npm - dashclaw - Versions diffs - 2.11.1 → 2.13.0 - Mend

dashclaw 2.11.1 → 2.13.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md CHANGED Viewed

@@ -1,4 +1,4 @@
-# DashClaw SDK (v2.11.1)
+# DashClaw SDK (v2.12.0)
 **Minimal governance runtime for AI agents.**
@@ -18,58 +18,211 @@ pip install dashclaw
 ## The Governance Loop
-DashClaw v2 is designed around a single 4-step loop.
+DashClaw v2 is designed around a 4-step loop, with an optional
+human-in-the-loop (HITL) branch when policy requires approval.
+```
+guard ─▶ createAction ─▶ (if pending_approval: waitForApproval) ─▶ updateOutcome
+```
 ### Node.js
 ```javascript
-import { DashClaw } from 'dashclaw';
+import { DashClaw, GuardBlockedError, ApprovalDeniedError } from 'dashclaw';
 const claw = new DashClaw({
   baseUrl: process.env.DASHCLAW_BASE_URL,
   apiKey: process.env.DASHCLAW_API_KEY,
-  agentId: 'my-agent'
+  agentId: 'my-agent',
+  agentName: 'My Agent',  // optional — stored in audit trail for attribution
+  // Phase 2 (optional): attach a JWT from your OIDC provider for cryptographic
+  // attribution. When set, the server verifies the signature via JWKS and the
+  // JWT sub claim overrides agentId in the audit record.
+  // authToken: process.env.MY_AGENT_JWT,
 });
 // 1. Ask permission
-const res = await claw.guard({ action_type: 'deploy' });
+const decision = await claw.guard({
+  action_type: 'deploy',
+  declared_goal: 'Ship v2.4.0 to production',
+  risk_score: 90,
+});
+if (decision.decision === 'block') {
+  throw new GuardBlockedError(decision);
+}
-// 2. Log intent
-const { action_id } = await claw.createAction({ action_type: 'deploy' });
+// 2. Log intent. Server may gate this if policy requires approval —
+//    check action.status before assuming you're clear to execute.
+const { action, action_id } = await claw.createAction({
+  action_type: 'deploy',
+  declared_goal: 'Ship v2.4.0 to production',
+  risk_score: 90,
+});
-// 3. Log evidence
-await claw.recordAssumption({ action_id, assumption: 'Tests passed' });
+// 3. If the server flagged this for human review, wait for an operator.
+if (action?.status === 'pending_approval') {
+  try {
+    await claw.waitForApproval(action_id);
+  } catch (err) {
+    if (err instanceof ApprovalDeniedError) return; // operator denied
+    throw err;
+  }
+}
-// 4. Update result
-await claw.updateOutcome(action_id, { status: 'completed' });
+// 4. Execute the real work, then record the outcome
+await claw.recordAssumption({ action_id, assumption: 'Staging tests passed' });
+try {
+  const result = await myLlmCall();
+  await claw.updateOutcome(action_id, {
+    status: 'completed',
+    // Optional — populate Analytics cost/token charts. Cost is derived
+    // server-side from the configured pricing table when model + tokens
+    // are provided without an explicit cost_estimate.
+    tokens_in: result.usage.input_tokens,
+    tokens_out: result.usage.output_tokens,
+    model: result.model,
+  });
+} catch (err) {
+  await claw.updateOutcome(action_id, { status: 'failed', error_message: err.message });
+}
 ```
 ### Python
 ```python
 import os
-from dashclaw import DashClaw
+from dashclaw import DashClaw, GuardBlockedError, ApprovalDeniedError
 claw = DashClaw(
     base_url=os.environ["DASHCLAW_BASE_URL"],
     api_key=os.environ["DASHCLAW_API_KEY"],
-    agent_id="my-agent"
+    agent_id="my-agent",
+    agent_name="My Agent",  # optional — stored in audit trail for attribution
 )
 # 1. Ask permission
-res = claw.guard({"action_type": "deploy"})
+decision = claw.guard({
+    "action_type": "deploy",
+    "declared_goal": "Ship v2.4.0 to production",
+    "risk_score": 90,
+})
+if decision["decision"] == "block":
+    raise GuardBlockedError(decision)
 # 2. Log intent
-action = claw.create_action(action_type="deploy")
+action = claw.create_action(
+    action_type="deploy",
+    declared_goal="Ship v2.4.0 to production",
+    risk_score=90,
+)
 action_id = action["action_id"]
-# 3. Log evidence
-claw.record_assumption({"action_id": action_id, "assumption": "Tests passed"})
+# 3. If the server flagged this for human review, wait for an operator.
+if action.get("action", {}).get("status") == "pending_approval":
+    try:
+        claw.wait_for_approval(action_id)
+    except ApprovalDeniedError:
+        pass  # operator denied — stop here
-# 4. Update result
+# 4. Execute and record outcome
+claw.record_assumption({"action_id": action_id, "assumption": "Staging tests passed"})
 claw.update_outcome(action_id, status="completed")
 ```
 ---
+## Human-in-the-Loop (HITL) Approval Flow
+When a guard policy, a capability `requires_approval` flag, or any server-side
+rule triggers human review, the server responds to `createAction()` with
+`action.status === 'pending_approval'` and HTTP **202**. Your agent's job is to
+pause on `waitForApproval()` until an operator clicks **Approve** or **Deny** from the dashboard, the
+CLI, the mobile PWA, or — on instances with Telegram configured — an inline
+Telegram button.
+### The rule every agent author needs to know
+**`waitForApproval()` must be called with the `action_id` returned by
+`createAction()`, NOT with the `action_id` returned by `guard()`.**
+These are two different records in two different tables:
+| Call | Returns `action_id` that refers to… | Prefix |
+|---|---|---|
+| `guard()` | A row in `guard_decisions` (the decision log) | `act_gd_…` |
+| `createAction()` | A row in `action_records` (the thing you're actually doing) | `act_…` |
+`waitForApproval()` polls `GET /api/actions/:id`, which is the
+`action_records` table. Passing it a `guard_decisions` ID (`act_gd_…`) will
+either return 404 or time out waiting on a row that doesn't exist. This was a
+real bug in an early version of the OpenClaw plugin — don't reproduce it.
+### Correct sequence
+```javascript
+// 1. Guard — advisory; may return 'allow', 'block', 'warn', or 'require_approval'
+const decision = await claw.guard({
+  action_type: 'post_message',
+  declared_goal: 'Notify #ops of deploy start',
+  risk_score: 40,
+});
+if (decision.decision === 'block') {
+  throw new GuardBlockedError(decision);
+}
+// 2. Create the action. The server re-evaluates policy at this point and is
+//    the authoritative source for whether human review is required. Even if
+//    guard returned 'allow', the server may still set status='pending_approval'
+//    (for example, if a capability has requires_approval=true).
+const { action, action_id } = await claw.createAction({
+  action_type: 'post_message',
+  declared_goal: 'Notify #ops of deploy start',
+  risk_score: 40,
+});
+// 3. Check the SERVER's verdict, not the guard decision.
+if (action?.status === 'pending_approval') {
+  try {
+    // Use createAction's action_id, never the guard decision's action_id.
+    await claw.waitForApproval(action_id, { timeout: 600_000 });
+  } catch (err) {
+    if (err instanceof ApprovalDeniedError) {
+      // Operator denied — do NOT execute the action
+      return { denied: true, reason: err.message };
+    }
+    throw err;
+  }
+}
+// 4. Execute and record outcome
+await doTheWork();
+await claw.updateOutcome(action_id, { status: 'completed' });
+```
+### What `waitForApproval()` does under the hood
+- Opens an SSE connection to `/api/stream` and watches for
+  `action.updated` events scoped to the given `actionId`.
+- Falls back to HTTP polling of `GET /api/actions/:id` every 5 seconds if
+  SSE is unavailable.
+- Resolves when `action.approved_by` is set (operator approved).
+- Throws `ApprovalDeniedError` when `action.status` becomes `failed` or
+  `cancelled` (operator denied).
+- Throws a timeout error after `options.timeout` milliseconds (default
+  `300_000` = 5 minutes).
+### Why guard and the server can disagree
+`guard()` is fast, in-memory, advisory. The server's `createAction` handler
+re-runs the exact same `evaluateGuard()` pipeline against the **persisted**
+action record, plus any capability-specific `requires_approval` flags and
+org-scoped rules that can only be resolved at write time. So the authoritative
+answer to "does this need human review?" is always `action.status` on the
+`createAction()` response — not `decision.decision` on the `guard()` response.
+Short version: **trust `action.status`, not `decision.decision`, for HITL
+branching.**
+---
 ## SDK Tiers
 DashClaw currently exposes a canonical Node SDK surface plus a legacy compatibility layer:
@@ -98,33 +251,43 @@ See:
 ---
-## SDK Surface Area (v2.11.1)
+## SDK Surface Area (v2.12.0)
 The v2 SDK exposes the stable governance runtime plus promoted execution domains in the canonical Node client:
 ### Core Runtime
-- `guard(context)` -- Policy evaluation ("Can I do X?"). Returns `risk_score` (server-computed) and `agent_risk_score` (raw agent value)
-- `createAction(action)` -- Lifecycle tracking ("I am doing X")
-- `updateOutcome(id, outcome)` -- Result recording ("X finished with Y")
+- `guard(context)` -- Policy evaluation ("Can I do X?"). Returns `risk_score` (server-computed), `agent_risk_score` (raw agent value), and `verification_status` (`verified` | `unverified` | `expired` | `failed` | `unknown_issuer`). Automatically includes `agent_name` from the constructor if not overridden in the call context. Pass `authToken` in the constructor to enable JWKS-backed cryptographic attribution (Phase 2 — see `docs/agent-identity.md`).
+- `createAction(action)` -- Lifecycle tracking ("I am doing X"). Accepts optional `idempotency_key`; on collision returns the existing row with `{ idempotent_replay: true }` instead of inserting a duplicate.
+- `updateOutcome(id, outcome)` -- Result recording ("X finished with Y"). `outcome` accepts `status`, `output_summary`, `side_effects`, `artifacts_created`, `error_message`, `duration_ms`, `tokens_in`, `tokens_out`, `model`, `cost_estimate`. When `tokens_in` / `tokens_out` are reported without an explicit `cost_estimate`, the server derives cost from `model` using the configured pricing table.
 - `recordAssumption(assumption)` -- Integrity tracking ("I believe Z while doing X")
 - `waitForApproval(id)` -- Real-time SSE listener for human-in-the-loop approvals (automatic polling fallback)
 - `approveAction(id, decision, reasoning?)` -- Submit approval decisions from code
 - `getPendingApprovals()` -- List actions awaiting human review
+### Durable Execution Finality (v2.13.3+)
+Terminal outcome reporting that is one-shot, retry-safe, and immutable once non-pending. Separate from `updateOutcome`, which remains the lifecycle-PATCH path. Full spec: [`docs/architecture/durable-execution-finality.md`](../docs/architecture/durable-execution-finality.md). Detailed examples in the [Action Outcome](#action-outcome-durable-execution-finality) subsection of Execution Studio below.
+- `reportActionOutcome(id, { status, summary?, error_message?, progress? })` -- Record the terminal outcome. `status` must be `completed`, `partial`, or `failed`; `lost_confirmation` is reserved for the system sweep. First call wins; subsequent POSTs return 409 with `current_status`.
+- `getActionOutcome(id)` -- Read the current outcome state. Returns `status` (one of `pending` / `completed` / `partial` / `failed` / `lost_confirmation`), `outcome_at`, `summary`, `error_message`, `progress`, `elapsed_ms`. Poll this before retrying any approved action.
+- `reportActionSuccess(id, summary?)` -- Convenience wrapper for `completed`.
+- `reportActionFailure(id, errorMessage, summary?)` -- Convenience wrapper for `failed`. `error_message` is required.
+- `reportActionPartial(id, progress, summary?)` -- Convenience wrapper for `partial`. `progress` (object) is required.
+- `deriveIdempotencyKey(parts)` -- SHA-256 hex digest of intent-fields for the `idempotency_key` field on `createAction`. Order-independent. Derive from intent (agent, action_type, scope, request_id), not timestamps.
 ### Decision Integrity
 - `registerOpenLoop(actionId, type, desc)` -- Register unresolved dependencies.
 - `resolveOpenLoop(loopId, status, res)` -- Resolve pending loops.
 - `getSignals()` -- Get current risk signals across all agents.
 ### Swarm & Connectivity
-- `heartbeat(status, metadata)` -- Report agent presence and health. **As of DashClaw 2.13.0, heartbeats are implicit on `createAction()` — you only need this if you want to report presence without recording an action.**
+- `heartbeat(status, metadata)` -- Report agent presence and health. **As of DashClaw platform 2.13.0 (server-side change, independent of SDK version), heartbeats are implicit on `createAction()` — you only need this if you want to report presence without recording an action.**
 - `reportConnections(connections)` -- Report active provider connections.
 ### Learning & Optimization
 - `getLearningVelocity()` -- Track agent improvement rate.
 - `getLearningCurves()` -- Measure efficiency gains per action type.
 - `getLessons({ actionType, limit })` -- Fetch consolidated lessons from scored outcomes.
-- `renderPrompt(context)` -- Fetch rendered prompt templates from DashClaw.
+- `renderPrompt({ template_id, version_id, variables, record })` -- Fetch a rendered prompt template from DashClaw. `template_id` is required; `version_id` defaults to the active version; `variables` is an object of mustache values; `record: true` persists the render as a governance event.
 ### Learning Loop
@@ -367,30 +530,55 @@ Messages sent through the context are automatically correlated with the action i
 DashClaw uses standard HTTP status codes and custom error classes:
-- `GuardBlockedError` -- Thrown when `claw.guard()` returns a `block` decision.
-- `ApprovalDeniedError` -- Thrown when an operator denies an action during `waitForApproval()`.
+- `GuardBlockedError` -- Thrown by **any** SDK call when the server returns HTTP 403 with `{ decision: { decision: 'block' } }`. Note that a successful `guard()` call returning `{ decision: 'block' }` in a **200** body does **not** throw — it just returns the decision object. Always check `decision.decision === 'block'` after `guard()` and throw `new GuardBlockedError(decision)` yourself if you want to abort early, as shown in the governance loop above.
+- `ApprovalDeniedError` -- Thrown by `waitForApproval()` when an operator denies the action (server sets `status` to `failed` or `cancelled`).
 ---
-## CLI Approval Channel
+## CLI (`@dashclaw/cli`)
-Install the DashClaw CLI to approve agent actions from the terminal:
+Install the DashClaw CLI for terminal approvals and self-host diagnostics:
 ```bash
 npm install -g @dashclaw/cli
 ```
+**Approvals:**
 ```bash
 dashclaw approvals              # interactive approval inbox
 dashclaw approve <actionId>     # approve a specific action
 dashclaw deny <actionId>        # deny a specific action
 ```
-When an agent calls `waitForApproval()`, it prints the action ID and replay link to stdout. Approve from any terminal or the dashboard, and the agent unblocks instantly.
+**Diagnostics:**
-## MCP Server (Zero-Code Integration)
+```bash
+dashclaw doctor                 # diagnose + auto-fix safe issues (database, config, auth, deployment, SDK, governance, drift)
+dashclaw doctor --json          # CI/machine-readable
+dashclaw doctor --no-fix        # diagnose only
+dashclaw doctor --category database,config
+```
-If your agent supports MCP (Claude Code, Claude Desktop, Managed Agents), you can skip the SDK entirely:
+Config resolution order: env vars (`DASHCLAW_BASE_URL`, `DASHCLAW_API_KEY`, optional `DASHCLAW_AGENT_ID`) → `~/.dashclaw/config.json` (`600`, persisted after interactive prompt) → first-run prompt. `dashclaw logout` removes saved config.
+When an agent calls `waitForApproval()`, it prints the action ID and replay link to stdout. Approve from any terminal, the browser dashboard, the `/approve` mobile PWA, or — if the instance has Telegram configured — via an inline Telegram Approve/Reject button pushed to the admin chat — decisions sync over Redis SSE within ~1 second.
+## Self-Host Doctor (`npm run doctor`)
+For operators running a self-hosted DashClaw instance, Doctor is also available as a local script with filesystem-level fix powers:
+```bash
+npm run doctor                  # can write .env, run migrations, seed default policy
+```
+Doctor check modules are emitted from the livingcode shape (`app/lib/doctor/generated/checks-from-shape.mjs`) and run against `GET /api/doctor` / `POST /api/doctor/fix`. The `.env` is always backed up before any write. Includes a drift guard that flags when shape-derived artifacts are out of sync — fix with `npm run livingcode:refresh`.
+## MCP Server (`@dashclaw/mcp-server`)
+If your agent supports Model Context Protocol (Claude Code, Claude Desktop, Managed Agents, MCP Inspector), skip the SDK entirely and let the MCP server wire governance into your agent loop.
+**stdio transport** (recommended for Claude Desktop / Claude Code):
 ```json
 {
@@ -404,21 +592,59 @@ If your agent supports MCP (Claude Code, Claude Desktop, Managed Agents), you ca
 }
 ```
-The MCP server exposes the same governance surface as the SDK (guard, record, invoke, wait for approval) plus discovery (capabilities, policies) and session lifecycle.
+**Streamable HTTP transport** (same surface, served by your DashClaw instance at `POST /api/mcp`).
+**23 tools** in 7 groups:
+- **Core governance (8):** `dashclaw_guard`, `dashclaw_record`, `dashclaw_invoke`, `dashclaw_capabilities_list`, `dashclaw_policies_list`, `dashclaw_wait_for_approval`, `dashclaw_session_start`, `dashclaw_session_end`.
+- **Optimal files (2):** `dashclaw_optimal_files_preview`, `dashclaw_optimal_files_manifest` — Code Sessions optimizer output (root CLAUDE.md, path-scoped rules, hooks, skill packs).
+- **Session continuity (3):** `dashclaw_handoff_create`, `dashclaw_handoff_latest`, `dashclaw_handoff_consume` — agent-runtime handoff bundle for the next session.
+- **Credential hygiene (3):** `dashclaw_secret_list`, `dashclaw_secret_due`, `dashclaw_secret_mark_rotated` — check rotation due-dates before acting on tracked credentials.
+- **Skill safety (1):** `dashclaw_skill_scan` — static safety scan of untrusted skill files; results cached by content hash.
+- **Open loops (3):** `dashclaw_loop_add`, `dashclaw_loop_list`, `dashclaw_loop_close` — action-scoped commitments (the "I will X later" tracker).
+- **Learning + retrospection (3):** `dashclaw_learning_log`, `dashclaw_learning_query`, `dashclaw_decisions_recent` — log + query non-obvious decisions; recent governed-action ledger.
+**4 resources:** `dashclaw://policies`, `dashclaw://capabilities`, `dashclaw://agent/{agent_id}/history`, `dashclaw://status`.
+### Agent runtime endpoints (server-side, no SDK wrapper)
+DashClaw 2.17 (platform) added three route families that are **agent-runtime infrastructure, not developer SDK methods**. They are called by the MCP server (the tools listed above), by Hermes Agent hooks, and by other governance plumbing — never directly from agent code. By design, they are not exposed on `claw.*`:
+| Family | Endpoints | Where called from |
+|---|---|---|
+| Session handoffs | `POST/GET /api/handoffs`, `GET /api/handoffs/latest`, `GET /api/handoffs/{id}`, `POST /api/handoffs/{id}/consume` | Hermes `on_session_end` / `on_session_start` / `pre_llm_call` hooks; MCP `dashclaw_handoff_*` tools |
+| Operator-tracked secrets | `GET/POST /api/secrets`, `PATCH/DELETE /api/secrets/{id}`, `GET /api/secrets/rotation-due` | MCP `dashclaw_secret_*` tools; operator UI |
+| Skill safety scan | `POST /api/skills/scan`, `GET /api/skills/scans/{id}` | MCP `dashclaw_skill_scan` tool; agents before loading an untrusted skill |
+If you're building a custom integration that needs these without MCP, call them as plain HTTP — see `docs/api-inventory.md` and the OpenAPI spec at `docs/openapi/critical-stable.openapi.json`.
+## OpenClaw Plugin (`@dashclaw/openclaw-plugin`)
+For teams using the OpenClaw agent framework, the governance plugin intercepts `PreToolUse` / `PostToolUse` lifecycle hooks and runs guard → record → wait-for-approval automatically. Tool classification vocabulary aligns with DashClaw's guard action types. Install via the openclaw CLI which picks up the bundled `HOOK.md` pack.
+## Governance Skill for Claude (Anthropic)
+For Anthropic Managed Agents or Claude Code sessions, the `@dashclaw/governance` skill teaches the agent how to use the MCP tools correctly — risk thresholds, decision handling, recording rules, session lifecycle. Pairs with `@dashclaw/mcp-server`. Download at `https://<your-instance>/downloads/dashclaw-governance.zip` or see `public/downloads/dashclaw-governance/`.
 ---
 ## Claude Code Hooks
-Govern Claude Code tool calls without any SDK instrumentation. Copy two files from the `hooks/` directory in the repo into your `.claude/hooks/` folder:
+Govern Claude Code tool calls without any SDK instrumentation. One command from anywhere DashClaw is cloned:
 ```bash
-# In your project directory
-cp path/to/DashClaw/hooks/dashclaw_pretool.py .claude/hooks/
-cp path/to/DashClaw/hooks/dashclaw_posttool.py .claude/hooks/
+# From a DashClaw checkout
+npm run hooks:install
+# From any other project, pointing at a DashClaw checkout
+node /path/to/DashClaw/scripts/install-hooks.mjs --target=.
 ```
-Then merge the hooks block from `hooks/settings.json` into your `.claude/settings.json`. Set `DASHCLAW_BASE_URL`, `DASHCLAW_API_KEY`, and optionally `DASHCLAW_HOOK_MODE=enforce`.
+This installs three hooks (`dashclaw_pretool.py`, `dashclaw_posttool.py`, `dashclaw_stop.py`) plus the bundled `dashclaw_agent_intel/` tool-classification module into `.claude/hooks/`, then merges the `PreToolUse`, `PostToolUse`, and `Stop` blocks into `.claude/settings.json`. Idempotent: re-run after `git pull` to upgrade.
+The Stop hook captures per-turn LLM token usage from the session transcript and PATCHes it onto the action records the pretool opened during the turn, so cost analytics light up without per-agent instrumentation.
+Set `DASHCLAW_BASE_URL`, `DASHCLAW_API_KEY`, and optionally `DASHCLAW_HOOK_MODE=enforce`. Full guide and per-hook details in [`hooks/README.md`](../hooks/README.md).
 ---
@@ -463,6 +689,69 @@ const { rootActionId, nodes, edges } = await claw.getActionGraph(actionId);
 // edges: parent_child | related | assumption_of | loop_from
 ```
+### Action Outcome (durable execution finality)
+Every approved action carries a terminal outcome: `pending`, `completed`, `partial`, `failed`, or `lost_confirmation`. Agents call `reportActionOutcome` to record finality, and `getActionOutcome` before retry to avoid re-executing already-completed work. Outcomes are one-shot — once non-pending, they cannot be rewritten.
+```javascript
+// Report success
+await claw.reportActionOutcome(actionId, {
+  status: 'completed',
+  summary: 'Deployed dashclaw 2.13.4 to production'
+});
+// Convenience wrappers
+await claw.reportActionSuccess(actionId, 'Deployed dashclaw 2.13.4');
+await claw.reportActionFailure(actionId, 'Downstream API returned 503');
+await claw.reportActionPartial(actionId, { step: 2, of: 5 });
+// Report failure (error_message required)
+await claw.reportActionOutcome(actionId, {
+  status: 'failed',
+  error_message: 'Downstream API returned 503'
+});
+// Report partial progress (progress object required)
+await claw.reportActionOutcome(actionId, {
+  status: 'partial',
+  progress: { step: 2, of: 5 }
+});
+// Retry-safe poll before re-trying any approved action
+const outcome = await claw.getActionOutcome(actionId);
+switch (outcome.status) {
+  case 'pending':            /* still in flight, WAIT */ break;
+  case 'completed':          /* already executed, SKIP */ break;
+  case 'failed':             /* safe to RETRY */ break;
+  case 'lost_confirmation':  /* sweep gave up, safe to RETRY */ break;
+  case 'partial':            /* clean up then retry */ break;
+}
+```
+HTTP surface (when the SDK isn't available):
+```bash
+curl -X POST "$BASE_URL/api/actions/$ACTION_ID/outcome" \
+  -H "x-api-key: $API_KEY" -H "Content-Type: application/json" \
+  -d '{"status":"completed","summary":"shipped"}'
+# 200 → { outcome: { ... } }
+# 409 → { error: "outcome already set", current_status: "completed" }
+```
+Pending outcomes that never get reported get swept to `lost_confirmation` by `/api/cron/outcome-sweep`. Vercel runs it daily on Hobby; the `lost_confirmation` event fires a `signal.detected` webhook so subscribers can see and recover. Per-org timeout (minutes) is configurable via the `DASHCLAW_OUTCOME_TIMEOUT_MINUTES` setting (default 15).
+**Idempotency keys.** Network errors on the *create* side of the create-then-execute flow used to leave duplicate `action_records` behind. Pass `idempotency_key` on `POST /api/actions` to make creates retry-safe — a second POST with the same `(org_id, idempotency_key)` returns the original row with `{ idempotent_replay: true }` instead of inserting a duplicate. Derive keys from intent, not timestamps:
+```javascript
+const idempotency_key = claw.deriveIdempotencyKey({
+  agent_id: 'deploy-bot',
+  action_type: 'deploy',
+  scope: 'prod-us-east',
+  request_id: requestId, // your own attempt discriminator
+});
+await claw.createAction({ /* ... */, idempotency_key });
+```
 ### Workflow Templates
 ```javascript
@@ -704,5 +993,35 @@ Health responses now include certification and recency fields such as:
 ---
+## Hosted provisioning (operator surface — not an SDK method)
+When `DASHCLAW_HOSTED=true` the deployment exposes `/api/hosted/*` routes for one-click trial provisioning. These are operator-facing routes, not SDK methods — they produce the API key the SDK consumes.
+```bash
+# Mint a trial workspace (no auth required; Turnstile-gated in production)
+curl -X POST https://hosted.example.com/api/hosted/workspaces \
+  -H "content-type: application/json" \
+  -d '{"turnstile_token": "..."}'
+# → { "workspace_id": "org_...", "api_key": "oc_live_...", "endpoint": "...",
+#     "expires_at": "...", "trial_action_cap": 10000, "key_prefix": "oc_live_",
+#     "next_steps_url": "https://hosted.example.com/connect?hosted=org_..." }
+# Admin: inspect a trial workspace (x-api-key with admin role)
+curl https://hosted.example.com/api/hosted/workspaces/org_abc \
+  -H "x-api-key: <admin_key>"
+# Admin: delete a trial workspace
+curl -X DELETE https://hosted.example.com/api/hosted/workspaces/org_abc \
+  -H "x-api-key: <admin_key>"
+# Cron: sweep expired trials (admin role OR X-Cleanup-Secret)
+curl -X POST https://hosted.example.com/api/hosted/cleanup \
+  -H "X-Cleanup-Secret: $HOSTED_CLEANUP_SECRET"
+```
+These routes return 404 when `DASHCLAW_HOSTED` is unset — self-host deploys are unaffected.
+---
 ## License
 MIT

package/dashclaw.js CHANGED Viewed

@@ -1,8 +1,10 @@
 /**
- * DashClaw SDK v2.11.0 (Stable Runtime API)
+ * DashClaw SDK v2.12.0 (Stable Runtime API)
  * Focused governance runtime client for AI agents.
  */
+import { createHash } from 'crypto';
 class ApprovalDeniedError extends Error {
   constructor(message, decision) {
     super(message);
@@ -25,8 +27,13 @@ class DashClaw {
    * @param {string} options.baseUrl - DashClaw base URL
    * @param {string} options.apiKey - API key for authentication
    * @param {string} options.agentId - Unique identifier for this agent
+   * @param {string} [options.agentName] - Human-readable label for this agent (stored in audit trail)
+   * @param {string} [options.authToken] - Phase 2: JWT bearer token from your OIDC provider.
+   *   When set, DashClaw server verifies the token via JWKS and returns `verification_status`
+   *   in every guard response. The JWT `sub` claim overrides agentId in the audit record
+   *   when verification succeeds — cryptographic proof beats self-assertion.
    */
-  constructor({ baseUrl, apiKey, agentId }) {
+  constructor({ baseUrl, apiKey, agentId, agentName, authToken }) {
     if (!baseUrl) throw new Error('baseUrl is required');
     if (!apiKey) throw new Error('apiKey is required');
     if (!agentId) throw new Error('agentId is required');
@@ -34,6 +41,8 @@ class DashClaw {
     this.baseUrl = baseUrl.replace(/\/$/, '');
     this.apiKey = apiKey;
     this.agentId = agentId;
+    this.agentName = agentName || null;
+    this.authToken = authToken || null;
     this.execution = {
       capabilities: {
@@ -59,7 +68,8 @@ class DashClaw {
     const headers = {
       'Content-Type': 'application/json',
-      'x-api-key': this.apiKey
+      'x-api-key': this.apiKey,
+      ...(this.authToken ? { 'Authorization': `Bearer ${this.authToken}` } : {}),
     };
     const res = await fetch(url, {
@@ -90,12 +100,30 @@ class DashClaw {
   /**
    * POST /api/guard — "Can I do X?"
    * @param {Object} context
-   * @returns {Promise<{decision: 'allow'|'block'|'require_approval', action_id: string, reason: string, signals: string[]}>}
+   * @returns {Promise<{
+   *   decision: 'allow'|'block'|'require_approval'|'warn',
+   *   action_id: string,
+   *   reason: string,
+   *   signals: string[],
+   *   verification_status: 'verified'|'unverified'|'expired'|'failed'|'unknown_issuer',
+   *   agent_id: string|null,
+   *   agent_name: string|null,
+   * }>}
+   *
+   * `verification_status` reflects whether the JWT bearer token (if provided
+   * via the `authToken` constructor option) was cryptographically verified:
+   *   verified       — signature valid; audit entry anchored to JWT sub
+   *   unverified     — no token, or issuer temporarily unreachable (fail-soft)
+   *   expired        — token expired; consider refreshing before next call
+   *   failed         — bad signature, malformed token, or audience mismatch
+   *   unknown_issuer — issuer not in DASHCLAW_ALLOWED_ISSUER (server config)
    */
   async guard(context) {
     return this._request('/api/guard', 'POST', {
       ...context,
       agent_id: context.agent_id || this.agentId,
+      // Include agent_name for audit attribution if not already provided by caller
+      ...(context.agent_name == null && this.agentName ? { agent_name: this.agentName } : {}),
     });
   }
@@ -749,6 +777,95 @@ class DashClaw {
     return this._request(`/api/actions/${actionId}/graph`, 'GET');
   }
+  // ---------------------------------------------------------------------------
+  // Durable execution finality — terminal outcome reporting
+  // See docs/architecture/durable-execution-finality.md
+  // ---------------------------------------------------------------------------
+  /**
+   * POST /api/actions/:id/outcome — Record the terminal outcome of an action.
+   *
+   * @param {string} actionId
+   * @param {Object} payload
+   * @param {'completed'|'partial'|'failed'} payload.status
+   * @param {string} [payload.summary]
+   * @param {string} [payload.error_message] — required when status=failed
+   * @param {Object} [payload.progress] — required when status=partial
+   * @returns {Promise<{ outcome: object, security: object }>}
+   * @throws on 409 when the outcome is already terminal — inspect the response
+   *   body for `current_status` before deciding what to do next.
+   */
+  async reportActionOutcome(actionId, payload) {
+    return this._request(`/api/actions/${actionId}/outcome`, 'POST', payload);
+  }
+  /**
+   * GET /api/actions/:id/outcome — Read the current outcome state of an action.
+   *
+   * Returns `{ action_id, status, outcome_at, summary, error_message, progress, elapsed_ms }`.
+   * Status is one of: pending, completed, partial, failed, lost_confirmation.
+   * Use this BEFORE retrying any approved action to avoid double-execution.
+   */
+  async getActionOutcome(actionId) {
+    return this._request(`/api/actions/${actionId}/outcome`, 'GET');
+  }
+  /**
+   * Convenience: report a successful terminal outcome.
+   */
+  async reportActionSuccess(actionId, summary) {
+    return this.reportActionOutcome(actionId, { status: 'completed', summary });
+  }
+  /**
+   * Convenience: report a failed terminal outcome. `error_message` is required.
+   */
+  async reportActionFailure(actionId, errorMessage, summary) {
+    return this.reportActionOutcome(actionId, {
+      status: 'failed',
+      error_message: errorMessage,
+      summary,
+    });
+  }
+  /**
+   * Convenience: report a partial outcome with progress state. Progress is
+   * required (an object describing where the agent stopped).
+   */
+  async reportActionPartial(actionId, progress, summary) {
+    return this.reportActionOutcome(actionId, {
+      status: 'partial',
+      progress,
+      summary,
+    });
+  }
+  /**
+   * Derive a stable idempotency key from the *intent* of an action so a
+   * retried `createAction` call returns the original row instead of creating
+   * a duplicate. Pass the same `parts` for the same logical action; vary at
+   * least one part for distinct actions.
+   *
+   * The hash function uses SHA-256 hex via Node's built-in crypto. In
+   * browser-only environments lacking `require`, callers should compute the
+   * key themselves and pass it directly to `createAction({ idempotency_key }).`
+   *
+   * @param {Object} parts — at minimum agent_id + action_type + a request
+   *   discriminator that uniquely identifies this attempt. Reusing the key
+   *   for a logically distinct action is the agent's bug, not DashClaw's.
+   * @returns {string} SHA-256 hex digest
+   */
+  deriveIdempotencyKey(parts) {
+    if (!parts || typeof parts !== 'object') {
+      throw new TypeError('deriveIdempotencyKey: parts must be an object');
+    }
+    const ordered = Object.keys(parts)
+      .sort()
+      .map((k) => `${k}=${parts[k] ?? ''}`)
+      .join('|');
+    return createHash('sha256').update(ordered).digest('hex');
+  }
   // ---------------------------------------------------------------------------
   // Execution Studio — Workflow Templates
   // ---------------------------------------------------------------------------

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "dashclaw",
-  "version": "2.11.1",
+  "version": "2.13.0",
   "description": "Minimal governance runtime for AI agents. Intercept, govern, and verify agent actions.",
   "type": "module",
   "publishConfig": {