elasticdash-sdk 0.2.8 โ†’ 0.2.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -28,7 +28,7 @@ An AI-native test runner for ElasticDash workflow testing. Built for async AI pi
28
28
  ## Features
29
29
 
30
30
  - ๐ŸŽฏ **Trace-first testing** โ€” every test gets a `trace` context to record and assert on LLM calls and tool invocations
31
- - ๐Ÿ” **Automatic AI interception** โ€” captures OpenAI, Gemini, and Grok calls without code changes
31
+ - ๐Ÿ” **Automatic AI interception** โ€” captures OpenAI, Anthropic, Gemini, Grok, Kimi, and AWS Bedrock calls without code changes
32
32
  - ๐Ÿงช **AI-specific matchers** โ€” semantic output matching, LLM-judged evaluations, prompt assertions
33
33
  - ๐Ÿ› ๏ธ **Tool & LLM recording & replay** โ€” automatically trace tool and AI calls with checkpoint-based replay and mock support
34
34
  - ๐Ÿ“Š **Interactive dashboard** โ€” browse workflows, debug traces, validate fixes visually
@@ -201,7 +201,7 @@ Duration: 3.4s
201
201
 
202
202
  ### Recording Trace Data
203
203
 
204
- **Automatic (recommended):** Workflow code making real API calls to OpenAI, Gemini, or Grok is automatically intercepted and recorded.
204
+ **Automatic (recommended):** Workflow code making real API calls to OpenAI, Anthropic, Gemini, Grok, Kimi, or AWS Bedrock is automatically intercepted and recorded.
205
205
 
206
206
  **Manual (for custom providers or mocks):**
207
207
 
@@ -255,9 +255,13 @@ The runner automatically intercepts and records calls to:
255
255
  - OpenAI (`api.openai.com`)
256
256
  - Gemini (`generativelanguage.googleapis.com`)
257
257
  - Grok/xAI (`api.x.ai`)
258
+ - Kimi/Moonshot (`api.moonshot.ai`)
259
+ - AWS Bedrock (`bedrock-runtime.<region>.amazonaws.com`) โ€” both `InvokeModel`/`InvokeModelWithResponseStream` and `Converse`/`ConverseStream`
258
260
 
259
261
  No code changes needed โ€” just run your workflow and assertions work automatically. Because these providers are auto-captured, most workflows do **not** need to wrap LLM calls with `wrapAI`. See [Picking a wrapper](#picking-a-wrapper) below.
260
262
 
263
+ > **Note on Bedrock:** The interceptor sits on `globalThis.fetch`, so any code that reaches Bedrock through `fetch` is auto-captured (browsers, Workers, Deno, thin REST wrappers, and SDKs that use undici/fetch under the hood). `@aws-sdk/client-bedrock-runtime` on Node uses its own HTTP signer and bypasses `globalThis.fetch` โ€” wrap those calls with `wrapAI({ provider: 'bedrock', model })` so events still get tagged and mocked rerun can match them. See [AWS Bedrock](#aws-bedrock) below.
264
+
261
265
  ### Picking a wrapper
262
266
 
263
267
  The SDK exposes three wrappers that look similar but solve different problems. Pick by what your function actually does:
@@ -267,7 +271,7 @@ The SDK exposes three wrappers that look similar but solve different problems. P
267
271
  | Deterministic (REST call, DB query, file IO โ€” no LLM inside) | **`edTool`** | Records as a `tool` event AND registers in the global tool registry so CLI `run-tool`, MCP `run_tool`, and dashboard rerun can find it by name. |
268
272
  | Exactly one LLM round-trip, AND you need prompt mocks, AI output mocks by name, OR the provider isn't auto-intercepted | **`wrapAI`** | Records as an `ai` event with token usage. Only `wrapAI` supports prompt rewriting (`resolvePromptMock` / `resolveUserPromptMock`) and named AI output mocks. |
269
273
  | An agent loop (LLM + inner tools, multiple round-trips) | **`edTool`** on the outer boundary | The inner LLM calls are auto-captured by the AI interceptor. Wrapping the outer agent with `wrapAI` would hide the inner detail. |
270
- | A direct single call to an auto-intercepted provider SDK (Anthropic / OpenAI / Gemini / Grok) | **No wrapper** | The AI interceptor already records it as an `ai` event with token usage. |
274
+ | A direct single call to an auto-intercepted provider SDK (Anthropic / OpenAI / Gemini / Grok / Kimi / Bedrock via `fetch`) | **No wrapper** | The AI interceptor already records it as an `ai` event with token usage. |
271
275
 
272
276
  > **`wrapTool`** is the primitive that `edTool` builds on. Use `wrapTool` directly only when you specifically do not want registry registration โ€” for example, wrapping an inline closure inside another function.
273
277
 
@@ -378,7 +382,11 @@ export const callClaude = wrapAI('claude-sonnet-4-5', async (messages: Anthropic
378
382
 
379
383
  #### AWS Bedrock
380
384
 
381
- Bedrock calls go through the AWS SDK (which uses Node's HTTP stack, not `globalThis.fetch`), so they are **not auto-intercepted**. Wrap them with `wrapAI` using the unified **Converse API** โ€” its `{ usage: { inputTokens, outputTokens } }` response shape is auto-extracted, and tagging `provider` with the underlying vendor (e.g. `'claude'` for `anthropic.*` model IDs) means existing matchers like `expect(trace).toHaveLLMStep({ provider: 'claude' })` match Bedrock-served calls with no change:
385
+ Bedrock is recognised by URL pattern (`bedrock-runtime.<region>.amazonaws.com`) and supports both API families: `InvokeModel` / `InvokeModelWithResponseStream` (including streaming via the binary `application/vnd.amazon.eventstream` format) and the unified `Converse` / `ConverseStream`. Model IDs, prompts, completions, and token usage are extracted automatically โ€” including for cross-region inference profiles like `us.anthropic.โ€ฆ` or `au.anthropic.โ€ฆ`.
386
+
387
+ **If your code reaches Bedrock through `globalThis.fetch`** (browsers, Cloudflare Workers, Deno, undici-based clients, or a thin REST wrapper), nothing else is required. The interceptor captures the call, records it as an `ai` event with token usage, freezes it during `rerun_step`, and replays it during `rerun_workflow_mocked`.
388
+
389
+ **If your code uses `@aws-sdk/client-bedrock-runtime` on Node**, the AWS SDK runs through its own HTTP signer and bypasses `globalThis.fetch`. Wrap the call with `wrapAI` so events still get tagged and mocked rerun can match them โ€” the Converse response's `{ usage: { inputTokens, outputTokens } }` shape is auto-extracted, and tagging `provider` with the underlying vendor (e.g. `'claude'` for `anthropic.*` model IDs) means existing matchers like `expect(trace).toHaveLLMStep({ provider: 'claude' })` work unchanged:
382
390
 
383
391
  ```ts
384
392
  import { wrapAI } from 'elasticdash-sdk'
@@ -399,21 +407,20 @@ export const callClaudeOnBedrock = wrapAI(
399
407
  inferenceConfig: { maxTokens: 1024 },
400
408
  }))
401
409
  },
402
- { provider: 'claude', model: MODEL_ID },
410
+ { provider: 'bedrock', model: MODEL_ID },
403
411
  )
404
412
  ```
405
413
 
406
414
  Notes:
407
415
 
408
416
  - **Credentials** come from the standard AWS provider chain (env vars, shared credentials file, IAM role) โ€” the SDK does not manage them.
409
- - **Other vendors on Bedrock** (Llama, Titan, Mistral, Cohere, AI21) use the same pattern โ€” change `modelId` and tag `provider` with the underlying vendor name (e.g. `'meta'`, `'amazon'`).
410
- - **Dashboard reruns of Bedrock events are not supported.** Re-run the workflow that contains the call instead of clicking rerun on the individual step.
417
+ - **Other vendors on Bedrock** (Llama, Titan, Mistral, Cohere, AI21) use the same pattern. For Converse the response shape is identical across vendors. For raw `InvokeModel`, Anthropic gets first-class extraction; other vendors fall back to a best-effort `outputText` / `generation` / `choices` lookup.
411
418
 
412
419
  #### Use `wrapAI` when
413
420
 
414
421
  The function body is essentially one LLM round-trip, AND at least one of the following applies:
415
422
 
416
- - The provider is **not auto-intercepted** (anything outside Anthropic / OpenAI / Gemini / Grok โ€” e.g., Mistral, Cohere, local Ollama, Bedrock).
423
+ - The provider is **not auto-intercepted** (anything outside Anthropic / OpenAI / Gemini / Grok / Kimi / Bedrock โ€” e.g., Mistral direct, Cohere direct, local Ollama), or the SDK bypasses `globalThis.fetch` (notably `@aws-sdk/client-bedrock-runtime` on Node).
417
424
  - You want **prompt mocks** โ€” system or user prompt rewriting via `resolvePromptMock` / `resolveUserPromptMock` keyed by the name you pass to `wrapAI`. This is exclusive to `wrapAI`.
418
425
  - You want **AI output mocks keyed by a named step** โ€” e.g., mock the `"router"` call without mocking every call to the same model. `resolveAIMock` keys off the name argument.
419
426
  - You want **one labelled boundary per logical step** in the trace (e.g., `"router"`, `"summarizer"`) with token usage attributed to that label, distinct from the raw provider-level event.