retrace-sdk 0.16.0 → 0.16.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,58 +1,118 @@
1
- # retrace-sdk
1
+ <div align="center">
2
2
 
3
- The execution replay engine for AI agents. Record, replay, fork & share AI agent executions — TypeScript SDK.
3
+ <img src="https://raw.githubusercontent.com/yash1511-bogam/retrace-sdk/main/assets/banner.gif" alt="Retrace" width="480" />
4
4
 
5
- ## Installation
5
+ ![TypeScript](https://img.shields.io/badge/TypeScript-3178C6?logo=typescript&logoColor=white)
6
+
7
+ Record every LLM call, tool invocation, and error your agent makes. Replay it step-by-step like a video. Fork from any point, change the input, and watch the whole agent re-execute down a new path. Share any run as an interactive, public link.
8
+
9
+ [Quick Start](#quick-start) · [How It Works](#how-it-works) · [Recipes](#usage-recipes) · [Enforcement](#enforcement-circuit-breakers) · [Docs](https://docs.retraceai.tech)
10
+
11
+ </div>
12
+
13
+ ---
14
+
15
+ ## Install
6
16
 
7
17
  ```bash
8
18
  npm install retrace-sdk
9
19
  ```
10
20
 
11
- Requires Node.js 20+. ESM-only package.
21
+ Requires Node.js 20+. ESM-only. Works in Node and edge runtimes.
22
+
23
+ ---
12
24
 
13
25
  ## Quick Start
14
26
 
15
27
  ```typescript
16
28
  import { configure, trace } from "retrace-sdk";
17
29
 
18
- configure({ apiKey: "rt_..." }); // Get your key at retraceai.tech/settings
30
+ configure({ apiKey: "rt_..." }); // get a key in the dashboard
19
31
 
20
- const myAgent = trace(async (prompt: string) => {
21
- const response = await openai.chat.completions.create({
22
- model: "gpt-5.5",
23
- messages: [{ role: "user", content: prompt }],
24
- });
25
- return response.choices[0].message.content;
26
- }, { name: "my-agent" });
32
+ const runAgent = trace(async (prompt: string) => {
33
+ const plan = await callPlanner(prompt); // captured automatically
34
+ const results = await callTools(plan); // captured automatically
35
+ return summarize(results); // captured automatically
36
+ }, { name: "research-agent", resumable: true });
27
37
 
28
- await myAgent("What is quantum computing?");
38
+ await runAgent("What changed in vector databases this year?");
29
39
  ```
30
40
 
41
+ That's it. The run appears in the dashboard as an interactive timeline you can scrub, replay, and fork.
42
+
43
+ ---
44
+
45
+ ## How It Works
46
+
47
+ The SDK is a thin capture layer. It wraps your function, auto-instruments provider calls, and streams spans to Retrace over a resilient transport — never blocking or crashing your agent.
48
+
49
+ ```mermaid
50
+ flowchart LR
51
+ subgraph yourproc["Your process"]
52
+ fn["trace(fn)"] --> cap["auto-captured spans<br/>(LLM · tool · error)"]
53
+ cap --> buf["offline buffer<br/>(bounded, flush on reconnect)"]
54
+ end
55
+
56
+ buf == "WebSocket (primary)" ==> api["Retrace"]
57
+ buf -. "HTTP fallback" .-> api
58
+ api == "resume: re-execute from fork point" ==> fn
59
+
60
+ classDef p fill:#0f3d2e,stroke:#10b981,color:#d1fae5;
61
+ class api p;
62
+ ```
63
+
64
+ - **Capture** — provider calls (OpenAI, Anthropic, Gemini) are intercepted automatically; you can also emit manual spans.
65
+ - **Transport** — spans stream over a **WebSocket**; if it drops, the SDK falls back to **HTTP** and replays a bounded **offline buffer** on reconnect, so nothing is lost.
66
+ - **Resumable** — with `resumable: true`, the SDK listens for a `resume` command and **re-executes your function from a fork point** with modified input, powering cascade replay from the dashboard.
67
+ - **Safe by default** — failures in the SDK never throw into your agent; typed errors surface real problems explicitly.
68
+
69
+ ---
70
+
31
71
  ## Auto-Instrumentation
32
72
 
33
- LLM calls from all major providers are captured automatically:
73
+ LLM calls from major providers are captured with no extra code — just install the provider SDK alongside `retrace-sdk`:
74
+
75
+ | Provider | Captured call |
76
+ |---|---|
77
+ | **OpenAI** | `openai.chat.completions.create()` |
78
+ | **Anthropic** | `anthropic.messages.create()` |
79
+ | **Google Gemini** | `ai.models.generateContent()` |
80
+
81
+ Framework adapters are available for agent frameworks (e.g. LangChain, Vercel AI SDK) — see the [docs](https://docs.retraceai.tech).
34
82
 
35
- - **OpenAI** — `openai.chat.completions.create()` captured
36
- - **Anthropic** — `anthropic.messages.create()` captured
37
- - **Google Gemini** — `ai.models.generateContent()` captured
83
+ ---
38
84
 
39
- No extra setup needed. Install the provider SDK alongside `retrace-sdk`.
85
+ ## Capabilities
40
86
 
41
- ## Configuration
87
+ | Capability | What it does |
88
+ |---|---|
89
+ | **Record** | One `trace()` wrapper captures the full execution tree. |
90
+ | **Cascade replay** | `resumable: true` lets a dashboard fork re-execute the whole function from any step. |
91
+ | **Enforcement** | Local budget/step/loop ceilings stop a runaway agent *before* the next call. |
92
+ | **Multi-agent** | Tag spans with an agent id/role for topology + inter-agent detectors. |
93
+ | **Golden cassettes** | Record a run as a CI regression fixture and gate on it offline. |
94
+ | **Sampling** | Record a fraction of traffic in production. |
95
+ | **Sessions** | Group multi-turn conversations under one session. |
96
+
97
+ ---
98
+
99
+ ## Usage Recipes
100
+
101
+ ### Configure
42
102
 
43
103
  ```typescript
44
104
  import { configure } from "retrace-sdk";
45
105
 
46
106
  configure({
47
- apiKey: "rt_...", // or RETRACE_API_KEY env var
107
+ apiKey: "rt_...", // or RETRACE_API_KEY
48
108
  baseUrl: "https://api.retraceai.tech",
49
- projectId: "...", // or RETRACE_PROJECT_ID env var
109
+ projectId: "...", // or RETRACE_PROJECT_ID
50
110
  });
51
111
  ```
52
112
 
53
113
  Set `RETRACE_ENABLED=false` to disable recording without changing code.
54
114
 
55
- ## Manual Span Creation
115
+ ### Manual spans
56
116
 
57
117
  ```typescript
58
118
  import { record, SpanType } from "retrace-sdk";
@@ -67,35 +127,19 @@ recorder.endSpan(span, { results: ["..."] });
67
127
  recorder.end("Done");
68
128
  ```
69
129
 
70
- ## Resumable Execution (Cascade Replay)
71
-
72
- Mark a function as resumable to enable full cascade replay from the dashboard:
130
+ ### Resumable execution (cascade replay)
73
131
 
74
132
  ```typescript
75
- import { configure, trace } from "retrace-sdk";
76
-
77
- configure({ apiKey: "rt_..." });
78
-
79
- const myAgent = trace(async (prompt: string) => {
133
+ const runAgent = trace(async (prompt: string) => {
80
134
  const plan = await planner(prompt);
81
135
  const result = await executor(plan);
82
136
  return summarize(result);
83
137
  }, { name: "my-agent", resumable: true });
84
138
  ```
85
139
 
86
- When you fork at any span in the dashboard, the SDK re-executes the entire function with modified input — not just one LLM call.
87
-
88
- ## Error Handling
89
-
90
- ```typescript
91
- import { RetraceError, RetraceAuthError, RetraceCreditsExhaustedError, RetraceRateLimitError, RetraceEnforcementError } from "retrace-sdk";
92
- ```
93
-
94
- Typed errors for auth failures, credit exhaustion, and rate limiting.
95
-
96
- ## Enforcement (Circuit Breakers)
140
+ When you fork at any span in the dashboard, the SDK re-executes the **entire** function with the modified input — not just one call. Every downstream step that depends on the change diverges.
97
141
 
98
- Hard ceilings that stop a runaway agent before the next call. Local limits are enforced offline (zero network); `serverEnforcement: true` also consults centrally-managed server policies.
142
+ ### Enforcement (circuit breakers)
99
143
 
100
144
  ```typescript
101
145
  import { configure, RetraceEnforcementError } from "retrace-sdk";
@@ -104,7 +148,7 @@ configure({
104
148
  apiKey: "rt_...",
105
149
  maxStepsPerRun: 50,
106
150
  maxUsdPerRun: 2.0,
107
- serverEnforcement: true, // optional: also consult server policies
151
+ serverEnforcement: true, // optional: also consult centrally-managed server policies
108
152
  });
109
153
 
110
154
  try {
@@ -114,11 +158,9 @@ try {
114
158
  }
115
159
  ```
116
160
 
117
- Precedence: explicit arg > env var (`RETRACE_MAX_STEPS_PER_RUN`, `RETRACE_MAX_TOKENS_PER_RUN`, `RETRACE_MAX_USD_PER_RUN`, `RETRACE_SERVER_ENFORCEMENT`) > unset. If the server check is unreachable, local limits still apply.
118
-
119
- ## Multi-Agent Context
161
+ Local ceilings are enforced offline (zero network). Precedence: explicit arg > env var (`RETRACE_MAX_STEPS_PER_RUN`, `RETRACE_MAX_TOKENS_PER_RUN`, `RETRACE_MAX_USD_PER_RUN`, `RETRACE_SERVER_ENFORCEMENT`) > unset. If the server check is unreachable, local limits still apply.
120
162
 
121
- Tag spans with an agent id/role so the dashboard can draw the agent topology and run inter-agent detectors:
163
+ ### Multi-agent context
122
164
 
123
165
  ```typescript
124
166
  import { withAgent } from "retrace-sdk";
@@ -128,9 +170,9 @@ await withAgent({ id: "planner", role: "planner" }, async () => {
128
170
  });
129
171
  ```
130
172
 
131
- ## Golden Cassettes (CI Regression Gates)
173
+ Tags spans so the dashboard can draw the agent topology and run inter-agent detectors (ping-pong, reasoning–action mismatch, task derailment).
132
174
 
133
- Record a run as a golden cassette and gate on it offline in CI with `retrace ci replay`:
175
+ ### Golden cassettes (CI regression gates)
134
176
 
135
177
  ```typescript
136
178
  import { writeGoldenCassette } from "retrace-sdk";
@@ -138,51 +180,36 @@ import { writeGoldenCassette } from "retrace-sdk";
138
180
  writeGoldenCassette("golden.json", { recorder });
139
181
  ```
140
182
 
141
- ## Sampling
183
+ Gate on it offline in CI with `retrace ci replay`.
184
+
185
+ ### Sampling
142
186
 
143
187
  ```typescript
144
- configure({ apiKey: "rt_...", sampleRate: 0.1 }); // Record 10% of traces
188
+ configure({ apiKey: "rt_...", sampleRate: 0.1 }); // record 10% of traces
145
189
  ```
146
190
 
147
- ## Changelog
148
-
149
- ### 0.13.0
150
-
151
- - **Multi-agent context** — `withAgent({ id, role })` tags spans for topology + inter-agent detectors
152
- - **Golden cassettes** — `writeGoldenCassette(path, { recorder })` records a run as a CI regression fixture
153
- - **Pre-call enforcement gate** — local step/token/USD-per-run ceilings enforced offline; `RetraceEnforcementError` thrown instead of silently skipping the call
191
+ ### Error handling
154
192
 
193
+ ```typescript
194
+ import {
195
+ RetraceError,
196
+ RetraceAuthError,
197
+ RetraceCreditsExhaustedError,
198
+ RetraceRateLimitError,
199
+ RetraceEnforcementError,
200
+ } from "retrace-sdk";
201
+ ```
155
202
 
156
- - **Sessions** `sessionId` option in `TraceRecorder` and `trace()` to group multi-turn conversations
157
- - **Multi-Agent** — `setAgentId()` on `SpanBuilder` for cross-agent tracing
158
- - **Guardrail support** — SDK respects HALT commands from server-side guardrail policies
159
-
160
- ### 0.2.2
161
-
162
- - **Fixed** — OpenAI interceptor no longer creates dummy client instance to find prototype
163
-
164
- ### 0.6.0
165
-
166
- - **Token ID capture** — Stores output token IDs + logprobs from OpenAI responses (enables speculative decoding during replay)
167
- - **SpanData extended** — New `token_ids` and `logprobs` fields on SpanData interface
168
- - **Shared schema** — SpanInputSchema updated with `token_ids` and `logprobs` optional arrays
169
-
170
- ### 0.2.1
171
-
172
- - **Offline buffer** — stores up to 1000 messages when WebSocket disconnects, flushes on reconnect
173
- - **HTTP retry** — 3 attempts with exponential backoff on fallback transport
174
- - **Cascade replay** — `resumable: true` option registers function for SDK-level re-execution
175
- - **Resume listener** — handles server 'resume' commands for fork replay
176
-
177
- ### 0.2.0
203
+ Typed errors for auth failures, credit exhaustion, rate limiting, and enforcement blocks. Transient transport problems never crash your agent.
178
204
 
179
- - Typed errors (RetraceAuthError, RetraceCreditsExhaustedError, RetraceRateLimitError)
180
- - Trace sampling via `sampleRate` config
181
- - Auto-instrumentation for OpenAI, Anthropic, Gemini
182
- - WebSocket + HTTP fallback transport
205
+ ---
183
206
 
184
207
  ## Links
185
208
 
186
- - [Documentation](https://retraceai.tech/docs)
187
- - [GitHub](https://github.com/yash1511-bogam/retrace)
209
+ - [Documentation](https://docs.retraceai.tech)
210
+ - [GitHub](https://github.com/yash1511-bogam/retrace-sdk)
188
211
  - [npm](https://www.npmjs.com/package/retrace-sdk)
212
+
213
+ ## License
214
+
215
+ MIT
@@ -280,8 +280,13 @@ function createPatchedCreate() {
280
280
  // Wrap provider errors in typed Retrace exceptions for user-facing clarity
281
281
  // eslint-disable-next-line @typescript-eslint/no-explicit-any
282
282
  const status = err?.status || err?.response?.status;
283
+ // openai v5+ makes APIError.headers a Web `Headers` instance (use .get()); pre-v5 it was a
284
+ // plain record (bracket access). Support both so retry-after is honored across the peer range.
285
+ // eslint-disable-next-line @typescript-eslint/no-explicit-any
286
+ const eh = err?.headers;
287
+ const retryAfter = (eh && typeof eh.get === "function" ? eh.get("retry-after") : eh?.["retry-after"]) || "60";
283
288
  if (status === 429)
284
- throw new RetraceRateLimitError(parseInt(err?.headers?.["retry-after"] || "60", 10));
289
+ throw new RetraceRateLimitError(parseInt(retryAfter, 10));
285
290
  if (status === 401 || status === 403)
286
291
  throw new RetraceAuthError(`OpenAI auth failed: ${err.message}`);
287
292
  if (err?.message?.includes("ECONNREFUSED") || err?.message?.includes("fetch failed")) {
package/dist/telemetry.js CHANGED
@@ -14,7 +14,7 @@ import { getConfig } from "./config.js";
14
14
  const ANON_ID = Math.random().toString(16).slice(2, 18);
15
15
  const DISABLED = new Set(["0", "false", "no", "off"]);
16
16
  // Keep in sync with package.json version.
17
- const SDK_VERSION = "0.16.0";
17
+ const SDK_VERSION = "0.16.1";
18
18
  function enabled() {
19
19
  return !DISABLED.has((process.env.RETRACE_TELEMETRY ?? "1").trim().toLowerCase());
20
20
  }
package/dist/transport.js CHANGED
@@ -2,7 +2,7 @@ import { getConfig } from "./config.js";
2
2
  import { classifyServerSignal } from "./errors.js";
3
3
  // Client identifier sent on every request so the backend can attribute SDK usage/version.
4
4
  // Keep in sync with package.json on release.
5
- const CLIENT_ID = "typescript-sdk/0.16.0";
5
+ const CLIENT_ID = "typescript-sdk/0.16.1";
6
6
  // ─── Runtime-agnostic WebSocket ──────────────────────────────────────────────
7
7
  // Prefer the global Web `WebSocket` (Node 20+, Bun, Deno, browsers, and every edge runtime); fall
8
8
  // back to the OPTIONAL `ws` package only on older Node that lacks a global. Both expose the standard
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "retrace-sdk",
3
- "version": "0.16.0",
3
+ "version": "0.16.1",
4
4
  "description": "The execution replay engine for AI agents. Record, replay, fork, and share agent executions.",
5
5
  "type": "module",
6
6
  "main": "dist/index.js",
@@ -57,10 +57,10 @@
57
57
  "ws": "^8.20.1"
58
58
  },
59
59
  "peerDependencies": {
60
- "@google/genai": ">=1.52.0",
61
- "openai": ">=4.0.0",
62
60
  "@anthropic-ai/sdk": ">=0.30.0",
63
- "@langchain/core": ">=0.3.0"
61
+ "@google/genai": ">=1.52.0",
62
+ "@langchain/core": ">=0.3.0",
63
+ "openai": ">=4.0.0"
64
64
  },
65
65
  "peerDependenciesMeta": {
66
66
  "@google/genai": {
@@ -77,11 +77,11 @@
77
77
  }
78
78
  },
79
79
  "devDependencies": {
80
- "@google/genai": "^1.52.0",
81
- "@types/node": "22.15.3",
80
+ "@anthropic-ai/sdk": "^0.105.0",
81
+ "@google/genai": "^2.9.0",
82
+ "@types/node": "24.13.2",
82
83
  "@types/ws": "8.18.1",
83
- "typescript": "6.0.3",
84
- "openai": "^4.90.0",
85
- "@anthropic-ai/sdk": "^0.95.0"
84
+ "openai": "^6.44.0",
85
+ "typescript": "6.0.3"
86
86
  }
87
87
  }