@heystack/otel 0.9.2 → 0.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -22,7 +22,7 @@ HEYSTACK_API_KEY=sk_live_…
22
22
  | Next.js — **any** deploy target (Vercel/Node **and** Cloudflare/OpenNext) | `@heystack/otel/next` | `registerHeystack` in `instrumentation.ts`. Auto-detects Node vs Cloudflare workerd and picks the right exporter. No-op on Edge. |
23
23
  | Standalone Cloudflare Workers (hand-written `export default { fetch }`) | `@heystack/otel/workers` | `instrument()` wraps your handler. Fetch-based exporter, flushes via `ctx.waitUntil`. |
24
24
  | Node / Express / Fastify / NestJS (long-running server) | `@heystack/otel/node` | `initHeystack`: auto-instrumentations + graceful shutdown. |
25
- | Browser (SPA / any web frontend) | `@heystack/otel/web` | `instrumentWeb`: session replay + W3C `traceparent` on outgoing fetch. No-op on the server (SSR-safe). |
25
+ | Browser (SPA / any web frontend) | `@heystack/otel/web` | `instrumentWeb`: session replay + opt-in browser distributed tracing (`tracing: true`) that emits CLIENT spans + propagates W3C `traceparent` (browser→API shows as one trace). No-op on the server (SSR-safe). |
26
26
  | Anywhere (pure helpers) | `@heystack/otel` | `buildExporterConfig`, types. No Node SDK loaded. |
27
27
 
28
28
  ## Node / Express / etc.
@@ -306,11 +306,16 @@ The default export still needs to be wrapped with `instrument()` (or `initHeysta
306
306
 
307
307
  For a browser frontend (any SPA / web app), `instrumentWeb` records **session replay** and injects a W3C `traceparent` header on outgoing `fetch` calls, so replays correlate with the backend traces they triggered. It is a **no-op on the server** (SSR-safe), so it's safe to call from code that also runs during server rendering.
308
308
 
309
+ The rrweb recorder **ships inside this package** — there is nothing else to install. Uploads go cross-origin to the Heystack ingest endpoint and **work out of the box** (no CORS configuration on your side).
310
+
309
311
  ```ts
310
312
  import { instrumentWeb } from "@heystack/otel/web";
311
313
 
312
314
  const stop = await instrumentWeb({
313
- apiKey: process.env.HEYSTACK_API_KEY, // same ingest key as the rest of the SDK
315
+ // A BROWSER-exposed ingest key it ships to the client, like an analytics
316
+ // write key. Use a public env var (below), ideally a dedicated key you can
317
+ // rotate independently of your server-side key. NOT your server secret.
318
+ apiKey: import.meta.env.VITE_HEYSTACK_API_KEY, // Vite; Next.js: process.env.NEXT_PUBLIC_HEYSTACK_API_KEY
314
319
  service: "my-web-app",
315
320
  });
316
321
 
@@ -320,17 +325,36 @@ stop();
320
325
 
321
326
  `instrumentWeb` returns a `stop()` function that ends the recording session.
322
327
 
328
+ **Where to call it.** It must run in the **browser**. In a Vite/CRA SPA, call it once in your client entry (`main.tsx`). In **Next.js (App Router)**, wrap it in a small `"use client"` component that calls it from a `useEffect` and mount that once in your root layout (server components can't call it). Recording is **server-gated**: nothing is captured until you enable replay for the app in the console, so it's safe to ship this before flipping the switch.
329
+
323
330
  ### Options
324
331
 
325
332
  | Option | Type | Notes |
326
333
  | --- | --- | --- |
327
- | `apiKey` | `string` | **Required.** The same Heystack ingest key used by the rest of the SDK. |
334
+ | `apiKey` | `string` | **Required.** A Heystack ingest key, **exposed to the browser** (public env var). Prefer a dedicated key you can rotate — not your server-side secret. |
328
335
  | `service` | `string` | **Required.** The OTel service name (matches the app's service in the console). |
329
336
  | `userId` | `string?` | Optional app-supplied identifier stamped on the session. |
330
337
  | `endpoint` | `string?` | Optional ingest endpoint override (defaults to the Heystack ingest endpoint). |
331
338
  | `sampleRate` | `number?` | Optional **local** override for the recording sample rate (0–1). By default sampling is controlled from the console. |
332
339
  | `flushIntervalMs` | `number?` | How often buffered events are flushed (default 5000ms). |
333
340
  | `flushEveryEvents` | `number?` | Max buffered events before an early flush (default 200). |
341
+ | `tracing` | `boolean?` | Opt in to **browser distributed tracing** (default off). Emits a real CLIENT span per outbound `fetch` and propagates W3C trace context, so browser→backend calls show as one connected trace + a service-map edge. Independent of replay. |
342
+ | `traceSampleRate` | `number?` | Head sample rate for browser tracing (0–1, default 1 when `tracing` is on). Lower it to cap span volume/cost on busy apps. |
343
+
344
+ ### Browser distributed tracing
345
+
346
+ By default `/web` only records session replay. Set `tracing: true` to also trace the browser: each outbound `fetch` becomes a CLIENT span, and the injected `traceparent` makes the downstream service's SERVER span its child — so a browser→API call renders as **one connected trace** and a **service-map edge** (`web → api`). This is separate from replay (it works even with replay off).
347
+
348
+ ```ts
349
+ await instrumentWeb({
350
+ apiKey: import.meta.env.VITE_HEYSTACK_API_KEY,
351
+ service: "my-web-app",
352
+ tracing: true,
353
+ traceSampleRate: 0.25, // sample 25% of requests — tune for cost
354
+ });
355
+ ```
356
+
357
+ It's **cost-aware and safe by design**: off unless you opt in; head-sampled (an unsampled request still propagates `traceparent` with the sampled flag cleared, so the backend makes the same keep/drop decision — no orphaned server spans); and the exporter posts through the *original* `fetch`, never tracing its own upload (no self-export loop). Spans post to `/v1/traces` cross-origin with no CORS setup on your side.
334
358
 
335
359
  ### Sampling & masking come from the console
336
360
 
package/dist/web.d.ts CHANGED
@@ -65,11 +65,21 @@ export interface InstrumentWebOptions {
65
65
  flushIntervalMs?: number;
66
66
  /** Max buffered events before an early flush. */
67
67
  flushEveryEvents?: number;
68
+ /**
69
+ * Opt in to browser distributed tracing: emit a real CLIENT span per outbound
70
+ * fetch and propagate W3C trace context, so browser→backend calls show as one
71
+ * connected trace (and a service-map edge). Off by default — it adds span
72
+ * volume (backend cost). Independent of session replay.
73
+ */
74
+ tracing?: boolean;
75
+ /** Head sample rate for browser tracing (0–1, default 1 when `tracing` is on).
76
+ * Lower it to control span volume/cost on high-traffic apps. */
77
+ traceSampleRate?: number;
68
78
  }
69
79
  /** Entry point: fetch config, decide sampling, start rrweb, stream chunks.
70
80
  * Returns a stop() function. Safe to call in any browser; no-ops on the server. */
71
81
  export declare function instrumentWeb(opts: InstrumentWebOptions): Promise<() => void>;
72
- export declare function makeTraceparent(traceId: string, spanId: string): string;
82
+ export declare function makeTraceparent(traceId: string, spanId: string, sampled?: boolean): string;
73
83
  /** Collects trace ids observed during a session (deduped, capped). */
74
84
  export declare class TraceIdCollector {
75
85
  private readonly cap;
@@ -81,4 +91,49 @@ export declare class TraceIdCollector {
81
91
  /** Patch window.fetch to inject traceparent on outgoing calls and record the
82
92
  * trace id for correlation. Returns an unpatch function. */
83
93
  export declare function patchFetchForCorrelation(collector: TraceIdCollector): () => void;
94
+ /** One recorded browser CLIENT span (an outbound fetch). */
95
+ export interface BrowserClientSpan {
96
+ traceId: string;
97
+ spanId: string;
98
+ name: string;
99
+ startMs: number;
100
+ endMs: number;
101
+ method: string;
102
+ url: string;
103
+ /** HTTP response status; 0 for a network error / throw. */
104
+ statusCode: number;
105
+ error: boolean;
106
+ }
107
+ /** Build an OTLP/JSON ExportTraceServiceRequest for a batch of client spans. */
108
+ export declare function buildTraceExport(service: string, spans: BrowserClientSpan[]): Record<string, unknown>;
109
+ export interface TraceExporterOpts {
110
+ endpoint: string;
111
+ apiKey: string;
112
+ service: string;
113
+ /** MUST be the ORIGINAL fetch captured before patching, or the export POST
114
+ * self-traces and loops (cost guardrail #1). */
115
+ fetchImpl: typeof fetch;
116
+ maxBatch?: number;
117
+ }
118
+ /** Buffers browser CLIENT spans and POSTs them as OTLP/JSON to /v1/traces. */
119
+ export declare class BrowserTraceExporter {
120
+ private readonly o;
121
+ private buf;
122
+ constructor(o: TraceExporterOpts);
123
+ add(span: BrowserClientSpan): void;
124
+ flush(keepalive?: boolean): Promise<void>;
125
+ }
126
+ export interface TracingPatchOpts {
127
+ onSpan: (s: BrowserClientSpan) => void;
128
+ sampleRate: number;
129
+ /** Bare ingest hostname; calls to it are never traced (self-export loop guard). */
130
+ ingestHost: string;
131
+ /** Optional: also record the real trace id for replay↔trace correlation. */
132
+ collector?: TraceIdCollector;
133
+ rng?: () => number;
134
+ }
135
+ /** Patch window.fetch to emit a real CLIENT span per outbound call, inject that
136
+ * span's W3C traceparent, and (head-sampled) hand the finished span to onSpan.
137
+ * Returns an unpatch function. */
138
+ export declare function patchFetchForTracing(o: TracingPatchOpts): () => void;
84
139
  export {};
package/dist/web.js CHANGED
@@ -1,4 +1,8 @@
1
1
  import { DEFAULT_ENDPOINT } from "./core.js";
2
+ // safeHostname is pure string logic (no runtime imports) — safe in the browser
3
+ // bundle. Used to skip tracing our OWN telemetry POSTs (the self-export loop that
4
+ // caused the June 2026 cost incident); see CLAUDE.md → "Cost guardrails".
5
+ import { safeHostname } from "./self-span.js";
2
6
  /** Pure: decide once per session whether to record. rng defaults to Math.random. */
3
7
  export function shouldRecord(cfg, rng = Math.random) {
4
8
  if (!cfg.enabled)
@@ -93,10 +97,41 @@ export async function instrumentWeb(opts) {
93
97
  catch { /* offline / blocked - fall back to disabled */ }
94
98
  if (opts.sampleRate !== undefined)
95
99
  cfg = { ...cfg, sample_rate: opts.sampleRate };
96
- // 2. Session-level sampling decision.
100
+ // Capture the ORIGINAL fetch before any patching — telemetry exporters MUST use
101
+ // it so their own POSTs to the ingest endpoint aren't traced/looped (guardrail #1).
102
+ const originalFetch = fetch.bind(globalThis);
103
+ const ingestHost = safeHostname(endpoint);
104
+ const traces = new TraceIdCollector();
105
+ // 2. Browser distributed tracing (opt-in) — INDEPENDENT of replay sampling. Emits
106
+ // a real CLIENT span per outbound fetch + propagates W3C context to the backend.
107
+ let stopTracing = () => { };
108
+ let fetchPatched = false;
109
+ if (opts.tracing) {
110
+ const exporter = new BrowserTraceExporter({
111
+ endpoint, apiKey: opts.apiKey, service: opts.service, fetchImpl: originalFetch,
112
+ });
113
+ const unpatchTrace = patchFetchForTracing({
114
+ onSpan: (s) => exporter.add(s),
115
+ sampleRate: opts.traceSampleRate ?? 1,
116
+ ingestHost,
117
+ collector: traces, // real trace ids also tag the replay session (better correlation)
118
+ });
119
+ fetchPatched = true;
120
+ const traceFlush = setInterval(() => void exporter.flush(), opts.flushIntervalMs ?? DEFAULT_FLUSH_MS);
121
+ const onHideTrace = () => { if (document.visibilityState === "hidden")
122
+ void exporter.flush(true); };
123
+ document.addEventListener("visibilitychange", onHideTrace);
124
+ window.addEventListener("pagehide", () => void exporter.flush(true));
125
+ stopTracing = () => {
126
+ clearInterval(traceFlush);
127
+ document.removeEventListener("visibilitychange", onHideTrace);
128
+ unpatchTrace();
129
+ void exporter.flush(true);
130
+ };
131
+ }
132
+ // 3. Session replay — gated on the replay sampling decision (independent of tracing).
97
133
  if (!shouldRecord(cfg))
98
- return () => { };
99
- // 3. Start the recorder.
134
+ return () => { stopTracing(); };
100
135
  const { record } = await import("rrweb");
101
136
  // An element marked `data-hs-unmask` (or any descendant of one) is recorded in
102
137
  // cleartext; everything else is masked. rrweb 2.0.1's real opt-out hooks are
@@ -109,12 +144,11 @@ export async function instrumentWeb(opts) {
109
144
  const maskInputFn = (text, element) => reveal(text, element) ? text : "*".repeat(text.length);
110
145
  const maskTextFn = (text, element) => reveal(text, element) ? text : "*".repeat(text.length);
111
146
  const sessionId = crypto.randomUUID();
112
- const transport = new ReplayTransport({ endpoint, apiKey: opts.apiKey, sessionId });
113
- // Install the fetch patch AFTER the transport is constructed so the recorder's
114
- // own upload POSTs (which captured the original fetch at construction time via
115
- // fetch.bind(globalThis)) are not self-traced.
116
- const traces = new TraceIdCollector();
117
- const unpatch = patchFetchForCorrelation(traces);
147
+ const transport = new ReplayTransport({ endpoint, apiKey: opts.apiKey, fetchImpl: originalFetch, sessionId });
148
+ // Only patch fetch for replay correlation if tracing didn't already patch it
149
+ // (tracing's patch injects real context AND feeds `traces`). Avoids double-wrapping
150
+ // window.fetch. Uses the original fetch for uploads (self-span suppression).
151
+ const unpatch = fetchPatched ? () => { } : patchFetchForCorrelation(traces);
118
152
  let buffer = [];
119
153
  let errorCount = 0;
120
154
  const browser = navigator.userAgent;
@@ -159,6 +193,7 @@ export async function instrumentWeb(opts) {
159
193
  stopRecord?.();
160
194
  unpatch();
161
195
  flush(true);
196
+ stopTracing();
162
197
  };
163
198
  }
164
199
  function matchMediaDevice() {
@@ -171,8 +206,11 @@ function randHex(bytes) {
171
206
  crypto.getRandomValues(a);
172
207
  return Array.from(a, (b) => b.toString(16).padStart(2, "0")).join("");
173
208
  }
174
- export function makeTraceparent(traceId, spanId) {
175
- return `00-${traceId}-${spanId}-01`;
209
+ export function makeTraceparent(traceId, spanId, sampled = true) {
210
+ // The trace-flags byte's low bit is "sampled". When head sampling drops a
211
+ // request we still inject the context (00) so the downstream service makes the
212
+ // SAME keep/drop decision — coordinated sampling, no orphaned server spans.
213
+ return `00-${traceId}-${spanId}-${sampled ? "01" : "00"}`;
176
214
  }
177
215
  /** Collects trace ids observed during a session (deduped, capped). */
178
216
  export class TraceIdCollector {
@@ -209,3 +247,117 @@ export function patchFetchForCorrelation(collector) {
209
247
  });
210
248
  return () => { window.fetch = orig; };
211
249
  }
250
+ const kvStr = (key, value) => ({ key, value: { stringValue: value } });
251
+ const kvInt = (key, n) => ({ key, value: { intValue: String(n) } });
252
+ const msToNano = (ms) => `${Math.trunc(ms)}000000`;
253
+ function spanToOtlp(s) {
254
+ const attributes = [
255
+ kvStr("http.request.method", s.method),
256
+ kvStr("url.full", s.url),
257
+ kvStr("server.address", safeHostname(s.url)),
258
+ ];
259
+ if (s.statusCode > 0)
260
+ attributes.push(kvInt("http.response.status_code", s.statusCode));
261
+ return {
262
+ traceId: s.traceId,
263
+ spanId: s.spanId,
264
+ name: s.name,
265
+ kind: 3, // SPAN_KIND CLIENT
266
+ startTimeUnixNano: msToNano(s.startMs),
267
+ endTimeUnixNano: msToNano(s.endMs),
268
+ attributes,
269
+ status: { code: s.error ? 2 : 1 }, // STATUS_CODE ERROR : OK
270
+ };
271
+ }
272
+ /** Build an OTLP/JSON ExportTraceServiceRequest for a batch of client spans. */
273
+ export function buildTraceExport(service, spans) {
274
+ return {
275
+ resourceSpans: [
276
+ {
277
+ resource: { attributes: [kvStr("service.name", service)] },
278
+ scopeSpans: [{ scope: { name: "@heystack/otel/web" }, spans: spans.map(spanToOtlp) }],
279
+ },
280
+ ],
281
+ };
282
+ }
283
+ /** Buffers browser CLIENT spans and POSTs them as OTLP/JSON to /v1/traces. */
284
+ export class BrowserTraceExporter {
285
+ o;
286
+ buf = [];
287
+ constructor(o) {
288
+ this.o = o;
289
+ }
290
+ add(span) {
291
+ this.buf.push(span);
292
+ if (this.buf.length >= (this.o.maxBatch ?? 50))
293
+ void this.flush();
294
+ }
295
+ async flush(keepalive = false) {
296
+ if (this.buf.length === 0)
297
+ return;
298
+ const spans = this.buf;
299
+ this.buf = [];
300
+ const body = JSON.stringify(buildTraceExport(this.o.service, spans));
301
+ await this.o
302
+ .fetchImpl(`${this.o.endpoint}/v1/traces`, {
303
+ method: "POST",
304
+ keepalive,
305
+ headers: { authorization: `Bearer ${this.o.apiKey}`, "content-type": "application/json" },
306
+ body,
307
+ })
308
+ .catch(() => { });
309
+ }
310
+ }
311
+ function fetchUrl(input) {
312
+ if (typeof input === "string")
313
+ return input;
314
+ if (input instanceof URL)
315
+ return input.href;
316
+ return input.url ?? "";
317
+ }
318
+ function fetchMethod(input, init) {
319
+ const m = init?.method ?? (input instanceof Request ? input.method : undefined) ?? "GET";
320
+ return m.toUpperCase();
321
+ }
322
+ function pathOf(url) {
323
+ try {
324
+ return new URL(url).pathname;
325
+ }
326
+ catch {
327
+ return url;
328
+ }
329
+ }
330
+ /** Patch window.fetch to emit a real CLIENT span per outbound call, inject that
331
+ * span's W3C traceparent, and (head-sampled) hand the finished span to onSpan.
332
+ * Returns an unpatch function. */
333
+ export function patchFetchForTracing(o) {
334
+ if (typeof window === "undefined" || !window.fetch)
335
+ return () => { };
336
+ const orig = window.fetch.bind(window);
337
+ const rng = o.rng ?? Math.random;
338
+ window.fetch = ((input, init) => {
339
+ const url = fetchUrl(input);
340
+ // Never trace our own telemetry POSTs (trace export, replay upload, config) —
341
+ // they'd create spans that export → re-trace → loop. The exporter already uses
342
+ // the original fetch; this is the belt-and-suspenders host-match guard.
343
+ if (o.ingestHost && safeHostname(url) === o.ingestHost)
344
+ return orig(input, init);
345
+ const sampled = rng() < o.sampleRate;
346
+ const traceId = randHex(16);
347
+ const spanId = randHex(8);
348
+ if (sampled && o.collector)
349
+ o.collector.add(traceId);
350
+ const headers = new Headers(init?.headers ?? (input instanceof Request ? input.headers : undefined));
351
+ if (!headers.has("traceparent"))
352
+ headers.set("traceparent", makeTraceparent(traceId, spanId, sampled));
353
+ const method = fetchMethod(input, init);
354
+ const startMs = Date.now();
355
+ const emit = (statusCode, error) => {
356
+ if (!sampled)
357
+ return;
358
+ o.onSpan({ traceId, spanId, name: `${method} ${pathOf(url)}`, startMs, endMs: Date.now(), method, url, statusCode, error });
359
+ };
360
+ return orig(input, { ...init, headers }).then((res) => { emit(res.status, res.status >= 400); return res; }, (err) => { emit(0, true); throw err; });
361
+ });
362
+ return () => { window.fetch = orig; };
363
+ }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@heystack/otel",
3
- "version": "0.9.2",
3
+ "version": "0.10.0",
4
4
  "description": "Runtime-aware OpenTelemetry tracing that exports to Heystack (Node, Next.js, Workers).",
5
5
  "license": "MIT",
6
6
  "type": "module",