@heystack/otel 0.4.3 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -85,32 +85,120 @@ export default instrument(
85
85
  return new Response("ok");
86
86
  },
87
87
  },
88
- { service: "my-worker" }, // apiKey defaults to env.HEYSTACK_API_KEY
88
+ {
89
+ service: "my-worker", // apiKey defaults to env.HEYSTACK_API_KEY
90
+ getUser: (req) => ({
91
+ id: req.headers.get("x-user-id") ?? undefined,
92
+ }),
93
+ instrumentBindings: true, // auto-trace D1/KV/R2/Vectorize
94
+ },
89
95
  );
90
96
  ```
91
97
 
92
- As of **0.3.0** `instrument()` registers the **global** tracer provider and creates the per-request SERVER span via the global tracer, so nested spans created through the global `trace.getTracer()` API (framework/library/manual) also export — you get a trace tree, not a lone SERVER span.
98
+ `instrument()` must be the **outermost** wrapper if other middleware also wraps the handler, so the request span covers everything inside:
99
+
100
+ ```ts
101
+ export default instrument(withOtherMiddleware(worker), { service: "my-worker" });
102
+ ```
103
+
104
+ Set the key as a secret: `wrangler secret put HEYSTACK_API_KEY`.
93
105
 
94
- > **Requires `nodejs_compat` on workerd.** As of **0.3.2** the SDK registers an OpenTelemetry **ContextManager** at init (see below), backed by `AsyncLocalStorage` from `node:async_hooks`. On Cloudflare Workers that means your `wrangler.toml` must enable the Node.js compatibility flag:
106
+ > **Requires `nodejs_compat` on workerd.** The SDK uses `AsyncLocalStorage` for per-request context isolation. Add to `wrangler.toml`:
95
107
  > ```toml
96
108
  > compatibility_flags = ["nodejs_compat"]
97
109
  > ```
98
- > If `node:async_hooks` is unavailable, the SDK transparently falls back to a synchronous stack-based ContextManager (no extra dependency) — suppression still works, but cross-`await` parent linking and per-request context isolation degrade to best-effort.
110
+ > Without it the SDK falls back to a synchronous stack-based context manager — suppression still works, but cross-`await` span parenting and per-request isolation degrade to best-effort.
111
+
112
+ ### `WorkersConfig` options
99
113
 
100
- ### Why a ContextManager (0.3.2)
114
+ | Option | Type | Notes |
115
+ | --- | --- | --- |
116
+ | `service` | `string` | **Required.** Service name that appears in the Heystack console. |
117
+ | `apiKey` | `string?` | Defaults to `env.HEYSTACK_API_KEY`. |
118
+ | `getUser` | `(req: Request) => { id?, session?, requestId? } \| undefined` | Called per request. `id` → `enduser.id`, `session` → `session.id`, `requestId` → `http.request.id` (falls back to the `cf-ray` header). |
119
+ | `instrumentBindings` | `boolean \| string[]` | `true` = auto child spans for all detected D1/KV/R2/Vectorize bindings; `string[]` = only the named bindings. Default `false`. |
120
+ | `sampling` | `{ rate?: number }` | Head-sampling rate 0–1. `1` (default) keeps everything. `0.2` keeps ~20% of fresh root traces. Parent-respecting: a request that arrives with an inbound sampled `traceparent` is always recorded regardless of the rate. See [Head sampling](#head-sampling) below. |
121
+ | `waitUntil` | `(p: Promise<unknown>) => void` | Override the isolate keep-alive hook; defaults to the auto-detected `ctx.waitUntil`. |
122
+ | `endpoint` | `string?` | Override the ingest endpoint (advanced). |
101
123
 
102
- `context.with(...)` in OpenTelemetry is a **no-op unless a ContextManager is registered** with the global API. Before 0.3.2 the Workers path registered only a tracer provider, so `suppressTracing()` — the primary defence against the self-trace feedback loop — silently did nothing in production (the exporter's own `POST /v1/traces` could be re-traced by host fetch auto-instrumentation, looping). As of **0.3.2** the SDK registers a ContextManager exactly once at init. With `AsyncLocalStorageContextManager` (the default, on Node and on workerd under `nodejs_compat`) you also get **cross-`await` parent→child span linking** and **per-request context isolation** — concurrent requests no longer share or clobber the active span.
124
+ ### Head sampling
103
125
 
104
- `instrument()` must be the **outermost** wrapper if other middleware also wraps the handler, so the request span covers everything inside:
126
+ `sampling: { rate }` performs **head sampling** the keep/drop decision is made at the start of each fresh root trace, before any export. Traces that are dropped never leave the worker (no egress, no ingest cost, no storage). Traces that are kept are exported in full.
105
127
 
106
128
  ```ts
107
- export default instrument(withOtherMiddleware(worker), { service: "my-worker" });
129
+ export default instrument(worker, {
130
+ service: "my-api",
131
+ sampling: { rate: 0.2 }, // keep ~20% of traffic
132
+ });
108
133
  ```
109
134
 
110
- Set the key as a secret: `wrangler secret put HEYSTACK_API_KEY`.
135
+ The sampler is **deterministic by trace ID**: the same trace always makes the same decision, so parent→child consistency is maintained when this worker is in a call chain sampled at the same rate by another service.
136
+
137
+ **Tradeoff:** head sampling can't guarantee keeping error traces — the drop/keep decision is made before the response status is known. If you need every error captured, keep `rate: 1` (the default). Heystack's ingest-side keep rules (error and slow retention) apply among traces that _are_ exported; they can't rescue a trace that was dropped in the worker before export.
138
+
139
+ The `rate` governs **fresh root traces only** (no inbound `traceparent`, or `traceparent` with `sampled=0`). A request arriving with a sampled inbound `traceparent` (`sampled=1`) is always recorded — the parent's decision is respected, so a distributed trace is never split mid-flight by the child's sample rate.
140
+
141
+ ### Automatic tracing
142
+
143
+ `instrument()` traces the following automatically, with no additional config:
144
+
145
+ - **Incoming requests (`fetch`)** — a SERVER span per request, carrying `http.request.method`, `url.full`, `url.path`, `server.address`, `http.response.status_code`, and `enduser.id`/`client.address`/geo attributes (see [Client enrichment](#client-enrichment) below). An inbound W3C `traceparent` header is continued so the client and server share one trace; a `traceparent` response header (+ `Access-Control-Expose-Headers: traceparent`) is set so downstream clients can read it.
146
+ - **Outbound `fetch`** — each outbound subrequest while a request span is active gets a CLIENT child span (`http.request.method`, `url.full`, `server.address`, `http.response.status_code`). A W3C `traceparent` header is injected into the subrequest so a downstream Heystack-instrumented service continues the same trace (distributed tracing across services). The exporter's own ingest POST is never traced.
147
+ - **Queue consumers (`queue`)** — a CONSUMER span per batch, with `messaging.destination.name` (queue name) and `messaging.batch.message_count`.
148
+ - **Scheduled handlers (`scheduled`)** — an INTERNAL span per invocation, with `controller.cron`.
149
+ - **Binding calls** (when `instrumentBindings` is set) — a child span for every D1 query (`db.statement`), KV read/write, R2 operation, and Vectorize query.
150
+
151
+ ### Client enrichment
152
+
153
+ These attributes are set automatically on every SERVER span from request metadata:
154
+
155
+ | Attribute | Source |
156
+ | --- | --- |
157
+ | `enduser.id` | `getUser(req).id` |
158
+ | `session.id` | `getUser(req).session` |
159
+ | `http.request.id` | `getUser(req).requestId` or `cf-ray` header |
160
+ | `client.address` | `CF-Connecting-IP` header |
161
+ | `geo.country`, `geo.region`, `geo.city`, `geo.asn` | Cloudflare `req.cf` object |
162
+
163
+ ### Manual spans: `withSpan` / `addEvent`
164
+
165
+ Inside a traced handler, add finer-grained spans without touching the OpenTelemetry API directly:
166
+
167
+ ```ts
168
+ import { instrument, withSpan, addEvent } from "@heystack/otel/workers";
169
+
170
+ // Inside a fetch handler:
171
+ const result = await withSpan("parse-payload", { "source": "body" }, async (span) => {
172
+ addEvent("parsing-started");
173
+ span.setAttribute("content-type", req.headers.get("content-type") ?? "");
174
+ return JSON.parse(await req.text());
175
+ });
176
+
177
+ // Without attrs:
178
+ const data = await withSpan("call-llm", async () => {
179
+ return callMyAI();
180
+ });
181
+ ```
182
+
183
+ `withSpan(name, attrs?, fn)` — runs `fn` inside a new child span parented to the currently-active span. The span is started before `fn` and ended in `finally`; exceptions are recorded on the span and re-thrown. The `fn` receives the live `Span` as its argument.
184
+
185
+ `addEvent(name, attrs?)` — adds a named event to the currently-active span. No-op when no span is active.
186
+
187
+ ### Why a ContextManager
188
+
189
+ `context.with(...)` in OpenTelemetry is a **no-op unless a ContextManager is registered** with the global API. Without one, `suppressTracing()` — the primary defence against the self-trace feedback loop — silently does nothing in production. As of **0.3.2** the SDK registers a ContextManager exactly once at init. With `AsyncLocalStorage` (workerd under `nodejs_compat`, Node) you also get **cross-`await` parent→child span linking** and **per-request context isolation** — concurrent requests no longer share or clobber the active span.
111
190
 
112
191
  As of **0.3.1** `instrument()` **forwards every other handler your Worker exports** — `queue`, `scheduled`, `tail`, etc. — untouched, so wrapping never drops a handler Cloudflare requires for deploy (it previously returned only `{ fetch }`, which broke Queue/Cron Workers). On top of forwarding, `queue` and `scheduled` are themselves traced when present: each gets a root span via the global tracer (`queue <queueName>` as a CONSUMER span with batch attributes; `scheduled <cron>` as an INTERNAL span with the cron attribute), flushed via `ctx.waitUntil` just like `fetch`.
113
192
 
193
+ ### Streaming responses & trace correlation (`/workers`)
194
+
195
+ `instrument()` keeps the SERVER span open until the response body finishes
196
+ streaming, so a streamed response's duration includes time-to-last-byte and the
197
+ span carries a `first_byte` event (time-to-first-byte). It also continues an
198
+ inbound W3C `traceparent` (so a client request and the server handler share one
199
+ trace) and returns a `traceparent` response header — plus
200
+ `Access-Control-Expose-Headers: traceparent` so browser clients can read it.
201
+
114
202
  ### Durable Objects are NOT covered by `instrument()`
115
203
 
116
204
  `instrument()` wraps the keys of the **default-export handler object** (`fetch`/`queue`/`scheduled`/… ). **Durable Objects are separate named class exports**, so spreading the handler object does not touch them — a DO's `fetch`/`alarm` methods run **untraced** even when your Worker's default export is wrapped.
@@ -204,6 +292,8 @@ As belt-and-suspenders the exporter also drops any span whose HTTP target points
204
292
 
205
293
  ## Migration / versioning
206
294
 
295
+ - **`0.6.0`** — **`/workers`: head sampling (`sampling: { rate }`).** New optional `WorkersConfig` field: `sampling.rate` (0–1, default `1`). Keeps a deterministic fraction of fresh root traces — the drop decision is made in the worker before export (no egress, no ingest cost). Parent-respecting: requests arriving with a sampled `traceparent` are always recorded. Consistent with server-side sampling (same trace-ID hash). No breaking changes; all new options are optional. See [Head sampling](#head-sampling).
296
+ - **`0.5.0`** — **`/workers`: identity enrichment, binding tracing, outbound-fetch tracing, manual span helpers.** New `WorkersConfig` options: `getUser` (attach `enduser.id`/`session.id`/`http.request.id` per request from a synchronous callback), `instrumentBindings` (auto child spans for D1/KV/R2/Vectorize — `true` = all detected, or a `string[]` to select). Outbound `fetch` calls made inside a traced handler automatically get CLIENT child spans with `traceparent` injection (distributed tracing across services). New ergonomic exports from `/workers`: `withSpan(name, attrs?, fn)` runs a function inside a named child span (auto-parented, exceptions recorded, `span.end()` in `finally`); `addEvent(name, attrs?)` adds an event to the active span. No breaking changes; all new options are optional.
207
297
  - **`0.4.3`** — **feedback-loop guard extended to the Node path (cost fix).** The self-instrumentation loop the Workers path fixed in 0.3.1/0.3.2 was still live on plain Node / Next-on-Node: the OTLP-over-`http` exporter's `POST /v1/traces` was auto-instrumented and re-exported, so ~77% of ingested spans in production were the exporter tracing itself — needless ingest + ClickHouse compute. 0.4.3 ignores ingest-host requests in the HTTP/undici auto-instrumentations **and** filters self-spans at the exporter boundary (covers caller-supplied `instrumentations` too). The hostname matcher is now a shared module used by both `/node` and `/workers`. No API change. **Action: upgrade and redeploy any Node/Next-on-Node app** — it cuts ingested span volume sharply.
208
298
  - **`0.3.5`** — **type-constraint fix (Workers).** A Worker whose `queue` consumer is typed with a concrete message body — `queue(batch: MessageBatch<MyJob>, …)`, the normal case — failed to compile against `instrument()` in 0.3.4 (`TS2345`: `MessageBatch<unknown>` not assignable to `MessageBatch<MyJob>`). The `WorkerHandler` constraint declared its entrypoints as arrow properties, whose parameters are checked **contravariantly** under `strictFunctionTypes`, so a narrowed handler wasn't assignable. 0.3.5 declares them with **method syntax** (bivariant parameters) and widens the batch to `MessageBatch<any>`, mirroring Cloudflare's own `ExportedHandler` — a typed-queue Worker now type-checks with a bare `instrument(handler, cfg)`. Runtime behaviour unchanged. Also adds a **consumer type-check gate** (`pnpm consumer-typecheck`, run in `check` and `prepublishOnly`) that compiles a fully-typed Worker against the built `dist` through the public `exports` map and asserts `satisfies ExportedHandler<Env>` — the regression that escaped in 0.3.3/0.3.4 now fails the build before publish.
209
299
  - **`0.3.4`** — **type-inference fix (Workers).** Restores `instrument()`'s ability to infer the handler's concrete `Env` type. In 0.3.3 the signature was `instrument<E = unknown, H extends WorkerHandler<E>>`, so `E` defaulted to `unknown` and was never recovered from the handler — under `strictFunctionTypes` a Worker typed `fetch(req, env: Env, ctx)` then failed to compile (`TS2345: 'Env' is not assignable to 'unknown'`) unless the caller passed `instrument<Env>(...)` explicitly. 0.3.4 infers `E` from the handler argument (`instrument<H extends WorkerHandler<any>>(...): Instrumented<EnvOf<H>, H>`), so a bare `instrument(handler, cfg)` type-checks again. Runtime behaviour is unchanged; no `0.3.3` consumer needs the explicit type arg after upgrading.
@@ -0,0 +1,33 @@
1
+ import { type Span } from "@opentelemetry/api";
2
+ export interface InstrumentBindingsOpts {
3
+ /**
4
+ * Factory that creates and starts a new child span. Called at binding method
5
+ * invocation time (inside the traced handler scope), so `context.active()`
6
+ * at that moment correctly parents to the root span.
7
+ *
8
+ * For integration: pass `(name, attrs) => tracer.startSpan(name, { attributes: attrs }, context.active())`.
9
+ * For unit tests: inject a fake so no global provider is required.
10
+ */
11
+ startSpan: (name: string, attrs: Record<string, unknown>) => Span;
12
+ /**
13
+ * `true` → auto-detect and wrap all D1/KV/R2/Vectorize bindings.
14
+ * `string[]` → only wrap bindings whose env key is listed.
15
+ */
16
+ select: boolean | string[];
17
+ }
18
+ /**
19
+ * Wrap an env object's Cloudflare bindings so that each binding operation
20
+ * emits a child span under the currently-active OTel context.
21
+ *
22
+ * Detects binding type by duck-typing (D1: `prepare`; KV: `get`+`put`+`list`;
23
+ * R2: `get`+`put`+`head`; Vectorize: `query`+`upsert`). Unrecognised bindings
24
+ * are passed through unchanged.
25
+ *
26
+ * Each wrapped binding is a `Proxy` over the original — non-wrapped prototype
27
+ * methods fall through to the real binding so no functionality is lost.
28
+ *
29
+ * @param env - The Worker env / binding bag.
30
+ * @param opts - `startSpan` factory + `select` filter.
31
+ * @returns A shallow copy of `env` with selected bindings replaced by proxies.
32
+ */
33
+ export declare function instrumentEnv<E extends Record<string, unknown>>(env: E, opts: InstrumentBindingsOpts): E;
@@ -0,0 +1,205 @@
1
+ // ---------------------------------------------------------------------------
2
+ // Cloudflare binding instrumentation for @heystack/otel/workers.
3
+ //
4
+ // Wraps D1, KV, R2, and Vectorize bindings with OTel child spans so that
5
+ // every binding operation is visible as a child of the active request span.
6
+ //
7
+ // WinterCG-safe: no `node:*` imports. Span factory is injected so the logic
8
+ // is pure and unit-testable without a global provider.
9
+ // ---------------------------------------------------------------------------
10
+ import { context, SpanStatusCode } from "@opentelemetry/api";
11
+ import { isTracingSuppressed } from "@opentelemetry/core";
12
+ // ---------------------------------------------------------------------------
13
+ // Duck-type detectors — conservative; require the distinctive method set
14
+ // exactly as documented in the task brief.
15
+ // ---------------------------------------------------------------------------
16
+ function isD1Like(b) {
17
+ return typeof b?.prepare === "function";
18
+ }
19
+ /** R2 is checked BEFORE KV because R2 also exposes `list`. */
20
+ function isR2Like(b) {
21
+ return (typeof b?.get === "function" &&
22
+ typeof b?.put === "function" &&
23
+ typeof b?.head === "function");
24
+ }
25
+ function isKVLike(b) {
26
+ return (typeof b?.get === "function" &&
27
+ typeof b?.put === "function" &&
28
+ typeof b?.list === "function");
29
+ }
30
+ function isVectorizeLike(b) {
31
+ return (typeof b?.query === "function" &&
32
+ typeof b?.upsert === "function");
33
+ }
34
+ // ---------------------------------------------------------------------------
35
+ // Span lifecycle helper
36
+ // ---------------------------------------------------------------------------
37
+ async function runWithSpan(span, fn) {
38
+ try {
39
+ return await fn();
40
+ }
41
+ catch (err) {
42
+ span.recordException(err instanceof Error ? err : new Error(String(err)));
43
+ span.setStatus({
44
+ code: SpanStatusCode.ERROR,
45
+ message: err instanceof Error ? err.message : String(err),
46
+ });
47
+ throw err;
48
+ }
49
+ finally {
50
+ span.end();
51
+ }
52
+ }
53
+ // ---------------------------------------------------------------------------
54
+ // Proxy helper — intercepts listed handlers, falls through to prototype for rest
55
+ // ---------------------------------------------------------------------------
56
+ function makeProxy(target, handlers) {
57
+ return new Proxy(target, {
58
+ get(t, prop, receiver) {
59
+ if (typeof prop === "string" && prop in handlers)
60
+ return handlers[prop];
61
+ const val = Reflect.get(t, prop, receiver);
62
+ return typeof val === "function" ? val.bind(t) : val;
63
+ },
64
+ });
65
+ }
66
+ // ---------------------------------------------------------------------------
67
+ // D1 wrappers
68
+ // ---------------------------------------------------------------------------
69
+ function wrapD1Statement(stmt, sql, opts) {
70
+ const wrapOp = (op) => async (...args) => {
71
+ if (isTracingSuppressed(context.active())) {
72
+ return stmt[op](...args);
73
+ }
74
+ const span = opts.startSpan(`D1 ${op}`, {
75
+ "db.system": "d1",
76
+ "db.statement": sql,
77
+ });
78
+ return runWithSpan(span, () => stmt[op](...args));
79
+ };
80
+ return makeProxy(stmt, {
81
+ all: wrapOp("all"),
82
+ first: wrapOp("first"),
83
+ run: wrapOp("run"),
84
+ raw: wrapOp("raw"),
85
+ bind(...bindArgs) {
86
+ // Return a wrapped statement so the sql propagates through bind chains.
87
+ return wrapD1Statement(stmt.bind(...bindArgs), sql, opts);
88
+ },
89
+ });
90
+ }
91
+ function wrapD1(binding, opts) {
92
+ return makeProxy(binding, {
93
+ prepare(sql) {
94
+ return wrapD1Statement(binding.prepare(sql), sql, opts);
95
+ },
96
+ });
97
+ }
98
+ // ---------------------------------------------------------------------------
99
+ // KV wrappers
100
+ // ---------------------------------------------------------------------------
101
+ function wrapKV(binding, opts, namespace) {
102
+ const wrapOp = (op, getKey) => async (...args) => {
103
+ if (isTracingSuppressed(context.active())) {
104
+ return binding[op](...args);
105
+ }
106
+ const attrs = { "kv.namespace": namespace };
107
+ const key = getKey?.(...args);
108
+ if (key !== undefined)
109
+ attrs["kv.key"] = key;
110
+ const span = opts.startSpan(`KV ${op}`, attrs);
111
+ return runWithSpan(span, () => binding[op](...args));
112
+ };
113
+ return makeProxy(binding, {
114
+ get: wrapOp("get", (key) => key),
115
+ put: wrapOp("put", (key) => key),
116
+ list: wrapOp("list"),
117
+ delete: wrapOp("delete", (key) => key),
118
+ });
119
+ }
120
+ // ---------------------------------------------------------------------------
121
+ // R2 wrappers
122
+ // ---------------------------------------------------------------------------
123
+ function wrapR2(binding, opts, bucket) {
124
+ const wrapOp = (op, getKey) => async (...args) => {
125
+ if (isTracingSuppressed(context.active())) {
126
+ return binding[op](...args);
127
+ }
128
+ const attrs = { "r2.bucket": bucket };
129
+ const key = getKey?.(...args);
130
+ if (key !== undefined)
131
+ attrs["r2.key"] = key;
132
+ const span = opts.startSpan(`R2 ${op}`, attrs);
133
+ return runWithSpan(span, () => binding[op](...args));
134
+ };
135
+ return makeProxy(binding, {
136
+ get: wrapOp("get", (key) => key),
137
+ put: wrapOp("put", (key) => key),
138
+ head: wrapOp("head", (key) => key),
139
+ delete: wrapOp("delete", (key) => key),
140
+ });
141
+ }
142
+ // ---------------------------------------------------------------------------
143
+ // Vectorize wrappers
144
+ // ---------------------------------------------------------------------------
145
+ function wrapVectorize(binding, opts, indexName) {
146
+ const wrapOp = (op) => async (...args) => {
147
+ if (isTracingSuppressed(context.active())) {
148
+ return binding[op](...args);
149
+ }
150
+ const span = opts.startSpan(`Vectorize ${op}`, {
151
+ "vectorize.index": indexName,
152
+ });
153
+ return runWithSpan(span, () => binding[op](...args));
154
+ };
155
+ return makeProxy(binding, {
156
+ query: wrapOp("query"),
157
+ upsert: wrapOp("upsert"),
158
+ insert: wrapOp("insert"),
159
+ deleteByIds: wrapOp("deleteByIds"),
160
+ getByIds: wrapOp("getByIds"),
161
+ });
162
+ }
163
+ // ---------------------------------------------------------------------------
164
+ // Main export
165
+ // ---------------------------------------------------------------------------
166
+ /**
167
+ * Wrap an env object's Cloudflare bindings so that each binding operation
168
+ * emits a child span under the currently-active OTel context.
169
+ *
170
+ * Detects binding type by duck-typing (D1: `prepare`; KV: `get`+`put`+`list`;
171
+ * R2: `get`+`put`+`head`; Vectorize: `query`+`upsert`). Unrecognised bindings
172
+ * are passed through unchanged.
173
+ *
174
+ * Each wrapped binding is a `Proxy` over the original — non-wrapped prototype
175
+ * methods fall through to the real binding so no functionality is lost.
176
+ *
177
+ * @param env - The Worker env / binding bag.
178
+ * @param opts - `startSpan` factory + `select` filter.
179
+ * @returns A shallow copy of `env` with selected bindings replaced by proxies.
180
+ */
181
+ export function instrumentEnv(env, opts) {
182
+ const result = { ...env };
183
+ const { select } = opts;
184
+ for (const key of Object.keys(env)) {
185
+ // Filter: if select is an array, only wrap keys in the list.
186
+ if (select !== true && !select.includes(key))
187
+ continue;
188
+ const binding = env[key];
189
+ if (isD1Like(binding)) {
190
+ result[key] = wrapD1(binding, opts);
191
+ }
192
+ else if (isR2Like(binding)) {
193
+ // R2 before KV — R2 also has `list`, so checking `head` first avoids mis-classifying.
194
+ result[key] = wrapR2(binding, opts, key);
195
+ }
196
+ else if (isKVLike(binding)) {
197
+ result[key] = wrapKV(binding, opts, key);
198
+ }
199
+ else if (isVectorizeLike(binding)) {
200
+ result[key] = wrapVectorize(binding, opts, key);
201
+ }
202
+ // Unrecognised bindings are left as-is.
203
+ }
204
+ return result;
205
+ }
@@ -0,0 +1,13 @@
1
+ import { type Sampler, type SamplingResult } from "@opentelemetry/sdk-trace-base";
2
+ import type { Context, Attributes, Link, SpanKind } from "@opentelemetry/api";
3
+ export declare function fnv01(s: string): number;
4
+ export declare function traceKept(traceId: string, rate: number): boolean;
5
+ export declare class HeystackRatioSampler implements Sampler {
6
+ private readonly rate;
7
+ constructor(rate: number);
8
+ shouldSample(_ctx: Context, traceId: string, _name: string, _kind: SpanKind, _attrs: Attributes, _links: Link[]): SamplingResult;
9
+ toString(): string;
10
+ }
11
+ export declare function makeSampler(sampling?: {
12
+ rate?: number;
13
+ }): Sampler;
@@ -0,0 +1,42 @@
1
+ import { SamplingDecision, ParentBasedSampler, AlwaysOnSampler, } from "@opentelemetry/sdk-trace-base";
2
+ // FNV-1a 32-bit → [0,1). MUST stay byte-identical to apps/ingest/src/sampling.ts
3
+ // so the SDK and ingest agree on which traces to keep.
4
+ export function fnv01(s) {
5
+ let h = 0x811c9dc5;
6
+ for (let i = 0; i < s.length; i++) {
7
+ h ^= s.charCodeAt(i);
8
+ h = Math.imul(h, 0x01000193);
9
+ }
10
+ return (h >>> 0) / 0x100000000;
11
+ }
12
+ export function traceKept(traceId, rate) {
13
+ if (rate >= 1)
14
+ return true;
15
+ if (rate <= 0)
16
+ return false;
17
+ return fnv01(traceId) < rate;
18
+ }
19
+ export class HeystackRatioSampler {
20
+ rate;
21
+ constructor(rate) {
22
+ this.rate = rate;
23
+ }
24
+ shouldSample(_ctx, traceId, _name, _kind, _attrs, _links) {
25
+ return {
26
+ decision: traceKept(traceId, this.rate)
27
+ ? SamplingDecision.RECORD_AND_SAMPLED
28
+ : SamplingDecision.NOT_RECORD,
29
+ };
30
+ }
31
+ toString() {
32
+ return `HeystackRatioSampler{${this.rate}}`;
33
+ }
34
+ }
35
+ export function makeSampler(sampling) {
36
+ const rate = sampling?.rate;
37
+ if (rate === undefined || rate >= 1)
38
+ return new AlwaysOnSampler();
39
+ // ParentBased: a sampled/dropped PARENT (local or remote via traceparent) wins,
40
+ // so a distributed trace samples consistently; only the root uses the ratio.
41
+ return new ParentBasedSampler({ root: new HeystackRatioSampler(rate) });
42
+ }
package/dist/workers.d.ts CHANGED
@@ -23,6 +23,11 @@ interface OtlpKeyValue {
23
23
  key: string;
24
24
  value: OtlpAnyValue;
25
25
  }
26
+ interface OtlpSpanEvent {
27
+ timeUnixNano: string;
28
+ name: string;
29
+ attributes: OtlpKeyValue[];
30
+ }
26
31
  interface OtlpSpan {
27
32
  traceId: string;
28
33
  spanId: string;
@@ -32,6 +37,7 @@ interface OtlpSpan {
32
37
  startTimeUnixNano: string;
33
38
  endTimeUnixNano: string;
34
39
  attributes: OtlpKeyValue[];
40
+ events: OtlpSpanEvent[];
35
41
  status: {
36
42
  code: number;
37
43
  message?: string;
@@ -54,6 +60,14 @@ interface OtlpTracesPayload {
54
60
  * hence the same resource, so we emit a single resourceSpans entry.
55
61
  */
56
62
  export declare function serializeSpans(spans: ReadableSpan[]): OtlpTracesPayload;
63
+ /** Parse a W3C `traceparent`. Returns null for malformed or all-zero ids. */
64
+ export declare function parseTraceparent(header: string | null): {
65
+ traceId: string;
66
+ spanId: string;
67
+ traceFlags: number;
68
+ } | null;
69
+ /** Add `value` to Access-Control-Expose-Headers without duplicating it. */
70
+ export declare function appendExposeHeader(headers: Headers, value: string): void;
57
71
  /**
58
72
  * Test-only helper: run the self-span attribute check directly against a plain
59
73
  * attribute bag + ingest hostname, without constructing a ReadableSpan. The
@@ -61,6 +75,12 @@ export declare function serializeSpans(spans: ReadableSpan[]): OtlpTracesPayload
61
75
  * the exporter derives via `safeHostname(cfg.url)`.
62
76
  */
63
77
  export declare function isSelfSpanForTest(attrs: Record<string, unknown>, ingestHost: string): boolean;
78
+ /**
79
+ * Reset the outbound-fetch instrumentation: restore the captured platform fetch
80
+ * (only when our wrapper is still the installed global) and clear the guard.
81
+ * Internal/testing helper.
82
+ */
83
+ export declare function __resetFetchInstrumentation(): void;
64
84
  /**
65
85
  * A WinterCG-compatible OTLP/JSON span exporter. POSTs ended spans to the
66
86
  * Heystack ingest using the platform `fetch` — no Node built-ins.
@@ -68,7 +88,7 @@ export declare function isSelfSpanForTest(attrs: Record<string, unknown>, ingest
68
88
  export declare class HeystackSpanExporter implements SpanExporter {
69
89
  private readonly url;
70
90
  /** Hostname (no port) of the ingest endpoint, used to drop self-trace spans. */
71
- private readonly ingestHost;
91
+ readonly ingestHost: string;
72
92
  private readonly headers;
73
93
  private shutdownState;
74
94
  /**
@@ -122,6 +142,36 @@ export interface WorkersConfig {
122
142
  * automatically borrows the request context's `ctx.waitUntil`.
123
143
  */
124
144
  waitUntil?: (p: Promise<unknown>) => void;
145
+ /**
146
+ * Optional hook called on each incoming request to supply identity context.
147
+ * Return `{ id }` to tag the SERVER span with `enduser.id`, `{ session }` for
148
+ * `session.id`, or `{ requestId }` to override `http.request.id` (otherwise
149
+ * the `cf-ray` request header is used as a fallback). Any field may be omitted.
150
+ */
151
+ getUser?: (req: Request) => {
152
+ id?: string;
153
+ session?: string;
154
+ requestId?: string;
155
+ } | undefined;
156
+ /**
157
+ * Declare which bindings to instrument with tracing (Task 5). Pass `true` to
158
+ * trace all bindings, or an array of binding names to trace selectively.
159
+ * Defaults to `false` (no binding tracing). Consumed by a later task.
160
+ */
161
+ instrumentBindings?: boolean | string[];
162
+ /**
163
+ * Head-sampling configuration. When omitted (or `rate` >= 1), all traces are
164
+ * kept. When `rate` < 1, a deterministic FNV-1a hash of the trace ID decides
165
+ * keep/drop at the root span — consistent with the ingest-side sampler so the
166
+ * SDK and backend agree. Error / slow-span retention is an ingest concern.
167
+ *
168
+ * Note: head sampling is parent-respecting, so an incoming request carrying a
169
+ * sampled `traceparent` is still recorded even at `rate: 0` (it is not an
170
+ * absolute kill-switch; it governs only fresh/root traces).
171
+ */
172
+ sampling?: {
173
+ rate?: number;
174
+ };
125
175
  }
126
176
  /**
127
177
  * A `BasicTracerProvider` with the underlying `HeystackSpanExporter` attached so
@@ -141,14 +191,37 @@ export type HeystackTracerProvider = BasicTracerProvider & {
141
191
  * must drain the exporter — `await provider.heystackExporter.forceFlush()` (or
142
192
  * `flushHeystack()`), not just `provider.forceFlush()`.
143
193
  */
144
- export declare function createTracerProvider(config: HeystackOptions): HeystackTracerProvider;
194
+ export declare function createTracerProvider(config: HeystackOptions & {
195
+ sampling?: {
196
+ rate?: number;
197
+ };
198
+ }): HeystackTracerProvider;
199
+ /**
200
+ * An AsyncLocalStorage-backed ContextManager for the /workers entry. When the
201
+ * runtime exposes `globalThis.AsyncLocalStorage` (workerd and Node do), this
202
+ * manager is preferred over `SyncStackContextManager` because it propagates
203
+ * context across `await` boundaries — child spans created after an `await`
204
+ * inside a handler are correctly parented to the request's root span.
205
+ *
206
+ * WinterCG-safe: it does NOT import `node:async_hooks`; instead it uses the
207
+ * ALS global that the runtime exposes. Falls back to `SyncStackContextManager`
208
+ * when the global is absent (Deno, Bun without globals, etc.).
209
+ */
210
+ export declare class AlsContextManager implements ContextManager {
211
+ private _als;
212
+ constructor();
213
+ active(): Context;
214
+ with<A extends unknown[], F extends (...args: A) => ReturnType<F>>(ctx: Context, fn: F, thisArg?: ThisParameterType<F>, ...args: A): ReturnType<F>;
215
+ bind<T>(_ctx: Context, target: T): T;
216
+ enable(): this;
217
+ disable(): this;
218
+ }
145
219
  /**
146
- * A minimal SYNCHRONOUS, stack-based ContextManager — the registered manager for
147
- * the /workers entry (no `node:async_hooks`, so it works on any WinterCG runtime).
148
- * It makes `context.with()` propagate synchronously, which is enough for the
149
- * exporter's `suppressTracing` to take effect and for the belt-and-suspenders
150
- * self-span filter — but it does NOT carry context across `await` boundaries (so
151
- * cross-`await` parent linking and per-request isolation are best-effort).
220
+ * A minimal SYNCHRONOUS, stack-based ContextManager — the fallback for the
221
+ * /workers entry when `globalThis.AsyncLocalStorage` is absent (any WinterCG
222
+ * runtime without the global). It makes `context.with()` propagate
223
+ * synchronously, which is enough for the exporter's `suppressTracing` to take
224
+ * effect — but it does NOT carry context across `await` boundaries.
152
225
  */
153
226
  export declare class SyncStackContextManager implements ContextManager {
154
227
  private _stack;
@@ -269,4 +342,22 @@ type EnvOf<H> = H extends {
269
342
  scheduled: (controller: ScheduledController, env: infer E, ctx: ExecutionContext) => unknown;
270
343
  } ? E : unknown;
271
344
  export declare function instrument<H extends WorkerHandler<any>>(handler: H, config: WorkersConfig): Instrumented<EnvOf<H>, H>;
345
+ /**
346
+ * Run `fn` inside a new child span named `name`. The span is automatically
347
+ * parented to the currently-active span (via `context.active()`), started
348
+ * before `fn` and ended in `finally` — so the caller never needs to call
349
+ * `span.end()`. If `fn` throws, the exception is recorded on the span and
350
+ * the status is set to ERROR before re-throwing.
351
+ *
352
+ * Two call signatures are supported:
353
+ * withSpan("name", async (span) => { ... })
354
+ * withSpan("name", { attr: "value" }, async (span) => { ... })
355
+ */
356
+ export declare function withSpan<T>(name: string, fn: (span: Span) => T | Promise<T>): Promise<T>;
357
+ export declare function withSpan<T>(name: string, attrs: Record<string, string | number | boolean>, fn: (span: Span) => T | Promise<T>): Promise<T>;
358
+ /**
359
+ * Add a named event (with optional attributes) to the currently-active span.
360
+ * No-op when no span is active.
361
+ */
362
+ export declare function addEvent(name: string, attrs?: Record<string, string | number | boolean>): void;
272
363
  export type { Span };
package/dist/workers.js CHANGED
@@ -8,13 +8,15 @@
8
8
  // ships its own OTLP/JSON-over-fetch span exporter so it runs on Workers/Edge
9
9
  // where the Node SDK cannot.
10
10
  import { context, trace, SpanKind, SpanStatusCode, } from "@opentelemetry/api";
11
- import { suppressTracing } from "@opentelemetry/core";
11
+ import { suppressTracing, isTracingSuppressed } from "@opentelemetry/core";
12
12
  import { ROOT_CONTEXT } from "@opentelemetry/api";
13
13
  import { Resource } from "@opentelemetry/resources";
14
14
  import { BasicTracerProvider, SimpleSpanProcessor, } from "@opentelemetry/sdk-trace-base";
15
15
  import { ATTR_SERVICE_NAME } from "@opentelemetry/semantic-conventions";
16
16
  import { buildExporterConfig } from "./core.js";
17
17
  import { isSelfSpanAttrs, safeHostname } from "./self-span.js";
18
+ import { instrumentEnv } from "./workers-bindings.js";
19
+ import { makeSampler } from "./workers-sampler.js";
18
20
  // `ExportResult` / `ExportResultCode` mirror `@opentelemetry/core`. We define
19
21
  // them inline (structurally identical) rather than import them: core is only a
20
22
  // transitive dep of sdk-trace-base and isn't reliably resolvable, and keeping it
@@ -116,6 +118,11 @@ function readableSpanToOtlp(span) {
116
118
  startTimeUnixNano: hrTimeToUnixNano(span.startTime),
117
119
  endTimeUnixNano: hrTimeToUnixNano(span.endTime),
118
120
  attributes: toKeyValues(span.attributes),
121
+ events: span.events.map((ev) => ({
122
+ timeUnixNano: hrTimeToUnixNano(ev.time),
123
+ name: ev.name,
124
+ attributes: toKeyValues((ev.attributes ?? {})),
125
+ })),
119
126
  status: toStatus(span.status),
120
127
  };
121
128
  }
@@ -139,6 +146,36 @@ export function serializeSpans(spans) {
139
146
  };
140
147
  }
141
148
  // ---------------------------------------------------------------------------
149
+ // W3C traceparent + CORS expose-header helpers (FR5 tap→server correlation)
150
+ // ---------------------------------------------------------------------------
151
+ const TRACEPARENT_RE = /^00-([0-9a-f]{32})-([0-9a-f]{16})-([0-9a-f]{2})$/;
152
+ /** Parse a W3C `traceparent`. Returns null for malformed or all-zero ids. */
153
+ export function parseTraceparent(header) {
154
+ if (!header)
155
+ return null;
156
+ const m = TRACEPARENT_RE.exec(header.trim());
157
+ if (!m)
158
+ return null;
159
+ const traceId = m[1];
160
+ const spanId = m[2];
161
+ const flags = m[3];
162
+ if (traceId === "0".repeat(32) || spanId === "0".repeat(16))
163
+ return null;
164
+ return { traceId, spanId, traceFlags: parseInt(flags, 16) };
165
+ }
166
+ /** Add `value` to Access-Control-Expose-Headers without duplicating it. */
167
+ export function appendExposeHeader(headers, value) {
168
+ const existing = headers.get("access-control-expose-headers");
169
+ if (!existing) {
170
+ headers.set("access-control-expose-headers", value);
171
+ return;
172
+ }
173
+ const present = existing.split(",").map((s) => s.trim().toLowerCase());
174
+ if (!present.includes(value.toLowerCase())) {
175
+ headers.set("access-control-expose-headers", `${existing}, ${value}`);
176
+ }
177
+ }
178
+ // ---------------------------------------------------------------------------
142
179
  // Self-span filtering (feedback-loop guard)
143
180
  //
144
181
  // On Next/OpenNext the host auto-instruments outbound `fetch`, so the exporter's
@@ -164,6 +201,143 @@ export function isSelfSpanForTest(attrs, ingestHost) {
164
201
  return isSelfSpanAttrs(attrs, ingestHost);
165
202
  }
166
203
  // ---------------------------------------------------------------------------
204
+ // Outbound fetch auto-instrumentation (CLIENT child spans + traceparent)
205
+ //
206
+ // Patches globalThis.fetch ONCE (guarded), capturing the platform fetch the
207
+ // instant before replacing it. Each outbound call made inside a traced request
208
+ // (an active span is present, tracing is not suppressed, and the target is not
209
+ // the ingest host) gets a CLIENT child span AND a W3C `traceparent` injected
210
+ // into a CLONE of the request — so a downstream Heystack-instrumented service
211
+ // continues the SAME trace (distributed tracing).
212
+ //
213
+ // Why patch-once + ALS rather than a per-request global swap: under workerd a
214
+ // single isolate serves concurrent requests, so swapping globalThis.fetch per
215
+ // request (install/restore in a finally) would race across overlapping requests.
216
+ // Task 2's AsyncLocalStorage context manager makes `context.active()` reflect the
217
+ // CURRENT request's active span across awaits, so ONE global wrapper can decide
218
+ // per-call which request (if any) the subrequest belongs to.
219
+ //
220
+ // `_originalFetch` is the captured platform fetch. The exporter's own POST uses
221
+ // it directly (see `export()`), so an export is never re-entered by this wrapper
222
+ // (belt; the wrapper also bails on suppressed contexts + ingest-host targets —
223
+ // suspenders). Reading `safeHostname` / the self-span host concept keeps the
224
+ // self-span guardrail satisfied and the path WinterCG-safe (pure string logic).
225
+ // ---------------------------------------------------------------------------
226
+ let _fetchInstrumented = false;
227
+ let _originalFetch;
228
+ let _fetchWrapper;
229
+ /** Resolve the absolute URL of an outbound fetch arg (string | URL | Request). */
230
+ function outboundUrl(input) {
231
+ if (typeof input === "string")
232
+ return input;
233
+ if (input instanceof URL)
234
+ return input.href;
235
+ if (input instanceof Request)
236
+ return input.url;
237
+ try {
238
+ return String(input.url ?? input);
239
+ }
240
+ catch {
241
+ return "";
242
+ }
243
+ }
244
+ /** Resolve the HTTP method of an outbound fetch arg. */
245
+ function outboundMethod(input, init) {
246
+ if (init?.method)
247
+ return init.method;
248
+ if (input instanceof Request)
249
+ return input.method;
250
+ return "GET";
251
+ }
252
+ /**
253
+ * Return an [input, init] pair with `traceparent` injected, WITHOUT mutating the
254
+ * caller's Request/Headers. For a Request input we build a fresh Request copying
255
+ * it; for string/URL we clone the init headers.
256
+ */
257
+ function injectTraceparent(input, init, traceparent) {
258
+ if (input instanceof Request) {
259
+ const headers = new Headers(init?.headers ?? input.headers);
260
+ headers.set("traceparent", traceparent);
261
+ return [new Request(input, { ...(init ?? {}), headers }), undefined];
262
+ }
263
+ const headers = new Headers(init?.headers);
264
+ headers.set("traceparent", traceparent);
265
+ return [input, { ...(init ?? {}), headers }];
266
+ }
267
+ /**
268
+ * Patch `globalThis.fetch` exactly once to emit a CLIENT child span + inject
269
+ * `traceparent` for outbound subrequests. `ingestHost` is the bare ingest
270
+ * hostname (lower-case, no port) so the exporter's own uploads are never traced.
271
+ */
272
+ function ensureFetchInstrumentation(ingestHost) {
273
+ if (_fetchInstrumented)
274
+ return;
275
+ _fetchInstrumented = true;
276
+ // Capture the platform fetch the instant before replacing it (cost guardrail:
277
+ // "capture original before patching"). The exporter reuses this captured
278
+ // reference, so its POST is never routed back through the wrapper.
279
+ const originalFetch = globalThis.fetch.bind(globalThis);
280
+ _originalFetch = originalFetch;
281
+ const wrapper = async (input, init) => {
282
+ const active = context.active();
283
+ // Passthrough UNCHANGED when there is nothing to parent to (no active span,
284
+ // e.g. a fetch outside a traced request) or tracing is suppressed (the
285
+ // exporter's own POST, or anything under suppressTracing).
286
+ if (isTracingSuppressed(active) || !trace.getSpan(active)) {
287
+ return originalFetch(input, init);
288
+ }
289
+ const target = outboundUrl(input);
290
+ const host = safeHostname(target);
291
+ // No parseable host, or the ingest host itself (self-span) → don't trace.
292
+ if (!host || host === ingestHost) {
293
+ return originalFetch(input, init);
294
+ }
295
+ const method = outboundMethod(input, init);
296
+ const span = trace.getTracer("heystack").startSpan(`${method} ${host}`, {
297
+ kind: SpanKind.CLIENT,
298
+ attributes: {
299
+ "http.request.method": method,
300
+ "url.full": target,
301
+ "server.address": host,
302
+ },
303
+ }, active);
304
+ const sc = span.spanContext();
305
+ const traceparent = `00-${sc.traceId}-${sc.spanId}-01`;
306
+ const [reqInput, reqInit] = injectTraceparent(input, init, traceparent);
307
+ try {
308
+ const response = await originalFetch(reqInput, reqInit);
309
+ span.setAttribute("http.response.status_code", response.status);
310
+ return response;
311
+ }
312
+ catch (error) {
313
+ span.recordException(error instanceof Error ? error : new Error(String(error)));
314
+ span.setStatus({
315
+ code: SpanStatusCode.ERROR,
316
+ message: error instanceof Error ? error.message : String(error),
317
+ });
318
+ throw error;
319
+ }
320
+ finally {
321
+ span.end();
322
+ }
323
+ };
324
+ _fetchWrapper = wrapper;
325
+ globalThis.fetch = _fetchWrapper;
326
+ }
327
+ /**
328
+ * Reset the outbound-fetch instrumentation: restore the captured platform fetch
329
+ * (only when our wrapper is still the installed global) and clear the guard.
330
+ * Internal/testing helper.
331
+ */
332
+ export function __resetFetchInstrumentation() {
333
+ if (_fetchWrapper && globalThis.fetch === _fetchWrapper && _originalFetch) {
334
+ globalThis.fetch = _originalFetch;
335
+ }
336
+ _fetchInstrumented = false;
337
+ _originalFetch = undefined;
338
+ _fetchWrapper = undefined;
339
+ }
340
+ // ---------------------------------------------------------------------------
167
341
  // Exporter
168
342
  // ---------------------------------------------------------------------------
169
343
  /**
@@ -231,9 +405,12 @@ export class HeystackSpanExporter {
231
405
  // The POST runs inside a tracing-suppressed context so that host fetch
232
406
  // auto-instrumentation (e.g. Next/OpenNext) does NOT create a CLIENT span
233
407
  // for it — which would otherwise be exported and re-captured, a sustained
234
- // feedback loop.
408
+ // feedback loop. As a belt-and-suspenders second layer it also uses the
409
+ // CAPTURED platform fetch (when our outbound-fetch wrapper is installed), so
410
+ // the export can never be re-entered by that wrapper regardless of context.
411
+ const doFetch = _originalFetch ?? fetch;
235
412
  const p = context
236
- .with(suppressTracing(context.active()), () => fetch(this.url, { method: "POST", headers: this.headers, body }))
413
+ .with(suppressTracing(context.active()), () => doFetch(this.url, { method: "POST", headers: this.headers, body }))
237
414
  .then((res) => {
238
415
  if (res.ok) {
239
416
  resultCallback({ code: ExportResultCode.SUCCESS });
@@ -347,6 +524,7 @@ export function createTracerProvider(config) {
347
524
  const provider = new BasicTracerProvider({
348
525
  resource: new Resource({ [ATTR_SERVICE_NAME]: config.service }),
349
526
  spanProcessors: [new SimpleSpanProcessor(exporter)],
527
+ sampler: makeSampler(config.sampling),
350
528
  });
351
529
  // Attach the exporter so flush paths can await its in-flight fetches.
352
530
  return Object.assign(provider, { heystackExporter: exporter });
@@ -355,31 +533,51 @@ export function createTracerProvider(config) {
355
533
  // Global tracer provider registration (for host frameworks, e.g. Next.js)
356
534
  // ---------------------------------------------------------------------------
357
535
  let _provider = null;
358
- // ---------------------------------------------------------------------------
359
- // Context manager registration (makes suppressTracing() actually work)
360
- //
361
- // `context.with(...)` is a NO-OP unless a ContextManager is registered with the
362
- // global OTel API. Without one, `suppressTracing(context.active())` produces a
363
- // context that is never made active, so the exporter's POST is NOT suppressed
364
- // in production and host fetch auto-instrumentation can re-trace it (feedback
365
- // loop). We therefore register a manager exactly ONCE in `ensureGlobalProvider`.
366
- //
367
- // We register a dependency-free SYNCHRONOUS stack manager (below). Deliberately
368
- // NOT AsyncLocalStorageContextManager: that statically imports `node:async_hooks`,
369
- // which would break `import "@heystack/otel/workers"` on a bare workerd without
370
- // `nodejs_compat` (and on other WinterCG runtimes) — defeating the whole point of
371
- // this entry being node-builtin-free. The sync manager covers the critical path:
372
- // the exporter's POST runs synchronously inside the suppressed `context.with`, so
373
- // `suppressTracing` takes effect. Trade-off: no cross-`await` context propagation,
374
- // so deep nested-span parenting is limited on the edge (documented).
375
- // ---------------------------------------------------------------------------
536
+ /** Read the ALS global lazily — at call time, not module load time. */
537
+ function getALS() {
538
+ return globalThis.AsyncLocalStorage;
539
+ }
376
540
  /**
377
- * A minimal SYNCHRONOUS, stack-based ContextManager the registered manager for
378
- * the /workers entry (no `node:async_hooks`, so it works on any WinterCG runtime).
379
- * It makes `context.with()` propagate synchronously, which is enough for the
380
- * exporter's `suppressTracing` to take effect and for the belt-and-suspenders
381
- * self-span filter but it does NOT carry context across `await` boundaries (so
382
- * cross-`await` parent linking and per-request isolation are best-effort).
541
+ * An AsyncLocalStorage-backed ContextManager for the /workers entry. When the
542
+ * runtime exposes `globalThis.AsyncLocalStorage` (workerd and Node do), this
543
+ * manager is preferred over `SyncStackContextManager` because it propagates
544
+ * context across `await` boundaries child spans created after an `await`
545
+ * inside a handler are correctly parented to the request's root span.
546
+ *
547
+ * WinterCG-safe: it does NOT import `node:async_hooks`; instead it uses the
548
+ * ALS global that the runtime exposes. Falls back to `SyncStackContextManager`
549
+ * when the global is absent (Deno, Bun without globals, etc.).
550
+ */
551
+ export class AlsContextManager {
552
+ _als;
553
+ constructor() {
554
+ const ALS = getALS();
555
+ if (!ALS)
556
+ throw new Error("AlsContextManager: globalThis.AsyncLocalStorage is not available in this runtime");
557
+ this._als = new ALS();
558
+ }
559
+ active() {
560
+ return this._als.getStore() ?? ROOT_CONTEXT;
561
+ }
562
+ with(ctx, fn, thisArg, ...args) {
563
+ return this._als.run(ctx, () => fn.call(thisArg, ...args));
564
+ }
565
+ bind(_ctx, target) {
566
+ return target;
567
+ }
568
+ enable() {
569
+ return this;
570
+ }
571
+ disable() {
572
+ return this;
573
+ }
574
+ }
575
+ /**
576
+ * A minimal SYNCHRONOUS, stack-based ContextManager — the fallback for the
577
+ * /workers entry when `globalThis.AsyncLocalStorage` is absent (any WinterCG
578
+ * runtime without the global). It makes `context.with()` propagate
579
+ * synchronously, which is enough for the exporter's `suppressTracing` to take
580
+ * effect — but it does NOT carry context across `await` boundaries.
383
581
  */
384
582
  export class SyncStackContextManager {
385
583
  _stack = [];
@@ -412,20 +610,19 @@ let _contextManagerRegistered = false;
412
610
  * `context.with(suppressTracing(...))` in the exporter is actually honoured —
413
611
  * otherwise suppression is a no-op and the exporter's POST can be re-traced.
414
612
  *
415
- * We register a synchronous, dependency-free stack manager. This keeps the
416
- * /workers entry WinterCG-safe (no `node:async_hooks` import → works on bare
417
- * workerd WITHOUT nodejs_compat, Deno, Bun, etc.). It fully covers the critical
418
- * path (the export fetch runs synchronously inside the suppressed `context.with`,
419
- * and per-request root spans). Trade-off: it does not propagate context across
420
- * `await` boundaries, so deep nested-span parenting is limited on the edge — an
421
- * acceptable, documented limitation (workerd has no async context manager by
422
- * default regardless).
613
+ * When `globalThis.AsyncLocalStorage` is available (workerd, Node), we register
614
+ * an `AlsContextManager` that propagates context across `await` boundaries
615
+ * child spans created after an `await` are correctly parented to the root span.
616
+ * When absent we fall back to `SyncStackContextManager`, which covers the
617
+ * critical suppression path synchronously but does not carry context across
618
+ * `await` boundaries.
423
619
  */
424
620
  function ensureContextManager() {
425
621
  if (_contextManagerRegistered)
426
622
  return;
427
623
  _contextManagerRegistered = true;
428
- context.setGlobalContextManager(new SyncStackContextManager().enable());
624
+ const mgr = getALS() ? new AlsContextManager() : new SyncStackContextManager();
625
+ context.setGlobalContextManager(mgr.enable());
429
626
  }
430
627
  /** Reset the context-manager registration guard. Internal/testing helper. */
431
628
  export function __resetContextManager() {
@@ -456,6 +653,10 @@ function ensureGlobalProvider(config) {
456
653
  // the exporter actually takes effect — without one, `context.with` is a no-op
457
654
  // and suppression silently does nothing in production.
458
655
  ensureContextManager();
656
+ // Patch globalThis.fetch (once) so outbound subrequests get CLIENT child spans
657
+ // + `traceparent` injection (distributed tracing). The exporter's own POST uses
658
+ // the captured original fetch, so it is never re-entered by this wrapper.
659
+ ensureFetchInstrumentation(_provider.heystackExporter.ingestHost);
459
660
  return _provider;
460
661
  }
461
662
  /**
@@ -524,6 +725,7 @@ export function instrument(handler, config) {
524
725
  service: config.service,
525
726
  endpoint: config.endpoint,
526
727
  waitUntil: config.waitUntil,
728
+ sampling: config.sampling,
527
729
  });
528
730
  return { provider, tracer: trace.getTracer("heystack") };
529
731
  };
@@ -562,6 +764,17 @@ export function instrument(handler, config) {
562
764
  return originalFetch(req, env, ctx);
563
765
  const { provider, tracer } = s;
564
766
  const url = new URL(req.url);
767
+ // FR5: continue an inbound W3C traceparent so tap→server is one trace.
768
+ const parent = parseTraceparent(req.headers.get("traceparent"));
769
+ let startCtx = context.active();
770
+ if (parent) {
771
+ startCtx = trace.setSpanContext(startCtx, {
772
+ traceId: parent.traceId,
773
+ spanId: parent.spanId,
774
+ traceFlags: parent.traceFlags,
775
+ isRemote: true,
776
+ });
777
+ }
565
778
  const span = tracer.startSpan(`${req.method} ${url.pathname}`, {
566
779
  kind: SpanKind.SERVER,
567
780
  attributes: {
@@ -570,16 +783,92 @@ export function instrument(handler, config) {
570
783
  "url.path": url.pathname,
571
784
  "server.address": url.host,
572
785
  },
573
- });
786
+ }, startCtx);
787
+ // A1 enrichment: identity, request id, client ip, geo.
788
+ // All attributes are set only when non-empty to keep spans lean.
789
+ const userInfo = config.getUser?.(req);
790
+ if (userInfo?.id)
791
+ span.setAttribute("enduser.id", userInfo.id);
792
+ if (userInfo?.session)
793
+ span.setAttribute("session.id", userInfo.session);
794
+ const reqId = userInfo?.requestId ?? req.headers.get("cf-ray") ?? "";
795
+ if (reqId)
796
+ span.setAttribute("http.request.id", reqId);
797
+ const clientIp = req.headers.get("CF-Connecting-IP") ?? "";
798
+ if (clientIp)
799
+ span.setAttribute("client.address", clientIp);
800
+ // Cloudflare geo — only present on the real CF runtime; read defensively.
801
+ const cf = req.cf;
802
+ if (cf) {
803
+ if (typeof cf.country === "string" && cf.country)
804
+ span.setAttribute("geo.country", cf.country);
805
+ if (typeof cf.region === "string" && cf.region)
806
+ span.setAttribute("geo.region", cf.region);
807
+ if (typeof cf.city === "string" && cf.city)
808
+ span.setAttribute("geo.city", cf.city);
809
+ if (cf.asn != null)
810
+ span.setAttribute("geo.asn", String(cf.asn));
811
+ }
812
+ // FR5: response header carrying THIS span's trace + span id.
813
+ const sc = span.spanContext();
814
+ const traceparent = `00-${sc.traceId}-${sc.spanId}-01`;
815
+ // Instrument bindings when requested — wrap env BEFORE handing to the
816
+ // handler so binding calls made inside `originalFetch` (which runs inside
817
+ // `context.with` below) correctly parent to the root span via ALS.
818
+ let handlerEnv = env;
819
+ if (config.instrumentBindings) {
820
+ const binTracer = trace.getTracer("heystack");
821
+ handlerEnv = instrumentEnv(env, {
822
+ startSpan: (name, attrs) => binTracer.startSpan(name, { attributes: attrs }, context.active()),
823
+ select: config.instrumentBindings,
824
+ });
825
+ }
574
826
  try {
575
- const response = await context.with(trace.setSpan(context.active(), span), () => originalFetch(req, env, ctx));
827
+ const response = await context.with(trace.setSpan(startCtx, span), () => originalFetch(req, handlerEnv, ctx));
828
+ const headers = new Headers(response.headers);
829
+ headers.set("traceparent", traceparent);
830
+ appendExposeHeader(headers, "traceparent");
576
831
  span.setAttribute("http.response.status_code", response.status);
577
- span.setStatus({
578
- code: response.status >= 500 ? SpanStatusCode.ERROR : SpanStatusCode.UNSET,
832
+ const finalize = () => {
833
+ span.setStatus({
834
+ code: response.status >= 500 ? SpanStatusCode.ERROR : SpanStatusCode.UNSET,
835
+ });
836
+ span.end();
837
+ drain(provider, ctx);
838
+ };
839
+ // No body to stream → finalize now (redirects, 204/304, etc.).
840
+ if (!response.body) {
841
+ finalize();
842
+ return new Response(null, {
843
+ status: response.status,
844
+ statusText: response.statusText,
845
+ headers,
846
+ });
847
+ }
848
+ // FR1: keep the span open until the streamed body drains; the first
849
+ // chunk records time-to-first-byte. `finished` guards double-finalize.
850
+ let firstByte = false;
851
+ let finished = false;
852
+ const monitor = new TransformStream({
853
+ transform(chunk, controller) {
854
+ if (!firstByte) {
855
+ firstByte = true;
856
+ span.addEvent("first_byte");
857
+ }
858
+ controller.enqueue(chunk);
859
+ },
860
+ flush() {
861
+ if (finished)
862
+ return;
863
+ finished = true;
864
+ finalize();
865
+ },
866
+ });
867
+ return new Response(response.body.pipeThrough(monitor), {
868
+ status: response.status,
869
+ statusText: response.statusText,
870
+ headers,
579
871
  });
580
- span.end();
581
- drain(provider, ctx);
582
- return response;
583
872
  }
584
873
  catch (error) {
585
874
  span.recordException(error instanceof Error ? error : new Error(String(error)));
@@ -658,3 +947,29 @@ export function instrument(handler, config) {
658
947
  }
659
948
  return wrapped;
660
949
  }
950
+ export async function withSpan(name, attrsOrFn, maybeFn) {
951
+ const attrs = typeof attrsOrFn === "function" ? undefined : attrsOrFn;
952
+ const fn = (typeof attrsOrFn === "function" ? attrsOrFn : maybeFn);
953
+ const tracer = trace.getTracer("heystack");
954
+ const span = tracer.startSpan(name, attrs ? { attributes: attrs } : undefined);
955
+ const ctx = trace.setSpan(context.active(), span);
956
+ try {
957
+ return await context.with(ctx, () => fn(span));
958
+ }
959
+ catch (err) {
960
+ span.recordException(err);
961
+ span.setStatus({ code: SpanStatusCode.ERROR, message: err.message });
962
+ throw err;
963
+ }
964
+ finally {
965
+ span.end();
966
+ }
967
+ }
968
+ /**
969
+ * Add a named event (with optional attributes) to the currently-active span.
970
+ * No-op when no span is active.
971
+ */
972
+ export function addEvent(name, attrs) {
973
+ const span = trace.getSpan(context.active());
974
+ span?.addEvent(name, attrs);
975
+ }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@heystack/otel",
3
- "version": "0.4.3",
3
+ "version": "0.6.0",
4
4
  "description": "Runtime-aware OpenTelemetry tracing that exports to Heystack (Node, Next.js, Workers).",
5
5
  "license": "MIT",
6
6
  "type": "module",