@heystack/otel 0.5.0 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -117,9 +117,40 @@ Set the key as a secret: `wrangler secret put HEYSTACK_API_KEY`.
117
117
  | `apiKey` | `string?` | Defaults to `env.HEYSTACK_API_KEY`. |
118
118
  | `getUser` | `(req: Request) => { id?, session?, requestId? } \| undefined` | Called per request. `id` → `enduser.id`, `session` → `session.id`, `requestId` → `http.request.id` (falls back to the `cf-ray` header). |
119
119
  | `instrumentBindings` | `boolean \| string[]` | `true` = auto child spans for all detected D1/KV/R2/Vectorize bindings; `string[]` = only the named bindings. Default `false`. |
120
+ | `sampling` | `{ rate?: number } \| { remote: true }` | Head-sampling configuration. `{ rate }`: keep a deterministic fraction of fresh root traces (0–1; default `1` = keep all). `{ remote: true }`: fetch the rate from the Heystack config endpoint instead — lets you change it centrally without redeploying. Cold isolates keep all traffic until the first config fetch resolves; fails open if the config can't be reached. Parent-respecting in both modes: a request arriving with a sampled `traceparent` is always recorded. See [Head sampling](#head-sampling) below. |
120
121
  | `waitUntil` | `(p: Promise<unknown>) => void` | Override the isolate keep-alive hook; defaults to the auto-detected `ctx.waitUntil`. |
121
122
  | `endpoint` | `string?` | Override the ingest endpoint (advanced). |
122
123
 
124
+ ### Head sampling
125
+
126
+ `sampling: { rate }` performs **head sampling** — the keep/drop decision is made at the start of each fresh root trace, before any export. Traces that are dropped never leave the worker (no egress, no ingest cost, no storage). Traces that are kept are exported in full.
127
+
128
+ ```ts
129
+ export default instrument(worker, {
130
+ service: "my-api",
131
+ sampling: { rate: 0.2 }, // keep ~20% of traffic
132
+ });
133
+ ```
134
+
135
+ The sampler is **deterministic by trace ID**: the same trace always makes the same decision, so parent→child consistency is maintained when this worker is in a call chain sampled at the same rate by another service.
136
+
137
+ **Tradeoff:** head sampling can't guarantee keeping error traces — the drop/keep decision is made before the response status is known. If you need every error captured, keep `rate: 1` (the default). Heystack's ingest-side keep rules (error and slow retention) apply among traces that _are_ exported; they can't rescue a trace that was dropped in the worker before export.
138
+
139
+ The `rate` governs **fresh root traces only** (no inbound `traceparent`, or `traceparent` with `sampled=0`). A request arriving with a sampled inbound `traceparent` (`sampled=1`) is always recorded — the parent's decision is respected, so a distributed trace is never split mid-flight by the child's sample rate.
140
+
141
+ ### Remote sampling
142
+
143
+ `sampling: { remote: true }` lets Heystack control the rate centrally — change it from the dashboard without redeploying:
144
+
145
+ ```ts
146
+ export default instrument(worker, {
147
+ service: "my-api",
148
+ sampling: { remote: true },
149
+ });
150
+ ```
151
+
152
+ On startup the worker fetches its configured rate from the Heystack config endpoint. **Cold isolates keep all traffic until that first fetch resolves** (fails open — nothing is dropped while loading). If the config endpoint can't be reached, the worker keeps everything. The same parent-respecting rule applies: a request with an inbound sampled `traceparent` is always recorded regardless of the fetched rate.
153
+
123
154
  ### Automatic tracing
124
155
 
125
156
  `instrument()` traces the following automatically, with no additional config:
@@ -274,6 +305,8 @@ As belt-and-suspenders the exporter also drops any span whose HTTP target points
274
305
 
275
306
  ## Migration / versioning
276
307
 
308
+ - **`0.7.0`** — **`/workers`: remote sampling (`sampling: { remote: true }`).** New `sampling` variant that fetches the head-sampling rate from the Heystack config endpoint at runtime, so you can change it from the console without redeploying. Cold isolates keep all traffic until the first config fetch resolves (fails open). If the config endpoint is unreachable, the worker keeps everything. Same parent-respecting rule as `sampling: { rate }`. No breaking changes; existing `sampling: { rate }` configs are unchanged.
309
+ - **`0.6.0`** — **`/workers`: head sampling (`sampling: { rate }`).** New optional `WorkersConfig` field: `sampling.rate` (0–1, default `1`). Keeps a deterministic fraction of fresh root traces — the drop decision is made in the worker before export (no egress, no ingest cost). Parent-respecting: requests arriving with a sampled `traceparent` are always recorded. Consistent with server-side sampling (same trace-ID hash). No breaking changes; all new options are optional. See [Head sampling](#head-sampling).
277
310
  - **`0.5.0`** — **`/workers`: identity enrichment, binding tracing, outbound-fetch tracing, manual span helpers.** New `WorkersConfig` options: `getUser` (attach `enduser.id`/`session.id`/`http.request.id` per request from a synchronous callback), `instrumentBindings` (auto child spans for D1/KV/R2/Vectorize — `true` = all detected, or a `string[]` to select). Outbound `fetch` calls made inside a traced handler automatically get CLIENT child spans with `traceparent` injection (distributed tracing across services). New ergonomic exports from `/workers`: `withSpan(name, attrs?, fn)` runs a function inside a named child span (auto-parented, exceptions recorded, `span.end()` in `finally`); `addEvent(name, attrs?)` adds an event to the active span. No breaking changes; all new options are optional.
278
311
  - **`0.4.3`** — **feedback-loop guard extended to the Node path (cost fix).** The self-instrumentation loop the Workers path fixed in 0.3.1/0.3.2 was still live on plain Node / Next-on-Node: the OTLP-over-`http` exporter's `POST /v1/traces` was auto-instrumented and re-exported, so ~77% of ingested spans in production were the exporter tracing itself — needless ingest + ClickHouse compute. 0.4.3 ignores ingest-host requests in the HTTP/undici auto-instrumentations **and** filters self-spans at the exporter boundary (covers caller-supplied `instrumentations` too). The hostname matcher is now a shared module used by both `/node` and `/workers`. No API change. **Action: upgrade and redeploy any Node/Next-on-Node app** — it cuts ingested span volume sharply.
279
312
  - **`0.3.5`** — **type-constraint fix (Workers).** A Worker whose `queue` consumer is typed with a concrete message body — `queue(batch: MessageBatch<MyJob>, …)`, the normal case — failed to compile against `instrument()` in 0.3.4 (`TS2345`: `MessageBatch<unknown>` not assignable to `MessageBatch<MyJob>`). The `WorkerHandler` constraint declared its entrypoints as arrow properties, whose parameters are checked **contravariantly** under `strictFunctionTypes`, so a narrowed handler wasn't assignable. 0.3.5 declares them with **method syntax** (bivariant parameters) and widens the batch to `MessageBatch<any>`, mirroring Cloudflare's own `ExportedHandler` — a typed-queue Worker now type-checks with a bare `instrument(handler, cfg)`. Runtime behaviour unchanged. Also adds a **consumer type-check gate** (`pnpm consumer-typecheck`, run in `check` and `prepublishOnly`) that compiles a fully-typed Worker against the built `dist` through the public `exports` map and asserts `satisfies ExportedHandler<Env>` — the regression that escaped in 0.3.3/0.3.4 now fails the build before publish.
@@ -0,0 +1,35 @@
1
+ import { type Sampler, type SamplingResult } from "@opentelemetry/sdk-trace-base";
2
+ import type { Context, Attributes, Link, SpanKind } from "@opentelemetry/api";
3
+ export declare function fnv01(s: string): number;
4
+ export declare function traceKept(traceId: string, rate: number): boolean;
5
+ /** Test-only: set the remote rate directly. */
6
+ export declare function __setRemoteRate(n: number): void;
7
+ /** Test-only: read the current remote rate. */
8
+ export declare function __getRemoteRate(): number;
9
+ export declare class HeystackRatioSampler implements Sampler {
10
+ private readonly rate;
11
+ constructor(rate: number | (() => number));
12
+ shouldSample(_ctx: Context, traceId: string, _name: string, _kind: SpanKind, _attrs: Attributes, _links: Link[]): SamplingResult;
13
+ toString(): string;
14
+ }
15
+ /**
16
+ * Fetch the sampling rate from the Heystack ingest config endpoint and update
17
+ * the module-level `_remoteRate` ref. Call once per isolate (guarded in
18
+ * `workers.ts` via `_remoteSamplingKicked`). Uses the provided `fetchImpl`
19
+ * (the captured pre-patch fetch) so the GET is never re-entered by outbound
20
+ * fetch instrumentation and is wrapped in `suppressTracing` at the call site
21
+ * in `workers.ts` (belt-and-suspenders against self-tracing).
22
+ *
23
+ * Fail-open: any network failure, non-200, or parse error leaves `_remoteRate`
24
+ * at its current value (initially 1 = keep-all). Remote sampling must never
25
+ * drop telemetry due to a config-service outage.
26
+ */
27
+ export declare function loadRemoteSamplingRate(opts: {
28
+ endpoint: string;
29
+ apiKey: string;
30
+ fetchImpl: typeof fetch;
31
+ }): Promise<void>;
32
+ export declare function makeSampler(sampling?: {
33
+ rate?: number;
34
+ remote?: boolean;
35
+ }): Sampler;
@@ -0,0 +1,89 @@
1
+ import { SamplingDecision, ParentBasedSampler, AlwaysOnSampler, } from "@opentelemetry/sdk-trace-base";
2
+ // FNV-1a 32-bit → [0,1). MUST stay byte-identical to apps/ingest/src/sampling.ts
3
+ // so the SDK and ingest agree on which traces to keep.
4
+ export function fnv01(s) {
5
+ let h = 0x811c9dc5;
6
+ for (let i = 0; i < s.length; i++) {
7
+ h ^= s.charCodeAt(i);
8
+ h = Math.imul(h, 0x01000193);
9
+ }
10
+ return (h >>> 0) / 0x100000000;
11
+ }
12
+ export function traceKept(traceId, rate) {
13
+ if (rate >= 1)
14
+ return true;
15
+ if (rate <= 0)
16
+ return false;
17
+ return fnv01(traceId) < rate;
18
+ }
19
+ // Module-level mutable ref for the remote-loaded rate. Starts at 1 (keep-all)
20
+ // so cold-isolate requests are fully preserved until the config fetch resolves.
21
+ const _remoteRate = { value: 1 };
22
+ /** Test-only: set the remote rate directly. */
23
+ export function __setRemoteRate(n) {
24
+ _remoteRate.value = n;
25
+ }
26
+ /** Test-only: read the current remote rate. */
27
+ export function __getRemoteRate() {
28
+ return _remoteRate.value;
29
+ }
30
+ export class HeystackRatioSampler {
31
+ rate;
32
+ constructor(rate) {
33
+ this.rate = rate;
34
+ }
35
+ shouldSample(_ctx, traceId, _name, _kind, _attrs, _links) {
36
+ const rate = typeof this.rate === "function" ? this.rate() : this.rate;
37
+ return {
38
+ decision: traceKept(traceId, rate)
39
+ ? SamplingDecision.RECORD_AND_SAMPLED
40
+ : SamplingDecision.NOT_RECORD,
41
+ };
42
+ }
43
+ toString() {
44
+ const r = typeof this.rate === "function" ? this.rate() : this.rate;
45
+ return `HeystackRatioSampler{${r}}`;
46
+ }
47
+ }
48
+ /**
49
+ * Fetch the sampling rate from the Heystack ingest config endpoint and update
50
+ * the module-level `_remoteRate` ref. Call once per isolate (guarded in
51
+ * `workers.ts` via `_remoteSamplingKicked`). Uses the provided `fetchImpl`
52
+ * (the captured pre-patch fetch) so the GET is never re-entered by outbound
53
+ * fetch instrumentation and is wrapped in `suppressTracing` at the call site
54
+ * in `workers.ts` (belt-and-suspenders against self-tracing).
55
+ *
56
+ * Fail-open: any network failure, non-200, or parse error leaves `_remoteRate`
57
+ * at its current value (initially 1 = keep-all). Remote sampling must never
58
+ * drop telemetry due to a config-service outage.
59
+ */
60
+ export async function loadRemoteSamplingRate(opts) {
61
+ try {
62
+ const url = `${opts.endpoint.replace(/\/+$/, "")}/v1/sampling/config`;
63
+ const res = await opts.fetchImpl(url, {
64
+ headers: { Authorization: `Bearer ${opts.apiKey}` },
65
+ });
66
+ if (!res.ok)
67
+ return; // fail open — keep current rate
68
+ const cfg = (await res.json());
69
+ const r = Number(cfg?.trace_sample_rate);
70
+ if (Number.isFinite(r) && r >= 0 && r <= 1)
71
+ _remoteRate.value = r;
72
+ }
73
+ catch {
74
+ /* fail open: leave rate at keep-all */
75
+ }
76
+ }
77
+ export function makeSampler(sampling) {
78
+ if (sampling?.remote) {
79
+ // Dynamic rate: reads from the per-isolate ref that loadRemoteSamplingRate sets.
80
+ // Starts at 1 (keep-all) until the config fetch resolves on the first request.
81
+ return new ParentBasedSampler({ root: new HeystackRatioSampler(() => _remoteRate.value) });
82
+ }
83
+ const rate = sampling?.rate;
84
+ if (rate === undefined || rate >= 1)
85
+ return new AlwaysOnSampler();
86
+ // ParentBased: a sampled/dropped PARENT (local or remote via traceparent) wins,
87
+ // so a distributed trace samples consistently; only the root uses the ratio.
88
+ return new ParentBasedSampler({ root: new HeystackRatioSampler(rate) });
89
+ }
package/dist/workers.d.ts CHANGED
@@ -159,6 +159,25 @@ export interface WorkersConfig {
159
159
  * Defaults to `false` (no binding tracing). Consumed by a later task.
160
160
  */
161
161
  instrumentBindings?: boolean | string[];
162
+ /**
163
+ * Head-sampling configuration. When omitted (or `rate` >= 1), all traces are
164
+ * kept. When `rate` < 1, a deterministic FNV-1a hash of the trace ID decides
165
+ * keep/drop at the root span — consistent with the ingest-side sampler so the
166
+ * SDK and backend agree. Error / slow-span retention is an ingest concern.
167
+ *
168
+ * Note: head sampling is parent-respecting, so an incoming request carrying a
169
+ * sampled `traceparent` is still recorded even at `rate: 0` (it is not an
170
+ * absolute kill-switch; it governs only fresh/root traces).
171
+ *
172
+ * When `remote: true`, the sampling rate is fetched once per isolate from
173
+ * `{endpoint}/v1/sampling/config` on the first request. Cold-isolate spans
174
+ * are kept (rate=1) until the fetch resolves. On any failure the rate stays
175
+ * at keep-all (fail-open). Incompatible with inline `rate` (remote wins).
176
+ */
177
+ sampling?: {
178
+ rate?: number;
179
+ remote?: boolean;
180
+ };
162
181
  }
163
182
  /**
164
183
  * A `BasicTracerProvider` with the underlying `HeystackSpanExporter` attached so
@@ -178,7 +197,11 @@ export type HeystackTracerProvider = BasicTracerProvider & {
178
197
  * must drain the exporter — `await provider.heystackExporter.forceFlush()` (or
179
198
  * `flushHeystack()`), not just `provider.forceFlush()`.
180
199
  */
181
- export declare function createTracerProvider(config: HeystackOptions): HeystackTracerProvider;
200
+ export declare function createTracerProvider(config: HeystackOptions & {
201
+ sampling?: {
202
+ rate?: number;
203
+ };
204
+ }): HeystackTracerProvider;
182
205
  /**
183
206
  * An AsyncLocalStorage-backed ContextManager for the /workers entry. When the
184
207
  * runtime exposes `globalThis.AsyncLocalStorage` (workerd and Node do), this
package/dist/workers.js CHANGED
@@ -13,9 +13,10 @@ import { ROOT_CONTEXT } from "@opentelemetry/api";
13
13
  import { Resource } from "@opentelemetry/resources";
14
14
  import { BasicTracerProvider, SimpleSpanProcessor, } from "@opentelemetry/sdk-trace-base";
15
15
  import { ATTR_SERVICE_NAME } from "@opentelemetry/semantic-conventions";
16
- import { buildExporterConfig } from "./core.js";
16
+ import { buildExporterConfig, DEFAULT_ENDPOINT } from "./core.js";
17
17
  import { isSelfSpanAttrs, safeHostname } from "./self-span.js";
18
18
  import { instrumentEnv } from "./workers-bindings.js";
19
+ import { makeSampler, loadRemoteSamplingRate } from "./workers-sampler.js";
19
20
  // `ExportResult` / `ExportResultCode` mirror `@opentelemetry/core`. We define
20
21
  // them inline (structurally identical) rather than import them: core is only a
21
22
  // transitive dep of sdk-trace-base and isn't reliably resolvable, and keeping it
@@ -523,6 +524,7 @@ export function createTracerProvider(config) {
523
524
  const provider = new BasicTracerProvider({
524
525
  resource: new Resource({ [ATTR_SERVICE_NAME]: config.service }),
525
526
  spanProcessors: [new SimpleSpanProcessor(exporter)],
527
+ sampler: makeSampler(config.sampling),
526
528
  });
527
529
  // Attach the exporter so flush paths can await its in-flight fetches.
528
530
  return Object.assign(provider, { heystackExporter: exporter });
@@ -694,7 +696,10 @@ export async function flushHeystack() {
694
696
  /** Reset the singleton global provider. Internal/testing helper. */
695
697
  export function __resetProvider() {
696
698
  _provider = null;
699
+ _remoteSamplingKicked = false;
697
700
  }
701
+ /** Guard: the remote sampling config GET fires at most once per isolate. */
702
+ let _remoteSamplingKicked = false;
698
703
  let warnedNoKey = false;
699
704
  function warnOnceNoKey() {
700
705
  if (warnedNoKey)
@@ -723,6 +728,7 @@ export function instrument(handler, config) {
723
728
  service: config.service,
724
729
  endpoint: config.endpoint,
725
730
  waitUntil: config.waitUntil,
731
+ sampling: config.sampling,
726
732
  });
727
733
  return { provider, tracer: trace.getTracer("heystack") };
728
734
  };
@@ -760,6 +766,22 @@ export function instrument(handler, config) {
760
766
  if (!s)
761
767
  return originalFetch(req, env, ctx);
762
768
  const { provider, tracer } = s;
769
+ // Once per isolate: kick off the remote sampling config GET so the rate
770
+ // is available for subsequent requests without a redeploy. Uses the
771
+ // captured pre-patch fetch under suppressTracing — never self-traced,
772
+ // never looped. Fail-open: any error leaves the rate at 1 (keep-all).
773
+ if (config.sampling?.remote && !_remoteSamplingKicked) {
774
+ _remoteSamplingKicked = true;
775
+ const resolvedKey = config.apiKey ?? env?.HEYSTACK_API_KEY;
776
+ if (resolvedKey) {
777
+ const ep = config.endpoint ?? DEFAULT_ENDPOINT;
778
+ ctx.waitUntil(context.with(suppressTracing(context.active()), () => loadRemoteSamplingRate({
779
+ endpoint: ep,
780
+ apiKey: resolvedKey,
781
+ fetchImpl: _originalFetch ?? fetch,
782
+ })));
783
+ }
784
+ }
763
785
  const url = new URL(req.url);
764
786
  // FR5: continue an inbound W3C traceparent so tap→server is one trace.
765
787
  const parent = parseTraceparent(req.headers.get("traceparent"));
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@heystack/otel",
3
- "version": "0.5.0",
3
+ "version": "0.7.0",
4
4
  "description": "Runtime-aware OpenTelemetry tracing that exports to Heystack (Node, Next.js, Workers).",
5
5
  "license": "MIT",
6
6
  "type": "module",