@service-bridge/node 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (51) hide show
  1. package/README.md +854 -0
  2. package/biome.json +28 -0
  3. package/bun.lock +249 -0
  4. package/dist/express.d.ts +51 -0
  5. package/dist/express.js +129 -0
  6. package/dist/fastify.d.ts +43 -0
  7. package/dist/fastify.js +122 -0
  8. package/dist/index.js +34410 -0
  9. package/dist/trace.d.ts +19 -0
  10. package/http/dist/express.d.ts +51 -0
  11. package/http/dist/express.d.ts.map +1 -0
  12. package/http/dist/express.test.d.ts +2 -0
  13. package/http/dist/express.test.d.ts.map +1 -0
  14. package/http/dist/fastify.d.ts +43 -0
  15. package/http/dist/fastify.d.ts.map +1 -0
  16. package/http/dist/fastify.test.d.ts +2 -0
  17. package/http/dist/fastify.test.d.ts.map +1 -0
  18. package/http/dist/index.d.ts +7 -0
  19. package/http/dist/index.d.ts.map +1 -0
  20. package/http/dist/trace.d.ts +19 -0
  21. package/http/dist/trace.d.ts.map +1 -0
  22. package/http/dist/trace.test.d.ts +2 -0
  23. package/http/dist/trace.test.d.ts.map +1 -0
  24. package/http/package.json +48 -0
  25. package/http/src/express.test.ts +125 -0
  26. package/http/src/express.ts +209 -0
  27. package/http/src/fastify.test.ts +142 -0
  28. package/http/src/fastify.ts +159 -0
  29. package/http/src/index.ts +10 -0
  30. package/http/src/sdk-augment.d.ts +11 -0
  31. package/http/src/servicebridge.d.ts +23 -0
  32. package/http/src/trace.test.ts +97 -0
  33. package/http/src/trace.ts +56 -0
  34. package/http/tsconfig.json +17 -0
  35. package/http/tsconfig.test.json +6 -0
  36. package/package.json +65 -0
  37. package/sdk/dist/generated/servicebridge-package-definition.d.ts +4709 -0
  38. package/sdk/dist/grpc-client.d.ts +304 -0
  39. package/sdk/dist/grpc-client.test.d.ts +1 -0
  40. package/sdk/dist/index.d.ts +2 -0
  41. package/sdk/package.json +30 -0
  42. package/sdk/scripts/generate-proto.ts +65 -0
  43. package/sdk/src/generated/servicebridge-package-definition.ts +5198 -0
  44. package/sdk/src/grpc-client.d.ts +305 -0
  45. package/sdk/src/grpc-client.d.ts.map +1 -0
  46. package/sdk/src/grpc-client.test.ts +422 -0
  47. package/sdk/src/grpc-client.ts +2924 -0
  48. package/sdk/src/index.d.ts +3 -0
  49. package/sdk/src/index.d.ts.map +1 -0
  50. package/sdk/src/index.ts +29 -0
  51. package/sdk/tsconfig.json +13 -0
package/README.md ADDED
@@ -0,0 +1,854 @@
1
+ # @service-bridge/node
2
+
3
+ [![npm version](https://img.shields.io/npm/v/%40service-bridge%2Fnode?color=cb3837&logo=npm)](https://www.npmjs.com/package/@service-bridge/node)
4
+ [![License](https://img.shields.io/badge/License-Free%20%2F%20Commercial-blue)](../LICENSE)
5
+ [![TypeScript](https://img.shields.io/badge/TypeScript-5%2B-3178c6?logo=typescript&logoColor=white)](https://www.typescriptlang.org/)
6
+ [![Node](https://img.shields.io/badge/Node.js-18%2B-339933?logo=node.js&logoColor=white)](https://nodejs.org/)
7
+
8
+ **The Unified Bridge for Microservices Interaction**
9
+
10
+ Node.js SDK for [ServiceBridge](https://servicebridge.dev) — production-ready RPC, durable events, workflows, jobs, and distributed tracing in a single SDK. One Go runtime and PostgreSQL.
11
+
12
+ ```
13
+ ┌─────────────────────────────────────────────────────────────────┐
14
+ │ BEFORE: 10 moving parts │
15
+ │ Istio · Envoy · RabbitMQ · Temporal · Jaeger · Consul · │
16
+ │ cert-manager · Alertmanager · cron · custom glue │
17
+ └─────────────────────────────────────────────────────────────────┘
18
+
19
+ ┌─────────────────────────────────────────────────────────────────┐
20
+ │ AFTER: ServiceBridge + PostgreSQL │
21
+ │ RPC · Events · Workflows · Jobs · Tracing · mTLS · Dashboard │
22
+ │ One SDK · One runtime · Zero sidecars │
23
+ └─────────────────────────────────────────────────────────────────┘
24
+ ```
25
+
26
+ ## Table of Contents
27
+
28
+ - [Why ServiceBridge](#why-servicebridge)
29
+ - [Use Cases](#use-cases)
30
+ - [Quick Start](#quick-start)
31
+ - [Install](#install)
32
+ - [Runtime Setup](#runtime-setup)
33
+ - [End-to-End Example](#end-to-end-example)
34
+ - [Platform Features](#platform-features)
35
+ - [How It Compares](#how-it-compares)
36
+ - [API Reference](#api-reference)
37
+ - [HTTP Plugins](#http-plugins)
38
+ - [Configuration](#configuration)
39
+ - [Environment Variables](#environment-variables)
40
+ - [Error Handling](#error-handling)
41
+ - [When to Use / When Not to Use](#when-to-use--when-not-to-use)
42
+ - [FAQ](#faq)
43
+ - [Community and Support](#community-and-support)
44
+ - [License](#license)
45
+
46
+ ---
47
+
48
+ ## Why ServiceBridge
49
+
50
+ | Problem | Without ServiceBridge | With ServiceBridge |
51
+ |---|---|---|
52
+ | Service-to-service calls | Istio/Envoy sidecar proxy per pod | **Direct SDK-to-worker gRPC, zero proxy hops** |
53
+ | Async messaging | Kafka/RabbitMQ + retry logic + DLQ setup | **Built-in durable events with retry, DLQ, replay** |
54
+ | Background jobs | Bull/BullMQ + Redis + cron daemon | **Built-in cron and delayed jobs** |
55
+ | Workflow orchestration | Temporal/Conductor cluster + persistence | **Built-in DAG workflows** |
56
+ | Distributed tracing | Jaeger/Tempo + OTEL collector + dashboards | **Built-in traces + realtime UI** |
57
+ | Service discovery | Consul/etcd + DNS glue | **Built-in registry + health-aware balancing** |
58
+ | mTLS | cert-manager + Vault PKI | **Auto-provisioned certs from service key** |
59
+
60
+ **Result**: `10 tools → 1 runtime`. One Go binary + PostgreSQL replaces the entire stack.
61
+
62
+ ---
63
+
64
+ ## Use Cases
65
+
66
+ **Microservice communication** — Replace sidecar mesh with direct RPC calls. Get sub-millisecond overhead instead of double proxy hop latency.
67
+
68
+ **Event-driven architecture** — Publish durable events with fan-out, retries, DLQ, idempotency, and server-side filtering. No broker infrastructure to manage.
69
+
70
+ **Background job scheduling** — Cron jobs, delayed execution, and job-triggered workflows in a single API. No Redis, no separate queue workers.
71
+
72
+ **Saga / distributed transactions** — DAG workflows with typed steps (`rpc`, `event`, `event_wait`, `sleep`, child workflow). Compensations and rollbacks via workflow step dependencies.
73
+
74
+ **AI agent orchestration** — Stream LLM tokens via realtime run streams with replay. Orchestrate multi-step AI pipelines as workflows.
75
+
76
+ **Full-stack observability** — Every RPC call, event delivery, workflow step, and HTTP request traced automatically. One timeline, one dashboard. Prometheus metrics and Loki-compatible log API included.
77
+
78
+ ---
79
+
80
+ ## Quick Start
81
+
82
+ ### 1. Install
83
+
84
+ ```bash
85
+ npm install @service-bridge/node
86
+ # or
87
+ bun add @service-bridge/node
88
+ ```
89
+
90
+ ### 2. Create a worker (service that handles calls)
91
+
92
+ ```ts
93
+ import { servicebridge } from "@service-bridge/node";
94
+
95
+ const sb = servicebridge(
96
+ process.env.SERVICEBRIDGE_URL ?? "127.0.0.1:14445",
97
+ process.env.SERVICE_KEY!,
98
+ "payments",
99
+ );
100
+
101
+ sb.handleRpc("charge", async (payload: { orderId: string; amount: number }) => {
102
+ return { ok: true, txId: `tx_${Date.now()}`, orderId: payload.orderId };
103
+ });
104
+
105
+ await sb.serve({ host: "127.0.0.1" });
106
+ ```
107
+
108
+ ### 3. Call it from another service
109
+
110
+ ```ts
111
+ import { servicebridge } from "@service-bridge/node";
112
+
113
+ const sb = servicebridge(
114
+ process.env.SERVICEBRIDGE_URL ?? "127.0.0.1:14445",
115
+ process.env.SERVICE_KEY!,
116
+ "orders",
117
+ );
118
+
119
+ const result = await sb.rpc<{ ok: boolean; txId: string }>("payments/charge", {
120
+ orderId: "ord_42",
121
+ amount: 4990,
122
+ });
123
+
124
+ console.log(result.txId); // tx_1711234567890
125
+ ```
126
+
127
+ That's it. No broker, no sidecar, no proxy — direct gRPC call between services.
128
+
129
+ ---
130
+
131
+ ## Runtime Setup
132
+
133
+ The SDK connects to a ServiceBridge runtime. The fastest way to start:
134
+
135
+ ```bash
136
+ bash <(curl -fsSL https://servicebridge.dev/install.sh)
137
+ ```
138
+
139
+ This installs ServiceBridge + PostgreSQL via Docker Compose and generates an admin password automatically. After install, the dashboard is at `http://localhost:14444` and the gRPC control plane at `127.0.0.1:14445`.
140
+
141
+ For manual Docker Compose setup, configuration reference, and all runtime environment variables, see the **[Runtime Setup](../README.md#runtime-setup)** section in the main SDK README.
142
+
143
+ ---
144
+
145
+ ## End-to-End Example
146
+
147
+ A complete order flow: HTTP request → RPC → Event → Event handler with streaming.
148
+
149
+ ```ts
150
+ import { servicebridge } from "@service-bridge/node";
151
+
152
+ // --- Payments service (worker) ---
153
+
154
+ const payments = servicebridge("127.0.0.1:14445", process.env.SERVICE_KEY!, "payments");
155
+
156
+ payments.handleRpc("charge", async (payload: { orderId: string; amount: number }, ctx) => {
157
+ await ctx?.stream.write({ status: "charging", orderId: payload.orderId }, "progress");
158
+
159
+ // ... charge logic ...
160
+
161
+ await ctx?.stream.write({ status: "charged" }, "progress");
162
+ return { ok: true, txId: `tx_${Date.now()}` };
163
+ });
164
+
165
+ await payments.serve({ host: "127.0.0.1" });
166
+ ```
167
+
168
+ ```ts
169
+ // --- Orders service (caller + event publisher) ---
170
+
171
+ const orders = servicebridge("127.0.0.1:14445", process.env.SERVICE_KEY!, "orders");
172
+
173
+ // Call payments, then publish event
174
+ const charge = await orders.rpc<{ ok: boolean; txId: string }>("payments/charge", {
175
+ orderId: "ord_42",
176
+ amount: 4990,
177
+ });
178
+
179
+ await orders.event("orders.completed", {
180
+ orderId: "ord_42",
181
+ txId: charge.txId,
182
+ }, {
183
+ idempotencyKey: "order:ord_42:completed",
184
+ headers: { source: "checkout" },
185
+ });
186
+ ```
187
+
188
+ ```ts
189
+ // --- Notifications service (event consumer) ---
190
+
191
+ const notifications = servicebridge("127.0.0.1:14445", process.env.SERVICE_KEY!, "notifications");
192
+
193
+ notifications.handleEvent("orders.*", async (payload, ctx) => {
194
+ const body = payload as { orderId: string; txId: string };
195
+ await ctx.stream.write({ status: "sending_email", orderId: body.orderId }, "progress");
196
+ // ... send email ...
197
+ });
198
+
199
+ await notifications.serve({ host: "127.0.0.1" });
200
+ ```
201
+
202
+ ```ts
203
+ // --- Orchestrate as a workflow ---
204
+
205
+ await orders.workflow("order.fulfillment", [
206
+ { id: "reserve", type: "rpc", ref: "inventory/reserve" },
207
+ { id: "charge", type: "rpc", ref: "payments/charge", deps: ["reserve"] },
208
+ { id: "wait_dlv", type: "event_wait", ref: "shipping.delivered", deps: ["charge"] },
209
+ { id: "notify", type: "event", ref: "orders.fulfilled", deps: ["wait_dlv"] },
210
+ ]);
211
+ ```
212
+
213
+ Every step above — RPC, event publish, event delivery, workflow execution — appears in a single trace timeline in the built-in dashboard.
214
+
215
+ ---
216
+
217
+ ## Platform Features
218
+
219
+ ### Communication
220
+ - **Direct RPC** — zero-hop gRPC calls with retries, deadlines, and mTLS identity
221
+ - **Durable Events** — fan-out delivery, at-least-once guarantees, retries, DLQ, replay, idempotency
222
+ - **Realtime Streams** — live chunks with replay for AI/progress/log streaming
223
+ - **Service Discovery** — automatic endpoint resolution and round-robin balancing
224
+ - **HTTP Middleware** — Express and Fastify instrumentation with automatic trace propagation
225
+
226
+ ### Orchestration
227
+ - **Workflows** — DAG steps: `rpc`, `event`, `event_wait`, `sleep`, child workflow
228
+ - **Jobs** — cron, delayed, and workflow-triggered scheduling
229
+
230
+ ### Security
231
+ - **Auto mTLS** — automatic certificate provisioning for workers
232
+ - **Access Policy** — service-level caller/target restrictions and RBAC
233
+
234
+ ### Observability
235
+ - **Unified Tracing** — single trace timeline across HTTP, RPC, events, workflows, and jobs
236
+ - **Metrics** — Prometheus-compatible `/metrics` endpoint (30+ metric families)
237
+ - **Logs** — structured log ingest with Loki-compatible query API
238
+ - **Alerts** — runtime alerts for delivery failures, errors, and service health
239
+ - **Dashboard** — realtime web UI for runs, events, workflows, jobs, DLQ, service map, and service keys
240
+
241
+ ---
242
+
243
+ ## How It Compares
244
+
245
+ | Concern | Istio + Envoy | Dapr | Temporal + Kafka | ServiceBridge |
246
+ |---|---|---|---|---|
247
+ | RPC data path | Sidecar proxy hop | Sidecar/daemon hop | N/A | **Direct (proxyless)** |
248
+ | Service discovery | K8s control plane | Sidecar placement | External registry | **Built-in registry** |
249
+ | Durable events + DLQ | External broker | Pub/Sub component | Kafka + consumers | **Built-in** |
250
+ | Workflow orchestration | External engine | External engine | Built-in | **Built-in** |
251
+ | Job scheduling | External cron/queue | External scheduler | External scheduler | **Built-in** |
252
+ | Traces + UI | Jaeger/Tempo + dashboards | OTEL backend + dashboards | Temporal UI | **Built-in** |
253
+ | Logs for Grafana | Loki + Promtail pipeline | Log pipeline | Log pipeline | **Built-in Loki API** |
254
+ | Metrics | App/exporter setup | App/exporter setup | Multiple exporters | **Built-in `/metrics`** |
255
+ | Security model | Mesh PKI + policy | Deployment-dependent mTLS | Mixed | **Service keys + auto mTLS** |
256
+ | Operational footprint | Multi-component mesh | Runtime + sidecars | Workflow + broker + DB | **One binary + PostgreSQL** |
257
+
258
+ ---
259
+
260
+ ## API Reference
261
+
262
+ ### `servicebridge(url, serviceKey, serviceName?, opts?)`
263
+
264
+ ```ts
265
+ function servicebridge(
266
+ url: string,
267
+ serviceKey: string,
268
+ service?: string,
269
+ globalOpts?: ServiceBridgeOpts,
270
+ ): ServiceBridgeService
271
+ ```
272
+
273
+ Creates an SDK client instance.
274
+
275
+ `ServiceBridgeOpts`:
276
+
277
+ | Option | Type | Default | Description |
278
+ |---|---|---|---|
279
+ | `timeout` | `number` | `30000` | Default timeout (ms) for SDK operations. |
280
+ | `retries` | `number` | `3` | Default retry count for `rpc()`. |
281
+ | `retryDelay` | `number` | `300` | Base backoff delay (ms) for `rpc()`. |
282
+ | `discoveryRefreshMs` | `number` | `10000` | Discovery refresh period for endpoint updates. |
283
+ | `queueMaxSize` | `number` | `1000` | Max offline queue size for control-plane writes. |
284
+ | `queueOverflow` | `"drop-oldest" \| "drop-newest" \| "error"` | `"drop-oldest"` | Overflow strategy for offline queue. |
285
+ | `heartbeatIntervalMs` | `number` | `10000` | Base heartbeat period for worker registrations. |
286
+ | `workerTransport` | `"tls"` | `"tls"` | Worker server transport. |
287
+ | `workerTLS` | `WorkerTLSOpts` | auto | Explicit cert/key/CA for worker mTLS. |
288
+ | `adminUrl` | `string` | derived from `url` | HTTP admin base URL (TLS provisioning and management API calls). |
289
+ | `captureLogs` | `boolean` | `true` | Forward `console.*` logs to ServiceBridge. |
290
+
291
+ `WorkerTLSOpts`:
292
+
293
+ ```ts
294
+ type WorkerTLSOpts = {
295
+ caCert?: string | Buffer;
296
+ cert?: string | Buffer;
297
+ key?: string | Buffer;
298
+ serverName?: string;
299
+ }
300
+ ```
301
+
302
+ ---
303
+
304
+ ### `rpc(fn, payload?, opts?)`
305
+
306
+ ```ts
307
+ rpc<T = unknown>(fn: string, payload?: unknown, opts?: RpcOpts): Promise<T>
308
+ ```
309
+
310
+ Calls a registered RPC handler on another service. Direct gRPC path, no proxy.
311
+
312
+ `RpcOpts`:
313
+
314
+ | Option | Type | Description |
315
+ |---|---|---|
316
+ | `timeout` | `number` | Call timeout in ms. |
317
+ | `retries` | `number` | Retry count override. |
318
+ | `retryDelay` | `number` | Base retry delay override. |
319
+ | `traceId` | `string` | Explicit trace id. |
320
+ | `parentSpanId` | `string` | Explicit parent span id. |
321
+
322
+ ```ts
323
+ const user = await sb.rpc<{ id: string; name: string }>("users/get", { id: "u_1" }, {
324
+ timeout: 5000,
325
+ retries: 2,
326
+ });
327
+ ```
328
+
329
+ ---
330
+
331
+ ### `event(topic, payload?, opts?)`
332
+
333
+ ```ts
334
+ event(topic: string, payload?: unknown, opts?: EventOpts): Promise<string>
335
+ ```
336
+
337
+ Publishes a durable event. Returns `messageId` when online.
338
+
339
+ `EventOpts`:
340
+
341
+ | Option | Type | Description |
342
+ |---|---|---|
343
+ | `traceId` | `string` | Explicit trace id. |
344
+ | `parentSpanId` | `string` | Explicit parent span id. |
345
+ | `idempotencyKey` | `string` | Idempotency key for dedup-safe publishing. |
346
+ | `headers` | `Record<string, string>` | Custom metadata headers. |
347
+
348
+ ```ts
349
+ await sb.event("orders.created", { orderId: "ord_42" }, {
350
+ idempotencyKey: "order:ord_42",
351
+ headers: { source: "checkout" },
352
+ });
353
+ ```
354
+
355
+ ---
356
+
357
+ ### `job(target, opts)`
358
+
359
+ ```ts
360
+ job(target: string, opts: ScheduleOpts): Promise<string>
361
+ ```
362
+
363
+ Registers a scheduled or delayed job.
364
+
365
+ `ScheduleOpts`:
366
+
367
+ | Option | Type | Description |
368
+ |---|---|---|
369
+ | `cron` | `string` | Cron expression. |
370
+ | `delay` | `number` | Delay in ms before execution. |
371
+ | `timezone` | `string` | Timezone for cron execution. |
372
+ | `misfire` | `"fire_now" \| "skip"` | Misfire policy. |
373
+ | `via` | `"event" \| "rpc" \| "workflow"` | Target type. |
374
+ | `retryPolicyJson` | `string` | Retry policy JSON string. |
375
+
376
+ ```ts
377
+ await sb.job("billing/collect", {
378
+ cron: "0 * * * *",
379
+ timezone: "UTC",
380
+ via: "rpc",
381
+ });
382
+ ```
383
+
384
+ ---
385
+
386
+ ### `workflow(name, steps)`
387
+
388
+ ```ts
389
+ workflow(name: string, steps: WorkflowStep[]): Promise<string>
390
+ ```
391
+
392
+ Registers a workflow definition as a DAG of typed steps.
393
+
394
+ ```ts
395
+ await sb.workflow("order.fulfillment", [
396
+ { id: "reserve", type: "rpc", ref: "inventory/reserve" },
397
+ { id: "charge", type: "rpc", ref: "payments/charge", deps: ["reserve"] },
398
+ { id: "notify", type: "event", ref: "orders.fulfilled", deps: ["charge"] },
399
+ ]);
400
+ ```
401
+
402
+ ---
403
+
404
+ ### `cancelWorkflowRun(runId)`
405
+
406
+ ```ts
407
+ cancelWorkflowRun(runId: string): Promise<void>
408
+ ```
409
+
410
+ Cancels a running workflow instance.
411
+
412
+ ```ts
413
+ await sb.cancelWorkflowRun("run_01HQ...XYZ");
414
+ ```
415
+
416
+ ---
417
+
418
+ ### `handleRpc(fn, handler, opts?)`
419
+
420
+ ```ts
421
+ handleRpc(
422
+ fn: string,
423
+ handler: (payload: unknown, ctx?: RpcContext) => unknown | Promise<unknown>,
424
+ opts?: HandleRpcOpts,
425
+ ): ServiceBridgeService
426
+ ```
427
+
428
+ Registers an RPC handler. Chainable.
429
+
430
+ `HandleRpcOpts`:
431
+
432
+ | Option | Type | Description |
433
+ |---|---|---|
434
+ | `timeout` | `number` | Handler timeout hint. |
435
+ | `retryable` | `boolean` | Retryable hint for runtime policies. |
436
+ | `concurrency` | `number` | Worker-side concurrency hint. |
437
+ | `schema` | `RpcSchemaOpts` | Inline protobuf schema for binary encode/decode. |
438
+ | `allowedCallers` | `string[]` | Allow-list of caller service names. |
439
+
440
+ ```ts
441
+ sb.handleRpc("ai/generate", async (payload: { prompt: string }, ctx) => {
442
+ await ctx?.stream.write({ token: "Hello" }, "output");
443
+ await ctx?.stream.write({ token: " world" }, "output");
444
+ return { text: "Hello world" };
445
+ });
446
+ ```
447
+
448
+ ---
449
+
450
+ ### `handleEvent(pattern, handler, opts?)`
451
+
452
+ ```ts
453
+ handleEvent(
454
+ pattern: string,
455
+ handler: (payload: unknown, ctx: EventContext) => void | Promise<void>,
456
+ opts?: HandleEventOpts,
457
+ ): ServiceBridgeService
458
+ ```
459
+
460
+ Registers an event consumer handler. Chainable.
461
+
462
+ `HandleEventOpts`:
463
+
464
+ | Option | Type | Description |
465
+ |---|---|---|
466
+ | `groupName` | `string` | Consumer group name. Default: `<service>:<pattern>`. |
467
+ | `concurrency` | `number` | Concurrency hint for consumer processing. |
468
+ | `prefetch` | `number` | Prefetch hint. |
469
+ | `retryPolicyJson` | `string` | Retry policy JSON string. |
470
+ | `filterExpr` | `string` | Server-side filter expression. |
471
+
472
+ `EventContext` helpers:
473
+
474
+ - `ctx.retry(delayMs?)` — ask for redelivery with optional delay
475
+ - `ctx.reject(reason)` — reject without retry
476
+ - `ctx.refs` — metadata (`topic`, `groupName`, `messageId`, `attempt`, `headers`)
477
+ - `ctx.stream.write(...)` — append real-time chunks to run stream
478
+
479
+ ```ts
480
+ sb.handleEvent("orders.*", async (payload, ctx) => {
481
+ const body = payload as { orderId?: string };
482
+ if (!body.orderId) {
483
+ ctx.reject("missing_order_id");
484
+ return;
485
+ }
486
+ await ctx.stream.write({ status: "processing", orderId: body.orderId }, "progress");
487
+ });
488
+ ```
489
+
490
+ ---
491
+
492
+ ### `serve(opts?)`
493
+
494
+ ```ts
495
+ serve(opts?: ServeOpts): Promise<void>
496
+ ```
497
+
498
+ Starts the worker gRPC server and registers handlers with the control plane.
499
+
500
+ `ServeOpts`:
501
+
502
+ | Option | Type | Description |
503
+ |---|---|---|
504
+ | `host` | `string` | Bind host. Default: `127.0.0.1`. |
505
+ | `instanceId` | `string` | Stable worker instance identifier. |
506
+ | `weight` | `number` | Scheduling/discovery weight hint. |
507
+ | `transport` | `"tls"` | Worker transport. |
508
+ | `tls` | `WorkerTLSOpts` | Per-serve TLS override. |
509
+
510
+ ```ts
511
+ await sb.serve({
512
+ host: "0.0.0.0",
513
+ instanceId: process.env.HOSTNAME,
514
+ });
515
+ ```
516
+
517
+ ---
518
+
519
+ ### `stop()`
520
+
521
+ ```ts
522
+ stop(): void
523
+ ```
524
+
525
+ Stops worker server, heartbeats, channels, and SDK internals.
526
+
527
+ ---
528
+
529
+ ### `startHttpSpan(opts)`
530
+
531
+ ```ts
532
+ startHttpSpan(opts: {
533
+ method: string;
534
+ path: string;
535
+ traceId?: string;
536
+ parentSpanId?: string;
537
+ }): HttpSpan
538
+ ```
539
+
540
+ Manual HTTP tracing primitive.
541
+
542
+ ```ts
543
+ const span = sb.startHttpSpan({ method: "GET", path: "/health" });
544
+ try {
545
+ span.end({ statusCode: 200, success: true });
546
+ } catch (e) {
547
+ span.end({ success: false, error: String(e) });
548
+ }
549
+ ```
550
+
551
+ ---
552
+
553
+ ### `registerHttpEndpoint(opts)`
554
+
555
+ ```ts
556
+ registerHttpEndpoint(opts: {
557
+ method: string;
558
+ route: string;
559
+ instanceId?: string;
560
+ endpoint?: string;
561
+ allowedCallers?: string[];
562
+ }): Promise<void>
563
+ ```
564
+
565
+ Registers HTTP route metadata in the ServiceBridge service catalog.
566
+
567
+ ```ts
568
+ await sb.registerHttpEndpoint({ method: "GET", route: "/users/:id" });
569
+ ```
570
+
571
+ ---
572
+
573
+ ### `watchRun(runId, opts?)`
574
+
575
+ ```ts
576
+ watchRun(runId: string, opts?: WatchRunOpts): AsyncIterable<RunStreamEvent>
577
+ ```
578
+
579
+ Subscribes to a run stream with replay and live updates.
580
+
581
+ `WatchRunOpts`:
582
+
583
+ | Option | Type | Default | Description |
584
+ |---|---|---|---|
585
+ | `key` | `string` | `"default"` | Stream key filter. |
586
+ | `fromSequence` | `number` | `0` | Replay from sequence cursor. |
587
+
588
+ ```ts
589
+ for await (const evt of sb.watchRun(runId, { key: "output", fromSequence: 0 })) {
590
+ if (evt.type === "chunk") {
591
+ process.stdout.write(String((evt.data as { token?: string }).token ?? ""));
592
+ }
593
+ }
594
+ ```
595
+
596
+ ---
597
+
598
+ ### Trace Utilities
599
+
600
+ #### `getTraceContext()`
601
+
602
+ ```ts
603
+ getTraceContext(): { traceId: string; spanId: string } | undefined
604
+ ```
605
+
606
+ Returns the current async-local trace context.
607
+
608
+ #### `runWithTraceContext(ctx, fn)`
609
+
610
+ ```ts
611
+ runWithTraceContext<T>(ctx: { traceId: string; spanId: string }, fn: () => T): T
612
+ ```
613
+
614
+ Runs a function inside an explicit trace context.
615
+
616
+ ```ts
617
+ runWithTraceContext({ traceId: "trace-1", spanId: "span-1" }, async () => {
618
+ await sb.event("audit.log", { action: "user.login" });
619
+ });
620
+ ```
621
+
622
+ ---
623
+
624
+ ## HTTP Plugins
625
+
626
+ ### Express (`@service-bridge/node/express`)
627
+
628
+ ```bash
629
+ npm install express
630
+ ```
631
+
632
+ ```ts
633
+ import express from "express";
634
+ import { servicebridge } from "@service-bridge/node";
635
+ import { servicebridgeMiddleware, registerExpressRoutes } from "@service-bridge/node/express";
636
+
637
+ const sb = servicebridge(process.env.SERVICEBRIDGE_URL!, process.env.SERVICE_KEY!, "api");
638
+ const app = express();
639
+
640
+ app.use(servicebridgeMiddleware({
641
+ client: sb,
642
+ excludePaths: ["/health"],
643
+ autoRegister: true,
644
+ }));
645
+
646
+ app.get("/users/:id", async (req, res) => {
647
+ const user = await req.servicebridge.rpc("users/get", { id: req.params.id });
648
+ res.json(user);
649
+ });
650
+ ```
651
+
652
+ #### `servicebridgeMiddleware(options)`
653
+
654
+ ```ts
655
+ servicebridgeMiddleware(options: {
656
+ client: ServiceBridgeService;
657
+ excludePaths?: string[];
658
+ propagateTraceHeader?: boolean;
659
+ autoRegister?: boolean;
660
+ }): express.RequestHandler
661
+ ```
662
+
663
+ - Attaches `req.servicebridge`, `req.traceId`, `req.spanId`
664
+ - Starts/ends HTTP span automatically
665
+ - Optionally sets `x-trace-id` response header
666
+ - Optionally auto-registers route pattern in catalog on first hit
667
+
668
+ #### `registerExpressRoutes(app, client, opts?)`
669
+
670
+ Eager route catalog registration without waiting for first request.
671
+
672
+ ```ts
673
+ await registerExpressRoutes(app, sb, {
674
+ endpoint: "http://10.0.0.5:3000",
675
+ allowedCallers: ["api-gateway"],
676
+ excludePaths: ["/health"],
677
+ });
678
+ ```
679
+
680
+ ---
681
+
682
+ ### Fastify (`@service-bridge/node/fastify`)
683
+
684
+ ```bash
685
+ npm install fastify
686
+ ```
687
+
688
+ ```ts
689
+ import Fastify from "fastify";
690
+ import { servicebridge } from "@service-bridge/node";
691
+ import { servicebridgePlugin, wrapHandler } from "@service-bridge/node/fastify";
692
+
693
+ const sb = servicebridge(process.env.SERVICEBRIDGE_URL!, process.env.SERVICE_KEY!, "api");
694
+ const app = Fastify();
695
+
696
+ await app.register(servicebridgePlugin, {
697
+ client: sb,
698
+ excludePaths: ["/health"],
699
+ autoRegister: true,
700
+ });
701
+
702
+ app.get("/users/:id", wrapHandler(async (request, reply) => {
703
+ const user = await request.servicebridge.rpc("users/get", {
704
+ id: (request.params as any).id,
705
+ });
706
+ return reply.send(user);
707
+ }));
708
+ ```
709
+
710
+ #### `servicebridgePlugin(fastify, options)`
711
+
712
+ ```ts
713
+ servicebridgePlugin(fastify, {
714
+ client,
715
+ excludePaths?,
716
+ propagateTraceHeader?,
717
+ autoRegister?,
718
+ register?: {
719
+ instanceId?,
720
+ endpoint?,
721
+ allowedCallers?,
722
+ excludePaths?,
723
+ },
724
+ })
725
+ ```
726
+
727
+ - Decorates `request.servicebridge`, `request.traceId`, `request.spanId`
728
+ - Traces HTTP lifecycle via hooks
729
+ - Auto-registers routes on `onRoute` before traffic
730
+
731
+ #### `wrapHandler(handler)`
732
+
733
+ Runs a Fastify handler inside the current trace context so downstream SDK calls inherit the trace.
734
+
735
+ ---
736
+
737
+ ## Configuration
738
+
739
+ ### TLS behavior
740
+
741
+ - Worker transport is TLS-only.
742
+ - If `workerTLS` is not provided, SDK auto-provisions certs through the admin API.
743
+ - `workerTLS.cert` and `workerTLS.key` must be provided together.
744
+ - `serve({ tls })` overrides global `workerTLS` for a specific worker instance.
745
+
746
+ ### Offline queue behavior
747
+
748
+ When the control plane is unavailable, SDK queues write operations (`event`, `job`, `workflow`, telemetry writes).
749
+
750
+ - Queue size: `queueMaxSize` (default: 1000)
751
+ - Overflow policy: `queueOverflow` (default: `"drop-oldest"`)
752
+ - Return values for queued writes may be empty strings until flushed
753
+
754
+ ---
755
+
756
+ ## Environment Variables
757
+
758
+ The SDK requires values you pass into `servicebridge(...)`. Common setup:
759
+
760
+ | Variable | Required | Example | Description |
761
+ |---|---|---|---|
762
+ | `SERVICEBRIDGE_URL` | yes | `127.0.0.1:14445` | gRPC control plane URL |
763
+ | `SERVICE_KEY` | yes | `sb_live_...` | Service authentication key |
764
+ | `SERVICEBRIDGE_SERVICE` | yes (worker mode) | `orders` | Service name in registry |
765
+ | `SERVICEBRIDGE_ADMIN_URL` | optional | `http://127.0.0.1:14444` | Explicit admin API base URL |
766
+
767
+ ```ts
768
+ const sb = servicebridge(
769
+ process.env.SERVICEBRIDGE_URL ?? "127.0.0.1:14445",
770
+ process.env.SERVICE_KEY!,
771
+ process.env.SERVICEBRIDGE_SERVICE ?? "orders",
772
+ {
773
+ adminUrl: process.env.SERVICEBRIDGE_ADMIN_URL,
774
+ },
775
+ );
776
+ ```
777
+
778
+ ---
779
+
780
+ ## Error Handling
781
+
782
+ `ServiceBridgeError` is exported for normalized SDK and runtime errors.
783
+
784
+ ```ts
785
+ import { servicebridge, ServiceBridgeError } from "@service-bridge/node";
786
+
787
+ try {
788
+ await sb.rpc("payments/charge", { orderId: "ord_1" });
789
+ } catch (e) {
790
+ if (e instanceof ServiceBridgeError) {
791
+ console.error(e.component, e.operation, e.severity, e.code);
792
+ }
793
+ throw e;
794
+ }
795
+ ```
796
+
797
+ ---
798
+
799
+ ## When to Use / When Not to Use
800
+
801
+ ### ServiceBridge is a good fit when you:
802
+
803
+ - Have **3+ microservices** that need to communicate via RPC, events, or both
804
+ - Want **RPC + events + workflows + jobs** without managing separate infrastructure for each
805
+ - Need **end-to-end tracing** across all communication patterns in one timeline
806
+ - Want to **eliminate sidecar proxies** and reduce operational overhead
807
+ - Need **durable event delivery** with retry, DLQ, and replay without running a broker
808
+ - Are building **AI/LLM pipelines** and need realtime streaming with replay
809
+
810
+ ### Consider alternatives when you:
811
+
812
+ - Run a **single monolith** with no service decomposition plans
813
+ - Need **ultra-high-throughput event streaming** (100K+ msg/s sustained) — Kafka is purpose-built for this
814
+ - Need a **full API gateway** with rate limiting, auth plugins, and request transformation — use Kong/Envoy Gateway
815
+ - Already have a **mature Istio/Linkerd mesh** and only need traffic management (no events/workflows/jobs)
816
+ - Need **multi-region event replication** — ServiceBridge currently targets single-region deployments
817
+
818
+ ---
819
+
820
+ ## FAQ
821
+
822
+ **How does ServiceBridge handle service failures?**
823
+ RPC calls have configurable retries with exponential backoff. Events are durable (PostgreSQL-backed) with at-least-once delivery per consumer group. Failed deliveries are retried according to policy, then moved to DLQ. Workflows track step state and can be resumed.
824
+
825
+ **Is there vendor lock-in?**
826
+ ServiceBridge is self-hosted. The runtime is a single Go binary + PostgreSQL. SDK calls map to standard patterns (RPC, pub/sub, cron) — migrating away means replacing SDK calls with equivalent library calls.
827
+
828
+ **How does tracing work without an OTEL collector?**
829
+ The SDK automatically reports trace spans for every RPC call, event publish/delivery, workflow step, and HTTP request. The runtime stores traces in PostgreSQL and serves them via the built-in dashboard and a Loki-compatible API for Grafana integration.
830
+
831
+ **Can I use ServiceBridge alongside existing infrastructure?**
832
+ Yes. You can adopt incrementally — start with RPC between two services, add events later, then workflows. ServiceBridge doesn't require replacing your existing broker or mesh all at once.
833
+
834
+ **What happens when the control plane is down?**
835
+ In-flight direct RPC calls continue working (they go service-to-service, not through the control plane). New discovery lookups, event publishes, and telemetry writes are queued in the SDK offline queue and flushed when the control plane recovers.
836
+
837
+ **What databases does the runtime support?**
838
+ PostgreSQL 16+. The runtime uses PostgreSQL for all persistence: traces, events, workflows, jobs, service registry, and configuration.
839
+
840
+ ---
841
+
842
+ ## Community and Support
843
+
844
+ - Website: [servicebridge.dev](https://servicebridge.dev)
845
+ - GitHub: [github.com/service-bridge](https://github.com/service-bridge)
846
+ - SDK monorepo: [README.md](../README.md)
847
+
848
+ ---
849
+
850
+ ## License
851
+
852
+ Free for non-commercial use. Commercial use requires a separate license. See [LICENSE](../LICENSE).
853
+
854
+ Copyright (c) 2026 Eugene Surkov.