service-bridge 1.9.0-dev.52 → 2.0.0-alpha
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +382 -1096
- package/dist/http/express/index.d.ts +31 -0
- package/dist/http/express/index.js +2765 -0
- package/dist/http/express/index.js.map +1 -0
- package/dist/http/fastify/index.d.ts +38 -0
- package/dist/http/fastify/index.js +2726 -0
- package/dist/http/fastify/index.js.map +1 -0
- package/dist/http/hono/index.d.ts +39 -0
- package/dist/http/hono/index.js +2706 -0
- package/dist/http/hono/index.js.map +1 -0
- package/dist/index.d.ts +27 -0
- package/dist/index.js +14885 -3140
- package/dist/index.js.map +1 -0
- package/dist/service-bridge-CPmirNES.d.ts +2261 -0
- package/package.json +107 -123
- package/dist/express.d.ts +0 -51
- package/dist/express.js +0 -129
- package/dist/fastify.d.ts +0 -43
- package/dist/fastify.js +0 -122
- package/dist/trace.d.ts +0 -19
package/README.md
CHANGED
|
@@ -1,1337 +1,623 @@
|
|
|
1
|
-
<!--
|
|
1
|
+
<!--
|
|
2
|
+
Keywords: service-bridge, ServiceBridge, microservices, Node.js SDK, TypeScript SDK, Bun,
|
|
3
|
+
gRPC, mTLS, RPC framework, durable events, pub/sub, message broker alternative, RabbitMQ alternative,
|
|
4
|
+
workflow engine, saga, orchestration, Temporal alternative, job scheduler, cron, distributed tracing,
|
|
5
|
+
observability, OpenTelemetry alternative, Jaeger alternative, service mesh alternative, Istio alternative,
|
|
6
|
+
self-hosted, PostgreSQL, Express, Fastify, Hono, circuit breaker, idempotency, retries, load balancing.
|
|
7
|
+
-->
|
|
2
8
|
|
|
3
9
|
# service-bridge
|
|
4
10
|
|
|
5
|
-
[](https://www.npmjs.com/package/service-bridge)
|
|
12
|
+
[](./LICENSE)
|
|
13
|
+
[](https://www.typescriptlang.org/)
|
|
14
|
+
[](https://nodejs.org/)
|
|
9
15
|
|
|
10
|
-
**The
|
|
16
|
+
**The Node.js / Bun SDK for [ServiceBridge](https://servicebridge.dev) — RPC, durable events, workflows, jobs, streaming and full observability over one self-hosted runtime. No broker. No sidecar. No tracing stack. Just one Go binary plus PostgreSQL.**
|
|
11
17
|
|
|
12
|
-
|
|
18
|
+
You declare what your service handles and what it calls. ServiceBridge does the rest: provisions an mTLS identity, opens the connection, registers your handlers, and routes every RPC, event, job and workflow step — with tracing, metrics and access policy built in.
|
|
13
19
|
|
|
14
20
|
```
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
│
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
│
|
|
23
|
-
│
|
|
24
|
-
│
|
|
25
|
-
|
|
21
|
+
BEFORE AFTER
|
|
22
|
+
|
|
23
|
+
┌─────────────────────┐
|
|
24
|
+
│ Istio + Envoy │ ← mesh / mTLS
|
|
25
|
+
│ RabbitMQ / Kafka │ ← events ┌──────────────────────┐
|
|
26
|
+
│ Temporal │ ← workflows │ │
|
|
27
|
+
│ a cron scheduler │ ← jobs │ ServiceBridge │
|
|
28
|
+
│ gRPC plumbing │ ← RPC ═══► │ runtime (1 binary) │
|
|
29
|
+
│ Jaeger / Tempo │ ← tracing │ + │
|
|
30
|
+
│ Prometheus wiring │ ← metrics │ PostgreSQL │
|
|
31
|
+
│ Loki │ ← logs │ │
|
|
32
|
+
│ a load balancer │ ← LB / retries └──────────────────────┘
|
|
33
|
+
│ service registry │ ← discovery
|
|
34
|
+
└─────────────────────┘
|
|
35
|
+
10+ moving parts 2 things to run
|
|
26
36
|
```
|
|
27
37
|
|
|
28
|
-
|
|
38
|
+
---
|
|
39
|
+
|
|
40
|
+
## Table of contents
|
|
29
41
|
|
|
30
|
-
- [Why ServiceBridge](#why-servicebridge)
|
|
31
|
-
- [Use Cases](#use-cases)
|
|
32
|
-
- [Quick Start](#quick-start)
|
|
33
42
|
- [Install](#install)
|
|
34
|
-
- [
|
|
35
|
-
- [
|
|
36
|
-
- [
|
|
37
|
-
- [
|
|
38
|
-
- [
|
|
39
|
-
- [
|
|
43
|
+
- [Why ServiceBridge](#why-servicebridge)
|
|
44
|
+
- [Use cases](#use-cases)
|
|
45
|
+
- [Quick start](#quick-start)
|
|
46
|
+
- [Runtime setup](#runtime-setup)
|
|
47
|
+
- [End-to-end example](#end-to-end-example)
|
|
48
|
+
- [Platform features](#platform-features)
|
|
49
|
+
- [How it compares](#how-it-compares)
|
|
50
|
+
- [API reference](#api-reference)
|
|
51
|
+
- [RPC](#rpc)
|
|
52
|
+
- [Events](#events)
|
|
53
|
+
- [Jobs](#jobs)
|
|
54
|
+
- [Workflows](#workflows)
|
|
55
|
+
- [Streaming](#streaming)
|
|
56
|
+
- [Telemetry](#telemetry)
|
|
57
|
+
- [HTTP](#http)
|
|
58
|
+
- [HTTP plugins](#http-plugins)
|
|
40
59
|
- [Configuration](#configuration)
|
|
41
|
-
- [
|
|
42
|
-
- [Error Handling](#error-handling)
|
|
43
|
-
- [When to Use / When Not to Use](#when-to-use--when-not-to-use)
|
|
60
|
+
- [Error handling](#error-handling)
|
|
44
61
|
- [FAQ](#faq)
|
|
45
|
-
- [Community
|
|
62
|
+
- [Community](#community)
|
|
46
63
|
- [License](#license)
|
|
47
64
|
|
|
48
65
|
---
|
|
49
66
|
|
|
50
|
-
##
|
|
51
|
-
|
|
52
|
-
| Problem | Without ServiceBridge | With ServiceBridge |
|
|
53
|
-
|---|---|---|
|
|
54
|
-
| Service-to-service calls | Istio/Envoy sidecar proxy per pod | **Direct SDK-to-worker gRPC, zero proxy hops** |
|
|
55
|
-
| Async messaging | Kafka/RabbitMQ + retry logic + DLQ setup | **Built-in durable events with retry, DLQ, replay** |
|
|
56
|
-
| Background jobs | Bull/BullMQ + Redis + cron daemon | **Built-in cron and delayed jobs** |
|
|
57
|
-
| Workflow orchestration | Temporal/Conductor cluster + persistence | **Built-in DAG workflows** |
|
|
58
|
-
| Distributed tracing | Jaeger/Tempo + OTEL collector + dashboards | **Built-in traces + realtime UI** |
|
|
59
|
-
| Service discovery | Consul/etcd + DNS glue | **Built-in registry + health-aware balancing** |
|
|
60
|
-
| mTLS | cert-manager + Vault PKI | **Auto-provisioned certs from service key** |
|
|
61
|
-
|
|
62
|
-
**Result**: `10 tools → 1 runtime`. One Go binary + PostgreSQL replaces the entire stack.
|
|
63
|
-
|
|
64
|
-
---
|
|
65
|
-
|
|
66
|
-
## Use Cases
|
|
67
|
-
|
|
68
|
-
**Microservice communication** — Replace sidecar mesh with direct RPC calls. Get sub-millisecond overhead instead of double proxy hop latency.
|
|
69
|
-
|
|
70
|
-
**Event-driven architecture** — Publish durable events with fan-out, retries, DLQ, idempotency, and server-side filtering. No broker infrastructure to manage.
|
|
71
|
-
|
|
72
|
-
**Background job scheduling** — Cron jobs, delayed execution, and job-triggered workflows in a single API. No Redis, no separate queue workers.
|
|
73
|
-
|
|
74
|
-
**Saga / distributed transactions** — DAG workflows with typed steps (`rpc`, `event`, `event_wait`, `sleep`, child workflow). Compensations and rollbacks via workflow step dependencies.
|
|
75
|
-
|
|
76
|
-
**AI agent orchestration** — Stream LLM tokens via realtime trace streams with replay. Orchestrate multi-step AI pipelines as workflows.
|
|
77
|
-
|
|
78
|
-
**Full-stack observability** — Every RPC call, event delivery, workflow step, and HTTP request traced automatically. One timeline, one dashboard. Prometheus metrics and Loki-compatible log API included.
|
|
79
|
-
|
|
80
|
-
---
|
|
81
|
-
|
|
82
|
-
## Quick Start
|
|
83
|
-
|
|
84
|
-
### 1. Install
|
|
67
|
+
## Install
|
|
85
68
|
|
|
86
|
-
```
|
|
69
|
+
```sh
|
|
87
70
|
npm i service-bridge
|
|
88
71
|
# or
|
|
89
72
|
bun add service-bridge
|
|
90
73
|
```
|
|
91
74
|
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
|
|
95
|
-
import { ServiceBridge } from "service-bridge";
|
|
96
|
-
|
|
97
|
-
const sb = new ServiceBridge(
|
|
98
|
-
process.env.SERVICEBRIDGE_URL ?? "localhost:14445",
|
|
99
|
-
process.env.SERVICEBRIDGE_SERVICE_KEY!,
|
|
100
|
-
);
|
|
101
|
-
|
|
102
|
-
sb.rpc.handle("payment.charge", async (payload: { orderId: string; amount: number }) => {
|
|
103
|
-
return { ok: true, txId: `tx_${Date.now()}`, orderId: payload.orderId };
|
|
104
|
-
});
|
|
105
|
-
|
|
106
|
-
await sb.start({ host: "localhost" });
|
|
107
|
-
```
|
|
108
|
-
|
|
109
|
-
### 3. Call it from another service
|
|
75
|
+
- **Runtime:** Node.js 18+ or any current Bun.
|
|
76
|
+
- **Types:** included, written in TypeScript 5.
|
|
77
|
+
- **Backend:** a running ServiceBridge runtime (gRPC control plane on `:14445`) backed by PostgreSQL 18+. See [Runtime setup](#runtime-setup).
|
|
110
78
|
|
|
111
79
|
```ts
|
|
112
80
|
import { ServiceBridge } from "service-bridge";
|
|
113
81
|
|
|
114
82
|
const sb = new ServiceBridge(
|
|
115
|
-
|
|
116
|
-
|
|
83
|
+
"localhost:14445", // runtime control-plane address
|
|
84
|
+
"sb_key_...", // bootstrap service key from the runtime
|
|
117
85
|
);
|
|
118
|
-
|
|
119
|
-
const result = await sb.rpc.invoke<{ ok: boolean; txId: string }>("payment.charge", {
|
|
120
|
-
orderId: "ord_42",
|
|
121
|
-
amount: 4990,
|
|
122
|
-
});
|
|
123
|
-
|
|
124
|
-
console.log(result.txId); // tx_1711234567890
|
|
125
|
-
```
|
|
126
|
-
|
|
127
|
-
That's it. No broker, no sidecar, no proxy — direct gRPC call between services.
|
|
128
|
-
|
|
129
|
-
---
|
|
130
|
-
|
|
131
|
-
## Runtime Setup
|
|
132
|
-
|
|
133
|
-
The SDK connects to a ServiceBridge runtime. The fastest way to start:
|
|
134
|
-
|
|
135
|
-
```bash
|
|
136
|
-
bash <(curl -fsSL https://servicebridge.dev/install.sh)
|
|
137
86
|
```
|
|
138
87
|
|
|
139
|
-
|
|
140
|
-
|
|
141
|
-
For manual Docker Compose setup, configuration reference, and all runtime environment variables, see the **[Runtime Setup](../README.md#runtime-setup)** section in the main SDK README.
|
|
88
|
+
The third constructor argument is an [options](#configuration) object. The SDK reads **no environment variables** — every knob is a constructor option, so you stay in control of where config comes from.
|
|
142
89
|
|
|
143
90
|
---
|
|
144
91
|
|
|
145
|
-
##
|
|
146
|
-
|
|
147
|
-
A complete order flow: HTTP request → RPC → Event → Event handler with streaming.
|
|
148
|
-
|
|
149
|
-
```ts
|
|
150
|
-
import { ServiceBridge } from "service-bridge";
|
|
151
|
-
|
|
152
|
-
// --- Payments service (worker) ---
|
|
153
|
-
|
|
154
|
-
const payments = new ServiceBridge("localhost:14445", process.env.SERVICEBRIDGE_SERVICE_KEY!);
|
|
155
|
-
|
|
156
|
-
payments.rpc.handle("payment.charge", async (payload: { orderId: string; amount: number }, ctx) => {
|
|
157
|
-
await ctx?.stream.write({ status: "charging", orderId: payload.orderId }, "progress");
|
|
158
|
-
|
|
159
|
-
// ... charge logic ...
|
|
160
|
-
|
|
161
|
-
await ctx?.stream.write({ status: "charged" }, "progress");
|
|
162
|
-
return { ok: true, txId: `tx_${Date.now()}` };
|
|
163
|
-
});
|
|
164
|
-
|
|
165
|
-
await payments.start({ host: "localhost" });
|
|
166
|
-
```
|
|
167
|
-
|
|
168
|
-
```ts
|
|
169
|
-
// --- Orders service (caller + event publisher) ---
|
|
170
|
-
|
|
171
|
-
const orders = new ServiceBridge("localhost:14445", process.env.SERVICEBRIDGE_SERVICE_KEY!);
|
|
172
|
-
|
|
173
|
-
// Call payments, then publish event
|
|
174
|
-
const charge = await orders.rpc.invoke<{ ok: boolean; txId: string }>("payment.charge", {
|
|
175
|
-
orderId: "ord_42",
|
|
176
|
-
amount: 4990,
|
|
177
|
-
});
|
|
178
|
-
|
|
179
|
-
await orders.events.publish("orders.completed", {
|
|
180
|
-
orderId: "ord_42",
|
|
181
|
-
txId: charge.txId,
|
|
182
|
-
}, {
|
|
183
|
-
idempotencyKey: "order:ord_42:completed",
|
|
184
|
-
headers: { source: "checkout" },
|
|
185
|
-
});
|
|
186
|
-
```
|
|
187
|
-
|
|
188
|
-
```ts
|
|
189
|
-
// --- Notifications service (event consumer) ---
|
|
190
|
-
|
|
191
|
-
const notifications = new ServiceBridge("localhost:14445", process.env.SERVICEBRIDGE_SERVICE_KEY!);
|
|
92
|
+
## Why ServiceBridge
|
|
192
93
|
|
|
193
|
-
|
|
194
|
-
const body = payload as { orderId: string; txId: string };
|
|
195
|
-
await ctx.stream.write({ status: "sending_email", orderId: body.orderId }, "progress");
|
|
196
|
-
// ... send email ...
|
|
197
|
-
});
|
|
94
|
+
Microservices rarely fail because of business logic. They fail in the gaps *between* services — the broker that dropped a message, the workflow engine nobody fully understands, the trace that stops at a service boundary, the mesh config that takes a week to debug. Each gap is another system to run, secure and correlate.
|
|
198
95
|
|
|
199
|
-
|
|
200
|
-
```
|
|
96
|
+
ServiceBridge collapses those gaps into one runtime. Your service talks to a single gRPC endpoint over mTLS; the runtime is the single source of truth for routing, delivery and state.
|
|
201
97
|
|
|
202
|
-
|
|
203
|
-
|
|
204
|
-
|
|
205
|
-
|
|
206
|
-
|
|
207
|
-
|
|
208
|
-
|
|
209
|
-
|
|
210
|
-
]);
|
|
211
|
-
```
|
|
98
|
+
| Problem | Without ServiceBridge | With ServiceBridge |
|
|
99
|
+
|---|---|---|
|
|
100
|
+
| Service-to-service calls | gRPC/HTTP plumbing + a mesh for mTLS + retries | `sb.rpc.call("svc", "Method", req)` — mTLS, LB, retries, breakers built in |
|
|
101
|
+
| Reliable async messaging | Stand up and operate a broker | `sb.event.publish(...)` — durable outbox, at-least-once, fan-out, DLQ |
|
|
102
|
+
| Multi-step business processes | A separate workflow engine to learn and host | `sb.workflow.handle(...)` — durable DAGs with compensation and replay |
|
|
103
|
+
| Scheduled work | A cron box or a job scheduler service | `sb.job.handle(...)` — cron / interval / delay, leased and retried |
|
|
104
|
+
| Knowing what happened | Wire up tracing + metrics + logs across N tools | Every hop is traced, measured and logged automatically |
|
|
105
|
+
| Identity & access | Certificates, a mesh policy layer | mTLS from a service key + granular access policy, on by default |
|
|
212
106
|
|
|
213
|
-
|
|
107
|
+
One binary, one database, one place to look when something breaks.
|
|
214
108
|
|
|
215
109
|
---
|
|
216
110
|
|
|
217
|
-
##
|
|
218
|
-
|
|
219
|
-
### Communication
|
|
220
|
-
- **Direct RPC** — zero-hop gRPC calls with retries, deadlines, and mTLS identity
|
|
221
|
-
- **Durable Events** — fan-out delivery, guaranteed delivery (RabbitMQ-style), at-least-once guarantees, retries, DLQ, replay, idempotency. If a consumer is offline, the message waits in the server-side queue and is dispatched the moment the consumer reconnects — no retry budget consumed while waiting.
|
|
222
|
-
- **Realtime Streams** — live chunks with replay for AI/progress/log streaming
|
|
223
|
-
- **Service Discovery** — automatic endpoint resolution and round-robin balancing
|
|
224
|
-
- **HTTP Middleware** — Express and Fastify instrumentation with automatic trace propagation
|
|
225
|
-
|
|
226
|
-
### Orchestration
|
|
227
|
-
- **Workflows** — DAG steps: `rpc`, `event`, `event_wait`, `sleep`, child workflow
|
|
228
|
-
- **Jobs** — cron, delayed, and workflow-triggered scheduling
|
|
111
|
+
## Use cases
|
|
229
112
|
|
|
230
|
-
|
|
231
|
-
- **
|
|
232
|
-
- **
|
|
233
|
-
|
|
234
|
-
|
|
235
|
-
- **
|
|
236
|
-
- **Metrics** — Prometheus-compatible `/metrics` endpoint (30+ metric families)
|
|
237
|
-
- **Logs** — structured log ingest with Loki-compatible query API
|
|
238
|
-
- **Alerts** — runtime alerts for delivery failures, errors, and service health
|
|
239
|
-
- **Dashboard** — realtime web UI for traces, events, workflows, jobs, DLQ, service map, and service keys
|
|
240
|
-
|
|
241
|
-
---
|
|
242
|
-
|
|
243
|
-
## How It Compares
|
|
244
|
-
|
|
245
|
-
| Concern | Istio + Envoy | Dapr | Temporal + Kafka | ServiceBridge |
|
|
246
|
-
|---|---|---|---|---|
|
|
247
|
-
| RPC data path | Sidecar proxy hop | Sidecar/daemon hop | N/A | **Direct (proxyless)** |
|
|
248
|
-
| Service discovery | K8s control plane | Sidecar placement | External registry | **Built-in registry** |
|
|
249
|
-
| Durable events + DLQ | External broker | Pub/Sub component | Kafka + consumers | **Built-in** |
|
|
250
|
-
| Workflow orchestration | External engine | External engine | Built-in | **Built-in** |
|
|
251
|
-
| Job scheduling | External cron/queue | External scheduler | External scheduler | **Built-in** |
|
|
252
|
-
| Traces + UI | Jaeger/Tempo + dashboards | OTEL backend + dashboards | Temporal UI | **Built-in** |
|
|
253
|
-
| Logs for Grafana | Loki + Promtail pipeline | Log pipeline | Log pipeline | **Built-in Loki API** |
|
|
254
|
-
| Metrics | App/exporter setup | App/exporter setup | Multiple exporters | **Built-in `/metrics`** |
|
|
255
|
-
| Security model | Mesh PKI + policy | Deployment-dependent mTLS | Mixed | **Service keys + auto mTLS** |
|
|
256
|
-
| Operational footprint | Multi-component mesh | Runtime + sidecars | Workflow + broker + DB | **One binary + PostgreSQL** |
|
|
113
|
+
- **Replace a broker** — durable, at-least-once events with fan-out and a dead-letter queue, without operating Kafka or RabbitMQ.
|
|
114
|
+
- **Run sagas / orchestration** — checkout, onboarding, fulfilment as durable workflows with automatic compensation on failure.
|
|
115
|
+
- **Internal RPC backbone** — typed service-to-service calls with load balancing, retries and circuit breakers, secured by mTLS.
|
|
116
|
+
- **Scheduled & delayed work** — nightly rollups, reminders, periodic syncs as leased, retried jobs.
|
|
117
|
+
- **Streaming responses** — token-by-token LLM output or progress feeds over server-side streaming RPC.
|
|
118
|
+
- **Observability for free** — get a full distributed trace across RPC → event → workflow → job without instrumenting by hand.
|
|
257
119
|
|
|
258
120
|
---
|
|
259
121
|
|
|
260
|
-
##
|
|
261
|
-
|
|
262
|
-
### `ServiceBridge` / `ServiceBridgeService` surface
|
|
263
|
-
|
|
264
|
-
Per-instance API for `new ServiceBridge(...)` (implements `ServiceBridgeService`):
|
|
122
|
+
## Quick start
|
|
265
123
|
|
|
266
|
-
|
|
267
|
-
- **Lifecycle:** `start(opts?)`, `stop()`.
|
|
268
|
-
- **Workflows:** `cancelWorkflow(traceId)`.
|
|
269
|
-
- **HTTP & traces:** `startHttpSpan(opts)`, `registerHttpEndpoint(opts)`, `watchTrace(traceId, opts?)`.
|
|
270
|
-
- **Module helpers (exported from `service-bridge`):** `getTraceContext`, `withTraceContext`, `ServiceBridgeError`, `mapGrpcStatus`, `SB`, `SB_MESSAGES`. (`captureConsole` exists internally for log capture but is not part of the public package exports.)
|
|
124
|
+
Schemas are **file-based**: point the SDK at a `.proto` file (it resolves request/response types from the `service` block) or a `.schema.json` with explicit field numbers. There is no inline schema.
|
|
271
125
|
|
|
272
|
-
|
|
273
|
-
|
|
274
|
-
|
|
275
|
-
|
|
276
|
-
|
|
277
|
-
|
|
278
|
-
|
|
279
|
-
|
|
280
|
-
- Constructor TLS overrides: `workerTLS`/`caCert` (Node), `WorkerTLS`/`CACert` (Go), `worker_tls`/`ca_cert` (Python)
|
|
281
|
-
- Handler hints: timeout/retryable/concurrency/prefetch are advisory in all SDKs
|
|
282
|
-
- Shared `start()` fields across SDKs: host, max in-flight, instance ID, weight, and per-start TLS override
|
|
283
|
-
|
|
284
|
-
### `new ServiceBridge(url, serviceKey, opts?)`
|
|
285
|
-
|
|
286
|
-
```ts
|
|
287
|
-
class ServiceBridge {
|
|
288
|
-
constructor(url: string, serviceKey: string, opts?: ServiceBridgeOpts);
|
|
126
|
+
```proto
|
|
127
|
+
// payment.proto
|
|
128
|
+
syntax = "proto3";
|
|
129
|
+
message ChargeRequest { string user_id = 1; int64 amount = 2; }
|
|
130
|
+
message ChargeReply { bool ok = 1; }
|
|
131
|
+
service Payment {
|
|
132
|
+
rpc Charge(ChargeRequest) returns (ChargeReply);
|
|
289
133
|
}
|
|
290
134
|
```
|
|
291
135
|
|
|
292
|
-
|
|
293
|
-
|
|
294
|
-
`ServiceBridgeOpts`:
|
|
295
|
-
|
|
296
|
-
| Option | Type | Default | Description |
|
|
297
|
-
|---|---|---|---|
|
|
298
|
-
| `timeout` | `number` | `30000` | Default hard timeout per `rpc.invoke()` attempt (ms). |
|
|
299
|
-
| `retries` | `number` | `3` | Default retry count for `rpc.invoke()`. |
|
|
300
|
-
| `retryDelay` | `number` | `300` | Base backoff delay (ms) for `rpc.invoke()`. |
|
|
301
|
-
| `discoveryRefreshMs` | `number` | `10000` | Discovery refresh period for endpoint updates. |
|
|
302
|
-
| `queueMaxSize` | `number` | `1000` | Max offline queue size for control-plane writes. |
|
|
303
|
-
| `queueOverflow` | `"drop-oldest" \| "drop-newest" \| "error"` | `"drop-oldest"` | Overflow strategy for offline queue. |
|
|
304
|
-
| `heartbeatIntervalMs` | `number` | `10000` | Base heartbeat period for worker registrations. |
|
|
305
|
-
| `captureLogs` | `boolean` | `true` | Forward `console.*` logs to ServiceBridge. |
|
|
306
|
-
| `strictOutboundDeclarations` | `boolean` | `false` | When `true`, every outbound `rpc.invoke()` must be preceded by `rpc.declare(fn)` for the resolved target. |
|
|
307
|
-
|
|
308
|
-
### Advanced TLS overrides
|
|
309
|
-
|
|
310
|
-
| Option | Type | Default | Description |
|
|
311
|
-
|---|---|---|---|
|
|
312
|
-
| `workerTLS` | `WorkerTLSOpts` | auto | Explicit cert/key/CA for worker mTLS. |
|
|
313
|
-
| `caCert` | `string \| Buffer` | from `serviceKey` | Optional control-plane CA override. By default SDK reads CA from sbv2 service key. |
|
|
314
|
-
|
|
315
|
-
`WorkerTLSOpts`:
|
|
136
|
+
**Worker** — register the handler. One argument in, one value out.
|
|
316
137
|
|
|
317
138
|
```ts
|
|
318
|
-
|
|
319
|
-
caCert?: string | Buffer;
|
|
320
|
-
cert?: string | Buffer;
|
|
321
|
-
key?: string | Buffer;
|
|
322
|
-
serverName?: string;
|
|
323
|
-
}
|
|
324
|
-
```
|
|
139
|
+
import { ServiceBridge } from "service-bridge";
|
|
325
140
|
|
|
326
|
-
|
|
141
|
+
const sb = new ServiceBridge("localhost:14445", process.env.PAYMENT_KEY!);
|
|
327
142
|
|
|
328
|
-
|
|
143
|
+
sb.rpc.handle(
|
|
144
|
+
"Charge",
|
|
145
|
+
async (req: { userId: string; amount: number }) => {
|
|
146
|
+
return { ok: req.amount > 0 };
|
|
147
|
+
},
|
|
148
|
+
{ schema: { protoFile: "./payment.proto" } },
|
|
149
|
+
);
|
|
329
150
|
|
|
330
|
-
|
|
331
|
-
invoke<T = unknown>(fn: string, payload?: unknown, opts?: RpcOpts): Promise<T>
|
|
151
|
+
await sb.start();
|
|
332
152
|
```
|
|
333
153
|
|
|
334
|
-
|
|
335
|
-
|
|
336
|
-
**Function name** — `fn` is a single **global function name** (the same string passed to `rpc.handle` on the callee), e.g. `payment.charge` or `user.get`. It must be unique in the catalog and **must not contain `/`**.
|
|
337
|
-
|
|
338
|
-
`RpcOpts`:
|
|
339
|
-
|
|
340
|
-
| Option | Type | Description |
|
|
341
|
-
|---|---|---|
|
|
342
|
-
| `timeout` | `number` | Call timeout in ms. |
|
|
343
|
-
| `retries` | `number` | Retry count override. |
|
|
344
|
-
| `retryDelay` | `number` | Base retry delay override. |
|
|
345
|
-
| `traceId` | `string` | Explicit trace id. |
|
|
346
|
-
| `parentSpanId` | `string` | Explicit parent span id. |
|
|
347
|
-
| `mode` | `"direct" \| "proxy"` | Transport mode. `"direct"` (default) connects directly to the worker. `"proxy"` routes through the control plane when direct connection is unavailable. |
|
|
154
|
+
**Caller** — in another process, build a typed client and call it. `sb.client()` reads the `.proto` once, declares every method in its `service` block as an outgoing dependency, loads the schemas, and returns a typed proxy.
|
|
348
155
|
|
|
349
156
|
```ts
|
|
350
|
-
|
|
351
|
-
|
|
352
|
-
const user2 = await sb.rpc.invoke<{ id: string; name: string }>("user.get", { id: "u_1" }, {
|
|
353
|
-
timeout: 5000,
|
|
354
|
-
retries: 2,
|
|
355
|
-
});
|
|
356
|
-
```
|
|
357
|
-
|
|
358
|
-
`rpc.invoke()` is bounded even when a downstream worker is silent:
|
|
359
|
-
each attempt has a hard local timeout, retries are finite (`retries + 1` total attempts),
|
|
360
|
-
and after the final failed attempt the root RPC span is closed with `error`.
|
|
361
|
-
|
|
362
|
-
Retry delay uses exponential backoff: `retryDelay * 2^(attempt-1)`.
|
|
157
|
+
import { ServiceBridge } from "service-bridge";
|
|
363
158
|
|
|
364
|
-
|
|
159
|
+
const sb = new ServiceBridge("localhost:14445", process.env.ORDERS_KEY!);
|
|
160
|
+
const payment = await sb.client("payment-svc", "./payment.proto");
|
|
365
161
|
|
|
366
|
-
|
|
162
|
+
await sb.start();
|
|
367
163
|
|
|
368
|
-
|
|
369
|
-
|
|
164
|
+
const res = await payment.Charge({ userId: "u-1", amount: 100 });
|
|
165
|
+
// res.ok === true
|
|
370
166
|
```
|
|
371
167
|
|
|
372
|
-
|
|
168
|
+
Declare dependencies and build typed clients **before** `start()` — they ride along in the first registration. Calls succeed once `start()` has connected.
|
|
373
169
|
|
|
374
170
|
---
|
|
375
171
|
|
|
376
|
-
|
|
172
|
+
## Runtime setup
|
|
377
173
|
|
|
378
|
-
|
|
379
|
-
publish(topic: string, payload?: unknown, opts?: EventOpts): Promise<string>
|
|
380
|
-
```
|
|
174
|
+
The SDK needs a running ServiceBridge runtime. Spin one up with the one-line installer:
|
|
381
175
|
|
|
382
|
-
|
|
383
|
-
|
|
384
|
-
`EventOpts`:
|
|
385
|
-
|
|
386
|
-
| Option | Type | Description |
|
|
387
|
-
|---|---|---|
|
|
388
|
-
| `traceId` | `string` | Explicit trace id. |
|
|
389
|
-
| `parentSpanId` | `string` | Explicit parent span id. |
|
|
390
|
-
| `idempotencyKey` | `string` | Idempotency key for dedup-safe publishing. |
|
|
391
|
-
| `headers` | `Record<string, string>` | Custom metadata headers. |
|
|
392
|
-
|
|
393
|
-
```ts
|
|
394
|
-
await sb.events.publish("orders.created", { orderId: "ord_42" }, {
|
|
395
|
-
idempotencyKey: "order:ord_42",
|
|
396
|
-
headers: { source: "checkout" },
|
|
397
|
-
});
|
|
176
|
+
```sh
|
|
177
|
+
bash <(curl -fsSL https://servicebridge.dev/install.sh)
|
|
398
178
|
```
|
|
399
179
|
|
|
400
|
-
|
|
401
|
-
|
|
402
|
-
### `events.publishWorker(topic, payload?, opts?)`
|
|
180
|
+
It pulls the runtime container, wires it to PostgreSQL 18+, and exposes the gRPC control plane on `:14445` and the dashboard on `:14444`. Open the dashboard, create a service, and copy its **bootstrap service key** — that opaque string is the second argument to `new ServiceBridge(url, key)`.
|
|
403
181
|
|
|
404
|
-
|
|
405
|
-
publishWorker(
|
|
406
|
-
topic: string,
|
|
407
|
-
payload?: unknown,
|
|
408
|
-
opts?: { traceId?: string; parentSpanId?: string; headers?: Record<string, string> },
|
|
409
|
-
): Promise<string>
|
|
410
|
-
```
|
|
182
|
+
Each instance authenticates with its key: the SDK calls `Bootstrap.Provision`, receives a short-lived leaf certificate, opens an mTLS gRPC channel and registers. Certificates rotate automatically with overlap (the new session is live before the old one closes), so long-running instances never drop traffic at renewal.
|
|
411
183
|
|
|
412
|
-
|
|
184
|
+
Full self-hosting docs live at **[servicebridge.dev/docs](https://servicebridge.dev/docs)**.
|
|
413
185
|
|
|
414
186
|
---
|
|
415
187
|
|
|
416
|
-
|
|
188
|
+
## End-to-end example
|
|
417
189
|
|
|
418
|
-
|
|
419
|
-
declare(topic: string): void
|
|
420
|
-
```
|
|
421
|
-
|
|
422
|
-
Declares an outbound event dependency for registration metadata (does not publish a message).
|
|
423
|
-
|
|
424
|
-
---
|
|
425
|
-
|
|
426
|
-
### `jobs.run(service, fn, opts)` / `jobs.run(target, opts)`
|
|
190
|
+
A small order flow: an HTTP request triggers a workflow that charges a payment, then publishes an event another service consumes — all traced as one tree.
|
|
427
191
|
|
|
428
192
|
```ts
|
|
429
|
-
|
|
430
|
-
run(target: string, opts: ScheduleOpts & { via: "event" | "workflow" }): Promise<string>
|
|
431
|
-
```
|
|
432
|
-
|
|
433
|
-
Registers a scheduled or delayed job. Resolves to the registration key: `"${service}/${fn}"` for the RPC overload, or the `target` string for the `event` / `workflow` overload.
|
|
434
|
-
|
|
435
|
-
`ScheduleOpts`:
|
|
436
|
-
|
|
437
|
-
| Option | Type | Description |
|
|
438
|
-
|---|---|---|
|
|
439
|
-
| `cron` | `string` | Cron expression. |
|
|
440
|
-
| `delay` | `number` | Delay in ms before execution. Backed by `int32` in the proto — maximum ~24.8 days (~2,147,483,647 ms). |
|
|
441
|
-
| `timezone` | `string` | Timezone for cron execution. |
|
|
442
|
-
| `misfire` | `"fire_now" \| "skip"` | Misfire policy. |
|
|
443
|
-
| `via` | `"event" \| "rpc" \| "workflow"` | Target type. |
|
|
444
|
-
| `retryPolicyJson` | `string` | Retry policy JSON string. |
|
|
193
|
+
import { ServiceBridge } from "service-bridge";
|
|
445
194
|
|
|
446
|
-
|
|
447
|
-
|
|
448
|
-
|
|
449
|
-
|
|
450
|
-
|
|
195
|
+
const sb = new ServiceBridge("localhost:14445", process.env.ORDERS_KEY!);
|
|
196
|
+
|
|
197
|
+
// Outgoing dependencies — declared before start().
|
|
198
|
+
sb.service("payment-svc", { rpc: ["Charge"] });
|
|
199
|
+
sb.event.define("order.placed", { protoFile: "./events.proto", input: "OrderPlaced" });
|
|
200
|
+
|
|
201
|
+
// A durable workflow: charge, then announce. Steps run by dependency level.
|
|
202
|
+
sb.workflow.handle("checkout", {
|
|
203
|
+
input: { type: "object", properties: { orderId: { type: "string" } } },
|
|
204
|
+
steps: [
|
|
205
|
+
{ id: "charge", type: "call", service: "payment-svc", method: "Charge",
|
|
206
|
+
input: "$.input" },
|
|
207
|
+
{ id: "announce", type: "publish", event: "order.placed",
|
|
208
|
+
input: "$.input", waitFor: ["charge"] },
|
|
209
|
+
],
|
|
451
210
|
});
|
|
452
|
-
```
|
|
453
|
-
|
|
454
|
-
---
|
|
455
|
-
|
|
456
|
-
### `workflows.run(name, steps, opts?)` — register DAG
|
|
457
|
-
|
|
458
|
-
TypeScript (single method; behavior depends on the second argument):
|
|
459
|
-
|
|
460
|
-
```ts
|
|
461
|
-
run(
|
|
462
|
-
nameOrService: string,
|
|
463
|
-
stepsOrName: WorkflowStep[] | string,
|
|
464
|
-
inputOrOpts?: unknown,
|
|
465
|
-
opts?: ExecuteWorkflowOpts,
|
|
466
|
-
): Promise<string | ExecuteWorkflowResult>
|
|
467
|
-
```
|
|
468
|
-
|
|
469
|
-
- **Register:** when `stepsOrName` is `WorkflowStep[]`, `nameOrService` is the workflow name, `inputOrOpts` is optional `WorkflowOpts`, and the promise resolves to that name (`string`).
|
|
470
|
-
- **Execute:** when `stepsOrName` is a `string`, `nameOrService` is the target **service** name, `stepsOrName` is the workflow name, `inputOrOpts` is the optional execution input, and `opts` is optional `ExecuteWorkflowOpts` (see execute section below).
|
|
471
|
-
|
|
472
|
-
Overload as used when registering:
|
|
473
|
-
|
|
474
|
-
```ts
|
|
475
|
-
run(name: string, steps: WorkflowStep[], opts?: WorkflowOpts): Promise<string>
|
|
476
|
-
```
|
|
477
|
-
|
|
478
|
-
Registers (or updates) a workflow definition as a DAG of typed steps. Returns the workflow name.
|
|
479
211
|
|
|
480
|
-
`
|
|
212
|
+
sb.on("connected", ({ serviceName }) => console.log(`up as ${serviceName}`));
|
|
481
213
|
|
|
482
|
-
|
|
483
|
-
|---|---|---|
|
|
484
|
-
| `id` | `string` | Unique step identifier in the DAG. |
|
|
485
|
-
| `type` | `"rpc" \| "event" \| "event_wait" \| "sleep" \| "workflow"` | Step execution type. |
|
|
486
|
-
| `service` | `string` | Required for `rpc` and `workflow`: target service that owns the function or child workflow. |
|
|
487
|
-
| `ref` | `string` | Target name: RPC function, event topic, waited topic, or child workflow name (per `type`). |
|
|
488
|
-
| `deps` | `string[]` | Dependencies. Empty/omitted means root step. |
|
|
489
|
-
| `if` | `string` | Optional filter expression (step is skipped if false). |
|
|
490
|
-
| `timeoutMs` | `number` | Optional timeout for `rpc` and `event_wait` steps. |
|
|
491
|
-
| `durationMs` | `number` | Required for `sleep` steps. |
|
|
492
|
-
|
|
493
|
-
`WorkflowOpts` (third argument when registering a DAG — shape below; the interface is defined in the SDK but **not** re-exported from the main `service-bridge` package entry, so use an inline object in app code):
|
|
214
|
+
await sb.start();
|
|
494
215
|
|
|
495
|
-
|
|
496
|
-
|
|
497
|
-
|
|
498
|
-
|
|
499
|
-
}
|
|
216
|
+
// Kick off a run and wait for the final state.
|
|
217
|
+
const { runId } = await sb.workflow.start("checkout", { orderId: "o-1" });
|
|
218
|
+
const state = await sb.workflow.await(runId);
|
|
219
|
+
console.log("done", state);
|
|
500
220
|
```
|
|
501
221
|
|
|
502
|
-
|
|
503
|
-
|---|---|---|---|
|
|
504
|
-
| `stateLimitBytes` | `number` | `262144` (256 KB) | Maximum serialized state size in bytes. |
|
|
505
|
-
| `stepTimeoutMs` | `number` | `30000` (30 s) | Default per-step timeout in milliseconds. |
|
|
222
|
+
The consuming service just subscribes:
|
|
506
223
|
|
|
507
224
|
```ts
|
|
508
|
-
|
|
509
|
-
|
|
510
|
-
|
|
511
|
-
|
|
512
|
-
{ id: "notify", type: "event", ref: "orders.fulfilled", deps: ["wait_5m"] },
|
|
513
|
-
]);
|
|
225
|
+
sb.event.handle("order.placed", async (payload) => {
|
|
226
|
+
await sendReceipt(payload);
|
|
227
|
+
});
|
|
228
|
+
await sb.start();
|
|
514
229
|
```
|
|
515
230
|
|
|
516
|
-
|
|
517
|
-
|
|
518
|
-
```ts
|
|
519
|
-
await sb.workflows.run("checkout.flow", steps, { stepTimeoutMs: 60_000 });
|
|
520
|
-
```
|
|
231
|
+
In the dashboard you see one trace spanning the workflow run, the `Charge` RPC, the `order.placed` publish, and its delivery to the subscriber.
|
|
521
232
|
|
|
522
233
|
---
|
|
523
234
|
|
|
524
|
-
|
|
235
|
+
## Platform features
|
|
525
236
|
|
|
526
|
-
|
|
527
|
-
|
|
528
|
-
|
|
237
|
+
| Area | What you get |
|
|
238
|
+
|---|---|
|
|
239
|
+
| **Communication** | Direct RPC, server-side streaming, durable events, service discovery, full-mesh routing, a live service map |
|
|
240
|
+
| **Orchestration** | Workflows (DAG steps with compensation), sub-workflows, jobs (cron / interval / delayed), bidirectional replay |
|
|
241
|
+
| **Reliability** | At-least-once delivery, retries, DLQ, idempotency, fan-out, session resilience, multi-instance failover, circuit breakers |
|
|
242
|
+
| **Traffic control** | Load balancing, rate limiting, per-definition limits, filter expressions, adaptive performance |
|
|
243
|
+
| **Security** | TLS by default, mTLS identity, auto-provisioned certs from a service key, granular access policy |
|
|
244
|
+
| **Observability** | Unified tracing with propagation, Prometheus-compatible metrics, structured logs, smart alerts |
|
|
529
245
|
|
|
530
|
-
|
|
246
|
+
Designed to run up to 1000 services against a single runtime.
|
|
531
247
|
|
|
532
248
|
---
|
|
533
249
|
|
|
534
|
-
|
|
250
|
+
## How it compares
|
|
535
251
|
|
|
536
|
-
|
|
252
|
+
| You'd otherwise reach for | ServiceBridge gives you |
|
|
253
|
+
|---|---|
|
|
254
|
+
| Istio / Linkerd (mesh, mTLS) | mTLS identity + routing + policy, no sidecars |
|
|
255
|
+
| RabbitMQ / Kafka / NATS | Durable events with outbox, fan-out, retries, DLQ |
|
|
256
|
+
| Temporal / Cadence | Durable workflows with compensation, signals, replay |
|
|
257
|
+
| A cron service / Quartz | Leased, retried scheduled jobs |
|
|
258
|
+
| Jaeger / Tempo + Prometheus + Loki | Tracing, metrics and logs, correlated out of the box |
|
|
259
|
+
| gRPC + a service registry | Typed RPC with discovery, LB and breakers |
|
|
537
260
|
|
|
538
|
-
|
|
539
|
-
run(service: string, name: string, input?: unknown, opts?: ExecuteWorkflowOpts): Promise<ExecuteWorkflowResult>
|
|
540
|
-
```
|
|
261
|
+
The point isn't that ServiceBridge beats each tool at its own game — it's that you stop running and correlating ten of them.
|
|
541
262
|
|
|
542
|
-
|
|
543
|
-
An alternative to scheduling via `jobs.run(target, { via: "workflow", ... })` — triggers the execution immediately.
|
|
263
|
+
---
|
|
544
264
|
|
|
545
|
-
|
|
546
|
-
|---|---|---|---|
|
|
547
|
-
| `service` / `name` | `string` | required | Target service and workflow name. |
|
|
548
|
-
| `input` | `unknown` | `undefined` | Optional JSON-serializable input payload (serialized as JSON for the runtime). |
|
|
265
|
+
## API reference
|
|
549
266
|
|
|
550
|
-
|
|
267
|
+
The bridge exposes four domains (`sb.rpc`, `sb.event`, `sb.job`, `sb.workflow`) plus `sb.stream()` and `sb.telemetry`. Register handlers and declare dependencies **before** `start()`.
|
|
551
268
|
|
|
552
|
-
|
|
269
|
+
### RPC
|
|
553
270
|
|
|
554
|
-
|
|
555
|
-
|---|---|---|
|
|
556
|
-
| `traceId` | `string` | Declared on the exported type for API parity; the current Node implementation does **not** forward this field to the control plane (the gRPC request is built without it). Prefer relying on the returned `traceId`. |
|
|
271
|
+
`sb.rpc` is request/response: register handlers, call other services.
|
|
557
272
|
|
|
558
273
|
```ts
|
|
559
|
-
|
|
560
|
-
|
|
561
|
-
|
|
562
|
-
|
|
563
|
-
|
|
564
|
-
|
|
274
|
+
// Unary handler: (req) => res
|
|
275
|
+
sb.rpc.handle<ChargeRequest, ChargeReply>(
|
|
276
|
+
"Charge",
|
|
277
|
+
async (req) => ({ ok: req.amount > 0 }),
|
|
278
|
+
{ schema: { protoFile: "./payment.proto" } },
|
|
279
|
+
);
|
|
565
280
|
|
|
566
|
-
|
|
567
|
-
|
|
281
|
+
// Server-side streaming handler: (req) => AsyncIterable<chunk>
|
|
282
|
+
sb.rpc.handleStream<GenRequest, Token>(
|
|
283
|
+
"Generate",
|
|
284
|
+
async function* (req) {
|
|
285
|
+
for (const word of req.prompt.split(" ")) yield { token: word };
|
|
286
|
+
},
|
|
287
|
+
{ schema: { protoFile: "./gen.proto" } },
|
|
288
|
+
);
|
|
568
289
|
```
|
|
569
290
|
|
|
570
|
-
|
|
291
|
+
Calling — the typed proxy from `sb.client()` (preferred), or the lower-level `sb.rpc.call()`:
|
|
571
292
|
|
|
572
293
|
```ts
|
|
573
|
-
await
|
|
574
|
-
```
|
|
575
|
-
|
|
576
|
-
---
|
|
294
|
+
const res = await payment.Charge({ userId: "u-1", amount: 100 });
|
|
577
295
|
|
|
578
|
-
|
|
579
|
-
|
|
580
|
-
|
|
581
|
-
|
|
582
|
-
fn: string,
|
|
583
|
-
handler: (payload: unknown, ctx?: RpcContext) => unknown | Promise<unknown>,
|
|
584
|
-
opts?: HandleRpcOpts,
|
|
585
|
-
): ServiceBridgeService
|
|
296
|
+
const res2 = await sb.rpc.call("payment-svc", "Charge",
|
|
297
|
+
{ userId: "u-1", amount: 100 },
|
|
298
|
+
{ timeout: "5s", idempotencyKey: "order-42" },
|
|
299
|
+
);
|
|
586
300
|
```
|
|
587
301
|
|
|
588
|
-
|
|
302
|
+
`CallOpts` apply per call, layered over `callDefaults` from the constructor:
|
|
589
303
|
|
|
590
|
-
`
|
|
591
|
-
|
|
592
|
-
|
|
|
593
|
-
|
|
594
|
-
| `
|
|
595
|
-
| `
|
|
596
|
-
| `
|
|
597
|
-
|
|
598
|
-
`HandleRpcOpts`:
|
|
599
|
-
|
|
600
|
-
| Option | Type | Description |
|
|
601
|
-
|---|---|---|
|
|
602
|
-
| `timeout` | `number` | Advisory timeout hint (currently metadata-level, not hard-enforced by runtime). |
|
|
603
|
-
| `retryable` | `boolean` | Advisory retry hint (currently metadata-level, not a strict policy switch). |
|
|
604
|
-
| `concurrency` | `number` | Advisory concurrency hint (currently not hard-enforced). |
|
|
605
|
-
| `schema` | `RpcSchemaOpts` | Inline protobuf schema for binary encode/decode. |
|
|
606
|
-
| `allowedCallers` | `string[]` | Allow-list of caller service names. |
|
|
607
|
-
|
|
608
|
-
```ts
|
|
609
|
-
sb.rpc.handle("ai.generate", async (payload: { prompt: string }, ctx) => {
|
|
610
|
-
await ctx?.stream.write({ token: "Hello" }, "output");
|
|
611
|
-
await ctx?.stream.write({ token: " world" }, "output");
|
|
612
|
-
return { text: "Hello world" };
|
|
613
|
-
});
|
|
614
|
-
```
|
|
615
|
-
|
|
616
|
-
`StreamWriter`:
|
|
304
|
+
| `CallOpts` | Type | Default | Description |
|
|
305
|
+
|---|---|---|---|
|
|
306
|
+
| `timeout` | `string` | `"30s"` | Deadline, e.g. `"500ms"`, `"10s"`, `"2m"`. |
|
|
307
|
+
| `requestId` | `string` | random UUID v4 | Correlation id carried to the callee. |
|
|
308
|
+
| `transport` | `"direct" \| "proxy" \| "auto"` | `"auto"` | `direct` = caller→callee mTLS; `proxy` = via the runtime; `auto` = direct when an endpoint is known. |
|
|
309
|
+
| `idempotencyKey` | `string` | none | Opts into runtime-side dedup; replays within the TTL return the cached response. |
|
|
310
|
+
| `retry` | `Partial<RetryOpts>` | exp. backoff | `{ maxAttempts: 3, baseDelayMs: 200, factor: 2, maxDelayMs: 5000, jitter: 0.3 }`. Set `maxAttempts: 1` to disable. |
|
|
617
311
|
|
|
618
|
-
|
|
619
|
-
|---|---|---|
|
|
620
|
-
| `write` | `write(data: unknown, key?: string): Promise<void>` | Append a real-time chunk to the trace stream. |
|
|
621
|
-
| `end` | `end(key?: string): Promise<void>` | No-op placeholder for API symmetry (lifecycle managed by runtime). |
|
|
312
|
+
Without an `idempotencyKey`, ambiguous failures (`INTERNAL` / `ABORTED` / `UNKNOWN`) are treated as non-retryable so a non-idempotent call is never silently repeated. Schema-version mismatches are filtered at routing time, so blue-green deploys route `v1→v1` and `v2→v2` automatically.
|
|
622
313
|
|
|
623
|
-
|
|
314
|
+
### Events
|
|
624
315
|
|
|
625
|
-
|
|
316
|
+
Durable, at-least-once publish/subscribe. Events hit a local SQLite outbox first, then drain to the runtime, so a publish survives a transient disconnect.
|
|
626
317
|
|
|
627
318
|
```ts
|
|
628
|
-
|
|
629
|
-
|
|
630
|
-
handler: (payload: unknown, ctx: EventContext) => void | Promise<void>,
|
|
631
|
-
opts?: HandleEventOpts,
|
|
632
|
-
): ServiceBridgeService
|
|
633
|
-
```
|
|
319
|
+
// Declare what you publish (same file-based SchemaSpec as RPC).
|
|
320
|
+
sb.event.define("order.placed", { protoFile: "./events.proto", input: "OrderPlaced" });
|
|
634
321
|
|
|
635
|
-
|
|
636
|
-
|
|
637
|
-
|
|
638
|
-
|
|
639
|
-
| Option | Type | Description |
|
|
640
|
-
|---|---|---|
|
|
641
|
-
| `concurrency` | `number` | Advisory concurrency hint (currently not hard-enforced). |
|
|
642
|
-
| `prefetch` | `number` | Advisory prefetch hint (currently not hard-enforced). |
|
|
643
|
-
| `retryPolicyJson` | `string` | Retry policy JSON string. |
|
|
644
|
-
| `filterExpr` | `string` | Server-side filter expression. |
|
|
645
|
-
|
|
646
|
-
The consumer group name is fixed as `<service-key-id>.<pattern>` (derived from your sbv2 key and the pattern string). Registering a second handler for the same pattern throws a duplicate consumer-group error.
|
|
647
|
-
|
|
648
|
-
**Delivery guarantee**: once a message is accepted by the runtime, delivery to each consumer group
|
|
649
|
-
is guaranteed. If the consumer is offline, the message waits in the server-side queue and is
|
|
650
|
-
dispatched automatically the moment the service reconnects and registers its handlers — no retry
|
|
651
|
-
budget is consumed while waiting. After `SERVICEBRIDGE_DELIVERY_TTL_DAYS` (default 7) days without
|
|
652
|
-
a consumer, the delivery moves to DLQ with reason `delivery_ttl_exceeded`.
|
|
653
|
-
|
|
654
|
-
`EventContext` helpers:
|
|
655
|
-
|
|
656
|
-
- `ctx.traceId` — current trace ID
|
|
657
|
-
- `ctx.spanId` — current span ID
|
|
658
|
-
- `ctx.retry(delayMs?)` — ask for redelivery with optional delay
|
|
659
|
-
- `ctx.reject(reason)` — move to DLQ immediately, bypassing remaining retries
|
|
660
|
-
- `ctx.refs` — metadata (`topic`, `groupName`, `messageId`, `attempt`, `headers`)
|
|
661
|
-
- `ctx.stream.write(...)` — append real-time chunks to trace stream
|
|
662
|
-
|
|
663
|
-
```ts
|
|
664
|
-
sb.events.handle("orders.*", async (payload, ctx) => {
|
|
665
|
-
const body = payload as { orderId?: string };
|
|
666
|
-
if (!body.orderId) {
|
|
667
|
-
ctx.reject("missing_order_id");
|
|
668
|
-
return;
|
|
669
|
-
}
|
|
670
|
-
await ctx.stream.write({ status: "processing", orderId: body.orderId }, "progress");
|
|
322
|
+
// Subscribe — exact name or wildcard ("order.*", "order.#").
|
|
323
|
+
sb.event.handle("order.placed", async (payload) => {
|
|
324
|
+
await fulfil(payload);
|
|
671
325
|
});
|
|
672
|
-
```
|
|
673
326
|
|
|
674
|
-
|
|
327
|
+
await sb.start();
|
|
675
328
|
|
|
676
|
-
|
|
677
|
-
|
|
678
|
-
```ts
|
|
679
|
-
start(opts?: StartOpts): Promise<void>
|
|
329
|
+
const { eventId } = await sb.event.publish("order.placed", { orderId: "o-1", total: 4200 });
|
|
680
330
|
```
|
|
681
331
|
|
|
682
|
-
|
|
683
|
-
The promise resolves once startup/registration is complete (it does not block
|
|
684
|
-
the Node.js process). Throws immediately if no handlers are registered (neither `rpc.handle()` nor `events.handle()` have been called).
|
|
685
|
-
|
|
686
|
-
`StartOpts`:
|
|
332
|
+
Event names must match `^[a-z0-9_-]+(\.[a-z0-9_-]+)*$` (invalid → `InvalidEventNameError`). A full outbox throws `OutboxFullError`.
|
|
687
333
|
|
|
688
|
-
|
|
|
334
|
+
| `PublishOpts` | Type | Description |
|
|
689
335
|
|---|---|---|
|
|
690
|
-
| `
|
|
691
|
-
| `
|
|
692
|
-
| `
|
|
693
|
-
| `
|
|
694
|
-
| `
|
|
336
|
+
| `idempotencyKey` | `string` | Dedup key for at-least-once delivery. |
|
|
337
|
+
| `partitionKey` | `string` | Orders delivery within a partition. |
|
|
338
|
+
| `fireAndForget` | `boolean` | Skip the durable wait for the publish ack. |
|
|
339
|
+
| `headers` | `Record<string, string>` | Custom envelope headers. |
|
|
340
|
+
| `occurredAtMs` | `number` | Event time (unix-ms); defaults to now. |
|
|
695
341
|
|
|
696
|
-
|
|
697
|
-
await sb.start({
|
|
698
|
-
host: "localhost",
|
|
699
|
-
instanceId: process.env.HOSTNAME,
|
|
700
|
-
});
|
|
701
|
-
```
|
|
342
|
+
The runtime delivers at-least-once, retries failures, fans out to every matching subscriber, and dead-letters exhausted messages. The DLQ is operated from the dashboard — the SDK has no DLQ API; make handlers idempotent and throw to signal "retry me".
|
|
702
343
|
|
|
703
|
-
|
|
344
|
+
### Jobs
|
|
704
345
|
|
|
705
|
-
|
|
346
|
+
Scheduled work: cron, fixed interval, or one-shot delay. The runtime owns the schedule, leasing and retries.
|
|
706
347
|
|
|
707
348
|
```ts
|
|
708
|
-
|
|
709
|
-
|
|
710
|
-
|
|
711
|
-
|
|
712
|
-
|
|
713
|
-
---
|
|
714
|
-
|
|
715
|
-
### `startHttpSpan(opts)`
|
|
716
|
-
|
|
717
|
-
```ts
|
|
718
|
-
startHttpSpan(opts: {
|
|
719
|
-
method: string;
|
|
720
|
-
path: string;
|
|
721
|
-
traceId?: string;
|
|
722
|
-
parentSpanId?: string;
|
|
723
|
-
}): HttpSpan
|
|
724
|
-
```
|
|
725
|
-
|
|
726
|
-
Manual HTTP tracing primitive.
|
|
727
|
-
|
|
728
|
-
```ts
|
|
729
|
-
const span = sb.startHttpSpan({ method: "GET", path: "/health" });
|
|
730
|
-
try {
|
|
731
|
-
span.end({ statusCode: 200, success: true });
|
|
732
|
-
} catch (e) {
|
|
733
|
-
span.end({ success: false, error: String(e) });
|
|
734
|
-
}
|
|
735
|
-
```
|
|
736
|
-
|
|
737
|
-
---
|
|
349
|
+
sb.job.handle("nightly-rollup",
|
|
350
|
+
{ trigger: { cron: "0 3 * * *", tz: "UTC" } }, // 5-field cron, no seconds
|
|
351
|
+
async (ctx) => { await rollup(ctx.scheduledAt); },
|
|
352
|
+
);
|
|
738
353
|
|
|
739
|
-
|
|
354
|
+
sb.job.handle("heartbeat", { trigger: { interval: 30_000 } }, async () => { await ping(); });
|
|
740
355
|
|
|
741
|
-
|
|
742
|
-
|
|
743
|
-
|
|
744
|
-
|
|
745
|
-
instanceId?: string;
|
|
746
|
-
endpoint?: string;
|
|
747
|
-
allowedCallers?: string[];
|
|
748
|
-
requestSchemaJson?: string;
|
|
749
|
-
responseSchemaJson?: string;
|
|
750
|
-
transport?: string;
|
|
751
|
-
}): Promise<void>
|
|
356
|
+
sb.job.handle("send-reminder",
|
|
357
|
+
{ trigger: { delayed: { at: Date.now() + 60_000 } } }, // Date | number | ISO string
|
|
358
|
+
async (ctx) => { await remind(ctx.idempotencyKey); },
|
|
359
|
+
);
|
|
752
360
|
```
|
|
753
361
|
|
|
754
|
-
|
|
362
|
+
The handler receives a `JobHandlerCtx`: `{ jobName, executionId, scheduledAt, localScheduledAt, attempt, idempotencyKey, signal }`.
|
|
755
363
|
|
|
756
|
-
|
|
|
757
|
-
|
|
758
|
-
| `
|
|
759
|
-
| `
|
|
760
|
-
| `
|
|
761
|
-
| `
|
|
762
|
-
| `
|
|
763
|
-
|
|
764
|
-
|
|
765
|
-
|
|
766
|
-
|
|
767
|
-
|
|
768
|
-
|
|
769
|
-
|
|
770
|
-
|
|
771
|
-
|
|
772
|
-
|
|
364
|
+
| `JobOpts` | Type | Default | Description |
|
|
365
|
+
|---|---|---|---|
|
|
366
|
+
| `trigger` | `{cron, tz?} \| {delayed:{at}} \| {interval}` | required | Exactly one trigger; `interval` is in ms. |
|
|
367
|
+
| `catchup` | `"skip" \| "fire_once" \| "fire_all"` | `skip` | What to do for fire times missed during downtime. |
|
|
368
|
+
| `overlap` | `"skip" \| "allow" \| "buffer_one"` | `allow` | Behaviour when a previous run is still in flight. |
|
|
369
|
+
| `deps` | `DeclaredDep[]` | none | Outgoing deps: `{ rpc }`, `{ event }`, `{ workflow }`. |
|
|
370
|
+
| `maxAttempts` / `leaseTtlMs` / `maxConcurrent` / `retry` | — | runtime default | Execution limits and `{ initialMs, maxMs, multiplier, jitter }` retry. |
|
|
371
|
+
|
|
372
|
+
### Workflows
|
|
373
|
+
|
|
374
|
+
Durable DAGs. Declare the graph once; the runtime executes it, persists state between steps, survives restarts, and compensates on failure or cancel.
|
|
375
|
+
|
|
376
|
+
```ts
|
|
377
|
+
sb.workflow.handle("checkout", {
|
|
378
|
+
input: { type: "object", properties: { orderId: { type: "string" } } },
|
|
379
|
+
steps: [
|
|
380
|
+
{ id: "reserve", type: "call", service: "inventory-svc", method: "Reserve",
|
|
381
|
+
input: "$.input",
|
|
382
|
+
compensate: { service: "inventory-svc", method: "Release", input: "$.reserve" } },
|
|
383
|
+
{ id: "charge", type: "call", service: "payment-svc", method: "Charge",
|
|
384
|
+
input: "$.input", waitFor: ["reserve"] },
|
|
385
|
+
{ id: "notify", type: "publish", event: "order.placed",
|
|
386
|
+
input: "$.input", waitFor: ["charge"] },
|
|
387
|
+
],
|
|
773
388
|
});
|
|
774
389
|
```
|
|
775
390
|
|
|
776
|
-
|
|
391
|
+
Top-level steps run in parallel by default; `waitFor` declares dependencies and defines the execution levels. Step types: `call`, `publish`, `sleep`, `wait_event`, `wait_signal`, `workflow` (sub-workflow), `parallel`, `sequence`, `local`. Inputs are JSON-path expressions (`"$.input"`, `"$.reserve.id"`) over the accumulated run state.
|
|
777
392
|
|
|
778
|
-
|
|
393
|
+
Driving a run:
|
|
779
394
|
|
|
780
395
|
```ts
|
|
781
|
-
|
|
782
|
-
```
|
|
783
|
-
|
|
784
|
-
Subscribes to a trace stream with replay and live updates. `traceId` is the stream
|
|
785
|
-
identifier used by `ctx.stream.write(...)`.
|
|
786
|
-
|
|
787
|
-
`WatchTraceOpts`:
|
|
788
|
-
|
|
789
|
-
| Option | Type | Default | Description |
|
|
790
|
-
|---|---|---|---|
|
|
791
|
-
| `key` | `string` | `""` | Stream key filter (`""` = all keys). |
|
|
792
|
-
| `fromSequence` | `number` | `0` | Replay from sequence cursor. |
|
|
396
|
+
const { runId } = await sb.workflow.start("checkout", { orderId: "o-1" });
|
|
793
397
|
|
|
794
|
-
|
|
398
|
+
const state = await sb.workflow.await(runId); // block until terminal
|
|
399
|
+
const snap = await sb.workflow.query(runId); // { status, state, steps: [...] }
|
|
400
|
+
await sb.workflow.signal(runId, "approval", { ok: 1 }); // resume a wait_signal step
|
|
401
|
+
await sb.workflow.cancel(runId); // compensate in reverse
|
|
402
|
+
const { runId: forked } = await sb.workflow.replay(runId, { fromStepId: "charge" });
|
|
403
|
+
```
|
|
795
404
|
|
|
796
|
-
|
|
797
|
-
|---|---|---|
|
|
798
|
-
| `type` | `"chunk" \| "trace_complete"` | Event kind. |
|
|
799
|
-
| `traceId` | `string` | Trace identifier being watched. |
|
|
800
|
-
| `key` | `string` | Stream lane key. |
|
|
801
|
-
| `sequence` | `number` | Monotonic sequence number. |
|
|
802
|
-
| `data` | `unknown` | JSON-decoded chunk payload. |
|
|
803
|
-
| `traceStatus` | `string \| undefined` | Final status on `trace_complete`. |
|
|
405
|
+
Use `sb.workflow.query()` for the snapshot — there is no `getStatus`. `start()` with no permission throws `WorkflowAccessDeniedError`; an unknown name throws `WorkflowNotFoundError`; signalling/cancelling a finished run throws `WorkflowTerminalError`.
|
|
804
406
|
|
|
805
|
-
|
|
407
|
+
### Streaming
|
|
806
408
|
|
|
807
|
-
-
|
|
808
|
-
- Deduplicates by `sequence` across reconnects.
|
|
809
|
-
- Enforces strict JSON for `type="chunk"` payloads (non-JSON chunk terminates stream with fatal error).
|
|
810
|
-
- Enforces internal queue limit `256`; overflow is fatal (consumer must drain promptly).
|
|
409
|
+
Server-side streaming is a first-class RPC shape. Register with `sb.rpc.handleStream`, consume with `sb.stream()` (or the typed proxy, which auto-detects `returns (stream T)` methods).
|
|
811
410
|
|
|
812
411
|
```ts
|
|
813
|
-
for await (const
|
|
814
|
-
|
|
815
|
-
process.stdout.write(String((evt.data as { token?: string }).token ?? ""));
|
|
816
|
-
}
|
|
817
|
-
if (evt.type === "trace_complete") break;
|
|
412
|
+
for await (const chunk of sb.stream("gen-svc", "Generate", { prompt: "write a haiku" })) {
|
|
413
|
+
process.stdout.write(chunk.token);
|
|
818
414
|
}
|
|
819
415
|
```
|
|
820
416
|
|
|
821
|
-
|
|
822
|
-
|
|
823
|
-
### Trace Utilities
|
|
824
|
-
|
|
825
|
-
#### `getTraceContext()`
|
|
417
|
+
Breaking the loop (`break`/`return`) tears down the gRPC stream end to end. Streams are single-pick — never retried — by design.
|
|
826
418
|
|
|
827
|
-
|
|
828
|
-
getTraceContext(): TraceCtx | undefined
|
|
829
|
-
```
|
|
419
|
+
### Telemetry
|
|
830
420
|
|
|
831
|
-
|
|
421
|
+
Telemetry flows automatically: every RPC, event, job, workflow step and HTTP request emits an operation span and propagates the trace across hops. Add your own through `sb.telemetry`; anything emitted inside a handler nests under that handler's trace.
|
|
832
422
|
|
|
833
423
|
```ts
|
|
834
|
-
import {
|
|
424
|
+
import { Channel, UserSubOp } from "service-bridge";
|
|
835
425
|
|
|
836
|
-
const
|
|
837
|
-
|
|
838
|
-
|
|
426
|
+
const op = sb.telemetry.startOp({
|
|
427
|
+
channel: Channel.USER, kind: UserSubOp, subject: "reprice-cart", businessKey: cartId,
|
|
428
|
+
});
|
|
429
|
+
try {
|
|
430
|
+
await reprice(cartId);
|
|
431
|
+
op.end(/* Status.SUCCESS */);
|
|
432
|
+
} catch (err) {
|
|
433
|
+
op.end(/* Status.ERROR */, String(err));
|
|
434
|
+
throw err;
|
|
839
435
|
}
|
|
840
|
-
```
|
|
841
|
-
|
|
842
|
-
#### `withTraceContext(ctx, fn)`
|
|
843
436
|
|
|
844
|
-
|
|
845
|
-
|
|
437
|
+
sb.telemetry.log.info("cart repriced", { cartId, items: 7 }); // also sb.logger
|
|
438
|
+
sb.telemetry.counter("carts_repriced_total").inc();
|
|
439
|
+
sb.telemetry.gauge("queue_depth").set(42);
|
|
440
|
+
sb.telemetry.histogram("reprice_ms", "ms").observe(12.5);
|
|
846
441
|
```
|
|
847
442
|
|
|
848
|
-
|
|
443
|
+
`startOp()` returns a handle whose `.end(status, message?)` closes the span. Anything emitted before `start()` buffers in an in-memory ring and drains once connected.
|
|
849
444
|
|
|
850
|
-
|
|
851
|
-
import { withTraceContext } from "service-bridge";
|
|
445
|
+
### HTTP
|
|
852
446
|
|
|
853
|
-
|
|
854
|
-
|
|
855
|
-
|
|
856
|
-
```
|
|
447
|
+
ServiceBridge does **not** proxy your business HTTP. You run your own server; the integration discovers your routes, publishes them to the Service Map, and wraps each request in a trace span so HTTP stitches into the same trace as the RPCs and events it triggers. See [HTTP plugins](#http-plugins).
|
|
448
|
+
|
|
449
|
+
Useful read accessors after `start()`: `sb.identity()` (current session identity or `null`), `sb.serviceMap()` (live registry: visible methods, instances, endpoints), `sb.policyEvaluation()` (the runtime's current access-policy verdict).
|
|
857
450
|
|
|
858
451
|
---
|
|
859
452
|
|
|
860
|
-
## HTTP
|
|
453
|
+
## HTTP plugins
|
|
861
454
|
|
|
862
|
-
|
|
455
|
+
Each integration is a subpath import with an optional peer dependency.
|
|
863
456
|
|
|
864
|
-
|
|
865
|
-
npm install express
|
|
866
|
-
```
|
|
457
|
+
**Express** — `service-bridge/express`:
|
|
867
458
|
|
|
868
459
|
```ts
|
|
869
460
|
import express from "express";
|
|
870
461
|
import { ServiceBridge } from "service-bridge";
|
|
871
|
-
import {
|
|
462
|
+
import { attachExpress } from "service-bridge/express";
|
|
872
463
|
|
|
873
|
-
const sb = new ServiceBridge(process.env.SERVICEBRIDGE_URL!, process.env.SERVICEBRIDGE_SERVICE_KEY!);
|
|
874
464
|
const app = express();
|
|
465
|
+
app.post("/orders", (req, res) => res.json({ ok: true }));
|
|
875
466
|
|
|
876
|
-
|
|
877
|
-
|
|
878
|
-
excludePaths: ["/health"],
|
|
879
|
-
autoRegister: true,
|
|
880
|
-
}));
|
|
467
|
+
const sb = new ServiceBridge("localhost:14445", KEY);
|
|
468
|
+
await sb.start();
|
|
881
469
|
|
|
882
|
-
app.
|
|
883
|
-
const user = await req.servicebridge.rpc.invoke("user.get", { id: req.params.id });
|
|
884
|
-
res.json(user);
|
|
885
|
-
});
|
|
470
|
+
app.listen(3000, () => attachExpress(app, sb, { port: 3000 }));
|
|
886
471
|
```
|
|
887
472
|
|
|
888
|
-
|
|
889
|
-
|
|
890
|
-
```ts
|
|
891
|
-
servicebridgeMiddleware(options: {
|
|
892
|
-
client: ServiceBridgeService;
|
|
893
|
-
excludePaths?: string[];
|
|
894
|
-
propagateTraceHeader?: boolean;
|
|
895
|
-
autoRegister?: boolean;
|
|
896
|
-
}): express.RequestHandler
|
|
897
|
-
```
|
|
898
|
-
|
|
899
|
-
- Attaches `req.servicebridge`, `req.traceId`, `req.spanId`
|
|
900
|
-
- Starts/ends HTTP span automatically
|
|
901
|
-
- Optionally sets `x-trace-id` response header
|
|
902
|
-
- Optionally auto-registers route pattern in catalog on first hit
|
|
903
|
-
|
|
904
|
-
#### `registerExpressRoutes(app, client, opts?)`
|
|
905
|
-
|
|
906
|
-
Eager route catalog registration without waiting for first request.
|
|
907
|
-
|
|
908
|
-
```ts
|
|
909
|
-
await registerExpressRoutes(app, sb, {
|
|
910
|
-
endpoint: "http://10.0.0.5:3000",
|
|
911
|
-
allowedCallers: ["api-gateway"],
|
|
912
|
-
excludePaths: ["/health"],
|
|
913
|
-
});
|
|
914
|
-
```
|
|
915
|
-
|
|
916
|
-
---
|
|
917
|
-
|
|
918
|
-
### Fastify (`service-bridge/fastify`)
|
|
919
|
-
|
|
920
|
-
```bash
|
|
921
|
-
npm install fastify
|
|
922
|
-
```
|
|
473
|
+
**Fastify** — `service-bridge/fastify`:
|
|
923
474
|
|
|
924
475
|
```ts
|
|
925
476
|
import Fastify from "fastify";
|
|
926
477
|
import { ServiceBridge } from "service-bridge";
|
|
927
|
-
import {
|
|
478
|
+
import { sbFastify } from "service-bridge/fastify";
|
|
928
479
|
|
|
929
|
-
const sb = new ServiceBridge(process.env.SERVICEBRIDGE_URL!, process.env.SERVICEBRIDGE_SERVICE_KEY!);
|
|
930
480
|
const app = Fastify();
|
|
481
|
+
const sb = new ServiceBridge("localhost:14445", KEY);
|
|
931
482
|
|
|
932
|
-
|
|
933
|
-
|
|
934
|
-
excludePaths: ["/health"],
|
|
935
|
-
autoRegister: true,
|
|
936
|
-
});
|
|
483
|
+
app.post("/orders", async () => ({ ok: true }));
|
|
484
|
+
await app.register(sbFastify, { sb }); // discovers routes + endpoint in onListen
|
|
937
485
|
|
|
938
|
-
|
|
939
|
-
|
|
940
|
-
id: (request.params as any).id,
|
|
941
|
-
});
|
|
942
|
-
return reply.send(user);
|
|
943
|
-
}));
|
|
486
|
+
await sb.start();
|
|
487
|
+
await app.listen({ port: 3000 });
|
|
944
488
|
```
|
|
945
489
|
|
|
946
|
-
|
|
490
|
+
**Hono** — `service-bridge/hono`:
|
|
947
491
|
|
|
948
492
|
```ts
|
|
949
|
-
|
|
950
|
-
|
|
951
|
-
|
|
952
|
-
propagateTraceHeader?,
|
|
953
|
-
autoRegister?,
|
|
954
|
-
register?: {
|
|
955
|
-
instanceId?,
|
|
956
|
-
endpoint?,
|
|
957
|
-
allowedCallers?,
|
|
958
|
-
excludePaths?,
|
|
959
|
-
},
|
|
960
|
-
})
|
|
961
|
-
```
|
|
962
|
-
|
|
963
|
-
- Decorates `request.servicebridge`, `request.traceId`, `request.spanId`
|
|
964
|
-
- Traces HTTP lifecycle via hooks
|
|
965
|
-
- Auto-registers routes on `onRoute` before traffic
|
|
966
|
-
|
|
967
|
-
#### `wrapHandler(handler)`
|
|
968
|
-
|
|
969
|
-
Runs a Fastify handler inside the current trace context so downstream SDK calls inherit the trace.
|
|
970
|
-
|
|
971
|
-
---
|
|
972
|
-
|
|
973
|
-
### Trace Utilities (HTTP Plugins)
|
|
493
|
+
import { Hono } from "hono";
|
|
494
|
+
import { ServiceBridge } from "service-bridge";
|
|
495
|
+
import { attachHono } from "service-bridge/hono";
|
|
974
496
|
|
|
975
|
-
|
|
497
|
+
const app = new Hono();
|
|
498
|
+
app.post("/orders", (c) => c.json({ ok: true }));
|
|
976
499
|
|
|
977
|
-
|
|
978
|
-
|
|
979
|
-
// or
|
|
980
|
-
import { extractTraceFromHeaders } from "service-bridge/fastify";
|
|
500
|
+
const sb = new ServiceBridge("localhost:14445", KEY);
|
|
501
|
+
await sb.start();
|
|
981
502
|
|
|
982
|
-
|
|
503
|
+
attachHono(app, sb, { port: 3000 }); // Hono doesn't own the socket — pass the port
|
|
504
|
+
Bun.serve({ port: 3000, fetch: app.fetch });
|
|
983
505
|
```
|
|
984
506
|
|
|
985
|
-
|
|
507
|
+
`attachExpress`/`attachHono` take `{ port, host? }`; `sbFastify` reads the bound address itself. Host defaults to the bound socket, falling back to `127.0.0.1`. Attaching before `start()` is safe — the endpoint rides along in the first registration.
|
|
986
508
|
|
|
987
509
|
---
|
|
988
510
|
|
|
989
511
|
## Configuration
|
|
990
512
|
|
|
991
|
-
|
|
513
|
+
All configuration lives on the `ServiceBridge` constructor — `new ServiceBridge(url, key, options)`. The SDK reads no environment variables; you decide where `url`, `key` and options come from. Every option is optional.
|
|
992
514
|
|
|
993
|
-
|
|
994
|
-
- Control plane is TLS-only. Trust source is embedded into sbv2 service key by default.
|
|
995
|
-
- Embedded/explicit CA PEM is validated with strict x509 parsing.
|
|
996
|
-
- If `workerTLS` is not provided, SDK auto-provisions worker certs via gRPC `ProvisionWorkerCertificate`.
|
|
997
|
-
- `workerTLS.cert` and `workerTLS.key` must be provided together.
|
|
998
|
-
- `start({ tls })` overrides global `workerTLS` for a specific worker instance.
|
|
999
|
-
|
|
1000
|
-
### Offline queue behavior
|
|
1001
|
-
|
|
1002
|
-
When the control plane is unavailable, SDK queues write operations (`events.publish`, `jobs.run`, `workflows.run`, telemetry writes).
|
|
1003
|
-
|
|
1004
|
-
- Queue size: `queueMaxSize` (default: 1000)
|
|
1005
|
-
- Overflow policy: `queueOverflow` (default: `"drop-oldest"`)
|
|
1006
|
-
- Return values for queued writes may be empty strings until flushed
|
|
1007
|
-
|
|
1008
|
-
---
|
|
1009
|
-
|
|
1010
|
-
## Environment Variables
|
|
1011
|
-
|
|
1012
|
-
The SDK requires values you pass into `new ServiceBridge(...)`. Common setup:
|
|
1013
|
-
|
|
1014
|
-
| Variable | Required | Example | Description |
|
|
515
|
+
| Option | Type | Default | Description |
|
|
1015
516
|
|---|---|---|---|
|
|
1016
|
-
| `
|
|
1017
|
-
| `
|
|
1018
|
-
|
|
1019
|
-
|
|
1020
|
-
|
|
1021
|
-
|
|
1022
|
-
|
|
1023
|
-
|
|
1024
|
-
|
|
1025
|
-
|
|
1026
|
-
|
|
1027
|
-
|
|
1028
|
-
|
|
1029
|
-
|
|
1030
|
-
|
|
1031
|
-
|
|
1032
|
-
|
|
1033
|
-
|
|
1034
|
-
|
|
1035
|
-
try {
|
|
1036
|
-
await sb.rpc.invoke("payment.charge", { orderId: "ord_1" });
|
|
1037
|
-
} catch (e) {
|
|
1038
|
-
if (e instanceof ServiceBridgeError) {
|
|
1039
|
-
console.error(e.component, e.operation, e.severity, e.retryable, e.code, e.grpcStatus);
|
|
1040
|
-
}
|
|
1041
|
-
throw e;
|
|
1042
|
-
}
|
|
1043
|
-
```
|
|
1044
|
-
|
|
1045
|
-
| Field | Type | Description |
|
|
1046
|
-
|---|---|---|
|
|
1047
|
-
| `component` | `string` | SDK subsystem (for example, `"rpc"` or `"event"`). |
|
|
1048
|
-
| `operation` | `string` | Operation that failed. |
|
|
1049
|
-
| `severity` | `"fatal" \| "retriable" \| "ignorable"` | Error classification. |
|
|
1050
|
-
| `retryable` | `boolean` | Whether retry is recommended (`true` when `severity === "retriable"`). |
|
|
1051
|
-
| `code` | `ServiceBridgeErrorCode` | Stable SDK error id (`SB_*`). |
|
|
1052
|
-
| `grpcStatus` | `number \| undefined` | gRPC status code when the error came from gRPC. |
|
|
1053
|
-
| `cause` | `unknown` | Original underlying error. |
|
|
1054
|
-
|
|
1055
|
-
---
|
|
1056
|
-
|
|
1057
|
-
## When to Use / When Not to Use
|
|
1058
|
-
|
|
1059
|
-
### ServiceBridge is a good fit when you:
|
|
1060
|
-
|
|
1061
|
-
- Have **3+ microservices** that need to communicate via RPC, events, or both
|
|
1062
|
-
- Want **RPC + events + workflows + jobs** without managing separate infrastructure for each
|
|
1063
|
-
- Need **end-to-end tracing** across all communication patterns in one timeline
|
|
1064
|
-
- Want to **eliminate sidecar proxies** and reduce operational overhead
|
|
1065
|
-
- Need **durable event delivery** with retry, DLQ, and replay without running a broker
|
|
1066
|
-
- Are building **AI/LLM pipelines** and need realtime streaming with replay
|
|
1067
|
-
|
|
1068
|
-
### Consider alternatives when you:
|
|
1069
|
-
|
|
1070
|
-
- Run a **single monolith** with no service decomposition plans
|
|
1071
|
-
- Need **ultra-high-throughput event streaming** (100K+ msg/s sustained) — Kafka is purpose-built for this
|
|
1072
|
-
- Need a **full API gateway** with rate limiting, auth plugins, and request transformation — use Kong/Envoy Gateway
|
|
1073
|
-
- Already have a **mature Istio/Linkerd mesh** and only need traffic management (no events/workflows/jobs)
|
|
1074
|
-
- Need **multi-region event replication** — ServiceBridge currently targets single-region deployments
|
|
1075
|
-
|
|
1076
|
-
---
|
|
1077
|
-
|
|
1078
|
-
## v2 Session API
|
|
1079
|
-
|
|
1080
|
-
`session_v2.ts` реализует новый Enterprise Session Protocol — Channel-based bidi stream с 8-состоянийным FSM, адаптивным heartbeat и кредитным управлением потоком. Симметричен с Go и Python SDK.
|
|
1081
|
-
|
|
1082
|
-
### Жизненный цикл сессии (8 состояний FSM)
|
|
1083
|
-
|
|
1084
|
-
```
|
|
1085
|
-
connecting → handshaking → ready ↔ active
|
|
1086
|
-
↘ suspended → (reconnect)
|
|
1087
|
-
↘ draining → closed
|
|
1088
|
-
↘ fenced (permanent)
|
|
1089
|
-
```
|
|
1090
|
-
|
|
1091
|
-
| Состояние | Описание |
|
|
1092
|
-
|-----------|----------|
|
|
1093
|
-
| `connecting` | Устанавливается TCP/TLS соединение |
|
|
1094
|
-
| `handshaking` | Отправлен Hello, ждём HelloAck |
|
|
1095
|
-
| `ready` | HelloAck получен, команды не выполняются |
|
|
1096
|
-
| `active` | Есть активные команды |
|
|
1097
|
-
| `suspended` | Heartbeat пропущен 2+ раза |
|
|
1098
|
-
| `draining` | Инициирован graceful shutdown |
|
|
1099
|
-
| `fenced` | Сервер прислал GOAWAY_FENCED — сессия закрыта навсегда |
|
|
1100
|
-
| `closed` | Соединение закрыто |
|
|
1101
|
-
|
|
1102
|
-
### Быстрый старт
|
|
1103
|
-
|
|
1104
|
-
```typescript
|
|
1105
|
-
import { V2SessionClient, validateV2Config } from 'service-bridge';
|
|
1106
|
-
|
|
1107
|
-
const cfg = {
|
|
1108
|
-
serverAddress: 'localhost:9090',
|
|
1109
|
-
instanceId: 'worker-1',
|
|
1110
|
-
zone: 'us-east-1a',
|
|
1111
|
-
transportMode: 'direct' as const,
|
|
1112
|
-
maxInflight: 64,
|
|
1113
|
-
};
|
|
1114
|
-
|
|
1115
|
-
validateV2Config(cfg);
|
|
1116
|
-
const session = new V2SessionClient(cfg);
|
|
1117
|
-
|
|
1118
|
-
// Отправить Hello при подключении
|
|
1119
|
-
const hello = session.getHelloFields();
|
|
1120
|
-
|
|
1121
|
-
// Обработать HelloAck от сервера
|
|
1122
|
-
session.onHelloAck({
|
|
1123
|
-
sessionId: 'sess-abc',
|
|
1124
|
-
resumeToken: 'token-xyz',
|
|
1125
|
-
epoch: 1n,
|
|
1126
|
-
resumed: false,
|
|
1127
|
-
resumeFromSeq: 0n,
|
|
1128
|
-
replayedCommands: 0,
|
|
1129
|
-
reconciledResults: 0,
|
|
1130
|
-
heartbeatIntervalMs: 10_000,
|
|
1131
|
-
heartbeatTimeoutMs: 30_000,
|
|
1132
|
-
initialPermits: 64,
|
|
1133
|
-
maxPermits: 128,
|
|
1134
|
-
effectiveTransportMode: 'direct',
|
|
517
|
+
| `advertise` | `{ host, port } \| false` | `127.0.0.1` on a free port (with a warning) | Inbound RPC server address. Pass `{ host, port }` in containers / k8s; `false` for caller-only instances that never serve RPC. |
|
|
518
|
+
| `callDefaults` | `CallOpts` | `{}` | Default `CallOpts` merged under every `sb.rpc.call()` / `sb.stream()`. |
|
|
519
|
+
| `failOnPolicyViolation` | `boolean` | `false` | When `true`, any policy warning at registration makes `start()` surface a `disconnected` event and stop. Otherwise warnings are logged and emitted as `policy_violation`. |
|
|
520
|
+
| `telemetry` | `boolean` | `true` | Emit ops/logs/metrics to the runtime. `false` fully disables the telemetry transport. |
|
|
521
|
+
| `telemetryRingSize` | `number` | `262144` (256 KiB) | Byte budget for the in-memory ops ring buffer. |
|
|
522
|
+
| `dataDir` | `string` | `"./.servicebridge"` | Directory for the local SQLite event outbox. |
|
|
523
|
+
| `maxOutboxRows` | `number` | `100000` | Outbox rows before `publish` back-pressures with `OutboxFullError`. |
|
|
524
|
+
| `eventsDrainerBatch` | `number` | `50` | Outbox rows drained to the runtime per tick. |
|
|
525
|
+
| `eventsMaxInFlight` | `number` | `32` | Max concurrent inbound events processed by subscribers. |
|
|
526
|
+
| `payloadMaxBytes` | `number` | `65536` | Per-direction cap on captured payload bytes. |
|
|
527
|
+
| `reconnectIntervalMs` | `number` | `3000` | Delay between reconnect attempts. |
|
|
528
|
+
| `reconnectAttempts` | `number` | `3` | Reconnect attempts before giving up. `0` = unlimited. |
|
|
529
|
+
|
|
530
|
+
```ts
|
|
531
|
+
const sb = new ServiceBridge("localhost:14445", KEY, {
|
|
532
|
+
advertise: { host: process.env.POD_IP!, port: 50051 },
|
|
533
|
+
callDefaults: { timeout: "10s" },
|
|
534
|
+
reconnectAttempts: 0,
|
|
535
|
+
dataDir: "/var/lib/myservice/sb",
|
|
1135
536
|
});
|
|
1136
|
-
|
|
1137
|
-
console.log(session.state); // 'ready'
|
|
1138
|
-
|
|
1139
|
-
// Входящая команда
|
|
1140
|
-
const accepted = session.onCommandReceived(1n, 'cmd-001');
|
|
1141
|
-
if (!accepted) {
|
|
1142
|
-
// backpressure — permits = 0
|
|
1143
|
-
}
|
|
1144
|
-
|
|
1145
|
-
// Команда выполнена
|
|
1146
|
-
session.onCommandCompleted(1n, 'cmd-001');
|
|
1147
537
|
```
|
|
1148
538
|
|
|
1149
|
-
###
|
|
539
|
+
### Lifecycle
|
|
1150
540
|
|
|
1151
|
-
```
|
|
1152
|
-
|
|
1153
|
-
|
|
1154
|
-
const hb = new AdaptiveHeartbeatV2(10_000, 30_000);
|
|
1155
|
-
|
|
1156
|
-
// Получен pong
|
|
1157
|
-
hb.onPong(25); // rttMs
|
|
1158
|
-
|
|
1159
|
-
// Следующий интервал (адаптируется по EWMA RTT)
|
|
1160
|
-
const nextMs = hb.nextIntervalMs();
|
|
1161
|
-
|
|
1162
|
-
// Пропуск — ускоряем пинги
|
|
1163
|
-
const missCount = hb.onMiss();
|
|
1164
|
-
if (missCount >= 2) {
|
|
1165
|
-
// reconnect
|
|
1166
|
-
}
|
|
1167
|
-
```
|
|
1168
|
-
|
|
1169
|
-
Алгоритм: базовый интервал `intervalMs / 3`; при пропусках делится на `2^miss` (min 2s); при стабильном RTT < 50ms удваивается (max 30s).
|
|
1170
|
-
|
|
1171
|
-
### Кредитное управление потоком
|
|
1172
|
-
|
|
1173
|
-
```typescript
|
|
1174
|
-
import { FlowControlStateV2 } from 'service-bridge';
|
|
541
|
+
```ts
|
|
542
|
+
const sb = new ServiceBridge("localhost:14445", KEY);
|
|
1175
543
|
|
|
1176
|
-
|
|
544
|
+
sb.service("payment-svc", { rpc: ["Charge"] }); // what you call
|
|
545
|
+
sb.rpc.handle("Ship", shipHandler, { schema: { protoFile: "./ship.proto" } }); // what you serve
|
|
1177
546
|
|
|
1178
|
-
|
|
1179
|
-
|
|
1180
|
-
}
|
|
547
|
+
sb.on("connected", ({ serviceName }) => console.log(`connected as ${serviceName}`));
|
|
548
|
+
sb.on("reconnecting", ({ attempt, reason }) => console.warn(`reconnecting #${attempt}: ${reason}`));
|
|
549
|
+
sb.on("disconnected", ({ reason }) => console.error(`disconnected: ${reason}`));
|
|
550
|
+
sb.on("policy_violation", (v) => console.warn(`policy: ${v.declaration} ${v.value} — ${v.reason}`));
|
|
1181
551
|
|
|
1182
|
-
|
|
1183
|
-
fc.release(1);
|
|
552
|
+
await sb.start();
|
|
1184
553
|
|
|
1185
|
-
|
|
1186
|
-
fc.setWindow(32);
|
|
554
|
+
process.on("SIGTERM", async () => { await sb.stop(); process.exit(0); });
|
|
1187
555
|
```
|
|
1188
556
|
|
|
1189
|
-
|
|
1190
|
-
|
|
1191
|
-
`BackoffV2` реализует экспоненциальный backoff с full jitter (base=100ms, max=30s). При переподключении `getHelloFields()` автоматически включает `resumeToken`, `epoch`, `lastReceivedSeq`, `lastSentSeq`, `completedCommandIds` — сервер продолжит сессию с нужной позиции.
|
|
1192
|
-
|
|
1193
|
-
```typescript
|
|
1194
|
-
import { BackoffV2 } from 'service-bridge';
|
|
557
|
+
---
|
|
1195
558
|
|
|
1196
|
-
|
|
559
|
+
## Error handling
|
|
1197
560
|
|
|
1198
|
-
|
|
1199
|
-
if (backoff.isCircuitOpen()) break; // 10+ сбоев подряд
|
|
561
|
+
Typed errors are exported from the package root, so you can `catch` precisely:
|
|
1200
562
|
|
|
1201
|
-
|
|
1202
|
-
|
|
563
|
+
```ts
|
|
564
|
+
import {
|
|
565
|
+
RpcAccessDeniedError,
|
|
566
|
+
WorkflowAccessDeniedError,
|
|
567
|
+
InvalidEventNameError,
|
|
568
|
+
OutboxFullError,
|
|
569
|
+
ServiceBridgeError,
|
|
570
|
+
} from "service-bridge";
|
|
1203
571
|
|
|
1204
|
-
|
|
1205
|
-
|
|
1206
|
-
|
|
1207
|
-
|
|
1208
|
-
|
|
572
|
+
try {
|
|
573
|
+
await payment.Charge({ userId: "u-1", amount: 100 });
|
|
574
|
+
} catch (err) {
|
|
575
|
+
if (err instanceof RpcAccessDeniedError) {
|
|
576
|
+
// denied by access policy: { serviceName, methodName, reason }
|
|
577
|
+
} else if (err instanceof ServiceBridgeError) {
|
|
578
|
+
// connection / provisioning failure with a typed .code
|
|
1209
579
|
}
|
|
1210
580
|
}
|
|
1211
581
|
```
|
|
1212
582
|
|
|
1213
|
-
|
|
1214
|
-
|
|
1215
|
-
|
|
1216
|
-
|
|
1217
|
-
|
|
1218
|
-
|
|
1219
|
-
|
|
1220
|
-
|
|
1221
|
-
|
|
1222
|
-
},
|
|
1223
|
-
functionOverrides: {
|
|
1224
|
-
'payment.charge': { mode: 'proxy', timeoutMs: 5000 },
|
|
1225
|
-
},
|
|
1226
|
-
});
|
|
1227
|
-
|
|
1228
|
-
// Разрешить транспорт для функции
|
|
1229
|
-
const mode = session.resolveTransportMode('payment.charge'); // 'proxy'
|
|
1230
|
-
```
|
|
1231
|
-
|
|
1232
|
-
### Все события сессии
|
|
1233
|
-
|
|
1234
|
-
| Метод | Описание |
|
|
1235
|
-
|-------|----------|
|
|
1236
|
-
| `getHelloFields()` | Поля для отправки Hello (первый + resume) |
|
|
1237
|
-
| `onHelloAck(ack)` | Обработка HelloAck от сервера |
|
|
1238
|
-
| `onCommandReceived(seq, id)` | Входящая команда; возвращает `false` при backpressure |
|
|
1239
|
-
| `onCommandCompleted(seq, id)` | Команда выполнена; освобождает permit |
|
|
1240
|
-
| `onPermitGrant(n)` | Сервер добавил `n` permits |
|
|
1241
|
-
| `onFlowControlUpdate(size, reason)` | Сервер изменил размер окна |
|
|
1242
|
-
| `onPong(rttMs)` | Получен pong; обновляет EWMA |
|
|
1243
|
-
| `onHeartbeatMiss()` | Таймаут pong; возвращает `true` → `suspended` |
|
|
1244
|
-
| `onDrain(reason, deadlineMs)` | Инициировать graceful drain |
|
|
1245
|
-
| `onGoaway(code, reason)` | GoawaySignal от сервера |
|
|
1246
|
-
| `onConfigPush(config)` | Применить новую конфигурацию транспорта |
|
|
1247
|
-
| `resolveTransportMode(fnName)` | Получить режим транспорта для функции |
|
|
1248
|
-
| `stop()` | Немедленно закрыть сессию |
|
|
1249
|
-
|
|
1250
|
-
### Экспортируемые классы и типы
|
|
1251
|
-
|
|
1252
|
-
| Символ | Тип | Описание |
|
|
1253
|
-
|--------|-----|----------|
|
|
1254
|
-
| `V2SessionClient` | class | Главный клиент сессии |
|
|
1255
|
-
| `AdaptiveHeartbeatV2` | class | EWMA RTT heartbeat controller |
|
|
1256
|
-
| `FlowControlStateV2` | class | Кредитное управление потоком |
|
|
1257
|
-
| `BackoffV2` | class | Exponential backoff + circuit |
|
|
1258
|
-
| `PositionTrackerV2` | class | Трекер seq/completed IDs |
|
|
1259
|
-
| `ConfigPushStateV2` | class | Менеджер динамической конфигурации |
|
|
1260
|
-
| `validateV2Config` | function | Валидация конфига; бросает `Error` |
|
|
1261
|
-
| `V2Config` | interface | Конфигурация сессии |
|
|
1262
|
-
| `SessionStateV2` | type | Союз 8 состояний FSM |
|
|
1263
|
-
| `TransportMode` | type | `'direct' \| 'proxy'` |
|
|
1264
|
-
| `HelloAckV2` | interface | Данные HelloAck от сервера |
|
|
1265
|
-
| `TransportConfigV2` | interface | ConfigPush payload |
|
|
1266
|
-
| `ReconcileRequestV2` | interface | Declarative worker registration request |
|
|
1267
|
-
| `FunctionDeclarationV2` | interface | Function declaration for Reconcile |
|
|
1268
|
-
| `ConsumerGroupDeclarationV2` | interface | Consumer group declaration |
|
|
1269
|
-
| `HttpRouteDeclarationV2` | interface | HTTP route declaration |
|
|
1270
|
-
| `JobDeclarationV2` | interface | Job declaration |
|
|
1271
|
-
| `WorkflowDeclarationV2` | interface | Workflow declaration |
|
|
1272
|
-
| `SubscribeRequestV2` | interface | Registry subscribe request |
|
|
1273
|
-
| `WorkerEndpointV2` | interface | Worker endpoint info |
|
|
1274
|
-
| `IssueCertificateRequestV2` | interface | Certificate request |
|
|
1275
|
-
| `IssueCertificateResponseV2` | interface | Certificate response |
|
|
1276
|
-
| `CircuitBreakerConfigV2` | interface | Circuit breaker config |
|
|
1277
|
-
| `ZoneConfigV2` | interface | Zone-aware config |
|
|
1278
|
-
| `ServiceTransportOverride` | interface | Per-service transport override |
|
|
1279
|
-
| `FunctionTransportOverride` | interface | Per-function transport override |
|
|
1280
|
-
| `ResumeState` | interface | Reconnect resume state |
|
|
1281
|
-
|
|
1282
|
-
From the main entry `service-bridge`, types such as `ServiceBridgeOpts`, `RpcOpts`, `EventOpts`, `HandleRpcOpts`, `HandleEventOpts`, `ScheduleOpts`, `StartOpts`, `ExecuteWorkflowOpts`, and `ExecuteWorkflowResult` are available. The DAG shapes **`WorkflowStep` and `WorkflowOpts` are documented above but are not named exports** from that entry — use inline object literals (inference from `workflows.run(...)`) unless your toolchain exposes deep paths. Example:
|
|
1283
|
-
|
|
1284
|
-
```ts
|
|
1285
|
-
import type {
|
|
1286
|
-
RpcContext,
|
|
1287
|
-
EventContext,
|
|
1288
|
-
StreamWriter,
|
|
1289
|
-
TraceCtx,
|
|
1290
|
-
RetryPolicy,
|
|
1291
|
-
ServiceBridgeErrorSeverity,
|
|
1292
|
-
} from "service-bridge";
|
|
1293
|
-
```
|
|
583
|
+
| Error | Thrown when |
|
|
584
|
+
|---|---|
|
|
585
|
+
| `RpcAccessDeniedError` | An RPC call is denied by access policy. Also fires a `policy_violation` event. |
|
|
586
|
+
| `WorkflowAccessDeniedError` | A workflow `start()` is denied by access policy. |
|
|
587
|
+
| `WorkflowNotFoundError` | Starting a workflow name the runtime doesn't know. |
|
|
588
|
+
| `WorkflowTerminalError` | Signalling/cancelling a run that already finished. |
|
|
589
|
+
| `InvalidEventNameError` | Publishing/defining an event whose name fails the naming rule. |
|
|
590
|
+
| `OutboxFullError` | The local event outbox is at `maxOutboxRows` (back-pressure). |
|
|
591
|
+
| `ServiceBridgeError` | Connection / provisioning failures; carries a typed `.code` (retryable ones drive auto-reconnect). |
|
|
1294
592
|
|
|
1295
593
|
---
|
|
1296
594
|
|
|
1297
595
|
## FAQ
|
|
1298
596
|
|
|
1299
|
-
**
|
|
1300
|
-
RPC calls have configurable retries with exponential backoff and hard per-attempt timeouts, so a silent downstream service cannot keep a call pending forever. Events are durable (PostgreSQL-backed) with at-least-once delivery per consumer group. Failed deliveries are retried according to policy, then moved to DLQ. Workflows track step state and can be resumed.
|
|
597
|
+
**Do I have to use Protobuf?** You point handlers at a `.proto` file or a `.schema.json` with explicit field numbers. Both are file-based; there is no inline schema.
|
|
1301
598
|
|
|
1302
|
-
**
|
|
1303
|
-
ServiceBridge is self-hosted. The runtime is a single Go binary + PostgreSQL. SDK calls map to standard patterns (RPC, pub/sub, cron) — migrating away means replacing SDK calls with equivalent library calls.
|
|
599
|
+
**Does ServiceBridge proxy my HTTP traffic?** No. You run your own Express / Fastify / Hono server. The integration only discovers your routes for the Service Map and adds trace spans — your HTTP path is untouched.
|
|
1304
600
|
|
|
1305
|
-
**How
|
|
1306
|
-
The SDK automatically reports trace spans for every RPC call, event publish/delivery, workflow step, and HTTP request. The runtime stores traces in PostgreSQL and serves them via the built-in dashboard and a Loki-compatible API for Grafana integration.
|
|
601
|
+
**How do I scale horizontally?** Run as many SDK instances as you like; the runtime load-balances RPC across live instances and fails over automatically. The runtime itself is a single source of truth backed by PostgreSQL.
|
|
1307
602
|
|
|
1308
|
-
**
|
|
1309
|
-
Yes. You can adopt incrementally — start with RPC between two services, add events later, then workflows. ServiceBridge doesn't require replacing your existing broker or mesh all at once.
|
|
603
|
+
**What happens on a transient disconnect?** Published events sit in the local SQLite outbox and drain when the connection returns. The SDK auto-reconnects (configurable) and rotates certs with overlap so live instances don't drop traffic.
|
|
1310
604
|
|
|
1311
|
-
**
|
|
1312
|
-
In-flight direct RPC calls continue working (they go service-to-service, not through the control plane). New discovery lookups, event publishes, and telemetry writes are queued in the SDK offline queue and flushed when the control plane recovers.
|
|
605
|
+
**Where do I see traces, metrics and the DLQ?** In the runtime dashboard on `:14444`. Tracing, metrics and the dead-letter queue are operated there.
|
|
1313
606
|
|
|
1314
|
-
**
|
|
1315
|
-
PostgreSQL 16+. The runtime uses PostgreSQL for all persistence: traces, events, workflows, jobs, service registry, and configuration.
|
|
607
|
+
**Node or Bun?** Both. Node 18+ or any current Bun. Bun-native APIs are used where available.
|
|
1316
608
|
|
|
1317
609
|
---
|
|
1318
610
|
|
|
1319
|
-
## Community
|
|
611
|
+
## Community
|
|
1320
612
|
|
|
1321
|
-
- Website
|
|
1322
|
-
-
|
|
1323
|
-
-
|
|
613
|
+
- **Website & docs:** [servicebridge.dev](https://servicebridge.dev) · [servicebridge.dev/docs](https://servicebridge.dev/docs)
|
|
614
|
+
- **SDK umbrella repo (all languages):** [github.com/service-bridge/sdk](https://github.com/service-bridge/sdk)
|
|
615
|
+
- **Runtime:** [github.com/servicebridge2/runtime](https://github.com/servicebridge2/runtime)
|
|
1324
616
|
|
|
1325
|
-
|
|
1326
|
-
|
|
1327
|
-
## License
|
|
1328
|
-
|
|
1329
|
-
Free for non-commercial use. Commercial use requires a separate license. See [LICENSE](../LICENSE).
|
|
1330
|
-
|
|
1331
|
-
Copyright (c) 2026 Eugene Surkov.
|
|
617
|
+
This is an alpha release (`2.0.0-alpha`). The API is stabilising — issues and feedback are welcome.
|
|
1332
618
|
|
|
1333
619
|
---
|
|
1334
620
|
|
|
1335
|
-
##
|
|
621
|
+
## License
|
|
1336
622
|
|
|
1337
|
-
|
|
623
|
+
Licensed under the **MIT License** — see [LICENSE](./LICENSE). Free for any use, including commercial; you only need to keep the copyright and license notice (attribution to esurkov1 <esurkovv@yandex.ru>).
|