service-bridge 2.0.0-alpha.1 → 2.0.0-alpha.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -89,13 +89,19 @@ The third constructor argument is an [options](#configuration) object. The SDK r
89
89
 
90
90
  ### Using an AI coding agent?
91
91
 
92
- Drop in the official **`servicebridge-node`** skill and your agent (Claude Code, etc.) writes correct ServiceBridge code on the first try — RPC, events, workflows, jobs and HTTP integration, grounded in this exact SDK:
92
+ This package ships an official skill so your agent (Claude Code, etc.) writes correct ServiceBridge code on the first try — RPC, events, workflows, jobs and HTTP integration, grounded in this exact SDK. It comes with the install; copy it into your agent's skills directory:
93
93
 
94
94
  ```sh
95
- npx degit service-bridge/sdk/skills/servicebridge-node .claude/skills/servicebridge-node
95
+ cp -r node_modules/service-bridge/skill .claude/skills/servicebridge-node
96
96
  ```
97
97
 
98
- Source and details: [`skills/servicebridge-node/`](https://github.com/service-bridge/sdk/tree/main/skills/servicebridge-node).
98
+ Not installed yet? Pull it straight from the repo:
99
+
100
+ ```sh
101
+ npx degit service-bridge/sdk/node/skill .claude/skills/servicebridge-node
102
+ ```
103
+
104
+ Source: [`node/skill/`](https://github.com/service-bridge/sdk/tree/main/node/skill).
99
105
 
100
106
  ---
101
107
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "service-bridge",
3
- "version": "2.0.0-alpha.1",
3
+ "version": "2.0.0-alpha.2",
4
4
  "description": "TypeScript SDK for ServiceBridge — RPC, durable events, workflows, jobs and observability over a single self-hosted gRPC runtime with mTLS.",
5
5
  "license": "MIT",
6
6
  "type": "module",
@@ -33,6 +33,7 @@
33
33
  },
34
34
  "files": [
35
35
  "dist",
36
+ "skill",
36
37
  "README.md",
37
38
  "LICENSE"
38
39
  ],
package/skill/SKILL.md ADDED
@@ -0,0 +1,114 @@
1
+ ---
2
+ name: servicebridge-node
3
+ description: Build backend services with the ServiceBridge Node SDK (service-bridge on npm) — RPC, durable events, workflows, scheduled jobs, and Express/Fastify/Hono integration against a self-hosted ServiceBridge runtime. Use when writing TypeScript/JavaScript that calls or handles RPC, publishes or consumes events, defines workflows or jobs, or wires an HTTP framework into the service mesh.
4
+ ---
5
+
6
+ # ServiceBridge Node SDK
7
+
8
+ ServiceBridge is a self-hosted runtime ("one container + PostgreSQL") that replaces a service mesh, a message broker, and a workflow engine. Your service declares its handlers and dependencies, then `start()`s — the runtime owns transport, delivery, orchestration, policy, and observability. No sidecars.
9
+
10
+ This SDK is the **backend** client for that runtime. The npm package is **`service-bridge`** (with a hyphen). Everything in this skill is the real, current API — do not invent methods or options.
11
+
12
+ ## The one mental model that prevents most mistakes
13
+
14
+ A `ServiceBridge` instance has two phases. Get this wrong and nothing connects.
15
+
16
+ 1. **Before `start()` — declare.** Register incoming handlers (`rpc.handle`, `event.handle`, `workflow.handle`, `job.handle`), declare outgoing dependencies (`service()`, `client()`, `useSchema()`), attach HTTP frameworks. These ride along in the first registration to the runtime.
17
+ 2. **`await sb.start()`** — connect, authenticate (mTLS from the bootstrap key), register everything atomically.
18
+ 3. **After `start()` — act.** Make outgoing calls: `rpc.call`, `event.publish`, `workflow.start`. These need a live connection.
19
+
20
+ ```ts
21
+ import { ServiceBridge } from "service-bridge";
22
+
23
+ // 1. construct — url + bootstrap key, nothing else is read from env
24
+ const sb = new ServiceBridge("localhost:14445", process.env.PAYMENT_KEY!);
25
+
26
+ // 2. declare (before start)
27
+ sb.rpc.handle(
28
+ "Charge",
29
+ async (req: { userId: string; amount: number }) => ({ ok: req.amount > 0 }),
30
+ { schema: { protoFile: "./payment.proto", input: "ChargeRequest", output: "ChargeReply" } },
31
+ );
32
+
33
+ // 3. connect
34
+ await sb.start();
35
+
36
+ // 4. act (after start)
37
+ // const res = await sb.rpc.call("other-svc", "Method", { ... });
38
+
39
+ // teardown
40
+ // await sb.stop();
41
+ ```
42
+
43
+ ## Golden rules
44
+
45
+ - **Package name is `service-bridge`.** `npm i service-bridge` / `import { ServiceBridge } from "service-bridge"`. Never `servicebridge`.
46
+ - **The SDK reads NO env vars for `url`/`key`.** You pass them explicitly (e.g. `process.env.PAYMENT_KEY!`). It only reads `SB_HTTP_ADVERTISE_HOST`/`SB_ADVERTISE_HOST` as a fallback for HTTP-plugin advertise host.
47
+ - **Get the bootstrap key from the dashboard.** Open the runtime dashboard (`http://localhost:14444`) → **Services → Create service** → copy the `sb.…` string. That opaque value is the second constructor arg. (From runtime source you can also mint one with `go run ./cmd/sbkey-gen -name <svc> -dsn <postgres-dsn>`.)
48
+ - **Every RPC handler needs a `schema`.** `rpc.handle(name, fn, { schema })` — schema is required. Without it, registration fails.
49
+ - **`event.handle` matches the EXACT event name**, not a wildcard. Wildcard routing is configured server-side, not in the handler string.
50
+ - **Event and job handlers must be idempotent.** Delivery is at-least-once. For jobs, dedup on `ctx.idempotencyKey`, never on `ctx.attempt`.
51
+ - **Declare before `start()`, call after `start()`.** Outgoing `rpc.call`/`event.publish`/`workflow.start` before `start()` throw "not ready".
52
+ - **Teardown is `await sb.stop()`.** There is no `close()`.
53
+ - **Set `advertise` in production.** If a service handles RPC, pass `{ advertise: { host, port } }` (e.g. the pod IP). Omitting it falls back to `127.0.0.1` (local-only) with a warning. A pure caller can pass `advertise: false`.
54
+
55
+ ## Install & connect
56
+
57
+ ```sh
58
+ npm i service-bridge # or: bun add service-bridge
59
+ ```
60
+
61
+ The runtime must be running. One-line install of the runtime:
62
+
63
+ ```sh
64
+ bash <(curl -fsSL https://servicebridge.dev/install.sh)
65
+ ```
66
+
67
+ Dashboard at `http://localhost:14444` (create your admin account on first open, then create services to get keys); SDK connects to the gRPC control plane at `localhost:14445`.
68
+
69
+ ## What each domain is for
70
+
71
+ | You want to… | Use | Reference |
72
+ |---|---|---|
73
+ | Request/response between services (incl. streaming) | `sb.rpc` | [reference/rpc.md](reference/rpc.md) |
74
+ | Fire-and-forget / pub-sub with durable at-least-once delivery | `sb.event` | [reference/events.md](reference/events.md) |
75
+ | Multi-step orchestration (DAG, compensation, signals, replay) | `sb.workflow` | [reference/workflows.md](reference/workflows.md) |
76
+ | Cron / delayed / interval scheduled work | `sb.job` | [reference/jobs.md](reference/jobs.md) |
77
+ | Expose your Express/Fastify/Hono app to Service Map + discovery | `service-bridge/{express,fastify,hono}` | [reference/http-integrations.md](reference/http-integrations.md) |
78
+ | Constructor options, error types, capacity tuning | `ServiceBridgeOptions` | [reference/configuration.md](reference/configuration.md) |
79
+
80
+ Read the matching reference file before writing code for a domain — each has exact signatures, defaults, and a runnable recipe. When a task spans domains, the lifecycle rule above still holds: one `ServiceBridge` per process, declare everything, then `start()`.
81
+
82
+ ## Minimal two-service RPC (the canonical smoke test)
83
+
84
+ ```ts
85
+ // provider.ts
86
+ import { ServiceBridge } from "service-bridge";
87
+ const sb = new ServiceBridge("localhost:14445", process.env.PROVIDER_KEY!);
88
+ sb.rpc.handle(
89
+ "Charge",
90
+ async (req: { userId: string; amount: number }) => ({ transactionId: `tx-${req.userId}`, ok: req.amount > 0 }),
91
+ { schema: { protoFile: "./payment.proto", input: "ChargeRequest", output: "ChargeReply" } },
92
+ );
93
+ await sb.start();
94
+ ```
95
+
96
+ ```ts
97
+ // caller.ts
98
+ import { ServiceBridge } from "service-bridge";
99
+ const sb = new ServiceBridge("localhost:14445", process.env.CALLER_KEY!);
100
+ const payment = await sb.client("payment-svc", "./payment.proto"); // payment-svc = provider's service name
101
+ await sb.start();
102
+ const res = await payment.Charge({ userId: "u-1", amount: 100 }); // { transactionId: "tx-u-1", ok: true }
103
+ ```
104
+
105
+ ```proto
106
+ // payment.proto — the shared contract
107
+ syntax = "proto3";
108
+ package demo;
109
+ message ChargeRequest { string user_id = 1; double amount = 2; }
110
+ message ChargeReply { string transaction_id = 1; bool ok = 2; }
111
+ service Payment { rpc Charge(ChargeRequest) returns (ChargeReply); }
112
+ ```
113
+
114
+ The caller targets the provider by its **service name** (the name used when the service was created in the dashboard), not by host/port — the runtime resolves routing.
@@ -0,0 +1,93 @@
1
+ # Configuration, lifecycle & errors
2
+
3
+ ## Constructor
4
+
5
+ ```ts
6
+ new ServiceBridge(url: string, key: string, options?: ServiceBridgeOptions)
7
+ ```
8
+
9
+ - `url` — runtime gRPC control plane, e.g. `"localhost:14445"`.
10
+ - `key` — the `sb.…` bootstrap key (dashboard → Services → Create service, or `sbkey-gen`).
11
+ - The SDK reads **no env vars** for `url`/`key`. Pass them yourself: `new ServiceBridge(process.env.SERVICEBRIDGE_URL!, process.env.MY_SERVICE_KEY!)`.
12
+
13
+ ## ServiceBridgeOptions
14
+
15
+ ```ts
16
+ interface ServiceBridgeOptions {
17
+ advertise?: { host: string; port: number } | false; // see below
18
+ callDefaults?: CallOpts; // base opts for every rpc.call; default {}
19
+ failOnPolicyViolation?: boolean; // default false (warn only)
20
+ telemetry?: boolean; // default true
21
+ telemetryRingSize?: number; // default 262144 (256 KiB)
22
+ dataDir?: string; // event outbox dir; default "./.servicebridge"
23
+ maxOutboxRows?: number; // default 100000 (publish backpressures at cap)
24
+ eventsDrainerBatch?: number; // default 50
25
+ eventsMaxInFlight?: number; // default 32
26
+ payloadMaxBytes?: number; // per-direction capture cap; default 65536
27
+ reconnectIntervalMs?: number; // default 3000
28
+ reconnectAttempts?: number; // default 3 (0 = unlimited)
29
+ }
30
+ ```
31
+
32
+ ### advertise
33
+
34
+ Controls the inbound Call RPC server (needed if this service **handles** RPC or workflows):
35
+
36
+ - `{ host, port }` — explicit; use in production (e.g. `{ host: process.env.POD_IP!, port: 7777 }`, `port: 0` lets the OS pick).
37
+ - omitted — binds `127.0.0.1` on a free port and logs a warning (local dev only; not reachable cross-host).
38
+ - `false` — caller-only; bind no inbound server. Use when the instance only makes outgoing calls.
39
+
40
+ ## Lifecycle
41
+
42
+ ```ts
43
+ await sb.start(); // connect, authenticate, register everything declared so far
44
+ await sb.stop(); // graceful shutdown (there is no close())
45
+ ```
46
+
47
+ Declare handlers/dependencies/clients/HTTP plugins **before** `start()`. Make outgoing calls (`rpc.call`, `event.publish`, `workflow.start`) **after** `start()`.
48
+
49
+ ## Declaring dependencies
50
+
51
+ ```ts
52
+ sb.service(serviceName: string, deps: { rpc?: string[]; workflows?: string[]; http?: string[] }): void
53
+ ```
54
+
55
+ Declares what this service calls, so the runtime can wire the graph and enforce policy. `client()` does this for you for RPC; use `service()` for explicit/low-level `rpc.call`.
56
+
57
+ ## Introspection & events
58
+
59
+ ```ts
60
+ sb.identity(): { sessionId; serviceId; serviceName; instanceId } | null // null until connected
61
+ sb.serviceMap(): ReadonlyMap<string, ServiceMapEntry> // live discovery snapshot
62
+ sb.on("connected" | "reconnecting" | "disconnected" | "policy_violation", handler): this
63
+ ```
64
+
65
+ ```ts
66
+ sb.on("disconnected", (e) => {
67
+ console.error("disconnected:", e.reason, e.error?.code); // e.error is a ServiceBridgeError (gRPC code)
68
+ });
69
+ ```
70
+
71
+ ## Error types
72
+
73
+ ```ts
74
+ import {
75
+ ServiceBridgeError, // base; .code = gRPC status (16 UNAUTHENTICATED, 7 PERMISSION_DENIED, ...)
76
+ RpcAccessDeniedError, // rpc.call denied by policy (serviceName, methodName, reason)
77
+ WorkflowAccessDeniedError, // workflow.start denied by policy
78
+ WorkflowNotFoundError, // workflow.start on unknown name
79
+ WorkflowTerminalError, // signal/cancel on a terminal run
80
+ InvalidEventNameError, // bad event name
81
+ OutboxFullError, // event outbox at cap
82
+ } from "service-bridge";
83
+ ```
84
+
85
+ Connection/auth problems arrive via the `disconnected` event carrying a `ServiceBridgeError`. Codes `UNAUTHENTICATED (16)`, `PERMISSION_DENIED (7)`, `NOT_FOUND (5)`, `INVALID_ARGUMENT (3)` are fatal (no reconnect); others trigger reconnect per `reconnectAttempts`/`reconnectIntervalMs`.
86
+
87
+ ## Production checklist
88
+
89
+ - Set `advertise: { host: <reachable host>, port }` on any service that handles RPC/workflows.
90
+ - Pass `url`/`key` from your own config/secrets; key from the dashboard.
91
+ - Consider `reconnectAttempts: 0` (unlimited) for long-lived services.
92
+ - Make event and job handlers idempotent.
93
+ - Tune `maxOutboxRows`/`eventsMaxInFlight` for high event throughput; `payloadMaxBytes` for capture size.
@@ -0,0 +1,102 @@
1
+ # Events — durable pub/sub
2
+
3
+ At-least-once delivery through a local SQLite outbox → runtime → subscribers. Fan-out to every matching subscriber, retries, and a dead-letter queue (operated from the dashboard, not the SDK).
4
+
5
+ ## Declare an event
6
+
7
+ ```ts
8
+ sb.event.define(name: string, spec?: SchemaSpec): void
9
+ ```
10
+
11
+ - Call before `await sb.start()`, on **both** publisher and subscriber (each indexes schemas locally to encode/decode; there's no global decoder registry).
12
+ - `name` must match `^[a-z0-9_-]+(\.[a-z0-9_-]+)*$` (dotted segments, lowercase). A bad name throws `InvalidEventNameError`.
13
+ - **Schema:** the proto needs a `service` block, and you reference the message by `method` — the SDK resolves the input message from the rpc of that name. (Event names contain dots, so they can't be rpc names directly; pick a valid rpc identifier for `method`.) Alternatively pass explicit `input` **and** `output`. Passing `input` alone does **not** resolve.
14
+
15
+ ```proto
16
+ // events.proto
17
+ syntax = "proto3";
18
+ package billing;
19
+ message ChargedEvent { string charge_id = 1; double amount = 2; }
20
+ service BillingEvents {
21
+ // method for event.define; output is irrelevant for events but the block requires one.
22
+ rpc billing_charged (ChargedEvent) returns (ChargedEvent);
23
+ }
24
+ ```
25
+
26
+ ## Handle (subscribe)
27
+
28
+ ```ts
29
+ sb.event.handle(name: string, fn: (payload: unknown) => Promise<void> | void): void
30
+ ```
31
+
32
+ - **Exact event name** — not a wildcard. (Wildcard subscription routing is a server-side concern, not the handler string.)
33
+ - Register before `start()`. A single instance may register several handlers; each matching event fires all of them.
34
+ - **Handlers must be idempotent.** Delivery is at-least-once. Throwing causes a Nack → retry → DLQ after the runtime's max attempts. Returning normally Acks.
35
+
36
+ ```ts
37
+ sb.event.define("billing.charged", { protoFile: "./events.proto", method: "billing_charged" });
38
+ sb.event.handle("billing.charged", async (payload) => {
39
+ const e = payload as { charge_id: string; amount: number };
40
+ await applyOnce(e.charge_id, e.amount); // idempotent by charge_id
41
+ });
42
+ await sb.start();
43
+ ```
44
+
45
+ ## Publish
46
+
47
+ ```ts
48
+ await sb.event.publish<T>(
49
+ name: string,
50
+ payload: T,
51
+ opts?: PublishOpts,
52
+ ): Promise<{ eventId: string }>
53
+ ```
54
+
55
+ - Call **after** `await sb.start()`. The event must be `define()`d first.
56
+ - The payload is validated against the schema before it enters the outbox; encoding errors throw synchronously.
57
+ - Returns `{ eventId }` (a monotonic UUID).
58
+
59
+ ```ts
60
+ sb.event.define("billing.charged", { protoFile: "./events.proto", method: "billing_charged" });
61
+ await sb.start();
62
+ const { eventId } = await sb.event.publish(
63
+ "billing.charged",
64
+ { charge_id: "ch-123", amount: 100 },
65
+ { idempotencyKey: "ch-123", partitionKey: "user-42" },
66
+ );
67
+ ```
68
+
69
+ > **Start the subscriber before the publisher.** Fan-out matches against registered subscriptions at publish time — a publish to a name nobody subscribes to yet is accepted but delivered to no one. After `start()`, the subscription registers asynchronously; in a script that publishes microseconds later, wait for the `connected` event (or briefly settle) before the first publish. Long-running services never hit this.
70
+
71
+ ### PublishOpts
72
+
73
+ ```ts
74
+ interface PublishOpts {
75
+ idempotencyKey?: string; // runtime dedups same key (24h window)
76
+ partitionKey?: string; // FIFO ordering per key per consumer
77
+ fireAndForget?: boolean; // skip the durable outbox (best-effort); default false
78
+ headers?: Record<string, string>; // metadata; not passed to the handler payload
79
+ occurredAtMs?: number; // business timestamp; default now
80
+ }
81
+ ```
82
+
83
+ ## Delivery semantics (what to rely on)
84
+
85
+ - **At-least-once**: a handler may see the same event more than once → make it idempotent (use `idempotencyKey` on publish and/or a dedup key in the handler).
86
+ - **Fan-out**: every subscriber whose subscription matches gets its own delivery.
87
+ - **Ordering**: only guaranteed per `partitionKey`, per consumer.
88
+ - **Retries + DLQ**: a throwing handler is retried by the runtime; after max attempts the delivery goes to the DLQ. Replay/purge the DLQ from the dashboard — the SDK has no DLQ API.
89
+
90
+ ## Errors
91
+
92
+ ```ts
93
+ import { InvalidEventNameError, OutboxFullError } from "service-bridge";
94
+ ```
95
+
96
+ - `InvalidEventNameError` — from `define()`/`publish()` when the name fails the regex.
97
+ - `OutboxFullError` — from `publish()` when the local outbox hits `maxOutboxRows` (backpressure; default 100000). Slow down or raise the cap.
98
+ - "no schema registered" — publishing/handling a name that was never `define()`d with a schema.
99
+
100
+ ## Capacity knobs (constructor options)
101
+
102
+ `dataDir` (outbox location, default `./.servicebridge`), `maxOutboxRows` (100000), `eventsDrainerBatch` (50 rows/tick), `eventsMaxInFlight` (32 concurrent inbound handlers). See [configuration.md](configuration.md).
@@ -0,0 +1,87 @@
1
+ # HTTP integrations — Express, Fastify, Hono
2
+
3
+ ServiceBridge does **not** proxy your business HTTP. You run your own HTTP server; these plugins register its routes into the Service Map and Service Discovery and add trace/metric capture. Your framework keeps serving traffic exactly as before.
4
+
5
+ Each plugin ships as a subpath export:
6
+
7
+ ```ts
8
+ import { attachExpress } from "service-bridge/express";
9
+ import { sbFastify } from "service-bridge/fastify";
10
+ import { attachHono } from "service-bridge/hono";
11
+ ```
12
+
13
+ All three: parse the `X-SB-Trace` header so handler-internal `sb.rpc.call`/`sb.event.publish` join the same trace, emit an `HTTP.HANDLE` op per request (status + optional body capture), and publish your routes + advertise endpoint. Safe to call before `sb.start()` — the endpoint queues into the first registration.
14
+
15
+ ## Express
16
+
17
+ ```ts
18
+ attachExpress(app: Express, sb: ServiceBridge, endpoint: { host?: string; port: number }): void
19
+ ```
20
+
21
+ Call **after** all routes are registered. `port` is required (Express can bind `0`, so the plugin can't infer it). Mount a body parser (`express.json()`) before routes so request bodies are captured.
22
+
23
+ ```ts
24
+ import express from "express";
25
+ import { ServiceBridge } from "service-bridge";
26
+ import { attachExpress } from "service-bridge/express";
27
+
28
+ const sb = new ServiceBridge("localhost:14445", process.env.API_KEY!);
29
+ const app = express();
30
+ app.use(express.json());
31
+ app.get("/api/orders/:id", (req, res) => res.json({ id: req.params.id }));
32
+ app.post("/api/orders", (req, res) => res.json({ created: true }));
33
+
34
+ attachExpress(app, sb, { host: process.env.POD_IP, port: 3000 }); // after routes
35
+ await sb.start();
36
+ app.listen(3000);
37
+ ```
38
+
39
+ ## Fastify
40
+
41
+ ```ts
42
+ await app.register(sbFastify, { sb: ServiceBridge, host?: string });
43
+ ```
44
+
45
+ Register **before** your routes (it collects them via the `onRoute` hook) — the advertise endpoint is published after `app.listen()` via the `onListen` hook, so the real port (even `0`) is known automatically. Supports Fastify 4.x and 5.x.
46
+
47
+ ```ts
48
+ import Fastify from "fastify";
49
+ import { ServiceBridge } from "service-bridge";
50
+ import { sbFastify } from "service-bridge/fastify";
51
+
52
+ const sb = new ServiceBridge("localhost:14445", process.env.API_KEY!);
53
+ const app = Fastify();
54
+ await app.register(sbFastify, { sb, host: process.env.POD_IP }); // before routes
55
+ app.get("/api/orders/:id", async (req) => ({ id: (req.params as any).id }));
56
+ app.post("/api/orders", async () => ({ created: true }));
57
+
58
+ await sb.start();
59
+ await app.listen({ port: 3000 });
60
+ ```
61
+
62
+ ## Hono
63
+
64
+ ```ts
65
+ attachHono(app: Hono, sb: ServiceBridge, endpoint: { host?: string; port: number }): void
66
+ ```
67
+
68
+ Call **after** routes are registered. Hono doesn't bind a socket itself, so `port` must be passed and must match what you give `Bun.serve` / `@hono/node-server` / `Deno.serve`. Routes declared with `app.all(...)` are not collected (no concrete method).
69
+
70
+ ```ts
71
+ import { Hono } from "hono";
72
+ import { ServiceBridge } from "service-bridge";
73
+ import { attachHono } from "service-bridge/hono";
74
+
75
+ const sb = new ServiceBridge("localhost:14445", process.env.API_KEY!);
76
+ const app = new Hono();
77
+ app.get("/api/orders/:id", (c) => c.json({ id: c.req.param("id") }));
78
+ app.post("/api/orders", (c) => c.json({ created: true }));
79
+
80
+ attachHono(app, sb, { host: process.env.POD_IP, port: 3000 }); // after routes
81
+ await sb.start();
82
+ export default { port: 3000, fetch: app.fetch }; // Bun
83
+ ```
84
+
85
+ ## Advertise host
86
+
87
+ `host` is optional on all three. If omitted, the plugin falls back to `SB_HTTP_ADVERTISE_HOST` → `SB_ADVERTISE_HOST` → `127.0.0.1` (with a one-time warning). Pass an explicit reachable host (e.g. the pod IP) in production.
@@ -0,0 +1,90 @@
1
+ # Jobs — scheduled work
2
+
3
+ Cron, delayed, and interval jobs driven by the runtime. The runtime fires the job on schedule, leases it to one instance, and retries on failure. Jobs have no incoming caller and no payload.
4
+
5
+ ## Register a job
6
+
7
+ ```ts
8
+ sb.job.handle(name: string, opts: JobOpts, fn: (ctx: JobHandlerCtx) => Promise<void>): void
9
+ ```
10
+
11
+ Register before `await sb.start()`. Names must be unique per service (duplicate throws).
12
+
13
+ ```ts
14
+ interface JobOpts {
15
+ trigger: Trigger; // required — exactly one shape below
16
+ catchup?: CatchupPolicy; // "skip" (default) | "fire_once" | "fire_all"
17
+ overlap?: OverlapPolicy; // "skip" (default) | "allow" | "buffer_one"
18
+ deps?: DeclaredDep[]; // declared downstream dependencies
19
+ maxAttempts?: number;
20
+ leaseTtlMs?: number;
21
+ maxConcurrent?: number; // with overlap "allow"
22
+ retry?: RetryPolicy; // { initialMs, maxMs, multiplier, jitter }
23
+ }
24
+
25
+ type Trigger =
26
+ | { cron: string; tz?: string } // 5-field cron, no seconds; e.g. "0 9 * * 1"
27
+ | { delayed: { at: Date | string | number } }
28
+ | { interval: number }; // milliseconds
29
+
30
+ type DeclaredDep = { rpc: string } | { event: string } | { workflow: string }; // rpc form: "service.Method"
31
+ ```
32
+
33
+ ## Handler context
34
+
35
+ ```ts
36
+ interface JobHandlerCtx {
37
+ jobName: string;
38
+ executionId: string;
39
+ scheduledAt: Date; // UTC fire time
40
+ localScheduledAt: Date; // in the cron tz (UTC for interval/delayed)
41
+ attempt: number; // 1,2,3… diagnostic only — DO NOT dedup on this
42
+ idempotencyKey: string; // stable per (job, scheduled tick) — DEDUP ON THIS
43
+ signal: AbortSignal; // aborts on lease loss / reconnect
44
+ }
45
+ ```
46
+
47
+ **Idempotency:** a tick may be delivered more than once (retry, failover). Dedup on `ctx.idempotencyKey` (e.g. a DB unique constraint or `SET NX`), never on `ctx.attempt`. Respect `ctx.signal` for long jobs.
48
+
49
+ ## Policies
50
+
51
+ - `catchup` — after the runtime was down across scheduled ticks: `skip` (ignore them), `fire_once` (one catch-up run), `fire_all` (replay all missed, capped by runtime budget).
52
+ - `overlap` — when the previous run is still going: `skip` (drop this tick), `buffer_one` (queue one), `allow` (run up to `maxConcurrent` concurrently).
53
+
54
+ ## Recipe — cron job that calls an RPC
55
+
56
+ ```ts
57
+ import { ServiceBridge } from "service-bridge";
58
+ const sb = new ServiceBridge("localhost:14445", process.env.JOBS_KEY!);
59
+
60
+ sb.service("billing", { rpc: ["Reconcile"] }); // declare the downstream call
61
+ sb.job.handle(
62
+ "nightly-reconcile",
63
+ {
64
+ trigger: { cron: "0 2 * * *", tz: "Europe/Moscow" }, // 02:00 Moscow daily
65
+ catchup: "fire_once",
66
+ overlap: "skip",
67
+ maxAttempts: 3,
68
+ deps: [{ rpc: "billing.Reconcile" }],
69
+ },
70
+ async (ctx) => {
71
+ if (!(await claimOnce(ctx.idempotencyKey))) return; // idempotency guard
72
+ await sb.rpc.call("billing", "Reconcile", { date: ctx.localScheduledAt.toISOString() });
73
+ },
74
+ );
75
+
76
+ await sb.start();
77
+ ```
78
+
79
+ ## Recipe — one-shot delayed job
80
+
81
+ ```ts
82
+ sb.job.handle(
83
+ "send-reminder",
84
+ { trigger: { delayed: { at: Date.now() + 60_000 } }, maxAttempts: 2 },
85
+ async (ctx) => { await sendReminder(ctx.idempotencyKey); },
86
+ );
87
+ await sb.start();
88
+ ```
89
+
90
+ Per-service job rate limits are enforced by the runtime at registration time; tune them in the dashboard, not the SDK.
@@ -0,0 +1,150 @@
1
+ # RPC — request/response
2
+
3
+ Direct, typed request/response between services. The runtime resolves routing by service name; you never hardcode host/port.
4
+
5
+ ## Handle incoming calls
6
+
7
+ ```ts
8
+ sb.rpc.handle<Req, Res>(
9
+ name: string,
10
+ fn: (req: Req) => Promise<Res> | Res,
11
+ opts: { schema: SchemaSpec; captureMode?: "all" | "errors" | "none" },
12
+ ): void
13
+ ```
14
+
15
+ - `schema` is **required**. See [Schemas](#schemas).
16
+ - Register before `await sb.start()`.
17
+ - Throw from the handler to signal failure to the caller; the runtime maps it to an error response.
18
+
19
+ ```ts
20
+ sb.rpc.handle(
21
+ "Charge",
22
+ async (req: { userId: string; amount: number }) => {
23
+ if (req.amount <= 0) throw new Error("amount must be positive");
24
+ return { transactionId: `tx-${req.userId}`, ok: true };
25
+ },
26
+ { schema: { protoFile: "./payment.proto", input: "ChargeRequest", output: "ChargeReply" } },
27
+ );
28
+ ```
29
+
30
+ ## Call another service
31
+
32
+ Two ways. Prefer the typed client for ergonomics; use `rpc.call` for dynamic/low-level calls.
33
+
34
+ ### Typed client (recommended)
35
+
36
+ ```ts
37
+ const payment = await sb.client(
38
+ serviceName: string,
39
+ protoFile: string,
40
+ opts?: { methods?: string[]; callDefaults?: CallOpts },
41
+ ); // returns a proxy with one method per rpc in the .proto service block
42
+ ```
43
+
44
+ ```ts
45
+ const payment = await sb.client("payment-svc", "./payment.proto");
46
+ await sb.start();
47
+ const res = await payment.Charge({ userId: "u-1", amount: 100 });
48
+ ```
49
+
50
+ `client()` reads the `.proto` once, declares every method in its `service` block as an outgoing dependency, and loads schemas. Call `client()` **before** `start()` so the dependency rides along in the first registration. Calls succeed once `start()` has connected.
51
+
52
+ ### Low-level call
53
+
54
+ ```ts
55
+ await sb.rpc.call<Req, Res>(
56
+ serviceName: string,
57
+ methodName: string,
58
+ payload: Req,
59
+ opts?: CallOpts,
60
+ ): Promise<Res>
61
+ ```
62
+
63
+ When using `rpc.call` for a method whose schema the SDK doesn't yet know, register it first with `sb.useSchema(serviceName, methodName, spec)` (before `start()`), or declare the dependency with `sb.service(serviceName, { rpc: ["Method"] })`.
64
+
65
+ ```ts
66
+ sb.service("payment-svc", { rpc: ["Charge"] });
67
+ await sb.useSchema("payment-svc", "Charge", {
68
+ protoFile: "./payment.proto", input: "ChargeRequest", output: "ChargeReply",
69
+ });
70
+ await sb.start();
71
+ const res = await sb.rpc.call("payment-svc", "Charge", { userId: "u-1", amount: 100 }, { timeout: "15s" });
72
+ ```
73
+
74
+ ### CallOpts
75
+
76
+ ```ts
77
+ interface CallOpts {
78
+ timeout?: string; // "10s", "500ms" — default "30s"
79
+ requestId?: string; // auto UUID v4 if omitted
80
+ transport?: "direct" | "proxy" | "auto"; // default "auto" (direct when endpoint known, else proxy)
81
+ idempotencyKey?: string; // opt-in runtime-side dedup; empty = no dedup
82
+ retry?: Partial<RetryOpts>; // defaults below
83
+ }
84
+
85
+ interface RetryOpts {
86
+ maxAttempts: number; // default 3
87
+ baseDelayMs: number; // default 200
88
+ factor: number; // default 2
89
+ maxDelayMs: number; // default 5000
90
+ jitter: number; // [0,1], default 0.3
91
+ }
92
+ ```
93
+
94
+ Per-call opts override `callDefaults` from the constructor.
95
+
96
+ ## Streaming
97
+
98
+ Server-streaming: handler returns an async iterable; caller consumes one.
99
+
100
+ ```ts
101
+ // provider
102
+ sb.rpc.handleStream<Req, Chunk>(
103
+ name: string,
104
+ fn: (req: Req) => AsyncIterable<Chunk>,
105
+ opts: { schema: SchemaSpec },
106
+ ): void
107
+
108
+ // caller — typed client method returns an async iterable, or use sb.stream:
109
+ sb.stream<Req, Chunk>(serviceName, methodName, payload, opts?): AsyncIterable<Chunk>
110
+ ```
111
+
112
+ ```ts
113
+ sb.rpc.handleStream("Ticks", async function* (req: { n: number }) {
114
+ for (let i = 0; i < req.n; i++) yield { i };
115
+ }, { schema: { protoFile: "./ticks.proto", input: "TicksRequest", output: "Tick" } });
116
+
117
+ // caller
118
+ for await (const chunk of sb.stream("tick-svc", "Ticks", { n: 5 })) {
119
+ console.log(chunk);
120
+ }
121
+ ```
122
+
123
+ ## Schemas
124
+
125
+ Every RPC handler and every declared call needs a schema. Two forms:
126
+
127
+ ```ts
128
+ type SchemaSpec =
129
+ | { protoFile: string; input?: string; output?: string; method?: string }
130
+ | { schemaFile: string; method?: string }; // .schema.json with explicit fieldNumber per property
131
+ ```
132
+
133
+ - With `protoFile` and no `input`/`output`, the SDK finds the rpc in the `.proto` `service` block whose name matches the method and uses its request/response messages. Provide `input`/`output` explicitly when the message names differ from the auto-resolution or there's ambiguity.
134
+ - Paths are relative to `process.cwd()` unless absolute.
135
+
136
+ ## Errors
137
+
138
+ - `RpcAccessDeniedError` (`serviceName`, `methodName`, `reason`) — thrown from `rpc.call` when the runtime's bilateral access policy denies the call. Not retryable.
139
+ - Handler exceptions surface to the caller as an error response.
140
+ - Connection/auth failures surface via the `disconnected` event with a `ServiceBridgeError` (see [configuration.md](configuration.md)).
141
+
142
+ ```ts
143
+ import { RpcAccessDeniedError } from "service-bridge";
144
+ try {
145
+ await sb.rpc.call("payment-svc", "Charge", payload);
146
+ } catch (e) {
147
+ if (e instanceof RpcAccessDeniedError) { /* policy denial — fix the access policy */ }
148
+ else throw e;
149
+ }
150
+ ```
@@ -0,0 +1,150 @@
1
+ # Workflows — durable orchestration
2
+
3
+ A workflow is a DAG of steps persisted in the runtime: steps run in parallel by default and order only by their `waitFor` dependencies. The runtime drives execution, survives restarts, supports compensation (saga rollback), external signals, and replay.
4
+
5
+ There are two sides: the **owner** registers the definition (`workflow.handle`); a **caller** launches runs (`workflow.start`). They can be the same service or different ones.
6
+
7
+ ## Define & register (owner side)
8
+
9
+ ```ts
10
+ sb.workflow.handle(name: string, def: WorkflowDef, opts?: { input?: Record<string, unknown> }): void
11
+
12
+ interface WorkflowDef {
13
+ input?: Record<string, unknown>; // JSON-Schema-ish shape for run input
14
+ steps: Step[]; // the DAG
15
+ retry?: Partial<RetryOpts>; // default retry for steps
16
+ maxParallelism?: number; // cap concurrent steps (0 = unlimited)
17
+ timeoutSec?: number; // whole-run wall-clock timeout
18
+ }
19
+ ```
20
+
21
+ Register before `await sb.start()`. The definition is canonicalized and fingerprinted; the runtime rejects re-registering the same name with a different shape while runs exist (in-flight runs keep their frozen plan).
22
+
23
+ ## Step types
24
+
25
+ Every step shares these control fields:
26
+
27
+ ```ts
28
+ interface StepControlFields {
29
+ id: string; // unique within the workflow, ^[a-z0-9_]+$
30
+ waitFor?: string[]; // ids that must finish first (this is what orders the DAG)
31
+ when?: Predicate; // run only if predicate is true (see below)
32
+ compensate?: CompensateSpec; // rollback action if the run later fails (call/publish steps)
33
+ timeoutSec?: number; // workflow-control timeout for this step
34
+ retry?: Partial<RetryOpts>;
35
+ }
36
+ ```
37
+
38
+ | `type` | Extra fields | Does |
39
+ |---|---|---|
40
+ | `"call"` | `service`, `method`, `input`, `opts?` | `rpc.call` to another service |
41
+ | `"publish"` | `event`, `input`, `opts?` | publish an event |
42
+ | `"sleep"` | `durationSec` | durable timer |
43
+ | `"wait_event"` | `event`, `filter?` | park until a matching event is ingested |
44
+ | `"wait_signal"` | `signal` | park until `sb.workflow.signal(runId, signal, …)` |
45
+ | `"workflow"` | `workflow`, `input`, `opts?` | start a child workflow and wait for it |
46
+ | `"parallel"` | `steps`, `forEach?` | group; all inner steps start at once |
47
+ | `"sequence"` | `steps`, `forEach?` | group; inner steps run one after another |
48
+ | `"local"` | `fn: (state) => Promise<unknown>` | arbitrary JS in the SDK process — use sparingly |
49
+
50
+ `forEach?: { from: JsonExpression; as: string }` on parallel/sequence fans the group out over an array.
51
+
52
+ ## Expressions (JSONPath-lite)
53
+
54
+ Step `input`, `when`, `idempotencyKey`, `forEach.from`, etc. accept declarative expressions:
55
+
56
+ - `"$.input.userId"` — a path into run input or accumulated state. Each completed step's output is stored under its `id`, so `"$.reserve.token"` reads the `reserve` step's output field `token`.
57
+ - A string **not** starting with `$.` is a literal. Use `{ literal: "$.x" }` to force a literal that looks like a path.
58
+ - Objects/arrays are evaluated recursively.
59
+
60
+ `Predicate` for `when`:
61
+
62
+ ```ts
63
+ type Predicate =
64
+ | string // truthy expression, e.g. "$.input.enabled"
65
+ | { not: Predicate }
66
+ | { equals: [JsonExpression, JsonExpression] }
67
+ | { in: [JsonExpression, JsonExpression] } // [value, array]
68
+ | { and: Predicate[] }
69
+ | { or: Predicate[] };
70
+ ```
71
+
72
+ `CompensateSpec` (rollback for a `call`/`publish` step, run in reverse if a later step fails):
73
+
74
+ ```ts
75
+ interface CompensateSpec {
76
+ type?: "call" | "publish";
77
+ service?: string; method?: string; // for call
78
+ event?: string; // for publish
79
+ input: JsonExpression;
80
+ retry?: Partial<RetryOpts>;
81
+ idempotencyKey?: string;
82
+ }
83
+ ```
84
+
85
+ ## Launch & observe (caller side)
86
+
87
+ ```ts
88
+ await sb.workflow.start(name, input, opts?): Promise<{ runId: string }> // opts: { idempotencyKey?, timeoutSec?, parentRunId? }
89
+ await sb.workflow.await(runId): Promise<Record<string, unknown>> // resolves on success, rejects on failed/cancelled
90
+ await sb.workflow.query(runId): Promise<{ status; state; steps }> // point-in-time snapshot
91
+ await sb.workflow.signal(runId, signalName, payload): Promise<void> // deliver to a wait_signal step
92
+ await sb.workflow.cancel(runId): Promise<void> // cooperative cancel → compensating → cancelled
93
+ await sb.workflow.replay(runId, opts?): Promise<{ runId: string }> // fork a new run; opts: { fromStepId? }
94
+ ```
95
+
96
+ Caller ops require `await sb.start()` first. `query().status` is one of `pending | running | waiting | success | failed | cancelling | cancelled | compensating | failed_compensated`.
97
+
98
+ > When the **same instance** both `handle()`s a workflow and `start()`s it, the definition registers asynchronously after `start()` — a `start()` issued microseconds later can throw `WorkflowNotFoundError`. Wait for the `connected` event (or briefly settle) first. With a separate owner service (the usual case), this doesn't apply.
99
+
100
+ ## Recipe — saga with compensation
101
+
102
+ ```ts
103
+ // owner
104
+ sb.workflow.handle("checkout", {
105
+ input: { userId: "string", item: "string", quantity: "number", amount: "number" },
106
+ steps: [
107
+ {
108
+ type: "call", id: "reserve",
109
+ service: "inventory", method: "Reserve",
110
+ input: { item: "$.input.item", quantity: "$.input.quantity" },
111
+ compensate: { service: "inventory", method: "Release", input: { token: "$.reserve.token" } },
112
+ },
113
+ {
114
+ type: "call", id: "charge",
115
+ service: "billing", method: "Charge",
116
+ input: { userId: "$.input.userId", amount: "$.input.amount" },
117
+ waitFor: ["reserve"],
118
+ // if "charge" fails, the runtime runs "reserve"'s compensate (Release) automatically
119
+ },
120
+ {
121
+ type: "publish", id: "notify",
122
+ event: "order.placed",
123
+ input: { userId: "$.input.userId", token: "$.reserve.token" },
124
+ waitFor: ["charge"],
125
+ },
126
+ ],
127
+ retry: { maxAttempts: 3 },
128
+ });
129
+ await sb.start();
130
+ ```
131
+
132
+ ```ts
133
+ // caller
134
+ await sb.start();
135
+ const { runId } = await sb.workflow.start("checkout", {
136
+ userId: "u-1", item: "sku-9", quantity: 2, amount: 100,
137
+ });
138
+ const finalState = await sb.workflow.await(runId);
139
+ ```
140
+
141
+ ## Errors
142
+
143
+ ```ts
144
+ import { WorkflowAccessDeniedError, WorkflowNotFoundError, WorkflowTerminalError } from "service-bridge";
145
+ ```
146
+
147
+ - `WorkflowNotFoundError` — `start()` on an unregistered name.
148
+ - `WorkflowAccessDeniedError` — bilateral access policy denied the start.
149
+ - `WorkflowTerminalError` — `signal()`/`cancel()` on an already-terminal run.
150
+ - `await(runId)` rejects when the run ends in `failed`/`cancelled`/`failed_compensated`.