@bluelibs/runner 6.3.0 → 6.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,258 @@
1
+ # Durable Workflows (v2) — Token-Friendly
2
+
3
+ ← [Back to main README](../README.md) | [Full documentation](./DURABLE_WORKFLOWS.md)
4
+
5
+ ---
6
+
7
+ Durable workflows are **Runner tasks with replay-safe checkpoints** (Node-only: `@bluelibs/runner/node`).
8
+
9
+ They're designed for flows that span time (minutes → days): approvals, payments, onboarding, shipping.
10
+
11
+ ## The mental model
12
+
13
+ - A workflow does not "resume the instruction pointer".
14
+ - On every wake-up (sleep/signal/retry/recover), it **re-runs from the top** and fast-forwards using stored results:
15
+ - `durableContext.step("id", fn)` runs once, persists result, returns cached on replay.
16
+ - `durableContext.sleep(...)` and `durableContext.waitForSignal(...)` persist durable checkpoints.
17
+
18
+ Rule: side effects belong inside `durableContext.step(...)`.
19
+
20
+ ## The happy path
21
+
22
+ 1. **Register a durable resource** (store + queue + event bus).
23
+ 2. **Write a durable task**:
24
+ - stable `durableContext.step("...")` ids
25
+ - explicit `{ stepId }` for `sleep/emit/waitForSignal` in production
26
+ 3. **Start the workflow**:
27
+ - `executionId = await service.start(taskOrTaskId, input)`
28
+ - persist `executionId` in your domain row (eg. `orders.execution_id`)
29
+ - or `await service.startAndWait(taskOrTaskId, input)` to start + wait in one call
30
+ 4. **Interact later via signals**:
31
+ - look up `executionId`
32
+ - `await service.signal(executionId, SignalDef, payload)`
33
+
34
+ For user-facing status pages, you can read the durable execution on-demand from the durable store using `executionId` (no need to mirror into Postgres): `store.getExecution(executionId)` (or `new DurableOperator(store).getExecutionDetail(executionId)` when supported).
35
+
36
+ Signals buffer if no waiter exists yet; the next `waitForSignal(...)` consumes the payload.
37
+
38
+ `taskOrTaskId` can be:
39
+
40
+ - an `ITask` (recommended, keeps full input/result type-safety)
41
+ - a task id `string` (resolved via runtime registry; fail-fast if not found)
42
+
43
+ `taskOrTaskId` is the built task object (`.build()`) or its id string, not the injected dependency callable from `.dependencies({...})`.
44
+
45
+ `start()` vs `startAndWait()`:
46
+
47
+ - `start(taskOrTaskId, input)` returns `executionId` immediately.
48
+ - `startAndWait(taskOrTaskId, input)` starts, waits, and returns `{ durable: { executionId }, data }`.
49
+
50
+ ## Tagging workflows (required for discovery)
51
+
52
+ Durable workflows are regular Runner tasks, but **must be tagged with `tags.durableWorkflow`** to make them discoverable at runtime. Always add this tag to your workflow tasks:
53
+
54
+ ```ts
55
+ import { r } from "@bluelibs/runner";
56
+ import { resources, tags } from "@bluelibs/runner/node";
57
+
58
+ const durable = resources.memoryWorkflow.fork("app-durable");
59
+
60
+ const onboarding = r
61
+ .task("onboarding")
62
+ .dependencies({ durable })
63
+ .tags([
64
+ tags.durableWorkflow.with({
65
+ category: "users",
66
+ defaults: { invitedBy: "system" },
67
+ }),
68
+ ])
69
+ .run(async (_input, { durable }) => {
70
+ const durableContext = durable.use();
71
+ await durableContext.step("create-user", async () => ({ ok: true }));
72
+ return { ok: true };
73
+ })
74
+ .build();
75
+
76
+ // Later, after run(...):
77
+ // const durableRuntime = runtime.getResourceValue(durable);
78
+ // const workflows = durableRuntime.getWorkflows();
79
+ ```
80
+
81
+ `tags.durableWorkflow` is **required** — workflows without this tag will not be discoverable via `getWorkflows()`. Register `resources.durable` once in the app so the durable tag definition and durable events are available at runtime.
82
+
83
+ `tags.durableWorkflow` is discovery metadata, and can also carry optional `defaults` for `describe(...)`.
84
+ The unified response envelope is produced by `startAndWait(...)`: `{ durable: { executionId }, data }`.
85
+ `defaults` are applied only by `describe(task)` when no explicit describe input is passed.
86
+
87
+ ### Starting workflows from dependencies (HTTP route)
88
+
89
+ Tagged tasks are discovery metadata only. Start workflows explicitly via `durable.start(...)` (or `durable.startAndWait(...)` when you want to wait for completion):
90
+
91
+ ```ts
92
+ import express from "express";
93
+ import { r, run } from "@bluelibs/runner";
94
+ import { resources, tags } from "@bluelibs/runner/node";
95
+
96
+ const durable = resources.memoryWorkflow.fork("app-durable");
97
+
98
+ const approveOrder = r
99
+ .task("approve-order")
100
+ .dependencies({ durable })
101
+ .tags([tags.durableWorkflow.with({ category: "orders" })])
102
+ .run(async (input: { orderId: string }, { durable }) => {
103
+ const durableContext = durable.use();
104
+ await durableContext.step("approve", async () => ({ approved: true }));
105
+ return { orderId: input.orderId, status: "approved" as const };
106
+ })
107
+ .build();
108
+
109
+ const api = r
110
+ .resource("api")
111
+ .register([resources.durable, durable.with({ worker: false }), approveOrder])
112
+ .dependencies({ durable, approveOrder })
113
+ .init(async (_cfg, { durable, approveOrder }) => {
114
+ const app = express();
115
+ app.use(express.json());
116
+
117
+ app.post("/orders/:id/approve", async (req, res) => {
118
+ const executionId = await durable.start(approveOrder, {
119
+ orderId: req.params.id,
120
+ });
121
+ res.status(202).json({ executionId });
122
+ });
123
+
124
+ app.listen(3000);
125
+ })
126
+ .build();
127
+
128
+ await run(api);
129
+ ```
130
+
131
+ Recommended wiring (config-only resources):
132
+
133
+ ```ts
134
+ import { resources } from "@bluelibs/runner/node";
135
+
136
+ // dev/tests
137
+ const durable = resources.memoryWorkflow.fork("app-durable").with({
138
+ worker: true,
139
+ });
140
+
141
+ // production (Redis + optional RabbitMQ queue)
142
+ const durableProd = resources.redisWorkflow.fork("app-durable").with({
143
+ redis: { url: process.env.REDIS_URL! },
144
+ queue: { url: process.env.RABBITMQ_URL! },
145
+ worker: true,
146
+ });
147
+ ```
148
+
149
+ `waitForSignal()` return shapes:
150
+
151
+ - `await durableContext.waitForSignal(Signal)` → `payload` (throws on timeout)
152
+ - `await durableContext.waitForSignal(Signal, { timeoutMs })` → `{ kind: "signal", payload } | { kind: "timeout" }`
153
+
154
+ ## Scheduling
155
+
156
+ - One-time: `service.schedule(taskOrTaskId, input, { at } | { delay })`
157
+ - Recurring: `service.ensureSchedule(taskOrTaskId, input, { id, cron } | { id, interval })`
158
+ - Manage: `pauseSchedule/resumeSchedule/getSchedule/listSchedules/updateSchedule/removeSchedule`
159
+
160
+ ## Recovery
161
+
162
+ - `await service.recover()` on startup kicks incomplete executions.
163
+ - Timers (sleeps, signal timeouts, schedules) require polling enabled in at least one process.
164
+
165
+ ## Inspecting an execution
166
+
167
+ - `createDashboardMiddleware` is now part of `@bluelibs/runner-durable-dashboard` (not core).
168
+ - `store.getExecution(executionId)` → status (running/sleeping/completed/failed/etc)
169
+ - When supported:
170
+ - `store.listStepResults(executionId)` → completed steps
171
+ - `store.listAuditEntries(executionId)` → timeline (step_completed, signal_waiting, signal_delivered, sleeps, status changes)
172
+ - `new DurableOperator(store).getExecutionDetail(executionId)` returns `{ execution, steps, audit }`.
173
+
174
+ "Internal steps" are recorded steps created by durable primitives (`sleep/waitForSignal/emit` and some bookkeeping). They typically use reserved step id prefixes like `__...` or `rollback:...`.
175
+
176
+ Audit can be enabled via `audit: { enabled: true }`; inside workflows you can add replay-safe notes via `durableContext.note("msg", meta)`. In Runner integration, audit entries are also emitted via `durableEvents.*`.
177
+
178
+ Runner integration detail: durable events emission does not depend on `audit.enabled` (it controls store persistence); events are emitted as long as an audit emitter is configured (the built-in durable workflow resources wire one by default).
179
+
180
+ Import and subscribe using event definitions (not strings): `import { durableEvents } from "@bluelibs/runner/node"` and `.on(durableEvents.audit.appended)` (or a specific durable event).
181
+
182
+ ## Compensation / rollback
183
+
184
+ - `durableContext.step("id").up(...).down(...)` registers compensations.
185
+ - `await durableContext.rollback()` runs compensations in reverse order.
186
+
187
+ ## Branching with durableContext.switch()
188
+
189
+ `durableContext.switch()` is a replay-safe branching primitive. It evaluates matchers against a value, persists which branch was taken, and on replay skips the matchers entirely.
190
+
191
+ ```ts
192
+ const result = await durableContext.switch(
193
+ "route-order",
194
+ order.status,
195
+ [
196
+ {
197
+ id: "approve",
198
+ match: (s) => s === "paid",
199
+ run: async (s) => {
200
+ /* ... */ return "approved";
201
+ },
202
+ },
203
+ {
204
+ id: "reject",
205
+ match: (s) => s === "declined",
206
+ run: async () => "rejected",
207
+ },
208
+ ],
209
+ { id: "manual-review", run: async () => "needs-review" },
210
+ ); // optional default
211
+ ```
212
+
213
+ - First arg is the step id (must be unique, like `durableContext.step`).
214
+ - Matchers evaluate in order; first match wins.
215
+ - The matched branch `id` + result are persisted; on replay the cached result is returned immediately.
216
+ - Throws if no branch matches and no default is provided.
217
+ - Audit emits a `switch_evaluated` entry with `branchId` and `durationMs`.
218
+
219
+ ## Describing a flow (static shape export)
220
+
221
+ Use `durable.describe(...)` to export the structure of a workflow without executing it. Useful for documentation, visualization, and tooling.
222
+
223
+ **Easiest: pass the task directly** — no refactoring needed:
224
+
225
+ ```ts
226
+ // Get your durable dependency from runtime, then:
227
+ const durableRuntime = runtime.getResourceValue(durable);
228
+ const shape = await durableRuntime.describe(myTask);
229
+ // shape.nodes = [{ kind: "step", stepId: "validate", ... }, ...]
230
+
231
+ // TInput is inferred from the task, or can be specified explicitly:
232
+ const shape2 = await durableRuntime.describe<{ orderId: string }>(myTask, {
233
+ orderId: "123",
234
+ });
235
+ ```
236
+
237
+ The recorder shims `durable.use()` inside the task's `run` and records every `durableContext.*` operation.
238
+
239
+ If the task uses `tags.durableWorkflow.with({ defaults: {...} })`, `describe(task)` uses those defaults.
240
+ `describe(task, input)` always overrides tag defaults.
241
+
242
+ Notes:
243
+
244
+ - The recorder captures each `durableContext.*` call as a `FlowNode`; step bodies are never executed.
245
+ - Supported node kinds: `step`, `sleep`, `waitForSignal`, `emit`, `switch`, `note`.
246
+ - `DurableFlowShape` and all `FlowNode` types are exported for type-safe consumption.
247
+ - Conditional logic should be modeled with `durableContext.switch()` (not JS `if/else`) for the shape to capture it.
248
+
249
+ ## Versioning (don't get burned)
250
+
251
+ - Step ids are part of the durable contract: don't rename/reorder casually.
252
+ - For breaking behavior changes, ship a **new workflow task id** (eg. `...v2`) and route new starts to it while v1 drains.
253
+ - A "dispatcher/alias" task is great for _new starts_, but in-flight stability requires the version choice to be stable (don't silently change behavior under the same durable task id).
254
+
255
+ ## Operational notes
256
+
257
+ - `durableContext.emit(...)` is best-effort (notifications), not guaranteed delivery.
258
+ - Queue mode is at-least-once; correctness comes from the store + step memoization.
@@ -0,0 +1,219 @@
1
+ # BlueLibs Runner — Enterprise
2
+
3
+ ← [Back to main README](../README.md)
4
+
5
+ Build reliable, observable, and compliant services with a TypeScript-first framework designed for governed environments.
6
+
7
+ - Strong typing and explicit dependency graphs
8
+ - Tasks, Resources, Events, Middleware as clear building blocks
9
+ - First-class observability and graceful lifecycle management
10
+ - Predictable releases with LTS options
11
+ - Enterprise support with SLAs and architecture guidance
12
+
13
+ ---
14
+
15
+ ## Why Enterprises Choose BlueLibs Runner
16
+
17
+ - Reliability by design
18
+ - Validated dependency graph at boot (dry-run option)
19
+ - Error boundaries and graceful shutdown
20
+ - Resilience patterns: retries, timeouts, caching
21
+
22
+ - Operability out of the box
23
+ - Structured logging with configurable formats
24
+ - System lifecycle events for orchestration
25
+ - Task- and event-level observability hooks
26
+
27
+ - Performance at scale
28
+ - Minimal-overhead DI and middleware
29
+ - Async-first, highly concurrent execution
30
+ - Benchmark guidance and tuning tips
31
+
32
+ - Low-risk adoption
33
+ - Type-safe contracts reduce defects
34
+ - Modular, opt-in architecture
35
+ - Clear upgrade and migration paths
36
+
37
+ ---
38
+
39
+ ## Long-Term Support (LTS) & Release Governance
40
+
41
+ We align with enterprise change management: stability, predictability, and controlled upgrades.
42
+
43
+ - Semantic Versioning
44
+ - Patch: bug/security fixes, no breaking changes
45
+ - Minor: backward-compatible improvements
46
+ - Major: planned, documented changes with migration guides
47
+
48
+ - LTS Policy (current)
49
+ - Version 6.x: actively maintained
50
+ - Version 5.x: LTS through December 31, 2026
51
+ - New features, improvements, and public API evolution land on 6.x first
52
+ - 5.x receives important maintenance and critical fixes during the LTS window
53
+
54
+ - Release channels
55
+ - Tagged releases: [GitHub Releases](https://github.com/bluelibs/runner/releases)
56
+
57
+ - Governance and Change Management
58
+ - Deprecation policy with advance notice
59
+ - Documented migration guides for major versions
60
+ - Release note entry for every public behavior/API change
61
+ - Optional dry-run to validate dependency graphs in CI
62
+
63
+ ---
64
+
65
+ ## Security & Compliance
66
+
67
+ Operate confidently under security review and audit.
68
+
69
+ - Security posture
70
+ - No telemetry by default
71
+ - Small, explicit surface area (functional DI, no hidden globals)
72
+ - Error boundary and controlled shutdown hooks
73
+
74
+ - Vulnerability management
75
+ - Rapid triage and patching for reported issues
76
+ - Coordinated disclosure process (contact below)
77
+ - Clear security advisories and release notes
78
+
79
+ - Supply chain considerations
80
+ - Deterministic builds via lockfiles
81
+ - Compatible with private registries/proxies
82
+ - Supports Node.js LTS releases
83
+
84
+ - Data handling
85
+ - Framework does not persist data by itself
86
+ - Configurable structured logging to avoid sensitive output
87
+
88
+ Security contact: theodor@bluelibs.com
89
+
90
+ ---
91
+
92
+ ## Operability: Observability, Resilience, and Lifecycle
93
+
94
+ - Observability
95
+ - Structured logger with multiple output strategies
96
+ - System-ready event to coordinate startup
97
+ - Debug modes for local/incident analysis
98
+
99
+ - Resilience
100
+ - Retry middleware with configurable strategies
101
+ - Timeout middleware (AbortController-based)
102
+ - Caching middleware for expensive operations
103
+
104
+ - Lifecycle
105
+ - Graceful shutdown (SIGINT/SIGTERM)
106
+ - Uncaught exception/rejection handling
107
+ - Resource disposal in reverse dependency order
108
+
109
+ ---
110
+
111
+ ## Compatibility & Environments
112
+
113
+ - TypeScript-first (strong types across tasks/resources/events)
114
+ - Node.js: `>=22` (matches package engines)
115
+ - TypeScript: `5.6+` recommended for best type inference and tooling parity
116
+ - Integrations:
117
+ - HTTP servers (e.g., Express)
118
+ - Message/event systems
119
+ - Async-capable data stores and services
120
+
121
+ ---
122
+
123
+ ## Support Plans
124
+
125
+ - Professional Support
126
+ - 4-hour response for urgent issues (business hours)
127
+ - Guidance on configuration, debugging, and best practices
128
+ - Covers one production application
129
+ - Up to 25 developers
130
+
131
+ - Enterprise Support
132
+ - 1-hour response for urgent issues, 24/7 for critical incidents
133
+ - Dedicated support engineer familiar with your setup
134
+ - Architecture reviews and performance guidance
135
+ - Covers one production application
136
+ - Up to 100 developers
137
+
138
+ - Strategic Support (Custom)
139
+ - Custom SLAs and escalation paths
140
+ - Multi-app, multi-team rollouts
141
+ - Training and quarterly reviews
142
+ - Input on roadmap and feature planning
143
+
144
+ Severity targets (typical):
145
+
146
+ - Sev-1 (Production down/data loss): 1h response (24/7), work until mitigated
147
+ - Sev-2 (Critical degradation, no workaround): 4h response, prioritized mitigation
148
+ - Sev-3 (Non-critical defect, workaround exists): next business day response
149
+ - Sev-4 (How-to/consulting): 2 business days
150
+
151
+ ---
152
+
153
+ ## Custom Work
154
+
155
+ Extend the framework for your environment without bespoke debt.
156
+
157
+ - Framework Extensions
158
+ - Custom middleware (security, compliance, integrations)
159
+ - Observability adapters
160
+
161
+ - Migration Tooling
162
+ - From legacy frameworks/systems
163
+ - Data/config transformations and compatibility shims
164
+
165
+ - Integration Adapters
166
+ - Message queues, proprietary protocols, legacy services
167
+
168
+ - Performance Engineering
169
+ - Profiling and optimization
170
+ - Tailored caching and resource pooling
171
+
172
+ Delivery model:
173
+
174
+ - Discovery → Fixed proposal (scope, timeline) → Dev & test → Documentation & handover
175
+
176
+ ---
177
+
178
+ ## Adoption Playbook
179
+
180
+ A practical, low-risk path from evaluation to production.
181
+
182
+ 1. Evaluation (days)
183
+
184
+ - Run example apps and benchmarks
185
+ - Validate logging, shutdown, retries/timeouts
186
+ - Use dry-run in CI to inspect dependency graphs
187
+
188
+ 2. Pilot (weeks)
189
+
190
+ - Wrap one service/workflow with tasks/resources/middleware
191
+ - Add observability and resilience policies
192
+ - Define SLOs and validate on staging
193
+
194
+ 3. Production rollout (weeks+)
195
+
196
+ - Phase-by-phase migration
197
+ - Architecture review and performance check
198
+ - Runbooks and on-call integration
199
+ - Formalize support plan and escalation
200
+
201
+ Success metrics:
202
+
203
+ - MTTR and incident count trending down
204
+ - Error rate and p95 latency stable or improved
205
+ - Predictable, low-friction upgrades
206
+
207
+ ---
208
+
209
+ ## Getting Started
210
+
211
+ - Schedule a call: theodor@bluelibs.com
212
+ - For rapid evaluations, share timelines and constraints to fast-track review.
213
+
214
+ Please include:
215
+
216
+ - Team size, criticality, and environment (cloud/on‑prem)
217
+ - Target Node.js/TypeScript versions
218
+ - Security/compliance requirements
219
+ - Desired timelines and success