@proompteng/temporal-bun-sdk 0.7.1 → 0.9.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +80 -544
- package/dist/agent-readiness.json +46 -0
- package/dist/production-readiness.json +93 -0
- package/dist/src/bin/lint-workflows-command.d.ts.map +1 -1
- package/dist/src/bin/lint-workflows-command.js +24 -2
- package/dist/src/bin/lint-workflows-command.js.map +1 -1
- package/dist/src/bin/temporal-bun.d.ts.map +1 -1
- package/dist/src/bin/temporal-bun.js +20 -5
- package/dist/src/bin/temporal-bun.js.map +1 -1
- package/dist/src/otel/sdk-trace.d.ts.map +1 -1
- package/dist/src/otel/sdk-trace.js +5 -0
- package/dist/src/otel/sdk-trace.js.map +1 -1
- package/dist/src/workflow/executor.d.ts.map +1 -1
- package/dist/src/workflow/executor.js +41 -6
- package/dist/src/workflow/executor.js.map +1 -1
- package/dist/src/workflow/guards.d.ts.map +1 -1
- package/dist/src/workflow/guards.js +15 -0
- package/dist/src/workflow/guards.js.map +1 -1
- package/docs/agent-adoption-guide.md +61 -0
- package/docs/feature-matrix.md +31 -0
- package/docs/production-design.md +589 -0
- package/docs/production-readiness-implementation-plan.md +370 -0
- package/docs/release-runbook.md +48 -0
- package/docs/support-policy.md +76 -0
- package/docs/temporal-ci-cluster-requirement.md +36 -0
- package/docs/workflow-updates.md +107 -0
- package/package.json +21 -17
|
@@ -0,0 +1,370 @@
|
|
|
1
|
+
# Temporal Bun SDK Production Readiness Implementation Plan
|
|
2
|
+
|
|
3
|
+
_Last updated: May 5, 2026_
|
|
4
|
+
|
|
5
|
+
## Goal
|
|
6
|
+
|
|
7
|
+
Make `@proompteng/temporal-bun-sdk` the default Temporal choice for Bun-based
|
|
8
|
+
agents by turning the current project-proven runtime into a public,
|
|
9
|
+
repeatable, evidence-backed production library.
|
|
10
|
+
|
|
11
|
+
This is not a plan to load the official `@temporalio/worker` stack in Bun. The
|
|
12
|
+
SDK is a pure Bun implementation: package metadata only ships `dist`, `docs`,
|
|
13
|
+
`skills`, and `README.md`; package dependencies are protobuf/connect/Effect/TypeScript; and the
|
|
14
|
+
production verification gate rejects `@temporalio/worker`,
|
|
15
|
+
`@temporalio/core-bridge`, `node-gyp`, `dist/native`, and stale native Docker
|
|
16
|
+
paths.
|
|
17
|
+
|
|
18
|
+
## Actual Concern
|
|
19
|
+
|
|
20
|
+
The concern behind "not production ready" is not that Bun cannot load a Node
|
|
21
|
+
NAPI bridge. This SDK does not rely on that bridge.
|
|
22
|
+
|
|
23
|
+
The real concern is that Temporal worker correctness is a protocol and
|
|
24
|
+
determinism contract. The official SDK gets trust from Temporal-maintained Core,
|
|
25
|
+
long-running compatibility coverage, and years of production history. A pure
|
|
26
|
+
Bun worker can be production quality, but it must publish enough proof that:
|
|
27
|
+
|
|
28
|
+
- Workflow code cannot silently observe nondeterministic Bun/JS runtime state.
|
|
29
|
+
- Replay reconstruction matches real Temporal histories across feature and
|
|
30
|
+
version combinations.
|
|
31
|
+
- Worker pollers, sticky queues, heartbeats, updates, shutdown, and retries hold
|
|
32
|
+
under load, restart, and failure conditions.
|
|
33
|
+
- Protocol command materialization remains compatible with Temporal Server
|
|
34
|
+
changes.
|
|
35
|
+
- Release artifacts make the package boundary, test results, and support matrix
|
|
36
|
+
easy for agents and humans to verify.
|
|
37
|
+
|
|
38
|
+
The current library is already deployed and tested on the current infra. That
|
|
39
|
+
counts, but agents outside this project cannot infer it. The implementation
|
|
40
|
+
work is to convert private confidence into public, machine-checkable evidence.
|
|
41
|
+
|
|
42
|
+
## Code Read Findings
|
|
43
|
+
|
|
44
|
+
| Surface | Current implementation | Production gap to close |
|
|
45
|
+
| -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
46
|
+
| Package boundary | `packages/temporal-bun-sdk/package.json` ships `dist`, `docs`, `skills`, and `README.md`; runtime dependencies are `@bufbuild/protobuf`, Connect, Effect, and TypeScript. `verify:production` runs `tests/packaging/manifest-packaging.test.ts`. | Keep this as a release gate and publish the result in a release manifest so agents can prove the package is not a Node/native wrapper. |
|
|
47
|
+
| Worker runtime | `src/worker/runtime.ts` owns config load, WorkflowService transport, workflow/activity pollers, sticky queues, deployment/build IDs, scheduler, metrics, plugins, graceful shutdown, determinism marker emission, and activity task lifecycle. | Add restart/chaos/soak scenarios for poll cancellation, sticky cache drift, task-not-found, heartbeat failure, tuner changes, and shutdown during active workflow/activity tasks. |
|
|
48
|
+
| Workflow execution | `src/workflow/executor.ts` runs registered workflows through Effect, creates `WorkflowCommandContext`, evaluates queries, processes updates, and materializes success/failure commands. | Add protocol golden tests that compare emitted commands and update protocol messages against captured histories and expected server-visible events. |
|
|
49
|
+
| Determinism guard | `src/workflow/determinism.ts` records command, random, time, signal, query, and update streams and throws `WorkflowNondeterminismError` on replay mismatch. | Add async interleaving fuzz tests and query-mode negative tests. Query handlers must never read live time/randomness as a hidden side channel. |
|
|
50
|
+
| Runtime guards | `src/workflow/guards.ts` patches `Date`, `Date.now`, `Math.random`, `crypto.randomUUID`, `crypto.getRandomValues`, `fetch`, timers, `performance.now`, `WebSocket`, `Bun.spawn`, and `Bun.nanoseconds`; `WorkflowExecutor` requires strict guards in production. | Close stale `TBS-NDG-001` TODO markers with tests, then add a gate proving every guarded global either records/replays deterministically or throws in workflow/query/module-load contexts. |
|
|
51
|
+
| Static workflow lint | `src/bin/lint-workflows-command.ts` walks workflow import graphs and denies unsafe imports/globals/member expressions. Tests cover `fetch`, `process.env`, captured `Date.now`, and importing client APIs from workflows. | Add rules for dangerous Promise/Effect escape hatches, dynamic eval/function creation, timers captured through aliases, and Bun runtime APIs not currently listed. Make release CI fail if configured workflow entries are missing. |
|
|
52
|
+
| Replay | `src/workflow/replay.ts` ingests real histories, applies full/delta determinism markers, reconstructs command history, tracks updates, and diffs mismatch metadata. Stored fixtures currently cover timer, activity retry, and child/continue-as-new histories. | Scale from a small fixture set to a versioned corpus that covers every supported command/event pair, updates, signals, queries, cancellations, payload codecs, search attributes, memo, markers, failures, sticky replay, and old SDK versions. |
|
|
53
|
+
| Integration | `tests/integration/**` covers history replay, activity lifecycle, query-only workflows, signal/query, workflow updates, payload codecs, client resilience, worker ops, schedules, and worker runtime behavior behind `TEMPORAL_INTEGRATION_TESTS=1`. | Split optional service-unavailable skips from release-blocking skips. In release CI, a missing dev server or unimplemented critical endpoint must fail instead of silently reducing coverage. |
|
|
54
|
+
| Load | `tests/integration/load/**` submits CPU, activity, and update workflows, checks throughput, sticky hit ratio, and poll p95 latency, and writes JSONL/report artifacts. | Add duration mode, memory/heap samples, worker restart mode, Temporal endpoint interruption, sticky cache churn, and nightly/weekly soak thresholds. |
|
|
55
|
+
| CI | `.github/workflows/temporal-bun-sdk.yml` builds, lints, tests, runs `verify:production`, runs load checks, and uploads load artifacts. | Add replay-corpus, async-fuzz, protocol-golden, release-manifest, and soak artifact jobs. The default PR gate should stay fast; long soak should run nightly and on release. |
|
|
56
|
+
|
|
57
|
+
## Production Bar
|
|
58
|
+
|
|
59
|
+
The library becomes the obvious default when every release can point to a
|
|
60
|
+
single production-readiness artifact containing:
|
|
61
|
+
|
|
62
|
+
- Package boundary proof: no official Node worker dependency, no native bridge,
|
|
63
|
+
no `node-gyp`, no `dist/native`, no stale native Docker path.
|
|
64
|
+
- Determinism proof: replay corpus result, async fuzz result, query/global guard
|
|
65
|
+
matrix, and mismatch diagnostics samples.
|
|
66
|
+
- Protocol proof: command/update/query/signal/activity golden tests by Temporal
|
|
67
|
+
Server version and SDK version.
|
|
68
|
+
- Operations proof: load and soak reports with throughput, latency, sticky cache
|
|
69
|
+
ratio, heartbeat retries/failures, worker failures, memory slope, and restart
|
|
70
|
+
outcomes.
|
|
71
|
+
- Compatibility proof: supported Bun, Temporal Server, Temporal Cloud, TLS/mTLS,
|
|
72
|
+
payload codec, and deployment/versioning matrix.
|
|
73
|
+
- Support proof: documented feature status, known limits, upgrade policy,
|
|
74
|
+
security/reporting contact, and release deprecation policy.
|
|
75
|
+
|
|
76
|
+
## Implementation Phases
|
|
77
|
+
|
|
78
|
+
### P0 - Credibility Gate And Public Evidence
|
|
79
|
+
|
|
80
|
+
Purpose: make the current hardening visible and stop regressions.
|
|
81
|
+
|
|
82
|
+
Implementation:
|
|
83
|
+
|
|
84
|
+
- Add `scripts/production-readiness/collect-release-evidence.ts`.
|
|
85
|
+
- Generate `dist/production-readiness.json` during `prepack`.
|
|
86
|
+
- Include package boundary, Bun/Temporal versions, git SHA, command results,
|
|
87
|
+
replay fixture count, integration skip count, load report path, and docs hash.
|
|
88
|
+
- Extend `verify:production` to validate the generated evidence schema.
|
|
89
|
+
- Remove or rename stale `TODO(TBS-NDG-*)` comments once the matching tests prove
|
|
90
|
+
they are complete; leave new TODOs only for real unfinished work.
|
|
91
|
+
- Add a release CI step that uploads `production-readiness.json` with npm pack
|
|
92
|
+
output and worker-load artifacts.
|
|
93
|
+
|
|
94
|
+
Acceptance:
|
|
95
|
+
|
|
96
|
+
- `bun run --filter @proompteng/temporal-bun-sdk verify:production` fails if the
|
|
97
|
+
evidence file is missing, malformed, or reports native/Node worker artifacts.
|
|
98
|
+
- `npm pack --dry-run --json` output can be matched to the same evidence file.
|
|
99
|
+
- Docs link to the evidence fields so agents can make a yes/no decision without
|
|
100
|
+
reading the whole repo.
|
|
101
|
+
|
|
102
|
+
### P1 - Determinism And Async Semantics Audit
|
|
103
|
+
|
|
104
|
+
Purpose: answer the specific "Bun async semantics" concern.
|
|
105
|
+
|
|
106
|
+
Implementation:
|
|
107
|
+
|
|
108
|
+
- Add `tests/workflow/query-guard-matrix.test.ts`.
|
|
109
|
+
- Assert `Date`, `Date.now`, `Math.random`, `crypto.randomUUID`,
|
|
110
|
+
`crypto.getRandomValues`, `performance.now`, timers, fetch, WebSocket, and
|
|
111
|
+
Bun process APIs either throw `WorkflowQueryViolationError` in query mode or
|
|
112
|
+
replay from prior determinism state. No query path may read live wall-clock
|
|
113
|
+
or random state.
|
|
114
|
+
- Add `tests/workflow/async-determinism-fuzz.test.ts`.
|
|
115
|
+
- Generate seeded workflows that interleave Effect yields, Promise microtasks,
|
|
116
|
+
timers via workflow APIs, activities, signals, queries, updates, sideEffect,
|
|
117
|
+
getVersion, patch, and local activities.
|
|
118
|
+
- Run each seed twice: initial execution records state; replay must produce
|
|
119
|
+
identical output and zero `diffDeterminismState` mismatches.
|
|
120
|
+
- Mutate one command/random/time/query/update slot per seed and assert the
|
|
121
|
+
mismatch is detected with useful metadata.
|
|
122
|
+
- Extend `lint-workflows` rules to flag hidden async escape hatches:
|
|
123
|
+
- raw `Promise` constructors in workflow modules unless allowlisted;
|
|
124
|
+
- `Effect.tryPromise` and `Effect.promise` in workflow handlers unless called
|
|
125
|
+
through SDK-provided deterministic adapters;
|
|
126
|
+
- captured unsafe globals through aliases;
|
|
127
|
+
- `eval`, `Function`, dynamic import, and Bun runtime APIs beyond the current
|
|
128
|
+
deny list.
|
|
129
|
+
- Add a CLI gate that runs `temporal-bun lint-workflows` in strict JSON mode and
|
|
130
|
+
writes `.artifacts/workflow-lint/report.json`.
|
|
131
|
+
|
|
132
|
+
Acceptance:
|
|
133
|
+
|
|
134
|
+
- Runtime guard, query guard matrix, and async fuzz suites pass locally and in
|
|
135
|
+
CI.
|
|
136
|
+
- Fuzz default: at least 1,000 seeds in PR CI under a stable seed file.
|
|
137
|
+
- Fuzz release/nightly: at least 10,000 seeds with artifacted seed failures.
|
|
138
|
+
- Any workflow query reading live time/randomness is a release blocker.
|
|
139
|
+
|
|
140
|
+
### P2 - Replay Corpus Expansion
|
|
141
|
+
|
|
142
|
+
Purpose: prove replay against real Temporal histories, not only synthetic unit
|
|
143
|
+
fixtures.
|
|
144
|
+
|
|
145
|
+
Implementation:
|
|
146
|
+
|
|
147
|
+
- Add `tests/replay/corpus/manifest.json`.
|
|
148
|
+
- Fields: fixture name, Temporal Server version, SDK version, Bun version,
|
|
149
|
+
workflow type, feature tags, history event count, expected command count,
|
|
150
|
+
payload codec profile, captured date, and source command.
|
|
151
|
+
- Add `scripts/replay/capture-corpus-fixture.ts`.
|
|
152
|
+
- Starts or references a workflow, fetches history through WorkflowService or
|
|
153
|
+
Temporal CLI, normalizes JSON, runs `ingestWorkflowHistory`, stores expected
|
|
154
|
+
determinism state, and updates the manifest.
|
|
155
|
+
- Add `scripts/replay/verify-corpus.ts`.
|
|
156
|
+
- Runs all fixtures, rejects empty corpus, rejects duplicate tags without a
|
|
157
|
+
replacement marker, rejects unsupported schema versions, and writes
|
|
158
|
+
`.artifacts/replay-corpus/report.json`.
|
|
159
|
+
- Keep the existing small `tests/replay/fixtures.test.ts`, but make the larger
|
|
160
|
+
corpus the release gate.
|
|
161
|
+
|
|
162
|
+
Coverage targets:
|
|
163
|
+
|
|
164
|
+
- Activities: success, retry success, retry exhaustion, cancellation, heartbeat
|
|
165
|
+
timeout, non-retryable failure, payload codec failure details.
|
|
166
|
+
- Timers: fire, cancel, replay after marker, long timeout normalization.
|
|
167
|
+
- Child workflows: start, complete, fail, cancel, terminate, timeout, pending
|
|
168
|
+
child replay.
|
|
169
|
+
- Continue-as-new and versioning: marker chain, getVersion, patch/deprecate,
|
|
170
|
+
old-build replay.
|
|
171
|
+
- Signals and queries: signal delivery order, query-only task, legacy query,
|
|
172
|
+
multi-query, query failure.
|
|
173
|
+
- Updates: admitted, accepted, rejected, completed success/failure, duplicate
|
|
174
|
+
update ID, cancellation while awaiting.
|
|
175
|
+
- Search attributes and memo: upsert and workflow properties modified.
|
|
176
|
+
- Nexus operations: schedule, complete, fail, cancel, timeout.
|
|
177
|
+
- Payload codecs: JSON, binary tunnel, gzip, AES-GCM, key rotation metadata.
|
|
178
|
+
|
|
179
|
+
Acceptance:
|
|
180
|
+
|
|
181
|
+
- PR gate: at least 25 high-signal corpus fixtures.
|
|
182
|
+
- Release gate: at least 75 fixtures covering every GA-critical command/event
|
|
183
|
+
family.
|
|
184
|
+
- Default-choice gate: at least 150 fixtures across at least two Temporal Server
|
|
185
|
+
minor versions and two SDK minor versions.
|
|
186
|
+
|
|
187
|
+
### P3 - Protocol Golden And Compatibility Tests
|
|
188
|
+
|
|
189
|
+
Purpose: prove the custom Bun worker emits server-compatible commands.
|
|
190
|
+
|
|
191
|
+
Implementation:
|
|
192
|
+
|
|
193
|
+
- Add `tests/protocol/command-golden.test.ts`.
|
|
194
|
+
- Given workflow command intents, materialize commands and compare stable JSON
|
|
195
|
+
projections of `Command` protos.
|
|
196
|
+
- Include headers, memo, search attributes, retry policies, parent close
|
|
197
|
+
policy, task queues, timeouts, cancellation, continue-as-new, and Nexus.
|
|
198
|
+
- Add `tests/protocol/update-protocol-golden.test.ts`.
|
|
199
|
+
- Validate `buildUpdateProtocolMessages` output for acceptance, rejection,
|
|
200
|
+
success completion, failure completion, and replay duplicate suppression.
|
|
201
|
+
- Add `tests/protocol/history-roundtrip.test.ts`.
|
|
202
|
+
- For selected corpus fixtures, reconstruct determinism state, replay the
|
|
203
|
+
workflow, materialize commands, and assert stable equivalence against the
|
|
204
|
+
next command-bearing history events.
|
|
205
|
+
- Add a Temporal compatibility matrix job:
|
|
206
|
+
- local dev server for PRs;
|
|
207
|
+
- pinned Temporal Server minors in nightly;
|
|
208
|
+
- Temporal Cloud smoke in release when credentials are present.
|
|
209
|
+
|
|
210
|
+
Acceptance:
|
|
211
|
+
|
|
212
|
+
- Every command kind in `src/workflow/commands.ts` has a golden projection.
|
|
213
|
+
- Every event kind handled in `src/workflow/replay.ts` has at least one corpus
|
|
214
|
+
fixture or an explicit unsupported-status entry.
|
|
215
|
+
- Protocol golden tests fail on accidental protobuf field drift.
|
|
216
|
+
|
|
217
|
+
### P4 - Worker Lifecycle, Chaos, And Soak
|
|
218
|
+
|
|
219
|
+
Purpose: prove the runtime holds under real operating conditions.
|
|
220
|
+
|
|
221
|
+
Implementation:
|
|
222
|
+
|
|
223
|
+
- Extend `tests/integration/load/config.ts` with:
|
|
224
|
+
- `TEMPORAL_LOAD_TEST_DURATION_MS`;
|
|
225
|
+
- `TEMPORAL_LOAD_TEST_RESTART_INTERVAL_MS`;
|
|
226
|
+
- `TEMPORAL_LOAD_TEST_ENDPOINT_BLACKHOLE_MS`;
|
|
227
|
+
- `TEMPORAL_LOAD_TEST_STICKY_CHURN_RATIO`;
|
|
228
|
+
- `TEMPORAL_LOAD_TEST_MEMORY_SLOPE_MAX_MB_PER_HOUR`;
|
|
229
|
+
- `TEMPORAL_LOAD_TEST_SHUTDOWN_DURING_ACTIVITY_RATIO`.
|
|
230
|
+
- Extend `tests/integration/load/runner.ts` to:
|
|
231
|
+
- sample RSS/heap and write `memory.jsonl`;
|
|
232
|
+
- periodically shutdown and recreate `WorkerRuntime`;
|
|
233
|
+
- interrupt pollers and confirm `#withRpcAbort` and scheduler shutdown do not
|
|
234
|
+
strand tasks;
|
|
235
|
+
- force sticky cache eviction and drift rebuilds;
|
|
236
|
+
- track task-not-found, nondeterminism, heartbeat retry/failure, activity
|
|
237
|
+
failure, workflow failure, and sticky heal metrics.
|
|
238
|
+
- Add `scripts/run-worker-soak.ts`.
|
|
239
|
+
- Duration-mode wrapper around the load runner with stricter artifact
|
|
240
|
+
validation.
|
|
241
|
+
- Add `.github/workflows/temporal-bun-sdk-nightly.yml`.
|
|
242
|
+
- Runs 2-hour soak nightly.
|
|
243
|
+
- Runs 6-hour soak on release branch.
|
|
244
|
+
- Allows 24-hour soak manually before declaring the package default for new
|
|
245
|
+
agents.
|
|
246
|
+
|
|
247
|
+
Acceptance:
|
|
248
|
+
|
|
249
|
+
- PR load smoke: current short load test remains under 10 minutes.
|
|
250
|
+
- Nightly soak: 2 hours, zero stuck workflows, no unhandled runtime rejection,
|
|
251
|
+
memory slope below threshold, sticky heal rate below threshold, and all
|
|
252
|
+
metrics artifacts uploaded.
|
|
253
|
+
- Release soak: 6 hours against pinned Temporal Server with restart mode
|
|
254
|
+
enabled.
|
|
255
|
+
- Default-choice soak: one 24-hour run with restart mode, sticky churn, payload
|
|
256
|
+
codecs, updates, and activities enabled.
|
|
257
|
+
|
|
258
|
+
### P5 - CI Skip Policy And Release Blocking Rules
|
|
259
|
+
|
|
260
|
+
Purpose: prevent false confidence from skipped integration tests.
|
|
261
|
+
|
|
262
|
+
Implementation:
|
|
263
|
+
|
|
264
|
+
- Add a test reporter that counts Bun skipped tests and scenario-level "skipped"
|
|
265
|
+
console warnings for the SDK package.
|
|
266
|
+
- Add `TEMPORAL_BUN_RELEASE_GATE=1`.
|
|
267
|
+
- In release mode, missing Temporal CLI/dev server, missing corpus fixtures,
|
|
268
|
+
and critical integration skips are failures.
|
|
269
|
+
- Optional external services remain optional only if the skip is tagged with a
|
|
270
|
+
non-critical capability, for example Temporal Cloud Ops without credentials.
|
|
271
|
+
- Split integration suites into:
|
|
272
|
+
- `test:integration:critical`;
|
|
273
|
+
- `test:integration:optional`;
|
|
274
|
+
- `test:load`;
|
|
275
|
+
- `test:soak`.
|
|
276
|
+
- Publish `.artifacts/temporal-bun-sdk/test-summary.json`.
|
|
277
|
+
|
|
278
|
+
Acceptance:
|
|
279
|
+
|
|
280
|
+
- Release CI fails if a critical integration test silently skips.
|
|
281
|
+
- The release manifest includes skip counts by suite and reason.
|
|
282
|
+
- PR CI remains usable without requiring Temporal Cloud credentials.
|
|
283
|
+
|
|
284
|
+
### P6 - Docs, Support Policy, And Agent Default Metadata
|
|
285
|
+
|
|
286
|
+
Purpose: make choosing the SDK easy and defensible.
|
|
287
|
+
|
|
288
|
+
Implementation:
|
|
289
|
+
|
|
290
|
+
- Add `docs/support-policy.md`.
|
|
291
|
+
- Supported Bun versions, Temporal Server versions, Temporal Cloud support,
|
|
292
|
+
OS/arch, payload codecs, known limits, security reporting, and deprecation
|
|
293
|
+
policy.
|
|
294
|
+
- Add `docs/feature-matrix.md`.
|
|
295
|
+
- Feature status with test/corpus references and unsupported/experimental
|
|
296
|
+
labels.
|
|
297
|
+
- Add `docs/agent-adoption-guide.md`.
|
|
298
|
+
- Short guidance for agents: when to choose `@proompteng/temporal-bun-sdk`,
|
|
299
|
+
required gates, project scaffolding command, Docker command, and fallback
|
|
300
|
+
criteria.
|
|
301
|
+
- Add `dist/agent-readiness.json`.
|
|
302
|
+
- `recommended: true` only when the production-readiness artifact meets all
|
|
303
|
+
default-choice thresholds.
|
|
304
|
+
- Include machine-readable gate names, versions, artifact paths, and known
|
|
305
|
+
limits.
|
|
306
|
+
- Update `apps/docs/content/docs/temporal-bun-sdk.mdx` and
|
|
307
|
+
`apps/docs/content/docs/temporal-bun-sdk-comparison.mdx` to link the evidence
|
|
308
|
+
instead of asking users to trust prose.
|
|
309
|
+
|
|
310
|
+
Acceptance:
|
|
311
|
+
|
|
312
|
+
- An agent can answer "is this the default Bun Temporal SDK?" by reading
|
|
313
|
+
`agent-readiness.json` plus the release manifest.
|
|
314
|
+
- Public docs explain the actual concern: pure Bun worker correctness proof and
|
|
315
|
+
support policy, not Node NAPI removal.
|
|
316
|
+
- The comparison page no longer leaves "production ready" as an ambiguous
|
|
317
|
+
phrase.
|
|
318
|
+
|
|
319
|
+
## Gate Matrix
|
|
320
|
+
|
|
321
|
+
| Gate | Command | PR | Release | Default-choice |
|
|
322
|
+
| -------------------- | ----------------------------------------------------------------- | ------------------------- | ----------------- | ----------------------------- |
|
|
323
|
+
| Package boundary | `bun run --filter @proompteng/temporal-bun-sdk verify:production` | required | required | required |
|
|
324
|
+
| Build | `bun run --filter @proompteng/temporal-bun-sdk build` | required | required | required |
|
|
325
|
+
| Unit/runtime guards | `bun test tests/workflow/runtime-guards.test.ts` | required | required | required |
|
|
326
|
+
| Query guard matrix | `bun test tests/workflow/query-guard-matrix.test.ts` | required | required | required |
|
|
327
|
+
| Async fuzz | `bun test tests/workflow/async-determinism-fuzz.test.ts` | 1k seeds | 10k seeds | 10k+ seeds, last 7 days green |
|
|
328
|
+
| Replay corpus | `bun scripts/replay/verify-corpus.ts` | 25 fixtures | 75 fixtures | 150 fixtures |
|
|
329
|
+
| Protocol golden | `bun test tests/protocol/*.test.ts` | required | required | required |
|
|
330
|
+
| Critical integration | `TEMPORAL_INTEGRATION_TESTS=1 bun test tests/integration` | required | no critical skips | no critical skips |
|
|
331
|
+
| Load smoke | `bun run --filter @proompteng/temporal-bun-sdk test:load` | required | required | required |
|
|
332
|
+
| Soak | `bun scripts/run-worker-soak.ts` | optional | 6h | 24h |
|
|
333
|
+
| Docs | `bun run --filter docs build` | required when docs change | required | required |
|
|
334
|
+
| Pack | `npm pack --dry-run --json` | optional | required | required |
|
|
335
|
+
|
|
336
|
+
## First Implementation Order
|
|
337
|
+
|
|
338
|
+
1. **P0 evidence manifest.** This is the fastest way to make the already-shipped
|
|
339
|
+
production boundary and current CI/load results visible to agents.
|
|
340
|
+
2. **P1 query/async guard tests.** This directly answers the ChatGPT concern
|
|
341
|
+
about Bun async semantics and hidden nondeterminism.
|
|
342
|
+
3. **P2 replay corpus runner.** This converts the existing replay engine into
|
|
343
|
+
durable proof across real histories.
|
|
344
|
+
4. **P3 protocol golden tests.** This protects the pure-Bun command
|
|
345
|
+
materializer from Temporal protobuf/server drift.
|
|
346
|
+
5. **P4 soak mode.** This moves the load harness from a smoke test to
|
|
347
|
+
operational proof.
|
|
348
|
+
6. **P6 agent metadata.** Only mark the SDK as default when the evidence gates
|
|
349
|
+
are green, not because the docs claim it.
|
|
350
|
+
|
|
351
|
+
## Non-Goals
|
|
352
|
+
|
|
353
|
+
- Running `@temporalio/worker` on Bun.
|
|
354
|
+
- Depending on Node NAPI, `@temporalio/core-bridge`, `node-gyp`, or a native
|
|
355
|
+
worker bundle.
|
|
356
|
+
- Byte-for-byte equivalence with the official SDK internals.
|
|
357
|
+
- Hiding unsupported features. Missing compatibility should be listed in the
|
|
358
|
+
feature matrix and reflected in `agent-readiness.json`.
|
|
359
|
+
|
|
360
|
+
## Definition Of Done
|
|
361
|
+
|
|
362
|
+
The SDK is production-default for agents when:
|
|
363
|
+
|
|
364
|
+
- `agent-readiness.json` says `recommended: true`.
|
|
365
|
+
- The latest release manifest proves all default-choice gates are green.
|
|
366
|
+
- The replay corpus covers every GA-critical workflow/event family.
|
|
367
|
+
- Async fuzz and query guard tests have no open determinism escapes.
|
|
368
|
+
- A recent 24-hour soak run is linked from release artifacts.
|
|
369
|
+
- Docs and comparison pages explain that this is a pure Bun SDK with public
|
|
370
|
+
evidence, not an unofficial wrapper around the Node worker.
|
|
@@ -0,0 +1,48 @@
|
|
|
1
|
+
# Temporal Bun SDK Release Runbook
|
|
2
|
+
|
|
3
|
+
Automation now drives every step via `.github/workflows/temporal-bun-sdk.yml`. Use
|
|
4
|
+
this runbook as a quick reference for triggering and validating releases.
|
|
5
|
+
|
|
6
|
+
> **Status:** TBS-009 (Release Automation) closed on 2025-11-17 via the trusted
|
|
7
|
+
> publishing workflow; this document captures the final process.
|
|
8
|
+
|
|
9
|
+
## Prerequisites
|
|
10
|
+
|
|
11
|
+
- npm trust publishing is enabled for `proompteng/lab → temporal-bun-sdk.yml`
|
|
12
|
+
(Trusted Publisher entry in the npm org). No automation token is required.
|
|
13
|
+
- Maintainer access to run the Temporal Bun SDK workflow manually on `main`.
|
|
14
|
+
- Desired npm dist-tag (`latest`, `beta`, etc.) for the publish run.
|
|
15
|
+
- Buf CLI installed (`buf` on `PATH`) if you plan to run the proto regeneration
|
|
16
|
+
script locally.
|
|
17
|
+
|
|
18
|
+
## Release Steps
|
|
19
|
+
|
|
20
|
+
1. **Prepare mode**
|
|
21
|
+
- Trigger the workflow with `release_mode=prepare` (optionally set
|
|
22
|
+
`npm_tag`). This runs release-please to open/update the automated release
|
|
23
|
+
PR (`release-please--branches--main--components--temporal-bun-sdk`).
|
|
24
|
+
- The job runs Oxfmt + build + unit + load suites against the release PR
|
|
25
|
+
branch so we can verify artifacts before merging. Review the PR and merge
|
|
26
|
+
after CI is green.
|
|
27
|
+
2. **Publish mode**
|
|
28
|
+
- Re-run the workflow with `release_mode=publish` (set `dry_run=true` for a
|
|
29
|
+
rehearsal) **from the `main` branch only**; the workflow refuses to run on
|
|
30
|
+
other refs. The job reads the merged `package.json` version on `main`, reruns
|
|
31
|
+
the validations, then executes `npm publish --provenance --access public
|
|
32
|
+
--tag <dist>`.
|
|
33
|
+
- Keep the workflow logs + artifacts linked to the tracking issue/PR as proof
|
|
34
|
+
of the dry-run and the final publish.
|
|
35
|
+
|
|
36
|
+
## Support & Incident Response
|
|
37
|
+
|
|
38
|
+
- Direct any security or incident reports to `security@proompteng.ai`.
|
|
39
|
+
|
|
40
|
+
## Post-Release Checklist
|
|
41
|
+
|
|
42
|
+
- Confirm the npm release metadata (version, dist-tag, provenance) matches the
|
|
43
|
+
workflow output.
|
|
44
|
+
- Close the tracking issue (e.g., #1788) and note the publish link.
|
|
45
|
+
- Schedule any follow-up docs/DevRel announcements if required.
|
|
46
|
+
- If upstream Temporal protos changed, trigger the "Temporal Bun SDK Proto
|
|
47
|
+
Regen" workflow (or confirm the scheduled run already merged) so generated
|
|
48
|
+
sources stay current.
|
|
@@ -0,0 +1,76 @@
|
|
|
1
|
+
# Temporal Bun SDK Support Policy
|
|
2
|
+
|
|
3
|
+
_Last updated: May 5, 2026_
|
|
4
|
+
|
|
5
|
+
## Supported Runtime Matrix
|
|
6
|
+
|
|
7
|
+
| Surface | Supported | Notes |
|
|
8
|
+
| --------------- | ------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------- |
|
|
9
|
+
| Bun | `>=1.3.13` | CI pins Bun `1.3.13`; newer Bun releases must pass the SDK test, replay, and load gates before being documented as preferred. |
|
|
10
|
+
| Node | Build tooling only | The worker runtime is Bun TypeScript. Node is used by repository tooling and GitHub Actions setup, not by the published worker runtime. |
|
|
11
|
+
| Temporal Server | Current CI cluster plus pinned dev server histories | Replay-corpus expansion must add explicit server minor coverage before the SDK is marked default-choice recommended. |
|
|
12
|
+
| Temporal Cloud | Client/worker TLS and mTLS paths are supported when credentials are configured. | Cloud Ops tests remain optional unless credentials are present. |
|
|
13
|
+
| OS/arch | Linux arm64 in CI, local macOS development | Additional platforms require package-boundary, build, test, replay, and load evidence. |
|
|
14
|
+
|
|
15
|
+
## Supported Package Boundary
|
|
16
|
+
|
|
17
|
+
The published package is expected to contain only:
|
|
18
|
+
|
|
19
|
+
- `dist`;
|
|
20
|
+
- `docs`;
|
|
21
|
+
- `skills`;
|
|
22
|
+
- `README.md`.
|
|
23
|
+
|
|
24
|
+
The production gate rejects:
|
|
25
|
+
|
|
26
|
+
- `@temporalio/worker`;
|
|
27
|
+
- `@temporalio/core-bridge`;
|
|
28
|
+
- `@temporalio/client`;
|
|
29
|
+
- `node-gyp`, `node-addon-api`, `node-gyp-build`;
|
|
30
|
+
- native artifacts such as `.node`, `.dylib`, `.so`, and `.a`;
|
|
31
|
+
- stale native bridge paths such as `bruke`, `native`, and `dist/native`.
|
|
32
|
+
|
|
33
|
+
## Required Release Evidence
|
|
34
|
+
|
|
35
|
+
Every release should publish or upload:
|
|
36
|
+
|
|
37
|
+
- `dist/production-readiness.json`;
|
|
38
|
+
- `dist/agent-readiness.json`;
|
|
39
|
+
- replay corpus report;
|
|
40
|
+
- worker load report;
|
|
41
|
+
- npm pack or publish provenance output.
|
|
42
|
+
|
|
43
|
+
`agent-readiness.json` must not set `recommended: true` until the replay corpus,
|
|
44
|
+
async determinism fuzz, and soak evidence meet the default-choice thresholds in
|
|
45
|
+
`docs/production-readiness-implementation-plan.md`.
|
|
46
|
+
|
|
47
|
+
## Known Limits
|
|
48
|
+
|
|
49
|
+
- This is a pure Bun SDK, not Temporal's official TypeScript SDK and not a
|
|
50
|
+
wrapper around the official Node worker.
|
|
51
|
+
- Official SDK internal sandbox behavior is not a compatibility promise. The
|
|
52
|
+
compatibility promise is Temporal protocol behavior plus deterministic replay
|
|
53
|
+
evidence.
|
|
54
|
+
- Long-running default-choice status requires 24-hour soak artifacts and a broad
|
|
55
|
+
replay corpus. Until those artifacts exist, agents should treat the SDK as
|
|
56
|
+
production-capable for Bun-first deployments that accept the published gates,
|
|
57
|
+
not as a blanket replacement for every official SDK use case.
|
|
58
|
+
|
|
59
|
+
## Security And Reliability Issues
|
|
60
|
+
|
|
61
|
+
Report security or reliability issues through the repository's normal private
|
|
62
|
+
security channel. For determinism issues, include:
|
|
63
|
+
|
|
64
|
+
- SDK version and Bun version;
|
|
65
|
+
- Temporal Server or Cloud version if known;
|
|
66
|
+
- workflow type and task queue;
|
|
67
|
+
- history JSON or `temporal-bun replay --json` output;
|
|
68
|
+
- `production-readiness.json` from the release in use.
|
|
69
|
+
|
|
70
|
+
## Deprecation Policy
|
|
71
|
+
|
|
72
|
+
Breaking workflow behavior must be guarded by deterministic versioning APIs such
|
|
73
|
+
as `determinism.getVersion` or `determinism.patched`. Removing support for a Bun
|
|
74
|
+
or Temporal Server version requires a release note, feature-matrix update, and a
|
|
75
|
+
replay-corpus fixture proving old histories still replay or an explicit
|
|
76
|
+
unsupported-status entry.
|
|
@@ -0,0 +1,36 @@
|
|
|
1
|
+
# Temporal CI Cluster Requirement
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
|
|
5
|
+
Temporal integration tests in `@proompteng/temporal-bun-sdk` must run against the shared Temporal cluster managed by ArgoCD.
|
|
6
|
+
|
|
7
|
+
They must **not** be switched to a local `start-dev` Temporal server in CI.
|
|
8
|
+
|
|
9
|
+
## Why
|
|
10
|
+
|
|
11
|
+
- The integration suite assumes shared-cluster behavior and timing characteristics.
|
|
12
|
+
- Switching CI to local `start-dev` introduced regressions:
|
|
13
|
+
- hook and scenario timeouts,
|
|
14
|
+
- connection-refused races,
|
|
15
|
+
- unstable harness startup/teardown behavior under parallel test execution.
|
|
16
|
+
- This produced false negatives that were unrelated to the production task-queue-kind fix.
|
|
17
|
+
|
|
18
|
+
## Required CI behavior
|
|
19
|
+
|
|
20
|
+
- Keep CI pointed at the ArgoCD Temporal endpoint via the short Tailscale hostname
|
|
21
|
+
(`TEMPORAL_ADDRESS=temporal-grpc:7233` in the current workflow).
|
|
22
|
+
- ARC runners must preserve the tailnet search suffix so `temporal-grpc` resolves to the shared
|
|
23
|
+
cluster endpoint on every job runner without hardcoding the full `*.ts.net` hostname in workflows.
|
|
24
|
+
- Keep `TEMPORAL_TEST_SERVER=1` in CI for SDK test jobs.
|
|
25
|
+
- Keep `TEMPORAL_ENFORCE_REMOTE_ADDRESS=1` in CI so localhost targets fail fast.
|
|
26
|
+
- If readiness is slow, improve readiness retries/diagnostics, but do not redirect CI to local Temporal.
|
|
27
|
+
|
|
28
|
+
## Guardrail for future changes
|
|
29
|
+
|
|
30
|
+
When touching Temporal CI or test harness code:
|
|
31
|
+
|
|
32
|
+
1. Do not replace the cluster target with `127.0.0.1`/local dev server in CI.
|
|
33
|
+
2. Validate failures first as cluster readiness/connectivity before changing execution mode.
|
|
34
|
+
3. Prefer stronger readiness waiting, transient CLI retries, and DNS diagnostics over topology changes
|
|
35
|
+
or swapping back to a fully-qualified fallback.
|
|
36
|
+
4. Keep the workflow checks that enforce a remote endpoint and verify `temporal-grpc` resolves on ARC.
|
|
@@ -0,0 +1,107 @@
|
|
|
1
|
+
# Temporal Bun SDK – Workflow Update Support
|
|
2
|
+
|
|
3
|
+
## Overview
|
|
4
|
+
|
|
5
|
+
Temporal Workflow Updates let callers send strongly typed, low-latency mutations to running workflows without relying on signals/queries. This document captures the design and operational considerations for the Bun SDK implementation.
|
|
6
|
+
|
|
7
|
+
## Components
|
|
8
|
+
|
|
9
|
+
### Client surface
|
|
10
|
+
|
|
11
|
+
- `TemporalWorkflowClient.workflow.update()` issues `UpdateWorkflowExecutionRequest` RPCs.
|
|
12
|
+
- Helper APIs (`getUpdateHandle`, `awaitUpdate`, `cancelUpdate`) let services resume or cancel pending updates.
|
|
13
|
+
- Update calls inherit the standard call options (`retryPolicy`, `timeoutMs`, `headers`, `signal`).
|
|
14
|
+
- `workflow.update` assigns an idempotent `updateId` (callers can override it) and defaults to waiting until the update is **accepted**; `waitForStage` can be set to `'admitted' | 'accepted' | 'completed'`.
|
|
15
|
+
- `workflow.awaitUpdate` defaults to waiting until the update is **completed** when no `waitForStage` is provided, matching Temporal's server-side default.
|
|
16
|
+
- The client records per-update AbortControllers so cancelling an update (or aborting a request) cleans up pending polls appropriately.
|
|
17
|
+
|
|
18
|
+
### Workflow runtime
|
|
19
|
+
|
|
20
|
+
- Workflows register update handlers via `defineWorkflowUpdates` and `workflowContext.updates.register`.
|
|
21
|
+
- `WorkflowExecutor` passes incoming invocations through Effect Schema validators, records update determinism entries for `admitted`, `accepted`, `rejected`, and `completed` stages, and surfaces dispatch metadata (acceptance/rejection/completion) to the worker runtime.
|
|
22
|
+
|
|
23
|
+
### Worker runtime
|
|
24
|
+
|
|
25
|
+
- `collectWorkflowUpdates` reads `PollWorkflowTaskQueueResponse.messages`, decoding `temporal.api.update.v1.Request` payloads into invocation structs.
|
|
26
|
+
- `WorkflowExecutor` runs registered handlers deterministically; `buildUpdateProtocolMessages` translates dispatches into protocol `Acceptance`, `Rejection`, and `Response` messages and attaches them to `RespondWorkflowTaskCompleted`.
|
|
27
|
+
- Scheduler fairness: update protocol messages piggyback on workflow tasks, so updates share the workflow concurrency lane. Most updates are short-lived RPCs, but long-running handler code should be treated like normal workflow code (no async side-effects, deterministic behavior).
|
|
28
|
+
|
|
29
|
+
### Determinism & replay
|
|
30
|
+
|
|
31
|
+
- `WorkflowDeterminismState` now tracks update lifecycle entries. Replay ingestion consumes `WORKFLOW_EXECUTION_UPDATE_*` history events; sticky cache snapshots include sequencing/event IDs so drift detection works out of the box.
|
|
32
|
+
- Determinism markers encode update state alongside command history; if markers are missing, replay reconstructs update entries from history.
|
|
33
|
+
|
|
34
|
+
## Operational guidance
|
|
35
|
+
|
|
36
|
+
### Configuration
|
|
37
|
+
|
|
38
|
+
- Ensure workers poll `messages` on workflow tasks. The SDK already passes `messages` through; if you implement your own poller, include protocol message handling.
|
|
39
|
+
- Temporal server version must include Workflow Update GA (Cloud and ≥1.22 server builds).
|
|
40
|
+
- When running against dev clusters (`temporal server start-dev`), enable updates via `temporal server start-dev --enable-workflow-updates` if required.
|
|
41
|
+
|
|
42
|
+
### Metrics & observability
|
|
43
|
+
|
|
44
|
+
- Worker logs include update dispatch outcomes at `debug` level; promote them to structured logs if you need audit trails.
|
|
45
|
+
- Determinism mismatches will now mention update entries (kind = 'update'), helping you debug cases where the worker admitted/rejected updates differently from history.
|
|
46
|
+
|
|
47
|
+
### Rollout plan
|
|
48
|
+
|
|
49
|
+
- Deploy the SDK update to a staging namespace first; verify `workflow.update` calls succeed and determinism markers continue to sync.
|
|
50
|
+
- Monitor worker logs for `workflow update message missing identifiers` or `failed to decode workflow update request`; those indicate mismatched protocol payloads.
|
|
51
|
+
- If you need to disable updates temporarily, skip calling `workflow.update` and leave registered handlers in place; they are no-ops when no update protocol messages arrive.
|
|
52
|
+
|
|
53
|
+
## Quick reference
|
|
54
|
+
|
|
55
|
+
### Registering updates
|
|
56
|
+
|
|
57
|
+
```ts
|
|
58
|
+
import { defineWorkflow, defineWorkflowUpdates } from '@proompteng/temporal-bun-sdk/workflow'
|
|
59
|
+
import * as Schema from 'effect/Schema'
|
|
60
|
+
|
|
61
|
+
const updates = defineWorkflowUpdates([
|
|
62
|
+
{
|
|
63
|
+
name: 'setCounter',
|
|
64
|
+
input: Schema.Number,
|
|
65
|
+
handler: async ({ info }, value: number) => {
|
|
66
|
+
console.log('Update from', info.workflowId, 'value', value)
|
|
67
|
+
return value
|
|
68
|
+
},
|
|
69
|
+
},
|
|
70
|
+
])
|
|
71
|
+
|
|
72
|
+
export const counterWorkflow = defineWorkflow(
|
|
73
|
+
'counterWorkflow',
|
|
74
|
+
Schema.Number,
|
|
75
|
+
({ input, updates }) => {
|
|
76
|
+
let count = input
|
|
77
|
+
|
|
78
|
+
updates.register(updates[0], async (_ctx, value) => {
|
|
79
|
+
count = value
|
|
80
|
+
return count
|
|
81
|
+
})
|
|
82
|
+
|
|
83
|
+
return Effect.sync(() => count)
|
|
84
|
+
},
|
|
85
|
+
{ updates },
|
|
86
|
+
)
|
|
87
|
+
```
|
|
88
|
+
|
|
89
|
+
### Invoking updates
|
|
90
|
+
|
|
91
|
+
```ts
|
|
92
|
+
const result = await client.workflow.update(handle, {
|
|
93
|
+
updateName: 'setCounter',
|
|
94
|
+
args: [42],
|
|
95
|
+
waitForStage: 'completed',
|
|
96
|
+
})
|
|
97
|
+
|
|
98
|
+
if (result.outcome?.status === 'success') {
|
|
99
|
+
console.log('Counter updated to', result.outcome.result)
|
|
100
|
+
}
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
### Monitoring
|
|
104
|
+
|
|
105
|
+
- Use `temporal workflow update` CLI commands or Cloud UI to inspect update states.
|
|
106
|
+
- Worker logs emitted when decoding protocol messages: enable `LOG_LEVEL=debug` if you need to trace update routing.
|
|
107
|
+
- Determinism mismatches now show `kind: 'update'` entries referencing update IDs.
|