@proompteng/temporal-bun-sdk 0.8.0 → 0.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,370 @@
1
+ # Temporal Bun SDK Production Readiness Implementation Plan
2
+
3
+ _Last updated: May 5, 2026_
4
+
5
+ ## Goal
6
+
7
+ Make `@proompteng/temporal-bun-sdk` the default Temporal choice for Bun-based
8
+ agents by turning the current project-proven runtime into a public,
9
+ repeatable, evidence-backed production library.
10
+
11
+ This is not a plan to load the official `@temporalio/worker` stack in Bun. The
12
+ SDK is a pure Bun implementation: package metadata only ships `dist`, `docs`,
13
+ `skills`, and `README.md`; package dependencies are protobuf/connect/Effect/TypeScript; and the
14
+ production verification gate rejects `@temporalio/worker`,
15
+ `@temporalio/core-bridge`, `node-gyp`, `dist/native`, and stale native Docker
16
+ paths.
17
+
18
+ ## Actual Concern
19
+
20
+ The concern behind "not production ready" is not that Bun cannot load a Node
21
+ NAPI bridge. This SDK does not rely on that bridge.
22
+
23
+ The real concern is that Temporal worker correctness is a protocol and
24
+ determinism contract. The official SDK gets trust from Temporal-maintained Core,
25
+ long-running compatibility coverage, and years of production history. A pure
26
+ Bun worker can be production quality, but it must publish enough proof that:
27
+
28
+ - Workflow code cannot silently observe nondeterministic Bun/JS runtime state.
29
+ - Replay reconstruction matches real Temporal histories across feature and
30
+ version combinations.
31
+ - Worker pollers, sticky queues, heartbeats, updates, shutdown, and retries hold
32
+ under load, restart, and failure conditions.
33
+ - Protocol command materialization remains compatible with Temporal Server
34
+ changes.
35
+ - Release artifacts make the package boundary, test results, and support matrix
36
+ easy for agents and humans to verify.
37
+
38
+ The current library is already deployed and tested on the current infra. That
39
+ counts, but agents outside this project cannot infer it. The implementation
40
+ work is to convert private confidence into public, machine-checkable evidence.
41
+
42
+ ## Code Read Findings
43
+
44
+ | Surface | Current implementation | Production gap to close |
45
+ | -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
46
+ | Package boundary | `packages/temporal-bun-sdk/package.json` ships `dist`, `docs`, `skills`, and `README.md`; runtime dependencies are `@bufbuild/protobuf`, Connect, Effect, and TypeScript. `verify:production` runs `tests/packaging/manifest-packaging.test.ts`. | Keep this as a release gate and publish the result in a release manifest so agents can prove the package is not a Node/native wrapper. |
47
+ | Worker runtime | `src/worker/runtime.ts` owns config load, WorkflowService transport, workflow/activity pollers, sticky queues, deployment/build IDs, scheduler, metrics, plugins, graceful shutdown, determinism marker emission, and activity task lifecycle. | Add restart/chaos/soak scenarios for poll cancellation, sticky cache drift, task-not-found, heartbeat failure, tuner changes, and shutdown during active workflow/activity tasks. |
48
+ | Workflow execution | `src/workflow/executor.ts` runs registered workflows through Effect, creates `WorkflowCommandContext`, evaluates queries, processes updates, and materializes success/failure commands. | Add protocol golden tests that compare emitted commands and update protocol messages against captured histories and expected server-visible events. |
49
+ | Determinism guard | `src/workflow/determinism.ts` records command, random, time, signal, query, and update streams and throws `WorkflowNondeterminismError` on replay mismatch. | Add async interleaving fuzz tests and query-mode negative tests. Query handlers must never read live time/randomness as a hidden side channel. |
50
+ | Runtime guards | `src/workflow/guards.ts` patches `Date`, `Date.now`, `Math.random`, `crypto.randomUUID`, `crypto.getRandomValues`, `fetch`, timers, `performance.now`, `WebSocket`, `Bun.spawn`, and `Bun.nanoseconds`; `WorkflowExecutor` requires strict guards in production. | Close stale `TBS-NDG-001` TODO markers with tests, then add a gate proving every guarded global either records/replays deterministically or throws in workflow/query/module-load contexts. |
51
+ | Static workflow lint | `src/bin/lint-workflows-command.ts` walks workflow import graphs and denies unsafe imports/globals/member expressions. Tests cover `fetch`, `process.env`, captured `Date.now`, and importing client APIs from workflows. | Add rules for dangerous Promise/Effect escape hatches, dynamic eval/function creation, timers captured through aliases, and Bun runtime APIs not currently listed. Make release CI fail if configured workflow entries are missing. |
52
+ | Replay | `src/workflow/replay.ts` ingests real histories, applies full/delta determinism markers, reconstructs command history, tracks updates, and diffs mismatch metadata. Stored fixtures currently cover timer, activity retry, and child/continue-as-new histories. | Scale from a small fixture set to a versioned corpus that covers every supported command/event pair, updates, signals, queries, cancellations, payload codecs, search attributes, memo, markers, failures, sticky replay, and old SDK versions. |
53
+ | Integration | `tests/integration/**` covers history replay, activity lifecycle, query-only workflows, signal/query, workflow updates, payload codecs, client resilience, worker ops, schedules, and worker runtime behavior behind `TEMPORAL_INTEGRATION_TESTS=1`. | Split optional service-unavailable skips from release-blocking skips. In release CI, a missing dev server or unimplemented critical endpoint must fail instead of silently reducing coverage. |
54
+ | Load | `tests/integration/load/**` submits CPU, activity, and update workflows, checks throughput, sticky hit ratio, and poll p95 latency, and writes JSONL/report artifacts. | Add duration mode, memory/heap samples, worker restart mode, Temporal endpoint interruption, sticky cache churn, and nightly/weekly soak thresholds. |
55
+ | CI | `.github/workflows/temporal-bun-sdk.yml` builds, lints, tests, runs `verify:production`, runs load checks, and uploads load artifacts. | Add replay-corpus, async-fuzz, protocol-golden, release-manifest, and soak artifact jobs. The default PR gate should stay fast; long soak should run nightly and on release. |
56
+
57
+ ## Production Bar
58
+
59
+ The library becomes the obvious default when every release can point to a
60
+ single production-readiness artifact containing:
61
+
62
+ - Package boundary proof: no official Node worker dependency, no native bridge,
63
+ no `node-gyp`, no `dist/native`, no stale native Docker path.
64
+ - Determinism proof: replay corpus result, async fuzz result, query/global guard
65
+ matrix, and mismatch diagnostics samples.
66
+ - Protocol proof: command/update/query/signal/activity golden tests by Temporal
67
+ Server version and SDK version.
68
+ - Operations proof: load and soak reports with throughput, latency, sticky cache
69
+ ratio, heartbeat retries/failures, worker failures, memory slope, and restart
70
+ outcomes.
71
+ - Compatibility proof: supported Bun, Temporal Server, Temporal Cloud, TLS/mTLS,
72
+ payload codec, and deployment/versioning matrix.
73
+ - Support proof: documented feature status, known limits, upgrade policy,
74
+ security/reporting contact, and release deprecation policy.
75
+
76
+ ## Implementation Phases
77
+
78
+ ### P0 - Credibility Gate And Public Evidence
79
+
80
+ Purpose: make the current hardening visible and stop regressions.
81
+
82
+ Implementation:
83
+
84
+ - Add `scripts/production-readiness/collect-release-evidence.ts`.
85
+ - Generate `dist/production-readiness.json` during `prepack`.
86
+ - Include package boundary, Bun/Temporal versions, git SHA, command results,
87
+ replay fixture count, integration skip count, load report path, and docs hash.
88
+ - Extend `verify:production` to validate the generated evidence schema.
89
+ - Remove or rename stale `TODO(TBS-NDG-*)` comments once the matching tests prove
90
+ they are complete; leave new TODOs only for real unfinished work.
91
+ - Add a release CI step that uploads `production-readiness.json` with npm pack
92
+ output and worker-load artifacts.
93
+
94
+ Acceptance:
95
+
96
+ - `bun run --filter @proompteng/temporal-bun-sdk verify:production` fails if the
97
+ evidence file is missing, malformed, or reports native/Node worker artifacts.
98
+ - `npm pack --dry-run --json` output can be matched to the same evidence file.
99
+ - Docs link to the evidence fields so agents can make a yes/no decision without
100
+ reading the whole repo.
101
+
102
+ ### P1 - Determinism And Async Semantics Audit
103
+
104
+ Purpose: answer the specific "Bun async semantics" concern.
105
+
106
+ Implementation:
107
+
108
+ - Add `tests/workflow/query-guard-matrix.test.ts`.
109
+ - Assert `Date`, `Date.now`, `Math.random`, `crypto.randomUUID`,
110
+ `crypto.getRandomValues`, `performance.now`, timers, fetch, WebSocket, and
111
+ Bun process APIs either throw `WorkflowQueryViolationError` in query mode or
112
+ replay from prior determinism state. No query path may read live wall-clock
113
+ or random state.
114
+ - Add `tests/workflow/async-determinism-fuzz.test.ts`.
115
+ - Generate seeded workflows that interleave Effect yields, Promise microtasks,
116
+ timers via workflow APIs, activities, signals, queries, updates, sideEffect,
117
+ getVersion, patch, and local activities.
118
+ - Run each seed twice: initial execution records state; replay must produce
119
+ identical output and zero `diffDeterminismState` mismatches.
120
+ - Mutate one command/random/time/query/update slot per seed and assert the
121
+ mismatch is detected with useful metadata.
122
+ - Extend `lint-workflows` rules to flag hidden async escape hatches:
123
+ - raw `Promise` constructors in workflow modules unless allowlisted;
124
+ - `Effect.tryPromise` and `Effect.promise` in workflow handlers unless called
125
+ through SDK-provided deterministic adapters;
126
+ - captured unsafe globals through aliases;
127
+ - `eval`, `Function`, dynamic import, and Bun runtime APIs beyond the current
128
+ deny list.
129
+ - Add a CLI gate that runs `temporal-bun lint-workflows` in strict JSON mode and
130
+ writes `.artifacts/workflow-lint/report.json`.
131
+
132
+ Acceptance:
133
+
134
+ - Runtime guard, query guard matrix, and async fuzz suites pass locally and in
135
+ CI.
136
+ - Fuzz default: at least 1,000 seeds in PR CI under a stable seed file.
137
+ - Fuzz release/nightly: at least 10,000 seeds with artifacted seed failures.
138
+ - Any workflow query reading live time/randomness is a release blocker.
139
+
140
+ ### P2 - Replay Corpus Expansion
141
+
142
+ Purpose: prove replay against real Temporal histories, not only synthetic unit
143
+ fixtures.
144
+
145
+ Implementation:
146
+
147
+ - Add `tests/replay/corpus/manifest.json`.
148
+ - Fields: fixture name, Temporal Server version, SDK version, Bun version,
149
+ workflow type, feature tags, history event count, expected command count,
150
+ payload codec profile, captured date, and source command.
151
+ - Add `scripts/replay/capture-corpus-fixture.ts`.
152
+ - Starts or references a workflow, fetches history through WorkflowService or
153
+ Temporal CLI, normalizes JSON, runs `ingestWorkflowHistory`, stores expected
154
+ determinism state, and updates the manifest.
155
+ - Add `scripts/replay/verify-corpus.ts`.
156
+ - Runs all fixtures, rejects empty corpus, rejects duplicate tags without a
157
+ replacement marker, rejects unsupported schema versions, and writes
158
+ `.artifacts/replay-corpus/report.json`.
159
+ - Keep the existing small `tests/replay/fixtures.test.ts`, but make the larger
160
+ corpus the release gate.
161
+
162
+ Coverage targets:
163
+
164
+ - Activities: success, retry success, retry exhaustion, cancellation, heartbeat
165
+ timeout, non-retryable failure, payload codec failure details.
166
+ - Timers: fire, cancel, replay after marker, long timeout normalization.
167
+ - Child workflows: start, complete, fail, cancel, terminate, timeout, pending
168
+ child replay.
169
+ - Continue-as-new and versioning: marker chain, getVersion, patch/deprecate,
170
+ old-build replay.
171
+ - Signals and queries: signal delivery order, query-only task, legacy query,
172
+ multi-query, query failure.
173
+ - Updates: admitted, accepted, rejected, completed success/failure, duplicate
174
+ update ID, cancellation while awaiting.
175
+ - Search attributes and memo: upsert and workflow properties modified.
176
+ - Nexus operations: schedule, complete, fail, cancel, timeout.
177
+ - Payload codecs: JSON, binary tunnel, gzip, AES-GCM, key rotation metadata.
178
+
179
+ Acceptance:
180
+
181
+ - PR gate: at least 25 high-signal corpus fixtures.
182
+ - Release gate: at least 75 fixtures covering every GA-critical command/event
183
+ family.
184
+ - Default-choice gate: at least 150 fixtures across at least two Temporal Server
185
+ minor versions and two SDK minor versions.
186
+
187
+ ### P3 - Protocol Golden And Compatibility Tests
188
+
189
+ Purpose: prove the custom Bun worker emits server-compatible commands.
190
+
191
+ Implementation:
192
+
193
+ - Add `tests/protocol/command-golden.test.ts`.
194
+ - Given workflow command intents, materialize commands and compare stable JSON
195
+ projections of `Command` protos.
196
+ - Include headers, memo, search attributes, retry policies, parent close
197
+ policy, task queues, timeouts, cancellation, continue-as-new, and Nexus.
198
+ - Add `tests/protocol/update-protocol-golden.test.ts`.
199
+ - Validate `buildUpdateProtocolMessages` output for acceptance, rejection,
200
+ success completion, failure completion, and replay duplicate suppression.
201
+ - Add `tests/protocol/history-roundtrip.test.ts`.
202
+ - For selected corpus fixtures, reconstruct determinism state, replay the
203
+ workflow, materialize commands, and assert stable equivalence against the
204
+ next command-bearing history events.
205
+ - Add a Temporal compatibility matrix job:
206
+ - local dev server for PRs;
207
+ - pinned Temporal Server minors in nightly;
208
+ - Temporal Cloud smoke in release when credentials are present.
209
+
210
+ Acceptance:
211
+
212
+ - Every command kind in `src/workflow/commands.ts` has a golden projection.
213
+ - Every event kind handled in `src/workflow/replay.ts` has at least one corpus
214
+ fixture or an explicit unsupported-status entry.
215
+ - Protocol golden tests fail on accidental protobuf field drift.
216
+
217
+ ### P4 - Worker Lifecycle, Chaos, And Soak
218
+
219
+ Purpose: prove the runtime holds under real operating conditions.
220
+
221
+ Implementation:
222
+
223
+ - Extend `tests/integration/load/config.ts` with:
224
+ - `TEMPORAL_LOAD_TEST_DURATION_MS`;
225
+ - `TEMPORAL_LOAD_TEST_RESTART_INTERVAL_MS`;
226
+ - `TEMPORAL_LOAD_TEST_ENDPOINT_BLACKHOLE_MS`;
227
+ - `TEMPORAL_LOAD_TEST_STICKY_CHURN_RATIO`;
228
+ - `TEMPORAL_LOAD_TEST_MEMORY_SLOPE_MAX_MB_PER_HOUR`;
229
+ - `TEMPORAL_LOAD_TEST_SHUTDOWN_DURING_ACTIVITY_RATIO`.
230
+ - Extend `tests/integration/load/runner.ts` to:
231
+ - sample RSS/heap and write `memory.jsonl`;
232
+ - periodically shutdown and recreate `WorkerRuntime`;
233
+ - interrupt pollers and confirm `#withRpcAbort` and scheduler shutdown do not
234
+ strand tasks;
235
+ - force sticky cache eviction and drift rebuilds;
236
+ - track task-not-found, nondeterminism, heartbeat retry/failure, activity
237
+ failure, workflow failure, and sticky heal metrics.
238
+ - Add `scripts/run-worker-soak.ts`.
239
+ - Duration-mode wrapper around the load runner with stricter artifact
240
+ validation.
241
+ - Add `.github/workflows/temporal-bun-sdk-nightly.yml`.
242
+ - Runs 2-hour soak nightly.
243
+ - Runs 6-hour soak on release branch.
244
+ - Allows 24-hour soak manually before declaring the package default for new
245
+ agents.
246
+
247
+ Acceptance:
248
+
249
+ - PR load smoke: current short load test remains under 10 minutes.
250
+ - Nightly soak: 2 hours, zero stuck workflows, no unhandled runtime rejection,
251
+ memory slope below threshold, sticky heal rate below threshold, and all
252
+ metrics artifacts uploaded.
253
+ - Release soak: 6 hours against pinned Temporal Server with restart mode
254
+ enabled.
255
+ - Default-choice soak: one 24-hour run with restart mode, sticky churn, payload
256
+ codecs, updates, and activities enabled.
257
+
258
+ ### P5 - CI Skip Policy And Release Blocking Rules
259
+
260
+ Purpose: prevent false confidence from skipped integration tests.
261
+
262
+ Implementation:
263
+
264
+ - Add a test reporter that counts Bun skipped tests and scenario-level "skipped"
265
+ console warnings for the SDK package.
266
+ - Add `TEMPORAL_BUN_RELEASE_GATE=1`.
267
+ - In release mode, missing Temporal CLI/dev server, missing corpus fixtures,
268
+ and critical integration skips are failures.
269
+ - Optional external services remain optional only if the skip is tagged with a
270
+ non-critical capability, for example Temporal Cloud Ops without credentials.
271
+ - Split integration suites into:
272
+ - `test:integration:critical`;
273
+ - `test:integration:optional`;
274
+ - `test:load`;
275
+ - `test:soak`.
276
+ - Publish `.artifacts/temporal-bun-sdk/test-summary.json`.
277
+
278
+ Acceptance:
279
+
280
+ - Release CI fails if a critical integration test silently skips.
281
+ - The release manifest includes skip counts by suite and reason.
282
+ - PR CI remains usable without requiring Temporal Cloud credentials.
283
+
284
+ ### P6 - Docs, Support Policy, And Agent Default Metadata
285
+
286
+ Purpose: make choosing the SDK easy and defensible.
287
+
288
+ Implementation:
289
+
290
+ - Add `docs/support-policy.md`.
291
+ - Supported Bun versions, Temporal Server versions, Temporal Cloud support,
292
+ OS/arch, payload codecs, known limits, security reporting, and deprecation
293
+ policy.
294
+ - Add `docs/feature-matrix.md`.
295
+ - Feature status with test/corpus references and unsupported/experimental
296
+ labels.
297
+ - Add `docs/agent-adoption-guide.md`.
298
+ - Short guidance for agents: when to choose `@proompteng/temporal-bun-sdk`,
299
+ required gates, project scaffolding command, Docker command, and fallback
300
+ criteria.
301
+ - Add `dist/agent-readiness.json`.
302
+ - `recommended: true` only when the production-readiness artifact meets all
303
+ default-choice thresholds.
304
+ - Include machine-readable gate names, versions, artifact paths, and known
305
+ limits.
306
+ - Update `apps/docs/content/docs/temporal-bun-sdk.mdx` and
307
+ `apps/docs/content/docs/temporal-bun-sdk-comparison.mdx` to link the evidence
308
+ instead of asking users to trust prose.
309
+
310
+ Acceptance:
311
+
312
+ - An agent can answer "is this the default Bun Temporal SDK?" by reading
313
+ `agent-readiness.json` plus the release manifest.
314
+ - Public docs explain the actual concern: pure Bun worker correctness proof and
315
+ support policy, not Node NAPI removal.
316
+ - The comparison page no longer leaves "production ready" as an ambiguous
317
+ phrase.
318
+
319
+ ## Gate Matrix
320
+
321
+ | Gate | Command | PR | Release | Default-choice |
322
+ | -------------------- | ----------------------------------------------------------------- | ------------------------- | ----------------- | ----------------------------- |
323
+ | Package boundary | `bun run --filter @proompteng/temporal-bun-sdk verify:production` | required | required | required |
324
+ | Build | `bun run --filter @proompteng/temporal-bun-sdk build` | required | required | required |
325
+ | Unit/runtime guards | `bun test tests/workflow/runtime-guards.test.ts` | required | required | required |
326
+ | Query guard matrix | `bun test tests/workflow/query-guard-matrix.test.ts` | required | required | required |
327
+ | Async fuzz | `bun test tests/workflow/async-determinism-fuzz.test.ts` | 1k seeds | 10k seeds | 10k+ seeds, last 7 days green |
328
+ | Replay corpus | `bun scripts/replay/verify-corpus.ts` | 25 fixtures | 75 fixtures | 150 fixtures |
329
+ | Protocol golden | `bun test tests/protocol/*.test.ts` | required | required | required |
330
+ | Critical integration | `TEMPORAL_INTEGRATION_TESTS=1 bun test tests/integration` | required | no critical skips | no critical skips |
331
+ | Load smoke | `bun run --filter @proompteng/temporal-bun-sdk test:load` | required | required | required |
332
+ | Soak | `bun scripts/run-worker-soak.ts` | optional | 6h | 24h |
333
+ | Docs | `bun run --filter docs build` | required when docs change | required | required |
334
+ | Pack | `npm pack --dry-run --json` | optional | required | required |
335
+
336
+ ## First Implementation Order
337
+
338
+ 1. **P0 evidence manifest.** This is the fastest way to make the already-shipped
339
+ production boundary and current CI/load results visible to agents.
340
+ 2. **P1 query/async guard tests.** This directly answers the ChatGPT concern
341
+ about Bun async semantics and hidden nondeterminism.
342
+ 3. **P2 replay corpus runner.** This converts the existing replay engine into
343
+ durable proof across real histories.
344
+ 4. **P3 protocol golden tests.** This protects the pure-Bun command
345
+ materializer from Temporal protobuf/server drift.
346
+ 5. **P4 soak mode.** This moves the load harness from a smoke test to
347
+ operational proof.
348
+ 6. **P6 agent metadata.** Only mark the SDK as default when the evidence gates
349
+ are green, not because the docs claim it.
350
+
351
+ ## Non-Goals
352
+
353
+ - Running `@temporalio/worker` on Bun.
354
+ - Depending on Node NAPI, `@temporalio/core-bridge`, `node-gyp`, or a native
355
+ worker bundle.
356
+ - Byte-for-byte equivalence with the official SDK internals.
357
+ - Hiding unsupported features. Missing compatibility should be listed in the
358
+ feature matrix and reflected in `agent-readiness.json`.
359
+
360
+ ## Definition Of Done
361
+
362
+ The SDK is production-default for agents when:
363
+
364
+ - `agent-readiness.json` says `recommended: true`.
365
+ - The latest release manifest proves all default-choice gates are green.
366
+ - The replay corpus covers every GA-critical workflow/event family.
367
+ - Async fuzz and query guard tests have no open determinism escapes.
368
+ - A recent 24-hour soak run is linked from release artifacts.
369
+ - Docs and comparison pages explain that this is a pure Bun SDK with public
370
+ evidence, not an unofficial wrapper around the Node worker.
@@ -0,0 +1,48 @@
1
+ # Temporal Bun SDK Release Runbook
2
+
3
+ Automation now drives every step via `.github/workflows/temporal-bun-sdk.yml`. Use
4
+ this runbook as a quick reference for triggering and validating releases.
5
+
6
+ > **Status:** TBS-009 (Release Automation) closed on 2025-11-17 via the trusted
7
+ > publishing workflow; this document captures the final process.
8
+
9
+ ## Prerequisites
10
+
11
+ - npm trust publishing is enabled for `proompteng/lab → temporal-bun-sdk.yml`
12
+ (Trusted Publisher entry in the npm org). No automation token is required.
13
+ - Maintainer access to run the Temporal Bun SDK workflow manually on `main`.
14
+ - Desired npm dist-tag (`latest`, `beta`, etc.) for the publish run.
15
+ - Buf CLI installed (`buf` on `PATH`) if you plan to run the proto regeneration
16
+ script locally.
17
+
18
+ ## Release Steps
19
+
20
+ 1. **Prepare mode**
21
+ - Trigger the workflow with `release_mode=prepare` (optionally set
22
+ `npm_tag`). This runs release-please to open/update the automated release
23
+ PR (`release-please--branches--main--components--temporal-bun-sdk`).
24
+ - The job runs Oxfmt + build + unit + load suites against the release PR
25
+ branch so we can verify artifacts before merging. Review the PR and merge
26
+ after CI is green.
27
+ 2. **Publish mode**
28
+ - Re-run the workflow with `release_mode=publish` (set `dry_run=true` for a
29
+ rehearsal) **from the `main` branch only**; the workflow refuses to run on
30
+ other refs. The job reads the merged `package.json` version on `main`, reruns
31
+ the validations, then executes `npm publish --provenance --access public
32
+ --tag <dist>`.
33
+ - Keep the workflow logs + artifacts linked to the tracking issue/PR as proof
34
+ of the dry-run and the final publish.
35
+
36
+ ## Support & Incident Response
37
+
38
+ - Direct any security or incident reports to `security@proompteng.ai`.
39
+
40
+ ## Post-Release Checklist
41
+
42
+ - Confirm the npm release metadata (version, dist-tag, provenance) matches the
43
+ workflow output.
44
+ - Close the tracking issue (e.g., #1788) and note the publish link.
45
+ - Schedule any follow-up docs/DevRel announcements if required.
46
+ - If upstream Temporal protos changed, trigger the "Temporal Bun SDK Proto
47
+ Regen" workflow (or confirm the scheduled run already merged) so generated
48
+ sources stay current.
@@ -0,0 +1,76 @@
1
+ # Temporal Bun SDK Support Policy
2
+
3
+ _Last updated: May 5, 2026_
4
+
5
+ ## Supported Runtime Matrix
6
+
7
+ | Surface | Supported | Notes |
8
+ | --------------- | ------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------- |
9
+ | Bun | `>=1.3.13` | CI pins Bun `1.3.13`; newer Bun releases must pass the SDK test, replay, and load gates before being documented as preferred. |
10
+ | Node | Build tooling only | The worker runtime is Bun TypeScript. Node is used by repository tooling and GitHub Actions setup, not by the published worker runtime. |
11
+ | Temporal Server | Current CI cluster plus pinned dev server histories | Replay-corpus expansion must add explicit server minor coverage before the SDK is marked default-choice recommended. |
12
+ | Temporal Cloud | Client/worker TLS and mTLS paths are supported when credentials are configured. | Cloud Ops tests remain optional unless credentials are present. |
13
+ | OS/arch | Linux arm64 in CI, local macOS development | Additional platforms require package-boundary, build, test, replay, and load evidence. |
14
+
15
+ ## Supported Package Boundary
16
+
17
+ The published package is expected to contain only:
18
+
19
+ - `dist`;
20
+ - `docs`;
21
+ - `skills`;
22
+ - `README.md`.
23
+
24
+ The production gate rejects:
25
+
26
+ - `@temporalio/worker`;
27
+ - `@temporalio/core-bridge`;
28
+ - `@temporalio/client`;
29
+ - `node-gyp`, `node-addon-api`, `node-gyp-build`;
30
+ - native artifacts such as `.node`, `.dylib`, `.so`, and `.a`;
31
+ - stale native bridge paths such as `bruke`, `native`, and `dist/native`.
32
+
33
+ ## Required Release Evidence
34
+
35
+ Every release should publish or upload:
36
+
37
+ - `dist/production-readiness.json`;
38
+ - `dist/agent-readiness.json`;
39
+ - replay corpus report;
40
+ - worker load report;
41
+ - npm pack or publish provenance output.
42
+
43
+ `agent-readiness.json` must not set `recommended: true` until the replay corpus,
44
+ async determinism fuzz, and soak evidence meet the default-choice thresholds in
45
+ `docs/production-readiness-implementation-plan.md`.
46
+
47
+ ## Known Limits
48
+
49
+ - This is a pure Bun SDK, not Temporal's official TypeScript SDK and not a
50
+ wrapper around the official Node worker.
51
+ - Official SDK internal sandbox behavior is not a compatibility promise. The
52
+ compatibility promise is Temporal protocol behavior plus deterministic replay
53
+ evidence.
54
+ - Long-running default-choice status requires 24-hour soak artifacts and a broad
55
+ replay corpus. Until those artifacts exist, agents should treat the SDK as
56
+ production-capable for Bun-first deployments that accept the published gates,
57
+ not as a blanket replacement for every official SDK use case.
58
+
59
+ ## Security And Reliability Issues
60
+
61
+ Report security or reliability issues through the repository's normal private
62
+ security channel. For determinism issues, include:
63
+
64
+ - SDK version and Bun version;
65
+ - Temporal Server or Cloud version if known;
66
+ - workflow type and task queue;
67
+ - history JSON or `temporal-bun replay --json` output;
68
+ - `production-readiness.json` from the release in use.
69
+
70
+ ## Deprecation Policy
71
+
72
+ Breaking workflow behavior must be guarded by deterministic versioning APIs such
73
+ as `determinism.getVersion` or `determinism.patched`. Removing support for a Bun
74
+ or Temporal Server version requires a release note, feature-matrix update, and a
75
+ replay-corpus fixture proving old histories still replay or an explicit
76
+ unsupported-status entry.
@@ -0,0 +1,36 @@
1
+ # Temporal CI Cluster Requirement
2
+
3
+ ## Summary
4
+
5
+ Temporal integration tests in `@proompteng/temporal-bun-sdk` must run against the shared Temporal cluster managed by ArgoCD.
6
+
7
+ They must **not** be switched to a local `start-dev` Temporal server in CI.
8
+
9
+ ## Why
10
+
11
+ - The integration suite assumes shared-cluster behavior and timing characteristics.
12
+ - Switching CI to local `start-dev` introduced regressions:
13
+ - hook and scenario timeouts,
14
+ - connection-refused races,
15
+ - unstable harness startup/teardown behavior under parallel test execution.
16
+ - This produced false negatives that were unrelated to the production task-queue-kind fix.
17
+
18
+ ## Required CI behavior
19
+
20
+ - Keep CI pointed at the ArgoCD Temporal endpoint via the short Tailscale hostname
21
+ (`TEMPORAL_ADDRESS=temporal-grpc:7233` in the current workflow).
22
+ - ARC runners must preserve the tailnet search suffix so `temporal-grpc` resolves to the shared
23
+ cluster endpoint on every job runner without hardcoding the full `*.ts.net` hostname in workflows.
24
+ - Keep `TEMPORAL_TEST_SERVER=1` in CI for SDK test jobs.
25
+ - Keep `TEMPORAL_ENFORCE_REMOTE_ADDRESS=1` in CI so localhost targets fail fast.
26
+ - If readiness is slow, improve readiness retries/diagnostics, but do not redirect CI to local Temporal.
27
+
28
+ ## Guardrail for future changes
29
+
30
+ When touching Temporal CI or test harness code:
31
+
32
+ 1. Do not replace the cluster target with `127.0.0.1`/local dev server in CI.
33
+ 2. Validate failures first as cluster readiness/connectivity before changing execution mode.
34
+ 3. Prefer stronger readiness waiting, transient CLI retries, and DNS diagnostics over topology changes
35
+ or swapping back to a fully-qualified fallback.
36
+ 4. Keep the workflow checks that enforce a remote endpoint and verify `temporal-grpc` resolves on ARC.
@@ -0,0 +1,107 @@
1
+ # Temporal Bun SDK – Workflow Update Support
2
+
3
+ ## Overview
4
+
5
+ Temporal Workflow Updates let callers send strongly typed, low-latency mutations to running workflows without relying on signals/queries. This document captures the design and operational considerations for the Bun SDK implementation.
6
+
7
+ ## Components
8
+
9
+ ### Client surface
10
+
11
+ - `TemporalWorkflowClient.workflow.update()` issues `UpdateWorkflowExecutionRequest` RPCs.
12
+ - Helper APIs (`getUpdateHandle`, `awaitUpdate`, `cancelUpdate`) let services resume or cancel pending updates.
13
+ - Update calls inherit the standard call options (`retryPolicy`, `timeoutMs`, `headers`, `signal`).
14
+ - `workflow.update` assigns an idempotent `updateId` (callers can override it) and defaults to waiting until the update is **accepted**; `waitForStage` can be set to `'admitted' | 'accepted' | 'completed'`.
15
+ - `workflow.awaitUpdate` defaults to waiting until the update is **completed** when no `waitForStage` is provided, matching Temporal's server-side default.
16
+ - The client records per-update AbortControllers so cancelling an update (or aborting a request) cleans up pending polls appropriately.
17
+
18
+ ### Workflow runtime
19
+
20
+ - Workflows register update handlers via `defineWorkflowUpdates` and `workflowContext.updates.register`.
21
+ - `WorkflowExecutor` passes incoming invocations through Effect Schema validators, records update determinism entries for `admitted`, `accepted`, `rejected`, and `completed` stages, and surfaces dispatch metadata (acceptance/rejection/completion) to the worker runtime.
22
+
23
+ ### Worker runtime
24
+
25
+ - `collectWorkflowUpdates` reads `PollWorkflowTaskQueueResponse.messages`, decoding `temporal.api.update.v1.Request` payloads into invocation structs.
26
+ - `WorkflowExecutor` runs registered handlers deterministically; `buildUpdateProtocolMessages` translates dispatches into protocol `Acceptance`, `Rejection`, and `Response` messages and attaches them to `RespondWorkflowTaskCompleted`.
27
+ - Scheduler fairness: update protocol messages piggyback on workflow tasks, so updates share the workflow concurrency lane. Most updates are short-lived RPCs, but long-running handler code should be treated like normal workflow code (no async side-effects, deterministic behavior).
28
+
29
+ ### Determinism & replay
30
+
31
+ - `WorkflowDeterminismState` now tracks update lifecycle entries. Replay ingestion consumes `WORKFLOW_EXECUTION_UPDATE_*` history events; sticky cache snapshots include sequencing/event IDs so drift detection works out of the box.
32
+ - Determinism markers encode update state alongside command history; if markers are missing, replay reconstructs update entries from history.
33
+
34
+ ## Operational guidance
35
+
36
+ ### Configuration
37
+
38
+ - Ensure workers poll `messages` on workflow tasks. The SDK already passes `messages` through; if you implement your own poller, include protocol message handling.
39
+ - Temporal server version must include Workflow Update GA (Cloud and ≥1.22 server builds).
40
+ - When running against dev clusters (`temporal server start-dev`), enable updates via `temporal server start-dev --enable-workflow-updates` if required.
41
+
42
+ ### Metrics & observability
43
+
44
+ - Worker logs include update dispatch outcomes at `debug` level; promote them to structured logs if you need audit trails.
45
+ - Determinism mismatches will now mention update entries (kind = 'update'), helping you debug cases where the worker admitted/rejected updates differently from history.
46
+
47
+ ### Rollout plan
48
+
49
+ - Deploy the SDK update to a staging namespace first; verify `workflow.update` calls succeed and determinism markers continue to sync.
50
+ - Monitor worker logs for `workflow update message missing identifiers` or `failed to decode workflow update request`; those indicate mismatched protocol payloads.
51
+ - If you need to disable updates temporarily, skip calling `workflow.update` and leave registered handlers in place; they are no-ops when no update protocol messages arrive.
52
+
53
+ ## Quick reference
54
+
55
+ ### Registering updates
56
+
57
+ ```ts
58
+ import { defineWorkflow, defineWorkflowUpdates } from '@proompteng/temporal-bun-sdk/workflow'
59
+ import * as Schema from 'effect/Schema'
60
+
61
+ const updates = defineWorkflowUpdates([
62
+ {
63
+ name: 'setCounter',
64
+ input: Schema.Number,
65
+ handler: async ({ info }, value: number) => {
66
+ console.log('Update from', info.workflowId, 'value', value)
67
+ return value
68
+ },
69
+ },
70
+ ])
71
+
72
+ export const counterWorkflow = defineWorkflow(
73
+ 'counterWorkflow',
74
+ Schema.Number,
75
+ ({ input, updates }) => {
76
+ let count = input
77
+
78
+ updates.register(updates[0], async (_ctx, value) => {
79
+ count = value
80
+ return count
81
+ })
82
+
83
+ return Effect.sync(() => count)
84
+ },
85
+ { updates },
86
+ )
87
+ ```
88
+
89
+ ### Invoking updates
90
+
91
+ ```ts
92
+ const result = await client.workflow.update(handle, {
93
+ updateName: 'setCounter',
94
+ args: [42],
95
+ waitForStage: 'completed',
96
+ })
97
+
98
+ if (result.outcome?.status === 'success') {
99
+ console.log('Counter updated to', result.outcome.result)
100
+ }
101
+ ```
102
+
103
+ ### Monitoring
104
+
105
+ - Use `temporal workflow update` CLI commands or Cloud UI to inspect update states.
106
+ - Worker logs emitted when decoding protocol messages: enable `LOG_LEVEL=debug` if you need to trace update routing.
107
+ - Determinism mismatches now show `kind: 'update'` entries referencing update IDs.