defuss-express 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,244 @@
1
+ # defuss-express
2
+
3
+ > Extremely performant (!), express-compatible, auto-multi-core, QUIC/HTTP/3-enabled, WebSocket-capable, load-balanced, ultimate-express-powered server runtime for defuss.
4
+
5
+ `defuss-express` is a thin runtime wrapper around `ultimate-express` (and thus `express` API compatible!) that starts your app in one worker per CPU core and places a tiny TCP load balancer in front of those workers.
6
+
7
+ The intended developer experience is boring in the best way: you simply change your import from `express` to `defuss-express`, and immediately, your app will scale in the best way possible.
8
+
9
+ This package handles worker ports, process restarts, request fan-out, and worker telemetry using `defuss-open-telemetry` hooks.
10
+
11
+ You can even customize load balancing strategies based on request metadata and live worker stats, or just leverage the built-in round-robin or least-connections balancer algorithms.
12
+
13
+ ## Features
14
+
15
+ - Node.js based for maximum compatibility with existing Express apps and middleware
16
+ - automatic multi-core startup
17
+ - TCP-level load balancing proxy in the primary process
18
+ - request-aware load balancing hooks
19
+ - default round-robin balancing
20
+ - worker CPU and memory telemetry over IPC
21
+ - worker auto-respawn
22
+ - `ultimate-express` re-exported as `express`
23
+ - zero app-level cluster boilerplate
24
+ - graceful shutdown handling with signal handlers and timeouts
25
+ - QUIC/HTTP/3 and WebSocket support out of the box (via `ultimate-express`)
26
+ - built-in benchmarks with realistic, real-world payloads
27
+ - as little 3rd-party dependencies as possible (`ultimate-express` and `defuss-open-telemetry` for telemetry features)
28
+
29
+ ## Install
30
+
31
+ ```bash
32
+ bun/pnpm/yarn add defuss-express
33
+ ```
34
+
35
+ Note: We use bun as a _package manager_ only here.
36
+ Node.js is the intended runtime for `defuss-express` apps.
37
+
38
+ ## Basic usage
39
+
40
+ ```ts
41
+ import { express, startServer, stopServer } from "defuss-express";
42
+
43
+ const app = express({ threads: 0 });
44
+ app.disable("x-powered-by");
45
+
46
+ app.get("/", (_req, res) => {
47
+ res.status(200).send("hello");
48
+ });
49
+
50
+ await startServer(app);
51
+
52
+ process.on("SIGINT", () => {
53
+ void stopServer();
54
+ });
55
+ process.on("SIGTERM", () => {
56
+ void stopServer();
57
+ });
58
+ ```
59
+
60
+ ## Custom balancing
61
+
62
+ `loadBalancer` receives the parsed HTTP request head, the candidate backend list, and the client socket.
63
+
64
+ ```ts
65
+ import {
66
+ express,
67
+ setServerConfig,
68
+ startServer,
69
+ type BackendCandidate,
70
+ } from "defuss-express";
71
+
72
+ const chooseLowestCpu = (candidates: BackendCandidate[]) =>
73
+ [...candidates].sort(
74
+ (left, right) => (left.stats?.cpuPercent ?? 0) - (right.stats?.cpuPercent ?? 0),
75
+ )[0]!;
76
+
77
+ setServerConfig({
78
+ loadBalancer: ({ request, candidates }) => {
79
+ if (request.path?.startsWith("/realtime")) {
80
+ return chooseLowestCpu(candidates);
81
+ }
82
+
83
+ return candidates[0]!;
84
+ },
85
+ });
86
+
87
+ const app = express({ threads: 0 });
88
+ await startServer(app);
89
+ ```
90
+
91
+ ## Advanced server config
92
+
93
+ The full config object is passed to `startServer` or `setServerConfig` before startup. This example shows request-aware routing with a custom load balancer, opt-in telemetry via `defuss-open-telemetry`, and tuned timeouts:
94
+
95
+ ```ts
96
+ import { express, startServer, type LoadBalancerContext } from "defuss-express";
97
+ import { createOpenTelemetrySink, OtelMeterAdapter } from "defuss-open-telemetry";
98
+ import { metrics } from "@opentelemetry/api";
99
+
100
+ // Custom load balancer: sticky sessions for /api, lowest CPU for everything else
101
+ const customBalancer = ({ request, candidates, previousIndex }: LoadBalancerContext) => {
102
+ if (request.path?.startsWith("/api") && request.headers["x-session-id"]) {
103
+ // Hash the session header to a stable backend index
104
+ const hash = [...request.headers["x-session-id"]].reduce(
105
+ (acc, ch) => ((acc << 5) - acc + ch.charCodeAt(0)) | 0,
106
+ 0,
107
+ );
108
+ return candidates[Math.abs(hash) % candidates.length]!;
109
+ }
110
+
111
+ // Default: pick the worker with the lowest CPU usage
112
+ return [...candidates].sort(
113
+ (a, b) => (a.stats?.cpuPercent ?? 0) - (b.stats?.cpuPercent ?? 0),
114
+ )[0]!;
115
+ };
116
+
117
+ const app = express({ threads: 0 });
118
+
119
+ await startServer(app, {
120
+ host: "0.0.0.0",
121
+ port: 8080,
122
+ workers: 4,
123
+ loadBalancer: customBalancer,
124
+
125
+ // Opt-in OpenTelemetry (omit for silent no-op)
126
+ telemetry: createOpenTelemetrySink({
127
+ meter: new OtelMeterAdapter(metrics.getMeter("my-app")),
128
+ prefix: "defuss.express.",
129
+ }),
130
+
131
+ // Tuning
132
+ requestInspectionTimeoutMs: 25, // more time to sniff headers
133
+ maxHeaderBytes: 32 * 1024, // allow larger headers
134
+ workerHeartbeatIntervalMs: 30_000,
135
+ gracefulShutdownTimeoutMs: 15_000,
136
+ });
137
+ ```
138
+
139
+ The `LoadBalancerContext` provides `candidates` (healthy backends with live CPU/memory stats), `request` (parsed method, path, host, headers), `socket` (raw TCP socket), and `previousIndex` (for round-robin tracking). Return the `BackendCandidate` that should receive the connection.
140
+
141
+ ## API
142
+
143
+ ### `express`
144
+
145
+ Re-export of `ultimate-express`. Please see the [ultimate-express documentation](https://github.com/dimdenGD/ultimate-express/tree/main) for details on compatibility and caveats/limitations. Almost all of the `express` features are supported, including WebSockets and HTTP/3. There are _tiny_ differences/edge cases to be aware of, but this shouldn't be a concern for typical apps.
146
+
147
+ Make sure to implement end-2-end tests for your app to verify compatibility if you are using advanced or rarely used `express` features before you migrate to `defuss-express`. We haven't found any issues in our testing, but there are so many `express` features and combinations that we can't guarantee 100% compatibility in all edge cases. That said, if you do find any issues, please report them! This package is actively maintained and we will prioritize fixes to ensure maximum compatibility with the `express` API.
148
+
149
+ ### `setServerConfig(config)`
150
+
151
+ Sets global runtime config before `startServer(app)`.
152
+
153
+ ```ts
154
+ setServerConfig({
155
+ host: "0.0.0.0",
156
+ port: 3000,
157
+ workerHost: "127.0.0.1",
158
+ baseWorkerPort: 3001,
159
+ workers: "auto",
160
+ workerHeartbeatIntervalMs: 60_000,
161
+ workerHeartbeatStaleAfterMs: 150_000,
162
+ requestInspectionTimeoutMs: 10,
163
+ maxHeaderBytes: 16 * 1024,
164
+ gracefulShutdownTimeoutMs: 10_000,
165
+ respawnWorkers: true,
166
+ installSignalHandlers: true,
167
+ });
168
+ ```
169
+
170
+ ### `startServer(app, config?)`
171
+
172
+ Starts the runtime. In the primary process it forks workers and starts the TCP balancer. In worker processes it binds the app to a worker port.
173
+
174
+ ### `stopServer(graceful=true)`
175
+
176
+ Stops the balancer or the worker server depending on the current process role.
177
+
178
+ A _graceful_ shutdown (default) first stops accepting new connections, then waits for in-flight requests to finish until the `gracefulShutdownTimeoutMs` is reached, at which point it forcefully terminates any remaining connections and exits.
179
+
180
+ A _non-graceful_ shutdown immediately terminates all connections and exits.
181
+
182
+ ### Built-in load balancer strategies
183
+
184
+ - `roundRobinLoadBalancer`
185
+ - `leastConnectionsLoadBalancer`
186
+ - `resourceAwareLoadBalancer`
187
+ - `defaultLoadBalancer`
188
+
189
+ ## Implementation details and design notes
190
+
191
+ ### Should I store state in-memory in my app? Can I simply declare a global variable to share state across requests or per-session? Does this handle statefulness?
192
+
193
+ General advice: VERY BAD IDEA. If you CAN decide for an architecture, DO NOT implement any statefulness in-memory if you want your system to be horizontally scalable!
194
+
195
+ Memory is usually not shared across processes. And thus, when a user request hits the load balancer, it could be routed to **any** of the available worker processes. Your app will behave **flaky** if **_sometimes_** the request hits the worker with the expected in-memory state, and **_sometimes_** it hits a different worker that doesn't have that state. This is a fundamental consequence of the multi-core model and how load balancers work.
196
+
197
+ Therefore, you can:
198
+ - Use an external state store. `defuss-sharedmemory`, `defuss-redis` or `defuss-db` are great candidates for that.
199
+
200
+ - Use `defuss-redis` or `defuss-db` if you need to share state across multiple machines, or want the convenience of a higher-level data model and don't mind the extra latency of a network round trip to your state store.
201
+
202
+ - Use `defuss-sharedmemory` if you only deploy to 1 SINGLE HOST (no horizontal scaling across multiple machines) and want the lowest possible latency for state access.
203
+
204
+ That being said, should you have any other specific requirements for statefulness (aka. you need sticky sessions, have `Bearer` token auth/API auth with state managed per-session **in memory and per process**), you **MUST MESS WITH** implementing a custom load balancer strategy via `setServerConfig` in order to pin requests to a specific process once the machine you're running this on, has more than one CPU core/hyperthread. You could use the request metadata (headers, path, etc.) to hash to a specific backend index (see `defuss-hash` for federated hashing).
205
+
206
+ ### Operational notes on multi-core behavior
207
+
208
+ The same entry module is executed in both the primary process and each worker. That means app construction should be deterministic and free of one-shot side effects that are only safe in a single process.
209
+
210
+ The primary process does not terminate TLS or parse full HTTP bodies. It only sniffs the request head long enough to let a custom balancer inspect method, path, and headers, then proxies bytes to the selected worker.
211
+
212
+ WebSocket upgrades continue to work because the proxy is plain TCP after backend selection.
213
+
214
+ ## Commands
215
+
216
+ ```bash
217
+ bun run check
218
+ bun run test
219
+ bun run build
220
+ bun run bench # throughput benchmark (DSON payloads)
221
+ bun run bench:server # start benchmark server standalone
222
+ ```
223
+
224
+ ## Benchmarks
225
+
226
+ Measured with [autocannon](https://github.com/mcollina/autocannon) on Apple M1 Pro (10 cores), 100 concurrent connections, 10 s per scenario + 3 s warmup. Server uses `resourceAwareLoadBalancer`. Payloads are serialized/deserialized through `defuss-dson` (typed superset of JSON supporting `Date`, `Map`, `Set`, `RegExp`, `BigInt`, `Uint8Array`, …).
227
+
228
+ ```
229
+ bun run bench
230
+ ```
231
+
232
+ | Scenario | Avg Req/s | p50 | p99 | Throughput |
233
+ |---|---|---|---|---|
234
+ | GET /dson/generate (complex typed object) | **36,384** | 2 ms | 13 ms | 186 MB/s |
235
+ | POST /dson/echo (small, 3 fields) | **52,841** | 1 ms | 4 ms | 18 MB/s |
236
+ | POST /dson/echo (medium, 100 users) | **10,168** | 8 ms | 25 ms | 179 MB/s |
237
+ | POST /dson/transform (enrich 100 users) | **10,560** | 8 ms | 23 ms | 188 MB/s |
238
+ | POST /dson/echo (large, 500 records + binary) | **750** | 120 ms | 299 ms | 228 MB/s |
239
+
240
+ Zero errors across all scenarios. Results scale linearly with core count.
241
+
242
+ ## License
243
+
244
+ MIT