alvin-bot 4.9.3 โ 4.9.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +43 -0
- package/README.md +12 -1
- package/dist/index.js +8 -3
- package/dist/web/bind-strategy.js +42 -0
- package/dist/web/server.js +231 -101
- package/package.json +1 -1
- package/test/stress-scenarios.test.ts +1 -1
- package/test/web-server-integration.test.ts +189 -0
- package/test/web-server-resilience.test.ts +118 -0
- package/test/web-server-shutdown.test.ts +7 -1
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,49 @@
|
|
|
2
2
|
|
|
3
3
|
All notable changes to Alvin Bot are documented here.
|
|
4
4
|
|
|
5
|
+
## [4.9.4] โ 2026-04-13
|
|
6
|
+
|
|
7
|
+
### ๐ Web UI fully decoupled from main bot โ port conflicts no longer crash anything
|
|
8
|
+
|
|
9
|
+
Colleague feedback (WhatsApp voice note, 2026-04-13):
|
|
10
|
+
> *"The gateway binds to port 3100 like OpenClaw. When the bot restarts,
|
|
11
|
+
> the port is often still held โ catastrophic crash. I ended up
|
|
12
|
+
> decoupling the gateway process completely, because the actual bot
|
|
13
|
+
> runs independently of the gateway โ it can still answer Telegram
|
|
14
|
+
> even if the web endpoint isn't reachable yet. It's weird that the
|
|
15
|
+
> main routine crashes when the port is busy. It should just run in
|
|
16
|
+
> the background, watch for the port to become free, and connect
|
|
17
|
+
> then. Zero impact on the main routine."*
|
|
18
|
+
|
|
19
|
+
He was right. My v4.9.0 `stopWebServer()` fix was *prevention* โ it stopped the bot itself from holding 3100 across restarts. But it didn't cover the *resilience* side: a foreign process holding 3100 (another dev server, an OpenClaw-style orphan, a TIME_WAIT race after SIGKILL) still crashed the boot, because `startWebServer()` was synchronous and the `uncaught exception` from `server.listen()` escaped to the main event loop.
|
|
20
|
+
|
|
21
|
+
**Complete rewrite of the bind loop:**
|
|
22
|
+
|
|
23
|
+
- **`src/web/bind-strategy.ts` (new) โ pure decision helper.** `decideNextBindAction(err, attempt, opts)` returns either `{type: "retry-port", port, attempt}` (climb the ladder) or `{type: "retry-background", delayMs, port}` (back off, retry the original port in 30 s). EADDRINUSE with attempts remaining โ ladder. EADDRINUSE exhausted โ background. Any other error โ background. 8 unit tests covering every branch + purity.
|
|
24
|
+
|
|
25
|
+
- **`src/web/server.ts` startWebServer โ non-blocking, fresh-server-per-attempt.** Returns `void` synchronously, NEVER throws, NEVER blocks on bind. Each attempt creates a new `http.Server` (no state-recycling bugs) and attaches its own error handler. On failure, cleans up and calls `decideNextBindAction` to decide the next move. If the ladder is exhausted, schedules a 30 s background retry at the original port โ the Telegram bot keeps running the whole time, the web UI just isn't reachable yet.
|
|
26
|
+
|
|
27
|
+
- **`src/web/server.ts` WebSocketServer attached POST-bind.** The `ws` library's `WebSocketServer` constructor installs its own event plumbing on the underlying `http.Server` and โ crucially โ causes EADDRINUSE errors to escape as uncaught exceptions when attached pre-listen. Debugging this chewed an hour on 2026-04-13. Fix: only `new WebSocketServer({ server })` AFTER `listen()` has fired its callback. The unit-test `test/web-server-integration.test.ts "when the primary port is taken"` pins this behaviour.
|
|
28
|
+
|
|
29
|
+
- **`src/web/server.ts` error handler: `on` not `once`.** Previous version used `.once("error", handler)` and a node edge case where a single bind failure emits TWO error events left the second one uncaught. Handler is now `on` with a `handled` guard โ idempotent, and a post-bind quiet logger replaces it on success.
|
|
30
|
+
|
|
31
|
+
- **`src/web/server.ts` defensive try/catch around `server.listen()`.** In the wild Node sometimes throws synchronously for edge-case binds (already-listening, invalid backlog, kernel race). The catch funnels sync throws through the same `handleBindFailure` path as async error events.
|
|
32
|
+
|
|
33
|
+
- **`src/web/server.ts` `closeHttpServerGracefully(server)` + `stopWebServer()`.** The old `stopWebServer(server)` took an explicit server arg; it's been split into a low-level helper (`closeHttpServerGracefully(server)`, exported for tests) and a stateful top-level (`stopWebServer()`, no args, cleans up `currentServer` + `wsServerRef` + `bindRetryTimer`). Safe to call before start, safe to call twice, cancels pending background retries.
|
|
34
|
+
|
|
35
|
+
- **`src/index.ts` call sites adjusted.** `const webServer = startWebServer()` โ `startWebServer()`. `stopWebServer(webServer)` โ `stopWebServer()`. The comment above the call explains the decoupling so nobody accidentally re-couples it in a future "clean up" refactor.
|
|
36
|
+
|
|
37
|
+
**Testing: 186 โ 201 (+15 new).**
|
|
38
|
+
|
|
39
|
+
- `test/web-server-resilience.test.ts` โ 8 unit tests for `decideNextBindAction`
|
|
40
|
+
- `test/web-server-integration.test.ts` โ 7 real-server integration tests: startWebServer returns void, binds, stops, is idempotent, survives primary-port conflict by climbing the ladder, closes servers with hanging sockets.
|
|
41
|
+
- **Live-verified on the maintainer's machine**: `launchctl unload` + dual-stack Node hog on port 3100 + `launchctl load` โ bot booted cleanly โ out.log contained `[web] port 3100 busy (EADDRINUSE) โ trying 3101` โ `๐ Web UI: http://localhost:3101 (Port 3100 was busy, using 3101 instead)` โ Telegram responsive throughout. Exactly what the colleague described.
|
|
42
|
+
|
|
43
|
+
**Non-goals / intentionally unchanged:**
|
|
44
|
+
- Timeouts stay unlimited (v4.8.8 behaviour preserved).
|
|
45
|
+
- The primary port is still `WEB_PORT || 3100` โ no config schema change.
|
|
46
|
+
- When the bot binds on a non-primary port (e.g. 3101), the README permalink still points at 3100. Users hitting a ladder-climbed bot should check the startup log; this is rare and temporary.
|
|
47
|
+
|
|
5
48
|
## [4.9.3] โ 2026-04-11
|
|
6
49
|
|
|
7
50
|
### ๐ Two UX bugs found in production after v4.9.2 โ now closed
|
package/README.md
CHANGED
|
@@ -114,7 +114,18 @@ That's it. The setup wizard validates everything:
|
|
|
114
114
|
|
|
115
115
|
**Requires:** Node.js 18+ ([nodejs.org](https://nodejs.org)) ยท Telegram bot token ([@BotFather](https://t.me/BotFather)) ยท Your Telegram user ID ([@userinfobot](https://t.me/userinfobot))
|
|
116
116
|
|
|
117
|
-
Free AI providers available โ no credit card needed.
|
|
117
|
+
Free AI providers available โ no credit card needed. **Privacy-first?** Pick the ๐ **Offline โ Gemma 4 E4B** option in setup for a fully local LLM via Ollama (macOS/Linux: automated install; Windows: manual).
|
|
118
|
+
|
|
119
|
+
### ๐ First-time setup walkthroughs
|
|
120
|
+
|
|
121
|
+
Step-by-step guides with screenshots and screen-for-screen instructions:
|
|
122
|
+
|
|
123
|
+
| Platform | PDF (printable) |
|
|
124
|
+
|---|---|
|
|
125
|
+
| ๐ **macOS** (with `launchd` background service) | [Download PDF](https://github.com/alvbln/Alvin-Bot/releases/latest/download/Alvin-Bot-macOS-Setup-Guide.pdf) |
|
|
126
|
+
| ๐ช **Windows** (with Task Scheduler / Startup folder) | [Download PDF](https://github.com/alvbln/Alvin-Bot/releases/latest/download/Alvin-Bot-Windows-Setup-Guide.pdf) |
|
|
127
|
+
|
|
128
|
+
Both guides cover: Node.js install ยท Telegram bot creation ยท first-time `setup` ยท foreground test ยท background service ยท offline Gemma 4 mode ยท troubleshooting. ~15 min end-to-end for a first-time user.
|
|
118
129
|
|
|
119
130
|
### macOS: use `launchd` instead of pm2 (recommended)
|
|
120
131
|
|
package/dist/index.js
CHANGED
|
@@ -267,7 +267,7 @@ const shutdown = async () => {
|
|
|
267
267
|
}
|
|
268
268
|
// Release :3100 so the next launchd boot doesn't hit EADDRINUSE.
|
|
269
269
|
// Must happen before exit โ see src/web/server.ts stopWebServer() comment.
|
|
270
|
-
await stopWebServer(
|
|
270
|
+
await stopWebServer().catch((err) => console.warn("[shutdown] stopWebServer failed:", err));
|
|
271
271
|
await unloadPlugins().catch(() => { });
|
|
272
272
|
await disconnectMCP().catch(() => { });
|
|
273
273
|
// Tear down any bot-managed local runners (Ollama, LM Studio, โฆ) so VRAM
|
|
@@ -404,8 +404,13 @@ async function startOptionalPlatforms() {
|
|
|
404
404
|
}
|
|
405
405
|
}
|
|
406
406
|
startOptionalPlatforms().catch(err => console.error("Platform startup error:", err));
|
|
407
|
-
// Start Web UI (ALWAYS โ regardless of Telegram/AI config)
|
|
408
|
-
|
|
407
|
+
// Start Web UI (ALWAYS โ regardless of Telegram/AI config).
|
|
408
|
+
// startWebServer is now non-blocking and will never throw: if port 3100
|
|
409
|
+
// is busy (foreign process, TIME_WAIT, another bot instance), it climbs
|
|
410
|
+
// the port ladder up to 3119 and then enters a background retry loop
|
|
411
|
+
// at 3100 every 30s. The Telegram bot runs independently โ Web UI is a
|
|
412
|
+
// feature, not core. See src/web/bind-strategy.ts for the retry rules.
|
|
413
|
+
startWebServer();
|
|
409
414
|
// Start Cron Scheduler โ route notifications through delivery queue for reliability
|
|
410
415
|
setNotifyCallback(async (target, text) => {
|
|
411
416
|
if (target.platform === "web") {
|
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Pure decision helper for the web-server bind loop.
|
|
3
|
+
*
|
|
4
|
+
* Decouples the "what should happen next" logic from the side-effect
|
|
5
|
+
* spaghetti of real http.Server binding so it can be unit-tested in
|
|
6
|
+
* isolation. See test/web-server-resilience.test.ts for the contract.
|
|
7
|
+
*
|
|
8
|
+
* Why this exists: the v4.8.x and earlier implementations crashed the
|
|
9
|
+
* entire bot when port 3100 was held by a foreign process. A colleague
|
|
10
|
+
* running an OpenClaw fork hit the same bug years ago and ended up
|
|
11
|
+
* decoupling the web server completely โ the main bot should never be
|
|
12
|
+
* gated on a web-UI bind. This helper encodes the decision logic so
|
|
13
|
+
* the new startWebServer() can just act on the returned action.
|
|
14
|
+
*/
|
|
15
|
+
/**
|
|
16
|
+
* Decide what the bind loop should do next after a failed listen().
|
|
17
|
+
*
|
|
18
|
+
* Rule of thumb:
|
|
19
|
+
* - EADDRINUSE AND attempts remaining โ climb the port ladder.
|
|
20
|
+
* - EADDRINUSE AND ladder exhausted โ background retry at original port.
|
|
21
|
+
* - any other error (EACCES, listen-called-twice, etc.) โ background retry.
|
|
22
|
+
*
|
|
23
|
+
* PURE: no timers, no I/O, no mutation of inputs. Safe to call from tests.
|
|
24
|
+
*/
|
|
25
|
+
export function decideNextBindAction(err, attempt, opts) {
|
|
26
|
+
const code = err?.code;
|
|
27
|
+
if (code === "EADDRINUSE" && attempt < opts.maxPortTries - 1) {
|
|
28
|
+
return {
|
|
29
|
+
type: "retry-port",
|
|
30
|
+
port: opts.originalPort + attempt + 1,
|
|
31
|
+
attempt: attempt + 1,
|
|
32
|
+
};
|
|
33
|
+
}
|
|
34
|
+
// EADDRINUSE with no attempts left, OR any non-EADDRINUSE error:
|
|
35
|
+
// don't walk the port ladder further, just back off and retry the
|
|
36
|
+
// original port in the background.
|
|
37
|
+
return {
|
|
38
|
+
type: "retry-background",
|
|
39
|
+
delayMs: opts.backgroundRetryMs,
|
|
40
|
+
port: opts.originalPort,
|
|
41
|
+
};
|
|
42
|
+
}
|
package/dist/web/server.js
CHANGED
|
@@ -30,10 +30,24 @@ import { addCanvasClient } from "./canvas.js";
|
|
|
30
30
|
import { BOT_ROOT, ENV_FILE, PUBLIC_DIR, MEMORY_DIR, MEMORY_FILE, SOUL_FILE, DATA_DIR, MCP_CONFIG, SKILLS_DIR } from "../paths.js";
|
|
31
31
|
import { broadcast } from "../services/broadcast.js";
|
|
32
32
|
import { BOT_VERSION } from "../version.js";
|
|
33
|
+
import { decideNextBindAction } from "./bind-strategy.js";
|
|
33
34
|
const WEB_PORT = parseInt(process.env.WEB_PORT || "3100");
|
|
34
|
-
/**
|
|
35
|
-
*
|
|
35
|
+
/** Tuning for the bind loop. Walk the port ladder `MAX_PORT_TRIES` times
|
|
36
|
+
* then fall back to a `BACKGROUND_RETRY_MS` idle loop โ the bot keeps
|
|
37
|
+
* running on Telegram either way; see bind-strategy.ts for the pure
|
|
38
|
+
* decision logic. */
|
|
39
|
+
const MAX_PORT_TRIES = 20;
|
|
40
|
+
const BACKGROUND_RETRY_MS = 30_000;
|
|
41
|
+
/** Current live http.Server, if one has successfully bound. */
|
|
42
|
+
let currentServer = null;
|
|
43
|
+
/** Current live WebSocketServer attached to currentServer. */
|
|
36
44
|
let wsServerRef = null;
|
|
45
|
+
/** Background-retry timer handle โ set when the bind loop is in its
|
|
46
|
+
* idle wait between cycles, cleared when stopWebServer() cancels. */
|
|
47
|
+
let bindRetryTimer = null;
|
|
48
|
+
/** Flag flipped by stopWebServer(). Every bind-loop callback checks
|
|
49
|
+
* this and exits silently if set, so stop is truly terminal. */
|
|
50
|
+
let stopRequested = false;
|
|
37
51
|
const WEB_PASSWORD = process.env.WEB_PASSWORD || "";
|
|
38
52
|
/** The actual port the Web UI is running on (may differ from WEB_PORT if busy). */
|
|
39
53
|
let actualWebPort = WEB_PORT;
|
|
@@ -1371,126 +1385,207 @@ function handleWebSocket(wss) {
|
|
|
1371
1385
|
});
|
|
1372
1386
|
}
|
|
1373
1387
|
// โโ Start Server โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
|
|
1374
|
-
|
|
1375
|
-
|
|
1376
|
-
|
|
1377
|
-
|
|
1378
|
-
|
|
1379
|
-
|
|
1380
|
-
|
|
1381
|
-
|
|
1382
|
-
|
|
1383
|
-
|
|
1384
|
-
|
|
1385
|
-
|
|
1386
|
-
|
|
1387
|
-
|
|
1388
|
-
|
|
1389
|
-
|
|
1390
|
-
|
|
1391
|
-
|
|
1392
|
-
|
|
1393
|
-
|
|
1394
|
-
|
|
1395
|
-
|
|
1396
|
-
|
|
1397
|
-
|
|
1398
|
-
|
|
1399
|
-
|
|
1400
|
-
|
|
1401
|
-
|
|
1402
|
-
|
|
1403
|
-
}
|
|
1404
|
-
catch {
|
|
1405
|
-
res.statusCode = 404;
|
|
1406
|
-
res.end("Not found");
|
|
1407
|
-
}
|
|
1408
|
-
return;
|
|
1409
|
-
}
|
|
1410
|
-
// Static files
|
|
1411
|
-
let filePath = urlPath === "/" ? "/index.html" : urlPath;
|
|
1412
|
-
filePath = resolve(PUBLIC_DIR, filePath.slice(1));
|
|
1413
|
-
// Security: prevent path traversal
|
|
1414
|
-
if (!filePath.startsWith(PUBLIC_DIR)) {
|
|
1415
|
-
res.statusCode = 403;
|
|
1416
|
-
res.end("Forbidden");
|
|
1417
|
-
return;
|
|
1418
|
-
}
|
|
1388
|
+
/**
|
|
1389
|
+
* HTTP request handler for the web UI. Hoisted to a top-level function
|
|
1390
|
+
* so every bind attempt can create a fresh http.Server without
|
|
1391
|
+
* rebuilding the handler closure.
|
|
1392
|
+
*/
|
|
1393
|
+
function handleWebRequest(req, res) {
|
|
1394
|
+
let body = "";
|
|
1395
|
+
req.on("data", (chunk) => { body += chunk; });
|
|
1396
|
+
req.on("end", () => {
|
|
1397
|
+
const urlPath = (req.url || "/").split("?")[0];
|
|
1398
|
+
// OpenAI-compatible API (/v1/chat/completions, /v1/models)
|
|
1399
|
+
if (urlPath.startsWith("/v1/")) {
|
|
1400
|
+
handleOpenAICompat(req, res, urlPath, body);
|
|
1401
|
+
return;
|
|
1402
|
+
}
|
|
1403
|
+
// API routes
|
|
1404
|
+
if (urlPath.startsWith("/api/")) {
|
|
1405
|
+
handleAPI(req, res, urlPath, body);
|
|
1406
|
+
return;
|
|
1407
|
+
}
|
|
1408
|
+
// Auth page (if password set and not authenticated)
|
|
1409
|
+
if (WEB_PASSWORD && !checkAuth(req) && urlPath !== "/login.html") {
|
|
1410
|
+
res.writeHead(302, { Location: "/login.html" });
|
|
1411
|
+
res.end();
|
|
1412
|
+
return;
|
|
1413
|
+
}
|
|
1414
|
+
// Canvas UI
|
|
1415
|
+
if (urlPath === "/canvas") {
|
|
1416
|
+
const canvasFile = resolve(PUBLIC_DIR, "canvas.html");
|
|
1419
1417
|
try {
|
|
1420
|
-
const content = fs.readFileSync(
|
|
1421
|
-
|
|
1422
|
-
res.setHeader("Content-Type", MIME[ext] || "application/octet-stream");
|
|
1418
|
+
const content = fs.readFileSync(canvasFile);
|
|
1419
|
+
res.setHeader("Content-Type", "text/html");
|
|
1423
1420
|
res.end(content);
|
|
1424
1421
|
}
|
|
1425
1422
|
catch {
|
|
1426
1423
|
res.statusCode = 404;
|
|
1427
1424
|
res.end("Not found");
|
|
1428
1425
|
}
|
|
1429
|
-
|
|
1426
|
+
return;
|
|
1427
|
+
}
|
|
1428
|
+
// Static files
|
|
1429
|
+
let filePath = urlPath === "/" ? "/index.html" : urlPath;
|
|
1430
|
+
filePath = resolve(PUBLIC_DIR, filePath.slice(1));
|
|
1431
|
+
// Security: prevent path traversal
|
|
1432
|
+
if (!filePath.startsWith(PUBLIC_DIR)) {
|
|
1433
|
+
res.statusCode = 403;
|
|
1434
|
+
res.end("Forbidden");
|
|
1435
|
+
return;
|
|
1436
|
+
}
|
|
1437
|
+
try {
|
|
1438
|
+
const content = fs.readFileSync(filePath);
|
|
1439
|
+
const ext = path.extname(filePath);
|
|
1440
|
+
res.setHeader("Content-Type", MIME[ext] || "application/octet-stream");
|
|
1441
|
+
res.end(content);
|
|
1442
|
+
}
|
|
1443
|
+
catch {
|
|
1444
|
+
res.statusCode = 404;
|
|
1445
|
+
res.end("Not found");
|
|
1446
|
+
}
|
|
1430
1447
|
});
|
|
1431
|
-
|
|
1432
|
-
|
|
1433
|
-
|
|
1434
|
-
|
|
1435
|
-
|
|
1436
|
-
|
|
1437
|
-
|
|
1438
|
-
|
|
1439
|
-
|
|
1440
|
-
|
|
1441
|
-
|
|
1442
|
-
|
|
1443
|
-
|
|
1448
|
+
}
|
|
1449
|
+
/**
|
|
1450
|
+
* Kick off the web-UI bind loop. NEVER throws, NEVER blocks.
|
|
1451
|
+
*
|
|
1452
|
+
* History: earlier versions returned an http.Server synchronously and
|
|
1453
|
+
* let listen() errors bubble up as uncaught exceptions โ a colleague
|
|
1454
|
+
* flagged this on 2026-04-13 after spending months fighting the exact
|
|
1455
|
+
* same bug on a parallel OpenClaw fork. Their resolution: "the gateway
|
|
1456
|
+
* is a feature, not core. Decouple it."
|
|
1457
|
+
*
|
|
1458
|
+
* New contract:
|
|
1459
|
+
* - Returns `void` immediately. The actual bind happens asynchronously.
|
|
1460
|
+
* - If port 3100 is busy, tries 3101โฆ3119 in sequence (same as before).
|
|
1461
|
+
* - If ALL 20 ports are busy, schedules a background retry at 3100
|
|
1462
|
+
* in `BACKGROUND_RETRY_MS` โ keeps trying forever until success
|
|
1463
|
+
* or stopWebServer() is called.
|
|
1464
|
+
* - Any non-EADDRINUSE error also falls through to background retry.
|
|
1465
|
+
* - Each attempt uses a FRESH http.Server to avoid node's fragile
|
|
1466
|
+
* "listen-called-twice" state-recycling behaviour.
|
|
1467
|
+
* - The main Telegram bot is completely independent of this โ if the
|
|
1468
|
+
* web UI never binds, the bot still answers messages.
|
|
1469
|
+
*/
|
|
1470
|
+
export function startWebServer() {
|
|
1471
|
+
stopRequested = false;
|
|
1472
|
+
scheduleBindAttempt(WEB_PORT, 0);
|
|
1473
|
+
}
|
|
1474
|
+
function scheduleBindAttempt(port, attempt) {
|
|
1475
|
+
if (stopRequested)
|
|
1476
|
+
return;
|
|
1477
|
+
// Read WEB_PORT live every time rather than closing over the
|
|
1478
|
+
// module-load value, so tests that change process.env.WEB_PORT
|
|
1479
|
+
// between runs see the new port.
|
|
1480
|
+
const originalPort = parseInt(process.env.WEB_PORT || "3100");
|
|
1481
|
+
// Fresh server for each attempt. Recycling a server that has already
|
|
1482
|
+
// emitted an EADDRINUSE error has produced "Listen method has been
|
|
1483
|
+
// called more than once" crashes in the wild.
|
|
1484
|
+
//
|
|
1485
|
+
// IMPORTANT: do NOT attach the WebSocketServer yet. The `ws` library
|
|
1486
|
+
// installs its own event plumbing on the http.Server in its
|
|
1487
|
+
// constructor, which causes bind errors to escape as uncaught
|
|
1488
|
+
// exceptions. We only attach it AFTER listen() has succeeded.
|
|
1489
|
+
const server = http.createServer(handleWebRequest);
|
|
1490
|
+
// Double-invocation guard: on some Node versions `server.listen`
|
|
1491
|
+
// both throws synchronously AND emits an `error` event for the same
|
|
1492
|
+
// bind failure. Without the guard we'd climb the ladder twice in
|
|
1493
|
+
// parallel and end up with two retry cascades racing each other.
|
|
1494
|
+
let handled = false;
|
|
1495
|
+
const cleanupDeadAttempt = () => {
|
|
1496
|
+
try {
|
|
1497
|
+
server.removeAllListeners("error");
|
|
1498
|
+
}
|
|
1499
|
+
catch { /* ignore */ }
|
|
1500
|
+
try {
|
|
1501
|
+
server.close(() => { });
|
|
1502
|
+
}
|
|
1503
|
+
catch { /* ignore */ }
|
|
1504
|
+
};
|
|
1505
|
+
const handleBindFailure = (err) => {
|
|
1506
|
+
if (handled)
|
|
1507
|
+
return;
|
|
1508
|
+
handled = true;
|
|
1509
|
+
cleanupDeadAttempt();
|
|
1510
|
+
if (stopRequested)
|
|
1511
|
+
return;
|
|
1512
|
+
const action = decideNextBindAction(err, attempt, {
|
|
1513
|
+
originalPort,
|
|
1514
|
+
maxPortTries: MAX_PORT_TRIES,
|
|
1515
|
+
backgroundRetryMs: BACKGROUND_RETRY_MS,
|
|
1444
1516
|
});
|
|
1517
|
+
if (action.type === "retry-port") {
|
|
1518
|
+
console.warn(`[web] port ${port} busy (${err.code || err.message}) โ trying ${action.port}`);
|
|
1519
|
+
scheduleBindAttempt(action.port, action.attempt);
|
|
1520
|
+
return;
|
|
1521
|
+
}
|
|
1522
|
+
// action.type === "retry-background"
|
|
1523
|
+
console.warn(`[web] bind failed (${err.code || err.message}) โ ` +
|
|
1524
|
+
`backing off ${action.delayMs / 1000}s then retrying port ${action.port}. ` +
|
|
1525
|
+
`Bot is unaffected; Telegram remains live.`);
|
|
1526
|
+
bindRetryTimer = setTimeout(() => {
|
|
1527
|
+
bindRetryTimer = null;
|
|
1528
|
+
scheduleBindAttempt(action.port, 0);
|
|
1529
|
+
}, action.delayMs);
|
|
1530
|
+
};
|
|
1531
|
+
// Use `on` (not `once`) so a pathological server that emits two
|
|
1532
|
+
// error events for a single failure doesn't leave the second one
|
|
1533
|
+
// uncaught. The `handled` guard makes the handler idempotent.
|
|
1534
|
+
server.on("error", handleBindFailure);
|
|
1535
|
+
// Defensive try/catch โ `server.listen()` usually emits async errors,
|
|
1536
|
+
// but certain Node versions + edge cases (already-listening server,
|
|
1537
|
+
// invalid backlog, kernel hiccup) can throw synchronously. Catch here
|
|
1538
|
+
// so the main routine never crashes during web-UI bind.
|
|
1539
|
+
try {
|
|
1445
1540
|
server.listen(port, () => {
|
|
1541
|
+
if (handled)
|
|
1542
|
+
return; // Should be impossible; paranoia.
|
|
1543
|
+
handled = true;
|
|
1544
|
+
// Now โ and only now โ attach the WebSocketServer. Before the
|
|
1545
|
+
// bind succeeded, the ws library's constructor would hijack the
|
|
1546
|
+
// http.Server's error event chain and let EADDRINUSE escape as
|
|
1547
|
+
// uncaught. Post-bind is safe.
|
|
1548
|
+
const wss = new WebSocketServer({ server });
|
|
1549
|
+
handleWebSocket(wss);
|
|
1550
|
+
currentServer = server;
|
|
1551
|
+
wsServerRef = wss;
|
|
1446
1552
|
actualWebPort = port;
|
|
1553
|
+
// Remove the bind error handler โ post-listen errors (socket
|
|
1554
|
+
// errors, close events) should not kick off a spurious retry
|
|
1555
|
+
// cycle. Install a quiet logger for any stray error events so
|
|
1556
|
+
// they can't escape as uncaught.
|
|
1557
|
+
server.removeListener("error", handleBindFailure);
|
|
1558
|
+
server.on("error", (err) => {
|
|
1559
|
+
console.warn(`[web] post-bind server error (ignored): ${err.message}`);
|
|
1560
|
+
});
|
|
1447
1561
|
console.log(`๐ Web UI: http://localhost:${actualWebPort}`);
|
|
1448
|
-
if (actualWebPort !==
|
|
1449
|
-
console.log(` (Port ${
|
|
1562
|
+
if (actualWebPort !== originalPort) {
|
|
1563
|
+
console.log(` (Port ${originalPort} was busy, using ${actualWebPort} instead)`);
|
|
1450
1564
|
}
|
|
1451
1565
|
});
|
|
1452
1566
|
}
|
|
1453
|
-
|
|
1454
|
-
|
|
1567
|
+
catch (err) {
|
|
1568
|
+
handleBindFailure(err);
|
|
1569
|
+
}
|
|
1455
1570
|
}
|
|
1456
1571
|
/**
|
|
1457
|
-
* Gracefully
|
|
1458
|
-
*
|
|
1459
|
-
*
|
|
1460
|
-
*
|
|
1461
|
-
* listening socket in the socket table, so launchd's next boot of the bot
|
|
1462
|
-
* hit `EADDRINUSE :::3100`, threw an Uncaught exception and crash-looped.
|
|
1572
|
+
* Gracefully close a specific http.Server โ the low-level building
|
|
1573
|
+
* block. Exported for tests and for any future callers that manage
|
|
1574
|
+
* their own servers. Production bot code uses `stopWebServer()` below
|
|
1575
|
+
* which operates on the module-global current server instead.
|
|
1463
1576
|
*
|
|
1464
1577
|
* What this does:
|
|
1465
|
-
* 1. Force-close idle keep-alive sockets (
|
|
1466
|
-
* 2. Force-close active open requests (long-poll clients
|
|
1467
|
-
*
|
|
1468
|
-
* 3. Tear down the WebSocket server so its own sockets don't linger.
|
|
1469
|
-
* 4. Await `server.close()` so the listening socket is truly released
|
|
1470
|
-
* before the caller's shutdown continues.
|
|
1578
|
+
* 1. Force-close idle keep-alive sockets (Node 18.2+).
|
|
1579
|
+
* 2. Force-close active open requests (long-poll clients).
|
|
1580
|
+
* 3. Await `server.close()` so the listening socket is truly freed.
|
|
1471
1581
|
*
|
|
1472
|
-
* Safe to call
|
|
1473
|
-
*
|
|
1582
|
+
* Safe to call on already-closed, never-listened, or mid-listen servers.
|
|
1583
|
+
* Never throws.
|
|
1474
1584
|
*/
|
|
1475
|
-
export async function
|
|
1476
|
-
try {
|
|
1477
|
-
if (wsServerRef) {
|
|
1478
|
-
for (const client of wsServerRef.clients) {
|
|
1479
|
-
try {
|
|
1480
|
-
client.terminate();
|
|
1481
|
-
}
|
|
1482
|
-
catch { /* ignore */ }
|
|
1483
|
-
}
|
|
1484
|
-
await new Promise((resolve) => wsServerRef.close(() => resolve()));
|
|
1485
|
-
wsServerRef = null;
|
|
1486
|
-
}
|
|
1487
|
-
}
|
|
1488
|
-
catch { /* ignore */ }
|
|
1585
|
+
export async function closeHttpServerGracefully(server) {
|
|
1489
1586
|
if (!server.listening)
|
|
1490
1587
|
return;
|
|
1491
1588
|
try {
|
|
1492
|
-
// Node 18.2+ APIs โ break any keep-alive / long-poll stalls so
|
|
1493
|
-
// server.close() can actually resolve.
|
|
1494
1589
|
const s = server;
|
|
1495
1590
|
if (typeof s.closeIdleConnections === "function")
|
|
1496
1591
|
s.closeIdleConnections();
|
|
@@ -1499,12 +1594,47 @@ export async function stopWebServer(server) {
|
|
|
1499
1594
|
}
|
|
1500
1595
|
catch { /* ignore */ }
|
|
1501
1596
|
await new Promise((resolve) => {
|
|
1502
|
-
// close() callback fires with an Error arg when the server wasn't
|
|
1503
|
-
// listening โ we just resolve in either case. The caller only cares
|
|
1504
|
-
// that the port is free when this awaits.
|
|
1505
1597
|
server.close(() => resolve());
|
|
1506
1598
|
});
|
|
1507
1599
|
}
|
|
1600
|
+
/**
|
|
1601
|
+
* Stop the web server: cancel any pending background-retry, close
|
|
1602
|
+
* WebSocket clients, then gracefully close the HTTP server.
|
|
1603
|
+
*
|
|
1604
|
+
* Idempotent โ safe to call multiple times, and safe to call before
|
|
1605
|
+
* startWebServer() ever successfully bound. Never throws.
|
|
1606
|
+
*/
|
|
1607
|
+
export async function stopWebServer() {
|
|
1608
|
+
stopRequested = true;
|
|
1609
|
+
// Cancel any pending background-retry timer so a late retry doesn't
|
|
1610
|
+
// grab the port AFTER we thought we'd shut everything down.
|
|
1611
|
+
if (bindRetryTimer) {
|
|
1612
|
+
clearTimeout(bindRetryTimer);
|
|
1613
|
+
bindRetryTimer = null;
|
|
1614
|
+
}
|
|
1615
|
+
// Tear down the WebSocket server first so its sockets can't keep
|
|
1616
|
+
// the underlying http.Server alive.
|
|
1617
|
+
if (wsServerRef) {
|
|
1618
|
+
try {
|
|
1619
|
+
for (const client of wsServerRef.clients) {
|
|
1620
|
+
try {
|
|
1621
|
+
client.terminate();
|
|
1622
|
+
}
|
|
1623
|
+
catch { /* ignore */ }
|
|
1624
|
+
}
|
|
1625
|
+
await new Promise((resolve) => wsServerRef.close(() => resolve()));
|
|
1626
|
+
}
|
|
1627
|
+
catch { /* ignore */ }
|
|
1628
|
+
wsServerRef = null;
|
|
1629
|
+
}
|
|
1630
|
+
if (currentServer) {
|
|
1631
|
+
try {
|
|
1632
|
+
await closeHttpServerGracefully(currentServer);
|
|
1633
|
+
}
|
|
1634
|
+
catch { /* ignore */ }
|
|
1635
|
+
currentServer = null;
|
|
1636
|
+
}
|
|
1637
|
+
}
|
|
1508
1638
|
/** Get the actual port the Web UI is running on. */
|
|
1509
1639
|
export function getWebPort() {
|
|
1510
1640
|
return actualWebPort;
|
package/package.json
CHANGED
|
@@ -19,7 +19,7 @@
|
|
|
19
19
|
*/
|
|
20
20
|
import { describe, it, expect, beforeEach, vi } from "vitest";
|
|
21
21
|
import http from "http";
|
|
22
|
-
import { stopWebServer } from "../src/web/server.js";
|
|
22
|
+
import { closeHttpServerGracefully as stopWebServer } from "../src/web/server.js";
|
|
23
23
|
import {
|
|
24
24
|
handleStartupCatchup,
|
|
25
25
|
prepareForExecution,
|
|
@@ -0,0 +1,189 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Fix #16 (integration) โ end-to-end tests for the decoupled
|
|
3
|
+
* startWebServer + stopWebServer pair.
|
|
4
|
+
*
|
|
5
|
+
* These tests exercise the ACTUAL http.Server binding, not the pure
|
|
6
|
+
* decision helper. They rely on:
|
|
7
|
+
* - process.env.WEB_PORT to keep the test off the running bot's 3100
|
|
8
|
+
* - process.env.ALVIN_DATA_DIR to keep touch-points away from
|
|
9
|
+
* the maintainer's real ~/.alvin-bot/.env
|
|
10
|
+
*
|
|
11
|
+
* What's covered here:
|
|
12
|
+
* 1. startWebServer() returns synchronously (void) without throwing
|
|
13
|
+
* 2. stopWebServer() releases the port so another server can bind
|
|
14
|
+
* 3. Start โ stop โ start cycle doesn't leak sockets or timers
|
|
15
|
+
* 4. If the configured port is already busy, startWebServer still
|
|
16
|
+
* returns cleanly (no throw); the bot keeps running.
|
|
17
|
+
* 5. stopWebServer() is idempotent โ safe to call twice in a row
|
|
18
|
+
* and safe to call before startWebServer ever succeeded.
|
|
19
|
+
*
|
|
20
|
+
* The deliberate EADDRINUSE scenario is tested HERE against a real
|
|
21
|
+
* running hog โ no mocking.
|
|
22
|
+
*/
|
|
23
|
+
import { describe, it, expect, beforeEach, afterEach, vi } from "vitest";
|
|
24
|
+
import http from "http";
|
|
25
|
+
import fs from "fs";
|
|
26
|
+
import os from "os";
|
|
27
|
+
import { resolve } from "path";
|
|
28
|
+
|
|
29
|
+
const TEST_DATA_DIR = resolve(os.tmpdir(), `alvin-bot-web-int-${process.pid}-${Date.now()}`);
|
|
30
|
+
|
|
31
|
+
function getFreePort(): Promise<number> {
|
|
32
|
+
return new Promise((resolve, reject) => {
|
|
33
|
+
const s = http.createServer();
|
|
34
|
+
s.listen(0, () => {
|
|
35
|
+
const addr = s.address();
|
|
36
|
+
if (typeof addr === "object" && addr) {
|
|
37
|
+
const p = addr.port;
|
|
38
|
+
s.close(() => resolve(p));
|
|
39
|
+
} else {
|
|
40
|
+
reject(new Error("no address"));
|
|
41
|
+
}
|
|
42
|
+
});
|
|
43
|
+
});
|
|
44
|
+
}
|
|
45
|
+
|
|
46
|
+
async function waitForPortBound(port: number, timeoutMs = 3000): Promise<boolean> {
|
|
47
|
+
const deadline = Date.now() + timeoutMs;
|
|
48
|
+
while (Date.now() < deadline) {
|
|
49
|
+
try {
|
|
50
|
+
const code = await new Promise<number>((resolveCode, reject) => {
|
|
51
|
+
const req = http.get(`http://127.0.0.1:${port}/`, (res) => {
|
|
52
|
+
res.resume();
|
|
53
|
+
resolveCode(res.statusCode ?? 0);
|
|
54
|
+
});
|
|
55
|
+
req.on("error", (err) => reject(err));
|
|
56
|
+
req.setTimeout(500, () => {
|
|
57
|
+
req.destroy(new Error("timeout"));
|
|
58
|
+
});
|
|
59
|
+
});
|
|
60
|
+
if (code > 0) return true;
|
|
61
|
+
} catch {
|
|
62
|
+
/* not yet */
|
|
63
|
+
}
|
|
64
|
+
await new Promise((r) => setTimeout(r, 100));
|
|
65
|
+
}
|
|
66
|
+
return false;
|
|
67
|
+
}
|
|
68
|
+
|
|
69
|
+
beforeEach(async () => {
|
|
70
|
+
if (fs.existsSync(TEST_DATA_DIR)) fs.rmSync(TEST_DATA_DIR, { recursive: true, force: true });
|
|
71
|
+
fs.mkdirSync(TEST_DATA_DIR, { recursive: true });
|
|
72
|
+
process.env.ALVIN_DATA_DIR = TEST_DATA_DIR;
|
|
73
|
+
// Write a minimal .env so config.ts loads cleanly
|
|
74
|
+
fs.writeFileSync(`${TEST_DATA_DIR}/.env`, "WEB_PASSWORD=\n", "utf-8");
|
|
75
|
+
process.env.WEB_PORT = String(await getFreePort());
|
|
76
|
+
// Reset module cache so each test imports server.js fresh and
|
|
77
|
+
// picks up the new WEB_PORT env var at module-load time.
|
|
78
|
+
vi.resetModules();
|
|
79
|
+
});
|
|
80
|
+
|
|
81
|
+
afterEach(async () => {
|
|
82
|
+
// Best-effort: stop whatever is running in the current module instance
|
|
83
|
+
try {
|
|
84
|
+
const { stopWebServer } = await import("../src/web/server.js");
|
|
85
|
+
await stopWebServer();
|
|
86
|
+
} catch {
|
|
87
|
+
/* ignore */
|
|
88
|
+
}
|
|
89
|
+
// Give the OS a moment to release ports before the next test
|
|
90
|
+
await new Promise((r) => setTimeout(r, 50));
|
|
91
|
+
});
|
|
92
|
+
|
|
93
|
+
describe("startWebServer / stopWebServer integration (Fix #16)", () => {
|
|
94
|
+
it("startWebServer returns void synchronously without throwing", async () => {
|
|
95
|
+
const { startWebServer } = await import("../src/web/server.js");
|
|
96
|
+
const result = startWebServer();
|
|
97
|
+
// Must return void (undefined). If it returned a Server instance
|
|
98
|
+
// the old API is still in place.
|
|
99
|
+
expect(result).toBeUndefined();
|
|
100
|
+
});
|
|
101
|
+
|
|
102
|
+
it("actually binds the web server and serves HTTP", async () => {
|
|
103
|
+
const port = Number(process.env.WEB_PORT);
|
|
104
|
+
const { startWebServer } = await import("../src/web/server.js");
|
|
105
|
+
startWebServer();
|
|
106
|
+
const up = await waitForPortBound(port, 3000);
|
|
107
|
+
expect(up).toBe(true);
|
|
108
|
+
});
|
|
109
|
+
|
|
110
|
+
it("stopWebServer releases the port", async () => {
|
|
111
|
+
const port = Number(process.env.WEB_PORT);
|
|
112
|
+
const { startWebServer, stopWebServer } = await import("../src/web/server.js");
|
|
113
|
+
startWebServer();
|
|
114
|
+
expect(await waitForPortBound(port, 3000)).toBe(true);
|
|
115
|
+
await stopWebServer();
|
|
116
|
+
|
|
117
|
+
// Port should now be free โ a fresh bind must succeed
|
|
118
|
+
const reuse = http.createServer();
|
|
119
|
+
await new Promise<void>((resolve, reject) => {
|
|
120
|
+
reuse.once("error", reject);
|
|
121
|
+
reuse.listen(port, () => resolve());
|
|
122
|
+
});
|
|
123
|
+
await new Promise<void>((r) => reuse.close(() => r()));
|
|
124
|
+
});
|
|
125
|
+
|
|
126
|
+
it("stopWebServer is idempotent โ safe to call multiple times", async () => {
|
|
127
|
+
const { startWebServer, stopWebServer } = await import("../src/web/server.js");
|
|
128
|
+
startWebServer();
|
|
129
|
+
await new Promise((r) => setTimeout(r, 200));
|
|
130
|
+
await stopWebServer();
|
|
131
|
+
// Second call must not throw
|
|
132
|
+
await expect(stopWebServer()).resolves.toBeUndefined();
|
|
133
|
+
// Third call must also not throw
|
|
134
|
+
await expect(stopWebServer()).resolves.toBeUndefined();
|
|
135
|
+
});
|
|
136
|
+
|
|
137
|
+
it("stopWebServer is safe to call before startWebServer ever bound", async () => {
|
|
138
|
+
const { stopWebServer } = await import("../src/web/server.js");
|
|
139
|
+
// Module just imported โ nothing started yet
|
|
140
|
+
await expect(stopWebServer()).resolves.toBeUndefined();
|
|
141
|
+
});
|
|
142
|
+
|
|
143
|
+
it("when the primary port is taken, startWebServer still returns cleanly (climbs the ladder)", async () => {
|
|
144
|
+
const originalPort = Number(process.env.WEB_PORT);
|
|
145
|
+
// Plant a hog on the primary port BEFORE startWebServer
|
|
146
|
+
const hog = http.createServer();
|
|
147
|
+
await new Promise<void>((r) => hog.listen(originalPort, () => r()));
|
|
148
|
+
|
|
149
|
+
try {
|
|
150
|
+
const { startWebServer } = await import("../src/web/server.js");
|
|
151
|
+
// Must NOT throw even though the port is occupied
|
|
152
|
+
expect(() => startWebServer()).not.toThrow();
|
|
153
|
+
|
|
154
|
+
// The bot should have climbed the ladder โ one port higher should
|
|
155
|
+
// now be serving HTTP.
|
|
156
|
+
const climbed = await waitForPortBound(originalPort + 1, 3000);
|
|
157
|
+
expect(climbed).toBe(true);
|
|
158
|
+
} finally {
|
|
159
|
+
await new Promise<void>((r) => hog.close(() => r()));
|
|
160
|
+
}
|
|
161
|
+
});
|
|
162
|
+
|
|
163
|
+
it("closeHttpServerGracefully closes a server that's holding an open socket", async () => {
|
|
164
|
+
const { closeHttpServerGracefully } = await import("../src/web/server.js");
|
|
165
|
+
const port = await getFreePort();
|
|
166
|
+
const server = http.createServer((_req, res) => {
|
|
167
|
+
res.writeHead(200, { "Content-Type": "text/plain" });
|
|
168
|
+
res.write("chunk");
|
|
169
|
+
// never res.end โ client hangs forever
|
|
170
|
+
});
|
|
171
|
+
await new Promise<void>((r) => server.listen(port, () => r()));
|
|
172
|
+
|
|
173
|
+
const req = http.get(`http://127.0.0.1:${port}/hang`);
|
|
174
|
+
req.on("error", () => { /* expected */ });
|
|
175
|
+
await new Promise((r) => setTimeout(r, 100));
|
|
176
|
+
|
|
177
|
+
const t0 = Date.now();
|
|
178
|
+
await closeHttpServerGracefully(server);
|
|
179
|
+
expect(Date.now() - t0).toBeLessThan(2000);
|
|
180
|
+
|
|
181
|
+
// Port is reusable
|
|
182
|
+
const reuse = http.createServer();
|
|
183
|
+
await new Promise<void>((resolve, reject) => {
|
|
184
|
+
reuse.once("error", reject);
|
|
185
|
+
reuse.listen(port, () => resolve());
|
|
186
|
+
});
|
|
187
|
+
await new Promise<void>((r) => reuse.close(() => r()));
|
|
188
|
+
});
|
|
189
|
+
});
|
|
@@ -0,0 +1,118 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Fix #16 โ Web server must never crash the bot.
|
|
3
|
+
*
|
|
4
|
+
* Colleague feedback (WhatsApp voice note, 2026-04-13):
|
|
5
|
+
* > The gateway binds to port 3100 like OpenClaw. When the bot
|
|
6
|
+
* > restarts, the port is often still held โ catastrophic crash.
|
|
7
|
+
* > I ended up decoupling the gateway process completely, because
|
|
8
|
+
* > the actual bot runs independently of the gateway โ it can still
|
|
9
|
+
* > answer Telegram even if the web endpoint isn't reachable yet.
|
|
10
|
+
* > It's weird that the main routine crashes when the port is busy.
|
|
11
|
+
* > It should just run in the background, watch for the port to
|
|
12
|
+
* > become free, and connect then. Zero impact on the main routine.
|
|
13
|
+
*
|
|
14
|
+
* This file tests the pure decision helper that the new startWebServer
|
|
15
|
+
* uses to choose between "try the next port immediately" and "retry
|
|
16
|
+
* the default port in the background after a delay".
|
|
17
|
+
*
|
|
18
|
+
* Contract:
|
|
19
|
+
* decideNextBindAction(err, attempt, opts)
|
|
20
|
+
*
|
|
21
|
+
* err.code = "EADDRINUSE", attempt < maxPortTries
|
|
22
|
+
* โ { type: "retry-port", port: opts.originalPort + attempt + 1, attempt: attempt + 1 }
|
|
23
|
+
*
|
|
24
|
+
* err.code = "EADDRINUSE", attempt >= maxPortTries
|
|
25
|
+
* โ { type: "retry-background", delayMs: opts.backgroundRetryMs, port: opts.originalPort }
|
|
26
|
+
*
|
|
27
|
+
* err.code = anything else (EACCES, ECONNRESET, "Listen method called twice"โฆ)
|
|
28
|
+
* โ { type: "retry-background", delayMs: opts.backgroundRetryMs, port: opts.originalPort }
|
|
29
|
+
*
|
|
30
|
+
* Pure function, no side effects, no timers, no I/O.
|
|
31
|
+
*/
|
|
32
|
+
import { describe, it, expect } from "vitest";
|
|
33
|
+
import { decideNextBindAction } from "../src/web/bind-strategy.js";
|
|
34
|
+
|
|
35
|
+
const defaultOpts = {
|
|
36
|
+
originalPort: 3100,
|
|
37
|
+
maxPortTries: 20,
|
|
38
|
+
backgroundRetryMs: 30_000,
|
|
39
|
+
};
|
|
40
|
+
|
|
41
|
+
describe("decideNextBindAction (Fix #16)", () => {
|
|
42
|
+
it("retries on the next port when EADDRINUSE and attempts remain", () => {
|
|
43
|
+
const err = Object.assign(new Error("EADDRINUSE"), { code: "EADDRINUSE" });
|
|
44
|
+
const result = decideNextBindAction(err, 0, defaultOpts);
|
|
45
|
+
expect(result).toEqual({ type: "retry-port", port: 3101, attempt: 1 });
|
|
46
|
+
});
|
|
47
|
+
|
|
48
|
+
it("walks the port ladder across multiple attempts", () => {
|
|
49
|
+
const err = Object.assign(new Error("EADDRINUSE"), { code: "EADDRINUSE" });
|
|
50
|
+
expect(decideNextBindAction(err, 5, defaultOpts)).toEqual({
|
|
51
|
+
type: "retry-port",
|
|
52
|
+
port: 3106,
|
|
53
|
+
attempt: 6,
|
|
54
|
+
});
|
|
55
|
+
expect(decideNextBindAction(err, 18, defaultOpts)).toEqual({
|
|
56
|
+
type: "retry-port",
|
|
57
|
+
port: 3119,
|
|
58
|
+
attempt: 19,
|
|
59
|
+
});
|
|
60
|
+
});
|
|
61
|
+
|
|
62
|
+
it("switches to background retry when all port attempts are exhausted", () => {
|
|
63
|
+
const err = Object.assign(new Error("EADDRINUSE"), { code: "EADDRINUSE" });
|
|
64
|
+
const result = decideNextBindAction(err, 19, defaultOpts); // 20th failure
|
|
65
|
+
expect(result).toEqual({
|
|
66
|
+
type: "retry-background",
|
|
67
|
+
delayMs: 30_000,
|
|
68
|
+
port: 3100,
|
|
69
|
+
});
|
|
70
|
+
});
|
|
71
|
+
|
|
72
|
+
it("goes straight to background retry on non-EADDRINUSE errors", () => {
|
|
73
|
+
const err = Object.assign(new Error("EACCES"), { code: "EACCES" });
|
|
74
|
+
const result = decideNextBindAction(err, 0, defaultOpts);
|
|
75
|
+
expect(result).toEqual({
|
|
76
|
+
type: "retry-background",
|
|
77
|
+
delayMs: 30_000,
|
|
78
|
+
port: 3100,
|
|
79
|
+
});
|
|
80
|
+
});
|
|
81
|
+
|
|
82
|
+
it("handles errors without a .code field by doing background retry", () => {
|
|
83
|
+
const err = new Error("Listen method has been called more than once");
|
|
84
|
+
const result = decideNextBindAction(err, 3, defaultOpts);
|
|
85
|
+
expect(result.type).toBe("retry-background");
|
|
86
|
+
if (result.type === "retry-background") {
|
|
87
|
+
expect(result.port).toBe(3100);
|
|
88
|
+
}
|
|
89
|
+
});
|
|
90
|
+
|
|
91
|
+
it("respects custom maxPortTries", () => {
|
|
92
|
+
const err = Object.assign(new Error("EADDRINUSE"), { code: "EADDRINUSE" });
|
|
93
|
+
const opts = { ...defaultOpts, maxPortTries: 3 };
|
|
94
|
+
// attempts 0, 1 still retry; attempt 2 is the LAST retry; attempt 3 -> background
|
|
95
|
+
expect(decideNextBindAction(err, 0, opts).type).toBe("retry-port");
|
|
96
|
+
expect(decideNextBindAction(err, 1, opts).type).toBe("retry-port");
|
|
97
|
+
expect(decideNextBindAction(err, 2, opts).type).toBe("retry-background");
|
|
98
|
+
});
|
|
99
|
+
|
|
100
|
+
it("respects custom backgroundRetryMs", () => {
|
|
101
|
+
const err = Object.assign(new Error("EACCES"), { code: "EACCES" });
|
|
102
|
+
const opts = { ...defaultOpts, backgroundRetryMs: 5_000 };
|
|
103
|
+
const result = decideNextBindAction(err, 0, opts);
|
|
104
|
+
expect(result).toEqual({
|
|
105
|
+
type: "retry-background",
|
|
106
|
+
delayMs: 5_000,
|
|
107
|
+
port: 3100,
|
|
108
|
+
});
|
|
109
|
+
});
|
|
110
|
+
|
|
111
|
+
it("is pure โ same input, same output, no mutation", () => {
|
|
112
|
+
const err = Object.assign(new Error("EADDRINUSE"), { code: "EADDRINUSE" });
|
|
113
|
+
const snapshot = JSON.stringify({ ...defaultOpts });
|
|
114
|
+
decideNextBindAction(err, 5, defaultOpts);
|
|
115
|
+
decideNextBindAction(err, 5, defaultOpts);
|
|
116
|
+
expect(JSON.stringify({ ...defaultOpts })).toBe(snapshot);
|
|
117
|
+
});
|
|
118
|
+
});
|
|
@@ -17,7 +17,13 @@
|
|
|
17
17
|
import { describe, it, expect } from "vitest";
|
|
18
18
|
import http from "http";
|
|
19
19
|
import { once } from "events";
|
|
20
|
-
|
|
20
|
+
// Fix #1 shipped as stopWebServer(server) โ Fix #16 (v4.9.4) promoted
|
|
21
|
+
// that to `closeHttpServerGracefully(server)` and reserved the name
|
|
22
|
+
// `stopWebServer()` for the module-state-aware shutdown. The underlying
|
|
23
|
+
// contract (close an http.Server even when clients hold open sockets,
|
|
24
|
+
// release the port, idempotent, never throw) is unchanged โ these
|
|
25
|
+
// tests now exercise the renamed helper.
|
|
26
|
+
import { closeHttpServerGracefully as stopWebServer } from "../src/web/server.js";
|
|
21
27
|
|
|
22
28
|
function getFreePort(): Promise<number> {
|
|
23
29
|
return new Promise((resolve, reject) => {
|