alvin-bot 4.8.9 β 4.9.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +61 -0
- package/dist/handlers/commands.js +18 -7
- package/dist/handlers/message.js +5 -2
- package/dist/index.js +14 -10
- package/dist/platforms/whatsapp-auth-helpers.js +53 -0
- package/dist/platforms/whatsapp.js +6 -2
- package/dist/services/browser-manager.js +82 -10
- package/dist/services/browser-webfetch.js +93 -0
- package/dist/services/cron-resolver.js +58 -0
- package/dist/services/cron-scheduling.js +142 -0
- package/dist/services/cron.js +70 -10
- package/dist/services/skills.js +15 -11
- package/dist/services/subagent-delivery.js +8 -2
- package/dist/services/subagents.js +49 -8
- package/dist/services/telegram.js +12 -3
- package/dist/services/watchdog-brake.js +113 -0
- package/dist/services/watchdog.js +56 -42
- package/dist/util/console-formatter.js +109 -0
- package/dist/util/debounce.js +24 -0
- package/dist/util/telegram-error-filter.js +62 -0
- package/dist/web/server.js +56 -0
- package/package.json +1 -1
- package/test/browser-webfetch.test.ts +121 -0
- package/test/console-timestamps.test.ts +98 -0
- package/test/cron-restart-resilience.test.ts +191 -0
- package/test/cron-run-resolver.test.ts +133 -0
- package/test/debounce.test.ts +60 -0
- package/test/subagent-final-text.test.ts +132 -0
- package/test/telegram-error-filter.test.ts +85 -0
- package/test/watchdog-brake.test.ts +157 -0
- package/test/web-server-shutdown.test.ts +111 -0
- package/test/whatsapp-auth-resilience.test.ts +96 -0
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,67 @@
|
|
|
2
2
|
|
|
3
3
|
All notable changes to Alvin Bot are documented here.
|
|
4
4
|
|
|
5
|
+
## [4.9.1] β 2026-04-11
|
|
6
|
+
|
|
7
|
+
### π `/cron run <name>` accepts the job name, not just the opaque ID
|
|
8
|
+
|
|
9
|
+
Reported via screenshot: `/cron run Daily Job Alert` replied with `β Job not found.` because `runJobNow(id)` only matched against `job.id` β the random base-36 string (`mn90rrsndzto`) that nobody types. Worse, when Claude tried to trigger the same job through a natural-language request in an earlier session, it retried with different variants until one happened to succeed β and the absence of a re-entry guard in `runJobNow` meant the retries sometimes spawned a second parallel sub-agent, producing the "upsβ¦ wurde doppelt ausgefΓΌhrt" message.
|
|
10
|
+
|
|
11
|
+
**Fix β pure resolver + guard, wired into the public API:**
|
|
12
|
+
|
|
13
|
+
- **`src/services/cron-resolver.ts` (new).** Two pure helpers:
|
|
14
|
+
- `resolveJobByNameOrId(jobs, query)` β priority: exact ID > exact name > unique case-insensitive name > `null` on miss/ambiguous.
|
|
15
|
+
- `runJobNowGuard(id, isRunning, run)` β higher-order re-entry check, testable without the scheduler loop.
|
|
16
|
+
- **`src/services/cron.ts` runJobNow**. Now returns a typed outcome (`not-found` | `already-running` | `ran`), consults the `runningJobs` set (previously only the scheduler loop did), and β when it actually runs β persists `lastAttemptAt` / `lastRunAt` / `runCount` / `lastResult` / `lastError` exactly like the scheduler path, so manual triggers show up in the timeline instead of vanishing.
|
|
17
|
+
- **`src/handlers/commands.ts /cron run`**. Matches against name OR ID, prints a helpful "Available:" list on miss, and announces the already-running case instead of silently double-firing.
|
|
18
|
+
- **10 new tests** (`test/cron-run-resolver.test.ts`) covering exact ID, exact name, case-insensitive, trimmed input, miss, ambiguity, ID-over-name preference, and both guard branches. **164 tests total.**
|
|
19
|
+
|
|
20
|
+
**What this also quietly fixes:** natural-language triggers ("Alvin, run the daily job alert"). When Claude invokes `/cron run Daily Job Alert` via its own turn, the command now succeeds on the first try β no retry cascade, no double execution.
|
|
21
|
+
|
|
22
|
+
## [4.9.0] β 2026-04-11
|
|
23
|
+
|
|
24
|
+
### π‘ Stability batch: crash-loop eliminated, cron jobs restart-resistant, cleaner logs
|
|
25
|
+
|
|
26
|
+
Production users reported a daily-job-alert that "kept crashing" β the cron job triggered at 08:00, died mid-execution, and the next scheduled run silently disappeared until the next day. Root cause was not a single bug but a chain of four: the HTTP Web UI never released its port on shutdown β `EADDRINUSE :::3100` uncaught crash-loop β the cron scheduler persisted `nextRunAt = null` pre-execution β restart rewrote it to "tomorrow 08:00" β the run was lost. In parallel, sub-agents that ended on a tool call reported "completed" with only the pre-tool text as output, and grammy's "message is not modified" races leaked into Telegram replies as `Fehler: Call to 'editMessageText' failed!`.
|
|
27
|
+
|
|
28
|
+
This release closes the whole chain, adds the Tier 0 of the browser fallback, and installs timestamped logs so future forensics don't need timestamp-free grep archaeology.
|
|
29
|
+
|
|
30
|
+
**Pure functions extracted for isolated testing** (36 new tests, 154 total):
|
|
31
|
+
|
|
32
|
+
- `src/services/cron-scheduling.ts` β `prepareForExecution(job, now)` and `handleStartupCatchup(jobs, now, graceMs)`. The old scheduler set `nextRunAt = null` before `await executeJob(job)` and only recomputed after completion. A crash mid-execution left `nextRunAt = null`; the next boot recomputed it from the current time β always landed on tomorrow's trigger. Now `prepareForExecution` persists the NEXT regular trigger BEFORE running, and stamps `lastAttemptAt`. If `lastAttemptAt > lastRunAt` at boot and the attempt is β€ 6 h old, `handleStartupCatchup` rewinds `nextRunAt` to `now` so the next tick picks it up. New `CronJob.lastAttemptAt` field (`number | null`).
|
|
33
|
+
- `src/services/watchdog-brake.ts` β `decideBrakeAction(prev, now, opts)` and `shouldResetCrashCounter(uptimeMs, opts)`. The old brake reset `crashCount` after 5 minutes of clean uptime, which was shorter than the typical sub-agent lifetime β chronic crashes with 5β10 min gaps passed the brake indefinitely. New policy: **1 h clean uptime required for reset**, plus a hard **20 crashes / 24 h** daily cap alongside the existing 10 crashes / 10 min short-window cap. Both counters persist in the beacon.
|
|
34
|
+
- `src/util/debounce.ts` β trailing-edge debounce for fs.watch coalescing.
|
|
35
|
+
- `src/util/console-formatter.ts` β `installConsoleFormatter()`: monkey-patches `console.log/warn/info/error` to prefix every line with an ISO timestamp, and drops libsignal "Closing session" multi-line SessionEntry dumps + `[claude] Native binary` spam that were pushing tens of KB per day into `alvin-bot.out.log` / `alvin-bot.err.log`.
|
|
36
|
+
- `src/util/telegram-error-filter.ts` β `isHarmlessTelegramError(err)`: single source of truth for benign grammy races (`message is not modified`, `query is too old`, `message to edit not found`, `MESSAGE_ID_INVALID`, β¦).
|
|
37
|
+
- `src/services/browser-webfetch.ts` β `webfetchNavigate(url, opts)` + `parseTitle(html)` + `WebfetchFailed`: Tier 0 of the browser fallback chain. Plain `fetch()` instead of Playwright for static pages.
|
|
38
|
+
- `src/platforms/whatsapp-auth-helpers.ts` β `makeResilientSaveCreds(authDir, inner)`: wraps baileys' `saveCreds` so an ENOENT from a vanished auth dir transparently recreates the directory and retries once.
|
|
39
|
+
|
|
40
|
+
**Fixes wired into the existing modules:**
|
|
41
|
+
|
|
42
|
+
- **`src/web/server.ts` β new `stopWebServer(server)`.** Closes WebSocket clients, calls `closeIdleConnections()` + `closeAllConnections()` (Node 18.2+) so long-poll clients can't stall the shutdown, then awaits `server.close()`. Called from `shutdown()` in `src/index.ts`. Before this fix, launchd restarted the bot β new process tried `server.listen(3100)` β `EADDRINUSE` β uncaught exception β exit β launchd again. Classic crash-loop. **This single fix stops the chain.**
|
|
43
|
+
- **`src/services/cron.ts`** β scheduler rewired to call `prepareForExecution` pre-execution and `handleStartupCatchup` at boot. `lastResult` truncation bumped from 500 β 4000 chars so post-mortem is possible without running the job again.
|
|
44
|
+
- **`src/services/watchdog.ts`** β beacon schema extended with `dailyCrashCount` + `dailyCrashWindowStart`; `startWatchdog` now delegates the brake decision to the pure `decideBrakeAction`. Recovery timer still fires, but only resets the counter if `shouldResetCrashCounter` agrees (β₯ 1 h uptime).
|
|
45
|
+
- **`src/services/subagents.ts`** β `runSubAgent` now reads `finalText` from the `done` chunk as the authoritative final output (was ignored before), preserves buffered text when the stream emits an `error` chunk, and β most importantly β keeps `finalText` when the catch handler fires (was `output: ""`, throwing away multi-minute runs). Variable scope moved outside the try block. New `error` status branch for mid-stream provider failures.
|
|
46
|
+
- **`src/services/subagent-delivery.ts`** β `buildBanner` now renders `β οΈ completed Β· empty output` for the "successful run with zero text" case so truncated runs are immediately visible instead of hiding behind a green tick.
|
|
47
|
+
- **`src/services/skills.ts`** β `fs.watch` callbacks wrapped in `debounce(β¦, 300)` so macOS FSEvents duplicates coalesce into one reload.
|
|
48
|
+
- **`src/services/browser-manager.ts`** β new `webfetch` tier added as default for non-interactive tasks. `resolveStrategy` cascade is now `webfetch β hub-stealth β cdp β gateway β cli`. `navigate()` has an error-based fallback: if `webfetch` throws (403, 5xx, content-type mismatch), it transparently upgrades to `hub-stealth` then `cli` before giving up.
|
|
49
|
+
- **`src/platforms/whatsapp.ts`** β `saveCreds` wrapped in `makeResilientSaveCreds` so a vanished auth dir self-heals instead of becoming an unhandled rejection.
|
|
50
|
+
- **`src/handlers/message.ts`, `src/services/telegram.ts`, `src/index.ts` (bot.catch + streaming finalize)** β all three call sites that used to ship the raw grammy error to users now route through `isHarmlessTelegramError`. The `Fehler: Call to 'editMessageText' failed!` noise that 2-3 users per day were seeing is gone.
|
|
51
|
+
|
|
52
|
+
**What is NOT changed:**
|
|
53
|
+
|
|
54
|
+
- **Timeouts.** The v4.8.8 `defaultTimeoutMs = -1` (unlimited) behavior is preserved. Sub-agents and cron jobs can still run as long as they need.
|
|
55
|
+
- **The cron job `payload.prompt`s.** Users' existing cron definitions keep working unchanged.
|
|
56
|
+
- **The beacon file format back-compat.** Old beacons without the daily counters are read correctly; the new fields are seeded to 0/now on first boot.
|
|
57
|
+
|
|
58
|
+
**How to verify after update:**
|
|
59
|
+
|
|
60
|
+
1. `launchctl unload ~/Library/LaunchAgents/com.alvinbot.app.plist && launchctl load ~/Library/LaunchAgents/com.alvinbot.app.plist`
|
|
61
|
+
2. Tail `~/.alvin-bot/logs/alvin-bot.out.log` β every line should now carry an ISO timestamp and libsignal SessionEntry dumps should be gone.
|
|
62
|
+
3. Check `~/.alvin-bot/state/watchdog.json` β should contain `dailyCrashCount` / `dailyCrashWindowStart` within a minute.
|
|
63
|
+
4. Send `/cron run Daily Job Alert` β subagent-delivery banner should render fully, `~/.alvin-bot/cron-jobs.json` should show `lastAttemptAt` and a post-execution `lastRunAt`.
|
|
64
|
+
5. Trigger a deliberate edit race (double-tap an inline button quickly) β no `Fehler: Call to 'editMessageText' failed!` reply should land in the chat.
|
|
65
|
+
|
|
5
66
|
## [4.8.9] β 2026-04-11
|
|
6
67
|
|
|
7
68
|
### π Browser automation: dead `browse-server.cjs` path removed, 3-tier router now the source of truth
|
|
@@ -1441,17 +1441,28 @@ export function registerCommands(bot) {
|
|
|
1441
1441
|
}
|
|
1442
1442
|
return;
|
|
1443
1443
|
}
|
|
1444
|
-
// /cron run <id>
|
|
1444
|
+
// /cron run <name-or-id>
|
|
1445
1445
|
if (arg.startsWith("run ")) {
|
|
1446
|
-
const
|
|
1446
|
+
const nameOrId = arg.slice(4).trim();
|
|
1447
1447
|
await ctx.api.sendChatAction(ctx.chat.id, "typing");
|
|
1448
|
-
const
|
|
1449
|
-
if (
|
|
1450
|
-
|
|
1448
|
+
const outcome = await runJobNow(nameOrId);
|
|
1449
|
+
if (outcome.status === "not-found") {
|
|
1450
|
+
const jobs = listJobs();
|
|
1451
|
+
const hint = jobs.length > 0
|
|
1452
|
+
? `\n\nAvailable:\n${jobs.slice(0, 10).map(j => `β’ ${j.name}`).join("\n")}`
|
|
1453
|
+
: "";
|
|
1454
|
+
await ctx.reply(`β No job matches <code>${nameOrId}</code>.${hint}`, { parse_mode: "HTML" });
|
|
1455
|
+
return;
|
|
1456
|
+
}
|
|
1457
|
+
if (outcome.status === "already-running") {
|
|
1458
|
+
await ctx.reply(`β³ Job "${outcome.job.name}" is already running β not starting a duplicate. ` +
|
|
1459
|
+
`Wait for the current run to finish, or /subagents cancel to abort it.`);
|
|
1451
1460
|
return;
|
|
1452
1461
|
}
|
|
1453
|
-
const output =
|
|
1454
|
-
|
|
1462
|
+
const output = outcome.output
|
|
1463
|
+
? `\`\`\`\n${outcome.output.slice(0, 2000)}\n\`\`\``
|
|
1464
|
+
: "(no output)";
|
|
1465
|
+
await ctx.reply(`π§ Job "${outcome.job.name}" executed:\n${output}${outcome.error ? `\n\nβ ${outcome.error}` : ""}`, { parse_mode: "Markdown" });
|
|
1455
1466
|
return;
|
|
1456
1467
|
}
|
|
1457
1468
|
await ctx.reply("Unknown cron command. Use /cron for help.");
|
package/dist/handlers/message.js
CHANGED
|
@@ -14,6 +14,7 @@ import { emit } from "../services/hooks.js";
|
|
|
14
14
|
import { trackUsage } from "../services/usage-tracker.js";
|
|
15
15
|
import { emitUserMessage as broadcastUserMessage, emitResponseStart as broadcastResponseStart, emitResponseDelta as broadcastResponseDelta, emitResponseDone as broadcastResponseDone, } from "../services/broadcast.js";
|
|
16
16
|
import { t } from "../i18n.js";
|
|
17
|
+
import { isHarmlessTelegramError } from "../util/telegram-error-filter.js";
|
|
17
18
|
/**
|
|
18
19
|
* Stuck-only timeout β NO absolute cap.
|
|
19
20
|
*
|
|
@@ -367,7 +368,7 @@ export async function handleMessage(ctx) {
|
|
|
367
368
|
if (timedOut) {
|
|
368
369
|
await ctx.reply(t("bot.error.timeoutStuck", session.language, { min: STUCK_TIMEOUT_MINUTES }));
|
|
369
370
|
}
|
|
370
|
-
else {
|
|
371
|
+
else if (!isHarmlessTelegramError(chunk.error)) {
|
|
371
372
|
await ctx.reply(`${t("bot.error.prefix", session.language)} ${chunk.error}`);
|
|
372
373
|
}
|
|
373
374
|
break;
|
|
@@ -419,7 +420,9 @@ export async function handleMessage(ctx) {
|
|
|
419
420
|
else if (errorMsg.includes("abort")) {
|
|
420
421
|
await ctx.reply(t("bot.error.requestCancelled", lang));
|
|
421
422
|
}
|
|
422
|
-
else {
|
|
423
|
+
else if (!isHarmlessTelegramError(err)) {
|
|
424
|
+
// Drop benign grammy races ("message is not modified", etc.)
|
|
425
|
+
// instead of surfacing them as "Fehler: ..." replies.
|
|
423
426
|
await ctx.reply(`${t("bot.error.prefix", lang)} ${errorMsg}`);
|
|
424
427
|
}
|
|
425
428
|
}
|
package/dist/index.js
CHANGED
|
@@ -1,6 +1,12 @@
|
|
|
1
1
|
// ββ Bootstrap: ensure ~/.alvin-bot/ exists + migrate legacy data ββββ
|
|
2
2
|
import { ensureDataDirs, seedDefaults } from "./init-data-dir.js";
|
|
3
3
|
import { hasLegacyData, migrateFromLegacy } from "./migrate.js";
|
|
4
|
+
import { installConsoleFormatter } from "./util/console-formatter.js";
|
|
5
|
+
import { isHarmlessTelegramError } from "./util/telegram-error-filter.js";
|
|
6
|
+
// 0. Install timestamp + noise-filter formatters on console.* so every
|
|
7
|
+
// line in out.log / err.log carries an ISO timestamp and libsignal's
|
|
8
|
+
// SessionEntry dumps stop burying the signal.
|
|
9
|
+
installConsoleFormatter();
|
|
4
10
|
// 1. Create directory structure (no files yet)
|
|
5
11
|
ensureDataDirs();
|
|
6
12
|
// 2. Migrate legacy data BEFORE seeding defaults (so real data wins over templates)
|
|
@@ -70,7 +76,7 @@ import { handleVideo } from "./handlers/video.js";
|
|
|
70
76
|
import { initEngine } from "./engine.js";
|
|
71
77
|
import { loadPlugins, registerPluginCommands, unloadPlugins } from "./services/plugins.js";
|
|
72
78
|
import { initMCP, disconnectMCP, hasMCPConfig } from "./services/mcp.js";
|
|
73
|
-
import { startWebServer } from "./web/server.js";
|
|
79
|
+
import { startWebServer, stopWebServer } from "./web/server.js";
|
|
74
80
|
import { startScheduler, stopScheduler, setNotifyCallback } from "./services/cron.js";
|
|
75
81
|
import { startSessionCleanup, stopSessionCleanup } from "./services/session.js";
|
|
76
82
|
import { processQueue, cleanupQueue, setSenders, enqueue } from "./services/delivery-queue.js";
|
|
@@ -220,16 +226,11 @@ if (hasTelegram) {
|
|
|
220
226
|
bot.catch((err) => {
|
|
221
227
|
const ctx = err.ctx;
|
|
222
228
|
const e = err.error;
|
|
223
|
-
//
|
|
224
|
-
//
|
|
225
|
-
//
|
|
226
|
-
|
|
227
|
-
// re-render). Swallow it silently so it neither pollutes the logs
|
|
228
|
-
// nor bubbles up to the user as "internal error".
|
|
229
|
-
const msg = e instanceof Error ? e.message : String(e);
|
|
230
|
-
if (/message is not modified/i.test(msg) || /specified new message content.*exactly the same/i.test(msg)) {
|
|
229
|
+
// Swallow the well-known harmless grammy races (message is not
|
|
230
|
+
// modified, query too old, message to edit not found β¦) silently.
|
|
231
|
+
// See src/util/telegram-error-filter.ts for the exhaustive list.
|
|
232
|
+
if (isHarmlessTelegramError(e))
|
|
231
233
|
return;
|
|
232
|
-
}
|
|
233
234
|
console.error(`Error handling update ${ctx?.update?.update_id}:`, e);
|
|
234
235
|
// Try to notify the user
|
|
235
236
|
if (ctx?.chat?.id) {
|
|
@@ -260,6 +261,9 @@ const shutdown = async () => {
|
|
|
260
261
|
clearInterval(queueCleanupInterval);
|
|
261
262
|
if (bot)
|
|
262
263
|
bot.stop();
|
|
264
|
+
// Release :3100 so the next launchd boot doesn't hit EADDRINUSE.
|
|
265
|
+
// Must happen before exit β see src/web/server.ts stopWebServer() comment.
|
|
266
|
+
await stopWebServer(webServer).catch((err) => console.warn("[shutdown] stopWebServer failed:", err));
|
|
263
267
|
await unloadPlugins().catch(() => { });
|
|
264
268
|
await disconnectMCP().catch(() => { });
|
|
265
269
|
// Tear down any bot-managed local runners (Ollama, LM Studio, β¦) so VRAM
|
|
@@ -0,0 +1,53 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* WhatsApp auth helpers β tiny resilience wrappers around baileys'
|
|
3
|
+
* use-multi-file-auth-state output.
|
|
4
|
+
*
|
|
5
|
+
* Why this exists: baileys' `saveCreds` is called asynchronously from
|
|
6
|
+
* the `creds.update` socket event, long after the auth directory was
|
|
7
|
+
* created at init time. If anything wipes the directory between init
|
|
8
|
+
* and the first save β a crash mid-init, a manual rm -rf, a stale
|
|
9
|
+
* worker on a different code path β the write throws ENOENT and becomes
|
|
10
|
+
* an `unhandledRejection`, which node 15+ default-reports as a crash.
|
|
11
|
+
*
|
|
12
|
+
* This module keeps the wrapper separate from `whatsapp.ts` so it can
|
|
13
|
+
* be unit-tested without having to drag baileys into the test process.
|
|
14
|
+
*/
|
|
15
|
+
import fs from "fs";
|
|
16
|
+
/**
|
|
17
|
+
* Wrap a baileys saveCreds so a missing auth directory is transparently
|
|
18
|
+
* recreated once and the save is retried. Any other error, and any
|
|
19
|
+
* second ENOENT in a row, surfaces unchanged.
|
|
20
|
+
*/
|
|
21
|
+
export function makeResilientSaveCreds(authDir, innerSaveCreds) {
|
|
22
|
+
return async function resilientSaveCreds() {
|
|
23
|
+
try {
|
|
24
|
+
await innerSaveCreds();
|
|
25
|
+
return;
|
|
26
|
+
}
|
|
27
|
+
catch (err) {
|
|
28
|
+
if (!isEnoent(err))
|
|
29
|
+
throw err;
|
|
30
|
+
// baileys-auth dir vanished between init and now β rebuild and retry once.
|
|
31
|
+
try {
|
|
32
|
+
fs.mkdirSync(authDir, { recursive: true });
|
|
33
|
+
}
|
|
34
|
+
catch {
|
|
35
|
+
// If mkdir itself fails, fall through to the retry β it will surface
|
|
36
|
+
// the real error below with its original stack.
|
|
37
|
+
}
|
|
38
|
+
await innerSaveCreds();
|
|
39
|
+
}
|
|
40
|
+
};
|
|
41
|
+
}
|
|
42
|
+
function isEnoent(err) {
|
|
43
|
+
if (!err || typeof err !== "object")
|
|
44
|
+
return false;
|
|
45
|
+
const code = err.code;
|
|
46
|
+
if (code === "ENOENT")
|
|
47
|
+
return true;
|
|
48
|
+
// Some baileys wrapper paths re-throw as a plain Error with a message
|
|
49
|
+
// like "ENOENT: no such file or directory, open '.../creds.json'" but
|
|
50
|
+
// without .code β match the message as a fallback.
|
|
51
|
+
const msg = err.message || "";
|
|
52
|
+
return /ENOENT/.test(msg);
|
|
53
|
+
}
|
|
@@ -19,6 +19,7 @@
|
|
|
19
19
|
import fs from "fs";
|
|
20
20
|
import { dirname, join } from "path";
|
|
21
21
|
import { WHATSAPP_AUTH as AUTH_DIR, WA_GROUPS as GROUP_CONFIG_FILE, WA_MEDIA_DIR } from "../paths.js";
|
|
22
|
+
import { makeResilientSaveCreds } from "./whatsapp-auth-helpers.js";
|
|
22
23
|
function loadGroupConfig() {
|
|
23
24
|
try {
|
|
24
25
|
return JSON.parse(fs.readFileSync(GROUP_CONFIG_FILE, "utf-8"));
|
|
@@ -250,8 +251,11 @@ export class WhatsAppAdapter {
|
|
|
250
251
|
generateHighQualityLinkPreview: false,
|
|
251
252
|
});
|
|
252
253
|
this.sock = sock;
|
|
253
|
-
// Save credentials on update
|
|
254
|
-
|
|
254
|
+
// Save credentials on update. Wrapped so a vanished auth dir (crash
|
|
255
|
+
// mid-init, manual cleanup, etc.) doesn't turn the next creds.update
|
|
256
|
+
// into an unhandled ENOENT rejection.
|
|
257
|
+
const resilientSaveCreds = makeResilientSaveCreds(authDir, saveCreds);
|
|
258
|
+
sock.ev.on("creds.update", resilientSaveCreds);
|
|
255
259
|
// Connection state
|
|
256
260
|
sock.ev.on("connection.update", (update) => {
|
|
257
261
|
const { connection, lastDisconnect, qr } = update;
|
|
@@ -16,6 +16,7 @@ import fs from "fs";
|
|
|
16
16
|
import { config } from "../config.js";
|
|
17
17
|
import { BROWSE_SERVER_SCRIPT, HUB_BROWSER_SH } from "../paths.js";
|
|
18
18
|
import { screenshotUrl, extractText, generatePdf } from "./browser.js";
|
|
19
|
+
import { webfetchNavigate, WebfetchFailed } from "./browser-webfetch.js";
|
|
19
20
|
const CDP_PORT = 9222;
|
|
20
21
|
const EXEC_TIMEOUT = 60_000; // 60s for page loads via shell
|
|
21
22
|
// ββ Logging ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
|
@@ -53,30 +54,52 @@ async function isCDPAvailable() {
|
|
|
53
54
|
});
|
|
54
55
|
}
|
|
55
56
|
// ββ Strategy Selection with Fallback βββββββββββββββββββββββββββββββββ
|
|
56
|
-
/** Pick the preferred strategy based on task type
|
|
57
|
+
/** Pick the preferred strategy based on task type.
|
|
58
|
+
*
|
|
59
|
+
* Default for a one-shot read is `webfetch` β the cheapest tier. It
|
|
60
|
+
* only fails on JS-heavy or bot-guarded pages, and the cascade in
|
|
61
|
+
* resolveStrategy() handles the upgrade path automatically.
|
|
62
|
+
*/
|
|
57
63
|
export function selectStrategy(task = {}) {
|
|
58
64
|
if (task.useUserBrowser || config.cdpUrl)
|
|
59
65
|
return "cdp";
|
|
60
66
|
if (task.interactive || task.multiStep)
|
|
61
67
|
return "gateway";
|
|
62
|
-
return "
|
|
68
|
+
return "webfetch";
|
|
63
69
|
}
|
|
64
70
|
/**
|
|
65
71
|
* Resolve the preferred strategy to one that's actually available.
|
|
66
|
-
*
|
|
72
|
+
*
|
|
73
|
+
* Cascade order:
|
|
74
|
+
* webfetch β hub-stealth β cdp β gateway β cli
|
|
75
|
+
*
|
|
76
|
+
* Rationale:
|
|
77
|
+
* - `webfetch` is a plain HTTP GET β instant, zero footprint.
|
|
78
|
+
* - `hub-stealth` (playwright+stealth) handles JS-rendered pages
|
|
79
|
+
* without a persistent browser process.
|
|
80
|
+
* - `cdp` brings cookies/auth for login-walled sites.
|
|
81
|
+
* - `gateway` exposes the multi-step HTTP API (ref-based ops, long
|
|
82
|
+
* sessions) when the browse-server.cjs helper is available.
|
|
83
|
+
* - `cli` (raw Playwright) is the last-resort fallback.
|
|
67
84
|
*/
|
|
68
85
|
export async function resolveStrategy(preferred) {
|
|
69
86
|
const chain = [];
|
|
70
|
-
// Build fallback chain starting from preferred
|
|
87
|
+
// Build fallback chain starting from preferred. webfetch and
|
|
88
|
+
// hub-stealth are always available (no external state check), so
|
|
89
|
+
// they're included as floor entries. CDP/gateway only get in if the
|
|
90
|
+
// caller asked for them explicitly, since they need running daemons.
|
|
71
91
|
switch (preferred) {
|
|
92
|
+
case "webfetch":
|
|
93
|
+
chain.push("webfetch", "hub-stealth", "cli");
|
|
94
|
+
break;
|
|
72
95
|
case "gateway":
|
|
73
|
-
chain.push("gateway", "cdp", "hub-stealth", "cli");
|
|
96
|
+
chain.push("gateway", "cdp", "hub-stealth", "webfetch", "cli");
|
|
74
97
|
break;
|
|
75
98
|
case "cdp":
|
|
76
|
-
chain.push("cdp", "hub-stealth", "cli");
|
|
99
|
+
chain.push("cdp", "hub-stealth", "webfetch", "cli");
|
|
77
100
|
break;
|
|
78
101
|
case "hub-stealth":
|
|
79
|
-
chain.push("hub-stealth", "cli");
|
|
102
|
+
chain.push("hub-stealth", "webfetch", "cli");
|
|
80
103
|
break;
|
|
81
104
|
case "cli":
|
|
82
105
|
chain.push("cli");
|
|
@@ -84,6 +107,11 @@ export async function resolveStrategy(preferred) {
|
|
|
84
107
|
}
|
|
85
108
|
for (const strategy of chain) {
|
|
86
109
|
switch (strategy) {
|
|
110
|
+
case "webfetch":
|
|
111
|
+
// Native fetch is always present on Node β₯ 18 β no availability
|
|
112
|
+
// probe needed. Each call is self-contained, so we return the
|
|
113
|
+
// strategy tag and let navigate() handle per-call errors.
|
|
114
|
+
return "webfetch";
|
|
87
115
|
case "gateway":
|
|
88
116
|
if (isGatewayScriptPresent() && (await isGatewayRunning()))
|
|
89
117
|
return "gateway";
|
|
@@ -202,11 +230,55 @@ async function ensureGateway() {
|
|
|
202
230
|
return false;
|
|
203
231
|
}
|
|
204
232
|
// ββ Unified Operations βββββββββββββββββββββββββββββββββββββββββββββββ
|
|
205
|
-
/** Navigate to URL using best available strategy
|
|
233
|
+
/** Navigate to URL using best available strategy.
|
|
234
|
+
*
|
|
235
|
+
* Error-based cascade: if the chosen tier throws, we walk DOWN the
|
|
236
|
+
* priority chain until one succeeds or we exhaust the list. This lets
|
|
237
|
+
* a 403 from webfetch transparently upgrade to hub-stealth without
|
|
238
|
+
* callers having to know about the fallback graph.
|
|
239
|
+
*/
|
|
206
240
|
export async function navigate(url, task = {}) {
|
|
207
|
-
const
|
|
208
|
-
log(`navigate(${url}) using strategy: ${
|
|
241
|
+
const primary = await resolveStrategy(selectStrategy(task));
|
|
242
|
+
log(`navigate(${url}) using strategy: ${primary}`);
|
|
243
|
+
// Try primary, then hub-stealth as a universal fallback. We keep the
|
|
244
|
+
// fallback list short here to avoid cascading timeouts β the full
|
|
245
|
+
// cascade is only for resolveStrategy's availability check.
|
|
246
|
+
const attempt = async (strategy) => {
|
|
247
|
+
return navigateOne(strategy, url);
|
|
248
|
+
};
|
|
249
|
+
try {
|
|
250
|
+
return await attempt(primary);
|
|
251
|
+
}
|
|
252
|
+
catch (err) {
|
|
253
|
+
log(`navigate(${url}) ${primary} failed: ${err.message}`);
|
|
254
|
+
if (primary === "webfetch") {
|
|
255
|
+
// Webfetch is the most common tier and the most common to hit a
|
|
256
|
+
// bot guard β cascade to hub-stealth explicitly, then cli.
|
|
257
|
+
try {
|
|
258
|
+
return await attempt("hub-stealth");
|
|
259
|
+
}
|
|
260
|
+
catch (err2) {
|
|
261
|
+
log(`navigate(${url}) hub-stealth fallback failed: ${err2.message}`);
|
|
262
|
+
return await attempt("cli");
|
|
263
|
+
}
|
|
264
|
+
}
|
|
265
|
+
throw err;
|
|
266
|
+
}
|
|
267
|
+
}
|
|
268
|
+
/** Single-strategy navigate β no fallback logic, just do the thing. */
|
|
269
|
+
async function navigateOne(strategy, url) {
|
|
209
270
|
switch (strategy) {
|
|
271
|
+
case "webfetch": {
|
|
272
|
+
try {
|
|
273
|
+
const r = await webfetchNavigate(url);
|
|
274
|
+
return { title: r.title, url: r.url };
|
|
275
|
+
}
|
|
276
|
+
catch (err) {
|
|
277
|
+
if (err instanceof WebfetchFailed)
|
|
278
|
+
throw err;
|
|
279
|
+
throw new WebfetchFailed(url, err.message, { cause: err });
|
|
280
|
+
}
|
|
281
|
+
}
|
|
210
282
|
case "gateway": {
|
|
211
283
|
await ensureGateway();
|
|
212
284
|
return gatewayRequest("/navigate", { url });
|
|
@@ -0,0 +1,93 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* WebFetch β Tier 0 of the browser fallback chain.
|
|
3
|
+
*
|
|
4
|
+
* For URLs that don't need JavaScript, cookies, or a real browser β
|
|
5
|
+
* RSS feeds, JSON APIs, static HTML, OG-tag sniffs β a plain `fetch()`
|
|
6
|
+
* is 100Γ faster than spinning up Playwright and never shows up in
|
|
7
|
+
* bot-detection traffic. When this tier fails (4xx, 5xx, JS-heavy
|
|
8
|
+
* page, certificate error), callers should catch `WebfetchFailed`
|
|
9
|
+
* and cascade to the next tier (hub-stealth β cdp β gateway).
|
|
10
|
+
*
|
|
11
|
+
* See browser-manager.ts for the full cascade; this module is the
|
|
12
|
+
* leaf-level primitive with no dependencies on that file so both can
|
|
13
|
+
* be unit-tested in isolation.
|
|
14
|
+
*/
|
|
15
|
+
const DEFAULT_TIMEOUT_MS = 15_000;
|
|
16
|
+
const DEFAULT_USER_AGENT = "Mozilla/5.0 (Macintosh; Intel Mac OS X 14_0) AppleWebKit/605.1.15 " +
|
|
17
|
+
"(KHTML, like Gecko) Version/17.0 Safari/605.1.15 AlvinBot/webfetch";
|
|
18
|
+
export class WebfetchFailed extends Error {
|
|
19
|
+
status;
|
|
20
|
+
url;
|
|
21
|
+
cause;
|
|
22
|
+
constructor(url, message, opts = {}) {
|
|
23
|
+
super(`webfetch(${url}): ${message}`);
|
|
24
|
+
this.name = "WebfetchFailed";
|
|
25
|
+
this.url = url;
|
|
26
|
+
this.status = opts.status;
|
|
27
|
+
this.cause = opts.cause;
|
|
28
|
+
}
|
|
29
|
+
}
|
|
30
|
+
const ENTITY_MAP = {
|
|
31
|
+
"&": "&",
|
|
32
|
+
""": '"',
|
|
33
|
+
"'": "'",
|
|
34
|
+
"'": "'",
|
|
35
|
+
"<": "<",
|
|
36
|
+
">": ">",
|
|
37
|
+
" ": " ",
|
|
38
|
+
};
|
|
39
|
+
function decodeEntities(s) {
|
|
40
|
+
return s.replace(/&(amp|quot|#39|apos|lt|gt|nbsp);/gi, (m) => ENTITY_MAP[m.toLowerCase()] ?? m);
|
|
41
|
+
}
|
|
42
|
+
/**
|
|
43
|
+
* Return the contents of the first `<title>` tag, normalised:
|
|
44
|
+
* whitespace collapsed, common HTML entities decoded. If there's no
|
|
45
|
+
* `<title>` at all, returns the empty string β callers decide what to
|
|
46
|
+
* do with that (the URL is a reasonable default display value).
|
|
47
|
+
*/
|
|
48
|
+
export function parseTitle(html) {
|
|
49
|
+
const match = html.match(/<title[^>]*>([\s\S]*?)<\/title>/i);
|
|
50
|
+
if (!match)
|
|
51
|
+
return "";
|
|
52
|
+
const inner = match[1].replace(/\s+/g, " ").trim();
|
|
53
|
+
return decodeEntities(inner);
|
|
54
|
+
}
|
|
55
|
+
export async function webfetchNavigate(url, options = {}) {
|
|
56
|
+
const timeoutMs = options.timeoutMs ?? DEFAULT_TIMEOUT_MS;
|
|
57
|
+
const controller = new AbortController();
|
|
58
|
+
const timer = setTimeout(() => controller.abort(), timeoutMs);
|
|
59
|
+
try {
|
|
60
|
+
let response;
|
|
61
|
+
try {
|
|
62
|
+
response = await fetch(url, {
|
|
63
|
+
method: "GET",
|
|
64
|
+
headers: {
|
|
65
|
+
"User-Agent": options.userAgent ?? DEFAULT_USER_AGENT,
|
|
66
|
+
Accept: "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
|
|
67
|
+
},
|
|
68
|
+
redirect: "follow",
|
|
69
|
+
signal: controller.signal,
|
|
70
|
+
});
|
|
71
|
+
}
|
|
72
|
+
catch (err) {
|
|
73
|
+
throw new WebfetchFailed(url, err.message, { cause: err });
|
|
74
|
+
}
|
|
75
|
+
if (!response.ok) {
|
|
76
|
+
throw new WebfetchFailed(url, `HTTP ${response.status}`, { status: response.status });
|
|
77
|
+
}
|
|
78
|
+
const contentType = response.headers.get("content-type") || "";
|
|
79
|
+
const isHtml = /text\/html|application\/xhtml\+xml/i.test(contentType);
|
|
80
|
+
if (options.forceHtml && !isHtml) {
|
|
81
|
+
throw new WebfetchFailed(url, `expected HTML, got ${contentType || "unknown"}`, { status: response.status });
|
|
82
|
+
}
|
|
83
|
+
const body = await response.text();
|
|
84
|
+
const title = parseTitle(body);
|
|
85
|
+
return {
|
|
86
|
+
title: title || url,
|
|
87
|
+
url,
|
|
88
|
+
};
|
|
89
|
+
}
|
|
90
|
+
finally {
|
|
91
|
+
clearTimeout(timer);
|
|
92
|
+
}
|
|
93
|
+
}
|
|
@@ -0,0 +1,58 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Pure cron-job name/ID resolver and re-entry guard.
|
|
3
|
+
*
|
|
4
|
+
* See test/cron-run-resolver.test.ts for the regressions this closes:
|
|
5
|
+
* - `/cron run Daily Job Alert` returned "Job not found" because the
|
|
6
|
+
* old runJobNow only matched on `job.id`. Real IDs are random
|
|
7
|
+
* base-36 strings, nobody types those.
|
|
8
|
+
* - Natural-language triggers double-ran jobs because runJobNow
|
|
9
|
+
* didn't consult the `runningJobs` set.
|
|
10
|
+
*
|
|
11
|
+
* Both helpers are pure (or pure-over-callbacks) so they can be unit-
|
|
12
|
+
* tested without touching the filesystem or the scheduler loop.
|
|
13
|
+
*/
|
|
14
|
+
/**
|
|
15
|
+
* Resolve a user-facing query (name, case-insensitive name, or ID) to
|
|
16
|
+
* a specific job. Priority:
|
|
17
|
+
* 1. Exact ID match
|
|
18
|
+
* 2. Exact name match (case-sensitive)
|
|
19
|
+
* 3. Unique case-insensitive name match
|
|
20
|
+
* 4. null (miss or ambiguous)
|
|
21
|
+
*
|
|
22
|
+
* Trimmed whitespace on the query. Never mutates the input array.
|
|
23
|
+
*/
|
|
24
|
+
export function resolveJobByNameOrId(jobs, query) {
|
|
25
|
+
const q = query.trim();
|
|
26
|
+
if (!q)
|
|
27
|
+
return null;
|
|
28
|
+
// 1. Exact ID match
|
|
29
|
+
const byId = jobs.find((j) => j.id === q);
|
|
30
|
+
if (byId)
|
|
31
|
+
return byId;
|
|
32
|
+
// 2. Exact name match
|
|
33
|
+
const byExactName = jobs.find((j) => j.name === q);
|
|
34
|
+
if (byExactName)
|
|
35
|
+
return byExactName;
|
|
36
|
+
// 3. Unique case-insensitive name match
|
|
37
|
+
const qLower = q.toLowerCase();
|
|
38
|
+
const ciMatches = jobs.filter((j) => j.name.toLowerCase() === qLower);
|
|
39
|
+
if (ciMatches.length === 1)
|
|
40
|
+
return ciMatches[0];
|
|
41
|
+
// 4. Ambiguous or not found
|
|
42
|
+
return null;
|
|
43
|
+
}
|
|
44
|
+
/**
|
|
45
|
+
* Re-entry guard for runJobNow: only calls `run` when `isRunning`
|
|
46
|
+
* reports the job is idle. Otherwise reports back "already-running"
|
|
47
|
+
* so the caller can tell the user instead of silently double-firing.
|
|
48
|
+
*
|
|
49
|
+
* Kept as a higher-order function so the test doesn't need to stand
|
|
50
|
+
* up the whole cron loop β we mock the two callbacks.
|
|
51
|
+
*/
|
|
52
|
+
export async function runJobNowGuard(id, isRunning, run) {
|
|
53
|
+
if (isRunning(id)) {
|
|
54
|
+
return { status: "already-running" };
|
|
55
|
+
}
|
|
56
|
+
const result = await run(id);
|
|
57
|
+
return { status: "ran", output: result.output, error: result.error };
|
|
58
|
+
}
|