alvin-bot 4.8.7 → 4.8.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,6 +2,67 @@
2
2
 
3
3
  All notable changes to Alvin Bot are documented here.
4
4
 
5
+ ## [4.8.9] — 2026-04-11
6
+
7
+ ### 🐛 Browser automation: dead `browse-server.cjs` path removed, 3-tier router now the source of truth
8
+
9
+ The `browse` skill used to instruct the agent to start `node scripts/browse-server.cjs` on port 3800 for every browser task. That file was deleted in an earlier cleanup (see `20283c9` for the original 577-line version — now gone), but `skills/browse/SKILL.md` was never updated. Result: any browser-related user message on Telegram — or any cron job that hit the skill — got a system-prompt injection telling it to call a gateway that didn't exist, producing half-failed runs like the "Daily Job Alert" cron that couldn't load LinkedIn or StepStone.
10
+
11
+ **What changed:**
12
+
13
+ - **`skills/browse/SKILL.md` — full rewrite.** Now documents the hub 3-tier router at `~/.claude/hub/SCRIPTS/browser.sh`:
14
+ - **Tier 0** — WebFetch / `curl` for static pages and APIs
15
+ - **Tier 1** — `browser.sh stealth <url>` (Playwright + stealth plugin, headless, Cloudflare-masking)
16
+ - **Tier 2** — `browser.sh cdp {start|goto|shot|tabs|stop}` (real Chrome with persistent profile at `~/.claude/hub/BROWSER/profile/`, login cookies survive restarts)
17
+ - **Tier 3** — Claude-in-Chrome extension via MCP tools (interactive CLI only)
18
+ - Explicit escalation ladder (WebFetch → stealth → CDP → ask Ali to log in) and a `NIEMALS browse-server.cjs nutzen` anti-rule.
19
+ - Concrete working targets (StepStone ✅, Michael Page ✅, LinkedIn ✅ with login, Indeed ❌) so the agent knows what to try where.
20
+
21
+ - **`src/services/browser-manager.ts` — hardened fallback chain.** The multi-strategy manager already had the right *shape* (`gateway → cdp → hub-stealth → cli`) but several ops silently broke or hung:
22
+ - **`gatewayRequest` now has a 15 s timeout** (`req.destroy` on elapse). Previously a hung gateway would wedge the caller forever.
23
+ - **CDP fallback for interactive ops.** `click`, `fill`, `type`, `press`, `scroll`, `evaluate`, `info`, and `getTree` used to hard-throw `"requires gateway"` when `browse-server.cjs` wasn't running. They now try the gateway first, then a short-lived `chromium.connectOverCDP()` via a new `withCdpPage()` helper that reuses Ali's live Chrome on port 9222. Refs are interpreted as CSS selectors when gateway is absent.
24
+ - **Explicit PNG extension** on auto-generated screenshot filenames (`shot_<ts>.png`) so Playwright's format inference is unambiguous.
25
+ - **Better error messages** — every "needs interactive" throw now includes the exact command to start CDP Chrome (`~/.claude/hub/SCRIPTS/browser.sh cdp start headless`).
26
+
27
+ - **`src/paths.ts` — `HUB_BROWSER_SH` constant.** New absolute path to `~/.claude/hub/SCRIPTS/browser.sh` so the manager can shell out without hard-coding `os.homedir()` inline.
28
+
29
+ **Why this matters:** `browser-manager.ts` is still not wired into any bot code path (it's future-proofing), so the production fix for user-interactive flows is `SKILL.md`. The manager hardening ensures that when it does eventually get wired into a sub-agent tool, it won't hang on missing gateways or lose all interactive capability when only CDP is available.
30
+
31
+ **Testing:** Tier 1 stealth end-to-end against `stepstone.de/jobs/it-delivery-director` → 1.2 MB HTML, title parsed. Module-level integration test: `navigate('https://example.com')` via auto-selected hub-stealth → correct title/URL. `resolveStrategy('gateway')` → cascades to CDP with visible warning. `info()` via CDP fallback → returns live Chrome state without throwing. Skills reload picks up the new SKILL.md (5977 chars), `matchSkills("browse linkedin")` hits the browse skill, `buildSkillContext("open stepstone.de")` injects the 3-tier guidance block.
32
+
33
+ ## [4.8.8] — 2026-04-11
34
+
35
+ ### ✨ Unlimited sub-agent & cron timeouts (user-configurable)
36
+
37
+ Sub-agents and `ai-query` cron jobs used to hard-cap at 5 minutes (`SUBAGENT_TIMEOUT=300000` default), and `shell` cron jobs at 60 s. Long-running research, deep-dive audits, or anything that crossed the threshold got killed mid-stream with `status: "timeout"`. 4.8.8 flips the default to **unlimited** and lets the user override both globally and per job.
38
+
39
+ **What changed:**
40
+
41
+ - **Default is now infinite.** `src/config.ts` seeds `subAgentTimeout` from `SUBAGENT_TIMEOUT` env or falls back to `-1` (unlimited). The runtime value lives in `~/.alvin-bot/sub-agents.json` as `defaultTimeoutMs` and is changeable at runtime without restart.
42
+ - **New `/subagents timeout` command.** `/subagents timeout` shows the current value; `/subagents timeout 3600` sets 1 h; `/subagents timeout off` (or `-1`, `0`, `unlimited`, `infinite`) disables the cap entirely. The default-status output now includes a `⏱ Timeout` line.
43
+ - **Per-job override on cron.** `/cron add 1h ai-query "deep audit" --timeout off` gives this one job no timeout. `/cron add 5m shell "pm2 ls" --timeout 30` caps this shell at 30 s. Omitting `--timeout` inherits the current global default. Same flag exists on `scripts/cron-manage.js add --timeout <sec|off>`.
44
+ - **`CronJob.timeoutMs` field.** Optional number in `cron-jobs.json`. Undefined = inherit global default. Value ≤ 0 = unlimited.
45
+ - **Semantics.** `spawnSubAgent` now only arms the `setTimeout(abort)` when `timeout > 0`. At ≤ 0, no abort timer is created, existing `if (timeoutId) clearTimeout(…)` call sites are null-safe, and the agent runs until it finishes, is cancelled via `/subagents cancel`, or the process dies.
46
+ - **Shell cron unchanged behaviour preserved.** If the shell job has no `timeoutMs`, `execSync` is called without a `timeout` option, which Node treats as infinite — same effect as before was *meant* to provide, but the old hard-coded 60 s removed that freedom.
47
+
48
+ **ENV var still works but is seed-only.** `SUBAGENT_TIMEOUT=600000` at startup still seeds the config on first load, but the persisted value in `sub-agents.json` wins after that.
49
+
50
+ ### 🐛 Silenced harmless `message is not modified` Telegram errors
51
+
52
+ Occasionally Ali would see a red banner at the bottom of an Alvin message:
53
+
54
+ > Error: Call to 'editMessageText' failed! (400: Bad Request: message is not modified: specified new message content and reply markup are exactly the same as a current content and reply markup of the message)
55
+
56
+ It never broke anything, but it polluted logs and showed up as an "internal error" reply to the user. Root cause: Telegram's Bot API refuses `editMessageText` when the new content + reply markup are byte-identical to the existing message. This happens legitimately in callback handlers — e.g. tapping a cron-toggle button twice, re-rendering a sudo/keys/platforms menu, language-switch callbacks that render the same content, or stream flushes where the throttled partial hasn't changed since the last edit.
57
+
58
+ **Fix**: `bot.catch()` in `src/index.ts` now filters out this specific error early. Two regex patterns (`/message is not modified/i` and `/specified new message content.*exactly the same/i`) cover both variants Telegram sends. Real errors (network, SDK, provider failures) still log and still surface the "internal error" reply to the user — only this one harmless class gets dropped.
59
+
60
+ ### 📝 CLAUDE.md: PM2 references updated to launchd
61
+
62
+ The project `CLAUDE.md` still said *"PM2: `alvin-bot` Prozess, Config in `ecosystem.config.cjs`"* — outdated since the 4.8.6 switch to launchd. Updated to reflect the actual process manager (`~/Library/LaunchAgents/com.alvinbot.app.plist`, `KeepAlive=true`, `RunAtLoad=true`), the log paths, and a note that `watchdog.ts` only brakes process crash-loops — it does **not** kill long-running sessions or sub-agents. `ecosystem.config.cjs` is now labelled legacy.
63
+
64
+ The global `~/.claude/CLAUDE.md` was also corrected: `alvin-bot` was removed from the VPS PM2-process list (it runs locally, not on the VPS) and the cron-hub note now correctly says "als **launchd LaunchAgent**".
65
+
5
66
  ## [4.8.7] — 2026-04-11
6
67
 
7
68
  ### 🐛 `/update` now detects stale-runtime (rebuild without restart)
package/dist/config.js CHANGED
@@ -45,7 +45,13 @@ export const config = {
45
45
  compactionThreshold: Number(process.env.COMPACTION_THRESHOLD) || 80000,
46
46
  // Sub-Agents
47
47
  maxSubAgents: Number(process.env.MAX_SUBAGENTS) || 4,
48
- subAgentTimeout: Number(process.env.SUBAGENT_TIMEOUT) || 300000, // 5 min
48
+ // Default sub-agent timeout. -1 / 0 = unlimited (no hard cut-off).
49
+ // The runtime value lives in sub-agents.json and can be changed at runtime
50
+ // via /subagents timeout; this constant only seeds the initial config on
51
+ // first launch when SUBAGENT_TIMEOUT is not set.
52
+ subAgentTimeout: process.env.SUBAGENT_TIMEOUT !== undefined && process.env.SUBAGENT_TIMEOUT !== ""
53
+ ? Number(process.env.SUBAGENT_TIMEOUT)
54
+ : -1,
49
55
  // TTS Provider
50
56
  ttsProvider: (process.env.TTS_PROVIDER || "edge"),
51
57
  elevenlabs: {
@@ -1277,9 +1277,29 @@ export function registerCommands(bot) {
1277
1277
  `Commands: /cron add · delete · toggle · run · info`, { parse_mode: "HTML", reply_markup: keyboard });
1278
1278
  return;
1279
1279
  }
1280
- // /cron add <schedule> <type> <payload>
1280
+ // /cron add <schedule> <type> <payload> [--timeout <sec|off>]
1281
1281
  if (arg.startsWith("add ")) {
1282
- const rest = arg.slice(4).trim();
1282
+ let rest = arg.slice(4).trim();
1283
+ // Extract optional --timeout flag from anywhere in the command.
1284
+ // Accepts seconds, "off", "unlimited", "-1", or "0" — anything ≤ 0
1285
+ // or non-numeric collapses to -1 (unlimited).
1286
+ let timeoutMs;
1287
+ const timeoutMatch = rest.match(/(^|\s)--timeout\s+(\S+)/);
1288
+ if (timeoutMatch) {
1289
+ const val = timeoutMatch[2].toLowerCase();
1290
+ if (["off", "unlimited", "infinite", "-1", "0"].includes(val)) {
1291
+ timeoutMs = -1;
1292
+ }
1293
+ else {
1294
+ const secs = Number(timeoutMatch[2]);
1295
+ if (!Number.isFinite(secs) || secs < 0) {
1296
+ await ctx.reply(`❌ Invalid <code>--timeout</code> value: ${timeoutMatch[2]}`, { parse_mode: "HTML" });
1297
+ return;
1298
+ }
1299
+ timeoutMs = Math.floor(secs * 1000);
1300
+ }
1301
+ rest = rest.replace(/(^|\s)--timeout\s+\S+/, "").trim();
1302
+ }
1283
1303
  // Natural language schedule shortcuts (German + English)
1284
1304
  const naturalSchedules = {
1285
1305
  "täglich": "0 8 * * *", "daily": "0 8 * * *",
@@ -1342,7 +1362,7 @@ export function registerCommands(bot) {
1342
1362
  else {
1343
1363
  const sp = rest.indexOf(" ");
1344
1364
  if (sp < 0) {
1345
- await ctx.reply("Format: <code>/cron add &lt;schedule&gt; &lt;type&gt; &lt;payload&gt;</code>\n\nSchedule options:\n• <b>Intervals:</b> 5m, 1h, 30s, 2d\n• <b>Natural:</b> daily, weekly, monthly, weekdays, hourly\n• <b>With time:</b> 8:30 daily, weekdays 9:00\n• <b>German:</b> täglich, wöchentlich, morgens, abends\n• <b>Cron:</b> \"0 9 * * 1-5\"", { parse_mode: "HTML" });
1365
+ await ctx.reply("Format: <code>/cron add &lt;schedule&gt; &lt;type&gt; &lt;payload&gt; [--timeout &lt;sec|off&gt;]</code>\n\nSchedule options:\n• <b>Intervals:</b> 5m, 1h, 30s, 2d\n• <b>Natural:</b> daily, weekly, monthly, weekdays, hourly\n• <b>With time:</b> 8:30 daily, weekdays 9:00\n• <b>German:</b> täglich, wöchentlich, morgens, abends\n• <b>Cron:</b> \"0 9 * * 1-5\"\n\nOptional <code>--timeout</code> in seconds, or <code>off</code>/<code>-1</code> for unlimited.", { parse_mode: "HTML" });
1346
1366
  return;
1347
1367
  }
1348
1368
  schedule = rest.slice(0, sp);
@@ -1381,12 +1401,19 @@ export function registerCommands(bot) {
1381
1401
  payload,
1382
1402
  target: { platform: "telegram", chatId: String(chatId) },
1383
1403
  createdBy: `telegram:${userId}`,
1404
+ ...(timeoutMs !== undefined ? { timeoutMs } : {}),
1384
1405
  });
1385
1406
  const readableSched = humanReadableSchedule(job.schedule);
1407
+ const timeoutLine = typeof job.timeoutMs === "number"
1408
+ ? job.timeoutMs <= 0
1409
+ ? `<b>Timeout:</b> ∞ (unlimited)\n`
1410
+ : `<b>Timeout:</b> ${Math.round(job.timeoutMs / 1000)}s\n`
1411
+ : "";
1386
1412
  await ctx.reply(`✅ <b>Cron Job created</b>\n\n` +
1387
1413
  `<b>Name:</b> ${job.name}\n` +
1388
1414
  `📅 <b>${readableSched}</b>\n` +
1389
1415
  `<b>Type:</b> ${job.type}\n` +
1416
+ timeoutLine +
1390
1417
  `<b>Next run:</b> ${formatNextRun(job.nextRunAt)}\n` +
1391
1418
  `<b>ID:</b> <code>${job.id}</code>`, { parse_mode: "HTML" });
1392
1419
  return;
@@ -1734,7 +1761,7 @@ export function registerCommands(bot) {
1734
1761
  // type both "/sub-agents" and "/subagents" — Telegram routes both to this.
1735
1762
  bot.command(["sub_agents", "subagents"], async (ctx) => {
1736
1763
  const lang = getSession(ctx.from.id).language;
1737
- const { listSubAgents, cancelSubAgent, getSubAgentResult, getMaxParallelAgents, getConfiguredMaxParallel, setMaxParallelAgents, findSubAgentByName, getVisibility, setVisibility, getQueueCap, setQueueCap, } = await import("../services/subagents.js");
1764
+ const { listSubAgents, cancelSubAgent, getSubAgentResult, getMaxParallelAgents, getConfiguredMaxParallel, setMaxParallelAgents, findSubAgentByName, getVisibility, setVisibility, getQueueCap, setQueueCap, getDefaultTimeoutMs, setDefaultTimeoutMs, } = await import("../services/subagents.js");
1738
1765
  const arg = (ctx.match || "").trim();
1739
1766
  const tokens = arg.split(/\s+/).filter(Boolean);
1740
1767
  const sub = tokens[0]?.toLowerCase() || "";
@@ -1792,6 +1819,47 @@ export function registerCommands(bot) {
1792
1819
  await ctx.reply(lines.join("\n"), { parse_mode: "Markdown" });
1793
1820
  return;
1794
1821
  }
1822
+ // /subagents timeout [sec|off|unlimited|-1] — set default sub-agent timeout
1823
+ if (sub === "timeout") {
1824
+ const val = tokens[1];
1825
+ const formatTimeout = (ms) => {
1826
+ if (ms <= 0)
1827
+ return "∞ (unlimited)";
1828
+ if (ms < 1000)
1829
+ return `${ms}ms`;
1830
+ const sec = ms / 1000;
1831
+ if (sec < 60)
1832
+ return `${sec}s`;
1833
+ const min = sec / 60;
1834
+ if (min < 60)
1835
+ return `${min.toFixed(min < 10 ? 1 : 0)}min`;
1836
+ return `${(min / 60).toFixed(1)}h`;
1837
+ };
1838
+ if (!val) {
1839
+ const current = getDefaultTimeoutMs();
1840
+ await ctx.reply(`⏱ Default sub-agent timeout: *${formatTimeout(current)}*\n\n` +
1841
+ `Usage: \`/subagents timeout <sec>\` · \`/subagents timeout off\`\n` +
1842
+ `\`off\`, \`unlimited\`, \`-1\` oder \`0\` = kein Timeout. ` +
1843
+ `Gilt für neue Subagents und ai-query Cron-Jobs ohne eigenen Wert.`, { parse_mode: "Markdown" });
1844
+ return;
1845
+ }
1846
+ const lower = val.toLowerCase();
1847
+ let ms;
1848
+ if (["off", "unlimited", "infinite", "-1", "0"].includes(lower)) {
1849
+ ms = -1;
1850
+ }
1851
+ else {
1852
+ const secs = Number(val);
1853
+ if (!Number.isFinite(secs) || secs < 0) {
1854
+ await ctx.reply(`❌ Ungültiger Wert \`${val}\`. Nutze Sekunden (z.B. \`300\`) oder \`off\`.`, { parse_mode: "Markdown" });
1855
+ return;
1856
+ }
1857
+ ms = Math.floor(secs * 1000);
1858
+ }
1859
+ const effective = setDefaultTimeoutMs(ms);
1860
+ await ctx.reply(`✅ Default sub-agent timeout: *${formatTimeout(effective)}*`, { parse_mode: "Markdown" });
1861
+ return;
1862
+ }
1795
1863
  // /subagents queue <n> — set bounded-queue cap (0 disables queue)
1796
1864
  if (sub === "queue") {
1797
1865
  const n = parseInt(tokens[1] || "", 10);
@@ -1921,6 +1989,10 @@ export function registerCommands(bot) {
1921
1989
  ? `${t("bot.subagents.maxLabel", lang)} 0 ${t("bot.subagents.autoSuffix", lang, { n: effective })}`
1922
1990
  : `${t("bot.subagents.maxLabel", lang)} ${configured}`;
1923
1991
  const visibilityLabel = `${t("bot.subagents.visibilityLabel", lang)} *${getVisibility()}*`;
1992
+ const currentTimeout = getDefaultTimeoutMs();
1993
+ const timeoutLabel = currentTimeout <= 0
1994
+ ? `⏱ Timeout: *∞ (unlimited)*`
1995
+ : `⏱ Timeout: *${Math.round(currentTimeout / 1000)}s*`;
1924
1996
  const agents = listSubAgents();
1925
1997
  let body = "";
1926
1998
  if (agents.length === 0) {
@@ -1931,7 +2003,7 @@ export function registerCommands(bot) {
1931
2003
  }
1932
2004
  const header = t("bot.subagents.header", lang);
1933
2005
  const usage = `\n\n${t("bot.subagents.usage", lang)}`;
1934
- const full = `${header}\n${maxLabel}\n${visibilityLabel}${body}${usage}`;
2006
+ const full = `${header}\n${maxLabel}\n${visibilityLabel}\n${timeoutLabel}${body}${usage}`;
1935
2007
  await ctx.reply(full, { parse_mode: "Markdown" }).catch(() => ctx.reply(full));
1936
2008
  });
1937
2009
  }
package/dist/i18n.js CHANGED
@@ -519,10 +519,10 @@ const strings = {
519
519
  fr: "Durée : {sec}s · Tokens : {in}/{out}",
520
520
  },
521
521
  "bot.subagents.usage": {
522
- en: "Commands:\n/subagents — show status\n/subagents max <n> — set parallel limit (0=auto)\n/subagents visibility <auto|banner|silent|live> — delivery mode\n/subagents queue <n> — bounded-queue cap (0 = disabled)\n/subagents stats — last 24h run stats\n/subagents list — list all\n/subagents cancel <name|id> — cancel one\n/subagents result <name|id> — show result",
523
- de: "Befehle:\n/subagents — Status anzeigen\n/subagents max <n> — Parallel-Limit setzen (0=auto)\n/subagents visibility <auto|banner|silent|live> — Delivery-Modus\n/subagents list — alle anzeigen\n/subagents cancel <name|id> — abbrechen\n/subagents result <name|id> — Ergebnis anzeigen",
524
- es: "Comandos:\n/subagents — ver estado\n/subagents max <n> — establecer límite (0=auto)\n/subagents visibility <auto|banner|silent|live> — modo de entrega\n/subagents list — listar todos\n/subagents cancel <nombre|id> — cancelar uno\n/subagents result <nombre|id> — ver resultado",
525
- fr: "Commandes :\n/subagents — état\n/subagents max <n> — limite parallèle (0=auto)\n/subagents visibility <auto|banner|silent|live> — mode de livraison\n/subagents list — lister tous\n/subagents cancel <nom|id> — annuler un\n/subagents result <nom|id> — voir résultat",
522
+ en: "Commands:\n/subagents — show status\n/subagents max <n> — set parallel limit (0=auto)\n/subagents timeout <sec|off> — default timeout (off = unlimited)\n/subagents visibility <auto|banner|silent|live> — delivery mode\n/subagents queue <n> — bounded-queue cap (0 = disabled)\n/subagents stats — last 24h run stats\n/subagents list — list all\n/subagents cancel <name|id> — cancel one\n/subagents result <name|id> — show result",
523
+ de: "Befehle:\n/subagents — Status anzeigen\n/subagents max <n> — Parallel-Limit setzen (0=auto)\n/subagents timeout <sec|off> — Default-Timeout (off = unendlich)\n/subagents visibility <auto|banner|silent|live> — Delivery-Modus\n/subagents queue <n> — Queue-Cap (0 = deaktiviert)\n/subagents list — alle anzeigen\n/subagents cancel <name|id> — abbrechen\n/subagents result <name|id> — Ergebnis anzeigen",
524
+ es: "Comandos:\n/subagents — ver estado\n/subagents max <n> — establecer límite (0=auto)\n/subagents timeout <seg|off> — timeout por defecto (off = sin límite)\n/subagents visibility <auto|banner|silent|live> — modo de entrega\n/subagents list — listar todos\n/subagents cancel <nombre|id> — cancelar uno\n/subagents result <nombre|id> — ver resultado",
525
+ fr: "Commandes :\n/subagents — état\n/subagents max <n> — limite parallèle (0=auto)\n/subagents timeout <sec|off> — délai par défaut (off = illimité)\n/subagents visibility <auto|banner|silent|live> — mode de livraison\n/subagents list — lister tous\n/subagents cancel <nom|id> — annuler un\n/subagents result <nom|id> — voir résultat",
526
526
  },
527
527
  "bot.subagents.visibilityLabel": {
528
528
  en: "Visibility:",
package/dist/index.js CHANGED
@@ -216,10 +216,20 @@ if (hasTelegram) {
216
216
  bot.on("message:photo", handlePhoto);
217
217
  bot.on("message:document", handleDocument);
218
218
  bot.on("message:text", handleMessage);
219
- // Error handling — log but don't crash
219
+ // Error handling — log but don't crash.
220
220
  bot.catch((err) => {
221
221
  const ctx = err.ctx;
222
222
  const e = err.error;
223
+ // Telegram's "message is not modified" (400) is harmless — it fires
224
+ // when a callback handler re-renders an inline keyboard / edited
225
+ // message with content that happens to match the current message
226
+ // exactly (e.g. double-tapped toggle button, identical list after
227
+ // re-render). Swallow it silently so it neither pollutes the logs
228
+ // nor bubbles up to the user as "internal error".
229
+ const msg = e instanceof Error ? e.message : String(e);
230
+ if (/message is not modified/i.test(msg) || /specified new message content.*exactly the same/i.test(msg)) {
231
+ return;
232
+ }
223
233
  console.error(`Error handling update ${ctx?.update?.update_id}:`, e);
224
234
  // Try to notify the user
225
235
  if (ctx?.chat?.id) {
package/dist/paths.js CHANGED
@@ -86,6 +86,8 @@ export const AGENTS_FILE = resolve(DATA_DIR, "AGENTS.md");
86
86
  export const HOOKS_DIR = resolve(DATA_DIR, "hooks");
87
87
  /** scripts/browse-server.cjs — HTTP gateway for persistent browser sessions */
88
88
  export const BROWSE_SERVER_SCRIPT = resolve(BOT_ROOT, "scripts", "browse-server.cjs");
89
+ /** ~/.claude/hub/SCRIPTS/browser.sh — Hub 3-tier browser router (stealth, CDP, ext) */
90
+ export const HUB_BROWSER_SH = resolve(os.homedir(), ".claude", "hub", "SCRIPTS", "browser.sh");
89
91
  /** data/exec-allowlist.json — User-defined exec allowlist */
90
92
  export const EXEC_ALLOWLIST_FILE = resolve(DATA_DIR, "exec-allowlist.json");
91
93
  /** assets/ — User asset files (CVs, cover letters, legal docs, photos) */
@@ -1,34 +1,166 @@
1
1
  /**
2
- * Multi-Strategy Browser Manager
2
+ * Multi-Strategy Browser Manager — with automatic fallback chain.
3
3
  *
4
- * Auto-selects between three browser strategies:
5
- * - CLI: Headless Playwright, one-shot (screenshots, text extraction, PDF)
6
- * - Gateway: Persistent HTTP browser server (interactive browsing, form-filling)
7
- * - CDP: Attach to user's live Chrome via DevTools Protocol
4
+ * Strategy priority:
5
+ * 1. Gateway (browse-server.cjs HTTP server) if script exists and is running
6
+ * 2. CDP (Chrome DevTools Protocol) via hub browser.sh cdp, persistent cookies
7
+ * 3. Hub Stealth (Playwright + stealth plugin) via hub browser.sh stealth
8
+ * 4. Raw CLI (bare Playwright) — last resort, easily blocked
9
+ *
10
+ * If a strategy is unavailable, we automatically cascade to the next one
11
+ * and log a warning so failures are visible, not silent.
8
12
  */
9
- import { spawn } from "child_process";
13
+ import { execSync, spawn } from "child_process";
10
14
  import http from "http";
11
15
  import fs from "fs";
12
16
  import { config } from "../config.js";
13
- import { BROWSE_SERVER_SCRIPT } from "../paths.js";
17
+ import { BROWSE_SERVER_SCRIPT, HUB_BROWSER_SH } from "../paths.js";
14
18
  import { screenshotUrl, extractText, generatePdf } from "./browser.js";
15
- /** Auto-select the best browser strategy for a task */
19
+ const CDP_PORT = 9222;
20
+ const EXEC_TIMEOUT = 60_000; // 60s for page loads via shell
21
+ // ── Logging ──────────────────────────────────────────────────────────
22
+ function log(msg) {
23
+ console.warn(`[browser-manager] ${msg}`);
24
+ }
25
+ // ── Availability Checks ──────────────────────────────────────────────
26
+ function isGatewayScriptPresent() {
27
+ return fs.existsSync(BROWSE_SERVER_SCRIPT);
28
+ }
29
+ async function isGatewayRunning() {
30
+ try {
31
+ const health = await gatewayRequest("/health");
32
+ return !!health?.ok;
33
+ }
34
+ catch {
35
+ return false;
36
+ }
37
+ }
38
+ function isHubBrowserAvailable() {
39
+ return fs.existsSync(HUB_BROWSER_SH);
40
+ }
41
+ async function isCDPAvailable() {
42
+ return new Promise((resolve) => {
43
+ const req = http.get(`http://127.0.0.1:${CDP_PORT}/json/version`, (res) => {
44
+ let data = "";
45
+ res.on("data", (chunk) => (data += chunk));
46
+ res.on("end", () => resolve(res.statusCode === 200));
47
+ });
48
+ req.on("error", () => resolve(false));
49
+ req.setTimeout(3000, () => {
50
+ req.destroy();
51
+ resolve(false);
52
+ });
53
+ });
54
+ }
55
+ // ── Strategy Selection with Fallback ─────────────────────────────────
56
+ /** Pick the preferred strategy based on task type */
16
57
  export function selectStrategy(task = {}) {
17
58
  if (task.useUserBrowser || config.cdpUrl)
18
59
  return "cdp";
19
60
  if (task.interactive || task.multiStep)
20
61
  return "gateway";
62
+ return "hub-stealth";
63
+ }
64
+ /**
65
+ * Resolve the preferred strategy to one that's actually available.
66
+ * Cascades: gateway → cdp → hub-stealth → cli
67
+ */
68
+ export async function resolveStrategy(preferred) {
69
+ const chain = [];
70
+ // Build fallback chain starting from preferred
71
+ switch (preferred) {
72
+ case "gateway":
73
+ chain.push("gateway", "cdp", "hub-stealth", "cli");
74
+ break;
75
+ case "cdp":
76
+ chain.push("cdp", "hub-stealth", "cli");
77
+ break;
78
+ case "hub-stealth":
79
+ chain.push("hub-stealth", "cli");
80
+ break;
81
+ case "cli":
82
+ chain.push("cli");
83
+ break;
84
+ }
85
+ for (const strategy of chain) {
86
+ switch (strategy) {
87
+ case "gateway":
88
+ if (isGatewayScriptPresent() && (await isGatewayRunning()))
89
+ return "gateway";
90
+ if (!isGatewayScriptPresent()) {
91
+ log("Gateway unavailable: browse-server.cjs not found. Falling back.");
92
+ }
93
+ else {
94
+ log("Gateway not running. Falling back.");
95
+ }
96
+ break;
97
+ case "cdp":
98
+ if (await isCDPAvailable())
99
+ return "cdp";
100
+ // Try starting CDP via hub script
101
+ if (isHubBrowserAvailable()) {
102
+ try {
103
+ log("CDP Chrome not running — attempting to start via hub browser.sh...");
104
+ execSync(`"${HUB_BROWSER_SH}" cdp start headless`, {
105
+ stdio: "pipe",
106
+ timeout: 15_000,
107
+ });
108
+ // Give it a moment to spin up
109
+ await new Promise((r) => setTimeout(r, 3000));
110
+ if (await isCDPAvailable()) {
111
+ log("CDP Chrome started successfully.");
112
+ return "cdp";
113
+ }
114
+ }
115
+ catch (err) {
116
+ log(`Failed to start CDP Chrome: ${err.message}`);
117
+ }
118
+ }
119
+ log("CDP unavailable. Falling back.");
120
+ break;
121
+ case "hub-stealth":
122
+ if (isHubBrowserAvailable())
123
+ return "hub-stealth";
124
+ log("Hub browser.sh not found. Falling back to raw Playwright.");
125
+ break;
126
+ case "cli":
127
+ return "cli"; // Always available as last resort
128
+ }
129
+ }
21
130
  return "cli";
22
131
  }
23
- // ── Gateway Management ────────────────────────────────────────────────
132
+ function execHub(args) {
133
+ try {
134
+ const result = execSync(`"${HUB_BROWSER_SH}" ${args}`, {
135
+ stdio: "pipe",
136
+ timeout: EXEC_TIMEOUT,
137
+ env: { ...process.env, PATH: process.env.PATH },
138
+ });
139
+ const stdout = result.toString().trim();
140
+ // Try to parse as JSON (stealth outputs JSON)
141
+ try {
142
+ return JSON.parse(stdout);
143
+ }
144
+ catch {
145
+ // Not JSON — return as raw text
146
+ return { title: "", url: "", raw: stdout };
147
+ }
148
+ }
149
+ catch (err) {
150
+ const error = err;
151
+ log(`Hub script failed: ${error.stderr?.toString()?.trim() || error.message}`);
152
+ return null;
153
+ }
154
+ }
155
+ // ── Gateway Management ───────────────────────────────────────────────
24
156
  let gatewayProcess = null;
25
- async function gatewayRequest(path, params = {}) {
157
+ async function gatewayRequest(urlPath, params = {}, timeoutMs = 15_000) {
26
158
  const query = new URLSearchParams(params).toString();
27
- const url = `http://127.0.0.1:${config.browseServerPort}${path}${query ? "?" + query : ""}`;
159
+ const url = `http://127.0.0.1:${config.browseServerPort}${urlPath}${query ? "?" + query : ""}`;
28
160
  return new Promise((resolve, reject) => {
29
- http.get(url, (res) => {
161
+ const req = http.get(url, (res) => {
30
162
  let data = "";
31
- res.on("data", chunk => data += chunk);
163
+ res.on("data", (chunk) => (data += chunk));
32
164
  res.on("end", () => {
33
165
  try {
34
166
  resolve(JSON.parse(data));
@@ -37,107 +169,270 @@ async function gatewayRequest(path, params = {}) {
37
169
  reject(new Error(`Invalid JSON from gateway: ${data.slice(0, 200)}`));
38
170
  }
39
171
  });
40
- }).on("error", reject);
172
+ });
173
+ req.on("error", reject);
174
+ req.setTimeout(timeoutMs, () => {
175
+ req.destroy(new Error(`Gateway request timed out after ${timeoutMs}ms: ${urlPath}`));
176
+ });
41
177
  });
42
178
  }
43
179
  async function ensureGateway() {
44
180
  // Check if already running
45
- try {
46
- const health = await gatewayRequest("/health");
47
- if (health.ok)
48
- return true;
49
- }
50
- catch { /* not running */ }
51
- // Start it
52
- if (!fs.existsSync(BROWSE_SERVER_SCRIPT))
181
+ if (await isGatewayRunning())
182
+ return true;
183
+ // Try to start it
184
+ if (!isGatewayScriptPresent()) {
185
+ log("Cannot start gateway: browse-server.cjs not found.");
53
186
  return false;
187
+ }
54
188
  gatewayProcess = spawn("node", [BROWSE_SERVER_SCRIPT, String(config.browseServerPort)], {
55
189
  stdio: "pipe",
56
190
  detached: false,
57
191
  });
58
- gatewayProcess.on("exit", () => { gatewayProcess = null; });
192
+ gatewayProcess.on("exit", () => {
193
+ gatewayProcess = null;
194
+ });
59
195
  // Wait for startup (max 10s)
60
196
  for (let i = 0; i < 20; i++) {
61
- await new Promise(r => setTimeout(r, 500));
62
- try {
63
- const health = await gatewayRequest("/health");
64
- if (health.ok)
65
- return true;
66
- }
67
- catch { /* still starting */ }
197
+ await new Promise((r) => setTimeout(r, 500));
198
+ if (await isGatewayRunning())
199
+ return true;
68
200
  }
201
+ log("Gateway failed to start within 10s.");
69
202
  return false;
70
203
  }
71
- // ── Unified Operations ────────────────────────────────────────────────
72
- /** Navigate to URL using best strategy */
204
+ // ── Unified Operations ───────────────────────────────────────────────
205
+ /** Navigate to URL using best available strategy */
73
206
  export async function navigate(url, task = {}) {
74
- const strategy = selectStrategy(task);
75
- if (strategy === "gateway") {
76
- await ensureGateway();
77
- return gatewayRequest("/navigate", { url });
78
- }
79
- if (strategy === "cdp") {
80
- // CDP: use playwright connectOverCDP
81
- const { chromium } = await import("playwright");
82
- const browser = await chromium.connectOverCDP(config.cdpUrl);
83
- const contexts = browser.contexts();
84
- const page = contexts[0]?.pages()[0] || await contexts[0]?.newPage() || await browser.newPage();
85
- await page.goto(url, { waitUntil: "networkidle", timeout: 30000 });
86
- const title = await page.title();
87
- return { title, url: page.url() };
88
- }
89
- // CLI: simple text extraction
90
- const text = await extractText(url);
91
- return { title: url, url, tree: [text.slice(0, 500)] };
207
+ const strategy = await resolveStrategy(selectStrategy(task));
208
+ log(`navigate(${url}) using strategy: ${strategy}`);
209
+ switch (strategy) {
210
+ case "gateway": {
211
+ await ensureGateway();
212
+ return gatewayRequest("/navigate", { url });
213
+ }
214
+ case "cdp": {
215
+ // Try hub CDP first
216
+ if (isHubBrowserAvailable()) {
217
+ const result = execHub(`cdp goto "${url}"`);
218
+ if (result && !result.error) {
219
+ return { title: result.title || "", url: result.url || url };
220
+ }
221
+ }
222
+ // Fallback: direct Playwright CDP
223
+ try {
224
+ const { chromium } = await import("playwright");
225
+ const browser = await chromium.connectOverCDP(config.cdpUrl || `http://127.0.0.1:${CDP_PORT}`);
226
+ const contexts = browser.contexts();
227
+ const page = contexts[0]?.pages()[0] || (await contexts[0]?.newPage()) || (await browser.newPage());
228
+ await page.goto(url, { waitUntil: "networkidle", timeout: 30000 });
229
+ const title = await page.title();
230
+ return { title, url: page.url() };
231
+ }
232
+ catch (err) {
233
+ log(`Direct CDP failed: ${err.message}`);
234
+ // Last resort: try stealth
235
+ if (isHubBrowserAvailable()) {
236
+ const stealthResult = execHub(`stealth "${url}"`);
237
+ if (stealthResult) {
238
+ return { title: stealthResult.title || "", url: stealthResult.url || url };
239
+ }
240
+ }
241
+ throw err;
242
+ }
243
+ }
244
+ case "hub-stealth": {
245
+ const result = execHub(`stealth "${url}"`);
246
+ if (result && !result.error) {
247
+ return { title: result.title || "", url: result.url || url };
248
+ }
249
+ // Fallback to raw CLI
250
+ log("Hub stealth failed, falling back to raw Playwright.");
251
+ const text = await extractText(url);
252
+ return { title: url, url, tree: [text.slice(0, 500)] };
253
+ }
254
+ case "cli":
255
+ default: {
256
+ const text = await extractText(url);
257
+ return { title: url, url, tree: [text.slice(0, 500)] };
258
+ }
259
+ }
92
260
  }
93
261
  /** Take a screenshot */
94
262
  export async function screenshot(url, options = {}) {
95
- const strategy = selectStrategy();
96
- if (strategy === "gateway") {
97
- await ensureGateway();
98
- if (url)
99
- await gatewayRequest("/navigate", { url });
100
- const result = await gatewayRequest("/screenshot", options.fullPage ? { full: "true" } : {});
101
- return result.path;
102
- }
103
- // CLI fallback
104
- return screenshotUrl(url, { fullPage: options.fullPage });
105
- }
106
- /** Get accessibility tree (gateway only) */
263
+ const strategy = await resolveStrategy(selectStrategy());
264
+ log(`screenshot(${url}) using strategy: ${strategy}`);
265
+ switch (strategy) {
266
+ case "gateway": {
267
+ await ensureGateway();
268
+ if (url)
269
+ await gatewayRequest("/navigate", { url });
270
+ const result = await gatewayRequest("/screenshot", options.fullPage ? { full: "true" } : {});
271
+ return result.path;
272
+ }
273
+ case "cdp": {
274
+ if (isHubBrowserAvailable()) {
275
+ const tmpName = `shot_${Date.now()}.png`;
276
+ const result = execHub(`cdp shot "${url}" ${tmpName}`);
277
+ if (result?.screenshot)
278
+ return result.screenshot;
279
+ }
280
+ // Fallback to raw Playwright
281
+ return screenshotUrl(url, { fullPage: options.fullPage });
282
+ }
283
+ case "hub-stealth": {
284
+ const tmpName = `shot_${Date.now()}.png`;
285
+ const result = execHub(`stealth "${url}" --screenshot=${tmpName}`);
286
+ if (result?.screenshot)
287
+ return result.screenshot;
288
+ // Fallback
289
+ return screenshotUrl(url, { fullPage: options.fullPage });
290
+ }
291
+ case "cli":
292
+ default:
293
+ return screenshotUrl(url, { fullPage: options.fullPage });
294
+ }
295
+ }
296
+ // ── CDP Direct-Playwright Helper ─────────────────────────────────────
297
+ // Used as fallback when the gateway isn't running but CDP Chrome is.
298
+ // Each call opens a short-lived CDP connection, operates on the newest
299
+ // existing page in the current context (keeps Chrome itself alive), and
300
+ // disconnects. Safe for sub-agents that need a single op at a time.
301
+ async function withCdpPage(fn) {
302
+ const { chromium } = await import("playwright");
303
+ const browser = await chromium.connectOverCDP(config.cdpUrl || `http://127.0.0.1:${CDP_PORT}`);
304
+ try {
305
+ const context = browser.contexts()[0];
306
+ if (!context)
307
+ throw new Error("No CDP contexts available — is Chrome CDP running?");
308
+ const pages = context.pages();
309
+ const page = pages[pages.length - 1] || (await context.newPage());
310
+ return await fn(page);
311
+ }
312
+ finally {
313
+ await browser.close(); // Closes CDP connection, not Chrome itself
314
+ }
315
+ }
316
+ const NEEDS_INTERACTIVE_HINT = "Start CDP Chrome: ~/.claude/hub/SCRIPTS/browser.sh cdp start headless";
317
+ /**
318
+ * Get accessibility tree (gateway preferred, CDP fallback returns outerHTML).
319
+ * The @eN ref model only exists in the gateway; under CDP we return a
320
+ * best-effort DOM snippet instead so callers can still see what's there.
321
+ */
107
322
  export async function getTree(limit = 100) {
108
- await ensureGateway();
109
- return gatewayRequest("/tree", { limit: String(limit) });
110
- }
111
- /** Click element by ref (gateway only) */
112
- export async function click(ref) {
113
- await ensureGateway();
114
- return gatewayRequest("/click", { ref });
115
- }
116
- /** Fill input (gateway only) */
117
- export async function fill(ref, value) {
118
- await ensureGateway();
119
- await gatewayRequest("/fill", { ref, value });
120
- }
121
- /** Type text (gateway only) */
122
- export async function type(ref, text) {
123
- await ensureGateway();
124
- await gatewayRequest("/type", { ref, text });
125
- }
126
- /** Press key (gateway only) */
127
- export async function press(key, ref) {
128
- await ensureGateway();
129
- await gatewayRequest("/press", ref ? { key, ref } : { key });
130
- }
131
- /** Scroll page (gateway only) */
323
+ if (await isGatewayRunning()) {
324
+ return gatewayRequest("/tree", { limit: String(limit) });
325
+ }
326
+ if (await isCDPAvailable()) {
327
+ return withCdpPage(async (page) => {
328
+ const elements = await page.$$eval("a, button, input, select, textarea, [role=button], [role=link]", (els, max) => els.slice(0, max).map((el, i) => {
329
+ const tag = el.tagName.toLowerCase();
330
+ const text = (el.textContent || "").trim().slice(0, 60);
331
+ const id = el.id ? `#${el.id}` : "";
332
+ const name = el.name
333
+ ? `[name=${el.name}]`
334
+ : "";
335
+ return `@e${i + 1} <${tag}${id}${name}> "${text}"`;
336
+ }), limit);
337
+ return { tree: elements, total: elements.length };
338
+ });
339
+ }
340
+ throw new Error(`[browser-manager] getTree requires gateway or CDP. ${NEEDS_INTERACTIVE_HINT}`);
341
+ }
342
+ /**
343
+ * Click an element. Accepts a gateway ref (@eN → "eN") when gateway is
344
+ * running, or a CSS selector when only CDP is available.
345
+ */
346
+ export async function click(refOrSelector) {
347
+ if (await isGatewayRunning()) {
348
+ return gatewayRequest("/click", { ref: refOrSelector });
349
+ }
350
+ if (await isCDPAvailable()) {
351
+ return withCdpPage(async (page) => {
352
+ await page.click(refOrSelector, { timeout: 10_000 });
353
+ return { title: await page.title(), url: page.url() };
354
+ });
355
+ }
356
+ throw new Error(`[browser-manager] click() requires gateway or CDP. ${NEEDS_INTERACTIVE_HINT}`);
357
+ }
358
+ /** Fill an input. refOrSelector semantics match click(). */
359
+ export async function fill(refOrSelector, value) {
360
+ if (await isGatewayRunning()) {
361
+ await gatewayRequest("/fill", { ref: refOrSelector, value });
362
+ return;
363
+ }
364
+ if (await isCDPAvailable()) {
365
+ await withCdpPage(async (page) => {
366
+ await page.fill(refOrSelector, value, { timeout: 10_000 });
367
+ });
368
+ return;
369
+ }
370
+ throw new Error(`[browser-manager] fill() requires gateway or CDP. ${NEEDS_INTERACTIVE_HINT}`);
371
+ }
372
+ /** Type text character-by-character (for inputs that reject page.fill). */
373
+ export async function type(refOrSelector, text) {
374
+ if (await isGatewayRunning()) {
375
+ await gatewayRequest("/type", { ref: refOrSelector, text });
376
+ return;
377
+ }
378
+ if (await isCDPAvailable()) {
379
+ await withCdpPage(async (page) => {
380
+ await page.type(refOrSelector, text, { timeout: 10_000 });
381
+ });
382
+ return;
383
+ }
384
+ throw new Error(`[browser-manager] type() requires gateway or CDP. ${NEEDS_INTERACTIVE_HINT}`);
385
+ }
386
+ /** Press a keyboard key (page-level if no ref, element-level with ref). */
387
+ export async function press(key, refOrSelector) {
388
+ if (await isGatewayRunning()) {
389
+ await gatewayRequest("/press", refOrSelector ? { key, ref: refOrSelector } : { key });
390
+ return;
391
+ }
392
+ if (await isCDPAvailable()) {
393
+ await withCdpPage(async (page) => {
394
+ if (refOrSelector) {
395
+ await page.locator(refOrSelector).press(key, { timeout: 10_000 });
396
+ }
397
+ else {
398
+ await page.keyboard.press(key);
399
+ }
400
+ });
401
+ return;
402
+ }
403
+ throw new Error(`[browser-manager] press() requires gateway or CDP. ${NEEDS_INTERACTIVE_HINT}`);
404
+ }
405
+ /** Scroll page. CDP fallback uses window.scrollBy. */
132
406
  export async function scroll(direction, amount = 600) {
133
- await ensureGateway();
134
- return gatewayRequest("/scroll", { direction, amount: String(amount) });
407
+ if (await isGatewayRunning()) {
408
+ return gatewayRequest("/scroll", { direction, amount: String(amount) });
409
+ }
410
+ if (await isCDPAvailable()) {
411
+ return withCdpPage(async (page) => {
412
+ const delta = direction === "up" ? -amount :
413
+ direction === "top" ? -1e9 :
414
+ direction === "bottom" ? 1e9 :
415
+ amount;
416
+ await page.evaluate((d) => window.scrollBy(0, d), delta);
417
+ return { title: await page.title(), url: page.url() };
418
+ });
419
+ }
420
+ throw new Error(`[browser-manager] scroll() requires gateway or CDP. ${NEEDS_INTERACTIVE_HINT}`);
135
421
  }
136
- /** Evaluate JS (gateway only) */
422
+ /** Evaluate JS in the page context. */
137
423
  export async function evaluate(js) {
138
- await ensureGateway();
139
- const result = await gatewayRequest("/eval", { js });
140
- return result.result;
424
+ if (await isGatewayRunning()) {
425
+ const result = await gatewayRequest("/eval", { js });
426
+ return result.result;
427
+ }
428
+ if (await isCDPAvailable()) {
429
+ return withCdpPage(async (page) => {
430
+ // `page.evaluate(fn)` wraps a function — we need eval of a raw
431
+ // expression string, so wrap in an IIFE.
432
+ return page.evaluate(new Function(`return (${js})`));
433
+ });
434
+ }
435
+ throw new Error(`[browser-manager] evaluate() requires gateway or CDP. ${NEEDS_INTERACTIVE_HINT}`);
141
436
  }
142
437
  /** Generate PDF from URL */
143
438
  export async function pdf(url) {
@@ -154,8 +449,16 @@ export async function close() {
154
449
  gatewayProcess = null;
155
450
  }
156
451
  }
157
- /** Get current page info (gateway) */
452
+ /** Get current page info (gateway preferred, CDP fallback reads newest page). */
158
453
  export async function info() {
159
- await ensureGateway();
160
- return gatewayRequest("/info");
454
+ if (await isGatewayRunning()) {
455
+ return gatewayRequest("/info");
456
+ }
457
+ if (await isCDPAvailable()) {
458
+ return withCdpPage(async (page) => ({
459
+ title: await page.title(),
460
+ url: page.url(),
461
+ }));
462
+ }
463
+ throw new Error(`[browser-manager] info() requires gateway or CDP. ${NEEDS_INTERACTIVE_HINT}`);
161
464
  }
@@ -122,11 +122,16 @@ async function executeJob(job) {
122
122
  }
123
123
  case "shell": {
124
124
  const cmd = job.payload.command || "echo 'no command'";
125
- const output = execSync(cmd, {
126
- timeout: 60_000,
125
+ // Per-job timeout, default = no timeout (execSync treats timeout=0
126
+ // or "undefined" as infinite). Users opt in via /cron add … --timeout N.
127
+ const shellOpts = {
127
128
  stdio: "pipe",
128
129
  env: { ...process.env, PATH: process.env.PATH + ":/opt/homebrew/bin:/usr/local/bin" },
129
- }).toString().trim();
130
+ };
131
+ if (typeof job.timeoutMs === "number" && job.timeoutMs > 0) {
132
+ shellOpts.timeout = job.timeoutMs;
133
+ }
134
+ const output = execSync(cmd, shellOpts).toString().trim();
130
135
  // Notify with output
131
136
  if (notifyCallback && output) {
132
137
  await notifyCallback(job.target, `🔧 ${job.name}\n\`\`\`\n${output.slice(0, 3000)}\n\`\`\``);
@@ -173,14 +178,20 @@ async function executeJob(job) {
173
178
  ? Number(job.target.chatId)
174
179
  : undefined;
175
180
  const result = await new Promise((resolve, reject) => {
176
- spawnSubAgent({
181
+ // Only pass `timeout` through when the job has a per-job value.
182
+ // Otherwise the sub-agent inherits the current /subagents default.
183
+ const spawnConfig = {
177
184
  name: job.name,
178
185
  prompt,
179
186
  workingDir: BOT_ROOT,
180
187
  source: "cron",
181
188
  parentChatId,
182
189
  onComplete: (r) => resolve(r),
183
- }).catch(reject);
190
+ };
191
+ if (typeof job.timeoutMs === "number") {
192
+ spawnConfig.timeout = job.timeoutMs;
193
+ }
194
+ spawnSubAgent(spawnConfig).catch(reject);
184
195
  });
185
196
  // Non-success: don't notify here. The I3 delivery router has
186
197
  // already posted the appropriate banner (cancelled / timeout /
@@ -309,6 +320,7 @@ export function createJob(input) {
309
320
  nextRunAt: null,
310
321
  runCount: 0,
311
322
  createdBy: input.createdBy || "unknown",
323
+ ...(typeof input.timeoutMs === "number" ? { timeoutMs: input.timeoutMs } : {}),
312
324
  };
313
325
  // Calculate first run
314
326
  job.nextRunAt = calculateNextRun(job);
@@ -21,6 +21,14 @@ let configCache = null;
21
21
  function isValidVisibility(v) {
22
22
  return v === "auto" || v === "banner" || v === "silent" || v === "live";
23
23
  }
24
+ /** Resolve the initial default timeout from config.ts, which itself seeds
25
+ * from the SUBAGENT_TIMEOUT env var. -1 = unlimited. */
26
+ function seedDefaultTimeout() {
27
+ const raw = config.subAgentTimeout;
28
+ if (typeof raw !== "number" || !Number.isFinite(raw) || raw <= 0)
29
+ return -1;
30
+ return Math.floor(raw);
31
+ }
24
32
  function loadSubAgentsConfig() {
25
33
  if (configCache)
26
34
  return configCache;
@@ -33,14 +41,18 @@ function loadSubAgentsConfig() {
33
41
  queueCap: typeof parsed.queueCap === "number"
34
42
  ? Math.max(0, Math.min(Math.floor(parsed.queueCap), ABSOLUTE_MAX_QUEUE))
35
43
  : DEFAULT_QUEUE_CAP,
44
+ defaultTimeoutMs: typeof parsed.defaultTimeoutMs === "number" && Number.isFinite(parsed.defaultTimeoutMs)
45
+ ? (parsed.defaultTimeoutMs <= 0 ? -1 : Math.floor(parsed.defaultTimeoutMs))
46
+ : seedDefaultTimeout(),
36
47
  };
37
48
  }
38
49
  catch {
39
- // File missing or invalid — seed from env var then default to auto
50
+ // File missing or invalid — seed from env vars then default to auto/unlimited
40
51
  configCache = {
41
52
  maxParallel: Number(process.env.MAX_SUBAGENTS) || 0,
42
53
  visibility: "auto",
43
54
  queueCap: DEFAULT_QUEUE_CAP,
55
+ defaultTimeoutMs: seedDefaultTimeout(),
44
56
  };
45
57
  }
46
58
  return configCache;
@@ -102,6 +114,18 @@ export function setQueueCap(n) {
102
114
  saveSubAgentsConfig({ ...cfg, queueCap: clamped });
103
115
  return clamped;
104
116
  }
117
+ /** Current default timeout in ms. -1 = unlimited. */
118
+ export function getDefaultTimeoutMs() {
119
+ return loadSubAgentsConfig().defaultTimeoutMs;
120
+ }
121
+ /** Set the default timeout in ms. Any value ≤ 0 or non-finite collapses
122
+ * to -1 (unlimited). Returns the persisted value. */
123
+ export function setDefaultTimeoutMs(ms) {
124
+ const normalized = !Number.isFinite(ms) || ms <= 0 ? -1 : Math.floor(ms);
125
+ const cfg = loadSubAgentsConfig();
126
+ saveSubAgentsConfig({ ...cfg, defaultTimeoutMs: normalized });
127
+ return normalized;
128
+ }
105
129
  // ── State ───────────────────────────────────────────────
106
130
  const activeAgents = new Map();
107
131
  // ── Name resolver (B2) ──────────────────────────────────
@@ -433,14 +457,23 @@ export function spawnSubAgent(agentConfig) {
433
457
  const resolved = resolveAgentName(agentConfig.name);
434
458
  const resolvedName = resolved.name;
435
459
  const id = crypto.randomUUID();
436
- const timeout = agentConfig.timeout ?? config.subAgentTimeout;
460
+ // Timeout resolution order:
461
+ // 1. Per-spawn override (agentConfig.timeout) — used by cron jobs that
462
+ // carry their own timeoutMs.
463
+ // 2. Runtime default from sub-agents.json (set via /subagents timeout).
464
+ // 3. config.subAgentTimeout fallback (seeded from SUBAGENT_TIMEOUT env).
465
+ // Any value ≤ 0 means "no timeout" — we simply don't arm the abort timer.
466
+ // The existing null-safe `clearTimeout(timeoutId)` call sites make this
467
+ // a safe no-op when the agent finishes or is cancelled.
468
+ const timeout = agentConfig.timeout ?? getDefaultTimeoutMs();
437
469
  const abort = new AbortController();
438
- const timeoutId = setTimeout(() => abort.abort(), timeout);
470
+ const timeoutId = timeout > 0 ? setTimeout(() => abort.abort(), timeout) : null;
439
471
  const willRunImmediately = running < maxParallel;
440
472
  const canQueue = !willRunImmediately && queueCap > 0 && queuedLen < queueCap;
441
473
  if (!willRunImmediately && !canQueue) {
442
474
  // No slot, no queue room → priority-aware reject
443
- clearTimeout(timeoutId);
475
+ if (timeoutId)
476
+ clearTimeout(timeoutId);
444
477
  const source = sourceOf(agentConfig);
445
478
  const runningAgents = [...activeAgents.values()].filter((a) => a.info.status === "running");
446
479
  const userSlots = runningAgents.filter((a) => a.info.source === "user").length;
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "alvin-bot",
3
- "version": "4.8.7",
3
+ "version": "4.8.9",
4
4
  "description": "Alvin Bot — Your personal AI agent on Telegram, WhatsApp, Discord, Signal, and Web.",
5
5
  "type": "module",
6
6
  "main": "dist/index.js",
@@ -1,136 +1,161 @@
1
1
  ---
2
2
  name: Browser Automation
3
- description: Interactive browser control — navigate, click, fill forms, screenshot, test web apps
4
- triggers: browse, browser, test webapp, test app, test website, screenshot page, interact with, click on, fill form, visual test, qa test, check page, open page, test my app, browse to, open url, puppeteer, playwright, browser automation, test die seite, teste die app, schau dir an, öffne die seite, teste mal, visual check, check the ui, check the page
3
+ description: 3-tier browser control — stealth scraping, CDP with persistent cookies, visual oversight. Navigate, screenshot, extract text, interact with logged-in pages.
4
+ triggers: browse, browser, test webapp, test app, test website, screenshot page, interact with, click on, fill form, visual test, qa test, check page, open page, test my app, browse to, open url, puppeteer, playwright, browser automation, linkedin, stepstone, indeed, scrape, fetch page, crawl, teste die seite, teste die app, schau dir an, öffne die seite, teste mal, visual check, check the ui, check the page, webseite öffnen, seite abrufen
5
5
  priority: 8
6
6
  category: automation
7
7
  ---
8
8
 
9
- # Browser Automation — Playwright Interactive
9
+ # Browser Automation — 3-Tier Router
10
10
 
11
- ## Browser Strategies
11
+ Du hast drei Browser-Strategien plus WebFetch. **Wähle die billigste passende Stufe** und eskaliere nur wenn nötig.
12
12
 
13
- Alvin Bot auto-selects the best browser approach:
13
+ ## Entscheidungsregel (in dieser Reihenfolge)
14
14
 
15
- | Strategy | When | How |
16
- |----------|------|-----|
17
- | **CLI** (default) | Simple screenshots, text extraction, PDF | Headless Playwright, one-shot |
18
- | **HTTP Gateway** | Interactive browsing, form-filling, QA testing | Persistent browser server on port 3800 |
19
- | **CDP** | Attach to user's Chrome (with login state) | Chrome DevTools Protocol via CDP_URL |
15
+ | Task | Tool | Warum |
16
+ |------|------|-------|
17
+ | Einzelne öffentliche Seite, nur Text | WebFetch oder `curl` | Am schnellsten, keine Browser-Engine |
18
+ | Öffentliche Seite mit JS / Cloudflare | **Tier 1 Stealth** | Headless + Fingerprint-Masking |
19
+ | Login-pflichtige Seite (LinkedIn, Gmail, …) | **Tier 2 CDP** | Echtes Chrome, persistente Cookies |
20
+ | Komplexer Multi-Step-Flow, User soll zusehen | **Tier 3 Extension** | Visuelle Kontrolle |
20
21
 
21
- The gateway starts automatically when needed and shuts down after 5 min idle.
22
- For CDP: Launch Chrome with `--remote-debugging-port=9222` and set `CDP_URL=http://localhost:9222`.
22
+ **NIEMALS** `scripts/browse-server.cjs` nutzen existiert nicht mehr. **NIEMALS** nacktes `node -e "const {chromium}…"` für externe Seiten — wird sofort geblockt.
23
23
 
24
24
  ---
25
25
 
26
- You have a persistent Playwright browser server that gives you **eyes** and **hands** to interact with web pages. You can navigate, see screenshots, read the accessibility tree, click buttons, fill forms, and test running web apps.
26
+ ## Tier 0 WebFetch / curl (schnellster Pfad)
27
27
 
28
- ## Quick Start
28
+ Für statische Seiten oder APIs, die keine JS-Rendering brauchen:
29
29
 
30
30
  ```bash
31
- # 1. Ensure server is running (auto-shuts down after 5 min idle)
32
- curl -s http://127.0.0.1:3800/health 2>/dev/null | grep -q '"ok":true' || \
33
- (BOT_DIR=$(node -e "console.log(require('path').resolve(require.resolve('alvin-bot/package.json'), '..'))" 2>/dev/null || echo ".") && cd "$BOT_DIR" && node scripts/browse-server.cjs &) && sleep 3
31
+ # Direkter curl
32
+ curl -sL -H "User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36" \
33
+ "https://www.michaelpage.de/jobs/it-director" | htmlq -t "h1, .job-title"
34
34
 
35
- # 2. Navigate to a page
36
- curl -s "http://127.0.0.1:3800/navigate?url=https://example.com" | jq
35
+ # Oder das WebFetch-Tool, wenn verfügbar (interpretiert Inhalt direkt)
36
+ ```
37
37
 
38
- # 3. Take a screenshot (view it with Read tool)
39
- SHOT=$(curl -s "http://127.0.0.1:3800/screenshot" | jq -r '.path')
40
- # Then use Read tool on $SHOT to see the image
38
+ Wenn das einen 403/Captcha gibt eskaliere auf Tier 1.
41
39
 
42
- # 4. Get interactive elements
43
- curl -s "http://127.0.0.1:3800/tree" | jq '.tree[]' -r
40
+ ---
44
41
 
45
- # 5. Click something
46
- curl -s "http://127.0.0.1:3800/click?ref=e5" | jq
47
- ```
42
+ ## Tier 1 — Playwright Stealth (headless, schnell, maskiert)
43
+
44
+ **Router-Script:** `~/.claude/hub/SCRIPTS/browser.sh`
45
+
46
+ ```bash
47
+ # Seite laden, JSON-Metadata zurück (title, url, html_length)
48
+ ~/.claude/hub/SCRIPTS/browser.sh stealth "https://www.stepstone.de/jobs/it-delivery"
48
49
 
49
- ## All Routes
50
-
51
- | Route | Params | What it does |
52
- |-------|--------|-------------|
53
- | `/navigate` | `url` | Open a URL, returns title + accessibility tree |
54
- | `/screenshot` | `full=true` (optional) | Take screenshot, returns file path |
55
- | `/tree` | `limit=N` (optional) | Get all interactive elements with @eN refs |
56
- | `/click` | `ref=eN` | Click element by ref |
57
- | `/fill` | `ref=eN`, `value=text` | Fill input field |
58
- | `/type` | `ref=eN`, `text=chars` | Type character by character (for special inputs) |
59
- | `/press` | `key=Enter`, `ref=eN` (opt) | Press keyboard key |
60
- | `/select` | `ref=eN`, `value=opt` | Select dropdown option |
61
- | `/hover` | `ref=eN` | Hover over element |
62
- | `/scroll` | `direction=down/up/top/bottom`, `amount=600` | Scroll page |
63
- | `/eval` | `js=expression` | Run JavaScript on page |
64
- | `/wait` | `ms=2000` or `selector=.class` | Wait for time or element |
65
- | `/viewport` | `device=mobile/tablet` or `width=W&height=H` | Change viewport |
66
- | `/cookies` | `set=[{...}]` (optional) | Get or set cookies |
67
- | `/back` | — | Browser back |
68
- | `/forward` | — | Browser forward |
69
- | `/reload` | — | Reload page |
70
- | `/network` | `limit=20` | Recent network requests |
71
- | `/info` | — | Current page info |
72
- | `/close` | — | Close browser + shutdown server |
73
- | `/health` | — | Server status check |
74
-
75
- ## Element Refs (@eN)
76
-
77
- The accessibility tree assigns **refs** like `@e1`, `@e2`, `@e3` to every interactive element (links, buttons, inputs, etc.). Use these refs for all interactions — they're more robust than CSS selectors.
78
-
79
- Example tree:
50
+ # Mit Screenshot (PNG in ~/.claude/hub/BROWSER/screenshots/)
51
+ ~/.claude/hub/SCRIPTS/browser.sh stealth "https://example.com" --screenshot=page.png
80
52
  ```
81
- @e1 <a href="/"> "Home"
82
- @e2 <a href="/dashboard"> "Dashboard"
83
- @e3 <input type="email" name="email" placeholder="Enter email">
84
- @e4 <input type="password" name="password" placeholder="Password">
85
- @e5 <button> "Sign In"
86
- @e6 <a href="/forgot"> "Forgot password?"
53
+
54
+ **Was du bekommst:** JSON mit `{title, url, html_length, screenshot}`. Der volle HTML liegt nicht in stdout — zum Parsen den `stealth.js` direkt als Modul importieren oder `/tmp/`-File lesen.
55
+
56
+ **Wann blockt das:** reCAPTCHA v3, aggressive Cloudflare, Login-Walls.
57
+
58
+ **Konkrete funktionierende Targets (Stand 2026):**
59
+ - StepStone (alle Job-Suchen) ✅
60
+ - Michael Page ✅
61
+ - Hays ✅
62
+ - Öffentliche Blog-Posts, News-Sites ✅
63
+ - LinkedIn (ohne Login) ❌ → Tier 2
64
+ - Indeed / Glassdoor ❌ (403 Scraping-Block) → nur über E-Mail-Alerts
65
+
66
+ ---
67
+
68
+ ## Tier 2 — Chrome CDP (persistent Profile, echte Cookies)
69
+
70
+ Echtes Chrome mit Profil unter `~/.claude/hub/BROWSER/profile/`. Login-Cookies für LinkedIn/Gmail/etc. bleiben über Sessions erhalten.
71
+
72
+ ```bash
73
+ # Einmal starten (checkt ob schon läuft)
74
+ ~/.claude/hub/SCRIPTS/browser.sh cdp start headless # headless — für Cron/Daemon
75
+ ~/.claude/hub/SCRIPTS/browser.sh cdp start headful # sichtbar — wenn User zusehen soll
76
+
77
+ # Navigieren
78
+ ~/.claude/hub/SCRIPTS/browser.sh cdp goto "https://www.linkedin.com/jobs/search/?keywords=IT+Director"
79
+
80
+ # Screenshot
81
+ ~/.claude/hub/SCRIPTS/browser.sh cdp shot "https://www.linkedin.com/feed/" linkedin_feed.png
82
+
83
+ # Tabs auflisten
84
+ ~/.claude/hub/SCRIPTS/browser.sh cdp tabs
85
+
86
+ # Stoppen (meistens nicht nötig, Chrome läuft persistent)
87
+ ~/.claude/hub/SCRIPTS/browser.sh cdp stop
87
88
  ```
88
89
 
89
- To login:
90
+ **Login-Setup (einmalig):** Falls LinkedIn ausgeloggt ist, Ali per Telegram fragen:
91
+ > "Bitte einmal in Chrome (Hub-Profil) bei LinkedIn einloggen. Cookies bleiben dann dauerhaft erhalten."
92
+
93
+ Starten mit `cdp start headful` und Chrome öffnet sichtbar → Ali loggt ein → ab dann bleiben Cookies im Profil.
94
+
95
+ **Wie teste ich ob eingeloggt:** nach `cdp goto` die URL prüfen — wenn `/authwall` oder `/login` im Pfad steht, bist du ausgeloggt.
96
+
97
+ ---
98
+
99
+ ## Tier 3 — Claude-in-Chrome Extension (visuelle Kontrolle)
100
+
101
+ Nur in interaktiven CLI-Sessions, nicht im Cron/Daemon.
102
+
90
103
  ```bash
91
- curl -s "http://127.0.0.1:3800/fill?ref=e3&value=user@example.com"
92
- curl -s "http://127.0.0.1:3800/fill?ref=e4&value=mypassword"
93
- curl -s "http://127.0.0.1:3800/click?ref=e5"
104
+ # Check ob Extension verbunden
105
+ ~/.claude/hub/SCRIPTS/browser.sh ext check
106
+
107
+ # Dann MCP-Tools über ToolSearch laden:
108
+ # mcp__claude-in-chrome__tabs_context_mcp
109
+ # mcp__claude-in-chrome__navigate
110
+ # mcp__claude-in-chrome__computer
94
111
  ```
95
112
 
96
- ## Standard Workflow: Test a Web App
113
+ **Wann nutzen:** Drag&Drop, komplexe UI, User soll live zusehen und eingreifen können.
97
114
 
98
- 1. **Start** the browse server if not running
99
- 2. **Navigate** to the app URL
100
- 3. **Screenshot** → view with Read tool to see current state
101
- 4. **Tree** → see all interactive elements
102
- 5. **Interact** (click, fill, press) using @eN refs
103
- 6. **Screenshot** again to verify the result
104
- 7. **Repeat** for each test step
105
- 8. **Report** findings to the user
106
- 9. **Close** when done
115
+ ---
107
116
 
108
- ## Mobile Testing
117
+ ## Eskalations-Regel (PFLICHT)
118
+
119
+ ```
120
+ Öffentliche Text-Seite → Tier 0 (WebFetch/curl)
121
+ ↓ 403/Cloudflare/leerer HTML?
122
+ Tier 1 (stealth) → browser.sh stealth <url>
123
+ ↓ Captcha/Login-Wall?
124
+ Tier 2 (CDP) → cdp start headless/headful + cdp goto <url>
125
+ ↓ Cookies fehlen?
126
+ Ali fragen: "Bitte einmal in Chrome bei X einloggen, dann kann ich weitermachen."
127
+ ```
128
+
129
+ **NIEMALS aufgeben mit "Browser funktioniert nicht"** — es gibt immer einen nächsten Schritt. Lieber ehrlich melden "Tier 1 blockt mit Captcha, versuche Tier 2" als "Failed to load".
130
+
131
+ ## Status-Checks
109
132
 
110
133
  ```bash
111
- # Switch to mobile viewport
112
- curl -s "http://127.0.0.1:3800/viewport?device=mobile"
113
- curl -s "http://127.0.0.1:3800/screenshot" | jq -r '.path'
114
- # Switch back to desktop
115
- curl -s "http://127.0.0.1:3800/viewport?width=1280&height=720"
134
+ # Übersicht aller Tiers + Health
135
+ ~/.claude/hub/SCRIPTS/browser.sh status
136
+
137
+ # Ist CDP Chrome gerade auf Port 9222?
138
+ curl -s http://127.0.0.1:9222/json/version | head -c 200
116
139
  ```
117
140
 
118
- ## Auth / Cookie Injection
141
+ ## Screenshot-Ausgabe ansehen
142
+
143
+ Screenshots werden gespeichert unter `~/.claude/hub/BROWSER/screenshots/` (relativ) oder dem absoluten Pfad, den du angibst. Read-Tool auf den Pfad zeigt dir das Bild direkt an.
144
+
145
+ ## Interaktive Ops (Klicken, Formular füllen)
146
+
147
+ Für einfache Fälle: `cdp eval` mit JavaScript, das in der Seite ausgeführt wird:
119
148
 
120
- For pages that need authentication:
121
149
  ```bash
122
- # Set cookies manually
123
- curl -s 'http://127.0.0.1:3800/cookies?set=[{"name":"session","value":"abc123","domain":"example.com","path":"/"}]'
124
- # Then navigate to the authenticated page
125
- curl -s "http://127.0.0.1:3800/navigate?url=https://example.com/dashboard"
150
+ ~/.claude/hub/SCRIPTS/browser.sh cdp eval "https://example.com/login" \
151
+ "document.querySelector('#username').value='test'; document.querySelector('#password').value='pw'; document.querySelector('form').submit();"
126
152
  ```
127
153
 
128
- ## Important Notes
154
+ Für komplexere Flows (sequentielles Klicken nach DOM-Updates) → Tier 3 (Extension) nutzen.
155
+
156
+ ## Wichtige Notes
129
157
 
130
- - **Server auto-shuts down** after 5 min idle restart if needed
131
- - **One page at a time**navigation replaces the current page
132
- - **Screenshots** are saved to `/tmp/alvin-bot/browse/`view with Read tool
133
- - **127.0.0.1 only** not accessible from outside
134
- - **URL-encode** values with special chars: `value=hello%20world`
135
- - **Refs reset** on every navigation/click — always get fresh /tree after page changes
136
- - For **local dev servers**: use `http://localhost:PORT` as the URL
158
+ - **CDP-Profil-Konflikt:** Chrome kann `~/.claude/hub/BROWSER/profile/` nicht doppelt öffnen. Wenn Ali es lokal auf hatte, Port 9222 checken und `cdp stop` + `cdp start` machen.
159
+ - **Headless vs Headful:** Im Cron/Daemon (launchd) IMMER `headless` sonst scheitert Chrome an fehlendem Display.
160
+ - **Nach Seiten-Navigation** (`cdp goto`) neue Tabs legt Playwright standardmäßig an reuseTab ist nicht exponiert. Das ist OK für einzelne Scrapes, kann aber zu Tab-Explosion führen. `cdp stop` & Neustart räumt auf.
161
+ - **Persistenz:** Cookies, LocalStorage, IndexedDB, alles in `~/.claude/hub/BROWSER/profile/`. Komplett persistiert zwischen Bot-Restarts.