npm - @pa1nd/horse-browser - Versions diffs - 0.4.0 → 0.6.0 - Mend

@pa1nd/horse-browser 0.4.0 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/SKILL.md CHANGED Viewed

@@ -42,6 +42,37 @@ If `horse-browser` isn't on your PATH, the one-time setup hasn't been run — te
 to run the repo's `./install.sh` (fetches the browser, registers the launcher). Don't
 attempt setup yourself.
+## Input — use trusted, real events
+Drive clicks and typing with **`click(css)`** and **`type_into(css, text)`** (or
+`type_text(text)` into the already-focused field). They fire the same key/mouse events a
+real browser generates, so the page's `keyup`/`input`/`mousedown` listeners actually run —
+submit buttons enable, autocompletes fire, React/Vue state updates, menus open. **Never
+drive a form with `el.value = …` or `el.click()` in `js(...)`**: those fire *no* events, so
+the value or click *looks* applied while the page's logic never ran (disabled submit, dead
+dropdown, stale state). This is **correctness, not just bot-evasion** — plain sites break
+too; the anti-detection win rides along free.
+- `click(css)` — trusted mousedown→mouseup→click (+pointer). `click_xy(x, y)` for coords /
+  shadow DOM / cross-iframe (CDP input passes through iframes).
+- `type_into(css, text, clear=?, enter=?)` — real per-char keys, fast. `type_text(text)`
+  types the focused element. `press("Enter"|"Tab"|"Escape"|"Arrow…")` for a named key.
+- Escape hatch: `insert_text_fast(text)` dumps via insertText (no key events) — only for a
+  plain `<textarea>` with no listeners where speed matters.
+- Fast untrusted (`js("el.click()")`) is fine on trivial internal/dev pages, but **always**
+  use trusted input on any **login / signup / checkout**, anything behind a **bot vendor**
+  (Akamai / PerimeterX / DataDome / Cloudflare / hCaptcha / reCAPTCHA), or **after any
+  challenge appeared**.
+### Easy challenges: solve them, don't halt
+Many "captchas" are just a gesture — **click a checkbox, press-&-hold, slide-to-verify**.
+Do them; don't escalate. With a real fingerprint (always-on) plus a trusted click, the easy
+ones usually clear. Call **`solve_challenge()`** — it classifies the challenge and solves the
+easy kind, or returns `escalate:<why>` for the **perception** kind (identify images, read
+distorted text, rotate, audio) — and *only those* go to the operator. The gesture verbs:
+`press_hold(css, seconds)` and `drag(css, dx=… / to=(x,y))`.
 ## Extension
 Gives each Claude session its own coloured tab group (label = last 4 chars of `CLAUDE_CODE_SESSION_ID`; subagents inherit it and share the group). Keeps RAM and tab-strip clutter from bleeding across parallel sessions, and lets you reason about "my tabs" as a real set. `chrome.tabGroups` is extension-only — no CDP equivalent — which is why an extension exists at all.
@@ -78,7 +109,7 @@ self.listTabs(label: string) -> Promise<Tab[]>
 //   }
 ```
-**`bh_open` / `bh_list` / `bh_switch_tab` are pre-installed.** `install.sh` writes them into browser-harness's `agent-workspace/agent_helpers.py` (which auto-loads on every call), so they're available immediately — just call `bh_open(url)`.
+**`bh_open` / `bh_list` / `bh_switch_tab` are pre-installed.** `install.sh` writes them to browser-harness's `agent-workspace/horse_helpers.py`, loaded on every call via a small stub it adds once to `agent_helpers.py` (your own additions to that file are never touched), so they're available immediately — just call `bh_open(url)`.
 If `bh_open` is somehow undefined (a browser-harness checkout that never ran our `install.sh`), re-run `horse-browser`'s `install.sh` to install them — don't hand-roll your own; the focus-safe behaviour is subtle. You pass CDP `targetId`s only — the extension bridges to chrome `tabId`s internally.

package/agent-helpers.py CHANGED Viewed

@@ -1,8 +1,10 @@
 # horse-browser helpers for browser-harness.
 #
-# install.sh appends these into browser-harness's agent-workspace/agent_helpers.py
-# (which auto-loads on every browser-harness call), so bh_open() is available on
-# the first run — no agent has to install the recipe by hand.
+# install.sh installs this file as <workspace>/horse_helpers.py and appends a
+# load-once stub to agent_helpers.py (which browser-harness auto-loads on every
+# call), so bh_open() is available on the first run — no agent has to install the
+# recipe by hand. Updates overwrite horse_helpers.py only; anything a user keeps
+# in agent_helpers.py itself is never touched.
 #
 # What they give you (pass CDP targetIds; the extension bridges to chrome tabIds):
 #   bh_open(url)         open a tab WITHOUT raising the browser over your macOS app,
@@ -213,3 +215,63 @@ def bh_open(url):
 def bh_list():
     return ext_call("listTabs", _session_id()) or [] if _session_id() else []
+# ── screenshots: a unique file per call ──────────────────────────────────────────
+# Stock capture_screenshot() defaults every call — from EVERY session's daemon — to
+# the ONE shared file <tmp>/shot.png. With concurrent sessions, a neighbour overwrites
+# it between this session writing the shot and the agent Reading the path it printed,
+# so the "screenshot" shows another session's tab. Same namespace-merge trick as cdp()
+# above: defining capture_screenshot here replaces the stock one everywhere. Each call
+# mints its own file (named after the session's daemon lane) and returns that path, so
+# the agent's existing print-the-path-then-Read flow just works.
+from browser_harness.helpers import capture_screenshot as _real_capture_screenshot
+from browser_harness import _ipc as _bh_ipc
+_hb_shots_swept = False
+def _hb_sweep_shots(max_age_s=86400):
+    # Unique names accumulate where the single shot.png didn't — drop day-old ones.
+    # Once per process, and only in processes that actually screenshot.
+    global _hb_shots_swept
+    if _hb_shots_swept:
+        return
+    _hb_shots_swept = True
+    import time
+    now = time.time()
+    try:
+        for e in os.scandir(_bh_ipc._TMP):
+            if e.name.startswith("shot-") and e.name.endswith(".png") and now - e.stat().st_mtime > max_age_s:
+                try: os.remove(e.path)
+                except OSError: pass
+    except OSError:
+        pass
+def capture_screenshot(path=None, full=False, max_dim=None):
+    """Save a PNG of the current viewport to a per-call unique file; returns the path.
+    Args as stock browser-harness: full=capture beyond viewport, max_dim=downscale."""
+    if path is None:
+        _hb_sweep_shots()
+        import tempfile
+        fd, path = tempfile.mkstemp(
+            prefix=f"shot-{os.environ.get('BU_NAME', 'default')}-",
+            suffix=".png", dir=str(_bh_ipc._TMP))
+        os.close(fd)
+    return _real_capture_screenshot(path, full=full, max_dim=max_dim)
+# ── Tier 2 trusted-input layer — shipped as the sibling horse_input.py ───────────
+# horse-browser splits its managed helpers by concern: THIS file drives tabs (focus-safe
+# open/switch/list, per-call screenshots); horse_input.py does trusted, correct INPUT
+# (real click/key events, easy-challenge gestures). We exec the sibling here so the single
+# "do not edit" loader stub in agent_helpers.py bootstraps both. `_hb_path` is the path to
+# THIS file, set by that stub; horse_input.py sits next to it. Missing/failed → skipped, so
+# tab-driving still works even if the input file didn't ship.
+try:
+    _hb_input = os.path.join(os.path.dirname(_hb_path), "horse_input.py")
+    exec(compile(open(_hb_input).read(), _hb_input, "exec"))
+except Exception as _hb_input_err:
+    import sys as _hb_isys
+    print("horse-browser: couldn't load horse_input.py (%r) — re-run horse-browser's install.sh" % (_hb_input_err,), file=_hb_isys.stderr)

package/agent-input.py ADDED Viewed

@@ -0,0 +1,275 @@
+# horse-browser Tier 2 — trusted, correct input (installed as horse_input.py).
+#
+# Loaded by horse_helpers.py, which execs this sibling — so every agent that drives
+# the browser gets these by default. This is the AGENT LAYER of realness: input sent
+# over CDP that fires the SAME events a real browser would, applied on every site.
+#
+# WHY IT'S NOT JUST STEALTH — it's correctness. Sites bind real logic to real events:
+#   • keyup / keydown / input  → enable the submit button, fire autocomplete, validate,
+#                                update React/Vue/Svelte controlled state.
+#   • mousedown / pointerdown  → open menus, custom widgets, "close on outside mousedown".
+# insertText sets the value but fires NO key events; `el.value=` fires nothing at all;
+# `el.click()` fires only `click`, not the down/up/pointer chain. In every case the text
+# or click *appears* to work while the page's logic never ran — a silent break that hits
+# plain forms, not just defended ones. These verbs fire the real events, so pages behave.
+# (Bot-detector realness rides along for free.) Human-like MOTION — bezier paths, warm-up,
+# gaussian cadence — is the separate, gated Tier 3 layer; this file stays cheap and fast.
+#
+# Reach for these (they shadow the untrusted shortcuts):
+#   click(css)              trusted mousedown->mouseup->click at the element's center
+#   type_into(css, text)    focus + real per-char keyDown/keyUp (fires keyup/input/change)
+#   type_text(text)         OVERRIDE of the stock insertText typer → real key events
+#   press(name, times=1)    a trusted named key (Enter, Tab, Escape, Arrow*, Backspace)
+#   press_hold(css, s)      trusted press-and-hold (Press & Hold challenges)
+#   drag(css, to=/dx=/dy=)  trusted drag (slide-to-verify)
+#   solve_challenge(act=1)  classify a challenge → solve the EASY ones (click/hold/drag),
+#                           or return "escalate:<why>" for image/text/audio ones.
+# Deliberate escape hatch:
+#   insert_text_fast(text)  raw Input.insertText — fast, but fires NO key events; only for
+#                           dumping into a plain <textarea> with no listeners.
+#
+# `cdp` is provided by horse_helpers.py (loaded first) — we drive everything through it.
+import math as _im
+import random as _ir
+import time as _it
+import json as _ij
+import sys as _isys
+_mouse = {"x": 240.0, "y": 240.0}
+def _eval(expr):
+    return (cdp("Runtime.evaluate", expression=expr, returnByValue=True).get("result") or {}).get("value")
+def _center(css):
+    """Viewport-center (CSS px) of the first `css` match, scrolled into view; None if absent/hidden."""
+    expr = ("(function(){var e=document.querySelector(" + _ij.dumps(css) + ");if(!e)return null;"
+            "try{e.scrollIntoView({block:'center',inline:'center'});}catch(_){}"
+            "var b=e.getBoundingClientRect();if(b.width===0&&b.height===0)return null;"
+            "return [b.x+b.width/2,b.y+b.height/2];})()")
+    return _eval(expr)
+def _focus(css):
+    return bool(_eval("(function(){var e=document.querySelector(" + _ij.dumps(css) + ");if(!e)return false;e.focus();return true;})()"))
+# ── mouse ────────────────────────────────────────────────────────────────────────
+def _move(x, y):
+    cdp("Input.dispatchMouseEvent", type="mouseMoved", x=x, y=y)
+    _mouse["x"], _mouse["y"] = x, y
+def click_xy(x, y):
+    """Trusted left click at viewport coords — the full mousedown/mouseup/click (+pointer)
+    chain, so the page reacts exactly as it would to a person (unlike el.click())."""
+    _move(x, y)
+    _it.sleep(_ir.uniform(0.02, 0.06))
+    cdp("Input.dispatchMouseEvent", type="mousePressed", x=x, y=y, button="left", clickCount=1)
+    _it.sleep(_ir.uniform(0.03, 0.08))
+    cdp("Input.dispatchMouseEvent", type="mouseReleased", x=x, y=y, button="left", clickCount=1)
+def click(css):
+    """Trusted click at the center of `css`. Never el.click() — this fires the whole
+    event chain (mousedown/mouseup/pointer/click) so menus, widgets and validation run."""
+    c = _center(css)
+    if not c:
+        raise RuntimeError("click: no visible element " + css)
+    click_xy(c[0], c[1])
+def press_hold(css, seconds=6.0):
+    """Trusted press-and-hold at `css` — for Press & Hold challenges (PerimeterX/DataDome).
+    Holds the button down with tiny jitter for `seconds`, which real widgets require."""
+    c = _center(css)
+    if not c:
+        raise RuntimeError("press_hold: no visible element " + css)
+    x, y = c
+    _move(x, y)
+    cdp("Input.dispatchMouseEvent", type="mousePressed", x=x, y=y, button="left", clickCount=1)
+    end = _it.time() + seconds
+    while _it.time() < end:
+        cdp("Input.dispatchMouseEvent", type="mouseMoved", x=x + _ir.uniform(-1.4, 1.4), y=y + _ir.uniform(-1.4, 1.4), button="left")
+        _it.sleep(_ir.uniform(0.08, 0.2))
+    cdp("Input.dispatchMouseEvent", type="mouseReleased", x=x, y=y, button="left", clickCount=1)
+def drag(css, to=None, dx=None, dy=0):
+    """Trusted drag from `css` — for slide-to-verify sliders. Give an absolute `to`=(x,y)
+    target, or a relative `dx`/`dy`. Moves in small held-button steps (ease-in-out + tiny
+    jitter) so the site sees a real pointer drag, not a teleport."""
+    c = _center(css)
+    if not c:
+        raise RuntimeError("drag: no visible element " + css)
+    x0, y0 = c
+    x1, y1 = (to if to else (x0 + (dx or 0), y0 + dy))
+    _move(x0, y0)
+    cdp("Input.dispatchMouseEvent", type="mousePressed", x=x0, y=y0, button="left", clickCount=1)
+    _it.sleep(_ir.uniform(0.05, 0.12))
+    steps = max(14, int(_im.hypot(x1 - x0, y1 - y0) / 10))
+    for i in range(1, steps + 1):
+        t = i / steps
+        e = t * t * (3 - 2 * t)                                   # smoothstep ease
+        px = x0 + (x1 - x0) * e + (_ir.uniform(-1.0, 1.0) if i < steps else 0)
+        py = y0 + (y1 - y0) * e + (_ir.uniform(-1.0, 1.0) if i < steps else 0)
+        cdp("Input.dispatchMouseEvent", type="mouseMoved", x=px, y=py, button="left")
+        _it.sleep(_ir.uniform(0.008, 0.022))
+    cdp("Input.dispatchMouseEvent", type="mouseReleased", x=x1, y=y1, button="left", clickCount=1)
+    _mouse["x"], _mouse["y"] = x1, y1
+# ── keyboard ─────────────────────────────────────────────────────────────────────
+_PUNCT = {'/': ('Slash', 191), '.': ('Period', 190), ',': ('Comma', 188), '-': ('Minus', 189),
+          ' ': ('Space', 32), ';': ('Semicolon', 186), ':': ('Semicolon', 186), "'": ('Quote', 222),
+          '"': ('Quote', 222), '@': ('Digit2', 50), '_': ('Minus', 189), '=': ('Equal', 187),
+          '+': ('Equal', 187), '(': ('Digit9', 57), ')': ('Digit0', 48), '!': ('Digit1', 49),
+          '?': ('Slash', 191), '#': ('Digit3', 51)}
+_SPECIAL = {'Enter': ('Enter', 13, '\r'), 'Tab': ('Tab', 9, '\t'), 'Backspace': ('Backspace', 8, ''),
+            'Escape': ('Escape', 27, ''), 'Delete': ('Delete', 46, ''), 'Space': ('Space', 32, ' '),
+            'ArrowDown': ('ArrowDown', 40, ''), 'ArrowUp': ('ArrowUp', 38, ''),
+            'ArrowLeft': ('ArrowLeft', 37, ''), 'ArrowRight': ('ArrowRight', 39, '')}
+def _keyinfo(ch):
+    if ch.isalpha():
+        u = ch.upper()
+        return (ch, 'Key' + u, ord(u))
+    if ch.isdigit():
+        return (ch, 'Digit' + ch, ord(ch))
+    if ch in _PUNCT:
+        code, vk = _PUNCT[ch]
+        return (ch, code, vk)
+    return (ch, '', 0)
+def _key(ch):
+    key, code, vk = _keyinfo(ch)
+    base = dict(key=key, code=code, windowsVirtualKeyCode=vk, nativeVirtualKeyCode=vk)
+    # keyDown WITH text makes Chrome actually insert the char (fires a native `input`
+    # event); keyUp without text. Real, fully-formed events → site keyup/input listeners fire.
+    cdp("Input.dispatchKeyEvent", type="keyDown", text=ch, **base)
+    cdp("Input.dispatchKeyEvent", type="keyUp", **base)
+def press(name, times=1):
+    """Press a named key with a real, trusted key event: Enter, Tab, Escape, Backspace,
+    Delete, Space, Arrow{Up,Down,Left,Right}."""
+    code, vk, txt = _SPECIAL[name]
+    base = dict(key=code, code=code, windowsVirtualKeyCode=vk, nativeVirtualKeyCode=vk)
+    for _ in range(times):
+        cdp("Input.dispatchKeyEvent", type="keyDown", **(dict(base, text=txt) if txt else base))
+        cdp("Input.dispatchKeyEvent", type="keyUp", **base)
+        _it.sleep(0.03)
+def _clear_focused():
+    mods = 4 if _isys.platform == "darwin" else 2                 # Cmd on macOS, Ctrl elsewhere
+    sa = dict(key='a', code='KeyA', windowsVirtualKeyCode=65, nativeVirtualKeyCode=65, modifiers=mods)
+    cdp("Input.dispatchKeyEvent", type="rawKeyDown", **sa)
+    cdp("Input.dispatchKeyEvent", type="keyUp", **sa)
+    press("Delete")
+def type_into(css, text, per=0.0, clear=False, enter=False):
+    """Type `text` into `css` with REAL per-char key events so keyup/input/change fire —
+    enabling submit buttons, triggering autocompletes, updating framework state. Fast by
+    default (per=0); pass per>0 for a light cadence, or use the Tier 3 human_* helpers for
+    full human timing. clear=True empties the field first; enter=True presses Enter after."""
+    if not _focus(css):
+        raise RuntimeError("type_into: no element " + css)
+    if clear:
+        _clear_focused()
+    for ch in text:
+        _key(ch)
+        if per:
+            _it.sleep(per)
+    if enter:
+        press("Enter")
+def type_text(text):
+    """OVERRIDE of the stock browser-harness typer. Stock type_text used Input.insertText,
+    which sets the value but fires NO key events — so keyup/input listeners never run and
+    the page silently misbehaves (submit stays disabled, autocomplete dead, React state
+    stale). This types the currently-focused element with REAL key events instead."""
+    for ch in text:
+        _key(ch)
+def insert_text_fast(text):
+    """The old fast path: Input.insertText in one shot. Fires NO key events — use ONLY for
+    dumping into a plain <textarea> with no keyup/input listeners, where speed matters."""
+    cdp("Input.insertText", text=text)
+# ── easy-challenge solving — a gesture, never perception ───────────────────────────
+# Classify what's on the page. EASY = something a trusted gesture clears with no
+# understanding of content (checkbox, press-&-hold, slide-to-verify). HARD = anything
+# needing to perceive content (pick images, read distorted text, rotate, audio) — we
+# NEVER guess at those; we say escalate. Detection is heuristic (best-effort DOM sniff).
+_DETECT_JS = r"""
+(() => {
+  const q = (s) => document.querySelector(s);
+  const txt = (document.body ? document.body.innerText : '').toLowerCase();
+  const seen = (...ss) => ss.find(s => q(s));
+  // An image/interactive challenge popup that's OPEN — reCAPTCHA bframe or hCaptcha
+  // challenge iframe. It's a top-document iframe (cross-origin, can't read inside) but
+  // we can see it's visibly expanded. That means a checkbox already escalated to the
+  // perception kind → hard, escalate (don't re-report the checkbox behind it).
+  const pop = q('iframe[src*="recaptcha/api2/bframe"], iframe[src*="hcaptcha.com/captcha"][title*="hallenge"], iframe[title*="recaptcha challenge"]');
+  if (pop) { const pb = pop.getBoundingClientRect(); if (pb.height > 120 && pb.width > 120 && getComputedStyle(pop).visibility !== 'hidden') return {kind:'hard', why:'image challenge popup is open'}; }
+  // HARD first — if a perception challenge is present, don't attempt a gesture.
+  const hardTxt = /select all|click each|images? (with|containing)|type the (characters|text)|what does this say|rotate|listen and|audio challenge/;
+  if (hardTxt.test(txt) || q('table.rc-imageselect-table') || q('.geetest_item_wrap')) return {kind:'hard', why:'image/text/audio challenge'};
+  // Press & Hold (PerimeterX / DataDome)
+  if (/press\s*&?\s*and?\s*hold|press and hold/.test(txt) || q('#px-captcha') || q('[id*="px-captcha"]'))
+    { const el = q('#px-captcha [role=button]') || q('#px-captcha') || q('[id*="press"]'); return {kind:'hold', sel: el ? _sel(el) : '#px-captcha', why:'press & hold'}; }
+  // Slider / slide-to-verify
+  const slider = q('.slider, [class*="slide"] [class*="btn"], [class*="drag"][class*="btn"], .yidun_slider, .nc_iconfont');
+  if (/slide to|drag the slider|slide right|slide to verify/.test(txt) || slider)
+    return {kind:'drag', sel: slider ? _sel(slider) : null, why:'slide to verify'};
+  // Checkbox captchas (reCAPTCHA / hCaptcha / Turnstile) — usually a cross-origin iframe.
+  if (q('iframe[src*="recaptcha/api2/anchor"]') || q('iframe[title*="hCaptcha"]') || q('iframe[src*="challenges.cloudflare.com"]') || q('.cf-turnstile') || q('.g-recaptcha') || q('.h-captcha'))
+    return {kind:'checkbox', why:'checkbox captcha (in an iframe — click its coords)'};
+  function _sel(e){ if(e.id) return '#'+CSS.escape(e.id); if(e.className && typeof e.className==='string'){const c=e.className.trim().split(/\s+/)[0]; if(c) return e.tagName.toLowerCase()+'.'+CSS.escape(c);} return e.tagName.toLowerCase(); }
+  return {kind:'none'};
+})()
+"""
+def solve_challenge(act=True, hold_seconds=6.0):
+    """Detect a challenge and, if it's EASY (a trusted gesture — checkbox click, press-&-hold,
+    slide-to-verify), solve it; return a short status string. For HARD challenges (identify
+    images, read text, rotate, audio) it does NOT guess — it returns 'escalate:<why>' so you
+    stop and ask the operator. Returns 'none' if no challenge is found. act=False = classify
+    only (don't perform the gesture)."""
+    d = _eval(_DETECT_JS) or {"kind": "none"}
+    kind, sel, why = d.get("kind"), d.get("sel"), d.get("why", "")
+    if kind == "none":
+        return "none"
+    if kind == "hard":
+        return "escalate:" + why + " — needs perception; ask the operator, don't guess"
+    if not act:
+        return "easy:%s (%s) sel=%s" % (kind, why, sel)
+    try:
+        if kind == "hold":
+            press_hold(sel, seconds=hold_seconds)
+            return "solved:hold — press-held %s (verify it cleared; retry once, else escalate)" % sel
+        if kind == "drag":
+            if not sel:
+                return "easy:drag (%s) — found a slider but couldn't pin a selector; drag it by hand with drag(sel, dx=<track width>)" % why
+            c = _center(sel)
+            if c:
+                drag(sel, dx=320)                                # slide well to the right; simple sliders latch at the end
+                return "solved:drag — slid %s right (verify; if it snapped back, escalate)" % sel
+            return "easy:drag — slider not locatable; escalate if it blocks you"
+        if kind == "checkbox":
+            return ("easy:checkbox (%s) — it's in a cross-origin iframe. Screenshot, read the checkbox pixel, "
+                    "then click_xy(x, y) (a trusted click passes through the iframe). If an image grid appears "
+                    "after, that's the HARD kind — escalate." % why)
+    except Exception as e:
+        return "escalate:gesture failed (%r) — ask the operator" % (e,)
+    return "none"

package/extension/background.js CHANGED Viewed

@@ -137,3 +137,42 @@ chrome.runtime.onInstalled.addListener((details) => {
   // First run on a fresh profile only: open the welcome page (not on updates/restarts).
   if (details.reason === "install") chrome.tabs.create({ url: chrome.runtime.getURL("hello.html"), active: true });
 });
+// ── Passive network monitor ── Tier 1 realness debug aid. chrome.webRequest observes every
+// request WITHOUT touching the page — no window.fetch/XHR wrapper, which would itself be a
+// fingerprint tell (a non-native fetch.toString). We buffer the last N *outcomes* per tab
+// (onCompleted carries the status; onErrorOccurred carries blocks/aborts), so an agent can
+// see what a click actually fired — decisive for spotting silent gating. State lives in the
+// SW's memory: the 30s keepalive alarm keeps it warm during a session, but a hard SW
+// eviction resets it — fine for a read-right-after-you-act ring buffer.
+const NETLOG = new Map();          // tabId -> [{ t, method, url, type, status, fromCache, error }]
+const NETLOG_MAX = 200;
+function netPush(tabId, entry) {
+  if (tabId == null || tabId < 0) return;   // -1 = not tied to a tab (SW/extension fetches)
+  let buf = NETLOG.get(tabId);
+  if (!buf) { buf = []; NETLOG.set(tabId, buf); }
+  buf.push(entry);
+  if (buf.length > NETLOG_MAX) buf.shift();
+}
+const NET_URLS = { urls: ["http://*/*", "https://*/*"] };
+chrome.webRequest.onCompleted.addListener(
+  (d) => netPush(d.tabId, { t: d.timeStamp, method: d.method, url: d.url, type: d.type, status: d.statusCode, fromCache: d.fromCache }),
+  NET_URLS,
+);
+chrome.webRequest.onErrorOccurred.addListener(
+  (d) => netPush(d.tabId, { t: d.timeStamp, method: d.method, url: d.url, type: d.type, status: null, error: d.error }),
+  NET_URLS,
+);
+chrome.tabs.onRemoved.addListener((tabId) => NETLOG.delete(tabId));   // don't leak buffers
+// getNetLog(targetId) / clearNetLog(targetId) — CDP-callable like groupTab/listTabs. Agents
+// pass a CDP targetId; we bridge to the chrome tabId the same way the grouper does.
+self.getNetLog = async (targetId) => {
+  const tabId = await tabIdForTargetId(targetId);
+  return NETLOG.get(tabId) || [];
+};
+self.clearNetLog = async (targetId) => {
+  const tabId = await tabIdForTargetId(targetId);
+  NETLOG.delete(tabId);
+  return true;
+};

package/extension/manifest.json CHANGED Viewed

@@ -1,14 +1,28 @@
 {
   "manifest_version": 3,
   "name": "Agent Tab Grouper",
-  "version": "0.5.12",
-  "description": "Groups CDP-driven automation tabs into a per-session tab group, and serves the Agent Monitor — a live CCTV grid of all sessions' tabs. Driven over CDP via Runtime.evaluate against this extension's service worker.",
+  "version": "0.6.0",
+  "description": "Groups CDP-driven automation tabs into a per-session tab group and serves the Agent Monitor (a live CCTV grid). Also carries Tier 1 realness: an always-on Chrome-real fingerprint (UA-CH + sec-ch-ua) and a passive per-tab network log. Driven over CDP via Runtime.evaluate against this extension's service worker.",
   "icons": { "16": "icons/icon16.png", "48": "icons/icon48.png", "128": "icons/icon128.png" },
   "background": { "service_worker": "background.js" },
+  "content_scripts": [
+    {
+      "matches": ["http://*/*", "https://*/*"],
+      "js": ["realchrome.js"],
+      "run_at": "document_start",
+      "world": "MAIN",
+      "all_frames": true
+    }
+  ],
+  "declarative_net_request": {
+    "rule_resources": [
+      { "id": "realness_headers", "enabled": true, "path": "rules.json" }
+    ]
+  },
   "action": {
     "default_title": "Open Agent Monitor",
     "default_icon": { "16": "icons/icon16.png", "48": "icons/icon48.png", "128": "icons/icon128.png" }
   },
-  "permissions": ["tabs", "tabGroups", "debugger", "alarms"],
-  "host_permissions": ["http://127.0.0.1:9223/*"]
+  "permissions": ["tabs", "tabGroups", "debugger", "alarms", "webRequest", "declarativeNetRequest"],
+  "host_permissions": ["http://*/*", "https://*/*"]
 }

package/extension/realchrome.js ADDED Viewed

@@ -0,0 +1,27 @@
+// Tier 1 realness — the JS half of the Chrome-for-Testing → stable Google Chrome mask.
+// Runs in the page's MAIN world at document_start (see manifest content_scripts), so it
+// patches navigator.userAgentData BEFORE the page's first script reads it (verified: a
+// parse-time read already sees "Google Chrome", 10/10). The wire half — the sec-ch-ua
+// request header — is done coherently by declarativeNetRequest (rules.json), so JS and
+// headers agree. Payload lifted verbatim from the proven real_chrome() helper.
+(function(){ 'use strict'; try{
+  var u = navigator.userAgentData; if(!u) return;
+  var FULL='150.0.7871.47', MAJOR='150';
+  function low(){ return [{brand:'Not;A=Brand',version:'8'},{brand:'Chromium',version:MAJOR},{brand:'Google Chrome',version:MAJOR}]; }
+  function full(){ return [{brand:'Not;A=Brand',version:'8.0.0.0'},{brand:'Chromium',version:FULL},{brand:'Google Chrome',version:FULL}]; }
+  var proto = Object.getPrototypeOf(u);
+  function nativeProxy(orig, impl){ return new Proxy(orig, { apply:function(t,ta,a){ return impl(t,ta,a); } }); }
+  var bd = Object.getOwnPropertyDescriptor(proto,'brands');
+  if(bd && bd.get && !u.brands.some(function(b){return b.brand==='Google Chrome'})){
+    Object.defineProperty(proto,'brands',{ get: nativeProxy(bd.get, function(){ return low(); }), set:undefined, enumerable:bd.enumerable, configurable:true });
+  }
+  if(typeof proto.getHighEntropyValues==='function'){
+    proto.getHighEntropyValues = nativeProxy(proto.getHighEntropyValues, function(t,ta,a){
+      return Reflect.apply(t,ta,a).then(function(r){ if(r&&typeof r==='object'){ if('brands' in r) r.brands=low(); if('fullVersionList' in r) r.fullVersionList=full(); if('uaFullVersion' in r) r.uaFullVersion=FULL; } return r; });
+    });
+  }
+  if(typeof proto.toJSON==='function'){
+    proto.toJSON = nativeProxy(proto.toJSON, function(t,ta,a){ var o=Reflect.apply(t,ta,a); if(o&&typeof o==='object') o.brands=low(); return o; });
+  }
+  try{ if(navigator.webdriver===true) Object.defineProperty(Navigator.prototype,'webdriver',{get:function(){return false},configurable:true}); }catch(e){}
+}catch(e){} })();

package/extension/rules.json ADDED Viewed

@@ -0,0 +1,23 @@
+[
+  {
+    "id": 1,
+    "priority": 1,
+    "action": {
+      "type": "modifyHeaders",
+      "requestHeaders": [
+        {
+          "header": "sec-ch-ua",
+          "operation": "set",
+          "value": "\"Not;A=Brand\";v=\"8\", \"Chromium\";v=\"150\", \"Google Chrome\";v=\"150\""
+        }
+      ]
+    },
+    "condition": {
+      "urlFilter": "*",
+      "resourceTypes": [
+        "main_frame", "sub_frame", "stylesheet", "script", "image",
+        "font", "object", "xmlhttprequest", "ping", "csp_report", "media", "websocket", "other"
+      ]
+    }
+  }
+]

package/install.sh CHANGED Viewed

@@ -64,12 +64,14 @@ else
 fi
 # 3. bh_open helpers into browser-harness ───────────────────────────────────────
-# browser-harness auto-loads <workspace>/agent_helpers.py on every call; we append our
-# bh_open/bh_list/bh_switch_tab helpers there so they exist on the first run — no agent
-# installs the recipe by hand. The workspace location varies by version/install (≤0.1.0:
-# <repo>/agent-workspace; 0.1.1+: ~/.config/browser-harness/agent-workspace; or whatever
-# BH_AGENT_WORKSPACE points at) — so instead of guessing, we ASK browser-harness itself
-# where it loads from via its own python (helpers.AGENT_WORKSPACE). Append-once.
+# browser-harness auto-loads <workspace>/agent_helpers.py on every call. Our helpers
+# ship as horse_helpers.py — a file THIS installer owns and overwrites outright on
+# every sync — plus a load-once stub appended to agent_helpers.py exactly once and
+# never rewritten, so anything a user adds to agent_helpers.py is never touched.
+# The workspace location varies by version/install (≤0.1.0: <repo>/agent-workspace;
+# 0.1.1+: ~/.config/browser-harness/agent-workspace; or whatever BH_AGENT_WORKSPACE
+# points at) — so instead of guessing, we ASK browser-harness itself where it loads
+# from via its own python (helpers.AGENT_WORKSPACE).
 if ! command -v browser-harness >/dev/null 2>&1; then
   # browser-harness is a separate (Python) prerequisite. In a hand/curl install it's
   # required up front. Under npm it's a *declared* prereq the user may not have yet — so
@@ -82,17 +84,56 @@ if ! command -v browser-harness >/dev/null 2>&1; then
   exit 1
 fi
 HELPERS_SRC="$HERE/agent-helpers.py"
+INPUT_SRC="$HERE/agent-input.py"   # Tier 2 trusted-input layer → workspace/horse_input.py
+# Legacy marker: pre-0.4.1 installs appended the helpers INLINE under this line, and
+# re-syncs replaced marker→EOF — silently eating anything a user had added below the
+# block. Kept only so those files can be migrated once.
 HELPERS_MARKER="# ── horse-browser helpers (installed by horse-browser/install.sh) ──"
-install_helpers_into() {  # $1 = workspace dir; (re)syncs our helpers block — idempotent, so
-  local dst="$1/agent_helpers.py"  # re-running install.sh deploys helper UPDATES, not just the first time.
-  mkdir -p "$1" 2>/dev/null || return 0
-  # drop any previously-installed block (marker → EOF), trim trailing blanks, then re-append
+LOADER_MARKER="# >>> horse-browser: bh_open helpers (managed loader — do not edit) >>>"
+install_helpers_into() {  # $1 = workspace dir; (re)syncs the helpers — idempotent, so
+  local ws="$1" dst="$1/agent_helpers.py"  # re-running install.sh deploys helper UPDATES too.
+  mkdir -p "$ws" 2>/dev/null || return 0
+  cp "$HELPERS_SRC" "$ws/horse_helpers.py"
+  cp "$INPUT_SRC" "$ws/horse_input.py"   # loaded by horse_helpers.py (chain-exec)
+  # one-time migration of a legacy inline block: strip it ONLY on an exact byte match
+  # with the shipped source, so user additions below it survive. A modified/unknown
+  # block is left in place — harmless, the loader runs after it and its defs win.
   if [ -f "$dst" ] && grep -qF "$HELPERS_MARKER" "$dst"; then
-    awk -v m="$HELPERS_MARKER" 'index($0,m){exit} {print}' "$dst" \
-      | awk 'NF{last=NR} {b[NR]=$0} END{for(i=1;i<=last;i++)print b[i]}' > "$dst.tmp" && mv "$dst.tmp" "$dst"
+    python3 - "$dst" "$HELPERS_SRC" "$HELPERS_MARKER" <<'PY' || true
+import sys
+dst, src, marker = sys.argv[1], sys.argv[2], sys.argv[3]
+text, block = open(dst).read(), open(src).read()
+i = text.find(marker)
+legacy = marker + "\n" + block
+if i >= 0 and text[i:].startswith(legacy):
+    head = text[:i].rstrip("\n")
+    rest = text[i + len(legacy):].lstrip("\n")
+    parts = [p for p in (head, rest) if p]
+    open(dst, "w").write("\n\n".join(parts) + ("\n" if parts else ""))
+    print("  migrated: legacy inline helper block removed (everything else kept)")
+elif i >= 0:
+    print("  note: a legacy horse-browser block is present but doesn't match the shipped", file=sys.stderr)
+    print("        source — left in place (the horse_helpers.py loader below supersedes", file=sys.stderr)
+    print("        it); remove it from " + dst + " by hand when convenient.", file=sys.stderr)
+PY
+  fi
+  if ! grep -qF "$LOADER_MARKER" "$dst" 2>/dev/null; then
+    if [ -s "$dst" ]; then printf '\n' >> "$dst"; fi
+    printf '%s\n' "$LOADER_MARKER" >> "$dst"
+    cat >> "$dst" <<'LOADER'
+# The helpers live in horse_helpers.py next to this file. install.sh overwrites THAT
+# file on every sync and never rewrites this one — your own code here is safe.
+try:
+    import os as _hb_os
+    _hb_path = _hb_os.path.join(_hb_os.path.dirname(_hb_os.path.abspath(__file__)), "horse_helpers.py")
+    exec(compile(open(_hb_path).read(), _hb_path, "exec"))
+except Exception as _hb_err:
+    import sys as _hb_sys
+    print("horse-browser: couldn't load horse_helpers.py (%r) — re-run horse-browser's install.sh" % (_hb_err,), file=_hb_sys.stderr)
+# <<< horse-browser: bh_open helpers <<<
+LOADER
   fi
-  { [ -s "$dst" ] && printf '\n\n'; printf '%s\n' "$HELPERS_MARKER"; cat "$HELPERS_SRC"; } >> "$dst"
-  echo "✓ synced bh_open helpers → $dst"
+  echo "✓ synced bh_open helpers → $ws/horse_helpers.py (loaded via stub in $dst)"
 }
 # (a) the workspace browser-harness ACTUALLY loads from — ask it via its own python
 # (the CLI's shebang points at it); helpers.AGENT_WORKSPACE honours BH_AGENT_WORKSPACE and

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@pa1nd/horse-browser",
-  "version": "0.4.0",
+  "version": "0.6.0",
   "publishConfig": {
     "access": "public"
   },
@@ -12,6 +12,7 @@
     "bin",
     "extension",
     "agent-helpers.py",
+    "agent-input.py",
     "scripts",
     "install.sh",
     "claude-md.sh",