npm - @empir3/empir3-bridge - Versions diffs - 0.3.21 - Mend

@empir3/empir3-bridge 0.3.21

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (62) hide show

package/CHANGELOG.md +1531 -0
package/CODE_OF_CONDUCT.md +9 -0
package/CONTRIBUTING.md +75 -0
package/LICENSE +21 -0
package/README.md +464 -0
package/SECURITY.md +130 -0
package/assets/accuracy-lab.html +2639 -0
package/assets/api-clis-real.jpg +0 -0
package/assets/bridge-console-hero.jpg +0 -0
package/assets/browser-privacy.svg +151 -0
package/assets/demo-orchestration.svg +74 -0
package/assets/desktop-select-region.jpg +0 -0
package/assets/in-page-chat.gif +0 -0
package/assets/orchestration-hero.svg +126 -0
package/assets/social-preview.png +0 -0
package/assets/zara-accent.png +0 -0
package/build/bootstrap.js +548 -0
package/build/build.js +680 -0
package/build/payload-entry.js +649 -0
package/build/payload-signing-pub.json +7 -0
package/docs/AGENT_GUIDE.md +259 -0
package/docs/RELEASE.md +106 -0
package/docs/SAFETY.md +112 -0
package/docs/TESTING.md +181 -0
package/installer/server.js +231 -0
package/installer/ui/app.js +278 -0
package/installer/ui/index.html +24 -0
package/installer/ui/styles.css +146 -0
package/package.json +95 -0
package/scripts/bootstrap-e2e.mjs +650 -0
package/scripts/certify-bridge.mjs +636 -0
package/scripts/check-companion-surface.mjs +118 -0
package/scripts/extract-welcome.mjs +64 -0
package/scripts/gh-route-handler-check.mjs +57 -0
package/scripts/gh-wire-test.mjs +107 -0
package/scripts/publish-downloads.mjs +180 -0
package/scripts/smoke-all-tools.mjs +509 -0
package/scripts/smoke-live-bridge.mjs +696 -0
package/scripts/splice-welcome.mjs +63 -0
package/scripts/welcome-body.txt +2733 -0
package/src/anthropic-client.ts +192 -0
package/src/bootstrap-exe.ts +69 -0
package/src/bridge.ts +2444 -0
package/src/chat.ts +345 -0
package/src/cli-runner.ts +239 -0
package/src/cli.ts +649 -0
package/src/config.ts +199 -0
package/src/desktop-overlay.ps1 +121 -0
package/src/executable-resolver.ts +330 -0
package/src/handlers/agy-imagegen.ts +179 -0
package/src/handlers/github-cli.ts +399 -0
package/src/handlers/higgsfield-cli.ts +783 -0
package/src/launch.js +337 -0
package/src/mcp-server.ts +1265 -0
package/src/pair-claim.ts +218 -0
package/src/payload-daemon.ts +168 -0
package/src/server.ts +21036 -0
package/src/tool-defaults.ts +230 -0
package/src/update-check.js +136 -0
package/tray/build.py +76 -0
package/tray/requirements.txt +2 -0
package/tray/tray.py +1843 -0

package/docs/AGENT_GUIDE.md ADDED Viewed

@@ -0,0 +1,259 @@
+# Bridge agent guide — decision tree for picking the right tool
+> Audience: AI agents driving the bridge (Claude in the overlay, Vincent on
+> app.empir3.com, MCP clients in Claude Code / Codex / Cursor). Not the
+> human user.
+The bridge currently exposes **57 tools** across browser, desktop, overlay
++ recording, reliability + safety, and API & CLIs (`custom_llm` +
+`higgsfield_*`). Most tasks need 2-3 of them; the rest are specialised
+fallbacks. Read this guide top-to-bottom once per session and you'll skip
+the trial-and-error.
+When asked to "test the bridge", use the standard smoke plan at
+`/api/bridge-smoke-test-plan` and open `/desktop-test` first. That page is the
+shared harness for browser actions, desktop actions, recording/playback,
+overlay reinjection, and calibration checks; do not substitute a random page
+unless the user asked for a site-specific test.
+---
+## The five things you can do
+Every bridge action falls into one of these:
+1. **See** what's on screen — text, structure, or pixels
+2. **Find** a specific element (button, icon, field)
+3. **Act** on it (click, type, scroll)
+4. **Point** at it for the user (without taking control)
+5. **Manage state** (focus region, calibration, permissions)
+Pick the lane first, then the tool.
+---
+## 1 · See what's on screen
+| You want… | Tool | Why |
+|---|---|---|
+| Page text | `browser_text` | Cheapest. Always try first if content is web. |
+| Page structure as JSON refs | `browser_snapshot` | Get clickable refs (e0, e1, …) with bounds + names. Use for any web target. |
+| Visual confirmation of web page | `browser_screenshot` | After a write, to verify it landed. |
+| Desktop pixels | `desktop_screenshot` | Native desktop apps, games, anything outside the bridge browser. |
+| Tight zoom around a pixel | `desktop_screenshot_zoom` | Pixel-accurate inspection of a small area. |
+| Which monitors exist | `desktop_monitors` | DPI-aware bounds, including negative coords. |
+**Rule:** if the target is in the bridge tab, `browser_snapshot` beats every
+desktop tool. Web work should never touch desktop_* unless you need pixels
+outside the page.
+---
+## 2 · Find a specific element
+The most common failure mode for agents is "I need to click X but I don't
+know where it is." Pick the right finder for the surface:
+| Surface | Tool | Returns |
+|---|---|---|
+| Web page in bridge tab | `browser_snapshot` | `e0`, `e1`, … refs with `role`, `name`, `bounds`. |
+| Native Win32 / UWP app | `desktop_snapshot` | `d0`, `d1`, … refs with `role`, `name`, `bounds`. |
+| **Any visible region, agent reads numbers off image** | `desktop_snapshot_som` | Numbered boxes drawn on a screenshot. You read "click 14" — no pixel math. |
+| Pixel-only (no UIA, no DOM) — e.g. games, Photoshop, CEF | _Phase 2 (OmniParser)_ | Not shipped. Today: ask user to select region + use grid (see §5). |
+**`desktop_snapshot_som` is the killer tool** when the user has selected
+a focus region. It returns an annotated screenshot AND the element list —
+you pick the number and call `desktop_click_ref` with the matching `ref`.
+Zero pixel arithmetic.
+### When `_snapshot_som` returns `empty: true`
+That means UIA found no elements. Reasons:
+- App is CEF/Electron (Discord, Spotify, Steam, VS Code content area)
+- App is a game or custom GPU surface (Photoshop, Illustrator)
+- App is web content but in the bridge browser — use `browser_snapshot` instead
+Fallback: ask the user to point (`desktop_pick_point`) or use the focus
+chess-board grid (`desktop_click_cell`).
+---
+## 3 · Act on it (click / type / scroll)
+### Web
+| Intent | Tool |
+|---|---|
+| Click a ref from `browser_snapshot` | `browser_click_ref` |
+| Click by CSS selector | `browser_click` |
+| Click at known viewport coords | `browser_click_xy` |
+| Type into a ref | `browser_type_ref` |
+| Type by selector | `browser_type` |
+| Press a key globally | `browser_press` |
+| Scroll | `browser_scroll` |
+| Visual cue for the user | `browser_highlight` |
+### Desktop
+| Intent | Tool |
+|---|---|
+| Click a ref from `desktop_snapshot` / `_som` | `desktop_click_ref` |
+| Click at known screen coords | `desktop_click` |
+| Hover (no click) | `desktop_hover` / `desktop_hover_ref` |
+| Drag | `desktop_drag` |
+| Click cell N,M in the focus grid | `desktop_click_cell` |
+**Always prefer `_ref` over `_xy` / `_click`.** Refs survive screen movement
+and DPI changes. Coords don't.
+### Browser eval
+| Intent | Tool |
+|---|---|
+| Run arbitrary JS | `browser_evaluate` |
+Default-off because it's effectively root on the page. Use only when no
+other tool can get the data (e.g. inspecting `window.someAppState`).
+---
+## 4 · Point at it (don't take control)
+Use these when the user is doing the work and you're guiding them. The
+ghost cursor doesn't touch the real mouse.
+| Intent | Tool |
+|---|---|
+| Show a labeled arrow at coords | `desktop_pointer_show` |
+| Move the arrow | `desktop_pointer_move` |
+| Pulse animation for emphasis | `desktop_pointer_pulse` |
+| Hide it | `desktop_pointer_hide` |
+| Show pointer at a focus-grid cell | `desktop_pointer_cell` |
+| Check whether arrow is up | `desktop_pointer_status` |
+**Tutorial pattern:**
+```
+desktop_snapshot_som → "the brush tool is number 14"
+desktop_pointer_show at element 14 bounds, label "click here"
+… user clicks …
+desktop_pointer_hide
+desktop_snapshot_som → confirm next state
+```
+---
+## 5 · Manage focus, grid, and calibration
+These are scaffolding — agents rarely call them directly, but should know
+they exist.
+| Intent | Tool |
+|---|---|
+| Ask user to select an area to work in | `desktop_select_region` (user-interactive) |
+| Check whether a region is active | `desktop_focus_status` |
+| Clear the region | `desktop_release_focus` |
+| Show on-screen grid matching the agent's view | `desktop_focus_grid` |
+| User clicks → bridge reports cell coords | `desktop_pick_point` (user-interactive) |
+| Click a cell of the focus grid | `desktop_click_cell` |
+| Calibrate clicks (first-time or after monitor change) | `desktop_calibrate_pointer` (user-interactive) |
+| Read saved calibration | `desktop_calibration_status` |
+**Focus region** is the agent's working area inside an arbitrary monitor
+layout. When active, `desktop_screenshot` and `desktop_snapshot_som`
+auto-scope to it. Pixel coords in the screenshot are then focus-relative,
+which simplifies the agent's mental model.
+---
+## Recordings
+| Intent | Tool |
+|---|---|
+| Start recording user actions | `browser_record_start` |
+| Stop and save | `browser_record_stop` |
+| List saved recordings | `browser_recordings` |
+| Replay one | `browser_play` |
+| Push a message into the overlay chat | `browser_chat` |
+| Read overlay chat history | `browser_read_chat` |
+---
+## Common recipes
+### Recipe: click a button on a website
+```
+browser_snapshot → find { role:"button", name:"Continue" } → click_ref
+```
+### Recipe: click a small icon in a native app the user selected
+```
+desktop_snapshot_som → read numbered boxes → desktop_click_ref by id
+```
+### Recipe: guide user through Photoshop tutorial
+```
+desktop_select_region (one-time)
+desktop_calibrate_pointer (one-time)
+for each step:
+  desktop_pointer_show at the target, with label
+  wait for user click
+  desktop_pointer_hide
+```
+### Recipe: confirm an action worked
+```
+… action …
+browser_screenshot OR desktop_screenshot
+```
+### Recipe: agent doesn't know what app is open
+```
+desktop_snapshot scope:"all-windows" → returns each window's title + pid
+```
+---
+## Anti-patterns
+- ❌ Eyeballing pixel coords from a chat-resized screenshot. Use refs.
+- ❌ Calling `desktop_click x:… y:…` when `_snapshot_som` would work.
+- ❌ Taking a `desktop_screenshot` to "see" a web page you could `browser_snapshot`.
+- ❌ Repeating screenshots after every click — only re-capture when state changes meaningfully.
+- ❌ Calling `desktop_calibrate_pointer` without warning the user — it's interactive.
+---
+## Permissions (won't fire without these)
+Tools that *write* (anything in the Act / Point lanes, plus recordings)
+need `globalSafety.write` true AND per-tool `enabledTools[name]` true.
+Tools that *read* need `globalSafety.read`. The bridge returns
+`Permission denied` if either is off.
+Surface tools to the user via the bridge control center; never disable
+permissions silently from agent code.
+---
+## Discoverability — when in doubt
+Call `bridge_tool_advisor(intent: "I want to …")` — returns the relevant
+slice of this guide plus the tool names that fit.
+---
+## 6 · API & CLIs (talk to other models)
+The bridge can dispatch to other model endpoints you've already set up.
+Configure them in the welcome console (**API & CLIs** pane) once, then
+call from any MCP client.
+| Intent | Tool |
+|---|---|
+| Call any custom LLM (OpenAI-compatible protocol) | `custom_llm` (route by `provider` slug — Ollama, LM Studio, OpenRouter, vLLM, etc.) |
+| Check Higgsfield CLI status / auth | `higgsfield_status` |
+| List Higgsfield models / generations | `higgsfield_list` |
+| Generate an image with Higgsfield | `higgsfield_generate` (writes to `~/.empir3-bridge/artifacts/higgsfield/`) |
+**Family gates:**
+- `higgsfield_*` is gated by the `higgsfield-cli` handler toggle in bridge settings.
+- `custom_llm` is gated by `customProviders.length` — it isn't registered (and doesn't appear in the permissions list) until the user adds at least one custom provider. Adding the first provider auto-enables the tool; removing the last provider auto-disables it.
+Toggling any family after the MCP client is already connected requires
+reconnecting that client for the tool list to refresh.

package/docs/RELEASE.md ADDED Viewed

@@ -0,0 +1,106 @@
+# Release And Download Pipeline
+This repo is the canonical source for the open-source bridge and the Windows download.
+Normal users install from:
+```text
+https://empir3.com/download
+```
+The direct artifact path is:
+```text
+https://app.empir3.com/downloads/Empir3Setup.exe
+```
+## Version Source
+`package.json` is the bridge payload version source of truth.
+The tray menu displays the active payload version read from the downloaded payload. The public update manifest is:
+```text
+https://app.empir3.com/downloads/bridge-version.json
+```
+Do not guess the next version from this document. Before release, check both:
+```bash
+node -p "require('./package.json').version"
+curl -fsS https://app.empir3.com/downloads/bridge-version.json
+```
+If runtime behavior changes, bump `package.json`, build, dry-run publish, publish, then verify the live manifest reports the new version.
+## Build
+```bash
+npm install
+npm run build:windows
+```
+Build output lands in `build/dist/`:
+- `Empir3Setup.exe`
+- `bridge-payload-vX.Y.Z.tar.gz`
+- `bridge-payload-vX.Y.Z.sig`
+- `bridge-version.json`
+- `empir3-bridge.crx`
+- `empir3-bridge-update.xml`
+`Empir3Setup.exe` is the stable bootstrapper. The payload tarball contains the actual bridge runtime, installer UI, extension, and tray wrapper.
+## Publish
+Dry run:
+```bash
+npm run publish:downloads -- --dry-run
+```
+Publish (the deploy target comes from the environment — it is not hardcoded in the repo):
+```bash
+export EMPIR3_DOWNLOAD_HOST=user@your-host
+export EMPIR3_DOWNLOAD_DIR=/var/www/your-app/downloads
+npm run publish:downloads
+```
+The helper uploads the release artifacts to `$EMPIR3_DOWNLOAD_HOST:$EMPIR3_DOWNLOAD_DIR`, then verifies they are live:
+```text
+https://app.empir3.com/downloads/Empir3Setup.exe
+https://app.empir3.com/downloads/bridge-version.json
+```
+## Release Rule
+Do not ship bridge source changes without also checking whether they affect the Windows installer path. If the change affects runtime behavior, bump `package.json`, build the Windows payload, publish `bridge-version.json`, and smoke the tray version line after install/update.
+This release process is self-contained — it publishes the bridge payload + installer and is separate from any Empir3 app deploy. Do not use app deploy scripts (`deploy.ps1` / `deploy.sh`) for bridge releases.
+## Two Distribution Channels — Keep Them In Sync
+This private staging repo (`empir3labs/empir3-bridge-staging`) is the single source of truth. Two **independent, manual** pipelines fan out from it, and they drift if you update one and forget the other:
+| Channel | Tool | Target | What it is |
+|---|---|---|---|
+| **Public source** | `scripts/export-public.mjs` | `empir3hq/empir3-bridge` | A scrubbed, zero-history snapshot users clone / read (install-from-source path). |
+| **Runtime payload** | `build:windows` + `publish:downloads` | `app.empir3.com/downloads` | The signed payload installed daemons auto-update from. |
+`git push` to staging updates **neither** channel.
+### The coupling rule
+**Any change to runtime behavior ships to BOTH channels in the same pass**, or the public source and the running binary diverge. Doc-only changes (README, this file) don't need a payload publish — they can ride the next HQ export.
+One ordered checklist per runtime release:
+1. **Verify** the change live (alt-port `tsx` instance — never disturb the installed daemon).
+2. **Bump** `package.json` (the payload + HQ snapshot both carry it; it's how daemons detect updates — never publish two builds under the same version).
+3. **Build + publish payload**: `build:windows` → `publish:downloads -- --dry-run` → `publish:downloads`.
+4. **Export + push HQ from the same commit**: `export-public.mjs` → eyeball any new images (the scanner can't read them) → in `build/public-export/`, `git init` → commit **as "Empir3 Labs"** (never the maintainer's personal git identity — the export scanner hard-fails on it) → push to `empir3hq/empir3-bridge`.
+5. **Verify parity**: tray version == live `bridge-version.json` == HQ `package.json`.
+The `Empir3Setup.exe` distribution stays gated on Authenticode signing; publishing the payload updates already-installed daemons regardless.

package/docs/SAFETY.md ADDED Viewed

@@ -0,0 +1,112 @@
+# Safety Model
+Empir3 Bridge can read browser state and, when explicitly enabled, operate pages and the desktop. This document explains the safety boundary.
+## Defaults
+The first-run default is read-heavy and write-light:
+- Read tools: enabled
+- Navigation tools: enabled
+- Browser click/type tools: disabled
+- Desktop mouse tools: disabled
+- JavaScript eval: disabled
+- Recording and replay tools: disabled
+Disabled tools are not sent to the chat model as available tools. The dispatcher also rejects disabled tool calls as a second layer of protection.
+## Visible Control State
+The dashboard at `http://localhost:3006` shows a `Control Safety` card:
+- `Read Only`: no write-capable tools are enabled.
+- `Write Enabled`: one or more click, type, desktop, eval, or recording tools are enabled.
+The current state is also available through:
+```bash
+npx tsx src/cli.ts safety-status
+```
+and through MCP:
+```text
+bridge_safety_status
+```
+## Revoke Control
+To disable all write-capable tools immediately:
+```bash
+npx tsx src/cli.ts revoke-control
+```
+or call the MCP tool:
+```text
+bridge_revoke_control
+```
+or press `Revoke Write Control` on the dashboard.
+This turns off:
+- browser clicks
+- browser typing
+- browser keypresses
+- desktop click, hover, and drag
+- JavaScript eval
+- recording and replay tools
+- overlay chat programmatic read/write tools
+Read tools and browser navigation remain enabled.
+## Local Network Boundary
+By default, the wrapper and CDP bridge bind to `127.0.0.1`.
+Chrome is launched with:
+```text
+--remote-debugging-address=127.0.0.1
+```
+The bridge is intended for local tools on your own machine. Do not expose it to the LAN or internet.
+## Data Boundary
+The bridge uses a dedicated Chrome profile:
+```text
+~/.empir3-bridge/profile/
+```
+It does not use your normal Chrome profile. Site logins inside the bridge profile are separate from your daily browser.
+Local data paths:
+- `~/.empir3-bridge/config.json`: settings
+- `~/.empir3-bridge/conversations/`: chat transcripts
+- `./feedback/`: screenshots and action feedback
+- `./recordings/`: saved replay flows
+These paths can contain sensitive page state if you use the bridge on private sites. Treat them accordingly.
+## When To Enable Desktop Tools
+Enable desktop tools only when you want an agent to operate the host desktop, not just Chrome.
+Useful cases:
+- desktop app smoke tests
+- multi-monitor screenshots
+- browser UI that cannot be reached through the DOM
+- canvas or game interactions
+- drag/drop testing
+Use `http://localhost:3006/desktop-test` before trying desktop click/drag on real windows.
+## Reporting Security Issues
+Do not open a public issue for security bugs. See [../SECURITY.md](../SECURITY.md).

package/docs/TESTING.md ADDED Viewed

@@ -0,0 +1,181 @@
+# Testing The Bridge
+This guide is for maintainers, contributors, and agents making bridge changes.
+## Static Checks
+Run these before committing:
+```bash
+npx tsc --noEmit
+npm run build:mcp
+npm test
+git diff --check
+```
+Before a release or package change:
+```bash
+npm pack --dry-run
+```
+## Standard Smoke Test Plan
+Agents and maintainers should use the same quick smoke every time someone says
+"test the bridge." Do not skip `/desktop-test`; it is the shared harness for
+browser tools, desktop tools, calibration checks, recording, and playback.
+Open the live plan:
+```text
+http://localhost:3006/api/bridge-smoke-test-plan
+```
+Or print it from the CLI:
+```bash
+npx tsx src/cli.ts smoke-plan
+```
+Or open the visual harness:
+```text
+http://localhost:3006/desktop-test
+```
+Run the smoke in this order and stop after the first reproducible failure:
+1. Health: `status`, `reliability_status`, and `safety_status`.
+2. Overlay: navigate to `/desktop-test`, then run `bridge_overlay_reinject`.
+   Verify the chat bubble, cursor hook, and overlay transport are present.
+3. Browser tools: use `text`, `snapshot`, `screenshot`, `click #clickTarget`,
+   `type #nameInput`, `press Tab`, and scroll to `#scrollTarget`.
+4. Recording loop: `record_start`, click `#clickTarget`, `record_stop`,
+   list recordings, then play the saved recording once.
+5. Desktop tools: run `desktop_monitors`, `desktop_calibration_status`,
+   `desktop_cursor_position`, `desktop_screenshot_zoom`,
+   `desktop_focus_status`, and `desktop_release_focus`.
+6. Tray toolbar: run `desktop_toolbar status`, then `desktop_toolbar show`.
+Required selectors on the harness:
+```text
+#clickTarget
+#dragSource
+#dropTarget
+#nameInput
+#emailInput
+#notesInput
+#modeKeyboard
+#modeMouse
+#agreeBox
+#prioritySelect
+#submitForm
+#scrollTarget
+```
+## Basic Smoke
+Start the bridge:
+```bash
+npm start
+```
+In another shell:
+```bash
+npx tsx src/cli.ts status
+npx tsx src/cli.ts reliability-smoke
+npx tsx src/cli.ts safety-status
+```
+Expected:
+- status reports the bridge is running
+- reliability smoke passes
+- safety status reports either `read_only` or lists enabled write tools
+## Browser Smoke
+```bash
+npx tsx src/cli.ts desktop-test
+npx tsx src/cli.ts snapshot
+npx tsx src/cli.ts screenshot
+npx tsx src/cli.ts text
+```
+For write-capable browser tests, enable the relevant tool in settings first:
+```text
+http://localhost:3006/settings
+```
+Then use a harmless page before trying a real app.
+## Desktop Smoke
+Open the safe desktop test harness:
+```bash
+npx tsx src/cli.ts desktop-test
+```
+Or visit:
+```text
+http://localhost:3006/desktop-test
+```
+Useful checks:
+```bash
+npx tsx src/cli.ts desktop-monitors
+npx tsx src/cli.ts desktop-screenshot all
+npx tsx src/cli.ts desktop-hover 960 540 DISPLAY1
+```
+Only run `desktop-click` or `desktop-drag` when the test harness window is visible and positioned where the target coordinates are known. Blind drag tests can move windows or select real UI.
+## Parallel Bridge Smoke
+Use a separate profile and ports so you do not disturb the normal bridge:
+```bash
+EMPIR3_PW_PORT=3106 \
+EMPIR3_BRIDGE_HTTP_PORT=9967 \
+EMPIR3_CDP_PORT=9322 \
+EMPIR3_BRIDGE_PROFILE=$HOME/.empir3-bridge/profile-smoke \
+EMPIR3_BRIDGE_LABEL=SMOKE \
+npm start -- --fresh
+```
+Drive it:
+```bash
+BRIDGE_URL=http://localhost:3106 npx tsx src/cli.ts reliability-smoke
+BRIDGE_URL=http://localhost:3106 npx tsx src/cli.ts desktop-test
+```
+Stop it:
+```bash
+EMPIR3_PW_PORT=3106 \
+EMPIR3_BRIDGE_HTTP_PORT=9967 \
+EMPIR3_CDP_PORT=9322 \
+EMPIR3_BRIDGE_PROFILE=$HOME/.empir3-bridge/profile-smoke \
+EMPIR3_BRIDGE_LABEL=SMOKE \
+npm run kill
+```
+## What To Include In Bug Reports
+- OS and version
+- `node -v`
+- Chrome version
+- exact command run
+- `npm run status` output
+- `npx tsx src/cli.ts reliability-status` output
+- relevant screenshots or `feedback/` paths
+Do not paste API keys, site cookies, or private page data into public issues.