@empir3/empir3-bridge 0.3.21

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (62) hide show
  1. package/CHANGELOG.md +1531 -0
  2. package/CODE_OF_CONDUCT.md +9 -0
  3. package/CONTRIBUTING.md +75 -0
  4. package/LICENSE +21 -0
  5. package/README.md +464 -0
  6. package/SECURITY.md +130 -0
  7. package/assets/accuracy-lab.html +2639 -0
  8. package/assets/api-clis-real.jpg +0 -0
  9. package/assets/bridge-console-hero.jpg +0 -0
  10. package/assets/browser-privacy.svg +151 -0
  11. package/assets/demo-orchestration.svg +74 -0
  12. package/assets/desktop-select-region.jpg +0 -0
  13. package/assets/in-page-chat.gif +0 -0
  14. package/assets/orchestration-hero.svg +126 -0
  15. package/assets/social-preview.png +0 -0
  16. package/assets/zara-accent.png +0 -0
  17. package/build/bootstrap.js +548 -0
  18. package/build/build.js +680 -0
  19. package/build/payload-entry.js +649 -0
  20. package/build/payload-signing-pub.json +7 -0
  21. package/docs/AGENT_GUIDE.md +259 -0
  22. package/docs/RELEASE.md +106 -0
  23. package/docs/SAFETY.md +112 -0
  24. package/docs/TESTING.md +181 -0
  25. package/installer/server.js +231 -0
  26. package/installer/ui/app.js +278 -0
  27. package/installer/ui/index.html +24 -0
  28. package/installer/ui/styles.css +146 -0
  29. package/package.json +95 -0
  30. package/scripts/bootstrap-e2e.mjs +650 -0
  31. package/scripts/certify-bridge.mjs +636 -0
  32. package/scripts/check-companion-surface.mjs +118 -0
  33. package/scripts/extract-welcome.mjs +64 -0
  34. package/scripts/gh-route-handler-check.mjs +57 -0
  35. package/scripts/gh-wire-test.mjs +107 -0
  36. package/scripts/publish-downloads.mjs +180 -0
  37. package/scripts/smoke-all-tools.mjs +509 -0
  38. package/scripts/smoke-live-bridge.mjs +696 -0
  39. package/scripts/splice-welcome.mjs +63 -0
  40. package/scripts/welcome-body.txt +2733 -0
  41. package/src/anthropic-client.ts +192 -0
  42. package/src/bootstrap-exe.ts +69 -0
  43. package/src/bridge.ts +2444 -0
  44. package/src/chat.ts +345 -0
  45. package/src/cli-runner.ts +239 -0
  46. package/src/cli.ts +649 -0
  47. package/src/config.ts +199 -0
  48. package/src/desktop-overlay.ps1 +121 -0
  49. package/src/executable-resolver.ts +330 -0
  50. package/src/handlers/agy-imagegen.ts +179 -0
  51. package/src/handlers/github-cli.ts +399 -0
  52. package/src/handlers/higgsfield-cli.ts +783 -0
  53. package/src/launch.js +337 -0
  54. package/src/mcp-server.ts +1265 -0
  55. package/src/pair-claim.ts +218 -0
  56. package/src/payload-daemon.ts +168 -0
  57. package/src/server.ts +21036 -0
  58. package/src/tool-defaults.ts +230 -0
  59. package/src/update-check.js +136 -0
  60. package/tray/build.py +76 -0
  61. package/tray/requirements.txt +2 -0
  62. package/tray/tray.py +1843 -0
@@ -0,0 +1,9 @@
1
+ # Code of Conduct
2
+
3
+ This project adopts the [Contributor Covenant](https://www.contributor-covenant.org/version/2/1/code_of_conduct/), version 2.1, as its code of conduct.
4
+
5
+ In short: be kind, assume good faith, focus on the work. Disagreements about technical direction are welcome and encouraged. Personal attacks, harassment, or making the project a hostile place to participate are not.
6
+
7
+ If you experience or witness unacceptable behavior, or have any other concerns, please report it by emailing **conduct@empir3.com**. All reports will be handled confidentially. The project maintainers are responsible for clarifying standards and may take any action they deem appropriate, including warning the offender or banning them temporarily or permanently.
8
+
9
+ The full text of the Contributor Covenant 2.1 governs the specifics of what is expected and what is not, and how reports are reviewed. Read it at the link above.
@@ -0,0 +1,75 @@
1
+ # Contributing
2
+
3
+ Thanks for helping make Empir3 Bridge useful outside the Empir3 app.
4
+
5
+ ## Local Setup
6
+
7
+ ```bash
8
+ git clone https://github.com/empir3hq/empir3-bridge
9
+ cd empir3-bridge
10
+ npm install
11
+ npm start
12
+ ```
13
+
14
+ Useful commands:
15
+
16
+ ```bash
17
+ npm run status
18
+ npm run kill
19
+ npx tsx src/cli.ts reliability-smoke
20
+ npx tsx src/cli.ts safety-status
21
+ ```
22
+
23
+ ## Development Checks
24
+
25
+ Run before opening a PR:
26
+
27
+ ```bash
28
+ npx tsc --noEmit
29
+ npm run build:mcp
30
+ npm test
31
+ git diff --check
32
+ ```
33
+
34
+ For packaging-related changes:
35
+
36
+ ```bash
37
+ npm pack --dry-run
38
+ ```
39
+
40
+ ## Good First Areas
41
+
42
+ - Cross-platform Chrome detection.
43
+ - Better install and first-run messages.
44
+ - More deterministic smoke tests.
45
+ - Desktop tool polish on macOS and Linux.
46
+ - Diagnostic quality for failing CDP commands.
47
+ - Docs that help a fresh Claude Code or Codex user succeed quickly.
48
+ - Worked examples for the **API & CLIs** pane: a recipe per provider (Ollama on `localhost:11434`, LM Studio, OpenRouter, vLLM, a self-hosted gateway) so the `openai_chat` dispatcher is approachable without trial-and-error.
49
+
50
+ ## PR Guidelines
51
+
52
+ - Keep changes scoped to one concern.
53
+ - Include verification steps.
54
+ - Update README or docs for user-facing behavior.
55
+ - Avoid new dependencies unless they remove real complexity.
56
+ - Do not commit `feedback/`, `recordings/`, `node_modules/`, local profiles, or API keys.
57
+
58
+ ## Bug Reports
59
+
60
+ Please include:
61
+
62
+ - OS and version
63
+ - `node -v`
64
+ - Chrome version
65
+ - command you ran
66
+ - what you expected
67
+ - what happened
68
+ - `npm run status` output
69
+ - `npx tsx src/cli.ts reliability-status` output
70
+
71
+ Review logs and screenshots for private data before posting publicly.
72
+
73
+ ## Security
74
+
75
+ Do not open a public issue for security problems. Email security@empir3.com or use a private GitHub security advisory.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Empir3
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,464 @@
1
+ # Empir3 Bridge
2
+
3
+ ### Your AI subscriptions, finally working as a team.
4
+
5
+ Codex, Grok, Gemini, and Claude — the seats you're **already paying for** — wired into a single conversation where one agent quarterbacks the rest. Have Codex build it, Grok draft it, Gemini check it, Claude tie it all together, and pull every answer back into one thread.
6
+
7
+ **No API keys handed out. No per-token meter running.** Just the flat-rate CLIs you're already signed into — suddenly collaborating instead of sitting in four lonely terminals talking to nobody.
8
+
9
+ Then the team acts where chat can't: a real browser and your live desktop, driven by **pixel-accurate, OS-level clicks** that map page coordinates to physical screen pixels — landing on the canvas, trusted-event, and React-Native-Web UIs that synthetic automation can't touch.
10
+
11
+ **Three ways to drive it — MCP, CLI, or WebSocket.** MCP for the agents you talk to, CLI for your terminal and scripts, or WebSocket via [empir3.com](https://empir3.com) for paired remote control. Same engine behind all three — pick the door that fits.
12
+
13
+ ![Terminal demo — one prompt ("Have Codex scaffold the API, Gemini review the diff, then open the dashboard in a real browser and screenshot it") fans out to Codex on your OpenAI seat, Gemini on your Google seat, and the bridge's own Chrome, with every result reported back into one thread — no API keys, no per-token meter](assets/demo-orchestration.svg)
14
+
15
+ <sub>One prompt, the whole team — Codex on your OpenAI seat, Gemini on your Google seat, the browser on this machine, all in one thread. ([how it's wired ↓](#architecture))</sub>
16
+
17
+ ![One agent in any MCP client delegates through the local Empir3 Bridge to your Codex, Grok, Gemini, and Claude seats — plus a dedicated Chrome and a scoped region of your desktop](assets/orchestration-hero.svg)
18
+
19
+ Empir3 Bridge turns any MCP client (Claude Code, Codex, Cursor, …) into an agent that can:
20
+
21
+ 1. **Run a whole AI team from one chat.** Orchestrate the other CLIs you're already signed into — Codex, Grok, Gemini, Claude — from a single conversation, with **no API keys handed out and no extra cost**. Have Codex build, Grok draft, Gemini review, and pull the answers back into the thread you're already in. The bridge runs each CLI under your own subscription and quietly handles its invocation quirks.
22
+ 2. **Drive a real Chrome — walled off from your personal one.** The bridge runs its own dedicated Chrome profile that holds only the logins and history *you* give it, never your everyday browsing. Interact with accessibility-tree refs *and* real OS-level clicks that map page coordinates to physical pixels (so trusted-event-gated, canvas, and React-Native-Web UIs that ignore synthetic clicks still work).
23
+ 3. **Operate your desktop** — DPI-aware mouse and screenshots across multiple monitors, with per-display click calibration.
24
+ 4. **Generate images and video** — 40+ models through the Higgsfield CLI, discovered at runtime.
25
+
26
+ It runs entirely on your machine, keeps its own Chrome profile, binds its control servers to `127.0.0.1`, and **starts with every write-capable control off by default**. A visible overlay and ghost cursor show you exactly what the agent is targeting, and one toggle revokes control.
27
+
28
+ <img align="right" width="92" src="assets/zara-accent.png" alt="Zara, one of the Empir3 agents">
29
+
30
+ It's the same bridge pattern used inside [Empir3](https://empir3.com), open-sourced so developers can use it directly. Empir3 pairing is optional: local MCP use works without an account, and the remote relay only turns on after you explicitly pair this PC.
31
+
32
+ It all runs from one local console — what's connected, which capabilities are enabled, and a live log of every action the agent takes:
33
+
34
+ ![The Empir3 Bridge console running locally at 127.0.0.1:3006 — daemon running, MCP ready, 70/70 permissions, read/write/exec safety toggles, and a live activity log, all running on this PC with no account required](assets/bridge-console-hero.jpg)
35
+
36
+ ### One prompt, the whole team moving
37
+
38
+ > *"Have Codex scaffold the API, Gemini review the diff, then open the dashboard in a real browser and screenshot it for me."*
39
+
40
+ Codex writes under your OpenAI seat. Gemini reviews under your Google seat. The bridge drives its own Chrome to grab the shot — all from the single agent you started talking to. No keys copied, no tokens metered, no tab-switching, no four-terminals juggling act. That's the bridge.
41
+
42
+ ## Why It's Different
43
+
44
+ Plenty of tools let an agent click around a browser. Almost none let your agents work *together*. That's the whole point of the bridge:
45
+
46
+ - **A team of models, not a lonely one.** This is the headline. `cli_run` lets the agent you're talking to delegate out to Codex / Grok / Gemini / Claude — each playing to its strengths, all on the subscriptions you already pay for. No API keys, no per-token billing, no glue code. `cli_status` tells the agent which models are ready *before* it tries, so a task always routes to a model that will actually run. One conversation, four minds.
47
+ - **A real browser, walled off from yours.** The bridge runs its own dedicated Chrome profile — separate from your personal browser — that only ever stores the passwords, cookies, and history *you* choose to give it. Your everyday browsing stays private and untouched. It's real Chrome (not a headless renderer), so sites work exactly as they should and the logins you grant it persist across runs.
48
+ - **Chat with the agent on the page itself — point, don't describe.** A chat panel drops right onto whatever page the bridge is driving. Instead of typing *"the blue button near the top-right,"* you **point at it**: pin a numbered note to any element, draw on the layout, or snap a screenshot — and it all goes to the agent as context. Talk to Claude standalone, or your whole Empir3 team if paired. The agent acts on the same page in front of you, so you watch the change land instead of alt-tabbing to a terminal.
49
+ - **You choose what it sees on your desktop.** Desktop control is the scary part of any agent — so the bridge inverts it. Instead of handing over your whole screen, **drag a box around just the app or area you want help with.** Screenshots and snapshots auto-scope to that box, marked by a visible on-screen frame; everything outside it stays private. It **stays as long as you're using it** — the 30-minute timer is idle-only and resets on every action, so active work never loses scope — you can keep it open indefinitely for a long watch, and **close it anytime by clicking the ✕ on the box itself.**
50
+ - **Trusted, OS-level clicks.** `desktop_click_page` maps a page element to a physical screen pixel (content-window origin + DPR + per-display calibration) and fires a real hardware click — the thing synthetic CDP clicks can't do on trusted-event-gated UIs.
51
+ - **Visible and governed by default.** Read-only out of the box; a click-through overlay + ghost cursor show what's being targeted; action receipts, per-capability lend toggles, and one-click revoke keep the human in control.
52
+
53
+ Plus the table stakes: accessibility-ref interaction, multi-monitor desktop control, a safe `/desktop-test` harness, and a tray app with signed payload updates and version status.
54
+
55
+ ### Orchestrate the CLIs you already pay for
56
+
57
+ Every AI CLI you're signed into, discovered and authed by the bridge — flip one toggle to lend it to the agent. No API keys, no per-token billing.
58
+
59
+ ![The bridge console's API & CLIs pane: Claude Code, OpenAI Codex, Gemini, Grok, Antigravity, Higgsfield, and GitHub CLI, each shown as authed with a per-CLI lend toggle](assets/api-clis-real.jpg)
60
+
61
+ ### A real browser, walled off from yours
62
+
63
+ The bridge drives its own dedicated Chrome profile — your everyday browser, logins, and history stay private and untouched.
64
+
65
+ ![Your personal Chrome on the left holds your real logins and history, untouched; the bridge's dedicated Chrome on the right holds only the logins you grant it and is the one the agent drives — separated by a wall](assets/browser-privacy.svg)
66
+
67
+ ### Chat with the agent right on the page
68
+
69
+ Press one shortcut (`Ctrl+Shift+C`) and a chat panel drops onto whatever page the bridge is driving. Talk to Claude there — and instead of *describing* what you mean, **point at it**:
70
+
71
+ - **📌 Annotate** — click any element to pin a numbered note to it. Your comment, the element, and its exact selector go to the agent as structured context.
72
+ - **✎ Draw** — scribble on the layout to circle, cross out, or sketch the change you want.
73
+ - **📷 Snap** — capture the current view and send it along.
74
+
75
+ The agent acts on the **same page you're looking at** — clicking, typing, navigating, even **generating a fresh image and dropping it straight in** — while a visible overlay and ghost cursor show exactly what it's targeting. No alt-tabbing to a terminal, no describing in words what you could just point at. Standalone, it's you and Claude; paired with Empir3, the panel talks to your whole team.
76
+
77
+ ![The in-page chat panel open on a sample travel-brand landing page: the user pins a note on a bland stock-photo hero — "wrong photo, generate the real fjord" — and asks to regenerate it; the agent runs image generation (Higgsfield) and swaps the dull meadow for a dramatic golden-hour fjord-cabin shot, right there on the page](assets/in-page-chat.gif)
78
+
79
+ ### You choose what it sees on your desktop
80
+
81
+ Drag a box around just the app you want help with. Screenshots and snapshots auto-scope to that box; everything outside it stays private. Close it anytime by clicking the ✕ on the box.
82
+
83
+ ![Selecting a desktop region to share: the "Select an area to share with the agent" prompt with a green rectangle drawn around a single app — only what's inside the box becomes visible to the bridge](assets/desktop-select-region.jpg)
84
+
85
+ ## Install
86
+
87
+ **Point your agent at this repo and tell it to install.** Claude Code, Codex, Cursor — any agent with a terminal — will clone it, install dependencies, and start the bridge from source in about a minute. Or do it yourself with the [60-second Quickstart](#60-second-developer-quickstart) below. It runs entirely on your machine — no account required.
88
+
89
+ > **A packaged Windows installer is coming.** A one-click `Empir3Setup.exe` — a tiny bootstrapper that fetches an Ed25519-signed payload, installs the tray app, and starts on login — is built by the release pipeline but **isn't recommended for general use yet** while we finish Authenticode code-signing (Azure Trusted Signing, pending domain approval). Until it's signed, unsigned builds can trip Windows SmartScreen and antivirus false-positives, so for now install from source (above). This note goes away once signing is live.
90
+
91
+ Windows-first. The bridge core, browser tools, and CLI work on macOS and Linux, but the full desktop tool surface (UIA snapshot, pointer overlay, calibration, native screenshot grids) is Windows-only today.
92
+
93
+ ## 60-Second Developer Quickstart
94
+
95
+ Clone it, install, and run — about sixty seconds:
96
+
97
+ ```bash
98
+ git clone https://github.com/empir3hq/empir3-bridge
99
+ cd empir3-bridge
100
+ npm install
101
+ npm start
102
+ ```
103
+
104
+ Open the dashboard:
105
+
106
+ ```text
107
+ http://localhost:3006
108
+ ```
109
+
110
+ Then add the bridge to a Claude Code project with `.mcp.json`:
111
+
112
+ ```json
113
+ {
114
+ "mcpServers": {
115
+ "empir3-bridge": {
116
+ "type": "stdio",
117
+ "command": "npx",
118
+ "args": ["tsx", "<path-to-bridge>/src/mcp-server.ts"]
119
+ }
120
+ }
121
+ }
122
+ ```
123
+
124
+ Try a browser task:
125
+
126
+ ```text
127
+ Use the browser bridge to open example.com and take a screenshot.
128
+ ```
129
+
130
+ Or put the headline to work — one agent driving another, using a CLI you already pay for:
131
+
132
+ ```text
133
+ Use cli_status to see which CLIs are ready, then use cli_run to have Gemini summarize this README.
134
+ ```
135
+
136
+ ## What You Get
137
+
138
+ ### Orchestrate Other AI CLIs (4 tools)
139
+
140
+ The headline capability: drive the coding CLIs you're already logged into, from the agent you're already talking to. Toggle each CLI on in the welcome console (**API & CLIs** pane); the bridge runs it under your own subscription — no API keys leave your machine.
141
+
142
+ - `cli_status` — which lent CLIs are ready *right now* (installed + lent + authenticated), one row per model with the blocker if not ready. Call this first to route a task to a model that will actually run.
143
+ - `cli_run` — run a lent CLI (`codex` / `grok` / `gemini` / `claude`) with a prompt and get its text back. `mode:"text"` returns the answer read-only; `mode:"agentic"` lets it write files in a working dir. Pass `background:true` for long runs.
144
+ - `cli_runs`, `cli_run_status` — list invocations and poll a background run to completion, each with a saved transcript path.
145
+
146
+ Example prompt: *"Use cli_run to have Codex scaffold the endpoint, then have Gemini review the diff."* One agent, multiple models, your seats — no keys handed out.
147
+
148
+ ### Browser Control (15 tools)
149
+
150
+ - `browser_status`, `browser_navigate`, `browser_refresh`
151
+ - `browser_screenshot`, `browser_snapshot`, `browser_text`
152
+ - `browser_click`, `browser_click_ref`, `browser_click_xy`
153
+ - `browser_type`, `browser_type_ref`, `browser_press`, `browser_scroll`
154
+ - `browser_highlight`, `browser_evaluate`
155
+
156
+ ### Desktop Control (29 tools)
157
+
158
+ Mouse and screenshot primitives:
159
+
160
+ - `desktop_monitors`, `desktop_cursor_position`
161
+ - `desktop_screenshot` — supports `region:{x,y,width,height}` for native-res crops and `grid:true` to overlay a coordinate grid on the saved image (useful for vision-coord targeting on CEF/Electron apps where UIA is blind)
162
+ - `desktop_screenshot_zoom` — zoomed-in slice of the desktop for fine pointing
163
+ - `desktop_click`, `desktop_hover`, `desktop_drag`
164
+
165
+ UI Automation snapshots (Windows):
166
+
167
+ - `desktop_snapshot` — enumerate visible interactive elements via UIA; returns refs `d0..dN`
168
+ - `desktop_snapshot_som` — set-of-marks overlay variant for vision models
169
+ - `desktop_click_ref`, `desktop_hover_ref` — operate on snapshot refs instead of pixel coords
170
+ - `desktop_overlay` — toggle click-through labeled-box overlay over the snapshot
171
+
172
+ Agent-focus region (gesture "help me here" instead of granting whole-desktop control):
173
+
174
+ - `desktop_select_region` — user drags a rectangle; subsequent screenshot/snapshot calls auto-scope to it (30-min TTL). A click-through chip anchored to the region tells the user focus is active.
175
+ - `desktop_release_focus`, `desktop_focus_status`
176
+ - `desktop_focus_grid` — overlay a labeled coordinate grid on the focused region
177
+ - `desktop_click_cell`, `desktop_pointer_cell` — click or move to a named grid cell (e.g. `B3`)
178
+
179
+ On-screen pointer hint (a visible cursor agents can show before clicking):
180
+
181
+ - `desktop_pointer_show`, `desktop_pointer_move`, `desktop_pointer_pulse`, `desktop_pointer_hide`, `desktop_pointer_status`
182
+
183
+ Pointer calibration (correct for per-display offset between OS cursor and rendered visuals):
184
+
185
+ - `desktop_calibrate_pointer`, `desktop_calibration_status`, `desktop_pick_point`
186
+
187
+ Browser-page → physical-screen clicks (drive the bridge's own Chrome page with real OS-level input):
188
+
189
+ - `page_to_screen` — inspect-only: resolve a page element (CSS selector, snapshot ref, or `cssX,cssY`) to its physical virtual-screen pixel, the calibrated click coordinate, the content-window origin, and devicePixelRatio. Use it to verify where a click will land before firing one.
190
+ - `desktop_click_page` — a real OS-level mouse click on an element in the bridge's own Chrome page, mapped page→screen (content-window origin + DPR + per-display calibration). Use it for trusted-event-gated, drag-handle, and native-feel widgets that ignore synthetic clicks.
191
+ - `desktop_pointer_page` — show the click-through ghost cursor on a page element (visual-only, no click).
192
+
193
+ Desktop tools use DPI-aware physical virtual-screen coordinates. Multi-monitor layouts with negative coordinates are supported.
194
+
195
+ ### In-Page Chat, Annotation & Recording (6 tools)
196
+
197
+ The bridge injects a chat panel into every page its Chrome loads — see [Chat with the agent right on the page](#chat-with-the-agent-right-on-the-page). The agent reads and writes that panel through these tools, and annotations (pinned element notes), drawings, and screenshots the user makes there arrive as context on the next message.
198
+
199
+ - `browser_chat`, `browser_read_chat` — post to and read the in-page panel.
200
+ - `browser_record_start`, `browser_record_stop`, `browser_play`, `browser_recordings` — record a flow of clicks/types/scrolls and replay it later.
201
+
202
+ ### Reliability And Safety (6 tools)
203
+
204
+ - `bridge_tool_advisor` — agent asks "what tool should I use for X?", bridge answers with the right one and shows current safety state
205
+ - `bridge_reliability_status`, `bridge_reliability_smoke`, `bridge_action_log`
206
+ - `bridge_safety_status`, `bridge_revoke_control`
207
+
208
+ These are there so an agent can diagnose the bridge before taking action, and so the user can see or revoke write-capable controls.
209
+
210
+ ### Generative Media & Custom Models (5 tools)
211
+
212
+ The bridge is also a thin gateway to image/video generation and any custom model you have logged in locally. Configure them once in the welcome console (`http://localhost:3006/welcome`, **API & CLIs** pane):
213
+
214
+ - `higgsfield_models` — list the available Higgsfield models (40+, typed `image` / `video` / `text`) so the agent picks a valid id at runtime instead of guessing from a hard-coded list.
215
+ - `higgsfield_status`, `higgsfield_list`, `higgsfield_generate` — generate via the Higgsfield CLI; `higgsfield_generate` is self-documenting (an unknown model id returns the live catalog). Output lands in `~/.empir3-bridge/artifacts/higgsfield/`.
216
+ - `custom_llm` — generic dispatcher for any OpenAI-compatible model you've configured (Ollama, LM Studio, OpenRouter, vLLM, your own server). Routes by `provider` slug; registered only once at least one custom provider exists.
217
+
218
+ The **API & CLIs** pane is also where you flip the **lend toggles** that power the [CLI orchestration tools](#orchestrate-other-ai-clis-4-tools) above — Codex, Grok, Gemini, Claude, and Higgsfield. Each row shows install status, auth status, and a one-click auth-launch that opens the CLI in a console with the project cwd. See [docs/AGENT_GUIDE.md](docs/AGENT_GUIDE.md) for the full integration model.
219
+
220
+ ## Control And Trust Model
221
+
222
+ Empir3 Bridge is powerful software. Treat it like a local automation driver, not a passive browser extension.
223
+
224
+ - Local MCP mode is the default. The bridge listens on `127.0.0.1`, launches a dedicated Chrome profile, and exposes tools only to local clients configured to talk to it.
225
+ - Paired Empir3 mode is opt-in. Pairing stores a bridge token locally and opens a websocket relay to Empir3 so approved remote agents can send commands to this PC.
226
+ - Local permissions are still enforced. Read, write, execute, desktop, eval, recording, handler-family, and CLI-lending controls live on the device and can be toggled from the welcome console.
227
+ - Browser-origin requests are hardened with a per-launch nonce. Cross-origin browser writes and overlay websocket connections must present the bridge nonce instead of relying on open localhost access.
228
+ - The tray is part of the safety surface. It shows the running version, relay/account state, update status, logs, and quick actions for reconnect, sign-out, uninstall, and clean quit.
229
+ - Sensitive outputs stay local by default: screenshots, recordings, logs, transcripts, generated artifacts, provider keys, and bridge auth live in local data paths listed below.
230
+
231
+ ## Safety Model
232
+
233
+ Empir3 Bridge is a control surface, so the default posture is conservative.
234
+
235
+ - Read tools are on by default.
236
+ - Navigation tools are on by default.
237
+ - Page interaction tools are off by default.
238
+ - Desktop mouse tools are off by default.
239
+ - JavaScript eval is off by default.
240
+ - Recording and replay tools are off by default.
241
+ - Empir3 relay is off until this PC is paired.
242
+
243
+ Open the welcome console:
244
+
245
+ ```text
246
+ http://localhost:3006/welcome
247
+ ```
248
+
249
+ Check current control state:
250
+
251
+ ```bash
252
+ npx tsx src/cli.ts safety-status
253
+ ```
254
+
255
+ ## Release Builds
256
+
257
+ Windows release artifacts are built from this repository:
258
+
259
+ ```bash
260
+ npm run build:windows
261
+ ```
262
+
263
+ That writes artifacts under `build/dist/`:
264
+
265
+ - `Empir3Setup.exe`
266
+ - `bridge-payload-vX.Y.Z.tar.gz`
267
+ - `bridge-payload-vX.Y.Z.sig`
268
+ - `bridge-version.json`
269
+ - `empir3-bridge.crx`
270
+ - `empir3-bridge-update.xml`
271
+
272
+ The payload version comes from `package.json`. Keep the tray label, manifest, download metadata, and release notes tied to that version. See `docs/RELEASE.md`.
273
+
274
+ Disable all write-capable tools immediately:
275
+
276
+ ```bash
277
+ npx tsx src/cli.ts revoke-control
278
+ ```
279
+
280
+ The dashboard also shows a visible `Control Safety` card and has a `Revoke Write Control` button.
281
+
282
+ Read the full safety notes in [docs/SAFETY.md](docs/SAFETY.md).
283
+
284
+ ## Test The Bridge Safely
285
+
286
+ The bridge ships with a local test harness for click, hover, and drag accuracy:
287
+
288
+ ```text
289
+ http://localhost:3006/desktop-test
290
+ ```
291
+
292
+ Or open it from the CLI:
293
+
294
+ ```bash
295
+ npx tsx src/cli.ts desktop-test
296
+ ```
297
+
298
+ This page gives agents safe targets for desktop hover, click, and drag tests without moving your real windows around. See [docs/TESTING.md](docs/TESTING.md).
299
+
300
+ ## Use Standalone From CLI
301
+
302
+ ```bash
303
+ npx tsx src/cli.ts status
304
+ npx tsx src/cli.ts navigate "https://example.com"
305
+ npx tsx src/cli.ts snapshot
306
+ npx tsx src/cli.ts click-ref "e5"
307
+ npx tsx src/cli.ts click-xy 500 320
308
+ npx tsx src/cli.ts type-ref "e3" "hello"
309
+ npx tsx src/cli.ts screenshot
310
+ npx tsx src/cli.ts text
311
+ npx tsx src/cli.ts desktop-monitors
312
+ npx tsx src/cli.ts desktop-screenshot all
313
+ npx tsx src/cli.ts reliability-smoke
314
+ ```
315
+
316
+ Run with no args for the full command list.
317
+
318
+ ## Architecture
319
+
320
+ ```mermaid
321
+ flowchart LR
322
+ A["Claude Code / Codex / MCP Client"] -->|stdio| B["MCP Server (auto-launched on first client connect)"]
323
+ C["CLI / HTTP Client"] --> D["HTTP Wrapper :3006"]
324
+ B --> D
325
+ D --> E["CDP Bridge :9867"]
326
+ E --> F["Chrome with dedicated profile"]
327
+ D --> G["Local dashboard + overlay :3006/welcome"]
328
+ D --> H["Desktop tools on host OS"]
329
+ D --> I["CLI orchestration + media (cli_run → lent Codex/Grok/Gemini/Claude, higgsfield_*, custom_llm)"]
330
+ D -.-> J["Optional Empir3 relay websocket"]
331
+ ```
332
+
333
+ The MCP server is a thin stdio shim. On the first connection from a Claude Code / Codex / Cursor client it boots the HTTP wrapper and the CDP bridge if they aren't already running, so you don't have to babysit two processes — installing the `.mcp.json` block is enough.
334
+
335
+ Core files:
336
+
337
+ - `src/launch.js`: starts and stops the bridge process group.
338
+ - `src/bridge.ts`: talks to Chrome through CDP.
339
+ - `src/server.ts`: HTTP/WebSocket wrapper, dashboard, settings, safety, desktop tools.
340
+ - `src/mcp-server.ts`: MCP tool server.
341
+ - `src/cli.ts`: scriptable local CLI.
342
+
343
+ ## Local-Only Network Defaults
344
+
345
+ The bridge binds its wrapper and CDP HTTP server to `127.0.0.1` by default. Chrome remote debugging is also launched with `--remote-debugging-address=127.0.0.1`.
346
+
347
+ This means the bridge is intended for local agents on your machine, not LAN or internet access. Paired Empir3 mode uses an outbound websocket to Empir3; it does not expose the local bridge as a public server.
348
+
349
+ The bridge also stamps a per-launch nonce into its controlled welcome/overlay surfaces. Browser-origin write requests and overlay websocket connections from non-local origins must present that nonce.
350
+
351
+ ## Data Locations
352
+
353
+ - Chrome profile: `~/.empir3-bridge/profile/`
354
+ - Chat config (mode, API key, per-tool toggles): `~/.empir3-bridge/config.json`
355
+ - Bridge auth token after Empir3 pairing: `%APPDATA%\Empir3\bridge-auth.json` on Windows, `~/.empir3/Empir3/bridge-auth.json` on macOS/Linux
356
+ - Bridge settings (permissions, device name, home directory, handlers, custom providers): `%APPDATA%\Empir3\bridge-settings.json` on Windows, `~/.empir3/Empir3/bridge-settings.json` on macOS/Linux
357
+ - Current per-launch bridge nonce: `~/.empir3-bridge/nonce`
358
+ - Conversation transcripts: `~/.empir3-bridge/conversations/`
359
+ - Lent-CLI run transcripts (`cli_run`): `~/.empir3-bridge/cli-runs/`
360
+ - Generated artifacts (e.g. Higgsfield images): `~/.empir3-bridge/artifacts/`
361
+ - Screenshots and action feedback: `./feedback/`
362
+ - Recordings: `./recordings/`
363
+
364
+ `feedback/` and `recordings/` are gitignored.
365
+
366
+ ## Troubleshooting
367
+
368
+ ### Higgsfield CLI install fails on Windows with a tar error
369
+
370
+ The Higgsfield npm package ships a postinstall tarball with colons in some file paths. The bundled BusyBox/Git-Bash `tar` on Windows refuses those names and the install aborts.
371
+
372
+ Workaround:
373
+
374
+ ```bash
375
+ npm install -g higgsfield-cli --ignore-scripts
376
+ cd "%APPDATA%\npm\node_modules\higgsfield-cli"
377
+ "C:\Windows\System32\tar.exe" -xzf path\to\postinstall-bundle.tgz
378
+ ```
379
+
380
+ The Windows-system `tar.exe` (Microsoft's libarchive build) handles the colon paths. After that the CLI is on `PATH` and the welcome console's **API & CLIs** pane will pick it up. The bridge probe (`higgsfield_status`) reports an actionable error if any of these steps were skipped.
381
+
382
+ ### MCP client says "Connected" but no `browser_*` / `desktop_*` tools appear
383
+
384
+ Three known causes:
385
+
386
+ - `node_modules/` missing in the bridge repo — run `npm install`.
387
+ - `zod` resolved to v4 (the MCP SDK silently fails tool registration). The repo pins `zod ^3.25` — make sure your lockfile didn't override it.
388
+ - The path in your `.mcp.json` contains a space. `npx tsx` ESM loader splits at the first space and crashes without a useful error.
389
+
390
+ ### Chrome won't launch the dedicated profile
391
+
392
+ Kill any existing instance with `npm run kill`, then `npm start -- --fresh`. The profile lives at `~/.empir3-bridge/profile/` — deleting it is non-destructive (just re-logs you out of bridge tabs).
393
+
394
+ ## Fresh Runs And Parallel Bridges
395
+
396
+ Fresh launch:
397
+
398
+ ```bash
399
+ npm start -- --fresh
400
+ ```
401
+
402
+ Stop the bridge:
403
+
404
+ ```bash
405
+ npm run kill
406
+ ```
407
+
408
+ Status:
409
+
410
+ ```bash
411
+ npm run status
412
+ ```
413
+
414
+ Parallel bridge example:
415
+
416
+ ```bash
417
+ EMPIR3_PW_PORT=3106 \
418
+ EMPIR3_BRIDGE_HTTP_PORT=9967 \
419
+ EMPIR3_CDP_PORT=9322 \
420
+ EMPIR3_BRIDGE_PROFILE=$HOME/.empir3-bridge/profile-test \
421
+ EMPIR3_BRIDGE_LABEL=TEST \
422
+ npm start
423
+ ```
424
+
425
+ Drive it:
426
+
427
+ ```bash
428
+ BRIDGE_URL=http://localhost:3106 npx tsx src/cli.ts status
429
+ ```
430
+
431
+ ## Use With Empir3
432
+
433
+ The bridge is useful by itself. Empir3 is what happens when you put a team around it.
434
+
435
+ With Empir3, Vincent coordinates specialist agents that can work through the same bridge: research, browser work, app checks, design review, and implementation loops. The bridge is the local control plane. Empir3 is the collaborative AI team.
436
+
437
+ Pairing is optional and will stay opt-in. No Empir3 account is required to use this repo. If you pair, the bridge stores a local token, reports this device to Empir3, and opens an outbound relay websocket. Sign out from the tray or welcome console to remove the local pairing token and return to local-only use.
438
+
439
+ Try Empir3 at [empir3.com](https://empir3.com).
440
+
441
+ ## Contributing
442
+
443
+ We welcome bug reports, smoke-test notes, and PRs. Start with:
444
+
445
+ - [CONTRIBUTING.md](CONTRIBUTING.md)
446
+ - [SECURITY.md](SECURITY.md)
447
+ - [docs/TESTING.md](docs/TESTING.md)
448
+
449
+ Useful local checks:
450
+
451
+ ```bash
452
+ npx tsc --noEmit
453
+ npm run build:mcp
454
+ npm test
455
+ npm pack --dry-run
456
+ ```
457
+
458
+ ## Project Status
459
+
460
+ Pre-1.0. The bridge is used heavily in Empir3 development and is being shaped into a clean standalone OSS product. Expect rapid iteration around install, cross-platform polish, and safety UX.
461
+
462
+ ## License
463
+
464
+ MIT. See [LICENSE](LICENSE).