mobygate 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md ADDED
@@ -0,0 +1,207 @@
1
+ # Changelog
2
+
3
+ All notable changes to mobygate are documented here. Format loosely follows
4
+ [Keep a Changelog](https://keepachangelog.com/en/1.1.0/); version numbers are
5
+ [Semantic Versioning](https://semver.org/).
6
+
7
+ ## [0.3.0] — 2026-04-19
8
+
9
+ Shippable on npm. Sessions survive restarts. Logs live in a canonical
10
+ user-level location so every install method (git clone, `npm install -g`,
11
+ per-user service) sees the same files.
12
+
13
+ ### Added
14
+
15
+ - **Persistent session store** — `~/.mobygate/sessions.json`. Sessions
16
+ rehydrate on boot; mutations are debounced-written every 500 ms;
17
+ SIGTERM / SIGINT / SIGHUP flushes synchronously. Tested end-to-end:
18
+ a Claude conversation's context survives `mobygate restart`. See
19
+ `lib/session-store.js`.
20
+ - **npm publish** — installable via `npm install -g mobygate`. Package
21
+ metadata added (`repository`, `homepage`, `bugs`, `license`, `author`,
22
+ `keywords`, `engines: node >=18`) plus an explicit `files` array to
23
+ keep the tarball lean. MIT `LICENSE` file added.
24
+
25
+ ### Changed
26
+
27
+ - **Canonical log location** — `~/.mobygate/logs/` instead of
28
+ `{install-dir}/logs/`. Global npm installs often live under a
29
+ root-owned directory; keeping logs in the per-user config dir makes
30
+ the service work identically across git-clone, npm-global, and
31
+ per-user installs. `mobygate init` writes to the new path; existing
32
+ users will see empty logs after upgrade (a one-time reset).
33
+
34
+ ### Migration from 0.2.x
35
+
36
+ If upgrading a git-clone install:
37
+
38
+ ```
39
+ git pull
40
+ npm install
41
+ mobygate restart
42
+ ```
43
+
44
+ Logs from `{repo}/logs/` are left in place but no longer written to;
45
+ new entries go to `~/.mobygate/logs/`. Move the old logs if you want
46
+ to keep history.
47
+
48
+ ## [0.2.0] — 2026-04-19
49
+
50
+ Real product polish. v0.1.0 got the proxy working on three OSes; v0.2.0
51
+ makes it feel like a real dev tool you'd want open on a second monitor.
52
+
53
+ ### Added
54
+
55
+ - **Full web dashboard** at `http://localhost:3456/` — a faithful port
56
+ of the Paper design (artboard `C1-0`), built from the exported JSX as
57
+ source of truth (not screenshots):
58
+ - 4-card KPI strip: Uptime (live-ticking clock), Requests (with
59
+ stream/tool/img breakdown), Success rate (7-segment progress bar),
60
+ Avg latency (p50 headline + p95 sub + 14-bar color-thresholded
61
+ sparkline).
62
+ - Server / Auth / Traffic row. Server card shows model, context
63
+ window, build string (e.g., `v0.2.0 · darwin-arm64`). Auth card has
64
+ LOGGED IN badge + MAX plan pill + force-refresh CTA. Traffic card
65
+ is a 15-bucket rolling req/min column chart.
66
+ - Live requests table with kind chips (stream/tool/img/sync), inline
67
+ latency bars, rounded HTTP status pills, filter buttons
68
+ (ALL / ERRORS / SLOW > 15 s), click-to-expand details modal.
69
+ - Sessions panel with per-row expire + expire-all.
70
+ - Server log tail with auto-refresh (2.5 s poll, smart auto-scroll).
71
+ - Terminal-style footer with endpoint pills and status line.
72
+ - **`/dashboard/recent`, `/dashboard/sessions`, `/dashboard/logs`**
73
+ endpoints feeding the UI. `/events` SSE stream for live updates.
74
+ - **Rolling latency + traffic metrics** on the event bus. Per-variant
75
+ latency samples (stream / sync × last 50) with p50 / p95 computation.
76
+ Per-minute traffic bucketing retained for the last 15 min.
77
+ - **Terminal banner redesign** — `mobygate init` and `mobygate status`
78
+ now use the exact Paper whale ASCII and palette (`#B7E56D` green,
79
+ `#E89B2E` orange, `#4EA4C4` blue, `#8A9A6A` olive, truecolor ANSI).
80
+ Whale-in-terminal and whale-in-browser are visually identical now.
81
+ - **Build metadata** surfaced from `package.json` via `/dashboard/recent`
82
+ so the UI always displays the running version.
83
+ - **Web-fonts**: JetBrains Mono (400/500/700) + VT323 loaded from
84
+ Google Fonts — matches the design-system fonts in Paper.
85
+
86
+ ### Changed
87
+
88
+ - **Rebranded `claude-gate` → `mobygate`.** Distinct from Anthropic's
89
+ trademarks; Möbius-themed name independent of any single model
90
+ provider. GitHub repo renamed with auto-redirect kept.
91
+ - Terminal palette switched from 256-color to 24-bit truecolor so the
92
+ exact design hex values render correctly on any modern terminal.
93
+ - Banner functions accept a `{ version }` option and render a dim
94
+ `v0.X.Y` next to the title; version always sourced from
95
+ `package.json` so it stays accurate through rename / tag cycles.
96
+
97
+ ### Fixed
98
+
99
+ - On Windows, `mobygate init` was printing PowerShell instructions
100
+ instead of registering scheduled tasks. Automation is now wired up
101
+ through PowerShell spawns with a `.mobygate-server.cmd` launcher
102
+ wrapping `node server.js` for stdout/stderr redirection into
103
+ `logs/server.log`. Quoting is no longer fragile across platforms.
104
+ - Node 20+ `DEP0190` deprecation warning on `spawn(..., { shell: true })`
105
+ with args array in the auth helper; replaced with explicit
106
+ `spawn('cmd.exe', ['/c', quoted-cmdline])` on Windows.
107
+ - 401 responses that the SDK surfaces as **result-message text**
108
+ (rather than thrown exceptions) are now detected via pattern match
109
+ in both streaming and sync handlers. The same refresh + retry path
110
+ fires as it would for an exception-form 401.
111
+
112
+ ### Known Gaps
113
+
114
+ - **Day-over-day deltas** (e.g., `+3 today`, `↓ 3.2s vs yday` in the
115
+ design) require historical persistence we haven't built yet. Stats
116
+ reset on each server restart. Lands with a persistence layer later.
117
+ - **Long-uptime auth** — the Agent SDK caches OAuth creds in memory
118
+ per-process. After ~7–8 h uptime the in-memory state can go stale
119
+ even after a keychain refresh. Current mitigation: reactive retry
120
+ catches most cases; proactive 4-hour cron covers the rest. Full fix
121
+ lands later (either SDK patch or auto-restart on persistent 401).
122
+
123
+ ## [0.1.0] — 2026-04-19
124
+
125
+ First tagged release. Project rebranded from `claude-max-sdk-proxy` →
126
+ `claude-gate` → `mobygate` during the lead-up to this tag; prior commits
127
+ live in the same GitHub repo (`khnfrhn/mobygate`) but were not semver-tagged.
128
+
129
+ ### Added
130
+
131
+ - **Cross-platform installer** — `mobygate init` sets up the proxy as a
132
+ managed service on macOS (launchd), Linux (systemd user units), and
133
+ Windows (Task Scheduler). Interactive prompts for port, default model,
134
+ session TTL, CLAUDE_BIN override. Writes `~/.mobygate/config.yaml`,
135
+ starts the services, smoke-tests `/health`. No admin/sudo needed on any
136
+ platform.
137
+ - **`mobygate` CLI** — `init`, `start`, `stop`, `restart`, `status`,
138
+ `logs`, `auth`, `uninstall`, `version`.
139
+ - **ASCII whale banner** — orange starfield, green whale with barnacles
140
+ and baleen-through-mouth water, blue ripple waves. Möbius motif in the
141
+ whale's eye and waterline. Color auto-disables on non-TTY stdout
142
+ (pipes, CI, systemd).
143
+ - **OAuth auto-refresh on 401** — `runWithAuthRetry` wraps both stream
144
+ and sync query loops. Catches exception-form 401s and text-form 401s
145
+ (SDK sometimes surfaces auth errors as result-message text rather than
146
+ throws, especially on long-running proxies). Force-refreshes via
147
+ `claude -p` probe, retries once.
148
+ - **Proactive auth refresh cron** — every 4 hours on all three platforms,
149
+ using launchd / systemd timer / Task Scheduler. Access tokens last ~8 h,
150
+ so the 4 h cadence keeps us well inside the valid window.
151
+ - **Tool calling (OpenAI function-calling)** via a prompt-embedded
152
+ protocol. `<tool_call>` tags in the model's output are parsed and
153
+ emitted as OpenAI `tool_calls` with `finish_reason: "tool_calls"`.
154
+ Parallel calls supported. Built-in SDK tools disabled during tool
155
+ requests (`allowedTools: []`) so the model uses only client-defined
156
+ tools. Nudge appended when resuming with only tool results so the
157
+ model doesn't return empty text.
158
+ - **Multimodal passthrough** — OpenAI `image_url` content parts
159
+ (base64 data URLs + remote HTTP URLs) are translated to Anthropic
160
+ `image` content blocks and sent via an async-iterable `SDKUserMessage`.
161
+ - **1M context for Opus 4.7** — `claude-opus-4-7` routes to the native
162
+ `claude-opus-4-7[1m]` variant. Aliases: `claude-opus-4-7-1m` (explicit
163
+ 1M), `claude-opus-4-7-200k` (standard tier).
164
+ - **`/auth/status` and `/auth/refresh`** HTTP endpoints.
165
+ - **npm scripts** — `up` (install + start), `auth:status`, `auth:refresh`.
166
+ - **Startup preflight** — if node_modules is stale, the server dies
167
+ with a readable boxed error pointing at `npm install`.
168
+ - **`mcp-inspect.mjs`** — raw MCP response inspector over stdio /
169
+ StreamableHTTP / SSE. Used to confirm that when an MCP server returns
170
+ image content, the bytes are real — useful for diagnosing client-side
171
+ image-drop bugs (e.g. the Paper MCP → Hermes image gap).
172
+ - **Hermes patch (out of repo)** — fix for Hermes's MCP→LLM adapter to
173
+ surface image content blocks from MCP tools as an `image_url` user
174
+ message, rather than silently dropping them. Documented but not
175
+ auto-applied.
176
+
177
+ ### Changed
178
+
179
+ - `@anthropic-ai/claude-agent-sdk` bumped `0.2.101` → `0.2.112`.
180
+ - Default model is now `claude-opus-4-7[1m]`.
181
+ - Server listens on port 3456 (default; configurable via
182
+ `~/.mobygate/config.yaml` or `PORT` env).
183
+ - Config precedence: env vars > `~/.mobygate/config.yaml` > built-in
184
+ defaults.
185
+
186
+ ### Removed
187
+
188
+ - Dropped the `claude-gate`/`claude-max-sdk-proxy` names. `claude` is an
189
+ Anthropic trademark and the proxy shape is provider-agnostic; future
190
+ releases may route through additional providers without the name
191
+ becoming misleading.
192
+
193
+ ### Known Gaps
194
+
195
+ - **Long-running proxy uptime** — the Agent SDK appears to cache OAuth
196
+ credentials in memory per-process, so even after a keychain refresh
197
+ the in-memory state in a 7+ h old process may still be stale. The
198
+ new result-text 401 detection + retry handles most cases; full fix
199
+ (either patch the SDK or auto-restart on persistent auth failure)
200
+ lands later.
201
+ - **Web dashboard** — `/` still serves the simple status page. Live
202
+ request stream, session browser, auth panel with a "force refresh"
203
+ button, etc. are the next release.
204
+ - **Tool-calling edge cases** — ~95% format compliance on
205
+ `<tool_call>` emission; `tool_choice` (force-tool / specific-tool)
206
+ is not honored; streaming tool-call delta chunks are buffered
207
+ into a single final chunk.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Farhan Khan
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,429 @@
1
+ # mobygate
2
+
3
+ > OpenAI-compatible local proxy for **Claude Max**.
4
+ > The Möbius-strip gateway: OpenAI shape in, Claude Max out, on a single continuous loop.
5
+
6
+ Point any OpenAI-shaped client (Hermes, OpenClaw, custom tools, SDKs) at `http://localhost:3456` and you get Claude Max inference out the other side — without hitting the paid Anthropic API.
7
+
8
+ - ✓ Real streaming (SSE)
9
+ - ✓ Multimodal (image URLs + base64 data URLs)
10
+ - ✓ OpenAI-style function calling (`tools`, `tool_choice`-compatible `tool_calls` response)
11
+ - ✓ Opus 4.7 with native 1M context variant
12
+ - ✓ Session resume (map a client key → SDK session ID)
13
+ - ✓ OAuth auto-refresh (no more 8-hour-cliff 401 storms)
14
+ - ✓ Live web dashboard with per-request tracing
15
+ - ✓ Cross-platform service install (macOS / Linux / Windows — one command)
16
+
17
+ **Current release:** [v0.2.0](https://github.com/khnfrhn/mobygate/releases/tag/v0.2.0) · see [CHANGELOG.md](./CHANGELOG.md) for history.
18
+
19
+ ## Why
20
+
21
+ The older `claude-max-api-proxy` spawned a new Claude Code CLI subprocess for every request — ~500 ms overhead per call, Windows stdin pipe hacks, and patches that got nuked on every `npm update`. mobygate uses the Claude Agent SDK directly: no subprocess spawning, no patches, no maintenance. Same subscription, real streaming, multimodal, tool calling.
22
+
23
+ ## Quick start
24
+
25
+ ```bash
26
+ npm install -g mobygate
27
+ mobygate init # interactive setup: config + service install + smoke test
28
+ ```
29
+
30
+ Or from source (for hacking on mobygate itself):
31
+
32
+ ```bash
33
+ git clone https://github.com/khnfrhn/mobygate.git
34
+ cd mobygate
35
+ npm install
36
+ npm link # makes the `mobygate` command available globally
37
+ mobygate init
38
+ ```
39
+
40
+ That single `init` does the full cross-platform install:
41
+
42
+ | Step | Mac | Linux | Windows |
43
+ |---|---|---|---|
44
+ | Verify Node ≥ 18, `claude` CLI on PATH, `claude auth login` done | ✓ | ✓ | ✓ |
45
+ | Write config to `~/.mobygate/config.yaml` | ✓ | ✓ | ✓ |
46
+ | Install long-running server as user-level service | launchd (`ai.mobygate.server`) | systemd user unit (`mobygate-server.service`) | Task Scheduler (`mobygate-server`) |
47
+ | Install 4-hour auth-refresh cron | launchd plist | systemd `.timer` | Task Scheduler (4h repetition) |
48
+ | Redirect stdout/stderr to `logs/server.log` | ✓ | ✓ | ✓ (via `.cmd` launcher) |
49
+ | Auto-restart on crash | KeepAlive | `Restart=on-failure` | Task Scheduler RestartCount=3 |
50
+ | Smoke-test `/health` | ✓ | ✓ | ✓ |
51
+
52
+ No `sudo` required on any platform. No `nssm` on Windows. If the auto-install fails for any reason, `mobygate init` falls back to printing the exact commands to run manually.
53
+
54
+ Once installed, the service survives reboots and the daily driver commands are:
55
+
56
+ ```bash
57
+ mobygate status # service state, auth state, /health probe
58
+ mobygate logs # tail logs/server.log
59
+ mobygate auth # check + force-refresh OAuth token
60
+ mobygate start # start service (if stopped)
61
+ mobygate stop # stop service
62
+ mobygate restart # stop + start
63
+ mobygate uninstall # remove services (leaves the repo in place)
64
+ mobygate version
65
+ ```
66
+
67
+ Open **http://localhost:3456/** in your browser for the live dashboard (see below).
68
+
69
+ > **Linux headless tip:** user systemd units stop when you log out. For a mobygate that stays up on a server, run `sudo loginctl enable-linger $USER` once. Then it runs whether you're logged in or not.
70
+
71
+ > **After `git pull`:** always re-run `npm install` — new commits can bump the SDK or add packages. If you skip this, the server dies with a readable boxed "Missing package" error pointing at `npm install` (or `npm run up` which does both in one step).
72
+
73
+ ## Dashboard
74
+
75
+ Open **http://localhost:3456/** after install for a live, zero-config dashboard:
76
+
77
+ - **Header** — whale ASCII · `mobygate vX.Y.Z` · "healthy · live" pill that turns red on disconnect · `clear log` / `force refresh auth` buttons.
78
+ - **KPI strip** — Uptime (live-ticking clock), Requests (total + stream/tool/image breakdown), Success rate (with 7-segment progress bar), Avg latency (p50 headline + p95 secondary + 14-bar color-thresholded sparkline).
79
+ - **Server / Auth / Traffic row** — default model, active sessions, context window, build (`v0.2.0 · darwin-arm64`); email, plan, auth method, last probe, refresh count; 15-minute rolling req/min column chart.
80
+ - **Live requests** — table auto-updates as requests come in. Chips for `stream` / `tool` / `img` / `sync`. Inline latency bar (green < 3 s, blue < 15 s, orange > 15 s). Rounded status pill. Click any row → full start + end event JSON modal. Filter buttons: `ALL / ERRORS / SLOW > 15 s`.
81
+ - **Sessions panel** — active session-key map, per-row `expire`, `expire all`. Live-refreshes when a session is created / updated / expires.
82
+ - **Server log tail** — last 200 lines of `logs/server.log`. Auto-refresh every 2.5 s, smart auto-scroll that doesn't yank you back if you've scrolled up to read.
83
+ - **Footer** — clickable endpoint pills, terminal-style `stream · connected | mobygate · tty0 · 0.2.0` status line.
84
+
85
+ Design ported from the Paper artboard (`01KPFE5G6MJGMT5E5MGA94DQRF/C1-0`) via `get_jsx` — exact colors, typography, and ASCII.
86
+
87
+ ## Run (without the CLI)
88
+
89
+ If you just want a foreground process without installing services:
90
+
91
+ ```bash
92
+ node server.js # normal start
93
+ npm run dev # auto-reload on changes
94
+ npm run up # install deps + start (one command — use after git pull)
95
+ ```
96
+
97
+ The server starts on **port 3456** (same as the old proxy).
98
+
99
+ ## How It Works
100
+
101
+ ```
102
+ Discord / Hermes / OpenClaw → POST localhost:3456/v1/chat/completions → Agent SDK query() → Claude Max
103
+ ```
104
+
105
+ 1. Receives OpenAI-format chat completion requests
106
+ 2. Converts `messages[]` array to a single prompt string
107
+ 3. Calls `query()` from `@anthropic-ai/claude-agent-sdk`
108
+ 4. Streams responses back as SSE (Server-Sent Events) in OpenAI format
109
+
110
+ ## Endpoints
111
+
112
+ **OpenAI-compatible:**
113
+
114
+ | Method | Path | Description |
115
+ |--------|------|-------------|
116
+ | `POST` | `/v1/chat/completions` | Chat completions (streaming + non-streaming) |
117
+ | `GET` | `/v1/models` | List available models with context lengths |
118
+
119
+ **Operations:**
120
+
121
+ | Method | Path | Description |
122
+ |--------|------|-------------|
123
+ | `GET` | `/health` | Liveness + active session count |
124
+ | `GET` | `/auth/status` | OAuth state (add `?quick=1` to skip the live probe) |
125
+ | `POST` | `/auth/refresh` | Force an OAuth refresh probe (cron hook) |
126
+
127
+ **Session management:**
128
+
129
+ | Method | Path | Description |
130
+ |--------|------|-------------|
131
+ | `GET` | `/sessions` | List all active sessions |
132
+ | `GET` | `/sessions/:key` | Inspect one session |
133
+ | `DELETE` | `/sessions/:key` | Expire a single session |
134
+ | `DELETE` | `/sessions` | Expire all sessions |
135
+
136
+ **Dashboard feed:**
137
+
138
+ | Method | Path | Description |
139
+ |--------|------|-------------|
140
+ | `GET` | `/` | The live dashboard (HTML) |
141
+ | `GET` | `/events` | SSE stream of all dashboard events (`request.start`, `request.end`, `auth.refresh`, `session.*`, `server.boot`) with 15 s heartbeat |
142
+ | `GET` | `/dashboard/recent?limit=N` | Ring-buffer snapshot + stats + build meta for initial page load |
143
+ | `GET` | `/dashboard/sessions` | Per-session detail with idle + TTL-remaining times |
144
+ | `GET` | `/dashboard/logs?lines=N` | Last N lines of `logs/server.log` |
145
+
146
+ ## Model Mapping
147
+
148
+ | Input | Resolves To |
149
+ |-------|------------|
150
+ | `claude-opus-4`, `claude-opus-4-7`, `opus` | `claude-opus-4-7[1m]` (1M context) |
151
+ | `claude-opus-4-7-200k` | `claude-opus-4-7` (standard 200k) |
152
+ | `claude-opus-4-6` | `claude-opus-4-6` |
153
+ | `claude-sonnet-4`, `claude-sonnet-4-5`, `claude-sonnet-4-6`, `sonnet` | `claude-sonnet-4-5-20250929` |
154
+ | `claude-haiku-4`, `claude-haiku-4-5`, `haiku` | `claude-haiku-4-5-20251001` |
155
+
156
+ Provider prefixes are stripped automatically (e.g., `claude-max-proxy/claude-opus-4-7` → `claude-opus-4-7`).
157
+
158
+ ## Client Configuration
159
+
160
+ ### OpenClaw (`~/.openclaw/openclaw.json`)
161
+
162
+ Add under `models.providers`:
163
+
164
+ ```json
165
+ "claude-max-proxy": {
166
+ "baseUrl": "http://localhost:3456/v1",
167
+ "apiKey": "claude-max",
168
+ "api": "openai-completions",
169
+ "models": [
170
+ { "id": "claude-opus-4-7", "contextWindow": 1000000, "maxTokens": 16384 },
171
+ { "id": "claude-opus-4-6", "contextWindow": 200000, "maxTokens": 16384 },
172
+ { "id": "claude-sonnet-4-6", "contextWindow": 200000, "maxTokens": 16384 },
173
+ { "id": "claude-haiku-4-5", "contextWindow": 200000, "maxTokens": 16384 }
174
+ ]
175
+ }
176
+ ```
177
+
178
+ Set as default in `agents.defaults.model`:
179
+
180
+ ```json
181
+ "primary": "claude-max-proxy/claude-opus-4-7"
182
+ ```
183
+
184
+ ### Hermes Agent (`~/.hermes/config.yaml`)
185
+
186
+ ```yaml
187
+ model:
188
+ default: claude-opus-4-7
189
+ provider: custom # MUST be "custom", not "openai" or "custom:name"
190
+ api_key: claude-max
191
+ base_url: http://127.0.0.1:3456/v1
192
+ context_length: 1000000 # explicit override — ensures 1M context
193
+
194
+ providers:
195
+ claude-max-proxy:
196
+ api: http://127.0.0.1:3456/v1
197
+ name: Claude Max Proxy
198
+ api_key: claude-max
199
+ default_model: claude-opus-4-7
200
+ ```
201
+
202
+ Also add to `~/.hermes/auth.json` credential_pool:
203
+
204
+ ```json
205
+ "custom:claude-max-proxy": [{
206
+ "id": "a1b2c3",
207
+ "label": "Claude Max Proxy",
208
+ "auth_type": "api_key",
209
+ "priority": 0,
210
+ "source": "config:Claude Max Proxy",
211
+ "access_token": "claude-max",
212
+ "base_url": "http://127.0.0.1:3456/v1",
213
+ "request_count": 0
214
+ }]
215
+ ```
216
+
217
+ > **Hermes provider caveat:** The top-level `model.provider` must be `custom`. Hermes doesn't recognize `openai` as a provider, and `custom:name` only works in `delegation` blocks, not at the model level. The `custom` keyword tells Hermes to read `base_url` and `api_key` from the `model:` config. Aliases that also work: `ollama`, `lmstudio`, `vllm`, `llamacpp`.
218
+
219
+ ### Any OpenAI-compatible client
220
+
221
+ ```
222
+ base_url: http://localhost:3456/v1
223
+ api_key: claude-max (any non-empty string works)
224
+ model: claude-opus-4-7
225
+ ```
226
+
227
+ ## Configuration
228
+
229
+ Precedence, highest wins: **env var → `~/.mobygate/config.yaml` → built-in default**.
230
+
231
+ `mobygate init` writes a commented YAML file you can hand-edit. Env vars always override the file, so you can set one-off values (e.g. a different port per shell) without editing config.
232
+
233
+ | Variable | Config field | Default | Description |
234
+ |----------|-------------|---------|-------------|
235
+ | `PORT` | `port` | `3456` | Server port |
236
+ | `DEFAULT_MODEL` | `default_model` | `claude-opus-4-7[1m]` | Fallback model when none specified |
237
+ | `SESSION_TTL_MINUTES` | `session_ttl_minutes` | `60` | Idle timeout for session keys mapped to SDK sessions |
238
+ | `AUTH_REFRESH_INTERVAL_HOURS` | `auth_refresh_interval_hours` | `4` | How often the proactive refresh cron fires |
239
+ | `CLAUDE_BIN` | `claude_bin` | *(empty → PATH lookup)* | Absolute path to the `claude` binary if not on PATH |
240
+ | `LOG_LEVEL` | `log_level` | `info` | Reserved; currently informational only |
241
+ | `MOBYGATE_HOME` | — | `~/.mobygate` | Directory for config + state files |
242
+ | `MOBYGATE_NODE_BIN` | — | `process.execPath` | Node binary baked into service definitions (launchd/systemd/Task Scheduler) |
243
+ | `NO_COLOR` | — | unset | Disable ANSI color in CLI banner output |
244
+
245
+ ## Diagnosing MCP Image Drops
246
+
247
+ If a client (e.g. Hermes) reports that an MCP tool returned an empty screenshot or image, use `mcp-inspect.mjs` to bypass the client and talk to the MCP server directly — this isolates whether the image is being dropped in the MCP server itself or in the client's normalization layer.
248
+
249
+ ```bash
250
+ # stdio transport — spawn the MCP server as a subprocess
251
+ node mcp-inspect.mjs --cmd "<server-exe>" --args '["<arg1>"]' --list
252
+ node mcp-inspect.mjs --cmd "<server-exe>" --args '["<arg1>"]' \
253
+ --tool get_screenshot --params '{"nodeId":"WL-0"}'
254
+
255
+ # HTTP (StreamableHTTP) transport — e.g. Paper running at localhost:29979/mcp
256
+ node mcp-inspect.mjs --url "http://127.0.0.1:29979/mcp" --list
257
+ node mcp-inspect.mjs --url "http://127.0.0.1:29979/mcp" \
258
+ --tool get_screenshot --params '{"nodeId":"WL-0"}'
259
+
260
+ # Legacy SSE transport
261
+ node mcp-inspect.mjs --url "http://127.0.0.1:1234/sse" --transport sse --list
262
+ ```
263
+
264
+ If the output shows a non-empty `image` content block with hundreds of KB of base64, the MCP server is fine and the client is stripping the image. If the image block is missing or empty, the MCP server itself is the culprit.
265
+
266
+ ## Auth & Token Refresh
267
+
268
+ The proxy inherits Claude Max OAuth credentials from the local CLI keychain (macOS: `Claude Code-credentials`; Windows: Credential Manager; Linux: libsecret / GNOME Keyring). Access tokens last ~8 hours and are supposed to refresh silently, but in practice the SDK occasionally surfaces `401 Invalid authentication credentials` — either as a thrown error, or as the literal text of a `result` message on long-uptime processes.
269
+
270
+ `mobygate init` installs both defenses automatically; you shouldn't need to touch any of this. Reference only:
271
+
272
+ **1. Reactive retry on 401.** Both streaming and non-streaming handlers wrap the SDK query in `runWithAuthRetry` (see `scripts/auth-helper.js`). Exception-form 401s AND result-text-form 401s (`Failed to authenticate. API Error: 401 ...`) trigger a shell to `claude -p` that forces a token refresh via the still-valid refresh token, then retry the query once. Logs every step: `[auth] 401 on sync call — refreshing`, `[auth] refreshed in 1234 ms — retrying sync call`.
273
+
274
+ **2. Proactive 4-hour cron.** `scripts/auth-refresh.js` is cross-platform. `mobygate init` wires it up via launchd (macOS), systemd `.timer` (Linux), or Task Scheduler (Windows). Access tokens last ~8 hours, so a 4-hour cadence keeps us comfortably inside the valid window even if one run fails.
275
+
276
+ **CLI helpers:**
277
+
278
+ ```bash
279
+ mobygate auth # show status + run a live probe
280
+ npm run auth:status # same via npm script (prints JSON)
281
+ npm run auth:status:quick # keychain-only, no live probe (instant)
282
+ npm run auth:refresh # force a refresh probe, print JSON result
283
+ ```
284
+
285
+ **Escape hatch — full re-auth required:** if `claude auth status --json` reports `loggedIn: true` but you're still getting 401s after `mobygate auth` successfully refreshes, the refresh token itself has been revoked. Run `claude auth login` to do a full OAuth reauth, then `mobygate restart`. Rare; happens if you've signed out of Claude from another device.
286
+
287
+ <details>
288
+ <summary><b>Manual cron install (fallback if <code>mobygate init</code> didn't run the scheduler for you)</b></summary>
289
+
290
+ **macOS (launchd):**
291
+
292
+ ```bash
293
+ cp launchd/ai.mobygate.auth-refresh.plist ~/Library/LaunchAgents/
294
+ launchctl load ~/Library/LaunchAgents/ai.mobygate.auth-refresh.plist
295
+ ```
296
+
297
+ **Linux (cron):**
298
+
299
+ ```
300
+ 0 */4 * * * cd /path/to/mobygate && /usr/bin/node scripts/auth-refresh.js >> logs/auth-refresh.log 2>&1
301
+ ```
302
+
303
+ Or systemd timer: `mobygate init` generates these by default. To do it by hand, create `~/.config/systemd/user/mobygate-auth.{service,timer}` — service runs `/usr/bin/node /path/to/mobygate/scripts/auth-refresh.js`, timer has `OnUnitActiveSec=4h` and `OnBootSec=1min`. Then `systemctl --user enable --now mobygate-auth.timer`.
304
+
305
+ **Windows (Task Scheduler):**
306
+
307
+ ```powershell
308
+ $A = New-ScheduledTaskAction -Execute "node.exe" `
309
+ -Argument "scripts\auth-refresh.js" `
310
+ -WorkingDirectory "C:\path\to\mobygate"
311
+ $T = New-ScheduledTaskTrigger -Once -At (Get-Date) `
312
+ -RepetitionInterval (New-TimeSpan -Hours 4)
313
+ Register-ScheduledTask -TaskName "mobygate-auth-refresh" -Action $A -Trigger $T
314
+ ```
315
+
316
+ </details>
317
+
318
+ ## Multimodal
319
+
320
+ OpenAI `image_url` content parts are translated to Anthropic `image` content blocks. Both base64 data URLs and remote `https:` URLs work:
321
+
322
+ ```json
323
+ {
324
+ "role": "user",
325
+ "content": [
326
+ { "type": "text", "text": "What's in this image?" },
327
+ { "type": "image_url", "image_url": { "url": "data:image/png;base64,iVBORw0KGgo..." } }
328
+ ]
329
+ }
330
+ ```
331
+
332
+ When images are present in the request, the proxy switches from a plain-string prompt to an async-iterable `SDKUserMessage` with mixed-content blocks. Nothing else in the OpenAI shape changes. The dashboard shows an `img` chip on any request that carried images.
333
+
334
+ ## Tool Calling
335
+
336
+ OpenAI-style function calling is supported via a **prompt-embedded protocol** (the Agent SDK's native MCP mechanism pollutes session state on abort and gates tools behind `ToolSearch` — neither works for OpenAI's "emit call, client executes, send result back" flow).
337
+
338
+ How it works:
339
+
340
+ - Client sends `tools: [{type: "function", function: {...}}]` in the OpenAI request.
341
+ - Proxy injects the tool schemas into the system prompt and instructs the model to emit `<tool_call>{"name":"...","arguments":{...}}</tool_call>` tags.
342
+ - When a complete `<tool_call>` tag is detected in the model's stream, the SDK query is aborted, tags are parsed, and the response is emitted as OpenAI `tool_calls` with `finish_reason: "tool_calls"`.
343
+ - On the follow-up request, `role: "tool"` messages are translated into `<tool_result id="..." name="...">...</tool_result>` blocks for the model.
344
+ - Parallel calls supported — the model can emit multiple `<tool_call>` tags in one turn.
345
+ - Streaming responses with tools are buffered and emitted as a single chunk (OpenAI tool-call streaming deltas are not currently exposed piecewise).
346
+ - Built-in SDK tools (Read, Bash, Grep, etc.) are disabled via `allowedTools: []` during tool-calling requests so the model can only use client-defined tools.
347
+
348
+ Limitations:
349
+
350
+ - Relies on model format compliance (~95% in practice). Malformed JSON inside a `<tool_call>` tag is silently dropped.
351
+ - `tool_choice` (force-tool, specific-tool) is not yet honored — the model decides whether to call a tool based on prompt cues.
352
+
353
+ ## Gotchas & Fixes
354
+
355
+ Things we learned getting this working:
356
+
357
+ | Issue | Fix |
358
+ |-------|-----|
359
+ | `claude-sonnet-4-6` invalid | SDK resolves it to `claude-sonnet-4-6-20250514` which doesn't exist. Mapped to `claude-sonnet-4-5-20250929` |
360
+ | Old proxy still on port 3456 | Kill stale processes: `lsof -ti :3456 \| xargs kill` (Mac) or `netstat -ano \| findstr 3456` then `taskkill /PID <pid> /F` (Win) |
361
+ | `startup aborted — Missing package` box on start | You pulled new commits but didn't run `npm install` yet. Run `npm install` (or `npm run up` to do both in one step). Most common cause of "network connection error" / `ECONNREFUSED` on :3456 — the proxy wasn't running because startup bailed |
362
+ | SDK message structure | Assistant text is at `message.message.content[]` (nested), NOT `message.content` |
363
+ | Double/duplicate responses | SDK emits text in `assistant` events AND again in `result`. Only use `result` as fallback when no assistant content was already sent |
364
+ | `maxTurns: 1` blocks tools | Set `maxTurns: 200` for full agent capability. Use `1` only for pure text responses |
365
+ | Rate limiting | Each `query()` spawns a Claude Code session. Avoid running Claude Code CLI alongside the proxy |
366
+ | OpenClaw agents failing | Remove all `anthropic` fallbacks from `openclaw.json` — route everything through `claude-max-proxy` |
367
+ | Hermes `Unknown provider` | Use `provider: custom` in config.yaml. `openai` is NOT a valid Hermes provider. `custom:name` fails at model level — only works in `delegation` blocks |
368
+ | Context shows 0/128K in Hermes | Hermes calls `/v1/models` to detect context window. Proxy must return `context_length` in each model object. Without it, Hermes falls back to 128K which can truncate memory injection. Also set `model.context_length: 1000000` in `config.yaml` as explicit override |
369
+ | Hermes memories not loading | Caused by 128K context fallback truncating system prompt before memories get injected. Fixing context_length to 1M resolves this |
370
+ | Empty result after rate limit | SDK emits `rate_limit_event` then returns empty result. First request usually succeeds |
371
+ | node_modules cross-platform | Delete `node_modules` and `npm install` fresh when moving between Windows and Mac |
372
+
373
+ ## Testing
374
+
375
+ ```bash
376
+ node test.js
377
+ ```
378
+
379
+ Runs health, models, validation, non-streaming, and streaming tests.
380
+
381
+ ## What This Replaces
382
+
383
+ | Old (CLI Proxy) | New (SDK Proxy) |
384
+ |-----------------|-----------------|
385
+ | Spawns CLI subprocess per request | Native SDK `query()` call |
386
+ | ~500ms process overhead | Near-zero overhead |
387
+ | Patches nuked on `npm update` | No patches needed |
388
+ | `--dangerously-skip-permissions` flag | `permissionMode: 'bypassPermissions'` |
389
+ | Windows stdin pipe hack | Not needed |
390
+ | `manager.js` + `openai-to-cli.js` patches | Single `server.js` |
391
+
392
+ ## Dependencies
393
+
394
+ Runtime:
395
+ - [`@anthropic-ai/claude-agent-sdk`](https://www.npmjs.com/package/@anthropic-ai/claude-agent-sdk) — Claude Agent SDK (talks to Claude Max through the CLI keychain)
396
+ - [`express`](https://www.npmjs.com/package/express) — HTTP server
397
+ - [`js-yaml`](https://www.npmjs.com/package/js-yaml) — Parses `~/.mobygate/config.yaml`
398
+ - [`uuid`](https://www.npmjs.com/package/uuid) — Request ID generation
399
+
400
+ Transitive (used in `mcp-inspect.mjs`):
401
+ - [`@modelcontextprotocol/sdk`](https://www.npmjs.com/package/@modelcontextprotocol/sdk) — MCP client for diagnosing image-drop bugs in MCP servers
402
+
403
+ Frontend (loaded via CDN, no build step):
404
+ - [Tailwind CSS](https://tailwindcss.com/) via `cdn.tailwindcss.com`
405
+ - [JetBrains Mono](https://fonts.google.com/specimen/JetBrains+Mono) + [VT323](https://fonts.google.com/specimen/VT323) via Google Fonts
406
+
407
+ ## Releases
408
+
409
+ Tagged releases live at **[github.com/khnfrhn/mobygate/releases](https://github.com/khnfrhn/mobygate/releases)**. Pin by version when cloning for a reproducible install:
410
+
411
+ ```bash
412
+ git clone https://github.com/khnfrhn/mobygate.git
413
+ cd mobygate
414
+ git checkout v0.2.0 # or any other tag
415
+ npm install && npm link && mobygate init
416
+ ```
417
+
418
+ See [CHANGELOG.md](./CHANGELOG.md) for per-version change lists.
419
+
420
+ ## Contributing
421
+
422
+ Designs live in Paper (artboard `01KPFE5G6MJGMT5E5MGA94DQRF`). To port a new design into the dashboard:
423
+
424
+ 1. Select the node in Paper.
425
+ 2. Export its JSX via the Paper MCP `get_jsx` tool.
426
+ 3. Hand the JSX to a Claude session along with the current `index.html`.
427
+ 4. Colors, fonts, spacing, and any ASCII art will translate character-accurately.
428
+
429
+ This is how the v0.2.0 dashboard was built. Screenshots are fine for review; JSX is the source of truth for implementation.