npm - mobygate - Versions diffs - 0.3.0 - Mend

mobygate 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

package/CHANGELOG.md +207 -0
package/LICENSE +21 -0
package/README.md +429 -0
package/bin/mobygate.js +443 -0
package/index.html +805 -0
package/launchd/ai.mobygate.auth-refresh.plist +83 -0
package/lib/ascii.js +108 -0
package/lib/config.js +131 -0
package/lib/dashboard-bus.js +158 -0
package/lib/platform.js +584 -0
package/lib/session-store.js +112 -0
package/mcp-inspect.mjs +186 -0
package/package.json +62 -0
package/scripts/auth-helper.js +198 -0
package/scripts/auth-refresh.js +41 -0
package/scripts/auth-status.js +36 -0
package/server.js +1076 -0

package/CHANGELOG.md ADDED Viewed

@@ -0,0 +1,207 @@
+# Changelog
+All notable changes to mobygate are documented here. Format loosely follows
+[Keep a Changelog](https://keepachangelog.com/en/1.1.0/); version numbers are
+[Semantic Versioning](https://semver.org/).
+## [0.3.0] — 2026-04-19
+Shippable on npm. Sessions survive restarts. Logs live in a canonical
+user-level location so every install method (git clone, `npm install -g`,
+per-user service) sees the same files.
+### Added
+- **Persistent session store** — `~/.mobygate/sessions.json`. Sessions
+  rehydrate on boot; mutations are debounced-written every 500 ms;
+  SIGTERM / SIGINT / SIGHUP flushes synchronously. Tested end-to-end:
+  a Claude conversation's context survives `mobygate restart`. See
+  `lib/session-store.js`.
+- **npm publish** — installable via `npm install -g mobygate`. Package
+  metadata added (`repository`, `homepage`, `bugs`, `license`, `author`,
+  `keywords`, `engines: node >=18`) plus an explicit `files` array to
+  keep the tarball lean. MIT `LICENSE` file added.
+### Changed
+- **Canonical log location** — `~/.mobygate/logs/` instead of
+  `{install-dir}/logs/`. Global npm installs often live under a
+  root-owned directory; keeping logs in the per-user config dir makes
+  the service work identically across git-clone, npm-global, and
+  per-user installs. `mobygate init` writes to the new path; existing
+  users will see empty logs after upgrade (a one-time reset).
+### Migration from 0.2.x
+If upgrading a git-clone install:
+```
+git pull
+npm install
+mobygate restart
+```
+Logs from `{repo}/logs/` are left in place but no longer written to;
+new entries go to `~/.mobygate/logs/`. Move the old logs if you want
+to keep history.
+## [0.2.0] — 2026-04-19
+Real product polish. v0.1.0 got the proxy working on three OSes; v0.2.0
+makes it feel like a real dev tool you'd want open on a second monitor.
+### Added
+- **Full web dashboard** at `http://localhost:3456/` — a faithful port
+  of the Paper design (artboard `C1-0`), built from the exported JSX as
+  source of truth (not screenshots):
+  - 4-card KPI strip: Uptime (live-ticking clock), Requests (with
+    stream/tool/img breakdown), Success rate (7-segment progress bar),
+    Avg latency (p50 headline + p95 sub + 14-bar color-thresholded
+    sparkline).
+  - Server / Auth / Traffic row. Server card shows model, context
+    window, build string (e.g., `v0.2.0 · darwin-arm64`). Auth card has
+    LOGGED IN badge + MAX plan pill + force-refresh CTA. Traffic card
+    is a 15-bucket rolling req/min column chart.
+  - Live requests table with kind chips (stream/tool/img/sync), inline
+    latency bars, rounded HTTP status pills, filter buttons
+    (ALL / ERRORS / SLOW > 15 s), click-to-expand details modal.
+  - Sessions panel with per-row expire + expire-all.
+  - Server log tail with auto-refresh (2.5 s poll, smart auto-scroll).
+  - Terminal-style footer with endpoint pills and status line.
+- **`/dashboard/recent`, `/dashboard/sessions`, `/dashboard/logs`**
+  endpoints feeding the UI. `/events` SSE stream for live updates.
+- **Rolling latency + traffic metrics** on the event bus. Per-variant
+  latency samples (stream / sync × last 50) with p50 / p95 computation.
+  Per-minute traffic bucketing retained for the last 15 min.
+- **Terminal banner redesign** — `mobygate init` and `mobygate status`
+  now use the exact Paper whale ASCII and palette (`#B7E56D` green,
+  `#E89B2E` orange, `#4EA4C4` blue, `#8A9A6A` olive, truecolor ANSI).
+  Whale-in-terminal and whale-in-browser are visually identical now.
+- **Build metadata** surfaced from `package.json` via `/dashboard/recent`
+  so the UI always displays the running version.
+- **Web-fonts**: JetBrains Mono (400/500/700) + VT323 loaded from
+  Google Fonts — matches the design-system fonts in Paper.
+### Changed
+- **Rebranded `claude-gate` → `mobygate`.** Distinct from Anthropic's
+  trademarks; Möbius-themed name independent of any single model
+  provider. GitHub repo renamed with auto-redirect kept.
+- Terminal palette switched from 256-color to 24-bit truecolor so the
+  exact design hex values render correctly on any modern terminal.
+- Banner functions accept a `{ version }` option and render a dim
+  `v0.X.Y` next to the title; version always sourced from
+  `package.json` so it stays accurate through rename / tag cycles.
+### Fixed
+- On Windows, `mobygate init` was printing PowerShell instructions
+  instead of registering scheduled tasks. Automation is now wired up
+  through PowerShell spawns with a `.mobygate-server.cmd` launcher
+  wrapping `node server.js` for stdout/stderr redirection into
+  `logs/server.log`. Quoting is no longer fragile across platforms.
+- Node 20+ `DEP0190` deprecation warning on `spawn(..., { shell: true })`
+  with args array in the auth helper; replaced with explicit
+  `spawn('cmd.exe', ['/c', quoted-cmdline])` on Windows.
+- 401 responses that the SDK surfaces as **result-message text**
+  (rather than thrown exceptions) are now detected via pattern match
+  in both streaming and sync handlers. The same refresh + retry path
+  fires as it would for an exception-form 401.
+### Known Gaps
+- **Day-over-day deltas** (e.g., `+3 today`, `↓ 3.2s vs yday` in the
+  design) require historical persistence we haven't built yet. Stats
+  reset on each server restart. Lands with a persistence layer later.
+- **Long-uptime auth** — the Agent SDK caches OAuth creds in memory
+  per-process. After ~7–8 h uptime the in-memory state can go stale
+  even after a keychain refresh. Current mitigation: reactive retry
+  catches most cases; proactive 4-hour cron covers the rest. Full fix
+  lands later (either SDK patch or auto-restart on persistent 401).
+## [0.1.0] — 2026-04-19
+First tagged release. Project rebranded from `claude-max-sdk-proxy` →
+`claude-gate` → `mobygate` during the lead-up to this tag; prior commits
+live in the same GitHub repo (`khnfrhn/mobygate`) but were not semver-tagged.
+### Added
+- **Cross-platform installer** — `mobygate init` sets up the proxy as a
+  managed service on macOS (launchd), Linux (systemd user units), and
+  Windows (Task Scheduler). Interactive prompts for port, default model,
+  session TTL, CLAUDE_BIN override. Writes `~/.mobygate/config.yaml`,
+  starts the services, smoke-tests `/health`. No admin/sudo needed on any
+  platform.
+- **`mobygate` CLI** — `init`, `start`, `stop`, `restart`, `status`,
+  `logs`, `auth`, `uninstall`, `version`.
+- **ASCII whale banner** — orange starfield, green whale with barnacles
+  and baleen-through-mouth water, blue ripple waves. Möbius motif in the
+  whale's eye and waterline. Color auto-disables on non-TTY stdout
+  (pipes, CI, systemd).
+- **OAuth auto-refresh on 401** — `runWithAuthRetry` wraps both stream
+  and sync query loops. Catches exception-form 401s and text-form 401s
+  (SDK sometimes surfaces auth errors as result-message text rather than
+  throws, especially on long-running proxies). Force-refreshes via
+  `claude -p` probe, retries once.
+- **Proactive auth refresh cron** — every 4 hours on all three platforms,
+  using launchd / systemd timer / Task Scheduler. Access tokens last ~8 h,
+  so the 4 h cadence keeps us well inside the valid window.
+- **Tool calling (OpenAI function-calling)** via a prompt-embedded
+  protocol. `<tool_call>` tags in the model's output are parsed and
+  emitted as OpenAI `tool_calls` with `finish_reason: "tool_calls"`.
+  Parallel calls supported. Built-in SDK tools disabled during tool
+  requests (`allowedTools: []`) so the model uses only client-defined
+  tools. Nudge appended when resuming with only tool results so the
+  model doesn't return empty text.
+- **Multimodal passthrough** — OpenAI `image_url` content parts
+  (base64 data URLs + remote HTTP URLs) are translated to Anthropic
+  `image` content blocks and sent via an async-iterable `SDKUserMessage`.
+- **1M context for Opus 4.7** — `claude-opus-4-7` routes to the native
+  `claude-opus-4-7[1m]` variant. Aliases: `claude-opus-4-7-1m` (explicit
+  1M), `claude-opus-4-7-200k` (standard tier).
+- **`/auth/status` and `/auth/refresh`** HTTP endpoints.
+- **npm scripts** — `up` (install + start), `auth:status`, `auth:refresh`.
+- **Startup preflight** — if node_modules is stale, the server dies
+  with a readable boxed error pointing at `npm install`.
+- **`mcp-inspect.mjs`** — raw MCP response inspector over stdio /
+  StreamableHTTP / SSE. Used to confirm that when an MCP server returns
+  image content, the bytes are real — useful for diagnosing client-side
+  image-drop bugs (e.g. the Paper MCP → Hermes image gap).
+- **Hermes patch (out of repo)** — fix for Hermes's MCP→LLM adapter to
+  surface image content blocks from MCP tools as an `image_url` user
+  message, rather than silently dropping them. Documented but not
+  auto-applied.
+### Changed
+- `@anthropic-ai/claude-agent-sdk` bumped `0.2.101` → `0.2.112`.
+- Default model is now `claude-opus-4-7[1m]`.
+- Server listens on port 3456 (default; configurable via
+  `~/.mobygate/config.yaml` or `PORT` env).
+- Config precedence: env vars > `~/.mobygate/config.yaml` > built-in
+  defaults.
+### Removed
+- Dropped the `claude-gate`/`claude-max-sdk-proxy` names. `claude` is an
+  Anthropic trademark and the proxy shape is provider-agnostic; future
+  releases may route through additional providers without the name
+  becoming misleading.
+### Known Gaps
+- **Long-running proxy uptime** — the Agent SDK appears to cache OAuth
+  credentials in memory per-process, so even after a keychain refresh
+  the in-memory state in a 7+ h old process may still be stale. The
+  new result-text 401 detection + retry handles most cases; full fix
+  (either patch the SDK or auto-restart on persistent auth failure)
+  lands later.
+- **Web dashboard** — `/` still serves the simple status page. Live
+  request stream, session browser, auth panel with a "force refresh"
+  button, etc. are the next release.
+- **Tool-calling edge cases** — ~95% format compliance on
+  `<tool_call>` emission; `tool_choice` (force-tool / specific-tool)
+  is not honored; streaming tool-call delta chunks are buffered
+  into a single final chunk.

package/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Farhan Khan
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

package/README.md ADDED Viewed

@@ -0,0 +1,429 @@
+# mobygate
+> OpenAI-compatible local proxy for **Claude Max**.
+> The Möbius-strip gateway: OpenAI shape in, Claude Max out, on a single continuous loop.
+Point any OpenAI-shaped client (Hermes, OpenClaw, custom tools, SDKs) at `http://localhost:3456` and you get Claude Max inference out the other side — without hitting the paid Anthropic API.
+- ✓ Real streaming (SSE)
+- ✓ Multimodal (image URLs + base64 data URLs)
+- ✓ OpenAI-style function calling (`tools`, `tool_choice`-compatible `tool_calls` response)
+- ✓ Opus 4.7 with native 1M context variant
+- ✓ Session resume (map a client key → SDK session ID)
+- ✓ OAuth auto-refresh (no more 8-hour-cliff 401 storms)
+- ✓ Live web dashboard with per-request tracing
+- ✓ Cross-platform service install (macOS / Linux / Windows — one command)
+**Current release:** [v0.2.0](https://github.com/khnfrhn/mobygate/releases/tag/v0.2.0) · see [CHANGELOG.md](./CHANGELOG.md) for history.
+## Why
+The older `claude-max-api-proxy` spawned a new Claude Code CLI subprocess for every request — ~500 ms overhead per call, Windows stdin pipe hacks, and patches that got nuked on every `npm update`. mobygate uses the Claude Agent SDK directly: no subprocess spawning, no patches, no maintenance. Same subscription, real streaming, multimodal, tool calling.
+## Quick start
+```bash
+npm install -g mobygate
+mobygate init      # interactive setup: config + service install + smoke test
+```
+Or from source (for hacking on mobygate itself):
+```bash
+git clone https://github.com/khnfrhn/mobygate.git
+cd mobygate
+npm install
+npm link            # makes the `mobygate` command available globally
+mobygate init
+```
+That single `init` does the full cross-platform install:
+| Step | Mac | Linux | Windows |
+|---|---|---|---|
+| Verify Node ≥ 18, `claude` CLI on PATH, `claude auth login` done | ✓ | ✓ | ✓ |
+| Write config to `~/.mobygate/config.yaml` | ✓ | ✓ | ✓ |
+| Install long-running server as user-level service | launchd (`ai.mobygate.server`) | systemd user unit (`mobygate-server.service`) | Task Scheduler (`mobygate-server`) |
+| Install 4-hour auth-refresh cron | launchd plist | systemd `.timer` | Task Scheduler (4h repetition) |
+| Redirect stdout/stderr to `logs/server.log` | ✓ | ✓ | ✓ (via `.cmd` launcher) |
+| Auto-restart on crash | KeepAlive | `Restart=on-failure` | Task Scheduler RestartCount=3 |
+| Smoke-test `/health` | ✓ | ✓ | ✓ |
+No `sudo` required on any platform. No `nssm` on Windows. If the auto-install fails for any reason, `mobygate init` falls back to printing the exact commands to run manually.
+Once installed, the service survives reboots and the daily driver commands are:
+```bash
+mobygate status     # service state, auth state, /health probe
+mobygate logs       # tail logs/server.log
+mobygate auth       # check + force-refresh OAuth token
+mobygate start      # start service (if stopped)
+mobygate stop       # stop service
+mobygate restart    # stop + start
+mobygate uninstall  # remove services (leaves the repo in place)
+mobygate version
+```
+Open **http://localhost:3456/** in your browser for the live dashboard (see below).
+> **Linux headless tip:** user systemd units stop when you log out. For a mobygate that stays up on a server, run `sudo loginctl enable-linger $USER` once. Then it runs whether you're logged in or not.
+> **After `git pull`:** always re-run `npm install` — new commits can bump the SDK or add packages. If you skip this, the server dies with a readable boxed "Missing package" error pointing at `npm install` (or `npm run up` which does both in one step).
+## Dashboard
+Open **http://localhost:3456/** after install for a live, zero-config dashboard:
+- **Header** — whale ASCII · `mobygate vX.Y.Z` · "healthy · live" pill that turns red on disconnect · `clear log` / `force refresh auth` buttons.
+- **KPI strip** — Uptime (live-ticking clock), Requests (total + stream/tool/image breakdown), Success rate (with 7-segment progress bar), Avg latency (p50 headline + p95 secondary + 14-bar color-thresholded sparkline).
+- **Server / Auth / Traffic row** — default model, active sessions, context window, build (`v0.2.0 · darwin-arm64`); email, plan, auth method, last probe, refresh count; 15-minute rolling req/min column chart.
+- **Live requests** — table auto-updates as requests come in. Chips for `stream` / `tool` / `img` / `sync`. Inline latency bar (green < 3 s, blue < 15 s, orange > 15 s). Rounded status pill. Click any row → full start + end event JSON modal. Filter buttons: `ALL / ERRORS / SLOW > 15 s`.
+- **Sessions panel** — active session-key map, per-row `expire`, `expire all`. Live-refreshes when a session is created / updated / expires.
+- **Server log tail** — last 200 lines of `logs/server.log`. Auto-refresh every 2.5 s, smart auto-scroll that doesn't yank you back if you've scrolled up to read.
+- **Footer** — clickable endpoint pills, terminal-style `stream · connected | mobygate · tty0 · 0.2.0` status line.
+Design ported from the Paper artboard (`01KPFE5G6MJGMT5E5MGA94DQRF/C1-0`) via `get_jsx` — exact colors, typography, and ASCII.
+## Run (without the CLI)
+If you just want a foreground process without installing services:
+```bash
+node server.js         # normal start
+npm run dev            # auto-reload on changes
+npm run up             # install deps + start (one command — use after git pull)
+```
+The server starts on **port 3456** (same as the old proxy).
+## How It Works
+```
+Discord / Hermes / OpenClaw → POST localhost:3456/v1/chat/completions → Agent SDK query() → Claude Max
+```
+1. Receives OpenAI-format chat completion requests
+2. Converts `messages[]` array to a single prompt string
+3. Calls `query()` from `@anthropic-ai/claude-agent-sdk`
+4. Streams responses back as SSE (Server-Sent Events) in OpenAI format
+## Endpoints
+**OpenAI-compatible:**
+| Method | Path | Description |
+|--------|------|-------------|
+| `POST` | `/v1/chat/completions` | Chat completions (streaming + non-streaming) |
+| `GET`  | `/v1/models` | List available models with context lengths |
+**Operations:**
+| Method | Path | Description |
+|--------|------|-------------|
+| `GET`  | `/health` | Liveness + active session count |
+| `GET`  | `/auth/status` | OAuth state (add `?quick=1` to skip the live probe) |
+| `POST` | `/auth/refresh` | Force an OAuth refresh probe (cron hook) |
+**Session management:**
+| Method | Path | Description |
+|--------|------|-------------|
+| `GET`    | `/sessions`           | List all active sessions |
+| `GET`    | `/sessions/:key`      | Inspect one session |
+| `DELETE` | `/sessions/:key`      | Expire a single session |
+| `DELETE` | `/sessions`           | Expire all sessions |
+**Dashboard feed:**
+| Method | Path | Description |
+|--------|------|-------------|
+| `GET` | `/` | The live dashboard (HTML) |
+| `GET` | `/events` | SSE stream of all dashboard events (`request.start`, `request.end`, `auth.refresh`, `session.*`, `server.boot`) with 15 s heartbeat |
+| `GET` | `/dashboard/recent?limit=N` | Ring-buffer snapshot + stats + build meta for initial page load |
+| `GET` | `/dashboard/sessions` | Per-session detail with idle + TTL-remaining times |
+| `GET` | `/dashboard/logs?lines=N` | Last N lines of `logs/server.log` |
+## Model Mapping
+| Input | Resolves To |
+|-------|------------|
+| `claude-opus-4`, `claude-opus-4-7`, `opus` | `claude-opus-4-7[1m]` (1M context) |
+| `claude-opus-4-7-200k` | `claude-opus-4-7` (standard 200k) |
+| `claude-opus-4-6` | `claude-opus-4-6` |
+| `claude-sonnet-4`, `claude-sonnet-4-5`, `claude-sonnet-4-6`, `sonnet` | `claude-sonnet-4-5-20250929` |
+| `claude-haiku-4`, `claude-haiku-4-5`, `haiku` | `claude-haiku-4-5-20251001` |
+Provider prefixes are stripped automatically (e.g., `claude-max-proxy/claude-opus-4-7` → `claude-opus-4-7`).
+## Client Configuration
+### OpenClaw (`~/.openclaw/openclaw.json`)
+Add under `models.providers`:
+```json
+"claude-max-proxy": {
+  "baseUrl": "http://localhost:3456/v1",
+  "apiKey": "claude-max",
+  "api": "openai-completions",
+  "models": [
+    { "id": "claude-opus-4-7", "contextWindow": 1000000, "maxTokens": 16384 },
+    { "id": "claude-opus-4-6", "contextWindow": 200000, "maxTokens": 16384 },
+    { "id": "claude-sonnet-4-6", "contextWindow": 200000, "maxTokens": 16384 },
+    { "id": "claude-haiku-4-5", "contextWindow": 200000, "maxTokens": 16384 }
+  ]
+}
+```
+Set as default in `agents.defaults.model`:
+```json
+"primary": "claude-max-proxy/claude-opus-4-7"
+```
+### Hermes Agent (`~/.hermes/config.yaml`)
+```yaml
+model:
+  default: claude-opus-4-7
+  provider: custom          # MUST be "custom", not "openai" or "custom:name"
+  api_key: claude-max
+  base_url: http://127.0.0.1:3456/v1
+  context_length: 1000000   # explicit override — ensures 1M context
+providers:
+  claude-max-proxy:
+    api: http://127.0.0.1:3456/v1
+    name: Claude Max Proxy
+    api_key: claude-max
+    default_model: claude-opus-4-7
+```
+Also add to `~/.hermes/auth.json` credential_pool:
+```json
+"custom:claude-max-proxy": [{
+  "id": "a1b2c3",
+  "label": "Claude Max Proxy",
+  "auth_type": "api_key",
+  "priority": 0,
+  "source": "config:Claude Max Proxy",
+  "access_token": "claude-max",
+  "base_url": "http://127.0.0.1:3456/v1",
+  "request_count": 0
+}]
+```
+> **Hermes provider caveat:** The top-level `model.provider` must be `custom`. Hermes doesn't recognize `openai` as a provider, and `custom:name` only works in `delegation` blocks, not at the model level. The `custom` keyword tells Hermes to read `base_url` and `api_key` from the `model:` config. Aliases that also work: `ollama`, `lmstudio`, `vllm`, `llamacpp`.
+### Any OpenAI-compatible client
+```
+base_url: http://localhost:3456/v1
+api_key:  claude-max   (any non-empty string works)
+model:    claude-opus-4-7
+```
+## Configuration
+Precedence, highest wins: **env var → `~/.mobygate/config.yaml` → built-in default**.
+`mobygate init` writes a commented YAML file you can hand-edit. Env vars always override the file, so you can set one-off values (e.g. a different port per shell) without editing config.
+| Variable | Config field | Default | Description |
+|----------|-------------|---------|-------------|
+| `PORT` | `port` | `3456` | Server port |
+| `DEFAULT_MODEL` | `default_model` | `claude-opus-4-7[1m]` | Fallback model when none specified |
+| `SESSION_TTL_MINUTES` | `session_ttl_minutes` | `60` | Idle timeout for session keys mapped to SDK sessions |
+| `AUTH_REFRESH_INTERVAL_HOURS` | `auth_refresh_interval_hours` | `4` | How often the proactive refresh cron fires |
+| `CLAUDE_BIN` | `claude_bin` | *(empty → PATH lookup)* | Absolute path to the `claude` binary if not on PATH |
+| `LOG_LEVEL` | `log_level` | `info` | Reserved; currently informational only |
+| `MOBYGATE_HOME` | — | `~/.mobygate` | Directory for config + state files |
+| `MOBYGATE_NODE_BIN` | — | `process.execPath` | Node binary baked into service definitions (launchd/systemd/Task Scheduler) |
+| `NO_COLOR` | — | unset | Disable ANSI color in CLI banner output |
+## Diagnosing MCP Image Drops
+If a client (e.g. Hermes) reports that an MCP tool returned an empty screenshot or image, use `mcp-inspect.mjs` to bypass the client and talk to the MCP server directly — this isolates whether the image is being dropped in the MCP server itself or in the client's normalization layer.
+```bash
+# stdio transport — spawn the MCP server as a subprocess
+node mcp-inspect.mjs --cmd "<server-exe>" --args '["<arg1>"]' --list
+node mcp-inspect.mjs --cmd "<server-exe>" --args '["<arg1>"]' \
+  --tool get_screenshot --params '{"nodeId":"WL-0"}'
+# HTTP (StreamableHTTP) transport — e.g. Paper running at localhost:29979/mcp
+node mcp-inspect.mjs --url "http://127.0.0.1:29979/mcp" --list
+node mcp-inspect.mjs --url "http://127.0.0.1:29979/mcp" \
+  --tool get_screenshot --params '{"nodeId":"WL-0"}'
+# Legacy SSE transport
+node mcp-inspect.mjs --url "http://127.0.0.1:1234/sse" --transport sse --list
+```
+If the output shows a non-empty `image` content block with hundreds of KB of base64, the MCP server is fine and the client is stripping the image. If the image block is missing or empty, the MCP server itself is the culprit.
+## Auth & Token Refresh
+The proxy inherits Claude Max OAuth credentials from the local CLI keychain (macOS: `Claude Code-credentials`; Windows: Credential Manager; Linux: libsecret / GNOME Keyring). Access tokens last ~8 hours and are supposed to refresh silently, but in practice the SDK occasionally surfaces `401 Invalid authentication credentials` — either as a thrown error, or as the literal text of a `result` message on long-uptime processes.
+`mobygate init` installs both defenses automatically; you shouldn't need to touch any of this. Reference only:
+**1. Reactive retry on 401.** Both streaming and non-streaming handlers wrap the SDK query in `runWithAuthRetry` (see `scripts/auth-helper.js`). Exception-form 401s AND result-text-form 401s (`Failed to authenticate. API Error: 401 ...`) trigger a shell to `claude -p` that forces a token refresh via the still-valid refresh token, then retry the query once. Logs every step: `[auth] 401 on sync call — refreshing`, `[auth] refreshed in 1234 ms — retrying sync call`.
+**2. Proactive 4-hour cron.** `scripts/auth-refresh.js` is cross-platform. `mobygate init` wires it up via launchd (macOS), systemd `.timer` (Linux), or Task Scheduler (Windows). Access tokens last ~8 hours, so a 4-hour cadence keeps us comfortably inside the valid window even if one run fails.
+**CLI helpers:**
+```bash
+mobygate auth           # show status + run a live probe
+npm run auth:status     # same via npm script (prints JSON)
+npm run auth:status:quick  # keychain-only, no live probe (instant)
+npm run auth:refresh    # force a refresh probe, print JSON result
+```
+**Escape hatch — full re-auth required:** if `claude auth status --json` reports `loggedIn: true` but you're still getting 401s after `mobygate auth` successfully refreshes, the refresh token itself has been revoked. Run `claude auth login` to do a full OAuth reauth, then `mobygate restart`. Rare; happens if you've signed out of Claude from another device.
+<details>
+<summary><b>Manual cron install (fallback if <code>mobygate init</code> didn't run the scheduler for you)</b></summary>
+**macOS (launchd):**
+```bash
+cp launchd/ai.mobygate.auth-refresh.plist ~/Library/LaunchAgents/
+launchctl load ~/Library/LaunchAgents/ai.mobygate.auth-refresh.plist
+```
+**Linux (cron):**
+```
+0 */4 * * * cd /path/to/mobygate && /usr/bin/node scripts/auth-refresh.js >> logs/auth-refresh.log 2>&1
+```
+Or systemd timer: `mobygate init` generates these by default. To do it by hand, create `~/.config/systemd/user/mobygate-auth.{service,timer}` — service runs `/usr/bin/node /path/to/mobygate/scripts/auth-refresh.js`, timer has `OnUnitActiveSec=4h` and `OnBootSec=1min`. Then `systemctl --user enable --now mobygate-auth.timer`.
+**Windows (Task Scheduler):**
+```powershell
+$A = New-ScheduledTaskAction -Execute "node.exe" `
+  -Argument "scripts\auth-refresh.js" `
+  -WorkingDirectory "C:\path\to\mobygate"
+$T = New-ScheduledTaskTrigger -Once -At (Get-Date) `
+  -RepetitionInterval (New-TimeSpan -Hours 4)
+Register-ScheduledTask -TaskName "mobygate-auth-refresh" -Action $A -Trigger $T
+```
+</details>
+## Multimodal
+OpenAI `image_url` content parts are translated to Anthropic `image` content blocks. Both base64 data URLs and remote `https:` URLs work:
+```json
+{
+  "role": "user",
+  "content": [
+    { "type": "text", "text": "What's in this image?" },
+    { "type": "image_url", "image_url": { "url": "data:image/png;base64,iVBORw0KGgo..." } }
+  ]
+}
+```
+When images are present in the request, the proxy switches from a plain-string prompt to an async-iterable `SDKUserMessage` with mixed-content blocks. Nothing else in the OpenAI shape changes. The dashboard shows an `img` chip on any request that carried images.
+## Tool Calling
+OpenAI-style function calling is supported via a **prompt-embedded protocol** (the Agent SDK's native MCP mechanism pollutes session state on abort and gates tools behind `ToolSearch` — neither works for OpenAI's "emit call, client executes, send result back" flow).
+How it works:
+- Client sends `tools: [{type: "function", function: {...}}]` in the OpenAI request.
+- Proxy injects the tool schemas into the system prompt and instructs the model to emit `<tool_call>{"name":"...","arguments":{...}}</tool_call>` tags.
+- When a complete `<tool_call>` tag is detected in the model's stream, the SDK query is aborted, tags are parsed, and the response is emitted as OpenAI `tool_calls` with `finish_reason: "tool_calls"`.
+- On the follow-up request, `role: "tool"` messages are translated into `<tool_result id="..." name="...">...</tool_result>` blocks for the model.
+- Parallel calls supported — the model can emit multiple `<tool_call>` tags in one turn.
+- Streaming responses with tools are buffered and emitted as a single chunk (OpenAI tool-call streaming deltas are not currently exposed piecewise).
+- Built-in SDK tools (Read, Bash, Grep, etc.) are disabled via `allowedTools: []` during tool-calling requests so the model can only use client-defined tools.
+Limitations:
+- Relies on model format compliance (~95% in practice). Malformed JSON inside a `<tool_call>` tag is silently dropped.
+- `tool_choice` (force-tool, specific-tool) is not yet honored — the model decides whether to call a tool based on prompt cues.
+## Gotchas & Fixes
+Things we learned getting this working:
+| Issue | Fix |
+|-------|-----|
+| `claude-sonnet-4-6` invalid | SDK resolves it to `claude-sonnet-4-6-20250514` which doesn't exist. Mapped to `claude-sonnet-4-5-20250929` |
+| Old proxy still on port 3456 | Kill stale processes: `lsof -ti :3456 \| xargs kill` (Mac) or `netstat -ano \| findstr 3456` then `taskkill /PID <pid> /F` (Win) |
+| `startup aborted — Missing package` box on start | You pulled new commits but didn't run `npm install` yet. Run `npm install` (or `npm run up` to do both in one step). Most common cause of "network connection error" / `ECONNREFUSED` on :3456 — the proxy wasn't running because startup bailed |
+| SDK message structure | Assistant text is at `message.message.content[]` (nested), NOT `message.content` |
+| Double/duplicate responses | SDK emits text in `assistant` events AND again in `result`. Only use `result` as fallback when no assistant content was already sent |
+| `maxTurns: 1` blocks tools | Set `maxTurns: 200` for full agent capability. Use `1` only for pure text responses |
+| Rate limiting | Each `query()` spawns a Claude Code session. Avoid running Claude Code CLI alongside the proxy |
+| OpenClaw agents failing | Remove all `anthropic` fallbacks from `openclaw.json` — route everything through `claude-max-proxy` |
+| Hermes `Unknown provider` | Use `provider: custom` in config.yaml. `openai` is NOT a valid Hermes provider. `custom:name` fails at model level — only works in `delegation` blocks |
+| Context shows 0/128K in Hermes | Hermes calls `/v1/models` to detect context window. Proxy must return `context_length` in each model object. Without it, Hermes falls back to 128K which can truncate memory injection. Also set `model.context_length: 1000000` in `config.yaml` as explicit override |
+| Hermes memories not loading | Caused by 128K context fallback truncating system prompt before memories get injected. Fixing context_length to 1M resolves this |
+| Empty result after rate limit | SDK emits `rate_limit_event` then returns empty result. First request usually succeeds |
+| node_modules cross-platform | Delete `node_modules` and `npm install` fresh when moving between Windows and Mac |
+## Testing
+```bash
+node test.js
+```
+Runs health, models, validation, non-streaming, and streaming tests.
+## What This Replaces
+| Old (CLI Proxy) | New (SDK Proxy) |
+|-----------------|-----------------|
+| Spawns CLI subprocess per request | Native SDK `query()` call |
+| ~500ms process overhead | Near-zero overhead |
+| Patches nuked on `npm update` | No patches needed |
+| `--dangerously-skip-permissions` flag | `permissionMode: 'bypassPermissions'` |
+| Windows stdin pipe hack | Not needed |
+| `manager.js` + `openai-to-cli.js` patches | Single `server.js` |
+## Dependencies
+Runtime:
+- [`@anthropic-ai/claude-agent-sdk`](https://www.npmjs.com/package/@anthropic-ai/claude-agent-sdk) — Claude Agent SDK (talks to Claude Max through the CLI keychain)
+- [`express`](https://www.npmjs.com/package/express) — HTTP server
+- [`js-yaml`](https://www.npmjs.com/package/js-yaml) — Parses `~/.mobygate/config.yaml`
+- [`uuid`](https://www.npmjs.com/package/uuid) — Request ID generation
+Transitive (used in `mcp-inspect.mjs`):
+- [`@modelcontextprotocol/sdk`](https://www.npmjs.com/package/@modelcontextprotocol/sdk) — MCP client for diagnosing image-drop bugs in MCP servers
+Frontend (loaded via CDN, no build step):
+- [Tailwind CSS](https://tailwindcss.com/) via `cdn.tailwindcss.com`
+- [JetBrains Mono](https://fonts.google.com/specimen/JetBrains+Mono) + [VT323](https://fonts.google.com/specimen/VT323) via Google Fonts
+## Releases
+Tagged releases live at **[github.com/khnfrhn/mobygate/releases](https://github.com/khnfrhn/mobygate/releases)**. Pin by version when cloning for a reproducible install:
+```bash
+git clone https://github.com/khnfrhn/mobygate.git
+cd mobygate
+git checkout v0.2.0   # or any other tag
+npm install && npm link && mobygate init
+```
+See [CHANGELOG.md](./CHANGELOG.md) for per-version change lists.
+## Contributing
+Designs live in Paper (artboard `01KPFE5G6MJGMT5E5MGA94DQRF`). To port a new design into the dashboard:
+1. Select the node in Paper.
+2. Export its JSX via the Paper MCP `get_jsx` tool.
+3. Hand the JSX to a Claude session along with the current `index.html`.
+4. Colors, fonts, spacing, and any ASCII art will translate character-accurately.
+This is how the v0.2.0 dashboard was built. Screenshots are fine for review; JSX is the source of truth for implementation.