npm - mobygate - Versions diffs - 0.7.1 → 0.7.3 - Mend

mobygate 0.7.1 → 0.7.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/CHANGELOG.md CHANGED Viewed

@@ -4,6 +4,111 @@ All notable changes to mobygate are documented here. Format loosely follows
 [Keep a Changelog](https://keepachangelog.com/en/1.1.0/); version numbers are
 [Semantic Versioning](https://semver.org/).
+## [0.7.3] — 2026-04-25
+Hotfix bundle from a thorough security + bugs + ops audit. Six items.
+### Fixed (security)
+- **Same-origin gate on control-plane endpoints.** `/update/apply`,
+  `/auth/refresh`, `DELETE /sessions(/:key)`, `/dashboard/logs`, and
+  `/events` now require the request's `Host` header to be localhost
+  and (when present) the `Origin` header to match. This blocks the
+  DNS-rebinding scenario where a malicious site reroutes its DNS to
+  `127.0.0.1` and triggers `npm install -g`, drains Claude Max quota
+  via auth-refresh spam, or tails prompt content from server logs
+  through any browser tab the user has open. Proxy endpoints
+  (`/v1/chat/completions`, `/v1/messages`, `/v1/models`, `/health`)
+  stay open for client traffic.
+### Fixed (operational)
+- **Dashboard "Update now" silently no-op'd on Windows.** `lib/updater.js`
+  hardcoded `WIN_SERVER_TASK = 'ai.mobygate.server'` while the task
+  `lib/platform.js` actually registers is `'mobygate-server'`. The
+  cmd.exe `&&` chain short-circuited on the failed `schtasks /End`
+  and never ran `npm install`. Now imports `WIN_LABELS` and
+  `LINUX_UNITS` directly from platform.js, single-source-of-truth.
+- **`mobygate stop` and `mobygate update` now stop BOTH services**
+  (server + auth-refresh) on all three platforms. Earlier the auth
+  task could fire mid-update on Windows, grab file handles in
+  `node_modules\mobygate`, and trigger EBUSY — root cause of the
+  v0.6/v0.7 EBUSY churn. The dashboard `/update/apply` flow also
+  stops both services in the detached child script.
+- **Mac auth-refresh plist now generated programmatically.** Earlier
+  releases shipped a static plist template (`launchd/ai.mobygate.auth-refresh.plist`)
+  with hardcoded paths from Farhan's machine and sed-replaced them
+  at install time. Anyone whose username, install path, or fnm
+  Node version didn't EXACTLY match the patterns ended up with a
+  plist pointing at non-existent paths and the cron silently
+  failed forever. New `writeMacAuthRefreshPlist()` in `lib/platform.js`
+  mirrors `writeMacServerPlist`, generates from the user's actual
+  paths, portable across any user.
+### Fixed (bugs)
+- **Image + 401 auth-retry no longer hangs / returns empty.** When a
+  multimodal request hit the SDK right as the OAuth token expired,
+  `runWithAuthRetry` would invoke `runQuery` a second time with the
+  same already-exhausted async iterator (multimodal returns a
+  single-use generator). The SDK got an empty user message and the
+  client received an empty response. `prompt` is now built lazily
+  inside `runQuery` so each retry attempt rebuilds the iterator.
+  All four handlers fixed.
+- **400 instead of "model responds to its own reply"** when a resumed
+  request's history terminates with an assistant turn. Earlier
+  `messagesToPrompt` in resume mode fell back to extracting whatever
+  was at `messages[-1]`, sending the assistant's previous reply to
+  the SDK as the new user prompt. Now both `messagesToPrompt`
+  (OpenAI) and `anthropicMessagesToPrompt` (Anthropic) return a
+  structured `{ promptText, error }` and the handler returns 400
+  with a readable message when the trailing turn isn't user/tool.
+### Notes
+- For LAN-exposed installs (`bind: 0.0.0.0`), the same-origin gate
+  is necessary but not sufficient — anyone on the LAN can still hit
+  endpoints with a faked `Host` header. A real `MOBYGATE_TOKEN` for
+  LAN auth is queued for a follow-up release.
+- The `launchd/ai.mobygate.auth-refresh.plist` template file is now
+  unused. Left in the package for backward compatibility — won't
+  ship in a future release.
+## [0.7.2] — 2026-04-25
+### Fixed
+- **"I can't use the tool 'grep' here because it isn't available" refusals**
+  in long-running tasks. Even with `allowedTools: ['mcp__mobygate__*']`
+  blocking everything except client-defined tools, the model retains
+  strong priors from training for Claude Code's built-ins (Bash, Grep,
+  Read, Edit, Glob, WebFetch, ToolSearch, etc.). When a task seemed to
+  call for one — e.g., "find all TODOs" → instinctive reach for Grep —
+  the model would attempt it, get blocked, refuse the task, and stop.
+  Instead of falling back to the available client tool (`searchFiles`,
+  `terminal`, etc.).
+  **Fix:** for any tool-enabled request, append a short system-prompt
+  block (~150 tokens) via the SDK's
+  `systemPrompt: { type: 'preset', preset: 'claude_code', append: ... }`
+  option. The append explicitly lists the available client tools and
+  states that Claude Code's built-ins are NOT in this environment.
+  Calibrated to be matter-of-fact ("here's the environment, work
+  within it") rather than over-restrictive — the model now uses
+  available tools or briefly says what's missing, instead of refusing
+  silently.
+  Applies to both `/v1/chat/completions` and `/v1/messages`.
+### Notes
+- New helper: `buildToolUsageGuidance(tools)` in `lib/tool-bridge.js`
+  produces the append text from the OpenAI-shape tool array. The
+  Anthropic surface translates its tool defs to OpenAI shape for the
+  bridge already, so the helper takes one input shape across both.
+- Per-request token overhead: ~150 tokens, only when `tools` is non-empty.
+  No effect on text-only chat or non-tool requests.
 ## [0.7.1] — 2026-04-24
 Fixes a meaningful token-burn issue for clients that don't pass session

package/bin/mobygate.js CHANGED Viewed

@@ -28,7 +28,7 @@ import { loadConfig, writeConfig, writeState, readState, CONFIG_DIR, CONFIG_PATH
 import {
   PLATFORM, IS_MAC, IS_LINUX, IS_WIN,
   resolveNodeBin,
-  writeMacServerPlist, launchctlLoad, launchctlUnload,
+  writeMacServerPlist, writeMacAuthRefreshPlist, launchctlLoad, launchctlUnload,
   plistPathForLabel, queryLaunchd, uninstallAllServices,
   installWindowsServices, uninstallWindowsServices,
   queryWindowsTask, startWindowsTask, stopWindowsTask, WIN_LABELS,
@@ -204,21 +204,18 @@ async function cmdInit() {
     launchctlLoad(serverPlist);
     ok(`Installed ${SERVER_LABEL} (launchd)`);
-    // Auth refresh plist (we ship a template in launchd/ — copy + rewrite
-    // WorkingDirectory, node path, and log paths to match this install).
-    const authSrc = join(REPO_ROOT, 'launchd', 'ai.mobygate.auth-refresh.plist');
-    if (existsSync(authSrc)) {
-      const authDst = plistPathForLabel(AUTH_LABEL);
-      const tmpl = readFileSync(authSrc, 'utf8')
-        // WorkingDirectory + any path that referenced the repo root
-        .replace(/\/Users\/farhan\/openclaude\/claude-max-sdk-proxy\/logs/g, logsDir)
-        .replace(/\/Users\/farhan\/openclaude\/claude-max-sdk-proxy/g, REPO_ROOT)
-        // node binary baked into ProgramArguments
-        .replace(/\/Users\/farhan\/\.local\/share\/fnm\/aliases\/default\/bin\/node/g, nodeBin);
-      writeFileSync(authDst, tmpl);
-      launchctlLoad(authDst);
-      ok(`Installed ${AUTH_LABEL} (launchd, every ${existing.auth_refresh_interval_hours}h)`);
-    }
+    // Auth refresh plist — generated programmatically with the user's
+    // actual paths. Earlier we shipped a static template and sed-replaced
+    // hardcoded paths inside it, which silently broke for anyone whose
+    // username/install-path/fnm-version didn't EXACTLY match Farhan's.
+    const authPlist = writeMacAuthRefreshPlist({
+      installPath: REPO_ROOT,
+      nodeBin,
+      logsDir,
+      intervalHours: existing.auth_refresh_interval_hours,
+    });
+    launchctlLoad(authPlist);
+    ok(`Installed ${AUTH_LABEL} (launchd, every ${existing.auth_refresh_interval_hours}h)`);
   } else if (IS_WIN) {
     // Register Task Scheduler entries and kick the server task now.
     const r = installWindowsServices({
@@ -320,25 +317,47 @@ function cmdStart() {
 }
 function cmdStop() {
+  // Stop BOTH services: the server AND the auth-refresh task. Earlier
+  // releases only stopped the server, leaving the 4-hourly auth-refresh
+  // cron free to fire mid-update and grab file handles in node_modules
+  // — that was the root cause of the v0.6/v0.7 EBUSY churn on Windows.
+  // We tolerate "not running" failures on both since the user just
+  // wants the end state of "nothing mobygate is running."
   if (IS_MAC) {
-    const p = plistPathForLabel(SERVER_LABEL);
-    launchctlUnload(p);
+    const serverPlist = plistPathForLabel(SERVER_LABEL);
+    const authPlist   = plistPathForLabel(AUTH_LABEL);
+    launchctlUnload(serverPlist);
+    launchctlUnload(authPlist);
     ok(`Unloaded ${SERVER_LABEL}`);
+    ok(`Unloaded ${AUTH_LABEL}`);
   } else if (IS_WIN) {
-    const r = stopWindowsTask(WIN_LABELS.server);
-    if (!r.ok) return die(`Failed to stop ${WIN_LABELS.server}: ${r.stderr || 'unknown'}`);
-    ok(`Stopped ${WIN_LABELS.server}`);
+    const rServer = stopWindowsTask(WIN_LABELS.server);
+    const rAuth   = stopWindowsTask(WIN_LABELS.auth);
+    if (rServer.ok) ok(`Stopped ${WIN_LABELS.server}`);
+    else warn(`${WIN_LABELS.server}: ${rServer.stderr || 'not running or already stopped'}`);
+    if (rAuth.ok) ok(`Stopped ${WIN_LABELS.auth}`);
+    else warn(`${WIN_LABELS.auth}: ${rAuth.stderr || 'not running or already stopped'}`);
   } else if (IS_LINUX) {
-    const r = stopLinuxUnit(LINUX_UNITS.server);
-    if (!r.ok) return die(`Failed to stop ${LINUX_UNITS.server}: ${r.stderr || 'unknown'}`);
-    ok(`Stopped ${LINUX_UNITS.server}`);
+    const rServer = stopLinuxUnit(LINUX_UNITS.server);
+    if (rServer.ok) ok(`Stopped ${LINUX_UNITS.server}`);
+    else warn(`${LINUX_UNITS.server}: ${rServer.stderr || 'not running or already stopped'}`);
+    if (LINUX_UNITS.timer) {
+      const rTimer = stopLinuxUnit(LINUX_UNITS.timer);
+      if (rTimer.ok) ok(`Stopped ${LINUX_UNITS.timer}`);
+    }
+    if (LINUX_UNITS.auth) {
+      const rAuth = stopLinuxUnit(LINUX_UNITS.auth);
+      if (rAuth.ok) ok(`Stopped ${LINUX_UNITS.auth}`);
+    }
   } else {
     die('`mobygate stop` not supported on this platform.');
   }
 }
 function cmdRestart() {
-  cmdStop();
+  // Tolerate cmdStop failure (target may already be stopped). Only die
+  // on cmdStart errors, which are the actually-blocking ones.
+  try { cmdStop(); } catch {}
   cmdStart();
 }
@@ -581,23 +600,23 @@ async function cmdUpdate() {
   }
   print('');
-  // ---- Stop the service FIRST on Windows, otherwise running Node holds
-  // open file handles inside the install dir and `npm install -g` fails
-  // with EBUSY when it tries to rename the directory. On macOS/Linux we
-  // can replace open files freely, but stopping early there too is harmless
-  // and gives a cleaner restart sequence — so we do it everywhere.
-  let stoppedForUpdate = false;
+  // ---- Stop BOTH services first (server + auth-refresh). The auth task
+  // imports mobygate code from the same node_modules dir, so if it fires
+  // mid-install on Windows it grabs file handles and triggers EBUSY.
+  // POSIX systems can replace open files freely, but stopping early there
+  // too is harmless and gives a cleaner restart sequence — so we do it
+  // everywhere. Tolerate "already stopped" failures silently.
+  info('Stopping services so npm install can replace files...');
   if (IS_WIN) {
-    info('Stopping service so npm install can replace files...');
     stopWindowsTask(WIN_LABELS.server);
-    stoppedForUpdate = true;
+    stopWindowsTask(WIN_LABELS.auth);
   } else if (IS_MAC) {
-    const p = plistPathForLabel(SERVER_LABEL);
-    launchctlUnload(p);
-    stoppedForUpdate = true;
+    launchctlUnload(plistPathForLabel(SERVER_LABEL));
+    launchctlUnload(plistPathForLabel(AUTH_LABEL));
   } else if (IS_LINUX) {
     stopLinuxUnit(LINUX_UNITS.server);
-    stoppedForUpdate = true;
+    if (LINUX_UNITS.timer) stopLinuxUnit(LINUX_UNITS.timer);
+    if (LINUX_UNITS.auth)  stopLinuxUnit(LINUX_UNITS.auth);
   }
   // ---- Perform the upgrade
@@ -618,19 +637,23 @@ async function cmdUpdate() {
     return die(`Install mode is "${mode}" — can't auto-update. Reinstall via npm or git.`);
   }
-  // ---- Bring the service back up on the new code
+  // ---- Bring services back up on the new code (server first, then
+  // auth-refresh — server is the load-bearing one; auth restart is
+  // best-effort since it'll naturally fire on its next interval anyway).
   section('Restart');
-  info('Starting service on the new build...');
+  info('Starting services on the new build...');
   if (IS_MAC) {
-    const p = plistPathForLabel(SERVER_LABEL);
-    launchctlLoad(p);
+    launchctlLoad(plistPathForLabel(SERVER_LABEL));
     ok(`Loaded ${SERVER_LABEL}`);
+    try { launchctlLoad(plistPathForLabel(AUTH_LABEL)); ok(`Loaded ${AUTH_LABEL}`); } catch {}
   } else if (IS_WIN) {
     startWindowsTask(WIN_LABELS.server);
     ok(`Started ${WIN_LABELS.server}`);
+    try { startWindowsTask(WIN_LABELS.auth); ok(`Started ${WIN_LABELS.auth}`); } catch {}
   } else if (IS_LINUX) {
     startLinuxUnit(LINUX_UNITS.server);
     ok(`Started ${LINUX_UNITS.server}`);
+    if (LINUX_UNITS.timer) { try { startLinuxUnit(LINUX_UNITS.timer); ok(`Started ${LINUX_UNITS.timer}`); } catch {} }
   }
   print('');
   info(`Tip: if the install-layout changed (new service file, new paths), run \`mobygate init\` to re-install the service definitions.`);

package/lib/anthropic.js CHANGED Viewed

@@ -112,15 +112,26 @@ export function anthropicMessagesToPrompt(body, { resuming = false } = {}) {
     // SDK has full history. Send only the new tail: tool_results from
     // the last user message (if any) plus any fresh user text.
     const last = messages[messages.length - 1];
-    if (!last || last.role !== 'user') return '';
+    if (!last || last.role !== 'user') {
+      return {
+        promptText: '',
+        error: 'Resume mode requires the last message to be from the user. Last message has role "' + (last?.role || 'none') + '".',
+      };
+    }
     const trBlocks = anthropicToolResultsOf(last.content);
     const text = anthropicTextOf(last.content);
+    if (!trBlocks.length && !text) {
+      return {
+        promptText: '',
+        error: 'Resume mode requires the last user message to contain text or tool_result blocks.',
+      };
+    }
     const parts = [];
     if (trBlocks.length) {
       parts.push(`<tool_results>\n${trBlocks.map(formatToolResultBlock).join('\n')}\n</tool_results>`);
     }
     if (text) parts.push(text);
-    return parts.join('\n\n');
+    return { promptText: parts.join('\n\n') };
   }
   // Fresh request: serialize visible history. System prompt at top, then
@@ -154,7 +165,7 @@ export function anthropicMessagesToPrompt(body, { resuming = false } = {}) {
     }
   }
   flushTools();
-  return parts.join('\n').trim();
+  return { promptText: parts.join('\n').trim() };
 }
 /**

package/lib/platform.js CHANGED Viewed

@@ -110,6 +110,80 @@ export function writeMacServerPlist({ installPath, nodeBin, port, logsDir }) {
   return plistPath;
 }
+/**
+ * Generate the macOS auth-refresh plist with the user's actual paths
+ * baked in. Earlier we shipped a static plist template and sed-replaced
+ * Farhan's hardcoded paths inside it — anyone who installed without an
+ * EXACT path match (different username, different fnm version, etc.)
+ * ended up with a plist pointing at /Users/farhan/... and the cron
+ * silently failed forever. This generator mirrors writeMacServerPlist
+ * and uses the same nodeBin / installPath / logsDir resolution so the
+ * resulting plist is portable across any user's machine.
+ */
+export function writeMacAuthRefreshPlist({ installPath, nodeBin, logsDir, intervalHours = 4 }) {
+  if (!IS_MAC) throw new Error('writeMacAuthRefreshPlist called on non-macOS');
+  if (!existsSync(LAUNCH_AGENTS_DIR)) mkdirSync(LAUNCH_AGENTS_DIR, { recursive: true });
+  const plistPath = join(LAUNCH_AGENTS_DIR, `${AUTH_LABEL}.plist`);
+  const intervalSec = Math.max(60, parseInt(intervalHours, 10) * 3600);
+  const pathChain = [
+    dirname(nodeBin),
+    '/usr/local/bin', '/usr/bin', '/bin', '/opt/homebrew/bin',
+    join(homedir(), '.local/bin'),
+  ].join(':');
+  const xml = `<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
+<!--
+  Generated by \`mobygate init\` on ${new Date().toISOString()}.
+  Proactive Claude Max OAuth refresh cron.
+    - Runs scripts/auth-refresh.js every ${intervalHours}h via launchd
+    - Anthropic OAuth tokens last ~8h, so ${intervalHours}h cadence keeps
+      us inside the valid window even if one run fails
+  Install:   launchctl load ~/Library/LaunchAgents/${AUTH_LABEL}.plist
+  Uninstall: launchctl unload ~/Library/LaunchAgents/${AUTH_LABEL}.plist
+-->
+<plist version="1.0">
+<dict>
+    <key>Label</key>
+    <string>${AUTH_LABEL}</string>
+    <key>ProgramArguments</key>
+    <array>
+        <string>${nodeBin}</string>
+        <string>scripts/auth-refresh.js</string>
+    </array>
+    <key>WorkingDirectory</key>
+    <string>${installPath}</string>
+    <key>EnvironmentVariables</key>
+    <dict>
+        <key>PATH</key>
+        <string>${pathChain}</string>
+        <key>HOME</key>
+        <string>${homedir()}</string>
+    </dict>
+    <key>StartInterval</key>
+    <integer>${intervalSec}</integer>
+    <key>RunAtLoad</key>
+    <true/>
+    <key>StandardOutPath</key>
+    <string>${logsDir}/auth-refresh.log</string>
+    <key>StandardErrorPath</key>
+    <string>${logsDir}/auth-refresh.err.log</string>
+    <key>KeepAlive</key>
+    <false/>
+</dict>
+</plist>
+`;
+  writeFileSync(plistPath, xml);
+  return plistPath;
+}
 /**
  * Install (copy + load) a plist. Returns {installed: true, path}.
  * Safe to call when already loaded — we unload first.

package/lib/tool-bridge.js CHANGED Viewed

@@ -218,6 +218,50 @@ export function hasToolUse(assistantMessage) {
 // Tool results (OpenAI tool messages → Anthropic tool_result content blocks)
 // ---------------------------------------------------------------------------
+// ---------------------------------------------------------------------------
+// Strict-tool guidance (system-prompt append for tool-enabled requests)
+// ---------------------------------------------------------------------------
+// Even with native MCP registration + a tight `allowedTools` allowlist, the
+// model retains strong priors for Claude Code's built-in tools (Bash, Read,
+// Edit, Grep, Glob, WebFetch, ToolSearch, etc.) from training. When a task
+// seems to need one of those, the model reaches for it, gets blocked by
+// `allowedTools`, says "I can't use the tool 'grep' here because it isn't
+// available," and gives up — instead of falling back to the available
+// client-defined tools. We saw this in production OpenClaw use.
+//
+// The fix: append a short, explicit guidance block to Claude Code's system
+// prompt (via `systemPrompt: { type: 'preset', preset: 'claude_code', append: ... }`)
+// telling the model exactly which tools are available and that built-ins
+// are NOT in this environment. The positive list reinforces what the model
+// already sees via MCP registration; the negative list shuts down the
+// trained-in instinct to reach for built-ins.
+//
+// Calibration matters: too directive and the model becomes over-conservative
+// and refuses legitimate work. We aim for matter-of-fact "here's the
+// environment, work within it" rather than threatening prohibition.
+const KNOWN_BUILTINS = 'Bash, Read, Edit, Write, Grep, Glob, NotebookEdit, WebFetch, WebSearch, Task, ToolSearch';
+export function buildToolUsageGuidance(openaiTools) {
+  if (!Array.isArray(openaiTools) || openaiTools.length === 0) return null;
+  const names = [];
+  for (const t of openaiTools) {
+    if (t?.type !== 'function' || !t.function?.name) continue;
+    names.push(t.function.name);
+  }
+  if (names.length === 0) return null;
+  return [
+    'Tool environment: this session is running through a proxy that exposes only the client-defined tools listed below. Claude Code\'s default built-in tools',
+    `(${KNOWN_BUILTINS}, etc.) are NOT available in this environment and cannot be invoked — calls to them will fail.`,
+    '',
+    'Available tools:',
+    ...names.map((n) => `  - ${n}`),
+    '',
+    'If a task seems to require a built-in tool that isn\'t in this list, accomplish what you can with the available tools and briefly note what\'s missing — do not refuse silently or claim you have no tools.',
+  ].join('\n');
+}
 /**
  * Format OpenAI role:'tool' messages as a single user-readable text
  * block to splice into a resumed prompt.

package/lib/updater.js CHANGED Viewed

@@ -26,6 +26,12 @@ import { readFileSync, writeFileSync, existsSync, mkdirSync, openSync } from 'fs
 import { join, sep, dirname } from 'path';
 import { fileURLToPath } from 'url';
 import { LOGS_DIR } from './config.js';
+// Single-source service labels from platform.js — earlier we duplicated
+// these constants here and they drifted (WIN_SERVER_TASK was 'ai.mobygate.server'
+// while platform.js registered 'mobygate-server'), so the dashboard's
+// "Update now" silently no-op'd on Windows because the schtasks /End in the
+// update chain failed and short-circuited the rest via &&.
+import { WIN_LABELS, LINUX_UNITS } from './platform.js';
 const __filename = fileURLToPath(import.meta.url);
 const REPO_ROOT = dirname(dirname(__filename)); // lib/updater.js → repo root
@@ -35,8 +41,7 @@ const IS_MAC = process.platform === 'darwin';
 const IS_LINUX = process.platform === 'linux';
 const SERVER_LABEL = 'ai.mobygate.server';
-const WIN_SERVER_TASK = 'ai.mobygate.server';
-const LINUX_SERVER_UNIT = 'mobygate-server.service';
+const AUTH_LABEL = 'ai.mobygate.auth-refresh';
 const UPDATE_LOG = join(LOGS_DIR, 'update.log');
 const UPDATE_MARKER = join(LOGS_DIR, 'update.state.json');
@@ -174,14 +179,17 @@ function writeUpdateState(patch) {
 function buildUpdateCommand({ mode, repoRoot, logPath }) {
   if (IS_WIN) {
     // cmd.exe — `>>` for append, `2>&1` to merge. Each step on its own
-    // line so failures short-circuit via `||`.
+    // line so failures short-circuit via `&&`. The auth-refresh task is
+    // also stopped because it's a separate scheduled task that imports
+    // mobygate code; if it fires mid-install it grabs file handles in
+    // node_modules\mobygate and we hit EBUSY just like the server task.
+    // Note: trailing 2>nul on End calls so "task not found" doesn't
+    // short-circuit the chain — the start steps will surface real errors.
     const steps = [];
     steps.push(`echo [mobygate-update] start at %DATE% %TIME%`);
-    // Stop FIRST so npm can replace files without EBUSY. /F forces close
-    // even if the process is mid-request; the SDK session map writes are
-    // synchronous and the SIGTERM handler flushes before exit.
-    steps.push(`echo [mobygate-update] stopping service`);
-    steps.push(`schtasks /End /TN "${WIN_SERVER_TASK}"`);
+    steps.push(`echo [mobygate-update] stopping services`);
+    steps.push(`(schtasks /End /TN "${WIN_LABELS.server}" 2>nul) | rem`);
+    steps.push(`(schtasks /End /TN "${WIN_LABELS.auth}"   2>nul) | rem`);
     if (mode === 'npm') {
       steps.push(`npm install -g mobygate@latest`);
     } else if (mode === 'git') {
@@ -189,22 +197,29 @@ function buildUpdateCommand({ mode, repoRoot, logPath }) {
       steps.push(`git pull --ff-only`);
       steps.push(`npm install`);
     }
-    steps.push(`echo [mobygate-update] restarting service`);
-    steps.push(`schtasks /Run /TN "${WIN_SERVER_TASK}"`);
+    steps.push(`echo [mobygate-update] starting services on new build`);
+    steps.push(`schtasks /Run /TN "${WIN_LABELS.server}"`);
+    steps.push(`(schtasks /Run /TN "${WIN_LABELS.auth}" 2>nul) | rem`);
     steps.push(`echo [mobygate-update] done`);
     // Join with && so any failure stops the chain. Final redirect to log.
     const inner = steps.map((s) => `(${s})`).join(' && ');
     return { shell: 'cmd', cmd: `${inner} >> "${logPath}" 2>&1` };
   }
-  // POSIX: sh -c, bail-on-first-failure via set -e. Stop service first
-  // for the same reason — symmetry, cleaner restart, no harm.
+  // POSIX: sh -c, bail-on-first-failure via set -e. Same dual-task stop
+  // applies — auth-refresh runs on its own launchd plist / systemd timer
+  // and would lock files mid-install if not stopped. `|| true` because
+  // a not-loaded service shouldn't kill the chain.
   const parts = [`set -e`, `echo "[mobygate-update] start $(date)"`];
-  parts.push(`echo "[mobygate-update] stopping service"`);
+  parts.push(`echo "[mobygate-update] stopping services"`);
   if (IS_MAC) {
-    const plist = join(process.env.HOME || '~', 'Library', 'LaunchAgents', `${SERVER_LABEL}.plist`);
-    parts.push(`launchctl unload "${plist}" 2>/dev/null || true`);
+    const serverPlist = join(process.env.HOME || '~', 'Library', 'LaunchAgents', `${SERVER_LABEL}.plist`);
+    const authPlist   = join(process.env.HOME || '~', 'Library', 'LaunchAgents', `${AUTH_LABEL}.plist`);
+    parts.push(`launchctl unload "${serverPlist}" 2>/dev/null || true`);
+    parts.push(`launchctl unload "${authPlist}"   2>/dev/null || true`);
   } else if (IS_LINUX) {
-    parts.push(`systemctl --user stop ${LINUX_SERVER_UNIT} 2>/dev/null || true`);
+    parts.push(`systemctl --user stop ${LINUX_UNITS.server} 2>/dev/null || true`);
+    if (LINUX_UNITS.timer) parts.push(`systemctl --user stop ${LINUX_UNITS.timer} 2>/dev/null || true`);
+    if (LINUX_UNITS.auth)  parts.push(`systemctl --user stop ${LINUX_UNITS.auth}  2>/dev/null || true`);
   }
   if (mode === 'npm') {
     parts.push(`npm install -g mobygate@latest`);
@@ -213,12 +228,15 @@ function buildUpdateCommand({ mode, repoRoot, logPath }) {
     parts.push(`git pull --ff-only`);
     parts.push(`npm install`);
   }
-  parts.push(`echo "[mobygate-update] starting service on new build"`);
+  parts.push(`echo "[mobygate-update] starting services on new build"`);
   if (IS_MAC) {
-    const plist = join(process.env.HOME || '~', 'Library', 'LaunchAgents', `${SERVER_LABEL}.plist`);
-    parts.push(`launchctl load "${plist}"`);
+    const serverPlist = join(process.env.HOME || '~', 'Library', 'LaunchAgents', `${SERVER_LABEL}.plist`);
+    const authPlist   = join(process.env.HOME || '~', 'Library', 'LaunchAgents', `${AUTH_LABEL}.plist`);
+    parts.push(`launchctl load "${serverPlist}"`);
+    parts.push(`launchctl load "${authPlist}" 2>/dev/null || true`);
   } else if (IS_LINUX) {
-    parts.push(`systemctl --user start ${LINUX_SERVER_UNIT}`);
+    parts.push(`systemctl --user start ${LINUX_UNITS.server}`);
+    if (LINUX_UNITS.timer) parts.push(`systemctl --user start ${LINUX_UNITS.timer} 2>/dev/null || true`);
   }
   parts.push(`echo "[mobygate-update] done"`);
   const script = parts.join('\n');

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "mobygate",
-  "version": "0.7.1",
+  "version": "0.7.3",
   "description": "OpenAI-compatible local proxy for Claude Max. The Möbius-strip gateway: OpenAI shape in, Claude Max out.",
   "type": "module",
   "main": "server.js",

package/server.js CHANGED Viewed

@@ -55,6 +55,7 @@ import { loadSessions, saveSessions, flushSessionsNow } from './lib/session-stor
 import { LOGS_DIR } from './lib/config.js';
 import {
   buildClientToolsServer,
+  buildToolUsageGuidance,
   extractToolUses,
   hasToolUse,
   toolMessagesToText,
@@ -285,12 +286,20 @@ function messagesToPrompt(messages, { resuming = false } = {}) {
       }
     }
     const toolResultsText = toolMessagesToText(trailingToolMessages);
+    if (!userText && !toolResultsText) {
+      // Earlier code fell back to extracting whatever was at messages[-1],
+      // which on an assistant-terminated history sent the assistant's own
+      // previous reply back to the SDK as the new user prompt — and the
+      // model would "respond to its own reply." Catch this clearly instead.
+      return {
+        promptText: '',
+        error: 'Resume mode requires the request to end with a user message or tool result. Last message has role "' + (messages[messages.length - 1]?.role || 'unknown') + '".',
+      };
+    }
     const parts = [];
     if (toolResultsText) parts.push(toolResultsText);
     if (userText) parts.push(userText);
-    return {
-      promptText: parts.join('\n\n') || extractContent(messages[messages.length - 1]?.content || ''),
-    };
+    return { promptText: parts.join('\n\n') };
   }
   // Fresh request: serialize visible history as XML-wrapped text. No
@@ -395,13 +404,30 @@ async function handleStreaming(req, res, body, requestId, sessionKey) {
   const existing = getSession(sessionKey);
   const resuming = !!existing?.sdkSessionId;
   const toolsEnabled = hasTools(body);
-  const { promptText } = messagesToPrompt(body.messages, { resuming });
+  const { promptText, error: promptError } = messagesToPrompt(body.messages, { resuming });
+  if (promptError) {
+    return res.status(400).json({
+      error: { message: promptError, type: 'invalid_request_error', code: 'invalid_resume_messages' },
+    });
+  }
   const images = collectImages(body.messages);
-  const prompt = buildQueryPrompt(promptText, images);
+  // NOTE: `prompt` is built inside runQuery (not here) when images are
+  // present, because buildQueryPrompt returns a single-use async iterator
+  // for multimodal requests. If we built it here and the SDK call hit a
+  // 401, runWithAuthRetry would invoke runQuery a second time with the
+  // same exhausted iterator → SDK gets an empty user message → silent
+  // empty response. Lazy construction inside runQuery rebuilds the
+  // iterator per attempt.
   const model = resolveModel(body.model);
   // Build the in-process MCP server exposing client tools to the SDK.
   // null when toolsEnabled is false (or all tools are malformed).
   const clientToolsServer = toolsEnabled ? buildClientToolsServer(body.tools) : null;
+  // System-prompt append: tells the model exactly which tools are
+  // available and that Claude Code's built-ins (Bash, Grep, Read, etc.)
+  // are NOT in this environment. Without this, the model trained-in
+  // priors lead it to call Grep/Bash, get blocked by allowedTools, and
+  // refuse the task instead of falling back to client tools. ~150 tokens.
+  const toolsGuidance = clientToolsServer ? buildToolUsageGuidance(body.tools) : null;
   if (images.length) console.log(`  [multimodal] ${images.length} image block(s)`);
   if (toolsEnabled) console.log(`  [tools] ${body.tools.length} client tool(s) registered as MCP`);
@@ -443,6 +469,9 @@ async function handleStreaming(req, res, body, requestId, sessionKey) {
     resolvedModel = model;
     capturedSessionId = existing?.sdkSessionId || null;
+    // Build the prompt lazily on each attempt — multimodal returns a
+    // single-use async iterator. Keeps 401 auth-retries safe.
+    const prompt = buildQueryPrompt(promptText, images);
     for await (const message of query({
       prompt,
       options: {
@@ -458,6 +487,7 @@ async function handleStreaming(req, res, body, requestId, sessionKey) {
           ? {
               mcpServers: { [MCP_SERVER_NAME]: clientToolsServer },
               allowedTools: [`${MCP_TOOL_PREFIX}*`],
+              systemPrompt: { type: 'preset', preset: 'claude_code', append: toolsGuidance },
             }
           : toolsEnabled
             // Tools were requested but none were valid — disable all tools.
@@ -615,11 +645,23 @@ async function handleNonStreaming(res, body, requestId, sessionKey) {
   const existing = getSession(sessionKey);
   const resuming = !!existing?.sdkSessionId;
   const toolsEnabled = hasTools(body);
-  const { promptText } = messagesToPrompt(body.messages, { resuming });
+  const { promptText, error: promptError } = messagesToPrompt(body.messages, { resuming });
+  if (promptError) {
+    return res.status(400).json({
+      error: { message: promptError, type: 'invalid_request_error', code: 'invalid_resume_messages' },
+    });
+  }
   const images = collectImages(body.messages);
-  const prompt = buildQueryPrompt(promptText, images);
+  // NOTE: `prompt` is built inside runQuery (not here) when images are
+  // present, because buildQueryPrompt returns a single-use async iterator
+  // for multimodal requests. If we built it here and the SDK call hit a
+  // 401, runWithAuthRetry would invoke runQuery a second time with the
+  // same exhausted iterator → SDK gets an empty user message → silent
+  // empty response. Lazy construction inside runQuery rebuilds the
+  // iterator per attempt.
   const model = resolveModel(body.model);
   const clientToolsServer = toolsEnabled ? buildClientToolsServer(body.tools) : null;
+  const toolsGuidance = clientToolsServer ? buildToolUsageGuidance(body.tools) : null;
   if (images.length) console.log(`  [multimodal] ${images.length} image block(s)`);
   if (toolsEnabled) console.log(`  [tools] ${body.tools.length} client tool(s) registered as MCP`);
@@ -644,6 +686,9 @@ async function handleNonStreaming(res, body, requestId, sessionKey) {
     outputTokens = 0;
     capturedSessionId = existing?.sdkSessionId || null;
+    // Build the prompt lazily on each attempt — multimodal returns a
+    // single-use async iterator. Keeps 401 auth-retries safe.
+    const prompt = buildQueryPrompt(promptText, images);
     for await (const message of query({
       prompt,
       options: {
@@ -656,6 +701,7 @@ async function handleNonStreaming(res, body, requestId, sessionKey) {
           ? {
               mcpServers: { [MCP_SERVER_NAME]: clientToolsServer },
               allowedTools: [`${MCP_TOOL_PREFIX}*`],
+              systemPrompt: { type: 'preset', preset: 'claude_code', append: toolsGuidance },
             }
           : toolsEnabled
             ? { allowedTools: [] }
@@ -791,9 +837,17 @@ async function handleAnthropicNonStreaming(res, body, requestId, sessionKey) {
   const existing = getSession(sessionKey);
   const resuming = !!existing?.sdkSessionId;
   const toolsEnabled = hasAnthropicTools(body);
-  const promptText = anthropicMessagesToPrompt(body, { resuming });
+  const { promptText, error: promptError } = anthropicMessagesToPrompt(body, { resuming });
+  if (promptError) {
+    return res.status(400).json({
+      type: 'error',
+      error: { type: 'invalid_request_error', message: promptError },
+    });
+  }
   const images = collectAnthropicImages(body.messages || []);
-  const prompt = buildQueryPrompt(promptText, images);
+  // See note in handleStreaming — `prompt` is built lazily inside runQuery
+  // because the multimodal path returns a single-use async iterator that
+  // a 401-retry would exhaust on the first attempt.
   const model = resolveModel(body.model);
   // Translate Anthropic tool defs → OpenAI shape that buildClientToolsServer
   // expects. Both go through the same JSON-Schema → Zod path on the way to
@@ -806,6 +860,7 @@ async function handleAnthropicNonStreaming(res, body, requestId, sessionKey) {
       }))
     : null;
   const clientToolsServer = toolsForBridge ? buildClientToolsServer(toolsForBridge) : null;
+  const toolsGuidance = clientToolsServer ? buildToolUsageGuidance(toolsForBridge) : null;
   if (images.length) console.log(`  [multimodal] ${images.length} image block(s)`);
   if (toolsEnabled) console.log(`  [tools] ${body.tools.length} client tool(s) registered as MCP`);
@@ -832,6 +887,9 @@ async function handleAnthropicNonStreaming(res, body, requestId, sessionKey) {
     capturedSessionId = existing?.sdkSessionId || null;
     stopReason = 'end_turn';
+    // Build the prompt lazily on each attempt — multimodal returns a
+    // single-use async iterator. Keeps 401 auth-retries safe.
+    const prompt = buildQueryPrompt(promptText, images);
     for await (const message of query({
       prompt,
       options: {
@@ -844,6 +902,7 @@ async function handleAnthropicNonStreaming(res, body, requestId, sessionKey) {
           ? {
               mcpServers: { [MCP_SERVER_NAME]: clientToolsServer },
               allowedTools: [`${MCP_TOOL_PREFIX}*`],
+              systemPrompt: { type: 'preset', preset: 'claude_code', append: toolsGuidance },
             }
           : toolsEnabled
             ? { allowedTools: [] }
@@ -937,9 +996,17 @@ async function handleAnthropicStreaming(req, res, body, requestId, sessionKey) {
   const existing = getSession(sessionKey);
   const resuming = !!existing?.sdkSessionId;
   const toolsEnabled = hasAnthropicTools(body);
-  const promptText = anthropicMessagesToPrompt(body, { resuming });
+  const { promptText, error: promptError } = anthropicMessagesToPrompt(body, { resuming });
+  if (promptError) {
+    return res.status(400).json({
+      type: 'error',
+      error: { type: 'invalid_request_error', message: promptError },
+    });
+  }
   const images = collectAnthropicImages(body.messages || []);
-  const prompt = buildQueryPrompt(promptText, images);
+  // See note in handleStreaming — `prompt` is built lazily inside runQuery
+  // because the multimodal path returns a single-use async iterator that
+  // a 401-retry would exhaust on the first attempt.
   const model = resolveModel(body.model);
   const toolsForBridge = toolsEnabled
     ? body.tools.map((t) => ({
@@ -948,6 +1015,7 @@ async function handleAnthropicStreaming(req, res, body, requestId, sessionKey) {
       }))
     : null;
   const clientToolsServer = toolsForBridge ? buildClientToolsServer(toolsForBridge) : null;
+  const toolsGuidance = clientToolsServer ? buildToolUsageGuidance(toolsForBridge) : null;
   if (images.length) console.log(`  [multimodal] ${images.length} image block(s)`);
   if (toolsEnabled) console.log(`  [tools] ${body.tools.length} client tool(s) registered as MCP`);
@@ -992,6 +1060,9 @@ async function handleAnthropicStreaming(req, res, body, requestId, sessionKey) {
     textEmittedSoFar = '';
     toolUseEmitted = false;
+    // Build the prompt lazily on each attempt — multimodal returns a
+    // single-use async iterator. Keeps 401 auth-retries safe.
+    const prompt = buildQueryPrompt(promptText, images);
     for await (const message of query({
       prompt,
       options: {
@@ -1004,6 +1075,7 @@ async function handleAnthropicStreaming(req, res, body, requestId, sessionKey) {
           ? {
               mcpServers: { [MCP_SERVER_NAME]: clientToolsServer },
               allowedTools: [`${MCP_TOOL_PREFIX}*`],
+              systemPrompt: { type: 'preset', preset: 'claude_code', append: toolsGuidance },
             }
           : toolsEnabled
             ? { allowedTools: [] }
@@ -1151,6 +1223,62 @@ async function handleAnthropicStreaming(req, res, body, requestId, sessionKey) {
 const app = express();
 app.use(express.json({ limit: '10mb' }));
+// ---------------------------------------------------------------------------
+// Same-origin gate for control-plane endpoints
+// ---------------------------------------------------------------------------
+// The proxy endpoints (/v1/chat/completions, /v1/messages, /v1/models,
+// /health) are intentionally open: clients from other localhost processes
+// (Hermes, OpenClaw, etc.) need to hit them. But the *control-plane*
+// endpoints — anything that triggers privileged actions (npm install,
+// auth refresh, session deletion) or exposes sensitive data (server log
+// containing prompt text, live event metadata) — must NOT be reachable
+// from a browser tab on a malicious site (DNS-rebinding) or a LAN
+// attacker (when bind: 0.0.0.0).
+//
+// Defense:
+//   - Host header must resolve to localhost. DNS rebinding makes the
+//     network connect to 127.0.0.1, but the browser still sends the
+//     attacker's hostname in the Host header — block it.
+//   - If Origin is present (browsers always send it on POST), the
+//     hostname must also be local. Catches cross-origin fetches.
+//   - Non-browser clients (curl, the dashboard's own JS from same
+//     origin, programmatic callers) sail through fine.
+//
+// Limitation: this is NOT a substitute for real auth on a LAN-exposed
+// proxy. With bind: 0.0.0.0, anyone on the LAN can still hit endpoints
+// directly with a faked Host header. For v0.7.3 we accept that and warn
+// in the startup banner; a real `MOBYGATE_TOKEN` for LAN use is a
+// follow-up.
+function isLocalHostname(name) {
+  if (!name) return false;
+  const lower = String(name).toLowerCase();
+  // Strip optional brackets (IPv6) and port suffix.
+  const stripped = lower.replace(/^\[|\]$/g, '').replace(/:[0-9]+$/, '');
+  return stripped === '127.0.0.1' || stripped === 'localhost' || stripped === '::1';
+}
+function requireLocalOrigin(req, res, next) {
+  if (!isLocalHostname(req.headers.host)) {
+    return res.status(403).json({
+      error: { type: 'forbidden', message: 'Host header is not localhost. Mobygate refuses non-local origins on control-plane endpoints (DNS-rebinding protection).' },
+    });
+  }
+  const origin = req.headers.origin;
+  if (origin) {
+    try {
+      if (!isLocalHostname(new URL(origin).hostname)) {
+        return res.status(403).json({
+          error: { type: 'forbidden', message: 'Origin header is not localhost. Cross-origin fetch refused on control-plane endpoint.' },
+        });
+      }
+    } catch {
+      return res.status(403).json({ error: { type: 'forbidden', message: 'Invalid Origin header.' } });
+    }
+  }
+  next();
+}
 // GET / — serve dashboard. No-cache headers so browsers always re-fetch
 // after a mobygate upgrade; otherwise they keep serving the old index.html
 // from cache and users see a stale dashboard long after the service updated.
@@ -1374,7 +1502,7 @@ app.get('/sessions/:key', (req, res) => {
 });
 // DELETE /sessions/:key — clear a session
-app.delete('/sessions/:key', (req, res) => {
+app.delete('/sessions/:key', requireLocalOrigin, (req, res) => {
   const existed = sessions.delete(req.params.key);
   if (existed) {
     dashboardBus.emitEvent({ type: 'session.expired', key: req.params.key, reason: 'manual' });
@@ -1384,7 +1512,7 @@ app.delete('/sessions/:key', (req, res) => {
 });
 // DELETE /sessions — clear all sessions
-app.delete('/sessions', (_req, res) => {
+app.delete('/sessions', requireLocalOrigin, (_req, res) => {
   const keys = [...sessions.keys()];
   const count = sessions.size;
   sessions.clear();
@@ -1425,7 +1553,7 @@ app.get('/auth/status', async (req, res) => {
 // POST /auth/refresh
 // Fires the refresh probe. Intended for use by cron / launchd.
-app.post('/auth/refresh', async (_req, res) => {
+app.post('/auth/refresh', requireLocalOrigin, async (_req, res) => {
   const probe = await forceRefresh();
   dashboardBus.emitEvent({ type: 'auth.refresh', ok: probe.ok, durationMs: probe.durationMs, error: probe.error });
   res.status(probe.ok ? 200 : 502).json({
@@ -1439,7 +1567,7 @@ app.post('/auth/refresh', async (_req, res) => {
 // ---------------------------------------------------------------------------
 // GET /events — SSE stream of dashboard events
-app.get('/events', (req, res) => {
+app.get('/events', requireLocalOrigin, (req, res) => {
   res.setHeader('Content-Type', 'text/event-stream');
   res.setHeader('Cache-Control', 'no-cache, no-transform');
   res.setHeader('Connection', 'keep-alive');
@@ -1514,7 +1642,7 @@ app.get('/dashboard/sessions', (_req, res) => {
 });
 // GET /dashboard/logs — tail the server log file
-app.get('/dashboard/logs', async (req, res) => {
+app.get('/dashboard/logs', requireLocalOrigin, async (req, res) => {
   const lines = Math.min(2000, parseInt(req.query.lines || '200', 10));
   const logPath = join(LOGS_DIR, 'server.log');
   try {
@@ -1553,7 +1681,7 @@ app.get('/update/check', async (req, res) => {
 // `npm install -g mobygate@latest` (or `git pull && npm install`), then
 // restarts the service — which kills us. The dashboard polls
 // /update/status to show progress and reconnects once the new server is up.
-app.post('/update/apply', (_req, res) => {
+app.post('/update/apply', requireLocalOrigin, (_req, res) => {
   try {
     const result = applyUpdate({});
     const status = result.started ? 202 : 409;