npm - mobygate - Versions diffs - 0.8.1 → 0.8.3 - Mend

mobygate 0.8.1 → 0.8.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/CHANGELOG.md +142 -0
package/lib/anthropic.js +9 -3
package/lib/connectors/openclaw.js +72 -12
package/lib/session-derive.js +20 -1
package/package.json +1 -1
package/server.js +20 -4

package/CHANGELOG.md CHANGED Viewed

@@ -4,6 +4,148 @@ All notable changes to mobygate are documented here. Format loosely follows
 [Keep a Changelog](https://keepachangelog.com/en/1.1.0/); version numbers are
 [Semantic Versioning](https://semver.org/).
+## [0.8.3] — 2026-04-28
+OpenClaw .26 compatibility, sonnet 1M context, and observability for
+silent model swaps. Found by upgrading mobygate to v0.8.1 on Windows
+during heavy multi-agent testing.
+### Fixed
+- **OpenClaw connector wrote deprecated `models.main` / `models.default`
+  keys**, which OpenClaw `2026.4.26+` outright rejects with `Unrecognized
+  keys: "main", "default"`. The gateway refused to start until
+  `openclaw doctor --fix` cleaned them up.
+  v0.8.3 connector now writes the modern path:
+  - `agents.defaults.model.primary` (the active default)
+  - `agents.defaults.models` (registered model id map)
+  Plus on every `plan()` it removes the deprecated top-level keys if a
+  previous mobygate version (v0.8.0–v0.8.2) wrote them. Re-running
+  `mobygate connect openclaw` after upgrading will repair any broken
+  config in place.
+  Why this didn't bite Mac users: hand-rolled configs that never went
+  through `mobygate connect`'s apply path were never written by the
+  bad code. It surfaced on Windows when `mobygate init` auto-wired
+  detected clients during a fresh upgrade.
+- **Sonnet 4.6 silently capped at 200k context.** v0.8.2 routed
+  `claude-sonnet-4-6` through directly but didn't append the `[1m]`
+  suffix the way opus 4.7 does, so the SDK ran sonnet at the default
+  200k window. Confirmed via `modelUsage.contextWindow=200000` in the
+  diagnostic logging.
+  Fix: `claude-sonnet-4-6` now maps to `claude-sonnet-4-6[1m]`. Verified
+  next-turn `modelUsage` shows `contextWindow: 1000000`. Matches the
+  capability we already advertise in `/v1/models`.
+  Side note: `claude-sonnet-4-6-200k` added as an explicit alias for
+  callers that want the cheaper 200k variant.
+### Added
+- **`[model-billed]` diagnostic log line.** Every successful (non-tool-
+  use-aborted) request now logs the SDK's `modelUsage` map showing
+  what model Anthropic actually billed against. Output looks like:
+  ```
+  [model-billed] requested=claude-sonnet-4-6[1m]
+  modelUsage={"claude-sonnet-4-6[1m]":{"inputTokens":3,"costUSD":0.029,
+              "contextWindow":1000000,...}}
+  ```
+  This was originally added as a temporary diagnostic to chase whether
+  Anthropic was silently swapping sonnet → opus. The data confirmed
+  no swap is happening — but the line is too useful to remove. Surfaces
+  cost-per-turn, actual context window, and any future silent model
+  changes the SDK might introduce. Low log volume (one line per
+  result message, ~50% of requests since tool_use turns abort
+  before result).
+  Bonus finding from this log: the SDK transparently uses **two models
+  per opus turn** — a haiku for fast intermediate work and opus for
+  the final response. Adds ~$0.024 per opus turn beyond what the
+  capture summary shows. Worth knowing for cost analysis.
+### Notes
+The "claude.ai/settings/usage Sonnet only stuck at 0%" mystery turned
+out to be a Claude Max plan accounting design: the "Sonnet only" bar
+is overflow that only ticks once "All models" is exhausted. On Max
+plans, sonnet usage rolls into the "All models" bar (which was
+climbing as expected). Not a mobygate bug — investigation surfaced
+the v0.8.2 [1m] context omission, which IS a mobygate bug, so net
+positive on the chase.
+## [0.8.2] — 2026-04-28
+Multi-agent fixes. Found the day after v0.8.1 shipped, while testing
+three OpenClaw bots (Mobius/Lux/Mercury) in parallel on the same
+machine. Both bugs were invisible without the v0.8.1 inspector.
+### Fixed
+- **Session-key collision when multiple agents share a boilerplate
+  prefix in their system prompt.** v0.7.1's auto-derive hashed the
+  first 500 characters of the system prompt; OpenClaw's "You are a
+  personal assistant running inside OpenClaw…" preamble fills more
+  than that, so the per-agent personality content (loaded later from
+  workspace SOUL.md / IDENTITY.md / etc.) didn't reach the hash. Two
+  separate agents (Lux on sonnet-4-6, Mercury on sonnet-4-6) collided
+  onto the same auto-key when given the same first user message
+  ("@Lux @Mercury Hi"). Same key → same SDK session reuse → cache
+  thrash and potential session-state mixing.
+  Bumped `SYSTEM_TRIM` from 500 → 20000 chars. Verified against real
+  captured request bodies that collided in v0.8.1 — they now hash to
+  distinct keys (`auto_b0371e5c…` vs `auto_2b90afd7…`).
+  SHA-256 cost on 20kB is ~10-20µs per request, irrelevant in the
+  hot path.
+- **Model map silently downgraded `claude-sonnet-4-6` to retired
+  `claude-sonnet-4-5-20250929`.** When the v0.8.0 model map was
+  written, the Claude Agent SDK didn't recognize the un-dated
+  `claude-sonnet-4-6` alias and we worked around it by routing to the
+  most recent dated 4-5. The SDK has since added native 4-6 support,
+  but mobygate kept the workaround in place. Result: clients (OpenClaw
+  Lux/Mercury) configured for sonnet-4-6 were having their requests
+  rewritten to the retired 4-5-20250929 dated id. Anthropic accepted
+  the call but the response wasn't billing into the user's "Sonnet
+  only" quota — it was showing 0% used despite live traffic. Likely
+  Claude was falling back internally to opus or returning a
+  zero-billed degraded response.
+  Fix: route `claude-sonnet-4-6` through directly. Also updated
+  `claude-sonnet-4` and the `sonnet` shorthand to point at 4-6
+  (current latest) instead of the retired dated 4-5 entry. Explicit
+  `claude-sonnet-4-5` requests still route to the dated id for
+  backward compatibility.
+  Discovery: the inspector showed Lux/Mercury captures all stamped
+  with `model: claude-sonnet-4-6` (correct from the request side) but
+  Anthropic's quota panel reported 0% sonnet usage. The server.log's
+  `model=claude-sonnet-4-6 → claude-sonnet-4-5-20250929` translation
+  line was the smoking gun.
+### Notes
+The proper long-term fix is for clients to pass an explicit
+`X-Session-Id` header per agent (mobygate has supported this since
+v0.7.1 — it always wins over auto-derive). This bump is a defensive
+measure for clients that don't.
+Discovery flow is a nice validation of the v0.8.1 inspector: the
+collision was invisible at the OpenClaw level (each bot's replies
+arrived correctly because OpenClaw maintains its own per-agent SDK
+state) but jumped out as soon as the captures were sorted by session
+key in the inspector — two different model requests with the same
+session-key, with bootstrap text 55kB long but identical first 500
+chars. Without the inspector, this would have surfaced as
+unpredictable cache hit rates and been blamed on Anthropic.
 ## [0.8.1] — 2026-04-27
 Diagnostic visibility release. Adds a request/response capture system,

package/lib/anthropic.js CHANGED Viewed

@@ -36,17 +36,23 @@ import { v4 as uuidv4 } from 'uuid';
  * against shape variations (the Claude Agent SDK sometimes nests these
  * under `.usage`, sometimes places them flat on the message). Returns
  * a complete usage shape with cache_read / cache_creation fields zeroed
- * out if absent. Used by the 4 mobygate handlers to populate response
- * captures and dashboard cache-hit metrics.
+ * out if absent, plus `modelUsage` (the SDK's per-model usage breakdown,
+ * keyed by the actual model name Anthropic billed against — useful for
+ * spotting silent model fallbacks where the requested model differs
+ * from what Anthropic actually ran).
+ *
+ * Used by the 4 mobygate handlers to populate response captures and
+ * dashboard cache-hit metrics.
  */
 export function extractSdkUsage(message) {
-  if (!message) return { input_tokens: 0, output_tokens: 0, cache_read_input_tokens: 0, cache_creation_input_tokens: 0 };
+  if (!message) return { input_tokens: 0, output_tokens: 0, cache_read_input_tokens: 0, cache_creation_input_tokens: 0, modelUsage: null };
   const u = message.usage || message;
   return {
     input_tokens: u.input_tokens || 0,
     output_tokens: u.output_tokens || 0,
     cache_read_input_tokens: u.cache_read_input_tokens || 0,
     cache_creation_input_tokens: u.cache_creation_input_tokens || 0,
+    modelUsage: message.modelUsage || null,
   };
 }

package/lib/connectors/openclaw.js CHANGED Viewed

@@ -169,8 +169,12 @@ export const openclawConnector = {
       configPath: det.configPath,
       mobyProviderExists: !!providers[PROVIDER_NAME_OPENAI],
       mobyNativeProviderExists: !!providers[PROVIDER_NAME_ANTHROPIC],
-      currentMain: det.parsed?.models?.main || null,
-      currentDefault: det.parsed?.models?.default || null,
+      currentPrimary: det.parsed?.agents?.defaults?.model?.primary || null,
+      // Old keys: still surfaced so the user/dashboard can spot if a
+      // pre-v0.8.3 connector run left these around (they're invalid in
+      // OpenClaw .26+). A non-null value here is a "needs reconnect" hint.
+      deprecatedMain: det.parsed?.models?.main || null,
+      deprecatedDefault: det.parsed?.models?.default || null,
       shadowProviders, // pre-v0.8.0 entries pointing at our base URL
     };
   },
@@ -212,14 +216,52 @@ export const openclawConnector = {
           : null;
       if (preferredProvider) {
         const target = `${preferredProvider}/claude-opus-4-7`;
-        after.models.main = target;
-        after.models.default = target;
+        // Modern OpenClaw schema (>=2026.4.x): defaults live under
+        // `agents.defaults.model.primary` and `agents.defaults.models`.
+        // The old `models.main` / `models.default` top-level keys were
+        // valid in earlier OpenClaw versions but `.26+` rejects them
+        // ("Unrecognized keys: main, default"). v0.8.0/0.8.1 connector
+        // wrote the old shape and broke OpenClaw `.26` startup. Fixed
+        // in v0.8.3 by switching to the modern path + cleaning up any
+        // old keys it previously wrote.
+        if (!after.agents)                  after.agents = {};
+        if (!after.agents.defaults)         after.agents.defaults = {};
+        if (!after.agents.defaults.model)   after.agents.defaults.model = {};
+        if (!after.agents.defaults.models)  after.agents.defaults.models = {};
+        after.agents.defaults.model.primary = target;
+        // Register every model we surface in the provider so OpenClaw
+        // sees them in agents-list. Keep existing entries to preserve
+        // user-registered models (ollama, anthropic-direct, etc).
+        for (const m of MODELS_NATIVE_SURFACE) {
+          const id = `${preferredProvider}/${m.id}`;
+          if (!after.agents.defaults.models[id]) {
+            after.agents.defaults.models[id] = {};
+          }
+        }
+        // Cleanup: remove the deprecated top-level keys if a previous
+        // connector run (v0.8.0/0.8.1) wrote them. OpenClaw `.26`
+        // rejects the config outright if these are present.
+        if (after.models?.main !== undefined)    delete after.models.main;
+        if (after.models?.default !== undefined) delete after.models.default;
       }
     }
     const summary = diffSummary(
-      { providers: before.models?.providers, main: before.models?.main, default: before.models?.default },
-      { providers: after.models.providers,  main: after.models.main,  default: after.models.default },
+      {
+        providers: before.models?.providers,
+        primary: before.agents?.defaults?.model?.primary,
+        deprecatedMain: before.models?.main,
+        deprecatedDefault: before.models?.default,
+      },
+      {
+        providers: after.models.providers,
+        primary: after.agents?.defaults?.model?.primary,
+        deprecatedMain: after.models?.main,        // should be undefined post-fix
+        deprecatedDefault: after.models?.default,  // should be undefined post-fix
+      },
     );
     return {
@@ -263,14 +305,32 @@ export const openclawConnector = {
         if (providers[name]) { delete providers[name]; changed = true; }
       }
     }
-    // If main/default was pointing at us, blank them — let the user
-    // re-pick rather than guess at a replacement.
-    if (isMobyDefaultPointer(after.models?.main)) {
-      after.models.main = null;
+    // Modern path: if the agents.defaults.model.primary points at us,
+    // blank it out — let OpenClaw fall back to whatever default the
+    // user has configured otherwise.
+    const primary = after.agents?.defaults?.model?.primary;
+    if (isMobyDefaultPointer(primary)) {
+      after.agents.defaults.model.primary = null;
+      changed = true;
+    }
+    // Remove our model registrations from agents.defaults.models.
+    if (after.agents?.defaults?.models) {
+      for (const key of Object.keys(after.agents.defaults.models)) {
+        if (isMobyDefaultPointer(key)) {
+          delete after.agents.defaults.models[key];
+          changed = true;
+        }
+      }
+    }
+    // Legacy cleanup: remove deprecated top-level main/default if present
+    // (left over from v0.8.0/0.8.1 connector). OpenClaw .26 rejects them
+    // outright, so removing on disconnect is the right move.
+    if (after.models?.main !== undefined) {
+      delete after.models.main;
       changed = true;
     }
-    if (isMobyDefaultPointer(after.models?.default)) {
-      after.models.default = null;
+    if (after.models?.default !== undefined) {
+      delete after.models.default;
       changed = true;
     }

package/lib/session-derive.js CHANGED Viewed

@@ -40,6 +40,12 @@
  *     user message from history mid-conversation, the auto-key changes
  *     and the SDK starts a new session. One turn of double-billing,
  *     then we're back on the new key. Acceptable.
+ *   - **Multi-agent collisions** (fixed in v0.8.2): two agents that
+ *     share boilerplate at the start of their system prompt previously
+ *     collided onto one session key when the trim window only covered
+ *     the boilerplate. SYSTEM_TRIM was raised from 500 to 20000 chars
+ *     to capture the per-agent personality content that follows the
+ *     shared preamble. See note on the constant below for details.
  *
  * Opt-out: `X-Session-Id: none` tells us the client explicitly wants
  * stateless behavior — we return null and the request flows through
@@ -51,7 +57,20 @@
 import { createHash } from 'crypto';
 const HASH_LEN = 16;
-const SYSTEM_TRIM = 500;
+// SYSTEM_TRIM was 500 in v0.7.1 — large enough for casual single-agent
+// scenarios (Hermes, single-bot OpenClaw) but caused collisions when
+// multiple agents shared a common boilerplate prefix. Observed in v0.8.1
+// production: Lux + Mercury (two OpenClaw agents) both started their
+// system prompt with the OpenClaw "You are a personal assistant…"
+// boilerplate that filled the first ~500 chars, so their personality
+// markers (loaded from per-agent SOUL.md / IDENTITY.md / etc.) didn't
+// reach the hash and they collided onto the same session key.
+//
+// Bumping to 20kB covers realistic agent system prompts including
+// rich workspace bootstrap (Lux: ~42kB, Mercury: ~80kB total — but
+// the first 20kB has more than enough divergence to fingerprint each).
+// SHA-256 cost on 20kB is ~10-20µs, irrelevant per request.
+const SYSTEM_TRIM = 20000;
 const USER_TRIM = 500;
 /**

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "mobygate",
-  "version": "0.8.1",
+  "version": "0.8.3",
   "description": "OpenAI-compatible local proxy for Claude Max. The Möbius-strip gateway: OpenAI shape in, Claude Max out.",
   "type": "module",
   "main": "server.js",

package/server.js CHANGED Viewed

@@ -168,6 +168,17 @@ for (const sig of ['SIGTERM', 'SIGINT', 'SIGHUP']) {
 // Opus 4.7 ships a native 1M-context variant addressed as `claude-opus-4-7[1m]`.
 // Default opus aliases route to the 1M form to match the advertised context window.
 // Pass `claude-opus-4-7-200k` for the standard (cheaper) 200k variant.
+//
+// History: the sonnet-4-6 entry previously mapped to the dated
+// `claude-sonnet-4-5-20250929` because at the time, the SDK didn't
+// recognize `claude-sonnet-4-6` natively. The SDK has since added
+// native support for the un-dated 4-6 alias, so sonnet-4-6 requests
+// were silently being downgraded to retired 4-5-20250929. This caused
+// the "Sonnet only" Anthropic quota to show 0% usage even when Lux
+// and Mercury (configured for sonnet-4-6) were chatting actively —
+// the SDK was accepting the retired model id but Claude was likely
+// falling back to opus or returning a zero-billed response. Fixed in
+// v0.8.2 by routing 4-6 through directly.
 const MODEL_MAP = {
   'claude-opus-4': 'claude-opus-4-7[1m]',
   'claude-opus-4-6': 'claude-opus-4-6',
@@ -175,13 +186,14 @@ const MODEL_MAP = {
   'claude-opus-4-7[1m]': 'claude-opus-4-7[1m]',
   'claude-opus-4-7-1m': 'claude-opus-4-7[1m]',
   'claude-opus-4-7-200k': 'claude-opus-4-7',
-  'claude-sonnet-4': 'claude-sonnet-4-5-20250929',
-  'claude-sonnet-4-5': 'claude-sonnet-4-5-20250929',
-  'claude-sonnet-4-6': 'claude-sonnet-4-5-20250929', // SDK resolves 4-6 to non-existent dated version
+  'claude-sonnet-4': 'claude-sonnet-4-6[1m]',         // current latest sonnet, 1M context
+  'claude-sonnet-4-5': 'claude-sonnet-4-5-20250929',  // explicit request for older 4-5
+  'claude-sonnet-4-6': 'claude-sonnet-4-6[1m]',       // SDK supports natively; [1m] unlocks 1M context (same pattern as opus 4.7)
+  'claude-sonnet-4-6-200k': 'claude-sonnet-4-6',      // explicit 200k variant
   'claude-haiku-4': 'claude-haiku-4-5-20251001',
   'claude-haiku-4-5': 'claude-haiku-4-5-20251001',
   'opus': 'claude-opus-4-7[1m]',
-  'sonnet': 'claude-sonnet-4-5-20250929',
+  'sonnet': 'claude-sonnet-4-6[1m]',                  // current latest sonnet, 1M context
   'haiku': 'claude-haiku-4-5-20251001',
 };
@@ -579,6 +591,7 @@ async function handleStreaming(req, res, body, requestId, sessionKey) {
         outputTokens = usage.output_tokens;
         cacheReadTokens = usage.cache_read_input_tokens;
         cacheCreateTokens = usage.cache_creation_input_tokens;
+        console.log(`  [model-billed] requested=${resolvedModel} modelUsage=${JSON.stringify(usage.modelUsage || '(none)')}`);
         break;
       }
     }
@@ -784,6 +797,7 @@ async function handleNonStreaming(res, body, requestId, sessionKey) {
         outputTokens = usage.output_tokens;
         cacheReadTokens = usage.cache_read_input_tokens;
         cacheCreateTokens = usage.cache_creation_input_tokens;
+        console.log(`  [model-billed] requested=${resolvedModel} modelUsage=${JSON.stringify(usage.modelUsage || '(none)')}`);
         if (message.subtype) stopReason = message.subtype;
         break;
       }
@@ -998,6 +1012,7 @@ async function handleAnthropicNonStreaming(res, body, requestId, sessionKey) {
         outputTokens = usage.output_tokens;
         cacheReadTokens = usage.cache_read_input_tokens;
         cacheCreateTokens = usage.cache_creation_input_tokens;
+        console.log(`  [model-billed] requested=${resolvedModel} modelUsage=${JSON.stringify(usage.modelUsage || '(none)')}`);
         stopReason = mapStopReason(message);
         break;
       }
@@ -1243,6 +1258,7 @@ async function handleAnthropicStreaming(req, res, body, requestId, sessionKey) {
         outputTokens = usage.output_tokens;
         cacheReadTokens = usage.cache_read_input_tokens;
         cacheCreateTokens = usage.cache_creation_input_tokens;
+        console.log(`  [model-billed] requested=${resolvedModel} modelUsage=${JSON.stringify(usage.modelUsage || '(none)')}`);
         if (!toolUseEmitted) stopReason = mapStopReason(message);
         break;
       }