mobygate 0.8.1 → 0.8.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -4,6 +4,148 @@ All notable changes to mobygate are documented here. Format loosely follows
4
4
  [Keep a Changelog](https://keepachangelog.com/en/1.1.0/); version numbers are
5
5
  [Semantic Versioning](https://semver.org/).
6
6
 
7
+ ## [0.8.3] — 2026-04-28
8
+
9
+ OpenClaw .26 compatibility, sonnet 1M context, and observability for
10
+ silent model swaps. Found by upgrading mobygate to v0.8.1 on Windows
11
+ during heavy multi-agent testing.
12
+
13
+ ### Fixed
14
+
15
+ - **OpenClaw connector wrote deprecated `models.main` / `models.default`
16
+ keys**, which OpenClaw `2026.4.26+` outright rejects with `Unrecognized
17
+ keys: "main", "default"`. The gateway refused to start until
18
+ `openclaw doctor --fix` cleaned them up.
19
+
20
+ v0.8.3 connector now writes the modern path:
21
+ - `agents.defaults.model.primary` (the active default)
22
+ - `agents.defaults.models` (registered model id map)
23
+
24
+ Plus on every `plan()` it removes the deprecated top-level keys if a
25
+ previous mobygate version (v0.8.0–v0.8.2) wrote them. Re-running
26
+ `mobygate connect openclaw` after upgrading will repair any broken
27
+ config in place.
28
+
29
+ Why this didn't bite Mac users: hand-rolled configs that never went
30
+ through `mobygate connect`'s apply path were never written by the
31
+ bad code. It surfaced on Windows when `mobygate init` auto-wired
32
+ detected clients during a fresh upgrade.
33
+
34
+ - **Sonnet 4.6 silently capped at 200k context.** v0.8.2 routed
35
+ `claude-sonnet-4-6` through directly but didn't append the `[1m]`
36
+ suffix the way opus 4.7 does, so the SDK ran sonnet at the default
37
+ 200k window. Confirmed via `modelUsage.contextWindow=200000` in the
38
+ diagnostic logging.
39
+
40
+ Fix: `claude-sonnet-4-6` now maps to `claude-sonnet-4-6[1m]`. Verified
41
+ next-turn `modelUsage` shows `contextWindow: 1000000`. Matches the
42
+ capability we already advertise in `/v1/models`.
43
+
44
+ Side note: `claude-sonnet-4-6-200k` added as an explicit alias for
45
+ callers that want the cheaper 200k variant.
46
+
47
+ ### Added
48
+
49
+ - **`[model-billed]` diagnostic log line.** Every successful (non-tool-
50
+ use-aborted) request now logs the SDK's `modelUsage` map showing
51
+ what model Anthropic actually billed against. Output looks like:
52
+
53
+ ```
54
+ [model-billed] requested=claude-sonnet-4-6[1m]
55
+ modelUsage={"claude-sonnet-4-6[1m]":{"inputTokens":3,"costUSD":0.029,
56
+ "contextWindow":1000000,...}}
57
+ ```
58
+
59
+ This was originally added as a temporary diagnostic to chase whether
60
+ Anthropic was silently swapping sonnet → opus. The data confirmed
61
+ no swap is happening — but the line is too useful to remove. Surfaces
62
+ cost-per-turn, actual context window, and any future silent model
63
+ changes the SDK might introduce. Low log volume (one line per
64
+ result message, ~50% of requests since tool_use turns abort
65
+ before result).
66
+
67
+ Bonus finding from this log: the SDK transparently uses **two models
68
+ per opus turn** — a haiku for fast intermediate work and opus for
69
+ the final response. Adds ~$0.024 per opus turn beyond what the
70
+ capture summary shows. Worth knowing for cost analysis.
71
+
72
+ ### Notes
73
+
74
+ The "claude.ai/settings/usage Sonnet only stuck at 0%" mystery turned
75
+ out to be a Claude Max plan accounting design: the "Sonnet only" bar
76
+ is overflow that only ticks once "All models" is exhausted. On Max
77
+ plans, sonnet usage rolls into the "All models" bar (which was
78
+ climbing as expected). Not a mobygate bug — investigation surfaced
79
+ the v0.8.2 [1m] context omission, which IS a mobygate bug, so net
80
+ positive on the chase.
81
+
82
+ ## [0.8.2] — 2026-04-28
83
+
84
+ Multi-agent fixes. Found the day after v0.8.1 shipped, while testing
85
+ three OpenClaw bots (Mobius/Lux/Mercury) in parallel on the same
86
+ machine. Both bugs were invisible without the v0.8.1 inspector.
87
+
88
+ ### Fixed
89
+
90
+ - **Session-key collision when multiple agents share a boilerplate
91
+ prefix in their system prompt.** v0.7.1's auto-derive hashed the
92
+ first 500 characters of the system prompt; OpenClaw's "You are a
93
+ personal assistant running inside OpenClaw…" preamble fills more
94
+ than that, so the per-agent personality content (loaded later from
95
+ workspace SOUL.md / IDENTITY.md / etc.) didn't reach the hash. Two
96
+ separate agents (Lux on sonnet-4-6, Mercury on sonnet-4-6) collided
97
+ onto the same auto-key when given the same first user message
98
+ ("@Lux @Mercury Hi"). Same key → same SDK session reuse → cache
99
+ thrash and potential session-state mixing.
100
+
101
+ Bumped `SYSTEM_TRIM` from 500 → 20000 chars. Verified against real
102
+ captured request bodies that collided in v0.8.1 — they now hash to
103
+ distinct keys (`auto_b0371e5c…` vs `auto_2b90afd7…`).
104
+
105
+ SHA-256 cost on 20kB is ~10-20µs per request, irrelevant in the
106
+ hot path.
107
+
108
+ - **Model map silently downgraded `claude-sonnet-4-6` to retired
109
+ `claude-sonnet-4-5-20250929`.** When the v0.8.0 model map was
110
+ written, the Claude Agent SDK didn't recognize the un-dated
111
+ `claude-sonnet-4-6` alias and we worked around it by routing to the
112
+ most recent dated 4-5. The SDK has since added native 4-6 support,
113
+ but mobygate kept the workaround in place. Result: clients (OpenClaw
114
+ Lux/Mercury) configured for sonnet-4-6 were having their requests
115
+ rewritten to the retired 4-5-20250929 dated id. Anthropic accepted
116
+ the call but the response wasn't billing into the user's "Sonnet
117
+ only" quota — it was showing 0% used despite live traffic. Likely
118
+ Claude was falling back internally to opus or returning a
119
+ zero-billed degraded response.
120
+
121
+ Fix: route `claude-sonnet-4-6` through directly. Also updated
122
+ `claude-sonnet-4` and the `sonnet` shorthand to point at 4-6
123
+ (current latest) instead of the retired dated 4-5 entry. Explicit
124
+ `claude-sonnet-4-5` requests still route to the dated id for
125
+ backward compatibility.
126
+
127
+ Discovery: the inspector showed Lux/Mercury captures all stamped
128
+ with `model: claude-sonnet-4-6` (correct from the request side) but
129
+ Anthropic's quota panel reported 0% sonnet usage. The server.log's
130
+ `model=claude-sonnet-4-6 → claude-sonnet-4-5-20250929` translation
131
+ line was the smoking gun.
132
+
133
+ ### Notes
134
+
135
+ The proper long-term fix is for clients to pass an explicit
136
+ `X-Session-Id` header per agent (mobygate has supported this since
137
+ v0.7.1 — it always wins over auto-derive). This bump is a defensive
138
+ measure for clients that don't.
139
+
140
+ Discovery flow is a nice validation of the v0.8.1 inspector: the
141
+ collision was invisible at the OpenClaw level (each bot's replies
142
+ arrived correctly because OpenClaw maintains its own per-agent SDK
143
+ state) but jumped out as soon as the captures were sorted by session
144
+ key in the inspector — two different model requests with the same
145
+ session-key, with bootstrap text 55kB long but identical first 500
146
+ chars. Without the inspector, this would have surfaced as
147
+ unpredictable cache hit rates and been blamed on Anthropic.
148
+
7
149
  ## [0.8.1] — 2026-04-27
8
150
 
9
151
  Diagnostic visibility release. Adds a request/response capture system,
package/lib/anthropic.js CHANGED
@@ -36,17 +36,23 @@ import { v4 as uuidv4 } from 'uuid';
36
36
  * against shape variations (the Claude Agent SDK sometimes nests these
37
37
  * under `.usage`, sometimes places them flat on the message). Returns
38
38
  * a complete usage shape with cache_read / cache_creation fields zeroed
39
- * out if absent. Used by the 4 mobygate handlers to populate response
40
- * captures and dashboard cache-hit metrics.
39
+ * out if absent, plus `modelUsage` (the SDK's per-model usage breakdown,
40
+ * keyed by the actual model name Anthropic billed against — useful for
41
+ * spotting silent model fallbacks where the requested model differs
42
+ * from what Anthropic actually ran).
43
+ *
44
+ * Used by the 4 mobygate handlers to populate response captures and
45
+ * dashboard cache-hit metrics.
41
46
  */
42
47
  export function extractSdkUsage(message) {
43
- if (!message) return { input_tokens: 0, output_tokens: 0, cache_read_input_tokens: 0, cache_creation_input_tokens: 0 };
48
+ if (!message) return { input_tokens: 0, output_tokens: 0, cache_read_input_tokens: 0, cache_creation_input_tokens: 0, modelUsage: null };
44
49
  const u = message.usage || message;
45
50
  return {
46
51
  input_tokens: u.input_tokens || 0,
47
52
  output_tokens: u.output_tokens || 0,
48
53
  cache_read_input_tokens: u.cache_read_input_tokens || 0,
49
54
  cache_creation_input_tokens: u.cache_creation_input_tokens || 0,
55
+ modelUsage: message.modelUsage || null,
50
56
  };
51
57
  }
52
58
 
@@ -169,8 +169,12 @@ export const openclawConnector = {
169
169
  configPath: det.configPath,
170
170
  mobyProviderExists: !!providers[PROVIDER_NAME_OPENAI],
171
171
  mobyNativeProviderExists: !!providers[PROVIDER_NAME_ANTHROPIC],
172
- currentMain: det.parsed?.models?.main || null,
173
- currentDefault: det.parsed?.models?.default || null,
172
+ currentPrimary: det.parsed?.agents?.defaults?.model?.primary || null,
173
+ // Old keys: still surfaced so the user/dashboard can spot if a
174
+ // pre-v0.8.3 connector run left these around (they're invalid in
175
+ // OpenClaw .26+). A non-null value here is a "needs reconnect" hint.
176
+ deprecatedMain: det.parsed?.models?.main || null,
177
+ deprecatedDefault: det.parsed?.models?.default || null,
174
178
  shadowProviders, // pre-v0.8.0 entries pointing at our base URL
175
179
  };
176
180
  },
@@ -212,14 +216,52 @@ export const openclawConnector = {
212
216
  : null;
213
217
  if (preferredProvider) {
214
218
  const target = `${preferredProvider}/claude-opus-4-7`;
215
- after.models.main = target;
216
- after.models.default = target;
219
+
220
+ // Modern OpenClaw schema (>=2026.4.x): defaults live under
221
+ // `agents.defaults.model.primary` and `agents.defaults.models`.
222
+ // The old `models.main` / `models.default` top-level keys were
223
+ // valid in earlier OpenClaw versions but `.26+` rejects them
224
+ // ("Unrecognized keys: main, default"). v0.8.0/0.8.1 connector
225
+ // wrote the old shape and broke OpenClaw `.26` startup. Fixed
226
+ // in v0.8.3 by switching to the modern path + cleaning up any
227
+ // old keys it previously wrote.
228
+ if (!after.agents) after.agents = {};
229
+ if (!after.agents.defaults) after.agents.defaults = {};
230
+ if (!after.agents.defaults.model) after.agents.defaults.model = {};
231
+ if (!after.agents.defaults.models) after.agents.defaults.models = {};
232
+
233
+ after.agents.defaults.model.primary = target;
234
+ // Register every model we surface in the provider so OpenClaw
235
+ // sees them in agents-list. Keep existing entries to preserve
236
+ // user-registered models (ollama, anthropic-direct, etc).
237
+ for (const m of MODELS_NATIVE_SURFACE) {
238
+ const id = `${preferredProvider}/${m.id}`;
239
+ if (!after.agents.defaults.models[id]) {
240
+ after.agents.defaults.models[id] = {};
241
+ }
242
+ }
243
+
244
+ // Cleanup: remove the deprecated top-level keys if a previous
245
+ // connector run (v0.8.0/0.8.1) wrote them. OpenClaw `.26`
246
+ // rejects the config outright if these are present.
247
+ if (after.models?.main !== undefined) delete after.models.main;
248
+ if (after.models?.default !== undefined) delete after.models.default;
217
249
  }
218
250
  }
219
251
 
220
252
  const summary = diffSummary(
221
- { providers: before.models?.providers, main: before.models?.main, default: before.models?.default },
222
- { providers: after.models.providers, main: after.models.main, default: after.models.default },
253
+ {
254
+ providers: before.models?.providers,
255
+ primary: before.agents?.defaults?.model?.primary,
256
+ deprecatedMain: before.models?.main,
257
+ deprecatedDefault: before.models?.default,
258
+ },
259
+ {
260
+ providers: after.models.providers,
261
+ primary: after.agents?.defaults?.model?.primary,
262
+ deprecatedMain: after.models?.main, // should be undefined post-fix
263
+ deprecatedDefault: after.models?.default, // should be undefined post-fix
264
+ },
223
265
  );
224
266
 
225
267
  return {
@@ -263,14 +305,32 @@ export const openclawConnector = {
263
305
  if (providers[name]) { delete providers[name]; changed = true; }
264
306
  }
265
307
  }
266
- // If main/default was pointing at us, blank them — let the user
267
- // re-pick rather than guess at a replacement.
268
- if (isMobyDefaultPointer(after.models?.main)) {
269
- after.models.main = null;
308
+ // Modern path: if the agents.defaults.model.primary points at us,
309
+ // blank it out let OpenClaw fall back to whatever default the
310
+ // user has configured otherwise.
311
+ const primary = after.agents?.defaults?.model?.primary;
312
+ if (isMobyDefaultPointer(primary)) {
313
+ after.agents.defaults.model.primary = null;
314
+ changed = true;
315
+ }
316
+ // Remove our model registrations from agents.defaults.models.
317
+ if (after.agents?.defaults?.models) {
318
+ for (const key of Object.keys(after.agents.defaults.models)) {
319
+ if (isMobyDefaultPointer(key)) {
320
+ delete after.agents.defaults.models[key];
321
+ changed = true;
322
+ }
323
+ }
324
+ }
325
+ // Legacy cleanup: remove deprecated top-level main/default if present
326
+ // (left over from v0.8.0/0.8.1 connector). OpenClaw .26 rejects them
327
+ // outright, so removing on disconnect is the right move.
328
+ if (after.models?.main !== undefined) {
329
+ delete after.models.main;
270
330
  changed = true;
271
331
  }
272
- if (isMobyDefaultPointer(after.models?.default)) {
273
- after.models.default = null;
332
+ if (after.models?.default !== undefined) {
333
+ delete after.models.default;
274
334
  changed = true;
275
335
  }
276
336
 
@@ -40,6 +40,12 @@
40
40
  * user message from history mid-conversation, the auto-key changes
41
41
  * and the SDK starts a new session. One turn of double-billing,
42
42
  * then we're back on the new key. Acceptable.
43
+ * - **Multi-agent collisions** (fixed in v0.8.2): two agents that
44
+ * share boilerplate at the start of their system prompt previously
45
+ * collided onto one session key when the trim window only covered
46
+ * the boilerplate. SYSTEM_TRIM was raised from 500 to 20000 chars
47
+ * to capture the per-agent personality content that follows the
48
+ * shared preamble. See note on the constant below for details.
43
49
  *
44
50
  * Opt-out: `X-Session-Id: none` tells us the client explicitly wants
45
51
  * stateless behavior — we return null and the request flows through
@@ -51,7 +57,20 @@
51
57
  import { createHash } from 'crypto';
52
58
 
53
59
  const HASH_LEN = 16;
54
- const SYSTEM_TRIM = 500;
60
+ // SYSTEM_TRIM was 500 in v0.7.1 — large enough for casual single-agent
61
+ // scenarios (Hermes, single-bot OpenClaw) but caused collisions when
62
+ // multiple agents shared a common boilerplate prefix. Observed in v0.8.1
63
+ // production: Lux + Mercury (two OpenClaw agents) both started their
64
+ // system prompt with the OpenClaw "You are a personal assistant…"
65
+ // boilerplate that filled the first ~500 chars, so their personality
66
+ // markers (loaded from per-agent SOUL.md / IDENTITY.md / etc.) didn't
67
+ // reach the hash and they collided onto the same session key.
68
+ //
69
+ // Bumping to 20kB covers realistic agent system prompts including
70
+ // rich workspace bootstrap (Lux: ~42kB, Mercury: ~80kB total — but
71
+ // the first 20kB has more than enough divergence to fingerprint each).
72
+ // SHA-256 cost on 20kB is ~10-20µs, irrelevant per request.
73
+ const SYSTEM_TRIM = 20000;
55
74
  const USER_TRIM = 500;
56
75
 
57
76
  /**
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "mobygate",
3
- "version": "0.8.1",
3
+ "version": "0.8.3",
4
4
  "description": "OpenAI-compatible local proxy for Claude Max. The Möbius-strip gateway: OpenAI shape in, Claude Max out.",
5
5
  "type": "module",
6
6
  "main": "server.js",
package/server.js CHANGED
@@ -168,6 +168,17 @@ for (const sig of ['SIGTERM', 'SIGINT', 'SIGHUP']) {
168
168
  // Opus 4.7 ships a native 1M-context variant addressed as `claude-opus-4-7[1m]`.
169
169
  // Default opus aliases route to the 1M form to match the advertised context window.
170
170
  // Pass `claude-opus-4-7-200k` for the standard (cheaper) 200k variant.
171
+ //
172
+ // History: the sonnet-4-6 entry previously mapped to the dated
173
+ // `claude-sonnet-4-5-20250929` because at the time, the SDK didn't
174
+ // recognize `claude-sonnet-4-6` natively. The SDK has since added
175
+ // native support for the un-dated 4-6 alias, so sonnet-4-6 requests
176
+ // were silently being downgraded to retired 4-5-20250929. This caused
177
+ // the "Sonnet only" Anthropic quota to show 0% usage even when Lux
178
+ // and Mercury (configured for sonnet-4-6) were chatting actively —
179
+ // the SDK was accepting the retired model id but Claude was likely
180
+ // falling back to opus or returning a zero-billed response. Fixed in
181
+ // v0.8.2 by routing 4-6 through directly.
171
182
  const MODEL_MAP = {
172
183
  'claude-opus-4': 'claude-opus-4-7[1m]',
173
184
  'claude-opus-4-6': 'claude-opus-4-6',
@@ -175,13 +186,14 @@ const MODEL_MAP = {
175
186
  'claude-opus-4-7[1m]': 'claude-opus-4-7[1m]',
176
187
  'claude-opus-4-7-1m': 'claude-opus-4-7[1m]',
177
188
  'claude-opus-4-7-200k': 'claude-opus-4-7',
178
- 'claude-sonnet-4': 'claude-sonnet-4-5-20250929',
179
- 'claude-sonnet-4-5': 'claude-sonnet-4-5-20250929',
180
- 'claude-sonnet-4-6': 'claude-sonnet-4-5-20250929', // SDK resolves 4-6 to non-existent dated version
189
+ 'claude-sonnet-4': 'claude-sonnet-4-6[1m]', // current latest sonnet, 1M context
190
+ 'claude-sonnet-4-5': 'claude-sonnet-4-5-20250929', // explicit request for older 4-5
191
+ 'claude-sonnet-4-6': 'claude-sonnet-4-6[1m]', // SDK supports natively; [1m] unlocks 1M context (same pattern as opus 4.7)
192
+ 'claude-sonnet-4-6-200k': 'claude-sonnet-4-6', // explicit 200k variant
181
193
  'claude-haiku-4': 'claude-haiku-4-5-20251001',
182
194
  'claude-haiku-4-5': 'claude-haiku-4-5-20251001',
183
195
  'opus': 'claude-opus-4-7[1m]',
184
- 'sonnet': 'claude-sonnet-4-5-20250929',
196
+ 'sonnet': 'claude-sonnet-4-6[1m]', // current latest sonnet, 1M context
185
197
  'haiku': 'claude-haiku-4-5-20251001',
186
198
  };
187
199
 
@@ -579,6 +591,7 @@ async function handleStreaming(req, res, body, requestId, sessionKey) {
579
591
  outputTokens = usage.output_tokens;
580
592
  cacheReadTokens = usage.cache_read_input_tokens;
581
593
  cacheCreateTokens = usage.cache_creation_input_tokens;
594
+ console.log(` [model-billed] requested=${resolvedModel} modelUsage=${JSON.stringify(usage.modelUsage || '(none)')}`);
582
595
  break;
583
596
  }
584
597
  }
@@ -784,6 +797,7 @@ async function handleNonStreaming(res, body, requestId, sessionKey) {
784
797
  outputTokens = usage.output_tokens;
785
798
  cacheReadTokens = usage.cache_read_input_tokens;
786
799
  cacheCreateTokens = usage.cache_creation_input_tokens;
800
+ console.log(` [model-billed] requested=${resolvedModel} modelUsage=${JSON.stringify(usage.modelUsage || '(none)')}`);
787
801
  if (message.subtype) stopReason = message.subtype;
788
802
  break;
789
803
  }
@@ -998,6 +1012,7 @@ async function handleAnthropicNonStreaming(res, body, requestId, sessionKey) {
998
1012
  outputTokens = usage.output_tokens;
999
1013
  cacheReadTokens = usage.cache_read_input_tokens;
1000
1014
  cacheCreateTokens = usage.cache_creation_input_tokens;
1015
+ console.log(` [model-billed] requested=${resolvedModel} modelUsage=${JSON.stringify(usage.modelUsage || '(none)')}`);
1001
1016
  stopReason = mapStopReason(message);
1002
1017
  break;
1003
1018
  }
@@ -1243,6 +1258,7 @@ async function handleAnthropicStreaming(req, res, body, requestId, sessionKey) {
1243
1258
  outputTokens = usage.output_tokens;
1244
1259
  cacheReadTokens = usage.cache_read_input_tokens;
1245
1260
  cacheCreateTokens = usage.cache_creation_input_tokens;
1261
+ console.log(` [model-billed] requested=${resolvedModel} modelUsage=${JSON.stringify(usage.modelUsage || '(none)')}`);
1246
1262
  if (!toolUseEmitted) stopReason = mapStopReason(message);
1247
1263
  break;
1248
1264
  }