mobygate 0.7.0 → 0.7.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -4,6 +4,80 @@ All notable changes to mobygate are documented here. Format loosely follows
4
4
  [Keep a Changelog](https://keepachangelog.com/en/1.1.0/); version numbers are
5
5
  [Semantic Versioning](https://semver.org/).
6
6
 
7
+ ## [0.7.2] — 2026-04-25
8
+
9
+ ### Fixed
10
+
11
+ - **"I can't use the tool 'grep' here because it isn't available" refusals**
12
+ in long-running tasks. Even with `allowedTools: ['mcp__mobygate__*']`
13
+ blocking everything except client-defined tools, the model retains
14
+ strong priors from training for Claude Code's built-ins (Bash, Grep,
15
+ Read, Edit, Glob, WebFetch, ToolSearch, etc.). When a task seemed to
16
+ call for one — e.g., "find all TODOs" → instinctive reach for Grep —
17
+ the model would attempt it, get blocked, refuse the task, and stop.
18
+ Instead of falling back to the available client tool (`searchFiles`,
19
+ `terminal`, etc.).
20
+
21
+ **Fix:** for any tool-enabled request, append a short system-prompt
22
+ block (~150 tokens) via the SDK's
23
+ `systemPrompt: { type: 'preset', preset: 'claude_code', append: ... }`
24
+ option. The append explicitly lists the available client tools and
25
+ states that Claude Code's built-ins are NOT in this environment.
26
+ Calibrated to be matter-of-fact ("here's the environment, work
27
+ within it") rather than over-restrictive — the model now uses
28
+ available tools or briefly says what's missing, instead of refusing
29
+ silently.
30
+
31
+ Applies to both `/v1/chat/completions` and `/v1/messages`.
32
+
33
+ ### Notes
34
+
35
+ - New helper: `buildToolUsageGuidance(tools)` in `lib/tool-bridge.js`
36
+ produces the append text from the OpenAI-shape tool array. The
37
+ Anthropic surface translates its tool defs to OpenAI shape for the
38
+ bridge already, so the helper takes one input shape across both.
39
+ - Per-request token overhead: ~150 tokens, only when `tools` is non-empty.
40
+ No effect on text-only chat or non-tool requests.
41
+
42
+ ## [0.7.1] — 2026-04-24
43
+
44
+ Fixes a meaningful token-burn issue for clients that don't pass session
45
+ keys.
46
+
47
+ ### Added
48
+
49
+ - **Auto-derived session keys.** When a request arrives without an
50
+ `X-Session-Id` header (and without `body.session_id`), mobygate now
51
+ hashes a stable signature of the conversation — model + system
52
+ prompt + first user message — and uses that as the session key.
53
+ Subsequent turns of the same conversation hit the same auto-key,
54
+ the SDK resume kicks in, and the client only pays input-token cost
55
+ for the *new* tail of each turn instead of resending 200 messages
56
+ of history every time.
57
+
58
+ Surfaced in logs as `session=auto_<hash> (auto)` so you can tell
59
+ client-keyed sessions from server-derived ones at a glance. New
60
+ module: `lib/session-derive.js`.
61
+
62
+ In production we observed an OpenClaw client repeatedly sending
63
+ 175–211-message conversation histories without a session key,
64
+ burning through Max usage in minutes. With this change, the same
65
+ workload re-uses the SDK session and only the new turn gets billed.
66
+
67
+ - **Per-request opt-out:** `X-Session-Id: none` (literal string) tells
68
+ mobygate to skip auto-derive and run the request fully stateless.
69
+
70
+ ### Notes
71
+
72
+ - Applies to both `/v1/chat/completions` (OpenAI) and `/v1/messages`
73
+ (Anthropic) surfaces.
74
+ - Auto-keys obey the same 60-minute idle TTL as explicit ones, so
75
+ stale auto-sessions clean themselves up.
76
+ - Two unrelated users starting with identical model + system + first
77
+ message would share an auto-session — fine for single-user dev
78
+ setups, but multi-tenant deployments should pass `X-Session-Id`
79
+ explicitly to scope per-user.
80
+
7
81
  ## [0.7.0] — 2026-04-24
8
82
 
9
83
  Phase 2: native Anthropic Messages surface.
@@ -0,0 +1,164 @@
1
+ /**
2
+ * Auto-derive session keys for clients that don't send `X-Session-Id`.
3
+ *
4
+ * Why this exists: OpenAI's wire format is stateless by design — clients
5
+ * are expected to send the entire conversation history with every turn,
6
+ * and many clients (OpenClaw at the time of writing, plenty of others)
7
+ * don't bother passing a session identifier. Without one, mobygate
8
+ * treats every request as a fresh SDK session and the client ends up
9
+ * paying input-token cost for the full history on every single turn.
10
+ * On long conversations (175+ messages observed in production), this
11
+ * burns through Claude Max usage budgets in minutes.
12
+ *
13
+ * The fix: when a request arrives without an explicit session key, we
14
+ * compute one ourselves from a *stable signature* of the conversation —
15
+ * model + system prompt + first user message. The same conversation
16
+ * thread produces the same auto-key turn after turn, so the SDK resume
17
+ * machinery kicks in and only the new tail of each turn gets billed.
18
+ * Different conversations naturally produce different signatures and
19
+ * stay isolated. The existing 60-minute idle TTL keeps stale auto-keys
20
+ * from lingering forever.
21
+ *
22
+ * What's hashed (and why each piece):
23
+ * - **model** — different agent configs shouldn't share a session.
24
+ * - **system** (string or content blocks, plus any system-role
25
+ * messages) — typically stable for the lifetime of a conversation
26
+ * thread, distinguishes one agent's persona from another's.
27
+ * - **first user message text** — anchors the thread. Stable until
28
+ * the client prunes it from history; if/when that happens, a new
29
+ * auto-key forms and we lose continuity for that one transition.
30
+ * Graceful degradation, not a crash.
31
+ *
32
+ * Limitations to be aware of:
33
+ * - **Collisions across users:** if two unrelated users happen to
34
+ * start with the same model + system + first message ("hello"),
35
+ * they'd share a session. In single-user dev contexts (Hermes,
36
+ * OpenClaw on a personal machine) this is fine. For multi-tenant
37
+ * deployments, clients should pass `X-Session-Id` explicitly to
38
+ * scope per-user.
39
+ * - **History pruning shifts the key:** if the client drops the first
40
+ * user message from history mid-conversation, the auto-key changes
41
+ * and the SDK starts a new session. One turn of double-billing,
42
+ * then we're back on the new key. Acceptable.
43
+ *
44
+ * Opt-out: `X-Session-Id: none` tells us the client explicitly wants
45
+ * stateless behavior — we return null and the request flows through
46
+ * as a fresh SDK call. (An *empty* X-Session-Id is indistinguishable
47
+ * from "header not set" at the Express layer, so we treat it as
48
+ * "no explicit key, please auto-derive" rather than as opt-out.)
49
+ */
50
+
51
+ import { createHash } from 'crypto';
52
+
53
+ const HASH_LEN = 16;
54
+ const SYSTEM_TRIM = 500;
55
+ const USER_TRIM = 500;
56
+
57
+ /**
58
+ * Extract a flat text representation of a content field that might be
59
+ * either a string or an array of OpenAI/Anthropic content parts. We
60
+ * only pull the text — images/tool blocks/etc. are ignored for hashing
61
+ * because they vary in serialization but don't change conversation
62
+ * identity.
63
+ */
64
+ function flattenContent(content) {
65
+ if (typeof content === 'string') return content;
66
+ if (!Array.isArray(content)) return '';
67
+ const out = [];
68
+ for (const part of content) {
69
+ if (typeof part === 'string') out.push(part);
70
+ else if (part?.type === 'text' && part.text) out.push(part.text);
71
+ // image_url / image / tool_use / tool_result intentionally skipped
72
+ }
73
+ return out.join(' ');
74
+ }
75
+
76
+ /**
77
+ * Pull the system text out of a request body. The Anthropic surface
78
+ * carries it on `body.system` (string OR content blocks), the OpenAI
79
+ * surface carries it as messages with `role: 'system'`. Combine both.
80
+ */
81
+ function extractSystemText(body) {
82
+ let parts = [];
83
+ if (typeof body?.system === 'string') {
84
+ parts.push(body.system);
85
+ } else if (Array.isArray(body?.system)) {
86
+ parts.push(flattenContent(body.system));
87
+ }
88
+ for (const msg of body?.messages || []) {
89
+ if (msg?.role === 'system') {
90
+ parts.push(flattenContent(msg.content));
91
+ }
92
+ }
93
+ return parts.join('\n').slice(0, SYSTEM_TRIM);
94
+ }
95
+
96
+ /**
97
+ * First user-role message in the array, flattened to text. We use the
98
+ * first (oldest) one because it's the most stable anchor — later turns
99
+ * change every request.
100
+ */
101
+ function extractFirstUserText(body) {
102
+ for (const msg of body?.messages || []) {
103
+ if (msg?.role === 'user') {
104
+ const text = flattenContent(msg.content);
105
+ if (text) return text.slice(0, USER_TRIM);
106
+ }
107
+ }
108
+ return '';
109
+ }
110
+
111
+ /**
112
+ * Compute a stable session key from a request body. Returns a string
113
+ * like `auto_<16hex>` when there's enough signal to hash, or `null`
114
+ * when the body is too sparse (no model, no system, no user text — the
115
+ * caller should fall through to stateless behavior in that case).
116
+ *
117
+ * The hash uses SHA-256 truncated to 16 hex chars (~64 bits of
118
+ * collision space). A few orders of magnitude more than needed for the
119
+ * "same conversation prefix" matching use case.
120
+ */
121
+ export function deriveSessionKey(body) {
122
+ const model = body?.model || '';
123
+ const system = extractSystemText(body);
124
+ const firstUser = extractFirstUserText(body);
125
+
126
+ // Need at least *something* to anchor on. If the request has no
127
+ // model and no user message, there's literally nothing to identify
128
+ // the conversation with — better to return null and let the caller
129
+ // run stateless than to bucket everything into the same auto-key.
130
+ if (!model && !system && !firstUser) return null;
131
+ if (!firstUser) return null; // first user msg is the anchor; no anchor → no auto-key
132
+
133
+ const signature = [model, system, firstUser].join('||');
134
+ const digest = createHash('sha256').update(signature).digest('hex').slice(0, HASH_LEN);
135
+ return `auto_${digest}`;
136
+ }
137
+
138
+ /**
139
+ * Resolve the effective session key for a request. Order:
140
+ * 1. Explicit `X-Session-Id` header (or `body.session_id`) wins.
141
+ * Special value `'none'` means "explicitly stateless" and
142
+ * short-circuits to null without auto-deriving.
143
+ * 2. Auto-derived key from the conversation signature.
144
+ * 3. null (stateless) — only when there's nothing useful to hash.
145
+ *
146
+ * Returns `{ key, source }` where source is `'explicit' | 'auto' | 'none'`.
147
+ * The source label is informational — server.js logs it and the dashboard
148
+ * shows it so you can tell at a glance whether a session was client-keyed
149
+ * or server-derived.
150
+ */
151
+ export function resolveSessionKey({ headerKey, bodyKey, body }) {
152
+ const explicit = headerKey || bodyKey;
153
+ if (explicit) {
154
+ const trimmed = String(explicit).trim();
155
+ if (trimmed.toLowerCase() === 'none') {
156
+ return { key: null, source: 'none' };
157
+ }
158
+ if (trimmed) return { key: trimmed, source: 'explicit' };
159
+ }
160
+
161
+ const derived = deriveSessionKey(body);
162
+ if (derived) return { key: derived, source: 'auto' };
163
+ return { key: null, source: 'none' };
164
+ }
@@ -218,6 +218,50 @@ export function hasToolUse(assistantMessage) {
218
218
  // Tool results (OpenAI tool messages → Anthropic tool_result content blocks)
219
219
  // ---------------------------------------------------------------------------
220
220
 
221
+ // ---------------------------------------------------------------------------
222
+ // Strict-tool guidance (system-prompt append for tool-enabled requests)
223
+ // ---------------------------------------------------------------------------
224
+ // Even with native MCP registration + a tight `allowedTools` allowlist, the
225
+ // model retains strong priors for Claude Code's built-in tools (Bash, Read,
226
+ // Edit, Grep, Glob, WebFetch, ToolSearch, etc.) from training. When a task
227
+ // seems to need one of those, the model reaches for it, gets blocked by
228
+ // `allowedTools`, says "I can't use the tool 'grep' here because it isn't
229
+ // available," and gives up — instead of falling back to the available
230
+ // client-defined tools. We saw this in production OpenClaw use.
231
+ //
232
+ // The fix: append a short, explicit guidance block to Claude Code's system
233
+ // prompt (via `systemPrompt: { type: 'preset', preset: 'claude_code', append: ... }`)
234
+ // telling the model exactly which tools are available and that built-ins
235
+ // are NOT in this environment. The positive list reinforces what the model
236
+ // already sees via MCP registration; the negative list shuts down the
237
+ // trained-in instinct to reach for built-ins.
238
+ //
239
+ // Calibration matters: too directive and the model becomes over-conservative
240
+ // and refuses legitimate work. We aim for matter-of-fact "here's the
241
+ // environment, work within it" rather than threatening prohibition.
242
+
243
+ const KNOWN_BUILTINS = 'Bash, Read, Edit, Write, Grep, Glob, NotebookEdit, WebFetch, WebSearch, Task, ToolSearch';
244
+
245
+ export function buildToolUsageGuidance(openaiTools) {
246
+ if (!Array.isArray(openaiTools) || openaiTools.length === 0) return null;
247
+ const names = [];
248
+ for (const t of openaiTools) {
249
+ if (t?.type !== 'function' || !t.function?.name) continue;
250
+ names.push(t.function.name);
251
+ }
252
+ if (names.length === 0) return null;
253
+
254
+ return [
255
+ 'Tool environment: this session is running through a proxy that exposes only the client-defined tools listed below. Claude Code\'s default built-in tools',
256
+ `(${KNOWN_BUILTINS}, etc.) are NOT available in this environment and cannot be invoked — calls to them will fail.`,
257
+ '',
258
+ 'Available tools:',
259
+ ...names.map((n) => ` - ${n}`),
260
+ '',
261
+ 'If a task seems to require a built-in tool that isn\'t in this list, accomplish what you can with the available tools and briefly note what\'s missing — do not refuse silently or claim you have no tools.',
262
+ ].join('\n');
263
+ }
264
+
221
265
  /**
222
266
  * Format OpenAI role:'tool' messages as a single user-readable text
223
267
  * block to splice into a resumed prompt.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "mobygate",
3
- "version": "0.7.0",
3
+ "version": "0.7.2",
4
4
  "description": "OpenAI-compatible local proxy for Claude Max. The Möbius-strip gateway: OpenAI shape in, Claude Max out.",
5
5
  "type": "module",
6
6
  "main": "server.js",
package/server.js CHANGED
@@ -55,6 +55,7 @@ import { loadSessions, saveSessions, flushSessionsNow } from './lib/session-stor
55
55
  import { LOGS_DIR } from './lib/config.js';
56
56
  import {
57
57
  buildClientToolsServer,
58
+ buildToolUsageGuidance,
58
59
  extractToolUses,
59
60
  hasToolUse,
60
61
  toolMessagesToText,
@@ -76,6 +77,7 @@ import {
76
77
  hasAnthropicTools,
77
78
  mapStopReason,
78
79
  } from './lib/anthropic.js';
80
+ import { resolveSessionKey } from './lib/session-derive.js';
79
81
 
80
82
  const __filename = fileURLToPath(import.meta.url);
81
83
  const __dirname = dirname(__filename);
@@ -401,6 +403,12 @@ async function handleStreaming(req, res, body, requestId, sessionKey) {
401
403
  // Build the in-process MCP server exposing client tools to the SDK.
402
404
  // null when toolsEnabled is false (or all tools are malformed).
403
405
  const clientToolsServer = toolsEnabled ? buildClientToolsServer(body.tools) : null;
406
+ // System-prompt append: tells the model exactly which tools are
407
+ // available and that Claude Code's built-ins (Bash, Grep, Read, etc.)
408
+ // are NOT in this environment. Without this, the model trained-in
409
+ // priors lead it to call Grep/Bash, get blocked by allowedTools, and
410
+ // refuse the task instead of falling back to client tools. ~150 tokens.
411
+ const toolsGuidance = clientToolsServer ? buildToolUsageGuidance(body.tools) : null;
404
412
  if (images.length) console.log(` [multimodal] ${images.length} image block(s)`);
405
413
  if (toolsEnabled) console.log(` [tools] ${body.tools.length} client tool(s) registered as MCP`);
406
414
 
@@ -457,6 +465,7 @@ async function handleStreaming(req, res, body, requestId, sessionKey) {
457
465
  ? {
458
466
  mcpServers: { [MCP_SERVER_NAME]: clientToolsServer },
459
467
  allowedTools: [`${MCP_TOOL_PREFIX}*`],
468
+ systemPrompt: { type: 'preset', preset: 'claude_code', append: toolsGuidance },
460
469
  }
461
470
  : toolsEnabled
462
471
  // Tools were requested but none were valid — disable all tools.
@@ -619,6 +628,7 @@ async function handleNonStreaming(res, body, requestId, sessionKey) {
619
628
  const prompt = buildQueryPrompt(promptText, images);
620
629
  const model = resolveModel(body.model);
621
630
  const clientToolsServer = toolsEnabled ? buildClientToolsServer(body.tools) : null;
631
+ const toolsGuidance = clientToolsServer ? buildToolUsageGuidance(body.tools) : null;
622
632
  if (images.length) console.log(` [multimodal] ${images.length} image block(s)`);
623
633
  if (toolsEnabled) console.log(` [tools] ${body.tools.length} client tool(s) registered as MCP`);
624
634
 
@@ -655,6 +665,7 @@ async function handleNonStreaming(res, body, requestId, sessionKey) {
655
665
  ? {
656
666
  mcpServers: { [MCP_SERVER_NAME]: clientToolsServer },
657
667
  allowedTools: [`${MCP_TOOL_PREFIX}*`],
668
+ systemPrompt: { type: 'preset', preset: 'claude_code', append: toolsGuidance },
658
669
  }
659
670
  : toolsEnabled
660
671
  ? { allowedTools: [] }
@@ -805,6 +816,7 @@ async function handleAnthropicNonStreaming(res, body, requestId, sessionKey) {
805
816
  }))
806
817
  : null;
807
818
  const clientToolsServer = toolsForBridge ? buildClientToolsServer(toolsForBridge) : null;
819
+ const toolsGuidance = clientToolsServer ? buildToolUsageGuidance(toolsForBridge) : null;
808
820
 
809
821
  if (images.length) console.log(` [multimodal] ${images.length} image block(s)`);
810
822
  if (toolsEnabled) console.log(` [tools] ${body.tools.length} client tool(s) registered as MCP`);
@@ -843,6 +855,7 @@ async function handleAnthropicNonStreaming(res, body, requestId, sessionKey) {
843
855
  ? {
844
856
  mcpServers: { [MCP_SERVER_NAME]: clientToolsServer },
845
857
  allowedTools: [`${MCP_TOOL_PREFIX}*`],
858
+ systemPrompt: { type: 'preset', preset: 'claude_code', append: toolsGuidance },
846
859
  }
847
860
  : toolsEnabled
848
861
  ? { allowedTools: [] }
@@ -947,6 +960,7 @@ async function handleAnthropicStreaming(req, res, body, requestId, sessionKey) {
947
960
  }))
948
961
  : null;
949
962
  const clientToolsServer = toolsForBridge ? buildClientToolsServer(toolsForBridge) : null;
963
+ const toolsGuidance = clientToolsServer ? buildToolUsageGuidance(toolsForBridge) : null;
950
964
 
951
965
  if (images.length) console.log(` [multimodal] ${images.length} image block(s)`);
952
966
  if (toolsEnabled) console.log(` [tools] ${body.tools.length} client tool(s) registered as MCP`);
@@ -1003,6 +1017,7 @@ async function handleAnthropicStreaming(req, res, body, requestId, sessionKey) {
1003
1017
  ? {
1004
1018
  mcpServers: { [MCP_SERVER_NAME]: clientToolsServer },
1005
1019
  allowedTools: [`${MCP_TOOL_PREFIX}*`],
1020
+ systemPrompt: { type: 'preset', preset: 'claude_code', append: toolsGuidance },
1006
1021
  }
1007
1022
  : toolsEnabled
1008
1023
  ? { allowedTools: [] }
@@ -1193,10 +1208,20 @@ app.post('/v1/chat/completions', async (req, res) => {
1193
1208
  });
1194
1209
  }
1195
1210
 
1196
- // Session key: X-Session-Id header > body.session_id > null (stateless)
1197
- const sessionKey = req.headers['x-session-id'] || body.session_id || null;
1211
+ // Session key resolution: X-Session-Id header > body.session_id >
1212
+ // auto-derived from conversation signature > null (stateless).
1213
+ // Auto-derive protects clients that don't pass a session header from
1214
+ // re-paying input-token cost on every turn of a long conversation —
1215
+ // see lib/session-derive.js for the rationale and trade-offs.
1216
+ const { key: sessionKey, source: sessionKeySource } = resolveSessionKey({
1217
+ headerKey: req.headers['x-session-id'],
1218
+ bodyKey: body.session_id,
1219
+ body,
1220
+ });
1198
1221
  const existing = getSession(sessionKey);
1199
- const sessionTag = sessionKey ? ` | session=${sessionKey}${existing ? ' (resume)' : ' (new)'}` : '';
1222
+ const sessionTag = sessionKey
1223
+ ? ` | session=${sessionKey}${sessionKeySource === 'auto' ? ' (auto)' : ''}${existing ? ' (resume)' : ' (new)'}`
1224
+ : '';
1200
1225
 
1201
1226
  console.log(`[${new Date().toISOString()}] ${body.stream ? 'stream' : 'sync'} | model=${body.model} → ${resolveModel(body.model)} | msgs=${body.messages.length}${sessionTag}`);
1202
1227
 
@@ -1260,9 +1285,15 @@ app.post('/v1/messages', async (req, res) => {
1260
1285
  });
1261
1286
  }
1262
1287
 
1263
- const sessionKey = req.headers['x-session-id'] || body.session_id || null;
1288
+ const { key: sessionKey, source: sessionKeySource } = resolveSessionKey({
1289
+ headerKey: req.headers['x-session-id'],
1290
+ bodyKey: body.session_id,
1291
+ body,
1292
+ });
1264
1293
  const existing = getSession(sessionKey);
1265
- const sessionTag = sessionKey ? ` | session=${sessionKey}${existing ? ' (resume)' : ' (new)'}` : '';
1294
+ const sessionTag = sessionKey
1295
+ ? ` | session=${sessionKey}${sessionKeySource === 'auto' ? ' (auto)' : ''}${existing ? ' (resume)' : ' (new)'}`
1296
+ : '';
1266
1297
 
1267
1298
  console.log(`[${new Date().toISOString()}] anthropic ${body.stream ? 'stream' : 'sync'} | model=${body.model} → ${resolveModel(body.model)} | msgs=${body.messages.length}${sessionTag}`);
1268
1299