mobygate 0.7.0 → 0.7.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -4,6 +4,45 @@ All notable changes to mobygate are documented here. Format loosely follows
4
4
  [Keep a Changelog](https://keepachangelog.com/en/1.1.0/); version numbers are
5
5
  [Semantic Versioning](https://semver.org/).
6
6
 
7
+ ## [0.7.1] — 2026-04-24
8
+
9
+ Fixes a meaningful token-burn issue for clients that don't pass session
10
+ keys.
11
+
12
+ ### Added
13
+
14
+ - **Auto-derived session keys.** When a request arrives without an
15
+ `X-Session-Id` header (and without `body.session_id`), mobygate now
16
+ hashes a stable signature of the conversation — model + system
17
+ prompt + first user message — and uses that as the session key.
18
+ Subsequent turns of the same conversation hit the same auto-key,
19
+ the SDK resume kicks in, and the client only pays input-token cost
20
+ for the *new* tail of each turn instead of resending 200 messages
21
+ of history every time.
22
+
23
+ Surfaced in logs as `session=auto_<hash> (auto)` so you can tell
24
+ client-keyed sessions from server-derived ones at a glance. New
25
+ module: `lib/session-derive.js`.
26
+
27
+ In production we observed an OpenClaw client repeatedly sending
28
+ 175–211-message conversation histories without a session key,
29
+ burning through Max usage in minutes. With this change, the same
30
+ workload re-uses the SDK session and only the new turn gets billed.
31
+
32
+ - **Per-request opt-out:** `X-Session-Id: none` (literal string) tells
33
+ mobygate to skip auto-derive and run the request fully stateless.
34
+
35
+ ### Notes
36
+
37
+ - Applies to both `/v1/chat/completions` (OpenAI) and `/v1/messages`
38
+ (Anthropic) surfaces.
39
+ - Auto-keys obey the same 60-minute idle TTL as explicit ones, so
40
+ stale auto-sessions clean themselves up.
41
+ - Two unrelated users starting with identical model + system + first
42
+ message would share an auto-session — fine for single-user dev
43
+ setups, but multi-tenant deployments should pass `X-Session-Id`
44
+ explicitly to scope per-user.
45
+
7
46
  ## [0.7.0] — 2026-04-24
8
47
 
9
48
  Phase 2: native Anthropic Messages surface.
@@ -0,0 +1,164 @@
1
+ /**
2
+ * Auto-derive session keys for clients that don't send `X-Session-Id`.
3
+ *
4
+ * Why this exists: OpenAI's wire format is stateless by design — clients
5
+ * are expected to send the entire conversation history with every turn,
6
+ * and many clients (OpenClaw at the time of writing, plenty of others)
7
+ * don't bother passing a session identifier. Without one, mobygate
8
+ * treats every request as a fresh SDK session and the client ends up
9
+ * paying input-token cost for the full history on every single turn.
10
+ * On long conversations (175+ messages observed in production), this
11
+ * burns through Claude Max usage budgets in minutes.
12
+ *
13
+ * The fix: when a request arrives without an explicit session key, we
14
+ * compute one ourselves from a *stable signature* of the conversation —
15
+ * model + system prompt + first user message. The same conversation
16
+ * thread produces the same auto-key turn after turn, so the SDK resume
17
+ * machinery kicks in and only the new tail of each turn gets billed.
18
+ * Different conversations naturally produce different signatures and
19
+ * stay isolated. The existing 60-minute idle TTL keeps stale auto-keys
20
+ * from lingering forever.
21
+ *
22
+ * What's hashed (and why each piece):
23
+ * - **model** — different agent configs shouldn't share a session.
24
+ * - **system** (string or content blocks, plus any system-role
25
+ * messages) — typically stable for the lifetime of a conversation
26
+ * thread, distinguishes one agent's persona from another's.
27
+ * - **first user message text** — anchors the thread. Stable until
28
+ * the client prunes it from history; if/when that happens, a new
29
+ * auto-key forms and we lose continuity for that one transition.
30
+ * Graceful degradation, not a crash.
31
+ *
32
+ * Limitations to be aware of:
33
+ * - **Collisions across users:** if two unrelated users happen to
34
+ * start with the same model + system + first message ("hello"),
35
+ * they'd share a session. In single-user dev contexts (Hermes,
36
+ * OpenClaw on a personal machine) this is fine. For multi-tenant
37
+ * deployments, clients should pass `X-Session-Id` explicitly to
38
+ * scope per-user.
39
+ * - **History pruning shifts the key:** if the client drops the first
40
+ * user message from history mid-conversation, the auto-key changes
41
+ * and the SDK starts a new session. One turn of double-billing,
42
+ * then we're back on the new key. Acceptable.
43
+ *
44
+ * Opt-out: `X-Session-Id: none` tells us the client explicitly wants
45
+ * stateless behavior — we return null and the request flows through
46
+ * as a fresh SDK call. (An *empty* X-Session-Id is indistinguishable
47
+ * from "header not set" at the Express layer, so we treat it as
48
+ * "no explicit key, please auto-derive" rather than as opt-out.)
49
+ */
50
+
51
+ import { createHash } from 'crypto';
52
+
53
+ const HASH_LEN = 16;
54
+ const SYSTEM_TRIM = 500;
55
+ const USER_TRIM = 500;
56
+
57
+ /**
58
+ * Extract a flat text representation of a content field that might be
59
+ * either a string or an array of OpenAI/Anthropic content parts. We
60
+ * only pull the text — images/tool blocks/etc. are ignored for hashing
61
+ * because they vary in serialization but don't change conversation
62
+ * identity.
63
+ */
64
+ function flattenContent(content) {
65
+ if (typeof content === 'string') return content;
66
+ if (!Array.isArray(content)) return '';
67
+ const out = [];
68
+ for (const part of content) {
69
+ if (typeof part === 'string') out.push(part);
70
+ else if (part?.type === 'text' && part.text) out.push(part.text);
71
+ // image_url / image / tool_use / tool_result intentionally skipped
72
+ }
73
+ return out.join(' ');
74
+ }
75
+
76
+ /**
77
+ * Pull the system text out of a request body. The Anthropic surface
78
+ * carries it on `body.system` (string OR content blocks), the OpenAI
79
+ * surface carries it as messages with `role: 'system'`. Combine both.
80
+ */
81
+ function extractSystemText(body) {
82
+ let parts = [];
83
+ if (typeof body?.system === 'string') {
84
+ parts.push(body.system);
85
+ } else if (Array.isArray(body?.system)) {
86
+ parts.push(flattenContent(body.system));
87
+ }
88
+ for (const msg of body?.messages || []) {
89
+ if (msg?.role === 'system') {
90
+ parts.push(flattenContent(msg.content));
91
+ }
92
+ }
93
+ return parts.join('\n').slice(0, SYSTEM_TRIM);
94
+ }
95
+
96
+ /**
97
+ * First user-role message in the array, flattened to text. We use the
98
+ * first (oldest) one because it's the most stable anchor — later turns
99
+ * change every request.
100
+ */
101
+ function extractFirstUserText(body) {
102
+ for (const msg of body?.messages || []) {
103
+ if (msg?.role === 'user') {
104
+ const text = flattenContent(msg.content);
105
+ if (text) return text.slice(0, USER_TRIM);
106
+ }
107
+ }
108
+ return '';
109
+ }
110
+
111
+ /**
112
+ * Compute a stable session key from a request body. Returns a string
113
+ * like `auto_<16hex>` when there's enough signal to hash, or `null`
114
+ * when the body is too sparse (no model, no system, no user text — the
115
+ * caller should fall through to stateless behavior in that case).
116
+ *
117
+ * The hash uses SHA-256 truncated to 16 hex chars (~64 bits of
118
+ * collision space). A few orders of magnitude more than needed for the
119
+ * "same conversation prefix" matching use case.
120
+ */
121
+ export function deriveSessionKey(body) {
122
+ const model = body?.model || '';
123
+ const system = extractSystemText(body);
124
+ const firstUser = extractFirstUserText(body);
125
+
126
+ // Need at least *something* to anchor on. If the request has no
127
+ // model and no user message, there's literally nothing to identify
128
+ // the conversation with — better to return null and let the caller
129
+ // run stateless than to bucket everything into the same auto-key.
130
+ if (!model && !system && !firstUser) return null;
131
+ if (!firstUser) return null; // first user msg is the anchor; no anchor → no auto-key
132
+
133
+ const signature = [model, system, firstUser].join('||');
134
+ const digest = createHash('sha256').update(signature).digest('hex').slice(0, HASH_LEN);
135
+ return `auto_${digest}`;
136
+ }
137
+
138
+ /**
139
+ * Resolve the effective session key for a request. Order:
140
+ * 1. Explicit `X-Session-Id` header (or `body.session_id`) wins.
141
+ * Special value `'none'` means "explicitly stateless" and
142
+ * short-circuits to null without auto-deriving.
143
+ * 2. Auto-derived key from the conversation signature.
144
+ * 3. null (stateless) — only when there's nothing useful to hash.
145
+ *
146
+ * Returns `{ key, source }` where source is `'explicit' | 'auto' | 'none'`.
147
+ * The source label is informational — server.js logs it and the dashboard
148
+ * shows it so you can tell at a glance whether a session was client-keyed
149
+ * or server-derived.
150
+ */
151
+ export function resolveSessionKey({ headerKey, bodyKey, body }) {
152
+ const explicit = headerKey || bodyKey;
153
+ if (explicit) {
154
+ const trimmed = String(explicit).trim();
155
+ if (trimmed.toLowerCase() === 'none') {
156
+ return { key: null, source: 'none' };
157
+ }
158
+ if (trimmed) return { key: trimmed, source: 'explicit' };
159
+ }
160
+
161
+ const derived = deriveSessionKey(body);
162
+ if (derived) return { key: derived, source: 'auto' };
163
+ return { key: null, source: 'none' };
164
+ }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "mobygate",
3
- "version": "0.7.0",
3
+ "version": "0.7.1",
4
4
  "description": "OpenAI-compatible local proxy for Claude Max. The Möbius-strip gateway: OpenAI shape in, Claude Max out.",
5
5
  "type": "module",
6
6
  "main": "server.js",
package/server.js CHANGED
@@ -76,6 +76,7 @@ import {
76
76
  hasAnthropicTools,
77
77
  mapStopReason,
78
78
  } from './lib/anthropic.js';
79
+ import { resolveSessionKey } from './lib/session-derive.js';
79
80
 
80
81
  const __filename = fileURLToPath(import.meta.url);
81
82
  const __dirname = dirname(__filename);
@@ -1193,10 +1194,20 @@ app.post('/v1/chat/completions', async (req, res) => {
1193
1194
  });
1194
1195
  }
1195
1196
 
1196
- // Session key: X-Session-Id header > body.session_id > null (stateless)
1197
- const sessionKey = req.headers['x-session-id'] || body.session_id || null;
1197
+ // Session key resolution: X-Session-Id header > body.session_id >
1198
+ // auto-derived from conversation signature > null (stateless).
1199
+ // Auto-derive protects clients that don't pass a session header from
1200
+ // re-paying input-token cost on every turn of a long conversation —
1201
+ // see lib/session-derive.js for the rationale and trade-offs.
1202
+ const { key: sessionKey, source: sessionKeySource } = resolveSessionKey({
1203
+ headerKey: req.headers['x-session-id'],
1204
+ bodyKey: body.session_id,
1205
+ body,
1206
+ });
1198
1207
  const existing = getSession(sessionKey);
1199
- const sessionTag = sessionKey ? ` | session=${sessionKey}${existing ? ' (resume)' : ' (new)'}` : '';
1208
+ const sessionTag = sessionKey
1209
+ ? ` | session=${sessionKey}${sessionKeySource === 'auto' ? ' (auto)' : ''}${existing ? ' (resume)' : ' (new)'}`
1210
+ : '';
1200
1211
 
1201
1212
  console.log(`[${new Date().toISOString()}] ${body.stream ? 'stream' : 'sync'} | model=${body.model} → ${resolveModel(body.model)} | msgs=${body.messages.length}${sessionTag}`);
1202
1213
 
@@ -1260,9 +1271,15 @@ app.post('/v1/messages', async (req, res) => {
1260
1271
  });
1261
1272
  }
1262
1273
 
1263
- const sessionKey = req.headers['x-session-id'] || body.session_id || null;
1274
+ const { key: sessionKey, source: sessionKeySource } = resolveSessionKey({
1275
+ headerKey: req.headers['x-session-id'],
1276
+ bodyKey: body.session_id,
1277
+ body,
1278
+ });
1264
1279
  const existing = getSession(sessionKey);
1265
- const sessionTag = sessionKey ? ` | session=${sessionKey}${existing ? ' (resume)' : ' (new)'}` : '';
1280
+ const sessionTag = sessionKey
1281
+ ? ` | session=${sessionKey}${sessionKeySource === 'auto' ? ' (auto)' : ''}${existing ? ' (resume)' : ' (new)'}`
1282
+ : '';
1266
1283
 
1267
1284
  console.log(`[${new Date().toISOString()}] anthropic ${body.stream ? 'stream' : 'sync'} | model=${body.model} → ${resolveModel(body.model)} | msgs=${body.messages.length}${sessionTag}`);
1268
1285