npm - mobygate - Versions diffs - 0.7.0 → 0.7.1 - Mend

mobygate 0.7.0 → 0.7.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/CHANGELOG.md CHANGED Viewed

@@ -4,6 +4,45 @@ All notable changes to mobygate are documented here. Format loosely follows
 [Keep a Changelog](https://keepachangelog.com/en/1.1.0/); version numbers are
 [Semantic Versioning](https://semver.org/).
+## [0.7.1] — 2026-04-24
+Fixes a meaningful token-burn issue for clients that don't pass session
+keys.
+### Added
+- **Auto-derived session keys.** When a request arrives without an
+  `X-Session-Id` header (and without `body.session_id`), mobygate now
+  hashes a stable signature of the conversation — model + system
+  prompt + first user message — and uses that as the session key.
+  Subsequent turns of the same conversation hit the same auto-key,
+  the SDK resume kicks in, and the client only pays input-token cost
+  for the *new* tail of each turn instead of resending 200 messages
+  of history every time.
+  Surfaced in logs as `session=auto_<hash> (auto)` so you can tell
+  client-keyed sessions from server-derived ones at a glance. New
+  module: `lib/session-derive.js`.
+  In production we observed an OpenClaw client repeatedly sending
+  175–211-message conversation histories without a session key,
+  burning through Max usage in minutes. With this change, the same
+  workload re-uses the SDK session and only the new turn gets billed.
+- **Per-request opt-out:** `X-Session-Id: none` (literal string) tells
+  mobygate to skip auto-derive and run the request fully stateless.
+### Notes
+- Applies to both `/v1/chat/completions` (OpenAI) and `/v1/messages`
+  (Anthropic) surfaces.
+- Auto-keys obey the same 60-minute idle TTL as explicit ones, so
+  stale auto-sessions clean themselves up.
+- Two unrelated users starting with identical model + system + first
+  message would share an auto-session — fine for single-user dev
+  setups, but multi-tenant deployments should pass `X-Session-Id`
+  explicitly to scope per-user.
 ## [0.7.0] — 2026-04-24
 Phase 2: native Anthropic Messages surface.

package/lib/session-derive.js ADDED Viewed

@@ -0,0 +1,164 @@
+/**
+ * Auto-derive session keys for clients that don't send `X-Session-Id`.
+ *
+ * Why this exists: OpenAI's wire format is stateless by design — clients
+ * are expected to send the entire conversation history with every turn,
+ * and many clients (OpenClaw at the time of writing, plenty of others)
+ * don't bother passing a session identifier. Without one, mobygate
+ * treats every request as a fresh SDK session and the client ends up
+ * paying input-token cost for the full history on every single turn.
+ * On long conversations (175+ messages observed in production), this
+ * burns through Claude Max usage budgets in minutes.
+ *
+ * The fix: when a request arrives without an explicit session key, we
+ * compute one ourselves from a *stable signature* of the conversation —
+ * model + system prompt + first user message. The same conversation
+ * thread produces the same auto-key turn after turn, so the SDK resume
+ * machinery kicks in and only the new tail of each turn gets billed.
+ * Different conversations naturally produce different signatures and
+ * stay isolated. The existing 60-minute idle TTL keeps stale auto-keys
+ * from lingering forever.
+ *
+ * What's hashed (and why each piece):
+ *   - **model** — different agent configs shouldn't share a session.
+ *   - **system** (string or content blocks, plus any system-role
+ *     messages) — typically stable for the lifetime of a conversation
+ *     thread, distinguishes one agent's persona from another's.
+ *   - **first user message text** — anchors the thread. Stable until
+ *     the client prunes it from history; if/when that happens, a new
+ *     auto-key forms and we lose continuity for that one transition.
+ *     Graceful degradation, not a crash.
+ *
+ * Limitations to be aware of:
+ *   - **Collisions across users:** if two unrelated users happen to
+ *     start with the same model + system + first message ("hello"),
+ *     they'd share a session. In single-user dev contexts (Hermes,
+ *     OpenClaw on a personal machine) this is fine. For multi-tenant
+ *     deployments, clients should pass `X-Session-Id` explicitly to
+ *     scope per-user.
+ *   - **History pruning shifts the key:** if the client drops the first
+ *     user message from history mid-conversation, the auto-key changes
+ *     and the SDK starts a new session. One turn of double-billing,
+ *     then we're back on the new key. Acceptable.
+ *
+ * Opt-out: `X-Session-Id: none` tells us the client explicitly wants
+ * stateless behavior — we return null and the request flows through
+ * as a fresh SDK call. (An *empty* X-Session-Id is indistinguishable
+ * from "header not set" at the Express layer, so we treat it as
+ * "no explicit key, please auto-derive" rather than as opt-out.)
+ */
+import { createHash } from 'crypto';
+const HASH_LEN = 16;
+const SYSTEM_TRIM = 500;
+const USER_TRIM = 500;
+/**
+ * Extract a flat text representation of a content field that might be
+ * either a string or an array of OpenAI/Anthropic content parts. We
+ * only pull the text — images/tool blocks/etc. are ignored for hashing
+ * because they vary in serialization but don't change conversation
+ * identity.
+ */
+function flattenContent(content) {
+  if (typeof content === 'string') return content;
+  if (!Array.isArray(content)) return '';
+  const out = [];
+  for (const part of content) {
+    if (typeof part === 'string') out.push(part);
+    else if (part?.type === 'text' && part.text) out.push(part.text);
+    // image_url / image / tool_use / tool_result intentionally skipped
+  }
+  return out.join(' ');
+}
+/**
+ * Pull the system text out of a request body. The Anthropic surface
+ * carries it on `body.system` (string OR content blocks), the OpenAI
+ * surface carries it as messages with `role: 'system'`. Combine both.
+ */
+function extractSystemText(body) {
+  let parts = [];
+  if (typeof body?.system === 'string') {
+    parts.push(body.system);
+  } else if (Array.isArray(body?.system)) {
+    parts.push(flattenContent(body.system));
+  }
+  for (const msg of body?.messages || []) {
+    if (msg?.role === 'system') {
+      parts.push(flattenContent(msg.content));
+    }
+  }
+  return parts.join('\n').slice(0, SYSTEM_TRIM);
+}
+/**
+ * First user-role message in the array, flattened to text. We use the
+ * first (oldest) one because it's the most stable anchor — later turns
+ * change every request.
+ */
+function extractFirstUserText(body) {
+  for (const msg of body?.messages || []) {
+    if (msg?.role === 'user') {
+      const text = flattenContent(msg.content);
+      if (text) return text.slice(0, USER_TRIM);
+    }
+  }
+  return '';
+}
+/**
+ * Compute a stable session key from a request body. Returns a string
+ * like `auto_<16hex>` when there's enough signal to hash, or `null`
+ * when the body is too sparse (no model, no system, no user text — the
+ * caller should fall through to stateless behavior in that case).
+ *
+ * The hash uses SHA-256 truncated to 16 hex chars (~64 bits of
+ * collision space). A few orders of magnitude more than needed for the
+ * "same conversation prefix" matching use case.
+ */
+export function deriveSessionKey(body) {
+  const model = body?.model || '';
+  const system = extractSystemText(body);
+  const firstUser = extractFirstUserText(body);
+  // Need at least *something* to anchor on. If the request has no
+  // model and no user message, there's literally nothing to identify
+  // the conversation with — better to return null and let the caller
+  // run stateless than to bucket everything into the same auto-key.
+  if (!model && !system && !firstUser) return null;
+  if (!firstUser) return null; // first user msg is the anchor; no anchor → no auto-key
+  const signature = [model, system, firstUser].join('||');
+  const digest = createHash('sha256').update(signature).digest('hex').slice(0, HASH_LEN);
+  return `auto_${digest}`;
+}
+/**
+ * Resolve the effective session key for a request. Order:
+ *   1. Explicit `X-Session-Id` header (or `body.session_id`) wins.
+ *      Special value `'none'` means "explicitly stateless" and
+ *      short-circuits to null without auto-deriving.
+ *   2. Auto-derived key from the conversation signature.
+ *   3. null (stateless) — only when there's nothing useful to hash.
+ *
+ * Returns `{ key, source }` where source is `'explicit' | 'auto' | 'none'`.
+ * The source label is informational — server.js logs it and the dashboard
+ * shows it so you can tell at a glance whether a session was client-keyed
+ * or server-derived.
+ */
+export function resolveSessionKey({ headerKey, bodyKey, body }) {
+  const explicit = headerKey || bodyKey;
+  if (explicit) {
+    const trimmed = String(explicit).trim();
+    if (trimmed.toLowerCase() === 'none') {
+      return { key: null, source: 'none' };
+    }
+    if (trimmed) return { key: trimmed, source: 'explicit' };
+  }
+  const derived = deriveSessionKey(body);
+  if (derived) return { key: derived, source: 'auto' };
+  return { key: null, source: 'none' };
+}

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "mobygate",
-  "version": "0.7.0",
+  "version": "0.7.1",
   "description": "OpenAI-compatible local proxy for Claude Max. The Möbius-strip gateway: OpenAI shape in, Claude Max out.",
   "type": "module",
   "main": "server.js",

package/server.js CHANGED Viewed

@@ -76,6 +76,7 @@ import {
   hasAnthropicTools,
   mapStopReason,
 } from './lib/anthropic.js';
+import { resolveSessionKey } from './lib/session-derive.js';
 const __filename = fileURLToPath(import.meta.url);
 const __dirname = dirname(__filename);
@@ -1193,10 +1194,20 @@ app.post('/v1/chat/completions', async (req, res) => {
     });
   }
-  // Session key: X-Session-Id header > body.session_id > null (stateless)
-  const sessionKey = req.headers['x-session-id'] || body.session_id || null;
+  // Session key resolution: X-Session-Id header > body.session_id >
+  // auto-derived from conversation signature > null (stateless).
+  // Auto-derive protects clients that don't pass a session header from
+  // re-paying input-token cost on every turn of a long conversation —
+  // see lib/session-derive.js for the rationale and trade-offs.
+  const { key: sessionKey, source: sessionKeySource } = resolveSessionKey({
+    headerKey: req.headers['x-session-id'],
+    bodyKey: body.session_id,
+    body,
+  });
   const existing = getSession(sessionKey);
-  const sessionTag = sessionKey ? ` | session=${sessionKey}${existing ? ' (resume)' : ' (new)'}` : '';
+  const sessionTag = sessionKey
+    ? ` | session=${sessionKey}${sessionKeySource === 'auto' ? ' (auto)' : ''}${existing ? ' (resume)' : ' (new)'}`
+    : '';
   console.log(`[${new Date().toISOString()}] ${body.stream ? 'stream' : 'sync'} | model=${body.model} → ${resolveModel(body.model)} | msgs=${body.messages.length}${sessionTag}`);
@@ -1260,9 +1271,15 @@ app.post('/v1/messages', async (req, res) => {
     });
   }
-  const sessionKey = req.headers['x-session-id'] || body.session_id || null;
+  const { key: sessionKey, source: sessionKeySource } = resolveSessionKey({
+    headerKey: req.headers['x-session-id'],
+    bodyKey: body.session_id,
+    body,
+  });
   const existing = getSession(sessionKey);
-  const sessionTag = sessionKey ? ` | session=${sessionKey}${existing ? ' (resume)' : ' (new)'}` : '';
+  const sessionTag = sessionKey
+    ? ` | session=${sessionKey}${sessionKeySource === 'auto' ? ' (auto)' : ''}${existing ? ' (resume)' : ' (new)'}`
+    : '';
   console.log(`[${new Date().toISOString()}] anthropic ${body.stream ? 'stream' : 'sync'} | model=${body.model} → ${resolveModel(body.model)} | msgs=${body.messages.length}${sessionTag}`);