mobygate 0.5.3 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -4,6 +4,83 @@ All notable changes to mobygate are documented here. Format loosely follows
4
4
  [Keep a Changelog](https://keepachangelog.com/en/1.1.0/); version numbers are
5
5
  [Semantic Versioning](https://semver.org/).
6
6
 
7
+ ## [0.6.0] — 2026-04-24
8
+
9
+ Big one. Native tool calling + in-dashboard self-update.
10
+
11
+ ### Added
12
+
13
+ - **Native MCP tool calling.** Client-supplied OpenAI tools are now
14
+ registered with the Claude Agent SDK as in-process MCP tools (with
15
+ Zod schemas converted from JSON Schema). The model emits genuine
16
+ `tool_use` content blocks instead of the old `<tool_call>...`
17
+ text-pattern hack. Tool IDs returned to clients are now Anthropic-
18
+ native `toolu_*` strings, not synthesized `call_*` ones. New module:
19
+ `lib/tool-bridge.js`.
20
+ - **Dashboard update banner.** When npm has a newer mobygate, the
21
+ dashboard shows an orange pill at the top: `v0.6.0 → v0.6.1 available
22
+ · npm install · [changelog] [dismiss] [update now]`. Clicking
23
+ "update now" fires `npm install -g mobygate@latest` (or `git pull`
24
+ for git-mode installs) in a detached child process, restarts the
25
+ service, and auto-reloads the page. Dismissals stick per-version
26
+ via localStorage. New module: `lib/updater.js`.
27
+ - New endpoints: `GET /update/check`, `POST /update/apply`,
28
+ `GET /update/status`. The check endpoint caches the npm registry
29
+ lookup for 15 minutes so dashboards open all day don't hammer it.
30
+
31
+ ### Changed
32
+
33
+ - **No more prompt-injected tool definitions.** The `<system>...</system>`
34
+ block listing available tools as XML is gone — the SDK's MCP
35
+ registration is the model's source of truth now. This shrinks every
36
+ tool-enabled prompt by ~200-500 tokens depending on tool count.
37
+ - **Tool-flow detection** moved from text-pattern matching
38
+ (`hasCompleteToolCall`, `parseToolCalls` regexes) to native
39
+ `tool_use` content-block detection in the assistant message stream
40
+ (`hasToolUse`, `extractToolUses`). The moment a tool_use lands,
41
+ we abort the SDK and emit OpenAI-shape `tool_calls`.
42
+ - **`alwaysLoad: true`** on every registered tool. Without this, the
43
+ SDK lazily defers MCP tool schemas — the model has to call the
44
+ built-in `ToolSearch` tool to fetch each definition before invoking,
45
+ which leaks through to OpenAI clients as a confusing tool_call
46
+ for `ToolSearch` instead of their actual tool. Eager loading
47
+ keeps the surface clean.
48
+
49
+ ### Removed
50
+
51
+ - `buildToolInstructions` — the `<tool_call>...` protocol prose.
52
+ - `parseToolCalls` — the regex parser for `<tool_call>` JSON blocks.
53
+ - `hasCompleteToolCall` — the streaming-buffer heuristic that aborted
54
+ the SDK when a complete tag pair appeared.
55
+ - `formatAssistantForReplay`'s tool_calls→`<tool_call>` text
56
+ serialization (assistant replay is now best-effort text only).
57
+ - The "Use the tool results above to continue toward the final answer"
58
+ nudge — tool results are visible in conversation context now, so
59
+ the model handles continuation naturally without coaxing.
60
+
61
+ ### Known limitation (Phase 1 deliberate)
62
+
63
+ - Tool *results* coming back from the client are still spliced as
64
+ `<tool_results>` text in the resumed prompt, not native Anthropic
65
+ `tool_result` content blocks. Reason: aborting the SDK on a
66
+ `tool_use` block prevents the assistant turn from being persisted
67
+ in session state — on resume, native tool_result blocks have
68
+ nothing to bind to and the model re-calls the tool. Text-form
69
+ results work because the resumed model has the prior turn in
70
+ context. Phase 2's full Anthropic Messages wire surface will
71
+ keep the SDK alive through the tool turn and switch to native
72
+ tool_result blocks end-to-end.
73
+
74
+ ### Migration
75
+
76
+ - No client-facing changes. Existing OpenAI-shape requests with
77
+ `tools: [...]` work the same as before — what's improved is
78
+ reliability ("Model returned empty after tool calls" warnings
79
+ should largely disappear) and surface fidelity (tool_call IDs
80
+ are now native Anthropic IDs, not synthesized).
81
+ - Update with `mobygate update` (CLI) or click the new "update now"
82
+ button in the dashboard once it appears.
83
+
7
84
  ## [0.5.3] — 2026-04-19
8
85
 
9
86
  Security pass.
package/index.html CHANGED
@@ -49,6 +49,34 @@
49
49
  <body class="antialiased">
50
50
  <div class="mx-auto px-12 pt-8 pb-7 flex flex-col gap-6 max-w-[1440px] min-h-screen">
51
51
 
52
+ <!-- ===== Update banner ===== -->
53
+ <!-- Hidden until /update/check reports updateAvailable=true. During
54
+ apply, this becomes a progress strip showing live log tail. -->
55
+ <section id="updateBanner" style="display:none" class="items-center gap-4 py-3 px-5 bg-[#121210] border-l-2 border-l-[#E89B2E] border-t border-b border-r border-[#2A2A1F] rounded-r-md">
56
+ <div class="flex items-center gap-2.5">
57
+ <span class="rounded-full bg-[#E89B2E] w-2 h-2 pulse-dot"></span>
58
+ <span class="uppercase text-[#E89B2E] font-medium text-[10px] tracking-[0.22em]">Update</span>
59
+ </div>
60
+ <div id="updateBannerText" class="grow text-[#F3EFE4] text-xs leading-4"></div>
61
+ <div id="updateBannerActions" class="flex items-center gap-2 shrink-0">
62
+ <a id="updateBannerChangelog" href="https://github.com/khnfrhn/mobygate/blob/master/CHANGELOG.md" target="_blank" rel="noreferrer" class="text-[#8A9A6A] hover:text-[#C9D9A8] text-[11px] tracking-[0.04em] underline decoration-dotted">changelog</a>
63
+ <button id="updateDismissBtn" class="rounded-full py-1.5 px-3 border border-[#2A2A1F] text-[#8A9A6A] hover:text-[#C9D9A8] hover:border-[#5A5F54] font-medium text-[11px] tracking-[0.04em] transition">dismiss</button>
64
+ <button id="updateApplyBtn" class="rounded-full py-1.5 px-3.5 bg-[#E89B2E] hover:brightness-110 text-[#0B0B09] font-bold text-[11px] tracking-[0.04em] transition">update now</button>
65
+ </div>
66
+ </section>
67
+ <!-- Apply-in-progress shelf: expands below the banner during update. -->
68
+ <section id="updateProgress" style="display:none" class="flex-col gap-2 py-3 px-5 bg-[#121210] border border-[#2A2A1F] rounded-md">
69
+ <div class="flex items-center justify-between">
70
+ <div class="flex items-center gap-2">
71
+ <span id="updateSpinner" class="rounded-full bg-[#E89B2E] w-2 h-2 pulse-dot"></span>
72
+ <span id="updateProgressTitle" class="uppercase text-[#C9D9A8] font-medium text-[10px] tracking-[0.22em]">Installing</span>
73
+ <span id="updateProgressSub" class="text-[#5A5F54] text-[11px]"></span>
74
+ </div>
75
+ <button id="updateProgressClose" style="display:none" class="text-[#5A5F54] hover:text-[#C9D9A8] text-[11px]">close ✕</button>
76
+ </div>
77
+ <pre id="updateProgressLog" class="text-[11px] leading-[15px] text-[#8A9A6A] max-h-[180px] overflow-auto whitespace-pre-wrap m-0"></pre>
78
+ </section>
79
+
52
80
  <!-- ===== Header ===== -->
53
81
  <header class="flex justify-between items-center shrink-0">
54
82
  <div class="flex items-center gap-[22px]">
@@ -804,6 +832,117 @@
804
832
  }
805
833
  }, 1000);
806
834
 
835
+ // ───────────────────────── Updater
836
+ // Dashboard-driven upgrade flow. On load (and every 30 min) we ask
837
+ // /update/check whether a newer mobygate is on npm. If so, a pill
838
+ // appears at the top of the page — click "update now" to fire the
839
+ // update, watch log lines stream in, then auto-reload when the new
840
+ // server is up. The child process is detached, so the server
841
+ // restart doesn't orphan it.
842
+ const UPDATE_DISMISS_KEY = 'mobygate:update:dismissedVersion';
843
+ let updateInfo = null;
844
+ let updatePollTimer = null;
845
+
846
+ function showBanner(info) {
847
+ if (!info?.updateAvailable) {
848
+ $('updateBanner').style.display = 'none';
849
+ return;
850
+ }
851
+ // Respect dismissal: if the user dismissed this exact version, don't
852
+ // re-pester until a newer one lands.
853
+ const dismissed = localStorage.getItem(UPDATE_DISMISS_KEY);
854
+ if (dismissed === info.latest) {
855
+ $('updateBanner').style.display = 'none';
856
+ return;
857
+ }
858
+ const msg = info.canApply
859
+ ? `v${escHtml(info.current)} → <span class="text-[#B7E56D]">v${escHtml(info.latest)}</span> available · <span class="text-[#5A5F54]">${escHtml(info.installMode)} install</span>`
860
+ : `v${escHtml(info.current)} → <span class="text-[#B7E56D]">v${escHtml(info.latest)}</span> available · <span class="text-[#E89B2E]">${escHtml(info.installMode)} install — update manually</span>`;
861
+ $('updateBannerText').innerHTML = msg;
862
+ $('updateApplyBtn').style.display = info.canApply ? '' : 'none';
863
+ $('updateBanner').style.display = 'flex';
864
+ }
865
+
866
+ async function checkForUpdates({ force = false } = {}) {
867
+ try {
868
+ const r = await fetch(`/update/check${force ? '?force=1' : ''}`);
869
+ if (!r.ok) return;
870
+ updateInfo = await r.json();
871
+ showBanner(updateInfo);
872
+ } catch (e) { /* offline is fine */ }
873
+ }
874
+
875
+ function renderUpdateLog(lines) {
876
+ const el = $('updateProgressLog');
877
+ el.textContent = (lines || []).join('\n');
878
+ // Pin to bottom so the user sees the latest line.
879
+ el.scrollTop = el.scrollHeight;
880
+ }
881
+
882
+ async function pollUpdateStatus() {
883
+ try {
884
+ const r = await fetch('/update/status?lines=200');
885
+ if (!r.ok) return;
886
+ const s = await r.json();
887
+ renderUpdateLog(s.lines);
888
+ if (!s.running) {
889
+ // Update finished. The service restart may have already swapped
890
+ // the running binary — our `currentVersion` reflects whatever
891
+ // server answered. If it matches `latest`, celebrate. Either
892
+ // way, give it a moment then reload so the dashboard comes
893
+ // back on the new code path.
894
+ clearInterval(updatePollTimer); updatePollTimer = null;
895
+ $('updateSpinner').classList.remove('pulse-dot');
896
+ $('updateSpinner').classList.remove('bg-[#E89B2E]');
897
+ $('updateSpinner').classList.add('bg-[#B7E56D]');
898
+ $('updateProgressTitle').textContent = 'Installed';
899
+ $('updateProgressSub').textContent = `now on v${s.currentVersion} — reloading in 3s…`;
900
+ $('updateProgressClose').style.display = '';
901
+ setTimeout(() => location.reload(), 3000);
902
+ }
903
+ } catch (e) {
904
+ // Server is mid-restart — keep polling, it'll come back.
905
+ }
906
+ }
907
+
908
+ function startUpdateProgress(mode) {
909
+ $('updateBanner').style.display = 'none';
910
+ $('updateProgress').style.display = 'flex';
911
+ $('updateProgressSub').textContent = mode ? `(${mode} install)` : '';
912
+ $('updateProgressTitle').textContent = 'Installing';
913
+ $('updateSpinner').classList.add('pulse-dot');
914
+ $('updateProgressLog').textContent = 'starting update…';
915
+ if (updatePollTimer) clearInterval(updatePollTimer);
916
+ updatePollTimer = setInterval(pollUpdateStatus, 1500);
917
+ pollUpdateStatus();
918
+ }
919
+
920
+ $('updateApplyBtn')?.addEventListener('click', async () => {
921
+ $('updateApplyBtn').disabled = true;
922
+ try {
923
+ const r = await fetch('/update/apply', { method: 'POST' });
924
+ const j = await r.json().catch(() => ({}));
925
+ if (!r.ok || !j.started) {
926
+ $('updateBannerText').innerHTML += ` <span class="text-[#E89B2E]">— ${escHtml(j.error || 'update failed to start')}</span>`;
927
+ $('updateApplyBtn').disabled = false;
928
+ return;
929
+ }
930
+ startUpdateProgress(j.mode);
931
+ } catch (e) {
932
+ $('updateBannerText').innerHTML += ` <span class="text-[#E89B2E]">— ${escHtml(e.message)}</span>`;
933
+ $('updateApplyBtn').disabled = false;
934
+ }
935
+ });
936
+
937
+ $('updateDismissBtn')?.addEventListener('click', () => {
938
+ if (updateInfo?.latest) localStorage.setItem(UPDATE_DISMISS_KEY, updateInfo.latest);
939
+ $('updateBanner').style.display = 'none';
940
+ });
941
+
942
+ $('updateProgressClose')?.addEventListener('click', () => {
943
+ $('updateProgress').style.display = 'none';
944
+ });
945
+
807
946
  // Kick off
808
947
  loadSnapshot();
809
948
  loadAuth({ verify: false });
@@ -811,6 +950,21 @@
811
950
  loadLogs();
812
951
  armLogAutoRefresh();
813
952
  connectStream();
953
+ // Surface update availability on load + every 30 min. The backend
954
+ // caches the npm registry lookup for 15 min, so this doesn't hammer
955
+ // the registry even with the dashboard open all day.
956
+ checkForUpdates();
957
+ setInterval(() => checkForUpdates(), 30 * 60 * 1000);
958
+ // If an update is in-flight when the page loads (e.g., user refreshed
959
+ // mid-apply), pick up where it left off.
960
+ (async () => {
961
+ try {
962
+ const r = await fetch('/update/status?lines=50');
963
+ if (!r.ok) return;
964
+ const s = await r.json();
965
+ if (s.running) startUpdateProgress(s.mode);
966
+ } catch {}
967
+ })();
814
968
  </script>
815
969
  </body>
816
970
  </html>
@@ -0,0 +1,257 @@
1
+ /**
2
+ * Native tool bridge — translates between OpenAI client tools and the
3
+ * Claude Agent SDK's MCP-tool model.
4
+ *
5
+ * Why this exists (Phase 1 of the mobygate native-tools refactor):
6
+ *
7
+ * Until now, mobygate handled client-supplied tools by injecting their
8
+ * schemas into the system prompt as <tool> XML and instructing the model
9
+ * to emit <tool_call>{...}</tool_call> tags in its text output. We then
10
+ * regex-parsed those tags. Fragile in obvious ways: the model sometimes
11
+ * wrapped tags in code fences, sometimes hallucinated partial blocks,
12
+ * and the "empty after tool_results" nudge existed to paper over the
13
+ * model treating bare <tool_results> as inert data.
14
+ *
15
+ * The SDK actually supports native tool definitions via MCP — but its
16
+ * MCP model assumes the **handler runs in-process** and returns a
17
+ * synchronous result. Our case is different: we're a proxy. The actual
18
+ * tool implementations live on the *other* side of an HTTP boundary,
19
+ * inside the client (Hermes / OpenClaw / etc.). We can't run them.
20
+ *
21
+ * The trick: register client tools as MCP tools with stub handlers that
22
+ * never resolve. The model emits **native** `tool_use` content blocks
23
+ * (in the SDKAssistantMessage stream, not buried in text). We watch the
24
+ * stream, abort the SDK on the first complete `tool_use`, and surface
25
+ * it to the client as an OpenAI `tool_calls` response. The stub handler
26
+ * is then aborted via the SDK's signal — we never actually execute it,
27
+ * the client does.
28
+ *
29
+ * The other end of the round-trip: when the client sends a follow-up
30
+ * request with tool results (role:'tool' messages), we convert those
31
+ * into native `tool_result` content blocks inside an SDKUserMessage,
32
+ * resuming the SDK session. The model sees structured tool results,
33
+ * not <tool_result> XML, and continues the conversation cleanly.
34
+ *
35
+ * Names round-trip via the MCP prefix convention. A client tool named
36
+ * `getWeather` is registered as `mcp__mobygate__getWeather` with the
37
+ * SDK; the model emits tool_use blocks under that prefixed name; we
38
+ * strip the prefix on the way back so the client sees its original name.
39
+ */
40
+
41
+ import { z } from 'zod';
42
+ import { tool, createSdkMcpServer } from '@anthropic-ai/claude-agent-sdk';
43
+
44
+ export const MCP_SERVER_NAME = 'mobygate';
45
+ export const MCP_TOOL_PREFIX = `mcp__${MCP_SERVER_NAME}__`;
46
+
47
+ // ---------------------------------------------------------------------------
48
+ // JSON Schema → Zod RawShape
49
+ // ---------------------------------------------------------------------------
50
+ // The SDK's `tool()` helper takes a Zod RawShape (a record of ZodTypes,
51
+ // like `{name: z.string(), age: z.number()}`) — NOT a JSON Schema object.
52
+ // OpenAI clients send JSON Schema (`{type:'object', properties:{...}, required:[...]}`),
53
+ // so we need to convert. This handles the common cases that cover ~95% of
54
+ // real-world tool schemas; anything weirder falls through to z.unknown().
55
+
56
+ function jsonSchemaPropToZod(prop) {
57
+ if (!prop || typeof prop !== 'object') return z.unknown();
58
+
59
+ // Handle enums up front — they apply across types.
60
+ if (Array.isArray(prop.enum) && prop.enum.length > 0) {
61
+ const stringy = prop.enum.every((v) => typeof v === 'string');
62
+ if (stringy) return z.enum(prop.enum);
63
+ // mixed-type enums fall through to z.union of literals
64
+ return z.union(prop.enum.map((v) => z.literal(v)));
65
+ }
66
+
67
+ switch (prop.type) {
68
+ case 'string': return z.string();
69
+ case 'number': return z.number();
70
+ case 'integer': return z.number().int();
71
+ case 'boolean': return z.boolean();
72
+ case 'null': return z.null();
73
+ case 'array': {
74
+ const item = prop.items ? jsonSchemaPropToZod(prop.items) : z.unknown();
75
+ return z.array(item);
76
+ }
77
+ case 'object': {
78
+ const shape = jsonSchemaToZodShape(prop);
79
+ return z.object(shape).passthrough();
80
+ }
81
+ default: return z.unknown();
82
+ }
83
+ }
84
+
85
+ /**
86
+ * Convert a JSON Schema *object* (with `properties` + `required`) into
87
+ * a Zod RawShape suitable for the SDK's `tool()` helper.
88
+ *
89
+ * Returns an empty shape `{}` when the schema isn't an object — the
90
+ * caller will pass this to `tool()`, and the model will see "no
91
+ * structured input expected." That's the right default for tool defs
92
+ * that arrive without a properties block (which OpenAI permits).
93
+ */
94
+ export function jsonSchemaToZodShape(schema) {
95
+ if (!schema || schema.type !== 'object' || !schema.properties) return {};
96
+ const shape = {};
97
+ const required = new Set(Array.isArray(schema.required) ? schema.required : []);
98
+ for (const [key, prop] of Object.entries(schema.properties)) {
99
+ let zType = jsonSchemaPropToZod(prop);
100
+ if (!required.has(key)) zType = zType.optional();
101
+ if (prop?.description) zType = zType.describe(prop.description);
102
+ shape[key] = zType;
103
+ }
104
+ return shape;
105
+ }
106
+
107
+ // ---------------------------------------------------------------------------
108
+ // Build the MCP server that exposes client tools to the SDK
109
+ // ---------------------------------------------------------------------------
110
+
111
+ /**
112
+ * Stub handler. The model emits a tool_use block, the SDK calls us, but
113
+ * we don't actually have an implementation to run — the client does.
114
+ * So we wait. The stream-watcher in server.js will abort the SDK as
115
+ * soon as it sees the tool_use block, which propagates here as a signal
116
+ * abort. We reject and the SDK cleans up.
117
+ *
118
+ * The 30s safety timeout is for the (rare) case where the SDK fires our
119
+ * handler but the abort never propagates back — we don't want to leak
120
+ * a Promise forever. 30s is well past any reasonable abort latency.
121
+ */
122
+ function deferredToolHandler(_args, extra) {
123
+ return new Promise((resolve, reject) => {
124
+ const onAbort = () => {
125
+ cleanup();
126
+ reject(new Error('mobygate: tool execution deferred to client (aborted)'));
127
+ };
128
+ const timer = setTimeout(() => {
129
+ cleanup();
130
+ reject(new Error('mobygate: tool execution deferred to client (timeout)'));
131
+ }, 30_000);
132
+ function cleanup() {
133
+ clearTimeout(timer);
134
+ extra?.signal?.removeEventListener?.('abort', onAbort);
135
+ }
136
+ if (extra?.signal?.aborted) return onAbort();
137
+ extra?.signal?.addEventListener?.('abort', onAbort, { once: true });
138
+ });
139
+ }
140
+
141
+ /**
142
+ * Build an in-process MCP server exposing the client's tools to the SDK.
143
+ * Returns the McpSdkServerConfigWithInstance; pass it to `query({options: { mcpServers: { [MCP_SERVER_NAME]: config } }})`.
144
+ *
145
+ * Returns `null` when there are no valid tools — caller should skip
146
+ * MCP setup entirely in that case.
147
+ */
148
+ export function buildClientToolsServer(openaiTools) {
149
+ if (!Array.isArray(openaiTools) || openaiTools.length === 0) return null;
150
+
151
+ const toolDefs = [];
152
+ for (const t of openaiTools) {
153
+ if (t?.type !== 'function' || !t.function?.name) continue;
154
+ const fn = t.function;
155
+ const shape = jsonSchemaToZodShape(fn.parameters);
156
+ toolDefs.push(tool(
157
+ fn.name,
158
+ fn.description || `Client-defined tool: ${fn.name}`,
159
+ shape,
160
+ deferredToolHandler,
161
+ // alwaysLoad: the SDK otherwise marks MCP tools as "deferred" — the
162
+ // model has to call the built-in `ToolSearch` to fetch the schema
163
+ // before invoking. That round-trip is invisible to OpenAI clients,
164
+ // who see a confusing tool_call for ToolSearch instead of getWeather.
165
+ // Eagerly loading our tools keeps the OpenAI surface clean.
166
+ { alwaysLoad: true },
167
+ ));
168
+ }
169
+ if (toolDefs.length === 0) return null;
170
+
171
+ return createSdkMcpServer({
172
+ name: MCP_SERVER_NAME,
173
+ version: '1.0.0',
174
+ tools: toolDefs,
175
+ });
176
+ }
177
+
178
+ // ---------------------------------------------------------------------------
179
+ // Tool-use extraction (SDK assistant message → OpenAI tool_calls)
180
+ // ---------------------------------------------------------------------------
181
+
182
+ /**
183
+ * Walk an SDKAssistantMessage's content array for native `tool_use` blocks.
184
+ * Returns an array of `{ id, name, arguments }` formatted for OpenAI
185
+ * tool_calls — name has the MCP prefix stripped, arguments is a JSON string.
186
+ *
187
+ * Returns `[]` when the message has no tool_use blocks (most assistant
188
+ * messages don't — they're just text deltas).
189
+ */
190
+ export function extractToolUses(assistantMessage) {
191
+ const content = assistantMessage?.message?.content;
192
+ if (!Array.isArray(content)) return [];
193
+ const calls = [];
194
+ for (const block of content) {
195
+ if (block?.type !== 'tool_use' || !block.id || !block.name) continue;
196
+ // Strip the MCP prefix so the client sees its original tool name.
197
+ const name = block.name.startsWith(MCP_TOOL_PREFIX)
198
+ ? block.name.slice(MCP_TOOL_PREFIX.length)
199
+ : block.name;
200
+ let argsString = '{}';
201
+ try { argsString = JSON.stringify(block.input ?? {}); } catch {}
202
+ calls.push({ id: block.id, name, arguments: argsString });
203
+ }
204
+ return calls;
205
+ }
206
+
207
+ /**
208
+ * Quick liveness check used by the stream loop to decide whether to abort
209
+ * early. Returns true the moment any tool_use block appears.
210
+ */
211
+ export function hasToolUse(assistantMessage) {
212
+ const content = assistantMessage?.message?.content;
213
+ if (!Array.isArray(content)) return false;
214
+ return content.some((b) => b?.type === 'tool_use');
215
+ }
216
+
217
+ // ---------------------------------------------------------------------------
218
+ // Tool results (OpenAI tool messages → Anthropic tool_result content blocks)
219
+ // ---------------------------------------------------------------------------
220
+
221
+ /**
222
+ * Format OpenAI role:'tool' messages as a single user-readable text
223
+ * block to splice into a resumed prompt.
224
+ *
225
+ * NOTE: Phase 1 deliberately does *not* round-trip tool results as
226
+ * native Anthropic `tool_result` content blocks. Why: when we abort
227
+ * the SDK on a tool_use, the assistant turn isn't persisted in the
228
+ * SDK's session state (we observed `msgs=1` on resume after a tool
229
+ * call, meaning the partial turn was dropped). On resume, sending a
230
+ * native tool_result block then has nothing to bind to — the model
231
+ * sees an orphan tool_result and re-calls the tool.
232
+ *
233
+ * Phase 2's full Anthropic Messages wire format will keep the SDK
234
+ * alive long enough to persist the turn properly. Until then, text-
235
+ * form tool results (which the model handles fine — it has the
236
+ * preceding tool_use in resume context) is the pragmatic answer.
237
+ *
238
+ * Returns a single string suitable for prepending to (or replacing)
239
+ * the user's prompt text on a resumed turn. Returns '' when there
240
+ * are no tool messages.
241
+ */
242
+ export function toolMessagesToText(toolMessages) {
243
+ const lines = [];
244
+ for (const msg of toolMessages) {
245
+ if (msg?.role !== 'tool') continue;
246
+ const id = msg.tool_call_id || 'unknown';
247
+ const name = msg.name || '';
248
+ const content = typeof msg.content === 'string'
249
+ ? msg.content
250
+ : Array.isArray(msg.content)
251
+ ? msg.content.map((c) => (typeof c === 'string' ? c : c?.text || '')).join('')
252
+ : (msg.content == null ? '' : String(msg.content));
253
+ lines.push(`<tool_result id="${id}"${name ? ` name="${name}"` : ''}>\n${content}\n</tool_result>`);
254
+ }
255
+ if (lines.length === 0) return '';
256
+ return `<tool_results>\n${lines.join('\n')}\n</tool_results>`;
257
+ }
package/lib/updater.js ADDED
@@ -0,0 +1,275 @@
1
+ /**
2
+ * mobygate updater
3
+ *
4
+ * Shared helpers powering the dashboard's "update available → update now"
5
+ * flow (and re-usable from the CLI later). Two concerns:
6
+ *
7
+ * 1. Version lookup — read local package.json + query npm registry for
8
+ * the latest published version. Cached for 15 min so the dashboard
9
+ * can poll `/update/check` on load + every 30 min without hammering
10
+ * the registry (or getting rate-limited).
11
+ *
12
+ * 2. Apply — spawn the upgrade as a **detached** child process so the
13
+ * restart-the-service step can kill the running mobygate server
14
+ * without orphaning the update or losing log lines. Progress is
15
+ * streamed to `~/.mobygate/logs/update.log`, which the dashboard
16
+ * polls via `/update/status`.
17
+ *
18
+ * Install-mode routing matches `bin/mobygate.js cmdUpdate`:
19
+ * - `npm` → `npm install -g mobygate@latest` → restart service
20
+ * - `git` → `git pull && npm install` → restart service
21
+ * - `unknown` → refuse, surface a readable message
22
+ */
23
+
24
+ import { spawn, spawnSync } from 'child_process';
25
+ import { readFileSync, writeFileSync, existsSync, mkdirSync, openSync } from 'fs';
26
+ import { join, sep, dirname } from 'path';
27
+ import { fileURLToPath } from 'url';
28
+ import { LOGS_DIR } from './config.js';
29
+
30
+ const __filename = fileURLToPath(import.meta.url);
31
+ const REPO_ROOT = dirname(dirname(__filename)); // lib/updater.js → repo root
32
+
33
+ const IS_WIN = process.platform === 'win32';
34
+ const IS_MAC = process.platform === 'darwin';
35
+ const IS_LINUX = process.platform === 'linux';
36
+
37
+ const SERVER_LABEL = 'ai.mobygate.server';
38
+ const WIN_SERVER_TASK = 'ai.mobygate.server';
39
+ const LINUX_SERVER_UNIT = 'mobygate-server.service';
40
+
41
+ const UPDATE_LOG = join(LOGS_DIR, 'update.log');
42
+ const UPDATE_MARKER = join(LOGS_DIR, 'update.state.json');
43
+
44
+ // ---------------------------------------------------------------------------
45
+ // Version lookup
46
+ // ---------------------------------------------------------------------------
47
+
48
+ export function getCurrentVersion() {
49
+ try {
50
+ const pkg = JSON.parse(readFileSync(join(REPO_ROOT, 'package.json'), 'utf8'));
51
+ return pkg.version;
52
+ } catch {
53
+ return 'unknown';
54
+ }
55
+ }
56
+
57
+ export function detectInstallMode() {
58
+ if (existsSync(join(REPO_ROOT, '.git'))) return 'git';
59
+ if (REPO_ROOT.includes(`${sep}node_modules${sep}mobygate`)) return 'npm';
60
+ return 'unknown';
61
+ }
62
+
63
+ // 15-min in-memory cache so `/update/check` on dashboard load + 30-min
64
+ // repolls don't hit the npm registry every time. Bust with { force: true }.
65
+ const NPM_CACHE_MS = 15 * 60 * 1000;
66
+ let _npmCache = { version: null, fetchedAt: 0, error: null };
67
+
68
+ /**
69
+ * Resolve the latest published version on npm. Returns `{ version, cached, error }`.
70
+ * Never throws — on failure returns an error string so the endpoint can
71
+ * report "check failed" without 500'ing the dashboard.
72
+ */
73
+ export async function getLatestVersion({ force = false } = {}) {
74
+ const now = Date.now();
75
+ if (!force && _npmCache.version && (now - _npmCache.fetchedAt) < NPM_CACHE_MS) {
76
+ return { version: _npmCache.version, cached: true, error: null };
77
+ }
78
+ // `npm view` is a network call; cap at 10s so a bad connection doesn't
79
+ // wedge the dashboard.
80
+ const r = spawnSync('npm', ['view', 'mobygate', 'version'], {
81
+ encoding: 'utf8',
82
+ timeout: 10_000,
83
+ shell: IS_WIN, // on Windows, npm is a .cmd — needs shell resolution
84
+ });
85
+ if (r.status !== 0) {
86
+ const err = r.stderr?.trim() || r.error?.message || `npm exited ${r.status}`;
87
+ _npmCache = { version: null, fetchedAt: now, error: err };
88
+ return { version: null, cached: false, error: err };
89
+ }
90
+ const version = r.stdout.trim();
91
+ _npmCache = { version, fetchedAt: now, error: null };
92
+ return { version, cached: false, error: null };
93
+ }
94
+
95
+ /**
96
+ * Compare two semver-ish strings (x.y.z). Returns -1/0/1.
97
+ * Non-numeric suffixes (`-beta.1`) are ignored for simplicity — this
98
+ * matches what the registry returns for mobygate's release channel.
99
+ */
100
+ export function compareVersions(a, b) {
101
+ const pa = String(a).split('-')[0].split('.').map((n) => parseInt(n, 10) || 0);
102
+ const pb = String(b).split('-')[0].split('.').map((n) => parseInt(n, 10) || 0);
103
+ for (let i = 0; i < 3; i++) {
104
+ if ((pa[i] || 0) > (pb[i] || 0)) return 1;
105
+ if ((pa[i] || 0) < (pb[i] || 0)) return -1;
106
+ }
107
+ return 0;
108
+ }
109
+
110
+ export async function getUpdateCheck({ force = false } = {}) {
111
+ const current = getCurrentVersion();
112
+ const mode = detectInstallMode();
113
+ const { version: latest, cached, error } = await getLatestVersion({ force });
114
+ const updateAvailable = !!latest && compareVersions(latest, current) > 0;
115
+ return {
116
+ current,
117
+ latest: latest || null,
118
+ updateAvailable,
119
+ installMode: mode,
120
+ checkedAt: new Date().toISOString(),
121
+ cached,
122
+ error,
123
+ // If the install layout can't self-update, surface it so the UI can
124
+ // show "run manually" instead of a live-updating button.
125
+ canApply: updateAvailable && (mode === 'npm' || mode === 'git'),
126
+ };
127
+ }
128
+
129
+ // ---------------------------------------------------------------------------
130
+ // Apply
131
+ // ---------------------------------------------------------------------------
132
+
133
+ function ensureLogsDir() {
134
+ if (!existsSync(LOGS_DIR)) mkdirSync(LOGS_DIR, { recursive: true });
135
+ }
136
+
137
+ export function readUpdateState() {
138
+ if (!existsSync(UPDATE_MARKER)) return { running: false, startedAt: null, finishedAt: null, exitCode: null, pid: null };
139
+ try { return JSON.parse(readFileSync(UPDATE_MARKER, 'utf8')); } catch { return { running: false, error: 'marker-unreadable' }; }
140
+ }
141
+
142
+ export function readUpdateLogTail({ lines = 500 } = {}) {
143
+ if (!existsSync(UPDATE_LOG)) return [];
144
+ try {
145
+ const raw = readFileSync(UPDATE_LOG, 'utf8');
146
+ const split = raw.split(/\r?\n/);
147
+ return split.slice(-lines - 1, -1);
148
+ } catch { return []; }
149
+ }
150
+
151
+ function writeUpdateState(patch) {
152
+ ensureLogsDir();
153
+ const prev = readUpdateState();
154
+ const merged = { ...prev, ...patch, updatedAt: new Date().toISOString() };
155
+ writeFileSync(UPDATE_MARKER, JSON.stringify(merged, null, 2));
156
+ return merged;
157
+ }
158
+
159
+ /**
160
+ * Build the shell command that performs update + restart. Returned as a
161
+ * single string we can hand to `sh -c` / `cmd /c`. Written as a string
162
+ * (not an array) because we want shell redirection for log capture.
163
+ */
164
+ function buildUpdateCommand({ mode, repoRoot, logPath }) {
165
+ if (IS_WIN) {
166
+ // cmd.exe — `>>` for append, `2>&1` to merge. Each step on its own
167
+ // line so failures short-circuit via `||`.
168
+ const steps = [];
169
+ steps.push(`echo [mobygate-update] start at %DATE% %TIME%`);
170
+ if (mode === 'npm') {
171
+ steps.push(`npm install -g mobygate@latest`);
172
+ } else if (mode === 'git') {
173
+ steps.push(`cd /d "${repoRoot}"`);
174
+ steps.push(`git pull --ff-only`);
175
+ steps.push(`npm install`);
176
+ }
177
+ steps.push(`echo [mobygate-update] restarting service`);
178
+ steps.push(`schtasks /End /TN "${WIN_SERVER_TASK}"`);
179
+ steps.push(`schtasks /Run /TN "${WIN_SERVER_TASK}"`);
180
+ steps.push(`echo [mobygate-update] done`);
181
+ // Join with && so any failure stops the chain. Final redirect to log.
182
+ const inner = steps.map((s) => `(${s})`).join(' && ');
183
+ return { shell: 'cmd', cmd: `${inner} >> "${logPath}" 2>&1` };
184
+ }
185
+ // POSIX: sh -c, bail-on-first-failure via set -e
186
+ const parts = [`set -e`, `echo "[mobygate-update] start $(date)"`];
187
+ if (mode === 'npm') {
188
+ parts.push(`npm install -g mobygate@latest`);
189
+ } else if (mode === 'git') {
190
+ parts.push(`cd "${repoRoot}"`);
191
+ parts.push(`git pull --ff-only`);
192
+ parts.push(`npm install`);
193
+ }
194
+ parts.push(`echo "[mobygate-update] restarting service"`);
195
+ if (IS_MAC) {
196
+ const plist = join(process.env.HOME || '~', 'Library', 'LaunchAgents', `${SERVER_LABEL}.plist`);
197
+ // unload may fail if not loaded — tolerate that specific case
198
+ parts.push(`launchctl unload "${plist}" 2>/dev/null || true`);
199
+ parts.push(`launchctl load "${plist}"`);
200
+ } else if (IS_LINUX) {
201
+ parts.push(`systemctl --user restart ${LINUX_SERVER_UNIT}`);
202
+ }
203
+ parts.push(`echo "[mobygate-update] done"`);
204
+ const script = parts.join('\n');
205
+ return { shell: 'sh', cmd: script };
206
+ }
207
+
208
+ /**
209
+ * Kick off the update in a **detached** child process. The running
210
+ * mobygate server returns immediately and is then killed by the restart
211
+ * step. The dashboard polls `/update/status` to see progress, and the
212
+ * new server comes up with the upgraded code.
213
+ *
214
+ * Returns `{ started, pid, error }` — never throws. If another update
215
+ * is already running, returns `{ started: false, error: 'in-progress' }`.
216
+ */
217
+ export function applyUpdate({ mode, repoRoot = REPO_ROOT } = {}) {
218
+ const resolvedMode = mode || detectInstallMode();
219
+ if (resolvedMode !== 'npm' && resolvedMode !== 'git') {
220
+ return { started: false, error: `install-mode ${resolvedMode} can't auto-update` };
221
+ }
222
+
223
+ ensureLogsDir();
224
+ const prev = readUpdateState();
225
+ if (prev.running && prev.pid) {
226
+ // Treat unknown-alive PIDs as in-progress. Dead PIDs fall through.
227
+ let alive = false;
228
+ try { process.kill(prev.pid, 0); alive = true; } catch {}
229
+ if (alive) return { started: false, error: 'update-already-running', pid: prev.pid };
230
+ }
231
+
232
+ // Truncate the log so each update starts fresh (the old content is
233
+ // still available via `mobygate logs` up to the boundary).
234
+ writeFileSync(UPDATE_LOG, '');
235
+
236
+ const { shell, cmd } = buildUpdateCommand({ mode: resolvedMode, repoRoot, logPath: UPDATE_LOG });
237
+
238
+ let child;
239
+ try {
240
+ if (shell === 'cmd') {
241
+ // Windows: use cmd.exe /c; redirection is inside cmd string.
242
+ child = spawn('cmd.exe', ['/c', cmd], {
243
+ detached: true,
244
+ stdio: 'ignore',
245
+ windowsHide: true,
246
+ });
247
+ } else {
248
+ // POSIX: use sh -c; redirect stdout/stderr into the log file via Node
249
+ // (simpler than embedding `>>` into the script, and avoids quoting
250
+ // headaches across macOS sh versions).
251
+ const fd = openSync(UPDATE_LOG, 'a');
252
+ child = spawn('sh', ['-c', cmd], {
253
+ detached: true,
254
+ stdio: ['ignore', fd, fd],
255
+ });
256
+ }
257
+ child.unref();
258
+ } catch (e) {
259
+ return { started: false, error: `spawn failed: ${e.message}` };
260
+ }
261
+
262
+ writeUpdateState({
263
+ running: true,
264
+ startedAt: new Date().toISOString(),
265
+ finishedAt: null,
266
+ exitCode: null,
267
+ pid: child.pid,
268
+ mode: resolvedMode,
269
+ });
270
+
271
+ // We don't await the child — it's detached and will outlive us. The
272
+ // dashboard determines completion by polling /update/status, which
273
+ // checks `process.kill(pid, 0)` to test liveness.
274
+ return { started: true, pid: child.pid, mode: resolvedMode };
275
+ }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "mobygate",
3
- "version": "0.5.3",
3
+ "version": "0.6.0",
4
4
  "description": "OpenAI-compatible local proxy for Claude Max. The Möbius-strip gateway: OpenAI shape in, Claude Max out.",
5
5
  "type": "module",
6
6
  "main": "server.js",
package/server.js CHANGED
@@ -53,6 +53,21 @@ import { banner } from './lib/ascii.js';
53
53
  import { bus as dashboardBus } from './lib/dashboard-bus.js';
54
54
  import { loadSessions, saveSessions, flushSessionsNow } from './lib/session-store.js';
55
55
  import { LOGS_DIR } from './lib/config.js';
56
+ import {
57
+ buildClientToolsServer,
58
+ extractToolUses,
59
+ hasToolUse,
60
+ toolMessagesToText,
61
+ MCP_SERVER_NAME,
62
+ MCP_TOOL_PREFIX,
63
+ } from './lib/tool-bridge.js';
64
+ import {
65
+ getUpdateCheck,
66
+ applyUpdate,
67
+ readUpdateState,
68
+ readUpdateLogTail,
69
+ getCurrentVersion,
70
+ } from './lib/updater.js';
56
71
 
57
72
  const __filename = fileURLToPath(import.meta.url);
58
73
  const __dirname = dirname(__filename);
@@ -214,114 +229,45 @@ function collectImages(messages) {
214
229
  }
215
230
 
216
231
  // ---------------------------------------------------------------------------
217
- // Tool calling (Path B: prompt-embedded protocol)
232
+ // Tool calling (Phase 1: native MCP tools — no more <tool_call> text hack)
218
233
  // ---------------------------------------------------------------------------
219
- // The Claude Agent SDK cannot stream OpenAI-style function-call events back to
220
- // the caller (MCP handlers execute in-process and pollute session state; see
221
- // README "Known Gaps"). Workaround: inject client-provided tool schemas into
222
- // the system prompt and instruct the model to emit <tool_call>{...}</tool_call>
223
- // tags. We parse those out and re-emit as OpenAI `tool_calls`. Tool results
224
- // coming back from the client get wrapped in <tool_result> blocks.
234
+ // Client-provided OpenAI tools are registered with the SDK as in-process MCP
235
+ // tools (see lib/tool-bridge.js). The model emits **native** tool_use content
236
+ // blocks in its assistant messages; we abort the SDK on the first one and
237
+ // return OpenAI tool_calls to the client. When the client replies with tool
238
+ // results, we send them back as Anthropic tool_result content blocks inside
239
+ // a single SDKUserMessage round-tripping cleanly through the SDK session.
225
240
 
226
241
  function hasTools(body) {
227
242
  return Array.isArray(body?.tools) && body.tools.length > 0;
228
243
  }
229
244
 
230
- function buildToolInstructions(tools) {
231
- const lines = [
232
- 'You have access to CLIENT-DEFINED tools listed below. To invoke a tool, emit one or more <tool_call> tags, each containing a strict JSON object with "name" and "arguments":',
233
- '',
234
- '<tool_call>{"name":"<tool_name>","arguments":{<args>}}</tool_call>',
235
- '',
236
- 'Rules:',
237
- '- Do NOT wrap <tool_call> tags in markdown code fences.',
238
- '- When you emit <tool_call> tags, output ONLY the tags — no prose, no explanation, no other text.',
239
- '- You may emit multiple <tool_call> tags to request parallel calls.',
240
- '- Tool results will be returned as <tool_result id="..." name="...">...</tool_result> blocks. After results arrive, continue toward the final answer.',
241
- '- When you have the final answer and need no more tool calls, respond normally WITHOUT any <tool_call> tag.',
242
- '- Do NOT call any other tool (Read, Bash, Grep, etc.) — only the tools listed below.',
243
- '',
244
- 'Available tools:',
245
- ];
246
- for (const t of tools) {
247
- if (t?.type !== 'function' || !t.function) continue;
248
- const fn = t.function;
249
- lines.push(`<tool name="${fn.name}">`);
250
- if (fn.description) lines.push(` <description>${fn.description}</description>`);
251
- lines.push(` <parameters>${JSON.stringify(fn.parameters || { type: 'object', properties: {} })}</parameters>`);
252
- lines.push('</tool>');
253
- }
254
- return lines.join('\n');
255
- }
256
-
257
- function formatAssistantForReplay(msg) {
258
- const parts = [];
259
- const text = extractContent(msg.content);
260
- if (text) parts.push(text);
261
- if (Array.isArray(msg.tool_calls)) {
262
- for (const tc of msg.tool_calls) {
263
- if (tc?.type === 'function' && tc.function) {
264
- let args = {};
265
- try { args = JSON.parse(tc.function.arguments || '{}'); } catch {}
266
- parts.push(`<tool_call>${JSON.stringify({ name: tc.function.name, arguments: args })}</tool_call>`);
267
- }
268
- }
269
- }
270
- return parts.join('\n');
271
- }
272
-
273
- function formatToolResult(msg) {
274
- const content = extractContent(msg.content);
275
- const id = msg.tool_call_id || 'unknown';
276
- const name = msg.name || '';
277
- return `<tool_result id="${id}" name="${name}">\n${content}\n</tool_result>`;
278
- }
279
-
280
- // Parse the model's text output for <tool_call> tags. Returns
281
- // { toolCalls: [{id, name, arguments}], textBefore: string }
282
- // when at least one valid call is found, else null.
283
- function parseToolCalls(text) {
284
- if (!text || !text.includes('<tool_call>')) return null;
285
- const re = /<tool_call>\s*([\s\S]*?)\s*<\/tool_call>/g;
286
- const calls = [];
287
- let firstIdx = -1;
288
- let m;
289
- while ((m = re.exec(text)) !== null) {
290
- if (firstIdx === -1) firstIdx = m.index;
291
- try {
292
- const obj = JSON.parse(m[1]);
293
- if (obj && typeof obj.name === 'string') {
294
- calls.push({
295
- id: `call_${uuidv4().replace(/-/g, '').slice(0, 20)}`,
296
- name: obj.name,
297
- arguments: JSON.stringify(obj.arguments ?? {}),
298
- });
299
- }
300
- } catch {
301
- // ignore malformed tool_call blocks
302
- }
303
- }
304
- if (!calls.length) return null;
305
- return { toolCalls: calls, textBefore: text.slice(0, firstIdx).trim() };
306
- }
307
-
308
- // Detect whether the running text contains a COMPLETE <tool_call>...</tool_call>
309
- // pair — used to abort the SDK early once a call has been emitted.
310
- function hasCompleteToolCall(text) {
311
- return /<tool_call>\s*[\s\S]*?<\/tool_call>/.test(text);
312
- }
313
-
314
- function messagesToPrompt(messages, { resuming = false, tools = null } = {}) {
315
- // When resuming, the SDK already has full history. Only send the new tail:
316
- // tool_results (if the client is replying with tool outputs) and/or a fresh
317
- // user message.
245
+ /**
246
+ * Build the prompt text from the OpenAI messages array.
247
+ *
248
+ * Returns `{ promptText }` — a single string ready for the SDK. Tool
249
+ * results are spliced in as <tool_results> XML when present (see
250
+ * lib/tool-bridge.js#toolMessagesToText for why we don't use native
251
+ * tool_result content blocks yet).
252
+ *
253
+ * Resuming vs fresh:
254
+ * - Resuming: SDK has full history. We only send the new tail —
255
+ * trailing tool results plus the most recent user text, if any.
256
+ * - Fresh: SDK starts cold. We serialize the visible history with
257
+ * <system>/<previous_response>/<tool_results> tags. No tool-
258
+ * instruction injection — the SDK MCP registration handles that.
259
+ */
260
+ function messagesToPrompt(messages, { resuming = false } = {}) {
318
261
  if (resuming) {
319
- const toolResults = [];
262
+ // Walk backwards from the end, collecting trailing tool messages and
263
+ // the most recent user text. Tool results are formatted as a text
264
+ // block (see lib/tool-bridge.js#toolMessagesToText for the rationale).
265
+ const trailingToolMessages = [];
320
266
  let userText = '';
321
267
  for (let i = messages.length - 1; i >= 0; i--) {
322
268
  const msg = messages[i];
323
269
  if (msg.role === 'tool') {
324
- toolResults.unshift(formatToolResult(msg));
270
+ trailingToolMessages.unshift(msg);
325
271
  } else if (msg.role === 'user') {
326
272
  userText = extractContent(msg.content);
327
273
  break;
@@ -329,39 +275,20 @@ function messagesToPrompt(messages, { resuming = false, tools = null } = {}) {
329
275
  break;
330
276
  }
331
277
  }
278
+ const toolResultsText = toolMessagesToText(trailingToolMessages);
332
279
  const parts = [];
333
- if (toolResults.length) {
334
- parts.push(`<tool_results>\n${toolResults.join('\n')}\n</tool_results>`);
335
- // The model sometimes treats a bare <tool_results> block as "just data"
336
- // and returns empty. A short nudge keeps the turn productive without
337
- // biasing what comes next.
338
- if (!userText) parts.push('Use the tool results above to continue toward the final answer. If more tool calls are needed, emit them; otherwise respond directly.');
339
- }
280
+ if (toolResultsText) parts.push(toolResultsText);
340
281
  if (userText) parts.push(userText);
341
- return parts.join('\n\n') || extractContent(messages[messages.length - 1].content);
282
+ return {
283
+ promptText: parts.join('\n\n') || extractContent(messages[messages.length - 1]?.content || ''),
284
+ };
342
285
  }
343
286
 
287
+ // Fresh request: serialize visible history as XML-wrapped text. No
288
+ // tool-instruction injection (the model learns about tools via the SDK
289
+ // MCP registration, not the prompt).
344
290
  const parts = [];
345
- // Tool instructions prepended once at the top of the system context.
346
- if (tools && tools.length) {
347
- parts.push(`<system>\n${buildToolInstructions(tools)}\n</system>\n`);
348
- }
349
-
350
- // Group consecutive tool-role messages so they emit as one <tool_results> block.
351
- let toolBuffer = [];
352
- const flushTools = () => {
353
- if (toolBuffer.length) {
354
- parts.push(`<tool_results>\n${toolBuffer.join('\n')}\n</tool_results>\n`);
355
- toolBuffer = [];
356
- }
357
- };
358
-
359
291
  for (const msg of messages) {
360
- if (msg.role === 'tool') {
361
- toolBuffer.push(formatToolResult(msg));
362
- continue;
363
- }
364
- flushTools();
365
292
  switch (msg.role) {
366
293
  case 'system':
367
294
  parts.push(`<system>\n${extractContent(msg.content)}\n</system>\n`);
@@ -369,18 +296,34 @@ function messagesToPrompt(messages, { resuming = false, tools = null } = {}) {
369
296
  case 'user':
370
297
  parts.push(extractContent(msg.content));
371
298
  break;
372
- case 'assistant':
373
- parts.push(`<previous_response>\n${formatAssistantForReplay(msg)}\n</previous_response>\n`);
299
+ case 'assistant': {
300
+ // Best-effort replay. tool_calls in non-resume history are dropped;
301
+ // the model can usually infer continuity from the surrounding text.
302
+ const text = extractContent(msg.content);
303
+ if (text) parts.push(`<previous_response>\n${text}\n</previous_response>\n`);
304
+ break;
305
+ }
306
+ case 'tool': {
307
+ // Tool messages on a fresh turn (rare — clients normally use
308
+ // session keys). Splice as text since there's no preceding
309
+ // tool_use turn we can bind to natively.
310
+ const text = toolMessagesToText([msg]);
311
+ if (text) parts.push(text);
374
312
  break;
313
+ }
375
314
  }
376
315
  }
377
- flushTools();
378
- return parts.join('\n').trim();
316
+ return {
317
+ promptText: parts.join('\n').trim(),
318
+ };
379
319
  }
380
320
 
381
- // Wrap a prompt + optional image blocks into the form query() expects.
382
- // Returns a string when there are no images (fast path), or an async iterable
383
- // yielding one SDKUserMessage with multi-part content when there are.
321
+ /**
322
+ * Wrap promptText + optional image blocks into the form query() expects.
323
+ * Returns a string for the fast path (text-only, no images), or an
324
+ * async iterable yielding one SDKUserMessage with multi-part content
325
+ * when there are images.
326
+ */
384
327
  function buildQueryPrompt(promptText, imageBlocks) {
385
328
  if (!imageBlocks.length) return promptText;
386
329
  const content = [
@@ -443,12 +386,15 @@ async function handleStreaming(req, res, body, requestId, sessionKey) {
443
386
  const existing = getSession(sessionKey);
444
387
  const resuming = !!existing?.sdkSessionId;
445
388
  const toolsEnabled = hasTools(body);
446
- const promptText = messagesToPrompt(body.messages, { resuming, tools: toolsEnabled ? body.tools : null });
389
+ const { promptText } = messagesToPrompt(body.messages, { resuming });
447
390
  const images = collectImages(body.messages);
448
391
  const prompt = buildQueryPrompt(promptText, images);
449
392
  const model = resolveModel(body.model);
393
+ // Build the in-process MCP server exposing client tools to the SDK.
394
+ // null when toolsEnabled is false (or all tools are malformed).
395
+ const clientToolsServer = toolsEnabled ? buildClientToolsServer(body.tools) : null;
450
396
  if (images.length) console.log(` [multimodal] ${images.length} image block(s)`);
451
- if (toolsEnabled) console.log(` [tools] ${body.tools.length} client tool(s) buffering stream`);
397
+ if (toolsEnabled) console.log(` [tools] ${body.tools.length} client tool(s) registered as MCP`);
452
398
 
453
399
  res.setHeader('Content-Type', 'text/event-stream');
454
400
  res.setHeader('Cache-Control', 'no-cache');
@@ -473,11 +419,17 @@ async function handleStreaming(req, res, body, requestId, sessionKey) {
473
419
  console.log(` [session] resuming: ${sessionKey} → sdk=${existing.sdkSessionId} (msgs=${existing.messageCount})`);
474
420
  }
475
421
 
476
- let bufferedText = ''; // only used when toolsEnabled
422
+ // Tools-mode buffers text and collects native tool_use blocks. If the
423
+ // model emits text first then a tool_use, we want both: textBefore as
424
+ // the assistant content, plus the tool_calls. (Most clients display the
425
+ // text and then act on the tool_calls.)
426
+ let bufferedText = '';
427
+ let collectedToolCalls = []; // [{id, name, arguments}] from extractToolUses()
477
428
 
478
429
  const runQuery = async () => {
479
430
  // Reset per-attempt state so a 401 retry starts clean
480
431
  bufferedText = '';
432
+ collectedToolCalls = [];
481
433
  isFirst = true;
482
434
  resolvedModel = model;
483
435
  capturedSessionId = existing?.sdkSessionId || null;
@@ -490,7 +442,18 @@ async function handleStreaming(req, res, body, requestId, sessionKey) {
490
442
  permissionMode: 'bypassPermissions',
491
443
  allowDangerouslySkipPermissions: true,
492
444
  abortController,
493
- ...(toolsEnabled ? { allowedTools: [] } : {}),
445
+ // Tools-mode: register client tools as an in-process MCP server
446
+ // and allow only those (no Bash/Read/etc. — the SDK's built-ins
447
+ // would pollute the session and leak through to the model).
448
+ ...(clientToolsServer
449
+ ? {
450
+ mcpServers: { [MCP_SERVER_NAME]: clientToolsServer },
451
+ allowedTools: [`${MCP_TOOL_PREFIX}*`],
452
+ }
453
+ : toolsEnabled
454
+ // Tools were requested but none were valid — disable all tools.
455
+ ? { allowedTools: [] }
456
+ : {}),
494
457
  ...(resuming ? { resume: existing.sdkSessionId } : {}),
495
458
  ...(sessionKey && !resuming ? { persistSession: true } : {}),
496
459
  },
@@ -532,15 +495,25 @@ async function handleStreaming(req, res, body, requestId, sessionKey) {
532
495
  throw new AuthFailureInResultText(turnText);
533
496
  }
534
497
 
498
+ // Tools-mode: check for native tool_use content blocks. The moment
499
+ // we see one, abort the SDK — we don't want our stub handler to
500
+ // hang waiting on an execution that's actually happening client-side.
501
+ if (toolsEnabled && message.type === 'assistant' && hasToolUse(message)) {
502
+ const calls = extractToolUses(message);
503
+ if (calls.length) {
504
+ collectedToolCalls.push(...calls);
505
+ if (turnText) bufferedText += turnText;
506
+ console.log(` [tools] ${calls.length} native tool_use block(s) — aborting SDK`);
507
+ abortController.abort();
508
+ break;
509
+ }
510
+ }
511
+
535
512
  if (turnText) {
536
513
  if (toolsEnabled) {
514
+ // Buffer text in case it precedes a tool_use, or ends up as the
515
+ // final response when the model decides not to call any tools.
537
516
  bufferedText += turnText;
538
- // Abort early once we see a complete <tool_call>...</tool_call>
539
- if (hasCompleteToolCall(bufferedText)) {
540
- console.log(' [tools] complete tool_call detected — aborting SDK');
541
- abortController.abort();
542
- break;
543
- }
544
517
  } else {
545
518
  sendSSE(res, makeChunk(requestId, resolvedModel, turnText, isFirst ? 'assistant' : undefined, null));
546
519
  isFirst = false;
@@ -586,9 +559,8 @@ async function handleStreaming(req, res, body, requestId, sessionKey) {
586
559
  // Tools mode: emit the buffered response as a single chunk with either
587
560
  // tool_calls (+ finish_reason: tool_calls) or plain text (+ stop).
588
561
  if (toolsEnabled && !res.writableEnded) {
589
- const parsed = parseToolCalls(bufferedText);
590
- if (parsed) {
591
- console.log(` [tools] emitting ${parsed.toolCalls.length} tool_call(s)`);
562
+ if (collectedToolCalls.length > 0) {
563
+ console.log(` [tools] emitting ${collectedToolCalls.length} tool_call(s)`);
592
564
  const chunk = {
593
565
  id: `chatcmpl-${requestId}`,
594
566
  object: 'chat.completion.chunk',
@@ -598,8 +570,8 @@ async function handleStreaming(req, res, body, requestId, sessionKey) {
598
570
  index: 0,
599
571
  delta: {
600
572
  role: 'assistant',
601
- content: parsed.textBefore || null,
602
- tool_calls: parsed.toolCalls.map((tc, i) => ({
573
+ content: bufferedText.trim() || null,
574
+ tool_calls: collectedToolCalls.map((tc, i) => ({
603
575
  index: i,
604
576
  id: tc.id,
605
577
  type: 'function',
@@ -634,14 +606,16 @@ async function handleNonStreaming(res, body, requestId, sessionKey) {
634
606
  const existing = getSession(sessionKey);
635
607
  const resuming = !!existing?.sdkSessionId;
636
608
  const toolsEnabled = hasTools(body);
637
- const promptText = messagesToPrompt(body.messages, { resuming, tools: toolsEnabled ? body.tools : null });
609
+ const { promptText } = messagesToPrompt(body.messages, { resuming });
638
610
  const images = collectImages(body.messages);
639
611
  const prompt = buildQueryPrompt(promptText, images);
640
612
  const model = resolveModel(body.model);
613
+ const clientToolsServer = toolsEnabled ? buildClientToolsServer(body.tools) : null;
641
614
  if (images.length) console.log(` [multimodal] ${images.length} image block(s)`);
642
- if (toolsEnabled) console.log(` [tools] ${body.tools.length} client tool(s)`);
615
+ if (toolsEnabled) console.log(` [tools] ${body.tools.length} client tool(s) registered as MCP`);
643
616
 
644
617
  let resultText = '';
618
+ let collectedToolCalls = [];
645
619
  let resolvedModel = model;
646
620
  let inputTokens = 0;
647
621
  let outputTokens = 0;
@@ -655,6 +629,7 @@ async function handleNonStreaming(res, body, requestId, sessionKey) {
655
629
  const runQuery = async () => {
656
630
  // Reset per-attempt state so a 401 retry starts clean
657
631
  resultText = '';
632
+ collectedToolCalls = [];
658
633
  resolvedModel = model;
659
634
  inputTokens = 0;
660
635
  outputTokens = 0;
@@ -668,7 +643,14 @@ async function handleNonStreaming(res, body, requestId, sessionKey) {
668
643
  permissionMode: 'bypassPermissions',
669
644
  allowDangerouslySkipPermissions: true,
670
645
  abortController,
671
- ...(toolsEnabled ? { allowedTools: [] } : {}),
646
+ ...(clientToolsServer
647
+ ? {
648
+ mcpServers: { [MCP_SERVER_NAME]: clientToolsServer },
649
+ allowedTools: [`${MCP_TOOL_PREFIX}*`],
650
+ }
651
+ : toolsEnabled
652
+ ? { allowedTools: [] }
653
+ : {}),
672
654
  ...(resuming ? { resume: existing.sdkSessionId } : {}),
673
655
  ...(sessionKey && !resuming ? { persistSession: true } : {}),
674
656
  },
@@ -696,11 +678,15 @@ async function handleNonStreaming(res, body, requestId, sessionKey) {
696
678
  abortController.abort();
697
679
  throw new AuthFailureInResultText(resultText);
698
680
  }
699
- // Abort early once we see a complete <tool_call>...</tool_call>
700
- if (toolsEnabled && hasCompleteToolCall(resultText)) {
701
- console.log(' [tools] complete tool_call detected — aborting SDK');
702
- abortController.abort();
703
- break;
681
+ // Native tool_use detection abort the moment a tool_use lands.
682
+ if (toolsEnabled && hasToolUse(message)) {
683
+ const calls = extractToolUses(message);
684
+ if (calls.length) {
685
+ collectedToolCalls.push(...calls);
686
+ console.log(` [tools] ${calls.length} native tool_use block(s) — aborting SDK`);
687
+ abortController.abort();
688
+ break;
689
+ }
704
690
  }
705
691
  }
706
692
 
@@ -740,32 +726,29 @@ async function handleNonStreaming(res, body, requestId, sessionKey) {
740
726
  if (sessionKey) responseHeaders['X-Session-Id'] = sessionKey;
741
727
 
742
728
  // Tool-calling response shape
743
- if (toolsEnabled) {
744
- const parsed = parseToolCalls(resultText);
745
- if (parsed) {
746
- console.log(` [tools] emitting ${parsed.toolCalls.length} tool_call(s)`);
747
- return res.set(responseHeaders).json({
748
- id: `chatcmpl-${requestId}`,
749
- object: 'chat.completion',
750
- created: Math.floor(Date.now() / 1000),
751
- model: normalizeModelName(resolvedModel),
752
- choices: [{
753
- index: 0,
754
- message: {
755
- role: 'assistant',
756
- content: parsed.textBefore || null,
757
- tool_calls: parsed.toolCalls.map((tc) => ({
758
- id: tc.id,
759
- type: 'function',
760
- function: { name: tc.name, arguments: tc.arguments },
761
- })),
762
- },
763
- finish_reason: 'tool_calls',
764
- }],
765
- usage: { prompt_tokens: inputTokens, completion_tokens: outputTokens, total_tokens: inputTokens + outputTokens },
766
- });
767
- }
768
- // No tool_call tags → fall through to normal text response
729
+ if (toolsEnabled && collectedToolCalls.length > 0) {
730
+ console.log(` [tools] emitting ${collectedToolCalls.length} tool_call(s)`);
731
+ return res.set(responseHeaders).json({
732
+ id: `chatcmpl-${requestId}`,
733
+ object: 'chat.completion',
734
+ created: Math.floor(Date.now() / 1000),
735
+ model: normalizeModelName(resolvedModel),
736
+ choices: [{
737
+ index: 0,
738
+ message: {
739
+ role: 'assistant',
740
+ content: resultText.trim() || null,
741
+ tool_calls: collectedToolCalls.map((tc) => ({
742
+ id: tc.id,
743
+ type: 'function',
744
+ function: { name: tc.name, arguments: tc.arguments },
745
+ })),
746
+ },
747
+ finish_reason: 'tool_calls',
748
+ }],
749
+ usage: { prompt_tokens: inputTokens, completion_tokens: outputTokens, total_tokens: inputTokens + outputTokens },
750
+ });
751
+ // No tool_use blocks fall through to normal text response
769
752
  }
770
753
 
771
754
  res.set(responseHeaders).json({
@@ -1090,6 +1073,62 @@ app.get('/dashboard/logs', async (req, res) => {
1090
1073
  }
1091
1074
  });
1092
1075
 
1076
+ // ---------------------------------------------------------------------------
1077
+ // Updater — dashboard-driven "update available → update now" flow
1078
+ // ---------------------------------------------------------------------------
1079
+
1080
+ // GET /update/check — is there a newer mobygate on npm?
1081
+ // Response: { current, latest, updateAvailable, installMode, canApply, cached, error }
1082
+ // Safe to poll: the npm registry call is cached for 15 min in-process.
1083
+ app.get('/update/check', async (req, res) => {
1084
+ try {
1085
+ const force = req.query.force === '1' || req.query.force === 'true';
1086
+ const info = await getUpdateCheck({ force });
1087
+ res.json(info);
1088
+ } catch (e) {
1089
+ res.status(500).json({ error: e.message });
1090
+ }
1091
+ });
1092
+
1093
+ // POST /update/apply — fire the update in a detached child process.
1094
+ // We return immediately with { started, pid }. The child runs
1095
+ // `npm install -g mobygate@latest` (or `git pull && npm install`), then
1096
+ // restarts the service — which kills us. The dashboard polls
1097
+ // /update/status to show progress and reconnects once the new server is up.
1098
+ app.post('/update/apply', (_req, res) => {
1099
+ try {
1100
+ const result = applyUpdate({});
1101
+ const status = result.started ? 202 : 409;
1102
+ res.status(status).json({ ...result, currentVersion: getCurrentVersion() });
1103
+ if (result.started) {
1104
+ dashboardBus.emitEvent({ type: 'update.started', pid: result.pid, mode: result.mode });
1105
+ }
1106
+ } catch (e) {
1107
+ res.status(500).json({ started: false, error: e.message });
1108
+ }
1109
+ });
1110
+
1111
+ // GET /update/status — progress for a running (or just-finished) update.
1112
+ // The dashboard polls this during apply. `running` is determined by
1113
+ // PID liveness, so even if our process is the one getting restarted,
1114
+ // the new one answers correctly.
1115
+ app.get('/update/status', (req, res) => {
1116
+ const state = readUpdateState();
1117
+ let running = false;
1118
+ if (state.pid) {
1119
+ try { process.kill(state.pid, 0); running = true; } catch {}
1120
+ }
1121
+ const lines = Math.min(1000, parseInt(req.query.lines || '200', 10));
1122
+ res.json({
1123
+ running,
1124
+ pid: state.pid || null,
1125
+ startedAt: state.startedAt || null,
1126
+ mode: state.mode || null,
1127
+ lines: readUpdateLogTail({ lines }),
1128
+ currentVersion: getCurrentVersion(),
1129
+ });
1130
+ });
1131
+
1093
1132
  // ---------------------------------------------------------------------------
1094
1133
  // Start
1095
1134
  // ---------------------------------------------------------------------------