mobygate 0.5.3 → 0.6.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -4,6 +4,94 @@ All notable changes to mobygate are documented here. Format loosely follows
4
4
  [Keep a Changelog](https://keepachangelog.com/en/1.1.0/); version numbers are
5
5
  [Semantic Versioning](https://semver.org/).
6
6
 
7
+ ## [0.6.1] — 2026-04-24
8
+
9
+ ### Fixed
10
+
11
+ - **`mobygate update` on Windows** failed with `spawnSync npm ENOENT`
12
+ because `npm` resolves to `npm.cmd` (a batch file) on Windows, and
13
+ Node's `spawn` won't pick up `.cmd` extensions without going through
14
+ cmd.exe. Added `shell: IS_WIN` to every npm/git invocation in the
15
+ CLI's update path. The dashboard's update endpoint already had this
16
+ fix in v0.6.0; now the CLI matches.
17
+
18
+ ## [0.6.0] — 2026-04-24
19
+
20
+ Big one. Native tool calling + in-dashboard self-update.
21
+
22
+ ### Added
23
+
24
+ - **Native MCP tool calling.** Client-supplied OpenAI tools are now
25
+ registered with the Claude Agent SDK as in-process MCP tools (with
26
+ Zod schemas converted from JSON Schema). The model emits genuine
27
+ `tool_use` content blocks instead of the old `<tool_call>...`
28
+ text-pattern hack. Tool IDs returned to clients are now Anthropic-
29
+ native `toolu_*` strings, not synthesized `call_*` ones. New module:
30
+ `lib/tool-bridge.js`.
31
+ - **Dashboard update banner.** When npm has a newer mobygate, the
32
+ dashboard shows an orange pill at the top: `v0.6.0 → v0.6.1 available
33
+ · npm install · [changelog] [dismiss] [update now]`. Clicking
34
+ "update now" fires `npm install -g mobygate@latest` (or `git pull`
35
+ for git-mode installs) in a detached child process, restarts the
36
+ service, and auto-reloads the page. Dismissals stick per-version
37
+ via localStorage. New module: `lib/updater.js`.
38
+ - New endpoints: `GET /update/check`, `POST /update/apply`,
39
+ `GET /update/status`. The check endpoint caches the npm registry
40
+ lookup for 15 minutes so dashboards open all day don't hammer it.
41
+
42
+ ### Changed
43
+
44
+ - **No more prompt-injected tool definitions.** The `<system>...</system>`
45
+ block listing available tools as XML is gone — the SDK's MCP
46
+ registration is the model's source of truth now. This shrinks every
47
+ tool-enabled prompt by ~200-500 tokens depending on tool count.
48
+ - **Tool-flow detection** moved from text-pattern matching
49
+ (`hasCompleteToolCall`, `parseToolCalls` regexes) to native
50
+ `tool_use` content-block detection in the assistant message stream
51
+ (`hasToolUse`, `extractToolUses`). The moment a tool_use lands,
52
+ we abort the SDK and emit OpenAI-shape `tool_calls`.
53
+ - **`alwaysLoad: true`** on every registered tool. Without this, the
54
+ SDK lazily defers MCP tool schemas — the model has to call the
55
+ built-in `ToolSearch` tool to fetch each definition before invoking,
56
+ which leaks through to OpenAI clients as a confusing tool_call
57
+ for `ToolSearch` instead of their actual tool. Eager loading
58
+ keeps the surface clean.
59
+
60
+ ### Removed
61
+
62
+ - `buildToolInstructions` — the `<tool_call>...` protocol prose.
63
+ - `parseToolCalls` — the regex parser for `<tool_call>` JSON blocks.
64
+ - `hasCompleteToolCall` — the streaming-buffer heuristic that aborted
65
+ the SDK when a complete tag pair appeared.
66
+ - `formatAssistantForReplay`'s tool_calls→`<tool_call>` text
67
+ serialization (assistant replay is now best-effort text only).
68
+ - The "Use the tool results above to continue toward the final answer"
69
+ nudge — tool results are visible in conversation context now, so
70
+ the model handles continuation naturally without coaxing.
71
+
72
+ ### Known limitation (Phase 1 deliberate)
73
+
74
+ - Tool *results* coming back from the client are still spliced as
75
+ `<tool_results>` text in the resumed prompt, not native Anthropic
76
+ `tool_result` content blocks. Reason: aborting the SDK on a
77
+ `tool_use` block prevents the assistant turn from being persisted
78
+ in session state — on resume, native tool_result blocks have
79
+ nothing to bind to and the model re-calls the tool. Text-form
80
+ results work because the resumed model has the prior turn in
81
+ context. Phase 2's full Anthropic Messages wire surface will
82
+ keep the SDK alive through the tool turn and switch to native
83
+ tool_result blocks end-to-end.
84
+
85
+ ### Migration
86
+
87
+ - No client-facing changes. Existing OpenAI-shape requests with
88
+ `tools: [...]` work the same as before — what's improved is
89
+ reliability ("Model returned empty after tool calls" warnings
90
+ should largely disappear) and surface fidelity (tool_call IDs
91
+ are now native Anthropic IDs, not synthesized).
92
+ - Update with `mobygate update` (CLI) or click the new "update now"
93
+ button in the dashboard once it appears.
94
+
7
95
  ## [0.5.3] — 2026-04-19
8
96
 
9
97
  Security pass.
package/bin/mobygate.js CHANGED
@@ -564,8 +564,13 @@ async function cmdUpdate() {
564
564
  print(c.dim(`Current: v${pkg.version} · ${mode} install at ${REPO_ROOT}`));
565
565
 
566
566
  // ---- Look up latest published version on npm
567
+ // shell: IS_WIN is required on Windows because `npm` is `npm.cmd`
568
+ // (a batch file), and Node's spawn won't resolve .cmd extensions
569
+ // without going through cmd.exe. Same for git on Windows where some
570
+ // distributions install git as a shim. On macOS/Linux these are real
571
+ // binaries, so the flag is a no-op.
567
572
  info('Checking npm for the latest release...');
568
- const view = spawnSync('npm', ['view', 'mobygate', 'version'], { encoding: 'utf8', timeout: 10_000 });
573
+ const view = spawnSync('npm', ['view', 'mobygate', 'version'], { encoding: 'utf8', timeout: 10_000, shell: IS_WIN });
569
574
  if (view.status !== 0) {
570
575
  return die(`Couldn't reach npm registry: ${view.stderr?.trim() || view.error?.message || 'unknown'}`);
571
576
  }
@@ -579,15 +584,15 @@ async function cmdUpdate() {
579
584
  // ---- Perform the upgrade
580
585
  if (mode === 'npm') {
581
586
  info(`Running \`npm install -g mobygate@latest\`...`);
582
- const r = spawnSync('npm', ['install', '-g', 'mobygate@latest'], { stdio: 'inherit' });
587
+ const r = spawnSync('npm', ['install', '-g', 'mobygate@latest'], { stdio: 'inherit', shell: IS_WIN });
583
588
  if (r.status !== 0) return die('npm install failed. See output above.');
584
589
  ok(`Installed mobygate@${latest}`);
585
590
  } else if (mode === 'git') {
586
591
  info(`Running \`git pull\` in ${REPO_ROOT}...`);
587
- const pull = spawnSync('git', ['-C', REPO_ROOT, 'pull', '--ff-only'], { stdio: 'inherit' });
592
+ const pull = spawnSync('git', ['-C', REPO_ROOT, 'pull', '--ff-only'], { stdio: 'inherit', shell: IS_WIN });
588
593
  if (pull.status !== 0) return die('git pull failed. Resolve conflicts and retry.');
589
594
  info(`Running \`npm install\`...`);
590
- const install = spawnSync('npm', ['install'], { cwd: REPO_ROOT, stdio: 'inherit' });
595
+ const install = spawnSync('npm', ['install'], { cwd: REPO_ROOT, stdio: 'inherit', shell: IS_WIN });
591
596
  if (install.status !== 0) return die('npm install failed. See output above.');
592
597
  ok(`Pulled and installed. See git log for what changed.`);
593
598
  } else {
package/index.html CHANGED
@@ -49,6 +49,34 @@
49
49
  <body class="antialiased">
50
50
  <div class="mx-auto px-12 pt-8 pb-7 flex flex-col gap-6 max-w-[1440px] min-h-screen">
51
51
 
52
+ <!-- ===== Update banner ===== -->
53
+ <!-- Hidden until /update/check reports updateAvailable=true. During
54
+ apply, this becomes a progress strip showing live log tail. -->
55
+ <section id="updateBanner" style="display:none" class="items-center gap-4 py-3 px-5 bg-[#121210] border-l-2 border-l-[#E89B2E] border-t border-b border-r border-[#2A2A1F] rounded-r-md">
56
+ <div class="flex items-center gap-2.5">
57
+ <span class="rounded-full bg-[#E89B2E] w-2 h-2 pulse-dot"></span>
58
+ <span class="uppercase text-[#E89B2E] font-medium text-[10px] tracking-[0.22em]">Update</span>
59
+ </div>
60
+ <div id="updateBannerText" class="grow text-[#F3EFE4] text-xs leading-4"></div>
61
+ <div id="updateBannerActions" class="flex items-center gap-2 shrink-0">
62
+ <a id="updateBannerChangelog" href="https://github.com/khnfrhn/mobygate/blob/master/CHANGELOG.md" target="_blank" rel="noreferrer" class="text-[#8A9A6A] hover:text-[#C9D9A8] text-[11px] tracking-[0.04em] underline decoration-dotted">changelog</a>
63
+ <button id="updateDismissBtn" class="rounded-full py-1.5 px-3 border border-[#2A2A1F] text-[#8A9A6A] hover:text-[#C9D9A8] hover:border-[#5A5F54] font-medium text-[11px] tracking-[0.04em] transition">dismiss</button>
64
+ <button id="updateApplyBtn" class="rounded-full py-1.5 px-3.5 bg-[#E89B2E] hover:brightness-110 text-[#0B0B09] font-bold text-[11px] tracking-[0.04em] transition">update now</button>
65
+ </div>
66
+ </section>
67
+ <!-- Apply-in-progress shelf: expands below the banner during update. -->
68
+ <section id="updateProgress" style="display:none" class="flex-col gap-2 py-3 px-5 bg-[#121210] border border-[#2A2A1F] rounded-md">
69
+ <div class="flex items-center justify-between">
70
+ <div class="flex items-center gap-2">
71
+ <span id="updateSpinner" class="rounded-full bg-[#E89B2E] w-2 h-2 pulse-dot"></span>
72
+ <span id="updateProgressTitle" class="uppercase text-[#C9D9A8] font-medium text-[10px] tracking-[0.22em]">Installing</span>
73
+ <span id="updateProgressSub" class="text-[#5A5F54] text-[11px]"></span>
74
+ </div>
75
+ <button id="updateProgressClose" style="display:none" class="text-[#5A5F54] hover:text-[#C9D9A8] text-[11px]">close ✕</button>
76
+ </div>
77
+ <pre id="updateProgressLog" class="text-[11px] leading-[15px] text-[#8A9A6A] max-h-[180px] overflow-auto whitespace-pre-wrap m-0"></pre>
78
+ </section>
79
+
52
80
  <!-- ===== Header ===== -->
53
81
  <header class="flex justify-between items-center shrink-0">
54
82
  <div class="flex items-center gap-[22px]">
@@ -804,6 +832,117 @@
804
832
  }
805
833
  }, 1000);
806
834
 
835
+ // ───────────────────────── Updater
836
+ // Dashboard-driven upgrade flow. On load (and every 30 min) we ask
837
+ // /update/check whether a newer mobygate is on npm. If so, a pill
838
+ // appears at the top of the page — click "update now" to fire the
839
+ // update, watch log lines stream in, then auto-reload when the new
840
+ // server is up. The child process is detached, so the server
841
+ // restart doesn't orphan it.
842
+ const UPDATE_DISMISS_KEY = 'mobygate:update:dismissedVersion';
843
+ let updateInfo = null;
844
+ let updatePollTimer = null;
845
+
846
+ function showBanner(info) {
847
+ if (!info?.updateAvailable) {
848
+ $('updateBanner').style.display = 'none';
849
+ return;
850
+ }
851
+ // Respect dismissal: if the user dismissed this exact version, don't
852
+ // re-pester until a newer one lands.
853
+ const dismissed = localStorage.getItem(UPDATE_DISMISS_KEY);
854
+ if (dismissed === info.latest) {
855
+ $('updateBanner').style.display = 'none';
856
+ return;
857
+ }
858
+ const msg = info.canApply
859
+ ? `v${escHtml(info.current)} → <span class="text-[#B7E56D]">v${escHtml(info.latest)}</span> available · <span class="text-[#5A5F54]">${escHtml(info.installMode)} install</span>`
860
+ : `v${escHtml(info.current)} → <span class="text-[#B7E56D]">v${escHtml(info.latest)}</span> available · <span class="text-[#E89B2E]">${escHtml(info.installMode)} install — update manually</span>`;
861
+ $('updateBannerText').innerHTML = msg;
862
+ $('updateApplyBtn').style.display = info.canApply ? '' : 'none';
863
+ $('updateBanner').style.display = 'flex';
864
+ }
865
+
866
+ async function checkForUpdates({ force = false } = {}) {
867
+ try {
868
+ const r = await fetch(`/update/check${force ? '?force=1' : ''}`);
869
+ if (!r.ok) return;
870
+ updateInfo = await r.json();
871
+ showBanner(updateInfo);
872
+ } catch (e) { /* offline is fine */ }
873
+ }
874
+
875
+ function renderUpdateLog(lines) {
876
+ const el = $('updateProgressLog');
877
+ el.textContent = (lines || []).join('\n');
878
+ // Pin to bottom so the user sees the latest line.
879
+ el.scrollTop = el.scrollHeight;
880
+ }
881
+
882
+ async function pollUpdateStatus() {
883
+ try {
884
+ const r = await fetch('/update/status?lines=200');
885
+ if (!r.ok) return;
886
+ const s = await r.json();
887
+ renderUpdateLog(s.lines);
888
+ if (!s.running) {
889
+ // Update finished. The service restart may have already swapped
890
+ // the running binary — our `currentVersion` reflects whatever
891
+ // server answered. If it matches `latest`, celebrate. Either
892
+ // way, give it a moment then reload so the dashboard comes
893
+ // back on the new code path.
894
+ clearInterval(updatePollTimer); updatePollTimer = null;
895
+ $('updateSpinner').classList.remove('pulse-dot');
896
+ $('updateSpinner').classList.remove('bg-[#E89B2E]');
897
+ $('updateSpinner').classList.add('bg-[#B7E56D]');
898
+ $('updateProgressTitle').textContent = 'Installed';
899
+ $('updateProgressSub').textContent = `now on v${s.currentVersion} — reloading in 3s…`;
900
+ $('updateProgressClose').style.display = '';
901
+ setTimeout(() => location.reload(), 3000);
902
+ }
903
+ } catch (e) {
904
+ // Server is mid-restart — keep polling, it'll come back.
905
+ }
906
+ }
907
+
908
+ function startUpdateProgress(mode) {
909
+ $('updateBanner').style.display = 'none';
910
+ $('updateProgress').style.display = 'flex';
911
+ $('updateProgressSub').textContent = mode ? `(${mode} install)` : '';
912
+ $('updateProgressTitle').textContent = 'Installing';
913
+ $('updateSpinner').classList.add('pulse-dot');
914
+ $('updateProgressLog').textContent = 'starting update…';
915
+ if (updatePollTimer) clearInterval(updatePollTimer);
916
+ updatePollTimer = setInterval(pollUpdateStatus, 1500);
917
+ pollUpdateStatus();
918
+ }
919
+
920
+ $('updateApplyBtn')?.addEventListener('click', async () => {
921
+ $('updateApplyBtn').disabled = true;
922
+ try {
923
+ const r = await fetch('/update/apply', { method: 'POST' });
924
+ const j = await r.json().catch(() => ({}));
925
+ if (!r.ok || !j.started) {
926
+ $('updateBannerText').innerHTML += ` <span class="text-[#E89B2E]">— ${escHtml(j.error || 'update failed to start')}</span>`;
927
+ $('updateApplyBtn').disabled = false;
928
+ return;
929
+ }
930
+ startUpdateProgress(j.mode);
931
+ } catch (e) {
932
+ $('updateBannerText').innerHTML += ` <span class="text-[#E89B2E]">— ${escHtml(e.message)}</span>`;
933
+ $('updateApplyBtn').disabled = false;
934
+ }
935
+ });
936
+
937
+ $('updateDismissBtn')?.addEventListener('click', () => {
938
+ if (updateInfo?.latest) localStorage.setItem(UPDATE_DISMISS_KEY, updateInfo.latest);
939
+ $('updateBanner').style.display = 'none';
940
+ });
941
+
942
+ $('updateProgressClose')?.addEventListener('click', () => {
943
+ $('updateProgress').style.display = 'none';
944
+ });
945
+
807
946
  // Kick off
808
947
  loadSnapshot();
809
948
  loadAuth({ verify: false });
@@ -811,6 +950,21 @@
811
950
  loadLogs();
812
951
  armLogAutoRefresh();
813
952
  connectStream();
953
+ // Surface update availability on load + every 30 min. The backend
954
+ // caches the npm registry lookup for 15 min, so this doesn't hammer
955
+ // the registry even with the dashboard open all day.
956
+ checkForUpdates();
957
+ setInterval(() => checkForUpdates(), 30 * 60 * 1000);
958
+ // If an update is in-flight when the page loads (e.g., user refreshed
959
+ // mid-apply), pick up where it left off.
960
+ (async () => {
961
+ try {
962
+ const r = await fetch('/update/status?lines=50');
963
+ if (!r.ok) return;
964
+ const s = await r.json();
965
+ if (s.running) startUpdateProgress(s.mode);
966
+ } catch {}
967
+ })();
814
968
  </script>
815
969
  </body>
816
970
  </html>
@@ -0,0 +1,257 @@
1
+ /**
2
+ * Native tool bridge — translates between OpenAI client tools and the
3
+ * Claude Agent SDK's MCP-tool model.
4
+ *
5
+ * Why this exists (Phase 1 of the mobygate native-tools refactor):
6
+ *
7
+ * Until now, mobygate handled client-supplied tools by injecting their
8
+ * schemas into the system prompt as <tool> XML and instructing the model
9
+ * to emit <tool_call>{...}</tool_call> tags in its text output. We then
10
+ * regex-parsed those tags. Fragile in obvious ways: the model sometimes
11
+ * wrapped tags in code fences, sometimes hallucinated partial blocks,
12
+ * and the "empty after tool_results" nudge existed to paper over the
13
+ * model treating bare <tool_results> as inert data.
14
+ *
15
+ * The SDK actually supports native tool definitions via MCP — but its
16
+ * MCP model assumes the **handler runs in-process** and returns a
17
+ * synchronous result. Our case is different: we're a proxy. The actual
18
+ * tool implementations live on the *other* side of an HTTP boundary,
19
+ * inside the client (Hermes / OpenClaw / etc.). We can't run them.
20
+ *
21
+ * The trick: register client tools as MCP tools with stub handlers that
22
+ * never resolve. The model emits **native** `tool_use` content blocks
23
+ * (in the SDKAssistantMessage stream, not buried in text). We watch the
24
+ * stream, abort the SDK on the first complete `tool_use`, and surface
25
+ * it to the client as an OpenAI `tool_calls` response. The stub handler
26
+ * is then aborted via the SDK's signal — we never actually execute it,
27
+ * the client does.
28
+ *
29
+ * The other end of the round-trip: when the client sends a follow-up
30
+ * request with tool results (role:'tool' messages), we convert those
31
+ * into native `tool_result` content blocks inside an SDKUserMessage,
32
+ * resuming the SDK session. The model sees structured tool results,
33
+ * not <tool_result> XML, and continues the conversation cleanly.
34
+ *
35
+ * Names round-trip via the MCP prefix convention. A client tool named
36
+ * `getWeather` is registered as `mcp__mobygate__getWeather` with the
37
+ * SDK; the model emits tool_use blocks under that prefixed name; we
38
+ * strip the prefix on the way back so the client sees its original name.
39
+ */
40
+
41
+ import { z } from 'zod';
42
+ import { tool, createSdkMcpServer } from '@anthropic-ai/claude-agent-sdk';
43
+
44
+ export const MCP_SERVER_NAME = 'mobygate';
45
+ export const MCP_TOOL_PREFIX = `mcp__${MCP_SERVER_NAME}__`;
46
+
47
+ // ---------------------------------------------------------------------------
48
+ // JSON Schema → Zod RawShape
49
+ // ---------------------------------------------------------------------------
50
+ // The SDK's `tool()` helper takes a Zod RawShape (a record of ZodTypes,
51
+ // like `{name: z.string(), age: z.number()}`) — NOT a JSON Schema object.
52
+ // OpenAI clients send JSON Schema (`{type:'object', properties:{...}, required:[...]}`),
53
+ // so we need to convert. This handles the common cases that cover ~95% of
54
+ // real-world tool schemas; anything weirder falls through to z.unknown().
55
+
56
+ function jsonSchemaPropToZod(prop) {
57
+ if (!prop || typeof prop !== 'object') return z.unknown();
58
+
59
+ // Handle enums up front — they apply across types.
60
+ if (Array.isArray(prop.enum) && prop.enum.length > 0) {
61
+ const stringy = prop.enum.every((v) => typeof v === 'string');
62
+ if (stringy) return z.enum(prop.enum);
63
+ // mixed-type enums fall through to z.union of literals
64
+ return z.union(prop.enum.map((v) => z.literal(v)));
65
+ }
66
+
67
+ switch (prop.type) {
68
+ case 'string': return z.string();
69
+ case 'number': return z.number();
70
+ case 'integer': return z.number().int();
71
+ case 'boolean': return z.boolean();
72
+ case 'null': return z.null();
73
+ case 'array': {
74
+ const item = prop.items ? jsonSchemaPropToZod(prop.items) : z.unknown();
75
+ return z.array(item);
76
+ }
77
+ case 'object': {
78
+ const shape = jsonSchemaToZodShape(prop);
79
+ return z.object(shape).passthrough();
80
+ }
81
+ default: return z.unknown();
82
+ }
83
+ }
84
+
85
+ /**
86
+ * Convert a JSON Schema *object* (with `properties` + `required`) into
87
+ * a Zod RawShape suitable for the SDK's `tool()` helper.
88
+ *
89
+ * Returns an empty shape `{}` when the schema isn't an object — the
90
+ * caller will pass this to `tool()`, and the model will see "no
91
+ * structured input expected." That's the right default for tool defs
92
+ * that arrive without a properties block (which OpenAI permits).
93
+ */
94
+ export function jsonSchemaToZodShape(schema) {
95
+ if (!schema || schema.type !== 'object' || !schema.properties) return {};
96
+ const shape = {};
97
+ const required = new Set(Array.isArray(schema.required) ? schema.required : []);
98
+ for (const [key, prop] of Object.entries(schema.properties)) {
99
+ let zType = jsonSchemaPropToZod(prop);
100
+ if (!required.has(key)) zType = zType.optional();
101
+ if (prop?.description) zType = zType.describe(prop.description);
102
+ shape[key] = zType;
103
+ }
104
+ return shape;
105
+ }
106
+
107
+ // ---------------------------------------------------------------------------
108
+ // Build the MCP server that exposes client tools to the SDK
109
+ // ---------------------------------------------------------------------------
110
+
111
+ /**
112
+ * Stub handler. The model emits a tool_use block, the SDK calls us, but
113
+ * we don't actually have an implementation to run — the client does.
114
+ * So we wait. The stream-watcher in server.js will abort the SDK as
115
+ * soon as it sees the tool_use block, which propagates here as a signal
116
+ * abort. We reject and the SDK cleans up.
117
+ *
118
+ * The 30s safety timeout is for the (rare) case where the SDK fires our
119
+ * handler but the abort never propagates back — we don't want to leak
120
+ * a Promise forever. 30s is well past any reasonable abort latency.
121
+ */
122
+ function deferredToolHandler(_args, extra) {
123
+ return new Promise((resolve, reject) => {
124
+ const onAbort = () => {
125
+ cleanup();
126
+ reject(new Error('mobygate: tool execution deferred to client (aborted)'));
127
+ };
128
+ const timer = setTimeout(() => {
129
+ cleanup();
130
+ reject(new Error('mobygate: tool execution deferred to client (timeout)'));
131
+ }, 30_000);
132
+ function cleanup() {
133
+ clearTimeout(timer);
134
+ extra?.signal?.removeEventListener?.('abort', onAbort);
135
+ }
136
+ if (extra?.signal?.aborted) return onAbort();
137
+ extra?.signal?.addEventListener?.('abort', onAbort, { once: true });
138
+ });
139
+ }
140
+
141
+ /**
142
+ * Build an in-process MCP server exposing the client's tools to the SDK.
143
+ * Returns the McpSdkServerConfigWithInstance; pass it to `query({options: { mcpServers: { [MCP_SERVER_NAME]: config } }})`.
144
+ *
145
+ * Returns `null` when there are no valid tools — caller should skip
146
+ * MCP setup entirely in that case.
147
+ */
148
+ export function buildClientToolsServer(openaiTools) {
149
+ if (!Array.isArray(openaiTools) || openaiTools.length === 0) return null;
150
+
151
+ const toolDefs = [];
152
+ for (const t of openaiTools) {
153
+ if (t?.type !== 'function' || !t.function?.name) continue;
154
+ const fn = t.function;
155
+ const shape = jsonSchemaToZodShape(fn.parameters);
156
+ toolDefs.push(tool(
157
+ fn.name,
158
+ fn.description || `Client-defined tool: ${fn.name}`,
159
+ shape,
160
+ deferredToolHandler,
161
+ // alwaysLoad: the SDK otherwise marks MCP tools as "deferred" — the
162
+ // model has to call the built-in `ToolSearch` to fetch the schema
163
+ // before invoking. That round-trip is invisible to OpenAI clients,
164
+ // who see a confusing tool_call for ToolSearch instead of getWeather.
165
+ // Eagerly loading our tools keeps the OpenAI surface clean.
166
+ { alwaysLoad: true },
167
+ ));
168
+ }
169
+ if (toolDefs.length === 0) return null;
170
+
171
+ return createSdkMcpServer({
172
+ name: MCP_SERVER_NAME,
173
+ version: '1.0.0',
174
+ tools: toolDefs,
175
+ });
176
+ }
177
+
178
+ // ---------------------------------------------------------------------------
179
+ // Tool-use extraction (SDK assistant message → OpenAI tool_calls)
180
+ // ---------------------------------------------------------------------------
181
+
182
+ /**
183
+ * Walk an SDKAssistantMessage's content array for native `tool_use` blocks.
184
+ * Returns an array of `{ id, name, arguments }` formatted for OpenAI
185
+ * tool_calls — name has the MCP prefix stripped, arguments is a JSON string.
186
+ *
187
+ * Returns `[]` when the message has no tool_use blocks (most assistant
188
+ * messages don't — they're just text deltas).
189
+ */
190
+ export function extractToolUses(assistantMessage) {
191
+ const content = assistantMessage?.message?.content;
192
+ if (!Array.isArray(content)) return [];
193
+ const calls = [];
194
+ for (const block of content) {
195
+ if (block?.type !== 'tool_use' || !block.id || !block.name) continue;
196
+ // Strip the MCP prefix so the client sees its original tool name.
197
+ const name = block.name.startsWith(MCP_TOOL_PREFIX)
198
+ ? block.name.slice(MCP_TOOL_PREFIX.length)
199
+ : block.name;
200
+ let argsString = '{}';
201
+ try { argsString = JSON.stringify(block.input ?? {}); } catch {}
202
+ calls.push({ id: block.id, name, arguments: argsString });
203
+ }
204
+ return calls;
205
+ }
206
+
207
+ /**
208
+ * Quick liveness check used by the stream loop to decide whether to abort
209
+ * early. Returns true the moment any tool_use block appears.
210
+ */
211
+ export function hasToolUse(assistantMessage) {
212
+ const content = assistantMessage?.message?.content;
213
+ if (!Array.isArray(content)) return false;
214
+ return content.some((b) => b?.type === 'tool_use');
215
+ }
216
+
217
+ // ---------------------------------------------------------------------------
218
+ // Tool results (OpenAI tool messages → Anthropic tool_result content blocks)
219
+ // ---------------------------------------------------------------------------
220
+
221
+ /**
222
+ * Format OpenAI role:'tool' messages as a single user-readable text
223
+ * block to splice into a resumed prompt.
224
+ *
225
+ * NOTE: Phase 1 deliberately does *not* round-trip tool results as
226
+ * native Anthropic `tool_result` content blocks. Why: when we abort
227
+ * the SDK on a tool_use, the assistant turn isn't persisted in the
228
+ * SDK's session state (we observed `msgs=1` on resume after a tool
229
+ * call, meaning the partial turn was dropped). On resume, sending a
230
+ * native tool_result block then has nothing to bind to — the model
231
+ * sees an orphan tool_result and re-calls the tool.
232
+ *
233
+ * Phase 2's full Anthropic Messages wire format will keep the SDK
234
+ * alive long enough to persist the turn properly. Until then, text-
235
+ * form tool results (which the model handles fine — it has the
236
+ * preceding tool_use in resume context) is the pragmatic answer.
237
+ *
238
+ * Returns a single string suitable for prepending to (or replacing)
239
+ * the user's prompt text on a resumed turn. Returns '' when there
240
+ * are no tool messages.
241
+ */
242
+ export function toolMessagesToText(toolMessages) {
243
+ const lines = [];
244
+ for (const msg of toolMessages) {
245
+ if (msg?.role !== 'tool') continue;
246
+ const id = msg.tool_call_id || 'unknown';
247
+ const name = msg.name || '';
248
+ const content = typeof msg.content === 'string'
249
+ ? msg.content
250
+ : Array.isArray(msg.content)
251
+ ? msg.content.map((c) => (typeof c === 'string' ? c : c?.text || '')).join('')
252
+ : (msg.content == null ? '' : String(msg.content));
253
+ lines.push(`<tool_result id="${id}"${name ? ` name="${name}"` : ''}>\n${content}\n</tool_result>`);
254
+ }
255
+ if (lines.length === 0) return '';
256
+ return `<tool_results>\n${lines.join('\n')}\n</tool_results>`;
257
+ }