mobygate 0.5.3 → 0.6.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +88 -0
- package/bin/mobygate.js +9 -4
- package/index.html +154 -0
- package/lib/tool-bridge.js +257 -0
- package/lib/updater.js +275 -0
- package/package.json +1 -1
- package/server.js +219 -180
package/CHANGELOG.md
CHANGED
|
@@ -4,6 +4,94 @@ All notable changes to mobygate are documented here. Format loosely follows
|
|
|
4
4
|
[Keep a Changelog](https://keepachangelog.com/en/1.1.0/); version numbers are
|
|
5
5
|
[Semantic Versioning](https://semver.org/).
|
|
6
6
|
|
|
7
|
+
## [0.6.1] — 2026-04-24
|
|
8
|
+
|
|
9
|
+
### Fixed
|
|
10
|
+
|
|
11
|
+
- **`mobygate update` on Windows** failed with `spawnSync npm ENOENT`
|
|
12
|
+
because `npm` resolves to `npm.cmd` (a batch file) on Windows, and
|
|
13
|
+
Node's `spawn` won't pick up `.cmd` extensions without going through
|
|
14
|
+
cmd.exe. Added `shell: IS_WIN` to every npm/git invocation in the
|
|
15
|
+
CLI's update path. The dashboard's update endpoint already had this
|
|
16
|
+
fix in v0.6.0; now the CLI matches.
|
|
17
|
+
|
|
18
|
+
## [0.6.0] — 2026-04-24
|
|
19
|
+
|
|
20
|
+
Big one. Native tool calling + in-dashboard self-update.
|
|
21
|
+
|
|
22
|
+
### Added
|
|
23
|
+
|
|
24
|
+
- **Native MCP tool calling.** Client-supplied OpenAI tools are now
|
|
25
|
+
registered with the Claude Agent SDK as in-process MCP tools (with
|
|
26
|
+
Zod schemas converted from JSON Schema). The model emits genuine
|
|
27
|
+
`tool_use` content blocks instead of the old `<tool_call>...`
|
|
28
|
+
text-pattern hack. Tool IDs returned to clients are now Anthropic-
|
|
29
|
+
native `toolu_*` strings, not synthesized `call_*` ones. New module:
|
|
30
|
+
`lib/tool-bridge.js`.
|
|
31
|
+
- **Dashboard update banner.** When npm has a newer mobygate, the
|
|
32
|
+
dashboard shows an orange pill at the top: `v0.6.0 → v0.6.1 available
|
|
33
|
+
· npm install · [changelog] [dismiss] [update now]`. Clicking
|
|
34
|
+
"update now" fires `npm install -g mobygate@latest` (or `git pull`
|
|
35
|
+
for git-mode installs) in a detached child process, restarts the
|
|
36
|
+
service, and auto-reloads the page. Dismissals stick per-version
|
|
37
|
+
via localStorage. New module: `lib/updater.js`.
|
|
38
|
+
- New endpoints: `GET /update/check`, `POST /update/apply`,
|
|
39
|
+
`GET /update/status`. The check endpoint caches the npm registry
|
|
40
|
+
lookup for 15 minutes so dashboards open all day don't hammer it.
|
|
41
|
+
|
|
42
|
+
### Changed
|
|
43
|
+
|
|
44
|
+
- **No more prompt-injected tool definitions.** The `<system>...</system>`
|
|
45
|
+
block listing available tools as XML is gone — the SDK's MCP
|
|
46
|
+
registration is the model's source of truth now. This shrinks every
|
|
47
|
+
tool-enabled prompt by ~200-500 tokens depending on tool count.
|
|
48
|
+
- **Tool-flow detection** moved from text-pattern matching
|
|
49
|
+
(`hasCompleteToolCall`, `parseToolCalls` regexes) to native
|
|
50
|
+
`tool_use` content-block detection in the assistant message stream
|
|
51
|
+
(`hasToolUse`, `extractToolUses`). The moment a tool_use lands,
|
|
52
|
+
we abort the SDK and emit OpenAI-shape `tool_calls`.
|
|
53
|
+
- **`alwaysLoad: true`** on every registered tool. Without this, the
|
|
54
|
+
SDK lazily defers MCP tool schemas — the model has to call the
|
|
55
|
+
built-in `ToolSearch` tool to fetch each definition before invoking,
|
|
56
|
+
which leaks through to OpenAI clients as a confusing tool_call
|
|
57
|
+
for `ToolSearch` instead of their actual tool. Eager loading
|
|
58
|
+
keeps the surface clean.
|
|
59
|
+
|
|
60
|
+
### Removed
|
|
61
|
+
|
|
62
|
+
- `buildToolInstructions` — the `<tool_call>...` protocol prose.
|
|
63
|
+
- `parseToolCalls` — the regex parser for `<tool_call>` JSON blocks.
|
|
64
|
+
- `hasCompleteToolCall` — the streaming-buffer heuristic that aborted
|
|
65
|
+
the SDK when a complete tag pair appeared.
|
|
66
|
+
- `formatAssistantForReplay`'s tool_calls→`<tool_call>` text
|
|
67
|
+
serialization (assistant replay is now best-effort text only).
|
|
68
|
+
- The "Use the tool results above to continue toward the final answer"
|
|
69
|
+
nudge — tool results are visible in conversation context now, so
|
|
70
|
+
the model handles continuation naturally without coaxing.
|
|
71
|
+
|
|
72
|
+
### Known limitation (Phase 1 deliberate)
|
|
73
|
+
|
|
74
|
+
- Tool *results* coming back from the client are still spliced as
|
|
75
|
+
`<tool_results>` text in the resumed prompt, not native Anthropic
|
|
76
|
+
`tool_result` content blocks. Reason: aborting the SDK on a
|
|
77
|
+
`tool_use` block prevents the assistant turn from being persisted
|
|
78
|
+
in session state — on resume, native tool_result blocks have
|
|
79
|
+
nothing to bind to and the model re-calls the tool. Text-form
|
|
80
|
+
results work because the resumed model has the prior turn in
|
|
81
|
+
context. Phase 2's full Anthropic Messages wire surface will
|
|
82
|
+
keep the SDK alive through the tool turn and switch to native
|
|
83
|
+
tool_result blocks end-to-end.
|
|
84
|
+
|
|
85
|
+
### Migration
|
|
86
|
+
|
|
87
|
+
- No client-facing changes. Existing OpenAI-shape requests with
|
|
88
|
+
`tools: [...]` work the same as before — what's improved is
|
|
89
|
+
reliability ("Model returned empty after tool calls" warnings
|
|
90
|
+
should largely disappear) and surface fidelity (tool_call IDs
|
|
91
|
+
are now native Anthropic IDs, not synthesized).
|
|
92
|
+
- Update with `mobygate update` (CLI) or click the new "update now"
|
|
93
|
+
button in the dashboard once it appears.
|
|
94
|
+
|
|
7
95
|
## [0.5.3] — 2026-04-19
|
|
8
96
|
|
|
9
97
|
Security pass.
|
package/bin/mobygate.js
CHANGED
|
@@ -564,8 +564,13 @@ async function cmdUpdate() {
|
|
|
564
564
|
print(c.dim(`Current: v${pkg.version} · ${mode} install at ${REPO_ROOT}`));
|
|
565
565
|
|
|
566
566
|
// ---- Look up latest published version on npm
|
|
567
|
+
// shell: IS_WIN is required on Windows because `npm` is `npm.cmd`
|
|
568
|
+
// (a batch file), and Node's spawn won't resolve .cmd extensions
|
|
569
|
+
// without going through cmd.exe. Same for git on Windows where some
|
|
570
|
+
// distributions install git as a shim. On macOS/Linux these are real
|
|
571
|
+
// binaries, so the flag is a no-op.
|
|
567
572
|
info('Checking npm for the latest release...');
|
|
568
|
-
const view = spawnSync('npm', ['view', 'mobygate', 'version'], { encoding: 'utf8', timeout: 10_000 });
|
|
573
|
+
const view = spawnSync('npm', ['view', 'mobygate', 'version'], { encoding: 'utf8', timeout: 10_000, shell: IS_WIN });
|
|
569
574
|
if (view.status !== 0) {
|
|
570
575
|
return die(`Couldn't reach npm registry: ${view.stderr?.trim() || view.error?.message || 'unknown'}`);
|
|
571
576
|
}
|
|
@@ -579,15 +584,15 @@ async function cmdUpdate() {
|
|
|
579
584
|
// ---- Perform the upgrade
|
|
580
585
|
if (mode === 'npm') {
|
|
581
586
|
info(`Running \`npm install -g mobygate@latest\`...`);
|
|
582
|
-
const r = spawnSync('npm', ['install', '-g', 'mobygate@latest'], { stdio: 'inherit' });
|
|
587
|
+
const r = spawnSync('npm', ['install', '-g', 'mobygate@latest'], { stdio: 'inherit', shell: IS_WIN });
|
|
583
588
|
if (r.status !== 0) return die('npm install failed. See output above.');
|
|
584
589
|
ok(`Installed mobygate@${latest}`);
|
|
585
590
|
} else if (mode === 'git') {
|
|
586
591
|
info(`Running \`git pull\` in ${REPO_ROOT}...`);
|
|
587
|
-
const pull = spawnSync('git', ['-C', REPO_ROOT, 'pull', '--ff-only'], { stdio: 'inherit' });
|
|
592
|
+
const pull = spawnSync('git', ['-C', REPO_ROOT, 'pull', '--ff-only'], { stdio: 'inherit', shell: IS_WIN });
|
|
588
593
|
if (pull.status !== 0) return die('git pull failed. Resolve conflicts and retry.');
|
|
589
594
|
info(`Running \`npm install\`...`);
|
|
590
|
-
const install = spawnSync('npm', ['install'], { cwd: REPO_ROOT, stdio: 'inherit' });
|
|
595
|
+
const install = spawnSync('npm', ['install'], { cwd: REPO_ROOT, stdio: 'inherit', shell: IS_WIN });
|
|
591
596
|
if (install.status !== 0) return die('npm install failed. See output above.');
|
|
592
597
|
ok(`Pulled and installed. See git log for what changed.`);
|
|
593
598
|
} else {
|
package/index.html
CHANGED
|
@@ -49,6 +49,34 @@
|
|
|
49
49
|
<body class="antialiased">
|
|
50
50
|
<div class="mx-auto px-12 pt-8 pb-7 flex flex-col gap-6 max-w-[1440px] min-h-screen">
|
|
51
51
|
|
|
52
|
+
<!-- ===== Update banner ===== -->
|
|
53
|
+
<!-- Hidden until /update/check reports updateAvailable=true. During
|
|
54
|
+
apply, this becomes a progress strip showing live log tail. -->
|
|
55
|
+
<section id="updateBanner" style="display:none" class="items-center gap-4 py-3 px-5 bg-[#121210] border-l-2 border-l-[#E89B2E] border-t border-b border-r border-[#2A2A1F] rounded-r-md">
|
|
56
|
+
<div class="flex items-center gap-2.5">
|
|
57
|
+
<span class="rounded-full bg-[#E89B2E] w-2 h-2 pulse-dot"></span>
|
|
58
|
+
<span class="uppercase text-[#E89B2E] font-medium text-[10px] tracking-[0.22em]">Update</span>
|
|
59
|
+
</div>
|
|
60
|
+
<div id="updateBannerText" class="grow text-[#F3EFE4] text-xs leading-4"></div>
|
|
61
|
+
<div id="updateBannerActions" class="flex items-center gap-2 shrink-0">
|
|
62
|
+
<a id="updateBannerChangelog" href="https://github.com/khnfrhn/mobygate/blob/master/CHANGELOG.md" target="_blank" rel="noreferrer" class="text-[#8A9A6A] hover:text-[#C9D9A8] text-[11px] tracking-[0.04em] underline decoration-dotted">changelog</a>
|
|
63
|
+
<button id="updateDismissBtn" class="rounded-full py-1.5 px-3 border border-[#2A2A1F] text-[#8A9A6A] hover:text-[#C9D9A8] hover:border-[#5A5F54] font-medium text-[11px] tracking-[0.04em] transition">dismiss</button>
|
|
64
|
+
<button id="updateApplyBtn" class="rounded-full py-1.5 px-3.5 bg-[#E89B2E] hover:brightness-110 text-[#0B0B09] font-bold text-[11px] tracking-[0.04em] transition">update now</button>
|
|
65
|
+
</div>
|
|
66
|
+
</section>
|
|
67
|
+
<!-- Apply-in-progress shelf: expands below the banner during update. -->
|
|
68
|
+
<section id="updateProgress" style="display:none" class="flex-col gap-2 py-3 px-5 bg-[#121210] border border-[#2A2A1F] rounded-md">
|
|
69
|
+
<div class="flex items-center justify-between">
|
|
70
|
+
<div class="flex items-center gap-2">
|
|
71
|
+
<span id="updateSpinner" class="rounded-full bg-[#E89B2E] w-2 h-2 pulse-dot"></span>
|
|
72
|
+
<span id="updateProgressTitle" class="uppercase text-[#C9D9A8] font-medium text-[10px] tracking-[0.22em]">Installing</span>
|
|
73
|
+
<span id="updateProgressSub" class="text-[#5A5F54] text-[11px]"></span>
|
|
74
|
+
</div>
|
|
75
|
+
<button id="updateProgressClose" style="display:none" class="text-[#5A5F54] hover:text-[#C9D9A8] text-[11px]">close ✕</button>
|
|
76
|
+
</div>
|
|
77
|
+
<pre id="updateProgressLog" class="text-[11px] leading-[15px] text-[#8A9A6A] max-h-[180px] overflow-auto whitespace-pre-wrap m-0"></pre>
|
|
78
|
+
</section>
|
|
79
|
+
|
|
52
80
|
<!-- ===== Header ===== -->
|
|
53
81
|
<header class="flex justify-between items-center shrink-0">
|
|
54
82
|
<div class="flex items-center gap-[22px]">
|
|
@@ -804,6 +832,117 @@
|
|
|
804
832
|
}
|
|
805
833
|
}, 1000);
|
|
806
834
|
|
|
835
|
+
// ───────────────────────── Updater
|
|
836
|
+
// Dashboard-driven upgrade flow. On load (and every 30 min) we ask
|
|
837
|
+
// /update/check whether a newer mobygate is on npm. If so, a pill
|
|
838
|
+
// appears at the top of the page — click "update now" to fire the
|
|
839
|
+
// update, watch log lines stream in, then auto-reload when the new
|
|
840
|
+
// server is up. The child process is detached, so the server
|
|
841
|
+
// restart doesn't orphan it.
|
|
842
|
+
const UPDATE_DISMISS_KEY = 'mobygate:update:dismissedVersion';
|
|
843
|
+
let updateInfo = null;
|
|
844
|
+
let updatePollTimer = null;
|
|
845
|
+
|
|
846
|
+
function showBanner(info) {
|
|
847
|
+
if (!info?.updateAvailable) {
|
|
848
|
+
$('updateBanner').style.display = 'none';
|
|
849
|
+
return;
|
|
850
|
+
}
|
|
851
|
+
// Respect dismissal: if the user dismissed this exact version, don't
|
|
852
|
+
// re-pester until a newer one lands.
|
|
853
|
+
const dismissed = localStorage.getItem(UPDATE_DISMISS_KEY);
|
|
854
|
+
if (dismissed === info.latest) {
|
|
855
|
+
$('updateBanner').style.display = 'none';
|
|
856
|
+
return;
|
|
857
|
+
}
|
|
858
|
+
const msg = info.canApply
|
|
859
|
+
? `v${escHtml(info.current)} → <span class="text-[#B7E56D]">v${escHtml(info.latest)}</span> available · <span class="text-[#5A5F54]">${escHtml(info.installMode)} install</span>`
|
|
860
|
+
: `v${escHtml(info.current)} → <span class="text-[#B7E56D]">v${escHtml(info.latest)}</span> available · <span class="text-[#E89B2E]">${escHtml(info.installMode)} install — update manually</span>`;
|
|
861
|
+
$('updateBannerText').innerHTML = msg;
|
|
862
|
+
$('updateApplyBtn').style.display = info.canApply ? '' : 'none';
|
|
863
|
+
$('updateBanner').style.display = 'flex';
|
|
864
|
+
}
|
|
865
|
+
|
|
866
|
+
async function checkForUpdates({ force = false } = {}) {
|
|
867
|
+
try {
|
|
868
|
+
const r = await fetch(`/update/check${force ? '?force=1' : ''}`);
|
|
869
|
+
if (!r.ok) return;
|
|
870
|
+
updateInfo = await r.json();
|
|
871
|
+
showBanner(updateInfo);
|
|
872
|
+
} catch (e) { /* offline is fine */ }
|
|
873
|
+
}
|
|
874
|
+
|
|
875
|
+
function renderUpdateLog(lines) {
|
|
876
|
+
const el = $('updateProgressLog');
|
|
877
|
+
el.textContent = (lines || []).join('\n');
|
|
878
|
+
// Pin to bottom so the user sees the latest line.
|
|
879
|
+
el.scrollTop = el.scrollHeight;
|
|
880
|
+
}
|
|
881
|
+
|
|
882
|
+
async function pollUpdateStatus() {
|
|
883
|
+
try {
|
|
884
|
+
const r = await fetch('/update/status?lines=200');
|
|
885
|
+
if (!r.ok) return;
|
|
886
|
+
const s = await r.json();
|
|
887
|
+
renderUpdateLog(s.lines);
|
|
888
|
+
if (!s.running) {
|
|
889
|
+
// Update finished. The service restart may have already swapped
|
|
890
|
+
// the running binary — our `currentVersion` reflects whatever
|
|
891
|
+
// server answered. If it matches `latest`, celebrate. Either
|
|
892
|
+
// way, give it a moment then reload so the dashboard comes
|
|
893
|
+
// back on the new code path.
|
|
894
|
+
clearInterval(updatePollTimer); updatePollTimer = null;
|
|
895
|
+
$('updateSpinner').classList.remove('pulse-dot');
|
|
896
|
+
$('updateSpinner').classList.remove('bg-[#E89B2E]');
|
|
897
|
+
$('updateSpinner').classList.add('bg-[#B7E56D]');
|
|
898
|
+
$('updateProgressTitle').textContent = 'Installed';
|
|
899
|
+
$('updateProgressSub').textContent = `now on v${s.currentVersion} — reloading in 3s…`;
|
|
900
|
+
$('updateProgressClose').style.display = '';
|
|
901
|
+
setTimeout(() => location.reload(), 3000);
|
|
902
|
+
}
|
|
903
|
+
} catch (e) {
|
|
904
|
+
// Server is mid-restart — keep polling, it'll come back.
|
|
905
|
+
}
|
|
906
|
+
}
|
|
907
|
+
|
|
908
|
+
function startUpdateProgress(mode) {
|
|
909
|
+
$('updateBanner').style.display = 'none';
|
|
910
|
+
$('updateProgress').style.display = 'flex';
|
|
911
|
+
$('updateProgressSub').textContent = mode ? `(${mode} install)` : '';
|
|
912
|
+
$('updateProgressTitle').textContent = 'Installing';
|
|
913
|
+
$('updateSpinner').classList.add('pulse-dot');
|
|
914
|
+
$('updateProgressLog').textContent = 'starting update…';
|
|
915
|
+
if (updatePollTimer) clearInterval(updatePollTimer);
|
|
916
|
+
updatePollTimer = setInterval(pollUpdateStatus, 1500);
|
|
917
|
+
pollUpdateStatus();
|
|
918
|
+
}
|
|
919
|
+
|
|
920
|
+
$('updateApplyBtn')?.addEventListener('click', async () => {
|
|
921
|
+
$('updateApplyBtn').disabled = true;
|
|
922
|
+
try {
|
|
923
|
+
const r = await fetch('/update/apply', { method: 'POST' });
|
|
924
|
+
const j = await r.json().catch(() => ({}));
|
|
925
|
+
if (!r.ok || !j.started) {
|
|
926
|
+
$('updateBannerText').innerHTML += ` <span class="text-[#E89B2E]">— ${escHtml(j.error || 'update failed to start')}</span>`;
|
|
927
|
+
$('updateApplyBtn').disabled = false;
|
|
928
|
+
return;
|
|
929
|
+
}
|
|
930
|
+
startUpdateProgress(j.mode);
|
|
931
|
+
} catch (e) {
|
|
932
|
+
$('updateBannerText').innerHTML += ` <span class="text-[#E89B2E]">— ${escHtml(e.message)}</span>`;
|
|
933
|
+
$('updateApplyBtn').disabled = false;
|
|
934
|
+
}
|
|
935
|
+
});
|
|
936
|
+
|
|
937
|
+
$('updateDismissBtn')?.addEventListener('click', () => {
|
|
938
|
+
if (updateInfo?.latest) localStorage.setItem(UPDATE_DISMISS_KEY, updateInfo.latest);
|
|
939
|
+
$('updateBanner').style.display = 'none';
|
|
940
|
+
});
|
|
941
|
+
|
|
942
|
+
$('updateProgressClose')?.addEventListener('click', () => {
|
|
943
|
+
$('updateProgress').style.display = 'none';
|
|
944
|
+
});
|
|
945
|
+
|
|
807
946
|
// Kick off
|
|
808
947
|
loadSnapshot();
|
|
809
948
|
loadAuth({ verify: false });
|
|
@@ -811,6 +950,21 @@
|
|
|
811
950
|
loadLogs();
|
|
812
951
|
armLogAutoRefresh();
|
|
813
952
|
connectStream();
|
|
953
|
+
// Surface update availability on load + every 30 min. The backend
|
|
954
|
+
// caches the npm registry lookup for 15 min, so this doesn't hammer
|
|
955
|
+
// the registry even with the dashboard open all day.
|
|
956
|
+
checkForUpdates();
|
|
957
|
+
setInterval(() => checkForUpdates(), 30 * 60 * 1000);
|
|
958
|
+
// If an update is in-flight when the page loads (e.g., user refreshed
|
|
959
|
+
// mid-apply), pick up where it left off.
|
|
960
|
+
(async () => {
|
|
961
|
+
try {
|
|
962
|
+
const r = await fetch('/update/status?lines=50');
|
|
963
|
+
if (!r.ok) return;
|
|
964
|
+
const s = await r.json();
|
|
965
|
+
if (s.running) startUpdateProgress(s.mode);
|
|
966
|
+
} catch {}
|
|
967
|
+
})();
|
|
814
968
|
</script>
|
|
815
969
|
</body>
|
|
816
970
|
</html>
|
|
@@ -0,0 +1,257 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Native tool bridge — translates between OpenAI client tools and the
|
|
3
|
+
* Claude Agent SDK's MCP-tool model.
|
|
4
|
+
*
|
|
5
|
+
* Why this exists (Phase 1 of the mobygate native-tools refactor):
|
|
6
|
+
*
|
|
7
|
+
* Until now, mobygate handled client-supplied tools by injecting their
|
|
8
|
+
* schemas into the system prompt as <tool> XML and instructing the model
|
|
9
|
+
* to emit <tool_call>{...}</tool_call> tags in its text output. We then
|
|
10
|
+
* regex-parsed those tags. Fragile in obvious ways: the model sometimes
|
|
11
|
+
* wrapped tags in code fences, sometimes hallucinated partial blocks,
|
|
12
|
+
* and the "empty after tool_results" nudge existed to paper over the
|
|
13
|
+
* model treating bare <tool_results> as inert data.
|
|
14
|
+
*
|
|
15
|
+
* The SDK actually supports native tool definitions via MCP — but its
|
|
16
|
+
* MCP model assumes the **handler runs in-process** and returns a
|
|
17
|
+
* synchronous result. Our case is different: we're a proxy. The actual
|
|
18
|
+
* tool implementations live on the *other* side of an HTTP boundary,
|
|
19
|
+
* inside the client (Hermes / OpenClaw / etc.). We can't run them.
|
|
20
|
+
*
|
|
21
|
+
* The trick: register client tools as MCP tools with stub handlers that
|
|
22
|
+
* never resolve. The model emits **native** `tool_use` content blocks
|
|
23
|
+
* (in the SDKAssistantMessage stream, not buried in text). We watch the
|
|
24
|
+
* stream, abort the SDK on the first complete `tool_use`, and surface
|
|
25
|
+
* it to the client as an OpenAI `tool_calls` response. The stub handler
|
|
26
|
+
* is then aborted via the SDK's signal — we never actually execute it,
|
|
27
|
+
* the client does.
|
|
28
|
+
*
|
|
29
|
+
* The other end of the round-trip: when the client sends a follow-up
|
|
30
|
+
* request with tool results (role:'tool' messages), we convert those
|
|
31
|
+
* into native `tool_result` content blocks inside an SDKUserMessage,
|
|
32
|
+
* resuming the SDK session. The model sees structured tool results,
|
|
33
|
+
* not <tool_result> XML, and continues the conversation cleanly.
|
|
34
|
+
*
|
|
35
|
+
* Names round-trip via the MCP prefix convention. A client tool named
|
|
36
|
+
* `getWeather` is registered as `mcp__mobygate__getWeather` with the
|
|
37
|
+
* SDK; the model emits tool_use blocks under that prefixed name; we
|
|
38
|
+
* strip the prefix on the way back so the client sees its original name.
|
|
39
|
+
*/
|
|
40
|
+
|
|
41
|
+
import { z } from 'zod';
|
|
42
|
+
import { tool, createSdkMcpServer } from '@anthropic-ai/claude-agent-sdk';
|
|
43
|
+
|
|
44
|
+
export const MCP_SERVER_NAME = 'mobygate';
|
|
45
|
+
export const MCP_TOOL_PREFIX = `mcp__${MCP_SERVER_NAME}__`;
|
|
46
|
+
|
|
47
|
+
// ---------------------------------------------------------------------------
|
|
48
|
+
// JSON Schema → Zod RawShape
|
|
49
|
+
// ---------------------------------------------------------------------------
|
|
50
|
+
// The SDK's `tool()` helper takes a Zod RawShape (a record of ZodTypes,
|
|
51
|
+
// like `{name: z.string(), age: z.number()}`) — NOT a JSON Schema object.
|
|
52
|
+
// OpenAI clients send JSON Schema (`{type:'object', properties:{...}, required:[...]}`),
|
|
53
|
+
// so we need to convert. This handles the common cases that cover ~95% of
|
|
54
|
+
// real-world tool schemas; anything weirder falls through to z.unknown().
|
|
55
|
+
|
|
56
|
+
function jsonSchemaPropToZod(prop) {
|
|
57
|
+
if (!prop || typeof prop !== 'object') return z.unknown();
|
|
58
|
+
|
|
59
|
+
// Handle enums up front — they apply across types.
|
|
60
|
+
if (Array.isArray(prop.enum) && prop.enum.length > 0) {
|
|
61
|
+
const stringy = prop.enum.every((v) => typeof v === 'string');
|
|
62
|
+
if (stringy) return z.enum(prop.enum);
|
|
63
|
+
// mixed-type enums fall through to z.union of literals
|
|
64
|
+
return z.union(prop.enum.map((v) => z.literal(v)));
|
|
65
|
+
}
|
|
66
|
+
|
|
67
|
+
switch (prop.type) {
|
|
68
|
+
case 'string': return z.string();
|
|
69
|
+
case 'number': return z.number();
|
|
70
|
+
case 'integer': return z.number().int();
|
|
71
|
+
case 'boolean': return z.boolean();
|
|
72
|
+
case 'null': return z.null();
|
|
73
|
+
case 'array': {
|
|
74
|
+
const item = prop.items ? jsonSchemaPropToZod(prop.items) : z.unknown();
|
|
75
|
+
return z.array(item);
|
|
76
|
+
}
|
|
77
|
+
case 'object': {
|
|
78
|
+
const shape = jsonSchemaToZodShape(prop);
|
|
79
|
+
return z.object(shape).passthrough();
|
|
80
|
+
}
|
|
81
|
+
default: return z.unknown();
|
|
82
|
+
}
|
|
83
|
+
}
|
|
84
|
+
|
|
85
|
+
/**
|
|
86
|
+
* Convert a JSON Schema *object* (with `properties` + `required`) into
|
|
87
|
+
* a Zod RawShape suitable for the SDK's `tool()` helper.
|
|
88
|
+
*
|
|
89
|
+
* Returns an empty shape `{}` when the schema isn't an object — the
|
|
90
|
+
* caller will pass this to `tool()`, and the model will see "no
|
|
91
|
+
* structured input expected." That's the right default for tool defs
|
|
92
|
+
* that arrive without a properties block (which OpenAI permits).
|
|
93
|
+
*/
|
|
94
|
+
export function jsonSchemaToZodShape(schema) {
|
|
95
|
+
if (!schema || schema.type !== 'object' || !schema.properties) return {};
|
|
96
|
+
const shape = {};
|
|
97
|
+
const required = new Set(Array.isArray(schema.required) ? schema.required : []);
|
|
98
|
+
for (const [key, prop] of Object.entries(schema.properties)) {
|
|
99
|
+
let zType = jsonSchemaPropToZod(prop);
|
|
100
|
+
if (!required.has(key)) zType = zType.optional();
|
|
101
|
+
if (prop?.description) zType = zType.describe(prop.description);
|
|
102
|
+
shape[key] = zType;
|
|
103
|
+
}
|
|
104
|
+
return shape;
|
|
105
|
+
}
|
|
106
|
+
|
|
107
|
+
// ---------------------------------------------------------------------------
|
|
108
|
+
// Build the MCP server that exposes client tools to the SDK
|
|
109
|
+
// ---------------------------------------------------------------------------
|
|
110
|
+
|
|
111
|
+
/**
|
|
112
|
+
* Stub handler. The model emits a tool_use block, the SDK calls us, but
|
|
113
|
+
* we don't actually have an implementation to run — the client does.
|
|
114
|
+
* So we wait. The stream-watcher in server.js will abort the SDK as
|
|
115
|
+
* soon as it sees the tool_use block, which propagates here as a signal
|
|
116
|
+
* abort. We reject and the SDK cleans up.
|
|
117
|
+
*
|
|
118
|
+
* The 30s safety timeout is for the (rare) case where the SDK fires our
|
|
119
|
+
* handler but the abort never propagates back — we don't want to leak
|
|
120
|
+
* a Promise forever. 30s is well past any reasonable abort latency.
|
|
121
|
+
*/
|
|
122
|
+
function deferredToolHandler(_args, extra) {
|
|
123
|
+
return new Promise((resolve, reject) => {
|
|
124
|
+
const onAbort = () => {
|
|
125
|
+
cleanup();
|
|
126
|
+
reject(new Error('mobygate: tool execution deferred to client (aborted)'));
|
|
127
|
+
};
|
|
128
|
+
const timer = setTimeout(() => {
|
|
129
|
+
cleanup();
|
|
130
|
+
reject(new Error('mobygate: tool execution deferred to client (timeout)'));
|
|
131
|
+
}, 30_000);
|
|
132
|
+
function cleanup() {
|
|
133
|
+
clearTimeout(timer);
|
|
134
|
+
extra?.signal?.removeEventListener?.('abort', onAbort);
|
|
135
|
+
}
|
|
136
|
+
if (extra?.signal?.aborted) return onAbort();
|
|
137
|
+
extra?.signal?.addEventListener?.('abort', onAbort, { once: true });
|
|
138
|
+
});
|
|
139
|
+
}
|
|
140
|
+
|
|
141
|
+
/**
|
|
142
|
+
* Build an in-process MCP server exposing the client's tools to the SDK.
|
|
143
|
+
* Returns the McpSdkServerConfigWithInstance; pass it to `query({options: { mcpServers: { [MCP_SERVER_NAME]: config } }})`.
|
|
144
|
+
*
|
|
145
|
+
* Returns `null` when there are no valid tools — caller should skip
|
|
146
|
+
* MCP setup entirely in that case.
|
|
147
|
+
*/
|
|
148
|
+
export function buildClientToolsServer(openaiTools) {
|
|
149
|
+
if (!Array.isArray(openaiTools) || openaiTools.length === 0) return null;
|
|
150
|
+
|
|
151
|
+
const toolDefs = [];
|
|
152
|
+
for (const t of openaiTools) {
|
|
153
|
+
if (t?.type !== 'function' || !t.function?.name) continue;
|
|
154
|
+
const fn = t.function;
|
|
155
|
+
const shape = jsonSchemaToZodShape(fn.parameters);
|
|
156
|
+
toolDefs.push(tool(
|
|
157
|
+
fn.name,
|
|
158
|
+
fn.description || `Client-defined tool: ${fn.name}`,
|
|
159
|
+
shape,
|
|
160
|
+
deferredToolHandler,
|
|
161
|
+
// alwaysLoad: the SDK otherwise marks MCP tools as "deferred" — the
|
|
162
|
+
// model has to call the built-in `ToolSearch` to fetch the schema
|
|
163
|
+
// before invoking. That round-trip is invisible to OpenAI clients,
|
|
164
|
+
// who see a confusing tool_call for ToolSearch instead of getWeather.
|
|
165
|
+
// Eagerly loading our tools keeps the OpenAI surface clean.
|
|
166
|
+
{ alwaysLoad: true },
|
|
167
|
+
));
|
|
168
|
+
}
|
|
169
|
+
if (toolDefs.length === 0) return null;
|
|
170
|
+
|
|
171
|
+
return createSdkMcpServer({
|
|
172
|
+
name: MCP_SERVER_NAME,
|
|
173
|
+
version: '1.0.0',
|
|
174
|
+
tools: toolDefs,
|
|
175
|
+
});
|
|
176
|
+
}
|
|
177
|
+
|
|
178
|
+
// ---------------------------------------------------------------------------
|
|
179
|
+
// Tool-use extraction (SDK assistant message → OpenAI tool_calls)
|
|
180
|
+
// ---------------------------------------------------------------------------
|
|
181
|
+
|
|
182
|
+
/**
|
|
183
|
+
* Walk an SDKAssistantMessage's content array for native `tool_use` blocks.
|
|
184
|
+
* Returns an array of `{ id, name, arguments }` formatted for OpenAI
|
|
185
|
+
* tool_calls — name has the MCP prefix stripped, arguments is a JSON string.
|
|
186
|
+
*
|
|
187
|
+
* Returns `[]` when the message has no tool_use blocks (most assistant
|
|
188
|
+
* messages don't — they're just text deltas).
|
|
189
|
+
*/
|
|
190
|
+
export function extractToolUses(assistantMessage) {
|
|
191
|
+
const content = assistantMessage?.message?.content;
|
|
192
|
+
if (!Array.isArray(content)) return [];
|
|
193
|
+
const calls = [];
|
|
194
|
+
for (const block of content) {
|
|
195
|
+
if (block?.type !== 'tool_use' || !block.id || !block.name) continue;
|
|
196
|
+
// Strip the MCP prefix so the client sees its original tool name.
|
|
197
|
+
const name = block.name.startsWith(MCP_TOOL_PREFIX)
|
|
198
|
+
? block.name.slice(MCP_TOOL_PREFIX.length)
|
|
199
|
+
: block.name;
|
|
200
|
+
let argsString = '{}';
|
|
201
|
+
try { argsString = JSON.stringify(block.input ?? {}); } catch {}
|
|
202
|
+
calls.push({ id: block.id, name, arguments: argsString });
|
|
203
|
+
}
|
|
204
|
+
return calls;
|
|
205
|
+
}
|
|
206
|
+
|
|
207
|
+
/**
|
|
208
|
+
* Quick liveness check used by the stream loop to decide whether to abort
|
|
209
|
+
* early. Returns true the moment any tool_use block appears.
|
|
210
|
+
*/
|
|
211
|
+
export function hasToolUse(assistantMessage) {
|
|
212
|
+
const content = assistantMessage?.message?.content;
|
|
213
|
+
if (!Array.isArray(content)) return false;
|
|
214
|
+
return content.some((b) => b?.type === 'tool_use');
|
|
215
|
+
}
|
|
216
|
+
|
|
217
|
+
// ---------------------------------------------------------------------------
|
|
218
|
+
// Tool results (OpenAI tool messages → Anthropic tool_result content blocks)
|
|
219
|
+
// ---------------------------------------------------------------------------
|
|
220
|
+
|
|
221
|
+
/**
|
|
222
|
+
* Format OpenAI role:'tool' messages as a single user-readable text
|
|
223
|
+
* block to splice into a resumed prompt.
|
|
224
|
+
*
|
|
225
|
+
* NOTE: Phase 1 deliberately does *not* round-trip tool results as
|
|
226
|
+
* native Anthropic `tool_result` content blocks. Why: when we abort
|
|
227
|
+
* the SDK on a tool_use, the assistant turn isn't persisted in the
|
|
228
|
+
* SDK's session state (we observed `msgs=1` on resume after a tool
|
|
229
|
+
* call, meaning the partial turn was dropped). On resume, sending a
|
|
230
|
+
* native tool_result block then has nothing to bind to — the model
|
|
231
|
+
* sees an orphan tool_result and re-calls the tool.
|
|
232
|
+
*
|
|
233
|
+
* Phase 2's full Anthropic Messages wire format will keep the SDK
|
|
234
|
+
* alive long enough to persist the turn properly. Until then, text-
|
|
235
|
+
* form tool results (which the model handles fine — it has the
|
|
236
|
+
* preceding tool_use in resume context) is the pragmatic answer.
|
|
237
|
+
*
|
|
238
|
+
* Returns a single string suitable for prepending to (or replacing)
|
|
239
|
+
* the user's prompt text on a resumed turn. Returns '' when there
|
|
240
|
+
* are no tool messages.
|
|
241
|
+
*/
|
|
242
|
+
export function toolMessagesToText(toolMessages) {
|
|
243
|
+
const lines = [];
|
|
244
|
+
for (const msg of toolMessages) {
|
|
245
|
+
if (msg?.role !== 'tool') continue;
|
|
246
|
+
const id = msg.tool_call_id || 'unknown';
|
|
247
|
+
const name = msg.name || '';
|
|
248
|
+
const content = typeof msg.content === 'string'
|
|
249
|
+
? msg.content
|
|
250
|
+
: Array.isArray(msg.content)
|
|
251
|
+
? msg.content.map((c) => (typeof c === 'string' ? c : c?.text || '')).join('')
|
|
252
|
+
: (msg.content == null ? '' : String(msg.content));
|
|
253
|
+
lines.push(`<tool_result id="${id}"${name ? ` name="${name}"` : ''}>\n${content}\n</tool_result>`);
|
|
254
|
+
}
|
|
255
|
+
if (lines.length === 0) return '';
|
|
256
|
+
return `<tool_results>\n${lines.join('\n')}\n</tool_results>`;
|
|
257
|
+
}
|