@elvatis_com/openclaw-cli-bridge-elvatis 3.5.1 → 3.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CLAUDE.md CHANGED
@@ -31,7 +31,14 @@ OpenClaw Gateway ──(HTTP)──> proxy-server.ts ──(spawn)──> claude
31
31
  - **Compact tool schema** — when >8 tools, only send name+params (skip descriptions/full JSON schema), cuts prompt ~60%
32
32
  - **Exit 143 = our SIGTERM** — not OOM, not crash. The bridge's timeout/stale-output detector sends SIGTERM, Claude CLI exits 143
33
33
  - **Consecutive timeout rotation** — after 3 timeouts in a row on the same session, auto-expire it and create a fresh one. Prevents poisoned sessions from blocking all requests
34
- - **Workspace project auto-detection** — scans `~/.openclaw/workspace/` for project directories; when the prompt contains an exact match of a project name, auto-sets `workdir` and injects `[Context: Working directory is ...]` into the prompt
34
+ - **Workspace project auto-detection** — scans `~/.openclaw/workspace/` for project directories; when the prompt contains an exact match of a project name (from user messages only), auto-sets `workdir` and injects context
35
+ - **Opus escalation** — when conversations exceed 20 messages with tools, automatically routes from Sonnet to Opus. Opus handles large contexts reliably (94% success vs Sonnet's 55%)
36
+ - **Opus 90s stale timeout** — Opus gets 90s stale-output timeout (vs 30s for Sonnet) to allow time for long-form generation (blog posts, Lexical JSON)
37
+ - **Session resume: Opus only** — Sonnet/Haiku use fresh `claude -p` every call (session resume caused 45% hang rate). Opus uses `--session-id`/`--resume` for context continuity
38
+ - **Generic skill auto-detection** — scans `~/.openclaw/skills/` for SKILL.md files, injects pointers when prompt matches a skill name. Fully generic, works with any installed skill
39
+ - **First user message pinning** — original user request is always included in the prompt window, even when conversation exceeds MAX_MESSAGES
40
+ - **Haiku skip in tool loops** — fallback chain skips Haiku when tool_calls are expected (Haiku consistently returns text instead of tool_calls in tool loops)
41
+ - **Improved JSON parser** — tries multiple `{` positions for embedded JSON, rescue-from-raw strategy, handles malformed tool_calls from fallback models
35
42
 
36
43
  ## Build & Test
37
44
 
@@ -84,9 +91,12 @@ Parser tries 5 strategies: Claude JSON wrapper, direct JSON, code blocks, embedd
84
91
 
85
92
  ## Known Issues
86
93
 
87
- - **Sonnet intermittent hangs** — `claude -p` with Sonnet goes completely silent (~50% of the time) on large tool prompts (20KB+). First call often works, subsequent calls hang. NOT RAM-related. Likely API-side rate limiting or request dedup. Workaround: 30s stale-output detection + Haiku fallback.
88
- - **Haiku empty responses** — occasionally returns zero stdout (len:0). Cause unclear. The JSON reminder at prompt end helps but doesn't fully solve it.
89
- - **Pre-existing tsc errors** — 5 errors about `openclaw/plugin-sdk` module not found. These are expected — the SDK is injected at runtime by the gateway. Dist output is still generated.
94
+ - **Sonnet intermittent hangs** — `claude -p` with Sonnet goes completely silent (~45% of requests). Session resume makes it worse (corrupted sessions after SIGTERM). Workaround: session resume disabled for Sonnet (fresh `-p` every call), auto-escalate to Opus at 20+ messages. Opus has ~94% success rate.
95
+ - **Sonnet session resume disabled** — session resume caused corrupted sessions when SIGTERM killed processes. Only Opus uses `--session-id`/`--resume` now. Sonnet/Haiku send the full prompt every time (more tokens, but reliable).
96
+ - **Haiku unreliable for tool_calls** — returns text instead of tool_calls ~80% of the time in tool loops. Skipped in fallback chain when tools are expected.
97
+ - **Long-form generation limit** — generating 15KB+ responses (blog posts as Lexical JSON) can exceed even Opus's 90s stale timeout. The `claude -p` CLI sometimes goes silent during long generation. No workaround from the bridge side.
98
+ - **Agent delegation (disabled)** — infrastructure for delegating skills to `openclaw agent` is built but disabled. `openclaw agent` is single-turn only; multi-turn skill execution needs OpenClaw-side support.
99
+ - **Pre-existing tsc errors** — errors about `openclaw/plugin-sdk` module not found. Expected — the SDK is injected at runtime by the gateway. Dist output is still generated.
90
100
 
91
101
  ## Testing
92
102
 
package/README.md CHANGED
@@ -2,7 +2,7 @@
2
2
 
3
3
  > OpenClaw plugin that bridges locally installed AI CLIs (Codex, Gemini, Claude Code, OpenCode, Pi) as model providers — with slash commands for instant model switching, restore, health testing, and model listing.
4
4
 
5
- **Current version:** `3.5.1`
5
+ **Current version:** `3.7.0`
6
6
 
7
7
  ---
8
8
 
package/SKILL.md CHANGED
@@ -68,4 +68,4 @@ On gateway restart, if any session has expired, a **WhatsApp alert** is sent aut
68
68
 
69
69
  See `README.md` for full configuration reference and architecture diagram.
70
70
 
71
- **Version:** 3.5.1
71
+ **Version:** 3.7.0
@@ -2,7 +2,7 @@
2
2
  "id": "openclaw-cli-bridge-elvatis",
3
3
  "slug": "openclaw-cli-bridge-elvatis",
4
4
  "name": "OpenClaw CLI Bridge",
5
- "version": "3.5.1",
5
+ "version": "3.7.0",
6
6
  "license": "MIT",
7
7
  "description": "Phase 1: openai-codex auth bridge. Phase 2: local HTTP proxy routing model calls through gemini/claude CLIs (vllm provider).",
8
8
  "providers": [
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@elvatis_com/openclaw-cli-bridge-elvatis",
3
- "version": "3.5.1",
3
+ "version": "3.7.0",
4
4
  "description": "Bridges gemini, claude, and codex CLI tools as OpenClaw model providers. Reads existing CLI auth without re-login.",
5
5
  "type": "module",
6
6
  "openclaw": {
package/src/cli-runner.ts CHANGED
@@ -304,6 +304,8 @@ export interface RunCliOptions {
304
304
  */
305
305
  cwd?: string;
306
306
  timeoutMs?: number;
307
+ /** Override stale-output timeout (ms). Opus needs longer (90s) for long-form generation. */
308
+ staleTimeoutMs?: number;
307
309
  /** Optional logger for timeout events. */
308
310
  log?: (msg: string) => void;
309
311
  }
@@ -373,12 +375,13 @@ export function runCli(
373
375
  doKill(`timeout after ${Math.round(timeoutMs / 1000)}s`);
374
376
  }, timeoutMs);
375
377
 
376
- // ── Stale-output detection: kill if no stdout for STALE_OUTPUT_TIMEOUT_MS
377
- if (STALE_OUTPUT_TIMEOUT_MS > 0) {
378
+ // ── Stale-output detection: kill if no stdout for staleTimeoutMs
379
+ const effectiveStaleTimeout = opts.staleTimeoutMs ?? STALE_OUTPUT_TIMEOUT_MS;
380
+ if (effectiveStaleTimeout > 0) {
378
381
  const checkInterval = 15_000; // check every 15s
379
382
  staleTimer = setInterval(() => {
380
383
  const silent = Date.now() - lastOutputAt;
381
- if (silent >= STALE_OUTPUT_TIMEOUT_MS) {
384
+ if (silent >= effectiveStaleTimeout) {
382
385
  doKill(`stale output — no stdout for ${Math.round(silent / 1000)}s`);
383
386
  }
384
387
  }, checkInterval);
@@ -667,7 +670,9 @@ export async function runClaude(
667
670
 
668
671
  const model = stripPrefix(modelId);
669
672
  const session = getOrCreateSession("claude", model);
670
- const isResume = session.requestCount > 0;
673
+ // Session resume: enabled for Opus (reliable), disabled for Sonnet/Haiku (45% hang rate)
674
+ const isOpus = model.includes("opus");
675
+ const isResume = isOpus && session.requestCount > 0;
671
676
 
672
677
  const args: string[] = [
673
678
  "-p",
@@ -679,24 +684,28 @@ export async function runClaude(
679
684
 
680
685
  if (isResume) {
681
686
  args.push("--resume", session.sessionId);
682
- } else {
687
+ } else if (isOpus) {
683
688
  args.push("--session-id", session.sessionId);
684
689
  }
690
+ // Sonnet/Haiku: no session args — fresh call every time
685
691
 
686
- // When tools are present, sandwich the conversation between tool instructions.
687
- // On resume: only send the last user message (Claude has the full history).
688
- // On first request: send the full prompt with tool block.
692
+ // On resume: only send the last user message (Opus has the full history).
693
+ // On fresh: send the full prompt with tool block.
689
694
  const effectivePrompt = opts?.tools?.length
690
- ? buildToolPromptBlock(opts.tools) + "\n\n" + prompt + "\n\nREMINDER: You MUST respond with ONLY valid JSON — either {\"tool_calls\":[...]} or {\"content\":\"...\"}. Nothing else."
695
+ ? (isResume
696
+ ? prompt + "\n\nREMINDER: You MUST respond with ONLY valid JSON — either {\"tool_calls\":[...]} or {\"content\":\"...\"}. Nothing else."
697
+ : buildToolPromptBlock(opts.tools) + "\n\n" + prompt + "\n\nREMINDER: You MUST respond with ONLY valid JSON — either {\"tool_calls\":[...]} or {\"content\":\"...\"}. Nothing else.")
691
698
  : prompt;
692
699
 
693
700
  const cwd = workdir ?? homedir();
694
- debugLog("CLAUDE", `${isResume ? "resume" : "new"} ${model} session=${session.sessionId.slice(0, 8)}`, {
701
+ debugLog("CLAUDE", `${isResume ? "resume" : "fresh"} ${model}${isResume ? ` session=${session.sessionId.slice(0, 8)}` : ""}`, {
695
702
  promptLen: effectivePrompt.length, promptKB: Math.round(effectivePrompt.length / 1024),
696
- requestCount: session.requestCount, cwd, timeoutMs: Math.round(timeoutMs / 1000),
703
+ cwd, timeoutMs: Math.round(timeoutMs / 1000), ...(isOpus ? { requestCount: session.requestCount } : {}),
697
704
  });
698
705
 
699
- const result = await runCli("claude", args, effectivePrompt, timeoutMs, { cwd, log: opts?.log });
706
+ // Opus gets 90s stale timeout it needs think time for long-form generation (blog posts, Lexical JSON)
707
+ const staleMs = isOpus ? 90_000 : undefined;
708
+ const result = await runCli("claude", args, effectivePrompt, timeoutMs, { cwd, log: opts?.log, staleTimeoutMs: staleMs });
700
709
 
701
710
  // Session succeeded — update registry
702
711
  if (result.exitCode === 0 || result.stdout.length > 0) {
@@ -1003,6 +1012,92 @@ function detectProjectFromPrompt(prompt: string): { name: string; path: string }
1003
1012
  return null;
1004
1013
  }
1005
1014
 
1015
+ // ── Skill hint injection ─────────────────────────────────────────────────────
1016
+ // Scans ~/.openclaw/skills/ for skill directories with SKILL.md files.
1017
+ // When user prompt mentions a skill name (from the directory name or the SKILL.md
1018
+ // description), injects a pointer so the model knows where to find it.
1019
+
1020
+ interface SkillEntry {
1021
+ name: string;
1022
+ path: string;
1023
+ description: string;
1024
+ keywords: string[];
1025
+ scripts: string[];
1026
+ }
1027
+
1028
+ let _skillRegistry: SkillEntry[] | null = null;
1029
+ let _skillRegistryRefreshedAt = 0;
1030
+ const SKILL_REGISTRY_CACHE_TTL = 120_000; // refresh every 2 min
1031
+
1032
+ function getSkillRegistry(): SkillEntry[] {
1033
+ const now = Date.now();
1034
+ if (_skillRegistry && (now - _skillRegistryRefreshedAt) < SKILL_REGISTRY_CACHE_TTL) {
1035
+ return _skillRegistry;
1036
+ }
1037
+ _skillRegistry = [];
1038
+ const skillsDir = join(homedir(), ".openclaw", "skills");
1039
+ try {
1040
+ if (!existsSync(skillsDir)) return _skillRegistry;
1041
+ const entries = readdirSync(skillsDir);
1042
+ for (const name of entries) {
1043
+ const skillDir = join(skillsDir, name);
1044
+ const skillMd = join(skillDir, "SKILL.md");
1045
+ try {
1046
+ if (!statSync(skillDir).isDirectory()) continue;
1047
+ if (!existsSync(skillMd)) continue;
1048
+ // Read first 500 chars of SKILL.md to extract description and keywords
1049
+ const content = readFileSync(skillMd, "utf8").slice(0, 500);
1050
+ const descMatch = content.match(/description:\s*"([^"]+)"/);
1051
+ const description = descMatch?.[1] ?? "";
1052
+ // Build keywords from: skill name, words in description, hyphen-split name parts
1053
+ const keywords = [
1054
+ name,
1055
+ ...name.split("-"),
1056
+ ...description.toLowerCase().split(/[\s,.:;]+/).filter(w => w.length > 3),
1057
+ ];
1058
+ // Find scripts
1059
+ const scriptsDir = join(skillDir, "scripts");
1060
+ let scripts: string[] = [];
1061
+ try {
1062
+ if (existsSync(scriptsDir) && statSync(scriptsDir).isDirectory()) {
1063
+ scripts = readdirSync(scriptsDir).filter(f => f.endsWith(".py") || f.endsWith(".sh"));
1064
+ }
1065
+ } catch { /* no scripts dir */ }
1066
+ _skillRegistry.push({ name, path: skillDir, description, keywords, scripts });
1067
+ } catch { /* skip unreadable skill */ }
1068
+ }
1069
+ } catch { /* no skills dir */ }
1070
+ _skillRegistryRefreshedAt = now;
1071
+ return _skillRegistry;
1072
+ }
1073
+
1074
+ function detectSkillHints(userText: string): string | null {
1075
+ const skills = getSkillRegistry();
1076
+ if (!skills.length) return null;
1077
+
1078
+ const matched: SkillEntry[] = [];
1079
+
1080
+ for (const skill of skills) {
1081
+ // Match by exact skill name in prompt only
1082
+ const nameRegex = new RegExp(`\\b${skill.name.replace(/[.*+?^${}()|[\]\\]/g, "\\$&")}\\b`, "i");
1083
+ if (nameRegex.test(userText)) {
1084
+ matched.push(skill);
1085
+ }
1086
+ }
1087
+
1088
+ if (!matched.length) return null;
1089
+
1090
+ // Keep hints compact — every byte counts at high message counts
1091
+ const hints = matched.map(skill => {
1092
+ const scripts = skill.scripts.length > 0
1093
+ ? ` Scripts: ${skill.scripts.map(s => `${skill.path}/scripts/${s}`).join(", ")}`
1094
+ : "";
1095
+ return `[Skill: ${skill.name}] Read: ${skill.path}/SKILL.md — follow workflow with read/exec tools.${scripts}`;
1096
+ });
1097
+
1098
+ return hints.join("\n");
1099
+ }
1100
+
1006
1101
  /**
1007
1102
  * Route a chat completion to the correct CLI based on model prefix.
1008
1103
  * cli-gemini/<id> → gemini CLI
@@ -1028,11 +1123,12 @@ export async function routeToCliRunner(
1028
1123
  const hasTools = toolCount > 0;
1029
1124
 
1030
1125
  // Auto-detect project from user messages only (not tool results which mention other projects)
1126
+ const userText = messages
1127
+ .filter((m) => m.role === "user")
1128
+ .map((m) => contentToString(m.content))
1129
+ .join(" ");
1130
+
1031
1131
  if (!opts.workdir) {
1032
- const userText = messages
1033
- .filter((m) => m.role === "user")
1034
- .map((m) => typeof m.content === "string" ? m.content : "")
1035
- .join(" ");
1036
1132
  const detected = detectProjectFromPrompt(userText);
1037
1133
  if (detected) {
1038
1134
  opts = { ...opts, workdir: detected.path };
@@ -1041,6 +1137,13 @@ export async function routeToCliRunner(
1041
1137
  }
1042
1138
  }
1043
1139
 
1140
+ // Skill hints: inject at END of prompt so they're the freshest context (not buried under system msg)
1141
+ const skillHints = detectSkillHints(userText);
1142
+ if (skillHints) {
1143
+ prompt = `${prompt}\n\n${skillHints}`;
1144
+ debugLog("SKILL-HINT", "injected skill hints at end of prompt", { len: skillHints.length });
1145
+ }
1146
+
1044
1147
  // Strip "vllm/" prefix if present — OpenClaw sends the full provider path
1045
1148
  // (e.g. "vllm/cli-claude/claude-sonnet-4-6") but the router only needs the
1046
1149
  // "cli-<type>/<model>" portion.
package/src/config.ts CHANGED
@@ -78,6 +78,12 @@ export const TOOL_HEAVY_THRESHOLD = 10;
78
78
  */
79
79
  export const TOOL_ROUTING_THRESHOLD = 8;
80
80
 
81
+ /**
82
+ * Prompt size threshold (bytes) for escalating Sonnet to Opus.
83
+ * Sonnet hangs ~50% at 30KB+ prompts. Opus handles large contexts reliably.
84
+ */
85
+ export const OPUS_ESCALATION_THRESHOLD = 30_000;
86
+
81
87
  /** Max characters per message content before truncation. */
82
88
  export const MAX_MSG_CHARS = 4_000;
83
89
 
@@ -9,7 +9,8 @@
9
9
  */
10
10
 
11
11
  import http from "node:http";
12
- import { randomBytes } from "node:crypto";
12
+ import { execSync } from "node:child_process";
13
+ import { randomBytes, createHash } from "node:crypto";
13
14
  import { type ChatMessage, type CliToolResult, type ToolDefinition, routeToCliRunner, extractMultimodalParts, cleanupMediaFiles } from "./cli-runner.js";
14
15
  import { scheduleTokenRefresh, setAuthLogger, stopTokenRefresh } from "./claude-auth.js";
15
16
  import { grokComplete, grokCompleteStream, type ChatMessage as GrokChatMessage } from "./grok-client.js";
@@ -33,9 +34,118 @@ import {
33
34
  BITNET_SYSTEM_PROMPT,
34
35
  DEFAULT_MODEL_TIMEOUTS,
35
36
  TOOL_ROUTING_THRESHOLD,
37
+ OPUS_ESCALATION_THRESHOLD,
36
38
  } from "./config.js";
37
39
  import { debugLog, DEBUG_LOG_PATH, getLogTail, watchLogFile, setDebugLogEnabled } from "./debug-log.js";
38
40
 
41
+ // ── Skill delegation via openclaw agent ─────────────────────────────────────
42
+
43
+ import { existsSync, readdirSync, statSync } from "node:fs";
44
+ import { join } from "node:path";
45
+ import { homedir } from "node:os";
46
+ import { spawn as spawnChild } from "node:child_process";
47
+
48
+ const activeDelegations = new Set<string>();
49
+
50
+ function extractUserText(messages: ChatMessage[]): string {
51
+ return messages
52
+ .filter((m) => m.role === "user")
53
+ .map((m) => {
54
+ if (typeof m.content === "string") return m.content;
55
+ if (Array.isArray(m.content)) {
56
+ return (m.content as Array<{ type: string; text?: string }>)
57
+ .filter((p) => p.type === "text" && p.text)
58
+ .map((p) => p.text!)
59
+ .join(" ");
60
+ }
61
+ return "";
62
+ })
63
+ .join(" ");
64
+ }
65
+
66
+ let _skillNames: string[] | null = null;
67
+ let _skillNamesAt = 0;
68
+
69
+ function getSkillNames(): string[] {
70
+ const now = Date.now();
71
+ if (_skillNames && (now - _skillNamesAt) < 120_000) return _skillNames;
72
+ _skillNames = [];
73
+ const dir = join(homedir(), ".openclaw", "skills");
74
+ try {
75
+ if (!existsSync(dir)) return _skillNames;
76
+ for (const name of readdirSync(dir)) {
77
+ try {
78
+ if (statSync(join(dir, name)).isDirectory() && existsSync(join(dir, name, "SKILL.md"))) {
79
+ _skillNames.push(name);
80
+ }
81
+ } catch {}
82
+ }
83
+ } catch {}
84
+ _skillNamesAt = now;
85
+ return _skillNames;
86
+ }
87
+
88
+ function detectMatchedSkill(userText: string): string | null {
89
+ for (const name of getSkillNames()) {
90
+ const re = new RegExp(`\\b${name.replace(/[.*+?^${}()|[\]\\]/g, "\\$&")}\\b`, "i");
91
+ if (re.test(userText)) return name;
92
+ }
93
+ return null;
94
+ }
95
+
96
+ async function delegateToAgent(prompt: string, timeoutMs: number): Promise<{ text: string; durationMs: number }> {
97
+ const start = Date.now();
98
+ const timeoutSec = Math.min(Math.floor(timeoutMs / 1000), 300);
99
+
100
+ return new Promise((resolve, reject) => {
101
+ // Use the same Node + openclaw entry point as the systemd service to avoid version mismatches
102
+ const openclawEntry = join(homedir(), ".npm-global", "lib", "node_modules", "openclaw", "dist", "entry.js");
103
+ const useEntryJs = existsSync(openclawEntry);
104
+ const cmd = useEntryJs ? process.execPath : "openclaw"; // process.execPath = /usr/bin/node
105
+ const args = useEntryJs
106
+ ? [openclawEntry, "agent", "--agent", "main", "--message", prompt, "--json", "--timeout", String(timeoutSec)]
107
+ : ["agent", "--agent", "main", "--message", prompt, "--json", "--timeout", String(timeoutSec)];
108
+ const child = spawnChild(cmd, args, {
109
+ env: { ...process.env, PATH: `${join(homedir(), ".local", "bin")}:${process.env.PATH ?? ""}` },
110
+ stdio: ["pipe", "pipe", "pipe"],
111
+ });
112
+
113
+ let stdout = "";
114
+ let stderr = "";
115
+ child.stdout?.on("data", (d: Buffer) => { stdout += d.toString(); });
116
+ child.stderr?.on("data", (d: Buffer) => { stderr += d.toString(); });
117
+
118
+ const timer = setTimeout(() => { child.kill("SIGTERM"); }, timeoutMs + 10_000);
119
+
120
+ child.on("close", (code) => {
121
+ clearTimeout(timer);
122
+ const durationMs = Date.now() - start;
123
+ // Only fail if no JSON output at all — stderr always has plugin log noise
124
+ const hasJsonOutput = stdout.includes('"status"') || stdout.includes('"result"');
125
+ if (code !== 0 && !hasJsonOutput) {
126
+ // Filter out plugin log lines from stderr to find real errors
127
+ const realErrors = stderr.split("\n").filter(l => !l.includes("[plugins]") && !l.includes("[memory-") && l.trim()).join("\n");
128
+ reject(new Error(`openclaw agent exited ${code}: ${realErrors.slice(0, 500) || stderr.slice(0, 500)}`));
129
+ return;
130
+ }
131
+ try {
132
+ const jsonStart = stdout.indexOf("{");
133
+ if (jsonStart === -1) {
134
+ reject(new Error("No JSON in openclaw agent output"));
135
+ return;
136
+ }
137
+ const result = JSON.parse(stdout.slice(jsonStart));
138
+ const text = result?.result?.payloads?.[0]?.text ?? result?.result?.text ?? "";
139
+ resolve({ text, durationMs });
140
+ } catch (e) {
141
+ reject(new Error(`Failed to parse agent result: ${(e as Error).message}`));
142
+ }
143
+ });
144
+
145
+ child.on("error", (err) => { clearTimeout(timer); reject(err); });
146
+ });
147
+ }
148
+
39
149
  // ── Active request tracking ─────────────────────────────────────────────────
40
150
 
41
151
  export interface ActiveRequest {
@@ -846,15 +956,107 @@ async function handleRequest(
846
956
  }
847
957
  // ─────────────────────────────────────────────────────────────────────────
848
958
 
959
+ // ── Skill delegation: delegate to openclaw agent for full workflow execution ──
960
+ const userText = extractUserText(cleanMessages);
961
+ const matchedSkill = detectMatchedSkill(userText);
962
+ const delegationKey = matchedSkill ? `${matchedSkill}:${createHash("md5").update(userText.slice(0, 500)).digest("hex").slice(0, 12)}` : null;
963
+
964
+ // TODO: delegation needs a multi-turn agent runner, not single-turn `openclaw agent`.
965
+ // `openclaw agent` returns after one turn (220ms) without executing the full workflow.
966
+ // Re-enable when OpenClaw supports multi-turn skill execution (e.g., `openclaw skill run blog-writer`).
967
+ if (false && matchedSkill && delegationKey && activeDelegations.size === 0) {
968
+ debugLog("DELEGATE", `skill "${matchedSkill}" detected, delegating to openclaw agent`, { msgs: cleanMessages.length });
969
+ activeDelegations.add(delegationKey);
970
+
971
+ // Send SSE headers early if streaming
972
+ if (stream) {
973
+ res.writeHead(200, { "Content-Type": "text/event-stream", "Cache-Control": "no-cache", Connection: "keep-alive", ...corsHeaders() });
974
+ res.write(": delegating to openclaw agent\n\n");
975
+ // Keepalive while agent runs
976
+ const ka = setInterval(() => { res.write(": agent working\n\n"); }, 15_000);
977
+ try {
978
+ const lastUser = [...cleanMessages].reverse().find(m => m.role === "user");
979
+ const delegatePrompt = typeof lastUser?.content === "string" ? lastUser.content
980
+ : Array.isArray(lastUser?.content) ? (lastUser!.content as Array<{ type: string; text?: string }>).filter(p => p.type === "text").map(p => p.text).join(" ")
981
+ : userText.slice(-2000);
982
+
983
+ const agentResult = await delegateToAgent(delegatePrompt, MAX_EFFECTIVE_TIMEOUT_MS);
984
+ debugLog("DELEGATE-OK", `skill "${matchedSkill}" completed in ${(agentResult.durationMs / 1000).toFixed(1)}s`, { contentLen: agentResult.text.length });
985
+ metrics.recordRequest(model, agentResult.durationMs, true, estPromptTokens, estimateTokens(agentResult.text), promptPreview);
986
+
987
+ const chunk = { id, object: "chat.completion.chunk", created, model, choices: [{ index: 0, delta: { role: "assistant", content: agentResult.text }, finish_reason: "stop" }] };
988
+ res.write(`data: ${JSON.stringify(chunk)}\n\n`);
989
+ res.write("data: [DONE]\n\n");
990
+ res.end();
991
+ } catch (err) {
992
+ const msg = (err as Error).message;
993
+ debugLog("DELEGATE-FAIL", `skill "${matchedSkill}" failed`, { error: msg.slice(0, 200) });
994
+ opts.warn(`[cli-bridge] agent delegation failed: ${msg.slice(0, 100)}, falling through to CLI`);
995
+ // Fall through to normal CLI routing below
996
+ clearInterval(ka);
997
+ activeDelegations.delete(delegationKey);
998
+ // Can't fall through after sending SSE headers — send error
999
+ res.write(`data: ${JSON.stringify({ error: { message: `Agent delegation failed: ${msg.slice(0, 200)}. Retrying via CLI.`, type: "cli_error" } })}\n\n`);
1000
+ res.write("data: [DONE]\n\n");
1001
+ res.end();
1002
+ activeRequests.delete(id);
1003
+ cleanupMediaFiles(mediaFiles);
1004
+ return;
1005
+ } finally {
1006
+ clearInterval(ka);
1007
+ activeDelegations.delete(delegationKey);
1008
+ }
1009
+ activeRequests.delete(id);
1010
+ cleanupMediaFiles(mediaFiles);
1011
+ return;
1012
+ }
1013
+
1014
+ // Non-streaming delegation
1015
+ try {
1016
+ const lastUser = [...cleanMessages].reverse().find(m => m.role === "user");
1017
+ const delegatePrompt = typeof lastUser?.content === "string" ? lastUser.content
1018
+ : Array.isArray(lastUser?.content) ? (lastUser!.content as Array<{ type: string; text?: string }>).filter(p => p.type === "text").map(p => p.text).join(" ")
1019
+ : userText.slice(-2000);
1020
+
1021
+ const agentResult = await delegateToAgent(delegatePrompt, MAX_EFFECTIVE_TIMEOUT_MS);
1022
+ debugLog("DELEGATE-OK", `skill "${matchedSkill}" completed in ${(agentResult.durationMs / 1000).toFixed(1)}s`, { contentLen: agentResult.text.length });
1023
+
1024
+ res.writeHead(200, { "Content-Type": "application/json", ...corsHeaders() });
1025
+ res.end(JSON.stringify({
1026
+ id, object: "chat.completion", created, model,
1027
+ choices: [{ index: 0, message: { role: "assistant", content: agentResult.text }, finish_reason: "stop" }],
1028
+ usage: { prompt_tokens: estPromptTokens, completion_tokens: estimateTokens(agentResult.text), total_tokens: estPromptTokens + estimateTokens(agentResult.text) },
1029
+ }));
1030
+ activeRequests.delete(id);
1031
+ cleanupMediaFiles(mediaFiles);
1032
+ return;
1033
+ } catch (err) {
1034
+ debugLog("DELEGATE-FAIL", `skill "${matchedSkill}" failed, falling through to CLI`, { error: (err as Error).message.slice(0, 200) });
1035
+ activeDelegations.delete(delegationKey);
1036
+ // Fall through to normal CLI routing
1037
+ } finally {
1038
+ activeDelegations.delete(delegationKey);
1039
+ }
1040
+ }
1041
+
849
1042
  // ── CLI runner routing (Gemini / Claude Code / Codex) ──────────────────────
850
1043
  let result: CliToolResult;
851
1044
  let usedModel = model;
852
1045
 
853
- // ── Smart tool routing: Sonnet first (better reasoning), fast fallback to Haiku ──
854
- // Sonnet picks the right tools but intermittently hangs on large prompts.
855
- // Strategy: let Sonnet try first if it responds, great (better tool selection).
856
- // The stale-output detector (60s) kills it fast if it hangs, then fallback to Haiku.
857
- // This preserves Sonnet's intelligence for tool selection while keeping Haiku as safety net.
1046
+ // ── Opus escalation: route heavy conversations to Opus instead of Sonnet ──
1047
+ // Sonnet hangs ~50% at 30KB+ prompts. Opus handles large contexts reliably.
1048
+ // Measure by message count (proxy for formatted prompt size after truncation):
1049
+ // - With 21 tools + 12 messages (heavy tools window), prompt hits ~30KB
1050
+ // - Escalate when messages > 20 (conversation is deep enough to cause hangs)
1051
+ const shouldEscalate = model === "cli-claude/claude-sonnet-4-6"
1052
+ && cleanMessages.length > 20
1053
+ && hasTools;
1054
+ if (shouldEscalate) {
1055
+ const originalModel = model;
1056
+ usedModel = "cli-claude/claude-opus-4-6";
1057
+ debugLog("OPUS-ESCALATE", `${originalModel} → ${usedModel}`, { msgs: cleanMessages.length, tools: tools?.length ?? 0 });
1058
+ opts.log(`[cli-bridge] escalating to Opus (${cleanMessages.length} msgs with ${tools?.length ?? 0} tools)`);
1059
+ }
858
1060
 
859
1061
  const routeOpts = { workdir, tools: hasTools ? tools : undefined, mediaFiles: mediaFiles.length ? mediaFiles : undefined, log: opts.log };
860
1062