npm - @martian-engineering/lossless-claw - Versions diffs - 0.5.1 → 0.5.2 - Mend

@martian-engineering/lossless-claw 0.5.1 → 0.5.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/README.md +4 -3
package/docs/configuration.md +1 -0
package/docs/tui.md +10 -1
package/package.json +1 -1
package/src/compaction.ts +28 -7
package/src/engine.ts +10 -2
package/src/plugin/index.ts +2 -0
package/src/summarize.ts +28 -6
package/src/tools/lcm-expand-query-tool.ts +214 -122

package/README.md CHANGED Viewed

@@ -98,8 +98,8 @@ Add a `lossless-claw` entry under `plugins.entries` in your OpenClaw config:
           "ignoreSessionPatterns": [
             "agent:*:cron:**"
           ],
-          "summaryProvider": "anthropic",
-          "summaryModel": "claude-3-5-haiku"
+          "summaryModel": "anthropic/claude-haiku-4-5",
+          "expansionModel": "anthropic/claude-haiku-4-5"
         }
       }
     }
@@ -107,7 +107,7 @@ Add a `lossless-claw` entry under `plugins.entries` in your OpenClaw config:
 }
 ```
-`summaryModel` and `summaryProvider` let you pin compaction summarization to a cheaper or faster model than your main OpenClaw session model. When unset, LCM uses OpenClaw's configured default model/provider.
+`summaryModel` and `summaryProvider` let you pin compaction summarization to a cheaper or faster model than your main OpenClaw session model. `expansionModel` does the same for `lcm_expand_query` sub-agent calls (drilling into summaries to recover detail). When unset, both fall back to OpenClaw's configured default model/provider. See [Expansion model override requirements](#expansion-model-override-requirements) for the required `subagent` trust policy when using `expansionModel`.
 ### Environment variables
@@ -133,6 +133,7 @@ Add a `lossless-claw` entry under `plugins.entries` in your OpenClaw config:
 | `LCM_LARGE_FILE_SUMMARY_MODEL` | `""` | Model override for large-file summarization |
 | `LCM_SUMMARY_MODEL` | `""` | Model override for compaction summarization; falls back to OpenClaw's default model when unset |
 | `LCM_SUMMARY_PROVIDER` | `""` | Provider override for compaction summarization; falls back to `OPENCLAW_PROVIDER` or the provider embedded in the model ref |
+| `LCM_SUMMARY_BASE_URL` | *(from OpenClaw / provider default)* | Base URL override for summarization API calls |
 | `LCM_EXPANSION_MODEL` | *(from OpenClaw)* | Model override for `lcm_expand_query` sub-agent (e.g. `anthropic/claude-haiku-4-5`) |
 | `LCM_EXPANSION_PROVIDER` | *(from OpenClaw)* | Provider override for `lcm_expand_query` sub-agent |
 | `LCM_AUTOCOMPACT_DISABLED` | `false` | Disable automatic compaction after turns |

package/docs/configuration.md CHANGED Viewed

@@ -99,6 +99,7 @@ LCM uses the same model as the parent OpenClaw session for summarization by defa
 # Use a specific model for summarization
 export LCM_SUMMARY_MODEL=anthropic/claude-sonnet-4-20250514
 export LCM_SUMMARY_PROVIDER=anthropic
+export LCM_SUMMARY_BASE_URL=https://api.anthropic.com
 ```
 Using a cheaper/faster model for summarization can reduce costs, but quality matters — poor summaries compound as they're condensed into higher-level nodes.

package/docs/tui.md CHANGED Viewed

@@ -287,6 +287,9 @@ lcm-tui rewrite 44 --all --apply --diff
 # Rewrite with OpenAI Responses API
 lcm-tui rewrite 44 --summary sum_abc123 --provider openai --model gpt-5.3-codex --apply
+# Rewrite through a custom OpenAI-compatible proxy
+lcm-tui rewrite 44 --summary sum_abc123 --provider openai --model gpt-5.3-codex --base-url https://proxy.example.com/openai --apply
 # Use custom prompt templates
 lcm-tui rewrite 44 --all --apply --prompt-dir ~/.config/lcm-tui/prompts
 ```
@@ -301,6 +304,7 @@ lcm-tui rewrite 44 --all --apply --prompt-dir ~/.config/lcm-tui/prompts
 | `--diff` | Show unified diff |
 | `--provider <id>` | API provider (inferred from `--model` when omitted) |
 | `--model <model>` | API model (default depends on provider) |
+| `--base-url <url>` | Custom API base URL (overrides config and env) |
 | `--prompt-dir <path>` | Custom prompt template directory |
 | `--timestamps` | Inject timestamps into source text (default: true) |
 | `--tz <timezone>` | Timezone for timestamps (default: system local) |
@@ -378,6 +382,9 @@ lcm-tui backfill my-agent session_abc123 --apply --transplant-to 653
 # Backfill using OpenAI
 lcm-tui backfill my-agent session_abc123 --apply --provider openai --model gpt-5.3-codex
+# Backfill through a custom OpenAI-compatible proxy
+lcm-tui backfill my-agent session_abc123 --apply --provider openai --model gpt-5.3-codex --base-url https://proxy.example.com/openai
 ```
 All write paths are transactional:
@@ -404,6 +411,7 @@ An idempotency guard prevents duplicate imports for the same `session_id`.
 | `--fresh-tail <n>` | Preserve freshest N raw messages from leaf compaction |
 | `--provider <id>` | API provider (inferred from model when omitted) |
 | `--model <id>` | API model (default depends on provider) |
+| `--base-url <url>` | Custom API base URL (overrides config and env) |
 | `--prompt-dir <path>` | Custom depth-prompt directory |
 ### `lcm-tui prompts`
@@ -479,9 +487,10 @@ If the provider auth profile mode is `oauth` (not `api_key`), set the provider A
 Interactive rewrite (`w`/`W`) can be configured with:
 - `LCM_TUI_SUMMARY_PROVIDER`
 - `LCM_TUI_SUMMARY_MODEL`
+- `LCM_TUI_SUMMARY_BASE_URL`
 - `LCM_TUI_CONVERSATION_WINDOW_SIZE` (default `200`)
-It also honors `LCM_SUMMARY_PROVIDER` / `LCM_SUMMARY_MODEL` as fallback.
+It also honors `LCM_SUMMARY_PROVIDER` / `LCM_SUMMARY_MODEL` / `LCM_SUMMARY_BASE_URL` as fallback.
 ## Database

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@martian-engineering/lossless-claw",
-  "version": "0.5.1",
+  "version": "0.5.2",
   "description": "Lossless Context Management plugin for OpenClaw — DAG-based conversation summarization with incremental compaction",
   "type": "module",
   "main": "index.ts",

package/src/compaction.ts CHANGED Viewed

@@ -2,6 +2,7 @@ import { createHash } from "node:crypto";
 import type { ConversationStore, CreateMessagePartInput } from "./store/conversation-store.js";
 import type { SummaryStore, SummaryRecord, ContextItemRecord } from "./store/summary-store.js";
 import { extractFileIdsFromContent } from "./large-files.js";
+import { LcmProviderAuthError } from "./summarize.js";
 // ── Public types ─────────────────────────────────────────────────────────────
@@ -1001,6 +1002,9 @@ export class CompactionEngine {
   /**
    * Run three-level summarization escalation:
    * normal -> aggressive -> deterministic fallback.
+   *
+   * Provider-auth failures are treated as non-compacting skips so we do not
+   * persist truncation artifacts into the summary DAG.
    */
   private async summarizeWithEscalation(params: {
     sourceText: string;
@@ -1026,17 +1030,31 @@ export class CompactionEngine {
         level: "fallback",
       };
     };
-    const runSummarizer = async (aggressiveMode: boolean): Promise<string | null> => {
-      const output = await params.summarize(sourceText, aggressiveMode, params.options);
+    const authFailure = Symbol("authFailure");
+    const runSummarizer = async (
+      aggressiveMode: boolean,
+    ): Promise<string | null | typeof authFailure> => {
+      let output: string;
+      try {
+        output = await params.summarize(sourceText, aggressiveMode, params.options);
+      } catch (err) {
+        if (err instanceof LcmProviderAuthError) {
+          return authFailure;
+        }
+        throw err;
+      }
       const trimmed = output.trim();
       return trimmed || null;
     };
     const initialSummary = await runSummarizer(false);
+    if (initialSummary === authFailure) {
+      return null;
+    }
     if (initialSummary === null) {
-      // Empty provider output should still compact deterministically so auth
-      // failures or empty responses do not stall compaction entirely.
+      // Empty provider output should still compact deterministically so a
+      // silent no-op does not stall compaction forever.
       return buildDeterministicFallback();
     }
     let summaryText = initialSummary;
@@ -1044,6 +1062,9 @@ export class CompactionEngine {
     if (estimateTokens(summaryText) >= inputTokens) {
       const aggressiveSummary = await runSummarizer(true);
+      if (aggressiveSummary === authFailure) {
+        return null;
+      }
       if (aggressiveSummary === null) {
         return buildDeterministicFallback();
       }
@@ -1149,7 +1170,7 @@ export class CompactionEngine {
     });
     if (!summary) {
       console.warn(
-        `[lcm] leaf summarizer returned empty content; conversationId=${conversationId}; chunkMessages=${messageContents.length}; skipping leaf chunk`,
+        `[lcm] leaf compaction skipped summary write; conversationId=${conversationId}; chunkMessages=${messageContents.length}`,
       );
       return null;
     }
@@ -1256,7 +1277,7 @@ export class CompactionEngine {
     });
     if (!condensed) {
       console.warn(
-        `[lcm] condensed summarizer returned empty content; conversationId=${conversationId}; depth=${targetDepth}; chunkSummaries=${summaryRecords.length}; skipping condensed chunk`,
+        `[lcm] condensed compaction skipped summary write; conversationId=${conversationId}; depth=${targetDepth}; chunkSummaries=${summaryRecords.length}`,
       );
       return null;
     }

package/src/engine.ts CHANGED Viewed

@@ -45,7 +45,7 @@ import {
   type MessagePartType,
 } from "./store/conversation-store.js";
 import { SummaryStore } from "./store/summary-store.js";
-import { createLcmSummarizeFromLegacyParams } from "./summarize.js";
+import { createLcmSummarizeFromLegacyParams, LcmProviderAuthError } from "./summarize.js";
 import type { LcmDependencies } from "./types.js";
 type AgentMessage = Parameters<ContextEngine["ingest"]>[0]["message"];
@@ -1277,7 +1277,15 @@ export class LcmContextEngine implements ContextEngine {
       }
       this.largeFileTextSummarizer = async (prompt: string): Promise<string | null> => {
-        const summary = await result.fn(prompt, false);
+        let summary: string;
+        try {
+          summary = await result.fn(prompt, false);
+        } catch (err) {
+          if (err instanceof LcmProviderAuthError) {
+            return null;
+          }
+          throw err;
+        }
         if (typeof summary !== "string") {
           return null;
         }

package/src/plugin/index.ts CHANGED Viewed

@@ -1294,6 +1294,8 @@ function createLcmDependencies(api: OpenClawPluginApi): LcmDependencies {
           return sub.run({
             sessionKey: String(params.params?.sessionKey ?? ""),
             message: String(params.params?.message ?? ""),
+            provider: params.params?.provider as string | undefined,
+            model: params.params?.model as string | undefined,
             extraSystemPrompt: params.params?.extraSystemPrompt as string | undefined,
             lane: params.params?.lane as string | undefined,
             deliver: (params.params?.deliver as boolean) ?? false,

package/src/summarize.ts CHANGED Viewed

@@ -42,6 +42,28 @@ type ProviderAuthFailure = {
   missingModelRequestScope: boolean;
 };
+/**
+ * Signals that the summarizer hit a provider-auth failure and callers should
+ * avoid treating the result like an empty summary.
+ */
+export class LcmProviderAuthError extends Error {
+  readonly provider: string;
+  readonly model: string;
+  readonly failure: ProviderAuthFailure;
+  constructor(params: {
+    provider: string;
+    model: string;
+    failure: ProviderAuthFailure;
+  }) {
+    super(buildProviderAuthWarning(params));
+    this.name = "LcmProviderAuthError";
+    this.provider = params.provider;
+    this.model = params.model;
+    this.failure = params.failure;
+  }
+}
 /**
  * Default timeout for a single summarizer LLM call.  Long enough for large
  * context windows on slower providers, short enough to prevent the gateway
@@ -976,13 +998,13 @@ export async function createLcmSummarizeFromLegacyParams(params: {
           },
         ],
         maxTokens: targetTokens,
-        temperature: aggressive ? 0.1 : 0.2,
       }), SUMMARIZER_TIMEOUT_MS, "initial");
     } catch (err) {
       const authFailure = extractProviderAuthFailure(err);
       if (authFailure) {
-        console.warn(buildProviderAuthWarning({ provider, model, failure: authFailure }));
-        return "";
+        const authError = new LcmProviderAuthError({ provider, model, failure: authFailure });
+        console.warn(authError.message);
+        throw authError;
       }
       const errMsg = err instanceof Error ? err.message : String(err);
       const isTimeout = errMsg.includes("summarizer timeout");
@@ -1000,8 +1022,9 @@ export async function createLcmSummarizeFromLegacyParams(params: {
     const authFailure = extractProviderAuthFailure(result);
     if (authFailure) {
-      console.warn(buildProviderAuthWarning({ provider, model, failure: authFailure }));
-      return "";
+      const authError = new LcmProviderAuthError({ provider, model, failure: authFailure });
+      console.warn(authError.message);
+      throw authError;
     }
     const normalized = normalizeCompletionSummary(result.content);
@@ -1059,7 +1082,6 @@ export async function createLcmSummarizeFromLegacyParams(params: {
             },
           ],
           maxTokens: targetTokens,
-          temperature: 0.05,
           reasoning: "low",
         }), SUMMARIZER_TIMEOUT_MS, "retry");
         const retryAuthFailure = extractProviderAuthFailure(retryResult);

package/src/tools/lcm-expand-query-tool.ts CHANGED Viewed

@@ -80,6 +80,76 @@ type SummaryCandidate = {
   conversationId: number;
 };
+function collectExpansionFailureText(value: unknown, parts: string[], depth = 0): void {
+  if (depth > 3 || value == null) {
+    return;
+  }
+  if (typeof value === "string") {
+    const trimmed = value.trim();
+    if (trimmed) {
+      parts.push(trimmed);
+    }
+    return;
+  }
+  if (typeof value === "number" || typeof value === "boolean") {
+    parts.push(String(value));
+    return;
+  }
+  if (value instanceof Error) {
+    if (value.message.trim()) {
+      parts.push(value.message.trim());
+    }
+    collectExpansionFailureText(value.cause, parts, depth + 1);
+    return;
+  }
+  if (Array.isArray(value)) {
+    for (const entry of value) {
+      collectExpansionFailureText(entry, parts, depth + 1);
+    }
+    return;
+  }
+  if (typeof value === "object") {
+    const record = value as Record<string, unknown>;
+    for (const key of ["message", "error", "reason", "details", "response", "cause", "code"]) {
+      collectExpansionFailureText(record[key], parts, depth + 1);
+    }
+  }
+}
+function formatExpansionFailure(error: unknown): string {
+  const parts: string[] = [];
+  collectExpansionFailureText(error, parts);
+  const message = parts.join(" ").replace(/\s+/g, " ").trim();
+  if (message) {
+    return message;
+  }
+  if (typeof error === "string" && error.trim()) {
+    return error.trim();
+  }
+  return "Delegated expansion query failed.";
+}
+function shouldRetryWithoutOverride(message: string): boolean {
+  const normalized = message.toLowerCase();
+  return [
+    "model.request",
+    "missing scopes",
+    "insufficient scope",
+    "unauthorized",
+    "not authorized",
+    "forbidden",
+    "provider/model overrides are not authorized",
+    "model override is not authorized",
+    "unknown model",
+    "model not found",
+    "invalid model",
+    "not available",
+    "not supported",
+    "401",
+    "403",
+  ].some((signal) => normalized.includes(signal));
+}
 /**
  * Build the sub-agent task message for delegated expansion and prompt answering.
  */
@@ -401,9 +471,6 @@ export function createLcmExpandQueryTool(input: {
         });
       }
-      let childSessionKey = "";
-      let grantCreated = false;
       try {
         const candidates = await resolveSummaryCandidates({
           lcm: input.lcm,
@@ -448,26 +515,9 @@ export function createLcmExpandQueryTool(input: {
         const requesterAgentId = input.deps.normalizeAgentId(
           input.deps.parseAgentSessionKey(callerSessionKey)?.agentId,
         );
-        childSessionKey = `agent:${requesterAgentId}:subagent:${crypto.randomUUID()}`;
         const childExpansionDepth = resolveNextExpansionDepth(callerSessionKey);
         const originSessionKey = recursionCheck.originSessionKey || callerSessionKey || "main";
-        createDelegatedExpansionGrant({
-          delegatedSessionKey: childSessionKey,
-          issuerSessionId: callerSessionKey || "main",
-          allowedConversationIds: [sourceConversationId],
-          tokenCap: expansionTokenCap,
-          ttlMs: DELEGATED_WAIT_TIMEOUT_MS + 30_000,
-        });
-        stampDelegatedExpansionContext({
-          sessionKey: childSessionKey,
-          requestId,
-          expansionDepth: childExpansionDepth,
-          originSessionKey,
-          stampedBy: "lcm_expand_query",
-        });
-        grantCreated = true;
         const task = buildDelegatedExpandQueryTask({
           summaryIds,
           conversationId: sourceConversationId,
@@ -480,118 +530,160 @@ export function createLcmExpandQueryTool(input: {
           originSessionKey,
         });
-        const childIdem = crypto.randomUUID();
         const expansionProvider = input.deps.config.expansionProvider || undefined;
         const expansionModel = input.deps.config.expansionModel || undefined;
-        const response = (await input.deps.callGateway({
-          method: "agent",
-          params: {
-            message: task,
-            sessionKey: childSessionKey,
-            deliver: false,
-            lane: input.deps.agentLaneSubagent,
-            idempotencyKey: childIdem,
-            ...(expansionProvider ? { provider: expansionProvider } : {}),
-            ...(expansionModel ? { model: expansionModel } : {}),
-            extraSystemPrompt: input.deps.buildSubagentSystemPrompt({
-              depth: 1,
-              maxDepth: 8,
-              taskSummary: "Run lcm_expand and return prompt-focused JSON answer",
-            }),
-          },
-          timeoutMs: GATEWAY_TIMEOUT_MS,
-        })) as { runId?: string };
-        const runId = typeof response?.runId === "string" ? response.runId.trim() : "";
-        if (!runId) {
-          return jsonResult({
-            error: "Delegated expansion did not return a runId.",
-          });
-        }
+        const configuredOverrideLabel =
+          expansionProvider && expansionModel
+            ? `${expansionProvider}/${expansionModel}`
+            : expansionModel || expansionProvider || "configured override";
-        const wait = (await input.deps.callGateway({
-          method: "agent.wait",
-          params: {
-            runId,
-            timeoutMs: DELEGATED_WAIT_TIMEOUT_MS,
-          },
-          timeoutMs: DELEGATED_WAIT_TIMEOUT_MS,
-        })) as { status?: string; error?: string };
-        const status = typeof wait?.status === "string" ? wait.status : "error";
-        if (status === "timeout") {
-          recordExpansionDelegationTelemetry({
-            deps: input.deps,
-            component: "lcm_expand_query",
-            event: "timeout",
-            requestId,
-            sessionKey: callerSessionKey,
-            expansionDepth: childExpansionDepth,
-            originSessionKey,
-            runId,
-          });
-          return jsonResult({
-            error: "lcm_expand_query timed out waiting for delegated expansion (120s).",
-          });
-        }
-        if (status !== "ok") {
-          return jsonResult({
-            error:
-              typeof wait?.error === "string" && wait.error.trim()
-                ? wait.error
-                : "Delegated expansion query failed.",
-          });
-        }
-        const replyPayload = (await input.deps.callGateway({
-          method: "sessions.get",
-          params: { key: childSessionKey, limit: 80 },
-          timeoutMs: GATEWAY_TIMEOUT_MS,
-        })) as { messages?: unknown[] };
-        const reply = input.deps.readLatestAssistantReply(
-          Array.isArray(replyPayload.messages) ? replyPayload.messages : [],
-        );
-        const parsed = parseDelegatedExpandQueryReply(reply, summaryIds.length);
-        recordExpansionDelegationTelemetry({
-          deps: input.deps,
-          component: "lcm_expand_query",
-          event: "success",
-          requestId,
-          sessionKey: callerSessionKey,
-          expansionDepth: childExpansionDepth,
-          originSessionKey,
-          runId,
-        });
+        const runDelegatedQuery = async (provider?: string, model?: string) => {
+          const childSessionKey = `agent:${requesterAgentId}:subagent:${crypto.randomUUID()}`;
+          const childIdem = crypto.randomUUID();
+          let grantCreated = false;
-        return jsonResult({
-          answer: parsed.answer,
-          citedIds: parsed.citedIds,
-          sourceConversationId,
-          expandedSummaryCount: parsed.expandedSummaryCount,
-          totalSourceTokens: parsed.totalSourceTokens,
-          truncated: parsed.truncated,
-        });
-      } catch (error) {
-        return jsonResult({
-          error: error instanceof Error ? error.message : String(error),
-        });
-      } finally {
-        if (childSessionKey) {
           try {
-            await input.deps.callGateway({
-              method: "sessions.delete",
-              params: { key: childSessionKey, deleteTranscript: true },
+            createDelegatedExpansionGrant({
+              delegatedSessionKey: childSessionKey,
+              issuerSessionId: callerSessionKey || "main",
+              allowedConversationIds: [sourceConversationId],
+              tokenCap: expansionTokenCap,
+              ttlMs: DELEGATED_WAIT_TIMEOUT_MS + 30_000,
+            });
+            stampDelegatedExpansionContext({
+              sessionKey: childSessionKey,
+              requestId,
+              expansionDepth: childExpansionDepth,
+              originSessionKey,
+              stampedBy: "lcm_expand_query",
+            });
+            grantCreated = true;
+            const response = (await input.deps.callGateway({
+              method: "agent",
+              params: {
+                message: task,
+                sessionKey: childSessionKey,
+                deliver: false,
+                lane: input.deps.agentLaneSubagent,
+                idempotencyKey: childIdem,
+                ...(provider ? { provider } : {}),
+                ...(model ? { model } : {}),
+                extraSystemPrompt: input.deps.buildSubagentSystemPrompt({
+                  depth: 1,
+                  maxDepth: 8,
+                  taskSummary: "Run lcm_expand and return prompt-focused JSON answer",
+                }),
+              },
+              timeoutMs: GATEWAY_TIMEOUT_MS,
+            })) as { runId?: unknown; error?: unknown };
+            const runId = typeof response?.runId === "string" ? response.runId.trim() : "";
+            if (!runId) {
+              throw new Error(
+                formatExpansionFailure(response?.error ?? response)
+                  || "Delegated expansion did not return a runId.",
+              );
+            }
+            const wait = (await input.deps.callGateway({
+              method: "agent.wait",
+              params: {
+                runId,
+                timeoutMs: DELEGATED_WAIT_TIMEOUT_MS,
+              },
+              timeoutMs: DELEGATED_WAIT_TIMEOUT_MS,
+            })) as { status?: string; error?: unknown };
+            const status = typeof wait?.status === "string" ? wait.status : "error";
+            if (status === "timeout") {
+              recordExpansionDelegationTelemetry({
+                deps: input.deps,
+                component: "lcm_expand_query",
+                event: "timeout",
+                requestId,
+                sessionKey: callerSessionKey,
+                expansionDepth: childExpansionDepth,
+                originSessionKey,
+                runId,
+              });
+              throw new Error(
+                "lcm_expand_query timed out waiting for delegated expansion (120s).",
+              );
+            }
+            if (status !== "ok") {
+              throw new Error(formatExpansionFailure(wait?.error));
+            }
+            const replyPayload = (await input.deps.callGateway({
+              method: "sessions.get",
+              params: { key: childSessionKey, limit: 80 },
               timeoutMs: GATEWAY_TIMEOUT_MS,
+            })) as { messages?: unknown[] };
+            const reply = input.deps.readLatestAssistantReply(
+              Array.isArray(replyPayload.messages) ? replyPayload.messages : [],
+            );
+            const parsed = parseDelegatedExpandQueryReply(reply, summaryIds.length);
+            recordExpansionDelegationTelemetry({
+              deps: input.deps,
+              component: "lcm_expand_query",
+              event: "success",
+              requestId,
+              sessionKey: callerSessionKey,
+              expansionDepth: childExpansionDepth,
+              originSessionKey,
+              runId,
+            });
+            return jsonResult({
+              answer: parsed.answer,
+              citedIds: parsed.citedIds,
+              sourceConversationId,
+              expandedSummaryCount: parsed.expandedSummaryCount,
+              totalSourceTokens: parsed.totalSourceTokens,
+              truncated: parsed.truncated,
             });
-          } catch {
-            // Cleanup is best-effort.
+          } finally {
+            try {
+              await input.deps.callGateway({
+                method: "sessions.delete",
+                params: { key: childSessionKey, deleteTranscript: true },
+                timeoutMs: GATEWAY_TIMEOUT_MS,
+              });
+            } catch {
+              // Cleanup is best-effort.
+            }
+            if (grantCreated) {
+              revokeDelegatedExpansionGrantForSession(childSessionKey, { removeBinding: true });
+            }
+            clearDelegatedExpansionContext(childSessionKey);
           }
+        };
+        if (!expansionProvider && !expansionModel) {
+          return await runDelegatedQuery();
         }
-        if (grantCreated && childSessionKey) {
-          revokeDelegatedExpansionGrantForSession(childSessionKey, { removeBinding: true });
-        }
-        if (childSessionKey) {
-          clearDelegatedExpansionContext(childSessionKey);
+        try {
+          return await runDelegatedQuery(expansionProvider, expansionModel);
+        } catch (error) {
+          const failure = formatExpansionFailure(error);
+          input.deps.log.warn(
+            `[lcm] delegated expansion override failed (${configuredOverrideLabel}): ${failure}`,
+          );
+          if (!shouldRetryWithoutOverride(failure)) {
+            throw new Error(failure);
+          }
+          input.deps.log.warn(
+            `[lcm] retrying delegated expansion without provider/model override after: ${failure}`,
+          );
+          return await runDelegatedQuery();
         }
+      } catch (error) {
+        const failure = formatExpansionFailure(error);
+        input.deps.log.error(`[lcm] delegated expansion query failed: ${failure}`);
+        return jsonResult({
+          error: failure,
+        });
       }
     },
   };