@ouro.bot/cli 0.1.0-alpha.490 → 0.1.0-alpha.492
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/changelog.json +16 -0
- package/dist/heart/core.js +123 -7
- package/dist/heart/daemon/cli-exec.js +10 -0
- package/dist/heart/session-events.js +107 -3
- package/dist/repertoire/tools-trip.js +76 -0
- package/dist/senses/shared-turn.js +43 -0
- package/package.json +1 -1
package/changelog.json
CHANGED
|
@@ -1,6 +1,22 @@
|
|
|
1
1
|
{
|
|
2
2
|
"_note": "This changelog is maintained as part of the PR/version-bump workflow. Agent-curated, not auto-generated. Agents read this file directly via read_file to understand what changed between versions.",
|
|
3
3
|
"versions": [
|
|
4
|
+
{
|
|
5
|
+
"version": "0.1.0-alpha.492",
|
|
6
|
+
"changes": [
|
|
7
|
+
"Engine-level fix for the actual root cause of Slugger's empty-reply MCP bug (PR #611's strip+retry was the right shape, but missed the deepest layer). MiniMax-M2.7 occasionally emits an assistant message with BOTH inline `<think>...</think>` reasoning AND tool_calls. When that combination is replayed in a subsequent turn, MiniMax rejects with error 2013 ('tool result's tool id not found') and stalls the entire session — every subsequent turn fails the same way, the failover layer fires repeatedly suggesting a provider switch, and the agent's own answer never reaches the operator. Slugger's session was stuck for 11 unanswered user messages because of this exact loop.",
|
|
8
|
+
"The fix has two halves and an AX rule: (1) **Persist-time strip** — runAgent now strips `<think>` blocks from the assistant message's persisted `content` before saving, while preserving the original reasoning trace on `_inline_reasoning` for audit. New `engine.inline_reasoning_stripped` info-level nerve event fires when this happens. (2) **Load-time repair** — `sanitizeProviderMessages` self-heals existing sessions that were saved before (1) by stripping the same blocks at load time. (3) **AX rule: full agent awareness, no silent fixes**. When the load-time repair runs, the synthetic tool-result that fills in for the missing tool result is an **explanatory** one — it tells the agent specifically: \"your previous tool call's result was lost because the assistant message had inline reasoning blocks the provider rejected; the harness has stripped them; your reasoning trace is preserved out-of-band; if the work needs to be done, retry the tool call now.\" Tool calls whose parent didn't have stripped reasoning still get a generic-but-improved \"this tool call's result was lost — possible causes [...]; retry if needed\" message instead of the old vague \"interrupted (previous turn timed out)\" line. The agent always sees what happened and what to do next.",
|
|
9
|
+
"Also: the no-tool-call retry path (added in #611's last commit) now uses the same shared `stripThinkBlocksForViolationCheck` helper so the violation-detection logic is consistent. Three regression tests cover the full path: persist-time strip preserves `_inline_reasoning`, load-time repair produces the explanatory tool-result message, generic orphans get the generic message, and unclosed `<think>` tags drop everything from the open tag onward."
|
|
10
|
+
]
|
|
11
|
+
},
|
|
12
|
+
{
|
|
13
|
+
"version": "0.1.0-alpha.491",
|
|
14
|
+
"changes": [
|
|
15
|
+
"Two live-runtime bugs Slugger surfaced during MCP roundtrip: (1) MCP `send_message` returned blank or raw `<think>` content instead of an actual reply when minimax-style models emitted only reasoning. The shared-turn runner now strips closed AND unclosed `<think>...</think>` blocks before returning, and when nothing remains it returns a clear diagnostic (`agent produced reasoning but no final answer this turn — try again`) plus emits a `senses.shared_turn_only_reasoning` warn-level nerve event so operators can see how often it's happening. (2) The `rest` tool's fresh-pending-work gate fired on every rest call within a turn because `hasFreshPendingWork(options)` reads from the turn-start snapshot of `pendingMessages` and never updates — once pending was non-empty, the agent could be told 'fresh work arrived' indefinitely even after surfacing or processing the items. The gate is now once-per-turn: the first rest call hits it, gets the message, the agent does whatever it needs, the next rest call passes. Emits `engine.fresh_work_gate_fired` info-level event the one time it fires. Both bugs reproduced as regression tests that fail without the fix.",
|
|
16
|
+
"New `trip_update_leg` tool to round out the trip ledger. The original Step 4 followup on substrate#35 listed `trip_ensure / trip_get / trip_update_leg / trip_attach_evidence` — the previous PR (#609) shipped `trip_upsert` instead of `trip_update_leg`, which forced the agent to re-emit the entire trip record to change one leg field. `trip_update_leg` updates specific fields of an existing leg in place: pass `tripId`, `legId`, a JSON object of field updates, and `updatedAt`. Identity-changing updates (`legId`, `kind`) and empty updates objects are rejected with operational error messages. Existing evidence is preserved unless the agent explicitly overwrites it. Emits a `trips.leg_updated` nerve event with the field list. Tool registry now at 74 tools (up from 73).",
|
|
17
|
+
"New `docs/trip-ledger.md` covering what the ledger actually is, why Slugger said it needed to exist (gap between mail body and travel doc — no authoritative source for cross-checks), the discriminated TripLeg union (lodging / flight / train / ground-transport / rental-car / ferry / event), the non-optional TripEvidence shape with `discoveryMethod`, the trust shape (per-agent keys, private key returned exactly once, hosted side never sees plaintext), the seven harness tools, on-disk layout for both harness and hosted sides, and an explicit answer to 'is this travel-specific or generalizable infra?' (current shape is travel-specific by design; the *pattern* is general and would be lifted to a shared abstraction the next time we build a per-agent encrypted record service)."
|
|
18
|
+
]
|
|
19
|
+
},
|
|
4
20
|
{
|
|
5
21
|
"version": "0.1.0-alpha.490",
|
|
6
22
|
"changes": [
|
package/dist/heart/core.js
CHANGED
|
@@ -180,6 +180,29 @@ function getProviderDisplayLabel(facing = "human") {
|
|
|
180
180
|
};
|
|
181
181
|
return providerLabelBuilders[provider]();
|
|
182
182
|
}
|
|
183
|
+
/**
|
|
184
|
+
* Strip <think>...</think> blocks for the violation-detection check at the
|
|
185
|
+
* end of a streaming turn. Used to tell legitimate text-only responses
|
|
186
|
+
* apart from the MiniMax-M2.7 "only thinking, no tool call" violation
|
|
187
|
+
* shape. Mirrors the more thorough stripThinkBlocks helper in
|
|
188
|
+
* senses/shared-turn.ts (which is for operator-facing output) — kept
|
|
189
|
+
* inline here to avoid pulling senses/ into the core module's import graph.
|
|
190
|
+
*/
|
|
191
|
+
function stripThinkBlocksForViolationCheck(input) {
|
|
192
|
+
let out = input;
|
|
193
|
+
for (;;) {
|
|
194
|
+
const open = out.indexOf("<think>");
|
|
195
|
+
if (open === -1)
|
|
196
|
+
break;
|
|
197
|
+
const close = out.indexOf("</think>", open + "<think>".length);
|
|
198
|
+
if (close === -1) {
|
|
199
|
+
out = out.slice(0, open);
|
|
200
|
+
break;
|
|
201
|
+
}
|
|
202
|
+
out = out.slice(0, open) + out.slice(close + "</think>".length);
|
|
203
|
+
}
|
|
204
|
+
return out.trim();
|
|
205
|
+
}
|
|
183
206
|
function hasFreshPendingWork(options) {
|
|
184
207
|
const pendingMessages = options?.pendingMessages;
|
|
185
208
|
if (!Array.isArray(pendingMessages))
|
|
@@ -380,10 +403,13 @@ function repairOrphanedToolCalls(messages) {
|
|
|
380
403
|
}
|
|
381
404
|
const missing = asst.tool_calls.filter((tc) => !resultIds.has(tc.id));
|
|
382
405
|
if (missing.length > 0) {
|
|
406
|
+
// AX rule: the agent must see what happened. Don't say "interrupted"
|
|
407
|
+
// — that's vague. Tell them the result was lost, possible causes,
|
|
408
|
+
// and what to do next.
|
|
383
409
|
const syntheticResults = missing.map((tc) => ({
|
|
384
410
|
role: "tool",
|
|
385
411
|
tool_call_id: tc.id,
|
|
386
|
-
content: "error: tool call was
|
|
412
|
+
content: "error: this tool call's result was lost — the previous turn ended before the tool finished (provider rejection, daemon interrupt, or the tool itself errored). if the work needs to be done, retry the tool call now.",
|
|
387
413
|
}));
|
|
388
414
|
let insertAt = i + 1;
|
|
389
415
|
while (insertAt < messages.length && messages[insertAt].role === "tool")
|
|
@@ -545,6 +571,21 @@ async function runAgent(messages, callbacks, channel, signal, options) {
|
|
|
545
571
|
let sawQuerySession = false;
|
|
546
572
|
let sawBridgeManage = false;
|
|
547
573
|
let sawExternalStateQuery = false;
|
|
574
|
+
// Once-per-turn flag for the fresh-work rest gate. Without this, an agent
|
|
575
|
+
// that called rest, was told "fresh work arrived", processed the items,
|
|
576
|
+
// and called rest again would get the same message forever — the gate
|
|
577
|
+
// condition is read from the turn-start snapshot of pendingMessages,
|
|
578
|
+
// which doesn't update mid-turn. The agent only needs to be told once;
|
|
579
|
+
// after that, repeated rest attempts mean they've acknowledged.
|
|
580
|
+
let freshWorkGateFired = false;
|
|
581
|
+
// Counter for "no tool call returned despite tool_choice=required" violations.
|
|
582
|
+
// MiniMax reasoning models occasionally emit only a <think>...</think>
|
|
583
|
+
// block and stop, without any tool call — even when tool_choice is set to
|
|
584
|
+
// "required". This is a provider-level violation; the harness retries with
|
|
585
|
+
// a corrective nudge up to a small cap rather than silently accepting an
|
|
586
|
+
// empty turn.
|
|
587
|
+
let noToolCallRetries = 0;
|
|
588
|
+
const NO_TOOL_CALL_MAX_RETRIES = 2;
|
|
548
589
|
const toolLoopState = (0, tool_loop_1.createToolLoopState)();
|
|
549
590
|
const toolFrictionLedger = (0, tool_friction_1.createToolFrictionLedger)();
|
|
550
591
|
const finishTerminalProviderError = (error, classification) => {
|
|
@@ -726,8 +767,36 @@ async function runAgent(messages, callbacks, channel, signal, options) {
|
|
|
726
767
|
const msg = {
|
|
727
768
|
role: "assistant",
|
|
728
769
|
};
|
|
729
|
-
|
|
730
|
-
|
|
770
|
+
// Persist assistant content WITHOUT inline <think>...</think> blocks.
|
|
771
|
+
// Reasoning content already routed through onReasoningChunk for live
|
|
772
|
+
// surfacing and persisted separately as `_reasoning_items` for
|
|
773
|
+
// providers that support a reasoning channel; saving it inline AND
|
|
774
|
+
// alongside tool_calls causes MiniMax to reject the replayed turn
|
|
775
|
+
// with "tool result's tool id not found" (error code 2013) because
|
|
776
|
+
// it can't reconcile reasoning-with-tools in the same assistant
|
|
777
|
+
// message. Strip aggressively at persist so the next replay is
|
|
778
|
+
// clean; preserve the original reasoning trace on the message via
|
|
779
|
+
// `_inline_reasoning` so debug/audit paths can still see it.
|
|
780
|
+
if (result.content) {
|
|
781
|
+
const stripped = stripThinkBlocksForViolationCheck(result.content);
|
|
782
|
+
if (stripped.length > 0)
|
|
783
|
+
msg.content = stripped;
|
|
784
|
+
if (stripped.length !== result.content.length) {
|
|
785
|
+
msg._inline_reasoning = result.content;
|
|
786
|
+
(0, runtime_1.emitNervesEvent)({
|
|
787
|
+
level: "info",
|
|
788
|
+
component: "engine",
|
|
789
|
+
event: "engine.inline_reasoning_stripped",
|
|
790
|
+
message: "stripped inline <think> blocks from persisted assistant message; preserved on _inline_reasoning",
|
|
791
|
+
meta: {
|
|
792
|
+
provider: providerRuntime.id,
|
|
793
|
+
model: providerRuntime.model,
|
|
794
|
+
originalLength: result.content.length,
|
|
795
|
+
strippedLength: stripped.length,
|
|
796
|
+
},
|
|
797
|
+
});
|
|
798
|
+
}
|
|
799
|
+
}
|
|
731
800
|
if (result.toolCalls.length)
|
|
732
801
|
msg.tool_calls = result.toolCalls.map((tc) => ({
|
|
733
802
|
id: tc.id,
|
|
@@ -751,14 +820,53 @@ async function runAgent(messages, callbacks, channel, signal, options) {
|
|
|
751
820
|
if (hasPhaseAnnotation) {
|
|
752
821
|
msg.phase = isSoleSettle ? "settle" : "commentary";
|
|
753
822
|
}
|
|
823
|
+
// Detect the MiniMax "only-thinking, no tool call" violation: no tool
|
|
824
|
+
// calls returned, and the content is empty after stripping
|
|
825
|
+
// <think>...</think> blocks. This is a narrow check — legitimate
|
|
826
|
+
// content-only responses (text without think tags, or text outside
|
|
827
|
+
// think tags) still flow through the original "no tool calls →
|
|
828
|
+
// accept as-is" path so existing channels and tests are unaffected.
|
|
829
|
+
const onlyThinkContent = !result.toolCalls.length
|
|
830
|
+
&& typeof result.content === "string"
|
|
831
|
+
&& stripThinkBlocksForViolationCheck(result.content).length === 0
|
|
832
|
+
&& result.content.length > 0;
|
|
754
833
|
if (!result.toolCalls.length) {
|
|
755
|
-
|
|
756
|
-
|
|
757
|
-
|
|
834
|
+
if (onlyThinkContent && toolChoiceRequired && noToolCallRetries < NO_TOOL_CALL_MAX_RETRIES) {
|
|
835
|
+
// Provider-level violation: tool_choice was required, model emitted
|
|
836
|
+
// only a <think>...</think> block (or empty content) with no tool
|
|
837
|
+
// call. Retry with a corrective nudge up to NO_TOOL_CALL_MAX_RETRIES
|
|
838
|
+
// times. After cap, accept as-is (the readback path strips think
|
|
839
|
+
// tags and surfaces a clear diagnostic).
|
|
840
|
+
noToolCallRetries++;
|
|
841
|
+
(0, runtime_1.emitNervesEvent)({
|
|
842
|
+
level: "warn",
|
|
843
|
+
component: "engine",
|
|
844
|
+
event: "engine.no_tool_call_retry",
|
|
845
|
+
message: "model returned only <think> content with no tool call despite tool_choice=required; retrying with corrective nudge",
|
|
846
|
+
meta: {
|
|
847
|
+
attempt: noToolCallRetries,
|
|
848
|
+
cap: NO_TOOL_CALL_MAX_RETRIES,
|
|
849
|
+
provider: providerRuntime.id,
|
|
850
|
+
model: providerRuntime.model,
|
|
851
|
+
contentLength: result.content.length,
|
|
852
|
+
},
|
|
853
|
+
});
|
|
854
|
+
messages.push(msg);
|
|
855
|
+
messages.push({
|
|
856
|
+
role: "user",
|
|
857
|
+
content: isInnerDialog
|
|
858
|
+
? "no tool was called this turn. you must end every turn by calling rest (or surface, ponder, observe). emit the tool call now."
|
|
859
|
+
: "no tool was called this turn. you must end every turn by calling settle with your answer (or ponder/observe). emit the tool call now.",
|
|
860
|
+
});
|
|
861
|
+
continue;
|
|
862
|
+
}
|
|
863
|
+
// Legitimate text-only response, or cap reached — accept as-is.
|
|
758
864
|
messages.push(msg);
|
|
759
865
|
done = true;
|
|
760
866
|
}
|
|
761
867
|
else {
|
|
868
|
+
// Reset the retry counter on any successful tool call.
|
|
869
|
+
noToolCallRetries = 0;
|
|
762
870
|
// Check for settle sole call: intercept before tool execution
|
|
763
871
|
if (isSoleSettle) {
|
|
764
872
|
/* v8 ignore next -- defensive: JSON.parse catch for malformed settle args @preserve */
|
|
@@ -896,12 +1004,20 @@ async function runAgent(messages, callbacks, channel, signal, options) {
|
|
|
896
1004
|
providerRuntime.appendToolOutput(result.toolCalls[0].id, gateMessage);
|
|
897
1005
|
continue;
|
|
898
1006
|
}
|
|
899
|
-
if (hasFreshPendingWork(options)) {
|
|
1007
|
+
if (hasFreshPendingWork(options) && !freshWorkGateFired) {
|
|
1008
|
+
freshWorkGateFired = true;
|
|
900
1009
|
callbacks.onToolEnd("rest", (0, tools_1.summarizeArgs)("rest", restArgs), false);
|
|
901
1010
|
messages.push(msg);
|
|
902
1011
|
const gateMessage = "fresh work arrived for me this turn — inspect the pending messages above and take the next concrete action before you rest.";
|
|
903
1012
|
messages.push({ role: "tool", tool_call_id: result.toolCalls[0].id, content: gateMessage });
|
|
904
1013
|
providerRuntime.appendToolOutput(result.toolCalls[0].id, gateMessage);
|
|
1014
|
+
(0, runtime_1.emitNervesEvent)({
|
|
1015
|
+
level: "info",
|
|
1016
|
+
component: "engine",
|
|
1017
|
+
event: "engine.fresh_work_gate_fired",
|
|
1018
|
+
message: "rest deferred once because pending work arrived this turn; agent has been notified",
|
|
1019
|
+
meta: { pendingCount: options.pendingMessages.length },
|
|
1020
|
+
});
|
|
905
1021
|
continue;
|
|
906
1022
|
}
|
|
907
1023
|
callbacks.onToolEnd("rest", (0, tools_1.summarizeArgs)("rest", restArgs), true);
|
|
@@ -2938,6 +2938,14 @@ async function executeConnectBlueBubbles(agent, deps) {
|
|
|
2938
2938
|
const port = parseOptionalPort(await promptInput("Local webhook port [18790]: "), 18790, "BlueBubbles webhook port");
|
|
2939
2939
|
const webhookPath = normalizeWebhookPath(await promptInput("Local webhook path [/bluebubbles-webhook]: "), "/bluebubbles-webhook");
|
|
2940
2940
|
const requestTimeoutMs = parseOptionalPositiveInteger(await promptInput("Request timeout ms [30000]: "), 30000, "BlueBubbles request timeout");
|
|
2941
|
+
// Capture the operator's known iMessage handles so the BB ingest path can
|
|
2942
|
+
// filter group-chat echoes whose `isFromMe` flag was lost or never set.
|
|
2943
|
+
// Without this, the agent would ingest its own outbound message as inbound
|
|
2944
|
+
// and reply to itself ("Slugger talking to himself" in groups).
|
|
2945
|
+
const ownHandlesRaw = (await promptInput("Your iMessage handle(s) — phone(s) and/or email(s) BlueBubbles attributes to your sent messages (comma-separated; needed for the group self-talk filter; blank to skip): ")).trim();
|
|
2946
|
+
const ownHandles = ownHandlesRaw
|
|
2947
|
+
? ownHandlesRaw.split(",").map((h) => h.trim()).filter((h) => h.length > 0)
|
|
2948
|
+
: [];
|
|
2941
2949
|
const machineId = currentMachineId(deps);
|
|
2942
2950
|
const progress = createHumanCommandProgress(deps, "connect bluebubbles");
|
|
2943
2951
|
let stored;
|
|
@@ -2956,6 +2964,7 @@ async function executeConnectBlueBubbles(agent, deps) {
|
|
|
2956
2964
|
serverUrl,
|
|
2957
2965
|
password,
|
|
2958
2966
|
accountId: "default",
|
|
2967
|
+
ownHandles,
|
|
2959
2968
|
},
|
|
2960
2969
|
bluebubblesChannel: {
|
|
2961
2970
|
port,
|
|
@@ -2984,6 +2993,7 @@ async function executeConnectBlueBubbles(agent, deps) {
|
|
|
2984
2993
|
`Stored: ${stored.itemPath}`,
|
|
2985
2994
|
"agent.json: senses.bluebubbles.enabled = true",
|
|
2986
2995
|
`Runtime: ${daemonApply}`,
|
|
2996
|
+
`ownHandles: ${ownHandles.length > 0 ? ownHandles.join(", ") : "(none — group self-talk filter inactive)"}`,
|
|
2987
2997
|
"secret was not printed",
|
|
2988
2998
|
...(syncSummary ? [syncSummary] : []),
|
|
2989
2999
|
],
|
|
@@ -334,7 +334,7 @@ function repairSessionMessages(messages) {
|
|
|
334
334
|
});
|
|
335
335
|
return result.map(toProviderMessage);
|
|
336
336
|
}
|
|
337
|
-
function repairToolCallSequences(messages) {
|
|
337
|
+
function repairToolCallSequences(messages, inlineReasoningStrippedCallIds = new Set()) {
|
|
338
338
|
const normalized = messages.map(normalizeMessage);
|
|
339
339
|
const validCallIds = new Set();
|
|
340
340
|
for (const msg of normalized) {
|
|
@@ -381,7 +381,7 @@ function repairToolCallSequences(messages) {
|
|
|
381
381
|
continue;
|
|
382
382
|
const syntheticResults = missing.map((toolCall) => ({
|
|
383
383
|
role: "tool",
|
|
384
|
-
content:
|
|
384
|
+
content: buildSyntheticToolResultMessage(toolCall.id, inlineReasoningStrippedCallIds),
|
|
385
385
|
name: null,
|
|
386
386
|
toolCallId: toolCall.id,
|
|
387
387
|
toolCalls: [],
|
|
@@ -460,6 +460,105 @@ function migrateToolNames(messages) {
|
|
|
460
460
|
}
|
|
461
461
|
return safeMessages.map(normalizeMessage).map(toProviderMessage);
|
|
462
462
|
}
|
|
463
|
+
/**
|
|
464
|
+
* Strip inline `<think>...</think>` blocks from a string. Mirrors the
|
|
465
|
+
* helper at senses/shared-turn.ts (operator-facing) and core.ts
|
|
466
|
+
* (live-turn) — kept inline here because session-events.ts is the load-
|
|
467
|
+
* time repair path and needs its own copy to avoid sense/heart import
|
|
468
|
+
* cycles. If the close tag is missing, drops everything from the open
|
|
469
|
+
* tag onward.
|
|
470
|
+
*/
|
|
471
|
+
function stripInlineThinkBlocks(input) {
|
|
472
|
+
let out = input;
|
|
473
|
+
for (;;) {
|
|
474
|
+
const open = out.indexOf("<think>");
|
|
475
|
+
if (open === -1)
|
|
476
|
+
break;
|
|
477
|
+
const close = out.indexOf("</think>", open + "<think>".length);
|
|
478
|
+
if (close === -1) {
|
|
479
|
+
out = out.slice(0, open);
|
|
480
|
+
break;
|
|
481
|
+
}
|
|
482
|
+
out = out.slice(0, open) + out.slice(close + "</think>".length);
|
|
483
|
+
}
|
|
484
|
+
return out.trim();
|
|
485
|
+
}
|
|
486
|
+
/**
|
|
487
|
+
* Strip inline `<think>` content from any assistant message that ALSO has
|
|
488
|
+
* tool_calls. MiniMax-style models persist think-content + tool_calls on
|
|
489
|
+
* the same assistant turn; replaying that combination triggers MiniMax
|
|
490
|
+
* error 2013 ("tool result's tool id not found") and stalls the session.
|
|
491
|
+
*
|
|
492
|
+
* AX requirement: the agent MUST see that this happened. We don't silently
|
|
493
|
+
* paper over their previous turn — we strip for replay correctness AND
|
|
494
|
+
* collect the affected tool_call_ids in `inlineReasoningStrippedCallIds`
|
|
495
|
+
* so the downstream synthetic-tool-result repair can produce an
|
|
496
|
+
* explanatory message addressed to those specific calls. The agent sees:
|
|
497
|
+
* "your previous tool call's result was lost because the assistant message
|
|
498
|
+
* had inline reasoning blocks the provider couldn't replay — here's what
|
|
499
|
+
* happened, retry if needed." Full awareness, no silent corrections.
|
|
500
|
+
*
|
|
501
|
+
* This load-time repair self-heals existing sessions that were saved
|
|
502
|
+
* before the persist-time strip in core.ts landed.
|
|
503
|
+
*/
|
|
504
|
+
function repairInlineReasoningOnReplay(messages, inlineReasoningStrippedCallIds) {
|
|
505
|
+
let repaired = 0;
|
|
506
|
+
const result = messages.map((msg) => {
|
|
507
|
+
if (msg.role !== "assistant")
|
|
508
|
+
return msg;
|
|
509
|
+
const a = msg;
|
|
510
|
+
if (!a.tool_calls || a.tool_calls.length === 0)
|
|
511
|
+
return msg;
|
|
512
|
+
if (typeof a.content !== "string")
|
|
513
|
+
return msg;
|
|
514
|
+
if (!a.content.includes("<think>"))
|
|
515
|
+
return msg;
|
|
516
|
+
const stripped = stripInlineThinkBlocks(a.content);
|
|
517
|
+
repaired++;
|
|
518
|
+
for (const tc of a.tool_calls)
|
|
519
|
+
inlineReasoningStrippedCallIds.add(tc.id);
|
|
520
|
+
return { ...a, content: stripped.length > 0 ? stripped : null };
|
|
521
|
+
});
|
|
522
|
+
if (repaired > 0) {
|
|
523
|
+
(0, runtime_1.emitNervesEvent)({
|
|
524
|
+
level: "info",
|
|
525
|
+
event: "mind.session_inline_reasoning_repair",
|
|
526
|
+
component: "mind",
|
|
527
|
+
message: "stripped inline <think> blocks from assistant messages with tool_calls so replay is valid; agent will see explanatory tool-result messages",
|
|
528
|
+
meta: { repaired, affectedCallIds: inlineReasoningStrippedCallIds.size },
|
|
529
|
+
});
|
|
530
|
+
}
|
|
531
|
+
return result;
|
|
532
|
+
}
|
|
533
|
+
/**
|
|
534
|
+
* Compose the synthetic tool-result message the agent sees when their
|
|
535
|
+
* previous turn's tool call has no matching tool result. The default
|
|
536
|
+
* message tells the agent what happened (turn ended early, result lost)
|
|
537
|
+
* and what to do (retry if the work isn't done). When the parent
|
|
538
|
+
* assistant message had inline `<think>` reasoning that the provider
|
|
539
|
+
* rejected, the message is more specific so the agent can adjust.
|
|
540
|
+
*
|
|
541
|
+
* AX rule: every repair must produce a message the agent can read and
|
|
542
|
+
* act on. Silent strips are never OK.
|
|
543
|
+
*/
|
|
544
|
+
function buildSyntheticToolResultMessage(toolCallId, inlineReasoningStrippedCallIds) {
|
|
545
|
+
if (inlineReasoningStrippedCallIds.has(toolCallId)) {
|
|
546
|
+
return [
|
|
547
|
+
"error: this tool call's result was lost.",
|
|
548
|
+
"your previous assistant turn included inline `<think>...</think>` reasoning alongside tool_calls,",
|
|
549
|
+
"and the provider (likely MiniMax) rejects that combination on replay (error 2013).",
|
|
550
|
+
"the harness has stripped the inline reasoning from the persisted content so the next replay is valid;",
|
|
551
|
+
"your reasoning trace itself is preserved out-of-band and not lost.",
|
|
552
|
+
"if the underlying work still needs to be done, retry the tool call now —",
|
|
553
|
+
"the call may not have run, or it ran but the result didn't reach you.",
|
|
554
|
+
].join(" ");
|
|
555
|
+
}
|
|
556
|
+
return [
|
|
557
|
+
"error: this tool call's result was lost — the previous turn ended before the tool finished",
|
|
558
|
+
"(provider rejection, daemon interrupt, or the tool itself errored).",
|
|
559
|
+
"if the work needs to be done, retry the tool call now.",
|
|
560
|
+
].join(" ");
|
|
561
|
+
}
|
|
463
562
|
function sanitizeProviderMessages(messages) {
|
|
464
563
|
const safeMessages = messages.filter((message) => Boolean(message) && typeof message === "object");
|
|
465
564
|
const normalized = safeMessages.map(normalizeMessage);
|
|
@@ -473,7 +572,12 @@ function sanitizeProviderMessages(messages) {
|
|
|
473
572
|
meta: { violations },
|
|
474
573
|
});
|
|
475
574
|
}
|
|
476
|
-
|
|
575
|
+
// Track which tool_call_ids belonged to assistant messages whose inline
|
|
576
|
+
// reasoning we just stripped. The synthetic-tool-result repair downstream
|
|
577
|
+
// uses this set to produce an explanatory message for those calls so the
|
|
578
|
+
// agent has full awareness of what happened.
|
|
579
|
+
const inlineReasoningStrippedCallIds = new Set();
|
|
580
|
+
return canonicalizeSystemMessageSequence(migrateToolNames(repairToolCallSequences(repairInlineReasoningOnReplay(repairSessionMessages(normalized.map(toProviderMessage)), inlineReasoningStrippedCallIds), inlineReasoningStrippedCallIds)));
|
|
477
581
|
}
|
|
478
582
|
function stampIngressTime(msg) {
|
|
479
583
|
msg._ingressAt = new Date().toISOString();
|
|
@@ -248,6 +248,82 @@ exports.tripToolDefinitions = [
|
|
|
248
248
|
},
|
|
249
249
|
summaryKeys: ["tripId", "legId"],
|
|
250
250
|
},
|
|
251
|
+
{
|
|
252
|
+
tool: {
|
|
253
|
+
type: "function",
|
|
254
|
+
function: {
|
|
255
|
+
name: "trip_update_leg",
|
|
256
|
+
description: "Update specific fields of an existing leg in a trip. Pass tripId, legId, and a JSON object of field updates (e.g. {status:\"cancelled\", confirmationCode:\"PNR123\"}). Existing evidence is preserved unless explicitly overwritten. Use this instead of trip_upsert when you only need to change one leg without re-emitting the whole record. The leg's `kind` cannot be changed (changing kind means a new leg).",
|
|
257
|
+
parameters: {
|
|
258
|
+
type: "object",
|
|
259
|
+
properties: {
|
|
260
|
+
tripId: { type: "string", description: "Canonical trip id." },
|
|
261
|
+
legId: { type: "string", description: "Leg id within the trip." },
|
|
262
|
+
updates: { type: "string", description: "JSON object of leg fields to update. Cannot include `legId` or `kind`. Common fields: status, confirmationCode, vendor, amount, checkInDate, checkOutDate, departureTime, arrivalTime, etc." },
|
|
263
|
+
updatedAt: { type: "string", description: "ISO timestamp for the update. Used both for the leg's updatedAt and the trip's updatedAt." },
|
|
264
|
+
},
|
|
265
|
+
required: ["tripId", "legId", "updates", "updatedAt"],
|
|
266
|
+
},
|
|
267
|
+
},
|
|
268
|
+
},
|
|
269
|
+
handler: async (args, ctx) => {
|
|
270
|
+
if (!trustAllowsTripAccess(ctx))
|
|
271
|
+
return "trip ledger is private; this tool is only available in trusted contexts.";
|
|
272
|
+
const tripId = args.tripId;
|
|
273
|
+
const legId = args.legId;
|
|
274
|
+
const updatedAt = args.updatedAt;
|
|
275
|
+
if (typeof tripId !== "string" || tripId.length === 0)
|
|
276
|
+
return "tripId is required.";
|
|
277
|
+
if (typeof legId !== "string" || legId.length === 0)
|
|
278
|
+
return "legId is required.";
|
|
279
|
+
if (typeof updatedAt !== "string" || updatedAt.length === 0)
|
|
280
|
+
return "updatedAt is required.";
|
|
281
|
+
try {
|
|
282
|
+
const updates = parseJsonArg(args.updates, "updates");
|
|
283
|
+
if (!isRecord(updates))
|
|
284
|
+
return "updates must be a JSON object.";
|
|
285
|
+
// Reject identity-changing fields — those would silently break referential integrity.
|
|
286
|
+
if ("legId" in updates)
|
|
287
|
+
return "updates cannot change legId; create a new leg instead.";
|
|
288
|
+
if ("kind" in updates)
|
|
289
|
+
return "updates cannot change kind; create a new leg instead.";
|
|
290
|
+
if (Object.keys(updates).length === 0)
|
|
291
|
+
return "updates cannot be empty — pass at least one field.";
|
|
292
|
+
const trip = (0, store_1.readTripRecord)((0, identity_1.getAgentName)(), tripId);
|
|
293
|
+
const legIndex = trip.legs.findIndex((leg) => leg.legId === legId);
|
|
294
|
+
if (legIndex === -1)
|
|
295
|
+
return `leg ${legId} not found in trip ${tripId}.`;
|
|
296
|
+
const leg = trip.legs[legIndex];
|
|
297
|
+
const updatedLeg = {
|
|
298
|
+
...leg,
|
|
299
|
+
...updates,
|
|
300
|
+
legId: leg.legId,
|
|
301
|
+
kind: leg.kind,
|
|
302
|
+
updatedAt,
|
|
303
|
+
};
|
|
304
|
+
const updated = {
|
|
305
|
+
...trip,
|
|
306
|
+
legs: [...trip.legs.slice(0, legIndex), updatedLeg, ...trip.legs.slice(legIndex + 1)],
|
|
307
|
+
updatedAt,
|
|
308
|
+
};
|
|
309
|
+
(0, store_1.upsertTripRecord)((0, identity_1.getAgentName)(), updated);
|
|
310
|
+
(0, runtime_1.emitNervesEvent)({
|
|
311
|
+
component: "trips",
|
|
312
|
+
event: "trips.leg_updated",
|
|
313
|
+
message: "trip leg fields updated",
|
|
314
|
+
meta: { agentId: (0, identity_1.getAgentName)(), tripId, legId, fields: Object.keys(updates) },
|
|
315
|
+
});
|
|
316
|
+
const fieldList = Object.keys(updates).join(", ");
|
|
317
|
+
return `leg ${legId} updated in ${tripId}: ${fieldList}.`;
|
|
318
|
+
}
|
|
319
|
+
catch (error) {
|
|
320
|
+
if (error instanceof store_1.TripNotFoundError)
|
|
321
|
+
return error.message;
|
|
322
|
+
return `update failed: ${error instanceof Error ? error.message : /* v8 ignore next -- non-Error throw is unreachable from parseJsonArg/store */ String(error)}`;
|
|
323
|
+
}
|
|
324
|
+
},
|
|
325
|
+
summaryKeys: ["tripId", "legId"],
|
|
326
|
+
},
|
|
251
327
|
{
|
|
252
328
|
tool: {
|
|
253
329
|
type: "function",
|
|
@@ -39,6 +39,7 @@ var __importStar = (this && this.__importStar) || (function () {
|
|
|
39
39
|
};
|
|
40
40
|
})();
|
|
41
41
|
Object.defineProperty(exports, "__esModule", { value: true });
|
|
42
|
+
exports.stripThinkBlocks = stripThinkBlocks;
|
|
42
43
|
exports.runSenseTurn = runSenseTurn;
|
|
43
44
|
const os = __importStar(require("os"));
|
|
44
45
|
const path = __importStar(require("path"));
|
|
@@ -59,6 +60,29 @@ const pipeline_1 = require("./pipeline");
|
|
|
59
60
|
const mcp_manager_1 = require("../repertoire/mcp-manager");
|
|
60
61
|
const runtime_1 = require("../nerves/runtime");
|
|
61
62
|
const RESPONSE_CAP = 50_000;
|
|
63
|
+
/**
|
|
64
|
+
* Strip MiniMax-style `<think>...</think>` reasoning blocks from a response
|
|
65
|
+
* string. Handles unclosed open tags (treats everything from `<think>` to
|
|
66
|
+
* end of string as reasoning) and multiple blocks in sequence. Returns the
|
|
67
|
+
* trimmed remainder.
|
|
68
|
+
*/
|
|
69
|
+
function stripThinkBlocks(input) {
|
|
70
|
+
let out = input;
|
|
71
|
+
// Closed blocks first (greedy match removed by repeatedly slicing the leftmost pair).
|
|
72
|
+
for (;;) {
|
|
73
|
+
const open = out.indexOf("<think>");
|
|
74
|
+
if (open === -1)
|
|
75
|
+
break;
|
|
76
|
+
const close = out.indexOf("</think>", open + "<think>".length);
|
|
77
|
+
if (close === -1) {
|
|
78
|
+
// Unclosed — drop everything from <think> onward.
|
|
79
|
+
out = out.slice(0, open);
|
|
80
|
+
break;
|
|
81
|
+
}
|
|
82
|
+
out = out.slice(0, open) + out.slice(close + "</think>".length);
|
|
83
|
+
}
|
|
84
|
+
return out.trim();
|
|
85
|
+
}
|
|
62
86
|
/**
|
|
63
87
|
* Run a single agent turn through the inbound pipeline.
|
|
64
88
|
* Caller provides channel, session key, friend, and message;
|
|
@@ -191,6 +215,25 @@ async function runSenseTurn(options) {
|
|
|
191
215
|
else {
|
|
192
216
|
finalResponse = responseText;
|
|
193
217
|
}
|
|
218
|
+
// Strip MiniMax-style <think>...</think> blocks from the final response.
|
|
219
|
+
// When a reasoning-style model emits only a think block and no final answer
|
|
220
|
+
// (no settle tool call, no post-think text), the readback path above
|
|
221
|
+
// surfaces the raw saved assistant content — which includes the think tags
|
|
222
|
+
// and renders as empty (or as raw reasoning) on MCP/CLI clients. Strip
|
|
223
|
+
// here so the caller sees the actual delivered text. If only reasoning
|
|
224
|
+
// came through and nothing else, surface a clear diagnostic message
|
|
225
|
+
// instead of a blank response so the operator knows what happened.
|
|
226
|
+
finalResponse = stripThinkBlocks(finalResponse);
|
|
227
|
+
if (finalResponse.length === 0) {
|
|
228
|
+
(0, runtime_1.emitNervesEvent)({
|
|
229
|
+
level: "warn",
|
|
230
|
+
component: "senses",
|
|
231
|
+
event: "senses.shared_turn_only_reasoning",
|
|
232
|
+
message: "agent produced only <think> reasoning with no final answer — likely a model that closed the think tag without continuing",
|
|
233
|
+
meta: { agentName, channel, sessionKey, friendId },
|
|
234
|
+
});
|
|
235
|
+
finalResponse = "(agent produced reasoning but no final answer this turn — try again, or check the session transcript for the trace)";
|
|
236
|
+
}
|
|
194
237
|
// Cap response length
|
|
195
238
|
if (finalResponse.length > RESPONSE_CAP) {
|
|
196
239
|
finalResponse = finalResponse.slice(0, RESPONSE_CAP) + "\n\n[truncated — response exceeded 50K characters]";
|