@oh-my-pi/pi-agent-core 15.10.11 → 15.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,6 +2,41 @@
2
2
 
3
3
  ## [Unreleased]
4
4
 
5
+ ## [15.11.0] - 2026-06-10
6
+ ### Breaking Changes
7
+
8
+ - Removed `compaction/index.ts` re-export of snapcompact helpers, so snapcompact utilities are no longer available from the agent compaction barrel and should be imported from `@oh-my-pi/snapcompact`
9
+ - Removed the `convertToLlm` alias export from `compaction/messages` — it duplicated `defaultConvertToLlm` under a second name. Import `defaultConvertToLlm` (array form) or the new `convertMessageToLlm` (single-message form) instead
10
+
11
+ ### Added
12
+
13
+ - Added `convertMessageToLlm()`: the single-message core transformer behind `defaultConvertToLlm()`. Embedders with app-specific message roles should handle their own roles and delegate every core role (`user`/`developer`/`assistant`/`toolResult`/`custom`/`hookMessage`/`branchSummary`/`compactionSummary`) to it instead of duplicating the conversion — a duplicated `compactionSummary` case is how snapcompact frames once silently dropped off provider requests
14
+ - Added `pruneSupersededToolResults()` and the opt-in `PruneConfig.supersedeKey` hook so harnesses can prune stale tool results superseded by a newer read of the same file; superseded results are pruned ahead of age-based victims during overflow pruning and replaced with a `[Superseded by a newer read of this file]` placeholder. Without the new config, `pruneToolOutputs()` behavior is unchanged.
15
+ - Added `readToolSupersedeKey()` implementing the read-tool path/selector grammar (selector-free reads supersede range reads of the same file; URL-scheme paths exempt). Pruning honors prompt-cache economics: per-turn prunes only fire when the post-candidate suffix is small or the cache is cold (idle gap).
16
+ - Added the `snapcompact` compaction strategy via `@oh-my-pi/snapcompact`: instead of an LLM summary, discarded history is printed onto dense bitmap frames and re-attached to the compaction summary message as image blocks. `CompactionSummaryMessage` gains an optional `images` field, `estimateTokens()` charges per attached frame, and frames persist under `preserveData.snapcompact` with an 8-frame middle-out eviction budget.
17
+ - Snapcompact frames are now rendered in a provider-aware shape (`SNAPCOMPACT_SHAPES` + `resolveSnapcompactShape(api)`), following the snapcompact 200k-token monolithic evals: Anthropic-family and unknown APIs get `8x8r-bw` (unscii-8 square cells, black ink, every line printed twice with the copy on a pale highlight band — read at F1 parity with raw text at ~2x lower cost and the most refusal-robust), Google gets `8x8r-sent` (sentence-hue ink, ~2.9x cheaper), and OpenAI gets `6x6u-sent` (unscii Lanczos-stretched to 6x6 cells — OpenAI bills a flat ~2.9k tokens per image, so frame count is the only cost lever) with `detail: "original"` on the frame images. `snapcompactCompact()` accepts `model`/`shape` options, frames persist their shape metadata, mixed-shape archives (provider switches, legacy 5x8 frames) are flagged in the reading instructions, and `snapcompactGeometry()`/`renderSnapcompactFrame()` now take a shape
18
+
19
+ ### Changed
20
+
21
+ - Compaction and branch-summary file lists are now a single `<files>` tag instead of `<read-files>`/`<modified-files>`: paths render as the grouped, prefix-folded directory tree the find/search tools emit (`# dir/` headers, bare basenames), each annotated `(Read)`, `(Write)`, or `(RW)` — modified files that were also read get `(RW)`. Legacy tags in summaries written by earlier versions are still stripped and self-heal on the next compaction
22
+
23
+ ### Fixed
24
+
25
+ - Fixed queued steering messages being drained into an externally aborted run: interrupting mid-tool execution (e.g. Enter with a pending steer) dequeued the steer into the dying run — it landed in history without a response and the post-abort resume saw an empty queue, so the agent stopped instead of continuing. Steering/follow-up/aside queue polls are now skipped once the run's abort signal fires, leaving the queue intact for `Agent.continue()`.
26
+ - Fixed `<read-files>` compaction lists recording the same file once per line-range/raw selector (`src/foo.ts:50-200`, `:raw`, `:1-50:raw`, …): read-tool selectors are now stripped before tracking, so reads dedupe to the base path and match their write/edit path when splitting read-only vs modified lists. Selector-polluted lists stored by earlier compactions self-heal on the next compaction. `readToolSupersedeKey()` now shares the same splitter (`splitReadSelector()`), gaining the `..` range alias and `L`-prefix forms it previously missed.
27
+ - Fixed `estimateTokens()` undercounting thinking-heavy assistant messages on replay: `thinkingSignature` payloads (OpenAI Responses encrypted reasoning items, Anthropic signed thinking blocks, etc.) and `redactedThinking.data` are now charged alongside the visible thinking text, so the local estimate tracks provider-reported usage instead of straddling the threshold on every turn ([#2275](https://github.com/can1357/oh-my-pi/issues/2275)).
28
+
29
+ ## [15.10.12] - 2026-06-10
30
+
31
+ ### Added
32
+
33
+ - Added `AgentLoopConfig.getDisableReasoning` so callers can override `disableReasoning` per LLM call, mirroring `getReasoning`.
34
+ - Added `transformProviderContext` to `AgentOptions`/`AgentLoopConfig`: an optional hook applied to the assembled provider context after conversion, normalization, and append-only handling, but before telemetry capture and provider send.
35
+
36
+ ### Fixed
37
+
38
+ - Fixed `Agent` runs so explicit reasoning disablement is forwarded to provider stream options and re-resolved per continuation, keeping mid-run thinking-off changes in sync with the next provider request.
39
+
5
40
  ## [15.10.11] - 2026-06-10
6
41
 
7
42
  ### Changed
@@ -10,6 +45,7 @@
10
45
  - Catalog imports moved to the new `@oh-my-pi/pi-catalog` package: subpath imports (`calculateCost`, Codex wire constants) plus catalog values previously taken from the `@oh-my-pi/pi-ai` root (`getBundledModel`, `clampThinkingLevelForModel`), which pi-ai no longer re-exports; type-only `Model`/`Api`/`Effort` imports from pi-ai are unchanged
11
46
 
12
47
  ## [15.10.8] - 2026-06-09
48
+
13
49
  ### Added
14
50
 
15
51
  - Added optional `fetch` overrides to `SummaryOptions` and `compact`/`generateSummary` so remote compaction can use custom HTTP clients
@@ -18,6 +54,7 @@
18
54
  - Added the upstream provider that served a request (`AssistantMessage.upstreamProvider`, e.g. OpenRouter's routed provider) as a `pi.gen_ai.response.upstream_provider` chat-span telemetry attribute, alongside the existing response id and time-to-first-chunk.
19
55
 
20
56
  ## [15.10.5] - 2026-06-08
57
+
21
58
  ### Removed
22
59
 
23
60
  - Removed the `maxToolCallsPerTurn` option from `AgentOptions` and `AgentLoopConfig`, so assistant turns are no longer capped after a configured number of completed tool calls
@@ -55,7 +92,6 @@
55
92
  - Removed stale synthetic user-message tag filters from OpenAI remote compaction output preservation; developer messages are now dropped by role instead.
56
93
  - Tool executions now receive the active turn `AbortSignal` unconditionally.
57
94
 
58
-
59
95
  ## [15.10.2] - 2026-06-08
60
96
 
61
97
  ### Fixed
@@ -87,6 +123,7 @@
87
123
  - Surfaced Anthropic stream failures whose message starts with `Output blocked by conten` as normal assistant error lifecycle events, so interactive clients render content-filter blocks instead of silently dropping the streaming bubble at `agent_end`.
88
124
 
89
125
  ## [15.8.3] - 2026-06-03
126
+
90
127
  ### Added
91
128
 
92
129
  - Added `getReadToolPath(context)` to `@oh-my-pi/pi-agent-core/compaction/tool-protection` to extract a paired `read` tool call's `path` for embedders building read-targeted protection matchers
@@ -1,4 +1,4 @@
1
- import { type ApiKeyResolveContext, type AssistantMessage, type AssistantMessageEvent, type CursorExecHandlers, type CursorToolResultHandler, type Effort, type ImageContent, type Message, type Model, type ProviderSessionState, type ServiceTier, type SimpleStreamOptions, type ThinkingBudgets, type ToolChoice } from "@oh-my-pi/pi-ai";
1
+ import { type ApiKeyResolveContext, type AssistantMessage, type AssistantMessageEvent, type Context, type CursorExecHandlers, type CursorToolResultHandler, type Effort, type ImageContent, type Message, type Model, type ProviderSessionState, type ServiceTier, type SimpleStreamOptions, type ThinkingBudgets, type ToolChoice } from "@oh-my-pi/pi-ai";
2
2
  import type { AppendOnlyContextManager } from "./append-only-context";
3
3
  import type { HarmonyAuditEvent } from "./harmony-leak";
4
4
  import type { AgentEvent, AgentLoopConfig, AgentMessage, AgentState, AgentTool, AgentToolContext, AsideMessage, StreamFn, ToolCallContext } from "./types";
@@ -17,6 +17,11 @@ export interface AgentOptions {
17
17
  * Use for context pruning, injecting external context, etc.
18
18
  */
19
19
  transformContext?: (messages: AgentMessage[], signal?: AbortSignal) => Promise<AgentMessage[]>;
20
+ /**
21
+ * Optional transform applied after provider context assembly and before
22
+ * telemetry capture/provider send.
23
+ */
24
+ transformProviderContext?: (context: Context) => Context;
20
25
  /**
21
26
  * Steering mode: "all" = send all steering messages at once, "one-at-a-time" = one per turn
22
27
  */
@@ -294,6 +299,7 @@ export declare class Agent {
294
299
  setSystemPrompt(v: string[]): void;
295
300
  setModel(m: Model): void;
296
301
  setThinkingLevel(l: Effort | undefined): void;
302
+ setDisableReasoning(disabled: boolean): void;
297
303
  setSteeringMode(mode: "all" | "one-at-a-time"): void;
298
304
  getSteeringMode(): "all" | "one-at-a-time";
299
305
  setFollowUpMode(mode: "all" | "one-at-a-time"): void;
@@ -4,10 +4,10 @@
4
4
  * Pure functions for compaction logic. The session manager handles I/O,
5
5
  * and after compaction the session is reloaded.
6
6
  */
7
- import { type FetchImpl, type MessageAttribution, type Model, type Usage } from "@oh-my-pi/pi-ai";
7
+ import { type FetchImpl, type MessageAttribution, type Model, type Tool, type Usage } from "@oh-my-pi/pi-ai";
8
8
  import { type AgentTelemetry } from "../telemetry";
9
9
  import { ThinkingLevel } from "../thinking";
10
- import type { AgentMessage, AgentTool } from "../types";
10
+ import type { AgentMessage } from "../types";
11
11
  import type { SessionEntry } from "./entries";
12
12
  import { type ConvertToLlm } from "./messages";
13
13
  import { type FileOperations } from "./utils";
@@ -30,7 +30,7 @@ export interface CompactionResult<T = unknown> {
30
30
  }
31
31
  export interface CompactionSettings {
32
32
  enabled: boolean;
33
- strategy?: "context-full" | "handoff" | "shake" | "off";
33
+ strategy?: "context-full" | "handoff" | "shake" | "snapcompact" | "off";
34
34
  thresholdPercent?: number;
35
35
  thresholdTokens?: number;
36
36
  reserveTokens: number;
@@ -133,7 +133,7 @@ export interface HandoffOptions {
133
133
  /** Live agent system prompt — passed verbatim so providers hit the cached prefix. */
134
134
  systemPrompt: string[];
135
135
  /** Live agent tool list — same purpose. Forced to `toolChoice: "none"`. */
136
- tools?: AgentTool<any>[];
136
+ tools?: Tool[];
137
137
  customInstructions?: string;
138
138
  convertToLlm?: ConvertToLlm;
139
139
  initiatorOverride?: MessageAttribution;
@@ -33,6 +33,8 @@ export interface CompactionSummaryMessage {
33
33
  shortSummary?: string;
34
34
  tokensBefore: number;
35
35
  providerPayload?: ProviderPayload;
36
+ /** Snapcompact frames archived by this compaction; appended as image blocks after the summary text. */
37
+ images?: ImageContent[];
36
38
  timestamp: number;
37
39
  }
38
40
  export type CoreCompactionMessage = CustomMessage | HookMessage | BranchSummaryMessage | CompactionSummaryMessage;
@@ -48,8 +50,19 @@ export type ConvertToLlm = (messages: AgentMessage[]) => Message[];
48
50
  export declare function renderBranchSummaryContext(summary: string): string;
49
51
  export declare function renderCompactionSummaryContext(summary: string): string;
50
52
  export declare function createBranchSummaryMessage(summary: string, fromId: string, timestamp: string): BranchSummaryMessage;
51
- export declare function createCompactionSummaryMessage(summary: string, tokensBefore: number, timestamp: string, shortSummary?: string, providerPayload?: ProviderPayload): CompactionSummaryMessage;
53
+ export declare function createCompactionSummaryMessage(summary: string, tokensBefore: number, timestamp: string, shortSummary?: string, providerPayload?: ProviderPayload, images?: ImageContent[]): CompactionSummaryMessage;
52
54
  export declare function createCustomMessage(customType: string, content: string | (TextContent | ImageContent)[], display: boolean, details: unknown | undefined, timestamp: string, attribution?: MessageAttribution): CustomMessage;
55
+ /**
56
+ * Transform a single core-domain agent message to its LLM form; `undefined`
57
+ * drops it from the provider request.
58
+ *
59
+ * Single source of truth for the core roles (user/developer/assistant/
60
+ * toolResult) and the compaction messages owned by this package. Embedders
61
+ * with their own app messages (e.g. the coding agent) handle their custom
62
+ * roles and delegate every core role here — duplicating these cases is how
63
+ * snapcompact frames once silently fell off the provider request.
64
+ */
65
+ export declare function convertMessageToLlm(message: AgentMessage): Message | undefined;
53
66
  /**
54
67
  * Default compaction-domain transformer.
55
68
  *
@@ -58,4 +71,3 @@ export declare function createCustomMessage(customType: string, content: string
58
71
  * core LLM roles and the compaction messages owned by this package.
59
72
  */
60
73
  export declare function defaultConvertToLlm(messages: AgentMessage[]): Message[];
61
- export declare const convertToLlm: typeof defaultConvertToLlm;
@@ -10,10 +10,58 @@ export interface PruneConfig {
10
10
  minimumSavings: number;
11
11
  /** Tool-result protection matchers. String entries protect every result from that tool; predicates may inspect the paired tool call. */
12
12
  protectedTools: ProtectedToolMatcher[];
13
+ /**
14
+ * Optional supersede key function (see {@link SupersedePruneConfig.supersedeKey}).
15
+ * When provided, superseded tool results are pruned first — even inside the
16
+ * `protectTokens` window — before age-based victims. Absent, behavior is
17
+ * unchanged.
18
+ */
19
+ supersedeKey?: SupersedeKeyFn;
13
20
  }
14
21
  export declare const DEFAULT_PRUNE_CONFIG: PruneConfig;
15
22
  export interface PruneResult {
16
23
  prunedCount: number;
17
24
  tokensSaved: number;
18
25
  }
26
+ /** Exact placeholder written over a superseded tool result. */
27
+ export declare const SUPERSEDED_NOTICE = "[Superseded by a newer read of this file]";
28
+ /**
29
+ * Maps a tool call to a supersede key. Results sharing a key form a group in
30
+ * which every result except the newest is a supersede candidate. A key `K`
31
+ * additionally supersedes keys with prefix `K + "\u0000"` (selector-free read
32
+ * supersedes selector-carrying reads of the same base path). Return
33
+ * `undefined` to exempt a call from supersede grouping.
34
+ */
35
+ export type SupersedeKeyFn = (toolName: string, args: Record<string, unknown>) => string | undefined;
36
+ export interface SupersedePruneConfig {
37
+ /** Supersede key function; results sharing a key supersede older ones. */
38
+ supersedeKey: SupersedeKeyFn;
39
+ /** Prune a candidate now when all messages after it total at most this many estimated tokens. Default 8 000. */
40
+ suffixTokenLimit?: number;
41
+ /** Prune all candidates when the last message is at least this old (prompt cache is cold anyway). Default 30 min. */
42
+ idleFlushMs?: number;
43
+ /** Clock override for tests. */
44
+ now?: number;
45
+ /** Tool-result protection matchers (same contract as {@link PruneConfig.protectedTools}). */
46
+ protectedTools: ProtectedToolMatcher[];
47
+ }
48
+ /**
49
+ * Prune superseded tool results (e.g. stale `read` outputs replaced by a newer
50
+ * read of the same file). Cheap, incremental, and prompt-cache-aware: a
51
+ * candidate is pruned now only when the suffix after it is small (tail case —
52
+ * the read→edit→read loop) or when the context has been idle long enough that
53
+ * the provider cache is cold anyway (then ALL candidates flush).
54
+ */
55
+ export declare function pruneSupersededToolResults(entries: SessionEntry[], config: SupersedePruneConfig): PruneResult;
19
56
  export declare function pruneToolOutputs(entries: SessionEntry[], config?: PruneConfig): PruneResult;
57
+ /**
58
+ * Supersede key for the `read` tool: the file path with the trailing line/raw
59
+ * selector stripped (the read tool's own splitter grammar via
60
+ * {@link splitReadSelector}, e.g. `src/foo.ts:50-200`, `:2-4:raw`).
61
+ * Internal/URL-scheme paths (`skill://…`, `https://…`) are exempt.
62
+ * Selector-free reads key on the bare path; selector-carrying reads key on
63
+ * `path + "\u0000" + selector`, so two reads collide only when the newer is
64
+ * selector-free or the selectors are identical (the pass's prefix rule lets a
65
+ * bare-path read supersede selector-carrying reads of the same file).
66
+ */
67
+ export declare function readToolSupersedeKey(toolName: string, args: Record<string, unknown>): string | undefined;
@@ -9,6 +9,22 @@ export interface FileOperations {
9
9
  edited: Set<string>;
10
10
  }
11
11
  export declare function createFileOps(): FileOperations;
12
+ /**
13
+ * Split a read-tool path into its base path and trailing selector, mirroring the
14
+ * read tool's own splitter. Single source of the grammar in this package: the
15
+ * file-operations list strips selectors via {@link stripReadSelector}, and the
16
+ * supersede-prune pass keys on both parts via `readToolSupersedeKey`.
17
+ */
18
+ export declare function splitReadSelector(path: string): {
19
+ path: string;
20
+ sel?: string;
21
+ };
22
+ /**
23
+ * Strip a trailing read-tool selector (`:50-200`, `:raw`, `:1-50:raw`, `:conflicts`, …)
24
+ * so the same file read with different line ranges dedupes to one `<files>` entry
25
+ * and matches its write/edit path when computing Read/Write/RW markers.
26
+ */
27
+ export declare function stripReadSelector(path: string): string;
12
28
  /**
13
29
  * Extract file operations from tool calls in an assistant message.
14
30
  */
@@ -21,8 +37,8 @@ export declare function computeFileLists(fileOps: FileOperations): {
21
37
  readFiles: string[];
22
38
  modifiedFiles: string[];
23
39
  };
24
- export declare function formatFileOperations(readFiles: string[], modifiedFiles: string[]): string;
25
- export declare function upsertFileOperations(summary: string, readFiles: string[], modifiedFiles: string[]): string;
40
+ export declare function formatFileOperations(readFiles: string[], modifiedFiles: string[], readSet?: ReadonlySet<string>): string;
41
+ export declare function upsertFileOperations(summary: string, readFiles: string[], modifiedFiles: string[], readSet?: ReadonlySet<string>): string;
26
42
  /**
27
43
  * Serialize LLM messages to text for summarization.
28
44
  * This prevents the model from treating it as a conversation to continue.
@@ -1,4 +1,4 @@
1
- import type { ApiKeyResolveContext, AssistantMessage, AssistantMessageEvent, AssistantMessageEventStream, Effort, ImageContent, Message, Model, SimpleStreamOptions, Static, streamSimple, TextContent, Tool, ToolChoice, ToolResultMessage, TSchema } from "@oh-my-pi/pi-ai";
1
+ import type { ApiKeyResolveContext, AssistantMessage, AssistantMessageEvent, AssistantMessageEventStream, Context, Effort, ImageContent, Message, Model, SimpleStreamOptions, Static, streamSimple, TextContent, Tool, ToolChoice, ToolResultMessage, TSchema } from "@oh-my-pi/pi-ai";
2
2
  import type { AppendOnlyContextManager } from "./append-only-context";
3
3
  import type { HarmonyAuditEvent } from "./harmony-leak";
4
4
  import type { AgentRunCoverage, AgentRunSummary } from "./run-collector";
@@ -79,6 +79,12 @@ export interface AgentLoopConfig extends SimpleStreamOptions {
79
79
  * ```
80
80
  */
81
81
  transformContext?: (messages: AgentMessage[], signal?: AbortSignal) => Promise<AgentMessage[]>;
82
+ /**
83
+ * Optional transform applied to the final provider context after conversion,
84
+ * normalization, and append-only context handling, but before telemetry capture
85
+ * and provider send.
86
+ */
87
+ transformProviderContext?: (context: Context) => Context;
82
88
  /**
83
89
  * Resolves an API key dynamically for each LLM call.
84
90
  *
@@ -171,6 +177,14 @@ export interface AgentLoopConfig extends SimpleStreamOptions {
171
177
  * the next model call instead of waiting for the next prompt.
172
178
  */
173
179
  getReasoning?: () => Effort | undefined;
180
+ /**
181
+ * Dynamic reasoning-disable override, resolved per LLM call. When set,
182
+ * its return value overrides the static `disableReasoning` from
183
+ * `SimpleStreamOptions` for that request. Pair with `getReasoning` so
184
+ * mid-run transitions into and out of the explicit `off` state propagate
185
+ * to the next provider call.
186
+ */
187
+ getDisableReasoning?: () => boolean | undefined;
174
188
  /**
175
189
  * Called after a tool call has been validated and is about to execute.
176
190
  *
@@ -307,6 +321,7 @@ export interface AgentState {
307
321
  systemPrompt: string[];
308
322
  model: Model;
309
323
  thinkingLevel?: Effort;
324
+ disableReasoning?: boolean;
310
325
  tools: AgentTool<any>[];
311
326
  messages: AgentMessage[];
312
327
  isStreaming: boolean;
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "type": "module",
3
3
  "name": "@oh-my-pi/pi-agent-core",
4
- "version": "15.10.11",
4
+ "version": "15.11.0",
5
5
  "description": "General-purpose agent with transport abstraction, state management, and attachment support",
6
6
  "homepage": "https://omp.sh",
7
7
  "author": "Can Boluk",
@@ -35,10 +35,11 @@
35
35
  "fmt": "biome format --write ."
36
36
  },
37
37
  "dependencies": {
38
- "@oh-my-pi/pi-ai": "15.10.11",
39
- "@oh-my-pi/pi-catalog": "15.10.11",
40
- "@oh-my-pi/pi-natives": "15.10.11",
41
- "@oh-my-pi/pi-utils": "15.10.11",
38
+ "@oh-my-pi/pi-ai": "15.11.0",
39
+ "@oh-my-pi/pi-catalog": "15.11.0",
40
+ "@oh-my-pi/pi-natives": "15.11.0",
41
+ "@oh-my-pi/pi-utils": "15.11.0",
42
+ "@oh-my-pi/snapcompact": "15.11.0",
42
43
  "@opentelemetry/api": "^1.9.1"
43
44
  },
44
45
  "devDependencies": {
package/src/agent-loop.ts CHANGED
@@ -564,8 +564,10 @@ async function runLoopBody(
564
564
  streamFn?: StreamFn,
565
565
  ): Promise<void> {
566
566
  let firstTurn = true;
567
- // Check for steering messages at start (user may have typed while waiting)
568
- let pendingMessages: AgentMessage[] = (await config.getSteeringMessages?.()) || [];
567
+ // Check for steering messages at start (user may have typed while waiting).
568
+ // Skip when the run is already externally aborted — dequeuing would strand
569
+ // the messages in a run that is about to die.
570
+ let pendingMessages: AgentMessage[] = signal?.aborted ? [] : (await config.getSteeringMessages?.()) || [];
569
571
  let harmonyRetryAttempt = 0;
570
572
  let harmonyTruncateResumeCount = 0;
571
573
 
@@ -743,7 +745,12 @@ async function runLoopBody(
743
745
 
744
746
  stream.push({ type: "turn_end", message, toolResults });
745
747
 
746
- const steering = steeringMessagesFromExecution ?? ((await config.getSteeringMessages?.()) || []);
748
+ // On external abort (user interrupt), leave the steering queue intact: the
749
+ // session aborts then continues, delivering the queue into a fresh run.
750
+ // Draining it here would inject the messages right before a model call that
751
+ // instantly aborts — message lands in history, agent never responds.
752
+ const steering =
753
+ steeringMessagesFromExecution ?? (signal?.aborted ? [] : (await config.getSteeringMessages?.()) || []);
747
754
  if (hasMoreToolCalls) {
748
755
  // Mid-work: fold any non-interrupting asides into the next turn alongside steering.
749
756
  const asides = resolveAsides(await config.getAsideMessages?.());
@@ -758,8 +765,9 @@ async function runLoopBody(
758
765
 
759
766
  // Agent would stop here. Drain non-interrupting asides + follow-up messages.
760
767
  await config.onBeforeYield?.();
761
- const asideMessages = resolveAsides(await config.getAsideMessages?.());
762
- const followUpMessages = (await config.getFollowUpMessages?.()) || [];
768
+ // Skip queue drains when externally aborted (same stranding hazard as above).
769
+ const asideMessages = signal?.aborted ? [] : resolveAsides(await config.getAsideMessages?.());
770
+ const followUpMessages = signal?.aborted ? [] : (await config.getFollowUpMessages?.()) || [];
763
771
  if (asideMessages.length > 0 || followUpMessages.length > 0) {
764
772
  // Set as pending so the inner loop processes them before stopping.
765
773
  pendingMessages = [...asideMessages, ...followUpMessages];
@@ -829,6 +837,9 @@ async function streamAssistantResponse(
829
837
  tools: normalizeTools(context.tools, !!config.intentTracing),
830
838
  };
831
839
  }
840
+ if (config.transformProviderContext) {
841
+ llmContext = config.transformProviderContext(llmContext);
842
+ }
832
843
 
833
844
  const streamFunction = streamFn || streamSimple;
834
845
 
@@ -845,6 +856,7 @@ async function streamAssistantResponse(
845
856
 
846
857
  const dynamicToolChoice = config.getToolChoice?.();
847
858
  const dynamicReasoning = config.getReasoning?.();
859
+ const dynamicDisableReasoning = config.getDisableReasoning?.();
848
860
  const harmonyMitigationEnabled = isHarmonyLeakMitigationTarget(config.model);
849
861
  const harmonyAbortController = harmonyMitigationEnabled ? new AbortController() : undefined;
850
862
  const requestSignal = harmonyAbortController
@@ -856,6 +868,7 @@ async function streamAssistantResponse(
856
868
  harmonyRetryAttempt > 0 && config.temperature !== undefined ? config.temperature + 0.05 : config.temperature;
857
869
  const effectiveToolChoice = dynamicToolChoice ?? config.toolChoice;
858
870
  const effectiveReasoning = dynamicReasoning ?? config.reasoning;
871
+ const effectiveDisableReasoning = dynamicDisableReasoning ?? config.disableReasoning;
859
872
 
860
873
  const chatStepNumber = stepCounter.count;
861
874
  stepCounter.count += 1;
@@ -916,6 +929,7 @@ async function streamAssistantResponse(
916
929
  metadata: resolvedMetadata,
917
930
  toolChoice: effectiveToolChoice,
918
931
  reasoning: effectiveReasoning,
932
+ disableReasoning: effectiveDisableReasoning,
919
933
  temperature: effectiveTemperature,
920
934
  signal: requestSignal,
921
935
  onResponse: captureOnResponse,
@@ -1247,11 +1261,16 @@ async function executeToolCalls(
1247
1261
  }));
1248
1262
 
1249
1263
  const checkSteering = async (): Promise<void> => {
1250
- if (!shouldInterruptImmediately || !getSteeringMessages || interruptState.triggered) {
1264
+ // `signal` (external/user abort) is checked separately from the internal
1265
+ // steeringAbortController: once the run is externally aborted it is
1266
+ // unwinding, and draining the steering queue here would strand the
1267
+ // messages in the dying run instead of leaving them for the post-abort
1268
+ // continue (interruptAndFlushQueuedMessages → Agent.continue()).
1269
+ if (!shouldInterruptImmediately || !getSteeringMessages || interruptState.triggered || signal?.aborted) {
1251
1270
  return;
1252
1271
  }
1253
1272
  const check = steeringCheckTail.then(async () => {
1254
- if (interruptState.triggered) return;
1273
+ if (interruptState.triggered || signal?.aborted) return;
1255
1274
  const steering = await getSteeringMessages();
1256
1275
  if (steering.length > 0) {
1257
1276
  steeringMessages = steering;
package/src/agent.ts CHANGED
@@ -6,6 +6,7 @@ import {
6
6
  type ApiKeyResolveContext,
7
7
  type AssistantMessage,
8
8
  type AssistantMessageEvent,
9
+ type Context,
9
10
  type CursorExecHandlers,
10
11
  type CursorToolResultHandler,
11
12
  type Effort,
@@ -93,6 +94,12 @@ export interface AgentOptions {
93
94
  */
94
95
  transformContext?: (messages: AgentMessage[], signal?: AbortSignal) => Promise<AgentMessage[]>;
95
96
 
97
+ /**
98
+ * Optional transform applied after provider context assembly and before
99
+ * telemetry capture/provider send.
100
+ */
101
+ transformProviderContext?: (context: Context) => Context;
102
+
96
103
  /**
97
104
  * Steering mode: "all" = send all steering messages at once, "one-at-a-time" = one per turn
98
105
  */
@@ -265,6 +272,7 @@ export class Agent {
265
272
  systemPrompt: [],
266
273
  model: getBundledModel("google", "gemini-2.5-flash-lite-preview-06-17"),
267
274
  thinkingLevel: undefined,
275
+ disableReasoning: false,
268
276
  tools: [],
269
277
  messages: [],
270
278
  isStreaming: false,
@@ -277,6 +285,7 @@ export class Agent {
277
285
  #abortController?: AbortController;
278
286
  #convertToLlm: (messages: AgentMessage[]) => Message[] | Promise<Message[]>;
279
287
  #transformContext?: (messages: AgentMessage[], signal?: AbortSignal) => Promise<AgentMessage[]>;
288
+ #transformProviderContext?: (context: Context) => Context;
280
289
  #steeringQueue: AgentMessage[] = [];
281
290
  #followUpQueue: AgentMessage[] = [];
282
291
  #steeringMode: "all" | "one-at-a-time";
@@ -375,6 +384,7 @@ export class Agent {
375
384
  this.afterToolCall = opts.afterToolCall;
376
385
  this.#telemetry = opts.telemetry;
377
386
  this.#appendOnlyContext = opts.appendOnlyContext;
387
+ this.#transformProviderContext = opts.transformProviderContext;
378
388
  }
379
389
 
380
390
  /**
@@ -658,6 +668,10 @@ export class Agent {
658
668
  this.#state.thinkingLevel = l;
659
669
  }
660
670
 
671
+ setDisableReasoning(disabled: boolean) {
672
+ this.#state.disableReasoning = disabled;
673
+ }
674
+
661
675
  setSteeringMode(mode: "all" | "one-at-a-time") {
662
676
  this.#steeringMode = mode;
663
677
  }
@@ -942,6 +956,7 @@ export class Agent {
942
956
  const config: AgentLoopConfig = {
943
957
  model,
944
958
  reasoning,
959
+ disableReasoning: this.#state.disableReasoning,
945
960
  temperature: this.#temperature,
946
961
  topP: this.#topP,
947
962
  topK: this.#topK,
@@ -961,6 +976,7 @@ export class Agent {
961
976
  kimiApiFormat: this.#kimiApiFormat,
962
977
  preferWebsockets: this.#preferWebsockets,
963
978
  convertToLlm: this.#convertToLlm,
979
+ transformProviderContext: this.#transformProviderContext,
964
980
  transformContext: this.#transformContext,
965
981
  onPayload: this.#onPayload,
966
982
  onResponse: this.#onResponse,
@@ -985,6 +1001,7 @@ export class Agent {
985
1001
  onHarmonyLeak: this.#onHarmonyLeak,
986
1002
  getToolChoice,
987
1003
  getReasoning: () => this.#state.thinkingLevel,
1004
+ getDisableReasoning: () => this.#state.disableReasoning,
988
1005
  getSteeringMessages: async () => {
989
1006
  if (skipInitialSteeringPoll) {
990
1007
  skipInitialSteeringPoll = false;
@@ -13,10 +13,10 @@ import { estimateTokens } from "./compaction";
13
13
  import type { ReadonlySessionManager, SessionEntry } from "./entries";
14
14
  import {
15
15
  type ConvertToLlm,
16
- convertToLlm,
17
16
  createBranchSummaryMessage,
18
17
  createCompactionSummaryMessage,
19
18
  createCustomMessage,
19
+ defaultConvertToLlm,
20
20
  } from "./messages";
21
21
  import branchSummaryPrompt from "./prompts/branch-summary.md" with { type: "text" };
22
22
  import branchSummaryPreamble from "./prompts/branch-summary-preamble.md" with { type: "text" };
@@ -27,6 +27,7 @@ import {
27
27
  type FileOperations,
28
28
  SUMMARIZATION_SYSTEM_PROMPT,
29
29
  serializeConversation,
30
+ stripReadSelector,
30
31
  upsertFileOperations,
31
32
  } from "./utils";
32
33
 
@@ -214,7 +215,7 @@ export function prepareBranchEntries(entries: SessionEntry[], tokenBudget: numbe
214
215
  if (entry.type === "branch_summary" && !entry.fromExtension && entry.details) {
215
216
  const details = entry.details as BranchSummaryDetails;
216
217
  if (Array.isArray(details.readFiles)) {
217
- for (const f of details.readFiles) fileOps.read.add(f);
218
+ for (const f of details.readFiles) fileOps.read.add(stripReadSelector(f));
218
219
  }
219
220
  if (Array.isArray(details.modifiedFiles)) {
220
221
  // Modified files go into both edited and written for proper deduplication
@@ -288,7 +289,7 @@ export async function generateBranchSummary(
288
289
 
289
290
  // Transform to LLM-compatible messages, then serialize to text
290
291
  // Serialization prevents the model from treating it as a conversation to continue
291
- const llmMessages = (options.convertToLlm ?? convertToLlm)(messages);
292
+ const llmMessages = (options.convertToLlm ?? defaultConvertToLlm)(messages);
292
293
  const conversationText = serializeConversation(llmMessages);
293
294
 
294
295
  // Build prompt
@@ -329,7 +330,7 @@ export async function generateBranchSummary(
329
330
 
330
331
  // Compute file lists and append to summary
331
332
  const { readFiles, modifiedFiles } = computeFileLists(fileOps);
332
- summary = upsertFileOperations(summary, readFiles, modifiedFiles);
333
+ summary = upsertFileOperations(summary, readFiles, modifiedFiles, fileOps.read);
333
334
 
334
335
  return {
335
336
  summary: summary || "No summary generated",
@@ -12,16 +12,18 @@ import {
12
12
  type Message,
13
13
  type MessageAttribution,
14
14
  type Model,
15
+ type Tool,
15
16
  type Usage,
16
17
  } from "@oh-my-pi/pi-ai";
17
18
  import { clampThinkingLevelForModel } from "@oh-my-pi/pi-catalog/model-thinking";
18
19
  import { countTokens } from "@oh-my-pi/pi-natives";
19
20
  import { logger, prompt } from "@oh-my-pi/pi-utils";
21
+ import { SNAPCOMPACT_FRAME_TOKEN_ESTIMATE } from "@oh-my-pi/snapcompact";
20
22
  import { type AgentTelemetry, instrumentedCompleteSimple } from "../telemetry";
21
23
  import { ThinkingLevel } from "../thinking";
22
- import type { AgentMessage, AgentTool } from "../types";
24
+ import type { AgentMessage } from "../types";
23
25
  import type { CompactionEntry, SessionEntry } from "./entries";
24
- import { type ConvertToLlm, convertToLlm, createBranchSummaryMessage, createCustomMessage } from "./messages";
26
+ import { type ConvertToLlm, createBranchSummaryMessage, createCustomMessage, defaultConvertToLlm } from "./messages";
25
27
  import {
26
28
  buildOpenAiNativeHistory,
27
29
  getPreservedOpenAiRemoteCompactionData,
@@ -44,6 +46,7 @@ import {
44
46
  type FileOperations,
45
47
  SUMMARIZATION_SYSTEM_PROMPT,
46
48
  serializeConversation,
49
+ stripReadSelector,
47
50
  upsertFileOperations,
48
51
  } from "./utils";
49
52
 
@@ -73,7 +76,7 @@ function extractFileOperations(
73
76
  if (!prevCompaction.fromExtension && prevCompaction.details) {
74
77
  const details = prevCompaction.details as CompactionDetails;
75
78
  if (Array.isArray(details.readFiles)) {
76
- for (const f of details.readFiles) fileOps.read.add(f);
79
+ for (const f of details.readFiles) fileOps.read.add(stripReadSelector(f));
77
80
  }
78
81
  if (Array.isArray(details.modifiedFiles)) {
79
82
  for (const f of details.modifiedFiles) fileOps.edited.add(f);
@@ -136,7 +139,7 @@ export interface CompactionResult<T = unknown> {
136
139
 
137
140
  export interface CompactionSettings {
138
141
  enabled: boolean;
139
- strategy?: "context-full" | "handoff" | "shake" | "off";
142
+ strategy?: "context-full" | "handoff" | "shake" | "snapcompact" | "off";
140
143
  thresholdPercent?: number;
141
144
  thresholdTokens?: number;
142
145
  reserveTokens: number;
@@ -284,9 +287,19 @@ export function estimateTokens(message: AgentMessage): number {
284
287
  fragments.push(block.text);
285
288
  } else if (block.type === "thinking") {
286
289
  fragments.push(block.thinking);
290
+ // Providers charge for the opaque signature/reasoning payload that
291
+ // rides alongside the thinking text (OpenAI Responses encrypted
292
+ // reasoning items, Anthropic signed thinking blocks, etc.). Without
293
+ // counting it, this estimator can read ~half of the provider-reported
294
+ // usage on thinking-heavy turns — see #2275 for the resulting
295
+ // compaction-trigger / post-check metric divergence.
296
+ if (block.thinkingSignature) fragments.push(block.thinkingSignature);
287
297
  } else if (block.type === "toolCall") {
288
298
  fragments.push(block.name);
289
299
  fragments.push(JSON.stringify(block.arguments));
300
+ } else if (block.type === "redactedThinking") {
301
+ // Encrypted reasoning blob the provider still bills for on replay.
302
+ fragments.push(block.data);
290
303
  }
291
304
  }
292
305
  break;
@@ -309,6 +322,10 @@ export function estimateTokens(message: AgentMessage): number {
309
322
  case "branchSummary":
310
323
  case "compactionSummary": {
311
324
  fragments.push(message.summary);
325
+ if (message.role === "compactionSummary" && message.images) {
326
+ // Snapcompact frames render at ≥1568px; providers bill the downscaled cap.
327
+ extra += message.images.length * SNAPCOMPACT_FRAME_TOKEN_ESTIMATE;
328
+ }
312
329
  break;
313
330
  }
314
331
  default:
@@ -624,7 +641,7 @@ export async function generateSummary(
624
641
 
625
642
  // Serialize conversation to text so model doesn't try to continue it
626
643
  // Convert to LLM messages first (handles custom app messages when caller provides a transformer).
627
- const llmMessages = (options?.convertToLlm ?? convertToLlm)(currentMessages);
644
+ const llmMessages = (options?.convertToLlm ?? defaultConvertToLlm)(currentMessages);
628
645
  const conversationText = serializeConversation(llmMessages);
629
646
 
630
647
  // Build the prompt with conversation wrapped in tags
@@ -690,7 +707,7 @@ export interface HandoffOptions {
690
707
  /** Live agent system prompt — passed verbatim so providers hit the cached prefix. */
691
708
  systemPrompt: string[];
692
709
  /** Live agent tool list — same purpose. Forced to `toolChoice: "none"`. */
693
- tools?: AgentTool<any>[];
710
+ tools?: Tool[];
694
711
  customInstructions?: string;
695
712
  convertToLlm?: ConvertToLlm;
696
713
  initiatorOverride?: MessageAttribution;
@@ -723,7 +740,7 @@ export async function generateHandoff(
723
740
  options: HandoffOptions,
724
741
  signal?: AbortSignal,
725
742
  ): Promise<string> {
726
- const llmMessages = (options.convertToLlm ?? convertToLlm)(messages);
743
+ const llmMessages = (options.convertToLlm ?? defaultConvertToLlm)(messages);
727
744
  const requestMessages: Message[] = [
728
745
  ...llmMessages,
729
746
  {
@@ -772,7 +789,7 @@ async function generateShortSummary(
772
789
  options?: SummaryOptions,
773
790
  ): Promise<string> {
774
791
  const maxTokens = Math.min(512, Math.floor(0.2 * reserveTokens));
775
- const llmMessages = (options?.convertToLlm ?? convertToLlm)(recentMessages);
792
+ const llmMessages = (options?.convertToLlm ?? defaultConvertToLlm)(recentMessages);
776
793
  const conversationText = serializeConversation(llmMessages);
777
794
 
778
795
  let promptText = `<conversation>\n${conversationText}\n</conversation>\n\n`;
@@ -1009,7 +1026,7 @@ export async function compact(
1009
1026
  ? previousRemoteCompaction.replacementHistory
1010
1027
  : undefined;
1011
1028
  const remoteHistory = buildOpenAiNativeHistory(
1012
- (summaryOptions.convertToLlm ?? convertToLlm)(remoteMessages),
1029
+ (summaryOptions.convertToLlm ?? defaultConvertToLlm)(remoteMessages),
1013
1030
  model,
1014
1031
  previousReplacementHistory,
1015
1032
  );
@@ -1097,7 +1114,7 @@ export async function compact(
1097
1114
 
1098
1115
  // Compute file lists and append to summary
1099
1116
  const { readFiles, modifiedFiles } = computeFileLists(fileOps);
1100
- summary = upsertFileOperations(summary, readFiles, modifiedFiles);
1117
+ summary = upsertFileOperations(summary, readFiles, modifiedFiles, fileOps.read);
1101
1118
 
1102
1119
  if (!firstKeptEntryId) {
1103
1120
  throw new Error("First kept entry has no ID - session may need migration");
@@ -1126,7 +1143,7 @@ async function generateTurnPrefixSummary(
1126
1143
  ): Promise<string> {
1127
1144
  const maxTokens = Math.floor(0.5 * reserveTokens); // Smaller budget for turn prefix
1128
1145
 
1129
- const llmMessages = (options?.convertToLlm ?? convertToLlm)(messages);
1146
+ const llmMessages = (options?.convertToLlm ?? defaultConvertToLlm)(messages);
1130
1147
  const conversationText = serializeConversation(llmMessages);
1131
1148
  const promptText = `<conversation>\n${conversationText}\n</conversation>\n\n${TURN_PREFIX_SUMMARIZATION_PROMPT}`;
1132
1149
  const summarizationMessages = [
@@ -51,6 +51,8 @@ export interface CompactionSummaryMessage {
51
51
  shortSummary?: string;
52
52
  tokensBefore: number;
53
53
  providerPayload?: ProviderPayload;
54
+ /** Snapcompact frames archived by this compaction; appended as image blocks after the summary text. */
55
+ images?: ImageContent[];
54
56
  timestamp: number;
55
57
  }
56
58
 
@@ -98,6 +100,7 @@ export function createCompactionSummaryMessage(
98
100
  timestamp: string,
99
101
  shortSummary?: string,
100
102
  providerPayload?: ProviderPayload,
103
+ images?: ImageContent[],
101
104
  ): CompactionSummaryMessage {
102
105
  return {
103
106
  role: "compactionSummary",
@@ -105,6 +108,7 @@ export function createCompactionSummaryMessage(
105
108
  shortSummary,
106
109
  tokensBefore,
107
110
  providerPayload,
111
+ images: images && images.length > 0 ? images : undefined,
108
112
  timestamp: new Date(timestamp).getTime(),
109
113
  };
110
114
  }
@@ -137,6 +141,79 @@ function isCoreCompactionMessage(message: AgentMessage): message is AgentMessage
137
141
  );
138
142
  }
139
143
 
144
+ /**
145
+ * Transform a single core-domain agent message to its LLM form; `undefined`
146
+ * drops it from the provider request.
147
+ *
148
+ * Single source of truth for the core roles (user/developer/assistant/
149
+ * toolResult) and the compaction messages owned by this package. Embedders
150
+ * with their own app messages (e.g. the coding agent) handle their custom
151
+ * roles and delegate every core role here — duplicating these cases is how
152
+ * snapcompact frames once silently fell off the provider request.
153
+ */
154
+ export function convertMessageToLlm(message: AgentMessage): Message | undefined {
155
+ if (isCoreCompactionMessage(message)) {
156
+ switch (message.role) {
157
+ case "custom":
158
+ case "hookMessage": {
159
+ const content =
160
+ typeof message.content === "string"
161
+ ? [{ type: "text" as const, text: message.content }]
162
+ : message.content;
163
+ return {
164
+ role: "developer",
165
+ content,
166
+ attribution: message.attribution,
167
+ timestamp: message.timestamp,
168
+ };
169
+ }
170
+ case "branchSummary":
171
+ return {
172
+ role: "user",
173
+ content: [
174
+ {
175
+ type: "text" as const,
176
+ text: renderBranchSummaryContext(message.summary),
177
+ },
178
+ ],
179
+ attribution: "agent",
180
+ timestamp: message.timestamp,
181
+ };
182
+ case "compactionSummary":
183
+ return {
184
+ role: "user",
185
+ content: [
186
+ {
187
+ type: "text" as const,
188
+ text: renderCompactionSummaryContext(message.summary),
189
+ },
190
+ ...(message.images ?? []),
191
+ ],
192
+ attribution: "agent",
193
+ providerPayload: message.providerPayload,
194
+ timestamp: message.timestamp,
195
+ };
196
+ }
197
+ }
198
+
199
+ switch (message.role) {
200
+ case "user":
201
+ return { ...message, attribution: message.attribution ?? "user" };
202
+ case "developer":
203
+ return { ...message, attribution: message.attribution ?? "agent" };
204
+ case "assistant":
205
+ return message as AssistantMessage;
206
+ case "toolResult":
207
+ return {
208
+ ...message,
209
+ content: getPrunedToolResultContent(message as ToolResultMessage),
210
+ attribution: message.attribution ?? "agent",
211
+ };
212
+ default:
213
+ return undefined;
214
+ }
215
+ }
216
+
140
217
  /**
141
218
  * Default compaction-domain transformer.
142
219
  *
@@ -145,68 +222,5 @@ function isCoreCompactionMessage(message: AgentMessage): message is AgentMessage
145
222
  * core LLM roles and the compaction messages owned by this package.
146
223
  */
147
224
  export function defaultConvertToLlm(messages: AgentMessage[]): Message[] {
148
- return messages
149
- .map((message): Message | undefined => {
150
- if (isCoreCompactionMessage(message)) {
151
- switch (message.role) {
152
- case "custom":
153
- case "hookMessage": {
154
- const content =
155
- typeof message.content === "string"
156
- ? [{ type: "text" as const, text: message.content }]
157
- : message.content;
158
- return {
159
- role: "developer",
160
- content,
161
- attribution: message.attribution,
162
- timestamp: message.timestamp,
163
- };
164
- }
165
- case "branchSummary":
166
- return {
167
- role: "user",
168
- content: [
169
- {
170
- type: "text" as const,
171
- text: renderBranchSummaryContext(message.summary),
172
- },
173
- ],
174
- attribution: "agent",
175
- timestamp: message.timestamp,
176
- };
177
- case "compactionSummary":
178
- return {
179
- role: "user",
180
- content: [
181
- {
182
- type: "text" as const,
183
- text: renderCompactionSummaryContext(message.summary),
184
- },
185
- ],
186
- attribution: "agent",
187
- providerPayload: message.providerPayload,
188
- timestamp: message.timestamp,
189
- };
190
- }
191
- }
192
-
193
- switch (message.role) {
194
- case "user":
195
- return { ...message, attribution: message.attribution ?? "user" };
196
- case "developer":
197
- return { ...message, attribution: message.attribution ?? "agent" };
198
- case "assistant":
199
- return message as AssistantMessage;
200
- case "toolResult":
201
- return {
202
- ...message,
203
- content: getPrunedToolResultContent(message as ToolResultMessage),
204
- attribution: message.attribution ?? "agent",
205
- };
206
- default:
207
- return undefined;
208
- }
209
- })
210
- .filter(message => message !== undefined);
225
+ return messages.map(convertMessageToLlm).filter(message => message !== undefined);
211
226
  }
212
- export const convertToLlm = defaultConvertToLlm;
@@ -1,10 +1,5 @@
1
- {{#if readFiles.length}}
2
- {{#xml "read-files"}}
3
- {{join readFiles "\n"}}
4
- {{/xml}}
5
- {{/if}}
6
- {{#if modifiedFiles.length}}
7
- {{#xml "modified-files"}}
8
- {{join modifiedFiles "\n"}}
1
+ {{#if files}}
2
+ {{#xml "files"}}
3
+ {{files}}
9
4
  {{/xml}}
10
5
  {{/if}}
@@ -3,7 +3,7 @@
3
3
  */
4
4
 
5
5
  import type { ToolResultMessage } from "@oh-my-pi/pi-ai";
6
- import type { AgentMessage } from "../types";
6
+ import type { AgentMessage, AgentToolCall } from "../types";
7
7
  import { estimateTokens } from "./compaction";
8
8
  import type { SessionEntry, SessionMessageEntry } from "./entries";
9
9
  import {
@@ -12,6 +12,7 @@ import {
12
12
  isSkillReadToolResult,
13
13
  type ProtectedToolMatcher,
14
14
  } from "./tool-protection";
15
+ import { splitReadSelector } from "./utils";
15
16
 
16
17
  export interface PruneConfig {
17
18
  /** Keep the most recent tool output tokens intact. */
@@ -20,6 +21,13 @@ export interface PruneConfig {
20
21
  minimumSavings: number;
21
22
  /** Tool-result protection matchers. String entries protect every result from that tool; predicates may inspect the paired tool call. */
22
23
  protectedTools: ProtectedToolMatcher[];
24
+ /**
25
+ * Optional supersede key function (see {@link SupersedePruneConfig.supersedeKey}).
26
+ * When provided, superseded tool results are pruned first — even inside the
27
+ * `protectTokens` window — before age-based victims. Absent, behavior is
28
+ * unchanged.
29
+ */
30
+ supersedeKey?: SupersedeKeyFn;
23
31
  }
24
32
 
25
33
  export const DEFAULT_PRUNE_CONFIG: PruneConfig = {
@@ -33,6 +41,34 @@ export interface PruneResult {
33
41
  tokensSaved: number;
34
42
  }
35
43
 
44
+ /** Exact placeholder written over a superseded tool result. */
45
+ export const SUPERSEDED_NOTICE = "[Superseded by a newer read of this file]";
46
+
47
+ /**
48
+ * Maps a tool call to a supersede key. Results sharing a key form a group in
49
+ * which every result except the newest is a supersede candidate. A key `K`
50
+ * additionally supersedes keys with prefix `K + "\u0000"` (selector-free read
51
+ * supersedes selector-carrying reads of the same base path). Return
52
+ * `undefined` to exempt a call from supersede grouping.
53
+ */
54
+ export type SupersedeKeyFn = (toolName: string, args: Record<string, unknown>) => string | undefined;
55
+
56
+ export interface SupersedePruneConfig {
57
+ /** Supersede key function; results sharing a key supersede older ones. */
58
+ supersedeKey: SupersedeKeyFn;
59
+ /** Prune a candidate now when all messages after it total at most this many estimated tokens. Default 8 000. */
60
+ suffixTokenLimit?: number;
61
+ /** Prune all candidates when the last message is at least this old (prompt cache is cold anyway). Default 30 min. */
62
+ idleFlushMs?: number;
63
+ /** Clock override for tests. */
64
+ now?: number;
65
+ /** Tool-result protection matchers (same contract as {@link PruneConfig.protectedTools}). */
66
+ protectedTools: ProtectedToolMatcher[];
67
+ }
68
+
69
+ const DEFAULT_SUFFIX_TOKEN_LIMIT = 8_000;
70
+ const DEFAULT_IDLE_FLUSH_MS = 30 * 60_000;
71
+
36
72
  function createPrunedNotice(tokens: number): string {
37
73
  return `[Output truncated - ${tokens} tokens]`;
38
74
  }
@@ -44,18 +80,121 @@ function getToolResultMessage(entry: SessionEntry): ToolResultMessage | undefine
44
80
  return message as ToolResultMessage;
45
81
  }
46
82
 
47
- function estimatePrunedSavings(tokens: number): number {
48
- const noticeTokens = Math.ceil(createPrunedNotice(tokens).length / 4);
83
+ function estimatePrunedSavings(tokens: number, notice: string): number {
84
+ const noticeTokens = Math.ceil(notice.length / 4);
49
85
  return Math.max(0, tokens - noticeTokens);
50
86
  }
51
87
 
88
+ interface SupersedeCandidate {
89
+ entry: SessionMessageEntry;
90
+ message: ToolResultMessage;
91
+ /** Index of the entry within the `entries` array. */
92
+ index: number;
93
+ tokens: number;
94
+ }
95
+
96
+ /**
97
+ * Collect superseded tool results: for every unpruned, unprotected tool result
98
+ * whose paired call resolves a supersede key, a LATER result with the same key
99
+ * — or with a key that is the `"\u0000"`-prefix parent of this one — marks it
100
+ * superseded. Returned in message order.
101
+ */
102
+ function collectSupersededResults(
103
+ entries: readonly SessionEntry[],
104
+ toolCallsById: ReadonlyMap<string, AgentToolCall>,
105
+ supersedeKey: SupersedeKeyFn,
106
+ protectedTools: readonly ProtectedToolMatcher[],
107
+ ): SupersedeCandidate[] {
108
+ const candidates: SupersedeCandidate[] = [];
109
+ const seenKeys = new Set<string>();
110
+ for (let i = entries.length - 1; i >= 0; i--) {
111
+ const entry = entries[i];
112
+ const message = getToolResultMessage(entry);
113
+ if (!message || message.prunedAt !== undefined) continue;
114
+ const toolCall = toolCallsById.get(message.toolCallId);
115
+ if (!toolCall) continue;
116
+ if (isProtectedToolResult(message, toolCall, protectedTools)) continue;
117
+ const key = supersedeKey(toolCall.name, toolCall.arguments as Record<string, unknown>);
118
+ if (key === undefined) continue;
119
+ const separator = key.indexOf("\u0000");
120
+ const superseded = seenKeys.has(key) || (separator >= 0 && seenKeys.has(key.slice(0, separator)));
121
+ seenKeys.add(key);
122
+ if (!superseded) continue;
123
+ candidates.push({
124
+ entry: entry as SessionMessageEntry,
125
+ message,
126
+ index: i,
127
+ tokens: estimateTokens(message as AgentMessage),
128
+ });
129
+ }
130
+ return candidates.reverse();
131
+ }
132
+
133
+ /**
134
+ * Prune superseded tool results (e.g. stale `read` outputs replaced by a newer
135
+ * read of the same file). Cheap, incremental, and prompt-cache-aware: a
136
+ * candidate is pruned now only when the suffix after it is small (tail case —
137
+ * the read→edit→read loop) or when the context has been idle long enough that
138
+ * the provider cache is cold anyway (then ALL candidates flush).
139
+ */
140
+ export function pruneSupersededToolResults(entries: SessionEntry[], config: SupersedePruneConfig): PruneResult {
141
+ const toolCallsById = collectToolCallsById(entries);
142
+ const candidates = collectSupersededResults(entries, toolCallsById, config.supersedeKey, config.protectedTools);
143
+ if (candidates.length === 0) return { prunedCount: 0, tokensSaved: 0 };
144
+
145
+ const now = config.now ?? Date.now();
146
+ let lastMessageTimestamp: number | undefined;
147
+ for (let i = entries.length - 1; i >= 0; i--) {
148
+ const entry = entries[i];
149
+ if (entry.type !== "message") continue;
150
+ const timestamp = (entry.message as AgentMessage).timestamp;
151
+ if (typeof timestamp === "number") lastMessageTimestamp = timestamp;
152
+ break;
153
+ }
154
+ const idle =
155
+ lastMessageTimestamp !== undefined && now - lastMessageTimestamp >= (config.idleFlushMs ?? DEFAULT_IDLE_FLUSH_MS);
156
+
157
+ let toPrune: SupersedeCandidate[];
158
+ if (idle) {
159
+ toPrune = candidates;
160
+ } else {
161
+ const suffixTokenLimit = config.suffixTokenLimit ?? DEFAULT_SUFFIX_TOKEN_LIMIT;
162
+ // suffixTokens[i] = estimated tokens of all messages strictly after entry i.
163
+ const suffixTokens = new Array<number>(entries.length);
164
+ let accumulated = 0;
165
+ for (let i = entries.length - 1; i >= 0; i--) {
166
+ suffixTokens[i] = accumulated;
167
+ const entry = entries[i];
168
+ if (entry.type === "message") accumulated += estimateTokens(entry.message as AgentMessage);
169
+ }
170
+ toPrune = candidates.filter(candidate => suffixTokens[candidate.index] <= suffixTokenLimit);
171
+ }
172
+ if (toPrune.length === 0) return { prunedCount: 0, tokensSaved: 0 };
173
+
174
+ const prunedAt = Date.now();
175
+ let tokensSaved = 0;
176
+ for (const candidate of toPrune) {
177
+ candidate.message.content = [{ type: "text", text: SUPERSEDED_NOTICE }];
178
+ candidate.message.prunedAt = prunedAt;
179
+ tokensSaved += estimatePrunedSavings(candidate.tokens, SUPERSEDED_NOTICE);
180
+ }
181
+ return { prunedCount: toPrune.length, tokensSaved };
182
+ }
183
+
52
184
  export function pruneToolOutputs(entries: SessionEntry[], config: PruneConfig = DEFAULT_PRUNE_CONFIG): PruneResult {
53
185
  let accumulatedTokens = 0;
54
186
  let tokensSaved = 0;
55
187
  let prunedCount = 0;
56
188
 
57
- const candidates: Array<{ entry: SessionMessageEntry; tokens: number }> = [];
189
+ const candidates: Array<{ entry: SessionMessageEntry; tokens: number; superseded: boolean }> = [];
58
190
  const toolCallsById = collectToolCallsById(entries);
191
+ const supersededMessages = config.supersedeKey
192
+ ? new Set(
193
+ collectSupersededResults(entries, toolCallsById, config.supersedeKey, config.protectedTools).map(
194
+ candidate => candidate.message,
195
+ ),
196
+ )
197
+ : undefined;
59
198
 
60
199
  for (let i = entries.length - 1; i >= 0; i--) {
61
200
  const entry = entries[i];
@@ -70,17 +209,23 @@ export function pruneToolOutputs(entries: SessionEntry[], config: PruneConfig =
70
209
  continue;
71
210
  }
72
211
 
73
- if (accumulatedTokens < config.protectTokens || isProtected) {
212
+ // Superseded results are pruned first: they bypass the protect window
213
+ // (a stale copy of re-read content is dead weight at any age).
214
+ const superseded = supersededMessages?.has(message) ?? false;
215
+ if (!superseded && (accumulatedTokens < config.protectTokens || isProtected)) {
74
216
  accumulatedTokens += tokens;
75
217
  continue;
76
218
  }
77
219
 
78
- candidates.push({ entry: entry as SessionMessageEntry, tokens });
220
+ candidates.push({ entry: entry as SessionMessageEntry, tokens, superseded });
79
221
  accumulatedTokens += tokens;
80
222
  }
81
223
 
82
224
  for (const candidate of candidates) {
83
- tokensSaved += estimatePrunedSavings(candidate.tokens);
225
+ tokensSaved += estimatePrunedSavings(
226
+ candidate.tokens,
227
+ candidate.superseded ? SUPERSEDED_NOTICE : createPrunedNotice(candidate.tokens),
228
+ );
84
229
  }
85
230
 
86
231
  if (tokensSaved < config.minimumSavings || candidates.length === 0) {
@@ -90,10 +235,31 @@ export function pruneToolOutputs(entries: SessionEntry[], config: PruneConfig =
90
235
  const prunedAt = Date.now();
91
236
  for (const candidate of candidates) {
92
237
  const message = candidate.entry.message as ToolResultMessage;
93
- message.content = [{ type: "text", text: createPrunedNotice(candidate.tokens) }];
238
+ message.content = [
239
+ { type: "text", text: candidate.superseded ? SUPERSEDED_NOTICE : createPrunedNotice(candidate.tokens) },
240
+ ];
94
241
  message.prunedAt = prunedAt;
95
242
  prunedCount++;
96
243
  }
97
244
 
98
245
  return { prunedCount, tokensSaved };
99
246
  }
247
+
248
+ /**
249
+ * Supersede key for the `read` tool: the file path with the trailing line/raw
250
+ * selector stripped (the read tool's own splitter grammar via
251
+ * {@link splitReadSelector}, e.g. `src/foo.ts:50-200`, `:2-4:raw`).
252
+ * Internal/URL-scheme paths (`skill://…`, `https://…`) are exempt.
253
+ * Selector-free reads key on the bare path; selector-carrying reads key on
254
+ * `path + "\u0000" + selector`, so two reads collide only when the newer is
255
+ * selector-free or the selectors are identical (the pass's prefix rule lets a
256
+ * bare-path read supersede selector-carrying reads of the same file).
257
+ */
258
+ export function readToolSupersedeKey(toolName: string, args: Record<string, unknown>): string | undefined {
259
+ if (toolName !== "read") return undefined;
260
+ const path = args.path;
261
+ if (typeof path !== "string" || path.length === 0) return undefined;
262
+ if (path.includes("://")) return undefined;
263
+ const { path: base, sel } = splitReadSelector(path);
264
+ return sel === undefined ? base : `${base}\u0000${sel}`;
265
+ }
@@ -3,7 +3,7 @@
3
3
  */
4
4
 
5
5
  import type { Message } from "@oh-my-pi/pi-ai";
6
- import { prompt } from "@oh-my-pi/pi-utils";
6
+ import { formatGroupedPaths, prompt } from "@oh-my-pi/pi-utils";
7
7
  import type { AgentMessage } from "../types";
8
8
  import fileOperationsTemplate from "./prompts/file-operations.md" with { type: "text" };
9
9
  import summarizationSystemPrompt from "./prompts/summarization-system.md" with { type: "text" };
@@ -26,6 +26,55 @@ export function createFileOps(): FileOperations {
26
26
  };
27
27
  }
28
28
 
29
+ // Read-tool selector grammar, mirrored from the conservative filesystem splitter in
30
+ // packages/coding-agent/src/tools/path-utils.ts (splitPathAndSel). Keep in sync.
31
+ // A trailing `:chunk` is a selector only when it is a line-range list
32
+ // (`50`, `50-200`, `50+10`, `5-16,960-973`, `..` alias), `raw`, or `conflicts` —
33
+ // alone or as a `range:raw` / `raw:range` compound.
34
+ const RANGE_CHUNK_SRC = String.raw`L?\d+(?:(?:[-+]|\.\.)L?\d+|-|\.\.)?`;
35
+ const RANGE_LIST_SRC = `${RANGE_CHUNK_SRC}(?:,${RANGE_CHUNK_SRC})*`;
36
+ const READ_SELECTOR_RE = new RegExp(`^(?:${RANGE_LIST_SRC}|raw|conflicts)$`, "i");
37
+ const READ_RANGE_ONLY_RE = new RegExp(`^${RANGE_LIST_SRC}$`, "i");
38
+ const READ_RAW_ONLY_RE = /^raw$/i;
39
+
40
+ /**
41
+ * Split a read-tool path into its base path and trailing selector, mirroring the
42
+ * read tool's own splitter. Single source of the grammar in this package: the
43
+ * file-operations list strips selectors via {@link stripReadSelector}, and the
44
+ * supersede-prune pass keys on both parts via `readToolSupersedeKey`.
45
+ */
46
+ export function splitReadSelector(path: string): { path: string; sel?: string } {
47
+ const colon = path.lastIndexOf(":");
48
+ if (colon <= 0) return { path };
49
+ const candidate = path.slice(colon + 1);
50
+ if (!READ_SELECTOR_RE.test(candidate)) return { path };
51
+ let base = path.slice(0, colon);
52
+ let sel = candidate;
53
+ // Compound trailing selector: `path:1-50:raw` or `path:raw:1-50`.
54
+ const inner = base.lastIndexOf(":");
55
+ if (inner > 0) {
56
+ const innerCandidate = base.slice(inner + 1);
57
+ const innerIsRaw = READ_RAW_ONLY_RE.test(innerCandidate);
58
+ const outerIsRaw = READ_RAW_ONLY_RE.test(candidate);
59
+ const innerIsRange = READ_RANGE_ONLY_RE.test(innerCandidate);
60
+ const outerIsRange = READ_RANGE_ONLY_RE.test(candidate);
61
+ if ((innerIsRaw && outerIsRange) || (innerIsRange && outerIsRaw)) {
62
+ sel = `${innerCandidate}:${candidate}`;
63
+ base = base.slice(0, inner);
64
+ }
65
+ }
66
+ return { path: base, sel };
67
+ }
68
+
69
+ /**
70
+ * Strip a trailing read-tool selector (`:50-200`, `:raw`, `:1-50:raw`, `:conflicts`, …)
71
+ * so the same file read with different line ranges dedupes to one `<files>` entry
72
+ * and matches its write/edit path when computing Read/Write/RW markers.
73
+ */
74
+ export function stripReadSelector(path: string): string {
75
+ return splitReadSelector(path).path;
76
+ }
77
+
29
78
  /**
30
79
  * Extract file operations from tool calls in an assistant message.
31
80
  */
@@ -46,7 +95,7 @@ export function extractFileOpsFromMessage(message: AgentMessage, fileOps: FileOp
46
95
 
47
96
  switch (block.name) {
48
97
  case "read":
49
- fileOps.read.add(path);
98
+ fileOps.read.add(stripReadSelector(path));
50
99
  break;
51
100
  case "write":
52
101
  fileOps.written.add(path);
@@ -70,32 +119,48 @@ export function computeFileLists(fileOps: FileOperations): { readFiles: string[]
70
119
  }
71
120
 
72
121
  /**
73
- * Format file operations as XML tags for summary.
122
+ * Format file operations as one `<files>` tag: a grouped, prefix-folded
123
+ * directory tree (find-tool shape — `# dir/` headers, bare basenames) with a
124
+ * ` (Read)` / ` (Write)` / ` (RW)` marker per file instead of separate
125
+ * read/modified lists. `readSet` is the cumulative read set (`fileOps.read`),
126
+ * used to tell modified files that were also read (RW) from blind writes.
74
127
  */
75
128
  const FILE_OPERATION_SUMMARY_LIMIT = 20;
76
129
 
77
- function truncateFileList(files: string[]): string[] {
78
- if (files.length <= FILE_OPERATION_SUMMARY_LIMIT) return files;
79
- const omitted = files.length - FILE_OPERATION_SUMMARY_LIMIT;
80
- return [...files.slice(0, FILE_OPERATION_SUMMARY_LIMIT), `… (${omitted} more files omitted)`];
81
- }
82
-
83
130
  function stripFileOperationTags(summary: string): string {
84
- const withoutReadFiles = summary.replace(/<read-files>[\s\S]*?<\/read-files>\s*/g, "");
85
- const withoutModifiedFiles = withoutReadFiles.replace(/<modified-files>[\s\S]*?<\/modified-files>\s*/g, "");
86
- return withoutModifiedFiles.trimEnd();
131
+ // Legacy <read-files>/<modified-files> tags are still stripped so summaries
132
+ // written before the combined <files> tag self-heal on the next compaction.
133
+ return summary
134
+ .replace(/<files>[\s\S]*?<\/files>\s*/g, "")
135
+ .replace(/<read-files>[\s\S]*?<\/read-files>\s*/g, "")
136
+ .replace(/<modified-files>[\s\S]*?<\/modified-files>\s*/g, "")
137
+ .trimEnd();
87
138
  }
88
- export function formatFileOperations(readFiles: string[], modifiedFiles: string[]): string {
139
+ export function formatFileOperations(
140
+ readFiles: string[],
141
+ modifiedFiles: string[],
142
+ readSet?: ReadonlySet<string>,
143
+ ): string {
89
144
  if (readFiles.length === 0 && modifiedFiles.length === 0) return "";
90
- return prompt.render(fileOperationsTemplate, {
91
- readFiles: truncateFileList(readFiles),
92
- modifiedFiles: truncateFileList(modifiedFiles),
93
- });
145
+ const mode = new Map<string, "Read" | "Write" | "RW">();
146
+ for (const file of readFiles) mode.set(file, "Read");
147
+ for (const file of modifiedFiles) mode.set(file, readSet?.has(file) ? "RW" : "Write");
148
+ const all = [...mode.keys()].sort();
149
+ let files = formatGroupedPaths(all.slice(0, FILE_OPERATION_SUMMARY_LIMIT), path => ` (${mode.get(path)})`);
150
+ if (all.length > FILE_OPERATION_SUMMARY_LIMIT) {
151
+ files += `\n… (${all.length - FILE_OPERATION_SUMMARY_LIMIT} more files omitted)`;
152
+ }
153
+ return prompt.render(fileOperationsTemplate, { files });
94
154
  }
95
155
 
96
- export function upsertFileOperations(summary: string, readFiles: string[], modifiedFiles: string[]): string {
156
+ export function upsertFileOperations(
157
+ summary: string,
158
+ readFiles: string[],
159
+ modifiedFiles: string[],
160
+ readSet?: ReadonlySet<string>,
161
+ ): string {
97
162
  const baseSummary = stripFileOperationTags(summary);
98
- const fileOperations = formatFileOperations(readFiles, modifiedFiles);
163
+ const fileOperations = formatFileOperations(readFiles, modifiedFiles, readSet);
99
164
  if (!fileOperations) return baseSummary;
100
165
  if (!baseSummary) return fileOperations;
101
166
  return `${baseSummary}\n\n${fileOperations}`;
package/src/types.ts CHANGED
@@ -3,6 +3,7 @@ import type {
3
3
  AssistantMessage,
4
4
  AssistantMessageEvent,
5
5
  AssistantMessageEventStream,
6
+ Context,
6
7
  Effort,
7
8
  ImageContent,
8
9
  Message,
@@ -107,6 +108,13 @@ export interface AgentLoopConfig extends SimpleStreamOptions {
107
108
  */
108
109
  transformContext?: (messages: AgentMessage[], signal?: AbortSignal) => Promise<AgentMessage[]>;
109
110
 
111
+ /**
112
+ * Optional transform applied to the final provider context after conversion,
113
+ * normalization, and append-only context handling, but before telemetry capture
114
+ * and provider send.
115
+ */
116
+ transformProviderContext?: (context: Context) => Context;
117
+
110
118
  /**
111
119
  * Resolves an API key dynamically for each LLM call.
112
120
  *
@@ -210,6 +218,15 @@ export interface AgentLoopConfig extends SimpleStreamOptions {
210
218
  */
211
219
  getReasoning?: () => Effort | undefined;
212
220
 
221
+ /**
222
+ * Dynamic reasoning-disable override, resolved per LLM call. When set,
223
+ * its return value overrides the static `disableReasoning` from
224
+ * `SimpleStreamOptions` for that request. Pair with `getReasoning` so
225
+ * mid-run transitions into and out of the explicit `off` state propagate
226
+ * to the next provider call.
227
+ */
228
+ getDisableReasoning?: () => boolean | undefined;
229
+
213
230
  /**
214
231
  * Called after a tool call has been validated and is about to execute.
215
232
  *
@@ -358,6 +375,7 @@ export interface AgentState {
358
375
  systemPrompt: string[];
359
376
  model: Model;
360
377
  thinkingLevel?: Effort;
378
+ disableReasoning?: boolean;
361
379
  tools: AgentTool<any>[];
362
380
  messages: AgentMessage[]; // Can include attachments + custom message types
363
381
  isStreaming: boolean;