npm - @oh-my-pi/pi-agent-core - Versions diffs - 15.10.11 → 15.11.0 - Mend

@oh-my-pi/pi-agent-core 15.10.11 → 15.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

package/CHANGELOG.md +38 -1
package/dist/types/agent.d.ts +7 -1
package/dist/types/compaction/compaction.d.ts +4 -4
package/dist/types/compaction/messages.d.ts +14 -2
package/dist/types/compaction/pruning.d.ts +48 -0
package/dist/types/compaction/utils.d.ts +18 -2
package/dist/types/types.d.ts +16 -1
package/package.json +6 -5
package/src/agent-loop.ts +26 -7
package/src/agent.ts +17 -0
package/src/compaction/branch-summarization.ts +5 -4
package/src/compaction/compaction.ts +28 -11
package/src/compaction/messages.ts +78 -64
package/src/compaction/prompts/file-operations.md +3 -8
package/src/compaction/pruning.ts +174 -8
package/src/compaction/utils.ts +84 -19
package/src/types.ts +18 -0

package/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,41 @@
 ## [Unreleased]
+## [15.11.0] - 2026-06-10
+### Breaking Changes
+- Removed `compaction/index.ts` re-export of snapcompact helpers, so snapcompact utilities are no longer available from the agent compaction barrel and should be imported from `@oh-my-pi/snapcompact`
+- Removed the `convertToLlm` alias export from `compaction/messages` — it duplicated `defaultConvertToLlm` under a second name. Import `defaultConvertToLlm` (array form) or the new `convertMessageToLlm` (single-message form) instead
+### Added
+- Added `convertMessageToLlm()`: the single-message core transformer behind `defaultConvertToLlm()`. Embedders with app-specific message roles should handle their own roles and delegate every core role (`user`/`developer`/`assistant`/`toolResult`/`custom`/`hookMessage`/`branchSummary`/`compactionSummary`) to it instead of duplicating the conversion — a duplicated `compactionSummary` case is how snapcompact frames once silently dropped off provider requests
+- Added `pruneSupersededToolResults()` and the opt-in `PruneConfig.supersedeKey` hook so harnesses can prune stale tool results superseded by a newer read of the same file; superseded results are pruned ahead of age-based victims during overflow pruning and replaced with a `[Superseded by a newer read of this file]` placeholder. Without the new config, `pruneToolOutputs()` behavior is unchanged.
+- Added `readToolSupersedeKey()` implementing the read-tool path/selector grammar (selector-free reads supersede range reads of the same file; URL-scheme paths exempt). Pruning honors prompt-cache economics: per-turn prunes only fire when the post-candidate suffix is small or the cache is cold (idle gap).
+- Added the `snapcompact` compaction strategy via `@oh-my-pi/snapcompact`: instead of an LLM summary, discarded history is printed onto dense bitmap frames and re-attached to the compaction summary message as image blocks. `CompactionSummaryMessage` gains an optional `images` field, `estimateTokens()` charges per attached frame, and frames persist under `preserveData.snapcompact` with an 8-frame middle-out eviction budget.
+- Snapcompact frames are now rendered in a provider-aware shape (`SNAPCOMPACT_SHAPES` + `resolveSnapcompactShape(api)`), following the snapcompact 200k-token monolithic evals: Anthropic-family and unknown APIs get `8x8r-bw` (unscii-8 square cells, black ink, every line printed twice with the copy on a pale highlight band — read at F1 parity with raw text at ~2x lower cost and the most refusal-robust), Google gets `8x8r-sent` (sentence-hue ink, ~2.9x cheaper), and OpenAI gets `6x6u-sent` (unscii Lanczos-stretched to 6x6 cells — OpenAI bills a flat ~2.9k tokens per image, so frame count is the only cost lever) with `detail: "original"` on the frame images. `snapcompactCompact()` accepts `model`/`shape` options, frames persist their shape metadata, mixed-shape archives (provider switches, legacy 5x8 frames) are flagged in the reading instructions, and `snapcompactGeometry()`/`renderSnapcompactFrame()` now take a shape
+### Changed
+- Compaction and branch-summary file lists are now a single `<files>` tag instead of `<read-files>`/`<modified-files>`: paths render as the grouped, prefix-folded directory tree the find/search tools emit (`# dir/` headers, bare basenames), each annotated `(Read)`, `(Write)`, or `(RW)` — modified files that were also read get `(RW)`. Legacy tags in summaries written by earlier versions are still stripped and self-heal on the next compaction
+### Fixed
+- Fixed queued steering messages being drained into an externally aborted run: interrupting mid-tool execution (e.g. Enter with a pending steer) dequeued the steer into the dying run — it landed in history without a response and the post-abort resume saw an empty queue, so the agent stopped instead of continuing. Steering/follow-up/aside queue polls are now skipped once the run's abort signal fires, leaving the queue intact for `Agent.continue()`.
+- Fixed `<read-files>` compaction lists recording the same file once per line-range/raw selector (`src/foo.ts:50-200`, `:raw`, `:1-50:raw`, …): read-tool selectors are now stripped before tracking, so reads dedupe to the base path and match their write/edit path when splitting read-only vs modified lists. Selector-polluted lists stored by earlier compactions self-heal on the next compaction. `readToolSupersedeKey()` now shares the same splitter (`splitReadSelector()`), gaining the `..` range alias and `L`-prefix forms it previously missed.
+- Fixed `estimateTokens()` undercounting thinking-heavy assistant messages on replay: `thinkingSignature` payloads (OpenAI Responses encrypted reasoning items, Anthropic signed thinking blocks, etc.) and `redactedThinking.data` are now charged alongside the visible thinking text, so the local estimate tracks provider-reported usage instead of straddling the threshold on every turn ([#2275](https://github.com/can1357/oh-my-pi/issues/2275)).
+## [15.10.12] - 2026-06-10
+### Added
+- Added `AgentLoopConfig.getDisableReasoning` so callers can override `disableReasoning` per LLM call, mirroring `getReasoning`.
+- Added `transformProviderContext` to `AgentOptions`/`AgentLoopConfig`: an optional hook applied to the assembled provider context after conversion, normalization, and append-only handling, but before telemetry capture and provider send.
+### Fixed
+- Fixed `Agent` runs so explicit reasoning disablement is forwarded to provider stream options and re-resolved per continuation, keeping mid-run thinking-off changes in sync with the next provider request.
 ## [15.10.11] - 2026-06-10
 ### Changed
@@ -10,6 +45,7 @@
 - Catalog imports moved to the new `@oh-my-pi/pi-catalog` package: subpath imports (`calculateCost`, Codex wire constants) plus catalog values previously taken from the `@oh-my-pi/pi-ai` root (`getBundledModel`, `clampThinkingLevelForModel`), which pi-ai no longer re-exports; type-only `Model`/`Api`/`Effort` imports from pi-ai are unchanged
 ## [15.10.8] - 2026-06-09
 ### Added
 - Added optional `fetch` overrides to `SummaryOptions` and `compact`/`generateSummary` so remote compaction can use custom HTTP clients
@@ -18,6 +54,7 @@
 - Added the upstream provider that served a request (`AssistantMessage.upstreamProvider`, e.g. OpenRouter's routed provider) as a `pi.gen_ai.response.upstream_provider` chat-span telemetry attribute, alongside the existing response id and time-to-first-chunk.
 ## [15.10.5] - 2026-06-08
 ### Removed
 - Removed the `maxToolCallsPerTurn` option from `AgentOptions` and `AgentLoopConfig`, so assistant turns are no longer capped after a configured number of completed tool calls
@@ -55,7 +92,6 @@
 - Removed stale synthetic user-message tag filters from OpenAI remote compaction output preservation; developer messages are now dropped by role instead.
 - Tool executions now receive the active turn `AbortSignal` unconditionally.
 ## [15.10.2] - 2026-06-08
 ### Fixed
@@ -87,6 +123,7 @@
 - Surfaced Anthropic stream failures whose message starts with `Output blocked by conten` as normal assistant error lifecycle events, so interactive clients render content-filter blocks instead of silently dropping the streaming bubble at `agent_end`.
 ## [15.8.3] - 2026-06-03
 ### Added
 - Added `getReadToolPath(context)` to `@oh-my-pi/pi-agent-core/compaction/tool-protection` to extract a paired `read` tool call's `path` for embedders building read-targeted protection matchers

package/dist/types/agent.d.ts CHANGED Viewed

@@ -1,4 +1,4 @@
-import { type ApiKeyResolveContext, type AssistantMessage, type AssistantMessageEvent, type CursorExecHandlers, type CursorToolResultHandler, type Effort, type ImageContent, type Message, type Model, type ProviderSessionState, type ServiceTier, type SimpleStreamOptions, type ThinkingBudgets, type ToolChoice } from "@oh-my-pi/pi-ai";
+import { type ApiKeyResolveContext, type AssistantMessage, type AssistantMessageEvent, type Context, type CursorExecHandlers, type CursorToolResultHandler, type Effort, type ImageContent, type Message, type Model, type ProviderSessionState, type ServiceTier, type SimpleStreamOptions, type ThinkingBudgets, type ToolChoice } from "@oh-my-pi/pi-ai";
 import type { AppendOnlyContextManager } from "./append-only-context";
 import type { HarmonyAuditEvent } from "./harmony-leak";
 import type { AgentEvent, AgentLoopConfig, AgentMessage, AgentState, AgentTool, AgentToolContext, AsideMessage, StreamFn, ToolCallContext } from "./types";
@@ -17,6 +17,11 @@ export interface AgentOptions {
      * Use for context pruning, injecting external context, etc.
      */
     transformContext?: (messages: AgentMessage[], signal?: AbortSignal) => Promise<AgentMessage[]>;
+    /**
+     * Optional transform applied after provider context assembly and before
+     * telemetry capture/provider send.
+     */
+    transformProviderContext?: (context: Context) => Context;
     /**
      * Steering mode: "all" = send all steering messages at once, "one-at-a-time" = one per turn
      */
@@ -294,6 +299,7 @@ export declare class Agent {
     setSystemPrompt(v: string[]): void;
     setModel(m: Model): void;
     setThinkingLevel(l: Effort | undefined): void;
+    setDisableReasoning(disabled: boolean): void;
     setSteeringMode(mode: "all" | "one-at-a-time"): void;
     getSteeringMode(): "all" | "one-at-a-time";
     setFollowUpMode(mode: "all" | "one-at-a-time"): void;

package/dist/types/compaction/compaction.d.ts CHANGED Viewed

@@ -4,10 +4,10 @@
  * Pure functions for compaction logic. The session manager handles I/O,
  * and after compaction the session is reloaded.
  */
-import { type FetchImpl, type MessageAttribution, type Model, type Usage } from "@oh-my-pi/pi-ai";
+import { type FetchImpl, type MessageAttribution, type Model, type Tool, type Usage } from "@oh-my-pi/pi-ai";
 import { type AgentTelemetry } from "../telemetry";
 import { ThinkingLevel } from "../thinking";
-import type { AgentMessage, AgentTool } from "../types";
+import type { AgentMessage } from "../types";
 import type { SessionEntry } from "./entries";
 import { type ConvertToLlm } from "./messages";
 import { type FileOperations } from "./utils";
@@ -30,7 +30,7 @@ export interface CompactionResult<T = unknown> {
 }
 export interface CompactionSettings {
     enabled: boolean;
-    strategy?: "context-full" | "handoff" | "shake" | "off";
+    strategy?: "context-full" | "handoff" | "shake" | "snapcompact" | "off";
     thresholdPercent?: number;
     thresholdTokens?: number;
     reserveTokens: number;
@@ -133,7 +133,7 @@ export interface HandoffOptions {
     /** Live agent system prompt — passed verbatim so providers hit the cached prefix. */
     systemPrompt: string[];
     /** Live agent tool list — same purpose. Forced to `toolChoice: "none"`. */
-    tools?: AgentTool<any>[];
+    tools?: Tool[];
     customInstructions?: string;
     convertToLlm?: ConvertToLlm;
     initiatorOverride?: MessageAttribution;

package/dist/types/compaction/messages.d.ts CHANGED Viewed

@@ -33,6 +33,8 @@ export interface CompactionSummaryMessage {
     shortSummary?: string;
     tokensBefore: number;
     providerPayload?: ProviderPayload;
+    /** Snapcompact frames archived by this compaction; appended as image blocks after the summary text. */
+    images?: ImageContent[];
     timestamp: number;
 }
 export type CoreCompactionMessage = CustomMessage | HookMessage | BranchSummaryMessage | CompactionSummaryMessage;
@@ -48,8 +50,19 @@ export type ConvertToLlm = (messages: AgentMessage[]) => Message[];
 export declare function renderBranchSummaryContext(summary: string): string;
 export declare function renderCompactionSummaryContext(summary: string): string;
 export declare function createBranchSummaryMessage(summary: string, fromId: string, timestamp: string): BranchSummaryMessage;
-export declare function createCompactionSummaryMessage(summary: string, tokensBefore: number, timestamp: string, shortSummary?: string, providerPayload?: ProviderPayload): CompactionSummaryMessage;
+export declare function createCompactionSummaryMessage(summary: string, tokensBefore: number, timestamp: string, shortSummary?: string, providerPayload?: ProviderPayload, images?: ImageContent[]): CompactionSummaryMessage;
 export declare function createCustomMessage(customType: string, content: string | (TextContent | ImageContent)[], display: boolean, details: unknown | undefined, timestamp: string, attribution?: MessageAttribution): CustomMessage;
+/**
+ * Transform a single core-domain agent message to its LLM form; `undefined`
+ * drops it from the provider request.
+ *
+ * Single source of truth for the core roles (user/developer/assistant/
+ * toolResult) and the compaction messages owned by this package. Embedders
+ * with their own app messages (e.g. the coding agent) handle their custom
+ * roles and delegate every core role here — duplicating these cases is how
+ * snapcompact frames once silently fell off the provider request.
+ */
+export declare function convertMessageToLlm(message: AgentMessage): Message | undefined;
 /**
  * Default compaction-domain transformer.
  *
@@ -58,4 +71,3 @@ export declare function createCustomMessage(customType: string, content: string
  * core LLM roles and the compaction messages owned by this package.
  */
 export declare function defaultConvertToLlm(messages: AgentMessage[]): Message[];
-export declare const convertToLlm: typeof defaultConvertToLlm;

package/dist/types/compaction/pruning.d.ts CHANGED Viewed

@@ -10,10 +10,58 @@ export interface PruneConfig {
     minimumSavings: number;
     /** Tool-result protection matchers. String entries protect every result from that tool; predicates may inspect the paired tool call. */
     protectedTools: ProtectedToolMatcher[];
+    /**
+     * Optional supersede key function (see {@link SupersedePruneConfig.supersedeKey}).
+     * When provided, superseded tool results are pruned first — even inside the
+     * `protectTokens` window — before age-based victims. Absent, behavior is
+     * unchanged.
+     */
+    supersedeKey?: SupersedeKeyFn;
 }
 export declare const DEFAULT_PRUNE_CONFIG: PruneConfig;
 export interface PruneResult {
     prunedCount: number;
     tokensSaved: number;
 }
+/** Exact placeholder written over a superseded tool result. */
+export declare const SUPERSEDED_NOTICE = "[Superseded by a newer read of this file]";
+/**
+ * Maps a tool call to a supersede key. Results sharing a key form a group in
+ * which every result except the newest is a supersede candidate. A key `K`
+ * additionally supersedes keys with prefix `K + "\u0000"` (selector-free read
+ * supersedes selector-carrying reads of the same base path). Return
+ * `undefined` to exempt a call from supersede grouping.
+ */
+export type SupersedeKeyFn = (toolName: string, args: Record<string, unknown>) => string | undefined;
+export interface SupersedePruneConfig {
+    /** Supersede key function; results sharing a key supersede older ones. */
+    supersedeKey: SupersedeKeyFn;
+    /** Prune a candidate now when all messages after it total at most this many estimated tokens. Default 8 000. */
+    suffixTokenLimit?: number;
+    /** Prune all candidates when the last message is at least this old (prompt cache is cold anyway). Default 30 min. */
+    idleFlushMs?: number;
+    /** Clock override for tests. */
+    now?: number;
+    /** Tool-result protection matchers (same contract as {@link PruneConfig.protectedTools}). */
+    protectedTools: ProtectedToolMatcher[];
+}
+/**
+ * Prune superseded tool results (e.g. stale `read` outputs replaced by a newer
+ * read of the same file). Cheap, incremental, and prompt-cache-aware: a
+ * candidate is pruned now only when the suffix after it is small (tail case —
+ * the read→edit→read loop) or when the context has been idle long enough that
+ * the provider cache is cold anyway (then ALL candidates flush).
+ */
+export declare function pruneSupersededToolResults(entries: SessionEntry[], config: SupersedePruneConfig): PruneResult;
 export declare function pruneToolOutputs(entries: SessionEntry[], config?: PruneConfig): PruneResult;
+/**
+ * Supersede key for the `read` tool: the file path with the trailing line/raw
+ * selector stripped (the read tool's own splitter grammar via
+ * {@link splitReadSelector}, e.g. `src/foo.ts:50-200`, `:2-4:raw`).
+ * Internal/URL-scheme paths (`skill://…`, `https://…`) are exempt.
+ * Selector-free reads key on the bare path; selector-carrying reads key on
+ * `path + "\u0000" + selector`, so two reads collide only when the newer is
+ * selector-free or the selectors are identical (the pass's prefix rule lets a
+ * bare-path read supersede selector-carrying reads of the same file).
+ */
+export declare function readToolSupersedeKey(toolName: string, args: Record<string, unknown>): string | undefined;

package/dist/types/compaction/utils.d.ts CHANGED Viewed

@@ -9,6 +9,22 @@ export interface FileOperations {
     edited: Set<string>;
 }
 export declare function createFileOps(): FileOperations;
+/**
+ * Split a read-tool path into its base path and trailing selector, mirroring the
+ * read tool's own splitter. Single source of the grammar in this package: the
+ * file-operations list strips selectors via {@link stripReadSelector}, and the
+ * supersede-prune pass keys on both parts via `readToolSupersedeKey`.
+ */
+export declare function splitReadSelector(path: string): {
+    path: string;
+    sel?: string;
+};
+/**
+ * Strip a trailing read-tool selector (`:50-200`, `:raw`, `:1-50:raw`, `:conflicts`, …)
+ * so the same file read with different line ranges dedupes to one `<files>` entry
+ * and matches its write/edit path when computing Read/Write/RW markers.
+ */
+export declare function stripReadSelector(path: string): string;
 /**
  * Extract file operations from tool calls in an assistant message.
  */
@@ -21,8 +37,8 @@ export declare function computeFileLists(fileOps: FileOperations): {
     readFiles: string[];
     modifiedFiles: string[];
 };
-export declare function formatFileOperations(readFiles: string[], modifiedFiles: string[]): string;
-export declare function upsertFileOperations(summary: string, readFiles: string[], modifiedFiles: string[]): string;
+export declare function formatFileOperations(readFiles: string[], modifiedFiles: string[], readSet?: ReadonlySet<string>): string;
+export declare function upsertFileOperations(summary: string, readFiles: string[], modifiedFiles: string[], readSet?: ReadonlySet<string>): string;
 /**
  * Serialize LLM messages to text for summarization.
  * This prevents the model from treating it as a conversation to continue.

package/dist/types/types.d.ts CHANGED Viewed

@@ -1,4 +1,4 @@
-import type { ApiKeyResolveContext, AssistantMessage, AssistantMessageEvent, AssistantMessageEventStream, Effort, ImageContent, Message, Model, SimpleStreamOptions, Static, streamSimple, TextContent, Tool, ToolChoice, ToolResultMessage, TSchema } from "@oh-my-pi/pi-ai";
+import type { ApiKeyResolveContext, AssistantMessage, AssistantMessageEvent, AssistantMessageEventStream, Context, Effort, ImageContent, Message, Model, SimpleStreamOptions, Static, streamSimple, TextContent, Tool, ToolChoice, ToolResultMessage, TSchema } from "@oh-my-pi/pi-ai";
 import type { AppendOnlyContextManager } from "./append-only-context";
 import type { HarmonyAuditEvent } from "./harmony-leak";
 import type { AgentRunCoverage, AgentRunSummary } from "./run-collector";
@@ -79,6 +79,12 @@ export interface AgentLoopConfig extends SimpleStreamOptions {
      * ```
      */
     transformContext?: (messages: AgentMessage[], signal?: AbortSignal) => Promise<AgentMessage[]>;
+    /**
+     * Optional transform applied to the final provider context after conversion,
+     * normalization, and append-only context handling, but before telemetry capture
+     * and provider send.
+     */
+    transformProviderContext?: (context: Context) => Context;
     /**
      * Resolves an API key dynamically for each LLM call.
      *
@@ -171,6 +177,14 @@ export interface AgentLoopConfig extends SimpleStreamOptions {
      * the next model call instead of waiting for the next prompt.
      */
     getReasoning?: () => Effort | undefined;
+    /**
+     * Dynamic reasoning-disable override, resolved per LLM call. When set,
+     * its return value overrides the static `disableReasoning` from
+     * `SimpleStreamOptions` for that request. Pair with `getReasoning` so
+     * mid-run transitions into and out of the explicit `off` state propagate
+     * to the next provider call.
+     */
+    getDisableReasoning?: () => boolean | undefined;
     /**
      * Called after a tool call has been validated and is about to execute.
      *
@@ -307,6 +321,7 @@ export interface AgentState {
     systemPrompt: string[];
     model: Model;
     thinkingLevel?: Effort;
+    disableReasoning?: boolean;
     tools: AgentTool<any>[];
     messages: AgentMessage[];
     isStreaming: boolean;

package/package.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
 	"type": "module",
 	"name": "@oh-my-pi/pi-agent-core",
-	"version": "15.10.11",
+	"version": "15.11.0",
 	"description": "General-purpose agent with transport abstraction, state management, and attachment support",
 	"homepage": "https://omp.sh",
 	"author": "Can Boluk",
@@ -35,10 +35,11 @@
 		"fmt": "biome format --write ."
 	},
 	"dependencies": {
-		"@oh-my-pi/pi-ai": "15.10.11",
-		"@oh-my-pi/pi-catalog": "15.10.11",
-		"@oh-my-pi/pi-natives": "15.10.11",
-		"@oh-my-pi/pi-utils": "15.10.11",
+		"@oh-my-pi/pi-ai": "15.11.0",
+		"@oh-my-pi/pi-catalog": "15.11.0",
+		"@oh-my-pi/pi-natives": "15.11.0",
+		"@oh-my-pi/pi-utils": "15.11.0",
+		"@oh-my-pi/snapcompact": "15.11.0",
 		"@opentelemetry/api": "^1.9.1"
 	},
 	"devDependencies": {

package/src/agent-loop.ts CHANGED Viewed

@@ -564,8 +564,10 @@ async function runLoopBody(
 	streamFn?: StreamFn,
 ): Promise<void> {
 	let firstTurn = true;
-	// Check for steering messages at start (user may have typed while waiting)
-	let pendingMessages: AgentMessage[] = (await config.getSteeringMessages?.()) || [];
+	// Check for steering messages at start (user may have typed while waiting).
+	// Skip when the run is already externally aborted — dequeuing would strand
+	// the messages in a run that is about to die.
+	let pendingMessages: AgentMessage[] = signal?.aborted ? [] : (await config.getSteeringMessages?.()) || [];
 	let harmonyRetryAttempt = 0;
 	let harmonyTruncateResumeCount = 0;
@@ -743,7 +745,12 @@ async function runLoopBody(
 			stream.push({ type: "turn_end", message, toolResults });
-			const steering = steeringMessagesFromExecution ?? ((await config.getSteeringMessages?.()) || []);
+			// On external abort (user interrupt), leave the steering queue intact: the
+			// session aborts then continues, delivering the queue into a fresh run.
+			// Draining it here would inject the messages right before a model call that
+			// instantly aborts — message lands in history, agent never responds.
+			const steering =
+				steeringMessagesFromExecution ?? (signal?.aborted ? [] : (await config.getSteeringMessages?.()) || []);
 			if (hasMoreToolCalls) {
 				// Mid-work: fold any non-interrupting asides into the next turn alongside steering.
 				const asides = resolveAsides(await config.getAsideMessages?.());
@@ -758,8 +765,9 @@ async function runLoopBody(
 		// Agent would stop here. Drain non-interrupting asides + follow-up messages.
 		await config.onBeforeYield?.();
-		const asideMessages = resolveAsides(await config.getAsideMessages?.());
-		const followUpMessages = (await config.getFollowUpMessages?.()) || [];
+		// Skip queue drains when externally aborted (same stranding hazard as above).
+		const asideMessages = signal?.aborted ? [] : resolveAsides(await config.getAsideMessages?.());
+		const followUpMessages = signal?.aborted ? [] : (await config.getFollowUpMessages?.()) || [];
 		if (asideMessages.length > 0 || followUpMessages.length > 0) {
 			// Set as pending so the inner loop processes them before stopping.
 			pendingMessages = [...asideMessages, ...followUpMessages];
@@ -829,6 +837,9 @@ async function streamAssistantResponse(
 			tools: normalizeTools(context.tools, !!config.intentTracing),
 		};
 	}
+	if (config.transformProviderContext) {
+		llmContext = config.transformProviderContext(llmContext);
+	}
 	const streamFunction = streamFn || streamSimple;
@@ -845,6 +856,7 @@ async function streamAssistantResponse(
 	const dynamicToolChoice = config.getToolChoice?.();
 	const dynamicReasoning = config.getReasoning?.();
+	const dynamicDisableReasoning = config.getDisableReasoning?.();
 	const harmonyMitigationEnabled = isHarmonyLeakMitigationTarget(config.model);
 	const harmonyAbortController = harmonyMitigationEnabled ? new AbortController() : undefined;
 	const requestSignal = harmonyAbortController
@@ -856,6 +868,7 @@ async function streamAssistantResponse(
 		harmonyRetryAttempt > 0 && config.temperature !== undefined ? config.temperature + 0.05 : config.temperature;
 	const effectiveToolChoice = dynamicToolChoice ?? config.toolChoice;
 	const effectiveReasoning = dynamicReasoning ?? config.reasoning;
+	const effectiveDisableReasoning = dynamicDisableReasoning ?? config.disableReasoning;
 	const chatStepNumber = stepCounter.count;
 	stepCounter.count += 1;
@@ -916,6 +929,7 @@ async function streamAssistantResponse(
 				metadata: resolvedMetadata,
 				toolChoice: effectiveToolChoice,
 				reasoning: effectiveReasoning,
+				disableReasoning: effectiveDisableReasoning,
 				temperature: effectiveTemperature,
 				signal: requestSignal,
 				onResponse: captureOnResponse,
@@ -1247,11 +1261,16 @@ async function executeToolCalls(
 	}));
 	const checkSteering = async (): Promise<void> => {
-		if (!shouldInterruptImmediately || !getSteeringMessages || interruptState.triggered) {
+		// `signal` (external/user abort) is checked separately from the internal
+		// steeringAbortController: once the run is externally aborted it is
+		// unwinding, and draining the steering queue here would strand the
+		// messages in the dying run instead of leaving them for the post-abort
+		// continue (interruptAndFlushQueuedMessages → Agent.continue()).
+		if (!shouldInterruptImmediately || !getSteeringMessages || interruptState.triggered || signal?.aborted) {
 			return;
 		}
 		const check = steeringCheckTail.then(async () => {
-			if (interruptState.triggered) return;
+			if (interruptState.triggered || signal?.aborted) return;
 			const steering = await getSteeringMessages();
 			if (steering.length > 0) {
 				steeringMessages = steering;

package/src/agent.ts CHANGED Viewed

@@ -6,6 +6,7 @@ import {
 	type ApiKeyResolveContext,
 	type AssistantMessage,
 	type AssistantMessageEvent,
+	type Context,
 	type CursorExecHandlers,
 	type CursorToolResultHandler,
 	type Effort,
@@ -93,6 +94,12 @@ export interface AgentOptions {
 	 */
 	transformContext?: (messages: AgentMessage[], signal?: AbortSignal) => Promise<AgentMessage[]>;
+	/**
+	 * Optional transform applied after provider context assembly and before
+	 * telemetry capture/provider send.
+	 */
+	transformProviderContext?: (context: Context) => Context;
 	/**
 	 * Steering mode: "all" = send all steering messages at once, "one-at-a-time" = one per turn
 	 */
@@ -265,6 +272,7 @@ export class Agent {
 		systemPrompt: [],
 		model: getBundledModel("google", "gemini-2.5-flash-lite-preview-06-17"),
 		thinkingLevel: undefined,
+		disableReasoning: false,
 		tools: [],
 		messages: [],
 		isStreaming: false,
@@ -277,6 +285,7 @@ export class Agent {
 	#abortController?: AbortController;
 	#convertToLlm: (messages: AgentMessage[]) => Message[] | Promise<Message[]>;
 	#transformContext?: (messages: AgentMessage[], signal?: AbortSignal) => Promise<AgentMessage[]>;
+	#transformProviderContext?: (context: Context) => Context;
 	#steeringQueue: AgentMessage[] = [];
 	#followUpQueue: AgentMessage[] = [];
 	#steeringMode: "all" | "one-at-a-time";
@@ -375,6 +384,7 @@ export class Agent {
 		this.afterToolCall = opts.afterToolCall;
 		this.#telemetry = opts.telemetry;
 		this.#appendOnlyContext = opts.appendOnlyContext;
+		this.#transformProviderContext = opts.transformProviderContext;
 	}
 	/**
@@ -658,6 +668,10 @@ export class Agent {
 		this.#state.thinkingLevel = l;
 	}
+	setDisableReasoning(disabled: boolean) {
+		this.#state.disableReasoning = disabled;
+	}
 	setSteeringMode(mode: "all" | "one-at-a-time") {
 		this.#steeringMode = mode;
 	}
@@ -942,6 +956,7 @@ export class Agent {
 		const config: AgentLoopConfig = {
 			model,
 			reasoning,
+			disableReasoning: this.#state.disableReasoning,
 			temperature: this.#temperature,
 			topP: this.#topP,
 			topK: this.#topK,
@@ -961,6 +976,7 @@ export class Agent {
 			kimiApiFormat: this.#kimiApiFormat,
 			preferWebsockets: this.#preferWebsockets,
 			convertToLlm: this.#convertToLlm,
+			transformProviderContext: this.#transformProviderContext,
 			transformContext: this.#transformContext,
 			onPayload: this.#onPayload,
 			onResponse: this.#onResponse,
@@ -985,6 +1001,7 @@ export class Agent {
 			onHarmonyLeak: this.#onHarmonyLeak,
 			getToolChoice,
 			getReasoning: () => this.#state.thinkingLevel,
+			getDisableReasoning: () => this.#state.disableReasoning,
 			getSteeringMessages: async () => {
 				if (skipInitialSteeringPoll) {
 					skipInitialSteeringPoll = false;

package/src/compaction/branch-summarization.ts CHANGED Viewed

@@ -13,10 +13,10 @@ import { estimateTokens } from "./compaction";
 import type { ReadonlySessionManager, SessionEntry } from "./entries";
 import {
 	type ConvertToLlm,
-	convertToLlm,
 	createBranchSummaryMessage,
 	createCompactionSummaryMessage,
 	createCustomMessage,
+	defaultConvertToLlm,
 } from "./messages";
 import branchSummaryPrompt from "./prompts/branch-summary.md" with { type: "text" };
 import branchSummaryPreamble from "./prompts/branch-summary-preamble.md" with { type: "text" };
@@ -27,6 +27,7 @@ import {
 	type FileOperations,
 	SUMMARIZATION_SYSTEM_PROMPT,
 	serializeConversation,
+	stripReadSelector,
 	upsertFileOperations,
 } from "./utils";
@@ -214,7 +215,7 @@ export function prepareBranchEntries(entries: SessionEntry[], tokenBudget: numbe
 		if (entry.type === "branch_summary" && !entry.fromExtension && entry.details) {
 			const details = entry.details as BranchSummaryDetails;
 			if (Array.isArray(details.readFiles)) {
-				for (const f of details.readFiles) fileOps.read.add(f);
+				for (const f of details.readFiles) fileOps.read.add(stripReadSelector(f));
 			}
 			if (Array.isArray(details.modifiedFiles)) {
 				// Modified files go into both edited and written for proper deduplication
@@ -288,7 +289,7 @@ export async function generateBranchSummary(
 	// Transform to LLM-compatible messages, then serialize to text
 	// Serialization prevents the model from treating it as a conversation to continue
-	const llmMessages = (options.convertToLlm ?? convertToLlm)(messages);
+	const llmMessages = (options.convertToLlm ?? defaultConvertToLlm)(messages);
 	const conversationText = serializeConversation(llmMessages);
 	// Build prompt
@@ -329,7 +330,7 @@ export async function generateBranchSummary(
 	// Compute file lists and append to summary
 	const { readFiles, modifiedFiles } = computeFileLists(fileOps);
-	summary = upsertFileOperations(summary, readFiles, modifiedFiles);
+	summary = upsertFileOperations(summary, readFiles, modifiedFiles, fileOps.read);
 	return {
 		summary: summary || "No summary generated",

package/src/compaction/compaction.ts CHANGED Viewed

@@ -12,16 +12,18 @@ import {
 	type Message,
 	type MessageAttribution,
 	type Model,
+	type Tool,
 	type Usage,
 } from "@oh-my-pi/pi-ai";
 import { clampThinkingLevelForModel } from "@oh-my-pi/pi-catalog/model-thinking";
 import { countTokens } from "@oh-my-pi/pi-natives";
 import { logger, prompt } from "@oh-my-pi/pi-utils";
+import { SNAPCOMPACT_FRAME_TOKEN_ESTIMATE } from "@oh-my-pi/snapcompact";
 import { type AgentTelemetry, instrumentedCompleteSimple } from "../telemetry";
 import { ThinkingLevel } from "../thinking";
-import type { AgentMessage, AgentTool } from "../types";
+import type { AgentMessage } from "../types";
 import type { CompactionEntry, SessionEntry } from "./entries";
-import { type ConvertToLlm, convertToLlm, createBranchSummaryMessage, createCustomMessage } from "./messages";
+import { type ConvertToLlm, createBranchSummaryMessage, createCustomMessage, defaultConvertToLlm } from "./messages";
 import {
 	buildOpenAiNativeHistory,
 	getPreservedOpenAiRemoteCompactionData,
@@ -44,6 +46,7 @@ import {
 	type FileOperations,
 	SUMMARIZATION_SYSTEM_PROMPT,
 	serializeConversation,
+	stripReadSelector,
 	upsertFileOperations,
 } from "./utils";
@@ -73,7 +76,7 @@ function extractFileOperations(
 		if (!prevCompaction.fromExtension && prevCompaction.details) {
 			const details = prevCompaction.details as CompactionDetails;
 			if (Array.isArray(details.readFiles)) {
-				for (const f of details.readFiles) fileOps.read.add(f);
+				for (const f of details.readFiles) fileOps.read.add(stripReadSelector(f));
 			}
 			if (Array.isArray(details.modifiedFiles)) {
 				for (const f of details.modifiedFiles) fileOps.edited.add(f);
@@ -136,7 +139,7 @@ export interface CompactionResult<T = unknown> {
 export interface CompactionSettings {
 	enabled: boolean;
-	strategy?: "context-full" | "handoff" | "shake" | "off";
+	strategy?: "context-full" | "handoff" | "shake" | "snapcompact" | "off";
 	thresholdPercent?: number;
 	thresholdTokens?: number;
 	reserveTokens: number;
@@ -284,9 +287,19 @@ export function estimateTokens(message: AgentMessage): number {
 					fragments.push(block.text);
 				} else if (block.type === "thinking") {
 					fragments.push(block.thinking);
+					// Providers charge for the opaque signature/reasoning payload that
+					// rides alongside the thinking text (OpenAI Responses encrypted
+					// reasoning items, Anthropic signed thinking blocks, etc.). Without
+					// counting it, this estimator can read ~half of the provider-reported
+					// usage on thinking-heavy turns — see #2275 for the resulting
+					// compaction-trigger / post-check metric divergence.
+					if (block.thinkingSignature) fragments.push(block.thinkingSignature);
 				} else if (block.type === "toolCall") {
 					fragments.push(block.name);
 					fragments.push(JSON.stringify(block.arguments));
+				} else if (block.type === "redactedThinking") {
+					// Encrypted reasoning blob the provider still bills for on replay.
+					fragments.push(block.data);
 				}
 			}
 			break;
@@ -309,6 +322,10 @@ export function estimateTokens(message: AgentMessage): number {
 		case "branchSummary":
 		case "compactionSummary": {
 			fragments.push(message.summary);
+			if (message.role === "compactionSummary" && message.images) {
+				// Snapcompact frames render at ≥1568px; providers bill the downscaled cap.
+				extra += message.images.length * SNAPCOMPACT_FRAME_TOKEN_ESTIMATE;
+			}
 			break;
 		}
 		default:
@@ -624,7 +641,7 @@ export async function generateSummary(
 	// Serialize conversation to text so model doesn't try to continue it
 	// Convert to LLM messages first (handles custom app messages when caller provides a transformer).
-	const llmMessages = (options?.convertToLlm ?? convertToLlm)(currentMessages);
+	const llmMessages = (options?.convertToLlm ?? defaultConvertToLlm)(currentMessages);
 	const conversationText = serializeConversation(llmMessages);
 	// Build the prompt with conversation wrapped in tags
@@ -690,7 +707,7 @@ export interface HandoffOptions {
 	/** Live agent system prompt — passed verbatim so providers hit the cached prefix. */
 	systemPrompt: string[];
 	/** Live agent tool list — same purpose. Forced to `toolChoice: "none"`. */
-	tools?: AgentTool<any>[];
+	tools?: Tool[];
 	customInstructions?: string;
 	convertToLlm?: ConvertToLlm;
 	initiatorOverride?: MessageAttribution;
@@ -723,7 +740,7 @@ export async function generateHandoff(
 	options: HandoffOptions,
 	signal?: AbortSignal,
 ): Promise<string> {
-	const llmMessages = (options.convertToLlm ?? convertToLlm)(messages);
+	const llmMessages = (options.convertToLlm ?? defaultConvertToLlm)(messages);
 	const requestMessages: Message[] = [
 		...llmMessages,
 		{
@@ -772,7 +789,7 @@ async function generateShortSummary(
 	options?: SummaryOptions,
 ): Promise<string> {
 	const maxTokens = Math.min(512, Math.floor(0.2 * reserveTokens));
-	const llmMessages = (options?.convertToLlm ?? convertToLlm)(recentMessages);
+	const llmMessages = (options?.convertToLlm ?? defaultConvertToLlm)(recentMessages);
 	const conversationText = serializeConversation(llmMessages);
 	let promptText = `<conversation>\n${conversationText}\n</conversation>\n\n`;
@@ -1009,7 +1026,7 @@ export async function compact(
 				? previousRemoteCompaction.replacementHistory
 				: undefined;
 		const remoteHistory = buildOpenAiNativeHistory(
-			(summaryOptions.convertToLlm ?? convertToLlm)(remoteMessages),
+			(summaryOptions.convertToLlm ?? defaultConvertToLlm)(remoteMessages),
 			model,
 			previousReplacementHistory,
 		);
@@ -1097,7 +1114,7 @@ export async function compact(
 	// Compute file lists and append to summary
 	const { readFiles, modifiedFiles } = computeFileLists(fileOps);
-	summary = upsertFileOperations(summary, readFiles, modifiedFiles);
+	summary = upsertFileOperations(summary, readFiles, modifiedFiles, fileOps.read);
 	if (!firstKeptEntryId) {
 		throw new Error("First kept entry has no ID - session may need migration");
@@ -1126,7 +1143,7 @@ async function generateTurnPrefixSummary(
 ): Promise<string> {
 	const maxTokens = Math.floor(0.5 * reserveTokens); // Smaller budget for turn prefix
-	const llmMessages = (options?.convertToLlm ?? convertToLlm)(messages);
+	const llmMessages = (options?.convertToLlm ?? defaultConvertToLlm)(messages);
 	const conversationText = serializeConversation(llmMessages);
 	const promptText = `<conversation>\n${conversationText}\n</conversation>\n\n${TURN_PREFIX_SUMMARIZATION_PROMPT}`;
 	const summarizationMessages = [

package/src/compaction/messages.ts CHANGED Viewed

@@ -51,6 +51,8 @@ export interface CompactionSummaryMessage {
 	shortSummary?: string;
 	tokensBefore: number;
 	providerPayload?: ProviderPayload;
+	/** Snapcompact frames archived by this compaction; appended as image blocks after the summary text. */
+	images?: ImageContent[];
 	timestamp: number;
 }
@@ -98,6 +100,7 @@ export function createCompactionSummaryMessage(
 	timestamp: string,
 	shortSummary?: string,
 	providerPayload?: ProviderPayload,
+	images?: ImageContent[],
 ): CompactionSummaryMessage {
 	return {
 		role: "compactionSummary",
@@ -105,6 +108,7 @@ export function createCompactionSummaryMessage(
 		shortSummary,
 		tokensBefore,
 		providerPayload,
+		images: images && images.length > 0 ? images : undefined,
 		timestamp: new Date(timestamp).getTime(),
 	};
 }
@@ -137,6 +141,79 @@ function isCoreCompactionMessage(message: AgentMessage): message is AgentMessage
 	);
 }
+/**
+ * Transform a single core-domain agent message to its LLM form; `undefined`
+ * drops it from the provider request.
+ *
+ * Single source of truth for the core roles (user/developer/assistant/
+ * toolResult) and the compaction messages owned by this package. Embedders
+ * with their own app messages (e.g. the coding agent) handle their custom
+ * roles and delegate every core role here — duplicating these cases is how
+ * snapcompact frames once silently fell off the provider request.
+ */
+export function convertMessageToLlm(message: AgentMessage): Message | undefined {
+	if (isCoreCompactionMessage(message)) {
+		switch (message.role) {
+			case "custom":
+			case "hookMessage": {
+				const content =
+					typeof message.content === "string"
+						? [{ type: "text" as const, text: message.content }]
+						: message.content;
+				return {
+					role: "developer",
+					content,
+					attribution: message.attribution,
+					timestamp: message.timestamp,
+				};
+			}
+			case "branchSummary":
+				return {
+					role: "user",
+					content: [
+						{
+							type: "text" as const,
+							text: renderBranchSummaryContext(message.summary),
+						},
+					],
+					attribution: "agent",
+					timestamp: message.timestamp,
+				};
+			case "compactionSummary":
+				return {
+					role: "user",
+					content: [
+						{
+							type: "text" as const,
+							text: renderCompactionSummaryContext(message.summary),
+						},
+						...(message.images ?? []),
+					],
+					attribution: "agent",
+					providerPayload: message.providerPayload,
+					timestamp: message.timestamp,
+				};
+		}
+	}
+	switch (message.role) {
+		case "user":
+			return { ...message, attribution: message.attribution ?? "user" };
+		case "developer":
+			return { ...message, attribution: message.attribution ?? "agent" };
+		case "assistant":
+			return message as AssistantMessage;
+		case "toolResult":
+			return {
+				...message,
+				content: getPrunedToolResultContent(message as ToolResultMessage),
+				attribution: message.attribution ?? "agent",
+			};
+		default:
+			return undefined;
+	}
+}
 /**
  * Default compaction-domain transformer.
  *
@@ -145,68 +222,5 @@ function isCoreCompactionMessage(message: AgentMessage): message is AgentMessage
  * core LLM roles and the compaction messages owned by this package.
  */
 export function defaultConvertToLlm(messages: AgentMessage[]): Message[] {
-	return messages
-		.map((message): Message | undefined => {
-			if (isCoreCompactionMessage(message)) {
-				switch (message.role) {
-					case "custom":
-					case "hookMessage": {
-						const content =
-							typeof message.content === "string"
-								? [{ type: "text" as const, text: message.content }]
-								: message.content;
-						return {
-							role: "developer",
-							content,
-							attribution: message.attribution,
-							timestamp: message.timestamp,
-						};
-					}
-					case "branchSummary":
-						return {
-							role: "user",
-							content: [
-								{
-									type: "text" as const,
-									text: renderBranchSummaryContext(message.summary),
-								},
-							],
-							attribution: "agent",
-							timestamp: message.timestamp,
-						};
-					case "compactionSummary":
-						return {
-							role: "user",
-							content: [
-								{
-									type: "text" as const,
-									text: renderCompactionSummaryContext(message.summary),
-								},
-							],
-							attribution: "agent",
-							providerPayload: message.providerPayload,
-							timestamp: message.timestamp,
-						};
-				}
-			}
-			switch (message.role) {
-				case "user":
-					return { ...message, attribution: message.attribution ?? "user" };
-				case "developer":
-					return { ...message, attribution: message.attribution ?? "agent" };
-				case "assistant":
-					return message as AssistantMessage;
-				case "toolResult":
-					return {
-						...message,
-						content: getPrunedToolResultContent(message as ToolResultMessage),
-						attribution: message.attribution ?? "agent",
-					};
-				default:
-					return undefined;
-			}
-		})
-		.filter(message => message !== undefined);
+	return messages.map(convertMessageToLlm).filter(message => message !== undefined);
 }
-export const convertToLlm = defaultConvertToLlm;

package/src/compaction/prompts/file-operations.md CHANGED Viewed

@@ -1,10 +1,5 @@
-{{#if readFiles.length}}
-{{#xml "read-files"}}
-{{join readFiles "\n"}}
-{{/xml}}
-{{/if}}
-{{#if modifiedFiles.length}}
-{{#xml "modified-files"}}
-{{join modifiedFiles "\n"}}
+{{#if files}}
+{{#xml "files"}}
+{{files}}
 {{/xml}}
 {{/if}}

package/src/compaction/pruning.ts CHANGED Viewed

@@ -3,7 +3,7 @@
  */
 import type { ToolResultMessage } from "@oh-my-pi/pi-ai";
-import type { AgentMessage } from "../types";
+import type { AgentMessage, AgentToolCall } from "../types";
 import { estimateTokens } from "./compaction";
 import type { SessionEntry, SessionMessageEntry } from "./entries";
 import {
@@ -12,6 +12,7 @@ import {
 	isSkillReadToolResult,
 	type ProtectedToolMatcher,
 } from "./tool-protection";
+import { splitReadSelector } from "./utils";
 export interface PruneConfig {
 	/** Keep the most recent tool output tokens intact. */
@@ -20,6 +21,13 @@ export interface PruneConfig {
 	minimumSavings: number;
 	/** Tool-result protection matchers. String entries protect every result from that tool; predicates may inspect the paired tool call. */
 	protectedTools: ProtectedToolMatcher[];
+	/**
+	 * Optional supersede key function (see {@link SupersedePruneConfig.supersedeKey}).
+	 * When provided, superseded tool results are pruned first — even inside the
+	 * `protectTokens` window — before age-based victims. Absent, behavior is
+	 * unchanged.
+	 */
+	supersedeKey?: SupersedeKeyFn;
 }
 export const DEFAULT_PRUNE_CONFIG: PruneConfig = {
@@ -33,6 +41,34 @@ export interface PruneResult {
 	tokensSaved: number;
 }
+/** Exact placeholder written over a superseded tool result. */
+export const SUPERSEDED_NOTICE = "[Superseded by a newer read of this file]";
+/**
+ * Maps a tool call to a supersede key. Results sharing a key form a group in
+ * which every result except the newest is a supersede candidate. A key `K`
+ * additionally supersedes keys with prefix `K + "\u0000"` (selector-free read
+ * supersedes selector-carrying reads of the same base path). Return
+ * `undefined` to exempt a call from supersede grouping.
+ */
+export type SupersedeKeyFn = (toolName: string, args: Record<string, unknown>) => string | undefined;
+export interface SupersedePruneConfig {
+	/** Supersede key function; results sharing a key supersede older ones. */
+	supersedeKey: SupersedeKeyFn;
+	/** Prune a candidate now when all messages after it total at most this many estimated tokens. Default 8 000. */
+	suffixTokenLimit?: number;
+	/** Prune all candidates when the last message is at least this old (prompt cache is cold anyway). Default 30 min. */
+	idleFlushMs?: number;
+	/** Clock override for tests. */
+	now?: number;
+	/** Tool-result protection matchers (same contract as {@link PruneConfig.protectedTools}). */
+	protectedTools: ProtectedToolMatcher[];
+}
+const DEFAULT_SUFFIX_TOKEN_LIMIT = 8_000;
+const DEFAULT_IDLE_FLUSH_MS = 30 * 60_000;
 function createPrunedNotice(tokens: number): string {
 	return `[Output truncated - ${tokens} tokens]`;
 }
@@ -44,18 +80,121 @@ function getToolResultMessage(entry: SessionEntry): ToolResultMessage | undefine
 	return message as ToolResultMessage;
 }
-function estimatePrunedSavings(tokens: number): number {
-	const noticeTokens = Math.ceil(createPrunedNotice(tokens).length / 4);
+function estimatePrunedSavings(tokens: number, notice: string): number {
+	const noticeTokens = Math.ceil(notice.length / 4);
 	return Math.max(0, tokens - noticeTokens);
 }
+interface SupersedeCandidate {
+	entry: SessionMessageEntry;
+	message: ToolResultMessage;
+	/** Index of the entry within the `entries` array. */
+	index: number;
+	tokens: number;
+}
+/**
+ * Collect superseded tool results: for every unpruned, unprotected tool result
+ * whose paired call resolves a supersede key, a LATER result with the same key
+ * — or with a key that is the `"\u0000"`-prefix parent of this one — marks it
+ * superseded. Returned in message order.
+ */
+function collectSupersededResults(
+	entries: readonly SessionEntry[],
+	toolCallsById: ReadonlyMap<string, AgentToolCall>,
+	supersedeKey: SupersedeKeyFn,
+	protectedTools: readonly ProtectedToolMatcher[],
+): SupersedeCandidate[] {
+	const candidates: SupersedeCandidate[] = [];
+	const seenKeys = new Set<string>();
+	for (let i = entries.length - 1; i >= 0; i--) {
+		const entry = entries[i];
+		const message = getToolResultMessage(entry);
+		if (!message || message.prunedAt !== undefined) continue;
+		const toolCall = toolCallsById.get(message.toolCallId);
+		if (!toolCall) continue;
+		if (isProtectedToolResult(message, toolCall, protectedTools)) continue;
+		const key = supersedeKey(toolCall.name, toolCall.arguments as Record<string, unknown>);
+		if (key === undefined) continue;
+		const separator = key.indexOf("\u0000");
+		const superseded = seenKeys.has(key) || (separator >= 0 && seenKeys.has(key.slice(0, separator)));
+		seenKeys.add(key);
+		if (!superseded) continue;
+		candidates.push({
+			entry: entry as SessionMessageEntry,
+			message,
+			index: i,
+			tokens: estimateTokens(message as AgentMessage),
+		});
+	}
+	return candidates.reverse();
+}
+/**
+ * Prune superseded tool results (e.g. stale `read` outputs replaced by a newer
+ * read of the same file). Cheap, incremental, and prompt-cache-aware: a
+ * candidate is pruned now only when the suffix after it is small (tail case —
+ * the read→edit→read loop) or when the context has been idle long enough that
+ * the provider cache is cold anyway (then ALL candidates flush).
+ */
+export function pruneSupersededToolResults(entries: SessionEntry[], config: SupersedePruneConfig): PruneResult {
+	const toolCallsById = collectToolCallsById(entries);
+	const candidates = collectSupersededResults(entries, toolCallsById, config.supersedeKey, config.protectedTools);
+	if (candidates.length === 0) return { prunedCount: 0, tokensSaved: 0 };
+	const now = config.now ?? Date.now();
+	let lastMessageTimestamp: number | undefined;
+	for (let i = entries.length - 1; i >= 0; i--) {
+		const entry = entries[i];
+		if (entry.type !== "message") continue;
+		const timestamp = (entry.message as AgentMessage).timestamp;
+		if (typeof timestamp === "number") lastMessageTimestamp = timestamp;
+		break;
+	}
+	const idle =
+		lastMessageTimestamp !== undefined && now - lastMessageTimestamp >= (config.idleFlushMs ?? DEFAULT_IDLE_FLUSH_MS);
+	let toPrune: SupersedeCandidate[];
+	if (idle) {
+		toPrune = candidates;
+	} else {
+		const suffixTokenLimit = config.suffixTokenLimit ?? DEFAULT_SUFFIX_TOKEN_LIMIT;
+		// suffixTokens[i] = estimated tokens of all messages strictly after entry i.
+		const suffixTokens = new Array<number>(entries.length);
+		let accumulated = 0;
+		for (let i = entries.length - 1; i >= 0; i--) {
+			suffixTokens[i] = accumulated;
+			const entry = entries[i];
+			if (entry.type === "message") accumulated += estimateTokens(entry.message as AgentMessage);
+		}
+		toPrune = candidates.filter(candidate => suffixTokens[candidate.index] <= suffixTokenLimit);
+	}
+	if (toPrune.length === 0) return { prunedCount: 0, tokensSaved: 0 };
+	const prunedAt = Date.now();
+	let tokensSaved = 0;
+	for (const candidate of toPrune) {
+		candidate.message.content = [{ type: "text", text: SUPERSEDED_NOTICE }];
+		candidate.message.prunedAt = prunedAt;
+		tokensSaved += estimatePrunedSavings(candidate.tokens, SUPERSEDED_NOTICE);
+	}
+	return { prunedCount: toPrune.length, tokensSaved };
+}
 export function pruneToolOutputs(entries: SessionEntry[], config: PruneConfig = DEFAULT_PRUNE_CONFIG): PruneResult {
 	let accumulatedTokens = 0;
 	let tokensSaved = 0;
 	let prunedCount = 0;
-	const candidates: Array<{ entry: SessionMessageEntry; tokens: number }> = [];
+	const candidates: Array<{ entry: SessionMessageEntry; tokens: number; superseded: boolean }> = [];
 	const toolCallsById = collectToolCallsById(entries);
+	const supersededMessages = config.supersedeKey
+		? new Set(
+				collectSupersededResults(entries, toolCallsById, config.supersedeKey, config.protectedTools).map(
+					candidate => candidate.message,
+				),
+			)
+		: undefined;
 	for (let i = entries.length - 1; i >= 0; i--) {
 		const entry = entries[i];
@@ -70,17 +209,23 @@ export function pruneToolOutputs(entries: SessionEntry[], config: PruneConfig =
 			continue;
 		}
-		if (accumulatedTokens < config.protectTokens || isProtected) {
+		// Superseded results are pruned first: they bypass the protect window
+		// (a stale copy of re-read content is dead weight at any age).
+		const superseded = supersededMessages?.has(message) ?? false;
+		if (!superseded && (accumulatedTokens < config.protectTokens || isProtected)) {
 			accumulatedTokens += tokens;
 			continue;
 		}
-		candidates.push({ entry: entry as SessionMessageEntry, tokens });
+		candidates.push({ entry: entry as SessionMessageEntry, tokens, superseded });
 		accumulatedTokens += tokens;
 	}
 	for (const candidate of candidates) {
-		tokensSaved += estimatePrunedSavings(candidate.tokens);
+		tokensSaved += estimatePrunedSavings(
+			candidate.tokens,
+			candidate.superseded ? SUPERSEDED_NOTICE : createPrunedNotice(candidate.tokens),
+		);
 	}
 	if (tokensSaved < config.minimumSavings || candidates.length === 0) {
@@ -90,10 +235,31 @@ export function pruneToolOutputs(entries: SessionEntry[], config: PruneConfig =
 	const prunedAt = Date.now();
 	for (const candidate of candidates) {
 		const message = candidate.entry.message as ToolResultMessage;
-		message.content = [{ type: "text", text: createPrunedNotice(candidate.tokens) }];
+		message.content = [
+			{ type: "text", text: candidate.superseded ? SUPERSEDED_NOTICE : createPrunedNotice(candidate.tokens) },
+		];
 		message.prunedAt = prunedAt;
 		prunedCount++;
 	}
 	return { prunedCount, tokensSaved };
 }
+/**
+ * Supersede key for the `read` tool: the file path with the trailing line/raw
+ * selector stripped (the read tool's own splitter grammar via
+ * {@link splitReadSelector}, e.g. `src/foo.ts:50-200`, `:2-4:raw`).
+ * Internal/URL-scheme paths (`skill://…`, `https://…`) are exempt.
+ * Selector-free reads key on the bare path; selector-carrying reads key on
+ * `path + "\u0000" + selector`, so two reads collide only when the newer is
+ * selector-free or the selectors are identical (the pass's prefix rule lets a
+ * bare-path read supersede selector-carrying reads of the same file).
+ */
+export function readToolSupersedeKey(toolName: string, args: Record<string, unknown>): string | undefined {
+	if (toolName !== "read") return undefined;
+	const path = args.path;
+	if (typeof path !== "string" || path.length === 0) return undefined;
+	if (path.includes("://")) return undefined;
+	const { path: base, sel } = splitReadSelector(path);
+	return sel === undefined ? base : `${base}\u0000${sel}`;
+}

package/src/compaction/utils.ts CHANGED Viewed

@@ -3,7 +3,7 @@
  */
 import type { Message } from "@oh-my-pi/pi-ai";
-import { prompt } from "@oh-my-pi/pi-utils";
+import { formatGroupedPaths, prompt } from "@oh-my-pi/pi-utils";
 import type { AgentMessage } from "../types";
 import fileOperationsTemplate from "./prompts/file-operations.md" with { type: "text" };
 import summarizationSystemPrompt from "./prompts/summarization-system.md" with { type: "text" };
@@ -26,6 +26,55 @@ export function createFileOps(): FileOperations {
 	};
 }
+// Read-tool selector grammar, mirrored from the conservative filesystem splitter in
+// packages/coding-agent/src/tools/path-utils.ts (splitPathAndSel). Keep in sync.
+// A trailing `:chunk` is a selector only when it is a line-range list
+// (`50`, `50-200`, `50+10`, `5-16,960-973`, `..` alias), `raw`, or `conflicts` —
+// alone or as a `range:raw` / `raw:range` compound.
+const RANGE_CHUNK_SRC = String.raw`L?\d+(?:(?:[-+]|\.\.)L?\d+|-|\.\.)?`;
+const RANGE_LIST_SRC = `${RANGE_CHUNK_SRC}(?:,${RANGE_CHUNK_SRC})*`;
+const READ_SELECTOR_RE = new RegExp(`^(?:${RANGE_LIST_SRC}|raw|conflicts)$`, "i");
+const READ_RANGE_ONLY_RE = new RegExp(`^${RANGE_LIST_SRC}$`, "i");
+const READ_RAW_ONLY_RE = /^raw$/i;
+/**
+ * Split a read-tool path into its base path and trailing selector, mirroring the
+ * read tool's own splitter. Single source of the grammar in this package: the
+ * file-operations list strips selectors via {@link stripReadSelector}, and the
+ * supersede-prune pass keys on both parts via `readToolSupersedeKey`.
+ */
+export function splitReadSelector(path: string): { path: string; sel?: string } {
+	const colon = path.lastIndexOf(":");
+	if (colon <= 0) return { path };
+	const candidate = path.slice(colon + 1);
+	if (!READ_SELECTOR_RE.test(candidate)) return { path };
+	let base = path.slice(0, colon);
+	let sel = candidate;
+	// Compound trailing selector: `path:1-50:raw` or `path:raw:1-50`.
+	const inner = base.lastIndexOf(":");
+	if (inner > 0) {
+		const innerCandidate = base.slice(inner + 1);
+		const innerIsRaw = READ_RAW_ONLY_RE.test(innerCandidate);
+		const outerIsRaw = READ_RAW_ONLY_RE.test(candidate);
+		const innerIsRange = READ_RANGE_ONLY_RE.test(innerCandidate);
+		const outerIsRange = READ_RANGE_ONLY_RE.test(candidate);
+		if ((innerIsRaw && outerIsRange) || (innerIsRange && outerIsRaw)) {
+			sel = `${innerCandidate}:${candidate}`;
+			base = base.slice(0, inner);
+		}
+	}
+	return { path: base, sel };
+}
+/**
+ * Strip a trailing read-tool selector (`:50-200`, `:raw`, `:1-50:raw`, `:conflicts`, …)
+ * so the same file read with different line ranges dedupes to one `<files>` entry
+ * and matches its write/edit path when computing Read/Write/RW markers.
+ */
+export function stripReadSelector(path: string): string {
+	return splitReadSelector(path).path;
+}
 /**
  * Extract file operations from tool calls in an assistant message.
  */
@@ -46,7 +95,7 @@ export function extractFileOpsFromMessage(message: AgentMessage, fileOps: FileOp
 		switch (block.name) {
 			case "read":
-				fileOps.read.add(path);
+				fileOps.read.add(stripReadSelector(path));
 				break;
 			case "write":
 				fileOps.written.add(path);
@@ -70,32 +119,48 @@ export function computeFileLists(fileOps: FileOperations): { readFiles: string[]
 }
 /**
- * Format file operations as XML tags for summary.
+ * Format file operations as one `<files>` tag: a grouped, prefix-folded
+ * directory tree (find-tool shape — `# dir/` headers, bare basenames) with a
+ * ` (Read)` / ` (Write)` / ` (RW)` marker per file instead of separate
+ * read/modified lists. `readSet` is the cumulative read set (`fileOps.read`),
+ * used to tell modified files that were also read (RW) from blind writes.
  */
 const FILE_OPERATION_SUMMARY_LIMIT = 20;
-function truncateFileList(files: string[]): string[] {
-	if (files.length <= FILE_OPERATION_SUMMARY_LIMIT) return files;
-	const omitted = files.length - FILE_OPERATION_SUMMARY_LIMIT;
-	return [...files.slice(0, FILE_OPERATION_SUMMARY_LIMIT), `… (${omitted} more files omitted)`];
-}
 function stripFileOperationTags(summary: string): string {
-	const withoutReadFiles = summary.replace(/<read-files>[\s\S]*?<\/read-files>\s*/g, "");
-	const withoutModifiedFiles = withoutReadFiles.replace(/<modified-files>[\s\S]*?<\/modified-files>\s*/g, "");
-	return withoutModifiedFiles.trimEnd();
+	// Legacy <read-files>/<modified-files> tags are still stripped so summaries
+	// written before the combined <files> tag self-heal on the next compaction.
+	return summary
+		.replace(/<files>[\s\S]*?<\/files>\s*/g, "")
+		.replace(/<read-files>[\s\S]*?<\/read-files>\s*/g, "")
+		.replace(/<modified-files>[\s\S]*?<\/modified-files>\s*/g, "")
+		.trimEnd();
 }
-export function formatFileOperations(readFiles: string[], modifiedFiles: string[]): string {
+export function formatFileOperations(
+	readFiles: string[],
+	modifiedFiles: string[],
+	readSet?: ReadonlySet<string>,
+): string {
 	if (readFiles.length === 0 && modifiedFiles.length === 0) return "";
-	return prompt.render(fileOperationsTemplate, {
-		readFiles: truncateFileList(readFiles),
-		modifiedFiles: truncateFileList(modifiedFiles),
-	});
+	const mode = new Map<string, "Read" | "Write" | "RW">();
+	for (const file of readFiles) mode.set(file, "Read");
+	for (const file of modifiedFiles) mode.set(file, readSet?.has(file) ? "RW" : "Write");
+	const all = [...mode.keys()].sort();
+	let files = formatGroupedPaths(all.slice(0, FILE_OPERATION_SUMMARY_LIMIT), path => ` (${mode.get(path)})`);
+	if (all.length > FILE_OPERATION_SUMMARY_LIMIT) {
+		files += `\n… (${all.length - FILE_OPERATION_SUMMARY_LIMIT} more files omitted)`;
+	}
+	return prompt.render(fileOperationsTemplate, { files });
 }
-export function upsertFileOperations(summary: string, readFiles: string[], modifiedFiles: string[]): string {
+export function upsertFileOperations(
+	summary: string,
+	readFiles: string[],
+	modifiedFiles: string[],
+	readSet?: ReadonlySet<string>,
+): string {
 	const baseSummary = stripFileOperationTags(summary);
-	const fileOperations = formatFileOperations(readFiles, modifiedFiles);
+	const fileOperations = formatFileOperations(readFiles, modifiedFiles, readSet);
 	if (!fileOperations) return baseSummary;
 	if (!baseSummary) return fileOperations;
 	return `${baseSummary}\n\n${fileOperations}`;

package/src/types.ts CHANGED Viewed

@@ -3,6 +3,7 @@ import type {
 	AssistantMessage,
 	AssistantMessageEvent,
 	AssistantMessageEventStream,
+	Context,
 	Effort,
 	ImageContent,
 	Message,
@@ -107,6 +108,13 @@ export interface AgentLoopConfig extends SimpleStreamOptions {
 	 */
 	transformContext?: (messages: AgentMessage[], signal?: AbortSignal) => Promise<AgentMessage[]>;
+	/**
+	 * Optional transform applied to the final provider context after conversion,
+	 * normalization, and append-only context handling, but before telemetry capture
+	 * and provider send.
+	 */
+	transformProviderContext?: (context: Context) => Context;
 	/**
 	 * Resolves an API key dynamically for each LLM call.
 	 *
@@ -210,6 +218,15 @@ export interface AgentLoopConfig extends SimpleStreamOptions {
 	 */
 	getReasoning?: () => Effort | undefined;
+	/**
+	 * Dynamic reasoning-disable override, resolved per LLM call. When set,
+	 * its return value overrides the static `disableReasoning` from
+	 * `SimpleStreamOptions` for that request. Pair with `getReasoning` so
+	 * mid-run transitions into and out of the explicit `off` state propagate
+	 * to the next provider call.
+	 */
+	getDisableReasoning?: () => boolean | undefined;
 	/**
 	 * Called after a tool call has been validated and is about to execute.
 	 *
@@ -358,6 +375,7 @@ export interface AgentState {
 	systemPrompt: string[];
 	model: Model;
 	thinkingLevel?: Effort;
+	disableReasoning?: boolean;
 	tools: AgentTool<any>[];
 	messages: AgentMessage[]; // Can include attachments + custom message types
 	isStreaming: boolean;