@oh-my-pi/pi-agent-core 15.10.11 → 15.11.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +38 -1
- package/dist/types/agent.d.ts +7 -1
- package/dist/types/compaction/compaction.d.ts +4 -4
- package/dist/types/compaction/messages.d.ts +14 -2
- package/dist/types/compaction/pruning.d.ts +48 -0
- package/dist/types/compaction/utils.d.ts +18 -2
- package/dist/types/types.d.ts +16 -1
- package/package.json +6 -5
- package/src/agent-loop.ts +26 -7
- package/src/agent.ts +17 -0
- package/src/compaction/branch-summarization.ts +5 -4
- package/src/compaction/compaction.ts +28 -11
- package/src/compaction/messages.ts +78 -64
- package/src/compaction/prompts/file-operations.md +3 -8
- package/src/compaction/pruning.ts +174 -8
- package/src/compaction/utils.ts +84 -19
- package/src/types.ts +18 -0
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,41 @@
|
|
|
2
2
|
|
|
3
3
|
## [Unreleased]
|
|
4
4
|
|
|
5
|
+
## [15.11.0] - 2026-06-10
|
|
6
|
+
### Breaking Changes
|
|
7
|
+
|
|
8
|
+
- Removed `compaction/index.ts` re-export of snapcompact helpers, so snapcompact utilities are no longer available from the agent compaction barrel and should be imported from `@oh-my-pi/snapcompact`
|
|
9
|
+
- Removed the `convertToLlm` alias export from `compaction/messages` — it duplicated `defaultConvertToLlm` under a second name. Import `defaultConvertToLlm` (array form) or the new `convertMessageToLlm` (single-message form) instead
|
|
10
|
+
|
|
11
|
+
### Added
|
|
12
|
+
|
|
13
|
+
- Added `convertMessageToLlm()`: the single-message core transformer behind `defaultConvertToLlm()`. Embedders with app-specific message roles should handle their own roles and delegate every core role (`user`/`developer`/`assistant`/`toolResult`/`custom`/`hookMessage`/`branchSummary`/`compactionSummary`) to it instead of duplicating the conversion — a duplicated `compactionSummary` case is how snapcompact frames once silently dropped off provider requests
|
|
14
|
+
- Added `pruneSupersededToolResults()` and the opt-in `PruneConfig.supersedeKey` hook so harnesses can prune stale tool results superseded by a newer read of the same file; superseded results are pruned ahead of age-based victims during overflow pruning and replaced with a `[Superseded by a newer read of this file]` placeholder. Without the new config, `pruneToolOutputs()` behavior is unchanged.
|
|
15
|
+
- Added `readToolSupersedeKey()` implementing the read-tool path/selector grammar (selector-free reads supersede range reads of the same file; URL-scheme paths exempt). Pruning honors prompt-cache economics: per-turn prunes only fire when the post-candidate suffix is small or the cache is cold (idle gap).
|
|
16
|
+
- Added the `snapcompact` compaction strategy via `@oh-my-pi/snapcompact`: instead of an LLM summary, discarded history is printed onto dense bitmap frames and re-attached to the compaction summary message as image blocks. `CompactionSummaryMessage` gains an optional `images` field, `estimateTokens()` charges per attached frame, and frames persist under `preserveData.snapcompact` with an 8-frame middle-out eviction budget.
|
|
17
|
+
- Snapcompact frames are now rendered in a provider-aware shape (`SNAPCOMPACT_SHAPES` + `resolveSnapcompactShape(api)`), following the snapcompact 200k-token monolithic evals: Anthropic-family and unknown APIs get `8x8r-bw` (unscii-8 square cells, black ink, every line printed twice with the copy on a pale highlight band — read at F1 parity with raw text at ~2x lower cost and the most refusal-robust), Google gets `8x8r-sent` (sentence-hue ink, ~2.9x cheaper), and OpenAI gets `6x6u-sent` (unscii Lanczos-stretched to 6x6 cells — OpenAI bills a flat ~2.9k tokens per image, so frame count is the only cost lever) with `detail: "original"` on the frame images. `snapcompactCompact()` accepts `model`/`shape` options, frames persist their shape metadata, mixed-shape archives (provider switches, legacy 5x8 frames) are flagged in the reading instructions, and `snapcompactGeometry()`/`renderSnapcompactFrame()` now take a shape
|
|
18
|
+
|
|
19
|
+
### Changed
|
|
20
|
+
|
|
21
|
+
- Compaction and branch-summary file lists are now a single `<files>` tag instead of `<read-files>`/`<modified-files>`: paths render as the grouped, prefix-folded directory tree the find/search tools emit (`# dir/` headers, bare basenames), each annotated `(Read)`, `(Write)`, or `(RW)` — modified files that were also read get `(RW)`. Legacy tags in summaries written by earlier versions are still stripped and self-heal on the next compaction
|
|
22
|
+
|
|
23
|
+
### Fixed
|
|
24
|
+
|
|
25
|
+
- Fixed queued steering messages being drained into an externally aborted run: interrupting mid-tool execution (e.g. Enter with a pending steer) dequeued the steer into the dying run — it landed in history without a response and the post-abort resume saw an empty queue, so the agent stopped instead of continuing. Steering/follow-up/aside queue polls are now skipped once the run's abort signal fires, leaving the queue intact for `Agent.continue()`.
|
|
26
|
+
- Fixed `<read-files>` compaction lists recording the same file once per line-range/raw selector (`src/foo.ts:50-200`, `:raw`, `:1-50:raw`, …): read-tool selectors are now stripped before tracking, so reads dedupe to the base path and match their write/edit path when splitting read-only vs modified lists. Selector-polluted lists stored by earlier compactions self-heal on the next compaction. `readToolSupersedeKey()` now shares the same splitter (`splitReadSelector()`), gaining the `..` range alias and `L`-prefix forms it previously missed.
|
|
27
|
+
- Fixed `estimateTokens()` undercounting thinking-heavy assistant messages on replay: `thinkingSignature` payloads (OpenAI Responses encrypted reasoning items, Anthropic signed thinking blocks, etc.) and `redactedThinking.data` are now charged alongside the visible thinking text, so the local estimate tracks provider-reported usage instead of straddling the threshold on every turn ([#2275](https://github.com/can1357/oh-my-pi/issues/2275)).
|
|
28
|
+
|
|
29
|
+
## [15.10.12] - 2026-06-10
|
|
30
|
+
|
|
31
|
+
### Added
|
|
32
|
+
|
|
33
|
+
- Added `AgentLoopConfig.getDisableReasoning` so callers can override `disableReasoning` per LLM call, mirroring `getReasoning`.
|
|
34
|
+
- Added `transformProviderContext` to `AgentOptions`/`AgentLoopConfig`: an optional hook applied to the assembled provider context after conversion, normalization, and append-only handling, but before telemetry capture and provider send.
|
|
35
|
+
|
|
36
|
+
### Fixed
|
|
37
|
+
|
|
38
|
+
- Fixed `Agent` runs so explicit reasoning disablement is forwarded to provider stream options and re-resolved per continuation, keeping mid-run thinking-off changes in sync with the next provider request.
|
|
39
|
+
|
|
5
40
|
## [15.10.11] - 2026-06-10
|
|
6
41
|
|
|
7
42
|
### Changed
|
|
@@ -10,6 +45,7 @@
|
|
|
10
45
|
- Catalog imports moved to the new `@oh-my-pi/pi-catalog` package: subpath imports (`calculateCost`, Codex wire constants) plus catalog values previously taken from the `@oh-my-pi/pi-ai` root (`getBundledModel`, `clampThinkingLevelForModel`), which pi-ai no longer re-exports; type-only `Model`/`Api`/`Effort` imports from pi-ai are unchanged
|
|
11
46
|
|
|
12
47
|
## [15.10.8] - 2026-06-09
|
|
48
|
+
|
|
13
49
|
### Added
|
|
14
50
|
|
|
15
51
|
- Added optional `fetch` overrides to `SummaryOptions` and `compact`/`generateSummary` so remote compaction can use custom HTTP clients
|
|
@@ -18,6 +54,7 @@
|
|
|
18
54
|
- Added the upstream provider that served a request (`AssistantMessage.upstreamProvider`, e.g. OpenRouter's routed provider) as a `pi.gen_ai.response.upstream_provider` chat-span telemetry attribute, alongside the existing response id and time-to-first-chunk.
|
|
19
55
|
|
|
20
56
|
## [15.10.5] - 2026-06-08
|
|
57
|
+
|
|
21
58
|
### Removed
|
|
22
59
|
|
|
23
60
|
- Removed the `maxToolCallsPerTurn` option from `AgentOptions` and `AgentLoopConfig`, so assistant turns are no longer capped after a configured number of completed tool calls
|
|
@@ -55,7 +92,6 @@
|
|
|
55
92
|
- Removed stale synthetic user-message tag filters from OpenAI remote compaction output preservation; developer messages are now dropped by role instead.
|
|
56
93
|
- Tool executions now receive the active turn `AbortSignal` unconditionally.
|
|
57
94
|
|
|
58
|
-
|
|
59
95
|
## [15.10.2] - 2026-06-08
|
|
60
96
|
|
|
61
97
|
### Fixed
|
|
@@ -87,6 +123,7 @@
|
|
|
87
123
|
- Surfaced Anthropic stream failures whose message starts with `Output blocked by conten` as normal assistant error lifecycle events, so interactive clients render content-filter blocks instead of silently dropping the streaming bubble at `agent_end`.
|
|
88
124
|
|
|
89
125
|
## [15.8.3] - 2026-06-03
|
|
126
|
+
|
|
90
127
|
### Added
|
|
91
128
|
|
|
92
129
|
- Added `getReadToolPath(context)` to `@oh-my-pi/pi-agent-core/compaction/tool-protection` to extract a paired `read` tool call's `path` for embedders building read-targeted protection matchers
|
package/dist/types/agent.d.ts
CHANGED
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
import { type ApiKeyResolveContext, type AssistantMessage, type AssistantMessageEvent, type CursorExecHandlers, type CursorToolResultHandler, type Effort, type ImageContent, type Message, type Model, type ProviderSessionState, type ServiceTier, type SimpleStreamOptions, type ThinkingBudgets, type ToolChoice } from "@oh-my-pi/pi-ai";
|
|
1
|
+
import { type ApiKeyResolveContext, type AssistantMessage, type AssistantMessageEvent, type Context, type CursorExecHandlers, type CursorToolResultHandler, type Effort, type ImageContent, type Message, type Model, type ProviderSessionState, type ServiceTier, type SimpleStreamOptions, type ThinkingBudgets, type ToolChoice } from "@oh-my-pi/pi-ai";
|
|
2
2
|
import type { AppendOnlyContextManager } from "./append-only-context";
|
|
3
3
|
import type { HarmonyAuditEvent } from "./harmony-leak";
|
|
4
4
|
import type { AgentEvent, AgentLoopConfig, AgentMessage, AgentState, AgentTool, AgentToolContext, AsideMessage, StreamFn, ToolCallContext } from "./types";
|
|
@@ -17,6 +17,11 @@ export interface AgentOptions {
|
|
|
17
17
|
* Use for context pruning, injecting external context, etc.
|
|
18
18
|
*/
|
|
19
19
|
transformContext?: (messages: AgentMessage[], signal?: AbortSignal) => Promise<AgentMessage[]>;
|
|
20
|
+
/**
|
|
21
|
+
* Optional transform applied after provider context assembly and before
|
|
22
|
+
* telemetry capture/provider send.
|
|
23
|
+
*/
|
|
24
|
+
transformProviderContext?: (context: Context) => Context;
|
|
20
25
|
/**
|
|
21
26
|
* Steering mode: "all" = send all steering messages at once, "one-at-a-time" = one per turn
|
|
22
27
|
*/
|
|
@@ -294,6 +299,7 @@ export declare class Agent {
|
|
|
294
299
|
setSystemPrompt(v: string[]): void;
|
|
295
300
|
setModel(m: Model): void;
|
|
296
301
|
setThinkingLevel(l: Effort | undefined): void;
|
|
302
|
+
setDisableReasoning(disabled: boolean): void;
|
|
297
303
|
setSteeringMode(mode: "all" | "one-at-a-time"): void;
|
|
298
304
|
getSteeringMode(): "all" | "one-at-a-time";
|
|
299
305
|
setFollowUpMode(mode: "all" | "one-at-a-time"): void;
|
|
@@ -4,10 +4,10 @@
|
|
|
4
4
|
* Pure functions for compaction logic. The session manager handles I/O,
|
|
5
5
|
* and after compaction the session is reloaded.
|
|
6
6
|
*/
|
|
7
|
-
import { type FetchImpl, type MessageAttribution, type Model, type Usage } from "@oh-my-pi/pi-ai";
|
|
7
|
+
import { type FetchImpl, type MessageAttribution, type Model, type Tool, type Usage } from "@oh-my-pi/pi-ai";
|
|
8
8
|
import { type AgentTelemetry } from "../telemetry";
|
|
9
9
|
import { ThinkingLevel } from "../thinking";
|
|
10
|
-
import type { AgentMessage
|
|
10
|
+
import type { AgentMessage } from "../types";
|
|
11
11
|
import type { SessionEntry } from "./entries";
|
|
12
12
|
import { type ConvertToLlm } from "./messages";
|
|
13
13
|
import { type FileOperations } from "./utils";
|
|
@@ -30,7 +30,7 @@ export interface CompactionResult<T = unknown> {
|
|
|
30
30
|
}
|
|
31
31
|
export interface CompactionSettings {
|
|
32
32
|
enabled: boolean;
|
|
33
|
-
strategy?: "context-full" | "handoff" | "shake" | "off";
|
|
33
|
+
strategy?: "context-full" | "handoff" | "shake" | "snapcompact" | "off";
|
|
34
34
|
thresholdPercent?: number;
|
|
35
35
|
thresholdTokens?: number;
|
|
36
36
|
reserveTokens: number;
|
|
@@ -133,7 +133,7 @@ export interface HandoffOptions {
|
|
|
133
133
|
/** Live agent system prompt — passed verbatim so providers hit the cached prefix. */
|
|
134
134
|
systemPrompt: string[];
|
|
135
135
|
/** Live agent tool list — same purpose. Forced to `toolChoice: "none"`. */
|
|
136
|
-
tools?:
|
|
136
|
+
tools?: Tool[];
|
|
137
137
|
customInstructions?: string;
|
|
138
138
|
convertToLlm?: ConvertToLlm;
|
|
139
139
|
initiatorOverride?: MessageAttribution;
|
|
@@ -33,6 +33,8 @@ export interface CompactionSummaryMessage {
|
|
|
33
33
|
shortSummary?: string;
|
|
34
34
|
tokensBefore: number;
|
|
35
35
|
providerPayload?: ProviderPayload;
|
|
36
|
+
/** Snapcompact frames archived by this compaction; appended as image blocks after the summary text. */
|
|
37
|
+
images?: ImageContent[];
|
|
36
38
|
timestamp: number;
|
|
37
39
|
}
|
|
38
40
|
export type CoreCompactionMessage = CustomMessage | HookMessage | BranchSummaryMessage | CompactionSummaryMessage;
|
|
@@ -48,8 +50,19 @@ export type ConvertToLlm = (messages: AgentMessage[]) => Message[];
|
|
|
48
50
|
export declare function renderBranchSummaryContext(summary: string): string;
|
|
49
51
|
export declare function renderCompactionSummaryContext(summary: string): string;
|
|
50
52
|
export declare function createBranchSummaryMessage(summary: string, fromId: string, timestamp: string): BranchSummaryMessage;
|
|
51
|
-
export declare function createCompactionSummaryMessage(summary: string, tokensBefore: number, timestamp: string, shortSummary?: string, providerPayload?: ProviderPayload): CompactionSummaryMessage;
|
|
53
|
+
export declare function createCompactionSummaryMessage(summary: string, tokensBefore: number, timestamp: string, shortSummary?: string, providerPayload?: ProviderPayload, images?: ImageContent[]): CompactionSummaryMessage;
|
|
52
54
|
export declare function createCustomMessage(customType: string, content: string | (TextContent | ImageContent)[], display: boolean, details: unknown | undefined, timestamp: string, attribution?: MessageAttribution): CustomMessage;
|
|
55
|
+
/**
|
|
56
|
+
* Transform a single core-domain agent message to its LLM form; `undefined`
|
|
57
|
+
* drops it from the provider request.
|
|
58
|
+
*
|
|
59
|
+
* Single source of truth for the core roles (user/developer/assistant/
|
|
60
|
+
* toolResult) and the compaction messages owned by this package. Embedders
|
|
61
|
+
* with their own app messages (e.g. the coding agent) handle their custom
|
|
62
|
+
* roles and delegate every core role here — duplicating these cases is how
|
|
63
|
+
* snapcompact frames once silently fell off the provider request.
|
|
64
|
+
*/
|
|
65
|
+
export declare function convertMessageToLlm(message: AgentMessage): Message | undefined;
|
|
53
66
|
/**
|
|
54
67
|
* Default compaction-domain transformer.
|
|
55
68
|
*
|
|
@@ -58,4 +71,3 @@ export declare function createCustomMessage(customType: string, content: string
|
|
|
58
71
|
* core LLM roles and the compaction messages owned by this package.
|
|
59
72
|
*/
|
|
60
73
|
export declare function defaultConvertToLlm(messages: AgentMessage[]): Message[];
|
|
61
|
-
export declare const convertToLlm: typeof defaultConvertToLlm;
|
|
@@ -10,10 +10,58 @@ export interface PruneConfig {
|
|
|
10
10
|
minimumSavings: number;
|
|
11
11
|
/** Tool-result protection matchers. String entries protect every result from that tool; predicates may inspect the paired tool call. */
|
|
12
12
|
protectedTools: ProtectedToolMatcher[];
|
|
13
|
+
/**
|
|
14
|
+
* Optional supersede key function (see {@link SupersedePruneConfig.supersedeKey}).
|
|
15
|
+
* When provided, superseded tool results are pruned first — even inside the
|
|
16
|
+
* `protectTokens` window — before age-based victims. Absent, behavior is
|
|
17
|
+
* unchanged.
|
|
18
|
+
*/
|
|
19
|
+
supersedeKey?: SupersedeKeyFn;
|
|
13
20
|
}
|
|
14
21
|
export declare const DEFAULT_PRUNE_CONFIG: PruneConfig;
|
|
15
22
|
export interface PruneResult {
|
|
16
23
|
prunedCount: number;
|
|
17
24
|
tokensSaved: number;
|
|
18
25
|
}
|
|
26
|
+
/** Exact placeholder written over a superseded tool result. */
|
|
27
|
+
export declare const SUPERSEDED_NOTICE = "[Superseded by a newer read of this file]";
|
|
28
|
+
/**
|
|
29
|
+
* Maps a tool call to a supersede key. Results sharing a key form a group in
|
|
30
|
+
* which every result except the newest is a supersede candidate. A key `K`
|
|
31
|
+
* additionally supersedes keys with prefix `K + "\u0000"` (selector-free read
|
|
32
|
+
* supersedes selector-carrying reads of the same base path). Return
|
|
33
|
+
* `undefined` to exempt a call from supersede grouping.
|
|
34
|
+
*/
|
|
35
|
+
export type SupersedeKeyFn = (toolName: string, args: Record<string, unknown>) => string | undefined;
|
|
36
|
+
export interface SupersedePruneConfig {
|
|
37
|
+
/** Supersede key function; results sharing a key supersede older ones. */
|
|
38
|
+
supersedeKey: SupersedeKeyFn;
|
|
39
|
+
/** Prune a candidate now when all messages after it total at most this many estimated tokens. Default 8 000. */
|
|
40
|
+
suffixTokenLimit?: number;
|
|
41
|
+
/** Prune all candidates when the last message is at least this old (prompt cache is cold anyway). Default 30 min. */
|
|
42
|
+
idleFlushMs?: number;
|
|
43
|
+
/** Clock override for tests. */
|
|
44
|
+
now?: number;
|
|
45
|
+
/** Tool-result protection matchers (same contract as {@link PruneConfig.protectedTools}). */
|
|
46
|
+
protectedTools: ProtectedToolMatcher[];
|
|
47
|
+
}
|
|
48
|
+
/**
|
|
49
|
+
* Prune superseded tool results (e.g. stale `read` outputs replaced by a newer
|
|
50
|
+
* read of the same file). Cheap, incremental, and prompt-cache-aware: a
|
|
51
|
+
* candidate is pruned now only when the suffix after it is small (tail case —
|
|
52
|
+
* the read→edit→read loop) or when the context has been idle long enough that
|
|
53
|
+
* the provider cache is cold anyway (then ALL candidates flush).
|
|
54
|
+
*/
|
|
55
|
+
export declare function pruneSupersededToolResults(entries: SessionEntry[], config: SupersedePruneConfig): PruneResult;
|
|
19
56
|
export declare function pruneToolOutputs(entries: SessionEntry[], config?: PruneConfig): PruneResult;
|
|
57
|
+
/**
|
|
58
|
+
* Supersede key for the `read` tool: the file path with the trailing line/raw
|
|
59
|
+
* selector stripped (the read tool's own splitter grammar via
|
|
60
|
+
* {@link splitReadSelector}, e.g. `src/foo.ts:50-200`, `:2-4:raw`).
|
|
61
|
+
* Internal/URL-scheme paths (`skill://…`, `https://…`) are exempt.
|
|
62
|
+
* Selector-free reads key on the bare path; selector-carrying reads key on
|
|
63
|
+
* `path + "\u0000" + selector`, so two reads collide only when the newer is
|
|
64
|
+
* selector-free or the selectors are identical (the pass's prefix rule lets a
|
|
65
|
+
* bare-path read supersede selector-carrying reads of the same file).
|
|
66
|
+
*/
|
|
67
|
+
export declare function readToolSupersedeKey(toolName: string, args: Record<string, unknown>): string | undefined;
|
|
@@ -9,6 +9,22 @@ export interface FileOperations {
|
|
|
9
9
|
edited: Set<string>;
|
|
10
10
|
}
|
|
11
11
|
export declare function createFileOps(): FileOperations;
|
|
12
|
+
/**
|
|
13
|
+
* Split a read-tool path into its base path and trailing selector, mirroring the
|
|
14
|
+
* read tool's own splitter. Single source of the grammar in this package: the
|
|
15
|
+
* file-operations list strips selectors via {@link stripReadSelector}, and the
|
|
16
|
+
* supersede-prune pass keys on both parts via `readToolSupersedeKey`.
|
|
17
|
+
*/
|
|
18
|
+
export declare function splitReadSelector(path: string): {
|
|
19
|
+
path: string;
|
|
20
|
+
sel?: string;
|
|
21
|
+
};
|
|
22
|
+
/**
|
|
23
|
+
* Strip a trailing read-tool selector (`:50-200`, `:raw`, `:1-50:raw`, `:conflicts`, …)
|
|
24
|
+
* so the same file read with different line ranges dedupes to one `<files>` entry
|
|
25
|
+
* and matches its write/edit path when computing Read/Write/RW markers.
|
|
26
|
+
*/
|
|
27
|
+
export declare function stripReadSelector(path: string): string;
|
|
12
28
|
/**
|
|
13
29
|
* Extract file operations from tool calls in an assistant message.
|
|
14
30
|
*/
|
|
@@ -21,8 +37,8 @@ export declare function computeFileLists(fileOps: FileOperations): {
|
|
|
21
37
|
readFiles: string[];
|
|
22
38
|
modifiedFiles: string[];
|
|
23
39
|
};
|
|
24
|
-
export declare function formatFileOperations(readFiles: string[], modifiedFiles: string[]): string;
|
|
25
|
-
export declare function upsertFileOperations(summary: string, readFiles: string[], modifiedFiles: string[]): string;
|
|
40
|
+
export declare function formatFileOperations(readFiles: string[], modifiedFiles: string[], readSet?: ReadonlySet<string>): string;
|
|
41
|
+
export declare function upsertFileOperations(summary: string, readFiles: string[], modifiedFiles: string[], readSet?: ReadonlySet<string>): string;
|
|
26
42
|
/**
|
|
27
43
|
* Serialize LLM messages to text for summarization.
|
|
28
44
|
* This prevents the model from treating it as a conversation to continue.
|
package/dist/types/types.d.ts
CHANGED
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
import type { ApiKeyResolveContext, AssistantMessage, AssistantMessageEvent, AssistantMessageEventStream, Effort, ImageContent, Message, Model, SimpleStreamOptions, Static, streamSimple, TextContent, Tool, ToolChoice, ToolResultMessage, TSchema } from "@oh-my-pi/pi-ai";
|
|
1
|
+
import type { ApiKeyResolveContext, AssistantMessage, AssistantMessageEvent, AssistantMessageEventStream, Context, Effort, ImageContent, Message, Model, SimpleStreamOptions, Static, streamSimple, TextContent, Tool, ToolChoice, ToolResultMessage, TSchema } from "@oh-my-pi/pi-ai";
|
|
2
2
|
import type { AppendOnlyContextManager } from "./append-only-context";
|
|
3
3
|
import type { HarmonyAuditEvent } from "./harmony-leak";
|
|
4
4
|
import type { AgentRunCoverage, AgentRunSummary } from "./run-collector";
|
|
@@ -79,6 +79,12 @@ export interface AgentLoopConfig extends SimpleStreamOptions {
|
|
|
79
79
|
* ```
|
|
80
80
|
*/
|
|
81
81
|
transformContext?: (messages: AgentMessage[], signal?: AbortSignal) => Promise<AgentMessage[]>;
|
|
82
|
+
/**
|
|
83
|
+
* Optional transform applied to the final provider context after conversion,
|
|
84
|
+
* normalization, and append-only context handling, but before telemetry capture
|
|
85
|
+
* and provider send.
|
|
86
|
+
*/
|
|
87
|
+
transformProviderContext?: (context: Context) => Context;
|
|
82
88
|
/**
|
|
83
89
|
* Resolves an API key dynamically for each LLM call.
|
|
84
90
|
*
|
|
@@ -171,6 +177,14 @@ export interface AgentLoopConfig extends SimpleStreamOptions {
|
|
|
171
177
|
* the next model call instead of waiting for the next prompt.
|
|
172
178
|
*/
|
|
173
179
|
getReasoning?: () => Effort | undefined;
|
|
180
|
+
/**
|
|
181
|
+
* Dynamic reasoning-disable override, resolved per LLM call. When set,
|
|
182
|
+
* its return value overrides the static `disableReasoning` from
|
|
183
|
+
* `SimpleStreamOptions` for that request. Pair with `getReasoning` so
|
|
184
|
+
* mid-run transitions into and out of the explicit `off` state propagate
|
|
185
|
+
* to the next provider call.
|
|
186
|
+
*/
|
|
187
|
+
getDisableReasoning?: () => boolean | undefined;
|
|
174
188
|
/**
|
|
175
189
|
* Called after a tool call has been validated and is about to execute.
|
|
176
190
|
*
|
|
@@ -307,6 +321,7 @@ export interface AgentState {
|
|
|
307
321
|
systemPrompt: string[];
|
|
308
322
|
model: Model;
|
|
309
323
|
thinkingLevel?: Effort;
|
|
324
|
+
disableReasoning?: boolean;
|
|
310
325
|
tools: AgentTool<any>[];
|
|
311
326
|
messages: AgentMessage[];
|
|
312
327
|
isStreaming: boolean;
|
package/package.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"type": "module",
|
|
3
3
|
"name": "@oh-my-pi/pi-agent-core",
|
|
4
|
-
"version": "15.
|
|
4
|
+
"version": "15.11.0",
|
|
5
5
|
"description": "General-purpose agent with transport abstraction, state management, and attachment support",
|
|
6
6
|
"homepage": "https://omp.sh",
|
|
7
7
|
"author": "Can Boluk",
|
|
@@ -35,10 +35,11 @@
|
|
|
35
35
|
"fmt": "biome format --write ."
|
|
36
36
|
},
|
|
37
37
|
"dependencies": {
|
|
38
|
-
"@oh-my-pi/pi-ai": "15.
|
|
39
|
-
"@oh-my-pi/pi-catalog": "15.
|
|
40
|
-
"@oh-my-pi/pi-natives": "15.
|
|
41
|
-
"@oh-my-pi/pi-utils": "15.
|
|
38
|
+
"@oh-my-pi/pi-ai": "15.11.0",
|
|
39
|
+
"@oh-my-pi/pi-catalog": "15.11.0",
|
|
40
|
+
"@oh-my-pi/pi-natives": "15.11.0",
|
|
41
|
+
"@oh-my-pi/pi-utils": "15.11.0",
|
|
42
|
+
"@oh-my-pi/snapcompact": "15.11.0",
|
|
42
43
|
"@opentelemetry/api": "^1.9.1"
|
|
43
44
|
},
|
|
44
45
|
"devDependencies": {
|
package/src/agent-loop.ts
CHANGED
|
@@ -564,8 +564,10 @@ async function runLoopBody(
|
|
|
564
564
|
streamFn?: StreamFn,
|
|
565
565
|
): Promise<void> {
|
|
566
566
|
let firstTurn = true;
|
|
567
|
-
// Check for steering messages at start (user may have typed while waiting)
|
|
568
|
-
|
|
567
|
+
// Check for steering messages at start (user may have typed while waiting).
|
|
568
|
+
// Skip when the run is already externally aborted — dequeuing would strand
|
|
569
|
+
// the messages in a run that is about to die.
|
|
570
|
+
let pendingMessages: AgentMessage[] = signal?.aborted ? [] : (await config.getSteeringMessages?.()) || [];
|
|
569
571
|
let harmonyRetryAttempt = 0;
|
|
570
572
|
let harmonyTruncateResumeCount = 0;
|
|
571
573
|
|
|
@@ -743,7 +745,12 @@ async function runLoopBody(
|
|
|
743
745
|
|
|
744
746
|
stream.push({ type: "turn_end", message, toolResults });
|
|
745
747
|
|
|
746
|
-
|
|
748
|
+
// On external abort (user interrupt), leave the steering queue intact: the
|
|
749
|
+
// session aborts then continues, delivering the queue into a fresh run.
|
|
750
|
+
// Draining it here would inject the messages right before a model call that
|
|
751
|
+
// instantly aborts — message lands in history, agent never responds.
|
|
752
|
+
const steering =
|
|
753
|
+
steeringMessagesFromExecution ?? (signal?.aborted ? [] : (await config.getSteeringMessages?.()) || []);
|
|
747
754
|
if (hasMoreToolCalls) {
|
|
748
755
|
// Mid-work: fold any non-interrupting asides into the next turn alongside steering.
|
|
749
756
|
const asides = resolveAsides(await config.getAsideMessages?.());
|
|
@@ -758,8 +765,9 @@ async function runLoopBody(
|
|
|
758
765
|
|
|
759
766
|
// Agent would stop here. Drain non-interrupting asides + follow-up messages.
|
|
760
767
|
await config.onBeforeYield?.();
|
|
761
|
-
|
|
762
|
-
const
|
|
768
|
+
// Skip queue drains when externally aborted (same stranding hazard as above).
|
|
769
|
+
const asideMessages = signal?.aborted ? [] : resolveAsides(await config.getAsideMessages?.());
|
|
770
|
+
const followUpMessages = signal?.aborted ? [] : (await config.getFollowUpMessages?.()) || [];
|
|
763
771
|
if (asideMessages.length > 0 || followUpMessages.length > 0) {
|
|
764
772
|
// Set as pending so the inner loop processes them before stopping.
|
|
765
773
|
pendingMessages = [...asideMessages, ...followUpMessages];
|
|
@@ -829,6 +837,9 @@ async function streamAssistantResponse(
|
|
|
829
837
|
tools: normalizeTools(context.tools, !!config.intentTracing),
|
|
830
838
|
};
|
|
831
839
|
}
|
|
840
|
+
if (config.transformProviderContext) {
|
|
841
|
+
llmContext = config.transformProviderContext(llmContext);
|
|
842
|
+
}
|
|
832
843
|
|
|
833
844
|
const streamFunction = streamFn || streamSimple;
|
|
834
845
|
|
|
@@ -845,6 +856,7 @@ async function streamAssistantResponse(
|
|
|
845
856
|
|
|
846
857
|
const dynamicToolChoice = config.getToolChoice?.();
|
|
847
858
|
const dynamicReasoning = config.getReasoning?.();
|
|
859
|
+
const dynamicDisableReasoning = config.getDisableReasoning?.();
|
|
848
860
|
const harmonyMitigationEnabled = isHarmonyLeakMitigationTarget(config.model);
|
|
849
861
|
const harmonyAbortController = harmonyMitigationEnabled ? new AbortController() : undefined;
|
|
850
862
|
const requestSignal = harmonyAbortController
|
|
@@ -856,6 +868,7 @@ async function streamAssistantResponse(
|
|
|
856
868
|
harmonyRetryAttempt > 0 && config.temperature !== undefined ? config.temperature + 0.05 : config.temperature;
|
|
857
869
|
const effectiveToolChoice = dynamicToolChoice ?? config.toolChoice;
|
|
858
870
|
const effectiveReasoning = dynamicReasoning ?? config.reasoning;
|
|
871
|
+
const effectiveDisableReasoning = dynamicDisableReasoning ?? config.disableReasoning;
|
|
859
872
|
|
|
860
873
|
const chatStepNumber = stepCounter.count;
|
|
861
874
|
stepCounter.count += 1;
|
|
@@ -916,6 +929,7 @@ async function streamAssistantResponse(
|
|
|
916
929
|
metadata: resolvedMetadata,
|
|
917
930
|
toolChoice: effectiveToolChoice,
|
|
918
931
|
reasoning: effectiveReasoning,
|
|
932
|
+
disableReasoning: effectiveDisableReasoning,
|
|
919
933
|
temperature: effectiveTemperature,
|
|
920
934
|
signal: requestSignal,
|
|
921
935
|
onResponse: captureOnResponse,
|
|
@@ -1247,11 +1261,16 @@ async function executeToolCalls(
|
|
|
1247
1261
|
}));
|
|
1248
1262
|
|
|
1249
1263
|
const checkSteering = async (): Promise<void> => {
|
|
1250
|
-
|
|
1264
|
+
// `signal` (external/user abort) is checked separately from the internal
|
|
1265
|
+
// steeringAbortController: once the run is externally aborted it is
|
|
1266
|
+
// unwinding, and draining the steering queue here would strand the
|
|
1267
|
+
// messages in the dying run instead of leaving them for the post-abort
|
|
1268
|
+
// continue (interruptAndFlushQueuedMessages → Agent.continue()).
|
|
1269
|
+
if (!shouldInterruptImmediately || !getSteeringMessages || interruptState.triggered || signal?.aborted) {
|
|
1251
1270
|
return;
|
|
1252
1271
|
}
|
|
1253
1272
|
const check = steeringCheckTail.then(async () => {
|
|
1254
|
-
if (interruptState.triggered) return;
|
|
1273
|
+
if (interruptState.triggered || signal?.aborted) return;
|
|
1255
1274
|
const steering = await getSteeringMessages();
|
|
1256
1275
|
if (steering.length > 0) {
|
|
1257
1276
|
steeringMessages = steering;
|
package/src/agent.ts
CHANGED
|
@@ -6,6 +6,7 @@ import {
|
|
|
6
6
|
type ApiKeyResolveContext,
|
|
7
7
|
type AssistantMessage,
|
|
8
8
|
type AssistantMessageEvent,
|
|
9
|
+
type Context,
|
|
9
10
|
type CursorExecHandlers,
|
|
10
11
|
type CursorToolResultHandler,
|
|
11
12
|
type Effort,
|
|
@@ -93,6 +94,12 @@ export interface AgentOptions {
|
|
|
93
94
|
*/
|
|
94
95
|
transformContext?: (messages: AgentMessage[], signal?: AbortSignal) => Promise<AgentMessage[]>;
|
|
95
96
|
|
|
97
|
+
/**
|
|
98
|
+
* Optional transform applied after provider context assembly and before
|
|
99
|
+
* telemetry capture/provider send.
|
|
100
|
+
*/
|
|
101
|
+
transformProviderContext?: (context: Context) => Context;
|
|
102
|
+
|
|
96
103
|
/**
|
|
97
104
|
* Steering mode: "all" = send all steering messages at once, "one-at-a-time" = one per turn
|
|
98
105
|
*/
|
|
@@ -265,6 +272,7 @@ export class Agent {
|
|
|
265
272
|
systemPrompt: [],
|
|
266
273
|
model: getBundledModel("google", "gemini-2.5-flash-lite-preview-06-17"),
|
|
267
274
|
thinkingLevel: undefined,
|
|
275
|
+
disableReasoning: false,
|
|
268
276
|
tools: [],
|
|
269
277
|
messages: [],
|
|
270
278
|
isStreaming: false,
|
|
@@ -277,6 +285,7 @@ export class Agent {
|
|
|
277
285
|
#abortController?: AbortController;
|
|
278
286
|
#convertToLlm: (messages: AgentMessage[]) => Message[] | Promise<Message[]>;
|
|
279
287
|
#transformContext?: (messages: AgentMessage[], signal?: AbortSignal) => Promise<AgentMessage[]>;
|
|
288
|
+
#transformProviderContext?: (context: Context) => Context;
|
|
280
289
|
#steeringQueue: AgentMessage[] = [];
|
|
281
290
|
#followUpQueue: AgentMessage[] = [];
|
|
282
291
|
#steeringMode: "all" | "one-at-a-time";
|
|
@@ -375,6 +384,7 @@ export class Agent {
|
|
|
375
384
|
this.afterToolCall = opts.afterToolCall;
|
|
376
385
|
this.#telemetry = opts.telemetry;
|
|
377
386
|
this.#appendOnlyContext = opts.appendOnlyContext;
|
|
387
|
+
this.#transformProviderContext = opts.transformProviderContext;
|
|
378
388
|
}
|
|
379
389
|
|
|
380
390
|
/**
|
|
@@ -658,6 +668,10 @@ export class Agent {
|
|
|
658
668
|
this.#state.thinkingLevel = l;
|
|
659
669
|
}
|
|
660
670
|
|
|
671
|
+
setDisableReasoning(disabled: boolean) {
|
|
672
|
+
this.#state.disableReasoning = disabled;
|
|
673
|
+
}
|
|
674
|
+
|
|
661
675
|
setSteeringMode(mode: "all" | "one-at-a-time") {
|
|
662
676
|
this.#steeringMode = mode;
|
|
663
677
|
}
|
|
@@ -942,6 +956,7 @@ export class Agent {
|
|
|
942
956
|
const config: AgentLoopConfig = {
|
|
943
957
|
model,
|
|
944
958
|
reasoning,
|
|
959
|
+
disableReasoning: this.#state.disableReasoning,
|
|
945
960
|
temperature: this.#temperature,
|
|
946
961
|
topP: this.#topP,
|
|
947
962
|
topK: this.#topK,
|
|
@@ -961,6 +976,7 @@ export class Agent {
|
|
|
961
976
|
kimiApiFormat: this.#kimiApiFormat,
|
|
962
977
|
preferWebsockets: this.#preferWebsockets,
|
|
963
978
|
convertToLlm: this.#convertToLlm,
|
|
979
|
+
transformProviderContext: this.#transformProviderContext,
|
|
964
980
|
transformContext: this.#transformContext,
|
|
965
981
|
onPayload: this.#onPayload,
|
|
966
982
|
onResponse: this.#onResponse,
|
|
@@ -985,6 +1001,7 @@ export class Agent {
|
|
|
985
1001
|
onHarmonyLeak: this.#onHarmonyLeak,
|
|
986
1002
|
getToolChoice,
|
|
987
1003
|
getReasoning: () => this.#state.thinkingLevel,
|
|
1004
|
+
getDisableReasoning: () => this.#state.disableReasoning,
|
|
988
1005
|
getSteeringMessages: async () => {
|
|
989
1006
|
if (skipInitialSteeringPoll) {
|
|
990
1007
|
skipInitialSteeringPoll = false;
|
|
@@ -13,10 +13,10 @@ import { estimateTokens } from "./compaction";
|
|
|
13
13
|
import type { ReadonlySessionManager, SessionEntry } from "./entries";
|
|
14
14
|
import {
|
|
15
15
|
type ConvertToLlm,
|
|
16
|
-
convertToLlm,
|
|
17
16
|
createBranchSummaryMessage,
|
|
18
17
|
createCompactionSummaryMessage,
|
|
19
18
|
createCustomMessage,
|
|
19
|
+
defaultConvertToLlm,
|
|
20
20
|
} from "./messages";
|
|
21
21
|
import branchSummaryPrompt from "./prompts/branch-summary.md" with { type: "text" };
|
|
22
22
|
import branchSummaryPreamble from "./prompts/branch-summary-preamble.md" with { type: "text" };
|
|
@@ -27,6 +27,7 @@ import {
|
|
|
27
27
|
type FileOperations,
|
|
28
28
|
SUMMARIZATION_SYSTEM_PROMPT,
|
|
29
29
|
serializeConversation,
|
|
30
|
+
stripReadSelector,
|
|
30
31
|
upsertFileOperations,
|
|
31
32
|
} from "./utils";
|
|
32
33
|
|
|
@@ -214,7 +215,7 @@ export function prepareBranchEntries(entries: SessionEntry[], tokenBudget: numbe
|
|
|
214
215
|
if (entry.type === "branch_summary" && !entry.fromExtension && entry.details) {
|
|
215
216
|
const details = entry.details as BranchSummaryDetails;
|
|
216
217
|
if (Array.isArray(details.readFiles)) {
|
|
217
|
-
for (const f of details.readFiles) fileOps.read.add(f);
|
|
218
|
+
for (const f of details.readFiles) fileOps.read.add(stripReadSelector(f));
|
|
218
219
|
}
|
|
219
220
|
if (Array.isArray(details.modifiedFiles)) {
|
|
220
221
|
// Modified files go into both edited and written for proper deduplication
|
|
@@ -288,7 +289,7 @@ export async function generateBranchSummary(
|
|
|
288
289
|
|
|
289
290
|
// Transform to LLM-compatible messages, then serialize to text
|
|
290
291
|
// Serialization prevents the model from treating it as a conversation to continue
|
|
291
|
-
const llmMessages = (options.convertToLlm ??
|
|
292
|
+
const llmMessages = (options.convertToLlm ?? defaultConvertToLlm)(messages);
|
|
292
293
|
const conversationText = serializeConversation(llmMessages);
|
|
293
294
|
|
|
294
295
|
// Build prompt
|
|
@@ -329,7 +330,7 @@ export async function generateBranchSummary(
|
|
|
329
330
|
|
|
330
331
|
// Compute file lists and append to summary
|
|
331
332
|
const { readFiles, modifiedFiles } = computeFileLists(fileOps);
|
|
332
|
-
summary = upsertFileOperations(summary, readFiles, modifiedFiles);
|
|
333
|
+
summary = upsertFileOperations(summary, readFiles, modifiedFiles, fileOps.read);
|
|
333
334
|
|
|
334
335
|
return {
|
|
335
336
|
summary: summary || "No summary generated",
|
|
@@ -12,16 +12,18 @@ import {
|
|
|
12
12
|
type Message,
|
|
13
13
|
type MessageAttribution,
|
|
14
14
|
type Model,
|
|
15
|
+
type Tool,
|
|
15
16
|
type Usage,
|
|
16
17
|
} from "@oh-my-pi/pi-ai";
|
|
17
18
|
import { clampThinkingLevelForModel } from "@oh-my-pi/pi-catalog/model-thinking";
|
|
18
19
|
import { countTokens } from "@oh-my-pi/pi-natives";
|
|
19
20
|
import { logger, prompt } from "@oh-my-pi/pi-utils";
|
|
21
|
+
import { SNAPCOMPACT_FRAME_TOKEN_ESTIMATE } from "@oh-my-pi/snapcompact";
|
|
20
22
|
import { type AgentTelemetry, instrumentedCompleteSimple } from "../telemetry";
|
|
21
23
|
import { ThinkingLevel } from "../thinking";
|
|
22
|
-
import type { AgentMessage
|
|
24
|
+
import type { AgentMessage } from "../types";
|
|
23
25
|
import type { CompactionEntry, SessionEntry } from "./entries";
|
|
24
|
-
import { type ConvertToLlm,
|
|
26
|
+
import { type ConvertToLlm, createBranchSummaryMessage, createCustomMessage, defaultConvertToLlm } from "./messages";
|
|
25
27
|
import {
|
|
26
28
|
buildOpenAiNativeHistory,
|
|
27
29
|
getPreservedOpenAiRemoteCompactionData,
|
|
@@ -44,6 +46,7 @@ import {
|
|
|
44
46
|
type FileOperations,
|
|
45
47
|
SUMMARIZATION_SYSTEM_PROMPT,
|
|
46
48
|
serializeConversation,
|
|
49
|
+
stripReadSelector,
|
|
47
50
|
upsertFileOperations,
|
|
48
51
|
} from "./utils";
|
|
49
52
|
|
|
@@ -73,7 +76,7 @@ function extractFileOperations(
|
|
|
73
76
|
if (!prevCompaction.fromExtension && prevCompaction.details) {
|
|
74
77
|
const details = prevCompaction.details as CompactionDetails;
|
|
75
78
|
if (Array.isArray(details.readFiles)) {
|
|
76
|
-
for (const f of details.readFiles) fileOps.read.add(f);
|
|
79
|
+
for (const f of details.readFiles) fileOps.read.add(stripReadSelector(f));
|
|
77
80
|
}
|
|
78
81
|
if (Array.isArray(details.modifiedFiles)) {
|
|
79
82
|
for (const f of details.modifiedFiles) fileOps.edited.add(f);
|
|
@@ -136,7 +139,7 @@ export interface CompactionResult<T = unknown> {
|
|
|
136
139
|
|
|
137
140
|
export interface CompactionSettings {
|
|
138
141
|
enabled: boolean;
|
|
139
|
-
strategy?: "context-full" | "handoff" | "shake" | "off";
|
|
142
|
+
strategy?: "context-full" | "handoff" | "shake" | "snapcompact" | "off";
|
|
140
143
|
thresholdPercent?: number;
|
|
141
144
|
thresholdTokens?: number;
|
|
142
145
|
reserveTokens: number;
|
|
@@ -284,9 +287,19 @@ export function estimateTokens(message: AgentMessage): number {
|
|
|
284
287
|
fragments.push(block.text);
|
|
285
288
|
} else if (block.type === "thinking") {
|
|
286
289
|
fragments.push(block.thinking);
|
|
290
|
+
// Providers charge for the opaque signature/reasoning payload that
|
|
291
|
+
// rides alongside the thinking text (OpenAI Responses encrypted
|
|
292
|
+
// reasoning items, Anthropic signed thinking blocks, etc.). Without
|
|
293
|
+
// counting it, this estimator can read ~half of the provider-reported
|
|
294
|
+
// usage on thinking-heavy turns — see #2275 for the resulting
|
|
295
|
+
// compaction-trigger / post-check metric divergence.
|
|
296
|
+
if (block.thinkingSignature) fragments.push(block.thinkingSignature);
|
|
287
297
|
} else if (block.type === "toolCall") {
|
|
288
298
|
fragments.push(block.name);
|
|
289
299
|
fragments.push(JSON.stringify(block.arguments));
|
|
300
|
+
} else if (block.type === "redactedThinking") {
|
|
301
|
+
// Encrypted reasoning blob the provider still bills for on replay.
|
|
302
|
+
fragments.push(block.data);
|
|
290
303
|
}
|
|
291
304
|
}
|
|
292
305
|
break;
|
|
@@ -309,6 +322,10 @@ export function estimateTokens(message: AgentMessage): number {
|
|
|
309
322
|
case "branchSummary":
|
|
310
323
|
case "compactionSummary": {
|
|
311
324
|
fragments.push(message.summary);
|
|
325
|
+
if (message.role === "compactionSummary" && message.images) {
|
|
326
|
+
// Snapcompact frames render at ≥1568px; providers bill the downscaled cap.
|
|
327
|
+
extra += message.images.length * SNAPCOMPACT_FRAME_TOKEN_ESTIMATE;
|
|
328
|
+
}
|
|
312
329
|
break;
|
|
313
330
|
}
|
|
314
331
|
default:
|
|
@@ -624,7 +641,7 @@ export async function generateSummary(
|
|
|
624
641
|
|
|
625
642
|
// Serialize conversation to text so model doesn't try to continue it
|
|
626
643
|
// Convert to LLM messages first (handles custom app messages when caller provides a transformer).
|
|
627
|
-
const llmMessages = (options?.convertToLlm ??
|
|
644
|
+
const llmMessages = (options?.convertToLlm ?? defaultConvertToLlm)(currentMessages);
|
|
628
645
|
const conversationText = serializeConversation(llmMessages);
|
|
629
646
|
|
|
630
647
|
// Build the prompt with conversation wrapped in tags
|
|
@@ -690,7 +707,7 @@ export interface HandoffOptions {
|
|
|
690
707
|
/** Live agent system prompt — passed verbatim so providers hit the cached prefix. */
|
|
691
708
|
systemPrompt: string[];
|
|
692
709
|
/** Live agent tool list — same purpose. Forced to `toolChoice: "none"`. */
|
|
693
|
-
tools?:
|
|
710
|
+
tools?: Tool[];
|
|
694
711
|
customInstructions?: string;
|
|
695
712
|
convertToLlm?: ConvertToLlm;
|
|
696
713
|
initiatorOverride?: MessageAttribution;
|
|
@@ -723,7 +740,7 @@ export async function generateHandoff(
|
|
|
723
740
|
options: HandoffOptions,
|
|
724
741
|
signal?: AbortSignal,
|
|
725
742
|
): Promise<string> {
|
|
726
|
-
const llmMessages = (options.convertToLlm ??
|
|
743
|
+
const llmMessages = (options.convertToLlm ?? defaultConvertToLlm)(messages);
|
|
727
744
|
const requestMessages: Message[] = [
|
|
728
745
|
...llmMessages,
|
|
729
746
|
{
|
|
@@ -772,7 +789,7 @@ async function generateShortSummary(
|
|
|
772
789
|
options?: SummaryOptions,
|
|
773
790
|
): Promise<string> {
|
|
774
791
|
const maxTokens = Math.min(512, Math.floor(0.2 * reserveTokens));
|
|
775
|
-
const llmMessages = (options?.convertToLlm ??
|
|
792
|
+
const llmMessages = (options?.convertToLlm ?? defaultConvertToLlm)(recentMessages);
|
|
776
793
|
const conversationText = serializeConversation(llmMessages);
|
|
777
794
|
|
|
778
795
|
let promptText = `<conversation>\n${conversationText}\n</conversation>\n\n`;
|
|
@@ -1009,7 +1026,7 @@ export async function compact(
|
|
|
1009
1026
|
? previousRemoteCompaction.replacementHistory
|
|
1010
1027
|
: undefined;
|
|
1011
1028
|
const remoteHistory = buildOpenAiNativeHistory(
|
|
1012
|
-
(summaryOptions.convertToLlm ??
|
|
1029
|
+
(summaryOptions.convertToLlm ?? defaultConvertToLlm)(remoteMessages),
|
|
1013
1030
|
model,
|
|
1014
1031
|
previousReplacementHistory,
|
|
1015
1032
|
);
|
|
@@ -1097,7 +1114,7 @@ export async function compact(
|
|
|
1097
1114
|
|
|
1098
1115
|
// Compute file lists and append to summary
|
|
1099
1116
|
const { readFiles, modifiedFiles } = computeFileLists(fileOps);
|
|
1100
|
-
summary = upsertFileOperations(summary, readFiles, modifiedFiles);
|
|
1117
|
+
summary = upsertFileOperations(summary, readFiles, modifiedFiles, fileOps.read);
|
|
1101
1118
|
|
|
1102
1119
|
if (!firstKeptEntryId) {
|
|
1103
1120
|
throw new Error("First kept entry has no ID - session may need migration");
|
|
@@ -1126,7 +1143,7 @@ async function generateTurnPrefixSummary(
|
|
|
1126
1143
|
): Promise<string> {
|
|
1127
1144
|
const maxTokens = Math.floor(0.5 * reserveTokens); // Smaller budget for turn prefix
|
|
1128
1145
|
|
|
1129
|
-
const llmMessages = (options?.convertToLlm ??
|
|
1146
|
+
const llmMessages = (options?.convertToLlm ?? defaultConvertToLlm)(messages);
|
|
1130
1147
|
const conversationText = serializeConversation(llmMessages);
|
|
1131
1148
|
const promptText = `<conversation>\n${conversationText}\n</conversation>\n\n${TURN_PREFIX_SUMMARIZATION_PROMPT}`;
|
|
1132
1149
|
const summarizationMessages = [
|
|
@@ -51,6 +51,8 @@ export interface CompactionSummaryMessage {
|
|
|
51
51
|
shortSummary?: string;
|
|
52
52
|
tokensBefore: number;
|
|
53
53
|
providerPayload?: ProviderPayload;
|
|
54
|
+
/** Snapcompact frames archived by this compaction; appended as image blocks after the summary text. */
|
|
55
|
+
images?: ImageContent[];
|
|
54
56
|
timestamp: number;
|
|
55
57
|
}
|
|
56
58
|
|
|
@@ -98,6 +100,7 @@ export function createCompactionSummaryMessage(
|
|
|
98
100
|
timestamp: string,
|
|
99
101
|
shortSummary?: string,
|
|
100
102
|
providerPayload?: ProviderPayload,
|
|
103
|
+
images?: ImageContent[],
|
|
101
104
|
): CompactionSummaryMessage {
|
|
102
105
|
return {
|
|
103
106
|
role: "compactionSummary",
|
|
@@ -105,6 +108,7 @@ export function createCompactionSummaryMessage(
|
|
|
105
108
|
shortSummary,
|
|
106
109
|
tokensBefore,
|
|
107
110
|
providerPayload,
|
|
111
|
+
images: images && images.length > 0 ? images : undefined,
|
|
108
112
|
timestamp: new Date(timestamp).getTime(),
|
|
109
113
|
};
|
|
110
114
|
}
|
|
@@ -137,6 +141,79 @@ function isCoreCompactionMessage(message: AgentMessage): message is AgentMessage
|
|
|
137
141
|
);
|
|
138
142
|
}
|
|
139
143
|
|
|
144
|
+
/**
|
|
145
|
+
* Transform a single core-domain agent message to its LLM form; `undefined`
|
|
146
|
+
* drops it from the provider request.
|
|
147
|
+
*
|
|
148
|
+
* Single source of truth for the core roles (user/developer/assistant/
|
|
149
|
+
* toolResult) and the compaction messages owned by this package. Embedders
|
|
150
|
+
* with their own app messages (e.g. the coding agent) handle their custom
|
|
151
|
+
* roles and delegate every core role here — duplicating these cases is how
|
|
152
|
+
* snapcompact frames once silently fell off the provider request.
|
|
153
|
+
*/
|
|
154
|
+
export function convertMessageToLlm(message: AgentMessage): Message | undefined {
|
|
155
|
+
if (isCoreCompactionMessage(message)) {
|
|
156
|
+
switch (message.role) {
|
|
157
|
+
case "custom":
|
|
158
|
+
case "hookMessage": {
|
|
159
|
+
const content =
|
|
160
|
+
typeof message.content === "string"
|
|
161
|
+
? [{ type: "text" as const, text: message.content }]
|
|
162
|
+
: message.content;
|
|
163
|
+
return {
|
|
164
|
+
role: "developer",
|
|
165
|
+
content,
|
|
166
|
+
attribution: message.attribution,
|
|
167
|
+
timestamp: message.timestamp,
|
|
168
|
+
};
|
|
169
|
+
}
|
|
170
|
+
case "branchSummary":
|
|
171
|
+
return {
|
|
172
|
+
role: "user",
|
|
173
|
+
content: [
|
|
174
|
+
{
|
|
175
|
+
type: "text" as const,
|
|
176
|
+
text: renderBranchSummaryContext(message.summary),
|
|
177
|
+
},
|
|
178
|
+
],
|
|
179
|
+
attribution: "agent",
|
|
180
|
+
timestamp: message.timestamp,
|
|
181
|
+
};
|
|
182
|
+
case "compactionSummary":
|
|
183
|
+
return {
|
|
184
|
+
role: "user",
|
|
185
|
+
content: [
|
|
186
|
+
{
|
|
187
|
+
type: "text" as const,
|
|
188
|
+
text: renderCompactionSummaryContext(message.summary),
|
|
189
|
+
},
|
|
190
|
+
...(message.images ?? []),
|
|
191
|
+
],
|
|
192
|
+
attribution: "agent",
|
|
193
|
+
providerPayload: message.providerPayload,
|
|
194
|
+
timestamp: message.timestamp,
|
|
195
|
+
};
|
|
196
|
+
}
|
|
197
|
+
}
|
|
198
|
+
|
|
199
|
+
switch (message.role) {
|
|
200
|
+
case "user":
|
|
201
|
+
return { ...message, attribution: message.attribution ?? "user" };
|
|
202
|
+
case "developer":
|
|
203
|
+
return { ...message, attribution: message.attribution ?? "agent" };
|
|
204
|
+
case "assistant":
|
|
205
|
+
return message as AssistantMessage;
|
|
206
|
+
case "toolResult":
|
|
207
|
+
return {
|
|
208
|
+
...message,
|
|
209
|
+
content: getPrunedToolResultContent(message as ToolResultMessage),
|
|
210
|
+
attribution: message.attribution ?? "agent",
|
|
211
|
+
};
|
|
212
|
+
default:
|
|
213
|
+
return undefined;
|
|
214
|
+
}
|
|
215
|
+
}
|
|
216
|
+
|
|
140
217
|
/**
|
|
141
218
|
* Default compaction-domain transformer.
|
|
142
219
|
*
|
|
@@ -145,68 +222,5 @@ function isCoreCompactionMessage(message: AgentMessage): message is AgentMessage
|
|
|
145
222
|
* core LLM roles and the compaction messages owned by this package.
|
|
146
223
|
*/
|
|
147
224
|
export function defaultConvertToLlm(messages: AgentMessage[]): Message[] {
|
|
148
|
-
return messages
|
|
149
|
-
.map((message): Message | undefined => {
|
|
150
|
-
if (isCoreCompactionMessage(message)) {
|
|
151
|
-
switch (message.role) {
|
|
152
|
-
case "custom":
|
|
153
|
-
case "hookMessage": {
|
|
154
|
-
const content =
|
|
155
|
-
typeof message.content === "string"
|
|
156
|
-
? [{ type: "text" as const, text: message.content }]
|
|
157
|
-
: message.content;
|
|
158
|
-
return {
|
|
159
|
-
role: "developer",
|
|
160
|
-
content,
|
|
161
|
-
attribution: message.attribution,
|
|
162
|
-
timestamp: message.timestamp,
|
|
163
|
-
};
|
|
164
|
-
}
|
|
165
|
-
case "branchSummary":
|
|
166
|
-
return {
|
|
167
|
-
role: "user",
|
|
168
|
-
content: [
|
|
169
|
-
{
|
|
170
|
-
type: "text" as const,
|
|
171
|
-
text: renderBranchSummaryContext(message.summary),
|
|
172
|
-
},
|
|
173
|
-
],
|
|
174
|
-
attribution: "agent",
|
|
175
|
-
timestamp: message.timestamp,
|
|
176
|
-
};
|
|
177
|
-
case "compactionSummary":
|
|
178
|
-
return {
|
|
179
|
-
role: "user",
|
|
180
|
-
content: [
|
|
181
|
-
{
|
|
182
|
-
type: "text" as const,
|
|
183
|
-
text: renderCompactionSummaryContext(message.summary),
|
|
184
|
-
},
|
|
185
|
-
],
|
|
186
|
-
attribution: "agent",
|
|
187
|
-
providerPayload: message.providerPayload,
|
|
188
|
-
timestamp: message.timestamp,
|
|
189
|
-
};
|
|
190
|
-
}
|
|
191
|
-
}
|
|
192
|
-
|
|
193
|
-
switch (message.role) {
|
|
194
|
-
case "user":
|
|
195
|
-
return { ...message, attribution: message.attribution ?? "user" };
|
|
196
|
-
case "developer":
|
|
197
|
-
return { ...message, attribution: message.attribution ?? "agent" };
|
|
198
|
-
case "assistant":
|
|
199
|
-
return message as AssistantMessage;
|
|
200
|
-
case "toolResult":
|
|
201
|
-
return {
|
|
202
|
-
...message,
|
|
203
|
-
content: getPrunedToolResultContent(message as ToolResultMessage),
|
|
204
|
-
attribution: message.attribution ?? "agent",
|
|
205
|
-
};
|
|
206
|
-
default:
|
|
207
|
-
return undefined;
|
|
208
|
-
}
|
|
209
|
-
})
|
|
210
|
-
.filter(message => message !== undefined);
|
|
225
|
+
return messages.map(convertMessageToLlm).filter(message => message !== undefined);
|
|
211
226
|
}
|
|
212
|
-
export const convertToLlm = defaultConvertToLlm;
|
|
@@ -3,7 +3,7 @@
|
|
|
3
3
|
*/
|
|
4
4
|
|
|
5
5
|
import type { ToolResultMessage } from "@oh-my-pi/pi-ai";
|
|
6
|
-
import type { AgentMessage } from "../types";
|
|
6
|
+
import type { AgentMessage, AgentToolCall } from "../types";
|
|
7
7
|
import { estimateTokens } from "./compaction";
|
|
8
8
|
import type { SessionEntry, SessionMessageEntry } from "./entries";
|
|
9
9
|
import {
|
|
@@ -12,6 +12,7 @@ import {
|
|
|
12
12
|
isSkillReadToolResult,
|
|
13
13
|
type ProtectedToolMatcher,
|
|
14
14
|
} from "./tool-protection";
|
|
15
|
+
import { splitReadSelector } from "./utils";
|
|
15
16
|
|
|
16
17
|
export interface PruneConfig {
|
|
17
18
|
/** Keep the most recent tool output tokens intact. */
|
|
@@ -20,6 +21,13 @@ export interface PruneConfig {
|
|
|
20
21
|
minimumSavings: number;
|
|
21
22
|
/** Tool-result protection matchers. String entries protect every result from that tool; predicates may inspect the paired tool call. */
|
|
22
23
|
protectedTools: ProtectedToolMatcher[];
|
|
24
|
+
/**
|
|
25
|
+
* Optional supersede key function (see {@link SupersedePruneConfig.supersedeKey}).
|
|
26
|
+
* When provided, superseded tool results are pruned first — even inside the
|
|
27
|
+
* `protectTokens` window — before age-based victims. Absent, behavior is
|
|
28
|
+
* unchanged.
|
|
29
|
+
*/
|
|
30
|
+
supersedeKey?: SupersedeKeyFn;
|
|
23
31
|
}
|
|
24
32
|
|
|
25
33
|
export const DEFAULT_PRUNE_CONFIG: PruneConfig = {
|
|
@@ -33,6 +41,34 @@ export interface PruneResult {
|
|
|
33
41
|
tokensSaved: number;
|
|
34
42
|
}
|
|
35
43
|
|
|
44
|
+
/** Exact placeholder written over a superseded tool result. */
|
|
45
|
+
export const SUPERSEDED_NOTICE = "[Superseded by a newer read of this file]";
|
|
46
|
+
|
|
47
|
+
/**
|
|
48
|
+
* Maps a tool call to a supersede key. Results sharing a key form a group in
|
|
49
|
+
* which every result except the newest is a supersede candidate. A key `K`
|
|
50
|
+
* additionally supersedes keys with prefix `K + "\u0000"` (selector-free read
|
|
51
|
+
* supersedes selector-carrying reads of the same base path). Return
|
|
52
|
+
* `undefined` to exempt a call from supersede grouping.
|
|
53
|
+
*/
|
|
54
|
+
export type SupersedeKeyFn = (toolName: string, args: Record<string, unknown>) => string | undefined;
|
|
55
|
+
|
|
56
|
+
export interface SupersedePruneConfig {
|
|
57
|
+
/** Supersede key function; results sharing a key supersede older ones. */
|
|
58
|
+
supersedeKey: SupersedeKeyFn;
|
|
59
|
+
/** Prune a candidate now when all messages after it total at most this many estimated tokens. Default 8 000. */
|
|
60
|
+
suffixTokenLimit?: number;
|
|
61
|
+
/** Prune all candidates when the last message is at least this old (prompt cache is cold anyway). Default 30 min. */
|
|
62
|
+
idleFlushMs?: number;
|
|
63
|
+
/** Clock override for tests. */
|
|
64
|
+
now?: number;
|
|
65
|
+
/** Tool-result protection matchers (same contract as {@link PruneConfig.protectedTools}). */
|
|
66
|
+
protectedTools: ProtectedToolMatcher[];
|
|
67
|
+
}
|
|
68
|
+
|
|
69
|
+
const DEFAULT_SUFFIX_TOKEN_LIMIT = 8_000;
|
|
70
|
+
const DEFAULT_IDLE_FLUSH_MS = 30 * 60_000;
|
|
71
|
+
|
|
36
72
|
function createPrunedNotice(tokens: number): string {
|
|
37
73
|
return `[Output truncated - ${tokens} tokens]`;
|
|
38
74
|
}
|
|
@@ -44,18 +80,121 @@ function getToolResultMessage(entry: SessionEntry): ToolResultMessage | undefine
|
|
|
44
80
|
return message as ToolResultMessage;
|
|
45
81
|
}
|
|
46
82
|
|
|
47
|
-
function estimatePrunedSavings(tokens: number): number {
|
|
48
|
-
const noticeTokens = Math.ceil(
|
|
83
|
+
function estimatePrunedSavings(tokens: number, notice: string): number {
|
|
84
|
+
const noticeTokens = Math.ceil(notice.length / 4);
|
|
49
85
|
return Math.max(0, tokens - noticeTokens);
|
|
50
86
|
}
|
|
51
87
|
|
|
88
|
+
interface SupersedeCandidate {
|
|
89
|
+
entry: SessionMessageEntry;
|
|
90
|
+
message: ToolResultMessage;
|
|
91
|
+
/** Index of the entry within the `entries` array. */
|
|
92
|
+
index: number;
|
|
93
|
+
tokens: number;
|
|
94
|
+
}
|
|
95
|
+
|
|
96
|
+
/**
|
|
97
|
+
* Collect superseded tool results: for every unpruned, unprotected tool result
|
|
98
|
+
* whose paired call resolves a supersede key, a LATER result with the same key
|
|
99
|
+
* — or with a key that is the `"\u0000"`-prefix parent of this one — marks it
|
|
100
|
+
* superseded. Returned in message order.
|
|
101
|
+
*/
|
|
102
|
+
function collectSupersededResults(
|
|
103
|
+
entries: readonly SessionEntry[],
|
|
104
|
+
toolCallsById: ReadonlyMap<string, AgentToolCall>,
|
|
105
|
+
supersedeKey: SupersedeKeyFn,
|
|
106
|
+
protectedTools: readonly ProtectedToolMatcher[],
|
|
107
|
+
): SupersedeCandidate[] {
|
|
108
|
+
const candidates: SupersedeCandidate[] = [];
|
|
109
|
+
const seenKeys = new Set<string>();
|
|
110
|
+
for (let i = entries.length - 1; i >= 0; i--) {
|
|
111
|
+
const entry = entries[i];
|
|
112
|
+
const message = getToolResultMessage(entry);
|
|
113
|
+
if (!message || message.prunedAt !== undefined) continue;
|
|
114
|
+
const toolCall = toolCallsById.get(message.toolCallId);
|
|
115
|
+
if (!toolCall) continue;
|
|
116
|
+
if (isProtectedToolResult(message, toolCall, protectedTools)) continue;
|
|
117
|
+
const key = supersedeKey(toolCall.name, toolCall.arguments as Record<string, unknown>);
|
|
118
|
+
if (key === undefined) continue;
|
|
119
|
+
const separator = key.indexOf("\u0000");
|
|
120
|
+
const superseded = seenKeys.has(key) || (separator >= 0 && seenKeys.has(key.slice(0, separator)));
|
|
121
|
+
seenKeys.add(key);
|
|
122
|
+
if (!superseded) continue;
|
|
123
|
+
candidates.push({
|
|
124
|
+
entry: entry as SessionMessageEntry,
|
|
125
|
+
message,
|
|
126
|
+
index: i,
|
|
127
|
+
tokens: estimateTokens(message as AgentMessage),
|
|
128
|
+
});
|
|
129
|
+
}
|
|
130
|
+
return candidates.reverse();
|
|
131
|
+
}
|
|
132
|
+
|
|
133
|
+
/**
|
|
134
|
+
* Prune superseded tool results (e.g. stale `read` outputs replaced by a newer
|
|
135
|
+
* read of the same file). Cheap, incremental, and prompt-cache-aware: a
|
|
136
|
+
* candidate is pruned now only when the suffix after it is small (tail case —
|
|
137
|
+
* the read→edit→read loop) or when the context has been idle long enough that
|
|
138
|
+
* the provider cache is cold anyway (then ALL candidates flush).
|
|
139
|
+
*/
|
|
140
|
+
export function pruneSupersededToolResults(entries: SessionEntry[], config: SupersedePruneConfig): PruneResult {
|
|
141
|
+
const toolCallsById = collectToolCallsById(entries);
|
|
142
|
+
const candidates = collectSupersededResults(entries, toolCallsById, config.supersedeKey, config.protectedTools);
|
|
143
|
+
if (candidates.length === 0) return { prunedCount: 0, tokensSaved: 0 };
|
|
144
|
+
|
|
145
|
+
const now = config.now ?? Date.now();
|
|
146
|
+
let lastMessageTimestamp: number | undefined;
|
|
147
|
+
for (let i = entries.length - 1; i >= 0; i--) {
|
|
148
|
+
const entry = entries[i];
|
|
149
|
+
if (entry.type !== "message") continue;
|
|
150
|
+
const timestamp = (entry.message as AgentMessage).timestamp;
|
|
151
|
+
if (typeof timestamp === "number") lastMessageTimestamp = timestamp;
|
|
152
|
+
break;
|
|
153
|
+
}
|
|
154
|
+
const idle =
|
|
155
|
+
lastMessageTimestamp !== undefined && now - lastMessageTimestamp >= (config.idleFlushMs ?? DEFAULT_IDLE_FLUSH_MS);
|
|
156
|
+
|
|
157
|
+
let toPrune: SupersedeCandidate[];
|
|
158
|
+
if (idle) {
|
|
159
|
+
toPrune = candidates;
|
|
160
|
+
} else {
|
|
161
|
+
const suffixTokenLimit = config.suffixTokenLimit ?? DEFAULT_SUFFIX_TOKEN_LIMIT;
|
|
162
|
+
// suffixTokens[i] = estimated tokens of all messages strictly after entry i.
|
|
163
|
+
const suffixTokens = new Array<number>(entries.length);
|
|
164
|
+
let accumulated = 0;
|
|
165
|
+
for (let i = entries.length - 1; i >= 0; i--) {
|
|
166
|
+
suffixTokens[i] = accumulated;
|
|
167
|
+
const entry = entries[i];
|
|
168
|
+
if (entry.type === "message") accumulated += estimateTokens(entry.message as AgentMessage);
|
|
169
|
+
}
|
|
170
|
+
toPrune = candidates.filter(candidate => suffixTokens[candidate.index] <= suffixTokenLimit);
|
|
171
|
+
}
|
|
172
|
+
if (toPrune.length === 0) return { prunedCount: 0, tokensSaved: 0 };
|
|
173
|
+
|
|
174
|
+
const prunedAt = Date.now();
|
|
175
|
+
let tokensSaved = 0;
|
|
176
|
+
for (const candidate of toPrune) {
|
|
177
|
+
candidate.message.content = [{ type: "text", text: SUPERSEDED_NOTICE }];
|
|
178
|
+
candidate.message.prunedAt = prunedAt;
|
|
179
|
+
tokensSaved += estimatePrunedSavings(candidate.tokens, SUPERSEDED_NOTICE);
|
|
180
|
+
}
|
|
181
|
+
return { prunedCount: toPrune.length, tokensSaved };
|
|
182
|
+
}
|
|
183
|
+
|
|
52
184
|
export function pruneToolOutputs(entries: SessionEntry[], config: PruneConfig = DEFAULT_PRUNE_CONFIG): PruneResult {
|
|
53
185
|
let accumulatedTokens = 0;
|
|
54
186
|
let tokensSaved = 0;
|
|
55
187
|
let prunedCount = 0;
|
|
56
188
|
|
|
57
|
-
const candidates: Array<{ entry: SessionMessageEntry; tokens: number }> = [];
|
|
189
|
+
const candidates: Array<{ entry: SessionMessageEntry; tokens: number; superseded: boolean }> = [];
|
|
58
190
|
const toolCallsById = collectToolCallsById(entries);
|
|
191
|
+
const supersededMessages = config.supersedeKey
|
|
192
|
+
? new Set(
|
|
193
|
+
collectSupersededResults(entries, toolCallsById, config.supersedeKey, config.protectedTools).map(
|
|
194
|
+
candidate => candidate.message,
|
|
195
|
+
),
|
|
196
|
+
)
|
|
197
|
+
: undefined;
|
|
59
198
|
|
|
60
199
|
for (let i = entries.length - 1; i >= 0; i--) {
|
|
61
200
|
const entry = entries[i];
|
|
@@ -70,17 +209,23 @@ export function pruneToolOutputs(entries: SessionEntry[], config: PruneConfig =
|
|
|
70
209
|
continue;
|
|
71
210
|
}
|
|
72
211
|
|
|
73
|
-
|
|
212
|
+
// Superseded results are pruned first: they bypass the protect window
|
|
213
|
+
// (a stale copy of re-read content is dead weight at any age).
|
|
214
|
+
const superseded = supersededMessages?.has(message) ?? false;
|
|
215
|
+
if (!superseded && (accumulatedTokens < config.protectTokens || isProtected)) {
|
|
74
216
|
accumulatedTokens += tokens;
|
|
75
217
|
continue;
|
|
76
218
|
}
|
|
77
219
|
|
|
78
|
-
candidates.push({ entry: entry as SessionMessageEntry, tokens });
|
|
220
|
+
candidates.push({ entry: entry as SessionMessageEntry, tokens, superseded });
|
|
79
221
|
accumulatedTokens += tokens;
|
|
80
222
|
}
|
|
81
223
|
|
|
82
224
|
for (const candidate of candidates) {
|
|
83
|
-
tokensSaved += estimatePrunedSavings(
|
|
225
|
+
tokensSaved += estimatePrunedSavings(
|
|
226
|
+
candidate.tokens,
|
|
227
|
+
candidate.superseded ? SUPERSEDED_NOTICE : createPrunedNotice(candidate.tokens),
|
|
228
|
+
);
|
|
84
229
|
}
|
|
85
230
|
|
|
86
231
|
if (tokensSaved < config.minimumSavings || candidates.length === 0) {
|
|
@@ -90,10 +235,31 @@ export function pruneToolOutputs(entries: SessionEntry[], config: PruneConfig =
|
|
|
90
235
|
const prunedAt = Date.now();
|
|
91
236
|
for (const candidate of candidates) {
|
|
92
237
|
const message = candidate.entry.message as ToolResultMessage;
|
|
93
|
-
message.content = [
|
|
238
|
+
message.content = [
|
|
239
|
+
{ type: "text", text: candidate.superseded ? SUPERSEDED_NOTICE : createPrunedNotice(candidate.tokens) },
|
|
240
|
+
];
|
|
94
241
|
message.prunedAt = prunedAt;
|
|
95
242
|
prunedCount++;
|
|
96
243
|
}
|
|
97
244
|
|
|
98
245
|
return { prunedCount, tokensSaved };
|
|
99
246
|
}
|
|
247
|
+
|
|
248
|
+
/**
|
|
249
|
+
* Supersede key for the `read` tool: the file path with the trailing line/raw
|
|
250
|
+
* selector stripped (the read tool's own splitter grammar via
|
|
251
|
+
* {@link splitReadSelector}, e.g. `src/foo.ts:50-200`, `:2-4:raw`).
|
|
252
|
+
* Internal/URL-scheme paths (`skill://…`, `https://…`) are exempt.
|
|
253
|
+
* Selector-free reads key on the bare path; selector-carrying reads key on
|
|
254
|
+
* `path + "\u0000" + selector`, so two reads collide only when the newer is
|
|
255
|
+
* selector-free or the selectors are identical (the pass's prefix rule lets a
|
|
256
|
+
* bare-path read supersede selector-carrying reads of the same file).
|
|
257
|
+
*/
|
|
258
|
+
export function readToolSupersedeKey(toolName: string, args: Record<string, unknown>): string | undefined {
|
|
259
|
+
if (toolName !== "read") return undefined;
|
|
260
|
+
const path = args.path;
|
|
261
|
+
if (typeof path !== "string" || path.length === 0) return undefined;
|
|
262
|
+
if (path.includes("://")) return undefined;
|
|
263
|
+
const { path: base, sel } = splitReadSelector(path);
|
|
264
|
+
return sel === undefined ? base : `${base}\u0000${sel}`;
|
|
265
|
+
}
|
package/src/compaction/utils.ts
CHANGED
|
@@ -3,7 +3,7 @@
|
|
|
3
3
|
*/
|
|
4
4
|
|
|
5
5
|
import type { Message } from "@oh-my-pi/pi-ai";
|
|
6
|
-
import { prompt } from "@oh-my-pi/pi-utils";
|
|
6
|
+
import { formatGroupedPaths, prompt } from "@oh-my-pi/pi-utils";
|
|
7
7
|
import type { AgentMessage } from "../types";
|
|
8
8
|
import fileOperationsTemplate from "./prompts/file-operations.md" with { type: "text" };
|
|
9
9
|
import summarizationSystemPrompt from "./prompts/summarization-system.md" with { type: "text" };
|
|
@@ -26,6 +26,55 @@ export function createFileOps(): FileOperations {
|
|
|
26
26
|
};
|
|
27
27
|
}
|
|
28
28
|
|
|
29
|
+
// Read-tool selector grammar, mirrored from the conservative filesystem splitter in
|
|
30
|
+
// packages/coding-agent/src/tools/path-utils.ts (splitPathAndSel). Keep in sync.
|
|
31
|
+
// A trailing `:chunk` is a selector only when it is a line-range list
|
|
32
|
+
// (`50`, `50-200`, `50+10`, `5-16,960-973`, `..` alias), `raw`, or `conflicts` —
|
|
33
|
+
// alone or as a `range:raw` / `raw:range` compound.
|
|
34
|
+
const RANGE_CHUNK_SRC = String.raw`L?\d+(?:(?:[-+]|\.\.)L?\d+|-|\.\.)?`;
|
|
35
|
+
const RANGE_LIST_SRC = `${RANGE_CHUNK_SRC}(?:,${RANGE_CHUNK_SRC})*`;
|
|
36
|
+
const READ_SELECTOR_RE = new RegExp(`^(?:${RANGE_LIST_SRC}|raw|conflicts)$`, "i");
|
|
37
|
+
const READ_RANGE_ONLY_RE = new RegExp(`^${RANGE_LIST_SRC}$`, "i");
|
|
38
|
+
const READ_RAW_ONLY_RE = /^raw$/i;
|
|
39
|
+
|
|
40
|
+
/**
|
|
41
|
+
* Split a read-tool path into its base path and trailing selector, mirroring the
|
|
42
|
+
* read tool's own splitter. Single source of the grammar in this package: the
|
|
43
|
+
* file-operations list strips selectors via {@link stripReadSelector}, and the
|
|
44
|
+
* supersede-prune pass keys on both parts via `readToolSupersedeKey`.
|
|
45
|
+
*/
|
|
46
|
+
export function splitReadSelector(path: string): { path: string; sel?: string } {
|
|
47
|
+
const colon = path.lastIndexOf(":");
|
|
48
|
+
if (colon <= 0) return { path };
|
|
49
|
+
const candidate = path.slice(colon + 1);
|
|
50
|
+
if (!READ_SELECTOR_RE.test(candidate)) return { path };
|
|
51
|
+
let base = path.slice(0, colon);
|
|
52
|
+
let sel = candidate;
|
|
53
|
+
// Compound trailing selector: `path:1-50:raw` or `path:raw:1-50`.
|
|
54
|
+
const inner = base.lastIndexOf(":");
|
|
55
|
+
if (inner > 0) {
|
|
56
|
+
const innerCandidate = base.slice(inner + 1);
|
|
57
|
+
const innerIsRaw = READ_RAW_ONLY_RE.test(innerCandidate);
|
|
58
|
+
const outerIsRaw = READ_RAW_ONLY_RE.test(candidate);
|
|
59
|
+
const innerIsRange = READ_RANGE_ONLY_RE.test(innerCandidate);
|
|
60
|
+
const outerIsRange = READ_RANGE_ONLY_RE.test(candidate);
|
|
61
|
+
if ((innerIsRaw && outerIsRange) || (innerIsRange && outerIsRaw)) {
|
|
62
|
+
sel = `${innerCandidate}:${candidate}`;
|
|
63
|
+
base = base.slice(0, inner);
|
|
64
|
+
}
|
|
65
|
+
}
|
|
66
|
+
return { path: base, sel };
|
|
67
|
+
}
|
|
68
|
+
|
|
69
|
+
/**
|
|
70
|
+
* Strip a trailing read-tool selector (`:50-200`, `:raw`, `:1-50:raw`, `:conflicts`, …)
|
|
71
|
+
* so the same file read with different line ranges dedupes to one `<files>` entry
|
|
72
|
+
* and matches its write/edit path when computing Read/Write/RW markers.
|
|
73
|
+
*/
|
|
74
|
+
export function stripReadSelector(path: string): string {
|
|
75
|
+
return splitReadSelector(path).path;
|
|
76
|
+
}
|
|
77
|
+
|
|
29
78
|
/**
|
|
30
79
|
* Extract file operations from tool calls in an assistant message.
|
|
31
80
|
*/
|
|
@@ -46,7 +95,7 @@ export function extractFileOpsFromMessage(message: AgentMessage, fileOps: FileOp
|
|
|
46
95
|
|
|
47
96
|
switch (block.name) {
|
|
48
97
|
case "read":
|
|
49
|
-
fileOps.read.add(path);
|
|
98
|
+
fileOps.read.add(stripReadSelector(path));
|
|
50
99
|
break;
|
|
51
100
|
case "write":
|
|
52
101
|
fileOps.written.add(path);
|
|
@@ -70,32 +119,48 @@ export function computeFileLists(fileOps: FileOperations): { readFiles: string[]
|
|
|
70
119
|
}
|
|
71
120
|
|
|
72
121
|
/**
|
|
73
|
-
* Format file operations as
|
|
122
|
+
* Format file operations as one `<files>` tag: a grouped, prefix-folded
|
|
123
|
+
* directory tree (find-tool shape — `# dir/` headers, bare basenames) with a
|
|
124
|
+
* ` (Read)` / ` (Write)` / ` (RW)` marker per file instead of separate
|
|
125
|
+
* read/modified lists. `readSet` is the cumulative read set (`fileOps.read`),
|
|
126
|
+
* used to tell modified files that were also read (RW) from blind writes.
|
|
74
127
|
*/
|
|
75
128
|
const FILE_OPERATION_SUMMARY_LIMIT = 20;
|
|
76
129
|
|
|
77
|
-
function truncateFileList(files: string[]): string[] {
|
|
78
|
-
if (files.length <= FILE_OPERATION_SUMMARY_LIMIT) return files;
|
|
79
|
-
const omitted = files.length - FILE_OPERATION_SUMMARY_LIMIT;
|
|
80
|
-
return [...files.slice(0, FILE_OPERATION_SUMMARY_LIMIT), `… (${omitted} more files omitted)`];
|
|
81
|
-
}
|
|
82
|
-
|
|
83
130
|
function stripFileOperationTags(summary: string): string {
|
|
84
|
-
|
|
85
|
-
|
|
86
|
-
return
|
|
131
|
+
// Legacy <read-files>/<modified-files> tags are still stripped so summaries
|
|
132
|
+
// written before the combined <files> tag self-heal on the next compaction.
|
|
133
|
+
return summary
|
|
134
|
+
.replace(/<files>[\s\S]*?<\/files>\s*/g, "")
|
|
135
|
+
.replace(/<read-files>[\s\S]*?<\/read-files>\s*/g, "")
|
|
136
|
+
.replace(/<modified-files>[\s\S]*?<\/modified-files>\s*/g, "")
|
|
137
|
+
.trimEnd();
|
|
87
138
|
}
|
|
88
|
-
export function formatFileOperations(
|
|
139
|
+
export function formatFileOperations(
|
|
140
|
+
readFiles: string[],
|
|
141
|
+
modifiedFiles: string[],
|
|
142
|
+
readSet?: ReadonlySet<string>,
|
|
143
|
+
): string {
|
|
89
144
|
if (readFiles.length === 0 && modifiedFiles.length === 0) return "";
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
|
|
145
|
+
const mode = new Map<string, "Read" | "Write" | "RW">();
|
|
146
|
+
for (const file of readFiles) mode.set(file, "Read");
|
|
147
|
+
for (const file of modifiedFiles) mode.set(file, readSet?.has(file) ? "RW" : "Write");
|
|
148
|
+
const all = [...mode.keys()].sort();
|
|
149
|
+
let files = formatGroupedPaths(all.slice(0, FILE_OPERATION_SUMMARY_LIMIT), path => ` (${mode.get(path)})`);
|
|
150
|
+
if (all.length > FILE_OPERATION_SUMMARY_LIMIT) {
|
|
151
|
+
files += `\n… (${all.length - FILE_OPERATION_SUMMARY_LIMIT} more files omitted)`;
|
|
152
|
+
}
|
|
153
|
+
return prompt.render(fileOperationsTemplate, { files });
|
|
94
154
|
}
|
|
95
155
|
|
|
96
|
-
export function upsertFileOperations(
|
|
156
|
+
export function upsertFileOperations(
|
|
157
|
+
summary: string,
|
|
158
|
+
readFiles: string[],
|
|
159
|
+
modifiedFiles: string[],
|
|
160
|
+
readSet?: ReadonlySet<string>,
|
|
161
|
+
): string {
|
|
97
162
|
const baseSummary = stripFileOperationTags(summary);
|
|
98
|
-
const fileOperations = formatFileOperations(readFiles, modifiedFiles);
|
|
163
|
+
const fileOperations = formatFileOperations(readFiles, modifiedFiles, readSet);
|
|
99
164
|
if (!fileOperations) return baseSummary;
|
|
100
165
|
if (!baseSummary) return fileOperations;
|
|
101
166
|
return `${baseSummary}\n\n${fileOperations}`;
|
package/src/types.ts
CHANGED
|
@@ -3,6 +3,7 @@ import type {
|
|
|
3
3
|
AssistantMessage,
|
|
4
4
|
AssistantMessageEvent,
|
|
5
5
|
AssistantMessageEventStream,
|
|
6
|
+
Context,
|
|
6
7
|
Effort,
|
|
7
8
|
ImageContent,
|
|
8
9
|
Message,
|
|
@@ -107,6 +108,13 @@ export interface AgentLoopConfig extends SimpleStreamOptions {
|
|
|
107
108
|
*/
|
|
108
109
|
transformContext?: (messages: AgentMessage[], signal?: AbortSignal) => Promise<AgentMessage[]>;
|
|
109
110
|
|
|
111
|
+
/**
|
|
112
|
+
* Optional transform applied to the final provider context after conversion,
|
|
113
|
+
* normalization, and append-only context handling, but before telemetry capture
|
|
114
|
+
* and provider send.
|
|
115
|
+
*/
|
|
116
|
+
transformProviderContext?: (context: Context) => Context;
|
|
117
|
+
|
|
110
118
|
/**
|
|
111
119
|
* Resolves an API key dynamically for each LLM call.
|
|
112
120
|
*
|
|
@@ -210,6 +218,15 @@ export interface AgentLoopConfig extends SimpleStreamOptions {
|
|
|
210
218
|
*/
|
|
211
219
|
getReasoning?: () => Effort | undefined;
|
|
212
220
|
|
|
221
|
+
/**
|
|
222
|
+
* Dynamic reasoning-disable override, resolved per LLM call. When set,
|
|
223
|
+
* its return value overrides the static `disableReasoning` from
|
|
224
|
+
* `SimpleStreamOptions` for that request. Pair with `getReasoning` so
|
|
225
|
+
* mid-run transitions into and out of the explicit `off` state propagate
|
|
226
|
+
* to the next provider call.
|
|
227
|
+
*/
|
|
228
|
+
getDisableReasoning?: () => boolean | undefined;
|
|
229
|
+
|
|
213
230
|
/**
|
|
214
231
|
* Called after a tool call has been validated and is about to execute.
|
|
215
232
|
*
|
|
@@ -358,6 +375,7 @@ export interface AgentState {
|
|
|
358
375
|
systemPrompt: string[];
|
|
359
376
|
model: Model;
|
|
360
377
|
thinkingLevel?: Effort;
|
|
378
|
+
disableReasoning?: boolean;
|
|
361
379
|
tools: AgentTool<any>[];
|
|
362
380
|
messages: AgentMessage[]; // Can include attachments + custom message types
|
|
363
381
|
isStreaming: boolean;
|