npm - @oh-my-pi/pi-agent-core - Versions diffs - 15.0.2 → 15.1.0 - Mend

@oh-my-pi/pi-agent-core 15.0.2 → 15.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (32) hide show

package/CHANGELOG.md +50 -0
package/README.md +102 -4
package/package.json +15 -5
package/src/agent-loop.ts +478 -182
package/src/agent.ts +29 -0
package/src/compaction/branch-summarization.ts +333 -0
package/src/compaction/compaction.ts +1047 -0
package/src/compaction/entries.ts +133 -0
package/src/compaction/errors.ts +31 -0
package/src/compaction/index.ts +12 -0
package/src/compaction/messages.ts +212 -0
package/src/compaction/openai.ts +551 -0
package/src/compaction/prompts/auto-handoff-threshold-focus.md +1 -0
package/src/compaction/prompts/branch-summary-context.md +5 -0
package/src/compaction/prompts/branch-summary-preamble.md +2 -0
package/src/compaction/prompts/branch-summary.md +30 -0
package/src/compaction/prompts/compaction-short-summary.md +9 -0
package/src/compaction/prompts/compaction-summary-context.md +5 -0
package/src/compaction/prompts/compaction-summary.md +38 -0
package/src/compaction/prompts/compaction-turn-prefix.md +17 -0
package/src/compaction/prompts/compaction-update-summary.md +45 -0
package/src/compaction/prompts/file-operations.md +10 -0
package/src/compaction/prompts/handoff-document.md +49 -0
package/src/compaction/prompts/summarization-system.md +3 -0
package/src/compaction/pruning.ts +92 -0
package/src/compaction/utils.ts +185 -0
package/src/compaction.ts +1 -0
package/src/index.ts +6 -0
package/src/proxy.ts +5 -1
package/src/run-collector.ts +628 -0
package/src/telemetry.ts +1800 -0
package/src/types.ts +21 -2

package/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,56 @@
 ## [Unreleased]
+## [15.1.0] - 2026-05-15
+### Breaking Changes
+- Removed the `@oh-my-pi/pi-agent-core/compaction/handoff` exports from the package surface, including `extractHandoffDocument`, `createHandoffContext`, and `createHandoffFileName`
+- Removed legacy telemetry constants from the public enum surface (including `AGGREGATE_ATTR`, `GenAIAttr.System`, and old `gen_ai.*` extension keys such as `gen_ai.request.service_tier`/cost/tool status/handoff fields) and replaced them with `OpenAIAttr`, `PiGenAIAttr`, and `PiGenAIAggregateAttr`
+### Added
+- Added `generateHandoff(messages, model, apiKey, options)` to `@oh-my-pi/pi-agent-core/compaction` to generate a handoff document by calling the model directly, using live system/tool context and optional metadata
+- Added generation filtering so the returned handoff document now includes only text content blocks from the model output
+- Added support for defining `AgentTool` schemas with Zod, with legacy TypeBox schemas still supported when generating tool schemas for model calls
+- Added `OpenAIAttr`, `PiGenAIAttr`, and `PiGenAIAggregateAttr` exports so consumers can reference the new `openai.*` and `pi.gen_ai.*` telemetry attribute keys directly
+- Added `onChatUsage` to `AgentTelemetryConfig`, an always-fired hook receiving a `ChatUsageEvent` for every chat step that produced usage. The event carries the chat `span`, `agent`, `conversationId`, `stepNumber`, `model`, `provider`, `serviceTier`, `usage`, optional `cost`, and resolved dynamic `attributes` — independent of whether a `costEstimator` is configured.
+- Added `agentLoopDetailed(...)` and `agentLoopContinueDetailed(...)` helpers that return the same event stream plus a `detailed()` result with run `telemetry` and `coverage`
+- Added `onRunEnd` to `AgentTelemetryConfig` to receive `AgentRunSummary` and `AgentRunCoverage` at the end of each invocation
+- Added run-level telemetry and coverage types/helpers (for example `AgentRunSummary`, `AgentRunCoverage`, `aggregateAgentRunSummaries`, and `aggregateAgentRunCoverage`) to package exports
+- Added generic telemetry extension hooks for dynamic span attributes, provider/agent-name normalization, per-step cost deltas, warning callbacks, bounded summary content capture, and manual chat telemetry for non-loop model calls.
+- Added opt-in OpenTelemetry instrumentation on the agent loop. Pass `telemetry: {}` (or a richer `AgentTelemetryConfig`) on `AgentLoopConfig` / `AgentOptions` / `createAgentSession({ telemetry })` to emit GenAI-semantic-convention spans plus `pi.gen_ai.*` extension attributes:
+- `invoke_agent {agent.name}` wraps each `agentLoop` invocation with `gen_ai.operation.name=invoke_agent`, agent identity, conversation id, and `pi.gen_ai.agent.step.count`.
+- `chat {model}` per provider call, parented under `invoke_agent`, with OTEL request/response/usage attributes (`gen_ai.request.{model,stream,temperature,top_p,top_k,max_tokens,presence_penalty,stop_sequences}`, `gen_ai.response.{model,id,finish_reasons,time_to_first_chunk}`, `gen_ai.usage.{input_tokens,output_tokens,cache_read.input_tokens,cache_creation.input_tokens,reasoning.output_tokens}`) and project extensions for reasoning effort, tool choice, available tools, usage totals, and cost.
+- `execute_tool {tool.name}` per tool call, parented under `invoke_agent`, with `gen_ai.tool.{name,call.id,description,type}` plus the active context so user/MCP/provider spans created inside `tool.execute()` attach as children.
+- One-shot `handoff` span available via the public `recordHandoff(...)` helper for agent-to-agent transitions.
+- Added `AgentTelemetryConfig` hooks (`onSpanStart`, `onSpanEnd`, `costEstimator`), `agent` identity, `attributes` envelope merged onto every span, `captureMessageContent` toggle (defaults to the `OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT` env var) emitting OTEL-shaped `gen_ai.input.messages` / `gen_ai.output.messages` / `gen_ai.system_instructions` / `gen_ai.tool.call.arguments` / `gen_ai.tool.call.result`, and tracer/tracerName override surfaces.
+- Added `Agent#setTelemetry(config)` so consumers can swap or disable instrumentation between invocations.
+- Added `@opentelemetry/api` as a runtime dependency; SDK setup (exporters, samplers, processors) remains the host's responsibility per standard OTEL conventions. When no SDK is registered, helpers fall through to no-op spans with zero overhead.
+- Added compaction APIs under `@oh-my-pi/pi-agent-core/compaction`, including context compaction, branch summarization, handoff prompt/context helpers, pruning, token budgeting, prompt templates, and OpenAI `/responses/compact` helpers.
+### Changed
+- Changed handoff document generation to force `toolChoice: "none"` when calling the model so tool invocation is disabled during generation
+- Changed `chat` spans to emit normalized provider identifiers in `gen_ai.provider.name` via OTEL-style values (for example `google` to `gcp.gemini`) instead of the legacy `gen_ai.system` label
+- Changed service-tier telemetry to emit `openai.request.service_tier`/`openai.response.service_tier` only when supported by provider via `shouldSendServiceTier`, rather than always using `gen_ai.request.service_tier`
+- Changed captured message payloads so full capture now records OTEL-structured message parts with `pi.gen_ai.request.messages`, `pi.gen_ai.system_instructions`, and `gen_ai.output.messages` including assistant `finish_reason`
+- Changed the `agent_end` event payload to include optional `telemetry` and `coverage` fields when telemetry is enabled, while keeping the legacy payload shape when disabled
+- Changed `invoke_agent` spans to include aggregate `pi.gen_ai.agent.*` attributes for chat/tool counts, latency, usage, cost, errors, and tool coverage
+### Fixed
+- Fixed intent-field injection for tool schemas defined with Zod by converting them to wire schema before mutation
+- Fixed token accounting in `ChatUsageEvent` and usage summaries so `inputTokens` and `totalTokens` now include cached read/write input tokens
+- Fixed `execute_tool` span attributes so `pi.gen_ai.tool.status` and `error.type` now reflect run-level tool outcomes (`ok`, `error`, `skipped`, `blocked`, `timeout`, `aborted`) instead of mapping all non-ok cases the same way
+- Fixed `onRunEnd` callbacks to be safe and idempotent by invoking them once per run and swallowing thrown callback errors so they cannot fail or duplicate successful runs
+- Fixed run telemetry to count interrupted, blocked, or otherwise skipped tool calls so run coverage and tool counters now include those paths
+- Fixed chat failure handling so failed chat steps are still represented in run summaries when provider streaming throws before yielding an assistant message
+- Fixed double-counting of interrupted tool calls in run summaries: the `runTool` early-return on a queued steering interrupt now defers to the post-batch tail sweep so each call is recorded exactly once
+- Fixed `coverage.toolsInvoked` and run-summary tool counters under-reporting tool calls embedded in an aborted/errored assistant message — those calls now record a collector orphan with status `aborted` or `error`
+- Fixed `AgentRunSummary.usage.inputTokens` so it now includes `cache_read` and `cache_write` input tokens, matching `ChatUsageEvent.inputTokens`
+- Fixed span lifecycle hooks (`onSpanStart`, `onSpanEnd`) so a thrown user callback is caught and surfaced via `onTelemetryWarning` (`on_span_start_failed` / `on_span_end_failed`) instead of leaking and aborting the surrounding span
+- Fixed unbounded recursion in summary content capture when a captured value contains a cyclic or deeply nested array — array recursion now respects the same depth cap as plain-object recursion and replaces back-references with `"[Circular]"`
 ## [15.0.1] - 2026-05-14
 ### Breaking Changes

package/README.md CHANGED Viewed

@@ -279,17 +279,17 @@ const agent = new Agent({
 ## Tools
-Define tools using `AgentTool`:
+Define tools using `AgentTool` with a Zod parameter schema (via `z` from `@oh-my-pi/pi-ai`). Legacy TypeBox-authored schemas are still accepted at runtime and are lifted to Zod internally.
 ```typescript
-import { Type } from "@sinclair/typebox";
+import { z } from "@oh-my-pi/pi-ai";
 const readFileTool: AgentTool = {
 	name: "read_file",
 	label: "Read File", // For UI display
 	description: "Read a file's contents",
-	parameters: Type.Object({
-		path: Type.String({ description: "File path" }),
+	parameters: z.object({
+		path: z.string().describe("File path"),
 	}),
 	execute: async (toolCallId, params, signal, onUpdate, context) => {
 		const content = await fs.readFile(params.path, "utf-8");
@@ -370,6 +370,104 @@ for await (const event of agentLoopContinue(context, config)) {
 }
 ```
+## Run-level telemetry
+Every `invoke_agent` produces two values alongside the OTEL spans:
+- **`AgentRunSummary`** — chat / tool / usage / cost / error counters bucketed
+  by status, with per-tool-name breakdowns. Pure aggregation, safe to
+  persist, diff, or assert.
+- **`AgentRunCoverage`** — sorted+deduped `toolsAvailable` / `toolsInvoked` /
+  `toolsUnused` / `modelsUsed` / `providersUsed` arrays. Stable for snapshot
+  tests.
+Three delivery channels (use whichever fits):
+### `agent_end` event (additive)
+```typescript
+for await (const event of agentLoop([userMessage], context, {
+	...config,
+	telemetry: {},
+})) {
+	if (event.type === "agent_end" && event.telemetry) {
+		console.log("tokens:", event.telemetry.usage.totalTokens);
+		console.log("unused tools:", event.coverage?.toolsUnused);
+	}
+}
+```
+The `messages` field is unchanged. Consumers that ignore `telemetry`/
+`coverage` continue to work.
+### `onRunEnd` hook (non-fatal)
+```typescript
+const stream = agentLoop([userMessage], context, {
+	...config,
+	telemetry: {
+		onRunEnd: (summary, coverage) => {
+			await persistRunSummary(summary, coverage);
+		},
+	},
+});
+```
+Exceptions thrown from `onRunEnd` are caught and logged via `console.warn`;
+a misbehaving telemetry consumer can **never** turn a successful agent run
+into a failed one.
+### `agentLoopDetailed` (typed `detailed()` result)
+Convenience wrapper that preserves the existing stream API and exposes the
+rollup as a typed value:
+```typescript
+const { stream, detailed } = agentLoopDetailed([userMessage], context, {
+	...config,
+	telemetry: {}, // required to populate telemetry/coverage
+});
+for await (const event of stream) {
+	// existing event handling
+}
+const { messages, telemetry, coverage } = await detailed();
+```
+`stream.result()` still resolves to `AgentMessage[]` — no breaking change.
+### Multi-run aggregation
+Callers that drive the loop multiple times (verify pass, benchmark harness)
+fold N summaries with `aggregateAgentRunSummaries` / `aggregateAgentRunCoverage`:
+```typescript
+import {
+	aggregateAgentRunSummaries,
+	aggregateAgentRunCoverage,
+} from "@oh-my-pi/pi-agent";
+const summaries: AgentRunSummary[] = [];
+const coverages: AgentRunCoverage[] = [];
+for (const target of targets) {
+	const { detailed } = agentLoopDetailed(/* ... */);
+	const result = await detailed();
+	if (result.telemetry) summaries.push(result.telemetry);
+	if (result.coverage) coverages.push(result.coverage);
+}
+const runSummary = aggregateAgentRunSummaries(summaries);
+const runCoverage = aggregateAgentRunCoverage(coverages);
+```
+### Tool status reporting
+`execute_tool` spans carry `pi.gen_ai.tool.status` ∈
+`"ok" | "error" | "skipped" | "blocked" | "timeout" | "aborted"`.
+`beforeToolCall` blocks throw a distinguishable `ToolCallBlockedError`
+internally; the catch path reports `status: "blocked"` instead of conflating
+with generic tool errors. Pre-run interrupts and tail-sweep skips are
+recorded as `"skipped"` even though they never start a span.
 ## License
 MIT

package/package.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
 	"type": "module",
 	"name": "@oh-my-pi/pi-agent-core",
-	"version": "15.0.2",
+	"version": "15.1.0",
 	"description": "General-purpose agent with transport abstraction, state management, and attachment support",
 	"homepage": "https://omp.sh",
 	"author": "Can Boluk",
@@ -35,12 +35,14 @@
 		"fmt": "biome format --write ."
 	},
 	"dependencies": {
-		"@oh-my-pi/pi-ai": "15.0.2",
-		"@oh-my-pi/pi-natives": "15.0.2",
-		"@oh-my-pi/pi-utils": "15.0.2"
+		"@oh-my-pi/pi-ai": "15.1.0",
+		"@oh-my-pi/pi-natives": "15.1.0",
+		"@oh-my-pi/pi-utils": "15.1.0",
+		"@opentelemetry/api": "^1.9.0"
 	},
 	"devDependencies": {
-		"@sinclair/typebox": "^0.34.49",
+		"@opentelemetry/context-async-hooks": "^2.0.0",
+		"@opentelemetry/sdk-trace-base": "^2.0.0",
 		"@types/bun": "^1.3.14"
 	},
 	"engines": {
@@ -56,6 +58,14 @@
 			"types": "./src/index.ts",
 			"import": "./src/index.ts"
 		},
+		"./compaction": {
+			"types": "./src/compaction.ts",
+			"import": "./src/compaction.ts"
+		},
+		"./compaction/*": {
+			"types": "./src/compaction/*.ts",
+			"import": "./src/compaction/*.ts"
+		},
 		"./*": {
 			"types": "./src/*.ts",
 			"import": "./src/*.ts"