@oh-my-pi/pi-agent-core 15.0.2 → 15.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,6 +2,56 @@
2
2
 
3
3
  ## [Unreleased]
4
4
 
5
+ ## [15.1.0] - 2026-05-15
6
+ ### Breaking Changes
7
+
8
+ - Removed the `@oh-my-pi/pi-agent-core/compaction/handoff` exports from the package surface, including `extractHandoffDocument`, `createHandoffContext`, and `createHandoffFileName`
9
+ - Removed legacy telemetry constants from the public enum surface (including `AGGREGATE_ATTR`, `GenAIAttr.System`, and old `gen_ai.*` extension keys such as `gen_ai.request.service_tier`/cost/tool status/handoff fields) and replaced them with `OpenAIAttr`, `PiGenAIAttr`, and `PiGenAIAggregateAttr`
10
+
11
+ ### Added
12
+
13
+ - Added `generateHandoff(messages, model, apiKey, options)` to `@oh-my-pi/pi-agent-core/compaction` to generate a handoff document by calling the model directly, using live system/tool context and optional metadata
14
+ - Added generation filtering so the returned handoff document now includes only text content blocks from the model output
15
+ - Added support for defining `AgentTool` schemas with Zod, with legacy TypeBox schemas still supported when generating tool schemas for model calls
16
+ - Added `OpenAIAttr`, `PiGenAIAttr`, and `PiGenAIAggregateAttr` exports so consumers can reference the new `openai.*` and `pi.gen_ai.*` telemetry attribute keys directly
17
+ - Added `onChatUsage` to `AgentTelemetryConfig`, an always-fired hook receiving a `ChatUsageEvent` for every chat step that produced usage. The event carries the chat `span`, `agent`, `conversationId`, `stepNumber`, `model`, `provider`, `serviceTier`, `usage`, optional `cost`, and resolved dynamic `attributes` — independent of whether a `costEstimator` is configured.
18
+ - Added `agentLoopDetailed(...)` and `agentLoopContinueDetailed(...)` helpers that return the same event stream plus a `detailed()` result with run `telemetry` and `coverage`
19
+ - Added `onRunEnd` to `AgentTelemetryConfig` to receive `AgentRunSummary` and `AgentRunCoverage` at the end of each invocation
20
+ - Added run-level telemetry and coverage types/helpers (for example `AgentRunSummary`, `AgentRunCoverage`, `aggregateAgentRunSummaries`, and `aggregateAgentRunCoverage`) to package exports
21
+ - Added generic telemetry extension hooks for dynamic span attributes, provider/agent-name normalization, per-step cost deltas, warning callbacks, bounded summary content capture, and manual chat telemetry for non-loop model calls.
22
+ - Added opt-in OpenTelemetry instrumentation on the agent loop. Pass `telemetry: {}` (or a richer `AgentTelemetryConfig`) on `AgentLoopConfig` / `AgentOptions` / `createAgentSession({ telemetry })` to emit GenAI-semantic-convention spans plus `pi.gen_ai.*` extension attributes:
23
+ - `invoke_agent {agent.name}` wraps each `agentLoop` invocation with `gen_ai.operation.name=invoke_agent`, agent identity, conversation id, and `pi.gen_ai.agent.step.count`.
24
+ - `chat {model}` per provider call, parented under `invoke_agent`, with OTEL request/response/usage attributes (`gen_ai.request.{model,stream,temperature,top_p,top_k,max_tokens,presence_penalty,stop_sequences}`, `gen_ai.response.{model,id,finish_reasons,time_to_first_chunk}`, `gen_ai.usage.{input_tokens,output_tokens,cache_read.input_tokens,cache_creation.input_tokens,reasoning.output_tokens}`) and project extensions for reasoning effort, tool choice, available tools, usage totals, and cost.
25
+ - `execute_tool {tool.name}` per tool call, parented under `invoke_agent`, with `gen_ai.tool.{name,call.id,description,type}` plus the active context so user/MCP/provider spans created inside `tool.execute()` attach as children.
26
+ - One-shot `handoff` span available via the public `recordHandoff(...)` helper for agent-to-agent transitions.
27
+ - Added `AgentTelemetryConfig` hooks (`onSpanStart`, `onSpanEnd`, `costEstimator`), `agent` identity, `attributes` envelope merged onto every span, `captureMessageContent` toggle (defaults to the `OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT` env var) emitting OTEL-shaped `gen_ai.input.messages` / `gen_ai.output.messages` / `gen_ai.system_instructions` / `gen_ai.tool.call.arguments` / `gen_ai.tool.call.result`, and tracer/tracerName override surfaces.
28
+ - Added `Agent#setTelemetry(config)` so consumers can swap or disable instrumentation between invocations.
29
+ - Added `@opentelemetry/api` as a runtime dependency; SDK setup (exporters, samplers, processors) remains the host's responsibility per standard OTEL conventions. When no SDK is registered, helpers fall through to no-op spans with zero overhead.
30
+ - Added compaction APIs under `@oh-my-pi/pi-agent-core/compaction`, including context compaction, branch summarization, handoff prompt/context helpers, pruning, token budgeting, prompt templates, and OpenAI `/responses/compact` helpers.
31
+
32
+ ### Changed
33
+
34
+ - Changed handoff document generation to force `toolChoice: "none"` when calling the model so tool invocation is disabled during generation
35
+ - Changed `chat` spans to emit normalized provider identifiers in `gen_ai.provider.name` via OTEL-style values (for example `google` to `gcp.gemini`) instead of the legacy `gen_ai.system` label
36
+ - Changed service-tier telemetry to emit `openai.request.service_tier`/`openai.response.service_tier` only when supported by provider via `shouldSendServiceTier`, rather than always using `gen_ai.request.service_tier`
37
+ - Changed captured message payloads so full capture now records OTEL-structured message parts with `pi.gen_ai.request.messages`, `pi.gen_ai.system_instructions`, and `gen_ai.output.messages` including assistant `finish_reason`
38
+ - Changed the `agent_end` event payload to include optional `telemetry` and `coverage` fields when telemetry is enabled, while keeping the legacy payload shape when disabled
39
+ - Changed `invoke_agent` spans to include aggregate `pi.gen_ai.agent.*` attributes for chat/tool counts, latency, usage, cost, errors, and tool coverage
40
+
41
+ ### Fixed
42
+
43
+ - Fixed intent-field injection for tool schemas defined with Zod by converting them to wire schema before mutation
44
+ - Fixed token accounting in `ChatUsageEvent` and usage summaries so `inputTokens` and `totalTokens` now include cached read/write input tokens
45
+ - Fixed `execute_tool` span attributes so `pi.gen_ai.tool.status` and `error.type` now reflect run-level tool outcomes (`ok`, `error`, `skipped`, `blocked`, `timeout`, `aborted`) instead of mapping all non-ok cases the same way
46
+ - Fixed `onRunEnd` callbacks to be safe and idempotent by invoking them once per run and swallowing thrown callback errors so they cannot fail or duplicate successful runs
47
+ - Fixed run telemetry to count interrupted, blocked, or otherwise skipped tool calls so run coverage and tool counters now include those paths
48
+ - Fixed chat failure handling so failed chat steps are still represented in run summaries when provider streaming throws before yielding an assistant message
49
+ - Fixed double-counting of interrupted tool calls in run summaries: the `runTool` early-return on a queued steering interrupt now defers to the post-batch tail sweep so each call is recorded exactly once
50
+ - Fixed `coverage.toolsInvoked` and run-summary tool counters under-reporting tool calls embedded in an aborted/errored assistant message — those calls now record a collector orphan with status `aborted` or `error`
51
+ - Fixed `AgentRunSummary.usage.inputTokens` so it now includes `cache_read` and `cache_write` input tokens, matching `ChatUsageEvent.inputTokens`
52
+ - Fixed span lifecycle hooks (`onSpanStart`, `onSpanEnd`) so a thrown user callback is caught and surfaced via `onTelemetryWarning` (`on_span_start_failed` / `on_span_end_failed`) instead of leaking and aborting the surrounding span
53
+ - Fixed unbounded recursion in summary content capture when a captured value contains a cyclic or deeply nested array — array recursion now respects the same depth cap as plain-object recursion and replaces back-references with `"[Circular]"`
54
+
5
55
  ## [15.0.1] - 2026-05-14
6
56
  ### Breaking Changes
7
57
 
package/README.md CHANGED
@@ -279,17 +279,17 @@ const agent = new Agent({
279
279
 
280
280
  ## Tools
281
281
 
282
- Define tools using `AgentTool`:
282
+ Define tools using `AgentTool` with a Zod parameter schema (via `z` from `@oh-my-pi/pi-ai`). Legacy TypeBox-authored schemas are still accepted at runtime and are lifted to Zod internally.
283
283
 
284
284
  ```typescript
285
- import { Type } from "@sinclair/typebox";
285
+ import { z } from "@oh-my-pi/pi-ai";
286
286
 
287
287
  const readFileTool: AgentTool = {
288
288
  name: "read_file",
289
289
  label: "Read File", // For UI display
290
290
  description: "Read a file's contents",
291
- parameters: Type.Object({
292
- path: Type.String({ description: "File path" }),
291
+ parameters: z.object({
292
+ path: z.string().describe("File path"),
293
293
  }),
294
294
  execute: async (toolCallId, params, signal, onUpdate, context) => {
295
295
  const content = await fs.readFile(params.path, "utf-8");
@@ -370,6 +370,104 @@ for await (const event of agentLoopContinue(context, config)) {
370
370
  }
371
371
  ```
372
372
 
373
+ ## Run-level telemetry
374
+ Every `invoke_agent` produces two values alongside the OTEL spans:
375
+
376
+ - **`AgentRunSummary`** — chat / tool / usage / cost / error counters bucketed
377
+ by status, with per-tool-name breakdowns. Pure aggregation, safe to
378
+ persist, diff, or assert.
379
+ - **`AgentRunCoverage`** — sorted+deduped `toolsAvailable` / `toolsInvoked` /
380
+ `toolsUnused` / `modelsUsed` / `providersUsed` arrays. Stable for snapshot
381
+ tests.
382
+
383
+ Three delivery channels (use whichever fits):
384
+
385
+ ### `agent_end` event (additive)
386
+
387
+ ```typescript
388
+ for await (const event of agentLoop([userMessage], context, {
389
+ ...config,
390
+ telemetry: {},
391
+ })) {
392
+ if (event.type === "agent_end" && event.telemetry) {
393
+ console.log("tokens:", event.telemetry.usage.totalTokens);
394
+ console.log("unused tools:", event.coverage?.toolsUnused);
395
+ }
396
+ }
397
+ ```
398
+
399
+ The `messages` field is unchanged. Consumers that ignore `telemetry`/
400
+ `coverage` continue to work.
401
+
402
+ ### `onRunEnd` hook (non-fatal)
403
+
404
+ ```typescript
405
+ const stream = agentLoop([userMessage], context, {
406
+ ...config,
407
+ telemetry: {
408
+ onRunEnd: (summary, coverage) => {
409
+ await persistRunSummary(summary, coverage);
410
+ },
411
+ },
412
+ });
413
+ ```
414
+
415
+ Exceptions thrown from `onRunEnd` are caught and logged via `console.warn`;
416
+ a misbehaving telemetry consumer can **never** turn a successful agent run
417
+ into a failed one.
418
+
419
+ ### `agentLoopDetailed` (typed `detailed()` result)
420
+
421
+ Convenience wrapper that preserves the existing stream API and exposes the
422
+ rollup as a typed value:
423
+
424
+ ```typescript
425
+ const { stream, detailed } = agentLoopDetailed([userMessage], context, {
426
+ ...config,
427
+ telemetry: {}, // required to populate telemetry/coverage
428
+ });
429
+
430
+ for await (const event of stream) {
431
+ // existing event handling
432
+ }
433
+
434
+ const { messages, telemetry, coverage } = await detailed();
435
+ ```
436
+
437
+ `stream.result()` still resolves to `AgentMessage[]` — no breaking change.
438
+
439
+ ### Multi-run aggregation
440
+
441
+ Callers that drive the loop multiple times (verify pass, benchmark harness)
442
+ fold N summaries with `aggregateAgentRunSummaries` / `aggregateAgentRunCoverage`:
443
+
444
+ ```typescript
445
+ import {
446
+ aggregateAgentRunSummaries,
447
+ aggregateAgentRunCoverage,
448
+ } from "@oh-my-pi/pi-agent";
449
+
450
+ const summaries: AgentRunSummary[] = [];
451
+ const coverages: AgentRunCoverage[] = [];
452
+ for (const target of targets) {
453
+ const { detailed } = agentLoopDetailed(/* ... */);
454
+ const result = await detailed();
455
+ if (result.telemetry) summaries.push(result.telemetry);
456
+ if (result.coverage) coverages.push(result.coverage);
457
+ }
458
+ const runSummary = aggregateAgentRunSummaries(summaries);
459
+ const runCoverage = aggregateAgentRunCoverage(coverages);
460
+ ```
461
+
462
+ ### Tool status reporting
463
+
464
+ `execute_tool` spans carry `pi.gen_ai.tool.status` ∈
465
+ `"ok" | "error" | "skipped" | "blocked" | "timeout" | "aborted"`.
466
+ `beforeToolCall` blocks throw a distinguishable `ToolCallBlockedError`
467
+ internally; the catch path reports `status: "blocked"` instead of conflating
468
+ with generic tool errors. Pre-run interrupts and tail-sweep skips are
469
+ recorded as `"skipped"` even though they never start a span.
470
+
373
471
  ## License
374
472
 
375
473
  MIT
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "type": "module",
3
3
  "name": "@oh-my-pi/pi-agent-core",
4
- "version": "15.0.2",
4
+ "version": "15.1.0",
5
5
  "description": "General-purpose agent with transport abstraction, state management, and attachment support",
6
6
  "homepage": "https://omp.sh",
7
7
  "author": "Can Boluk",
@@ -35,12 +35,14 @@
35
35
  "fmt": "biome format --write ."
36
36
  },
37
37
  "dependencies": {
38
- "@oh-my-pi/pi-ai": "15.0.2",
39
- "@oh-my-pi/pi-natives": "15.0.2",
40
- "@oh-my-pi/pi-utils": "15.0.2"
38
+ "@oh-my-pi/pi-ai": "15.1.0",
39
+ "@oh-my-pi/pi-natives": "15.1.0",
40
+ "@oh-my-pi/pi-utils": "15.1.0",
41
+ "@opentelemetry/api": "^1.9.0"
41
42
  },
42
43
  "devDependencies": {
43
- "@sinclair/typebox": "^0.34.49",
44
+ "@opentelemetry/context-async-hooks": "^2.0.0",
45
+ "@opentelemetry/sdk-trace-base": "^2.0.0",
44
46
  "@types/bun": "^1.3.14"
45
47
  },
46
48
  "engines": {
@@ -56,6 +58,14 @@
56
58
  "types": "./src/index.ts",
57
59
  "import": "./src/index.ts"
58
60
  },
61
+ "./compaction": {
62
+ "types": "./src/compaction.ts",
63
+ "import": "./src/compaction.ts"
64
+ },
65
+ "./compaction/*": {
66
+ "types": "./src/compaction/*.ts",
67
+ "import": "./src/compaction/*.ts"
68
+ },
59
69
  "./*": {
60
70
  "types": "./src/*.ts",
61
71
  "import": "./src/*.ts"