RubyGems - claude-agent-sdk - Versions diffs - 0.17.0 → 0.18.0 - Mend

claude-agent-sdk 0.17.0 → 0.18.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (21) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +56 -0
data/README.md +4 -2
data/docs/configuration.md +13 -2
data/docs/observability.md +28 -4
data/docs/sessions.md +15 -2
data/lib/claude_agent_sdk/command_builder.rb +69 -22
data/lib/claude_agent_sdk/fiber_boundary.rb +39 -1
data/lib/claude_agent_sdk/instrumentation/otel.rb +97 -23
data/lib/claude_agent_sdk/message_parser.rb +4 -1
data/lib/claude_agent_sdk/observer.rb +23 -3
data/lib/claude_agent_sdk/query.rb +223 -88
data/lib/claude_agent_sdk/sdk_mcp_server.rb +232 -181
data/lib/claude_agent_sdk/session_store.rb +4 -0
data/lib/claude_agent_sdk/sessions.rb +144 -24
data/lib/claude_agent_sdk/subprocess_cli_transport.rb +184 -50
data/lib/claude_agent_sdk/testing/session_store_conformance.rb +15 -1
data/lib/claude_agent_sdk/types.rb +43 -5
data/lib/claude_agent_sdk/version.rb +1 -1
data/lib/claude_agent_sdk.rb +359 -93
metadata +12 -6

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: abd1a08c12369ca6417cc28946a153f2f641ba38e1770026ff53ae36465da34c
-  data.tar.gz: c1a35168f601b9bf8f6680cf66565c6f06f26f0c4817d2e5a8afbe2923b0935d
+  metadata.gz: 9ca7129acd0fb9330d1a49307fde85efff7febaf503bcbda5dbe63d82b3f6476
+  data.tar.gz: 49568fac24a25b129c9f97e5a8ef6c2cd0a3f4db6be6f721c69be7ea31261da5
 SHA512:
-  metadata.gz: 8fe246f00316a5d6d704e30d4a6df1d732be3d494d47fc222fe44738d7fc17914fd281a36b6fdf5200ddf69e7a8f2b4f6d471ec48799d0005375097a0d535ef8
-  data.tar.gz: 82cb35a8b05262b407014fd704eef12ac00f13bbb15f428c40e402b4fc6eb9576fd61b6f328bc4b00dc356365e2a16c18f6e5f97156877c0f7f4c23b499f8e96
+  metadata.gz: 54a7f40056f3b97db66a31e34dab0d659432731bef991e7262b32d0a09efd419248ea6f150fe2906292b281f293b964dde85cb0e7b9bdd7d220b0d8fb57a8441
+  data.tar.gz: 430e1f1d708a2b75b425523fd87d1a0aab8ef68850def2c66205fcb0d20552f374d2f92feb03217a301410d5b016d7c465eeac97b0aacfc497bafea0412c69ff

data/CHANGELOG.md CHANGED Viewed

@@ -7,6 +7,62 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
+## [0.18.0] - 2026-06-12
+### Added
+- `query(transport:)` — inject a pre-constructed custom transport instance (Python parity): CLI discovery, version check, and resume materialization are skipped; the SDK calls `#connect`/`#close` on the instance (including after a failed connect — a documented safety deviation from Python).
+- `Client#query` now accepts an Enumerable of message Hashes / JSONL Strings in addition to a String (Python parity): items stream inline on the caller, `session_id` is stamped onto Hashes that lack one (explicit values, even nil, preserved), and non-Hash/String items raise instead of being silently serialized. Note: JSONL String items pass through verbatim and carry their own `session_id` — generate them with the matching `session_id:` argument (`Streaming.user_message` defaults to `'default'`).
+- `Client.open(prompt = nil, options:, ...) { |client| }` — block-scoped lifecycle mirroring Python's `async with ClaudeSDKClient()`: connects, yields, always disconnects; works standalone (creates a reactor via `Sync`) and inside `Async` blocks; returns the block's value.
+- `ClaudeAgentOptions#user` is now applied — the CLI subprocess is spawned as that OS user via spawn's `:uid` (String username or Integer uid, Unix; previously accepted but silently ignored). On unsupported platforms the failure is loud (`CLIConnectionError`), not silent.
+- `HookMatcher#timeout` is now sent to the CLI in the initialize request (per matcher, seconds — Python wire parity); the SDK-side enforcement remains as defense-in-depth.
+- `ClaudeAgentSDK.list_subagents` / `ClaudeAgentSDK.get_subagent_messages` — local-disk subagent transcript readers (disk counterparts to the existing `*_from_store` variants; Python SDK parity, upstream #825). Scans `<projectDir>/<sessionId>/subagents/**/agent-<id>.jsonl` including nested `workflows/<runId>/` paths. Note: `limit: 0` returns `[]` per the Ruby read-API family convention (Python returns all).
+- W3C trace-context propagation to the CLI subprocess (Python SDK #821 parity): when `opentelemetry` is loaded and a span is active at connect, `TRACEPARENT`/`TRACESTATE` (and any other propagator carrier keys such as `BAGGAGE`, uppercased) are injected into the CLI env so CLI-side OTel spans join the caller's distributed trace. Explicit `ClaudeAgentOptions#env` keys win; stale inherited W3C env is scrubbed only when an active span replaces it. No new dependency — no-op without the opentelemetry gem.
+- `CLAUDE_AGENT_SDK_SKIP_VERSION_CHECK` env var skips the CLI version check (any non-empty value, Python parity), and the check itself now has a 2-second deadline — a hung `claude -v` (NFS-mounted binary, wedged Node bootstrap) no longer hangs `connect` forever. The unsupported-version warning now includes the CLI path.
+- `ClaudeAgentOptions#skills` (Python parity): enable skills for the main session with `'all'` or an Array of names. Injects `Skill`/`Skill(name)` into `--allowedTools`, defaults `setting_sources` to `['user', 'project']` when unset, and sends explicit lists via the `initialize` control request so the CLI filters which skills load. `[]` hides all skills; this is a context filter, not a sandbox.
+- OTel spans now report prompt-cache usage: `gen_ai.usage.cache_creation_input_tokens` / `gen_ai.usage.cache_read_input_tokens` on generation and session spans, plus OpenInference `llm.token_count.prompt_details.cache_read`/`.cache_write` on the session span. Anthropic's `input_tokens` excludes cached tokens, so heavily cached sessions previously under-reported prompt volume by orders of magnitude. Strictly additive — existing keys unchanged.
+- `on_user_prompt` observers now fire for Enumerator/streaming-input prompts (once per `type: 'user'` message with extractable text). OTel traces for streaming sessions now get an `input.value` for the first trace; later turns' capture depends on prompt timing relative to each init (`OTelObserver` keeps one prompt per trace).
+### Fixed
+- `SubprocessCLITransport#end_input` now takes the stdin mutex like write/close (its lock-free close could race a concurrent writer into a misleading "undefined method ... for nil" error that also poisoned `@exit_error`).
+- `CLAUDE_CODE_ENTRYPOINT` now defaults to `sdk-rb` regardless of inherited process env (an ambient `cli` value from running inside a Claude Code terminal previously won via `||=`, mis-attributing telemetry); `options.env` can still override it. `CLAUDE_AGENT_SDK_VERSION` is now always SDK-set, never overridable (Python merge-order parity).
+- Unknown hook events now arrive as `UnknownHookInput` carrying the wire `hook_event_name` and the complete raw payload (previously all event-specific fields were dropped and the name was nil; Python passes the raw dict through, losing nothing).
+- Oversized CLI stdout lines no longer allocate unbounded memory: the read loop's 1MB buffer cap previously fired only AFTER `each_line` had read the whole line into memory; reads are now chunk-bounded at `max_buffer_size + 1` bytes (Python's TextReceiveStream reads ≤64KB chunks — same incremental-cap semantics). The same bound applies to the stderr drain loops.
+- `advisor_tool_result` content blocks now parse into `ServerToolResultBlock` (they previously fell through to `UnknownBlock`); the `server_tool_result` wire type was dead code — no CLI version emits it — and now takes the forward-compat `UnknownBlock` path.
+- README's Client quick-start example used `receive_messages` with no termination and hung forever when pasted; it now uses `receive_response`.
+- `Configuration#default_options` containers are now deep-duplicated when constructing `ClaudeAgentOptions`: `options.allowed_tools << 'Bash'` in one session no longer mutates the global default (cross-session permission widening) and nested default hashes/arrays no longer leak mutations. Leaf objects (callbacks, observer factories, SDK MCP server instances) keep identity.
+- Hash-form `thinking` config (`{ type: 'adaptive'|'enabled'|'disabled', budget_tokens:, display: }`) is now serialized to the CLI; it was previously dropped silently and also suppressed the `max_thinking_tokens` fallback. Invalid shapes raise a clear `ArgumentError`.
+- Control-protocol client methods (`interrupt`, `set_model`, …) can now be called from inside hook/`can_use_tool`/SDK-MCP callbacks (Python reentrancy parity): the call previously wrote the request to the CLI and then crashed with an opaque `RuntimeError: No async task available!`, silently dropping the response. Worker-thread callers now wait on a level-triggered queue with the same timeout semantics.
+- Closed a lost-wakeup race in control-request waiting: `Async::Condition` is edge-triggered, so a `control_response` arriving while the sender was still suspended in a transport write was dropped and the caller waited the full 1200s timeout (reachable with custom transports whose `#write` suspends after delivery, and via the read-loop error broadcast). Senders now check the result slot before and between waits (anyio.Event level-trigger semantics, like Python).
+- `SdkMcpServer#handle_json` now actually serves resources and prompts: `resources/list` crashed inside the mcp gem (`Class#to_h`), `resources/read` silently returned `{contents: []}` for every URI (the read path referenced `MCP::ResourceContents`, a constant that has never existed in any mcp gem version), `prompts/list` mangled names (`codeReview` → `code_review`) and dropped descriptions/arguments, and `prompts/get` could not find any prompt. Resources are now served as `MCP::Resource` instances with the gem's registered read handler; prompts via `MCP::Prompt.define`. Both delegate to the SDK's own readers, preserving the FiberBoundary hop and result validation.
+- Pre-built JSON Schemas with symbol values (`type: :object`) are no longer silently mangled into garbage (schema meta-keys leaked as parameters, every valid `tools/call` was then rejected with "Missing required arguments: type, properties, required"). Schema normalization is now a single source of truth; `additionalProperties`/`enum`/`description` survive, and a malformed pre-built schema raises a clear error lazily instead of producing silent garbage.
+- `list_sessions`, `get_session_info`, and `get_session_messages` no longer raise `Errno::ENOENT` for a `directory:` that does not exist — they return `[]`/`nil`, matching the Python SDK (a fresh checkout or a deleted project directory previously crashed; sessions recorded for a since-deleted directory are now found again).
+- `get_session_messages` and `import_session_to_store` no longer stop at a 0-byte transcript stub: the session-file search skips empty files and continues to worktree project dirs, matching Python's `st_size > 0` resolver (a stub in the canonical project dir previously hid the real worktree transcript).
+- An empty `CLAUDE_CONFIG_DIR` is now treated as unset (falling back to `~/.claude`, NFC-normalized) across all session read/mutation paths and SessionStore mirroring, matching the Node CLI and Python SDK — previously the projects dir resolved to `/projects` and transcript mirroring was silently disabled.
+- `list_sessions` / `get_session_info` no longer return `created_at: nil` when a transcript's first JSONL record is a metadata-only entry (e.g. `permission-mode`) with no `timestamp` field — the whole 64 KiB head window is now scanned for the first timestamp (Python #907 parity). The store-backed readers already folded every entry, so the disk and store paths now agree on `created_at`.
+- `Observer#on_error` is now actually invoked — once per error surfacing from `query()` or `Client#query`/`#receive_messages`/`#receive_response`/`#connect`, before `on_close` where both fire. Crashed sessions now produce OTel traces with error status and a recorded exception.
+- `OTelObserver` no longer mislabels traces when one instance is reused: the buffered prompt/output are reset at every trace boundary (previously every trace after the first showed the first query's prompt as its `input.value`, and a nil-result trace could leak the previous query's output).
+- `OTelObserver` no longer leaks unfinished (never-exported) spans: a new `InitMessage` without an intervening `ResultMessage` finishes the superseded root span, and tool spans still pending at a `ResultMessage` or superseded init are finished at that boundary instead of waiting for `on_close`.
+- OTel tool spans now serialize Array tool-result content as JSON with `output.mime_type: application/json` (previously Ruby `inspect` format); String results gain `output.mime_type: text/plain`; nil results omit `output.value` (previously empty string).
+- `break` inside the user block of `query()`, `Client#receive_messages` and `Client#receive_response` now stops iteration (returning the break value) instead of raising `LocalJumpError` — the FiberBoundary thread hop translates the break back to the calling fiber (new internal `FiberBoundary.invoke_iteration`).
+- CLI stdout/stderr are now always decoded as UTF-8: under `LANG=C`/`LC_ALL=C` (minimal Docker images, systemd, CI) the pipes inherited US-ASCII and the first non-ASCII byte from the CLI killed the read loop with `Encoding::CompatibilityError`. Mirrors the Python SDK's UTF-8 `TextReceiveStream`.
+- The unsupported-CLI-version warning now actually fires: `Array#<` does not exist, so the version comparison raised `NoMethodError` into the version check's blanket rescue and the warning had never been emitted; the `-v` output is also UTF-8-scrubbed so a stray invalid byte cannot suppress it.
+- `query()` can no longer hang forever when the CLI dies while a streaming-input Enumerator is blocked: the input stream task is tracked on the `Query` (new `Query#spawn_task`, mirroring the Python SDK's `_child_tasks`) and stopped by `close`, so the real error now propagates to the caller instead of decaying to an async console warning.
+- A CLI crash now always unblocks in-flight control requests (`interrupt`, `set_model`, …) immediately instead of leaving them to the 1200s control-request timeout, and a real crash in a multi-turn session is reported to consumers instead of being silently swallowed after the first result.
+- Hooks and SDK MCP servers no longer silently stop working when a one-shot `query()`'s first turn runs past 60 seconds: stdin stays open (without timeout) until the first result, mirroring Python SDK commit c3d96cb. String-prompt queries also now stream messages to the block while that wait is pending instead of deferring delivery.
+### Changed
+- `Client#connect(enumerable)` now streams the initial prompt in the **background** (Python parity): connect returns immediately instead of blocking until the stream is exhausted (interactive streams that wait for a response before yielding no longer deadlock), Hash messages are serialized as JSON (previously Ruby `inspect` via `to_s` — never valid wire format), stdin closes when the stream is exhausted, and stream errors fire `Observer#on_error` and are logged instead of raising out of `connect`.
+- SDK MCP `tools/call` now routes through the official `MCP::Server`: arguments are JSON-Schema-validated (draft4) against the tool's `inputSchema` **before** the handler runs (Python parity — its mcp lowlevel server does the same), and validation failures return in-band `isError` results ("Missing required arguments: …"/"Invalid arguments: …") without invoking the handler. The `mcp` dependency floor rises to `>= 0.6, < 1` (0.4 turned these into protocol errors; 0.5 serializes empty icons arrays into list responses). The SDK normalizes tools/call error envelopes itself, so the gem's per-version error-behavior swings (0.7.1+/0.18 raise protocol errors again) never leak through. Error texts on this path now come from the gem (e.g. "Tool not found: X", "Internal error calling tool X: msg"). Validation can be disabled globally via `MCP.configure { |c| c.validate_tool_call_arguments = false }`. Schemas the draft4 metaschema rejects (numeric `exclusiveMinimum`, `$ref` — valid modern JSON Schema that Python accepts) fall back to validation-disabled with a one-time warning instead of bricking the tool.
+- SDK MCP tool failures are now reported **in-band** (`isError: true` with the error text in `content`) instead of as JSON-RPC `-32603` protocol errors — matching the MCP spec, the Python SDK, and the official mcp gem. The model can now read the error text and self-correct. This covers handler exceptions, unknown tools, and malformed handler results; `SdkMcpServer#call_tool` no longer raises for these cases.
+- `SdkMcpServer#call_tool`/`#read_resource`/`#get_prompt` now accept string-keyed handler results (Python handlers return string-keyed dicts naturally); previously they rejected them with an error the model saw as a protocol failure.
+- When an explicit `directory:` is given, `get_session_messages` and `import_session_to_store` no longer fall back to scanning all project directories: a session that only exists in an unrelated project now returns `[]` / raises `Errno::ENOENT` instead of silently returning/importing another project's data under the wrong project key (Python parity; `directory: nil` still searches all projects). A 0-byte-stub-only session now raises `Errno::ENOENT` from import instead of silently importing zero entries.
+- `can_use_tool` callbacks now receive fully populated `ToolPermissionContext`s: the CLI display fields (`title`, `display_name`, `description`, `blocked_path`, `decision_reason`) are forwarded (previously always nil), and `suggestions` are typed `PermissionUpdate` objects instead of raw wire hashes (Python #920 parity) — `PermissionUpdate.new` also hydrates wire-format rule hashes into `PermissionRuleValue`. Code treating suggestion entries as plain Hashes (`dig`, `fetch`) must use the typed accessors; echoing `context.suggestions` into `updated_permissions` keeps working.
+- A `ProcessError` that directly follows a result with `is_error: true` (the CLI exits non-zero on purpose, e.g. structured-output errors) is now raised with the structured error text the CLI reported (`Claude Code returned an error result: …`, preserving `exit_code`/`stderr`) instead of ending the stream silently — matching the Python SDK.
+- `CLAUDE_CODE_STREAM_CLOSE_TIMEOUT` is now a no-op (the internal `Query::STREAM_CLOSE_TIMEOUT_*` constants were removed with the stdin-close timeout).
+### Removed
+- **Breaking**: `ClaudeAgentOptions#append_allowed_tools`. It emitted `--append-allowed-tools`, which no Claude Code CLI version accepts (`error: unknown option`), so any use failed at connect. The option never existed in the Python SDK (mis-port in v0.4.0). Use `allowed_tools` instead — the CLI's `--allowedTools` already appends to settings-derived permission rules. Passing `append_allowed_tools:` now raises `ArgumentError` at construction.
 ## [0.17.0] - 2026-06-10
 ### Added

data/README.md CHANGED Viewed

@@ -50,7 +50,7 @@ Async do
   client = ClaudeAgentSDK::Client.new(options: options)
   client.connect
   client.query("Hello")
-  client.receive_messages { |msg| puts msg }
+  client.receive_response { |msg| puts msg }
   client.disconnect
 end.wait
 ```
@@ -68,7 +68,7 @@ Add this line to your application's Gemfile:
 gem 'claude-agent-sdk', github: 'ya-luotao/claude-agent-sdk-ruby'
 # Or use a stable version from RubyGems
-gem 'claude-agent-sdk', '~> 0.17.0'
+gem 'claude-agent-sdk', '~> 0.18.0'
 ```
 Then `bundle install`, or install directly: `gem install claude-agent-sdk`.
@@ -186,6 +186,8 @@ options = ClaudeAgentSDK::ClaudeAgentOptions.new(
 )
 ```
+Tool arguments are JSON-Schema-validated (draft4, via the official mcp gem) before your handler runs: the simple `{ name: :string }` idiom marks every parameter required, so a missing argument returns an in-band error to the model instead of invoking the handler with `nil`. Handler exceptions and unknown tools are also reported in-band (`isError: true`) so the model can read the text and self-correct. Opt out globally with `MCP.configure { |c| c.validate_tool_call_arguments = false }`. Schemas the draft4 metaschema rejects (e.g. numeric `exclusiveMinimum`, `$ref`) fall back to validation-disabled with a warning.
 Resources, prompts, mixed (SDK + external) servers, RubyLLM schema compatibility → see [docs/mcp-servers.md](docs/mcp-servers.md).
 ## Hooks & Permission Callbacks

data/docs/configuration.md CHANGED Viewed

@@ -122,11 +122,22 @@ options = ClaudeAgentSDK::ClaudeAgentOptions.new(tools: ['Read', 'Edit', 'Bash']
 # Preset
 options = ClaudeAgentSDK::ClaudeAgentOptions.new(tools: ClaudeAgentSDK::ToolsPreset.new(preset: 'claude_code'))
+```
+## Skills
+`skills` is the single place to enable skills for the main session — it auto-allows the `Skill` tool and defaults `setting_sources` to `['user', 'project']` (when unset) so skill files are discovered:
-# Append to allowed tools
-options = ClaudeAgentSDK::ClaudeAgentOptions.new(append_allowed_tools: ['Write', 'Bash'])
+```ruby
+# Every discovered skill ('all' is the only valid String)
+options = ClaudeAgentSDK::ClaudeAgentOptions.new(skills: 'all')
+# Specific skills only — also sent to the CLI so only these are loaded
+options = ClaudeAgentSDK::ClaudeAgentOptions.new(skills: %w[pdf docx])
 ```
+Semantics: `nil` (default) leaves CLI defaults untouched; `[]` hides every skill from the listing; an Array adds `Skill(name)` allow-rules per entry (use `plugin:skill` for plugin-qualified names). An explicitly set `setting_sources` (including `[]`) is never overridden. This is a context filter, not a sandbox — skill files remain readable on disk.
 ## Sandbox Settings
 Configure [sandbox-runtime](https://github.com/anthropic-experimental/sandbox-runtime) restrictions (network policy, filesystem access) via the CLI's `--sandbox` flag. The CLI handles OS-level process isolation using `srt`.

data/docs/observability.md CHANGED Viewed

@@ -2,9 +2,17 @@
 The SDK includes a built-in **observer interface** and an **OpenTelemetry observer** for tracing agent sessions. Traces are emitted using standard `gen_ai.*` semantic conventions, compatible with Langfuse, Jaeger, Datadog, and any OTel backend.
+## Distributed Trace Context (W3C)
+When `connect` spawns the CLI and there is an active OTel span, the SDK injects `TRACEPARENT`/`TRACESTATE` (and any other propagator carrier keys, e.g. `BAGGAGE` — which may carry user-defined key/values — uppercased) into the subprocess environment so CLI-side telemetry (`CLAUDE_CODE_ENABLE_TELEMETRY=1`) joins the caller's distributed trace. This requires the `opentelemetry` gem to be loaded with a configured propagator — there is no hard dependency, and it is a no-op otherwise. Explicit `ClaudeAgentOptions#env` keys always win; stale inherited `TRACEPARENT`/`TRACESTATE` is replaced (or unset) only when an active span supersedes it. This works independently of `OTelObserver`: the CLI parents under the caller's surrounding span, not under `claude_agent.session` (which starts at InitMessage, after spawn).
 ## How It Works
-Register observers via `ClaudeAgentOptions`. The SDK calls `on_message` for every parsed message in both `query()` and `Client`, and `on_close` when the session ends. Observer errors are silently rescued so they never crash your application.
+Register observers via `ClaudeAgentOptions`. The SDK calls `on_user_prompt` when a prompt is sent — the verbatim string for String prompts (`query()` / `Client#query`), and once per `type: 'user'` message with extractable text for Enumerator/streaming input (`query()` stream path and `Client#connect` with an initial enumerable). It calls `on_message` for every parsed message, `on_error` once per error that surfaces to your code (before `on_close` where both fire), and `on_close` when the session ends. Observer errors are silently rescued so they never crash your application.
+For multi-turn streaming input note that `OTelObserver` captures one prompt per trace (the first one buffered before each init); prompts queued up-front for later turns may not appear as those traces' `input.value`.
+In `Client` mode, call `disconnect` (ideally in an `ensure` block) so `on_close` runs and OTel spans are flushed and exported.
 ```
 claude_agent.session            (root span — one per query/session)
@@ -86,14 +94,26 @@ end
 # OpenTelemetry.tracer_provider.shutdown
 ```
+### Reuse and concurrency
+A single `OTelObserver` instance is safe to reuse for **sequential** queries — per-trace state (buffered prompt/output, open spans) is reset at each trace boundary. It holds unsynchronized span state, however, so for **concurrent** sessions (Puma, Sidekiq, threads) pass a callable factory so each query/session gets a fresh instance:
+```ruby
+options = ClaudeAgentSDK::ClaudeAgentOptions.new(
+  observers: [-> { ClaudeAgentSDK::Instrumentation::OTelObserver.new }]
+)
+```
+See [docs/rails.md](rails.md) for the Rails-specific pattern.
 ## Span Attributes
 The OTel observer sets attributes using both `gen_ai.*` (OTel GenAI) and OpenInference conventions for maximum backend compatibility:
 | Span | Type | Key Attributes |
 |------|------|----------------|
-| `claude_agent.session` | `agent` | `gen_ai.system`, `gen_ai.request.model`, `session.id`, `input.value`, `output.value`, `gen_ai.usage.cost`, `llm.cost.total` |
-| `claude_agent.generation` | `generation` | `gen_ai.response.model`, `gen_ai.usage.input_tokens`, `gen_ai.usage.output_tokens`, `output.value` |
+| `claude_agent.session` | `agent` | `gen_ai.system`, `gen_ai.request.model`, `session.id`, `input.value`, `output.value`, `gen_ai.usage.cost`, `llm.cost.total`, `gen_ai.usage.cache_creation_input_tokens`, `gen_ai.usage.cache_read_input_tokens`, `llm.token_count.prompt_details.cache_read`, `llm.token_count.prompt_details.cache_write` |
+| `claude_agent.generation` | `generation` | `gen_ai.response.model`, `gen_ai.usage.input_tokens`, `gen_ai.usage.output_tokens`, `gen_ai.usage.cache_creation_input_tokens`, `gen_ai.usage.cache_read_input_tokens`, `output.value` |
 | `claude_agent.tool.*` | `tool` | `tool.name`, `input.value`, `output.value` |
 Events (`api_retry`, `rate_limit`, `tool_progress`) are recorded on the root span.
@@ -102,7 +122,7 @@ The `langfuse.observation.type` attribute is set on each span (`agent`/`generati
 ## Custom Observers
-Implement the `Observer` module to build your own instrumentation:
+Implement the `Observer` module to build your own instrumentation. Overridable callbacks: `on_user_prompt(prompt)`, `on_message(message)`, `on_error(error)`, `on_close`.
 ```ruby
 class MyObserver
@@ -115,6 +135,10 @@ class MyObserver
     end
   end
+  def on_error(error)
+    puts "Session error: #{error.message}"
+  end
   def on_close
     puts "Session ended"
   end

data/docs/sessions.md CHANGED Viewed

@@ -1,6 +1,8 @@
 # Session Browsing & Mutations
-Browse, read, mutate, fork, and resume Claude Code sessions directly from Ruby — no CLI subprocess required. These APIs read and write `~/.claude/projects/` JSONL files directly, respecting the `CLAUDE_CONFIG_DIR` environment variable and auto-detecting git worktrees.
+Browse, read, mutate, fork, and resume Claude Code sessions directly from Ruby — no CLI subprocess required. These APIs read and write `~/.claude/projects/` JSONL files directly, respecting the `CLAUDE_CONFIG_DIR` environment variable (an empty value is treated as unset, falling back to `~/.claude`) and auto-detecting git worktrees.
+Not-found semantics: the read APIs return `[]`/`nil` for unknown sessions and for directories that do not exist or have no recorded sessions. An explicit `directory:` strictly scopes the search to that project and its git worktrees — there is no cross-project fallback (pass `directory: nil` to search all projects). 0-byte transcript stubs are skipped during session-file resolution.
 ## Listing Sessions
@@ -21,7 +23,7 @@ ClaudeAgentSDK.list_sessions(directory: '.', limit: 10, offset: 10)
 ClaudeAgentSDK.list_sessions(directory: '.', include_worktrees: true)
 ```
-Each `SDKSessionInfo` includes: `session_id`, `summary`, `last_modified`, `file_size`, `custom_title`, `first_prompt`, `git_branch`, `cwd`.
+Each `SDKSessionInfo` includes: `session_id`, `summary`, `last_modified`, `file_size`, `custom_title`, `first_prompt`, `git_branch`, `cwd`, `tag`, `created_at`.
 ## Reading Session Messages
@@ -36,6 +38,17 @@ ClaudeAgentSDK.get_session_messages(session_id: 'abc-123-...', offset: 10, limit
 Each `SessionMessage` includes `type` (`"user"` or `"assistant"`), `uuid`, `session_id`, and `message` (raw API hash).
+## Reading Subagent Transcripts
+Subagent transcripts live at `<projectDir>/<sessionId>/subagents/agent-<id>.jsonl` and may nest under `workflows/<runId>/`:
+```ruby
+ids = ClaudeAgentSDK.list_subagents(session_id: "uuid-here", directory: "/path/to/project")
+messages = ClaudeAgentSDK.get_subagent_messages(session_id: "uuid-here", agent_id: ids.first, limit: 50)
+```
+With `directory:` given, only that project and its git worktrees are searched (no global fallback). Store-backed counterparts: `list_subagents_from_store` / `get_subagent_messages_from_store`.
 ## Renaming a Session
 ```ruby

data/lib/claude_agent_sdk/command_builder.rb CHANGED Viewed

@@ -17,9 +17,14 @@ module ClaudeAgentSDK
     def build
       cmd = [@cli_path, "--output-format", "stream-json", "--verbose"]
+      # skills auto-wires the Skill tool into --allowedTools and defaults
+      # --setting-sources; compute both once so the two flags cannot diverge
+      # (mirrors Python _apply_skills_defaults).
+      effective_allowed_tools, effective_setting_sources = skills_defaults
       append_system_prompt(cmd)
       append_tools(cmd)
-      append_allowed_tools(cmd)
+      append_allowed_tools(cmd, effective_allowed_tools)
       append_disallowed_tools(cmd)
       append_max_turns(cmd)
       append_model(cmd)
@@ -30,13 +35,12 @@ module ClaudeAgentSDK
       append_thinking(cmd)
       append_effort(cmd)
       append_betas(cmd)
-      append_append_allowed_tools(cmd)
       append_output_format(cmd)
       append_additional_dirs(cmd)
       append_mcp_servers(cmd)
       append_boolean_flags(cmd)
       append_plugins(cmd)
-      append_setting_sources(cmd)
+      append_setting_sources(cmd, effective_setting_sources)
       append_extra_args(cmd)
       # Always use streaming mode for bidirectional control protocol.
@@ -80,8 +84,33 @@ module ClaudeAgentSDK
       end
     end
-    def append_allowed_tools(cmd)
-      cmd.push("--allowedTools", @options.allowed_tools.join(",")) unless @options.allowed_tools.empty?
+    def append_allowed_tools(cmd, allowed_tools)
+      cmd.push("--allowedTools", allowed_tools.join(",")) unless allowed_tools.empty?
+    end
+    # Mirror of Python's _apply_skills_defaults: when skills are requested,
+    # auto-allow the Skill tool ('all' -> bare Skill, list -> Skill(name) per
+    # entry, no duplicates) and default setting_sources to user+project so
+    # skill files are actually discovered. Explicit setting_sources (including
+    # []) is never overridden. Non-mutating; returns the effective pair.
+    # Justified divergence: a non-'all' String raises (NoMethodError on #each)
+    # instead of Python's quirk of iterating its characters.
+    def skills_defaults
+      allowed_tools = @options.allowed_tools.dup
+      setting_sources = @options.setting_sources&.dup
+      skills = @options.skills
+      return [allowed_tools, setting_sources] if skills.nil?
+      # Fail loudly with a clear message instead of a bare NoMethodError from
+      # deep inside build for skills: :all / 'pdf' / Hash typos (and instead
+      # of Python's quirk of iterating a String's characters).
+      valid = skills == "all" || skills.is_a?(Array)
+      raise ArgumentError, "skills must be 'all' or an Array of skill names (got #{skills.inspect})" unless valid
+      entries = skills == "all" ? ["Skill"] : skills.map { |name| "Skill(#{name})" }
+      entries.each { |entry| allowed_tools << entry unless allowed_tools.include?(entry) }
+      setting_sources = %w[user project] if setting_sources.nil?
+      [allowed_tools, setting_sources]
     end
     def append_disallowed_tools(cmd)
@@ -171,24 +200,48 @@ module ClaudeAgentSDK
       cmd.push("--task-budget", total.to_s) if total
     end
-    # Thinking configuration takes precedence over deprecated max_thinking_tokens
+    # Thinking configuration takes precedence over deprecated
+    # max_thinking_tokens. Accepts the ThinkingConfig* classes and the
+    # wire-shaped Hash form ({ type: 'adaptive'|'enabled'|'disabled',
+    # budget_tokens:, display: }, symbol or string keys) — the Hash form
+    # was previously dropped silently AND suppressed the
+    # max_thinking_tokens fallback.
     def append_thinking(cmd)
       if @options.thinking
-        case @options.thinking
-        when ThinkingConfigAdaptive
+        type, budget, display = thinking_fields(@options.thinking)
+        case type
+        when "adaptive"
           cmd.push("--thinking", "adaptive")
-          append_thinking_display(cmd, @options.thinking.display)
-        when ThinkingConfigEnabled
-          cmd.push("--max-thinking-tokens", @options.thinking.budget_tokens.to_s)
-          append_thinking_display(cmd, @options.thinking.display)
-        when ThinkingConfigDisabled
+          append_thinking_display(cmd, display)
+        when "enabled"
+          raise ArgumentError, "thinking type 'enabled' requires budget_tokens" if budget.nil?
+          cmd.push("--max-thinking-tokens", budget.to_s)
+          append_thinking_display(cmd, display)
+        when "disabled"
           cmd.push("--thinking", "disabled")
+        else
+          raise ArgumentError, "unsupported thinking config: #{@options.thinking.inspect}"
         end
       elsif @options.max_thinking_tokens
         cmd.push("--max-thinking-tokens", @options.max_thinking_tokens.to_s)
       end
     end
+    # Explicit class dispatch — never respond_to? probes (Kernel#display
+    # exists on every object and PRINTS the receiver to $stdout).
+    def thinking_fields(thinking)
+      case thinking
+      when Hash
+        type = (thinking[:type] || thinking["type"])&.to_s
+        [type, thinking[:budget_tokens] || thinking["budget_tokens"], thinking[:display] || thinking["display"]]
+      when ThinkingConfigAdaptive then [thinking.type, nil, thinking.display]
+      when ThinkingConfigEnabled then [thinking.type, thinking.budget_tokens, thinking.display]
+      when ThinkingConfigDisabled then [thinking.type, nil, nil]
+      else [nil, nil, nil] # falls into append_thinking's else -> ArgumentError
+      end
+    end
     # `--thinking-display` toggles between `"summarized"` (visible thinking
     # text) and `"omitted"` (empty thinking, signature only). Opus 4.7 defaults
     # to `"omitted"`, so pass `display: "summarized"` to see reasoning.
@@ -229,12 +282,6 @@ module ClaudeAgentSDK
       end
     end
-    def append_append_allowed_tools(cmd)
-      return unless @options.append_allowed_tools && !@options.append_allowed_tools.empty?
-      cmd.push("--append-allowed-tools", @options.append_allowed_tools.join(","))
-    end
     def append_output_format(cmd)
       return unless @options.output_format
@@ -298,10 +345,10 @@ module ClaudeAgentSDK
       end
     end
-    def append_setting_sources(cmd)
-      return unless @options.setting_sources
+    def append_setting_sources(cmd, setting_sources)
+      return if setting_sources.nil?
-      cmd.push("--setting-sources", @options.setting_sources.join(","))
+      cmd.push("--setting-sources", setting_sources.join(","))
     end
     def append_extra_args(cmd)

data/lib/claude_agent_sdk/fiber_boundary.rb CHANGED Viewed

@@ -26,13 +26,33 @@ module ClaudeAgentSDK
   #
   # The thread hop severs `break`/`return`/`next` from the surrounding method,
   # so SDK loops yielding user callbacks must keep loop control outside the
-  # invoked block (see `Client#receive_response`).
+  # invoked block (see `Client#receive_response`); user-initiated `break` is
+  # bridged back to the calling fiber via `.invoke_iteration`.
+  #
+  # Deliberate carve-out: the STREAMING-INPUT enumerable is the one user-code
+  # path iterated ON the reactor (Query#stream_input), matching Python where
+  # async input generators run on the event loop. Enumerator#next is
+  # fiber-based and cannot be pulled across threads, and a whole-iteration
+  # thread bridge would break Async-native producers (Async::Queue#dequeue
+  # etc.). Thread::Queue#pop / sleep / socket IO inside the enumerator are
+  # scheduler-aware and park only the stream task; CPU-bound or
+  # scheduler-opaque work must be moved by the user (a producer Thread
+  # feeding a Thread::Queue, or FiberBoundary.invoke inside the enumerator).
   module FiberBoundary
     # Raised by .invoke when a timeout-bounded call exceeds its allotted time.
     # The worker thread is abandoned (cancellation is best-effort; the
     # in-flight call may still complete).
     class JoinTimeout < StandardError; end
+    # Sentinel returned by .invoke_iteration when the user block attempted `break`.
+    class Break
+      attr_reader :value
+      def initialize(value)
+        @value = value
+      end
+    end
     module_function
     # Run the given block on a plain thread when a Fiber scheduler is active.
@@ -51,5 +71,23 @@ module ClaudeAgentSDK
       thread.value
     end
+    # Invoke a user-supplied iteration block across the boundary. The thread
+    # hop severs `break` from the surrounding loop, surfacing as
+    # LocalJumpError(reason: :break) on the worker thread; translate it into
+    # a Break sentinel so the SDK loop can break on the calling fiber.
+    # Returns Break when the user broke, nil when the block completed.
+    # Without a scheduler the block runs in place and `break` unwinds
+    # natively, never reaching the translation.
+    def invoke_iteration(block, *args)
+      invoke do
+        block.call(*args)
+        nil
+      rescue LocalJumpError => e
+        raise unless e.reason == :break
+        Break.new(e.exit_value)
+      end
+    end
   end
 end