RubyGems - ruby_llm-responses_api - Versions diffs - 0.5.4 → 0.6.0 - Mend

ruby_llm-responses_api 0.5.4 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +18 -0
data/README.md +47 -9
data/lib/ruby_llm/providers/openai_responses/built_in_tools.rb +70 -2
data/lib/ruby_llm/providers/openai_responses/capabilities.rb +38 -1
data/lib/ruby_llm/providers/openai_responses/chat.rb +20 -3
data/lib/ruby_llm/providers/openai_responses/chat_extension.rb +46 -0
data/lib/ruby_llm/providers/openai_responses/compaction.rb +12 -0
data/lib/ruby_llm/providers/openai_responses/message_extension.rb +4 -1
data/lib/ruby_llm/providers/openai_responses/model_registry.rb +19 -1
data/lib/ruby_llm/providers/openai_responses/stream_accumulator_extension.rb +39 -0
data/lib/ruby_llm/providers/openai_responses/streaming.rb +7 -2
data/lib/ruby_llm/providers/openai_responses/tools.rb +11 -2
data/lib/ruby_llm/providers/openai_responses/web_socket.rb +48 -25
data/lib/ruby_llm/providers/openai_responses.rb +21 -1
data/lib/rubyllm_responses_api.rb +3 -1
metadata +5 -3

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 688542f3b394d7ecfe8873cf6700d95da8c566230826a35d332a8e3bf1b08e7a
-  data.tar.gz: 5b12fd0f0728c273821956f315856dc966b0d8826a3663c86893710c377f2880
+  metadata.gz: daf5eec383489e81c577b2f04965ca156b976ca30a708a39a6ab8dce39863ff7
+  data.tar.gz: df8e9a6f230724e40f479679229d20c1b2faeee8e186400f37367e059badea8a
 SHA512:
-  metadata.gz: e8203f307e819443cff01c51ebfc61cba907345e71782bc3f9286ba43a5f2e9b86612135a2fc4754ed6e3262b14a8187e2bda5c130fe434ba6389f5b2f5aa5e0
-  data.tar.gz: 05e88249b83ec50f67ff1e333f06df78cb9776f66419eefd14665548b2899ba63bfc605c78b74b6991d05d3d0ae369cd9a06b9b8d49f6106ea03eea3ed468f87
+  metadata.gz: 7b6ae940cd27e283b2fd06edcb9122fb25328f39d786f3c011073954a4e5574a63f4272b17e11816e499c3064343407181a044768ebaa7f4f9d1844ea3afae3b
+  data.tar.gz: e2f971bfe9d6d874e6b368775485dab455ce16a0291c11d1806adf279273c3b6e762fac015ab86d376d84c3d31e0e1de26475ec5230c912d7352ad0b29c0a7f1

data/CHANGELOG.md CHANGED Viewed

@@ -5,6 +5,24 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [0.6.0] - 2026-05-26
+### Added
+- Fire `on_tool_call` / `on_tool_result` (and the newer `before_tool_call` / `after_tool_result`) for server-side built-in tools: web search, file search, code interpreter, image generation, shell, apply patch, MCP, computer use, local shell (issue #1 by @myxoh)
+- Configurable WebSocket `response_timeout` (default 60s); stalled streams now raise `ConnectionError` instead of hanging forever on `queue.pop`
+### Fixed
+- Stop sending the full message history alongside `previous_response_id` in chained conversations; this caused server-side chain state to grow quadratically and reach the context-window ceiling far earlier than the visible content suggested (issue #10 reported by @theclunkerjunker)
+- Drop the rejected `OpenAI-Beta: responses.websocket=v1` header that prevented the live `wss://api.openai.com/v1/responses` endpoint from accepting connections
+- Send `response.create` fields at the top level over WebSocket instead of nested under a `response` key, which made the live endpoint reject every request as `missing_required_parameter: model` (PR #8 by @lucas-domeij)
+- Bind WebSocket `on(:message)` / `on(:close)` / `on(:error)` handlers via a local closure so they no longer reference ivars on the underlying client (which silently dropped every incoming frame and made `#call` hang) (PR #9 by @lucas-domeij)
+### Changed
+- `ruby_llm` dependency bumped to `>= 1.13` so the existing `thinking:` / `tool_prefs:` overrides match the upstream `Provider#complete` signature
 ## [0.5.4] - 2026-04-06
 ### Changed

data/README.md CHANGED Viewed

@@ -17,7 +17,7 @@ RubyLLM.configure do |config|
   config.openai_api_key = ENV['OPENAI_API_KEY']
 end
-chat = RubyLLM.chat(model: 'gpt-4o-mini', provider: :openai_responses)
+chat = RubyLLM.chat(model: 'gpt-5.5', provider: :openai_responses)
 response = chat.ask("Hello!")
 puts response.content
 ```
@@ -29,7 +29,7 @@ All standard RubyLLM features work as expected (streaming, tools, vision, struct
 Conversations automatically chain via `previous_response_id`:
 ```ruby
-chat = RubyLLM.chat(model: 'gpt-4o-mini', provider: :openai_responses)
+chat = RubyLLM.chat(model: 'gpt-5.5', provider: :openai_responses)
 chat.ask("My name is Alice.")
 chat.ask("What's my name?")  # => "Your name is Alice."
 ```
@@ -50,7 +50,7 @@ Then use normally:
 ```ruby
 # Day 1
-chat = Chat.create!(model_id: 'gpt-4o-mini', provider: :openai_responses)
+chat = Chat.create!(model_id: 'gpt-5.5', provider: :openai_responses)
 chat.ask("My name is Alice.")
 # Day 2 (after restart)
@@ -65,12 +65,15 @@ The Responses API provides built-in tools that don't require custom implementati
 ### Web Search
 ```ruby
-chat.with_params(tools: [{ type: 'web_search_preview' }])
+chat.with_params(tools: [{ type: 'web_search' }])
 chat.ask("Latest news about Ruby 3.4?")
 # Or with helper
 tool = RubyLLM::ResponsesAPI::BuiltInTools.web_search(search_context_size: 'high')
 chat.with_params(tools: [tool])
+# Legacy preview type is still available when needed
+tool = RubyLLM::ResponsesAPI::BuiltInTools.web_search_preview
 ```
 ### Code Interpreter
@@ -97,7 +100,7 @@ Execute commands in hosted containers or local terminal environments. Requires G
 ```ruby
 # Auto-provisioned container (default)
-chat = RubyLLM.chat(model: 'gpt-5.2', provider: :openai_responses)
+chat = RubyLLM.chat(model: 'gpt-5.5', provider: :openai_responses)
 chat.with_params(tools: [{ type: 'shell', environment: { type: 'container_auto' } }])
 chat.ask("List all Python files in the project")
@@ -131,7 +134,7 @@ tool = RubyLLM::ResponsesAPI::BuiltInTools.shell(environment_type: 'local')
 Structured diff-based file editing. Requires GPT-5 family models.
 ```ruby
-chat = RubyLLM.chat(model: 'gpt-5.2', provider: :openai_responses)
+chat = RubyLLM.chat(model: 'gpt-5.5', provider: :openai_responses)
 chat.with_params(tools: [{ type: 'apply_patch' }])
 chat.ask("Add error handling to the User#save method")
@@ -162,7 +165,7 @@ chat.with_params(tools: [tool])
 ```ruby
 chat.with_params(tools: [
-  { type: 'web_search_preview' },
+  { type: 'web_search' },
   { type: 'code_interpreter' },
   { type: 'shell', environment: { type: 'container_auto' } }
 ])
@@ -195,12 +198,29 @@ end
 When the token count crosses the threshold, the server automatically compacts the conversation. The compacted state is carried forward transparently via `previous_response_id`.
+You can also run an explicit compaction pass or count request input tokens before creating a response:
+```ruby
+provider = chat.instance_variable_get(:@provider)
+compacted = provider.compact_response(
+  model: 'gpt-5.5',
+  input: [{ type: 'message', role: 'user', content: 'Summarize this long session...' }]
+)
+tokens = provider.count_input_tokens(
+  model: 'gpt-5.5',
+  input: 'Tell me a joke.'
+)
+puts tokens['input_tokens']
+```
 ## Containers API
 Manage persistent execution environments for the shell tool and code interpreter:
 ```ruby
-chat = RubyLLM.chat(model: 'gpt-5.2', provider: :openai_responses)
+chat = RubyLLM.chat(model: 'gpt-5.5', provider: :openai_responses)
 provider = chat.instance_variable_get(:@provider)
 # Create a container
@@ -241,9 +261,27 @@ result = provider.poll_response(response.response_id, interval: 2.0) do |status|
 end
 ```
+## Observing Built-in Tool Activity
+Server-side built-in tools (web search, code interpreter, file search, shell, apply patch, image generation, MCP, computer use, local shell) fire through the same `on_tool_call` / `on_tool_result` callbacks as locally executed function tools:
+```ruby
+chat = RubyLLM.chat(model: 'gpt-4o', provider: :openai_responses)
+chat.with_params(tools: [{ type: 'web_search' }])
+chat.on_tool_call  { |tc| puts "#{tc.name} called (id=#{tc.id})" }
+chat.on_tool_result { |r|  puts "  -> status=#{r[:status]}" }
+chat.ask("What's the latest Ruby release?")
+# web_search called (id=ws_...)
+#   -> status=completed
+```
+The newer `before_tool_call` / `after_tool_result` API (ruby_llm 1.13+) is supported too. Each `ToolCall` carries a normalized name (`web_search`, `code_interpreter`, `file_search`, `image_generation`, `shell`, `apply_patch`, `mcp`, `computer`, `local_shell`) and best-effort arguments extracted from the response item.
 ## Parsing Built-in Tool Results
-When the API returns results from built-in tools, use the parsers to extract structured data:
+When you want the structured payload rather than just the callback, use the parsers:
 ```ruby
 # Access raw response output (available via response.raw)

data/lib/ruby_llm/providers/openai_responses/built_in_tools.rb CHANGED Viewed

@@ -16,13 +16,19 @@ module RubyLLM
         # Web Search tool configuration
         # @param search_context_size [String, nil] 'low', 'medium', or 'high'
         # @param user_location [Hash, nil] { type: 'approximate', city: '...', country: '...' }
-        def web_search(search_context_size: nil, user_location: nil)
-          tool = { type: 'web_search_preview' }
+        # @param preview [Boolean] use the legacy preview tool type
+        def web_search(search_context_size: nil, user_location: nil, preview: false)
+          tool = { type: preview ? 'web_search_preview' : 'web_search' }
           tool[:search_context_size] = search_context_size if search_context_size
           tool[:user_location] = user_location if user_location
           tool
         end
+        # Legacy Web Search preview tool configuration.
+        def web_search_preview(search_context_size: nil, user_location: nil)
+          web_search(search_context_size: search_context_size, user_location: user_location, preview: true)
+        end
         # File Search tool configuration
         # @param vector_store_ids [Array<String>] IDs of vector stores to search
         # @param max_num_results [Integer, nil] Maximum results to return
@@ -187,6 +193,68 @@ module RubyLLM
             end
         end
+        # Server-executed built-in tool output item types and their argument
+        # extractors. The key is the output `type` from the Responses API; the
+        # value is a lambda that pulls the relevant arguments out of that item.
+        # To support a new built-in tool, add an entry here.
+        CALL_ARGUMENT_EXTRACTORS = {
+          'web_search_call' => ->(item) { { action: item['action'], query: item.dig('action', 'query') } },
+          'file_search_call' => ->(item) { { queries: item['queries'] } },
+          'code_interpreter_call' => ->(item) { { code: item['code'], container_id: item['container_id'] } },
+          'image_generation_call' => ->(_item) { {} },
+          'shell_call' => ->(item) { { action: item['action'], container_id: item['container_id'] } },
+          'local_shell_call' => ->(item) { { action: item['action'], container_id: item['container_id'] } },
+          'apply_patch_call' => ->(item) { { operation: item['operation'] } },
+          'mcp_call' => lambda { |item|
+            { name: item['name'], arguments: item['arguments'], server_label: item['server_label'] }
+          },
+          'computer_call' => ->(item) { { action: item['action'] } }
+        }.freeze
+        # Build a list of {tool_call:, result:} events from the response output.
+        # Used to surface server-side built-in tool activity through the standard
+        # on_tool_call / on_tool_result callbacks (issue #1).
+        def extract_events(output)
+          return [] unless output.is_a?(Array)
+          output.select { |item| CALL_ARGUMENT_EXTRACTORS.key?(item['type']) }
+                .map { |item| build_event(item) }
+        end
+        private_class_method def build_event(item)
+          type_label = item['type'].sub(/_call\z/, '')
+          tool_call = RubyLLM::ToolCall.new(
+            id: item['id'],
+            name: type_label,
+            arguments: call_arguments(item)
+          )
+          {
+            tool_call: tool_call,
+            result: call_result(item)
+          }
+        end
+        private_class_method def call_arguments(item)
+          extractor = CALL_ARGUMENT_EXTRACTORS[item['type']]
+          return {} unless extractor
+          extractor.call(item).compact
+        end
+        private_class_method def call_result(item)
+          {
+            id: item['id'],
+            type: item['type'],
+            status: item['status'],
+            results: item['results'],
+            result: item['result'],
+            output: item['output'],
+            action: item['action']
+          }.compact
+        end
         # Parse shell call results from output
         # @param output [Array] Response output array
         # @return [Array<Hash>] Parsed shell call results

data/lib/ruby_llm/providers/openai_responses/capabilities.rb CHANGED Viewed

@@ -10,6 +10,9 @@ module RubyLLM
         # Models that support the Responses API
         RESPONSES_API_MODELS = %w[
+          gpt-5.5
+          gpt-5.2 gpt-5.1 gpt-5.1-codex-max gpt-5.1-codex gpt-5.1-codex-mini gpt-5.1-chat
+          gpt-5 gpt-5-pro gpt-5-mini gpt-5-nano
           gpt-4o gpt-4o-mini gpt-4o-2024-05-13 gpt-4o-2024-08-06 gpt-4o-2024-11-20
           gpt-4o-mini-2024-07-18
           gpt-4.1 gpt-4.1-mini gpt-4.1-nano
@@ -21,6 +24,8 @@ module RubyLLM
         # Models with vision capabilities
         VISION_MODELS = %w[
+          gpt-5.5
+          gpt-5.2 gpt-5.1 gpt-5.1-chat gpt-5 gpt-5-pro gpt-5-mini gpt-5-nano
           gpt-4o gpt-4o-mini gpt-4o-2024-05-13 gpt-4o-2024-08-06 gpt-4o-2024-11-20
           gpt-4o-mini-2024-07-18
           gpt-4.1 gpt-4.1-mini gpt-4.1-nano
@@ -30,22 +35,39 @@ module RubyLLM
         ].freeze
         # Reasoning models (o-series)
-        REASONING_MODELS = %w[o1 o1-mini o1-preview o1-2024-12-17 o3 o3-mini o4-mini].freeze
+        REASONING_MODELS = %w[
+          gpt-5.5 gpt-5.2 gpt-5.1 gpt-5.1-codex-max gpt-5.1-codex gpt-5.1-codex-mini
+          gpt-5 gpt-5-pro gpt-5-mini gpt-5-nano
+          o1 o1-mini o1-preview o1-2024-12-17 o3 o3-mini o4-mini
+        ].freeze
         # Models that support web search
         WEB_SEARCH_MODELS = %w[
+          gpt-5.5 gpt-5.2 gpt-5.1 gpt-5.1-codex-max gpt-5 gpt-5-pro gpt-5-mini
           gpt-4o gpt-4o-mini gpt-4.1 gpt-4.1-mini gpt-4.1-nano
           o1 o3 o3-mini o4-mini
         ].freeze
         # Models that support code interpreter
         CODE_INTERPRETER_MODELS = %w[
+          gpt-5.5 gpt-5.2 gpt-5.1 gpt-5 gpt-5-pro gpt-5-mini
           gpt-4o gpt-4o-mini gpt-4.1 gpt-4.1-mini gpt-4.1-nano
           o1 o3 o3-mini o4-mini
         ].freeze
         # Context windows by model
         CONTEXT_WINDOWS = {
+          'gpt-5.5' => 1_050_000,
+          'gpt-5.2' => 400_000,
+          'gpt-5.1' => 400_000,
+          'gpt-5.1-codex-max' => 400_000,
+          'gpt-5.1-codex' => 400_000,
+          'gpt-5.1-codex-mini' => 400_000,
+          'gpt-5.1-chat' => 128_000,
+          'gpt-5' => 400_000,
+          'gpt-5-pro' => 400_000,
+          'gpt-5-mini' => 400_000,
+          'gpt-5-nano' => 400_000,
           'gpt-4o' => 128_000,
           'gpt-4o-mini' => 128_000,
           'gpt-4o-2024-05-13' => 128_000,
@@ -67,6 +89,17 @@ module RubyLLM
         # Max output tokens by model
         MAX_OUTPUT_TOKENS = {
+          'gpt-5.5' => 128_000,
+          'gpt-5.2' => 128_000,
+          'gpt-5.1' => 128_000,
+          'gpt-5.1-codex-max' => 128_000,
+          'gpt-5.1-codex' => 128_000,
+          'gpt-5.1-codex-mini' => 128_000,
+          'gpt-5.1-chat' => 16_384,
+          'gpt-5' => 128_000,
+          'gpt-5-pro' => 128_000,
+          'gpt-5-mini' => 128_000,
+          'gpt-5-nano' => 128_000,
           'gpt-4o' => 16_384,
           'gpt-4o-mini' => 16_384,
           'gpt-4o-2024-05-13' => 4_096,
@@ -178,6 +211,10 @@ module RubyLLM
         def model_family(model_id)
           case model_id
+          when /^gpt-5\.5/ then 'gpt-5.5'
+          when /^gpt-5\.2/ then 'gpt-5.2'
+          when /^gpt-5\.1/ then 'gpt-5.1'
+          when /^gpt-5/ then 'gpt-5'
           when /^gpt-4\.1/ then 'gpt-4.1'
           when /^gpt-4o-mini/ then 'gpt-4o-mini'
           when /^gpt-4o/ then 'gpt-4o'

data/lib/ruby_llm/providers/openai_responses/chat.rb CHANGED Viewed

@@ -20,9 +20,12 @@ module RubyLLM
           instructions = system_messages.map { |m| extract_text_content(m.content) }.join("\n\n")
+          last_response_id = extract_last_response_id(messages)
+          input_messages = unchained_messages(non_system_messages, last_response_id)
           payload = {
             model: model.id,
-            input: format_input(non_system_messages),
+            input: format_input(input_messages),
             stream: stream
           }
@@ -30,8 +33,6 @@ module RubyLLM
           payload[:temperature] = temperature unless temperature.nil?
           apply_tools(payload, tools, tool_prefs)
           payload[:text] = build_schema_format(schema) if schema
-          last_response_id = extract_last_response_id(messages)
           payload[:previous_response_id] = last_response_id if last_response_id
           payload
@@ -85,6 +86,21 @@ module RubyLLM
             .last
         end
+        # When chaining via previous_response_id, the API expects only the new
+        # items in `input` -- the rest already lives in the server-side response
+        # chain. Sending the full history every turn appends it to that chain
+        # and causes O(N^2) input_tokens growth. See issue #10.
+        def unchained_messages(messages, last_response_id)
+          return messages unless last_response_id
+          anchor = messages.rindex do |m|
+            m.role == :assistant && m.respond_to?(:response_id) && m.response_id == last_response_id
+          end
+          return messages unless anchor
+          messages[(anchor + 1)..] || []
+        end
         def parse_completion_response(response)
           data = response.body
           return if data.nil? || data.empty?
@@ -114,6 +130,7 @@ module RubyLLM
             cache_creation_tokens: 0,
             model_id: data['model'],
             response_id: data['id'],
+            built_in_tool_events: BuiltInTools.extract_events(output),
             raw: response
           )
         end

data/lib/ruby_llm/providers/openai_responses/chat_extension.rb ADDED Viewed

@@ -0,0 +1,46 @@
+# frozen_string_literal: true
+module RubyLLM
+  module Providers
+    class OpenAIResponses
+      # Extends RubyLLM::Chat to fire on_tool_call / on_tool_result for built-in
+      # server-side tools (web_search, code_interpreter, file_search, etc.)
+      # carried on the assistant message. This lets users observe built-in tool
+      # activity through the same callback API as locally executed function
+      # tools (issue #1).
+      module ChatExtension
+        def add_message(message_or_attributes)
+          message = super
+          dispatch_built_in_tool_events(message) if dispatch_built_in_tool_events?(message)
+          message
+        end
+        private
+        def dispatch_built_in_tool_events?(message)
+          message.respond_to?(:built_in_tool_events) &&
+            message.built_in_tool_events &&
+            !message.built_in_tool_events.empty?
+        end
+        def dispatch_built_in_tool_events(message)
+          message.built_in_tool_events.each do |event|
+            fire_callback(:before_tool_call, :tool_call, event[:tool_call])
+            fire_callback(:after_tool_result, :tool_result, event[:result])
+          end
+        end
+        # Mirrors RubyLLM::Chat#run_callbacks (private since 1.13): dispatches
+        # through the new @callbacks array API if present, and always falls
+        # back to the legacy @on hash so older ruby_llm versions still work.
+        def fire_callback(new_name, legacy_name, *args)
+          callbacks = instance_variable_defined?(:@callbacks) ? @callbacks : nil
+          callbacks[new_name]&.each { |cb| cb.call(*args) } if callbacks
+          @on[legacy_name]&.call(*args)
+        end
+      end
+    end
+  end
+end
+RubyLLM::Chat.prepend(RubyLLM::Providers::OpenAIResponses::ChatExtension)

data/lib/ruby_llm/providers/openai_responses/compaction.rb CHANGED Viewed

@@ -34,6 +34,18 @@ module RubyLLM
           payload
         end
+        # URL for explicit Responses API compaction.
+        # @return [String] The URL path
+        def compact_url
+          'responses/compact'
+        end
+        # URL for Responses API input token counting.
+        # @return [String] The URL path
+        def input_tokens_url
+          'responses/input_tokens'
+        end
       end
     end
   end

data/lib/ruby_llm/providers/openai_responses/message_extension.rb CHANGED Viewed

@@ -4,8 +4,10 @@ module RubyLLM
   module Providers
     class OpenAIResponses
       # Extends RubyLLM::Message to support response_id for stateful conversations
+      # and built_in_tool_events for server-side tool calls (web_search,
+      # code_interpreter, etc.) that should fire on_tool_call/on_tool_result.
       module MessageExtension
-        attr_accessor :response_id
+        attr_accessor :response_id, :built_in_tool_events
         def self.included(base)
           base.class_eval do
@@ -14,6 +16,7 @@ module RubyLLM
             define_method(:initialize) do |options = {}|
               original_initialize(options)
               @response_id = options[:response_id]
+              @built_in_tool_events = options[:built_in_tool_events]
             end
             alias_method :original_to_h, :to_h

data/lib/ruby_llm/providers/openai_responses/model_registry.rb CHANGED Viewed

@@ -4,9 +4,27 @@ module RubyLLM
   module Providers
     class OpenAIResponses
       # Registers OpenAI Responses API models with RubyLLM
-      # Models updated January 2026 based on OpenAI documentation
+      # Models updated May 2026 based on OpenAI documentation
       module ModelRegistry
         MODELS = [
+          # ===================
+          # GPT-5.5 Series (Latest flagship - May 2026)
+          # ===================
+          {
+            id: 'gpt-5.5',
+            name: 'GPT-5.5',
+            provider: 'openai_responses',
+            family: 'gpt-5.5',
+            context_window: 1_050_000,
+            max_output_tokens: 128_000,
+            modalities: { input: %w[text image], output: ['text'] },
+            capabilities: %w[
+              streaming function_calling structured_output vision reasoning
+              web_search file_search image_generation code_interpreter shell
+              apply_patch computer_use mcp
+            ]
+          },
           # ===================
           # GPT-5.2 Series (Latest flagship - December 2025)
           # ===================

data/lib/ruby_llm/providers/openai_responses/stream_accumulator_extension.rb ADDED Viewed

@@ -0,0 +1,39 @@
+# frozen_string_literal: true
+module RubyLLM
+  module Providers
+    class OpenAIResponses
+      # Extends RubyLLM::StreamAccumulator to carry built_in_tool_events from
+      # chunks through to the final assembled Message. Without this the
+      # accumulator drops everything off the Chunk it does not know about.
+      module StreamAccumulatorExtension
+        def add(chunk)
+          super
+          events = chunk_built_in_events(chunk)
+          return if events.nil? || events.empty?
+          @built_in_tool_events ||= []
+          @built_in_tool_events.concat(events)
+        end
+        def to_message(response)
+          message = super
+          if @built_in_tool_events && !@built_in_tool_events.empty? && message.respond_to?(:built_in_tool_events=)
+            message.built_in_tool_events = @built_in_tool_events
+          end
+          message
+        end
+        private
+        def chunk_built_in_events(chunk)
+          return nil unless chunk.respond_to?(:built_in_tool_events)
+          chunk.built_in_tool_events
+        end
+      end
+    end
+  end
+end
+RubyLLM::StreamAccumulator.prepend(RubyLLM::Providers::OpenAIResponses::StreamAccumulatorExtension)

data/lib/ruby_llm/providers/openai_responses/streaming.rb CHANGED Viewed

@@ -34,10 +34,14 @@ module RubyLLM
             )
           when 'response.completed'
-            # Final response with usage stats
+            # Final response with usage stats and any server-side built-in
+            # tool activity (web_search_call, code_interpreter_call, etc.) that
+            # the model executed. StreamAccumulatorExtension forwards
+            # built_in_tool_events onto the assembled Message.
             response_data = data['response'] || {}
             usage = response_data['usage'] || {}
             cached_tokens = usage.dig('input_tokens_details', 'cached_tokens')
+            built_in_events = BuiltInTools.extract_events(response_data['output'] || [])
             Chunk.new(
               role: :assistant,
@@ -47,7 +51,8 @@ module RubyLLM
               cached_tokens: cached_tokens,
               cache_creation_tokens: 0,
               model_id: response_data['model'],
-              response_id: response_data['id']
+              response_id: response_data['id'],
+              built_in_tool_events: built_in_events.empty? ? nil : built_in_events
             )
           when 'response.output_item.added'

data/lib/ruby_llm/providers/openai_responses/tools.rb CHANGED Viewed

@@ -17,7 +17,8 @@ module RubyLLM
         # Built-in tool type constants
         BUILT_IN_TOOLS = {
-          web_search: { type: 'web_search_preview' },
+          web_search: { type: 'web_search' },
+          web_search_preview: { type: 'web_search_preview' },
           file_search: ->(vector_store_ids) { { type: 'file_search', vector_store_ids: vector_store_ids } },
           code_interpreter: { type: 'code_interpreter', container: { type: 'auto' } },
           image_generation: { type: 'image_generation' },
@@ -149,9 +150,17 @@ module RubyLLM
         end
         # Helper to create built-in tool configurations
-        def web_search_tool(search_context_size: nil)
+        def web_search_tool(search_context_size: nil, user_location: nil, preview: false)
+          tool = { type: preview ? 'web_search_preview' : 'web_search' }
+          tool[:search_context_size] = search_context_size if search_context_size
+          tool[:user_location] = user_location if user_location
+          tool
+        end
+        def web_search_preview_tool(search_context_size: nil, user_location: nil)
           tool = { type: 'web_search_preview' }
           tool[:search_context_size] = search_context_size if search_context_size
+          tool[:user_location] = user_location if user_location
           tool
         end

data/lib/ruby_llm/providers/openai_responses/web_socket.rb CHANGED Viewed

@@ -24,6 +24,7 @@ module RubyLLM
       class WebSocket # rubocop:disable Metrics/ClassLength
         WEBSOCKET_PATH = '/v1/responses'
         KNOWN_PARAMS = %i[store metadata compact_threshold context_management].freeze
+        RESPONSE_TIMEOUT = :response_timeout
         attr_reader :last_response_id
@@ -32,13 +33,16 @@ module RubyLLM
         # @param organization_id [String, nil] OpenAI organization ID
         # @param project_id [String, nil] OpenAI project ID
         # @param client_class [#connect, nil] WebSocket client class (for testing)
+        # @param response_timeout [Numeric] seconds to wait for a response event
+        # rubocop:disable Metrics/ParameterLists
         def initialize(api_key:, api_base: 'https://api.openai.com/v1', organization_id: nil, project_id: nil,
-                       client_class: nil)
+                       client_class: nil, response_timeout: 60)
           @api_key = api_key
           @api_base = api_base
           @organization_id = organization_id
           @project_id = project_id
           @client_class = client_class
+          @response_timeout = response_timeout
           @ws = nil
           @mutex = Mutex.new
@@ -47,6 +51,7 @@ module RubyLLM
           @last_response_id = nil
           @message_queue = nil
         end
+        # rubocop:enable Metrics/ParameterLists
         # Open the WebSocket connection. Blocks until the connection is established.
         # @param timeout [Numeric] seconds to wait for the connection (default: 10)
@@ -57,6 +62,10 @@ module RubyLLM
           ready = Queue.new
           error_holder = []
+          # websocket-client-simple invokes on() blocks with instance_exec, so any
+          # @ivar reference inside resolves to the underlying client, not us.
+          # Capture self as a local so the handlers can call back into this object.
+          owner = self
           @ws = client_class.connect(build_ws_url, headers: build_headers)
@@ -64,20 +73,11 @@ module RubyLLM
           @ws.on(:error) do |e|
             error_holder << e
-            ready.push(:error) unless @connected
+            ready.push(:error) unless owner.connected?
           end
-          @ws.on(:close) do
-            @mutex.synchronize do
-              @connected = false
-              @message_queue&.push(nil)
-            end
-          end
-          @ws.on(:message) do |msg|
-            q = @mutex.synchronize { @message_queue }
-            q&.push(msg.data)
-          end
+          @ws.on(:close) { owner.__send__(:handle_close) }
+          @ws.on(:message) { |msg| owner.__send__(:handle_message, msg.data) }
           result = pop_with_timeout(ready, timeout)
           if result == :error || result.nil?
@@ -103,7 +103,7 @@ module RubyLLM
           queue = Queue.new
           @mutex.synchronize { @message_queue = queue }
-          envelope = { type: 'response.create', response: payload.except(:stream) }
+          envelope = { type: 'response.create' }.merge(payload.except(:stream))
           send_json(envelope)
           accumulate_response(queue, &)
         ensure
@@ -144,10 +144,7 @@ module RubyLLM
           queue = Queue.new
           @mutex.synchronize { @message_queue = queue }
-          payload = {
-            type: 'response.create',
-            response: { model: model, generate: false }.merge(extra)
-          }
+          payload = { type: 'response.create', model: model, generate: false }.merge(extra)
           send_json(payload)
@@ -196,6 +193,18 @@ module RubyLLM
         private
+        def handle_close
+          @mutex.synchronize do
+            @connected = false
+            @message_queue&.push(nil)
+          end
+        end
+        def handle_message(data)
+          q = @mutex.synchronize { @message_queue }
+          q&.push(data)
+        end
         def resolve_client_class
           require 'websocket-client-simple'
           ::WebSocket::Client::Simple
@@ -213,8 +222,7 @@ module RubyLLM
         def build_headers
           headers = {
-            'Authorization' => "Bearer #{@api_key}",
-            'OpenAI-Beta' => 'responses.websocket=v1'
+            'Authorization' => "Bearer #{@api_key}"
           }
           headers['OpenAI-Organization'] = @organization_id if @organization_id
           headers['OpenAI-Project'] = @project_id if @project_id
@@ -241,9 +249,14 @@ module RubyLLM
         def accumulate_response(queue, &block)
           accumulator = StreamAccumulator.new
+          built_in_events = []
           loop do
-            raw = queue.pop
+            raw = pop_response_event(queue)
+            if raw == RESPONSE_TIMEOUT
+              raise ConnectionError, "Timed out waiting for WebSocket response after #{@response_timeout} seconds"
+            end
             break if raw.nil?
             data = JSON.parse(raw)
@@ -253,14 +266,18 @@ module RubyLLM
             accumulator.add(chunk)
             block&.call(chunk)
-            if event_type == 'response.completed'
-              track_response_id(data)
-              break
-            end
+            next unless event_type == 'response.completed'
+            track_response_id(data)
+            built_in_events.concat(BuiltInTools.extract_events(data.dig('response', 'output') || []))
+            break
           end
           message = accumulator.to_message(nil)
           message.response_id = @last_response_id
+          if message.respond_to?(:built_in_tool_events=) && built_in_events.any?
+            message.built_in_tool_events = built_in_events
+          end
           message
         end
@@ -290,6 +307,12 @@ module RubyLLM
         rescue Timeout::Error
           nil
         end
+        def pop_response_event(queue)
+          Timeout.timeout(@response_timeout) { queue.pop }
+        rescue Timeout::Error
+          RESPONSE_TIMEOUT
+        end
       end
     end
   end

data/lib/ruby_llm/providers/openai_responses.rb CHANGED Viewed

@@ -16,7 +16,7 @@ module RubyLLM
         @config.openai_api_base || 'https://api.openai.com/v1'
       end
-      # Override to support WebSocket transport via with_params(transport: :websocket)
+      # Override to support WebSocket transport via with_params(transport: :websocket).
       # rubocop:disable Metrics/ParameterLists
       def complete(messages, tools:, temperature:, model:, params: {}, headers: {},
                    schema: nil, thinking: nil, tool_prefs: nil, &block)
@@ -90,6 +90,26 @@ module RubyLLM
         end
       end
+      # Run an explicit compaction pass over a response input.
+      # @param model [String] Model ID used for compaction
+      # @param input [String, Array<Hash>] Response input items to compact
+      # @param params [Hash] Additional Responses API parameters
+      # @return [Hash] Compacted response data
+      def compact_response(model:, input:, **params)
+        response = @connection.post(Compaction.compact_url, { model: model, input: input }.merge(params))
+        response.body
+      end
+      # Count input tokens for a Responses API request without creating a response.
+      # @param model [String] Model ID used for tokenization
+      # @param input [String, Array<Hash>] Response input
+      # @param params [Hash] Additional Responses API parameters
+      # @return [Hash] Token count data
+      def count_input_tokens(model:, input:, **params)
+        response = @connection.post(Compaction.input_tokens_url, { model: model, input: input }.merge(params))
+        response.body
+      end
       # --- Container Management ---
       # Create a new container

data/lib/rubyllm_responses_api.rb CHANGED Viewed

@@ -22,6 +22,8 @@ require_relative 'ruby_llm/providers/openai_responses/containers'
 require_relative 'ruby_llm/providers/openai_responses/batches'
 require_relative 'ruby_llm/providers/openai_responses/batch'
 require_relative 'ruby_llm/providers/openai_responses/message_extension'
+require_relative 'ruby_llm/providers/openai_responses/stream_accumulator_extension'
+require_relative 'ruby_llm/providers/openai_responses/chat_extension'
 require_relative 'ruby_llm/providers/openai_responses/model_registry'
 require_relative 'ruby_llm/providers/openai_responses/active_record_extension'
 require_relative 'ruby_llm/providers/openai_responses/web_socket'
@@ -39,7 +41,7 @@ RubyLLM::Providers::OpenAIResponses::ModelRegistry.register_all!
 module RubyLLM
   # ResponsesAPI namespace for direct access to helpers and version
   module ResponsesAPI
-    VERSION = '0.5.4'
+    VERSION = '0.6.0'
     # Shorthand access to built-in tool helpers
     BuiltInTools = Providers::OpenAIResponses::BuiltInTools

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: ruby_llm-responses_api
 version: !ruby/object:Gem::Version
-  version: 0.5.4
+  version: 0.6.0
 platform: ruby
 authors:
 - Chris Hasinski
@@ -15,14 +15,14 @@ dependencies:
     requirements:
     - - ">="
       - !ruby/object:Gem::Version
-        version: '1.0'
+        version: '1.13'
   type: :runtime
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
     - - ">="
       - !ruby/object:Gem::Version
-        version: '1.0'
+        version: '1.13'
 - !ruby/object:Gem::Dependency
   name: activerecord
   requirement: !ruby/object:Gem::Requirement
@@ -158,6 +158,7 @@ files:
 - lib/ruby_llm/providers/openai_responses/built_in_tools.rb
 - lib/ruby_llm/providers/openai_responses/capabilities.rb
 - lib/ruby_llm/providers/openai_responses/chat.rb
+- lib/ruby_llm/providers/openai_responses/chat_extension.rb
 - lib/ruby_llm/providers/openai_responses/compaction.rb
 - lib/ruby_llm/providers/openai_responses/containers.rb
 - lib/ruby_llm/providers/openai_responses/media.rb
@@ -165,6 +166,7 @@ files:
 - lib/ruby_llm/providers/openai_responses/model_registry.rb
 - lib/ruby_llm/providers/openai_responses/models.rb
 - lib/ruby_llm/providers/openai_responses/state.rb
+- lib/ruby_llm/providers/openai_responses/stream_accumulator_extension.rb
 - lib/ruby_llm/providers/openai_responses/streaming.rb
 - lib/ruby_llm/providers/openai_responses/tools.rb
 - lib/ruby_llm/providers/openai_responses/web_socket.rb