ruby_llm-responses_api 0.5.4 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 688542f3b394d7ecfe8873cf6700d95da8c566230826a35d332a8e3bf1b08e7a
4
- data.tar.gz: 5b12fd0f0728c273821956f315856dc966b0d8826a3663c86893710c377f2880
3
+ metadata.gz: daf5eec383489e81c577b2f04965ca156b976ca30a708a39a6ab8dce39863ff7
4
+ data.tar.gz: df8e9a6f230724e40f479679229d20c1b2faeee8e186400f37367e059badea8a
5
5
  SHA512:
6
- metadata.gz: e8203f307e819443cff01c51ebfc61cba907345e71782bc3f9286ba43a5f2e9b86612135a2fc4754ed6e3262b14a8187e2bda5c130fe434ba6389f5b2f5aa5e0
7
- data.tar.gz: 05e88249b83ec50f67ff1e333f06df78cb9776f66419eefd14665548b2899ba63bfc605c78b74b6991d05d3d0ae369cd9a06b9b8d49f6106ea03eea3ed468f87
6
+ metadata.gz: 7b6ae940cd27e283b2fd06edcb9122fb25328f39d786f3c011073954a4e5574a63f4272b17e11816e499c3064343407181a044768ebaa7f4f9d1844ea3afae3b
7
+ data.tar.gz: e2f971bfe9d6d874e6b368775485dab455ce16a0291c11d1806adf279273c3b6e762fac015ab86d376d84c3d31e0e1de26475ec5230c912d7352ad0b29c0a7f1
data/CHANGELOG.md CHANGED
@@ -5,6 +5,24 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [0.6.0] - 2026-05-26
9
+
10
+ ### Added
11
+
12
+ - Fire `on_tool_call` / `on_tool_result` (and the newer `before_tool_call` / `after_tool_result`) for server-side built-in tools: web search, file search, code interpreter, image generation, shell, apply patch, MCP, computer use, local shell (issue #1 by @myxoh)
13
+ - Configurable WebSocket `response_timeout` (default 60s); stalled streams now raise `ConnectionError` instead of hanging forever on `queue.pop`
14
+
15
+ ### Fixed
16
+
17
+ - Stop sending the full message history alongside `previous_response_id` in chained conversations; this caused server-side chain state to grow quadratically and reach the context-window ceiling far earlier than the visible content suggested (issue #10 reported by @theclunkerjunker)
18
+ - Drop the rejected `OpenAI-Beta: responses.websocket=v1` header that prevented the live `wss://api.openai.com/v1/responses` endpoint from accepting connections
19
+ - Send `response.create` fields at the top level over WebSocket instead of nested under a `response` key, which made the live endpoint reject every request as `missing_required_parameter: model` (PR #8 by @lucas-domeij)
20
+ - Bind WebSocket `on(:message)` / `on(:close)` / `on(:error)` handlers via a local closure so they no longer reference ivars on the underlying client (which silently dropped every incoming frame and made `#call` hang) (PR #9 by @lucas-domeij)
21
+
22
+ ### Changed
23
+
24
+ - `ruby_llm` dependency bumped to `>= 1.13` so the existing `thinking:` / `tool_prefs:` overrides match the upstream `Provider#complete` signature
25
+
8
26
  ## [0.5.4] - 2026-04-06
9
27
 
10
28
  ### Changed
data/README.md CHANGED
@@ -17,7 +17,7 @@ RubyLLM.configure do |config|
17
17
  config.openai_api_key = ENV['OPENAI_API_KEY']
18
18
  end
19
19
 
20
- chat = RubyLLM.chat(model: 'gpt-4o-mini', provider: :openai_responses)
20
+ chat = RubyLLM.chat(model: 'gpt-5.5', provider: :openai_responses)
21
21
  response = chat.ask("Hello!")
22
22
  puts response.content
23
23
  ```
@@ -29,7 +29,7 @@ All standard RubyLLM features work as expected (streaming, tools, vision, struct
29
29
  Conversations automatically chain via `previous_response_id`:
30
30
 
31
31
  ```ruby
32
- chat = RubyLLM.chat(model: 'gpt-4o-mini', provider: :openai_responses)
32
+ chat = RubyLLM.chat(model: 'gpt-5.5', provider: :openai_responses)
33
33
  chat.ask("My name is Alice.")
34
34
  chat.ask("What's my name?") # => "Your name is Alice."
35
35
  ```
@@ -50,7 +50,7 @@ Then use normally:
50
50
 
51
51
  ```ruby
52
52
  # Day 1
53
- chat = Chat.create!(model_id: 'gpt-4o-mini', provider: :openai_responses)
53
+ chat = Chat.create!(model_id: 'gpt-5.5', provider: :openai_responses)
54
54
  chat.ask("My name is Alice.")
55
55
 
56
56
  # Day 2 (after restart)
@@ -65,12 +65,15 @@ The Responses API provides built-in tools that don't require custom implementati
65
65
  ### Web Search
66
66
 
67
67
  ```ruby
68
- chat.with_params(tools: [{ type: 'web_search_preview' }])
68
+ chat.with_params(tools: [{ type: 'web_search' }])
69
69
  chat.ask("Latest news about Ruby 3.4?")
70
70
 
71
71
  # Or with helper
72
72
  tool = RubyLLM::ResponsesAPI::BuiltInTools.web_search(search_context_size: 'high')
73
73
  chat.with_params(tools: [tool])
74
+
75
+ # Legacy preview type is still available when needed
76
+ tool = RubyLLM::ResponsesAPI::BuiltInTools.web_search_preview
74
77
  ```
75
78
 
76
79
  ### Code Interpreter
@@ -97,7 +100,7 @@ Execute commands in hosted containers or local terminal environments. Requires G
97
100
 
98
101
  ```ruby
99
102
  # Auto-provisioned container (default)
100
- chat = RubyLLM.chat(model: 'gpt-5.2', provider: :openai_responses)
103
+ chat = RubyLLM.chat(model: 'gpt-5.5', provider: :openai_responses)
101
104
  chat.with_params(tools: [{ type: 'shell', environment: { type: 'container_auto' } }])
102
105
  chat.ask("List all Python files in the project")
103
106
 
@@ -131,7 +134,7 @@ tool = RubyLLM::ResponsesAPI::BuiltInTools.shell(environment_type: 'local')
131
134
  Structured diff-based file editing. Requires GPT-5 family models.
132
135
 
133
136
  ```ruby
134
- chat = RubyLLM.chat(model: 'gpt-5.2', provider: :openai_responses)
137
+ chat = RubyLLM.chat(model: 'gpt-5.5', provider: :openai_responses)
135
138
  chat.with_params(tools: [{ type: 'apply_patch' }])
136
139
  chat.ask("Add error handling to the User#save method")
137
140
 
@@ -162,7 +165,7 @@ chat.with_params(tools: [tool])
162
165
 
163
166
  ```ruby
164
167
  chat.with_params(tools: [
165
- { type: 'web_search_preview' },
168
+ { type: 'web_search' },
166
169
  { type: 'code_interpreter' },
167
170
  { type: 'shell', environment: { type: 'container_auto' } }
168
171
  ])
@@ -195,12 +198,29 @@ end
195
198
 
196
199
  When the token count crosses the threshold, the server automatically compacts the conversation. The compacted state is carried forward transparently via `previous_response_id`.
197
200
 
201
+ You can also run an explicit compaction pass or count request input tokens before creating a response:
202
+
203
+ ```ruby
204
+ provider = chat.instance_variable_get(:@provider)
205
+
206
+ compacted = provider.compact_response(
207
+ model: 'gpt-5.5',
208
+ input: [{ type: 'message', role: 'user', content: 'Summarize this long session...' }]
209
+ )
210
+
211
+ tokens = provider.count_input_tokens(
212
+ model: 'gpt-5.5',
213
+ input: 'Tell me a joke.'
214
+ )
215
+ puts tokens['input_tokens']
216
+ ```
217
+
198
218
  ## Containers API
199
219
 
200
220
  Manage persistent execution environments for the shell tool and code interpreter:
201
221
 
202
222
  ```ruby
203
- chat = RubyLLM.chat(model: 'gpt-5.2', provider: :openai_responses)
223
+ chat = RubyLLM.chat(model: 'gpt-5.5', provider: :openai_responses)
204
224
  provider = chat.instance_variable_get(:@provider)
205
225
 
206
226
  # Create a container
@@ -241,9 +261,27 @@ result = provider.poll_response(response.response_id, interval: 2.0) do |status|
241
261
  end
242
262
  ```
243
263
 
264
+ ## Observing Built-in Tool Activity
265
+
266
+ Server-side built-in tools (web search, code interpreter, file search, shell, apply patch, image generation, MCP, computer use, local shell) fire through the same `on_tool_call` / `on_tool_result` callbacks as locally executed function tools:
267
+
268
+ ```ruby
269
+ chat = RubyLLM.chat(model: 'gpt-4o', provider: :openai_responses)
270
+ chat.with_params(tools: [{ type: 'web_search' }])
271
+
272
+ chat.on_tool_call { |tc| puts "#{tc.name} called (id=#{tc.id})" }
273
+ chat.on_tool_result { |r| puts " -> status=#{r[:status]}" }
274
+
275
+ chat.ask("What's the latest Ruby release?")
276
+ # web_search called (id=ws_...)
277
+ # -> status=completed
278
+ ```
279
+
280
+ The newer `before_tool_call` / `after_tool_result` API (ruby_llm 1.13+) is supported too. Each `ToolCall` carries a normalized name (`web_search`, `code_interpreter`, `file_search`, `image_generation`, `shell`, `apply_patch`, `mcp`, `computer`, `local_shell`) and best-effort arguments extracted from the response item.
281
+
244
282
  ## Parsing Built-in Tool Results
245
283
 
246
- When the API returns results from built-in tools, use the parsers to extract structured data:
284
+ When you want the structured payload rather than just the callback, use the parsers:
247
285
 
248
286
  ```ruby
249
287
  # Access raw response output (available via response.raw)
@@ -16,13 +16,19 @@ module RubyLLM
16
16
  # Web Search tool configuration
17
17
  # @param search_context_size [String, nil] 'low', 'medium', or 'high'
18
18
  # @param user_location [Hash, nil] { type: 'approximate', city: '...', country: '...' }
19
- def web_search(search_context_size: nil, user_location: nil)
20
- tool = { type: 'web_search_preview' }
19
+ # @param preview [Boolean] use the legacy preview tool type
20
+ def web_search(search_context_size: nil, user_location: nil, preview: false)
21
+ tool = { type: preview ? 'web_search_preview' : 'web_search' }
21
22
  tool[:search_context_size] = search_context_size if search_context_size
22
23
  tool[:user_location] = user_location if user_location
23
24
  tool
24
25
  end
25
26
 
27
+ # Legacy Web Search preview tool configuration.
28
+ def web_search_preview(search_context_size: nil, user_location: nil)
29
+ web_search(search_context_size: search_context_size, user_location: user_location, preview: true)
30
+ end
31
+
26
32
  # File Search tool configuration
27
33
  # @param vector_store_ids [Array<String>] IDs of vector stores to search
28
34
  # @param max_num_results [Integer, nil] Maximum results to return
@@ -187,6 +193,68 @@ module RubyLLM
187
193
  end
188
194
  end
189
195
 
196
+ # Server-executed built-in tool output item types and their argument
197
+ # extractors. The key is the output `type` from the Responses API; the
198
+ # value is a lambda that pulls the relevant arguments out of that item.
199
+ # To support a new built-in tool, add an entry here.
200
+ CALL_ARGUMENT_EXTRACTORS = {
201
+ 'web_search_call' => ->(item) { { action: item['action'], query: item.dig('action', 'query') } },
202
+ 'file_search_call' => ->(item) { { queries: item['queries'] } },
203
+ 'code_interpreter_call' => ->(item) { { code: item['code'], container_id: item['container_id'] } },
204
+ 'image_generation_call' => ->(_item) { {} },
205
+ 'shell_call' => ->(item) { { action: item['action'], container_id: item['container_id'] } },
206
+ 'local_shell_call' => ->(item) { { action: item['action'], container_id: item['container_id'] } },
207
+ 'apply_patch_call' => ->(item) { { operation: item['operation'] } },
208
+ 'mcp_call' => lambda { |item|
209
+ { name: item['name'], arguments: item['arguments'], server_label: item['server_label'] }
210
+ },
211
+ 'computer_call' => ->(item) { { action: item['action'] } }
212
+ }.freeze
213
+
214
+ # Build a list of {tool_call:, result:} events from the response output.
215
+ # Used to surface server-side built-in tool activity through the standard
216
+ # on_tool_call / on_tool_result callbacks (issue #1).
217
+ def extract_events(output)
218
+ return [] unless output.is_a?(Array)
219
+
220
+ output.select { |item| CALL_ARGUMENT_EXTRACTORS.key?(item['type']) }
221
+ .map { |item| build_event(item) }
222
+ end
223
+
224
+ private_class_method def build_event(item)
225
+ type_label = item['type'].sub(/_call\z/, '')
226
+
227
+ tool_call = RubyLLM::ToolCall.new(
228
+ id: item['id'],
229
+ name: type_label,
230
+ arguments: call_arguments(item)
231
+ )
232
+
233
+ {
234
+ tool_call: tool_call,
235
+ result: call_result(item)
236
+ }
237
+ end
238
+
239
+ private_class_method def call_arguments(item)
240
+ extractor = CALL_ARGUMENT_EXTRACTORS[item['type']]
241
+ return {} unless extractor
242
+
243
+ extractor.call(item).compact
244
+ end
245
+
246
+ private_class_method def call_result(item)
247
+ {
248
+ id: item['id'],
249
+ type: item['type'],
250
+ status: item['status'],
251
+ results: item['results'],
252
+ result: item['result'],
253
+ output: item['output'],
254
+ action: item['action']
255
+ }.compact
256
+ end
257
+
190
258
  # Parse shell call results from output
191
259
  # @param output [Array] Response output array
192
260
  # @return [Array<Hash>] Parsed shell call results
@@ -10,6 +10,9 @@ module RubyLLM
10
10
 
11
11
  # Models that support the Responses API
12
12
  RESPONSES_API_MODELS = %w[
13
+ gpt-5.5
14
+ gpt-5.2 gpt-5.1 gpt-5.1-codex-max gpt-5.1-codex gpt-5.1-codex-mini gpt-5.1-chat
15
+ gpt-5 gpt-5-pro gpt-5-mini gpt-5-nano
13
16
  gpt-4o gpt-4o-mini gpt-4o-2024-05-13 gpt-4o-2024-08-06 gpt-4o-2024-11-20
14
17
  gpt-4o-mini-2024-07-18
15
18
  gpt-4.1 gpt-4.1-mini gpt-4.1-nano
@@ -21,6 +24,8 @@ module RubyLLM
21
24
 
22
25
  # Models with vision capabilities
23
26
  VISION_MODELS = %w[
27
+ gpt-5.5
28
+ gpt-5.2 gpt-5.1 gpt-5.1-chat gpt-5 gpt-5-pro gpt-5-mini gpt-5-nano
24
29
  gpt-4o gpt-4o-mini gpt-4o-2024-05-13 gpt-4o-2024-08-06 gpt-4o-2024-11-20
25
30
  gpt-4o-mini-2024-07-18
26
31
  gpt-4.1 gpt-4.1-mini gpt-4.1-nano
@@ -30,22 +35,39 @@ module RubyLLM
30
35
  ].freeze
31
36
 
32
37
  # Reasoning models (o-series)
33
- REASONING_MODELS = %w[o1 o1-mini o1-preview o1-2024-12-17 o3 o3-mini o4-mini].freeze
38
+ REASONING_MODELS = %w[
39
+ gpt-5.5 gpt-5.2 gpt-5.1 gpt-5.1-codex-max gpt-5.1-codex gpt-5.1-codex-mini
40
+ gpt-5 gpt-5-pro gpt-5-mini gpt-5-nano
41
+ o1 o1-mini o1-preview o1-2024-12-17 o3 o3-mini o4-mini
42
+ ].freeze
34
43
 
35
44
  # Models that support web search
36
45
  WEB_SEARCH_MODELS = %w[
46
+ gpt-5.5 gpt-5.2 gpt-5.1 gpt-5.1-codex-max gpt-5 gpt-5-pro gpt-5-mini
37
47
  gpt-4o gpt-4o-mini gpt-4.1 gpt-4.1-mini gpt-4.1-nano
38
48
  o1 o3 o3-mini o4-mini
39
49
  ].freeze
40
50
 
41
51
  # Models that support code interpreter
42
52
  CODE_INTERPRETER_MODELS = %w[
53
+ gpt-5.5 gpt-5.2 gpt-5.1 gpt-5 gpt-5-pro gpt-5-mini
43
54
  gpt-4o gpt-4o-mini gpt-4.1 gpt-4.1-mini gpt-4.1-nano
44
55
  o1 o3 o3-mini o4-mini
45
56
  ].freeze
46
57
 
47
58
  # Context windows by model
48
59
  CONTEXT_WINDOWS = {
60
+ 'gpt-5.5' => 1_050_000,
61
+ 'gpt-5.2' => 400_000,
62
+ 'gpt-5.1' => 400_000,
63
+ 'gpt-5.1-codex-max' => 400_000,
64
+ 'gpt-5.1-codex' => 400_000,
65
+ 'gpt-5.1-codex-mini' => 400_000,
66
+ 'gpt-5.1-chat' => 128_000,
67
+ 'gpt-5' => 400_000,
68
+ 'gpt-5-pro' => 400_000,
69
+ 'gpt-5-mini' => 400_000,
70
+ 'gpt-5-nano' => 400_000,
49
71
  'gpt-4o' => 128_000,
50
72
  'gpt-4o-mini' => 128_000,
51
73
  'gpt-4o-2024-05-13' => 128_000,
@@ -67,6 +89,17 @@ module RubyLLM
67
89
 
68
90
  # Max output tokens by model
69
91
  MAX_OUTPUT_TOKENS = {
92
+ 'gpt-5.5' => 128_000,
93
+ 'gpt-5.2' => 128_000,
94
+ 'gpt-5.1' => 128_000,
95
+ 'gpt-5.1-codex-max' => 128_000,
96
+ 'gpt-5.1-codex' => 128_000,
97
+ 'gpt-5.1-codex-mini' => 128_000,
98
+ 'gpt-5.1-chat' => 16_384,
99
+ 'gpt-5' => 128_000,
100
+ 'gpt-5-pro' => 128_000,
101
+ 'gpt-5-mini' => 128_000,
102
+ 'gpt-5-nano' => 128_000,
70
103
  'gpt-4o' => 16_384,
71
104
  'gpt-4o-mini' => 16_384,
72
105
  'gpt-4o-2024-05-13' => 4_096,
@@ -178,6 +211,10 @@ module RubyLLM
178
211
 
179
212
  def model_family(model_id)
180
213
  case model_id
214
+ when /^gpt-5\.5/ then 'gpt-5.5'
215
+ when /^gpt-5\.2/ then 'gpt-5.2'
216
+ when /^gpt-5\.1/ then 'gpt-5.1'
217
+ when /^gpt-5/ then 'gpt-5'
181
218
  when /^gpt-4\.1/ then 'gpt-4.1'
182
219
  when /^gpt-4o-mini/ then 'gpt-4o-mini'
183
220
  when /^gpt-4o/ then 'gpt-4o'
@@ -20,9 +20,12 @@ module RubyLLM
20
20
 
21
21
  instructions = system_messages.map { |m| extract_text_content(m.content) }.join("\n\n")
22
22
 
23
+ last_response_id = extract_last_response_id(messages)
24
+ input_messages = unchained_messages(non_system_messages, last_response_id)
25
+
23
26
  payload = {
24
27
  model: model.id,
25
- input: format_input(non_system_messages),
28
+ input: format_input(input_messages),
26
29
  stream: stream
27
30
  }
28
31
 
@@ -30,8 +33,6 @@ module RubyLLM
30
33
  payload[:temperature] = temperature unless temperature.nil?
31
34
  apply_tools(payload, tools, tool_prefs)
32
35
  payload[:text] = build_schema_format(schema) if schema
33
-
34
- last_response_id = extract_last_response_id(messages)
35
36
  payload[:previous_response_id] = last_response_id if last_response_id
36
37
 
37
38
  payload
@@ -85,6 +86,21 @@ module RubyLLM
85
86
  .last
86
87
  end
87
88
 
89
+ # When chaining via previous_response_id, the API expects only the new
90
+ # items in `input` -- the rest already lives in the server-side response
91
+ # chain. Sending the full history every turn appends it to that chain
92
+ # and causes O(N^2) input_tokens growth. See issue #10.
93
+ def unchained_messages(messages, last_response_id)
94
+ return messages unless last_response_id
95
+
96
+ anchor = messages.rindex do |m|
97
+ m.role == :assistant && m.respond_to?(:response_id) && m.response_id == last_response_id
98
+ end
99
+ return messages unless anchor
100
+
101
+ messages[(anchor + 1)..] || []
102
+ end
103
+
88
104
  def parse_completion_response(response)
89
105
  data = response.body
90
106
  return if data.nil? || data.empty?
@@ -114,6 +130,7 @@ module RubyLLM
114
130
  cache_creation_tokens: 0,
115
131
  model_id: data['model'],
116
132
  response_id: data['id'],
133
+ built_in_tool_events: BuiltInTools.extract_events(output),
117
134
  raw: response
118
135
  )
119
136
  end
@@ -0,0 +1,46 @@
1
+ # frozen_string_literal: true
2
+
3
+ module RubyLLM
4
+ module Providers
5
+ class OpenAIResponses
6
+ # Extends RubyLLM::Chat to fire on_tool_call / on_tool_result for built-in
7
+ # server-side tools (web_search, code_interpreter, file_search, etc.)
8
+ # carried on the assistant message. This lets users observe built-in tool
9
+ # activity through the same callback API as locally executed function
10
+ # tools (issue #1).
11
+ module ChatExtension
12
+ def add_message(message_or_attributes)
13
+ message = super
14
+ dispatch_built_in_tool_events(message) if dispatch_built_in_tool_events?(message)
15
+ message
16
+ end
17
+
18
+ private
19
+
20
+ def dispatch_built_in_tool_events?(message)
21
+ message.respond_to?(:built_in_tool_events) &&
22
+ message.built_in_tool_events &&
23
+ !message.built_in_tool_events.empty?
24
+ end
25
+
26
+ def dispatch_built_in_tool_events(message)
27
+ message.built_in_tool_events.each do |event|
28
+ fire_callback(:before_tool_call, :tool_call, event[:tool_call])
29
+ fire_callback(:after_tool_result, :tool_result, event[:result])
30
+ end
31
+ end
32
+
33
+ # Mirrors RubyLLM::Chat#run_callbacks (private since 1.13): dispatches
34
+ # through the new @callbacks array API if present, and always falls
35
+ # back to the legacy @on hash so older ruby_llm versions still work.
36
+ def fire_callback(new_name, legacy_name, *args)
37
+ callbacks = instance_variable_defined?(:@callbacks) ? @callbacks : nil
38
+ callbacks[new_name]&.each { |cb| cb.call(*args) } if callbacks
39
+ @on[legacy_name]&.call(*args)
40
+ end
41
+ end
42
+ end
43
+ end
44
+ end
45
+
46
+ RubyLLM::Chat.prepend(RubyLLM::Providers::OpenAIResponses::ChatExtension)
@@ -34,6 +34,18 @@ module RubyLLM
34
34
 
35
35
  payload
36
36
  end
37
+
38
+ # URL for explicit Responses API compaction.
39
+ # @return [String] The URL path
40
+ def compact_url
41
+ 'responses/compact'
42
+ end
43
+
44
+ # URL for Responses API input token counting.
45
+ # @return [String] The URL path
46
+ def input_tokens_url
47
+ 'responses/input_tokens'
48
+ end
37
49
  end
38
50
  end
39
51
  end
@@ -4,8 +4,10 @@ module RubyLLM
4
4
  module Providers
5
5
  class OpenAIResponses
6
6
  # Extends RubyLLM::Message to support response_id for stateful conversations
7
+ # and built_in_tool_events for server-side tool calls (web_search,
8
+ # code_interpreter, etc.) that should fire on_tool_call/on_tool_result.
7
9
  module MessageExtension
8
- attr_accessor :response_id
10
+ attr_accessor :response_id, :built_in_tool_events
9
11
 
10
12
  def self.included(base)
11
13
  base.class_eval do
@@ -14,6 +16,7 @@ module RubyLLM
14
16
  define_method(:initialize) do |options = {}|
15
17
  original_initialize(options)
16
18
  @response_id = options[:response_id]
19
+ @built_in_tool_events = options[:built_in_tool_events]
17
20
  end
18
21
 
19
22
  alias_method :original_to_h, :to_h
@@ -4,9 +4,27 @@ module RubyLLM
4
4
  module Providers
5
5
  class OpenAIResponses
6
6
  # Registers OpenAI Responses API models with RubyLLM
7
- # Models updated January 2026 based on OpenAI documentation
7
+ # Models updated May 2026 based on OpenAI documentation
8
8
  module ModelRegistry
9
9
  MODELS = [
10
+ # ===================
11
+ # GPT-5.5 Series (Latest flagship - May 2026)
12
+ # ===================
13
+ {
14
+ id: 'gpt-5.5',
15
+ name: 'GPT-5.5',
16
+ provider: 'openai_responses',
17
+ family: 'gpt-5.5',
18
+ context_window: 1_050_000,
19
+ max_output_tokens: 128_000,
20
+ modalities: { input: %w[text image], output: ['text'] },
21
+ capabilities: %w[
22
+ streaming function_calling structured_output vision reasoning
23
+ web_search file_search image_generation code_interpreter shell
24
+ apply_patch computer_use mcp
25
+ ]
26
+ },
27
+
10
28
  # ===================
11
29
  # GPT-5.2 Series (Latest flagship - December 2025)
12
30
  # ===================
@@ -0,0 +1,39 @@
1
+ # frozen_string_literal: true
2
+
3
+ module RubyLLM
4
+ module Providers
5
+ class OpenAIResponses
6
+ # Extends RubyLLM::StreamAccumulator to carry built_in_tool_events from
7
+ # chunks through to the final assembled Message. Without this the
8
+ # accumulator drops everything off the Chunk it does not know about.
9
+ module StreamAccumulatorExtension
10
+ def add(chunk)
11
+ super
12
+ events = chunk_built_in_events(chunk)
13
+ return if events.nil? || events.empty?
14
+
15
+ @built_in_tool_events ||= []
16
+ @built_in_tool_events.concat(events)
17
+ end
18
+
19
+ def to_message(response)
20
+ message = super
21
+ if @built_in_tool_events && !@built_in_tool_events.empty? && message.respond_to?(:built_in_tool_events=)
22
+ message.built_in_tool_events = @built_in_tool_events
23
+ end
24
+ message
25
+ end
26
+
27
+ private
28
+
29
+ def chunk_built_in_events(chunk)
30
+ return nil unless chunk.respond_to?(:built_in_tool_events)
31
+
32
+ chunk.built_in_tool_events
33
+ end
34
+ end
35
+ end
36
+ end
37
+ end
38
+
39
+ RubyLLM::StreamAccumulator.prepend(RubyLLM::Providers::OpenAIResponses::StreamAccumulatorExtension)
@@ -34,10 +34,14 @@ module RubyLLM
34
34
  )
35
35
 
36
36
  when 'response.completed'
37
- # Final response with usage stats
37
+ # Final response with usage stats and any server-side built-in
38
+ # tool activity (web_search_call, code_interpreter_call, etc.) that
39
+ # the model executed. StreamAccumulatorExtension forwards
40
+ # built_in_tool_events onto the assembled Message.
38
41
  response_data = data['response'] || {}
39
42
  usage = response_data['usage'] || {}
40
43
  cached_tokens = usage.dig('input_tokens_details', 'cached_tokens')
44
+ built_in_events = BuiltInTools.extract_events(response_data['output'] || [])
41
45
 
42
46
  Chunk.new(
43
47
  role: :assistant,
@@ -47,7 +51,8 @@ module RubyLLM
47
51
  cached_tokens: cached_tokens,
48
52
  cache_creation_tokens: 0,
49
53
  model_id: response_data['model'],
50
- response_id: response_data['id']
54
+ response_id: response_data['id'],
55
+ built_in_tool_events: built_in_events.empty? ? nil : built_in_events
51
56
  )
52
57
 
53
58
  when 'response.output_item.added'
@@ -17,7 +17,8 @@ module RubyLLM
17
17
 
18
18
  # Built-in tool type constants
19
19
  BUILT_IN_TOOLS = {
20
- web_search: { type: 'web_search_preview' },
20
+ web_search: { type: 'web_search' },
21
+ web_search_preview: { type: 'web_search_preview' },
21
22
  file_search: ->(vector_store_ids) { { type: 'file_search', vector_store_ids: vector_store_ids } },
22
23
  code_interpreter: { type: 'code_interpreter', container: { type: 'auto' } },
23
24
  image_generation: { type: 'image_generation' },
@@ -149,9 +150,17 @@ module RubyLLM
149
150
  end
150
151
 
151
152
  # Helper to create built-in tool configurations
152
- def web_search_tool(search_context_size: nil)
153
+ def web_search_tool(search_context_size: nil, user_location: nil, preview: false)
154
+ tool = { type: preview ? 'web_search_preview' : 'web_search' }
155
+ tool[:search_context_size] = search_context_size if search_context_size
156
+ tool[:user_location] = user_location if user_location
157
+ tool
158
+ end
159
+
160
+ def web_search_preview_tool(search_context_size: nil, user_location: nil)
153
161
  tool = { type: 'web_search_preview' }
154
162
  tool[:search_context_size] = search_context_size if search_context_size
163
+ tool[:user_location] = user_location if user_location
155
164
  tool
156
165
  end
157
166
 
@@ -24,6 +24,7 @@ module RubyLLM
24
24
  class WebSocket # rubocop:disable Metrics/ClassLength
25
25
  WEBSOCKET_PATH = '/v1/responses'
26
26
  KNOWN_PARAMS = %i[store metadata compact_threshold context_management].freeze
27
+ RESPONSE_TIMEOUT = :response_timeout
27
28
 
28
29
  attr_reader :last_response_id
29
30
 
@@ -32,13 +33,16 @@ module RubyLLM
32
33
  # @param organization_id [String, nil] OpenAI organization ID
33
34
  # @param project_id [String, nil] OpenAI project ID
34
35
  # @param client_class [#connect, nil] WebSocket client class (for testing)
36
+ # @param response_timeout [Numeric] seconds to wait for a response event
37
+ # rubocop:disable Metrics/ParameterLists
35
38
  def initialize(api_key:, api_base: 'https://api.openai.com/v1', organization_id: nil, project_id: nil,
36
- client_class: nil)
39
+ client_class: nil, response_timeout: 60)
37
40
  @api_key = api_key
38
41
  @api_base = api_base
39
42
  @organization_id = organization_id
40
43
  @project_id = project_id
41
44
  @client_class = client_class
45
+ @response_timeout = response_timeout
42
46
 
43
47
  @ws = nil
44
48
  @mutex = Mutex.new
@@ -47,6 +51,7 @@ module RubyLLM
47
51
  @last_response_id = nil
48
52
  @message_queue = nil
49
53
  end
54
+ # rubocop:enable Metrics/ParameterLists
50
55
 
51
56
  # Open the WebSocket connection. Blocks until the connection is established.
52
57
  # @param timeout [Numeric] seconds to wait for the connection (default: 10)
@@ -57,6 +62,10 @@ module RubyLLM
57
62
 
58
63
  ready = Queue.new
59
64
  error_holder = []
65
+ # websocket-client-simple invokes on() blocks with instance_exec, so any
66
+ # @ivar reference inside resolves to the underlying client, not us.
67
+ # Capture self as a local so the handlers can call back into this object.
68
+ owner = self
60
69
 
61
70
  @ws = client_class.connect(build_ws_url, headers: build_headers)
62
71
 
@@ -64,20 +73,11 @@ module RubyLLM
64
73
 
65
74
  @ws.on(:error) do |e|
66
75
  error_holder << e
67
- ready.push(:error) unless @connected
76
+ ready.push(:error) unless owner.connected?
68
77
  end
69
78
 
70
- @ws.on(:close) do
71
- @mutex.synchronize do
72
- @connected = false
73
- @message_queue&.push(nil)
74
- end
75
- end
76
-
77
- @ws.on(:message) do |msg|
78
- q = @mutex.synchronize { @message_queue }
79
- q&.push(msg.data)
80
- end
79
+ @ws.on(:close) { owner.__send__(:handle_close) }
80
+ @ws.on(:message) { |msg| owner.__send__(:handle_message, msg.data) }
81
81
 
82
82
  result = pop_with_timeout(ready, timeout)
83
83
  if result == :error || result.nil?
@@ -103,7 +103,7 @@ module RubyLLM
103
103
  queue = Queue.new
104
104
  @mutex.synchronize { @message_queue = queue }
105
105
 
106
- envelope = { type: 'response.create', response: payload.except(:stream) }
106
+ envelope = { type: 'response.create' }.merge(payload.except(:stream))
107
107
  send_json(envelope)
108
108
  accumulate_response(queue, &)
109
109
  ensure
@@ -144,10 +144,7 @@ module RubyLLM
144
144
  queue = Queue.new
145
145
  @mutex.synchronize { @message_queue = queue }
146
146
 
147
- payload = {
148
- type: 'response.create',
149
- response: { model: model, generate: false }.merge(extra)
150
- }
147
+ payload = { type: 'response.create', model: model, generate: false }.merge(extra)
151
148
 
152
149
  send_json(payload)
153
150
 
@@ -196,6 +193,18 @@ module RubyLLM
196
193
 
197
194
  private
198
195
 
196
+ def handle_close
197
+ @mutex.synchronize do
198
+ @connected = false
199
+ @message_queue&.push(nil)
200
+ end
201
+ end
202
+
203
+ def handle_message(data)
204
+ q = @mutex.synchronize { @message_queue }
205
+ q&.push(data)
206
+ end
207
+
199
208
  def resolve_client_class
200
209
  require 'websocket-client-simple'
201
210
  ::WebSocket::Client::Simple
@@ -213,8 +222,7 @@ module RubyLLM
213
222
 
214
223
  def build_headers
215
224
  headers = {
216
- 'Authorization' => "Bearer #{@api_key}",
217
- 'OpenAI-Beta' => 'responses.websocket=v1'
225
+ 'Authorization' => "Bearer #{@api_key}"
218
226
  }
219
227
  headers['OpenAI-Organization'] = @organization_id if @organization_id
220
228
  headers['OpenAI-Project'] = @project_id if @project_id
@@ -241,9 +249,14 @@ module RubyLLM
241
249
 
242
250
  def accumulate_response(queue, &block)
243
251
  accumulator = StreamAccumulator.new
252
+ built_in_events = []
244
253
 
245
254
  loop do
246
- raw = queue.pop
255
+ raw = pop_response_event(queue)
256
+ if raw == RESPONSE_TIMEOUT
257
+ raise ConnectionError, "Timed out waiting for WebSocket response after #{@response_timeout} seconds"
258
+ end
259
+
247
260
  break if raw.nil?
248
261
 
249
262
  data = JSON.parse(raw)
@@ -253,14 +266,18 @@ module RubyLLM
253
266
  accumulator.add(chunk)
254
267
  block&.call(chunk)
255
268
 
256
- if event_type == 'response.completed'
257
- track_response_id(data)
258
- break
259
- end
269
+ next unless event_type == 'response.completed'
270
+
271
+ track_response_id(data)
272
+ built_in_events.concat(BuiltInTools.extract_events(data.dig('response', 'output') || []))
273
+ break
260
274
  end
261
275
 
262
276
  message = accumulator.to_message(nil)
263
277
  message.response_id = @last_response_id
278
+ if message.respond_to?(:built_in_tool_events=) && built_in_events.any?
279
+ message.built_in_tool_events = built_in_events
280
+ end
264
281
  message
265
282
  end
266
283
 
@@ -290,6 +307,12 @@ module RubyLLM
290
307
  rescue Timeout::Error
291
308
  nil
292
309
  end
310
+
311
+ def pop_response_event(queue)
312
+ Timeout.timeout(@response_timeout) { queue.pop }
313
+ rescue Timeout::Error
314
+ RESPONSE_TIMEOUT
315
+ end
293
316
  end
294
317
  end
295
318
  end
@@ -16,7 +16,7 @@ module RubyLLM
16
16
  @config.openai_api_base || 'https://api.openai.com/v1'
17
17
  end
18
18
 
19
- # Override to support WebSocket transport via with_params(transport: :websocket)
19
+ # Override to support WebSocket transport via with_params(transport: :websocket).
20
20
  # rubocop:disable Metrics/ParameterLists
21
21
  def complete(messages, tools:, temperature:, model:, params: {}, headers: {},
22
22
  schema: nil, thinking: nil, tool_prefs: nil, &block)
@@ -90,6 +90,26 @@ module RubyLLM
90
90
  end
91
91
  end
92
92
 
93
+ # Run an explicit compaction pass over a response input.
94
+ # @param model [String] Model ID used for compaction
95
+ # @param input [String, Array<Hash>] Response input items to compact
96
+ # @param params [Hash] Additional Responses API parameters
97
+ # @return [Hash] Compacted response data
98
+ def compact_response(model:, input:, **params)
99
+ response = @connection.post(Compaction.compact_url, { model: model, input: input }.merge(params))
100
+ response.body
101
+ end
102
+
103
+ # Count input tokens for a Responses API request without creating a response.
104
+ # @param model [String] Model ID used for tokenization
105
+ # @param input [String, Array<Hash>] Response input
106
+ # @param params [Hash] Additional Responses API parameters
107
+ # @return [Hash] Token count data
108
+ def count_input_tokens(model:, input:, **params)
109
+ response = @connection.post(Compaction.input_tokens_url, { model: model, input: input }.merge(params))
110
+ response.body
111
+ end
112
+
93
113
  # --- Container Management ---
94
114
 
95
115
  # Create a new container
@@ -22,6 +22,8 @@ require_relative 'ruby_llm/providers/openai_responses/containers'
22
22
  require_relative 'ruby_llm/providers/openai_responses/batches'
23
23
  require_relative 'ruby_llm/providers/openai_responses/batch'
24
24
  require_relative 'ruby_llm/providers/openai_responses/message_extension'
25
+ require_relative 'ruby_llm/providers/openai_responses/stream_accumulator_extension'
26
+ require_relative 'ruby_llm/providers/openai_responses/chat_extension'
25
27
  require_relative 'ruby_llm/providers/openai_responses/model_registry'
26
28
  require_relative 'ruby_llm/providers/openai_responses/active_record_extension'
27
29
  require_relative 'ruby_llm/providers/openai_responses/web_socket'
@@ -39,7 +41,7 @@ RubyLLM::Providers::OpenAIResponses::ModelRegistry.register_all!
39
41
  module RubyLLM
40
42
  # ResponsesAPI namespace for direct access to helpers and version
41
43
  module ResponsesAPI
42
- VERSION = '0.5.4'
44
+ VERSION = '0.6.0'
43
45
 
44
46
  # Shorthand access to built-in tool helpers
45
47
  BuiltInTools = Providers::OpenAIResponses::BuiltInTools
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: ruby_llm-responses_api
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.5.4
4
+ version: 0.6.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Chris Hasinski
@@ -15,14 +15,14 @@ dependencies:
15
15
  requirements:
16
16
  - - ">="
17
17
  - !ruby/object:Gem::Version
18
- version: '1.0'
18
+ version: '1.13'
19
19
  type: :runtime
20
20
  prerelease: false
21
21
  version_requirements: !ruby/object:Gem::Requirement
22
22
  requirements:
23
23
  - - ">="
24
24
  - !ruby/object:Gem::Version
25
- version: '1.0'
25
+ version: '1.13'
26
26
  - !ruby/object:Gem::Dependency
27
27
  name: activerecord
28
28
  requirement: !ruby/object:Gem::Requirement
@@ -158,6 +158,7 @@ files:
158
158
  - lib/ruby_llm/providers/openai_responses/built_in_tools.rb
159
159
  - lib/ruby_llm/providers/openai_responses/capabilities.rb
160
160
  - lib/ruby_llm/providers/openai_responses/chat.rb
161
+ - lib/ruby_llm/providers/openai_responses/chat_extension.rb
161
162
  - lib/ruby_llm/providers/openai_responses/compaction.rb
162
163
  - lib/ruby_llm/providers/openai_responses/containers.rb
163
164
  - lib/ruby_llm/providers/openai_responses/media.rb
@@ -165,6 +166,7 @@ files:
165
166
  - lib/ruby_llm/providers/openai_responses/model_registry.rb
166
167
  - lib/ruby_llm/providers/openai_responses/models.rb
167
168
  - lib/ruby_llm/providers/openai_responses/state.rb
169
+ - lib/ruby_llm/providers/openai_responses/stream_accumulator_extension.rb
168
170
  - lib/ruby_llm/providers/openai_responses/streaming.rb
169
171
  - lib/ruby_llm/providers/openai_responses/tools.rb
170
172
  - lib/ruby_llm/providers/openai_responses/web_socket.rb