ruby_llm-responses_api 0.3.1 → 0.4.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: ea0573fba4558602d3dfae81efc02d6164a7fe30d05ac277302de4fe2ef91ba7
4
- data.tar.gz: d1f6077b07879b754c2ea7c2bb9577473beec9059d0389a541891c389f37af79
3
+ metadata.gz: c2d9ce65eebe6420f01878669d81f90f999b738158b17eaa558dd6c88226c2c2
4
+ data.tar.gz: c432ef2dfcebb290debbbc5ac5e72038081f54fc054065da1bee09465ba99ba0
5
5
  SHA512:
6
- metadata.gz: e407cfce50dd85f7ab4c2e35ca6dc0406efd26fd069238b9e6def78a725e9d3a4d62736328da0e3e8d3b8dbbb671f78bf13f632d645a5c9029521d45a54251b8
7
- data.tar.gz: e96c18c0c53c7a6f5482246b990d78a4d3f9e868cf1818c990e2a4d447a4ec6b9f26a5b8925024c1f331a18d45a9b640ca7393e7d01156b38bbed5946acbd260
6
+ metadata.gz: 39bbbb38a8b7183ff501d092eab938f0ab6572129ca3cd518057daa04b21117ea38eb8c19ab1a6755036b41f683f3124a1c0209ee91c5f547fe12590b673bbf3
7
+ data.tar.gz: 74346a9093b98f079b02deffc3d9f8cbe9b8bf681d33a843d4bd130d099976f78d66a61a6c2b5032d31a84190e25b7509fe4e83f39fa278be2f74f5980568544
data/CHANGELOG.md CHANGED
@@ -5,6 +5,30 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [0.4.1] - 2026-02-24
9
+
10
+ ### Added
11
+
12
+ - `chat.with_params(transport: :websocket)` integration with standard `chat.ask` interface
13
+ - `WebSocket#call` for accepting pre-built payloads from the provider
14
+
15
+ ### Fixed
16
+
17
+ - WebSocket responses now preserve token counts from `StreamAccumulator`
18
+
19
+ ## [0.4.0] - 2026-02-24
20
+
21
+ ### Added
22
+
23
+ - **WebSocket mode** for lower-latency agentic workflows with persistent `wss://` connections
24
+ - `RubyLLM::ResponsesAPI::WebSocket` standalone class
25
+ - Streamed responses via `create_response` with block
26
+ - Automatic `previous_response_id` chaining across turns
27
+ - `warmup` for server-side model weight caching (`generate: false`)
28
+ - Thread-safe with one-at-a-time response constraint
29
+ - Supports all existing helpers: `State`, `Compaction`, `Tools`
30
+ - Soft dependency on `websocket-client-simple` (lazy require with clear error)
31
+
8
32
  ## [0.3.1] - 2026-02-18
9
33
 
10
34
  ### Fixed
data/README.md CHANGED
@@ -259,6 +259,53 @@ image_results = RubyLLM::ResponsesAPI::BuiltInTools.parse_image_generation_resu
259
259
  citations = RubyLLM::ResponsesAPI::BuiltInTools.extract_citations(message_content)
260
260
  ```
261
261
 
262
+ ## WebSocket Mode
263
+
264
+ For agentic workflows with many tool-call round trips, WebSocket mode provides lower latency by maintaining a persistent connection instead of HTTP requests per turn.
265
+
266
+ Requires the `websocket-client-simple` gem:
267
+
268
+ ```ruby
269
+ gem 'websocket-client-simple'
270
+ ```
271
+
272
+ ### Usage
273
+
274
+ Just add `transport: :websocket` to your params -- the standard `chat.ask` API works as-is:
275
+
276
+ ```ruby
277
+ chat = RubyLLM.chat(model: 'gpt-4o', provider: :openai_responses)
278
+ chat.with_params(transport: :websocket)
279
+
280
+ chat.ask("Hello!")
281
+ chat.ask("What's 2+2?") # reuses the same WebSocket connection
282
+ ```
283
+
284
+ Streaming works the same way:
285
+
286
+ ```ruby
287
+ chat.ask("Tell me a story") { |chunk| print chunk.content }
288
+ ```
289
+
290
+ ### Direct WebSocket access
291
+
292
+ For advanced use cases (raw Responses API format, warmup, explicit connection management):
293
+
294
+ ```ruby
295
+ ws = RubyLLM::ResponsesAPI::WebSocket.new(api_key: ENV['OPENAI_API_KEY'])
296
+ ws.connect
297
+
298
+ ws.create_response(
299
+ model: 'gpt-4o',
300
+ input: [{ type: 'message', role: 'user', content: 'Hello!' }]
301
+ ) { |chunk| print chunk.content }
302
+
303
+ # Pre-cache model weights
304
+ ws.warmup(model: 'gpt-4o')
305
+
306
+ ws.disconnect
307
+ ```
308
+
262
309
  ## Why Use the Responses API?
263
310
 
264
311
  - **Built-in tools** - Web search, code execution, file search, shell, apply patch without custom implementation
@@ -266,6 +313,7 @@ citations = RubyLLM::ResponsesAPI::BuiltInTools.extract_citations(message_c
266
313
  - **Simpler multi-turn** - No need to send full message history on each request
267
314
  - **Server-side compaction** - Run multi-hour agent sessions without hitting context limits
268
315
  - **Containers** - Persistent execution environments with networking and file management
316
+ - **WebSocket mode** - Lower-latency persistent connections for agentic tool-call loops
269
317
 
270
318
  ## License
271
319
 
@@ -0,0 +1,296 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'timeout'
4
+
5
+ module RubyLLM
6
+ module Providers
7
+ class OpenAIResponses
8
+ # WebSocket transport for the OpenAI Responses API.
9
+ # Provides lower-latency agentic workflows by maintaining a persistent
10
+ # wss:// connection instead of HTTP requests per turn.
11
+ #
12
+ # Requires the `websocket-client-simple` gem (soft dependency).
13
+ #
14
+ # Integrated usage (recommended):
15
+ # chat = RubyLLM.chat(model: 'gpt-4o', provider: :openai_responses)
16
+ # chat.with_params(transport: :websocket)
17
+ # chat.ask("Hello!")
18
+ #
19
+ # Standalone usage (advanced):
20
+ # ws = RubyLLM::ResponsesAPI::WebSocket.new(api_key: ENV['OPENAI_API_KEY'])
21
+ # ws.connect
22
+ # ws.create_response(model: 'gpt-4o', input: [...]) { |chunk| ... }
23
+ # ws.disconnect
24
+ class WebSocket
25
+ WEBSOCKET_PATH = '/v1/responses'
26
+ KNOWN_PARAMS = %i[store metadata compact_threshold context_management].freeze
27
+
28
+ attr_reader :last_response_id
29
+
30
+ # @param api_key [String] OpenAI API key
31
+ # @param api_base [String] API base URL (https scheme; converted to wss)
32
+ # @param organization_id [String, nil] OpenAI organization ID
33
+ # @param project_id [String, nil] OpenAI project ID
34
+ # @param client_class [#connect, nil] WebSocket client class (for testing)
35
+ def initialize(api_key:, api_base: 'https://api.openai.com/v1', organization_id: nil, project_id: nil,
36
+ client_class: nil)
37
+ @api_key = api_key
38
+ @api_base = api_base
39
+ @organization_id = organization_id
40
+ @project_id = project_id
41
+ @client_class = client_class
42
+
43
+ @ws = nil
44
+ @mutex = Mutex.new
45
+ @connected = false
46
+ @in_flight = false
47
+ @last_response_id = nil
48
+ @message_queue = nil
49
+ end
50
+
51
+ # Open the WebSocket connection. Blocks until the connection is established.
52
+ # @param timeout [Numeric] seconds to wait for the connection (default: 10)
53
+ # @raise [ConnectionError] if the connection cannot be established
54
+ # @return [self]
55
+ def connect(timeout: 10)
56
+ client_class = @client_class || resolve_client_class
57
+
58
+ ready = Queue.new
59
+ error_holder = []
60
+
61
+ @ws = client_class.connect(build_ws_url, headers: build_headers)
62
+
63
+ @ws.on(:open) { ready.push(:ok) }
64
+
65
+ @ws.on(:error) do |e|
66
+ error_holder << e
67
+ ready.push(:error) unless @connected
68
+ end
69
+
70
+ @ws.on(:close) do
71
+ @mutex.synchronize do
72
+ @connected = false
73
+ @message_queue&.push(nil)
74
+ end
75
+ end
76
+
77
+ @ws.on(:message) do |msg|
78
+ q = @mutex.synchronize { @message_queue }
79
+ q&.push(msg.data)
80
+ end
81
+
82
+ result = pop_with_timeout(ready, timeout)
83
+ if result == :error || result.nil?
84
+ err = error_holder.first
85
+ raise ConnectionError, "WebSocket connection failed: #{err&.message || 'timeout'}"
86
+ end
87
+
88
+ @mutex.synchronize { @connected = true }
89
+ self
90
+ end
91
+
92
+ # Send a pre-built payload over WebSocket, streaming chunks via block.
93
+ # This is the integration point for Provider#complete -- it accepts the
94
+ # same payload hash that render_payload returns.
95
+ #
96
+ # @param payload [Hash] Responses API payload (model, input, tools, etc.)
97
+ # @yield [RubyLLM::Chunk] each streamed chunk
98
+ # @return [RubyLLM::Message] the assembled final message
99
+ def call(payload, &block)
100
+ ensure_connected!
101
+ acquire_flight!
102
+
103
+ queue = Queue.new
104
+ @mutex.synchronize { @message_queue = queue }
105
+
106
+ envelope = { type: 'response.create', response: payload.except(:stream) }
107
+ send_json(envelope)
108
+ accumulate_response(queue, &block)
109
+ ensure
110
+ @mutex.synchronize { @message_queue = nil }
111
+ release_flight!
112
+ end
113
+
114
+ # Send a response.create request using raw Responses API format.
115
+ # Useful for standalone usage outside the RubyLLM chat interface.
116
+ #
117
+ # @param model [String] model ID
118
+ # @param input [Array<Hash>] input items in Responses API format
119
+ # @param tools [Array<Hash>, nil] tool definitions
120
+ # @param previous_response_id [String, nil] chain to a prior response
121
+ # @param instructions [String, nil] system/developer instructions
122
+ # @param extra [Hash] additional fields forwarded to the API
123
+ # @yield [RubyLLM::Chunk] each streamed chunk
124
+ # @return [RubyLLM::Message] the assembled final message
125
+ def create_response(model:, input:, tools: nil, previous_response_id: nil, instructions: nil, **extra, &block)
126
+ payload = build_standalone_payload(
127
+ model: model, input: input, tools: tools,
128
+ previous_response_id: previous_response_id,
129
+ instructions: instructions, **extra
130
+ )
131
+
132
+ call(payload, &block)
133
+ end
134
+
135
+ # Warm up the connection by sending a response.create with generate: false.
136
+ # Caches model weights server-side without generating output.
137
+ # @param model [String] model ID
138
+ # @param extra [Hash] additional fields
139
+ # @return [void]
140
+ def warmup(model:, **extra)
141
+ ensure_connected!
142
+ acquire_flight!
143
+
144
+ queue = Queue.new
145
+ @mutex.synchronize { @message_queue = queue }
146
+
147
+ payload = {
148
+ type: 'response.create',
149
+ response: { model: model, generate: false }.merge(extra)
150
+ }
151
+
152
+ send_json(payload)
153
+
154
+ loop do
155
+ data = queue.pop
156
+ break if data.nil?
157
+
158
+ parsed = JSON.parse(data)
159
+ event_type = parsed['type']
160
+
161
+ if event_type == 'error'
162
+ error_msg = parsed.dig('error', 'message') || 'Warmup error'
163
+ raise ResponseError, error_msg
164
+ end
165
+
166
+ break if event_type == 'response.completed'
167
+ end
168
+ ensure
169
+ @mutex.synchronize { @message_queue = nil }
170
+ release_flight!
171
+ end
172
+
173
+ # Disconnect the WebSocket.
174
+ # @return [void]
175
+ def disconnect
176
+ @ws&.close
177
+ @mutex.synchronize { @connected = false }
178
+ end
179
+
180
+ # @return [Boolean]
181
+ def connected?
182
+ @mutex.synchronize { @connected }
183
+ end
184
+
185
+ # Close and reopen the connection.
186
+ # @return [self]
187
+ def reconnect(timeout: 10)
188
+ disconnect
189
+ connect(timeout: timeout)
190
+ end
191
+
192
+ # Custom error types
193
+ class ConnectionError < StandardError; end
194
+ class ConcurrencyError < StandardError; end
195
+ class ResponseError < StandardError; end
196
+
197
+ private
198
+
199
+ def resolve_client_class
200
+ require 'websocket-client-simple'
201
+ ::WebSocket::Client::Simple
202
+ rescue LoadError
203
+ raise LoadError,
204
+ 'The websocket-client-simple gem is required for WebSocket mode. ' \
205
+ "Add `gem 'websocket-client-simple'` to your Gemfile."
206
+ end
207
+
208
+ def build_ws_url
209
+ base = @api_base.sub(%r{/v1\z}, '')
210
+ host = base.sub(%r{\Ahttps?://}, '')
211
+ "wss://#{host}#{WEBSOCKET_PATH}"
212
+ end
213
+
214
+ def build_headers
215
+ headers = {
216
+ 'Authorization' => "Bearer #{@api_key}",
217
+ 'OpenAI-Beta' => 'responses.websocket=v1'
218
+ }
219
+ headers['OpenAI-Organization'] = @organization_id if @organization_id
220
+ headers['OpenAI-Project'] = @project_id if @project_id
221
+ headers
222
+ end
223
+
224
+ def build_standalone_payload(model:, input:, tools: nil, previous_response_id: nil, instructions: nil, **extra)
225
+ prev_id = previous_response_id || @last_response_id
226
+ response = { model: model, input: input }
227
+ response[:tools] = tools.map { |t| Tools.tool_for(t) } if tools&.any?
228
+ response[:previous_response_id] = prev_id if prev_id
229
+ response[:instructions] = instructions if instructions
230
+
231
+ State.apply_state_params(response, extra)
232
+ Compaction.apply_compaction(response, extra)
233
+
234
+ forwarded = extra.reject { |k, _| KNOWN_PARAMS.include?(k) }
235
+ response.merge(forwarded)
236
+ end
237
+
238
+ def send_json(payload)
239
+ @ws.send(JSON.generate(payload))
240
+ end
241
+
242
+ def accumulate_response(queue, &block)
243
+ accumulator = StreamAccumulator.new
244
+
245
+ loop do
246
+ raw = queue.pop
247
+ break if raw.nil?
248
+
249
+ data = JSON.parse(raw)
250
+ event_type = data['type']
251
+
252
+ chunk = Streaming.build_chunk(data)
253
+ accumulator.add(chunk)
254
+ block&.call(chunk)
255
+
256
+ if event_type == 'response.completed'
257
+ track_response_id(data)
258
+ break
259
+ end
260
+ end
261
+
262
+ message = accumulator.to_message(nil)
263
+ message.response_id = @last_response_id
264
+ message
265
+ end
266
+
267
+ def track_response_id(data)
268
+ resp_id = data.dig('response', 'id')
269
+ @mutex.synchronize { @last_response_id = resp_id } if resp_id
270
+ end
271
+
272
+ def ensure_connected!
273
+ raise ConnectionError, 'WebSocket is not connected. Call #connect first.' unless connected?
274
+ end
275
+
276
+ def acquire_flight!
277
+ @mutex.synchronize do
278
+ raise ConcurrencyError, 'Another response is already in flight.' if @in_flight
279
+
280
+ @in_flight = true
281
+ end
282
+ end
283
+
284
+ def release_flight!
285
+ @mutex.synchronize { @in_flight = false }
286
+ end
287
+
288
+ def pop_with_timeout(queue, seconds)
289
+ Timeout.timeout(seconds) { queue.pop }
290
+ rescue Timeout::Error
291
+ nil
292
+ end
293
+ end
294
+ end
295
+ end
296
+ end
@@ -16,6 +16,16 @@ module RubyLLM
16
16
  @config.openai_api_base || 'https://api.openai.com/v1'
17
17
  end
18
18
 
19
+ # Override to support WebSocket transport via with_params(transport: :websocket)
20
+ def complete(messages, tools:, temperature:, model:, params: {}, headers: {}, schema: nil, thinking: nil, &block) # rubocop:disable Metrics/ParameterLists
21
+ if params[:transport]&.to_sym == :websocket
22
+ ws_complete(messages, tools: tools, temperature: temperature, model: model,
23
+ params: params.except(:transport), schema: schema, thinking: thinking, &block)
24
+ else
25
+ super
26
+ end
27
+ end
28
+
19
29
  def headers
20
30
  {
21
31
  'Authorization' => "Bearer #{@config.openai_api_key}",
@@ -137,6 +147,35 @@ module RubyLLM
137
147
 
138
148
  private
139
149
 
150
+ def ws_complete(messages, tools:, temperature:, model:, params:, schema:, thinking:, &block)
151
+ normalized_temperature = maybe_normalize_temperature(temperature, model)
152
+
153
+ payload = Utils.deep_merge(
154
+ render_payload(
155
+ messages,
156
+ tools: tools,
157
+ temperature: normalized_temperature,
158
+ model: model,
159
+ stream: true,
160
+ schema: schema,
161
+ thinking: thinking
162
+ ),
163
+ params
164
+ )
165
+
166
+ ws_connection.connect unless ws_connection.connected?
167
+ ws_connection.call(payload, &block)
168
+ end
169
+
170
+ def ws_connection
171
+ @ws_connection ||= WebSocket.new(
172
+ api_key: @config.openai_api_key,
173
+ api_base: api_base,
174
+ organization_id: @config.openai_organization_id,
175
+ project_id: @config.openai_project_id
176
+ )
177
+ end
178
+
140
179
  # DELETE request via the underlying Faraday connection
141
180
  # RubyLLM::Connection only exposes get/post, so we use Faraday directly
142
181
  def delete_request(url)
@@ -22,6 +22,7 @@ require_relative 'ruby_llm/providers/openai_responses/containers'
22
22
  require_relative 'ruby_llm/providers/openai_responses/message_extension'
23
23
  require_relative 'ruby_llm/providers/openai_responses/model_registry'
24
24
  require_relative 'ruby_llm/providers/openai_responses/active_record_extension'
25
+ require_relative 'ruby_llm/providers/openai_responses/web_socket'
25
26
 
26
27
  # Include all modules in the provider class
27
28
  require_relative 'ruby_llm/providers/openai_responses'
@@ -36,7 +37,7 @@ RubyLLM::Providers::OpenAIResponses::ModelRegistry.register_all!
36
37
  module RubyLLM
37
38
  # ResponsesAPI namespace for direct access to helpers and version
38
39
  module ResponsesAPI
39
- VERSION = '0.3.1'
40
+ VERSION = '0.4.1'
40
41
 
41
42
  # Shorthand access to built-in tool helpers
42
43
  BuiltInTools = Providers::OpenAIResponses::BuiltInTools
@@ -44,5 +45,6 @@ module RubyLLM
44
45
  Background = Providers::OpenAIResponses::Background
45
46
  Compaction = Providers::OpenAIResponses::Compaction
46
47
  Containers = Providers::OpenAIResponses::Containers
48
+ WebSocket = Providers::OpenAIResponses::WebSocket
47
49
  end
48
50
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: ruby_llm-responses_api
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.3.1
4
+ version: 0.4.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Chris Hasinski
@@ -121,6 +121,20 @@ dependencies:
121
121
  - - "~>"
122
122
  - !ruby/object:Gem::Version
123
123
  version: '3.0'
124
+ - !ruby/object:Gem::Dependency
125
+ name: websocket-client-simple
126
+ requirement: !ruby/object:Gem::Requirement
127
+ requirements:
128
+ - - "~>"
129
+ - !ruby/object:Gem::Version
130
+ version: '0.8'
131
+ type: :development
132
+ prerelease: false
133
+ version_requirements: !ruby/object:Gem::Requirement
134
+ requirements:
135
+ - - "~>"
136
+ - !ruby/object:Gem::Version
137
+ version: '0.8'
124
138
  description: A RubyLLM provider that implements OpenAI's Responses API, providing
125
139
  access to built-in tools (web search, code interpreter, file search, shell, apply
126
140
  patch), stateful conversations, server-side compaction, containers API, background
@@ -151,6 +165,7 @@ files:
151
165
  - lib/ruby_llm/providers/openai_responses/state.rb
152
166
  - lib/ruby_llm/providers/openai_responses/streaming.rb
153
167
  - lib/ruby_llm/providers/openai_responses/tools.rb
168
+ - lib/ruby_llm/providers/openai_responses/web_socket.rb
154
169
  - lib/rubyllm_responses_api.rb
155
170
  homepage: https://github.com/khasinski/ruby_llm-responses_api
156
171
  licenses: