ruby_llm-responses_api 0.3.1 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: ea0573fba4558602d3dfae81efc02d6164a7fe30d05ac277302de4fe2ef91ba7
4
- data.tar.gz: d1f6077b07879b754c2ea7c2bb9577473beec9059d0389a541891c389f37af79
3
+ metadata.gz: 7ca7cab6681d016096c3c578e5cb0c74f21af60ec03cc4fd8667263a95cd97ce
4
+ data.tar.gz: ed3d4931a835334aba4c351da61f4293464cfbeb3a3fb8399468b0a3665c962c
5
5
  SHA512:
6
- metadata.gz: e407cfce50dd85f7ab4c2e35ca6dc0406efd26fd069238b9e6def78a725e9d3a4d62736328da0e3e8d3b8dbbb671f78bf13f632d645a5c9029521d45a54251b8
7
- data.tar.gz: e96c18c0c53c7a6f5482246b990d78a4d3f9e868cf1818c990e2a4d447a4ec6b9f26a5b8925024c1f331a18d45a9b640ca7393e7d01156b38bbed5946acbd260
6
+ metadata.gz: 4ab75bc29fe723177cd82c988b89f298e367e363d9224998bf3cde0372eb94f153804b6ffc3f8ac75032a137c1ec7fe1d065ca7d7d8452dadabc0d27d24abfa9
7
+ data.tar.gz: e4b6f9837af18c683392a3436942e4aed6e03d245ab1ae0070b95c20111ec0aae01c35f44fd929808fe75c926257b2e2a0218b527f9f12744e8003d7decc6df4
data/CHANGELOG.md CHANGED
@@ -5,6 +5,19 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [0.4.0] - 2026-02-24
9
+
10
+ ### Added
11
+
12
+ - **WebSocket mode** for lower-latency agentic workflows with persistent `wss://` connections
13
+ - `RubyLLM::ResponsesAPI::WebSocket` standalone class
14
+ - Streamed responses via `create_response` with block
15
+ - Automatic `previous_response_id` chaining across turns
16
+ - `warmup` for server-side model weight caching (`generate: false`)
17
+ - Thread-safe with one-at-a-time response constraint
18
+ - Supports all existing helpers: `State`, `Compaction`, `Tools`
19
+ - Soft dependency on `websocket-client-simple` (lazy require with clear error)
20
+
8
21
  ## [0.3.1] - 2026-02-18
9
22
 
10
23
  ### Fixed
data/README.md CHANGED
@@ -259,6 +259,72 @@ image_results = RubyLLM::ResponsesAPI::BuiltInTools.parse_image_generation_resu
259
259
  citations = RubyLLM::ResponsesAPI::BuiltInTools.extract_citations(message_content)
260
260
  ```
261
261
 
262
+ ## WebSocket Mode
263
+
264
+ For agentic workflows with many tool-call round trips, WebSocket mode provides lower latency by maintaining a persistent connection instead of HTTP requests per turn.
265
+
266
+ Requires the `websocket-client-simple` gem:
267
+
268
+ ```ruby
269
+ gem 'websocket-client-simple'
270
+ ```
271
+
272
+ ### Basic usage
273
+
274
+ ```ruby
275
+ ws = RubyLLM::ResponsesAPI::WebSocket.new(api_key: ENV['OPENAI_API_KEY'])
276
+ ws.connect
277
+
278
+ # Stream a response
279
+ message = ws.create_response(
280
+ model: 'gpt-4o',
281
+ input: [{ type: 'message', role: 'user', content: 'Hello!' }]
282
+ ) do |chunk|
283
+ print chunk.content if chunk.content
284
+ end
285
+
286
+ puts "\n#{message.content}"
287
+ ```
288
+
289
+ ### Multi-turn conversations
290
+
291
+ `previous_response_id` is tracked automatically across turns:
292
+
293
+ ```ruby
294
+ ws.create_response(model: 'gpt-4o', input: [
295
+ { type: 'message', role: 'user', content: 'My name is Alice.' }
296
+ ])
297
+
298
+ ws.create_response(model: 'gpt-4o', input: [
299
+ { type: 'message', role: 'user', content: "What's my name?" }
300
+ ])
301
+ # => "Alice" (auto-chained via previous_response_id)
302
+ ```
303
+
304
+ ### With tools
305
+
306
+ ```ruby
307
+ ws.create_response(
308
+ model: 'gpt-4o',
309
+ input: [{ type: 'message', role: 'user', content: 'Search for Ruby 3.4 release notes' }],
310
+ tools: [{ type: 'web_search_preview' }]
311
+ )
312
+ ```
313
+
314
+ ### Warmup
315
+
316
+ Pre-cache model weights without generating output:
317
+
318
+ ```ruby
319
+ ws.warmup(model: 'gpt-4o')
320
+ ```
321
+
322
+ ### Cleanup
323
+
324
+ ```ruby
325
+ ws.disconnect
326
+ ```
327
+
262
328
  ## Why Use the Responses API?
263
329
 
264
330
  - **Built-in tools** - Web search, code execution, file search, shell, apply patch without custom implementation
@@ -266,6 +332,7 @@ citations = RubyLLM::ResponsesAPI::BuiltInTools.extract_citations(message_c
266
332
  - **Simpler multi-turn** - No need to send full message history on each request
267
333
  - **Server-side compaction** - Run multi-hour agent sessions without hitting context limits
268
334
  - **Containers** - Persistent execution environments with networking and file management
335
+ - **WebSocket mode** - Lower-latency persistent connections for agentic tool-call loops
269
336
 
270
337
  ## License
271
338
 
@@ -0,0 +1,292 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'timeout'
4
+
5
+ module RubyLLM
6
+ module Providers
7
+ class OpenAIResponses
8
+ # WebSocket transport for the OpenAI Responses API.
9
+ # Provides lower-latency agentic workflows by maintaining a persistent
10
+ # wss:// connection instead of HTTP requests per turn.
11
+ #
12
+ # Requires the `websocket-client-simple` gem (soft dependency).
13
+ #
14
+ # Usage:
15
+ # ws = RubyLLM::ResponsesAPI::WebSocket.new(api_key: ENV['OPENAI_API_KEY'])
16
+ # ws.connect
17
+ #
18
+ # ws.create_response(model: 'gpt-4o', input: [{ type: 'message', role: 'user', content: 'Hi' }]) do |chunk|
19
+ # print chunk.content if chunk.content
20
+ # end
21
+ #
22
+ # ws.disconnect
23
+ class WebSocket
24
+ WEBSOCKET_PATH = '/v1/responses'
25
+ KNOWN_PARAMS = %i[store metadata compact_threshold context_management].freeze
26
+
27
+ attr_reader :last_response_id
28
+
29
+ # @param api_key [String] OpenAI API key
30
+ # @param api_base [String] API base URL (https scheme; converted to wss)
31
+ # @param organization_id [String, nil] OpenAI organization ID
32
+ # @param project_id [String, nil] OpenAI project ID
33
+ # @param client_class [#connect, nil] WebSocket client class (for testing)
34
+ def initialize(api_key:, api_base: 'https://api.openai.com/v1', organization_id: nil, project_id: nil,
35
+ client_class: nil)
36
+ @api_key = api_key
37
+ @api_base = api_base
38
+ @organization_id = organization_id
39
+ @project_id = project_id
40
+ @client_class = client_class
41
+
42
+ @ws = nil
43
+ @mutex = Mutex.new
44
+ @connected = false
45
+ @in_flight = false
46
+ @last_response_id = nil
47
+ @message_queue = nil
48
+ end
49
+
50
+ # Open the WebSocket connection. Blocks until the connection is established.
51
+ # @param timeout [Numeric] seconds to wait for the connection (default: 10)
52
+ # @raise [ConnectionError] if the connection cannot be established
53
+ # @return [self]
54
+ def connect(timeout: 10)
55
+ client_class = @client_class || resolve_client_class
56
+
57
+ ready = Queue.new
58
+ error_holder = []
59
+
60
+ @ws = client_class.connect(build_ws_url, headers: build_headers)
61
+
62
+ @ws.on(:open) { ready.push(:ok) }
63
+
64
+ @ws.on(:error) do |e|
65
+ error_holder << e
66
+ ready.push(:error) unless @connected
67
+ end
68
+
69
+ @ws.on(:close) do
70
+ @mutex.synchronize do
71
+ @connected = false
72
+ @message_queue&.push(nil)
73
+ end
74
+ end
75
+
76
+ # Route all messages to the current queue (swapped per request)
77
+ @ws.on(:message) do |msg|
78
+ q = @mutex.synchronize { @message_queue }
79
+ q&.push(msg.data)
80
+ end
81
+
82
+ result = pop_with_timeout(ready, timeout)
83
+ if result == :error || result.nil?
84
+ err = error_holder.first
85
+ raise ConnectionError, "WebSocket connection failed: #{err&.message || 'timeout'}"
86
+ end
87
+
88
+ @mutex.synchronize { @connected = true }
89
+ self
90
+ end
91
+
92
+ # Send a response.create request and stream chunks via block.
93
+ # @param model [String] model ID
94
+ # @param input [Array<Hash>] input items in Responses API format
95
+ # @param tools [Array<Hash>, nil] tool definitions
96
+ # @param previous_response_id [String, nil] chain to a prior response
97
+ # @param instructions [String, nil] system/developer instructions
98
+ # @param extra [Hash] additional top-level fields forwarded to the API
99
+ # @yield [RubyLLM::Chunk] each streamed chunk
100
+ # @return [RubyLLM::Message] the assembled final message
101
+ # @raise [ConcurrencyError] if another response is already in flight
102
+ # @raise [ConnectionError] if not connected
103
+ def create_response(model:, input:, tools: nil, previous_response_id: nil, instructions: nil, **extra, &block)
104
+ ensure_connected!
105
+ acquire_flight!
106
+
107
+ queue = Queue.new
108
+ @mutex.synchronize { @message_queue = queue }
109
+
110
+ payload = build_payload(
111
+ model: model, input: input, tools: tools,
112
+ previous_response_id: previous_response_id,
113
+ instructions: instructions, **extra
114
+ )
115
+
116
+ send_json(payload)
117
+ accumulate_response(queue, &block)
118
+ ensure
119
+ @mutex.synchronize { @message_queue = nil }
120
+ release_flight!
121
+ end
122
+
123
+ # Warm up the connection by sending a response.create with generate: false.
124
+ # Caches model weights server-side without generating output.
125
+ # @param model [String] model ID
126
+ # @param extra [Hash] additional fields
127
+ # @return [void]
128
+ def warmup(model:, **extra)
129
+ ensure_connected!
130
+ acquire_flight!
131
+
132
+ queue = Queue.new
133
+ @mutex.synchronize { @message_queue = queue }
134
+
135
+ payload = {
136
+ type: 'response.create',
137
+ response: { model: model, generate: false }.merge(extra)
138
+ }
139
+
140
+ send_json(payload)
141
+
142
+ loop do
143
+ data = queue.pop
144
+ break if data.nil?
145
+
146
+ parsed = JSON.parse(data)
147
+ event_type = parsed['type']
148
+
149
+ if event_type == 'error'
150
+ error_msg = parsed.dig('error', 'message') || 'Warmup error'
151
+ raise ResponseError, error_msg
152
+ end
153
+
154
+ break if event_type == 'response.completed'
155
+ end
156
+ ensure
157
+ @mutex.synchronize { @message_queue = nil }
158
+ release_flight!
159
+ end
160
+
161
+ # Disconnect the WebSocket.
162
+ # @return [void]
163
+ def disconnect
164
+ @ws&.close
165
+ @mutex.synchronize { @connected = false }
166
+ end
167
+
168
+ # @return [Boolean]
169
+ def connected?
170
+ @mutex.synchronize { @connected }
171
+ end
172
+
173
+ # Close and reopen the connection.
174
+ # @return [self]
175
+ def reconnect(timeout: 10)
176
+ disconnect
177
+ connect(timeout: timeout)
178
+ end
179
+
180
+ # Custom error types
181
+ class ConnectionError < StandardError; end
182
+ class ConcurrencyError < StandardError; end
183
+ class ResponseError < StandardError; end
184
+
185
+ private
186
+
187
+ def resolve_client_class
188
+ require 'websocket-client-simple'
189
+ ::WebSocket::Client::Simple
190
+ rescue LoadError
191
+ raise LoadError,
192
+ 'The websocket-client-simple gem is required for WebSocket mode. ' \
193
+ "Add `gem 'websocket-client-simple'` to your Gemfile."
194
+ end
195
+
196
+ def build_ws_url
197
+ base = @api_base.sub(%r{/v1\z}, '')
198
+ host = base.sub(%r{\Ahttps?://}, '')
199
+ "wss://#{host}#{WEBSOCKET_PATH}"
200
+ end
201
+
202
+ def build_headers
203
+ headers = {
204
+ 'Authorization' => "Bearer #{@api_key}",
205
+ 'OpenAI-Beta' => 'responses.websocket=v1'
206
+ }
207
+ headers['OpenAI-Organization'] = @organization_id if @organization_id
208
+ headers['OpenAI-Project'] = @project_id if @project_id
209
+ headers
210
+ end
211
+
212
+ def build_payload(model:, input:, tools: nil, previous_response_id: nil, instructions: nil, **extra)
213
+ prev_id = previous_response_id || @last_response_id
214
+ response = { model: model, input: input }
215
+ response[:tools] = tools.map { |t| Tools.tool_for(t) } if tools&.any?
216
+ response[:previous_response_id] = prev_id if prev_id
217
+ response[:instructions] = instructions if instructions
218
+
219
+ State.apply_state_params(response, extra)
220
+ Compaction.apply_compaction(response, extra)
221
+
222
+ forwarded = extra.reject { |k, _| KNOWN_PARAMS.include?(k) }
223
+ { type: 'response.create', response: response.merge(forwarded) }
224
+ end
225
+
226
+ def send_json(payload)
227
+ @ws.send(JSON.generate(payload))
228
+ end
229
+
230
+ def accumulate_response(queue, &block)
231
+ accumulator = StreamAccumulator.new
232
+
233
+ loop do
234
+ raw = queue.pop
235
+ break if raw.nil?
236
+
237
+ data = JSON.parse(raw)
238
+ event_type = data['type']
239
+
240
+ chunk = Streaming.build_chunk(data)
241
+ accumulator.add(chunk)
242
+ block&.call(chunk)
243
+
244
+ if event_type == 'response.completed'
245
+ track_response_id(data)
246
+ break
247
+ end
248
+ end
249
+
250
+ build_final_message(accumulator)
251
+ end
252
+
253
+ def track_response_id(data)
254
+ resp_id = data.dig('response', 'id')
255
+ @mutex.synchronize { @last_response_id = resp_id } if resp_id
256
+ end
257
+
258
+ def build_final_message(accumulator)
259
+ Message.new(
260
+ role: :assistant,
261
+ content: accumulator.content,
262
+ tool_calls: accumulator.tool_calls.empty? ? nil : accumulator.tool_calls,
263
+ model_id: accumulator.model_id,
264
+ response_id: @last_response_id
265
+ )
266
+ end
267
+
268
+ def ensure_connected!
269
+ raise ConnectionError, 'WebSocket is not connected. Call #connect first.' unless connected?
270
+ end
271
+
272
+ def acquire_flight!
273
+ @mutex.synchronize do
274
+ raise ConcurrencyError, 'Another response is already in flight.' if @in_flight
275
+
276
+ @in_flight = true
277
+ end
278
+ end
279
+
280
+ def release_flight!
281
+ @mutex.synchronize { @in_flight = false }
282
+ end
283
+
284
+ def pop_with_timeout(queue, seconds)
285
+ Timeout.timeout(seconds) { queue.pop }
286
+ rescue Timeout::Error
287
+ nil
288
+ end
289
+ end
290
+ end
291
+ end
292
+ end
@@ -22,6 +22,7 @@ require_relative 'ruby_llm/providers/openai_responses/containers'
22
22
  require_relative 'ruby_llm/providers/openai_responses/message_extension'
23
23
  require_relative 'ruby_llm/providers/openai_responses/model_registry'
24
24
  require_relative 'ruby_llm/providers/openai_responses/active_record_extension'
25
+ require_relative 'ruby_llm/providers/openai_responses/web_socket'
25
26
 
26
27
  # Include all modules in the provider class
27
28
  require_relative 'ruby_llm/providers/openai_responses'
@@ -36,7 +37,7 @@ RubyLLM::Providers::OpenAIResponses::ModelRegistry.register_all!
36
37
  module RubyLLM
37
38
  # ResponsesAPI namespace for direct access to helpers and version
38
39
  module ResponsesAPI
39
- VERSION = '0.3.1'
40
+ VERSION = '0.4.0'
40
41
 
41
42
  # Shorthand access to built-in tool helpers
42
43
  BuiltInTools = Providers::OpenAIResponses::BuiltInTools
@@ -44,5 +45,6 @@ module RubyLLM
44
45
  Background = Providers::OpenAIResponses::Background
45
46
  Compaction = Providers::OpenAIResponses::Compaction
46
47
  Containers = Providers::OpenAIResponses::Containers
48
+ WebSocket = Providers::OpenAIResponses::WebSocket
47
49
  end
48
50
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: ruby_llm-responses_api
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.3.1
4
+ version: 0.4.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Chris Hasinski
@@ -121,6 +121,20 @@ dependencies:
121
121
  - - "~>"
122
122
  - !ruby/object:Gem::Version
123
123
  version: '3.0'
124
+ - !ruby/object:Gem::Dependency
125
+ name: websocket-client-simple
126
+ requirement: !ruby/object:Gem::Requirement
127
+ requirements:
128
+ - - "~>"
129
+ - !ruby/object:Gem::Version
130
+ version: '0.8'
131
+ type: :development
132
+ prerelease: false
133
+ version_requirements: !ruby/object:Gem::Requirement
134
+ requirements:
135
+ - - "~>"
136
+ - !ruby/object:Gem::Version
137
+ version: '0.8'
124
138
  description: A RubyLLM provider that implements OpenAI's Responses API, providing
125
139
  access to built-in tools (web search, code interpreter, file search, shell, apply
126
140
  patch), stateful conversations, server-side compaction, containers API, background
@@ -151,6 +165,7 @@ files:
151
165
  - lib/ruby_llm/providers/openai_responses/state.rb
152
166
  - lib/ruby_llm/providers/openai_responses/streaming.rb
153
167
  - lib/ruby_llm/providers/openai_responses/tools.rb
168
+ - lib/ruby_llm/providers/openai_responses/web_socket.rb
154
169
  - lib/rubyllm_responses_api.rb
155
170
  homepage: https://github.com/khasinski/ruby_llm-responses_api
156
171
  licenses: