open_router_enhanced 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (47) hide show
  1. checksums.yaml +7 -0
  2. data/.env.example +1 -0
  3. data/.rspec +3 -0
  4. data/.rubocop.yml +13 -0
  5. data/.rubocop_todo.yml +130 -0
  6. data/.ruby-version +1 -0
  7. data/CHANGELOG.md +41 -0
  8. data/CODE_OF_CONDUCT.md +84 -0
  9. data/CONTRIBUTING.md +384 -0
  10. data/Gemfile +22 -0
  11. data/Gemfile.lock +138 -0
  12. data/LICENSE.txt +21 -0
  13. data/MIGRATION.md +556 -0
  14. data/README.md +1660 -0
  15. data/Rakefile +334 -0
  16. data/SECURITY.md +150 -0
  17. data/VCR_CONFIGURATION.md +80 -0
  18. data/docs/model_selection.md +637 -0
  19. data/docs/observability.md +430 -0
  20. data/docs/prompt_templates.md +422 -0
  21. data/docs/streaming.md +467 -0
  22. data/docs/structured_outputs.md +466 -0
  23. data/docs/tools.md +1016 -0
  24. data/examples/basic_completion.rb +122 -0
  25. data/examples/model_selection_example.rb +141 -0
  26. data/examples/observability_example.rb +199 -0
  27. data/examples/prompt_template_example.rb +184 -0
  28. data/examples/smart_completion_example.rb +89 -0
  29. data/examples/streaming_example.rb +176 -0
  30. data/examples/structured_outputs_example.rb +191 -0
  31. data/examples/tool_calling_example.rb +149 -0
  32. data/lib/open_router/client.rb +552 -0
  33. data/lib/open_router/http.rb +118 -0
  34. data/lib/open_router/json_healer.rb +263 -0
  35. data/lib/open_router/model_registry.rb +378 -0
  36. data/lib/open_router/model_selector.rb +462 -0
  37. data/lib/open_router/prompt_template.rb +290 -0
  38. data/lib/open_router/response.rb +371 -0
  39. data/lib/open_router/schema.rb +288 -0
  40. data/lib/open_router/streaming_client.rb +210 -0
  41. data/lib/open_router/tool.rb +221 -0
  42. data/lib/open_router/tool_call.rb +180 -0
  43. data/lib/open_router/usage_tracker.rb +277 -0
  44. data/lib/open_router/version.rb +5 -0
  45. data/lib/open_router.rb +123 -0
  46. data/sig/open_router.rbs +20 -0
  47. metadata +186 -0
data/docs/streaming.md ADDED
@@ -0,0 +1,467 @@
1
+ # Streaming Client
2
+
3
+ The OpenRouter gem provides an enhanced streaming client that offers real-time response streaming with comprehensive callback support and automatic response reconstruction. This is ideal for applications that need to display responses as they're generated or process large responses efficiently.
4
+
5
+ ## Quick Start
6
+
7
+ ```ruby
8
+ require 'open_router'
9
+
10
+ # Create streaming client
11
+ streaming_client = OpenRouter::StreamingClient.new(
12
+ access_token: ENV["OPENROUTER_API_KEY"]
13
+ )
14
+
15
+ # Basic streaming
16
+ response = streaming_client.stream_complete(
17
+ [{ role: "user", content: "Write a short story about a robot" }],
18
+ model: "openai/gpt-4o-mini",
19
+ accumulate_response: true
20
+ )
21
+
22
+ puts response.content # Complete response after streaming
23
+ ```
24
+
25
+ ## Streaming with Callbacks
26
+
27
+ The streaming client supports extensive callback events for monitoring and custom processing.
28
+
29
+ ### Available Streaming Events
30
+
31
+ - `:on_start` - Triggered when streaming begins
32
+ - `:on_chunk` - Triggered for each content chunk
33
+ - `:on_tool_call_chunk` - Triggered for tool call chunks
34
+ - `:on_finish` - Triggered when streaming completes
35
+ - `:on_error` - Triggered on errors
36
+
37
+ ### Basic Callback Setup
38
+
39
+ ```ruby
40
+ streaming_client = OpenRouter::StreamingClient.new
41
+
42
+ # Set up callbacks
43
+ streaming_client
44
+ .on_stream(:on_start) do |data|
45
+ puts "Starting request to #{data[:model]}"
46
+ puts "Messages: #{data[:messages].size} messages"
47
+ end
48
+ .on_stream(:on_chunk) do |chunk|
49
+ print chunk.content if chunk.content
50
+ end
51
+ .on_stream(:on_finish) do |response|
52
+ puts "\nCompleted!"
53
+ puts "Total tokens: #{response.total_tokens}"
54
+ puts "Cost: $#{response.cost_estimate}"
55
+ end
56
+ .on_stream(:on_error) do |error|
57
+ puts "Error: #{error.message}"
58
+ end
59
+
60
+ # Stream the request
61
+ streaming_client.stream_complete(
62
+ [{ role: "user", content: "Tell me about quantum computing" }],
63
+ model: "anthropic/claude-3-5-sonnet"
64
+ )
65
+ ```
66
+
67
+ ## Streaming with Tool Calls
68
+
69
+ The streaming client fully supports tool calling with real-time notifications.
70
+
71
+ ```ruby
72
+ # Define a tool
73
+ weather_tool = OpenRouter::Tool.define do
74
+ name "get_weather"
75
+ description "Get current weather for a location"
76
+ parameters do
77
+ string :location, required: true, description: "City name"
78
+ string :units, enum: ["celsius", "fahrenheit"], default: "celsius"
79
+ end
80
+ end
81
+
82
+ # Set up tool call monitoring
83
+ streaming_client.on_stream(:on_tool_call_chunk) do |chunk|
84
+ chunk.tool_calls.each do |tool_call|
85
+ puts "Tool call: #{tool_call.name}"
86
+ puts "Arguments: #{tool_call.arguments}"
87
+ end
88
+ end
89
+
90
+ # Stream with tools
91
+ response = streaming_client.stream_complete(
92
+ [{ role: "user", content: "What's the weather in Tokyo and London?" }],
93
+ model: "anthropic/claude-3-5-sonnet",
94
+ tools: [weather_tool],
95
+ accumulate_response: true
96
+ )
97
+
98
+ # Handle tool calls after streaming
99
+ if response.has_tool_calls?
100
+ response.tool_calls.each do |tool_call|
101
+ case tool_call.name
102
+ when "get_weather"
103
+ weather_data = fetch_weather(tool_call.arguments["location"])
104
+ puts "Weather: #{weather_data}"
105
+ end
106
+ end
107
+ end
108
+ ```
109
+
110
+ ## Advanced Streaming Patterns
111
+
112
+ ### Real-time Processing
113
+
114
+ Process chunks immediately without accumulating the full response:
115
+
116
+ ```ruby
117
+ streaming_client.stream_complete(
118
+ messages,
119
+ model: "openai/gpt-4o-mini",
120
+ accumulate_response: false # Don't store full response
121
+ ) do |chunk|
122
+ # Process each chunk immediately
123
+ if chunk.content
124
+ # Send to real-time display
125
+ websocket.send(chunk.content)
126
+
127
+ # Log to database
128
+ log_chunk(chunk.content, timestamp: Time.now)
129
+
130
+ # Trigger real-time analytics
131
+ update_metrics(chunk)
132
+ end
133
+ end
134
+ ```
135
+
136
+ ### Error Handling and Fallbacks
137
+
138
+ Implement robust error handling with automatic fallbacks:
139
+
140
+ ```ruby
141
+ streaming_client.on_stream(:on_error) do |error|
142
+ logger.error "Streaming failed: #{error.message}"
143
+
144
+ # Implement fallback to non-streaming
145
+ fallback_client = OpenRouter::Client.new
146
+ fallback_response = fallback_client.complete(
147
+ messages,
148
+ model: "openai/gpt-4o-mini"
149
+ )
150
+
151
+ # Process fallback response
152
+ process_complete_response(fallback_response)
153
+ end
154
+ ```
155
+
156
+ ### Performance Monitoring
157
+
158
+ Monitor streaming performance in real-time:
159
+
160
+ ```ruby
161
+ start_time = nil
162
+ token_count = 0
163
+
164
+ streaming_client
165
+ .on_stream(:on_start) { |data| start_time = Time.now }
166
+ .on_stream(:on_chunk) do |chunk|
167
+ if chunk.usage
168
+ token_count = chunk.usage["total_tokens"] || token_count
169
+ elapsed = Time.now - start_time
170
+ tokens_per_second = token_count / elapsed
171
+ puts "Speed: #{tokens_per_second.round(2)} tokens/sec"
172
+ end
173
+ end
174
+ .on_stream(:on_finish) do |response|
175
+ total_time = Time.now - start_time
176
+ final_tps = response.total_tokens / total_time
177
+ puts "Final speed: #{final_tps.round(2)} tokens/sec"
178
+
179
+ # Log performance metrics
180
+ log_performance({
181
+ model: response.model,
182
+ tokens: response.total_tokens,
183
+ duration: total_time,
184
+ tokens_per_second: final_tps
185
+ })
186
+ end
187
+ ```
188
+
189
+ ## Response Accumulation
190
+
191
+ The streaming client can automatically accumulate responses for you:
192
+
193
+ ```ruby
194
+ # Accumulate full response (default)
195
+ response = streaming_client.stream_complete(
196
+ messages,
197
+ accumulate_response: true # Default behavior
198
+ )
199
+
200
+ # Access complete response
201
+ puts response.content
202
+ puts response.total_tokens
203
+ puts response.cost_estimate
204
+
205
+ # Don't accumulate (memory efficient for large responses)
206
+ streaming_client.stream_complete(
207
+ messages,
208
+ accumulate_response: false
209
+ ) do |chunk|
210
+ # Process each chunk as it arrives
211
+ process_chunk_immediately(chunk)
212
+ end
213
+ ```
214
+
215
+ ## Structured Outputs with Streaming
216
+
217
+ Streaming works seamlessly with structured outputs:
218
+
219
+ ```ruby
220
+ # Define schema
221
+ user_schema = OpenRouter::Schema.define("user") do
222
+ string :name, required: true
223
+ integer :age, required: true
224
+ string :email, required: true
225
+ end
226
+
227
+ # Stream with structured output
228
+ response = streaming_client.stream_complete(
229
+ [{ role: "user", content: "Create a user: John Doe, 30, john@example.com" }],
230
+ model: "openai/gpt-4o",
231
+ response_format: user_schema,
232
+ accumulate_response: true
233
+ )
234
+
235
+ # Access structured output after streaming
236
+ user_data = response.structured_output
237
+ puts "User: #{user_data['name']}, Age: #{user_data['age']}"
238
+ ```
239
+
240
+ ## Configuration Options
241
+
242
+ The streaming client accepts all the same configuration options as the regular client:
243
+
244
+ ```ruby
245
+ streaming_client = OpenRouter::StreamingClient.new(
246
+ access_token: ENV["OPENROUTER_API_KEY"],
247
+ request_timeout: 60, # Shorter timeout for streaming
248
+ site_name: "My App",
249
+ site_url: "https://myapp.com",
250
+ track_usage: true # Enable usage tracking
251
+ )
252
+
253
+ # Configure healing for streaming
254
+ OpenRouter.configure do |config|
255
+ config.auto_heal_responses = true
256
+ config.healer_model = "openai/gpt-4o-mini"
257
+ end
258
+ ```
259
+
260
+ ## Memory Management
261
+
262
+ For long-running applications, manage memory efficiently:
263
+
264
+ ```ruby
265
+ # Process large batches with memory management
266
+ messages_batch.each_slice(10) do |batch_slice|
267
+ batch_slice.each do |messages|
268
+ streaming_client.stream_complete(
269
+ messages,
270
+ accumulate_response: false # Don't store in memory
271
+ ) do |chunk|
272
+ # Process and discard immediately
273
+ process_and_save_chunk(chunk)
274
+ end
275
+ end
276
+
277
+ # Force garbage collection periodically
278
+ GC.start if batch_slice.size == 10
279
+ end
280
+ ```
281
+
282
+ ## Integration Patterns
283
+
284
+ ### WebSocket Integration
285
+
286
+ ```ruby
287
+ class StreamingController
288
+ def stream_chat
289
+ streaming_client = OpenRouter::StreamingClient.new
290
+
291
+ streaming_client.on_stream(:on_chunk) do |chunk|
292
+ if chunk.content
293
+ ActionCable.server.broadcast(
294
+ "chat_#{session_id}",
295
+ { type: 'chunk', content: chunk.content }
296
+ )
297
+ end
298
+ end
299
+
300
+ streaming_client.on_stream(:on_finish) do |response|
301
+ ActionCable.server.broadcast(
302
+ "chat_#{session_id}",
303
+ { type: 'complete', total_tokens: response.total_tokens }
304
+ )
305
+ end
306
+
307
+ streaming_client.stream_complete(messages, model: model)
308
+ end
309
+ end
310
+ ```
311
+
312
+ ### Background Job Integration
313
+
314
+ ```ruby
315
+ class StreamingChatJob < ApplicationJob
316
+ def perform(user_id, messages, model)
317
+ streaming_client = OpenRouter::StreamingClient.new
318
+
319
+ streaming_client.on_stream(:on_chunk) do |chunk|
320
+ # Broadcast to user's channel
321
+ ActionCable.server.broadcast(
322
+ "user_#{user_id}",
323
+ { chunk: chunk.content }
324
+ )
325
+ end
326
+
327
+ streaming_client.on_stream(:on_finish) do |response|
328
+ # Save complete response to database
329
+ ChatMessage.create!(
330
+ user_id: user_id,
331
+ content: response.content,
332
+ token_count: response.total_tokens,
333
+ cost: response.cost_estimate
334
+ )
335
+ end
336
+
337
+ streaming_client.stream_complete(messages, model: model)
338
+ end
339
+ end
340
+ ```
341
+
342
+ ## Comparison: Streaming vs Regular Client
343
+
344
+ | Feature | Streaming Client | Regular Client |
345
+ |---------|-----------------|----------------|
346
+ | Response Time | Real-time chunks | Complete response at end |
347
+ | Memory Usage | Lower (optional accumulation) | Higher (full response) |
348
+ | User Experience | Immediate feedback | Wait for completion |
349
+ | Error Handling | Mid-stream error handling | End-of-request errors |
350
+ | Tool Calls | Real-time notifications | Post-completion processing |
351
+ | Complexity | Higher (callbacks) | Lower (simple request/response) |
352
+
353
+ ## Best Practices
354
+
355
+ ### When to Use Streaming
356
+
357
+ - **Long responses**: Stories, articles, detailed explanations
358
+ - **Real-time applications**: Chat interfaces, live content generation
359
+ - **Memory-constrained environments**: Processing large responses
360
+ - **User experience**: Showing progress to users
361
+
362
+ ### When to Use Regular Client
363
+
364
+ - **Short responses**: Quick questions, simple completions
365
+ - **Batch processing**: Processing many requests sequentially
366
+ - **Simple integrations**: When callbacks add unnecessary complexity
367
+ - **Structured outputs**: When you need the complete JSON before processing
368
+
369
+ ### Error Handling
370
+
371
+ Always implement comprehensive error handling:
372
+
373
+ ```ruby
374
+ streaming_client.on_stream(:on_error) do |error|
375
+ case error
376
+ when OpenRouter::ServerError
377
+ # API errors - might be transient
378
+ retry_with_backoff
379
+ when Faraday::TimeoutError
380
+ # Network timeout - try different model
381
+ fallback_to_faster_model
382
+ else
383
+ # Unknown error - log and fail gracefully
384
+ logger.error "Streaming error: #{error.message}"
385
+ send_error_to_user
386
+ end
387
+ end
388
+ ```
389
+
390
+ ### Performance Optimization
391
+
392
+ ```ruby
393
+ # Use connection pooling for high-throughput applications
394
+ streaming_client = OpenRouter::StreamingClient.new do |config|
395
+ config.faraday do |f|
396
+ f.adapter :net_http_persistent, pool_size: 10
397
+ end
398
+ end
399
+
400
+ # Monitor performance
401
+ streaming_client.on_stream(:on_finish) do |response|
402
+ if response.response_time > 5000 # 5 seconds
403
+ logger.warn "Slow streaming response: #{response.response_time}ms"
404
+ end
405
+ end
406
+ ```
407
+
408
+ ## Troubleshooting
409
+
410
+ ### Common Issues
411
+
412
+ #### Connection Timeouts
413
+ ```ruby
414
+ # Problem: Streaming connections timeout
415
+ streaming_client = OpenRouter::StreamingClient.new(
416
+ request_timeout: 300 # Increase timeout for long responses
417
+ )
418
+
419
+ # Or handle timeouts gracefully
420
+ streaming_client.on_stream(:on_error) do |error|
421
+ if error.is_a?(Faraday::TimeoutError)
422
+ puts "Request timed out, falling back to regular client"
423
+ fallback_response = regular_client.complete(messages, model: model)
424
+ end
425
+ end
426
+ ```
427
+
428
+ #### Memory Leaks
429
+ ```ruby
430
+ # Problem: Memory usage grows over time
431
+ # Solution: Use non-accumulating streaming
432
+ streaming_client.stream_complete(
433
+ messages,
434
+ accumulate_response: false # Don't store full response
435
+ ) do |chunk|
436
+ process_chunk_immediately(chunk)
437
+ end
438
+
439
+ # Or reset callbacks periodically
440
+ streaming_client.clear_callbacks if request_count % 1000 == 0
441
+ ```
442
+
443
+ #### Missing Chunks
444
+ ```ruby
445
+ # Problem: Some chunks appear empty
446
+ # Solution: Check for different content types
447
+ streaming_client.on_stream(:on_chunk) do |chunk|
448
+ if chunk.content && !chunk.content.empty?
449
+ process_content(chunk.content)
450
+ elsif chunk.tool_calls && !chunk.tool_calls.empty?
451
+ process_tool_calls(chunk.tool_calls)
452
+ elsif chunk.usage
453
+ update_usage_metrics(chunk.usage)
454
+ end
455
+ end
456
+ ```
457
+
458
+ ### Best Practices
459
+
460
+ 1. **Always handle errors**: Implement comprehensive error handling
461
+ 2. **Set appropriate timeouts**: Balance responsiveness with reliability
462
+ 3. **Use non-accumulating mode for large responses**: Avoid memory issues
463
+ 4. **Monitor performance**: Track tokens per second and response times
464
+ 5. **Implement fallbacks**: Have a backup plan for streaming failures
465
+ 6. **Clean up resources**: Clear callbacks and reset trackers periodically
466
+
467
+ The streaming client provides a powerful foundation for building responsive AI applications that can process and display results in real-time while maintaining full compatibility with all of OpenRouter's advanced features.