ruby_llm 0.1.0.pre42 → 0.1.0.pre44

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 2dac5b1b15a5d840d74ac63af791044a26669e0d41b0512d37236530cdd89693
4
- data.tar.gz: b737c0061790066680424051afd3b9ea75d9d63f014d1684261dde81c1bedf37
3
+ metadata.gz: 519b87700f2ba3d5aeb8ea3a87d0881f4ef58f974991647b7af4c9f138cbf3d2
4
+ data.tar.gz: 553b6441ad59fe5efe6d51374103690101c2cb8d072f91bf2a850d84b4934601
5
5
  SHA512:
6
- metadata.gz: a90362b3d2dbe5b9f5c54851d94646273b343a32eef452528903026ea2e1add335a105273b0efc17a5ebde1da15421d894029ebe401e94c511531a1e8d2f706b
7
- data.tar.gz: 504e5dd4e3692db83b6b9589e26e33f553865234b59264d05aec1f495e403e8e28eb2b7dfc2b2258047df07dd04f3740005b44dee8a874927b12a55f045fae6e
6
+ metadata.gz: e52cb96479a0f4443521bf3b5b2e98c882fb1c404e3d1527c74f638137bf56c2cb9442cc6e30ef197157fbdaa7dc189b7f6331a31e6f31ddfa28a926035d6578
7
+ data.tar.gz: b1d7948851e6c1ffa5257e214f01e6d6efaf64d32d98dcc66ec8f5d6776df232a92b3b26ff1de78889f4195ff27e41dc7375a4b2530dc1b97edc8fd57bb2c835
data/.rspec_status CHANGED
@@ -35,7 +35,7 @@ example_id | status | run_time |
35
35
  ./spec/ruby_llm/embeddings_spec.rb[1:1:2:1] | passed | 0.65614 seconds |
36
36
  ./spec/ruby_llm/embeddings_spec.rb[1:1:2:2] | passed | 2.16 seconds |
37
37
  ./spec/ruby_llm/error_handling_spec.rb[1:1] | passed | 0.29366 seconds |
38
- ./spec/ruby_llm/image_generation_spec.rb[1:1:1] | passed | 11.61 seconds |
39
- ./spec/ruby_llm/image_generation_spec.rb[1:1:2] | passed | 17.63 seconds |
40
- ./spec/ruby_llm/image_generation_spec.rb[1:1:3] | passed | 8.77 seconds |
41
- ./spec/ruby_llm/image_generation_spec.rb[1:1:4] | passed | 0.00319 seconds |
38
+ ./spec/ruby_llm/image_generation_spec.rb[1:1:1] | passed | 24.16 seconds |
39
+ ./spec/ruby_llm/image_generation_spec.rb[1:1:2] | passed | 14.81 seconds |
40
+ ./spec/ruby_llm/image_generation_spec.rb[1:1:3] | passed | 9.17 seconds |
41
+ ./spec/ruby_llm/image_generation_spec.rb[1:1:4] | passed | 0.00083 seconds |
data/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # RubyLLM
2
2
 
3
- A delightful Ruby way to work with AI. Chat in text, analyze and generate images, understand audio, and use tools through a unified interface to OpenAI, Anthropic, Google, and DeepSeek. Built for developer happiness with automatic token counting, proper streaming, and Rails integration. No wrapping your head around multiple APIs - just clean Ruby code that works.
3
+ A delightful Ruby way to work with AI. No configuration madness, no complex callbacks, no handler hell – just beautiful, expressive Ruby code.
4
4
 
5
5
  <p align="center">
6
6
  <img src="https://upload.wikimedia.org/wikipedia/commons/4/4d/OpenAI_Logo.svg" alt="OpenAI" height="40" width="120">
@@ -15,462 +15,164 @@ A delightful Ruby way to work with AI. Chat in text, analyze and generate images
15
15
  <p align="center">
16
16
  <a href="https://badge.fury.io/rb/ruby_llm"><img src="https://badge.fury.io/rb/ruby_llm.svg" alt="Gem Version" /></a>
17
17
  <a href="https://github.com/testdouble/standard"><img src="https://img.shields.io/badge/code_style-standard-brightgreen.svg" alt="Ruby Style Guide" /></a>
18
- <a href="https://rubygems.org/gems/ruby_llm"><img alt="Gem Total Downloads" src="https://img.shields.io/gem/dt/ruby_llm"></a>
18
+ <a href="https://rubygems.org/gems/ruby_llm"><img alt="Gem Downloads" src="https://img.shields.io/gem/dt/ruby_llm"></a>
19
19
  <a href="https://codecov.io/gh/crmne/ruby_llm"><img src="https://codecov.io/gh/crmne/ruby_llm/branch/main/graph/badge.svg" alt="codecov" /></a>
20
20
  </p>
21
21
 
22
22
  ðŸĪš Battle tested at [💎 Chat with Work](https://chatwithwork.com)
23
23
 
24
- ## Features
24
+ ## The problem with AI libraries
25
25
 
26
- - 💎 **Beautiful Chat Interface** - Converse with AI models as easily as `RubyLLM.chat.ask "teach me Ruby"`
27
- - ðŸŽĩ **Audio Analysis** - Get audio transcription and understanding with `chat.ask "what's said here?", with: { audio: "clip.wav" }`
28
- - 👁ïļ **Vision Understanding** - Let AIs analyze images with a simple `chat.ask "what's in this?", with: { image: "photo.jpg" }`
29
- - 🌊 **Streaming** - Real-time responses with proper Ruby streaming with `chat.ask "hello" do |chunk| puts chunk.content end`
30
- - 📄 **PDF Analysis** - Analyze PDF documents directly with `chat.ask "What's in this?", with: { pdf: "document.pdf" }`
31
- - 🚂 **Rails Integration** - Persist chats and messages with ActiveRecord with `acts_as_{chat|message|tool_call}`
32
- - 🛠ïļ **Tool Support** - Give AIs access to your Ruby code with `chat.with_tool(Calculator).ask "what's 2+2?"`
33
- - ðŸŽĻ **Paint with AI** - Create images as easily as `RubyLLM.paint "a sunset over mountains"`
34
- - 📊 **Embeddings** - Generate vector embeddings for your text with `RubyLLM.embed "hello"`
35
- - 🔄 **Multi-Provider Support** - Works with OpenAI, Anthropic, Google, and DeepSeek
36
- - ðŸŽŊ **Token Tracking** - Automatic usage tracking across providers
26
+ Every AI provider comes with its own client library, its own response format, its own conventions for streaming, and its own way of handling errors. Want to use multiple providers? Prepare to juggle incompatible APIs and bloated dependencies.
37
27
 
38
- ## Installation
39
-
40
- Add it to your Gemfile:
41
-
42
- ```ruby
43
- gem 'ruby_llm'
44
- ```
45
-
46
- Or install it yourself:
47
-
48
- ```bash
49
- gem install ruby_llm
50
- ```
28
+ RubyLLM fixes all that. One beautiful API for everything. One consistent format. Minimal dependencies — just Faraday and Zeitwerk. Because working with AI should be a joy, not a chore.
51
29
 
52
- ## Configuration
30
+ ## What makes it great
53
31
 
54
32
  ```ruby
55
- require 'ruby_llm'
56
-
57
- # Configure your API keys
58
- RubyLLM.configure do |config|
59
- config.openai_api_key = ENV['OPENAI_API_KEY']
60
- config.anthropic_api_key = ENV['ANTHROPIC_API_KEY']
61
- config.gemini_api_key = ENV['GEMINI_API_KEY']
62
- config.deepseek_api_key = ENV['DEEPSEEK_API_KEY']
63
- end
64
- ```
65
-
66
- ## Quick Start
67
-
68
- RubyLLM makes it dead simple to start chatting with AI models:
69
-
70
- ```ruby
71
- # Start a conversation
33
+ # Just ask questions
72
34
  chat = RubyLLM.chat
73
35
  chat.ask "What's the best way to learn Ruby?"
74
- ```
75
36
 
76
- ## Available Models
37
+ # Analyze images
38
+ chat.ask "What's in this image?", with: { image: "ruby_conf.jpg" }
77
39
 
78
- RubyLLM gives you access to the latest models from multiple providers:
40
+ # Analyze audio recordings
41
+ chat.ask "Describe this meeting", with: { audio: "meeting.wav" }
79
42
 
80
- ```ruby
81
- # List all available models
82
- RubyLLM.models.all
83
-
84
- # Get models by type
85
- chat_models = RubyLLM.models.chat_models
86
- embedding_models = RubyLLM.models.embedding_models
87
- audio_models = RubyLLM.models.audio_models
88
- image_models = RubyLLM.models.image_models
89
- ```
90
-
91
- ## Having a Conversation
43
+ # Analyze documents
44
+ chat.ask "Summarize this document", with: { pdf: "contract.pdf" }
92
45
 
93
- Conversations are simple and natural:
94
-
95
- ```ruby
96
- chat = RubyLLM.chat model: 'gemini-2.0-flash'
46
+ # Generate images
47
+ RubyLLM.paint "a sunset over mountains in watercolor style"
97
48
 
98
- # Ask questions
99
- response = chat.ask "What's your favorite Ruby feature?"
49
+ # Create vector embeddings
50
+ RubyLLM.embed "Ruby is elegant and expressive"
100
51
 
101
- # Multi-turn conversations just work
102
- chat.ask "Can you elaborate on that?"
103
- chat.ask "How does that compare to Python?"
104
-
105
- # Stream responses as they come
106
- chat.ask "Tell me a story about a Ruby programmer" do |chunk|
107
- print chunk.content
108
- end
109
-
110
- # Ask about images
111
- chat.ask "What do you see in this image?", with: { image: "ruby_logo.png" }
112
-
113
- # Get analysis of audio content
114
- chat.ask "What's being said in this recording?", with: { audio: "meeting.wav" }
115
-
116
- # Combine multiple pieces of content
117
- chat.ask "Compare these diagrams", with: { image: ["diagram1.png", "diagram2.png"] }
118
-
119
- # Ask about PDFs
120
-
121
- chat = RubyLLM.chat(model: 'claude-3-7-sonnet-20250219')
122
- chat.ask "Summarize this research paper", with: { pdf: "research.pdf" }
123
-
124
- # Multiple PDFs work too
125
- chat.ask "Compare these contracts", with: { pdf: ["contract1.pdf", "contract2.pdf"] }
126
-
127
- # Check token usage
128
- last_message = chat.messages.last
129
- puts "Conversation used #{last_message.input_tokens} input tokens and #{last_message.output_tokens} output tokens"
130
- ```
131
-
132
- You can provide content as local files or URLs - RubyLLM handles the rest. Vision and audio capabilities are available with compatible models. The API stays clean and consistent whether you're working with text, images, or audio.
133
-
134
- ## Image Generation
135
-
136
- Want to create AI-generated images? RubyLLM makes it super simple:
137
-
138
- ```ruby
139
- # Paint a picture!
140
- image = RubyLLM.paint "a starry night over San Francisco in Van Gogh's style"
141
- image.url # => "https://..."
142
- image.revised_prompt # Shows how DALL-E interpreted your prompt
143
-
144
- # Choose size and model
145
- image = RubyLLM.paint(
146
- "a cyberpunk cityscape at sunset",
147
- model: "dall-e-3",
148
- size: "1792x1024"
149
- )
150
-
151
- # Set your default model
152
- RubyLLM.configure do |config|
153
- config.default_image_model = "dall-e-3"
154
- end
155
- ```
156
-
157
- RubyLLM automatically handles all the complexities of the DALL-E API, token/credit management, and error handling, so you can focus on being creative.
158
-
159
- ## Text Embeddings
160
-
161
- Need vector embeddings for your text? RubyLLM makes it simple:
162
-
163
- ```ruby
164
- # Get embeddings with the default model
165
- RubyLLM.embed "Hello, world!"
166
-
167
- # Use a specific model
168
- RubyLLM.embed "Ruby is awesome!", model: "text-embedding-004"
169
-
170
- # Process multiple texts at once
171
- RubyLLM.embed([
172
- "First document",
173
- "Second document",
174
- "Third document"
175
- ])
176
-
177
- # Configure the default model
178
- RubyLLM.configure do |config|
179
- config.default_embedding_model = 'text-embedding-3-large'
180
- end
181
- ```
182
-
183
- ## Using Tools
184
-
185
- Give your AI assistants access to your Ruby code by creating tool classes that do one thing well:
186
-
187
- ```ruby
52
+ # Let AI use your code
188
53
  class Calculator < RubyLLM::Tool
189
- description "Performs arithmetic calculations"
190
-
191
- param :expression,
192
- type: :string,
193
- desc: "A mathematical expression to evaluate (e.g. '2 + 2')"
54
+ description "Performs calculations"
55
+ param :expression, type: :string, desc: "Math expression to evaluate"
194
56
 
195
57
  def execute(expression:)
196
58
  eval(expression).to_s
197
59
  end
198
60
  end
199
61
 
200
- class Search < RubyLLM::Tool
201
- description "Searches documents by similarity"
202
-
203
- param :query,
204
- desc: "The search query"
205
-
206
- param :limit,
207
- type: :integer,
208
- desc: "Number of results to return",
209
- required: false
210
-
211
- def initialize(repo:)
212
- @repo = repo
213
- end
214
-
215
- def execute(query:, limit: 5)
216
- @repo.similarity_search(query, limit:)
217
- end
218
- end
62
+ chat.with_tool(Calculator).ask "What's 123 * 456?"
219
63
  ```
220
64
 
221
- Then use them in your conversations:
65
+ ## Installation
222
66
 
223
67
  ```ruby
224
- # Simple tools just work
225
- chat = RubyLLM.chat.with_tool Calculator
226
-
227
- # Tools with dependencies are just regular Ruby objects
228
- search = Search.new repo: Document
229
- chat.with_tools search, Calculator
230
-
231
- # Configure as needed
232
- chat.with_model('claude-3-5-sonnet-20241022')
233
- .with_temperature(0.9)
68
+ # In your Gemfile
69
+ gem 'ruby_llm'
234
70
 
235
- chat.ask "What's 2+2?"
236
- # => "Let me calculate that for you. The result is 4."
71
+ # Then run
72
+ bundle install
237
73
 
238
- chat.ask "Find documents about Ruby performance"
239
- # => "I found these relevant documents about Ruby performance..."
74
+ # Or install it yourself
75
+ gem install ruby_llm
240
76
  ```
241
77
 
242
- Need to debug a tool? RubyLLM automatically logs all tool calls:
78
+ Configure with your API keys:
243
79
 
244
80
  ```ruby
245
- ENV['RUBY_LLM_DEBUG'] = 'true'
246
-
247
- chat.ask "What's 123 * 456?"
248
- # D, -- RubyLLM: Tool calculator called with: {"expression" => "123 * 456"}
249
- # D, -- RubyLLM: Tool calculator returned: "56088"
81
+ RubyLLM.configure do |config|
82
+ config.openai_api_key = ENV['OPENAI_API_KEY']
83
+ config.anthropic_api_key = ENV['ANTHROPIC_API_KEY']
84
+ config.gemini_api_key = ENV['GEMINI_API_KEY']
85
+ config.deepseek_api_key = ENV['DEEPSEEK_API_KEY'] # Optional
86
+ end
250
87
  ```
251
88
 
252
- ## Error Handling
253
-
254
- RubyLLM wraps provider errors in clear Ruby exceptions:
89
+ ## Have great conversations
255
90
 
256
91
  ```ruby
257
- begin
258
- chat = RubyLLM.chat
259
- chat.ask "Hello world!"
260
- rescue RubyLLM::UnauthorizedError
261
- puts "Check your API credentials"
262
- rescue RubyLLM::BadRequestError => e
263
- puts "Something went wrong: #{e.message}"
264
- rescue RubyLLM::PaymentRequiredError
265
- puts "Time to top up your API credits"
266
- rescue RubyLLM::ServiceUnavailableError
267
- puts "API service is temporarily down"
268
- end
269
- ```
92
+ # Start a chat with the default model (GPT-4o-mini)
93
+ chat = RubyLLM.chat
270
94
 
271
- ## Rails Integration
95
+ # Or specify what you want
96
+ chat = RubyLLM.chat(model: 'claude-3-7-sonnet-20250219')
272
97
 
273
- RubyLLM comes with built-in Rails support that makes it dead simple to persist your chats and messages. Just create your tables and hook it up:
98
+ # Simple questions just work
99
+ chat.ask "What's the difference between attr_reader and attr_accessor?"
274
100
 
275
- ```ruby
276
- # db/migrate/YYYYMMDDHHMMSS_create_chats.rb
277
- class CreateChats < ActiveRecord::Migration[8.0]
278
- def change
279
- create_table :chats do |t|
280
- t.string :model_id
281
- t.timestamps
282
- end
283
- end
284
- end
101
+ # Multi-turn conversations are seamless
102
+ chat.ask "Could you give me an example?"
285
103
 
286
- # db/migrate/YYYYMMDDHHMMSS_create_messages.rb
287
- class CreateMessages < ActiveRecord::Migration[8.0]
288
- def change
289
- create_table :messages do |t|
290
- t.references :chat, null: false
291
- t.string :role
292
- t.text :content
293
- t.string :model_id
294
- t.integer :input_tokens
295
- t.integer :output_tokens
296
- t.references :tool_call
297
- t.timestamps
298
- end
299
- end
104
+ # Stream responses in real-time
105
+ chat.ask "Tell me a story about a Ruby programmer" do |chunk|
106
+ print chunk.content
300
107
  end
301
108
 
302
- # db/migrate/YYYYMMDDHHMMSS_create_tool_calls.rb
303
- class CreateToolCalls < ActiveRecord::Migration[8.0]
304
- def change
305
- create_table :tool_calls do |t|
306
- t.references :message, null: false
307
- t.string :tool_call_id, null: false
308
- t.string :name, null: false
309
- t.jsonb :arguments, default: {}
310
- t.timestamps
311
- end
312
-
313
- add_index :tool_calls, :tool_call_id
314
- end
315
- end
109
+ # Understand content in multiple forms
110
+ chat.ask "Compare these diagrams", with: { image: ["diagram1.png", "diagram2.png"] }
111
+ chat.ask "Summarize this document", with: { pdf: "contract.pdf" }
112
+ chat.ask "What's being said?", with: { audio: "meeting.wav" }
113
+
114
+ # Need a different model mid-conversation? No problem
115
+ chat.with_model('gemini-2.0-flash').ask "What's your favorite algorithm?"
316
116
  ```
317
117
 
318
- Then in your models:
118
+ ## Rails integration that makes sense
319
119
 
320
120
  ```ruby
121
+ # app/models/chat.rb
321
122
  class Chat < ApplicationRecord
322
123
  acts_as_chat
323
124
 
324
- # Optional: Add Turbo Streams support
125
+ # Works great with Turbo
325
126
  broadcasts_to ->(chat) { "chat_#{chat.id}" }
326
127
  end
327
128
 
129
+ # app/models/message.rb
328
130
  class Message < ApplicationRecord
329
131
  acts_as_message
330
132
  end
331
133
 
134
+ # app/models/tool_call.rb
332
135
  class ToolCall < ApplicationRecord
333
136
  acts_as_tool_call
334
137
  end
335
- ```
336
-
337
- That's it! Now you can use chats straight from your models:
338
-
339
- ```ruby
340
- # Create a new chat
341
- chat = Chat.create! model_id: "gpt-4o-mini"
342
-
343
- # Ask questions - messages are automatically saved
344
- chat.ask "What's the weather in Paris?"
345
-
346
- # Stream responses in real-time
347
- chat.ask "Tell me a story" do |chunk|
348
- broadcast_chunk chunk
349
- end
350
138
 
351
- # Everything is persisted automatically
352
- chat.messages.each do |message|
353
- case message.role
354
- when :user
355
- puts "User: #{message.content}"
356
- when :assistant
357
- puts "Assistant: #{message.content}"
358
- end
139
+ # In your controller
140
+ chat = Chat.create!(model_id: "gpt-4o-mini")
141
+ chat.ask("What's your favorite Ruby gem?") do |chunk|
142
+ Turbo::StreamsChannel.broadcast_append_to(
143
+ chat,
144
+ target: "response",
145
+ partial: "messages/chunk",
146
+ locals: { chunk: chunk }
147
+ )
359
148
  end
360
- ```
361
-
362
- ### Real-time Updates with Hotwire
363
-
364
- The Rails integration works great with Hotwire out of the box:
365
-
366
- ```ruby
367
- # app/controllers/chats_controller.rb
368
- class ChatsController < ApplicationController
369
- def show
370
- @chat = Chat.find(params[:id])
371
- end
372
-
373
- def ask
374
- @chat = Chat.find(params[:id])
375
- @chat.ask(params[:message]) do |chunk|
376
- Turbo::StreamsChannel.broadcast_append_to(
377
- @chat,
378
- target: "messages",
379
- partial: "messages/chunk",
380
- locals: { chunk: chunk }
381
- )
382
- end
383
- end
384
- end
385
-
386
- # app/views/chats/show.html.erb
387
- <%= turbo_stream_from @chat %>
388
149
 
389
- <div id="messages">
390
- <%= render @chat.messages %>
391
- </div>
392
-
393
- <%= form_with(url: ask_chat_path(@chat), local: false) do |f| %>
394
- <%= f.text_area :message %>
395
- <%= f.submit "Send" %>
396
- <% end %>
150
+ # That's it - chat history is automatically saved
397
151
  ```
398
152
 
399
- ### Background Jobs
400
-
401
- The persistence works seamlessly with background jobs:
402
-
403
- ```ruby
404
- class ChatJob < ApplicationJob
405
- def perform(chat_id, message)
406
- chat = Chat.find chat_id
407
-
408
- chat.ask(message) do |chunk|
409
- # Optional: Broadcast chunks for real-time updates
410
- Turbo::StreamsChannel.broadcast_append_to(
411
- chat,
412
- target: "messages",
413
- partial: "messages/chunk",
414
- locals: { chunk: chunk }
415
- )
416
- end
417
- end
418
- end
419
- ```
420
-
421
- ### Using Tools
422
-
423
- Tools work just like they do in regular RubyLLM chats:
153
+ ## Creating tools is a breeze
424
154
 
425
155
  ```ruby
426
- class WeatherTool < RubyLLM::Tool
427
- description "Gets current weather for a location"
156
+ class Search < RubyLLM::Tool
157
+ description "Searches a knowledge base"
428
158
 
429
- param :location,
430
- type: :string,
431
- desc: "City name or coordinates"
159
+ param :query, desc: "The search query"
160
+ param :limit, type: :integer, desc: "Max results", required: false
432
161
 
433
- def execute(location:)
434
- # Fetch weather data...
435
- { temperature: 22, conditions: "Sunny" }
162
+ def execute(query:, limit: 5)
163
+ # Your search logic here
164
+ Document.search(query).limit(limit).map(&:title)
436
165
  end
437
166
  end
438
167
 
439
- # Use tools with your persisted chats
440
- chat = Chat.create! model_id: "deepseek-reasoner"
441
- chat.chat.with_tool WeatherTool.new
442
-
443
- # Ask about weather - tool usage is automatically saved
444
- chat.ask "What's the weather in Paris?"
445
-
446
- # Tool calls and results are persisted as messages
447
- pp chat.messages.map(&:role)
448
- #=> [:user, :assistant, :tool, :assistant]
168
+ # Let the AI use it
169
+ chat.with_tool(Search).ask "Find documents about Ruby 3.3 features"
449
170
  ```
450
171
 
451
- ## Provider Comparison
452
-
453
- | Feature | OpenAI | Anthropic | Google | DeepSeek |
454
- |---------|--------|-----------|--------|----------|
455
- | Chat | ✅ GPT-4o, GPT-3.5 | ✅ Claude 3.7, 3.5, 3 | ✅ Gemini 2.0, 1.5 | ✅ DeepSeek Chat, Reasoner |
456
- | Vision | ✅ GPT-4o, GPT-4 | ✅ All Claude 3 models | ✅ Gemini 2.0, 1.5 | ❌ |
457
- | Audio | ✅ GPT-4o-audio, Whisper | ❌ | ✅ Gemini models | ❌ |
458
- | PDF Analysis | ❌ | ✅ All Claude 3 models | ✅ Gemini models | ❌ |
459
- | Function Calling | ✅ Most models | ✅ Claude 3 models | ✅ Gemini models (except Lite) | ✅ |
460
- | JSON Mode | ✅ Most recent models | ✅ Claude 3 models | ✅ Gemini models | ❌ |
461
- | Image Generation | ✅ DALL-E 3 | ❌ | ✅ Imagen | ❌ |
462
- | Embeddings | ✅ text-embedding-3 | ❌ | ✅ text-embedding-004 | ❌ |
463
- | Context Size | ⭐ Up to 200K (o1) | ⭐ 200K tokens | ⭐ Up to 2M tokens | 64K tokens |
464
- | Streaming | ✅ | ✅ | ✅ | ✅ |
465
-
466
- ## Development
467
-
468
- After checking out the repo, run `bin/setup` to install dependencies. Then, run `bin/console` for an interactive prompt.
469
-
470
- ## Contributing
172
+ ## Learn more
471
173
 
472
- Bug reports and pull requests are welcome on GitHub at https://github.com/crmne/ruby_llm.
174
+ Check out the guides at https://rubyllm.com for deeper dives into conversations with tools, streaming responses, embedding generations, and more.
473
175
 
474
176
  ## License
475
177
 
476
- Released under the MIT License. See [LICENSE](LICENSE) for details.
178
+ Released under the MIT License.
@@ -3,16 +3,39 @@
3
3
  module RubyLLM
4
4
  # Represents a generated image from an AI model.
5
5
  # Provides an interface to image generation capabilities
6
- # from providers like DALL-E.
6
+ # from providers like DALL-E and Gemini's Imagen.
7
7
  class Image
8
- attr_reader :url, :revised_prompt, :model_id
8
+ attr_reader :url, :data, :mime_type, :revised_prompt, :model_id
9
9
 
10
- def initialize(url:, revised_prompt: nil, model_id: nil)
10
+ def initialize(url: nil, data: nil, mime_type: nil, revised_prompt: nil, model_id: nil)
11
11
  @url = url
12
+ @data = data
13
+ @mime_type = mime_type
12
14
  @revised_prompt = revised_prompt
13
15
  @model_id = model_id
14
16
  end
15
17
 
18
+ def base64?
19
+ !@data.nil?
20
+ end
21
+
22
+ # Returns the raw binary image data regardless of source
23
+ def to_blob
24
+ if base64?
25
+ Base64.decode64(@data)
26
+ else
27
+ # Use Faraday instead of URI.open for better security
28
+ response = Faraday.get(@url)
29
+ response.body
30
+ end
31
+ end
32
+
33
+ # Saves the image to a file path
34
+ def save(path)
35
+ File.binwrite(File.expand_path(path), to_blob)
36
+ path
37
+ end
38
+
16
39
  def self.paint(prompt, model: nil, size: '1024x1024')
17
40
  model_id = model || RubyLLM.config.default_image_model
18
41
  Models.find(model_id) # Validate model exists
@@ -158,6 +158,16 @@ module RubyLLM
158
158
  end
159
159
  end
160
160
 
161
+ def parse_data_uri(uri)
162
+ if uri&.start_with?('data:')
163
+ match = uri.match(/\Adata:([^;]+);base64,(.+)\z/)
164
+ return { mime_type: match[1], data: match[2] } if match
165
+ end
166
+
167
+ # If it's not a data URI, return nil
168
+ nil
169
+ end
170
+
161
171
  class << self
162
172
  def extended(base)
163
173
  base.extend(Methods)
@@ -34,10 +34,13 @@ module RubyLLM
34
34
  source = part[:source]
35
35
 
36
36
  if source.start_with?('http')
37
- # For URLs
37
+ # For URLs - add "type": "url" here
38
38
  {
39
39
  type: 'document',
40
- source: { url: source }
40
+ source: {
41
+ type: 'url', # This line is missing in the current implementation
42
+ url: source
43
+ }
41
44
  }
42
45
  else
43
46
  # For local files
@@ -37,12 +37,13 @@ module RubyLLM
37
37
  raise Error, 'Unexpected response format from Gemini image generation API'
38
38
  end
39
39
 
40
- # Handle response with base64 encoded image data
41
- image_url = "data:#{image_data['mimeType'] || 'image/png'};base64,#{image_data['bytesBase64Encoded']}"
40
+ # Extract mime type and base64 data
41
+ mime_type = image_data['mimeType'] || 'image/png'
42
+ base64_data = image_data['bytesBase64Encoded']
43
+
42
44
  Image.new(
43
- url: image_url,
44
- revised_prompt: '', # Imagen doesn't return revised prompts
45
- model_id: ''
45
+ data: base64_data,
46
+ mime_type: mime_type
46
47
  )
47
48
  end
48
49
  end
@@ -26,6 +26,7 @@ module RubyLLM
26
26
 
27
27
  Image.new(
28
28
  url: image_data['url'],
29
+ mime_type: 'image/png', # DALL-E typically returns PNGs
29
30
  revised_prompt: image_data['revised_prompt'],
30
31
  model_id: data['model']
31
32
  )
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module RubyLLM
4
- VERSION = '0.1.0.pre42'
4
+ VERSION = '0.1.0.pre44'
5
5
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: ruby_llm
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0.pre42
4
+ version: 0.1.0.pre44
5
5
  platform: ruby
6
6
  authors:
7
7
  - Carmine Paolino
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2025-02-26 00:00:00.000000000 Z
11
+ date: 2025-02-28 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: event_stream_parser