RubyGems - ruby_llm - Versions diffs - 0.1.0.pre42 → 0.1.0.pre44 - Mend

ruby_llm 0.1.0.pre42 → 0.1.0.pre44

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

checksums.yaml +4 -4
data/.rspec_status +4 -4
data/README.md +82 -380
data/lib/ruby_llm/image.rb +26 -3
data/lib/ruby_llm/provider.rb +10 -0
data/lib/ruby_llm/providers/anthropic/media.rb +5 -2
data/lib/ruby_llm/providers/gemini/images.rb +6 -5
data/lib/ruby_llm/providers/openai/images.rb +1 -0
data/lib/ruby_llm/version.rb +1 -1
metadata +2 -2

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 2dac5b1b15a5d840d74ac63af791044a26669e0d41b0512d37236530cdd89693
-  data.tar.gz: b737c0061790066680424051afd3b9ea75d9d63f014d1684261dde81c1bedf37
+  metadata.gz: 519b87700f2ba3d5aeb8ea3a87d0881f4ef58f974991647b7af4c9f138cbf3d2
+  data.tar.gz: 553b6441ad59fe5efe6d51374103690101c2cb8d072f91bf2a850d84b4934601
 SHA512:
-  metadata.gz: a90362b3d2dbe5b9f5c54851d94646273b343a32eef452528903026ea2e1add335a105273b0efc17a5ebde1da15421d894029ebe401e94c511531a1e8d2f706b
-  data.tar.gz: 504e5dd4e3692db83b6b9589e26e33f553865234b59264d05aec1f495e403e8e28eb2b7dfc2b2258047df07dd04f3740005b44dee8a874927b12a55f045fae6e
+  metadata.gz: e52cb96479a0f4443521bf3b5b2e98c882fb1c404e3d1527c74f638137bf56c2cb9442cc6e30ef197157fbdaa7dc189b7f6331a31e6f31ddfa28a926035d6578
+  data.tar.gz: b1d7948851e6c1ffa5257e214f01e6d6efaf64d32d98dcc66ec8f5d6776df232a92b3b26ff1de78889f4195ff27e41dc7375a4b2530dc1b97edc8fd57bb2c835

data/.rspec_status CHANGED Viewed

@@ -35,7 +35,7 @@ example_id                                         | status | run_time        |
 ./spec/ruby_llm/embeddings_spec.rb[1:1:2:1]        | passed | 0.65614 seconds |
 ./spec/ruby_llm/embeddings_spec.rb[1:1:2:2]        | passed | 2.16 seconds    |
 ./spec/ruby_llm/error_handling_spec.rb[1:1]        | passed | 0.29366 seconds |
-./spec/ruby_llm/image_generation_spec.rb[1:1:1]    | passed | 11.61 seconds   |
-./spec/ruby_llm/image_generation_spec.rb[1:1:2]    | passed | 17.63 seconds   |
-./spec/ruby_llm/image_generation_spec.rb[1:1:3]    | passed | 8.77 seconds    |
-./spec/ruby_llm/image_generation_spec.rb[1:1:4]    | passed | 0.00319 seconds |
+./spec/ruby_llm/image_generation_spec.rb[1:1:1]    | passed | 24.16 seconds   |
+./spec/ruby_llm/image_generation_spec.rb[1:1:2]    | passed | 14.81 seconds   |
+./spec/ruby_llm/image_generation_spec.rb[1:1:3]    | passed | 9.17 seconds    |
+./spec/ruby_llm/image_generation_spec.rb[1:1:4]    | passed | 0.00083 seconds |

data/README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # RubyLLM
-A delightful Ruby way to work with AI. Chat in text, analyze and generate images, understand audio, and use tools through a unified interface to OpenAI, Anthropic, Google, and DeepSeek. Built for developer happiness with automatic token counting, proper streaming, and Rails integration. No wrapping your head around multiple APIs - just clean Ruby code that works.
+A delightful Ruby way to work with AI. No configuration madness, no complex callbacks, no handler hell – just beautiful, expressive Ruby code.
 <p align="center">
   <img src="https://upload.wikimedia.org/wikipedia/commons/4/4d/OpenAI_Logo.svg" alt="OpenAI" height="40" width="120">
@@ -15,462 +15,164 @@ A delightful Ruby way to work with AI. Chat in text, analyze and generate images
 <p align="center">
   <a href="https://badge.fury.io/rb/ruby_llm"><img src="https://badge.fury.io/rb/ruby_llm.svg" alt="Gem Version" /></a>
   <a href="https://github.com/testdouble/standard"><img src="https://img.shields.io/badge/code_style-standard-brightgreen.svg" alt="Ruby Style Guide" /></a>
-  <a href="https://rubygems.org/gems/ruby_llm"><img alt="Gem Total Downloads" src="https://img.shields.io/gem/dt/ruby_llm"></a>
+  <a href="https://rubygems.org/gems/ruby_llm"><img alt="Gem Downloads" src="https://img.shields.io/gem/dt/ruby_llm"></a>
   <a href="https://codecov.io/gh/crmne/ruby_llm"><img src="https://codecov.io/gh/crmne/ruby_llm/branch/main/graph/badge.svg" alt="codecov" /></a>
 </p>
 🤺 Battle tested at [💬  Chat with Work](https://chatwithwork.com)
-## Features
+## The problem with AI libraries
-- 💬 **Beautiful Chat Interface** - Converse with AI models as easily as `RubyLLM.chat.ask "teach me Ruby"`
-- 🎵 **Audio Analysis** - Get audio transcription and understanding with `chat.ask "what's said here?", with: { audio: "clip.wav" }`
-- 👁️ **Vision Understanding** - Let AIs analyze images with a simple `chat.ask "what's in this?", with: { image: "photo.jpg" }`
-- 🌊 **Streaming** - Real-time responses with proper Ruby streaming with `chat.ask "hello" do |chunk| puts chunk.content end`
-- 📄 **PDF Analysis** - Analyze PDF documents directly with `chat.ask "What's in this?", with: { pdf: "document.pdf" }`
-- 🚂 **Rails Integration** - Persist chats and messages with ActiveRecord with `acts_as_{chat|message|tool_call}`
-- 🛠️ **Tool Support** - Give AIs access to your Ruby code with `chat.with_tool(Calculator).ask "what's 2+2?"`
-- 🎨 **Paint with AI** - Create images as easily as `RubyLLM.paint "a sunset over mountains"`
-- 📊 **Embeddings** - Generate vector embeddings for your text with `RubyLLM.embed "hello"`
-- 🔄 **Multi-Provider Support** - Works with OpenAI, Anthropic, Google, and DeepSeek
-- 🎯 **Token Tracking** - Automatic usage tracking across providers
+Every AI provider comes with its own client library, its own response format, its own conventions for streaming, and its own way of handling errors. Want to use multiple providers? Prepare to juggle incompatible APIs and bloated dependencies.
-## Installation
-Add it to your Gemfile:
-```ruby
-gem 'ruby_llm'
-```
-Or install it yourself:
-```bash
-gem install ruby_llm
-```
+RubyLLM fixes all that. One beautiful API for everything. One consistent format. Minimal dependencies — just Faraday and Zeitwerk. Because working with AI should be a joy, not a chore.
-## Configuration
+## What makes it great
 ```ruby
-require 'ruby_llm'
-# Configure your API keys
-RubyLLM.configure do |config|
-  config.openai_api_key = ENV['OPENAI_API_KEY']
-  config.anthropic_api_key = ENV['ANTHROPIC_API_KEY']
-  config.gemini_api_key = ENV['GEMINI_API_KEY']
-  config.deepseek_api_key = ENV['DEEPSEEK_API_KEY']
-end
-```
-## Quick Start
-RubyLLM makes it dead simple to start chatting with AI models:
-```ruby
-# Start a conversation
+# Just ask questions
 chat = RubyLLM.chat
 chat.ask "What's the best way to learn Ruby?"
-```
-## Available Models
+# Analyze images
+chat.ask "What's in this image?", with: { image: "ruby_conf.jpg" }
-RubyLLM gives you access to the latest models from multiple providers:
+# Analyze audio recordings
+chat.ask "Describe this meeting", with: { audio: "meeting.wav" }
-```ruby
-# List all available models
-RubyLLM.models.all
-# Get models by type
-chat_models = RubyLLM.models.chat_models
-embedding_models = RubyLLM.models.embedding_models
-audio_models = RubyLLM.models.audio_models
-image_models = RubyLLM.models.image_models
-```
-## Having a Conversation
+# Analyze documents
+chat.ask "Summarize this document", with: { pdf: "contract.pdf" }
-Conversations are simple and natural:
-```ruby
-chat = RubyLLM.chat model: 'gemini-2.0-flash'
+# Generate images
+RubyLLM.paint "a sunset over mountains in watercolor style"
-# Ask questions
-response = chat.ask "What's your favorite Ruby feature?"
+# Create vector embeddings
+RubyLLM.embed "Ruby is elegant and expressive"
-# Multi-turn conversations just work
-chat.ask "Can you elaborate on that?"
-chat.ask "How does that compare to Python?"
-# Stream responses as they come
-chat.ask "Tell me a story about a Ruby programmer" do |chunk|
-  print chunk.content
-end
-# Ask about images
-chat.ask "What do you see in this image?", with: { image: "ruby_logo.png" }
-# Get analysis of audio content
-chat.ask "What's being said in this recording?", with: { audio: "meeting.wav" }
-# Combine multiple pieces of content
-chat.ask "Compare these diagrams", with: { image: ["diagram1.png", "diagram2.png"] }
-# Ask about PDFs
-chat = RubyLLM.chat(model: 'claude-3-7-sonnet-20250219')
-chat.ask "Summarize this research paper", with: { pdf: "research.pdf" }
-# Multiple PDFs work too
-chat.ask "Compare these contracts", with: { pdf: ["contract1.pdf", "contract2.pdf"] }
-# Check token usage
-last_message = chat.messages.last
-puts "Conversation used #{last_message.input_tokens} input tokens and #{last_message.output_tokens} output tokens"
-```
-You can provide content as local files or URLs - RubyLLM handles the rest. Vision and audio capabilities are available with compatible models. The API stays clean and consistent whether you're working with text, images, or audio.
-## Image Generation
-Want to create AI-generated images? RubyLLM makes it super simple:
-```ruby
-# Paint a picture!
-image = RubyLLM.paint "a starry night over San Francisco in Van Gogh's style"
-image.url         # => "https://..."
-image.revised_prompt  # Shows how DALL-E interpreted your prompt
-# Choose size and model
-image = RubyLLM.paint(
-  "a cyberpunk cityscape at sunset",
-  model: "dall-e-3",
-  size: "1792x1024"
-)
-# Set your default model
-RubyLLM.configure do |config|
-  config.default_image_model = "dall-e-3"
-end
-```
-RubyLLM automatically handles all the complexities of the DALL-E API, token/credit management, and error handling, so you can focus on being creative.
-## Text Embeddings
-Need vector embeddings for your text? RubyLLM makes it simple:
-```ruby
-# Get embeddings with the default model
-RubyLLM.embed "Hello, world!"
-# Use a specific model
-RubyLLM.embed "Ruby is awesome!", model: "text-embedding-004"
-# Process multiple texts at once
-RubyLLM.embed([
-  "First document",
-  "Second document",
-  "Third document"
-])
-# Configure the default model
-RubyLLM.configure do |config|
-  config.default_embedding_model = 'text-embedding-3-large'
-end
-```
-## Using Tools
-Give your AI assistants access to your Ruby code by creating tool classes that do one thing well:
-```ruby
+# Let AI use your code
 class Calculator < RubyLLM::Tool
-  description "Performs arithmetic calculations"
-  param :expression,
-    type: :string,
-    desc: "A mathematical expression to evaluate (e.g. '2 + 2')"
+  description "Performs calculations"
+  param :expression, type: :string, desc: "Math expression to evaluate"
   def execute(expression:)
     eval(expression).to_s
   end
 end
-class Search < RubyLLM::Tool
-  description "Searches documents by similarity"
-  param :query,
-    desc: "The search query"
-  param :limit,
-    type: :integer,
-    desc: "Number of results to return",
-    required: false
-  def initialize(repo:)
-    @repo = repo
-  end
-  def execute(query:, limit: 5)
-    @repo.similarity_search(query, limit:)
-  end
-end
+chat.with_tool(Calculator).ask "What's 123 * 456?"
 ```
-Then use them in your conversations:
+## Installation
 ```ruby
-# Simple tools just work
-chat = RubyLLM.chat.with_tool Calculator
-# Tools with dependencies are just regular Ruby objects
-search = Search.new repo: Document
-chat.with_tools search, Calculator
-# Configure as needed
-chat.with_model('claude-3-5-sonnet-20241022')
-    .with_temperature(0.9)
+# In your Gemfile
+gem 'ruby_llm'
-chat.ask "What's 2+2?"
-# => "Let me calculate that for you. The result is 4."
+# Then run
+bundle install
-chat.ask "Find documents about Ruby performance"
-# => "I found these relevant documents about Ruby performance..."
+# Or install it yourself
+gem install ruby_llm
 ```
-Need to debug a tool? RubyLLM automatically logs all tool calls:
+Configure with your API keys:
 ```ruby
-ENV['RUBY_LLM_DEBUG'] = 'true'
-chat.ask "What's 123 * 456?"
-# D, -- RubyLLM: Tool calculator called with: {"expression" => "123 * 456"}
-# D, -- RubyLLM: Tool calculator returned: "56088"
+RubyLLM.configure do |config|
+  config.openai_api_key = ENV['OPENAI_API_KEY']
+  config.anthropic_api_key = ENV['ANTHROPIC_API_KEY']
+  config.gemini_api_key = ENV['GEMINI_API_KEY']
+  config.deepseek_api_key = ENV['DEEPSEEK_API_KEY'] # Optional
+end
 ```
-## Error Handling
-RubyLLM wraps provider errors in clear Ruby exceptions:
+## Have great conversations
 ```ruby
-begin
-  chat = RubyLLM.chat
-  chat.ask "Hello world!"
-rescue RubyLLM::UnauthorizedError
-  puts "Check your API credentials"
-rescue RubyLLM::BadRequestError => e
-  puts "Something went wrong: #{e.message}"
-rescue RubyLLM::PaymentRequiredError
-  puts "Time to top up your API credits"
-rescue RubyLLM::ServiceUnavailableError
-  puts "API service is temporarily down"
-end
-```
+# Start a chat with the default model (GPT-4o-mini)
+chat = RubyLLM.chat
-## Rails Integration
+# Or specify what you want
+chat = RubyLLM.chat(model: 'claude-3-7-sonnet-20250219')
-RubyLLM comes with built-in Rails support that makes it dead simple to persist your chats and messages. Just create your tables and hook it up:
+# Simple questions just work
+chat.ask "What's the difference between attr_reader and attr_accessor?"
-```ruby
-# db/migrate/YYYYMMDDHHMMSS_create_chats.rb
-class CreateChats < ActiveRecord::Migration[8.0]
-  def change
-    create_table :chats do |t|
-      t.string :model_id
-      t.timestamps
-    end
-  end
-end
+# Multi-turn conversations are seamless
+chat.ask "Could you give me an example?"
-# db/migrate/YYYYMMDDHHMMSS_create_messages.rb
-class CreateMessages < ActiveRecord::Migration[8.0]
-  def change
-    create_table :messages do |t|
-      t.references :chat, null: false
-      t.string :role
-      t.text :content
-      t.string :model_id
-      t.integer :input_tokens
-      t.integer :output_tokens
-      t.references :tool_call
-      t.timestamps
-    end
-  end
+# Stream responses in real-time
+chat.ask "Tell me a story about a Ruby programmer" do |chunk|
+  print chunk.content
 end
-# db/migrate/YYYYMMDDHHMMSS_create_tool_calls.rb
-class CreateToolCalls < ActiveRecord::Migration[8.0]
-  def change
-    create_table :tool_calls do |t|
-      t.references :message, null: false
-      t.string :tool_call_id, null: false
-      t.string :name, null: false
-      t.jsonb :arguments, default: {}
-      t.timestamps
-    end
-    add_index :tool_calls, :tool_call_id
-  end
-end
+# Understand content in multiple forms
+chat.ask "Compare these diagrams", with: { image: ["diagram1.png", "diagram2.png"] }
+chat.ask "Summarize this document", with: { pdf: "contract.pdf" }
+chat.ask "What's being said?", with: { audio: "meeting.wav" }
+# Need a different model mid-conversation? No problem
+chat.with_model('gemini-2.0-flash').ask "What's your favorite algorithm?"
 ```
-Then in your models:
+## Rails integration that makes sense
 ```ruby
+# app/models/chat.rb
 class Chat < ApplicationRecord
   acts_as_chat
-  # Optional: Add Turbo Streams support
+  # Works great with Turbo
   broadcasts_to ->(chat) { "chat_#{chat.id}" }
 end
+# app/models/message.rb
 class Message < ApplicationRecord
   acts_as_message
 end
+# app/models/tool_call.rb
 class ToolCall < ApplicationRecord
   acts_as_tool_call
 end
-```
-That's it! Now you can use chats straight from your models:
-```ruby
-# Create a new chat
-chat = Chat.create! model_id: "gpt-4o-mini"
-# Ask questions - messages are automatically saved
-chat.ask "What's the weather in Paris?"
-# Stream responses in real-time
-chat.ask "Tell me a story" do |chunk|
-  broadcast_chunk chunk
-end
-# Everything is persisted automatically
-chat.messages.each do |message|
-  case message.role
-  when :user
-    puts "User: #{message.content}"
-  when :assistant
-    puts "Assistant: #{message.content}"
-  end
+# In your controller
+chat = Chat.create!(model_id: "gpt-4o-mini")
+chat.ask("What's your favorite Ruby gem?") do |chunk|
+  Turbo::StreamsChannel.broadcast_append_to(
+    chat,
+    target: "response",
+    partial: "messages/chunk",
+    locals: { chunk: chunk }
+  )
 end
-```
-### Real-time Updates with Hotwire
-The Rails integration works great with Hotwire out of the box:
-```ruby
-# app/controllers/chats_controller.rb
-class ChatsController < ApplicationController
-  def show
-    @chat = Chat.find(params[:id])
-  end
-  def ask
-    @chat = Chat.find(params[:id])
-    @chat.ask(params[:message]) do |chunk|
-      Turbo::StreamsChannel.broadcast_append_to(
-        @chat,
-        target: "messages",
-        partial: "messages/chunk",
-        locals: { chunk: chunk }
-      )
-    end
-  end
-end
-# app/views/chats/show.html.erb
-<%= turbo_stream_from @chat %>
-<div id="messages">
-  <%= render @chat.messages %>
-</div>
-<%= form_with(url: ask_chat_path(@chat), local: false) do |f| %>
-  <%= f.text_area :message %>
-  <%= f.submit "Send" %>
-<% end %>
+# That's it - chat history is automatically saved
 ```
-### Background Jobs
-The persistence works seamlessly with background jobs:
-```ruby
-class ChatJob < ApplicationJob
-  def perform(chat_id, message)
-    chat = Chat.find chat_id
-    chat.ask(message) do |chunk|
-      # Optional: Broadcast chunks for real-time updates
-      Turbo::StreamsChannel.broadcast_append_to(
-        chat,
-        target: "messages",
-        partial: "messages/chunk",
-        locals: { chunk: chunk }
-      )
-    end
-  end
-end
-```
-### Using Tools
-Tools work just like they do in regular RubyLLM chats:
+## Creating tools is a breeze
 ```ruby
-class WeatherTool < RubyLLM::Tool
-  description "Gets current weather for a location"
+class Search < RubyLLM::Tool
+  description "Searches a knowledge base"
-  param :location,
-    type: :string,
-    desc: "City name or coordinates"
+  param :query, desc: "The search query"
+  param :limit, type: :integer, desc: "Max results", required: false
-  def execute(location:)
-    # Fetch weather data...
-    { temperature: 22, conditions: "Sunny" }
+  def execute(query:, limit: 5)
+    # Your search logic here
+    Document.search(query).limit(limit).map(&:title)
   end
 end
-# Use tools with your persisted chats
-chat = Chat.create! model_id: "deepseek-reasoner"
-chat.chat.with_tool WeatherTool.new
-# Ask about weather - tool usage is automatically saved
-chat.ask "What's the weather in Paris?"
-# Tool calls and results are persisted as messages
-pp chat.messages.map(&:role)
-#=> [:user, :assistant, :tool, :assistant]
+# Let the AI use it
+chat.with_tool(Search).ask "Find documents about Ruby 3.3 features"
 ```
-## Provider Comparison
-| Feature | OpenAI | Anthropic | Google | DeepSeek |
-|---------|--------|-----------|--------|----------|
-| Chat | ✅ GPT-4o, GPT-3.5 | ✅ Claude 3.7, 3.5, 3 | ✅ Gemini 2.0, 1.5 | ✅ DeepSeek Chat, Reasoner |
-| Vision | ✅ GPT-4o, GPT-4 | ✅ All Claude 3 models | ✅ Gemini 2.0, 1.5 | ❌ |
-| Audio | ✅ GPT-4o-audio, Whisper | ❌ | ✅ Gemini models | ❌ |
-| PDF Analysis | ❌ | ✅ All Claude 3 models | ✅ Gemini models | ❌ |
-| Function Calling | ✅ Most models | ✅ Claude 3 models | ✅ Gemini models (except Lite) | ✅ |
-| JSON Mode | ✅ Most recent models | ✅ Claude 3 models | ✅ Gemini models | ❌ |
-| Image Generation | ✅ DALL-E 3 | ❌ | ✅ Imagen | ❌ |
-| Embeddings | ✅ text-embedding-3 | ❌ | ✅ text-embedding-004 | ❌ |
-| Context Size | ⭐ Up to 200K (o1) | ⭐ 200K tokens | ⭐ Up to 2M tokens | 64K tokens |
-| Streaming | ✅ | ✅ | ✅ | ✅ |
-## Development
-After checking out the repo, run `bin/setup` to install dependencies. Then, run `bin/console` for an interactive prompt.
-## Contributing
+## Learn more
-Bug reports and pull requests are welcome on GitHub at https://github.com/crmne/ruby_llm.
+Check out the guides at https://rubyllm.com for deeper dives into conversations with tools, streaming responses, embedding generations, and more.
 ## License
-Released under the MIT License. See [LICENSE](LICENSE) for details.
+Released under the MIT License.

data/lib/ruby_llm/image.rb CHANGED Viewed

@@ -3,16 +3,39 @@
 module RubyLLM
   # Represents a generated image from an AI model.
   # Provides an interface to image generation capabilities
-  # from providers like DALL-E.
+  # from providers like DALL-E and Gemini's Imagen.
   class Image
-    attr_reader :url, :revised_prompt, :model_id
+    attr_reader :url, :data, :mime_type, :revised_prompt, :model_id
-    def initialize(url:, revised_prompt: nil, model_id: nil)
+    def initialize(url: nil, data: nil, mime_type: nil, revised_prompt: nil, model_id: nil)
       @url = url
+      @data = data
+      @mime_type = mime_type
       @revised_prompt = revised_prompt
       @model_id = model_id
     end
+    def base64?
+      !@data.nil?
+    end
+    # Returns the raw binary image data regardless of source
+    def to_blob
+      if base64?
+        Base64.decode64(@data)
+      else
+        # Use Faraday instead of URI.open for better security
+        response = Faraday.get(@url)
+        response.body
+      end
+    end
+    # Saves the image to a file path
+    def save(path)
+      File.binwrite(File.expand_path(path), to_blob)
+      path
+    end
     def self.paint(prompt, model: nil, size: '1024x1024')
       model_id = model || RubyLLM.config.default_image_model
       Models.find(model_id) # Validate model exists

data/lib/ruby_llm/provider.rb CHANGED Viewed

@@ -158,6 +158,16 @@ module RubyLLM
       end
     end
+    def parse_data_uri(uri)
+      if uri&.start_with?('data:')
+        match = uri.match(/\Adata:([^;]+);base64,(.+)\z/)
+        return { mime_type: match[1], data: match[2] } if match
+      end
+      # If it's not a data URI, return nil
+      nil
+    end
     class << self
       def extended(base)
         base.extend(Methods)

data/lib/ruby_llm/providers/anthropic/media.rb CHANGED Viewed

@@ -34,10 +34,13 @@ module RubyLLM
           source = part[:source]
           if source.start_with?('http')
-            # For URLs
+            # For URLs - add "type": "url" here
             {
               type: 'document',
-              source: { url: source }
+              source: {
+                type: 'url', # This line is missing in the current implementation
+                url: source
+              }
             }
           else
             # For local files

data/lib/ruby_llm/providers/gemini/images.rb CHANGED Viewed

@@ -37,12 +37,13 @@ module RubyLLM
             raise Error, 'Unexpected response format from Gemini image generation API'
           end
-          # Handle response with base64 encoded image data
-          image_url = "data:#{image_data['mimeType'] || 'image/png'};base64,#{image_data['bytesBase64Encoded']}"
+          # Extract mime type and base64 data
+          mime_type = image_data['mimeType'] || 'image/png'
+          base64_data = image_data['bytesBase64Encoded']
           Image.new(
-            url: image_url,
-            revised_prompt: '', # Imagen doesn't return revised prompts
-            model_id: ''
+            data: base64_data,
+            mime_type: mime_type
           )
         end
       end

data/lib/ruby_llm/providers/openai/images.rb CHANGED Viewed

@@ -26,6 +26,7 @@ module RubyLLM
           Image.new(
             url: image_data['url'],
+            mime_type: 'image/png', # DALL-E typically returns PNGs
             revised_prompt: image_data['revised_prompt'],
             model_id: data['model']
           )

data/lib/ruby_llm/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module RubyLLM
-  VERSION = '0.1.0.pre42'
+  VERSION = '0.1.0.pre44'
 end

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: ruby_llm
 version: !ruby/object:Gem::Version
-  version: 0.1.0.pre42
+  version: 0.1.0.pre44
 platform: ruby
 authors:
 - Carmine Paolino
 autorequire:
 bindir: exe
 cert_chain: []
-date: 2025-02-26 00:00:00.000000000 Z
+date: 2025-02-28 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: event_stream_parser