RubyGems - ruby-gemini-api - Versions diffs - 0.1.2 → 0.1.4 - Mend

ruby-gemini-api 0.1.2 → 0.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 36071e1dcc5fb406f9f49318cc2cffdb0e4159a0811c74da0aebd48293d9ef02
-  data.tar.gz: cdccba0ec67eb7fa2cd69ffcb8ddcd79c7056d6a636dcb3ccd0d05af14f5acbd
+  metadata.gz: c80fbf2cb7142ab3ff7d6a17d8b5f1e43960ec8ed398ee7f4c35286ed72ce962
+  data.tar.gz: 261f1a1e04757b93aac9c8a42263e355758bba4ec512f6c9dd408b63146e17fe
 SHA512:
-  metadata.gz: e50e268d2c015f45bf6db61d07e5b124aa07b967a278999aa00c29f760e1c41d1465af8582055764fc64454156f15130f498318cdc264cd78d342dc28cec4030
-  data.tar.gz: 21a68ce8106b7b2d0e5e9468d0464f91a18b62cbfad7c6c6924280d58bf525f9b4ac7e87ffe0e06076bd03b9f2b77fe0b6e35b9ba93bdfaedfbcc65afc6abc40
+  metadata.gz: 780a9684677d9bfdd8945727c9cd00e1e0ecc9e43a63a038da17a18aa8e524e43f601946bd1fbc00c9406e4a45b80d59fedfdfc85e190b973cedd9c2340a5be4
+  data.tar.gz: 399dff8bc6f6693b6267412b2fee067269125ea9b16ffd105d94b4ac9154ca1ade4246b6d3ca9aa9e7cb689e3f2d4a12ea771d4fc688de24e36688205675ab0f

data/CHANGELOG.md CHANGED Viewed

@@ -7,4 +7,10 @@
 - Changed generate_contents to accept temperature parameter
 ## [0.1.2] - 2025-07-10
-- Add function calling
+- Add function calling
+## [0.1.3] - 2025-10-09
+- Add support for multi-image input
+## [0.1.4] - 2025-11-08
+- Add support for grounding search

data/README.md CHANGED Viewed

@@ -1,4 +1,4 @@
-[README ‐ 日本語](https://github.com/rira100000000/ruby-gemini-api/wiki/README-%E2%80%90-%E6%97%A5%E6%9C%AC%E8%AA%9E)
+[README ‐ 日本語](https://github.com/rira100000000/ruby-gemini-api/blob/main/README_ja.md)
 # Ruby-Gemini-API
 A Ruby client library for Google's Gemini API. This gem provides a simple, intuitive interface for interacting with Gemini's generative AI capabilities, following patterns similar to other AI client libraries.
@@ -18,7 +18,11 @@ This project is inspired by and pays homage to [ruby-openai](https://github.com/
 - Document processing (PDFs and other formats)
 - Context caching for efficient processing
-### How to use Function Calling
+### Function Calling
+This library provides an intuitive DSL to define tools for function calling, making it easy to describe your functions to the Gemini model.
+#### Basic Usage
 ```ruby
 require 'gemini'
@@ -26,45 +30,67 @@ require 'gemini'
 # Initialize Gemini client
 client = Gemini::Client.new(ENV['GEMINI_API_KEY'])
-# Define function declarations for Function Calling
-# Note: Use camelCase (functionDeclarations) for Gemini API compatibility
-tools = [
-  {
-    functionDeclarations: [
-      {
-        name: "get_current_weather",
-        description: "Get the current weather information",
-        parameters: {
-          type: "object",
-          properties: {
-            location: {
-              type: "string",
-              description: "City name, e.g., Tokyo"
-            }
-          },
-          required: ["location"]
-        }
-      }
-    ]
-  }
-]
+# Define tools using the ToolDefinition DSL
+tools = Gemini::ToolDefinition.new do
+  function :get_current_weather, description: "Get the current weather information" do
+    property :location, type: :string, description: "City name, e.g., Tokyo", required: true
+  end
+end
 # User prompt
 user_prompt = "Tell me the current weather in Tokyo."
-# Send request with Function Calling tools
+# Send request with the defined tools
 response = client.generate_content(
   user_prompt,
-  model: "gemini-2.0-flash",
+  model: "gemini-1.5-flash", # Or any model that supports function calling
   tools: tools
 )
 # Parse function call from the response
 unless response.function_calls.empty?
   function_call = response.function_calls.first
-  puts "Function name to call: #{function_call["name"]}"
-  puts "Function arguments: #{function_call["args"]}"
+  puts "Function to call: #{function_call['name']}"
+  puts "Arguments: #{function_call['args']}"
+end
+```
+#### Advanced Tool Management
+You can define multiple functions, add them dynamically, combine tool sets, and manage them easily.
+```ruby
+# Define a set of weather tools
+weather_tools = Gemini::ToolDefinition.new do
+  function :get_current_weather, description: "Get the current weather" do
+    property :location, type: :string, description: "City name", required: true
+  end
+end
+# Define another set of stock-related tools
+stock_tools = Gemini::ToolDefinition.new do
+  function :get_stock_price, description: "Get the stock price for a symbol" do
+    property :ticker, type: :string, description: "Stock ticker symbol", required: true
+  end
 end
+# Combine tool sets using the + operator
+all_tools = weather_tools + stock_tools
+puts "Combined functions: #{all_tools.list_functions}"
+# => Combined functions: [:get_current_weather, :get_stock_price]
+# Add a new function later
+all_tools.add_function :send_email, description: "Send an email" do
+  property :to, type: :string, required: true
+  property :body, type: :string, required: true
+end
+puts "After adding a function: #{all_tools.list_functions}"
+# => After adding a function: [:get_current_weather, :get_stock_price, :send_email]
+# Delete a function
+all_tools.delete_function(:get_stock_price)
+puts "After deleting a function: #{all_tools.list_functions}"
+# => After deleting a function: [:get_current_weather, :send_email]
 ```
 ## Installation
@@ -246,6 +272,79 @@ client.files.delete(name: file_name)
 For more examples, check out the `demo/vision_demo.rb` and `demo/file_vision_demo.rb` files included with the gem.
+### Grounding with Google Search
+You can use Gemini API's Google Search grounding feature to retrieve real-time information.
+#### Basic Usage
+```ruby
+require 'gemini'
+client = Gemini::Client.new(ENV['GEMINI_API_KEY'])
+# Use Google Search to get real-time information
+response = client.generate_content(
+  "Who won the euro 2024?",
+  model: "gemini-2.0-flash-lite",
+  tools: [{ google_search: {} }]
+)
+if response.success?
+  puts response.text
+  # Check grounding information
+  if response.grounded?
+    puts "\nSource references:"
+    response.grounding_chunks.each do |chunk|
+      if chunk['web']
+        puts "- #{chunk['web']['title']}"
+        puts "  #{chunk['web']['uri']}"
+      end
+    end
+  end
+end
+```
+#### Checking Grounding Metadata
+```ruby
+# Check if response is grounded
+if response.grounded?
+  # Get full grounding metadata
+  metadata = response.grounding_metadata
+  # Get source chunks (references)
+  chunks = response.grounding_chunks
+  # Get search entry point
+  entry_point = response.search_entry_point
+end
+```
+#### Example with Different Topics
+```ruby
+response = client.generate_content(
+  "What are the latest AI developments in 2024?",
+  model: "gemini-2.0-flash-lite",
+  tools: [{ google_search: {} }]
+)
+if response.success? && response.grounded?
+  puts response.text
+  puts "\nSources: #{response.grounding_chunks.length} references"
+end
+```
+#### Demo Application
+You can find a grounding search demo in:
+```bash
+ruby demo/grounding_search_demo_ja.rb
+```
 ### Image Generation
 ```ruby
@@ -253,11 +352,11 @@ require 'gemini'
 client = Gemini::Client.new(ENV['GEMINI_API_KEY'])
-# Generate an image using Gemini 2.0
+# Generate an image using Gemini 2.5
 response = client.images.generate(
   parameters: {
     prompt: "A beautiful sunset over the ocean with sailing boats",
-    model: "gemini-2.0-flash-exp-image-generation",
+    model: "gemini-2.5-flash-image-preview",
     size: "16:9"
   }
 )
@@ -272,6 +371,72 @@ else
 end
 ```
+#### Image Generation with Multiple Input Images
+You can generate new images by combining or editing multiple input images:
+```ruby
+require 'gemini'
+client = Gemini::Client.new(ENV['GEMINI_API_KEY'])
+# Generate a new image using multiple input images
+response = client.images.generate(
+  parameters: {
+    prompt: "Combine these two images to create a single artistic composition",
+    image_paths: ["path/to/image1.jpg", "path/to/image2.png"],
+    model: "gemini-2.5-flash-image-preview",
+    temperature: 0.7
+  }
+)
+# Save the generated image
+if response.success? && response.images.any?
+  response.save_image("combined_image.png")
+  puts "Combined image saved"
+end
+```
+You can also use file objects:
+```ruby
+# Using file objects
+File.open("image1.jpg", "rb") do |file1|
+  File.open("image2.png", "rb") do |file2|
+    response = client.images.generate(
+      parameters: {
+        prompt: "Combine these images together",
+        images: [file1, file2],
+        model: "gemini-2.5-flash-image-preview"
+      }
+    )
+    if response.success? && response.images.any?
+      response.save_image("result.png")
+    end
+  end
+end
+```
+Base64-encoded image data is also supported:
+```ruby
+require 'base64'
+# Base64-encoded image data
+base64_data1 = Base64.strict_encode64(File.binread("image1.jpg"))
+base64_data2 = Base64.strict_encode64(File.binread("image2.png"))
+response = client.images.generate(
+  parameters: {
+    prompt: "Merge these images together",
+    image_base64s: [base64_data1, base64_data2],
+    mime_types: ["image/jpeg", "image/png"],
+    model: "gemini-2.5-flash-image-preview"
+  }
+)
+```
 You can also use Imagen 3 model (Note: This feature is not fully tested yet):
 ```ruby
@@ -293,7 +458,7 @@ if response.success? && !response.images.empty?
 end
 ```
-For a complete example, check out the `demo/image_generation_demo.rb` file included with the gem.
+For complete examples, check out the `demo/image_generation_demo.rb` and `demo/multi_image_generation_demo.rb` files included with the gem.
 ### Audio Transcription

data/lib/gemini/images.rb CHANGED Viewed

@@ -4,43 +4,279 @@ module Gemini
       @client = client
     end
-    # 画像を生成するメインメソッド
+    # Main method to generate images
     def generate(parameters: {})
       prompt = parameters[:prompt]
       raise ArgumentError, "prompt parameter is required" unless prompt
-      # モデルの決定（デフォルトはGemini 2.0）
-      model = parameters[:model] || "gemini-2.0-flash-exp-image-generation"
+      model = parameters[:model] || "gemini-2.5-flash-image-preview"
-      # モデルに応じた画像生成処理
+      # Image editing mode if input images are provided (supports single/multiple images)
+      if has_input_images?(parameters)
+        return generate_with_images(prompt, model, parameters)
+      end
+      # Image generation process based on model
       if model.start_with?("imagen")
-        # Imagen 3を使用
+        # Use Imagen 3
         response = imagen_generate(prompt, parameters)
       else
-        # Gemini 2.0を使用
+        # Use Gemini 2.0
         response = gemini_generate(prompt, parameters)
       end
-      # レスポンスをラップして返す
+      # Wrap and return response
       Gemini::Response.new(response)
     end
     private
+    # Check if input images exist (supports single/multiple images)
+    def has_input_images?(parameters)
+      # Single image parameters
+      single_image = parameters[:image] || parameters[:image_path] || parameters[:image_base64]
+      # Multiple image parameters
+      multiple_images = parameters[:images] || parameters[:image_paths] || parameters[:image_base64s]
+      single_image || multiple_images
+    end
+    # Image generation with image+text (supports single/multiple images)
+    def generate_with_images(prompt, model, parameters)
+      # Process image data (supports single/multiple images)
+      image_parts = process_input_images(parameters)
+      # Build content parts (place text first, then images)
+      parts = [{ "text" => prompt }] + image_parts
+      # Build generation config
+      generation_config = {
+        "responseModalities" => ["Image"]
+      }
+      # Add temperature setting if provided
+      if parameters[:temperature]
+        generation_config["temperature"] = parameters[:temperature]
+      end
+      # Build request parameters
+      request_params = {
+        "contents" => [{
+          "parts" => parts
+        }],
+        "generationConfig" => generation_config
+      }
+      # Merge other parameters (specify keys to exclude)
+      excluded_keys = [:prompt, :image, :image_path, :image_base64, :images, :image_paths, :image_base64s, :model, :temperature]
+      parameters.each do |key, value|
+        next if excluded_keys.include?(key)
+        request_params[key.to_s] = value
+      end
+      # API call
+      response = @client.json_post(
+        path: "models/#{model}:generateContent",
+        parameters: request_params
+      )
+      Gemini::Response.new(response)
+    end
-    # Gemini 2.0モデルを使用した画像生成
+    # Image generation with image+text (kept for backward compatibility)
+    def generate_with_image(prompt, model, parameters)
+      generate_with_images(prompt, model, parameters)
+    end
+    # Process input images (supports single/multiple images)
+    def process_input_images(parameters)
+      image_parts = []
+      # Process multiple images
+      if parameters[:images] || parameters[:image_paths] || parameters[:image_base64s]
+        # Multiple file objects
+        if parameters[:images]
+          parameters[:images].each_with_index do |image, index|
+            if image.respond_to?(:read)
+              image_data = process_image_io(image)
+              image_parts << create_image_part(image_data)
+            else
+              raise ArgumentError, "Invalid image at index #{index}. Expected file object."
+            end
+          end
+        end
+        # Multiple file paths
+        if parameters[:image_paths]
+          parameters[:image_paths].each_with_index do |path, index|
+            image_data = process_image_file(path)
+            image_parts << create_image_part(image_data)
+          end
+        end
+        # Multiple Base64 data
+        if parameters[:image_base64s]
+          mime_types = parameters[:mime_types] || Array.new(parameters[:image_base64s].size, "image/jpeg")
+          parameters[:image_base64s].each_with_index do |base64_data, index|
+            image_data = {
+              data: base64_data,
+              mime_type: mime_types[index] || "image/jpeg"
+            }
+            image_parts << create_image_part(image_data)
+          end
+        end
+      else
+        # Process single image (for backward compatibility)
+        image_data = process_single_input_image(parameters)
+        image_parts << create_image_part(image_data)
+      end
+      image_parts
+    end
+    # Process single input image (for backward compatibility)
+    def process_single_input_image(parameters)
+      if parameters[:image_base64]
+        # When Base64 data is provided directly
+        {
+          data: parameters[:image_base64],
+          mime_type: parameters[:mime_type] || "image/jpeg"
+        }
+      elsif parameters[:image_path]
+        # When file path is provided
+        process_image_file(parameters[:image_path])
+      elsif parameters[:image]
+        # When file object is provided
+        if parameters[:image].respond_to?(:read)
+          process_image_io(parameters[:image])
+        else
+          raise ArgumentError, "Invalid image parameter. Expected file path, file object, or base64 data."
+        end
+      else
+        raise ArgumentError, "No image data provided"
+      end
+    end
+    # Create API part from image data
+    def create_image_part(image_data)
+      {
+        "inline_data" => {
+          "mime_type" => image_data[:mime_type],
+          "data" => image_data[:data]
+        }
+      }
+    end
+    # Process input image (old method - kept for backward compatibility)
+    def process_input_image(parameters)
+      process_single_input_image(parameters)
+    end
+    # Process image from file path (newly added)
+    def process_image_file(file_path)
+      raise ArgumentError, "File does not exist: #{file_path}" unless File.exist?(file_path)
+      require 'base64'
+      # Determine MIME type
+      mime_type = determine_image_mime_type(file_path)
+      # Read file and encode as Base64
+      file_data = File.binread(file_path)
+      base64_data = Base64.strict_encode64(file_data)
+      {
+        data: base64_data,
+        mime_type: mime_type
+      }
+    end
+    # Process image from IO object (newly added)
+    def process_image_io(image_io)
+      require 'base64'
+      # Move to beginning of file
+      image_io.rewind if image_io.respond_to?(:rewind)
+      # Read data
+      file_data = image_io.read
+      # Determine MIME type (use file path if available, otherwise infer from content)
+      mime_type = if image_io.respond_to?(:path) && image_io.path
+                    determine_image_mime_type(image_io.path)
+                  else
+                    determine_mime_type_from_content(file_data)
+                  end
+      # Base64 encode
+      base64_data = Base64.strict_encode64(file_data)
+      {
+        data: base64_data,
+        mime_type: mime_type
+      }
+    end
+    # Determine image MIME type from file path (newly added)
+    def determine_image_mime_type(file_path)
+      ext = File.extname(file_path).downcase
+      case ext
+      when ".jpg", ".jpeg"
+        "image/jpeg"
+      when ".png"
+        "image/png"
+      when ".gif"
+        "image/gif"
+      when ".webp"
+        "image/webp"
+      when ".bmp"
+        "image/bmp"
+      when ".tiff", ".tif"
+        "image/tiff"
+      else
+        # Default to JPEG
+        "image/jpeg"
+      end
+    end
+    # Determine MIME type from file content (newly added)
+    def determine_mime_type_from_content(data)
+      return "image/jpeg" if data.nil? || data.empty?
+      # Check file header
+      header = data[0, 8].bytes
+      case
+      when header[0..1] == [0xFF, 0xD8]
+        "image/jpeg"
+      when header[0..7] == [0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A]
+        "image/png"
+      when header[0..2] == [0x47, 0x49, 0x46]
+        "image/gif"
+      when header[0..3] == [0x52, 0x49, 0x46, 0x46] && data[8..11].bytes == [0x57, 0x45, 0x42, 0x50]
+        "image/webp"
+      when header[0..1] == [0x42, 0x4D]
+        "image/bmp"
+      else
+        # Default to JPEG
+        "image/jpeg"
+      end
+    end
+    # Image generation using Gemini 2.5 model (original code unchanged)
     def gemini_generate(prompt, parameters)
-      # パラメータの準備
-      model = parameters[:model] || "gemini-2.0-flash-exp-image-generation"
+      # Prepare parameters
+      model = parameters[:model] || "gemini-2.5-flash-image-preview"
-      # サイズパラメータの処理（現在はGemini APIでは使用しない）
+      # Process size parameter (currently not used in Gemini API)
       # aspect_ratio = process_size_parameter(parameters[:size])
-      # 生成設定の構築
+      # Build generation config
       generation_config = {
-        "responseModalities" => ["Text", "Image"]
+        "responseModalities" => ["Image"]  # Image output only, even for text-only image generation
       }
-      # リクエストパラメータの構築
+      # Build request parameters
       request_params = {
         "contents" => [{
           "parts" => [
@@ -50,29 +286,29 @@ module Gemini
         "generationConfig" => generation_config
       }
-      # API呼び出し
+      # API call
       @client.json_post(
         path: "models/#{model}:generateContent",
         parameters: request_params
       )
     end
-    # Imagen 3モデルを使用した画像生成
+    # Image generation using Imagen 3 model (original code unchanged)
     def imagen_generate(prompt, parameters)
-      # モデル名の取得（デフォルトはImagen 3の標準モデル）
+      # Get model name (default is Imagen 3 standard model)
       model = parameters[:model] || "imagen-3.0-generate-002"
-      # サイズパラメータからアスペクト比を取得
+      # Get aspect ratio from size parameter
       aspect_ratio = process_size_parameter(parameters[:size])
-      # 画像生成数の設定
+      # Set number of images to generate
       sample_count = parameters[:n] || parameters[:sample_count] || 1
-      sample_count = [[sample_count.to_i, 1].max, 4].min # 1〜4の範囲に制限
+      sample_count = [[sample_count.to_i, 1].max, 4].min # Limit to range 1-4
-      # 人物生成の設定
+      # Set person generation setting
       person_generation = parameters[:person_generation] || "ALLOW_ADULT"
-      # リクエストパラメータの構築
+      # Build request parameters
       request_params = {
         "instances" => [
           {
@@ -84,20 +320,20 @@ module Gemini
         }
       }
-      # アスペクト比が指定されている場合は追加
+      # Add aspect ratio if specified
       request_params["parameters"]["aspectRatio"] = aspect_ratio if aspect_ratio
-      # 人物生成設定を追加
+      # Add person generation setting
       request_params["parameters"]["personGeneration"] = person_generation
-      # API呼び出し
+      # API call
       @client.json_post(
         path: "models/#{model}:predict",
         parameters: request_params
       )
     end
-    # サイズパラメータからアスペクト比を決定
+    # Determine aspect ratio from size parameter (original code unchanged)
     def process_size_parameter(size)
       return nil unless size
@@ -115,7 +351,7 @@ module Gemini
       when "1:1", "3:4", "4:3", "9:16", "16:9"
         size.to_s
       else
-        "1:1" # デフォルト
+        "1:1" # Default
       end
     end
   end

data/lib/gemini/response.rb CHANGED Viewed

@@ -99,6 +99,27 @@ module Gemini
     def safety_blocked?
       finish_reason == "SAFETY"
     end
+    # Get grounding metadata (for Google Search grounding)
+    def grounding_metadata
+      first_candidate&.dig("groundingMetadata")
+    end
+    # Check if response has grounding metadata
+    def grounded?
+      !grounding_metadata.nil? && !grounding_metadata.empty?
+    end
+    # Get grounding chunks (source references)
+    def grounding_chunks
+      grounding_metadata&.dig("groundingChunks") || []
+    end
+    # Get search entry point URL (if available)
+    def search_entry_point
+      grounding_metadata&.dig("searchEntryPoint", "renderedContent")
+    end
     # Get token usage information
     def usage

data/lib/gemini/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module Gemini
-  VERSION = "0.1.2"
+  VERSION = "0.1.4"
 end

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: ruby-gemini-api
 version: !ruby/object:Gem::Version
-  version: 0.1.2
+  version: 0.1.4
 platform: ruby
 authors:
 - rira100000000
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2025-07-10 00:00:00.000000000 Z
+date: 2025-11-07 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: faraday