RubyGems - dspy - Versions diffs - 0.19.1 → 0.20.1 - Mend

dspy 0.19.1 → 0.20.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (25) hide show

checksums.yaml +4 -4
data/README.md +93 -69
data/lib/dspy/chain_of_thought.rb +1 -0
data/lib/dspy/code_act.rb +9 -3
data/lib/dspy/evaluate.rb +1 -1
data/lib/dspy/image.rb +33 -0
data/lib/dspy/lm/adapter_factory.rb +2 -1
data/lib/dspy/lm/adapters/gemini_adapter.rb +189 -0
data/lib/dspy/lm/response.rb +40 -4
data/lib/dspy/lm/usage.rb +19 -0
data/lib/dspy/lm/vision_models.rb +20 -0
data/lib/dspy/lm.rb +5 -4
data/lib/dspy/mixins/struct_builder.rb +14 -2
data/lib/dspy/module.rb +2 -2
data/lib/dspy/predict.rb +72 -1
data/lib/dspy/prediction.rb +59 -10
data/lib/dspy/propose/grounded_proposer.rb +38 -3
data/lib/dspy/re_act.rb +9 -4
data/lib/dspy/signature.rb +73 -3
data/lib/dspy/storage/program_storage.rb +45 -10
data/lib/dspy/teleprompt/mipro_v2.rb +58 -17
data/lib/dspy/teleprompt/utils.rb +30 -5
data/lib/dspy/version.rb +1 -1
data/lib/dspy.rb +14 -0
metadata +17 -2

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: d76e6cfecee3a85c7d8cd389b39eb0401e2051644e0f3ada98ab03645414f35e
-  data.tar.gz: fe078c869c9ee45237916dd10bd12a5d9bb80ea629c1bb6b0a29ae6ff5249b73
+  metadata.gz: 3324f833b373e826df1dcdf3f2c3f46011e192e9a30fdf38d7db1b9cc51a7950
+  data.tar.gz: f6159081bc6429d0c57e6275111d2672bf984f8500be7756afbf8bb17599dd19
 SHA512:
-  metadata.gz: 56565be338d06d517daa931d7ebfb32f259b92d9b8a55140783bcf34f635ace57213909d74003ad33b6e8761d5c1fae950f1f6332ee55131d8a7ffd5c4a58183
-  data.tar.gz: 3e36f838fbba8e06428397611f1f2e2476f789c7f4714a98ce2198f38bf5632a0f4e12cf5be56282c48d31071c9b784959f0095f27bc8f42c44472ad43111d41
+  metadata.gz: cd81596af2ea0d734550b0827e18b71d4abdd1e3a4c92efb014af665d17bb2a37fb36b747757eb3ad4377d89a435b81bbaa4477ce0fd22b8ed88319ebc8b1f5b
+  data.tar.gz: d65e68af35f7ced5e97524dd85f1c590900f33b720038f25bbc22995031d10cc5115bdb1b160b153e5fbaa5bf2cb3809dde0520f6bd99a7ac348c0231bf88afa

data/README.md CHANGED Viewed

@@ -2,12 +2,17 @@
 [![Gem Version](https://img.shields.io/gem/v/dspy)](https://rubygems.org/gems/dspy)
 [![Total Downloads](https://img.shields.io/gem/dt/dspy)](https://rubygems.org/gems/dspy)
+[![Build Status](https://img.shields.io/github/actions/workflow/status/vicentereig/dspy.rb/ruby.yml?branch=main&label=build)](https://github.com/vicentereig/dspy.rb/actions/workflows/ruby.yml)
+[![Documentation](https://img.shields.io/badge/docs-vicentereig.github.io%2Fdspy.rb-blue)](https://vicentereig.github.io/dspy.rb/)
 **Build reliable LLM applications in Ruby using composable, type-safe modules.**
-DSPy.rb brings structured LLM programming to Ruby developers. Instead of wrestling with prompt strings and parsing responses, you define typed signatures and compose them into pipelines that just work.
+DSPy.rb brings structured LLM programming to Ruby developers. Instead of wrestling with prompt strings and parsing
+responses, you define typed signatures and compose them into pipelines that just work.
-Traditional prompting is like writing code with string concatenation: it works until it doesn't. DSPy.rb brings you the programming approach pioneered by [dspy.ai](https://dspy.ai/): instead of crafting fragile prompts, you define modular signatures and let the framework handle the messy details.
+Traditional prompting is like writing code with string concatenation: it works until it doesn't. DSPy.rb brings you
+the programming approach pioneered by [dspy.ai](https://dspy.ai/): instead of crafting fragile prompts, you define modular
+signatures and let the framework handle the messy details.
 The result? LLM applications that actually scale and don't break when you sneeze.
@@ -54,18 +59,18 @@ puts result.confidence   # => 0.85
 ## What You Get
 **Core Building Blocks:**
-- **Signatures** - Define input/output schemas using Sorbet types
-- **Predict** - Basic LLM completion with structured data
-- **Chain of Thought** - Step-by-step reasoning for complex problems
-- **ReAct** - Tool-using agents with basic tool integration
+- **Signatures** - Define input/output schemas using Sorbet types with T::Enum and union type support
+- **Predict** - LLM completion with structured data extraction and multimodal support
+- **Chain of Thought** - Step-by-step reasoning for complex problems with automatic prompt optimization
+- **ReAct** - Tool-using agents with type-safe tool definitions and error recovery
 - **CodeAct** - Dynamic code execution agents for programming tasks
-- **Manual Composition** - Combine multiple LLM calls into workflows
+- **Module Composition** - Combine multiple LLM calls into production-ready workflows
 **Optimization & Evaluation:**
 - **Prompt Objects** - Manipulate prompts as first-class objects instead of strings
 - **Typed Examples** - Type-safe training data with automatic validation
-- **Evaluation Framework** - Basic testing with simple metrics
-- **Basic Optimization** - Simple prompt optimization techniques
+- **Evaluation Framework** - Advanced metrics beyond simple accuracy with error-resilient pipelines
+- **MIPROv2 Optimization** - Automatic prompt optimization with storage and persistence
 **Production Features:**
 - **Reliable JSON Extraction** - Native OpenAI structured outputs, Anthropic extraction patterns, and automatic strategy selection with fallback
@@ -78,28 +83,64 @@ puts result.confidence   # => 0.85
 **Developer Experience:**
 - LLM provider support using official Ruby clients:
-  - [OpenAI Ruby](https://github.com/openai/openai-ruby)
-  - [Anthropic Ruby SDK](https://github.com/anthropics/anthropic-sdk-ruby)
-  - [Ollama](https://ollama.com/) via OpenAI compatibility layer
-- Runtime type checking with [Sorbet](https://sorbet.org/)
+  - [OpenAI Ruby](https://github.com/openai/openai-ruby) with vision model support
+  - [Anthropic Ruby SDK](https://github.com/anthropics/anthropic-sdk-ruby) with multimodal capabilities
+  - [Ollama](https://ollama.com/) via OpenAI compatibility layer for local models
+- **Multimodal Support** - Complete image analysis with DSPy::Image, type-safe bounding boxes, vision-capable models
+- Runtime type checking with [Sorbet](https://sorbet.org/) including T::Enum and union types
 - Type-safe tool definitions for ReAct agents
 - Comprehensive instrumentation and observability
 ## Development Status
-DSPy.rb is actively developed and approaching stability at **v0.15.4**. The core framework is production-ready with comprehensive documentation, but I'm battle-testing features through the 0.x series before committing to a stable v1.0 API.
+DSPy.rb is actively developed and approaching stability. The core framework is production-ready with
+comprehensive documentation, but I'm battle-testing features through the 0.x series before committing
+to a stable v1.0 API.
 Real-world usage feedback is invaluable - if you encounter issues or have suggestions, please open a GitHub issue!
+## Documentation
+📖 **[Complete Documentation Website](https://vicentereig.github.io/dspy.rb/)**
+### LLM-Friendly Documentation
+For LLMs and AI assistants working with DSPy.rb:
+- **[llms.txt](https://vicentereig.github.io/dspy.rb/llms.txt)** - Concise reference optimized for LLMs
+- **[llms-full.txt](https://vicentereig.github.io/dspy.rb/llms-full.txt)** - Comprehensive API documentation
+### Getting Started
+- **[Installation & Setup](docs/src/getting-started/installation.md)** - Detailed installation and configuration
+- **[Quick Start Guide](docs/src/getting-started/quick-start.md)** - Your first DSPy programs
+- **[Core Concepts](docs/src/getting-started/core-concepts.md)** - Understanding signatures, predictors, and modules
+### Core Features
+- **[Signatures & Types](docs/src/core-concepts/signatures.md)** - Define typed interfaces for LLM operations
+- **[Predictors](docs/src/core-concepts/predictors.md)** - Predict, ChainOfThought, ReAct, and more
+- **[Modules & Pipelines](docs/src/core-concepts/modules.md)** - Compose complex multi-stage workflows
+- **[Multimodal Support](docs/src/core-concepts/multimodal.md)** - Image analysis with vision-capable models
+- **[Examples & Validation](docs/src/core-concepts/examples.md)** - Type-safe training data
+### Optimization
+- **[Evaluation Framework](docs/src/optimization/evaluation.md)** - Advanced metrics beyond simple accuracy
+- **[Prompt Optimization](docs/src/optimization/prompt-optimization.md)** - Manipulate prompts as objects
+- **[MIPROv2 Optimizer](docs/src/optimization/miprov2.md)** - Automatic optimization algorithms
+### Production Features
+- **[Storage System](docs/src/production/storage.md)** - Persistence and optimization result storage
+- **[Observability](docs/src/production/observability.md)** - Zero-config Langfuse integration and structured logging
+### Advanced Usage
+- **[Complex Types](docs/src/advanced/complex-types.md)** - Sorbet type integration with automatic coercion for structs, enums, and arrays
+- **[Manual Pipelines](docs/src/advanced/pipelines.md)** - Manual module composition patterns
+- **[RAG Patterns](docs/src/advanced/rag.md)** - Manual RAG implementation with external services
+- **[Custom Metrics](docs/src/advanced/custom-metrics.md)** - Proc-based evaluation logic
 ## Quick Start
 ### Installation
-```ruby
-gem 'dspy', '~> 0.15'
-```
-Or add to your Gemfile:
+Add to your Gemfile:
 ```ruby
 gem 'dspy'
@@ -135,68 +176,51 @@ sudo apt-get install cmake
 **Note**: The `polars-df` gem compilation can take 15-20 minutes. Pre-built binaries are available for most platforms, so compilation is only needed if a pre-built binary isn't available for your system.
-## Documentation
-📖 **[Complete Documentation Website](https://vicentereig.github.io/dspy.rb/)**
-### LLM-Friendly Documentation
-For LLMs and AI assistants working with DSPy.rb:
-- **[llms.txt](https://vicentereig.github.io/dspy.rb/llms.txt)** - Concise reference optimized for LLMs
-- **[llms-full.txt](https://vicentereig.github.io/dspy.rb/llms-full.txt)** - Comprehensive API documentation
-### Getting Started
-- **[Installation & Setup](docs/src/getting-started/installation.md)** - Detailed installation and configuration
-- **[Quick Start Guide](docs/src/getting-started/quick-start.md)** - Your first DSPy programs
-- **[Core Concepts](docs/src/getting-started/core-concepts.md)** - Understanding signatures, predictors, and modules
-### Core Features
-- **[Signatures & Types](docs/src/core-concepts/signatures.md)** - Define typed interfaces for LLM operations
-- **[Predictors](docs/src/core-concepts/predictors.md)** - Predict, ChainOfThought, ReAct, and more
-- **[Modules & Pipelines](docs/src/core-concepts/modules.md)** - Compose complex multi-stage workflows
-- **[Examples & Validation](docs/src/core-concepts/examples.md)** - Type-safe training data
-### Optimization
-- **[Evaluation Framework](docs/src/optimization/evaluation.md)** - Basic testing with simple metrics
-- **[Prompt Optimization](docs/src/optimization/prompt-optimization.md)** - Manipulate prompts as objects
-- **[MIPROv2 Optimizer](docs/src/optimization/miprov2.md)** - Basic automatic optimization
-### Production Features
-- **[Storage System](docs/src/production/storage.md)** - Basic file-based persistence
-- **[Observability](docs/src/production/observability.md)** - Zero-config Langfuse integration and structured logging
-### Advanced Usage
-- **[Complex Types](docs/src/advanced/complex-types.md)** - Sorbet type integration with automatic coercion for structs, enums, and arrays
-- **[Manual Pipelines](docs/src/advanced/pipelines.md)** - Manual module composition patterns
-- **[RAG Patterns](docs/src/advanced/rag.md)** - Manual RAG implementation with external services
-- **[Custom Metrics](docs/src/advanced/custom-metrics.md)** - Proc-based evaluation logic
 ## Recent Achievements
 DSPy.rb has rapidly evolved from experimental to production-ready:
-- ✅ **JSON Parsing Reliability** (v0.8.0) - Native OpenAI structured outputs, strategy selection, retry logic
-- ✅ **Type-Safe Strategy Configuration** (v0.9.0) - Provider-optimized automatic strategy selection
-- ✅ **Documentation Website** (v0.6.4) - Comprehensive docs at [vicentereig.github.io/dspy.rb](https://vicentereig.github.io/dspy.rb)
+### Foundation
+- ✅ **JSON Parsing Reliability** - Native OpenAI structured outputs, strategy selection, retry logic
+- ✅ **Type-Safe Strategy Configuration** - Provider-optimized automatic strategy selection
+- ✅ **Core Module System** - Predict, ChainOfThought, ReAct, CodeAct with type safety
 - ✅ **Production Observability** - OpenTelemetry, New Relic, and Langfuse integration
 - ✅ **Optimization Framework** - MIPROv2 algorithm with storage & persistence
-- ✅ **Core Module System** - Predict, ChainOfThought, ReAct, CodeAct with type safety
-## Roadmap - Battle-Testing Toward v1.0
+### Recent Advances
+- ✅ **Comprehensive Multimodal Framework** - Complete image analysis with `DSPy::Image`, type-safe bounding boxes, vision model integration
+- ✅ **Advanced Type System** - `T::Enum` integration, union types for agentic workflows, complex type coercion
+- ✅ **Production-Ready Evaluation** - Multi-factor metrics beyond accuracy, error-resilient evaluation pipelines
+- ✅ **Documentation Ecosystem** - `llms.txt` for AI assistants, ADRs, blog articles, comprehensive examples
+- ✅ **API Maturation** - Simplified idiomatic patterns, better error handling, production-proven designs
-DSPy.rb is currently at **v0.15.2** and approaching stability. I'm focusing on real-world usage and refinement through the 0.16+ series before committing to a stable v1.0 API.
+## Roadmap - Production Battle-Testing Toward v1.0
+DSPy.rb has transitioned from **feature building** to **production validation**. The core framework is
+feature-complete and stable - now I'm focusing on real-world usage patterns, performance optimization,
+and ecosystem integration.
 **Current Focus Areas:**
-- ✅ **Ollama Support** - Local model integration (completed in v0.15.0)
-- ✅ **Agentic Memory** - Persistent agent state management with Memory module
-- 🚧 **Google Gemini Support** - Integration with Gemini models (#52)
-- 🚧 **Context Engineering** - Advanced prompt optimization techniques
-- 🚧 **MCP Support** - Model Context Protocol integration
-- 🚧 **Additional Optimizer Support** - Expanding teleprompt capabilities
-- 🚧 **Performance Optimization** - Based on production usage patterns
+### Production Readiness
+- 🚧 **Production Patterns** - Real-world usage validation and performance optimization
+- 🚧 **Ruby Ecosystem Integration** - Rails integration, Sidekiq compatibility, deployment patterns
+- 🚧 **Scale Testing** - High-volume usage, memory management, connection pooling
+- 🚧 **Error Recovery** - Robust failure handling patterns for production environments
+### Ecosystem Expansion
+- 🚧 **Model Context Protocol (MCP)** - Integration with MCP ecosystem
+- 🚧 **Additional Provider Support** - Azure OpenAI, local models beyond Ollama
+- 🚧 **Tool Ecosystem** - Expanded tool integrations for ReAct agents
+### Community & Adoption
+- 🚧 **Community Examples** - Real-world applications and case studies
+- 🚧 **Contributor Experience** - Making it easier to contribute and extend
+- 🚧 **Performance Benchmarks** - Comparative analysis vs other frameworks
 **v1.0 Philosophy:**
-v1.0 will be released after extensive production battle-testing, not after checking off features. This ensures a stable, reliable API backed by real-world validation.
+v1.0 will be released after extensive production battle-testing, not after checking off features.
+The API is already stable - v1.0 represents confidence in production reliability backed by real-world validation.
 ## License

data/lib/dspy/chain_of_thought.rb CHANGED Viewed

@@ -25,6 +25,7 @@ module DSPy
       @signature_class = enhanced_signature
     end
     # Override prompt-based methods to maintain ChainOfThought behavior
     sig { override.params(new_prompt: Prompt).returns(ChainOfThought) }
     def with_prompt(new_prompt)

data/lib/dspy/code_act.rb CHANGED Viewed

@@ -129,8 +129,16 @@ module DSPy
         # Use the enhanced output struct with CodeAct fields
         @output_struct_class = enhanced_output_struct
+        # Store original signature name
+        @original_signature_name = signature_class.name
         class << self
-          attr_reader :input_struct_class, :output_struct_class
+          attr_reader :input_struct_class, :output_struct_class, :original_signature_name
+          # Override name to return the original signature name
+          def name
+            @original_signature_name || super
+          end
         end
       end
@@ -140,8 +148,6 @@ module DSPy
     sig { params(kwargs: T.untyped).returns(T.untyped).override }
     def forward(**kwargs)
-      lm = config.lm || DSPy.config.lm
       # Validate input and serialize all fields as task context
       input_struct = @original_signature_class.input_struct_class.new(**kwargs)
       task = DSPy::TypeSerializer.serialize(input_struct).to_json

data/lib/dspy/evaluate.rb CHANGED Viewed

@@ -49,7 +49,7 @@ module DSPy
       def to_h
         {
           example: @example,
-          prediction: @prediction,
+          prediction: @prediction.respond_to?(:to_h) ? @prediction.to_h : @prediction,
           trace: @trace,
           metrics: @metrics,
           passed: @passed

data/lib/dspy/image.rb CHANGED Viewed

@@ -19,6 +19,10 @@ module DSPy
       'anthropic' => {
         sources: %w[base64 data],
         parameters: []
+      },
+      'gemini' => {
+        sources: %w[base64 data], # Gemini supports inline base64 data, not URLs
+        parameters: []
       }
     }.freeze
@@ -99,6 +103,27 @@ module DSPy
       end
     end
+    def to_gemini_format
+      if url
+        # Gemini requires base64 for inline data, URLs not supported for inline_data
+        raise NotImplementedError, "URL fetching for Gemini not yet implemented. Use base64 or data instead."
+      elsif base64
+        {
+          inline_data: {
+            mime_type: content_type,
+            data: base64
+          }
+        }
+      elsif data
+        {
+          inline_data: {
+            mime_type: content_type,
+            data: to_base64
+          }
+        }
+      end
+    end
     def to_base64
       return base64 if base64
       return Base64.strict_encode64(data.pack('C*')) if data
@@ -139,6 +164,11 @@ module DSPy
             raise DSPy::LM::IncompatibleImageFeatureError,
                   "Anthropic doesn't support image URLs. Please provide base64 or raw data instead."
           end
+        when 'gemini'
+          if current_source == 'url'
+            raise DSPy::LM::IncompatibleImageFeatureError,
+                  "Gemini doesn't support image URLs for inline data. Please provide base64 or raw data instead."
+          end
         end
       end
@@ -148,6 +178,9 @@ module DSPy
         when 'anthropic'
           raise DSPy::LM::IncompatibleImageFeatureError,
                 "Anthropic doesn't support the 'detail' parameter. This feature is OpenAI-specific."
+        when 'gemini'
+          raise DSPy::LM::IncompatibleImageFeatureError,
+                "Gemini doesn't support the 'detail' parameter. This feature is OpenAI-specific."
         end
       end
     end

data/lib/dspy/lm/adapter_factory.rb CHANGED Viewed

@@ -8,7 +8,8 @@ module DSPy
       ADAPTER_MAP = {
         'openai' => 'OpenAIAdapter',
         'anthropic' => 'AnthropicAdapter',
-        'ollama' => 'OllamaAdapter'
+        'ollama' => 'OllamaAdapter',
+        'gemini' => 'GeminiAdapter'
       }.freeze
       class << self

data/lib/dspy/lm/adapters/gemini_adapter.rb ADDED Viewed

@@ -0,0 +1,189 @@
+# frozen_string_literal: true
+require 'gemini-ai'
+require 'json'
+require_relative '../vision_models'
+module DSPy
+  class LM
+    class GeminiAdapter < Adapter
+      def initialize(model:, api_key:)
+        super
+        validate_api_key!(api_key, 'gemini')
+        @client = Gemini.new(
+          credentials: {
+            service: 'generative-language-api',
+            api_key: api_key
+          },
+          options: {
+            model: model,
+            server_sent_events: true
+          }
+        )
+      end
+      def chat(messages:, signature: nil, **extra_params, &block)
+        normalized_messages = normalize_messages(messages)
+        # Validate vision support if images are present
+        if contains_images?(normalized_messages)
+          VisionModels.validate_vision_support!('gemini', model)
+          # Convert messages to Gemini format with proper image handling
+          normalized_messages = format_multimodal_messages(normalized_messages)
+        end
+        # Convert DSPy message format to Gemini format
+        gemini_messages = convert_messages_to_gemini_format(normalized_messages)
+        request_params = {
+          contents: gemini_messages
+        }.merge(extra_params)
+        begin
+          # Always use streaming
+          content = ""
+          final_response_data = nil
+          @client.stream_generate_content(request_params) do |chunk|
+            # Handle case where chunk might be a string (from SSE VCR)
+            if chunk.is_a?(String)
+              begin
+                chunk = JSON.parse(chunk)
+              rescue JSON::ParserError => e
+                raise AdapterError, "Failed to parse Gemini streaming response: #{e.message}"
+              end
+            end
+            # Extract content from chunks
+            if chunk.dig('candidates', 0, 'content', 'parts')
+              chunk_text = extract_text_from_parts(chunk.dig('candidates', 0, 'content', 'parts'))
+              content += chunk_text
+              # Call block only if provided (for real streaming)
+              block.call(chunk) if block_given?
+            end
+            # Store final response data (usage, metadata) from last chunk
+            if chunk['usageMetadata'] || chunk.dig('candidates', 0, 'finishReason')
+              final_response_data = chunk
+            end
+          end
+          # Extract usage information from final chunk
+          usage_data = final_response_data&.dig('usageMetadata')
+          usage_struct = usage_data ? UsageFactory.create('gemini', usage_data) : nil
+          # Create metadata from final chunk
+          metadata = {
+            provider: 'gemini',
+            model: model,
+            finish_reason: final_response_data&.dig('candidates', 0, 'finishReason'),
+            safety_ratings: final_response_data&.dig('candidates', 0, 'safetyRatings'),
+            streaming: block_given?
+          }
+          # Create typed metadata
+          typed_metadata = ResponseMetadataFactory.create('gemini', metadata)
+          Response.new(
+            content: content,
+            usage: usage_struct,
+            metadata: typed_metadata
+          )
+        rescue => e
+          handle_gemini_error(e)
+        end
+      end
+      private
+      # Convert DSPy message format to Gemini format
+      def convert_messages_to_gemini_format(messages)
+        # Gemini expects contents array with role and parts
+        messages.map do |msg|
+          role = case msg[:role]
+                 when 'system'
+                   'user' # Gemini doesn't have explicit system role, merge with user
+                 when 'assistant'
+                   'model'
+                 else
+                   msg[:role]
+                 end
+          if msg[:content].is_a?(Array)
+            # Multimodal content
+            parts = msg[:content].map do |item|
+              case item[:type]
+              when 'text'
+                { text: item[:text] }
+              when 'image'
+                item[:image].to_gemini_format
+              else
+                item
+              end
+            end
+            { role: role, parts: parts }
+          else
+            # Text-only content
+            { role: role, parts: [{ text: msg[:content] }] }
+          end
+        end
+      end
+      # Extract text content from Gemini parts array
+      def extract_text_from_parts(parts)
+        return "" unless parts.is_a?(Array)
+        parts.map { |part| part['text'] }.compact.join
+      end
+      # Format multimodal messages for Gemini
+      def format_multimodal_messages(messages)
+        messages.map do |msg|
+          if msg[:content].is_a?(Array)
+            # Convert multimodal content to Gemini format
+            formatted_content = msg[:content].map do |item|
+              case item[:type]
+              when 'text'
+                { type: 'text', text: item[:text] }
+              when 'image'
+                # Validate image compatibility before formatting
+                item[:image].validate_for_provider!('gemini')
+                item[:image].to_gemini_format
+              else
+                item
+              end
+            end
+            {
+              role: msg[:role],
+              content: formatted_content
+            }
+          else
+            msg
+          end
+        end
+      end
+      # Handle Gemini-specific errors
+      def handle_gemini_error(error)
+        error_msg = error.message.to_s
+        if error_msg.include?('API_KEY') || error_msg.include?('status 400') || error_msg.include?('status 401') || error_msg.include?('status 403')
+          raise AdapterError, "Gemini authentication failed: #{error_msg}. Check your API key."
+        elsif error_msg.include?('RATE_LIMIT') || error_msg.downcase.include?('quota') || error_msg.include?('status 429')
+          raise AdapterError, "Gemini rate limit exceeded: #{error_msg}. Please wait and try again."
+        elsif error_msg.include?('SAFETY') || error_msg.include?('blocked')
+          raise AdapterError, "Gemini content was blocked by safety filters: #{error_msg}"
+        elsif error_msg.include?('image') || error_msg.include?('media')
+          raise AdapterError, "Gemini image processing failed: #{error_msg}. Ensure your image is a valid format and under size limits."
+        else
+          # Generic error handling
+          raise AdapterError, "Gemini adapter error: #{error_msg}"
+        end
+      end
+    end
+  end
+end

data/lib/dspy/lm/response.rb CHANGED Viewed

@@ -84,13 +84,42 @@ module DSPy
       end
     end
+    # Gemini-specific metadata with additional fields
+    class GeminiResponseMetadata < T::Struct
+      extend T::Sig
+      const :provider, String
+      const :model, String
+      const :response_id, T.nilable(String), default: nil
+      const :created, T.nilable(Integer), default: nil
+      const :structured_output, T.nilable(T::Boolean), default: nil
+      const :finish_reason, T.nilable(String), default: nil
+      const :safety_ratings, T.nilable(T::Array[T::Hash[String, T.untyped]]), default: nil
+      const :streaming, T.nilable(T::Boolean), default: nil
+      sig { returns(T::Hash[Symbol, T.untyped]) }
+      def to_h
+        hash = {
+          provider: provider,
+          model: model
+        }
+        hash[:response_id] = response_id if response_id
+        hash[:created] = created if created
+        hash[:structured_output] = structured_output unless structured_output.nil?
+        hash[:finish_reason] = finish_reason if finish_reason
+        hash[:safety_ratings] = safety_ratings if safety_ratings
+        hash[:streaming] = streaming unless streaming.nil?
+        hash
+      end
+    end
     # Normalized response format for all LM providers
     class Response < T::Struct
       extend T::Sig
       const :content, String
       const :usage, T.nilable(T.any(Usage, OpenAIUsage)), default: nil
-      const :metadata, T.any(ResponseMetadata, OpenAIResponseMetadata, AnthropicResponseMetadata, T::Hash[Symbol, T.untyped])
+      const :metadata, T.any(ResponseMetadata, OpenAIResponseMetadata, AnthropicResponseMetadata, GeminiResponseMetadata, T::Hash[Symbol, T.untyped])
       sig { returns(String) }
       def to_s
@@ -112,7 +141,7 @@ module DSPy
     module ResponseMetadataFactory
       extend T::Sig
-      sig { params(provider: String, metadata: T.nilable(T::Hash[Symbol, T.untyped])).returns(T.any(ResponseMetadata, OpenAIResponseMetadata, AnthropicResponseMetadata)) }
+      sig { params(provider: String, metadata: T.nilable(T::Hash[Symbol, T.untyped])).returns(T.any(ResponseMetadata, OpenAIResponseMetadata, AnthropicResponseMetadata, GeminiResponseMetadata)) }
       def self.create(provider, metadata)
         # Handle nil metadata
         metadata ||= {}
@@ -123,7 +152,7 @@ module DSPy
         # Extract common fields
         common_fields = {
           provider: provider,
-          model: metadata[:model] || 'unknown',
+          model: metadata[:model],
           response_id: metadata[:response_id] || metadata[:id],
           created: metadata[:created],
           structured_output: metadata[:structured_output]
@@ -143,6 +172,13 @@ module DSPy
             stop_sequence: metadata[:stop_sequence]&.to_s,
             tool_calls: metadata[:tool_calls]
           )
+        when 'gemini'
+          GeminiResponseMetadata.new(
+            **common_fields,
+            finish_reason: metadata[:finish_reason]&.to_s,
+            safety_ratings: metadata[:safety_ratings],
+            streaming: metadata[:streaming]
+          )
         else
           ResponseMetadata.new(**common_fields)
         end
@@ -151,7 +187,7 @@ module DSPy
         # Fallback to basic metadata
         ResponseMetadata.new(
           provider: provider,
-          model: metadata[:model] || 'unknown'
+          model: metadata[:model]
         )
       end
     end

data/lib/dspy/lm/usage.rb CHANGED Viewed

@@ -72,6 +72,8 @@ module DSPy
           create_openai_usage(normalized)
         when 'anthropic'
           create_anthropic_usage(normalized)
+        when 'gemini'
+          create_gemini_usage(normalized)
         else
           create_generic_usage(normalized)
         end
@@ -136,6 +138,23 @@ module DSPy
         nil
       end
+      sig { params(data: T::Hash[Symbol, T.untyped]).returns(T.nilable(Usage)) }
+      def self.create_gemini_usage(data)
+        # Gemini uses promptTokenCount/candidatesTokenCount/totalTokenCount
+        input_tokens = data[:promptTokenCount] || data[:input_tokens] || 0
+        output_tokens = data[:candidatesTokenCount] || data[:output_tokens] || 0
+        total_tokens = data[:totalTokenCount] || data[:total_tokens] || (input_tokens + output_tokens)
+        Usage.new(
+          input_tokens: input_tokens,
+          output_tokens: output_tokens,
+          total_tokens: total_tokens
+        )
+      rescue => e
+        DSPy.logger.debug("Failed to create Gemini usage: #{e.message}")
+        nil
+      end
       sig { params(data: T::Hash[Symbol, T.untyped]).returns(T.nilable(Usage)) }
       def self.create_generic_usage(data)
         # Generic fallback