dspy 0.19.1 → 0.20.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: d76e6cfecee3a85c7d8cd389b39eb0401e2051644e0f3ada98ab03645414f35e
4
- data.tar.gz: fe078c869c9ee45237916dd10bd12a5d9bb80ea629c1bb6b0a29ae6ff5249b73
3
+ metadata.gz: 3324f833b373e826df1dcdf3f2c3f46011e192e9a30fdf38d7db1b9cc51a7950
4
+ data.tar.gz: f6159081bc6429d0c57e6275111d2672bf984f8500be7756afbf8bb17599dd19
5
5
  SHA512:
6
- metadata.gz: 56565be338d06d517daa931d7ebfb32f259b92d9b8a55140783bcf34f635ace57213909d74003ad33b6e8761d5c1fae950f1f6332ee55131d8a7ffd5c4a58183
7
- data.tar.gz: 3e36f838fbba8e06428397611f1f2e2476f789c7f4714a98ce2198f38bf5632a0f4e12cf5be56282c48d31071c9b784959f0095f27bc8f42c44472ad43111d41
6
+ metadata.gz: cd81596af2ea0d734550b0827e18b71d4abdd1e3a4c92efb014af665d17bb2a37fb36b747757eb3ad4377d89a435b81bbaa4477ce0fd22b8ed88319ebc8b1f5b
7
+ data.tar.gz: d65e68af35f7ced5e97524dd85f1c590900f33b720038f25bbc22995031d10cc5115bdb1b160b153e5fbaa5bf2cb3809dde0520f6bd99a7ac348c0231bf88afa
data/README.md CHANGED
@@ -2,12 +2,17 @@
2
2
 
3
3
  [![Gem Version](https://img.shields.io/gem/v/dspy)](https://rubygems.org/gems/dspy)
4
4
  [![Total Downloads](https://img.shields.io/gem/dt/dspy)](https://rubygems.org/gems/dspy)
5
+ [![Build Status](https://img.shields.io/github/actions/workflow/status/vicentereig/dspy.rb/ruby.yml?branch=main&label=build)](https://github.com/vicentereig/dspy.rb/actions/workflows/ruby.yml)
6
+ [![Documentation](https://img.shields.io/badge/docs-vicentereig.github.io%2Fdspy.rb-blue)](https://vicentereig.github.io/dspy.rb/)
5
7
 
6
8
  **Build reliable LLM applications in Ruby using composable, type-safe modules.**
7
9
 
8
- DSPy.rb brings structured LLM programming to Ruby developers. Instead of wrestling with prompt strings and parsing responses, you define typed signatures and compose them into pipelines that just work.
10
+ DSPy.rb brings structured LLM programming to Ruby developers. Instead of wrestling with prompt strings and parsing
11
+ responses, you define typed signatures and compose them into pipelines that just work.
9
12
 
10
- Traditional prompting is like writing code with string concatenation: it works until it doesn't. DSPy.rb brings you the programming approach pioneered by [dspy.ai](https://dspy.ai/): instead of crafting fragile prompts, you define modular signatures and let the framework handle the messy details.
13
+ Traditional prompting is like writing code with string concatenation: it works until it doesn't. DSPy.rb brings you
14
+ the programming approach pioneered by [dspy.ai](https://dspy.ai/): instead of crafting fragile prompts, you define modular
15
+ signatures and let the framework handle the messy details.
11
16
 
12
17
  The result? LLM applications that actually scale and don't break when you sneeze.
13
18
 
@@ -54,18 +59,18 @@ puts result.confidence # => 0.85
54
59
  ## What You Get
55
60
 
56
61
  **Core Building Blocks:**
57
- - **Signatures** - Define input/output schemas using Sorbet types
58
- - **Predict** - Basic LLM completion with structured data
59
- - **Chain of Thought** - Step-by-step reasoning for complex problems
60
- - **ReAct** - Tool-using agents with basic tool integration
62
+ - **Signatures** - Define input/output schemas using Sorbet types with T::Enum and union type support
63
+ - **Predict** - LLM completion with structured data extraction and multimodal support
64
+ - **Chain of Thought** - Step-by-step reasoning for complex problems with automatic prompt optimization
65
+ - **ReAct** - Tool-using agents with type-safe tool definitions and error recovery
61
66
  - **CodeAct** - Dynamic code execution agents for programming tasks
62
- - **Manual Composition** - Combine multiple LLM calls into workflows
67
+ - **Module Composition** - Combine multiple LLM calls into production-ready workflows
63
68
 
64
69
  **Optimization & Evaluation:**
65
70
  - **Prompt Objects** - Manipulate prompts as first-class objects instead of strings
66
71
  - **Typed Examples** - Type-safe training data with automatic validation
67
- - **Evaluation Framework** - Basic testing with simple metrics
68
- - **Basic Optimization** - Simple prompt optimization techniques
72
+ - **Evaluation Framework** - Advanced metrics beyond simple accuracy with error-resilient pipelines
73
+ - **MIPROv2 Optimization** - Automatic prompt optimization with storage and persistence
69
74
 
70
75
  **Production Features:**
71
76
  - **Reliable JSON Extraction** - Native OpenAI structured outputs, Anthropic extraction patterns, and automatic strategy selection with fallback
@@ -78,28 +83,64 @@ puts result.confidence # => 0.85
78
83
 
79
84
  **Developer Experience:**
80
85
  - LLM provider support using official Ruby clients:
81
- - [OpenAI Ruby](https://github.com/openai/openai-ruby)
82
- - [Anthropic Ruby SDK](https://github.com/anthropics/anthropic-sdk-ruby)
83
- - [Ollama](https://ollama.com/) via OpenAI compatibility layer
84
- - Runtime type checking with [Sorbet](https://sorbet.org/)
86
+ - [OpenAI Ruby](https://github.com/openai/openai-ruby) with vision model support
87
+ - [Anthropic Ruby SDK](https://github.com/anthropics/anthropic-sdk-ruby) with multimodal capabilities
88
+ - [Ollama](https://ollama.com/) via OpenAI compatibility layer for local models
89
+ - **Multimodal Support** - Complete image analysis with DSPy::Image, type-safe bounding boxes, vision-capable models
90
+ - Runtime type checking with [Sorbet](https://sorbet.org/) including T::Enum and union types
85
91
  - Type-safe tool definitions for ReAct agents
86
92
  - Comprehensive instrumentation and observability
87
93
 
88
94
  ## Development Status
89
95
 
90
- DSPy.rb is actively developed and approaching stability at **v0.15.4**. The core framework is production-ready with comprehensive documentation, but I'm battle-testing features through the 0.x series before committing to a stable v1.0 API.
96
+ DSPy.rb is actively developed and approaching stability. The core framework is production-ready with
97
+ comprehensive documentation, but I'm battle-testing features through the 0.x series before committing
98
+ to a stable v1.0 API.
91
99
 
92
100
  Real-world usage feedback is invaluable - if you encounter issues or have suggestions, please open a GitHub issue!
93
101
 
102
+ ## Documentation
103
+
104
+ 📖 **[Complete Documentation Website](https://vicentereig.github.io/dspy.rb/)**
105
+
106
+ ### LLM-Friendly Documentation
107
+
108
+ For LLMs and AI assistants working with DSPy.rb:
109
+ - **[llms.txt](https://vicentereig.github.io/dspy.rb/llms.txt)** - Concise reference optimized for LLMs
110
+ - **[llms-full.txt](https://vicentereig.github.io/dspy.rb/llms-full.txt)** - Comprehensive API documentation
111
+
112
+ ### Getting Started
113
+ - **[Installation & Setup](docs/src/getting-started/installation.md)** - Detailed installation and configuration
114
+ - **[Quick Start Guide](docs/src/getting-started/quick-start.md)** - Your first DSPy programs
115
+ - **[Core Concepts](docs/src/getting-started/core-concepts.md)** - Understanding signatures, predictors, and modules
116
+
117
+ ### Core Features
118
+ - **[Signatures & Types](docs/src/core-concepts/signatures.md)** - Define typed interfaces for LLM operations
119
+ - **[Predictors](docs/src/core-concepts/predictors.md)** - Predict, ChainOfThought, ReAct, and more
120
+ - **[Modules & Pipelines](docs/src/core-concepts/modules.md)** - Compose complex multi-stage workflows
121
+ - **[Multimodal Support](docs/src/core-concepts/multimodal.md)** - Image analysis with vision-capable models
122
+ - **[Examples & Validation](docs/src/core-concepts/examples.md)** - Type-safe training data
123
+
124
+ ### Optimization
125
+ - **[Evaluation Framework](docs/src/optimization/evaluation.md)** - Advanced metrics beyond simple accuracy
126
+ - **[Prompt Optimization](docs/src/optimization/prompt-optimization.md)** - Manipulate prompts as objects
127
+ - **[MIPROv2 Optimizer](docs/src/optimization/miprov2.md)** - Automatic optimization algorithms
128
+
129
+ ### Production Features
130
+ - **[Storage System](docs/src/production/storage.md)** - Persistence and optimization result storage
131
+ - **[Observability](docs/src/production/observability.md)** - Zero-config Langfuse integration and structured logging
132
+
133
+ ### Advanced Usage
134
+ - **[Complex Types](docs/src/advanced/complex-types.md)** - Sorbet type integration with automatic coercion for structs, enums, and arrays
135
+ - **[Manual Pipelines](docs/src/advanced/pipelines.md)** - Manual module composition patterns
136
+ - **[RAG Patterns](docs/src/advanced/rag.md)** - Manual RAG implementation with external services
137
+ - **[Custom Metrics](docs/src/advanced/custom-metrics.md)** - Proc-based evaluation logic
138
+
94
139
  ## Quick Start
95
140
 
96
141
  ### Installation
97
142
 
98
- ```ruby
99
- gem 'dspy', '~> 0.15'
100
- ```
101
-
102
- Or add to your Gemfile:
143
+ Add to your Gemfile:
103
144
 
104
145
  ```ruby
105
146
  gem 'dspy'
@@ -135,68 +176,51 @@ sudo apt-get install cmake
135
176
 
136
177
  **Note**: The `polars-df` gem compilation can take 15-20 minutes. Pre-built binaries are available for most platforms, so compilation is only needed if a pre-built binary isn't available for your system.
137
178
 
138
- ## Documentation
139
-
140
- 📖 **[Complete Documentation Website](https://vicentereig.github.io/dspy.rb/)**
141
-
142
- ### LLM-Friendly Documentation
143
-
144
- For LLMs and AI assistants working with DSPy.rb:
145
- - **[llms.txt](https://vicentereig.github.io/dspy.rb/llms.txt)** - Concise reference optimized for LLMs
146
- - **[llms-full.txt](https://vicentereig.github.io/dspy.rb/llms-full.txt)** - Comprehensive API documentation
147
-
148
- ### Getting Started
149
- - **[Installation & Setup](docs/src/getting-started/installation.md)** - Detailed installation and configuration
150
- - **[Quick Start Guide](docs/src/getting-started/quick-start.md)** - Your first DSPy programs
151
- - **[Core Concepts](docs/src/getting-started/core-concepts.md)** - Understanding signatures, predictors, and modules
152
-
153
- ### Core Features
154
- - **[Signatures & Types](docs/src/core-concepts/signatures.md)** - Define typed interfaces for LLM operations
155
- - **[Predictors](docs/src/core-concepts/predictors.md)** - Predict, ChainOfThought, ReAct, and more
156
- - **[Modules & Pipelines](docs/src/core-concepts/modules.md)** - Compose complex multi-stage workflows
157
- - **[Examples & Validation](docs/src/core-concepts/examples.md)** - Type-safe training data
158
-
159
- ### Optimization
160
- - **[Evaluation Framework](docs/src/optimization/evaluation.md)** - Basic testing with simple metrics
161
- - **[Prompt Optimization](docs/src/optimization/prompt-optimization.md)** - Manipulate prompts as objects
162
- - **[MIPROv2 Optimizer](docs/src/optimization/miprov2.md)** - Basic automatic optimization
163
-
164
- ### Production Features
165
- - **[Storage System](docs/src/production/storage.md)** - Basic file-based persistence
166
- - **[Observability](docs/src/production/observability.md)** - Zero-config Langfuse integration and structured logging
167
-
168
- ### Advanced Usage
169
- - **[Complex Types](docs/src/advanced/complex-types.md)** - Sorbet type integration with automatic coercion for structs, enums, and arrays
170
- - **[Manual Pipelines](docs/src/advanced/pipelines.md)** - Manual module composition patterns
171
- - **[RAG Patterns](docs/src/advanced/rag.md)** - Manual RAG implementation with external services
172
- - **[Custom Metrics](docs/src/advanced/custom-metrics.md)** - Proc-based evaluation logic
173
-
174
179
  ## Recent Achievements
175
180
 
176
181
  DSPy.rb has rapidly evolved from experimental to production-ready:
177
182
 
178
- - ✅ **JSON Parsing Reliability** (v0.8.0) - Native OpenAI structured outputs, strategy selection, retry logic
179
- - ✅ **Type-Safe Strategy Configuration** (v0.9.0) - Provider-optimized automatic strategy selection
180
- - ✅ **Documentation Website** (v0.6.4) - Comprehensive docs at [vicentereig.github.io/dspy.rb](https://vicentereig.github.io/dspy.rb)
183
+ ### Foundation
184
+ - ✅ **JSON Parsing Reliability** - Native OpenAI structured outputs, strategy selection, retry logic
185
+ - ✅ **Type-Safe Strategy Configuration** - Provider-optimized automatic strategy selection
186
+ - ✅ **Core Module System** - Predict, ChainOfThought, ReAct, CodeAct with type safety
181
187
  - ✅ **Production Observability** - OpenTelemetry, New Relic, and Langfuse integration
182
188
  - ✅ **Optimization Framework** - MIPROv2 algorithm with storage & persistence
183
- - ✅ **Core Module System** - Predict, ChainOfThought, ReAct, CodeAct with type safety
184
189
 
185
- ## Roadmap - Battle-Testing Toward v1.0
190
+ ### Recent Advances
191
+ - ✅ **Comprehensive Multimodal Framework** - Complete image analysis with `DSPy::Image`, type-safe bounding boxes, vision model integration
192
+ - ✅ **Advanced Type System** - `T::Enum` integration, union types for agentic workflows, complex type coercion
193
+ - ✅ **Production-Ready Evaluation** - Multi-factor metrics beyond accuracy, error-resilient evaluation pipelines
194
+ - ✅ **Documentation Ecosystem** - `llms.txt` for AI assistants, ADRs, blog articles, comprehensive examples
195
+ - ✅ **API Maturation** - Simplified idiomatic patterns, better error handling, production-proven designs
186
196
 
187
- DSPy.rb is currently at **v0.15.2** and approaching stability. I'm focusing on real-world usage and refinement through the 0.16+ series before committing to a stable v1.0 API.
197
+ ## Roadmap - Production Battle-Testing Toward v1.0
198
+
199
+ DSPy.rb has transitioned from **feature building** to **production validation**. The core framework is
200
+ feature-complete and stable - now I'm focusing on real-world usage patterns, performance optimization,
201
+ and ecosystem integration.
188
202
 
189
203
  **Current Focus Areas:**
190
- - ✅ **Ollama Support** - Local model integration (completed in v0.15.0)
191
- - **Agentic Memory** - Persistent agent state management with Memory module
192
- - 🚧 **Google Gemini Support** - Integration with Gemini models (#52)
193
- - 🚧 **Context Engineering** - Advanced prompt optimization techniques
194
- - 🚧 **MCP Support** - Model Context Protocol integration
195
- - 🚧 **Additional Optimizer Support** - Expanding teleprompt capabilities
196
- - 🚧 **Performance Optimization** - Based on production usage patterns
204
+
205
+ ### Production Readiness
206
+ - 🚧 **Production Patterns** - Real-world usage validation and performance optimization
207
+ - 🚧 **Ruby Ecosystem Integration** - Rails integration, Sidekiq compatibility, deployment patterns
208
+ - 🚧 **Scale Testing** - High-volume usage, memory management, connection pooling
209
+ - 🚧 **Error Recovery** - Robust failure handling patterns for production environments
210
+
211
+ ### Ecosystem Expansion
212
+ - 🚧 **Model Context Protocol (MCP)** - Integration with MCP ecosystem
213
+ - 🚧 **Additional Provider Support** - Azure OpenAI, local models beyond Ollama
214
+ - 🚧 **Tool Ecosystem** - Expanded tool integrations for ReAct agents
215
+
216
+ ### Community & Adoption
217
+ - 🚧 **Community Examples** - Real-world applications and case studies
218
+ - 🚧 **Contributor Experience** - Making it easier to contribute and extend
219
+ - 🚧 **Performance Benchmarks** - Comparative analysis vs other frameworks
197
220
 
198
221
  **v1.0 Philosophy:**
199
- v1.0 will be released after extensive production battle-testing, not after checking off features. This ensures a stable, reliable API backed by real-world validation.
222
+ v1.0 will be released after extensive production battle-testing, not after checking off features.
223
+ The API is already stable - v1.0 represents confidence in production reliability backed by real-world validation.
200
224
 
201
225
  ## License
202
226
 
@@ -25,6 +25,7 @@ module DSPy
25
25
  @signature_class = enhanced_signature
26
26
  end
27
27
 
28
+
28
29
  # Override prompt-based methods to maintain ChainOfThought behavior
29
30
  sig { override.params(new_prompt: Prompt).returns(ChainOfThought) }
30
31
  def with_prompt(new_prompt)
data/lib/dspy/code_act.rb CHANGED
@@ -129,8 +129,16 @@ module DSPy
129
129
  # Use the enhanced output struct with CodeAct fields
130
130
  @output_struct_class = enhanced_output_struct
131
131
 
132
+ # Store original signature name
133
+ @original_signature_name = signature_class.name
134
+
132
135
  class << self
133
- attr_reader :input_struct_class, :output_struct_class
136
+ attr_reader :input_struct_class, :output_struct_class, :original_signature_name
137
+
138
+ # Override name to return the original signature name
139
+ def name
140
+ @original_signature_name || super
141
+ end
134
142
  end
135
143
  end
136
144
 
@@ -140,8 +148,6 @@ module DSPy
140
148
 
141
149
  sig { params(kwargs: T.untyped).returns(T.untyped).override }
142
150
  def forward(**kwargs)
143
- lm = config.lm || DSPy.config.lm
144
-
145
151
  # Validate input and serialize all fields as task context
146
152
  input_struct = @original_signature_class.input_struct_class.new(**kwargs)
147
153
  task = DSPy::TypeSerializer.serialize(input_struct).to_json
data/lib/dspy/evaluate.rb CHANGED
@@ -49,7 +49,7 @@ module DSPy
49
49
  def to_h
50
50
  {
51
51
  example: @example,
52
- prediction: @prediction,
52
+ prediction: @prediction.respond_to?(:to_h) ? @prediction.to_h : @prediction,
53
53
  trace: @trace,
54
54
  metrics: @metrics,
55
55
  passed: @passed
data/lib/dspy/image.rb CHANGED
@@ -19,6 +19,10 @@ module DSPy
19
19
  'anthropic' => {
20
20
  sources: %w[base64 data],
21
21
  parameters: []
22
+ },
23
+ 'gemini' => {
24
+ sources: %w[base64 data], # Gemini supports inline base64 data, not URLs
25
+ parameters: []
22
26
  }
23
27
  }.freeze
24
28
 
@@ -99,6 +103,27 @@ module DSPy
99
103
  end
100
104
  end
101
105
 
106
+ def to_gemini_format
107
+ if url
108
+ # Gemini requires base64 for inline data, URLs not supported for inline_data
109
+ raise NotImplementedError, "URL fetching for Gemini not yet implemented. Use base64 or data instead."
110
+ elsif base64
111
+ {
112
+ inline_data: {
113
+ mime_type: content_type,
114
+ data: base64
115
+ }
116
+ }
117
+ elsif data
118
+ {
119
+ inline_data: {
120
+ mime_type: content_type,
121
+ data: to_base64
122
+ }
123
+ }
124
+ end
125
+ end
126
+
102
127
  def to_base64
103
128
  return base64 if base64
104
129
  return Base64.strict_encode64(data.pack('C*')) if data
@@ -139,6 +164,11 @@ module DSPy
139
164
  raise DSPy::LM::IncompatibleImageFeatureError,
140
165
  "Anthropic doesn't support image URLs. Please provide base64 or raw data instead."
141
166
  end
167
+ when 'gemini'
168
+ if current_source == 'url'
169
+ raise DSPy::LM::IncompatibleImageFeatureError,
170
+ "Gemini doesn't support image URLs for inline data. Please provide base64 or raw data instead."
171
+ end
142
172
  end
143
173
  end
144
174
 
@@ -148,6 +178,9 @@ module DSPy
148
178
  when 'anthropic'
149
179
  raise DSPy::LM::IncompatibleImageFeatureError,
150
180
  "Anthropic doesn't support the 'detail' parameter. This feature is OpenAI-specific."
181
+ when 'gemini'
182
+ raise DSPy::LM::IncompatibleImageFeatureError,
183
+ "Gemini doesn't support the 'detail' parameter. This feature is OpenAI-specific."
151
184
  end
152
185
  end
153
186
  end
@@ -8,7 +8,8 @@ module DSPy
8
8
  ADAPTER_MAP = {
9
9
  'openai' => 'OpenAIAdapter',
10
10
  'anthropic' => 'AnthropicAdapter',
11
- 'ollama' => 'OllamaAdapter'
11
+ 'ollama' => 'OllamaAdapter',
12
+ 'gemini' => 'GeminiAdapter'
12
13
  }.freeze
13
14
 
14
15
  class << self
@@ -0,0 +1,189 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'gemini-ai'
4
+ require 'json'
5
+ require_relative '../vision_models'
6
+
7
+ module DSPy
8
+ class LM
9
+ class GeminiAdapter < Adapter
10
+ def initialize(model:, api_key:)
11
+ super
12
+ validate_api_key!(api_key, 'gemini')
13
+
14
+ @client = Gemini.new(
15
+ credentials: {
16
+ service: 'generative-language-api',
17
+ api_key: api_key
18
+ },
19
+ options: {
20
+ model: model,
21
+ server_sent_events: true
22
+ }
23
+ )
24
+ end
25
+
26
+ def chat(messages:, signature: nil, **extra_params, &block)
27
+ normalized_messages = normalize_messages(messages)
28
+
29
+ # Validate vision support if images are present
30
+ if contains_images?(normalized_messages)
31
+ VisionModels.validate_vision_support!('gemini', model)
32
+ # Convert messages to Gemini format with proper image handling
33
+ normalized_messages = format_multimodal_messages(normalized_messages)
34
+ end
35
+
36
+ # Convert DSPy message format to Gemini format
37
+ gemini_messages = convert_messages_to_gemini_format(normalized_messages)
38
+
39
+ request_params = {
40
+ contents: gemini_messages
41
+ }.merge(extra_params)
42
+
43
+ begin
44
+ # Always use streaming
45
+ content = ""
46
+ final_response_data = nil
47
+
48
+ @client.stream_generate_content(request_params) do |chunk|
49
+ # Handle case where chunk might be a string (from SSE VCR)
50
+ if chunk.is_a?(String)
51
+ begin
52
+ chunk = JSON.parse(chunk)
53
+ rescue JSON::ParserError => e
54
+ raise AdapterError, "Failed to parse Gemini streaming response: #{e.message}"
55
+ end
56
+ end
57
+
58
+ # Extract content from chunks
59
+ if chunk.dig('candidates', 0, 'content', 'parts')
60
+ chunk_text = extract_text_from_parts(chunk.dig('candidates', 0, 'content', 'parts'))
61
+ content += chunk_text
62
+
63
+ # Call block only if provided (for real streaming)
64
+ block.call(chunk) if block_given?
65
+ end
66
+
67
+ # Store final response data (usage, metadata) from last chunk
68
+ if chunk['usageMetadata'] || chunk.dig('candidates', 0, 'finishReason')
69
+ final_response_data = chunk
70
+ end
71
+ end
72
+
73
+ # Extract usage information from final chunk
74
+ usage_data = final_response_data&.dig('usageMetadata')
75
+ usage_struct = usage_data ? UsageFactory.create('gemini', usage_data) : nil
76
+
77
+ # Create metadata from final chunk
78
+ metadata = {
79
+ provider: 'gemini',
80
+ model: model,
81
+ finish_reason: final_response_data&.dig('candidates', 0, 'finishReason'),
82
+ safety_ratings: final_response_data&.dig('candidates', 0, 'safetyRatings'),
83
+ streaming: block_given?
84
+ }
85
+
86
+ # Create typed metadata
87
+ typed_metadata = ResponseMetadataFactory.create('gemini', metadata)
88
+
89
+ Response.new(
90
+ content: content,
91
+ usage: usage_struct,
92
+ metadata: typed_metadata
93
+ )
94
+ rescue => e
95
+ handle_gemini_error(e)
96
+ end
97
+ end
98
+
99
+ private
100
+
101
+ # Convert DSPy message format to Gemini format
102
+ def convert_messages_to_gemini_format(messages)
103
+ # Gemini expects contents array with role and parts
104
+ messages.map do |msg|
105
+ role = case msg[:role]
106
+ when 'system'
107
+ 'user' # Gemini doesn't have explicit system role, merge with user
108
+ when 'assistant'
109
+ 'model'
110
+ else
111
+ msg[:role]
112
+ end
113
+
114
+ if msg[:content].is_a?(Array)
115
+ # Multimodal content
116
+ parts = msg[:content].map do |item|
117
+ case item[:type]
118
+ when 'text'
119
+ { text: item[:text] }
120
+ when 'image'
121
+ item[:image].to_gemini_format
122
+ else
123
+ item
124
+ end
125
+ end
126
+
127
+ { role: role, parts: parts }
128
+ else
129
+ # Text-only content
130
+ { role: role, parts: [{ text: msg[:content] }] }
131
+ end
132
+ end
133
+ end
134
+
135
+ # Extract text content from Gemini parts array
136
+ def extract_text_from_parts(parts)
137
+ return "" unless parts.is_a?(Array)
138
+
139
+ parts.map { |part| part['text'] }.compact.join
140
+ end
141
+
142
+ # Format multimodal messages for Gemini
143
+ def format_multimodal_messages(messages)
144
+ messages.map do |msg|
145
+ if msg[:content].is_a?(Array)
146
+ # Convert multimodal content to Gemini format
147
+ formatted_content = msg[:content].map do |item|
148
+ case item[:type]
149
+ when 'text'
150
+ { type: 'text', text: item[:text] }
151
+ when 'image'
152
+ # Validate image compatibility before formatting
153
+ item[:image].validate_for_provider!('gemini')
154
+ item[:image].to_gemini_format
155
+ else
156
+ item
157
+ end
158
+ end
159
+
160
+ {
161
+ role: msg[:role],
162
+ content: formatted_content
163
+ }
164
+ else
165
+ msg
166
+ end
167
+ end
168
+ end
169
+
170
+ # Handle Gemini-specific errors
171
+ def handle_gemini_error(error)
172
+ error_msg = error.message.to_s
173
+
174
+ if error_msg.include?('API_KEY') || error_msg.include?('status 400') || error_msg.include?('status 401') || error_msg.include?('status 403')
175
+ raise AdapterError, "Gemini authentication failed: #{error_msg}. Check your API key."
176
+ elsif error_msg.include?('RATE_LIMIT') || error_msg.downcase.include?('quota') || error_msg.include?('status 429')
177
+ raise AdapterError, "Gemini rate limit exceeded: #{error_msg}. Please wait and try again."
178
+ elsif error_msg.include?('SAFETY') || error_msg.include?('blocked')
179
+ raise AdapterError, "Gemini content was blocked by safety filters: #{error_msg}"
180
+ elsif error_msg.include?('image') || error_msg.include?('media')
181
+ raise AdapterError, "Gemini image processing failed: #{error_msg}. Ensure your image is a valid format and under size limits."
182
+ else
183
+ # Generic error handling
184
+ raise AdapterError, "Gemini adapter error: #{error_msg}"
185
+ end
186
+ end
187
+ end
188
+ end
189
+ end
@@ -84,13 +84,42 @@ module DSPy
84
84
  end
85
85
  end
86
86
 
87
+ # Gemini-specific metadata with additional fields
88
+ class GeminiResponseMetadata < T::Struct
89
+ extend T::Sig
90
+
91
+ const :provider, String
92
+ const :model, String
93
+ const :response_id, T.nilable(String), default: nil
94
+ const :created, T.nilable(Integer), default: nil
95
+ const :structured_output, T.nilable(T::Boolean), default: nil
96
+ const :finish_reason, T.nilable(String), default: nil
97
+ const :safety_ratings, T.nilable(T::Array[T::Hash[String, T.untyped]]), default: nil
98
+ const :streaming, T.nilable(T::Boolean), default: nil
99
+
100
+ sig { returns(T::Hash[Symbol, T.untyped]) }
101
+ def to_h
102
+ hash = {
103
+ provider: provider,
104
+ model: model
105
+ }
106
+ hash[:response_id] = response_id if response_id
107
+ hash[:created] = created if created
108
+ hash[:structured_output] = structured_output unless structured_output.nil?
109
+ hash[:finish_reason] = finish_reason if finish_reason
110
+ hash[:safety_ratings] = safety_ratings if safety_ratings
111
+ hash[:streaming] = streaming unless streaming.nil?
112
+ hash
113
+ end
114
+ end
115
+
87
116
  # Normalized response format for all LM providers
88
117
  class Response < T::Struct
89
118
  extend T::Sig
90
119
 
91
120
  const :content, String
92
121
  const :usage, T.nilable(T.any(Usage, OpenAIUsage)), default: nil
93
- const :metadata, T.any(ResponseMetadata, OpenAIResponseMetadata, AnthropicResponseMetadata, T::Hash[Symbol, T.untyped])
122
+ const :metadata, T.any(ResponseMetadata, OpenAIResponseMetadata, AnthropicResponseMetadata, GeminiResponseMetadata, T::Hash[Symbol, T.untyped])
94
123
 
95
124
  sig { returns(String) }
96
125
  def to_s
@@ -112,7 +141,7 @@ module DSPy
112
141
  module ResponseMetadataFactory
113
142
  extend T::Sig
114
143
 
115
- sig { params(provider: String, metadata: T.nilable(T::Hash[Symbol, T.untyped])).returns(T.any(ResponseMetadata, OpenAIResponseMetadata, AnthropicResponseMetadata)) }
144
+ sig { params(provider: String, metadata: T.nilable(T::Hash[Symbol, T.untyped])).returns(T.any(ResponseMetadata, OpenAIResponseMetadata, AnthropicResponseMetadata, GeminiResponseMetadata)) }
116
145
  def self.create(provider, metadata)
117
146
  # Handle nil metadata
118
147
  metadata ||= {}
@@ -123,7 +152,7 @@ module DSPy
123
152
  # Extract common fields
124
153
  common_fields = {
125
154
  provider: provider,
126
- model: metadata[:model] || 'unknown',
155
+ model: metadata[:model],
127
156
  response_id: metadata[:response_id] || metadata[:id],
128
157
  created: metadata[:created],
129
158
  structured_output: metadata[:structured_output]
@@ -143,6 +172,13 @@ module DSPy
143
172
  stop_sequence: metadata[:stop_sequence]&.to_s,
144
173
  tool_calls: metadata[:tool_calls]
145
174
  )
175
+ when 'gemini'
176
+ GeminiResponseMetadata.new(
177
+ **common_fields,
178
+ finish_reason: metadata[:finish_reason]&.to_s,
179
+ safety_ratings: metadata[:safety_ratings],
180
+ streaming: metadata[:streaming]
181
+ )
146
182
  else
147
183
  ResponseMetadata.new(**common_fields)
148
184
  end
@@ -151,7 +187,7 @@ module DSPy
151
187
  # Fallback to basic metadata
152
188
  ResponseMetadata.new(
153
189
  provider: provider,
154
- model: metadata[:model] || 'unknown'
190
+ model: metadata[:model]
155
191
  )
156
192
  end
157
193
  end
data/lib/dspy/lm/usage.rb CHANGED
@@ -72,6 +72,8 @@ module DSPy
72
72
  create_openai_usage(normalized)
73
73
  when 'anthropic'
74
74
  create_anthropic_usage(normalized)
75
+ when 'gemini'
76
+ create_gemini_usage(normalized)
75
77
  else
76
78
  create_generic_usage(normalized)
77
79
  end
@@ -136,6 +138,23 @@ module DSPy
136
138
  nil
137
139
  end
138
140
 
141
+ sig { params(data: T::Hash[Symbol, T.untyped]).returns(T.nilable(Usage)) }
142
+ def self.create_gemini_usage(data)
143
+ # Gemini uses promptTokenCount/candidatesTokenCount/totalTokenCount
144
+ input_tokens = data[:promptTokenCount] || data[:input_tokens] || 0
145
+ output_tokens = data[:candidatesTokenCount] || data[:output_tokens] || 0
146
+ total_tokens = data[:totalTokenCount] || data[:total_tokens] || (input_tokens + output_tokens)
147
+
148
+ Usage.new(
149
+ input_tokens: input_tokens,
150
+ output_tokens: output_tokens,
151
+ total_tokens: total_tokens
152
+ )
153
+ rescue => e
154
+ DSPy.logger.debug("Failed to create Gemini usage: #{e.message}")
155
+ nil
156
+ end
157
+
139
158
  sig { params(data: T::Hash[Symbol, T.untyped]).returns(T.nilable(Usage)) }
140
159
  def self.create_generic_usage(data)
141
160
  # Generic fallback