RubyGems - llm_conductor - Versions diffs - 1.1.0 → 1.1.2 - Mend

llm_conductor 1.1.0 → 1.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

checksums.yaml +4 -4
data/.rubocop.yml +4 -0
data/README.md +88 -2
data/VISION_USAGE.md +233 -0
data/examples/openrouter_vision_usage.rb +108 -0
data/lib/llm_conductor/clients/groq_client.rb +3 -4
data/lib/llm_conductor/clients/openrouter_client.rb +112 -7
data/lib/llm_conductor/version.rb +1 -1
metadata +4 -2

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 77bf332827087c373c19d318c72b21532974ae510b6b7958b8a5d77fc3e85833
-  data.tar.gz: 365717c91358d62d4a44264087d6500829d53ca1a9d1dc4550890282152ad75b
+  metadata.gz: 0ca46783dd713d49b3292342d83f5adde4a0da684e4365004651464e6ac630bb
+  data.tar.gz: 5f693e2e4d8da70bebe5faf174a71e1880a705eb7dc6468d4fdf6774b8f2e9f3
 SHA512:
-  metadata.gz: 547cb6b093784591aa0944e36cab482a380a2519fd73324a5dcefaae38fb4eade745f564ada7daee05cd7359515022f49492fee330fc0eee4e592eaf2e5d37bf
-  data.tar.gz: 1118f23b62a725962c7576ba368b5ddadd5fee7f5244f5aea1405d52bacd56708a14cbe50d833afc8e81e6916be8db5c50bb424c238f134957b7965272504944
+  metadata.gz: 70ccb3ae2317588199a2820f1da19188e5ada27da19052de9cc964aeac775bbdd8feae350c958e797676eb8cbaf6549966591ad1299a1088ceeb8cfcbf70dc35
+  data.tar.gz: d24c1ff9423b009b2d50a35d6b122817174d5871813db037ee15378938cfd88aaf9dbc4765325bd65cf99d9fbbbeaae2a2d5ad926677efc6b36187209ddccf09

data/.rubocop.yml CHANGED Viewed

@@ -33,6 +33,7 @@ Metrics/MethodLength:
   Max: 15
   Exclude:
     - 'lib/llm_conductor/prompts.rb'
+    - 'lib/llm_conductor/clients/openrouter_client.rb'
 RSpec/ExampleLength:
   Enabled: false
@@ -89,14 +90,17 @@ Metrics/BlockLength:
 Metrics/AbcSize:
   Exclude:
     - 'lib/llm_conductor/prompts.rb'
+    - 'lib/llm_conductor/clients/openrouter_client.rb'
 Metrics/CyclomaticComplexity:
   Exclude:
     - 'lib/llm_conductor/prompts.rb'
+    - 'lib/llm_conductor/clients/openrouter_client.rb'
 Metrics/PerceivedComplexity:
   Exclude:
     - 'lib/llm_conductor/prompts.rb'
+    - 'lib/llm_conductor/clients/openrouter_client.rb'
 Layout/LineLength:
   Max: 120

data/README.md CHANGED Viewed

@@ -1,11 +1,12 @@
 # LLM Conductor
-A powerful Ruby gem from [Ekohe](https://ekohe.com) for orchestrating multiple Language Model providers with a unified, modern interface. LLM Conductor provides seamless integration with OpenAI GPT, Anthropic Claude, Google Gemini, Groq, and Ollama with advanced prompt management, data building patterns, and comprehensive response handling.
+A powerful Ruby gem from [Ekohe](https://ekohe.com) for orchestrating multiple Language Model providers with a unified, modern interface. LLM Conductor provides seamless integration with OpenAI GPT, Anthropic Claude, Google Gemini, Groq, Ollama, and OpenRouter with advanced prompt management, data building patterns, vision/multimodal support, and comprehensive response handling.
 ## Features
-🚀 **Multi-Provider Support** - OpenAI GPT, Anthropic Claude, Google Gemini, Groq, and Ollama with automatic vendor detection
+🚀 **Multi-Provider Support** - OpenAI GPT, Anthropic Claude, Google Gemini, Groq, Ollama, and OpenRouter with automatic vendor detection
 🎯 **Unified Modern API** - Simple `LlmConductor.generate()` interface with rich Response objects
+🖼️ **Vision/Multimodal Support** - Send images alongside text prompts for vision-enabled models (OpenRouter)
 📝 **Advanced Prompt Management** - Registrable prompt classes with inheritance and templating
 🏗️ **Data Builder Pattern** - Structured data preparation for complex LLM inputs
 ⚡ **Smart Configuration** - Rails-style configuration with environment variable support
@@ -114,6 +115,11 @@ LlmConductor.configure do |config|
     base_url: ENV['OLLAMA_ADDRESS'] || 'http://localhost:11434'
   )
+  config.openrouter(
+    api_key: ENV['OPENROUTER_API_KEY'],
+    uri_base: 'https://openrouter.ai/api/v1' # Optional, this is the default
+  )
   # Optional: Configure custom logger
   config.logger = Logger.new($stdout)                  # Log to stdout
   config.logger = Logger.new('log/llm_conductor.log')  # Log to file
@@ -153,6 +159,7 @@ The gem automatically detects these environment variables:
 - `GEMINI_API_KEY` - Google Gemini API key
 - `GROQ_API_KEY` - Groq API key
 - `OLLAMA_ADDRESS` - Ollama server address
+- `OPENROUTER_API_KEY` - OpenRouter API key
 ## Supported Providers & Models
@@ -223,6 +230,85 @@ response = LlmConductor.generate(
 )
 ```
+### OpenRouter (Access to Multiple Providers)
+OpenRouter provides unified access to various LLM providers with automatic routing. It also supports vision/multimodal models with automatic retry logic for handling intermittent availability issues.
+**Vision-capable models:**
+- `nvidia/nemotron-nano-12b-v2-vl:free` - **FREE** 12B vision model (may need retries)
+- `openai/gpt-4o-mini` - Fast and reliable
+- `google/gemini-flash-1.5` - Fast vision processing
+- `anthropic/claude-3.5-sonnet` - High quality analysis
+- `openai/gpt-4o` - Best quality (higher cost)
+**Note:** Free-tier models may experience intermittent 502 errors. The client includes automatic retry logic with exponential backoff (up to 5 retries) to handle these transient failures.
+```ruby
+# Text-only request
+response = LlmConductor.generate(
+  model: 'nvidia/nemotron-nano-12b-v2-vl:free',
+  vendor: :openrouter,
+  prompt: 'Your prompt here'
+)
+# Vision/multimodal request with single image
+response = LlmConductor.generate(
+  model: 'nvidia/nemotron-nano-12b-v2-vl:free',
+  vendor: :openrouter,
+  prompt: {
+    text: 'What is in this image?',
+    images: 'https://example.com/image.jpg'
+  }
+)
+# Vision request with multiple images
+response = LlmConductor.generate(
+  model: 'nvidia/nemotron-nano-12b-v2-vl:free',
+  vendor: :openrouter,
+  prompt: {
+    text: 'Compare these images',
+    images: [
+      'https://example.com/image1.jpg',
+      'https://example.com/image2.jpg'
+    ]
+  }
+)
+# Vision request with detail level
+response = LlmConductor.generate(
+  model: 'nvidia/nemotron-nano-12b-v2-vl:free',
+  vendor: :openrouter,
+  prompt: {
+    text: 'Describe this image in detail',
+    images: [
+      { url: 'https://example.com/image.jpg', detail: 'high' }
+    ]
+  }
+)
+# Advanced: Raw array format (OpenAI-compatible)
+response = LlmConductor.generate(
+  model: 'nvidia/nemotron-nano-12b-v2-vl:free',
+  vendor: :openrouter,
+  prompt: [
+    { type: 'text', text: 'What is in this image?' },
+    { type: 'image_url', image_url: { url: 'https://example.com/image.jpg' } }
+  ]
+)
+```
+**Reliability:** The OpenRouter client includes intelligent retry logic:
+- Automatically retries on 502 errors (up to 5 attempts)
+- Exponential backoff: 2s, 4s, 8s, 16s, 32s
+- Transparent to your code - works seamlessly
+- Enable logging to see retry attempts:
+```ruby
+LlmConductor.configure do |config|
+  config.logger = Logger.new($stdout)
+  config.logger.level = Logger::INFO
+end
+```
 ### Vendor Detection
 The gem automatically detects the appropriate provider based on model names:

data/VISION_USAGE.md ADDED Viewed

@@ -0,0 +1,233 @@
+# Vision/Multimodal Usage Guide
+This guide explains how to use vision/multimodal capabilities with the OpenRouter client in LLM Conductor.
+## Quick Start
+```ruby
+require 'llm_conductor'
+# Configure
+LlmConductor.configure do |config|
+  config.openrouter(api_key: ENV['OPENROUTER_API_KEY'])
+end
+# Analyze an image
+response = LlmConductor.generate(
+  model: 'openai/gpt-4o-mini',
+  vendor: :openrouter,
+  prompt: {
+    text: 'What is in this image?',
+    images: 'https://example.com/image.jpg'
+  }
+)
+puts response.output
+```
+## Recommended Models
+For vision tasks via OpenRouter, these models work reliably:
+- **`openai/gpt-4o-mini`** - Fast, reliable, good balance of cost/quality ✅
+- **`google/gemini-flash-1.5`** - Fast vision processing
+- **`anthropic/claude-3.5-sonnet`** - High quality analysis
+- **`openai/gpt-4o`** - Best quality (higher cost)
+## Usage Formats
+### 1. Single Image (Simple Format)
+```ruby
+response = LlmConductor.generate(
+  model: 'openai/gpt-4o-mini',
+  vendor: :openrouter,
+  prompt: {
+    text: 'Describe this image',
+    images: 'https://example.com/image.jpg'
+  }
+)
+```
+### 2. Multiple Images
+```ruby
+response = LlmConductor.generate(
+  model: 'openai/gpt-4o-mini',
+  vendor: :openrouter,
+  prompt: {
+    text: 'Compare these images',
+    images: [
+      'https://example.com/image1.jpg',
+      'https://example.com/image2.jpg',
+      'https://example.com/image3.jpg'
+    ]
+  }
+)
+```
+### 3. Image with Detail Level
+For high-resolution images, specify the detail level:
+```ruby
+response = LlmConductor.generate(
+  model: 'openai/gpt-4o-mini',
+  vendor: :openrouter,
+  prompt: {
+    text: 'Analyze this image in detail',
+    images: [
+      { url: 'https://example.com/hires-image.jpg', detail: 'high' }
+    ]
+  }
+)
+```
+Detail levels:
+- `'high'` - Better for detailed analysis (uses more tokens)
+- `'low'` - Faster, cheaper (default if not specified)
+- `'auto'` - Let the model decide
+### 4. Raw Format (Advanced)
+For maximum control, use the OpenAI-compatible array format:
+```ruby
+response = LlmConductor.generate(
+  model: 'openai/gpt-4o-mini',
+  vendor: :openrouter,
+  prompt: [
+    { type: 'text', text: 'What is in this image?' },
+    { type: 'image_url', image_url: { url: 'https://example.com/image.jpg' } },
+    { type: 'text', text: 'Describe it in detail.' }
+  ]
+)
+```
+## Text-Only Requests (Backward Compatible)
+The client still supports regular text-only requests:
+```ruby
+response = LlmConductor.generate(
+  model: 'openai/gpt-4o-mini',
+  vendor: :openrouter,
+  prompt: 'What is the capital of France?'
+)
+```
+## Image URL Requirements
+- Images must be publicly accessible URLs
+- Supported formats: JPEG, PNG, GIF, WebP
+- Maximum file size depends on the model
+- Use HTTPS URLs when possible
+## Error Handling
+```ruby
+response = LlmConductor.generate(
+  model: 'openai/gpt-4o-mini',
+  vendor: :openrouter,
+  prompt: {
+    text: 'Analyze this',
+    images: 'https://example.com/image.jpg'
+  }
+)
+if response.success?
+  puts response.output
+else
+  puts "Error: #{response.metadata[:error]}"
+end
+```
+## Testing in Development
+### Interactive Console
+```bash
+./bin/console
+```
+Then:
+```ruby
+LlmConductor.configure do |config|
+  config.openrouter(api_key: 'your-key')
+end
+response = LlmConductor.generate(
+  model: 'openai/gpt-4o-mini',
+  vendor: :openrouter,
+  prompt: {
+    text: 'What is this?',
+    images: 'https://example.com/image.jpg'
+  }
+)
+```
+### Run Examples
+```bash
+export OPENROUTER_API_KEY='your-key'
+ruby examples/openrouter_vision_usage.rb
+```
+## Token Counting
+Token counting for multimodal requests counts only the text portion. Image tokens vary by:
+- Image size
+- Detail level specified
+- Model being used
+The gem provides an approximation based on text tokens. For precise billing, check the OpenRouter dashboard.
+## Common Issues
+### 502 Server Error
+If you get a 502 error:
+- The model might be unavailable
+- Try a different model (e.g., switch to `openai/gpt-4o-mini`)
+- Free tier models may be overloaded
+### "No implicit conversion of Hash into String"
+This was fixed in the current version. Make sure you're using the latest version of the gem.
+### Image Not Loading
+- Verify the URL is publicly accessible
+- Check that the image format is supported
+- Try a smaller image size
+## Cost Considerations
+Vision models are more expensive than text-only models. Costs vary by:
+- **Model choice**: GPT-4o > GPT-4o-mini > Gemini Flash
+- **Detail level**: `high` uses more tokens than `low`
+- **Image count**: Each image adds to the cost
+- **Image size**: Larger images may use more tokens
+For development, use:
+- `openai/gpt-4o-mini` for cost-effective testing
+- `detail: 'low'` for quick analysis
+- Single images when possible
+For production:
+- Use `openai/gpt-4o` for best quality
+- Use `detail: 'high'` when needed
+- Monitor costs via OpenRouter dashboard
+## Examples
+See `examples/openrouter_vision_usage.rb` for complete working examples.
+## Further Reading
+- [OpenRouter Documentation](https://openrouter.ai/docs)
+- [OpenAI Vision API Reference](https://platform.openai.com/docs/guides/vision)
+- [Anthropic Claude Vision](https://docs.anthropic.com/claude/docs/vision)

data/examples/openrouter_vision_usage.rb ADDED Viewed

@@ -0,0 +1,108 @@
+#!/usr/bin/env ruby
+# frozen_string_literal: true
+# Example of OpenRouter vision/multimodal usage
+require_relative '../lib/llm_conductor'
+# Configure OpenRouter
+LlmConductor.configure do |config|
+  config.openrouter(
+    api_key: ENV['OPENROUTER_API_KEY']
+  )
+end
+# Example 1: Simple text-only request (backward compatible)
+puts '=== Example 1: Text-only request ==='
+response = LlmConductor.generate(
+  model: 'nvidia/nemotron-nano-12b-v2-vl:free', # Free vision-capable model
+  vendor: :openrouter,
+  prompt: 'What is the capital of France?'
+)
+puts response.output
+puts "Tokens used: #{response.total_tokens}\n\n"
+# Example 2: Vision request with a single image
+puts '=== Example 2: Single image analysis ==='
+response = LlmConductor.generate(
+  model: 'nvidia/nemotron-nano-12b-v2-vl:free',
+  vendor: :openrouter,
+  prompt: {
+    text: 'What is in this image?',
+    images: 'https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg'
+  }
+)
+puts response.output
+puts "Tokens used: #{response.total_tokens}\n\n"
+# Example 3: Vision request with multiple images
+puts '=== Example 3: Multiple images comparison ==='
+response = LlmConductor.generate(
+  model: 'nvidia/nemotron-nano-12b-v2-vl:free',
+  vendor: :openrouter,
+  prompt: {
+    text: 'Compare these two images and describe the differences.',
+    images: [
+      'https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg',
+      'https://upload.wikimedia.org/wikipedia/commons/thumb/3/3f/Placeholder_view_vector.svg/681px-Placeholder_view_vector.svg.png'
+    ]
+  }
+)
+puts response.output
+puts "Tokens used: #{response.total_tokens}\n\n"
+# Example 4: Image with detail level specification
+puts '=== Example 4: Image with detail level ==='
+response = LlmConductor.generate(
+  model: 'nvidia/nemotron-nano-12b-v2-vl:free',
+  vendor: :openrouter,
+  prompt: {
+    text: 'Describe this image in detail.',
+    images: [
+      {
+        url: 'https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg',
+        detail: 'high'
+      }
+    ]
+  }
+)
+puts response.output
+puts "Tokens used: #{response.total_tokens}\n\n"
+# Example 5: Using raw array format (advanced)
+puts '=== Example 5: Raw array format ==='
+response = LlmConductor.generate(
+  model: 'nvidia/nemotron-nano-12b-v2-vl:free',
+  vendor: :openrouter,
+  prompt: [
+    { type: 'text', text: 'What is in this image?' },
+    {
+      type: 'image_url',
+      image_url: {
+        url: 'https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg'
+      }
+    }
+  ]
+)
+puts response.output
+puts "Tokens used: #{response.total_tokens}\n\n"
+# Example 6: Error handling
+puts '=== Example 6: Error handling ==='
+begin
+  response = LlmConductor.generate(
+    model: 'nvidia/nemotron-nano-12b-v2-vl:free',
+    vendor: :openrouter,
+    prompt: {
+      text: 'Analyze this image',
+      images: 'invalid-url'
+    }
+  )
+  if response.success?
+    puts response.output
+  else
+    puts "Error: #{response.metadata[:error]}"
+  end
+rescue StandardError => e
+  puts "Exception: #{e.message}"
+end

data/lib/llm_conductor/clients/groq_client.rb CHANGED Viewed

@@ -7,10 +7,9 @@ module LlmConductor
       private
       def generate_content(prompt)
-        client.chat(
-          messages: [{ role: 'user', content: prompt }],
-          model:
-        ).dig('choices', 0, 'message', 'content')
+        # Groq::Client.chat expects messages as positional arg, not keyword arg
+        messages = [{ role: 'user', content: prompt }]
+        client.chat(messages, model_id: model)['content']
       end
       def client

data/lib/llm_conductor/clients/openrouter_client.rb CHANGED Viewed

@@ -3,17 +3,122 @@
 module LlmConductor
   module Clients
     # OpenRouter client implementation for accessing various LLM providers through OpenRouter API
+    # Supports both text-only and multimodal (vision) requests
     class OpenrouterClient < BaseClient
       private
+      # Override token calculation to handle multimodal content
+      def calculate_tokens(content)
+        case content
+        when String
+          super(content)
+        when Hash
+          # For multimodal content, count tokens only for text part
+          # Note: This is an approximation as images have variable token counts
+          text = content[:text] || content['text'] || ''
+          super(text)
+        when Array
+          # For pre-formatted arrays, extract and count text parts
+          text_parts = content.select { |part| part[:type] == 'text' || part['type'] == 'text' }
+                              .map { |part| part[:text] || part['text'] || '' }
+                              .join(' ')
+          super(text_parts)
+        else
+          super(content.to_s)
+        end
+      end
       def generate_content(prompt)
-        client.chat(
-          parameters: {
-            model:,
-            messages: [{ role: 'user', content: prompt }],
-            provider: { sort: 'throughput' }
+        content = format_content(prompt)
+        # Retry logic for transient 502 errors (common with free-tier models)
+        # Free-tier vision models can be slow/overloaded, so we use more retries
+        max_retries = 5
+        retry_count = 0
+        begin
+          client.chat(
+            parameters: {
+              model:,
+              messages: [{ role: 'user', content: }],
+              provider: { sort: 'throughput' }
+            }
+          ).dig('choices', 0, 'message', 'content')
+        rescue Faraday::ServerError => e
+          retry_count += 1
+          # Log retry attempts if logger is configured
+          configuration.logger&.warn(
+            "OpenRouter API error (attempt #{retry_count}/#{max_retries}): #{e.message}"
+          )
+          raise unless e.response[:status] == 502 && retry_count < max_retries
+          wait_time = 2**retry_count # Exponential backoff: 2, 4, 8, 16, 32 seconds
+          configuration.logger&.info("Retrying in #{wait_time}s...")
+          sleep(wait_time)
+          retry
+        end
+      end
+      # Format content based on whether it's a simple string or multimodal content
+      # @param prompt [String, Hash, Array] The prompt content
+      # @return [String, Array] Formatted content for the API
+      def format_content(prompt)
+        case prompt
+        when Hash
+          # Handle hash with text and/or images
+          format_multimodal_hash(prompt)
+        when Array
+          # Already formatted as array of content parts
+          prompt
+        else
+          # Simple string prompt
+          prompt.to_s
+        end
+      end
+      # Format a hash containing text and/or images into multimodal content array
+      # @param prompt_hash [Hash] Hash with :text and/or :images keys
+      # @return [Array] Array of content parts for the API
+      def format_multimodal_hash(prompt_hash)
+        content_parts = []
+        # Add text part if present
+        if prompt_hash[:text] || prompt_hash['text']
+          text = prompt_hash[:text] || prompt_hash['text']
+          content_parts << { type: 'text', text: }
+        end
+        # Add image parts if present
+        images = prompt_hash[:images] || prompt_hash['images'] || []
+        images = [images] unless images.is_a?(Array)
+        images.each do |image|
+          content_parts << format_image_part(image)
+        end
+        content_parts
+      end
+      # Format an image into the appropriate API structure
+      # @param image [String, Hash] Image URL or hash with url/detail keys
+      # @return [Hash] Formatted image part for the API
+      def format_image_part(image)
+        case image
+        when String
+          # Simple URL string
+          { type: 'image_url', image_url: { url: image } }
+        when Hash
+          # Hash with url and optional detail level
+          {
+            type: 'image_url',
+            image_url: {
+              url: image[:url] || image['url'],
+              detail: image[:detail] || image['detail']
+            }.compact
           }
-        ).dig('choices', 0, 'message', 'content')
+        end
       end
       def client
@@ -21,7 +126,7 @@ module LlmConductor
           config = LlmConductor.configuration.provider_config(:openrouter)
           OpenAI::Client.new(
             access_token: config[:api_key],
-            uri_base: config[:uri_base] || 'https://openrouter.ai/api/'
+            uri_base: config[:uri_base] || 'https://openrouter.ai/api/v1'
           )
         end
       end

data/lib/llm_conductor/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module LlmConductor
-  VERSION = '1.1.0'
+  VERSION = '1.1.2'
 end

metadata CHANGED Viewed

@@ -1,13 +1,13 @@
 --- !ruby/object:Gem::Specification
 name: llm_conductor
 version: !ruby/object:Gem::Version
-  version: 1.1.0
+  version: 1.1.2
 platform: ruby
 authors:
 - Ben Zheng
 bindir: exe
 cert_chain: []
-date: 2025-10-15 00:00:00.000000000 Z
+date: 2025-10-29 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: activesupport
@@ -152,10 +152,12 @@ files:
 - LICENSE
 - README.md
 - Rakefile
+- VISION_USAGE.md
 - config/initializers/llm_conductor.rb
 - examples/data_builder_usage.rb
 - examples/gemini_usage.rb
 - examples/groq_usage.rb
+- examples/openrouter_vision_usage.rb
 - examples/prompt_registration.rb
 - examples/rag_usage.rb
 - examples/simple_usage.rb