RubyGems - llm_conductor - Versions diffs - 1.4.1 → 1.5.0 - Mend

llm_conductor 1.4.1 → 1.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

checksums.yaml +4 -4
data/README.md +158 -540
data/docs/README.md +42 -0
data/docs/custom-parameters.md +352 -0
data/examples/ollama_params_usage.rb +99 -0
data/lib/llm_conductor/client_factory.rb +2 -2
data/lib/llm_conductor/clients/base_client.rb +3 -2
data/lib/llm_conductor/clients/ollama_client.rb +2 -1
data/lib/llm_conductor/version.rb +1 -1
data/lib/llm_conductor.rb +11 -9
metadata +6 -3
/data/{VISION_USAGE.md → docs/vision-support.md} +0 -0

data/docs/README.md ADDED Viewed

@@ -0,0 +1,42 @@
+# LLM Conductor Documentation
+Welcome to the LLM Conductor documentation. This directory contains detailed guides for advanced features.
+## Guides
+### [Custom Parameters](custom-parameters.md)
+Learn how to fine-tune LLM generation with parameters like `temperature`, `top_p`, and more. Includes:
+- Quick reference for common parameters
+- Temperature guidelines
+- Provider-specific parameters
+- Best practices and use cases
+**Currently supported**: Ollama
+**Coming soon**: OpenAI, Anthropic, Gemini, Groq, OpenRouter, Z.ai
+### [Vision Support](vision-support.md)
+Complete guide to using vision/multimodal capabilities. Includes:
+- Sending images with text prompts
+- Multiple image handling
+- Provider-specific formats
+- Base64 encoded images
+- Best practices for vision tasks
+**Supported providers**: OpenAI (GPT-4o), Anthropic (Claude 3), Gemini, OpenRouter, Z.ai (GLM-4.5V)
+## Quick Links
+- [Main README](../README.md) - Getting started and basic usage
+- [Examples](../examples/) - Working code examples
+- [API Reference](https://rubydoc.info/gems/llm_conductor) - Full API documentation
+## Contributing
+Found an issue or want to improve documentation? Please contribute:
+1. Fork the repository
+2. Make your changes
+3. Submit a pull request
+See [Contributing Guidelines](../README.md#contributing) for more details.

data/docs/custom-parameters.md ADDED Viewed

@@ -0,0 +1,352 @@
+# Custom Parameters Guide
+Fine-tune LLM generation behavior with parameters like `temperature`, `top_p`, and more.
+## 🚀 Quick Reference
+### Temperature Guide
+| Value | Behavior | Use Case |
+|-------|----------|----------|
+| 0.0 | Deterministic | Testing, data extraction |
+| 0.3 | Very focused | Factual Q&A, summaries |
+| 0.7 | **Balanced (recommended)** | General purpose |
+| 0.9 | Creative | Stories, brainstorming |
+### Common Patterns
+```ruby
+# Deterministic (testing)
+params: { temperature: 0.0, seed: 42 }
+# Creative writing
+params: { temperature: 0.9, top_p: 0.95, repeat_penalty: 1.2 }
+# Factual/precise
+params: { temperature: 0.3, top_p: 0.85 }
+```
+### Provider Support
+| Provider | Status |
+|----------|--------|
+| Ollama | ✅ Supported |
+| OpenAI, Anthropic, Gemini, etc. | 🔜 Coming soon |
+---
+## Overview
+Custom parameters allow you to control various aspects of LLM generation:
+- **Temperature**: Controls randomness (0.0 = deterministic, higher = more creative)
+- **Top-p/Top-k**: Controls diversity via nucleus/top-k sampling
+- **Max tokens**: Limits the length of generated responses
+- **And more**: Each provider supports different parameters
+## Quick Start
+```ruby
+require 'llm_conductor'
+# Generate with custom temperature
+response = LlmConductor.generate(
+  model: 'llama2',
+  prompt: 'Write a creative story.',
+  vendor: :ollama,
+  params: { temperature: 0.9 }
+)
+```
+## Usage Examples
+### 1. Simple Prompt with Parameters
+```ruby
+# Low temperature for focused, deterministic output
+response = LlmConductor.generate(
+  model: 'llama2',
+  prompt: 'What is 2 + 2?',
+  vendor: :ollama,
+  params: { temperature: 0.0 }
+)
+# High temperature for creative output
+response = LlmConductor.generate(
+  model: 'llama2',
+  prompt: 'Write a poem about the ocean.',
+  vendor: :ollama,
+  params: { temperature: 0.9 }
+)
+```
+### 2. Multiple Parameters
+```ruby
+response = LlmConductor.generate(
+  model: 'llama2',
+  prompt: 'Explain quantum computing.',
+  vendor: :ollama,
+  params: {
+    temperature: 0.7,
+    top_p: 0.9,
+    top_k: 40,
+    num_predict: 200,      # Max tokens
+    repeat_penalty: 1.1    # Penalize repetition
+  }
+)
+```
+### 3. Using build_client with Parameters
+```ruby
+# Create a client with custom parameters
+client = LlmConductor.build_client(
+  model: 'llama2',
+  type: :custom,
+  vendor: :ollama,
+  params: {
+    temperature: 0.3,
+    repeat_penalty: 1.2
+  }
+)
+# Use the client
+response = client.generate_simple(
+  prompt: 'List 5 benefits of exercise.'
+)
+```
+### 4. Template-Based Generation with Parameters
+```ruby
+# Using params with template-based generation
+response = LlmConductor.generate(
+  model: 'llama2',
+  type: :summarize_text,
+  data: {
+    content: 'Long article text here...',
+    max_length: 100
+  },
+  vendor: :ollama,
+  params: {
+    temperature: 0.5,
+    num_predict: 150
+  }
+)
+```
+## Ollama Parameters Reference
+Below are common parameters supported by Ollama. For a complete list, see the [Ollama documentation](https://github.com/ollama/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values).
+### Core Parameters
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `temperature` | Float | 0.8 | Controls randomness. 0.0 = deterministic, 2.0 = very random |
+| `top_p` | Float | 0.9 | Nucleus sampling. Controls diversity (0.0-1.0) |
+| `top_k` | Integer | 40 | Top-k sampling. Limits vocabulary to top K tokens |
+| `num_predict` | Integer | 128 | Maximum number of tokens to generate |
+| `repeat_penalty` | Float | 1.1 | Penalizes repetition. 1.0 = no penalty |
+### Advanced Parameters
+| Parameter | Type | Description |
+|-----------|------|-------------|
+| `seed` | Integer | Random seed for reproducibility |
+| `stop` | Array<String> | Stop sequences that end generation |
+| `tfs_z` | Float | Tail-free sampling parameter |
+| `num_ctx` | Integer | Context window size |
+| `num_gpu` | Integer | Number of layers to offload to GPU |
+| `num_thread` | Integer | Number of threads to use |
+| `repeat_last_n` | Integer | Look back for repetition penalty |
+| `mirostat` | Integer | Enable Mirostat sampling (0, 1, or 2) |
+| `mirostat_tau` | Float | Mirostat target entropy |
+| `mirostat_eta` | Float | Mirostat learning rate |
+## Common Use Cases
+### Deterministic Output (Testing, Structured Data)
+For consistent, reproducible results:
+```ruby
+response = LlmConductor.generate(
+  model: 'llama2',
+  prompt: 'Extract the email addresses from this text...',
+  vendor: :ollama,
+  params: {
+    temperature: 0.0,
+    seed: 42  # Optional: ensures reproducibility
+  }
+)
+```
+### Creative Writing
+For more varied, creative responses:
+```ruby
+response = LlmConductor.generate(
+  model: 'llama2',
+  prompt: 'Write a short science fiction story.',
+  vendor: :ollama,
+  params: {
+    temperature: 0.9,
+    top_p: 0.95,
+    repeat_penalty: 1.2
+  }
+)
+```
+### Balanced (General Purpose)
+Good middle ground for most tasks:
+```ruby
+response = LlmConductor.generate(
+  model: 'llama2',
+  prompt: 'Explain how photosynthesis works.',
+  vendor: :ollama,
+  params: {
+    temperature: 0.7,
+    top_p: 0.9
+  }
+)
+```
+### Long-Form Content
+For generating longer responses:
+```ruby
+response = LlmConductor.generate(
+  model: 'llama2',
+  prompt: 'Write a detailed guide on...',
+  vendor: :ollama,
+  params: {
+    temperature: 0.8,
+    num_predict: 1000,  # Allow up to 1000 tokens
+    repeat_penalty: 1.1
+  }
+)
+```
+## Best Practices
+### 1. Temperature Guidelines
+- **0.0-0.3**: Deterministic, focused, factual responses
+  - Use for: Data extraction, structured output, factual questions
+- **0.4-0.7**: Balanced responses with some variation
+  - Use for: General Q&A, summaries, explanations
+- **0.8-1.2**: Creative, diverse responses
+  - Use for: Creative writing, brainstorming, storytelling
+- **1.3+**: Very random, experimental
+  - Use with caution: May produce incoherent output
+### 2. Combining Parameters
+Temperature and top_p work together:
+```ruby
+# Conservative: Focused but with some diversity
+params: { temperature: 0.5, top_p: 0.9 }
+# Balanced: Good for most use cases
+params: { temperature: 0.7, top_p: 0.9, top_k: 40 }
+# Creative: Maximum diversity
+params: { temperature: 1.0, top_p: 0.95 }
+```
+### 3. Reproducibility
+For testing or debugging, use a fixed seed:
+```ruby
+params: {
+  temperature: 0.5,
+  seed: 12345  # Same seed + same params = same output
+}
+```
+### 4. Performance Tuning
+```ruby
+# Optimize for speed
+params: {
+  num_predict: 100,    # Limit output length
+  num_thread: 8        # Use more CPU threads
+}
+# Optimize for quality
+params: {
+  num_ctx: 4096,       # Larger context window
+  repeat_penalty: 1.2  # Reduce repetition
+}
+```
+## Configuration
+You can set default parameters at the configuration level (future enhancement):
+```ruby
+# Coming soon - configuration-level defaults
+LlmConductor.configure do |config|
+  config.ollama(
+    base_url: 'http://localhost:11434',
+    default_params: { temperature: 0.7, top_p: 0.9 }
+  )
+end
+```
+## Parameter Validation
+The gem passes parameters directly to the underlying provider. Invalid parameters will:
+- Be ignored by the provider (most common)
+- Return an error from the provider API
+Always refer to your provider's documentation for supported parameters.
+## Future Provider Support
+Currently, custom parameters are fully supported for:
+- ✅ **Ollama**
+Coming soon:
+- 🔜 OpenAI (GPT)
+- 🔜 Anthropic (Claude)
+- 🔜 Google (Gemini)
+- 🔜 Groq
+- 🔜 OpenRouter
+- 🔜 Z.ai
+## Troubleshooting
+### Parameters Not Working
+1. Check parameter spelling (case-sensitive)
+2. Verify your provider supports the parameter
+3. Check parameter value types (integer vs float vs string)
+### Unexpected Output
+1. Try lowering temperature for more consistent results
+2. Adjust top_p and top_k for better quality
+3. Increase repeat_penalty if output is too repetitive
+### Performance Issues
+1. Reduce `num_predict` to limit output length
+2. Adjust `num_thread` based on your CPU
+3. Use `num_gpu` to offload processing to GPU
+## Examples
+See the complete example file: [examples/ollama_params_usage.rb](examples/ollama_params_usage.rb)
+## Resources
+- [Ollama Parameters Documentation](https://github.com/ollama/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values)
+- [Temperature in Language Models](https://docs.cohere.com/docs/temperature)
+- [Nucleus Sampling (Top-p)](https://arxiv.org/abs/1904.09751)

data/examples/ollama_params_usage.rb ADDED Viewed

@@ -0,0 +1,99 @@
+#!/usr/bin/env ruby
+# frozen_string_literal: true
+# Example demonstrating how to use custom parameters with Ollama client
+require_relative '../lib/llm_conductor'
+# Configure Ollama (optional if using default localhost)
+LlmConductor.configure do |config|
+  config.ollama(base_url: ENV.fetch('OLLAMA_BASE_URL', 'http://localhost:11434'))
+end
+puts '=== Example 1: Using temperature parameter ==='
+# Generate with custom temperature
+response = LlmConductor.generate(
+  model: 'llama2',
+  prompt: 'Write a creative story about a robot learning to paint.',
+  vendor: :ollama,
+  params: { temperature: 0.9 }
+)
+puts response.output
+puts "Tokens used: #{response.total_tokens}"
+puts "\n"
+puts '=== Example 2: Using multiple parameters ==='
+# Generate with multiple custom parameters
+response2 = LlmConductor.generate(
+  model: 'llama2',
+  prompt: 'Explain the concept of artificial intelligence in simple terms.',
+  vendor: :ollama,
+  params: {
+    temperature: 0.7,
+    top_p: 0.9,
+    top_k: 40,
+    num_predict: 200 # Max tokens to generate
+  }
+)
+puts response2.output
+puts "Input tokens: #{response2.input_tokens}"
+puts "Output tokens: #{response2.output_tokens}"
+puts "\n"
+puts '=== Example 3: Using params with build_client ==='
+# You can also use params when building a client directly
+client = LlmConductor.build_client(
+  model: 'llama2',
+  type: :custom,
+  vendor: :ollama,
+  params: {
+    temperature: 0.3, # Lower temperature for more focused output
+    repeat_penalty: 1.1
+  }
+)
+response3 = client.generate_simple(
+  prompt: 'List 5 benefits of regular exercise.'
+)
+puts response3.output
+puts "Success: #{response3.success?}"
+puts "\n"
+puts '=== Example 4: Low temperature for deterministic output ==='
+# Use low temperature for more deterministic results
+response4 = LlmConductor.generate(
+  model: 'llama2',
+  prompt: 'What is 2 + 2?',
+  vendor: :ollama,
+  params: { temperature: 0.0 }
+)
+puts response4.output
+puts "\n"
+puts '=== Available Ollama Parameters ==='
+puts <<~PARAMS
+  Common parameters you can use with Ollama:
+  - temperature: Controls randomness (0.0 to 2.0, default: 0.8)
+    Lower = more focused and deterministic
+    Higher = more random and creative
+  - top_p: Nucleus sampling (0.0 to 1.0, default: 0.9)
+    Controls diversity via nucleus sampling
+  - top_k: Top-k sampling (default: 40)
+    Limits vocabulary to top K tokens
+  - num_predict: Maximum tokens to generate (default: 128)
+  - repeat_penalty: Penalizes repetition (default: 1.1)
+  - seed: Random seed for reproducibility
+  - stop: Stop sequences (array of strings)
+  For more parameters, see: https://github.com/ollama/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values
+PARAMS

data/lib/llm_conductor/client_factory.rb CHANGED Viewed

@@ -3,10 +3,10 @@
 module LlmConductor
   # Factory class for creating appropriate LLM client instances based on model and vendor
   class ClientFactory
-    def self.build(model:, type:, vendor: nil)
+    def self.build(model:, type:, vendor: nil, params: {})
       vendor ||= determine_vendor(model)
       client_class = client_class_for_vendor(vendor)
-      client_class.new(model:, type:)
+      client_class.new(model:, type:, params:)
     end
     def self.client_class_for_vendor(vendor)

data/lib/llm_conductor/clients/base_client.rb CHANGED Viewed

@@ -11,11 +11,12 @@ module LlmConductor
     class BaseClient
       include Prompts
-      attr_reader :model, :type
+      attr_reader :model, :type, :params
-      def initialize(model:, type:)
+      def initialize(model:, type:, params: {})
         @model = model
         @type = type
+        @params = params
       end
       def generate(data:)

data/lib/llm_conductor/clients/ollama_client.rb CHANGED Viewed

@@ -7,7 +7,8 @@ module LlmConductor
       private
       def generate_content(prompt)
-        client.generate({ model:, prompt:, stream: false }).first['response']
+        request_params = { model:, prompt:, stream: false }.merge(params)
+        client.generate(request_params).first['response']
       end
       def client

data/lib/llm_conductor/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module LlmConductor
-  VERSION = '1.4.1'
+  VERSION = '1.5.0'
 end

data/lib/llm_conductor.rb CHANGED Viewed

@@ -24,35 +24,37 @@ module LlmConductor
   class Error < StandardError; end
   # Main entry point for creating LLM clients
-  def self.build_client(model:, type:, vendor: nil)
-    ClientFactory.build(model:, type:, vendor:)
+  def self.build_client(model:, type:, vendor: nil, params: {})
+    ClientFactory.build(model:, type:, vendor:, params:)
   end
   # Unified generate method supporting both simple prompts and legacy template-based generation
-  def self.generate(model: nil, prompt: nil, type: nil, data: nil, vendor: nil)
+  # rubocop:disable Metrics/ParameterLists
+  def self.generate(model: nil, prompt: nil, type: nil, data: nil, vendor: nil, params: {})
     if prompt && !type && !data
-      generate_simple_prompt(model:, prompt:, vendor:)
+      generate_simple_prompt(model:, prompt:, vendor:, params:)
     elsif type && data && !prompt
-      generate_with_template(model:, type:, data:, vendor:)
+      generate_with_template(model:, type:, data:, vendor:, params:)
     else
       raise ArgumentError,
             "Invalid arguments. Use either: generate(prompt: 'text') or generate(type: :custom, data: {...})"
     end
   end
+  # rubocop:enable Metrics/ParameterLists
   class << self
     private
-    def generate_simple_prompt(model:, prompt:, vendor:)
+    def generate_simple_prompt(model:, prompt:, vendor:, params:)
       model ||= configuration.default_model
       vendor ||= ClientFactory.determine_vendor(model)
       client_class = client_class_for_vendor(vendor)
-      client = client_class.new(model:, type: :direct)
+      client = client_class.new(model:, type: :direct, params:)
       client.generate_simple(prompt:)
     end
-    def generate_with_template(model:, type:, data:, vendor:)
-      client = build_client(model:, type:, vendor:)
+    def generate_with_template(model:, type:, data:, vendor:, params:)
+      client = build_client(model:, type:, vendor:, params:)
       client.generate(data:)
     end

metadata CHANGED Viewed

@@ -1,13 +1,13 @@
 --- !ruby/object:Gem::Specification
 name: llm_conductor
 version: !ruby/object:Gem::Version
-  version: 1.4.1
+  version: 1.5.0
 platform: ruby
 authors:
 - Ben Zheng
 bindir: exe
 cert_chain: []
-date: 2025-12-05 00:00:00.000000000 Z
+date: 2025-12-12 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: activesupport
@@ -152,14 +152,17 @@ files:
 - LICENSE
 - README.md
 - Rakefile
-- VISION_USAGE.md
 - config/initializers/llm_conductor.rb
+- docs/README.md
+- docs/custom-parameters.md
+- docs/vision-support.md
 - examples/claude_vision_usage.rb
 - examples/data_builder_usage.rb
 - examples/gemini_usage.rb
 - examples/gemini_vision_usage.rb
 - examples/gpt_vision_usage.rb
 - examples/groq_usage.rb
+- examples/ollama_params_usage.rb
 - examples/openrouter_vision_usage.rb
 - examples/prompt_registration.rb
 - examples/rag_usage.rb

/data/{VISION_USAGE.md → docs/vision-support.md} RENAMED Viewed

File without changes