llm_conductor 1.4.1 → 1.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/docs/README.md ADDED
@@ -0,0 +1,42 @@
1
+ # LLM Conductor Documentation
2
+
3
+ Welcome to the LLM Conductor documentation. This directory contains detailed guides for advanced features.
4
+
5
+ ## Guides
6
+
7
+ ### [Custom Parameters](custom-parameters.md)
8
+ Learn how to fine-tune LLM generation with parameters like `temperature`, `top_p`, and more. Includes:
9
+ - Quick reference for common parameters
10
+ - Temperature guidelines
11
+ - Provider-specific parameters
12
+ - Best practices and use cases
13
+
14
+ **Currently supported**: Ollama
15
+ **Coming soon**: OpenAI, Anthropic, Gemini, Groq, OpenRouter, Z.ai
16
+
17
+ ### [Vision Support](vision-support.md)
18
+ Complete guide to using vision/multimodal capabilities. Includes:
19
+ - Sending images with text prompts
20
+ - Multiple image handling
21
+ - Provider-specific formats
22
+ - Base64 encoded images
23
+ - Best practices for vision tasks
24
+
25
+ **Supported providers**: OpenAI (GPT-4o), Anthropic (Claude 3), Gemini, OpenRouter, Z.ai (GLM-4.5V)
26
+
27
+ ## Quick Links
28
+
29
+ - [Main README](../README.md) - Getting started and basic usage
30
+ - [Examples](../examples/) - Working code examples
31
+ - [API Reference](https://rubydoc.info/gems/llm_conductor) - Full API documentation
32
+
33
+ ## Contributing
34
+
35
+ Found an issue or want to improve documentation? Please contribute:
36
+
37
+ 1. Fork the repository
38
+ 2. Make your changes
39
+ 3. Submit a pull request
40
+
41
+ See [Contributing Guidelines](../README.md#contributing) for more details.
42
+
@@ -0,0 +1,352 @@
1
+ # Custom Parameters Guide
2
+
3
+ Fine-tune LLM generation behavior with parameters like `temperature`, `top_p`, and more.
4
+
5
+ ## 🚀 Quick Reference
6
+
7
+ ### Temperature Guide
8
+ | Value | Behavior | Use Case |
9
+ |-------|----------|----------|
10
+ | 0.0 | Deterministic | Testing, data extraction |
11
+ | 0.3 | Very focused | Factual Q&A, summaries |
12
+ | 0.7 | **Balanced (recommended)** | General purpose |
13
+ | 0.9 | Creative | Stories, brainstorming |
14
+
15
+ ### Common Patterns
16
+
17
+ ```ruby
18
+ # Deterministic (testing)
19
+ params: { temperature: 0.0, seed: 42 }
20
+
21
+ # Creative writing
22
+ params: { temperature: 0.9, top_p: 0.95, repeat_penalty: 1.2 }
23
+
24
+ # Factual/precise
25
+ params: { temperature: 0.3, top_p: 0.85 }
26
+ ```
27
+
28
+ ### Provider Support
29
+ | Provider | Status |
30
+ |----------|--------|
31
+ | Ollama | ✅ Supported |
32
+ | OpenAI, Anthropic, Gemini, etc. | 🔜 Coming soon |
33
+
34
+ ---
35
+
36
+ ## Overview
37
+
38
+ Custom parameters allow you to control various aspects of LLM generation:
39
+ - **Temperature**: Controls randomness (0.0 = deterministic, higher = more creative)
40
+ - **Top-p/Top-k**: Controls diversity via nucleus/top-k sampling
41
+ - **Max tokens**: Limits the length of generated responses
42
+ - **And more**: Each provider supports different parameters
43
+
44
+ ## Quick Start
45
+
46
+ ```ruby
47
+ require 'llm_conductor'
48
+
49
+ # Generate with custom temperature
50
+ response = LlmConductor.generate(
51
+ model: 'llama2',
52
+ prompt: 'Write a creative story.',
53
+ vendor: :ollama,
54
+ params: { temperature: 0.9 }
55
+ )
56
+ ```
57
+
58
+ ## Usage Examples
59
+
60
+ ### 1. Simple Prompt with Parameters
61
+
62
+ ```ruby
63
+ # Low temperature for focused, deterministic output
64
+ response = LlmConductor.generate(
65
+ model: 'llama2',
66
+ prompt: 'What is 2 + 2?',
67
+ vendor: :ollama,
68
+ params: { temperature: 0.0 }
69
+ )
70
+
71
+ # High temperature for creative output
72
+ response = LlmConductor.generate(
73
+ model: 'llama2',
74
+ prompt: 'Write a poem about the ocean.',
75
+ vendor: :ollama,
76
+ params: { temperature: 0.9 }
77
+ )
78
+ ```
79
+
80
+ ### 2. Multiple Parameters
81
+
82
+ ```ruby
83
+ response = LlmConductor.generate(
84
+ model: 'llama2',
85
+ prompt: 'Explain quantum computing.',
86
+ vendor: :ollama,
87
+ params: {
88
+ temperature: 0.7,
89
+ top_p: 0.9,
90
+ top_k: 40,
91
+ num_predict: 200, # Max tokens
92
+ repeat_penalty: 1.1 # Penalize repetition
93
+ }
94
+ )
95
+ ```
96
+
97
+ ### 3. Using build_client with Parameters
98
+
99
+ ```ruby
100
+ # Create a client with custom parameters
101
+ client = LlmConductor.build_client(
102
+ model: 'llama2',
103
+ type: :custom,
104
+ vendor: :ollama,
105
+ params: {
106
+ temperature: 0.3,
107
+ repeat_penalty: 1.2
108
+ }
109
+ )
110
+
111
+ # Use the client
112
+ response = client.generate_simple(
113
+ prompt: 'List 5 benefits of exercise.'
114
+ )
115
+ ```
116
+
117
+ ### 4. Template-Based Generation with Parameters
118
+
119
+ ```ruby
120
+ # Using params with template-based generation
121
+ response = LlmConductor.generate(
122
+ model: 'llama2',
123
+ type: :summarize_text,
124
+ data: {
125
+ content: 'Long article text here...',
126
+ max_length: 100
127
+ },
128
+ vendor: :ollama,
129
+ params: {
130
+ temperature: 0.5,
131
+ num_predict: 150
132
+ }
133
+ )
134
+ ```
135
+
136
+ ## Ollama Parameters Reference
137
+
138
+ Below are common parameters supported by Ollama. For a complete list, see the [Ollama documentation](https://github.com/ollama/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values).
139
+
140
+ ### Core Parameters
141
+
142
+ | Parameter | Type | Default | Description |
143
+ |-----------|------|---------|-------------|
144
+ | `temperature` | Float | 0.8 | Controls randomness. 0.0 = deterministic, 2.0 = very random |
145
+ | `top_p` | Float | 0.9 | Nucleus sampling. Controls diversity (0.0-1.0) |
146
+ | `top_k` | Integer | 40 | Top-k sampling. Limits vocabulary to top K tokens |
147
+ | `num_predict` | Integer | 128 | Maximum number of tokens to generate |
148
+ | `repeat_penalty` | Float | 1.1 | Penalizes repetition. 1.0 = no penalty |
149
+
150
+ ### Advanced Parameters
151
+
152
+ | Parameter | Type | Description |
153
+ |-----------|------|-------------|
154
+ | `seed` | Integer | Random seed for reproducibility |
155
+ | `stop` | Array<String> | Stop sequences that end generation |
156
+ | `tfs_z` | Float | Tail-free sampling parameter |
157
+ | `num_ctx` | Integer | Context window size |
158
+ | `num_gpu` | Integer | Number of layers to offload to GPU |
159
+ | `num_thread` | Integer | Number of threads to use |
160
+ | `repeat_last_n` | Integer | Look back for repetition penalty |
161
+ | `mirostat` | Integer | Enable Mirostat sampling (0, 1, or 2) |
162
+ | `mirostat_tau` | Float | Mirostat target entropy |
163
+ | `mirostat_eta` | Float | Mirostat learning rate |
164
+
165
+ ## Common Use Cases
166
+
167
+ ### Deterministic Output (Testing, Structured Data)
168
+
169
+ For consistent, reproducible results:
170
+
171
+ ```ruby
172
+ response = LlmConductor.generate(
173
+ model: 'llama2',
174
+ prompt: 'Extract the email addresses from this text...',
175
+ vendor: :ollama,
176
+ params: {
177
+ temperature: 0.0,
178
+ seed: 42 # Optional: ensures reproducibility
179
+ }
180
+ )
181
+ ```
182
+
183
+ ### Creative Writing
184
+
185
+ For more varied, creative responses:
186
+
187
+ ```ruby
188
+ response = LlmConductor.generate(
189
+ model: 'llama2',
190
+ prompt: 'Write a short science fiction story.',
191
+ vendor: :ollama,
192
+ params: {
193
+ temperature: 0.9,
194
+ top_p: 0.95,
195
+ repeat_penalty: 1.2
196
+ }
197
+ )
198
+ ```
199
+
200
+ ### Balanced (General Purpose)
201
+
202
+ Good middle ground for most tasks:
203
+
204
+ ```ruby
205
+ response = LlmConductor.generate(
206
+ model: 'llama2',
207
+ prompt: 'Explain how photosynthesis works.',
208
+ vendor: :ollama,
209
+ params: {
210
+ temperature: 0.7,
211
+ top_p: 0.9
212
+ }
213
+ )
214
+ ```
215
+
216
+ ### Long-Form Content
217
+
218
+ For generating longer responses:
219
+
220
+ ```ruby
221
+ response = LlmConductor.generate(
222
+ model: 'llama2',
223
+ prompt: 'Write a detailed guide on...',
224
+ vendor: :ollama,
225
+ params: {
226
+ temperature: 0.8,
227
+ num_predict: 1000, # Allow up to 1000 tokens
228
+ repeat_penalty: 1.1
229
+ }
230
+ )
231
+ ```
232
+
233
+ ## Best Practices
234
+
235
+ ### 1. Temperature Guidelines
236
+
237
+ - **0.0-0.3**: Deterministic, focused, factual responses
238
+ - Use for: Data extraction, structured output, factual questions
239
+ - **0.4-0.7**: Balanced responses with some variation
240
+ - Use for: General Q&A, summaries, explanations
241
+ - **0.8-1.2**: Creative, diverse responses
242
+ - Use for: Creative writing, brainstorming, storytelling
243
+ - **1.3+**: Very random, experimental
244
+ - Use with caution: May produce incoherent output
245
+
246
+ ### 2. Combining Parameters
247
+
248
+ Temperature and top_p work together:
249
+
250
+ ```ruby
251
+ # Conservative: Focused but with some diversity
252
+ params: { temperature: 0.5, top_p: 0.9 }
253
+
254
+ # Balanced: Good for most use cases
255
+ params: { temperature: 0.7, top_p: 0.9, top_k: 40 }
256
+
257
+ # Creative: Maximum diversity
258
+ params: { temperature: 1.0, top_p: 0.95 }
259
+ ```
260
+
261
+ ### 3. Reproducibility
262
+
263
+ For testing or debugging, use a fixed seed:
264
+
265
+ ```ruby
266
+ params: {
267
+ temperature: 0.5,
268
+ seed: 12345 # Same seed + same params = same output
269
+ }
270
+ ```
271
+
272
+ ### 4. Performance Tuning
273
+
274
+ ```ruby
275
+ # Optimize for speed
276
+ params: {
277
+ num_predict: 100, # Limit output length
278
+ num_thread: 8 # Use more CPU threads
279
+ }
280
+
281
+ # Optimize for quality
282
+ params: {
283
+ num_ctx: 4096, # Larger context window
284
+ repeat_penalty: 1.2 # Reduce repetition
285
+ }
286
+ ```
287
+
288
+ ## Configuration
289
+
290
+ You can set default parameters at the configuration level (future enhancement):
291
+
292
+ ```ruby
293
+ # Coming soon - configuration-level defaults
294
+ LlmConductor.configure do |config|
295
+ config.ollama(
296
+ base_url: 'http://localhost:11434',
297
+ default_params: { temperature: 0.7, top_p: 0.9 }
298
+ )
299
+ end
300
+ ```
301
+
302
+ ## Parameter Validation
303
+
304
+ The gem passes parameters directly to the underlying provider. Invalid parameters will:
305
+ - Be ignored by the provider (most common)
306
+ - Return an error from the provider API
307
+
308
+ Always refer to your provider's documentation for supported parameters.
309
+
310
+ ## Future Provider Support
311
+
312
+ Currently, custom parameters are fully supported for:
313
+ - ✅ **Ollama**
314
+
315
+ Coming soon:
316
+ - 🔜 OpenAI (GPT)
317
+ - 🔜 Anthropic (Claude)
318
+ - 🔜 Google (Gemini)
319
+ - 🔜 Groq
320
+ - 🔜 OpenRouter
321
+ - 🔜 Z.ai
322
+
323
+ ## Troubleshooting
324
+
325
+ ### Parameters Not Working
326
+
327
+ 1. Check parameter spelling (case-sensitive)
328
+ 2. Verify your provider supports the parameter
329
+ 3. Check parameter value types (integer vs float vs string)
330
+
331
+ ### Unexpected Output
332
+
333
+ 1. Try lowering temperature for more consistent results
334
+ 2. Adjust top_p and top_k for better quality
335
+ 3. Increase repeat_penalty if output is too repetitive
336
+
337
+ ### Performance Issues
338
+
339
+ 1. Reduce `num_predict` to limit output length
340
+ 2. Adjust `num_thread` based on your CPU
341
+ 3. Use `num_gpu` to offload processing to GPU
342
+
343
+ ## Examples
344
+
345
+ See the complete example file: [examples/ollama_params_usage.rb](examples/ollama_params_usage.rb)
346
+
347
+ ## Resources
348
+
349
+ - [Ollama Parameters Documentation](https://github.com/ollama/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values)
350
+ - [Temperature in Language Models](https://docs.cohere.com/docs/temperature)
351
+ - [Nucleus Sampling (Top-p)](https://arxiv.org/abs/1904.09751)
352
+
@@ -0,0 +1,99 @@
1
+ #!/usr/bin/env ruby
2
+ # frozen_string_literal: true
3
+
4
+ # Example demonstrating how to use custom parameters with Ollama client
5
+ require_relative '../lib/llm_conductor'
6
+
7
+ # Configure Ollama (optional if using default localhost)
8
+ LlmConductor.configure do |config|
9
+ config.ollama(base_url: ENV.fetch('OLLAMA_BASE_URL', 'http://localhost:11434'))
10
+ end
11
+
12
+ puts '=== Example 1: Using temperature parameter ==='
13
+ # Generate with custom temperature
14
+ response = LlmConductor.generate(
15
+ model: 'llama2',
16
+ prompt: 'Write a creative story about a robot learning to paint.',
17
+ vendor: :ollama,
18
+ params: { temperature: 0.9 }
19
+ )
20
+
21
+ puts response.output
22
+ puts "Tokens used: #{response.total_tokens}"
23
+ puts "\n"
24
+
25
+ puts '=== Example 2: Using multiple parameters ==='
26
+ # Generate with multiple custom parameters
27
+ response2 = LlmConductor.generate(
28
+ model: 'llama2',
29
+ prompt: 'Explain the concept of artificial intelligence in simple terms.',
30
+ vendor: :ollama,
31
+ params: {
32
+ temperature: 0.7,
33
+ top_p: 0.9,
34
+ top_k: 40,
35
+ num_predict: 200 # Max tokens to generate
36
+ }
37
+ )
38
+
39
+ puts response2.output
40
+ puts "Input tokens: #{response2.input_tokens}"
41
+ puts "Output tokens: #{response2.output_tokens}"
42
+ puts "\n"
43
+
44
+ puts '=== Example 3: Using params with build_client ==='
45
+ # You can also use params when building a client directly
46
+ client = LlmConductor.build_client(
47
+ model: 'llama2',
48
+ type: :custom,
49
+ vendor: :ollama,
50
+ params: {
51
+ temperature: 0.3, # Lower temperature for more focused output
52
+ repeat_penalty: 1.1
53
+ }
54
+ )
55
+
56
+ response3 = client.generate_simple(
57
+ prompt: 'List 5 benefits of regular exercise.'
58
+ )
59
+
60
+ puts response3.output
61
+ puts "Success: #{response3.success?}"
62
+ puts "\n"
63
+
64
+ puts '=== Example 4: Low temperature for deterministic output ==='
65
+ # Use low temperature for more deterministic results
66
+ response4 = LlmConductor.generate(
67
+ model: 'llama2',
68
+ prompt: 'What is 2 + 2?',
69
+ vendor: :ollama,
70
+ params: { temperature: 0.0 }
71
+ )
72
+
73
+ puts response4.output
74
+ puts "\n"
75
+
76
+ puts '=== Available Ollama Parameters ==='
77
+ puts <<~PARAMS
78
+ Common parameters you can use with Ollama:
79
+
80
+ - temperature: Controls randomness (0.0 to 2.0, default: 0.8)
81
+ Lower = more focused and deterministic
82
+ Higher = more random and creative
83
+
84
+ - top_p: Nucleus sampling (0.0 to 1.0, default: 0.9)
85
+ Controls diversity via nucleus sampling
86
+
87
+ - top_k: Top-k sampling (default: 40)
88
+ Limits vocabulary to top K tokens
89
+
90
+ - num_predict: Maximum tokens to generate (default: 128)
91
+
92
+ - repeat_penalty: Penalizes repetition (default: 1.1)
93
+
94
+ - seed: Random seed for reproducibility
95
+
96
+ - stop: Stop sequences (array of strings)
97
+
98
+ For more parameters, see: https://github.com/ollama/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values
99
+ PARAMS
@@ -3,10 +3,10 @@
3
3
  module LlmConductor
4
4
  # Factory class for creating appropriate LLM client instances based on model and vendor
5
5
  class ClientFactory
6
- def self.build(model:, type:, vendor: nil)
6
+ def self.build(model:, type:, vendor: nil, params: {})
7
7
  vendor ||= determine_vendor(model)
8
8
  client_class = client_class_for_vendor(vendor)
9
- client_class.new(model:, type:)
9
+ client_class.new(model:, type:, params:)
10
10
  end
11
11
 
12
12
  def self.client_class_for_vendor(vendor)
@@ -11,11 +11,12 @@ module LlmConductor
11
11
  class BaseClient
12
12
  include Prompts
13
13
 
14
- attr_reader :model, :type
14
+ attr_reader :model, :type, :params
15
15
 
16
- def initialize(model:, type:)
16
+ def initialize(model:, type:, params: {})
17
17
  @model = model
18
18
  @type = type
19
+ @params = params
19
20
  end
20
21
 
21
22
  def generate(data:)
@@ -7,7 +7,8 @@ module LlmConductor
7
7
  private
8
8
 
9
9
  def generate_content(prompt)
10
- client.generate({ model:, prompt:, stream: false }).first['response']
10
+ request_params = { model:, prompt:, stream: false }.merge(params)
11
+ client.generate(request_params).first['response']
11
12
  end
12
13
 
13
14
  def client
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module LlmConductor
4
- VERSION = '1.4.1'
4
+ VERSION = '1.5.0'
5
5
  end
data/lib/llm_conductor.rb CHANGED
@@ -24,35 +24,37 @@ module LlmConductor
24
24
  class Error < StandardError; end
25
25
 
26
26
  # Main entry point for creating LLM clients
27
- def self.build_client(model:, type:, vendor: nil)
28
- ClientFactory.build(model:, type:, vendor:)
27
+ def self.build_client(model:, type:, vendor: nil, params: {})
28
+ ClientFactory.build(model:, type:, vendor:, params:)
29
29
  end
30
30
 
31
31
  # Unified generate method supporting both simple prompts and legacy template-based generation
32
- def self.generate(model: nil, prompt: nil, type: nil, data: nil, vendor: nil)
32
+ # rubocop:disable Metrics/ParameterLists
33
+ def self.generate(model: nil, prompt: nil, type: nil, data: nil, vendor: nil, params: {})
33
34
  if prompt && !type && !data
34
- generate_simple_prompt(model:, prompt:, vendor:)
35
+ generate_simple_prompt(model:, prompt:, vendor:, params:)
35
36
  elsif type && data && !prompt
36
- generate_with_template(model:, type:, data:, vendor:)
37
+ generate_with_template(model:, type:, data:, vendor:, params:)
37
38
  else
38
39
  raise ArgumentError,
39
40
  "Invalid arguments. Use either: generate(prompt: 'text') or generate(type: :custom, data: {...})"
40
41
  end
41
42
  end
43
+ # rubocop:enable Metrics/ParameterLists
42
44
 
43
45
  class << self
44
46
  private
45
47
 
46
- def generate_simple_prompt(model:, prompt:, vendor:)
48
+ def generate_simple_prompt(model:, prompt:, vendor:, params:)
47
49
  model ||= configuration.default_model
48
50
  vendor ||= ClientFactory.determine_vendor(model)
49
51
  client_class = client_class_for_vendor(vendor)
50
- client = client_class.new(model:, type: :direct)
52
+ client = client_class.new(model:, type: :direct, params:)
51
53
  client.generate_simple(prompt:)
52
54
  end
53
55
 
54
- def generate_with_template(model:, type:, data:, vendor:)
55
- client = build_client(model:, type:, vendor:)
56
+ def generate_with_template(model:, type:, data:, vendor:, params:)
57
+ client = build_client(model:, type:, vendor:, params:)
56
58
  client.generate(data:)
57
59
  end
58
60
 
metadata CHANGED
@@ -1,13 +1,13 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: llm_conductor
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.4.1
4
+ version: 1.5.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ben Zheng
8
8
  bindir: exe
9
9
  cert_chain: []
10
- date: 2025-12-05 00:00:00.000000000 Z
10
+ date: 2025-12-12 00:00:00.000000000 Z
11
11
  dependencies:
12
12
  - !ruby/object:Gem::Dependency
13
13
  name: activesupport
@@ -152,14 +152,17 @@ files:
152
152
  - LICENSE
153
153
  - README.md
154
154
  - Rakefile
155
- - VISION_USAGE.md
156
155
  - config/initializers/llm_conductor.rb
156
+ - docs/README.md
157
+ - docs/custom-parameters.md
158
+ - docs/vision-support.md
157
159
  - examples/claude_vision_usage.rb
158
160
  - examples/data_builder_usage.rb
159
161
  - examples/gemini_usage.rb
160
162
  - examples/gemini_vision_usage.rb
161
163
  - examples/gpt_vision_usage.rb
162
164
  - examples/groq_usage.rb
165
+ - examples/ollama_params_usage.rb
163
166
  - examples/openrouter_vision_usage.rb
164
167
  - examples/prompt_registration.rb
165
168
  - examples/rag_usage.rb
File without changes