dspy-code_act 0.29.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: b678ef45c8a1fef48c4144a37ab650d16c5b80f0d32df00cc88e5dd6e1929184
4
+ data.tar.gz: 0e7c257bd9eac064e4be71fbc83f79fc306da6526ce938fd465d6bc0cb4303ce
5
+ SHA512:
6
+ metadata.gz: 529801091e676b46348fa21e5635cd8ca4fd35a5aada9d08e459768e122d876f0b657765ffc4d62429b7876545bd6de9d32fb8c3b59bba238a8f97b585fa6d53
7
+ data.tar.gz: 25eccebf8fc46fdd5a91ecf1d7c5226d91ab63902f64441359376830828a0229b2844e9202ca0b048abb2aeaa690e20144743300f33cdf5339ebe6c7ce79a1b6
data/LICENSE ADDED
@@ -0,0 +1,45 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Vicente Services SL
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
22
+
23
+ This project is a Ruby port of the original Python [DSPy library](https://github.com/stanfordnlp/dspy), which is licensed under the MIT License:
24
+
25
+ MIT License
26
+
27
+ Copyright (c) 2023 Stanford Future Data Systems
28
+
29
+ Permission is hereby granted, free of charge, to any person obtaining a copy
30
+ of this software and associated documentation files (the "Software"), to deal
31
+ in the Software without restriction, including without limitation the rights
32
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
33
+ copies of the Software, and to permit persons to whom the Software is
34
+ furnished to do so, subject to the following conditions:
35
+
36
+ The above copyright notice and this permission notice shall be included in all
37
+ copies or substantial portions of the Software.
38
+
39
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
40
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
41
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
42
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
43
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
44
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
45
+ SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,249 @@
1
+ # DSPy.rb
2
+
3
+ [![Gem Version](https://img.shields.io/gem/v/dspy)](https://rubygems.org/gems/dspy)
4
+ [![Total Downloads](https://img.shields.io/gem/dt/dspy)](https://rubygems.org/gems/dspy)
5
+ [![Build Status](https://img.shields.io/github/actions/workflow/status/vicentereig/dspy.rb/ruby.yml?branch=main&label=build)](https://github.com/vicentereig/dspy.rb/actions/workflows/ruby.yml)
6
+ [![Documentation](https://img.shields.io/badge/docs-vicentereig.github.io%2Fdspy.rb-blue)](https://vicentereig.github.io/dspy.rb/)
7
+
8
+ > [!NOTE]
9
+ > The core Prompt Engineering Framework is production-ready with
10
+ > comprehensive documentation. I am focusing now on educational content on systematic Prompt Optimization and Context Engineering.
11
+ > Your feedback is invaluable. if you encounter issues, please open an [issue](https://github.com/vicentereig/dspy.rb/issues). If you have suggestions, open a [new thread](https://github.com/vicentereig/dspy.rb/discussions).
12
+ >
13
+ > If you want to contribute, feel free to reach out to me to coordinate efforts: hey at vicente.services
14
+ >
15
+ > And, yes, this is 100% a legit project. :)
16
+
17
+
18
+ **Build reliable LLM applications in idiomatic Ruby using composable, type-safe modules.**
19
+
20
+ The Ruby framework for programming with large language models. DSPy.rb brings structured LLM programming to Ruby developers, programmatic Prompt Engineering and Context Engineering.
21
+ Instead of wrestling with prompt strings and parsing responses, you define typed signatures using idiomatic Ruby to compose and decompose AI Worklows and AI Agents.
22
+
23
+ **Prompts are the just Functions.** Traditional prompting is like writing code with string concatenation: it works until it doesn't. DSPy.rb brings you
24
+ the programming approach pioneered by [dspy.ai](https://dspy.ai/): instead of crafting fragile prompts, you define modular
25
+ signatures and let the framework handle the messy details.
26
+
27
+ DSPy.rb is an idiomatic Ruby surgical port of Stanford's [DSPy framework](https://github.com/stanfordnlp/dspy). While implementing
28
+ the core concepts of signatures, predictors, and the main optimization algorithms from the original Python library, DSPy.rb embraces Ruby
29
+ conventions and adds Ruby-specific innovations like Sorbet-base Typed system, ReAct loops, and production-ready integrations like non-blocking Open Telemetry Instrumentation.
30
+
31
+ **What you get?** Ruby LLM applications that actually scale and don't break when you sneeze.
32
+
33
+ Check the [examples](examples/) and take them for a spin!
34
+
35
+ ## Your First DSPy Program
36
+ ### Installation
37
+
38
+ Add to your Gemfile:
39
+
40
+ ```ruby
41
+ gem 'dspy'
42
+ ```
43
+
44
+ and
45
+
46
+ ```bash
47
+ bundle install
48
+ ```
49
+ ### Your First Reliable Predictor
50
+
51
+ ```ruby
52
+
53
+ # Configure DSPy globablly to use your fave LLM - you can override this on an instance levle.
54
+ DSPy.configure do |c|
55
+ c.lm = DSPy::LM.new('openai/gpt-4o-mini',
56
+ api_key: ENV['OPENAI_API_KEY'],
57
+ structured_outputs: true) # Enable OpenAI's native JSON mode
58
+ end
59
+
60
+ # Define a signature for sentiment classification - instead of writing a full prompt!
61
+ class Classify < DSPy::Signature
62
+ description "Classify sentiment of a given sentence." # sets the goal of the underlying prompt
63
+
64
+ class Sentiment < T::Enum
65
+ enums do
66
+ Positive = new('positive')
67
+ Negative = new('negative')
68
+ Neutral = new('neutral')
69
+ end
70
+ end
71
+
72
+ # Structured Inputs: makes sure you are sending only valid prompt inputs to your model
73
+ input do
74
+ const :sentence, String, description: 'The sentence to analyze'
75
+ end
76
+
77
+ # Structured Outputs: your predictor will validate the output of the model too.
78
+ output do
79
+ const :sentiment, Sentiment, description: 'The sentiment of the sentence'
80
+ const :confidence, Float, description: 'A number between 0.0 and 1.0'
81
+ end
82
+ end
83
+
84
+ # Wire it to the simplest prompting technique - a Predictn.
85
+ classify = DSPy::Predict.new(Classify)
86
+ # it may raise an error if you mess the inputs or your LLM messes the outputs.
87
+ result = classify.call(sentence: "This book was super fun to read!")
88
+
89
+ puts result.sentiment # => #<Sentiment::Positive>
90
+ puts result.confidence # => 0.85
91
+ ```
92
+
93
+ ### Access to 200+ Models Across 5 Providers
94
+
95
+ DSPy.rb provides unified access to major LLM providers with provider-specific optimizations:
96
+
97
+ ```ruby
98
+ # OpenAI (GPT-4, GPT-4o, GPT-4o-mini, GPT-5, etc.)
99
+ DSPy.configure do |c|
100
+ c.lm = DSPy::LM.new('openai/gpt-4o-mini',
101
+ api_key: ENV['OPENAI_API_KEY'],
102
+ structured_outputs: true) # Native JSON mode
103
+ end
104
+
105
+ # Google Gemini (Gemini 1.5 Pro, Flash, Gemini 2.0, etc.)
106
+ DSPy.configure do |c|
107
+ c.lm = DSPy::LM.new('gemini/gemini-2.5-flash',
108
+ api_key: ENV['GEMINI_API_KEY'],
109
+ structured_outputs: true) # Native structured outputs
110
+ end
111
+
112
+ # Anthropic Claude (Claude 3.5, Claude 4, etc.)
113
+ DSPy.configure do |c|
114
+ c.lm = DSPy::LM.new('anthropic/claude-sonnet-4-5-20250929',
115
+ api_key: ENV['ANTHROPIC_API_KEY'],
116
+ structured_outputs: true) # Tool-based extraction (default)
117
+ end
118
+
119
+ # Ollama - Run any local model (Llama, Mistral, Gemma, etc.)
120
+ DSPy.configure do |c|
121
+ c.lm = DSPy::LM.new('ollama/llama3.2') # Free, runs locally, no API key needed
122
+ end
123
+
124
+ # OpenRouter - Access to 200+ models from multiple providers
125
+ DSPy.configure do |c|
126
+ c.lm = DSPy::LM.new('openrouter/deepseek/deepseek-chat-v3.1:free',
127
+ api_key: ENV['OPENROUTER_API_KEY'])
128
+ end
129
+ ```
130
+
131
+ ## What You Get
132
+
133
+ **Developer Experience:**
134
+ - LLM provider support using official Ruby clients:
135
+ - [OpenAI Ruby](https://github.com/openai/openai-ruby) with vision model support
136
+ - [Anthropic Ruby SDK](https://github.com/anthropics/anthropic-sdk-ruby) with multimodal capabilities
137
+ - [Google Gemini API](https://ai.google.dev/) with native structured outputs
138
+ - [Ollama](https://ollama.com/) via OpenAI compatibility layer for local models
139
+ - **Multimodal Support** - Complete image analysis with DSPy::Image, type-safe bounding boxes, vision-capable models
140
+ - Runtime type checking with [Sorbet](https://sorbet.org/) including T::Enum and union types
141
+ - Type-safe tool definitions for ReAct agents
142
+ - Comprehensive instrumentation and observability
143
+
144
+ **Core Building Blocks:**
145
+ - **Signatures** - Define input/output schemas using Sorbet types with T::Enum and union type support
146
+ - **Predict** - LLM completion with structured data extraction and multimodal support
147
+ - **Chain of Thought** - Step-by-step reasoning for complex problems with automatic prompt optimization
148
+ - **ReAct** - Tool-using agents with type-safe tool definitions and error recovery
149
+ - **Module Composition** - Combine multiple LLM calls into production-ready workflows
150
+
151
+ **Optimization & Evaluation:**
152
+ - **Prompt Objects** - Manipulate prompts as first-class objects instead of strings
153
+ - **Typed Examples** - Type-safe training data with automatic validation
154
+ - **Evaluation Framework** - Advanced metrics beyond simple accuracy with error-resilient pipelines
155
+ - **MIPROv2 Optimization** - Advanced Bayesian optimization with Gaussian Processes, multiple optimization strategies, auto-config presets, and storage persistence
156
+
157
+ **Production Features:**
158
+ - **Reliable JSON Extraction** - Native structured outputs for OpenAI and Gemini, Anthropic tool-based extraction, and automatic strategy selection with fallback
159
+ - **Type-Safe Configuration** - Strategy enums with automatic provider optimization (Strict/Compatible modes)
160
+ - **Smart Retry Logic** - Progressive fallback with exponential backoff for handling transient failures
161
+ - **Zero-Config Langfuse Integration** - Set env vars and get automatic OpenTelemetry traces in Langfuse
162
+ - **Performance Caching** - Schema and capability caching for faster repeated operations
163
+ - **File-based Storage** - Optimization result persistence with versioning
164
+ - **Structured Logging** - JSON and key=value formats with span tracking
165
+
166
+ ## Recent Achievements
167
+
168
+ DSPy.rb has rapidly evolved from experimental to production-ready:
169
+
170
+ ### Foundation
171
+ - ✅ **JSON Parsing Reliability** - Native OpenAI structured outputs with adaptive retry logic and schema-aware fallbacks
172
+ - ✅ **Type-Safe Strategy Configuration** - Provider-optimized strategy selection and enum-backed optimizer presets
173
+ - ✅ **Core Module System** - Predict, ChainOfThought, ReAct with type safety (add `dspy-code_act` for Think-Code-Observe agents)
174
+ - ✅ **Production Observability** - OpenTelemetry, New Relic, and Langfuse integration
175
+ - ✅ **Advanced Optimization** - MIPROv2 with Bayesian optimization, Gaussian Processes, and multi-mode search
176
+
177
+ ### Recent Advances
178
+ - ✅ **MIPROv2 ADE Integrity (v0.29.1)** - Stratified train/val/test splits, honest precision accounting, and enum-driven `--auto` presets with integration coverage
179
+ - ✅ **Instruction Deduplication (v0.29.1)** - Candidate generation now filters repeated programs so optimization logs highlight unique strategies
180
+ - ✅ **GEPA Teleprompter (v0.29.0)** - Genetic-Pareto reflective prompt evolution with merge proposer scheduling, reflective mutation, and ADE demo parity
181
+ - ✅ **Optimizer Utilities Parity (v0.29.0)** - Bootstrap strategies, dataset summaries, and Layer 3 utilities unlock multi-predictor programs on Ruby
182
+ - ✅ **Observability Hardening (v0.29.0)** - OTLP exporter runs on a single-thread executor preventing frozen SSL contexts without blocking spans
183
+ - ✅ **Documentation Refresh (v0.29.x)** - New GEPA guide plus ADE optimization docs covering presets, stratified splits, and error-handling defaults
184
+
185
+ **Current Focus Areas:**
186
+
187
+ ### Production Readiness
188
+ - 🚧 **Production Patterns** - Real-world usage validation and performance optimization
189
+ - 🚧 **Ruby Ecosystem Integration** - Rails integration, Sidekiq compatibility, deployment patterns
190
+
191
+ ### Community & Adoption
192
+ - 🚧 **Community Examples** - Real-world applications and case studies
193
+ - 🚧 **Contributor Experience** - Making it easier to contribute and extend
194
+ - 🚧 **Performance Benchmarks** - Comparative analysis vs other frameworks
195
+
196
+ **v1.0 Philosophy:**
197
+ v1.0 will be released after extensive production battle-testing, not after checking off features.
198
+ The API is already stable - v1.0 represents confidence in production reliability backed by real-world validation.
199
+
200
+
201
+ ## Documentation
202
+
203
+ 📖 **[Complete Documentation Website](https://vicentereig.github.io/dspy.rb/)**
204
+
205
+ ### LLM-Friendly Documentation
206
+
207
+ For LLMs and AI assistants working with DSPy.rb:
208
+ - **[llms.txt](https://vicentereig.github.io/dspy.rb/llms.txt)** - Concise reference optimized for LLMs
209
+ - **[llms-full.txt](https://vicentereig.github.io/dspy.rb/llms-full.txt)** - Comprehensive API documentation
210
+
211
+ ### Getting Started
212
+ - **[Installation & Setup](docs/src/getting-started/installation.md)** - Detailed installation and configuration
213
+ - **[Quick Start Guide](docs/src/getting-started/quick-start.md)** - Your first DSPy programs
214
+ - **[Core Concepts](docs/src/getting-started/core-concepts.md)** - Understanding signatures, predictors, and modules
215
+
216
+ ### Prompt Engineering
217
+ - **[Signatures & Types](docs/src/core-concepts/signatures.md)** - Define typed interfaces for LLM operations
218
+ - **[Predictors](docs/src/core-concepts/predictors.md)** - Predict, ChainOfThought, ReAct, and more
219
+ - **[Modules & Pipelines](docs/src/core-concepts/modules.md)** - Compose complex multi-stage workflows
220
+ - **[Multimodal Support](docs/src/core-concepts/multimodal.md)** - Image analysis with vision-capable models
221
+ - **[Examples & Validation](docs/src/core-concepts/examples.md)** - Type-safe training data
222
+ - **[Rich Types](docs/src/advanced/complex-types.md)** - Sorbet type integration with automatic coercion for structs, enums, and arrays
223
+ - **[Composable Pipelines](docs/src/advanced/pipelines.md)** - Manual module composition patterns
224
+
225
+ ### Prompt Optimization
226
+ - **[Evaluation Framework](docs/src/optimization/evaluation.md)** - Advanced metrics beyond simple accuracy
227
+ - **[Prompt Optimization](docs/src/optimization/prompt-optimization.md)** - Manipulate prompts as objects
228
+ - **[MIPROv2 Optimizer](docs/src/optimization/miprov2.md)** - Advanced Bayesian optimization with Gaussian Processes
229
+ - **[GEPA Optimizer](docs/src/optimization/gepa.md)** *(beta)* - Reflective mutation with optional reflection LMs
230
+
231
+ ### Context Engineering
232
+ - **[Tools](docs/src/core-concepts/toolsets.md)** - Tool wieldint agents.
233
+ - **[Agentic Memory](docs/src/core-concepts/memory.md)** - Memory Tools & Agentic Loops
234
+ - **[RAG Patterns](docs/src/advanced/rag.md)** - Manual RAG implementation with external services
235
+
236
+ ### Production Features
237
+ - **[Observability](docs/src/production/observability.md)** - Zero-config Langfuse integration with a dedicated export worker that never blocks your LLMs
238
+ - **[Storage System](docs/src/production/storage.md)** - Persistence and optimization result storage
239
+ - **[Custom Metrics](docs/src/advanced/custom-metrics.md)** - Proc-based evaluation logic
240
+
241
+
242
+
243
+
244
+
245
+
246
+
247
+
248
+ ## License
249
+ This project is licensed under the MIT License.
@@ -0,0 +1,152 @@
1
+ # CodeAct: Dynamic Code Generation for DSPy.rb
2
+
3
+ CodeAct is a DSPy.rb module that enables agents to write and execute Ruby code dynamically. Unlike ReAct agents that rely on predefined tools, CodeAct generates tailored Ruby code on the fly to solve complex tasks.
4
+
5
+ ## When to Use CodeAct
6
+
7
+ - Choose CodeAct when you need creative problem solving, custom data transformations, or bespoke algorithms.
8
+ - Prefer ReAct when you have well-defined tools, must call external services, or need stricter safety guarantees.
9
+
10
+ ## Quick Start
11
+
12
+ ```ruby
13
+ require 'dspy'
14
+ require 'dspy/code_act'
15
+
16
+ DSPy.configure do |config|
17
+ config.lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV.fetch('OPENAI_API_KEY'))
18
+ end
19
+
20
+ agent = DSPy::CodeAct.new
21
+
22
+ result = agent.forward(
23
+ task: "Calculate the Fibonacci sequence up to the 10th number",
24
+ context: "You have access to standard Ruby libraries"
25
+ )
26
+
27
+ puts result.final_answer
28
+ # => [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
29
+
30
+ result.history.each do |step|
31
+ puts "Step #{step.step}"
32
+ puts "Thought: #{step.thought}"
33
+ puts "Code: #{step.ruby_code}"
34
+ puts "Result: #{step.execution_result}"
35
+ end
36
+ ```
37
+
38
+ ## Advanced Usage
39
+
40
+ ### Custom Execution Context
41
+
42
+ Provide structured data or helper methods with the `context` argument so generated code can reference them:
43
+
44
+ ```ruby
45
+ sales_data = {
46
+ "January" => [100, 150, 200],
47
+ "February" => [120, 180, 190],
48
+ "March" => [140, 210, 220]
49
+ }
50
+
51
+ agent = DSPy::CodeAct.new
52
+
53
+ result = agent.forward(
54
+ task: "Calculate the average sales for each month and report the best performer",
55
+ context: "You have access to `sales_data`, a hash keyed by month with numeric arrays",
56
+ data: { sales_data: sales_data }
57
+ )
58
+
59
+ puts result.final_answer
60
+ # => "March is the best performing month with an average of 190.0"
61
+ ```
62
+
63
+ ### Arbitrary Data Processing
64
+
65
+ ```ruby
66
+ csv_content = <<~CSV
67
+ name,email,department
68
+ John Doe,john@gmail.com,Engineering
69
+ Jane Smith,jane@company.com,Marketing
70
+ Bob Johnson,bob@gmail.com,Sales
71
+ CSV
72
+
73
+ result = agent.forward(
74
+ task: "Parse the CSV data and list gmail.com addresses",
75
+ context: "CSV data is available in `csv_content`",
76
+ data: { csv_content: csv_content }
77
+ )
78
+ ```
79
+
80
+ ## Safety Checklist
81
+
82
+ Executing arbitrary Ruby code is powerful but risky. Consider:
83
+
84
+ 1. **Sandboxing & Timeouts** – wrap execution in `Timeout.timeout` and restrict accessible objects.
85
+ 2. **Input Sanitization** – scrub user input to remove shell-outs and dangerous methods.
86
+ 3. **Resource Monitoring** – track memory and CPU usage to abort runaway code.
87
+
88
+ ```ruby
89
+ class SafeCodeAct < DSPy::CodeAct
90
+ def execute_code(ruby_code)
91
+ Timeout.timeout(5) { super }
92
+ rescue Timeout::Error
93
+ "Code execution timed out"
94
+ end
95
+ end
96
+ ```
97
+
98
+ ## Example: Sales Analysis Pipeline
99
+
100
+ ```ruby
101
+ class SalesAnalyzer < DSPy::Module
102
+ def initialize
103
+ @agent = DSPy::CodeAct.new
104
+ end
105
+
106
+ def analyze_trends(sales_data)
107
+ result = @agent.forward(
108
+ task: <<~TASK,
109
+ Analyze the sales data to:
110
+ 1. Compute month-over-month growth
111
+ 2. Identify seasonal patterns
112
+ 3. Predict next month's sales with simple linear regression
113
+ TASK
114
+ context: "You have access to `sales_data` and standard Ruby libraries",
115
+ data: { sales_data: sales_data }
116
+ )
117
+
118
+ {
119
+ analysis: result.final_answer,
120
+ code_steps: result.history.map(&:ruby_code),
121
+ execution_time: result.metadata[:total_time]
122
+ }
123
+ end
124
+ end
125
+ ```
126
+
127
+ ## Debugging
128
+
129
+ Enable structured logging to observe each iteration:
130
+
131
+ ```ruby
132
+ DSPy.configure do |config|
133
+ config.logger = Dry.Logger(:dspy, formatter: :json) do |logger|
134
+ logger.add_backend(level: :debug, stream: $stdout)
135
+ end
136
+ end
137
+ ```
138
+
139
+ Inspect `result.history` entry-by-entry to review generated code, observations, and errors.
140
+
141
+ ## Limitations & Roadmap
142
+
143
+ - Basic sandboxing—do not run untrusted input without additional guards.
144
+ - No support for external gem loading during execution.
145
+ - Future roadmap targets hardened sandboxing, async execution, and richer explanations.
146
+
147
+ ## Best Practices
148
+
149
+ 1. Start with simple prompts before moving to complex tasks.
150
+ 2. Provide precise context and structured data for better code generation.
151
+ 3. Always validate outputs and enforce timeouts or resource caps.
152
+ 4. Capture telemetry (Observability, Langfuse) for production usage.
@@ -0,0 +1,7 @@
1
+ # frozen_string_literal: true
2
+
3
+ module DSPy
4
+ module CodeActVersion
5
+ VERSION = DSPy::VERSION
6
+ end
7
+ end
@@ -0,0 +1,486 @@
1
+ # typed: strict
2
+ # frozen_string_literal: true
3
+
4
+ require 'sorbet-runtime'
5
+ require 'dspy' unless defined?(DSPy)
6
+ require 'dspy/predict'
7
+ require 'dspy/signature'
8
+ require 'json'
9
+ require 'stringio'
10
+ require 'dspy/mixins/struct_builder'
11
+ require 'dspy/type_serializer'
12
+
13
+ module DSPy
14
+ # Define a simple struct for CodeAct history entries with proper type annotations
15
+ class CodeActHistoryEntry < T::Struct
16
+ const :step, Integer
17
+ prop :thought, T.nilable(String)
18
+ prop :ruby_code, T.nilable(String)
19
+ prop :execution_result, T.nilable(String)
20
+ prop :error_message, String
21
+
22
+ # Custom serialization to ensure compatibility with the rest of the code
23
+ def to_h
24
+ {
25
+ step: step,
26
+ thought: thought,
27
+ ruby_code: ruby_code,
28
+ execution_result: execution_result,
29
+ error_message: error_message
30
+ }.compact
31
+ end
32
+ end
33
+
34
+ # Defines the signature for Ruby code generation
35
+ class RubyCodeGeneration < DSPy::Signature
36
+ description "Generate Ruby code to solve the given task."
37
+
38
+ input do
39
+ const :task, String,
40
+ description: "JSON representation of all input fields for the task"
41
+ const :context, String,
42
+ description: "Available variables and previous results from code execution history"
43
+ const :history, T::Array[CodeActHistoryEntry],
44
+ description: "Previous thoughts and code executions with their results. Use this to understand what has been tried and what variables are available."
45
+ end
46
+
47
+ output do
48
+ const :thought, String,
49
+ description: "Reasoning about the approach to solve the task with Ruby code"
50
+ const :ruby_code, String,
51
+ description: "Ruby code to execute. This should be valid Ruby code that can be evaluated safely. Avoid system calls, file operations, or other potentially dangerous operations."
52
+ const :explanation, String,
53
+ description: "Brief explanation of what the code does and why this approach was chosen"
54
+ end
55
+ end
56
+
57
+ class CodeActNextStep < T::Enum
58
+ enums do
59
+ Continue = new("continue")
60
+ Finish = new("finish")
61
+ end
62
+ end
63
+
64
+ # Defines the signature for processing code execution results
65
+ class RubyCodeObservation < DSPy::Signature
66
+ description "Process the result of Ruby code execution and decide what to do next."
67
+
68
+ input do
69
+ const :task, String,
70
+ description: "JSON representation of all input fields for the task"
71
+ const :history, T::Array[CodeActHistoryEntry],
72
+ description: "Previous thoughts, code executions, and their results"
73
+ const :execution_result, T.nilable(String),
74
+ description: "The result from executing the Ruby code"
75
+ const :error_message, String,
76
+ description: "Error message if the code execution failed (empty string if no error)"
77
+ end
78
+
79
+ output do
80
+ const :observation, String,
81
+ description: "Analysis of the execution result and what it means for solving the task"
82
+ const :next_step, CodeActNextStep,
83
+ description: "What to do next: '#{CodeActNextStep::Continue}' to continue with more code or '#{CodeActNextStep::Finish}' if the task is complete"
84
+ const :final_answer, T.nilable(String),
85
+ description: "If next_step is 'finish', provide the final answer to the task based on the execution results"
86
+ end
87
+ end
88
+
89
+ # CodeAct Agent using Think-Code-Observe pattern
90
+ class CodeAct < Predict
91
+ extend T::Sig
92
+ include Mixins::StructBuilder
93
+
94
+ sig { returns(T.class_of(DSPy::Signature)) }
95
+ attr_reader :original_signature_class
96
+
97
+ sig { returns(T.class_of(T::Struct)) }
98
+ attr_reader :enhanced_output_struct
99
+
100
+ sig { returns(Integer) }
101
+ attr_reader :max_iterations
102
+
103
+ sig { returns(T::Hash[Symbol, T.untyped]) }
104
+ attr_reader :execution_context
105
+
106
+ sig { params(signature_class: T.class_of(DSPy::Signature), max_iterations: Integer).void }
107
+ def initialize(signature_class, max_iterations: 10)
108
+ @original_signature_class = signature_class
109
+ @max_iterations = max_iterations
110
+ @execution_context = T.let({}, T::Hash[Symbol, T.untyped])
111
+
112
+ # Create code generator using Predict to preserve field descriptions
113
+ @code_generator = T.let(DSPy::Predict.new(RubyCodeGeneration), DSPy::Predict)
114
+
115
+ # Create observation processor using Predict to preserve field descriptions
116
+ @observation_processor = T.let(DSPy::Predict.new(RubyCodeObservation), DSPy::Predict)
117
+
118
+ # Create enhanced output struct with CodeAct fields
119
+ @enhanced_output_struct = create_enhanced_output_struct(signature_class)
120
+ enhanced_output_struct = @enhanced_output_struct
121
+
122
+ # Create enhanced signature class
123
+ enhanced_signature = Class.new(DSPy::Signature) do
124
+ # Set the description
125
+ description signature_class.description
126
+
127
+ # Use the same input struct
128
+ @input_struct_class = signature_class.input_struct_class
129
+
130
+ # Use the enhanced output struct with CodeAct fields
131
+ @output_struct_class = enhanced_output_struct
132
+
133
+ # Store original signature name
134
+ @original_signature_name = signature_class.name
135
+
136
+ class << self
137
+ attr_reader :input_struct_class, :output_struct_class, :original_signature_name
138
+
139
+ # Override name to return the original signature name
140
+ def name
141
+ @original_signature_name || super
142
+ end
143
+ end
144
+ end
145
+
146
+ # Call parent constructor with enhanced signature
147
+ super(enhanced_signature)
148
+ end
149
+
150
+ sig { override.returns(T::Array[[String, DSPy::Module]]) }
151
+ def named_predictors
152
+ pairs = T.let([], T::Array[[String, DSPy::Module]])
153
+ pairs << ["code_generator", @code_generator]
154
+ pairs << ["observation_processor", @observation_processor]
155
+ pairs
156
+ end
157
+
158
+ sig { override.returns(T::Array[DSPy::Module]) }
159
+ def predictors
160
+ named_predictors.map { |(_, predictor)| predictor }
161
+ end
162
+
163
+ sig { params(kwargs: T.untyped).returns(T.untyped).override }
164
+ def forward(**kwargs)
165
+ # Validate input and serialize all fields as task context
166
+ input_struct = @original_signature_class.input_struct_class.new(**kwargs)
167
+ task = DSPy::TypeSerializer.serialize(input_struct).to_json
168
+
169
+ # Execute CodeAct reasoning loop
170
+ reasoning_result = execute_codeact_reasoning_loop(task)
171
+
172
+ # Create enhanced output with all CodeAct data
173
+ create_enhanced_result(kwargs, reasoning_result)
174
+ end
175
+
176
+ private
177
+
178
+ # Executes the main CodeAct reasoning loop (Think-Code-Observe)
179
+ sig { params(task: String).returns(T::Hash[Symbol, T.untyped]) }
180
+ def execute_codeact_reasoning_loop(task)
181
+ history = T.let([], T::Array[CodeActHistoryEntry])
182
+ final_answer = T.let(nil, T.nilable(String))
183
+ iterations_count = 0
184
+ context = ""
185
+
186
+ while should_continue_iteration?(iterations_count, final_answer)
187
+ iterations_count += 1
188
+
189
+ iteration_result = execute_single_iteration(
190
+ task, history, context, iterations_count
191
+ )
192
+
193
+ if iteration_result[:should_finish]
194
+ final_answer = iteration_result[:final_answer]
195
+ break
196
+ end
197
+
198
+ history = iteration_result[:history]
199
+ context = iteration_result[:context]
200
+ end
201
+
202
+ handle_max_iterations_if_needed(iterations_count, final_answer, history)
203
+
204
+ {
205
+ history: history,
206
+ iterations: iterations_count,
207
+ final_answer: final_answer || default_no_answer_message,
208
+ execution_context: @execution_context
209
+ }
210
+ end
211
+
212
+ # Executes a single iteration of the Think-Code-Observe loop
213
+ sig { params(task: String, history: T::Array[CodeActHistoryEntry], context: String, iteration: Integer).returns(T::Hash[Symbol, T.untyped]) }
214
+ def execute_single_iteration(task, history, context, iteration)
215
+ DSPy::Context.with_span(
216
+ operation: 'codeact.iteration',
217
+ 'dspy.module' => 'CodeAct',
218
+ 'codeact.iteration' => iteration,
219
+ 'codeact.max_iterations' => @max_iterations,
220
+ 'codeact.history_length' => history.length
221
+ ) do
222
+ execution_state = execute_think_code_step(task, context, history, iteration)
223
+
224
+ observation_decision = process_observation_and_decide_next_step(
225
+ task, execution_state[:history], execution_state[:execution_result],
226
+ execution_state[:error_message], iteration
227
+ )
228
+
229
+ if observation_decision[:should_finish]
230
+ return { should_finish: true, final_answer: observation_decision[:final_answer] }
231
+ end
232
+
233
+ finalize_iteration(execution_state, iteration)
234
+ end
235
+ end
236
+
237
+ # Executes the Think-Code step: generates code and executes it
238
+ sig { params(task: String, context: String, history: T::Array[CodeActHistoryEntry], iteration: Integer).returns(T::Hash[Symbol, T.untyped]) }
239
+ def execute_think_code_step(task, context, history, iteration)
240
+ code_obj = @code_generator.forward(
241
+ task: task,
242
+ context: context.empty? ? "No previous context available." : context,
243
+ history: history
244
+ )
245
+
246
+ execution_result, error_message = execute_ruby_code_with_instrumentation(
247
+ code_obj.ruby_code, iteration
248
+ )
249
+
250
+ history << create_history_entry(
251
+ iteration, code_obj.thought, code_obj.ruby_code,
252
+ execution_result, error_message
253
+ )
254
+
255
+ {
256
+ history: history,
257
+ thought: code_obj.thought,
258
+ ruby_code: code_obj.ruby_code,
259
+ execution_result: execution_result,
260
+ error_message: error_message
261
+ }
262
+ end
263
+
264
+ # Finalizes iteration by updating context and emitting events
265
+ sig { params(execution_state: T::Hash[Symbol, T.untyped], iteration: Integer).returns(T::Hash[Symbol, T.untyped]) }
266
+ def finalize_iteration(execution_state, iteration)
267
+ new_context = build_context_from_history(execution_state[:history])
268
+
269
+ emit_iteration_complete_event(
270
+ iteration, execution_state[:thought], execution_state[:ruby_code],
271
+ execution_state[:execution_result], execution_state[:error_message]
272
+ )
273
+
274
+ {
275
+ should_finish: false,
276
+ history: execution_state[:history],
277
+ context: new_context
278
+ }
279
+ end
280
+
281
+ # Creates enhanced output struct with CodeAct-specific fields
282
+ sig { params(signature_class: T.class_of(DSPy::Signature)).returns(T.class_of(T::Struct)) }
283
+ def create_enhanced_output_struct(signature_class)
284
+ input_props = signature_class.input_struct_class.props
285
+ output_props = signature_class.output_struct_class.props
286
+
287
+ build_enhanced_struct(
288
+ { input: input_props, output: output_props },
289
+ {
290
+ history: [T::Array[T::Hash[Symbol, T.untyped]], "CodeAct execution history"],
291
+ iterations: [Integer, "Number of iterations executed"],
292
+ execution_context: [T::Hash[Symbol, T.untyped], "Variables and context from code execution"]
293
+ }
294
+ )
295
+ end
296
+
297
+ # Creates enhanced result struct
298
+ sig { params(input_kwargs: T::Hash[Symbol, T.untyped], reasoning_result: T::Hash[Symbol, T.untyped]).returns(T.untyped) }
299
+ def create_enhanced_result(input_kwargs, reasoning_result)
300
+ output_field_name = @original_signature_class.output_struct_class.props.keys.first
301
+
302
+ output_data = input_kwargs.merge({
303
+ history: reasoning_result[:history].map(&:to_h),
304
+ iterations: reasoning_result[:iterations],
305
+ execution_context: reasoning_result[:execution_context]
306
+ })
307
+ output_data[output_field_name] = reasoning_result[:final_answer]
308
+
309
+ @enhanced_output_struct.new(**output_data)
310
+ end
311
+
312
+ # Helper methods for CodeAct logic
313
+ sig { params(iterations_count: Integer, final_answer: T.nilable(String)).returns(T::Boolean) }
314
+ def should_continue_iteration?(iterations_count, final_answer)
315
+ final_answer.nil? && (@max_iterations.nil? || iterations_count < @max_iterations)
316
+ end
317
+
318
+ sig { params(ruby_code: String, iteration: Integer).returns([T.nilable(String), String]) }
319
+ def execute_ruby_code_with_instrumentation(ruby_code, iteration)
320
+ DSPy::Context.with_span(
321
+ operation: 'codeact.code_execution',
322
+ 'dspy.module' => 'CodeAct',
323
+ 'codeact.iteration' => iteration,
324
+ 'code.length' => ruby_code.length
325
+ ) do
326
+ execute_ruby_code_safely(ruby_code)
327
+ end
328
+ end
329
+
330
+ sig { params(step: Integer, thought: String, ruby_code: String, execution_result: T.nilable(String), error_message: String).returns(CodeActHistoryEntry) }
331
+ def create_history_entry(step, thought, ruby_code, execution_result, error_message)
332
+ CodeActHistoryEntry.new(
333
+ step: step,
334
+ thought: thought,
335
+ ruby_code: ruby_code,
336
+ execution_result: execution_result,
337
+ error_message: error_message
338
+ )
339
+ end
340
+
341
+ sig { params(task: String, history: T::Array[CodeActHistoryEntry], execution_result: T.nilable(String), error_message: String, iteration: Integer).returns(T::Hash[Symbol, T.untyped]) }
342
+ def process_observation_and_decide_next_step(task, history, execution_result, error_message, iteration)
343
+ observation_result = @observation_processor.forward(
344
+ task: task,
345
+ history: history,
346
+ execution_result: execution_result,
347
+ error_message: error_message
348
+ )
349
+
350
+ return { should_finish: false } unless observation_result.next_step == CodeActNextStep::Finish
351
+
352
+ final_answer = observation_result.final_answer || execution_result || "Task completed"
353
+
354
+ { should_finish: true, final_answer: final_answer }
355
+ end
356
+
357
+ sig { params(history: T::Array[CodeActHistoryEntry]).returns(String) }
358
+ def build_context_from_history(history)
359
+ context_parts = []
360
+
361
+ history.each do |entry|
362
+ if entry.execution_result && !entry.execution_result.empty?
363
+ context_parts << "Step #{entry.step} result: #{entry.execution_result}"
364
+ end
365
+ end
366
+
367
+ context_parts.join("\n")
368
+ end
369
+
370
+ sig { params(iteration: Integer, thought: String, ruby_code: String, execution_result: T.nilable(String), error_message: T.nilable(String)).void }
371
+ def emit_iteration_complete_event(iteration, thought, ruby_code, execution_result, error_message)
372
+ DSPy.event('codeact.iteration_complete', {
373
+ 'codeact.iteration' => iteration,
374
+ 'codeact.thought' => thought,
375
+ 'codeact.ruby_code' => ruby_code,
376
+ 'codeact.execution_result' => execution_result,
377
+ 'codeact.error_message' => error_message,
378
+ 'codeact.success' => error_message.nil?
379
+ })
380
+ end
381
+
382
+ sig { params(iterations_count: Integer, final_answer: T.nilable(String), history: T::Array[CodeActHistoryEntry]).void }
383
+ def handle_max_iterations_if_needed(iterations_count, final_answer, history)
384
+ if iterations_count >= @max_iterations && final_answer.nil?
385
+ DSPy.event('codeact.max_iterations', {
386
+ 'codeact.iteration_count' => iterations_count,
387
+ 'codeact.max_iterations' => @max_iterations,
388
+ 'codeact.final_history_length' => history.length
389
+ })
390
+ end
391
+ end
392
+
393
+ sig { returns(String) }
394
+ def default_no_answer_message
395
+ "No solution reached within #{@max_iterations} iterations"
396
+ end
397
+
398
+ # Safe Ruby code execution method - placeholder for now
399
+ sig { params(ruby_code: String).returns([T.nilable(String), String]) }
400
+ def execute_ruby_code_safely(ruby_code)
401
+ # TODO: Implement proper sandboxing in Phase 2
402
+ # For now, use basic eval with error handling
403
+ original_stdout = nil
404
+ captured_output = nil
405
+
406
+ begin
407
+ # Capture stdout to get print/puts output
408
+ original_stdout = $stdout
409
+ captured_output = StringIO.new
410
+ $stdout = captured_output
411
+
412
+ result = eval(ruby_code, binding)
413
+
414
+ # Get the captured output
415
+ output = captured_output.string
416
+
417
+ # If there's captured output, use it, otherwise use the eval result
418
+ final_result = output.empty? ? result.to_s : output.chomp
419
+
420
+ [final_result, ""]
421
+ rescue SyntaxError => e
422
+ [nil, "Error: #{e.message}"]
423
+ rescue => e
424
+ [nil, "Error: #{e.message}"]
425
+ ensure
426
+ $stdout = original_stdout if original_stdout
427
+ end
428
+ end
429
+
430
+ sig { params(output: T.untyped).void }
431
+ def validate_output_schema!(output)
432
+ # Validate that output is an instance of the enhanced output struct
433
+ unless output.is_a?(@enhanced_output_struct)
434
+ raise "Output must be an instance of #{@enhanced_output_struct}, got #{output.class}"
435
+ end
436
+
437
+ # Validate original signature output fields are present
438
+ @original_signature_class.output_struct_class.props.each do |field_name, _prop|
439
+ unless output.respond_to?(field_name)
440
+ raise "Missing required field: #{field_name}"
441
+ end
442
+ end
443
+
444
+ # Validate CodeAct-specific fields
445
+ unless output.respond_to?(:history) && output.history.is_a?(Array)
446
+ raise "Missing or invalid history field"
447
+ end
448
+
449
+ unless output.respond_to?(:iterations) && output.iterations.is_a?(Integer)
450
+ raise "Missing or invalid iterations field"
451
+ end
452
+
453
+ unless output.respond_to?(:execution_context) && output.execution_context.is_a?(Hash)
454
+ raise "Missing or invalid execution_context field"
455
+ end
456
+ end
457
+
458
+ sig { returns(T::Hash[Symbol, T.untyped]) }
459
+ def generate_example_output
460
+ # Create a base example structure
461
+ example = {}
462
+
463
+ # Add CodeAct-specific example data
464
+ example[:history] = [
465
+ {
466
+ step: 1,
467
+ thought: "I need to write Ruby code to solve this task...",
468
+ ruby_code: "result = 2 + 2",
469
+ execution_result: "4",
470
+ error_message: nil
471
+ }
472
+ ]
473
+ example[:iterations] = 1
474
+ example[:execution_context] = { result: 4 }
475
+ example
476
+ end
477
+ end
478
+ end
479
+
480
+ require_relative 'code_act/version'
481
+
482
+ module DSPy
483
+ class CodeAct
484
+ VERSION = DSPy::CodeActVersion::VERSION unless const_defined?(:VERSION)
485
+ end
486
+ end
metadata ADDED
@@ -0,0 +1,62 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: dspy-code_act
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.29.1
5
+ platform: ruby
6
+ authors:
7
+ - Vicente Reig Rincón de Arellano
8
+ bindir: bin
9
+ cert_chain: []
10
+ date: 2025-10-24 00:00:00.000000000 Z
11
+ dependencies:
12
+ - !ruby/object:Gem::Dependency
13
+ name: dspy
14
+ requirement: !ruby/object:Gem::Requirement
15
+ requirements:
16
+ - - '='
17
+ - !ruby/object:Gem::Version
18
+ version: 0.29.1
19
+ type: :runtime
20
+ prerelease: false
21
+ version_requirements: !ruby/object:Gem::Requirement
22
+ requirements:
23
+ - - '='
24
+ - !ruby/object:Gem::Version
25
+ version: 0.29.1
26
+ description: CodeAct provides Think-Code-Observe agents that synthesize and execute
27
+ Ruby code dynamically. Ship DSPy.rb workflows that write custom Ruby code while
28
+ tracking execution history, observations, and safety signals.
29
+ email:
30
+ - hey@vicente.services
31
+ executables: []
32
+ extensions: []
33
+ extra_rdoc_files: []
34
+ files:
35
+ - LICENSE
36
+ - README.md
37
+ - lib/dspy/code_act.rb
38
+ - lib/dspy/code_act/README.md
39
+ - lib/dspy/code_act/version.rb
40
+ homepage: https://github.com/vicentereig/dspy.rb
41
+ licenses:
42
+ - MIT
43
+ metadata:
44
+ github_repo: git@github.com:vicentereig/dspy.rb
45
+ rdoc_options: []
46
+ require_paths:
47
+ - lib
48
+ required_ruby_version: !ruby/object:Gem::Requirement
49
+ requirements:
50
+ - - ">="
51
+ - !ruby/object:Gem::Version
52
+ version: 3.3.0
53
+ required_rubygems_version: !ruby/object:Gem::Requirement
54
+ requirements:
55
+ - - ">="
56
+ - !ruby/object:Gem::Version
57
+ version: '0'
58
+ requirements: []
59
+ rubygems_version: 3.6.5
60
+ specification_version: 4
61
+ summary: Dynamic code generation agents for DSPy.rb.
62
+ test_files: []