RubyGems - dspy - Versions diffs - 0.25.1 → 0.26.0 - Mend

dspy 0.25.1 → 0.26.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

checksums.yaml +4 -4
data/README.md +9 -10
data/lib/dspy/optimizers/gaussian_process.rb +141 -0
data/lib/dspy/teleprompt/mipro_v2.rb +157 -60
data/lib/dspy/version.rb +1 -1
metadata +6 -5

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 55addf122534bacff753f272a385ddb035e66322ed63ecc5bc27ce3a2bd4ea03
-  data.tar.gz: b73d12f9f560dcaf60fdfac9640bf6a422bb0b912912bf805dfb183058e94f81
+  metadata.gz: 573bfc89c0ca4d6c58a3d68fa96e0b0e8fc1d8bc2c38eec9c67e245a51527532
+  data.tar.gz: a2b34e8127abafbdec6842c3db475e1abbfbcf06c0ca9bfcd3fa7ea8299d0d0f
 SHA512:
-  metadata.gz: 9ffff85304ccbf2b72143e878e134637b97db118ac7b69e91597664d24e1a4fad63685219844d75c91c5e859dc7080a39052ac5baa68df520f1fb6731f35aece
-  data.tar.gz: f845270b9fbe9ff81fbed8519517cf174add60eb629d26702d8411077347a4dd1b9206bbf6b1fd7b4b4b7be260b35f48e2dfaf3951ae9d0e87b6d1aad7ea6b3e
+  metadata.gz: 88d9b7c29cf89d386ada79f8696f561b94087458715813b45ce2be18947163e840deabfe681d0f7594023fd09c84c1ff3746c87d19c2ebc8043103f5f98cbdc7
+  data.tar.gz: 5e5d518266d7dafe64abf3f3d6b083c6017feb2564743053c23bf9b6baa5a32be53eab299a57814b70ce05d236e3f5d659158679914ac097fd7fcc589b4ddb04

data/README.md CHANGED Viewed

@@ -73,7 +73,7 @@ puts result.confidence   # => 0.85
 - **Prompt Objects** - Manipulate prompts as first-class objects instead of strings
 - **Typed Examples** - Type-safe training data with automatic validation
 - **Evaluation Framework** - Advanced metrics beyond simple accuracy with error-resilient pipelines
-- **MIPROv2 Optimization** - Automatic prompt optimization with storage and persistence
+- **MIPROv2 Optimization** - Advanced Bayesian optimization with Gaussian Processes, multiple optimization strategies, and storage persistence
 - **GEPA Optimization** - Genetic-Pareto optimization for multi-objective prompt improvement
 **Production Features:**
@@ -128,7 +128,7 @@ For LLMs and AI assistants working with DSPy.rb:
 ### Optimization
 - **[Evaluation Framework](docs/src/optimization/evaluation.md)** - Advanced metrics beyond simple accuracy
 - **[Prompt Optimization](docs/src/optimization/prompt-optimization.md)** - Manipulate prompts as objects
-- **[MIPROv2 Optimizer](docs/src/optimization/miprov2.md)** - Automatic optimization algorithms
+- **[MIPROv2 Optimizer](docs/src/optimization/miprov2.md)** - Advanced Bayesian optimization with Gaussian Processes
 - **[GEPA Optimizer](docs/src/optimization/gepa.md)** - Genetic-Pareto optimization for multi-objective prompt optimization
 ### Production Features
@@ -159,7 +159,7 @@ bundle install
 #### System Dependencies for Ubuntu/Pop!_OS
-If you need to compile the `polars-df` dependency from source (used for data processing in evaluations), install these system packages:
+If you need to compile the `numo-narray` dependency from source (used for numerical computing in Bayesian optimization), install these system packages:
 ```bash
 # Update package list
@@ -171,15 +171,14 @@ sudo apt-get install ruby-full ruby-dev
 # Install essential build tools
 sudo apt-get install build-essential
-# Install Rust and Cargo (required for polars-df compilation)
-curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
-source $HOME/.cargo/env
+# Install BLAS and LAPACK libraries (required for numo-narray)
+sudo apt-get install libopenblas-dev liblapack-dev
-# Install CMake (often needed for Rust projects)
-sudo apt-get install cmake
+# Install additional development libraries
+sudo apt-get install libffi-dev libssl-dev
 ```
-**Note**: The `polars-df` gem compilation can take 15-20 minutes. Pre-built binaries are available for most platforms, so compilation is only needed if a pre-built binary isn't available for your system.
+**Note**: The `numo-narray` gem typically compiles quickly (1-2 minutes). Pre-built binaries are available for most platforms, so compilation is only needed if a pre-built binary isn't available for your system.
 ## Recent Achievements
@@ -190,7 +189,7 @@ DSPy.rb has rapidly evolved from experimental to production-ready:
 - ✅ **Type-Safe Strategy Configuration** - Provider-optimized automatic strategy selection
 - ✅ **Core Module System** - Predict, ChainOfThought, ReAct, CodeAct with type safety
 - ✅ **Production Observability** - OpenTelemetry, New Relic, and Langfuse integration
-- ✅ **Optimization Framework** - MIPROv2 algorithm with storage & persistence
+- ✅ **Advanced Optimization** - MIPROv2 with Bayesian optimization, Gaussian Processes, and multiple strategies
 ### Recent Advances
 - ✅ **Enhanced Langfuse Integration (v0.25.0)** - Comprehensive OpenTelemetry span reporting with proper input/output, hierarchical nesting, accurate timing, and observation types

data/lib/dspy/optimizers/gaussian_process.rb ADDED Viewed

@@ -0,0 +1,141 @@
+# typed: strict
+# frozen_string_literal: true
+require 'numo/narray'
+require 'sorbet-runtime'
+module DSPy
+  module Optimizers
+    # Pure Ruby Gaussian Process implementation for Bayesian optimization
+    # No external LAPACK/BLAS dependencies required
+    class GaussianProcess
+      extend T::Sig
+      sig { params(length_scale: Float, signal_variance: Float, noise_variance: Float).void }
+      def initialize(length_scale: 1.0, signal_variance: 1.0, noise_variance: 1e-6)
+        @length_scale = length_scale
+        @signal_variance = signal_variance
+        @noise_variance = noise_variance
+        @fitted = T.let(false, T::Boolean)
+      end
+      sig { params(x1: T::Array[T::Array[Float]], x2: T::Array[T::Array[Float]]).returns(Numo::DFloat) }
+      def rbf_kernel(x1, x2)
+        # Convert to Numo arrays
+        x1_array = Numo::DFloat[*x1]
+        x2_array = Numo::DFloat[*x2]
+        # Compute squared Euclidean distances manually
+        n1, n2 = x1_array.shape[0], x2_array.shape[0]
+        sqdist = Numo::DFloat.zeros(n1, n2)
+        (0...n1).each do |i|
+          (0...n2).each do |j|
+            diff = x1_array[i, true] - x2_array[j, true]
+            sqdist[i, j] = (diff ** 2).sum
+          end
+        end
+        # RBF kernel: σ² * exp(-0.5 * d² / ℓ²)
+        @signal_variance * Numo::NMath.exp(-0.5 * sqdist / (@length_scale ** 2))
+      end
+      sig { params(x_train: T::Array[T::Array[Float]], y_train: T::Array[Float]).void }
+      def fit(x_train, y_train)
+        @x_train = x_train
+        @y_train = Numo::DFloat[*y_train]
+        # Compute kernel matrix
+        k_matrix = rbf_kernel(x_train, x_train)
+        # Add noise to diagonal for numerical stability
+        n = k_matrix.shape[0]
+        (0...n).each { |i| k_matrix[i, i] += @noise_variance }
+        # Store inverted kernel matrix using simple LU decomposition
+        @k_inv = matrix_inverse(k_matrix)
+        @alpha = @k_inv.dot(@y_train)
+        @fitted = true
+      end
+      sig { params(x_test: T::Array[T::Array[Float]], return_std: T::Boolean).returns(T.any(Numo::DFloat, [Numo::DFloat, Numo::DFloat])) }
+      def predict(x_test, return_std: false)
+        raise "Gaussian Process not fitted" unless @fitted
+        # Kernel between training and test points
+        k_star = rbf_kernel(T.must(@x_train), x_test)
+        # Predictive mean
+        mean = k_star.transpose.dot(@alpha)
+        return mean unless return_std
+        # Predictive variance (simplified for small matrices)
+        k_star_star = rbf_kernel(x_test, x_test)
+        var_matrix = k_star_star - k_star.transpose.dot(@k_inv).dot(k_star)
+        var = var_matrix.diagonal
+        # Ensure positive variance (element-wise maximum)
+        var = var.map { |v| [v, 1e-12].max }
+        std = Numo::NMath.sqrt(var)
+        [mean, std]
+      end
+      private
+      sig { returns(T.nilable(T::Array[T::Array[Float]])) }
+      attr_reader :x_train
+      sig { returns(T.nilable(Numo::DFloat)) }
+      attr_reader :y_train, :k_inv, :alpha
+      # Simple matrix inversion using Gauss-Jordan elimination
+      # Only suitable for small matrices (< 100x100)
+      sig { params(matrix: Numo::DFloat).returns(Numo::DFloat) }
+      def matrix_inverse(matrix)
+        n = matrix.shape[0]
+        raise "Matrix must be square" unless matrix.shape[0] == matrix.shape[1]
+        # Create augmented matrix [A|I]
+        augmented = Numo::DFloat.zeros(n, 2*n)
+        augmented[true, 0...n] = matrix.copy
+        (0...n).each { |i| augmented[i, n+i] = 1.0 }
+        # Gauss-Jordan elimination
+        (0...n).each do |i|
+          # Find pivot
+          max_row = i
+          (i+1...n).each do |k|
+            if augmented[k, i].abs > augmented[max_row, i].abs
+              max_row = k
+            end
+          end
+          # Swap rows if needed
+          if max_row != i
+            temp = augmented[i, true].copy
+            augmented[i, true] = augmented[max_row, true]
+            augmented[max_row, true] = temp
+          end
+          # Make diagonal element 1
+          pivot = augmented[i, i]
+          raise "Matrix is singular" if pivot.abs < 1e-12
+          augmented[i, true] /= pivot
+          # Eliminate column
+          (0...n).each do |j|
+            next if i == j
+            factor = augmented[j, i]
+            augmented[j, true] -= factor * augmented[i, true]
+          end
+        end
+        # Extract inverse matrix
+        augmented[true, n...2*n]
+      end
+    end
+  end
+end

data/lib/dspy/teleprompt/mipro_v2.rb CHANGED Viewed

@@ -5,9 +5,28 @@ require 'sorbet-runtime'
 require_relative 'teleprompter'
 require_relative 'utils'
 require_relative '../propose/grounded_proposer'
+require_relative '../optimizers/gaussian_process'
 module DSPy
   module Teleprompt
+    # Enum for candidate configuration types
+    class CandidateType < T::Enum
+      enums do
+        Baseline = new("baseline")
+        InstructionOnly = new("instruction_only")
+        FewShotOnly = new("few_shot_only")
+        Combined = new("combined")
+      end
+    end
+    # Enum for optimization strategies
+    class OptimizationStrategy < T::Enum
+      enums do
+        Greedy = new("greedy")
+        Adaptive = new("adaptive")
+        Bayesian = new("bayesian")
+      end
+    end
     # MIPROv2: Multi-prompt Instruction Proposal with Retrieval Optimization
     # State-of-the-art prompt optimization combining bootstrap sampling,
     # instruction generation, and Bayesian optimization
@@ -141,40 +160,38 @@ module DSPy
       # Candidate configuration for optimization trials
       class CandidateConfig
         extend T::Sig
+        include Dry::Configurable
-        sig { returns(String) }
-        attr_reader :instruction
-        sig { returns(T::Array[T.untyped]) }
-        attr_reader :few_shot_examples
-        sig { returns(T::Hash[Symbol, T.untyped]) }
-        attr_reader :metadata
+        # Configuration settings
+        setting :instruction, default: ""
+        setting :few_shot_examples, default: []
+        setting :type, default: CandidateType::Baseline
+        setting :metadata, default: {}
         sig { returns(String) }
-        attr_reader :config_id
-        sig do
-          params(
-            instruction: String,
-            few_shot_examples: T::Array[T.untyped],
-            metadata: T::Hash[Symbol, T.untyped]
-          ).void
+        def config_id
+          @config_id ||= generate_config_id
         end
-        def initialize(instruction:, few_shot_examples:, metadata: {})
-          @instruction = instruction
-          @few_shot_examples = few_shot_examples
-          @metadata = metadata.freeze
+        sig { void }
+        def finalize!
+          # Freeze settings after configuration to prevent mutation
+          config.instruction = config.instruction.freeze
+          config.few_shot_examples = config.few_shot_examples.freeze
+          config.metadata = config.metadata.freeze
+          # Generate ID after finalization
           @config_id = generate_config_id
         end
         sig { returns(T::Hash[Symbol, T.untyped]) }
         def to_h
           {
-            instruction: @instruction,
-            few_shot_examples: @few_shot_examples.size,
-            metadata: @metadata,
-            config_id: @config_id
+            instruction: config.instruction,
+            few_shot_examples: config.few_shot_examples.size,
+            type: config.type.serialize,
+            metadata: config.metadata,
+            config_id: config_id
           }
         end
@@ -182,7 +199,7 @@ module DSPy
         sig { returns(String) }
         def generate_config_id
-          content = "#{@instruction}_#{@few_shot_examples.size}_#{@metadata.hash}"
+          content = "#{config.instruction}_#{config.few_shot_examples.size}_#{config.type.serialize}_#{config.metadata.hash}"
           Digest::SHA256.hexdigest(content)[0, 12]
         end
       end
@@ -263,6 +280,7 @@ module DSPy
       end
       def initialize(metric: nil, config: nil)
         @mipro_config = config || MIPROv2Config.new
+        # Call parent teleprompter initializer, which handles dry-configurable internally
         super(metric: metric, config: @mipro_config)
         @proposer = DSPy::Propose::GroundedProposer.new(config: @mipro_config.proposer_config)
@@ -416,8 +434,8 @@ module DSPy
           emit_event('trial_start', {
             trial_number: trials_completed,
             candidate_id: candidate.config_id,
-            instruction_preview: candidate.instruction[0, 50],
-            num_few_shot: candidate.few_shot_examples.size
+            instruction_preview: candidate.config.instruction[0, 50],
+            num_few_shot: candidate.config.few_shot_examples.size
           })
           begin
@@ -482,28 +500,31 @@ module DSPy
         candidates = []
         # Base configuration (no modifications)
-        candidates << CandidateConfig.new(
-          instruction: "",
-          few_shot_examples: [],
-          metadata: { type: "baseline" }
-        )
+        candidates << create_candidate_config do |config|
+          config.instruction = ""
+          config.few_shot_examples = []
+          config.type = CandidateType::Baseline
+          config.metadata = {}
+        end
         # Instruction-only candidates
         proposal_result.candidate_instructions.each_with_index do |instruction, idx|
-          candidates << CandidateConfig.new(
-            instruction: instruction,
-            few_shot_examples: [],
-            metadata: { type: "instruction_only", proposal_rank: idx }
-          )
+          candidates << create_candidate_config do |config|
+            config.instruction = instruction
+            config.few_shot_examples = []
+            config.type = CandidateType::InstructionOnly
+            config.metadata = { proposal_rank: idx }
+          end
         end
         # Few-shot only candidates
         bootstrap_result.candidate_sets.each_with_index do |candidate_set, idx|
-          candidates << CandidateConfig.new(
-            instruction: "",
-            few_shot_examples: candidate_set,
-            metadata: { type: "few_shot_only", bootstrap_rank: idx }
-          )
+          candidates << create_candidate_config do |config|
+            config.instruction = ""
+            config.few_shot_examples = candidate_set
+            config.type = CandidateType::FewShotOnly
+            config.metadata = { bootstrap_rank: idx }
+          end
         end
         # Combined candidates (instruction + few-shot)
@@ -512,15 +533,15 @@ module DSPy
         top_instructions.each_with_index do |instruction, i_idx|
           top_bootstrap_sets.each_with_index do |candidate_set, b_idx|
-            candidates << CandidateConfig.new(
-              instruction: instruction,
-              few_shot_examples: candidate_set,
-              metadata: {
-                type: "combined",
+            candidates << create_candidate_config do |config|
+              config.instruction = instruction
+              config.few_shot_examples = candidate_set
+              config.type = CandidateType::Combined
+              config.metadata = {
                 instruction_rank: i_idx,
                 bootstrap_rank: b_idx
               }
-            )
+            end
           end
         end
@@ -624,9 +645,85 @@ module DSPy
         ).returns(CandidateConfig)
       end
       def select_candidate_bayesian(candidates, state, trial_idx)
-        # For now, use adaptive selection with Bayesian-inspired exploration
-        # In a full implementation, this would use Gaussian processes or similar
-        select_candidate_adaptive(candidates, state, trial_idx)
+        # Need at least 3 observations to fit GP, otherwise fall back to adaptive
+        return select_candidate_adaptive(candidates, state, trial_idx) if state[:scores].size < 3
+        # Get scored candidates for training the GP
+        scored_candidates = candidates.select { |c| state[:scores].key?(c.config_id) }
+        return select_candidate_adaptive(candidates, state, trial_idx) if scored_candidates.size < 3
+        begin
+          # Encode candidates as numerical features
+          all_candidate_features = encode_candidates_for_gp(candidates)
+          scored_features = encode_candidates_for_gp(scored_candidates)
+          scored_targets = scored_candidates.map { |c| state[:scores][c.config_id].to_f }
+          # Train Gaussian Process
+          gp = DSPy::Optimizers::GaussianProcess.new(
+            length_scale: 1.0,
+            signal_variance: 1.0,
+            noise_variance: 0.01
+          )
+          gp.fit(scored_features, scored_targets)
+          # Predict mean and uncertainty for all candidates
+          means, stds = gp.predict(all_candidate_features, return_std: true)
+          # Upper Confidence Bound (UCB) acquisition function
+          kappa = 2.0 * Math.sqrt(Math.log(trial_idx + 1))  # Exploration parameter
+          acquisition_scores = means.to_a.zip(stds.to_a).map { |m, s| m + kappa * s }
+          # Select candidate with highest acquisition score
+          best_idx = acquisition_scores.each_with_index.max_by { |score, _| score }[1]
+          candidates[best_idx]
+        rescue => e
+          # If GP fails for any reason, fall back to adaptive selection
+          DSPy.logger.warn("Bayesian optimization failed: #{e.message}. Falling back to adaptive selection.")
+          select_candidate_adaptive(candidates, state, trial_idx)
+        end
+      end
+      private
+      # Helper method to create CandidateConfig with dry-configurable syntax
+      sig do
+        params(
+          block: T.proc.params(config: Dry::Configurable::Config).void
+        ).returns(CandidateConfig)
+      end
+      def create_candidate_config(&block)
+        candidate = CandidateConfig.new
+        candidate.configure(&block)
+        candidate.finalize!
+        candidate
+      end
+      # Encode candidates as numerical features for Gaussian Process
+      sig { params(candidates: T::Array[CandidateConfig]).returns(T::Array[T::Array[Float]]) }
+      def encode_candidates_for_gp(candidates)
+        # Simple encoding: use hash of config as features
+        # In practice, this could be more sophisticated (e.g., instruction embeddings)
+        candidates.map do |candidate|
+          # Create deterministic numerical features from the candidate config
+          config_hash = candidate.config_id.hash.abs
+          # Extract multiple features to create a feature vector
+          features = []
+          features << (config_hash % 1000).to_f / 1000.0  # Feature 1: hash mod 1000, normalized
+          features << ((config_hash / 1000) % 1000).to_f / 1000.0  # Feature 2: different part of hash
+          features << ((config_hash / 1_000_000) % 1000).to_f / 1000.0  # Feature 3: high bits
+          # Add instruction length if available
+          instruction = candidate.config.instruction
+          if instruction && !instruction.empty?
+            features << [instruction.length.to_f / 100.0, 2.0].min  # Instruction length, capped at 200 chars
+          else
+            features << 0.5  # Default value
+          end
+          features
+        end
       end
       # Evaluate a candidate configuration
@@ -656,13 +753,13 @@ module DSPy
         modified_program = program
         # Apply instruction if provided
-        if !candidate.instruction.empty? && program.respond_to?(:with_instruction)
-          modified_program = modified_program.with_instruction(candidate.instruction)
+        if !candidate.config.instruction.empty? && program.respond_to?(:with_instruction)
+          modified_program = modified_program.with_instruction(candidate.config.instruction)
         end
         # Apply few-shot examples if provided
-        if candidate.few_shot_examples.any? && program.respond_to?(:with_examples)
-          few_shot_examples = candidate.few_shot_examples.map do |example|
+        if candidate.config.few_shot_examples.any? && program.respond_to?(:with_examples)
+          few_shot_examples = candidate.config.few_shot_examples.map do |example|
             DSPy::FewShotExample.new(
               input: example.input_values,
               output: example.expected_values,
@@ -715,8 +812,8 @@ module DSPy
       sig { params(candidate: CandidateConfig).returns(Float) }
       def calculate_diversity_score(candidate)
         # Simple diversity metric based on instruction length and few-shot count
-        instruction_diversity = candidate.instruction.length / 200.0
-        few_shot_diversity = candidate.few_shot_examples.size / 10.0
+        instruction_diversity = candidate.config.instruction.length / 200.0
+        few_shot_diversity = candidate.config.few_shot_examples.size / 10.0
         [instruction_diversity + few_shot_diversity, 1.0].min
       end
@@ -747,9 +844,9 @@ module DSPy
         metadata = {
           optimizer: "MIPROv2",
           auto_mode: infer_auto_mode,
-          best_instruction: best_candidate&.instruction || "",
-          best_few_shot_count: best_candidate&.few_shot_examples&.size || 0,
-          best_candidate_type: best_candidate&.metadata&.fetch(:type, "unknown"),
+          best_instruction: best_candidate&.config&.instruction || "",
+          best_few_shot_count: best_candidate&.config&.few_shot_examples&.size || 0,
+          best_candidate_type: best_candidate&.config&.type&.serialize || "unknown",
           optimization_timestamp: Time.now.iso8601
         }

data/lib/dspy/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module DSPy
-  VERSION = "0.25.1"
+  VERSION = "0.26.0"
 end

metadata CHANGED Viewed

@@ -1,13 +1,13 @@
 --- !ruby/object:Gem::Specification
 name: dspy
 version: !ruby/object:Gem::Version
-  version: 0.25.1
+  version: 0.26.0
 platform: ruby
 authors:
 - Vicente Reig Rincón de Arellano
 bindir: bin
 cert_chain: []
-date: 2025-09-08 00:00:00.000000000 Z
+date: 2025-09-09 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: dry-configurable
@@ -122,19 +122,19 @@ dependencies:
       - !ruby/object:Gem::Version
         version: '0.3'
 - !ruby/object:Gem::Dependency
-  name: polars-df
+  name: numo-narray
   requirement: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: 0.20.0
+        version: '0.9'
   type: :runtime
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: 0.20.0
+        version: '0.9'
 - !ruby/object:Gem::Dependency
   name: informers
   requirement: !ruby/object:Gem::Requirement
@@ -239,6 +239,7 @@ files:
 - lib/dspy/module.rb
 - lib/dspy/observability.rb
 - lib/dspy/observability/async_span_processor.rb
+- lib/dspy/optimizers/gaussian_process.rb
 - lib/dspy/predict.rb
 - lib/dspy/prediction.rb
 - lib/dspy/prompt.rb