RubyGems - dspy - Versions diffs - 0.3.1 → 0.4.0 - Mend

dspy 0.3.1 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (29) hide show

checksums.yaml +4 -4
data/README.md +69 -382
data/lib/dspy/chain_of_thought.rb +57 -0
data/lib/dspy/evaluate.rb +554 -0
data/lib/dspy/example.rb +203 -0
data/lib/dspy/few_shot_example.rb +81 -0
data/lib/dspy/instrumentation.rb +97 -8
data/lib/dspy/lm/adapter_factory.rb +6 -8
data/lib/dspy/lm.rb +5 -7
data/lib/dspy/predict.rb +32 -34
data/lib/dspy/prompt.rb +222 -0
data/lib/dspy/propose/grounded_proposer.rb +560 -0
data/lib/dspy/registry/registry_manager.rb +504 -0
data/lib/dspy/registry/signature_registry.rb +725 -0
data/lib/dspy/storage/program_storage.rb +442 -0
data/lib/dspy/storage/storage_manager.rb +331 -0
data/lib/dspy/subscribers/langfuse_subscriber.rb +669 -0
data/lib/dspy/subscribers/logger_subscriber.rb +120 -0
data/lib/dspy/subscribers/newrelic_subscriber.rb +686 -0
data/lib/dspy/subscribers/otel_subscriber.rb +538 -0
data/lib/dspy/teleprompt/data_handler.rb +107 -0
data/lib/dspy/teleprompt/mipro_v2.rb +790 -0
data/lib/dspy/teleprompt/simple_optimizer.rb +497 -0
data/lib/dspy/teleprompt/teleprompter.rb +336 -0
data/lib/dspy/teleprompt/utils.rb +380 -0
data/lib/dspy/version.rb +5 -0
data/lib/dspy.rb +16 -0
metadata +29 -12
data/lib/dspy/lm/adapters/ruby_llm_adapter.rb +0 -81

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 64e8b7011ea06273772d2ef8a985d61aa1ee30d5d6fb3c559dc22ed81e345b16
-  data.tar.gz: a16fab394ee1db1bcaddc0baaa3636590a2ca00c1ce60eb7dc1a00355750009f
+  metadata.gz: 06ba0ad132367bef01b9dccd24cac12433eaed02c47b96ae78b460370b21c85b
+  data.tar.gz: 86baa5b7e136c1a0527e880915a9dfd34ad6093927ad4979f45a0dc44b3bbd9c
 SHA512:
-  metadata.gz: ce4ab780cce89c2c3680e6c5703e853bbd901ac206993af1cd0660cc25e9796ef1dd5adb488d575bbc97ea75a71b563e82eb4cc521b5c84291ff9b1e106216e1
-  data.tar.gz: be313a08f282eb7a08638879298742baf5ec26bcb48b946ac068892c2ad542003d99df575ecc85bb5220c637368f02adf042164831cf5d48fd7139bb3f5424a7
+  metadata.gz: f0e499582d6a3593b3e71b2bb587db6b8599fa1d50cd265cb14a9ac9eb4ea72fc8d5bfa580c67a8f8c5065eef29ef67b26010b316fb07a87e10011a5be5cd7d5
+  data.tar.gz: 765749d9edd708965a61ecdb71c20150c934fd252b2e67913acb41bc7c1f8c33ccf96709a96530c9120dab51643997690aa073dcb3e36425eaa3ed91739895ad

data/README.md CHANGED Viewed

@@ -2,14 +2,9 @@
 **Build reliable LLM applications in Ruby using composable, type-safe modules.**
-DSPy.rb brings structured LLM programming to Ruby developers.
-Instead of wrestling with prompt strings and parsing responses,
-you define typed signatures and compose them into pipelines that just work.
+DSPy.rb brings structured LLM programming to Ruby developers. Instead of wrestling with prompt strings and parsing responses, you define typed signatures and compose them into pipelines that just work.
-Traditional prompting is like writing code with string concatenation: it works until
-it doesn't. DSPy.rb brings you the programming approach pioneered
-by [dspy.ai](https://dspy.ai/): instead of crafting fragile prompts, you define
-modular signatures and let the framework handle the messy details.
+Traditional prompting is like writing code with string concatenation: it works until it doesn't. DSPy.rb brings you the programming approach pioneered by [dspy.ai](https://dspy.ai/): instead of crafting fragile prompts, you define modular signatures and let the framework handle the messy details.
 The result? LLM applications that actually scale and don't break when you sneeze.
@@ -22,33 +17,44 @@ The result? LLM applications that actually scale and don't break when you sneeze
 - **ReAct** - Tool-using agents that can actually get things done
 - **RAG** - Context-enriched responses from your data
 - **Multi-stage Pipelines** - Compose multiple LLM calls into workflows
-- OpenAI and Anthropic support via [Ruby LLM](https://github.com/crmne/ruby_llm)
+**Optimization & Evaluation:**
+- **Prompt Objects** - Manipulate prompts as first-class objects instead of strings
+- **Typed Examples** - Type-safe training data with automatic validation
+- **Evaluation Framework** - Systematic testing with built-in metrics
+- **MIPROv2 Optimizer** - State-of-the-art automatic prompt optimization
+- **Simple Optimizer** - Random/grid search for quick experimentation
+**Production Features:**
+- **Storage System** - Persistent optimization result storage with search and filtering
+- **Registry System** - Version control for optimized signatures with deployment tracking
+- **Multi-Platform Observability** - OpenTelemetry, New Relic, and Langfuse integration
+- **Auto-deployment** - Intelligent deployment based on performance improvements
+- **Rollback Protection** - Automatic rollback on performance degradation
+**Developer Experience:**
+- LLM provider support using official Ruby clients:
+  - [OpenAI Ruby](https://github.com/openai/openai-ruby)
+  - [Anthropic Ruby SDK](https://github.com/anthropics/anthropic-sdk-ruby)
 - Runtime type checking with [Sorbet](https://sorbet.org/)
 - Type-safe tool definitions for ReAct agents
+- Comprehensive instrumentation and observability
 ## Fair Warning
-This is fresh off the oven and evolving fast.
-I'm actively building this as a Ruby port of the [DSPy library](https://dspy.ai/).
-If you hit bugs or want to contribute, just email me directly!
+This is fresh off the oven and evolving fast. I'm actively building this as a Ruby port of the [DSPy library](https://dspy.ai/). If you hit bugs or want to contribute, just email me directly!
-## What's Next
-These are my goals to release v1.0.
+## Quick Start
-- Solidify prompt optimization
-- OTel Integration
-- Ollama support
-## Installation
+### Installation
 Skip the gem for now - install straight from this repo while I prep the first release:
 ```ruby
 gem 'dspy', github: 'vicentereig/dspy.rb'
 ```
-## Usage Examples
-### Simple Prediction
+### Your First DSPy Program
 ```ruby
 # Define a signature for sentiment classification
@@ -80,380 +86,61 @@ end
 # Create the predictor and run inference
 classify = DSPy::Predict.new(Classify)
-result = classify.call(sentence: "This book was super fun to read, though not the last chapter.")
+result = classify.call(sentence: "This book was super fun to read!")
-# result is a properly typed T::Struct instance
 puts result.sentiment    # => #<Sentiment::Positive>
 puts result.confidence   # => 0.85
 ```
-### Chain of Thought Reasoning
-```ruby
-class AnswerPredictor < DSPy::Signature
-  description "Provides a concise answer to the question"
-  input do
-    const :question, String
-  end
-  output do
-    const :answer, String
-  end
-end
-# Chain of thought automatically adds a 'reasoning' field to the output
-qa_cot = DSPy::ChainOfThought.new(AnswerPredictor)
-result = qa_cot.call(question: "Two dice are tossed. What is the probability that the sum equals two?")
-puts result.reasoning  # => "There is only one way to get a sum of 2..."
-puts result.answer     # => "1/36"
-```
-### ReAct Agents with Tools
-```ruby
-class DeepQA < DSPy::Signature
-  description "Answer questions with consideration for the context"
-  input do
-    const :question, String
-  end
-  output do
-    const :answer, String
-  end
-end
-# Define tools for the agent
-class CalculatorTool < DSPy::Tools::Base
-  tool_name 'calculator'
-  tool_description 'Performs basic arithmetic operations'
-  sig { params(operation: String, num1: Float, num2: Float).returns(T.any(Float, String)) }
-  def call(operation:, num1:, num2:)
-    case operation.downcase
-    when 'add' then num1 + num2
-    when 'subtract' then num1 - num2
-    when 'multiply' then num1 * num2
-    when 'divide'
-      return "Error: Cannot divide by zero" if num2 == 0
-      num1 / num2
-    else
-      "Error: Unknown operation '#{operation}'. Use add, subtract, multiply, or divide"
-    end
-  end
-# Create ReAct agent with tools
-agent = DSPy::ReAct.new(DeepQA, tools: [CalculatorTool.new])
-# Run the agent
-result = agent.forward(question: "What is 42 plus 58?")
-puts result.answer # => "100"
-puts result.history # => Array of reasoning steps and tool calls
-```
-### Multi-stage Pipelines
-Outline the sections of an article and draft them out.
-```ruby
-# write an article!
-drafter = ArticleDrafter.new
-article = drafter.forward(topic: "The impact of AI on software development") # { title: '....', sections: [{content: '....'}]}
-class Outline < DSPy::Signature
-  description "Outline a thorough overview of a topic."
-  input do
-    const :topic, String
-  end
-  output do
-    const :title, String
-    const :sections, T::Array[String]
-  end
-end
-class DraftSection < DSPy::Signature
-  description "Draft a section of an article"
-  input do
-    const :topic, String
-    const :title, String
-    const :section, String
-  end
-  output do
-    const :content, String
-  end
-end
-class ArticleDrafter < DSPy::Module
-  def initialize
-    @build_outline = DSPy::ChainOfThought.new(Outline)
-    @draft_section = DSPy::ChainOfThought.new(DraftSection)
-  end
-  def forward(topic:)
-    outline = @build_outline.call(topic: topic)
-    sections = outline.sections.map do |section|
-      @draft_section.call(
-        topic: topic,
-        title: outline.title,
-        section: section
-      )
-    end
-    {
-      title: outline.title,
-      sections: sections.map(&:content)
-    }
-  end
-end
-```
-## Working with Complex Types
-### Enums
-```ruby
-class Color < T::Enum
-  enums do
-    Red = new
-    Green = new
-    Blue = new
-  end
-end
-class ColorSignature < DSPy::Signature
-  description "Identify the dominant color in a description"
-  input do
-    const :description, String,
-      description: 'Description of an object or scene'
-  end
-  output do
-    const :color, Color,
-      description: 'The dominant color (Red, Green, or Blue)'
-  end
-end
-predictor = DSPy::Predict.new(ColorSignature)
-result = predictor.call(description: "A red apple on a wooden table")
-puts result.color  # => #<Color::Red>
-```
-### Optional Fields and Defaults
-```ruby
-class AnalysisSignature < DSPy::Signature
-  description "Analyze text with optional metadata"
-  input do
-    const :text, String,
-      description: 'Text to analyze'
-    const :include_metadata, T::Boolean,
-      description: 'Whether to include metadata in analysis',
-      default: false
-  end
-  output do
-    const :summary, String,
-      description: 'Summary of the text'
-    const :word_count, Integer,
-      description: 'Number of words (optional)',
-      default: 0
-  end
-end
-```
-## Advanced Usage Patterns
-### Multi-stage Pipelines
+## Documentation
-```ruby
-class TopicSignature < DSPy::Signature
-  description "Extract main topic from text"
-  input do
-    const :content, String,
-      description: 'Text content to analyze'
-  end
-  output do
-    const :topic, String,
-      description: 'Main topic of the content'
-  end
-end
+### Getting Started
+- **[Installation & Setup](docs/getting-started/installation.md)** - Detailed installation and configuration
+- **[Quick Start Guide](docs/getting-started/quick-start.md)** - Your first DSPy programs
+- **[Core Concepts](docs/getting-started/core-concepts.md)** - Understanding signatures, predictors, and modules
-class SummarySignature < DSPy::Signature
-  description "Create summary focusing on specific topic"
-  input do
-    const :content, String,
-      description: 'Original text content'
-    const :topic, String,
-      description: 'Topic to focus on'
-  end
-  output do
-    const :summary, String,
-      description: 'Topic-focused summary'
-  end
-end
+### Core Features
+- **[Signatures & Types](docs/core-concepts/signatures.md)** - Define typed interfaces for LLM operations
+- **[Predictors](docs/core-concepts/predictors.md)** - Predict, ChainOfThought, ReAct, and more
+- **[Modules & Pipelines](docs/core-concepts/modules.md)** - Compose complex multi-stage workflows
+- **[Examples & Validation](docs/core-concepts/examples.md)** - Type-safe training data
-class ArticlePipeline < DSPy::Signature
-  extend T::Sig
-  def initialize
-    @topic_extractor = DSPy::Predict.new(TopicSignature)
-    @summarizer = DSPy::ChainOfThought.new(SummarySignature)
-  end
-  sig { params(content: String).returns(T.untyped) }
-  def forward(content:)
-    # Extract topic
-    topic_result = @topic_extractor.call(content: content)
-    # Create focused summary
-    summary_result = @summarizer.call(
-      content: content,
-      topic: topic_result.topic
-    )
-    {
-      topic: topic_result.topic,
-      summary: summary_result.summary,
-      reasoning: summary_result.reasoning
-    }
-  end
-end
+### Optimization
+- **[Evaluation Framework](docs/optimization/evaluation.md)** - Systematic testing with metrics
+- **[Prompt Optimization](docs/optimization/prompt-optimization.md)** - Manipulate prompts as objects
+- **[MIPROv2 Optimizer](docs/optimization/miprov2.md)** - State-of-the-art automatic optimization
+- **[Simple Optimizer](docs/optimization/simple-optimizer.md)** - Quick experimentation with random/grid search
-# Usage
-pipeline = ArticlePipeline.new
-result = pipeline.call(content: "Long article content...")
-```
+### Production Features
+- **[Storage System](docs/enterprise/storage.md)** - Persist and search optimization results
+- **[Registry & Versions](docs/enterprise/registry.md)** - Version control with deployment tracking
+- **[Observability](docs/enterprise/observability.md)** - Multi-platform monitoring and metrics
-### Retrieval Augmented Generation
+### Advanced Usage
+- **[Complex Types](docs/advanced/complex-types.md)** - Enums, optional fields, and defaults
+- **[Multi-stage Pipelines](docs/advanced/pipelines.md)** - Advanced composition patterns
+- **[RAG Implementation](docs/advanced/rag.md)** - Retrieval Augmented Generation
+- **[Custom Metrics](docs/advanced/custom-metrics.md)** - Domain-specific evaluation logic
-```ruby
-class ContextualQA < DSPy::Signature
-  description "Answer questions using relevant context"
-  input do
-    const :question, String,
-      description: 'The question to answer'
-    const :context, T::Array[String],
-      description: 'Relevant context passages'
-  end
-  output do
-    const :answer, String,
-      description: 'Answer based on the provided context'
-    const :confidence, Float,
-      description: 'Confidence in the answer (0.0 to 1.0)'
-  end
-end
-# Usage with retriever
-retriever = YourRetrieverClass.new
-qa = DSPy::ChainOfThought.new(ContextualQA)
-question = "What is the capital of France?"
-context = retriever.retrieve(question)  # Returns array of strings
-result = qa.call(question: question, context: context)
-puts result.reasoning   # Step-by-step reasoning
-puts result.answer      # "Paris"
-puts result.confidence  # 0.95
-```
-## Instrumentation & Observability
-DSPy.rb includes built-in instrumentation that captures detailed events and
-performance metrics from your LLM operations. Perfect for monitoring your
-applications and integrating with observability tools.
-### Available Events
-Subscribe to these events to monitor different aspects of your LLM operations:
-| Event Name | Triggered When | Key Payload Fields |
-|------------|----------------|-------------------|
-| `dspy.lm.request` | LLM API request lifecycle | `gen_ai_system`, `model`, `provider`, `duration_ms`, `status` |
-| `dspy.lm.tokens` | Token usage tracking | `tokens_input`, `tokens_output`, `tokens_total` |
-| `dspy.predict` | Prediction operations | `signature_class`, `input_size`, `duration_ms`, `status` |
-| `dspy.chain_of_thought` | CoT reasoning | `signature_class`, `model`, `duration_ms`, `status` |
-| `dspy.react` | Agent operations | `max_iterations`, `tools_used`, `duration_ms`, `status` |
-| `dspy.react.tool_call` | Tool execution | `tool_name`, `tool_input`, `tool_output`, `duration_ms` |
-### Event Payloads
-The instrumentation emits events with structured payloads you can process:
-```ruby
-# Example event payload for dspy.predict
-{
-  signature_class: "QuestionAnswering",
-  model: "gpt-4o-mini",
-  provider: "openai",
-  input_size: 45,
-  duration_ms: 1234.56,
-  cpu_time_ms: 89.12,
-  status: "success",
-  timestamp: "2024-01-15T10:30:00Z"
-}
-# Example token usage payload
-{
-  tokens_input: 150,
-  tokens_output: 45,
-  tokens_total: 195,
-  gen_ai_system: "openai",
-  signature_class: "QuestionAnswering"
-}
-```
-Events are emitted via dry-monitor notifications, giving you flexibility to
-process them however you need - logging, metrics, alerts, or custom monitoring.
-### Token Tracking
-Token usage is extracted from actual API responses (OpenAI and Anthropic only),
-giving you precise cost tracking:
-```ruby
-# Token events include:
-{
-  tokens_input: 150,     # From API response
-  tokens_output: 45,     # From API response
-  tokens_total: 195,     # From API response
-  gen_ai_system: "openai",
-  gen_ai_request_model: "gpt-4o-mini"
-}
-```
-### Integration with Monitoring Tools
-Subscribe to events for custom processing:
+## What's Next
-```ruby
-# Subscribe to all LM events
-DSPy::Instrumentation.subscribe('dspy.lm.*') do |event|
-  puts "#{event.id}: #{event.payload[:duration_ms]}ms"
-end
+These are my goals to release v1.0.
-# Subscribe to specific events
-DSPy::Instrumentation.subscribe('dspy.predict') do |event|
-  MyMetrics.histogram('dspy.predict.duration', event.payload[:duration_ms])
-end
-```
+- ✅ Prompt objects foundation - *Done*
+- ✅ Evaluation framework - *Done*
+- ✅ Teleprompter base classes - *Done*
+- ✅ MIPROv2 optimization algorithm - *Done*
+- ✅ Storage & persistence system - *Done*
+- ✅ Registry & version management - *Done*
+- ✅ OpenTelemetry integration - *Done*
+- ✅ New Relic integration - *Done*
+- ✅ Langfuse integration - *Done*
+- 🚧 Ollama support
+- Context Engineering (see recent research: [How Contexts Fail](https://www.dbreunig.com/2025/06/22/how-contexts-fail-and-how-to-fix-them.html), [How to Fix Your Context](https://www.dbreunig.com/2025/06/26/how-to-fix-your-context.html), [Context Engineering](https://simonwillison.net/2025/Jun/27/context-engineering/))
+- Agentic Memory support
+- MCP Support
+- Documentation website
+- Performance benchmarks
 ## License

data/lib/dspy/chain_of_thought.rb CHANGED Viewed

@@ -53,6 +53,63 @@ module DSPy
       @signature_class = enhanced_signature
     end
+    # Override prompt-based methods to maintain ChainOfThought behavior
+    sig { override.params(new_prompt: Prompt).returns(ChainOfThought) }
+    def with_prompt(new_prompt)
+      # Create a new ChainOfThought with the same original signature
+      instance = self.class.new(@original_signature)
+      # Ensure the instruction includes "Think step by step" if not already present
+      enhanced_instruction = if new_prompt.instruction.include?("Think step by step")
+                               new_prompt.instruction
+                             else
+                               "#{new_prompt.instruction} Think step by step."
+                             end
+      # Create enhanced prompt with ChainOfThought-specific schemas
+      enhanced_prompt = Prompt.new(
+        instruction: enhanced_instruction,
+        input_schema: @signature_class.input_json_schema,
+        output_schema: @signature_class.output_json_schema,
+        few_shot_examples: new_prompt.few_shot_examples,
+        signature_class_name: @signature_class.name
+      )
+      instance.instance_variable_set(:@prompt, enhanced_prompt)
+      instance
+    end
+    sig { override.params(instruction: String).returns(ChainOfThought) }
+    def with_instruction(instruction)
+      # Ensure ChainOfThought behavior is preserved
+      cot_instruction = instruction.include?("Think step by step") ? instruction : "#{instruction} Think step by step."
+      super(cot_instruction)
+    end
+    sig { override.params(examples: T::Array[FewShotExample]).returns(ChainOfThought) }
+    def with_examples(examples)
+      # Convert examples to include reasoning if they don't have it
+      enhanced_examples = examples.map do |example|
+        if example.reasoning.nil? || example.reasoning.empty?
+          # Try to extract reasoning from the output if it contains a reasoning field
+          reasoning = example.output[:reasoning] || "Step by step reasoning for this example."
+          DSPy::FewShotExample.new(
+            input: example.input,
+            output: example.output,
+            reasoning: reasoning
+          )
+        else
+          example
+        end
+      end
+      super(enhanced_examples)
+    end
+    # Access to the original signature for optimization
+    sig { returns(T.class_of(DSPy::Signature)) }
+    attr_reader :original_signature
     # Override forward_untyped to add ChainOfThought-specific instrumentation
     sig { override.params(input_values: T.untyped).returns(T.untyped) }
     def forward_untyped(**input_values)