dspy 0.3.1 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 64e8b7011ea06273772d2ef8a985d61aa1ee30d5d6fb3c559dc22ed81e345b16
4
- data.tar.gz: a16fab394ee1db1bcaddc0baaa3636590a2ca00c1ce60eb7dc1a00355750009f
3
+ metadata.gz: 06ba0ad132367bef01b9dccd24cac12433eaed02c47b96ae78b460370b21c85b
4
+ data.tar.gz: 86baa5b7e136c1a0527e880915a9dfd34ad6093927ad4979f45a0dc44b3bbd9c
5
5
  SHA512:
6
- metadata.gz: ce4ab780cce89c2c3680e6c5703e853bbd901ac206993af1cd0660cc25e9796ef1dd5adb488d575bbc97ea75a71b563e82eb4cc521b5c84291ff9b1e106216e1
7
- data.tar.gz: be313a08f282eb7a08638879298742baf5ec26bcb48b946ac068892c2ad542003d99df575ecc85bb5220c637368f02adf042164831cf5d48fd7139bb3f5424a7
6
+ metadata.gz: f0e499582d6a3593b3e71b2bb587db6b8599fa1d50cd265cb14a9ac9eb4ea72fc8d5bfa580c67a8f8c5065eef29ef67b26010b316fb07a87e10011a5be5cd7d5
7
+ data.tar.gz: 765749d9edd708965a61ecdb71c20150c934fd252b2e67913acb41bc7c1f8c33ccf96709a96530c9120dab51643997690aa073dcb3e36425eaa3ed91739895ad
data/README.md CHANGED
@@ -2,14 +2,9 @@
2
2
 
3
3
  **Build reliable LLM applications in Ruby using composable, type-safe modules.**
4
4
 
5
- DSPy.rb brings structured LLM programming to Ruby developers.
6
- Instead of wrestling with prompt strings and parsing responses,
7
- you define typed signatures and compose them into pipelines that just work.
5
+ DSPy.rb brings structured LLM programming to Ruby developers. Instead of wrestling with prompt strings and parsing responses, you define typed signatures and compose them into pipelines that just work.
8
6
 
9
- Traditional prompting is like writing code with string concatenation: it works until
10
- it doesn't. DSPy.rb brings you the programming approach pioneered
11
- by [dspy.ai](https://dspy.ai/): instead of crafting fragile prompts, you define
12
- modular signatures and let the framework handle the messy details.
7
+ Traditional prompting is like writing code with string concatenation: it works until it doesn't. DSPy.rb brings you the programming approach pioneered by [dspy.ai](https://dspy.ai/): instead of crafting fragile prompts, you define modular signatures and let the framework handle the messy details.
13
8
 
14
9
  The result? LLM applications that actually scale and don't break when you sneeze.
15
10
 
@@ -22,33 +17,44 @@ The result? LLM applications that actually scale and don't break when you sneeze
22
17
  - **ReAct** - Tool-using agents that can actually get things done
23
18
  - **RAG** - Context-enriched responses from your data
24
19
  - **Multi-stage Pipelines** - Compose multiple LLM calls into workflows
25
- - OpenAI and Anthropic support via [Ruby LLM](https://github.com/crmne/ruby_llm)
20
+
21
+ **Optimization & Evaluation:**
22
+ - **Prompt Objects** - Manipulate prompts as first-class objects instead of strings
23
+ - **Typed Examples** - Type-safe training data with automatic validation
24
+ - **Evaluation Framework** - Systematic testing with built-in metrics
25
+ - **MIPROv2 Optimizer** - State-of-the-art automatic prompt optimization
26
+ - **Simple Optimizer** - Random/grid search for quick experimentation
27
+
28
+ **Production Features:**
29
+ - **Storage System** - Persistent optimization result storage with search and filtering
30
+ - **Registry System** - Version control for optimized signatures with deployment tracking
31
+ - **Multi-Platform Observability** - OpenTelemetry, New Relic, and Langfuse integration
32
+ - **Auto-deployment** - Intelligent deployment based on performance improvements
33
+ - **Rollback Protection** - Automatic rollback on performance degradation
34
+
35
+ **Developer Experience:**
36
+ - LLM provider support using official Ruby clients:
37
+ - [OpenAI Ruby](https://github.com/openai/openai-ruby)
38
+ - [Anthropic Ruby SDK](https://github.com/anthropics/anthropic-sdk-ruby)
26
39
  - Runtime type checking with [Sorbet](https://sorbet.org/)
27
40
  - Type-safe tool definitions for ReAct agents
41
+ - Comprehensive instrumentation and observability
28
42
 
29
43
  ## Fair Warning
30
44
 
31
- This is fresh off the oven and evolving fast.
32
- I'm actively building this as a Ruby port of the [DSPy library](https://dspy.ai/).
33
- If you hit bugs or want to contribute, just email me directly!
45
+ This is fresh off the oven and evolving fast. I'm actively building this as a Ruby port of the [DSPy library](https://dspy.ai/). If you hit bugs or want to contribute, just email me directly!
34
46
 
35
- ## What's Next
36
- These are my goals to release v1.0.
47
+ ## Quick Start
37
48
 
38
- - Solidify prompt optimization
39
- - OTel Integration
40
- - Ollama support
41
-
42
- ## Installation
49
+ ### Installation
43
50
 
44
51
  Skip the gem for now - install straight from this repo while I prep the first release:
52
+
45
53
  ```ruby
46
54
  gem 'dspy', github: 'vicentereig/dspy.rb'
47
55
  ```
48
56
 
49
- ## Usage Examples
50
-
51
- ### Simple Prediction
57
+ ### Your First DSPy Program
52
58
 
53
59
  ```ruby
54
60
  # Define a signature for sentiment classification
@@ -80,380 +86,61 @@ end
80
86
 
81
87
  # Create the predictor and run inference
82
88
  classify = DSPy::Predict.new(Classify)
83
- result = classify.call(sentence: "This book was super fun to read, though not the last chapter.")
89
+ result = classify.call(sentence: "This book was super fun to read!")
84
90
 
85
- # result is a properly typed T::Struct instance
86
91
  puts result.sentiment # => #<Sentiment::Positive>
87
92
  puts result.confidence # => 0.85
88
93
  ```
89
94
 
90
- ### Chain of Thought Reasoning
91
-
92
- ```ruby
93
- class AnswerPredictor < DSPy::Signature
94
- description "Provides a concise answer to the question"
95
-
96
- input do
97
- const :question, String
98
- end
99
-
100
- output do
101
- const :answer, String
102
- end
103
- end
104
-
105
- # Chain of thought automatically adds a 'reasoning' field to the output
106
- qa_cot = DSPy::ChainOfThought.new(AnswerPredictor)
107
- result = qa_cot.call(question: "Two dice are tossed. What is the probability that the sum equals two?")
108
-
109
- puts result.reasoning # => "There is only one way to get a sum of 2..."
110
- puts result.answer # => "1/36"
111
- ```
112
-
113
- ### ReAct Agents with Tools
114
-
115
- ```ruby
116
-
117
- class DeepQA < DSPy::Signature
118
- description "Answer questions with consideration for the context"
119
-
120
- input do
121
- const :question, String
122
- end
123
-
124
- output do
125
- const :answer, String
126
- end
127
- end
128
-
129
- # Define tools for the agent
130
- class CalculatorTool < DSPy::Tools::Base
131
-
132
- tool_name 'calculator'
133
- tool_description 'Performs basic arithmetic operations'
134
-
135
- sig { params(operation: String, num1: Float, num2: Float).returns(T.any(Float, String)) }
136
- def call(operation:, num1:, num2:)
137
- case operation.downcase
138
- when 'add' then num1 + num2
139
- when 'subtract' then num1 - num2
140
- when 'multiply' then num1 * num2
141
- when 'divide'
142
- return "Error: Cannot divide by zero" if num2 == 0
143
- num1 / num2
144
- else
145
- "Error: Unknown operation '#{operation}'. Use add, subtract, multiply, or divide"
146
- end
147
- end
148
-
149
- # Create ReAct agent with tools
150
- agent = DSPy::ReAct.new(DeepQA, tools: [CalculatorTool.new])
151
-
152
- # Run the agent
153
- result = agent.forward(question: "What is 42 plus 58?")
154
- puts result.answer # => "100"
155
- puts result.history # => Array of reasoning steps and tool calls
156
- ```
157
-
158
- ### Multi-stage Pipelines
159
- Outline the sections of an article and draft them out.
160
-
161
- ```ruby
162
-
163
- # write an article!
164
- drafter = ArticleDrafter.new
165
- article = drafter.forward(topic: "The impact of AI on software development") # { title: '....', sections: [{content: '....'}]}
166
-
167
- class Outline < DSPy::Signature
168
- description "Outline a thorough overview of a topic."
169
-
170
- input do
171
- const :topic, String
172
- end
173
-
174
- output do
175
- const :title, String
176
- const :sections, T::Array[String]
177
- end
178
- end
179
-
180
- class DraftSection < DSPy::Signature
181
- description "Draft a section of an article"
182
-
183
- input do
184
- const :topic, String
185
- const :title, String
186
- const :section, String
187
- end
188
-
189
- output do
190
- const :content, String
191
- end
192
- end
193
-
194
- class ArticleDrafter < DSPy::Module
195
- def initialize
196
- @build_outline = DSPy::ChainOfThought.new(Outline)
197
- @draft_section = DSPy::ChainOfThought.new(DraftSection)
198
- end
199
-
200
- def forward(topic:)
201
- outline = @build_outline.call(topic: topic)
202
-
203
- sections = outline.sections.map do |section|
204
- @draft_section.call(
205
- topic: topic,
206
- title: outline.title,
207
- section: section
208
- )
209
- end
210
-
211
- {
212
- title: outline.title,
213
- sections: sections.map(&:content)
214
- }
215
- end
216
- end
217
-
218
- ```
219
-
220
- ## Working with Complex Types
221
-
222
- ### Enums
223
-
224
- ```ruby
225
- class Color < T::Enum
226
- enums do
227
- Red = new
228
- Green = new
229
- Blue = new
230
- end
231
- end
232
-
233
- class ColorSignature < DSPy::Signature
234
- description "Identify the dominant color in a description"
235
-
236
- input do
237
- const :description, String,
238
- description: 'Description of an object or scene'
239
- end
240
-
241
- output do
242
- const :color, Color,
243
- description: 'The dominant color (Red, Green, or Blue)'
244
- end
245
- end
246
-
247
- predictor = DSPy::Predict.new(ColorSignature)
248
- result = predictor.call(description: "A red apple on a wooden table")
249
- puts result.color # => #<Color::Red>
250
- ```
251
-
252
- ### Optional Fields and Defaults
253
-
254
- ```ruby
255
- class AnalysisSignature < DSPy::Signature
256
- description "Analyze text with optional metadata"
257
-
258
- input do
259
- const :text, String,
260
- description: 'Text to analyze'
261
- const :include_metadata, T::Boolean,
262
- description: 'Whether to include metadata in analysis',
263
- default: false
264
- end
265
-
266
- output do
267
- const :summary, String,
268
- description: 'Summary of the text'
269
- const :word_count, Integer,
270
- description: 'Number of words (optional)',
271
- default: 0
272
- end
273
- end
274
- ```
275
-
276
- ## Advanced Usage Patterns
277
-
278
- ### Multi-stage Pipelines
95
+ ## Documentation
279
96
 
280
- ```ruby
281
- class TopicSignature < DSPy::Signature
282
- description "Extract main topic from text"
283
-
284
- input do
285
- const :content, String,
286
- description: 'Text content to analyze'
287
- end
288
-
289
- output do
290
- const :topic, String,
291
- description: 'Main topic of the content'
292
- end
293
- end
97
+ ### Getting Started
98
+ - **[Installation & Setup](docs/getting-started/installation.md)** - Detailed installation and configuration
99
+ - **[Quick Start Guide](docs/getting-started/quick-start.md)** - Your first DSPy programs
100
+ - **[Core Concepts](docs/getting-started/core-concepts.md)** - Understanding signatures, predictors, and modules
294
101
 
295
- class SummarySignature < DSPy::Signature
296
- description "Create summary focusing on specific topic"
297
-
298
- input do
299
- const :content, String,
300
- description: 'Original text content'
301
- const :topic, String,
302
- description: 'Topic to focus on'
303
- end
304
-
305
- output do
306
- const :summary, String,
307
- description: 'Topic-focused summary'
308
- end
309
- end
102
+ ### Core Features
103
+ - **[Signatures & Types](docs/core-concepts/signatures.md)** - Define typed interfaces for LLM operations
104
+ - **[Predictors](docs/core-concepts/predictors.md)** - Predict, ChainOfThought, ReAct, and more
105
+ - **[Modules & Pipelines](docs/core-concepts/modules.md)** - Compose complex multi-stage workflows
106
+ - **[Examples & Validation](docs/core-concepts/examples.md)** - Type-safe training data
310
107
 
311
- class ArticlePipeline < DSPy::Signature
312
- extend T::Sig
313
-
314
- def initialize
315
- @topic_extractor = DSPy::Predict.new(TopicSignature)
316
- @summarizer = DSPy::ChainOfThought.new(SummarySignature)
317
- end
318
-
319
- sig { params(content: String).returns(T.untyped) }
320
- def forward(content:)
321
- # Extract topic
322
- topic_result = @topic_extractor.call(content: content)
323
-
324
- # Create focused summary
325
- summary_result = @summarizer.call(
326
- content: content,
327
- topic: topic_result.topic
328
- )
329
-
330
- {
331
- topic: topic_result.topic,
332
- summary: summary_result.summary,
333
- reasoning: summary_result.reasoning
334
- }
335
- end
336
- end
108
+ ### Optimization
109
+ - **[Evaluation Framework](docs/optimization/evaluation.md)** - Systematic testing with metrics
110
+ - **[Prompt Optimization](docs/optimization/prompt-optimization.md)** - Manipulate prompts as objects
111
+ - **[MIPROv2 Optimizer](docs/optimization/miprov2.md)** - State-of-the-art automatic optimization
112
+ - **[Simple Optimizer](docs/optimization/simple-optimizer.md)** - Quick experimentation with random/grid search
337
113
 
338
- # Usage
339
- pipeline = ArticlePipeline.new
340
- result = pipeline.call(content: "Long article content...")
341
- ```
114
+ ### Production Features
115
+ - **[Storage System](docs/enterprise/storage.md)** - Persist and search optimization results
116
+ - **[Registry & Versions](docs/enterprise/registry.md)** - Version control with deployment tracking
117
+ - **[Observability](docs/enterprise/observability.md)** - Multi-platform monitoring and metrics
342
118
 
343
- ### Retrieval Augmented Generation
119
+ ### Advanced Usage
120
+ - **[Complex Types](docs/advanced/complex-types.md)** - Enums, optional fields, and defaults
121
+ - **[Multi-stage Pipelines](docs/advanced/pipelines.md)** - Advanced composition patterns
122
+ - **[RAG Implementation](docs/advanced/rag.md)** - Retrieval Augmented Generation
123
+ - **[Custom Metrics](docs/advanced/custom-metrics.md)** - Domain-specific evaluation logic
344
124
 
345
- ```ruby
346
- class ContextualQA < DSPy::Signature
347
- description "Answer questions using relevant context"
348
-
349
- input do
350
- const :question, String,
351
- description: 'The question to answer'
352
- const :context, T::Array[String],
353
- description: 'Relevant context passages'
354
- end
355
-
356
- output do
357
- const :answer, String,
358
- description: 'Answer based on the provided context'
359
- const :confidence, Float,
360
- description: 'Confidence in the answer (0.0 to 1.0)'
361
- end
362
- end
363
-
364
- # Usage with retriever
365
- retriever = YourRetrieverClass.new
366
- qa = DSPy::ChainOfThought.new(ContextualQA)
367
-
368
- question = "What is the capital of France?"
369
- context = retriever.retrieve(question) # Returns array of strings
370
-
371
- result = qa.call(question: question, context: context)
372
- puts result.reasoning # Step-by-step reasoning
373
- puts result.answer # "Paris"
374
- puts result.confidence # 0.95
375
- ```
376
-
377
- ## Instrumentation & Observability
378
-
379
- DSPy.rb includes built-in instrumentation that captures detailed events and
380
- performance metrics from your LLM operations. Perfect for monitoring your
381
- applications and integrating with observability tools.
382
-
383
- ### Available Events
384
-
385
- Subscribe to these events to monitor different aspects of your LLM operations:
386
-
387
- | Event Name | Triggered When | Key Payload Fields |
388
- |------------|----------------|-------------------|
389
- | `dspy.lm.request` | LLM API request lifecycle | `gen_ai_system`, `model`, `provider`, `duration_ms`, `status` |
390
- | `dspy.lm.tokens` | Token usage tracking | `tokens_input`, `tokens_output`, `tokens_total` |
391
- | `dspy.predict` | Prediction operations | `signature_class`, `input_size`, `duration_ms`, `status` |
392
- | `dspy.chain_of_thought` | CoT reasoning | `signature_class`, `model`, `duration_ms`, `status` |
393
- | `dspy.react` | Agent operations | `max_iterations`, `tools_used`, `duration_ms`, `status` |
394
- | `dspy.react.tool_call` | Tool execution | `tool_name`, `tool_input`, `tool_output`, `duration_ms` |
395
-
396
- ### Event Payloads
397
-
398
- The instrumentation emits events with structured payloads you can process:
399
-
400
- ```ruby
401
- # Example event payload for dspy.predict
402
- {
403
- signature_class: "QuestionAnswering",
404
- model: "gpt-4o-mini",
405
- provider: "openai",
406
- input_size: 45,
407
- duration_ms: 1234.56,
408
- cpu_time_ms: 89.12,
409
- status: "success",
410
- timestamp: "2024-01-15T10:30:00Z"
411
- }
412
-
413
- # Example token usage payload
414
- {
415
- tokens_input: 150,
416
- tokens_output: 45,
417
- tokens_total: 195,
418
- gen_ai_system: "openai",
419
- signature_class: "QuestionAnswering"
420
- }
421
- ```
422
-
423
- Events are emitted via dry-monitor notifications, giving you flexibility to
424
- process them however you need - logging, metrics, alerts, or custom monitoring.
425
-
426
- ### Token Tracking
427
-
428
- Token usage is extracted from actual API responses (OpenAI and Anthropic only),
429
- giving you precise cost tracking:
430
-
431
- ```ruby
432
- # Token events include:
433
- {
434
- tokens_input: 150, # From API response
435
- tokens_output: 45, # From API response
436
- tokens_total: 195, # From API response
437
- gen_ai_system: "openai",
438
- gen_ai_request_model: "gpt-4o-mini"
439
- }
440
- ```
441
-
442
- ### Integration with Monitoring Tools
443
-
444
- Subscribe to events for custom processing:
125
+ ## What's Next
445
126
 
446
- ```ruby
447
- # Subscribe to all LM events
448
- DSPy::Instrumentation.subscribe('dspy.lm.*') do |event|
449
- puts "#{event.id}: #{event.payload[:duration_ms]}ms"
450
- end
127
+ These are my goals to release v1.0.
451
128
 
452
- # Subscribe to specific events
453
- DSPy::Instrumentation.subscribe('dspy.predict') do |event|
454
- MyMetrics.histogram('dspy.predict.duration', event.payload[:duration_ms])
455
- end
456
- ```
129
+ - Prompt objects foundation - *Done*
130
+ - Evaluation framework - *Done*
131
+ - ✅ Teleprompter base classes - *Done*
132
+ - ✅ MIPROv2 optimization algorithm - *Done*
133
+ - ✅ Storage & persistence system - *Done*
134
+ - ✅ Registry & version management - *Done*
135
+ - ✅ OpenTelemetry integration - *Done*
136
+ - ✅ New Relic integration - *Done*
137
+ - ✅ Langfuse integration - *Done*
138
+ - 🚧 Ollama support
139
+ - Context Engineering (see recent research: [How Contexts Fail](https://www.dbreunig.com/2025/06/22/how-contexts-fail-and-how-to-fix-them.html), [How to Fix Your Context](https://www.dbreunig.com/2025/06/26/how-to-fix-your-context.html), [Context Engineering](https://simonwillison.net/2025/Jun/27/context-engineering/))
140
+ - Agentic Memory support
141
+ - MCP Support
142
+ - Documentation website
143
+ - Performance benchmarks
457
144
 
458
145
  ## License
459
146
 
@@ -53,6 +53,63 @@ module DSPy
53
53
  @signature_class = enhanced_signature
54
54
  end
55
55
 
56
+ # Override prompt-based methods to maintain ChainOfThought behavior
57
+ sig { override.params(new_prompt: Prompt).returns(ChainOfThought) }
58
+ def with_prompt(new_prompt)
59
+ # Create a new ChainOfThought with the same original signature
60
+ instance = self.class.new(@original_signature)
61
+
62
+ # Ensure the instruction includes "Think step by step" if not already present
63
+ enhanced_instruction = if new_prompt.instruction.include?("Think step by step")
64
+ new_prompt.instruction
65
+ else
66
+ "#{new_prompt.instruction} Think step by step."
67
+ end
68
+
69
+ # Create enhanced prompt with ChainOfThought-specific schemas
70
+ enhanced_prompt = Prompt.new(
71
+ instruction: enhanced_instruction,
72
+ input_schema: @signature_class.input_json_schema,
73
+ output_schema: @signature_class.output_json_schema,
74
+ few_shot_examples: new_prompt.few_shot_examples,
75
+ signature_class_name: @signature_class.name
76
+ )
77
+
78
+ instance.instance_variable_set(:@prompt, enhanced_prompt)
79
+ instance
80
+ end
81
+
82
+ sig { override.params(instruction: String).returns(ChainOfThought) }
83
+ def with_instruction(instruction)
84
+ # Ensure ChainOfThought behavior is preserved
85
+ cot_instruction = instruction.include?("Think step by step") ? instruction : "#{instruction} Think step by step."
86
+ super(cot_instruction)
87
+ end
88
+
89
+ sig { override.params(examples: T::Array[FewShotExample]).returns(ChainOfThought) }
90
+ def with_examples(examples)
91
+ # Convert examples to include reasoning if they don't have it
92
+ enhanced_examples = examples.map do |example|
93
+ if example.reasoning.nil? || example.reasoning.empty?
94
+ # Try to extract reasoning from the output if it contains a reasoning field
95
+ reasoning = example.output[:reasoning] || "Step by step reasoning for this example."
96
+ DSPy::FewShotExample.new(
97
+ input: example.input,
98
+ output: example.output,
99
+ reasoning: reasoning
100
+ )
101
+ else
102
+ example
103
+ end
104
+ end
105
+
106
+ super(enhanced_examples)
107
+ end
108
+
109
+ # Access to the original signature for optimization
110
+ sig { returns(T.class_of(DSPy::Signature)) }
111
+ attr_reader :original_signature
112
+
56
113
  # Override forward_untyped to add ChainOfThought-specific instrumentation
57
114
  sig { override.params(input_values: T.untyped).returns(T.untyped) }
58
115
  def forward_untyped(**input_values)