dspy-datasets 0.29.1 → 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: c8de3f972de17ce584e6f1f8f7eec8084b6d24c3517fd14001d58d12537b98d1
4
- data.tar.gz: f47577ccf5b0826387bfb991d3f6372f9a41cccef7e1d9f3583030a0b5a4c61e
3
+ metadata.gz: f279306528c5bfcfaabda4c088e8d9f909c0da3ec1d259c2ef2ffcf0daf64476
4
+ data.tar.gz: b07a55b2949e177d5dd8756211831099abf5aa1fbd942dc3d99c635410bc4bee
5
5
  SHA512:
6
- metadata.gz: e02a16d9b3321c2841d052e1c69fa91106cbbbeb8b44394f1c41052b01936a2757cb94b26c0292309effe724477eae12487ce6a9ac85b6bd10c1bd12f13a9798
7
- data.tar.gz: 9ac56b72949104a5bb5d998768f419283b9a47b00653f54f84e99de492c987cdc548f060faa84855db4a8a54f2c231524f3ebf5269e774d8e10888d3fbdcabbf
6
+ metadata.gz: 507c2dd52c37d081bd452529ad88973b3f4821bb42a50214eeb8cd99fd7c6a8aec5497354ccf2a379e284196bfdf27ad44332f02bb326c637a10c2c8203a4fdf
7
+ data.tar.gz: 100a758e3242f4dd0788f349323e688bc00240c9b4a6095f4066c2bb6f9955e09de9c0231e0656ef31a05dc1ec88e51b90e815037b3e5cf254da4d4d3c9642b4
data/README.md CHANGED
@@ -5,26 +5,79 @@
5
5
  [![Build Status](https://img.shields.io/github/actions/workflow/status/vicentereig/dspy.rb/ruby.yml?branch=main&label=build)](https://github.com/vicentereig/dspy.rb/actions/workflows/ruby.yml)
6
6
  [![Documentation](https://img.shields.io/badge/docs-vicentereig.github.io%2Fdspy.rb-blue)](https://vicentereig.github.io/dspy.rb/)
7
7
 
8
+ > [!NOTE]
9
+ > The core Prompt Engineering Framework is production-ready with
10
+ > comprehensive documentation. I am focusing now on educational content on systematic Prompt Optimization and Context Engineering.
11
+ > Your feedback is invaluable. if you encounter issues, please open an [issue](https://github.com/vicentereig/dspy.rb/issues). If you have suggestions, open a [new thread](https://github.com/vicentereig/dspy.rb/discussions).
12
+ >
13
+ > If you want to contribute, feel free to reach out to me to coordinate efforts: hey at vicente.services
14
+ >
15
+ > And, yes, this is 100% a legit project. :)
16
+
17
+
8
18
  **Build reliable LLM applications in idiomatic Ruby using composable, type-safe modules.**
9
19
 
10
- The Ruby framework for programming with large language models. DSPy.rb brings structured LLM programming to Ruby developers. Instead of wrestling with prompt strings and parsing responses, you define typed signatures using idiomatic Ruby to compose and decompose AI Worklows and AI Agents.
20
+ The Ruby framework for programming with large language models. DSPy.rb brings structured LLM programming to Ruby developers, programmatic Prompt Engineering and Context Engineering.
21
+ Instead of wrestling with prompt strings and parsing responses, you define typed signatures using idiomatic Ruby to compose and decompose AI Worklows and AI Agents.
11
22
 
12
23
  **Prompts are the just Functions.** Traditional prompting is like writing code with string concatenation: it works until it doesn't. DSPy.rb brings you
13
24
  the programming approach pioneered by [dspy.ai](https://dspy.ai/): instead of crafting fragile prompts, you define modular
14
25
  signatures and let the framework handle the messy details.
15
26
 
16
27
  DSPy.rb is an idiomatic Ruby surgical port of Stanford's [DSPy framework](https://github.com/stanfordnlp/dspy). While implementing
17
- the core concepts of signatures, predictors, and optimization from the original Python library, DSPy.rb embraces Ruby
18
- conventions and adds Ruby-specific innovations like CodeAct agents and enhanced production instrumentation.
28
+ the core concepts of signatures, predictors, and the main optimization algorithms from the original Python library, DSPy.rb embraces Ruby
29
+ conventions and adds Ruby-specific innovations like Sorbet-base Typed system, ReAct loops, and production-ready integrations like non-blocking Open Telemetry Instrumentation.
19
30
 
20
- The result? LLM applications that actually scale and don't break when you sneeze.
31
+ **What you get?** Ruby LLM applications that actually scale and don't break when you sneeze.
32
+
33
+ Check the [examples](examples/) and take them for a spin!
21
34
 
22
35
  ## Your First DSPy Program
36
+ ### Installation
37
+
38
+ Add to your Gemfile:
39
+
40
+ ```ruby
41
+ gem 'dspy'
42
+ ```
43
+
44
+ and
45
+
46
+ ```bash
47
+ bundle install
48
+ ```
49
+
50
+ ### Optional Sibling Gems
51
+
52
+ DSPy.rb ships multiple gems from this monorepo so you only install what you need. Add these alongside `dspy`:
53
+
54
+ | Gem | Description | Status |
55
+ | --- | --- | --- |
56
+ | `dspy-schema` | Exposes `DSPy::TypeSystem::SorbetJsonSchema` for downstream reuse. | **Stable** (v1.0.0) |
57
+ | `dspy-code_act` | Think-Code-Observe agents that synthesize and execute Ruby safely. | Preview (0.x) |
58
+ | `dspy-datasets` | Dataset helpers plus Parquet/Polars tooling for richer evaluation corpora. | Preview (0.x) |
59
+ | `dspy-evals` | High-throughput evaluation harness with metrics, callbacks, and regression fixtures. | Preview (0.x) |
60
+ | `dspy-miprov2` | Bayesian optimization + Gaussian Process backend for the MIPROv2 teleprompter. | Preview (0.x) |
61
+ | `dspy-gepa` | `DSPy::Teleprompt::GEPA`, reflection loops, experiment tracking, telemetry adapters. | Preview (mirrors `dspy` version) |
62
+ | `gepa` | GEPA optimizer core (Pareto engine, telemetry, reflective proposer). | Preview (mirrors `dspy` version) |
63
+ | `dspy-o11y` | Core observability APIs: `DSPy::Observability`, async span processor, observation types. | **Stable** (v1.0.0) |
64
+ | `dspy-o11y-langfuse` | Auto-configures DSPy observability to stream spans to Langfuse via OTLP. | **Stable** (v1.0.0) |
65
+
66
+ Set the matching `DSPY_WITH_*` environment variables (see `Gemfile`) to include or exclude each sibling gem when running Bundler locally (for example `DSPY_WITH_GEPA=1` or `DSPY_WITH_O11Y_LANGFUSE=1`). Refer to `docs/core-concepts/dependency-tree.md` for the full dependency map and roadmap.
67
+ ### Your First Reliable Predictor
23
68
 
24
69
  ```ruby
25
- # Define a signature for sentiment classification
70
+
71
+ # Configure DSPy globablly to use your fave LLM - you can override this on an instance levle.
72
+ DSPy.configure do |c|
73
+ c.lm = DSPy::LM.new('openai/gpt-4o-mini',
74
+ api_key: ENV['OPENAI_API_KEY'],
75
+ structured_outputs: true) # Enable OpenAI's native JSON mode
76
+ end
77
+
78
+ # Define a signature for sentiment classification - instead of writing a full prompt!
26
79
  class Classify < DSPy::Signature
27
- description "Classify sentiment of a given sentence."
80
+ description "Classify sentiment of a given sentence." # sets the goal of the underlying prompt
28
81
 
29
82
  class Sentiment < T::Enum
30
83
  enums do
@@ -33,26 +86,22 @@ class Classify < DSPy::Signature
33
86
  Neutral = new('neutral')
34
87
  end
35
88
  end
36
-
89
+
90
+ # Structured Inputs: makes sure you are sending only valid prompt inputs to your model
37
91
  input do
38
- const :sentence, String
92
+ const :sentence, String, description: 'The sentence to analyze'
39
93
  end
40
94
 
95
+ # Structured Outputs: your predictor will validate the output of the model too.
41
96
  output do
42
- const :sentiment, Sentiment
43
- const :confidence, Float
97
+ const :sentiment, Sentiment, description: 'The sentiment of the sentence'
98
+ const :confidence, Float, description: 'A number between 0.0 and 1.0'
44
99
  end
45
100
  end
46
101
 
47
- # Configure DSPy with your LLM
48
- DSPy.configure do |c|
49
- c.lm = DSPy::LM.new('openai/gpt-4o-mini',
50
- api_key: ENV['OPENAI_API_KEY'],
51
- structured_outputs: true) # Enable OpenAI's native JSON mode
52
- end
53
-
54
- # Create the predictor and run inference
102
+ # Wire it to the simplest prompting technique - a Predictn.
55
103
  classify = DSPy::Predict.new(Classify)
104
+ # it may raise an error if you mess the inputs or your LLM messes the outputs.
56
105
  result = classify.call(sentence: "This book was super fun to read!")
57
106
 
58
107
  puts result.sentiment # => #<Sentiment::Positive>
@@ -99,12 +148,22 @@ end
99
148
 
100
149
  ## What You Get
101
150
 
151
+ **Developer Experience:**
152
+ - LLM provider support using official Ruby clients:
153
+ - [OpenAI Ruby](https://github.com/openai/openai-ruby) with vision model support
154
+ - [Anthropic Ruby SDK](https://github.com/anthropics/anthropic-sdk-ruby) with multimodal capabilities
155
+ - [Google Gemini API](https://ai.google.dev/) with native structured outputs
156
+ - [Ollama](https://ollama.com/) via OpenAI compatibility layer for local models
157
+ - **Multimodal Support** - Complete image analysis with DSPy::Image, type-safe bounding boxes, vision-capable models
158
+ - Runtime type checking with [Sorbet](https://sorbet.org/) including T::Enum and union types
159
+ - Type-safe tool definitions for ReAct agents
160
+ - Comprehensive instrumentation and observability
161
+
102
162
  **Core Building Blocks:**
103
163
  - **Signatures** - Define input/output schemas using Sorbet types with T::Enum and union type support
104
164
  - **Predict** - LLM completion with structured data extraction and multimodal support
105
165
  - **Chain of Thought** - Step-by-step reasoning for complex problems with automatic prompt optimization
106
166
  - **ReAct** - Tool-using agents with type-safe tool definitions and error recovery
107
- - **CodeAct** - Dynamic code execution agents for programming tasks
108
167
  - **Module Composition** - Combine multiple LLM calls into production-ready workflows
109
168
 
110
169
  **Optimization & Evaluation:**
@@ -122,24 +181,40 @@ end
122
181
  - **File-based Storage** - Optimization result persistence with versioning
123
182
  - **Structured Logging** - JSON and key=value formats with span tracking
124
183
 
125
- **Developer Experience:**
126
- - LLM provider support using official Ruby clients:
127
- - [OpenAI Ruby](https://github.com/openai/openai-ruby) with vision model support
128
- - [Anthropic Ruby SDK](https://github.com/anthropics/anthropic-sdk-ruby) with multimodal capabilities
129
- - [Google Gemini API](https://ai.google.dev/) with native structured outputs
130
- - [Ollama](https://ollama.com/) via OpenAI compatibility layer for local models
131
- - **Multimodal Support** - Complete image analysis with DSPy::Image, type-safe bounding boxes, vision-capable models
132
- - Runtime type checking with [Sorbet](https://sorbet.org/) including T::Enum and union types
133
- - Type-safe tool definitions for ReAct agents
134
- - Comprehensive instrumentation and observability
184
+ ## Recent Achievements
135
185
 
136
- ## Development Status
186
+ DSPy.rb has rapidly evolved from experimental to production-ready:
137
187
 
138
- DSPy.rb is actively developed and approaching stability. The core framework is production-ready with
139
- comprehensive documentation, but I'm battle-testing features through the 0.x series before committing
140
- to a stable v1.0 API.
188
+ ### Foundation
189
+ - **JSON Parsing Reliability** - Native OpenAI structured outputs with adaptive retry logic and schema-aware fallbacks
190
+ - **Type-Safe Strategy Configuration** - Provider-optimized strategy selection and enum-backed optimizer presets
191
+ - ✅ **Core Module System** - Predict, ChainOfThought, ReAct with type safety (add `dspy-code_act` for Think-Code-Observe agents)
192
+ - ✅ **Production Observability** - OpenTelemetry, New Relic, and Langfuse integration
193
+ - ✅ **Advanced Optimization** - MIPROv2 with Bayesian optimization, Gaussian Processes, and multi-mode search
194
+
195
+ ### Recent Advances
196
+ - ✅ **MIPROv2 ADE Integrity (v0.29.1)** - Stratified train/val/test splits, honest precision accounting, and enum-driven `--auto` presets with integration coverage
197
+ - ✅ **Instruction Deduplication (v0.29.1)** - Candidate generation now filters repeated programs so optimization logs highlight unique strategies
198
+ - ✅ **GEPA Teleprompter (v0.29.0)** - Genetic-Pareto reflective prompt evolution with merge proposer scheduling, reflective mutation, and ADE demo parity
199
+ - ✅ **Optimizer Utilities Parity (v0.29.0)** - Bootstrap strategies, dataset summaries, and Layer 3 utilities unlock multi-predictor programs on Ruby
200
+ - ✅ **Observability Hardening (v0.29.0)** - OTLP exporter runs on a single-thread executor preventing frozen SSL contexts without blocking spans
201
+ - ✅ **Documentation Refresh (v0.29.x)** - New GEPA guide plus ADE optimization docs covering presets, stratified splits, and error-handling defaults
202
+
203
+ **Current Focus Areas:**
204
+
205
+ ### Production Readiness
206
+ - 🚧 **Production Patterns** - Real-world usage validation and performance optimization
207
+ - 🚧 **Ruby Ecosystem Integration** - Rails integration, Sidekiq compatibility, deployment patterns
208
+
209
+ ### Community & Adoption
210
+ - 🚧 **Community Examples** - Real-world applications and case studies
211
+ - 🚧 **Contributor Experience** - Making it easier to contribute and extend
212
+ - 🚧 **Performance Benchmarks** - Comparative analysis vs other frameworks
213
+
214
+ **v1.0 Philosophy:**
215
+ v1.0 will be released after extensive production battle-testing, not after checking off features.
216
+ The API is already stable - v1.0 represents confidence in production reliability backed by real-world validation.
141
217
 
142
- Real-world usage feedback is invaluable - if you encounter issues or have suggestions, please open a GitHub issue!
143
218
 
144
219
  ## Documentation
145
220
 
@@ -156,92 +231,37 @@ For LLMs and AI assistants working with DSPy.rb:
156
231
  - **[Quick Start Guide](docs/src/getting-started/quick-start.md)** - Your first DSPy programs
157
232
  - **[Core Concepts](docs/src/getting-started/core-concepts.md)** - Understanding signatures, predictors, and modules
158
233
 
159
- ### Core Features
234
+ ### Prompt Engineering
160
235
  - **[Signatures & Types](docs/src/core-concepts/signatures.md)** - Define typed interfaces for LLM operations
161
236
  - **[Predictors](docs/src/core-concepts/predictors.md)** - Predict, ChainOfThought, ReAct, and more
162
237
  - **[Modules & Pipelines](docs/src/core-concepts/modules.md)** - Compose complex multi-stage workflows
163
238
  - **[Multimodal Support](docs/src/core-concepts/multimodal.md)** - Image analysis with vision-capable models
164
239
  - **[Examples & Validation](docs/src/core-concepts/examples.md)** - Type-safe training data
240
+ - **[Rich Types](docs/src/advanced/complex-types.md)** - Sorbet type integration with automatic coercion for structs, enums, and arrays
241
+ - **[Composable Pipelines](docs/src/advanced/pipelines.md)** - Manual module composition patterns
165
242
 
166
- ### Optimization
243
+ ### Prompt Optimization
167
244
  - **[Evaluation Framework](docs/src/optimization/evaluation.md)** - Advanced metrics beyond simple accuracy
168
245
  - **[Prompt Optimization](docs/src/optimization/prompt-optimization.md)** - Manipulate prompts as objects
169
246
  - **[MIPROv2 Optimizer](docs/src/optimization/miprov2.md)** - Advanced Bayesian optimization with Gaussian Processes
170
247
  - **[GEPA Optimizer](docs/src/optimization/gepa.md)** *(beta)* - Reflective mutation with optional reflection LMs
171
248
 
172
- ### Production Features
173
- - **[Storage System](docs/src/production/storage.md)** - Persistence and optimization result storage
174
- - **[Observability](docs/src/production/observability.md)** - Zero-config Langfuse integration with a dedicated export worker that never blocks your LLMs
175
-
176
- ### Advanced Usage
177
- - **[Complex Types](docs/src/advanced/complex-types.md)** - Sorbet type integration with automatic coercion for structs, enums, and arrays
178
- - **[Manual Pipelines](docs/src/advanced/pipelines.md)** - Manual module composition patterns
249
+ ### Context Engineering
250
+ - **[Tools](docs/src/core-concepts/toolsets.md)** - Tool wieldint agents.
251
+ - **[Agentic Memory](docs/src/core-concepts/memory.md)** - Memory Tools & Agentic Loops
179
252
  - **[RAG Patterns](docs/src/advanced/rag.md)** - Manual RAG implementation with external services
180
- - **[Custom Metrics](docs/src/advanced/custom-metrics.md)** - Proc-based evaluation logic
181
-
182
- ## Quick Start
183
-
184
- ### Installation
185
-
186
- Add to your Gemfile:
187
-
188
- ```ruby
189
- gem 'dspy'
190
- ```
191
-
192
- Then run:
193
-
194
- ```bash
195
- bundle install
196
- ```
197
-
198
- ## Recent Achievements
199
-
200
- DSPy.rb has rapidly evolved from experimental to production-ready:
201
-
202
- ### Foundation
203
- - ✅ **JSON Parsing Reliability** - Native OpenAI structured outputs, strategy selection, retry logic
204
- - ✅ **Type-Safe Strategy Configuration** - Provider-optimized automatic strategy selection
205
- - ✅ **Core Module System** - Predict, ChainOfThought, ReAct, CodeAct with type safety
206
- - ✅ **Production Observability** - OpenTelemetry, New Relic, and Langfuse integration
207
- - ✅ **Advanced Optimization** - MIPROv2 with Bayesian optimization, Gaussian Processes, and multiple strategies
208
253
 
209
- ### Recent Advances
210
- - **Enhanced Langfuse Integration (v0.25.0)** - Comprehensive OpenTelemetry span reporting with proper input/output, hierarchical nesting, accurate timing, and observation types
211
- - **Comprehensive Multimodal Framework** - Complete image analysis with `DSPy::Image`, type-safe bounding boxes, vision model integration
212
- - **Advanced Type System** - `T::Enum` integration, union types for agentic workflows, complex type coercion
213
- - ✅ **Production-Ready Evaluation** - Multi-factor metrics beyond accuracy, error-resilient evaluation pipelines
214
- - ✅ **Documentation Ecosystem** - `llms.txt` for AI assistants, ADRs, blog articles, comprehensive examples
215
- - ✅ **API Maturation** - Simplified idiomatic patterns, better error handling, production-proven designs
254
+ ### Production Features
255
+ - **[Observability](docs/src/production/observability.md)** - Zero-config Langfuse integration with a dedicated export worker that never blocks your LLMs
256
+ - **[Storage System](docs/src/production/storage.md)** - Persistence and optimization result storage
257
+ - **[Custom Metrics](docs/src/advanced/custom-metrics.md)** - Proc-based evaluation logic
216
258
 
217
- ## Roadmap - Production Battle-Testing Toward v1.0
218
259
 
219
- DSPy.rb has transitioned from **feature building** to **production validation**. The core framework is
220
- feature-complete and stable - now I'm focusing on real-world usage patterns, performance optimization,
221
- and ecosystem integration.
222
260
 
223
- **Current Focus Areas:**
224
261
 
225
- ### Production Readiness
226
- - 🚧 **Production Patterns** - Real-world usage validation and performance optimization
227
- - 🚧 **Ruby Ecosystem Integration** - Rails integration, Sidekiq compatibility, deployment patterns
228
- - 🚧 **Scale Testing** - High-volume usage, memory management, connection pooling
229
- - 🚧 **Error Recovery** - Robust failure handling patterns for production environments
230
262
 
231
- ### Ecosystem Expansion
232
- - 🚧 **Model Context Protocol (MCP)** - Integration with MCP ecosystem
233
- - 🚧 **Additional Provider Support** - Azure OpenAI, local models beyond Ollama
234
- - 🚧 **Tool Ecosystem** - Expanded tool integrations for ReAct agents
235
263
 
236
- ### Community & Adoption
237
- - 🚧 **Community Examples** - Real-world applications and case studies
238
- - 🚧 **Contributor Experience** - Making it easier to contribute and extend
239
- - 🚧 **Performance Benchmarks** - Comparative analysis vs other frameworks
240
264
 
241
- **v1.0 Philosophy:**
242
- v1.0 will be released after extensive production battle-testing, not after checking off features.
243
- The API is already stable - v1.0 represents confidence in production reliability backed by real-world validation.
244
265
 
245
266
  ## License
246
-
247
267
  This project is licensed under the MIT License.
@@ -0,0 +1,182 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'set'
4
+ require_relative 'info'
5
+ require_relative 'loaders'
6
+
7
+ module DSPy
8
+ module Datasets
9
+ # Ruby implementation of the HotPotQA dataset loader backed by Hugging Face parquet files.
10
+ # Provides convenience helpers to create train/dev/test splits matching the Python DSPy defaults.
11
+ class HotPotQA
12
+ DATASET_INFO = DatasetInfo.new(
13
+ id: 'hotpotqa/hotpot_qa/fullwiki',
14
+ name: 'HotPotQA (FullWiki)',
15
+ provider: 'huggingface',
16
+ splits: %w[train validation],
17
+ features: {
18
+ 'id' => { 'type' => 'string' },
19
+ 'question' => { 'type' => 'string' },
20
+ 'answer' => { 'type' => 'string' },
21
+ 'level' => { 'type' => 'string' },
22
+ 'type' => { 'type' => 'string' },
23
+ 'supporting_facts' => { 'type' => 'list' },
24
+ 'context' => { 'type' => 'list' }
25
+ },
26
+ loader: :huggingface_parquet,
27
+ loader_options: {
28
+ dataset: ['hotpotqa/hotpot_qa', 'hotpot_qa'],
29
+ config: 'fullwiki'
30
+ },
31
+ metadata: {
32
+ description: 'HotPotQA FullWiki split filtered to hard examples. Train split is further divided into train/dev (75/25) matching Python DSPy defaults. Supports dataset rename on Hugging Face.',
33
+ homepage: 'https://huggingface.co/datasets/hotpot_qa',
34
+ approx_row_count: 112_000
35
+ }
36
+ ).freeze
37
+
38
+ DEFAULT_KEEP_DETAILS = :dev_titles
39
+
40
+ attr_reader :train_size, :dev_size, :test_size
41
+
42
+ def initialize(
43
+ only_hard_examples: true,
44
+ keep_details: DEFAULT_KEEP_DETAILS,
45
+ unofficial_dev: true,
46
+ train_seed: 0,
47
+ train_size: nil,
48
+ dev_size: nil,
49
+ test_size: nil,
50
+ cache_dir: nil
51
+ )
52
+ raise ArgumentError, 'only_hard_examples must be true' unless only_hard_examples
53
+
54
+ @keep_details = keep_details
55
+ @unofficial_dev = unofficial_dev
56
+ @train_seed = train_seed
57
+ @train_size = train_size
58
+ @dev_size = dev_size
59
+ @test_size = test_size
60
+ @cache_dir = cache_dir
61
+ @loaded = false
62
+ end
63
+
64
+ def train
65
+ ensure_loaded
66
+ subset(@train_examples, train_size)
67
+ end
68
+
69
+ def dev
70
+ ensure_loaded
71
+ subset(@dev_examples, dev_size)
72
+ end
73
+
74
+ def test
75
+ ensure_loaded
76
+ subset(@test_examples, test_size)
77
+ end
78
+
79
+ def context_lookup
80
+ ensure_loaded
81
+ @context_lookup ||= begin
82
+ all_examples = @train_examples + @dev_examples + @test_examples
83
+ all_examples.each_with_object({}) do |example, memo|
84
+ memo[example[:question]] = example[:context] || []
85
+ end
86
+ end
87
+ end
88
+
89
+ private
90
+
91
+ attr_reader :keep_details, :unofficial_dev, :train_seed, :cache_dir
92
+
93
+ def ensure_loaded
94
+ return if @loaded
95
+
96
+ load_data
97
+ @loaded = true
98
+ end
99
+
100
+ def subset(examples, limit)
101
+ return examples unless limit
102
+
103
+ examples.first(limit)
104
+ end
105
+
106
+ def load_data
107
+ train_rows = collect_rows(split: 'train')
108
+ shuffled = train_rows.shuffle(random: Random.new(train_seed))
109
+ split_point = (shuffled.length * 0.75).floor
110
+
111
+ @train_examples = shuffled.first(split_point)
112
+ @dev_examples = unofficial_dev ? shuffled.drop(split_point) : []
113
+
114
+ if keep_details == DEFAULT_KEEP_DETAILS
115
+ @train_examples.each { |example| example.delete(:gold_titles) }
116
+ end
117
+
118
+ @test_examples = collect_rows(split: 'validation')
119
+ end
120
+
121
+ def collect_rows(split:)
122
+ loader = Loaders.build(DATASET_INFO, split: split, cache_dir: cache_dir)
123
+ examples = []
124
+
125
+ loader.each_row do |row|
126
+ next unless row['level'] == 'hard'
127
+
128
+ examples << transform_row(row)
129
+ end
130
+
131
+ examples
132
+ end
133
+
134
+ def transform_row(row)
135
+ example = {
136
+ id: row['id'],
137
+ question: row['question'],
138
+ answer: row['answer'],
139
+ type: row['type'],
140
+ context: normalize_context(row['context']),
141
+ gold_titles: extract_gold_titles(row['supporting_facts'])
142
+ }
143
+
144
+ example.delete(:context) unless example[:context]&.any?
145
+ example.delete(:gold_titles) if example[:gold_titles].empty?
146
+ example
147
+ end
148
+
149
+ def normalize_context(raw_context)
150
+ return [] unless raw_context.respond_to?(:map)
151
+
152
+ raw_context.map do |pair|
153
+ if pair.is_a?(Array) && pair.size == 2
154
+ title, sentences = pair
155
+ sentences_text = if sentences.is_a?(Array)
156
+ sentences.join(' ')
157
+ else
158
+ sentences.to_s
159
+ end
160
+ "#{title}: #{sentences_text}".strip
161
+ else
162
+ pair.to_s
163
+ end
164
+ end
165
+ end
166
+
167
+ def extract_gold_titles(supporting_facts)
168
+ case supporting_facts
169
+ when Hash
170
+ titles = supporting_facts['title'] || supporting_facts[:title]
171
+ Array(titles).to_set
172
+ when Array
173
+ supporting_facts.each_with_object(Set.new) do |fact, memo|
174
+ memo << (fact.is_a?(Array) ? fact[0] : fact)
175
+ end
176
+ else
177
+ Set.new
178
+ end
179
+ end
180
+ end
181
+ end
182
+ end
@@ -52,24 +52,20 @@ module DSPy
52
52
 
53
53
  def parquet_files
54
54
  @parquet_files ||= begin
55
- uri = URI("#{BASE_URL}/parquet")
56
- params = {
57
- dataset: info.loader_options.fetch(:dataset),
58
- config: info.loader_options.fetch(:config),
59
- split: split
60
- }
61
- uri.query = URI.encode_www_form(params)
62
-
63
- response = http_get(uri)
64
- unless response.is_a?(Net::HTTPSuccess)
65
- raise DatasetError, "Failed to fetch parquet manifest: #{response.code}"
55
+ datasets = Array(info.loader_options.fetch(:dataset))
56
+ last_error = nil
57
+
58
+ datasets.each do |dataset_name|
59
+ begin
60
+ files = fetch_parquet_files(dataset_name)
61
+ return files unless files.empty?
62
+ last_error = DatasetError.new("No parquet files available for #{dataset_name} (#{split})")
63
+ rescue DatasetError => e
64
+ last_error = e
65
+ end
66
66
  end
67
67
 
68
- body = JSON.parse(response.body)
69
- files = body.fetch('parquet_files', [])
70
- raise DatasetError, "No parquet files available for #{info.id} (#{split})" if files.empty?
71
-
72
- files
68
+ raise(last_error || DatasetError.new("Failed to fetch parquet manifest for #{info.id} (#{split})"))
73
69
  end
74
70
  end
75
71
 
@@ -82,6 +78,24 @@ module DSPy
82
78
  path
83
79
  end
84
80
 
81
+ def fetch_parquet_files(dataset_name)
82
+ uri = URI("#{BASE_URL}/parquet")
83
+ params = {
84
+ dataset: dataset_name,
85
+ config: info.loader_options.fetch(:config),
86
+ split: split
87
+ }
88
+ uri.query = URI.encode_www_form(params)
89
+
90
+ response = http_get(uri)
91
+ unless response.is_a?(Net::HTTPSuccess)
92
+ raise DatasetError, "Failed to fetch parquet manifest: #{response.code}"
93
+ end
94
+
95
+ body = JSON.parse(response.body)
96
+ body.fetch('parquet_files', [])
97
+ end
98
+
85
99
  def cache_dir
86
100
  @cache_dir ||= File.join(cache_root, split)
87
101
  end
@@ -102,31 +116,50 @@ module DSPy
102
116
  end
103
117
 
104
118
  def http_get(uri)
105
- Net::HTTP.start(uri.host, uri.port, use_ssl: uri.scheme == 'https') do |http|
106
- request = Net::HTTP::Get.new(uri)
107
- http.request(request)
108
- end
119
+ perform_request_with_redirects(uri)
109
120
  end
110
121
 
111
122
  def download_file(url, destination)
112
- uri = URI(url)
123
+ fetch_with_redirects(URI(url)) do |response|
124
+ File.binwrite(destination, response.body)
125
+ end
126
+ rescue => e
127
+ File.delete(destination) if File.exist?(destination)
128
+ raise
129
+ end
130
+
131
+ MAX_REDIRECTS = 5
132
+
133
+ def perform_request_with_redirects(uri, limit = MAX_REDIRECTS)
134
+ raise DownloadError, 'Too many HTTP redirects' if limit <= 0
135
+
113
136
  Net::HTTP.start(uri.host, uri.port, use_ssl: uri.scheme == 'https') do |http|
114
137
  request = Net::HTTP::Get.new(uri)
115
- http.request(request) do |response|
116
- unless response.is_a?(Net::HTTPSuccess)
117
- raise DownloadError, "Failed to download parquet file: #{response.code}"
118
- end
138
+ response = http.request(request)
119
139
 
120
- File.open(destination, 'wb') do |file|
121
- response.read_body do |chunk|
122
- file.write(chunk)
123
- end
124
- end
140
+ if response.is_a?(Net::HTTPRedirection)
141
+ location = response['location']
142
+ raise DownloadError, 'Redirect without location header' unless location
143
+
144
+ new_uri = URI(location)
145
+ new_uri = uri + location if new_uri.relative?
146
+ return perform_request_with_redirects(new_uri, limit - 1)
125
147
  end
148
+
149
+ response
126
150
  end
127
- rescue => e
128
- File.delete(destination) if File.exist?(destination)
129
- raise
151
+ end
152
+
153
+ def fetch_with_redirects(uri, limit = MAX_REDIRECTS, &block)
154
+ response = perform_request_with_redirects(uri, limit)
155
+
156
+ unless response.is_a?(Net::HTTPSuccess)
157
+ message = response ? "Failed to download parquet file: #{response.code}" : 'Failed to download parquet file'
158
+ raise DownloadError, message
159
+ end
160
+
161
+ return yield response if block_given?
162
+ response
130
163
  end
131
164
  end
132
165
  end
@@ -28,6 +28,28 @@ module DSPy
28
28
  homepage: 'https://huggingface.co/datasets/ade-benchmark-corpus/ade_corpus_v2',
29
29
  approx_row_count: 23516
30
30
  }
31
+ ),
32
+ DatasetInfo.new(
33
+ id: 'hotpot_qa/fullwiki',
34
+ name: 'HotPotQA (FullWiki)',
35
+ provider: 'huggingface',
36
+ splits: %w[train validation],
37
+ features: {
38
+ 'id' => { 'type' => 'string' },
39
+ 'question' => { 'type' => 'string' },
40
+ 'answer' => { 'type' => 'string' },
41
+ 'level' => { 'type' => 'string' }
42
+ },
43
+ loader: :huggingface_parquet,
44
+ loader_options: {
45
+ dataset: 'hotpot_qa',
46
+ config: 'fullwiki'
47
+ },
48
+ metadata: {
49
+ description: 'HotPotQA FullWiki configuration. The DSPy::Datasets::HotPotQA helper further filters to hard examples and produces train/dev/test splits.',
50
+ homepage: 'https://huggingface.co/datasets/hotpot_qa',
51
+ approx_row_count: 112_000
52
+ }
31
53
  )
32
54
  ].freeze
33
55
  end
@@ -2,6 +2,6 @@
2
2
 
3
3
  module DSPy
4
4
  module Datasets
5
- VERSION = DSPy::VERSION
5
+ VERSION = '1.0.0'
6
6
  end
7
7
  end
data/lib/dspy/datasets.rb CHANGED
@@ -7,6 +7,7 @@ require_relative 'datasets/manifest'
7
7
  require_relative 'datasets/loaders'
8
8
  require_relative 'datasets/hugging_face/api'
9
9
  require_relative 'datasets/ade'
10
+ require_relative 'datasets/hotpot_qa'
10
11
 
11
12
  module DSPy
12
13
  module Datasets
metadata CHANGED
@@ -1,13 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: dspy-datasets
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.29.1
4
+ version: 1.0.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Vicente Reig Rincón de Arellano
8
+ autorequire:
8
9
  bindir: bin
9
10
  cert_chain: []
10
- date: 2025-10-20 00:00:00.000000000 Z
11
+ date: 2025-10-25 00:00:00.000000000 Z
11
12
  dependencies:
12
13
  - !ruby/object:Gem::Dependency
13
14
  name: dspy
@@ -15,14 +16,14 @@ dependencies:
15
16
  requirements:
16
17
  - - '='
17
18
  - !ruby/object:Gem::Version
18
- version: 0.29.1
19
+ version: 0.30.0
19
20
  type: :runtime
20
21
  prerelease: false
21
22
  version_requirements: !ruby/object:Gem::Requirement
22
23
  requirements:
23
24
  - - '='
24
25
  - !ruby/object:Gem::Version
25
- version: 0.29.1
26
+ version: 0.30.0
26
27
  - !ruby/object:Gem::Dependency
27
28
  name: red-parquet
28
29
  requirement: !ruby/object:Gem::Requirement
@@ -51,6 +52,7 @@ files:
51
52
  - lib/dspy/datasets/ade.rb
52
53
  - lib/dspy/datasets/dataset.rb
53
54
  - lib/dspy/datasets/errors.rb
55
+ - lib/dspy/datasets/hotpot_qa.rb
54
56
  - lib/dspy/datasets/hugging_face/api.rb
55
57
  - lib/dspy/datasets/info.rb
56
58
  - lib/dspy/datasets/loaders.rb
@@ -62,6 +64,7 @@ licenses:
62
64
  - MIT
63
65
  metadata:
64
66
  github_repo: git@github.com:vicentereig/dspy.rb
67
+ post_install_message:
65
68
  rdoc_options: []
66
69
  require_paths:
67
70
  - lib
@@ -76,7 +79,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
76
79
  - !ruby/object:Gem::Version
77
80
  version: '0'
78
81
  requirements: []
79
- rubygems_version: 3.6.5
82
+ rubygems_version: 3.0.3.1
83
+ signing_key:
80
84
  specification_version: 4
81
85
  summary: Curated datasets and loaders for DSPy.rb.
82
86
  test_files: []