dspy-schema 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: bf1a5bda9480f0a5fd8e00f80949debb6bcd99e0ee014e767dc5ed7ac02440f1
4
+ data.tar.gz: a7c318b77fef71636125958148203960d9737897f1eed656471fbe03dd43c406
5
+ SHA512:
6
+ metadata.gz: 8bf44b28876109783dc9570919f2a99f1a48a3b96823aec379f2a10b48ea84d281be226afb9a450a119a81f1721e9e37a990b5edf38b0aadc643f2cf94f49915
7
+ data.tar.gz: 3a34993c5c3252c63642a6892eed12e172d139dc834e53d04e7b79424e21f5f0d3ae245ef479dd6b013adec699f4d206daccf29864837db99231a853afc7f9c6
data/LICENSE ADDED
@@ -0,0 +1,45 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Vicente Services SL
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
22
+
23
+ This project is a Ruby port of the original Python [DSPy library](https://github.com/stanfordnlp/dspy), which is licensed under the MIT License:
24
+
25
+ MIT License
26
+
27
+ Copyright (c) 2023 Stanford Future Data Systems
28
+
29
+ Permission is hereby granted, free of charge, to any person obtaining a copy
30
+ of this software and associated documentation files (the "Software"), to deal
31
+ in the Software without restriction, including without limitation the rights
32
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
33
+ copies of the Software, and to permit persons to whom the Software is
34
+ furnished to do so, subject to the following conditions:
35
+
36
+ The above copyright notice and this permission notice shall be included in all
37
+ copies or substantial portions of the Software.
38
+
39
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
40
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
41
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
42
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
43
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
44
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
45
+ SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,264 @@
1
+ # DSPy.rb
2
+
3
+ [![Gem Version](https://img.shields.io/gem/v/dspy)](https://rubygems.org/gems/dspy)
4
+ [![Total Downloads](https://img.shields.io/gem/dt/dspy)](https://rubygems.org/gems/dspy)
5
+ [![Build Status](https://img.shields.io/github/actions/workflow/status/vicentereig/dspy.rb/ruby.yml?branch=main&label=build)](https://github.com/vicentereig/dspy.rb/actions/workflows/ruby.yml)
6
+ [![Documentation](https://img.shields.io/badge/docs-vicentereig.github.io%2Fdspy.rb-blue)](https://vicentereig.github.io/dspy.rb/)
7
+
8
+ > [!NOTE]
9
+ > The core Prompt Engineering Framework is production-ready with
10
+ > comprehensive documentation. I am focusing now on educational content on systematic Prompt Optimization and Context Engineering.
11
+ > Your feedback is invaluable. if you encounter issues, please open an [issue](https://github.com/vicentereig/dspy.rb/issues). If you have suggestions, open a [new thread](https://github.com/vicentereig/dspy.rb/discussions).
12
+ >
13
+ > If you want to contribute, feel free to reach out to me to coordinate efforts: hey at vicente.services
14
+ >
15
+ > And, yes, this is 100% a legit project. :)
16
+
17
+
18
+ **Build reliable LLM applications in idiomatic Ruby using composable, type-safe modules.**
19
+
20
+ The Ruby framework for programming with large language models. DSPy.rb brings structured LLM programming to Ruby developers, programmatic Prompt Engineering and Context Engineering.
21
+ Instead of wrestling with prompt strings and parsing responses, you define typed signatures using idiomatic Ruby to compose and decompose AI Worklows and AI Agents.
22
+
23
+ **Prompts are the just Functions.** Traditional prompting is like writing code with string concatenation: it works until it doesn't. DSPy.rb brings you
24
+ the programming approach pioneered by [dspy.ai](https://dspy.ai/): instead of crafting fragile prompts, you define modular
25
+ signatures and let the framework handle the messy details.
26
+
27
+ DSPy.rb is an idiomatic Ruby surgical port of Stanford's [DSPy framework](https://github.com/stanfordnlp/dspy). While implementing
28
+ the core concepts of signatures, predictors, and the main optimization algorithms from the original Python library, DSPy.rb embraces Ruby
29
+ conventions and adds Ruby-specific innovations like Sorbet-base Typed system, ReAct loops, and production-ready integrations like non-blocking Open Telemetry Instrumentation.
30
+
31
+ **What you get?** Ruby LLM applications that actually scale and don't break when you sneeze.
32
+
33
+ Check the [examples](examples/) and take them for a spin!
34
+
35
+ ## Your First DSPy Program
36
+ ### Installation
37
+
38
+ Add to your Gemfile:
39
+
40
+ ```ruby
41
+ gem 'dspy'
42
+ ```
43
+
44
+ and
45
+
46
+ ```bash
47
+ bundle install
48
+ ```
49
+
50
+ ### Optional Sibling Gems
51
+
52
+ DSPy.rb ships multiple gems from this monorepo so you only install what you need. Add these alongside `dspy`:
53
+
54
+ | Gem | Description |
55
+ | --- | --- |
56
+ | `dspy-schema` | Exposes `DSPy::TypeSystem::SorbetJsonSchema` so other projects (e.g., exa-ruby) can convert Sorbet types to JSON Schema without pulling the full DSPy stack. |
57
+ | `dspy-code_act` | Think-Code-Observe agents that can synthesize and execute Ruby code safely. |
58
+ | `dspy-datasets` | Dataset helpers plus Parquet/Polars tooling for richer evaluation corpora. |
59
+ | `dspy-evals` | High-throughput evaluation harness with metrics, callbacks, and regression fixtures. |
60
+ | `dspy-miprov2` | Bayesian optimization + Gaussian Process backend for the MIPROv2 teleprompter. |
61
+ | `gepa` | GEPA optimizer core (Pareto engine, telemetry, reflective proposer) shared with `dspy-gepa`. |
62
+
63
+ Set the matching `DSPY_WITH_*` environment variables (see `Gemfile`) to include or exclude each sibling gem when running Bundler locally.
64
+ ### Your First Reliable Predictor
65
+
66
+ ```ruby
67
+
68
+ # Configure DSPy globablly to use your fave LLM - you can override this on an instance levle.
69
+ DSPy.configure do |c|
70
+ c.lm = DSPy::LM.new('openai/gpt-4o-mini',
71
+ api_key: ENV['OPENAI_API_KEY'],
72
+ structured_outputs: true) # Enable OpenAI's native JSON mode
73
+ end
74
+
75
+ # Define a signature for sentiment classification - instead of writing a full prompt!
76
+ class Classify < DSPy::Signature
77
+ description "Classify sentiment of a given sentence." # sets the goal of the underlying prompt
78
+
79
+ class Sentiment < T::Enum
80
+ enums do
81
+ Positive = new('positive')
82
+ Negative = new('negative')
83
+ Neutral = new('neutral')
84
+ end
85
+ end
86
+
87
+ # Structured Inputs: makes sure you are sending only valid prompt inputs to your model
88
+ input do
89
+ const :sentence, String, description: 'The sentence to analyze'
90
+ end
91
+
92
+ # Structured Outputs: your predictor will validate the output of the model too.
93
+ output do
94
+ const :sentiment, Sentiment, description: 'The sentiment of the sentence'
95
+ const :confidence, Float, description: 'A number between 0.0 and 1.0'
96
+ end
97
+ end
98
+
99
+ # Wire it to the simplest prompting technique - a Predictn.
100
+ classify = DSPy::Predict.new(Classify)
101
+ # it may raise an error if you mess the inputs or your LLM messes the outputs.
102
+ result = classify.call(sentence: "This book was super fun to read!")
103
+
104
+ puts result.sentiment # => #<Sentiment::Positive>
105
+ puts result.confidence # => 0.85
106
+ ```
107
+
108
+ ### Access to 200+ Models Across 5 Providers
109
+
110
+ DSPy.rb provides unified access to major LLM providers with provider-specific optimizations:
111
+
112
+ ```ruby
113
+ # OpenAI (GPT-4, GPT-4o, GPT-4o-mini, GPT-5, etc.)
114
+ DSPy.configure do |c|
115
+ c.lm = DSPy::LM.new('openai/gpt-4o-mini',
116
+ api_key: ENV['OPENAI_API_KEY'],
117
+ structured_outputs: true) # Native JSON mode
118
+ end
119
+
120
+ # Google Gemini (Gemini 1.5 Pro, Flash, Gemini 2.0, etc.)
121
+ DSPy.configure do |c|
122
+ c.lm = DSPy::LM.new('gemini/gemini-2.5-flash',
123
+ api_key: ENV['GEMINI_API_KEY'],
124
+ structured_outputs: true) # Native structured outputs
125
+ end
126
+
127
+ # Anthropic Claude (Claude 3.5, Claude 4, etc.)
128
+ DSPy.configure do |c|
129
+ c.lm = DSPy::LM.new('anthropic/claude-sonnet-4-5-20250929',
130
+ api_key: ENV['ANTHROPIC_API_KEY'],
131
+ structured_outputs: true) # Tool-based extraction (default)
132
+ end
133
+
134
+ # Ollama - Run any local model (Llama, Mistral, Gemma, etc.)
135
+ DSPy.configure do |c|
136
+ c.lm = DSPy::LM.new('ollama/llama3.2') # Free, runs locally, no API key needed
137
+ end
138
+
139
+ # OpenRouter - Access to 200+ models from multiple providers
140
+ DSPy.configure do |c|
141
+ c.lm = DSPy::LM.new('openrouter/deepseek/deepseek-chat-v3.1:free',
142
+ api_key: ENV['OPENROUTER_API_KEY'])
143
+ end
144
+ ```
145
+
146
+ ## What You Get
147
+
148
+ **Developer Experience:**
149
+ - LLM provider support using official Ruby clients:
150
+ - [OpenAI Ruby](https://github.com/openai/openai-ruby) with vision model support
151
+ - [Anthropic Ruby SDK](https://github.com/anthropics/anthropic-sdk-ruby) with multimodal capabilities
152
+ - [Google Gemini API](https://ai.google.dev/) with native structured outputs
153
+ - [Ollama](https://ollama.com/) via OpenAI compatibility layer for local models
154
+ - **Multimodal Support** - Complete image analysis with DSPy::Image, type-safe bounding boxes, vision-capable models
155
+ - Runtime type checking with [Sorbet](https://sorbet.org/) including T::Enum and union types
156
+ - Type-safe tool definitions for ReAct agents
157
+ - Comprehensive instrumentation and observability
158
+
159
+ **Core Building Blocks:**
160
+ - **Signatures** - Define input/output schemas using Sorbet types with T::Enum and union type support
161
+ - **Predict** - LLM completion with structured data extraction and multimodal support
162
+ - **Chain of Thought** - Step-by-step reasoning for complex problems with automatic prompt optimization
163
+ - **ReAct** - Tool-using agents with type-safe tool definitions and error recovery
164
+ - **Module Composition** - Combine multiple LLM calls into production-ready workflows
165
+
166
+ **Optimization & Evaluation:**
167
+ - **Prompt Objects** - Manipulate prompts as first-class objects instead of strings
168
+ - **Typed Examples** - Type-safe training data with automatic validation
169
+ - **Evaluation Framework** - Advanced metrics beyond simple accuracy with error-resilient pipelines
170
+ - **MIPROv2 Optimization** - Advanced Bayesian optimization with Gaussian Processes, multiple optimization strategies, auto-config presets, and storage persistence
171
+
172
+ **Production Features:**
173
+ - **Reliable JSON Extraction** - Native structured outputs for OpenAI and Gemini, Anthropic tool-based extraction, and automatic strategy selection with fallback
174
+ - **Type-Safe Configuration** - Strategy enums with automatic provider optimization (Strict/Compatible modes)
175
+ - **Smart Retry Logic** - Progressive fallback with exponential backoff for handling transient failures
176
+ - **Zero-Config Langfuse Integration** - Set env vars and get automatic OpenTelemetry traces in Langfuse
177
+ - **Performance Caching** - Schema and capability caching for faster repeated operations
178
+ - **File-based Storage** - Optimization result persistence with versioning
179
+ - **Structured Logging** - JSON and key=value formats with span tracking
180
+
181
+ ## Recent Achievements
182
+
183
+ DSPy.rb has rapidly evolved from experimental to production-ready:
184
+
185
+ ### Foundation
186
+ - ✅ **JSON Parsing Reliability** - Native OpenAI structured outputs with adaptive retry logic and schema-aware fallbacks
187
+ - ✅ **Type-Safe Strategy Configuration** - Provider-optimized strategy selection and enum-backed optimizer presets
188
+ - ✅ **Core Module System** - Predict, ChainOfThought, ReAct with type safety (add `dspy-code_act` for Think-Code-Observe agents)
189
+ - ✅ **Production Observability** - OpenTelemetry, New Relic, and Langfuse integration
190
+ - ✅ **Advanced Optimization** - MIPROv2 with Bayesian optimization, Gaussian Processes, and multi-mode search
191
+
192
+ ### Recent Advances
193
+ - ✅ **MIPROv2 ADE Integrity (v0.29.1)** - Stratified train/val/test splits, honest precision accounting, and enum-driven `--auto` presets with integration coverage
194
+ - ✅ **Instruction Deduplication (v0.29.1)** - Candidate generation now filters repeated programs so optimization logs highlight unique strategies
195
+ - ✅ **GEPA Teleprompter (v0.29.0)** - Genetic-Pareto reflective prompt evolution with merge proposer scheduling, reflective mutation, and ADE demo parity
196
+ - ✅ **Optimizer Utilities Parity (v0.29.0)** - Bootstrap strategies, dataset summaries, and Layer 3 utilities unlock multi-predictor programs on Ruby
197
+ - ✅ **Observability Hardening (v0.29.0)** - OTLP exporter runs on a single-thread executor preventing frozen SSL contexts without blocking spans
198
+ - ✅ **Documentation Refresh (v0.29.x)** - New GEPA guide plus ADE optimization docs covering presets, stratified splits, and error-handling defaults
199
+
200
+ **Current Focus Areas:**
201
+
202
+ ### Production Readiness
203
+ - 🚧 **Production Patterns** - Real-world usage validation and performance optimization
204
+ - 🚧 **Ruby Ecosystem Integration** - Rails integration, Sidekiq compatibility, deployment patterns
205
+
206
+ ### Community & Adoption
207
+ - 🚧 **Community Examples** - Real-world applications and case studies
208
+ - 🚧 **Contributor Experience** - Making it easier to contribute and extend
209
+ - 🚧 **Performance Benchmarks** - Comparative analysis vs other frameworks
210
+
211
+ **v1.0 Philosophy:**
212
+ v1.0 will be released after extensive production battle-testing, not after checking off features.
213
+ The API is already stable - v1.0 represents confidence in production reliability backed by real-world validation.
214
+
215
+
216
+ ## Documentation
217
+
218
+ 📖 **[Complete Documentation Website](https://vicentereig.github.io/dspy.rb/)**
219
+
220
+ ### LLM-Friendly Documentation
221
+
222
+ For LLMs and AI assistants working with DSPy.rb:
223
+ - **[llms.txt](https://vicentereig.github.io/dspy.rb/llms.txt)** - Concise reference optimized for LLMs
224
+ - **[llms-full.txt](https://vicentereig.github.io/dspy.rb/llms-full.txt)** - Comprehensive API documentation
225
+
226
+ ### Getting Started
227
+ - **[Installation & Setup](docs/src/getting-started/installation.md)** - Detailed installation and configuration
228
+ - **[Quick Start Guide](docs/src/getting-started/quick-start.md)** - Your first DSPy programs
229
+ - **[Core Concepts](docs/src/getting-started/core-concepts.md)** - Understanding signatures, predictors, and modules
230
+
231
+ ### Prompt Engineering
232
+ - **[Signatures & Types](docs/src/core-concepts/signatures.md)** - Define typed interfaces for LLM operations
233
+ - **[Predictors](docs/src/core-concepts/predictors.md)** - Predict, ChainOfThought, ReAct, and more
234
+ - **[Modules & Pipelines](docs/src/core-concepts/modules.md)** - Compose complex multi-stage workflows
235
+ - **[Multimodal Support](docs/src/core-concepts/multimodal.md)** - Image analysis with vision-capable models
236
+ - **[Examples & Validation](docs/src/core-concepts/examples.md)** - Type-safe training data
237
+ - **[Rich Types](docs/src/advanced/complex-types.md)** - Sorbet type integration with automatic coercion for structs, enums, and arrays
238
+ - **[Composable Pipelines](docs/src/advanced/pipelines.md)** - Manual module composition patterns
239
+
240
+ ### Prompt Optimization
241
+ - **[Evaluation Framework](docs/src/optimization/evaluation.md)** - Advanced metrics beyond simple accuracy
242
+ - **[Prompt Optimization](docs/src/optimization/prompt-optimization.md)** - Manipulate prompts as objects
243
+ - **[MIPROv2 Optimizer](docs/src/optimization/miprov2.md)** - Advanced Bayesian optimization with Gaussian Processes
244
+ - **[GEPA Optimizer](docs/src/optimization/gepa.md)** *(beta)* - Reflective mutation with optional reflection LMs
245
+
246
+ ### Context Engineering
247
+ - **[Tools](docs/src/core-concepts/toolsets.md)** - Tool wieldint agents.
248
+ - **[Agentic Memory](docs/src/core-concepts/memory.md)** - Memory Tools & Agentic Loops
249
+ - **[RAG Patterns](docs/src/advanced/rag.md)** - Manual RAG implementation with external services
250
+
251
+ ### Production Features
252
+ - **[Observability](docs/src/production/observability.md)** - Zero-config Langfuse integration with a dedicated export worker that never blocks your LLMs
253
+ - **[Storage System](docs/src/production/storage.md)** - Persistence and optimization result storage
254
+ - **[Custom Metrics](docs/src/advanced/custom-metrics.md)** - Proc-based evaluation logic
255
+
256
+
257
+
258
+
259
+
260
+
261
+
262
+
263
+ ## License
264
+ This project is licensed under the MIT License.
@@ -0,0 +1,302 @@
1
+ # typed: strict
2
+ # frozen_string_literal: true
3
+
4
+ require 'set'
5
+ require 'sorbet-runtime'
6
+
7
+ module DSPy
8
+ module TypeSystem
9
+ # Unified module for converting Sorbet types to JSON Schema
10
+ # Extracted from Signature class to ensure consistency across Tools, Toolsets, and Signatures
11
+ module SorbetJsonSchema
12
+ extend T::Sig
13
+ extend T::Helpers
14
+
15
+ # Convert a Sorbet type to JSON Schema format
16
+ sig { params(type: T.untyped, visited: T.nilable(T::Set[T.untyped])).returns(T::Hash[Symbol, T.untyped]) }
17
+ def self.type_to_json_schema(type, visited = nil)
18
+ visited ||= Set.new
19
+
20
+ # Handle T::Boolean type alias first
21
+ if type == T::Boolean
22
+ return { type: "boolean" }
23
+ end
24
+
25
+ # Handle type aliases by resolving to their underlying type
26
+ if type.is_a?(T::Private::Types::TypeAlias)
27
+ return self.type_to_json_schema(type.aliased_type, visited)
28
+ end
29
+
30
+ # Handle raw class types first
31
+ if type.is_a?(Class)
32
+ if type < T::Enum
33
+ # Get all enum values
34
+ values = type.values.map(&:serialize)
35
+ { type: "string", enum: values }
36
+ elsif type == String
37
+ { type: "string" }
38
+ elsif type == Integer
39
+ { type: "integer" }
40
+ elsif type == Float
41
+ { type: "number" }
42
+ elsif type == Numeric
43
+ { type: "number" }
44
+ elsif type == Date
45
+ { type: "string", format: "date" }
46
+ elsif type == DateTime
47
+ { type: "string", format: "date-time" }
48
+ elsif type == Time
49
+ { type: "string", format: "date-time" }
50
+ elsif [TrueClass, FalseClass].include?(type)
51
+ { type: "boolean" }
52
+ elsif type < T::Struct
53
+ # Handle custom T::Struct classes by generating nested object schema
54
+ # Check for recursion
55
+ if visited.include?(type)
56
+ # Return a reference to avoid infinite recursion
57
+ {
58
+ "$ref" => "#/definitions/#{type.name.split('::').last}",
59
+ description: "Recursive reference to #{type.name}"
60
+ }
61
+ else
62
+ self.generate_struct_schema(type, visited)
63
+ end
64
+ else
65
+ { type: "string" } # Default fallback
66
+ end
67
+ elsif type.is_a?(T::Types::Simple)
68
+ case type.raw_type.to_s
69
+ when "String"
70
+ { type: "string" }
71
+ when "Integer"
72
+ { type: "integer" }
73
+ when "Float"
74
+ { type: "number" }
75
+ when "Numeric"
76
+ { type: "number" }
77
+ when "Date"
78
+ { type: "string", format: "date" }
79
+ when "DateTime"
80
+ { type: "string", format: "date-time" }
81
+ when "Time"
82
+ { type: "string", format: "date-time" }
83
+ when "TrueClass", "FalseClass"
84
+ { type: "boolean" }
85
+ when "T::Boolean"
86
+ { type: "boolean" }
87
+ else
88
+ # Check if it's an enum
89
+ if type.raw_type < T::Enum
90
+ # Get all enum values
91
+ values = type.raw_type.values.map(&:serialize)
92
+ { type: "string", enum: values }
93
+ elsif type.raw_type < T::Struct
94
+ # Handle custom T::Struct classes
95
+ if visited.include?(type.raw_type)
96
+ {
97
+ "$ref" => "#/definitions/#{type.raw_type.name.split('::').last}",
98
+ description: "Recursive reference to #{type.raw_type.name}"
99
+ }
100
+ else
101
+ generate_struct_schema(type.raw_type, visited)
102
+ end
103
+ else
104
+ { type: "string" } # Default fallback
105
+ end
106
+ end
107
+ elsif type.is_a?(T::Types::TypedArray)
108
+ # Handle arrays properly with nested item type
109
+ {
110
+ type: "array",
111
+ items: self.type_to_json_schema(type.type, visited)
112
+ }
113
+ elsif type.is_a?(T::Types::TypedHash)
114
+ # Handle hashes as objects with additionalProperties
115
+ # TypedHash has keys and values methods to access its key and value types
116
+ key_schema = self.type_to_json_schema(type.keys, visited)
117
+ value_schema = self.type_to_json_schema(type.values, visited)
118
+
119
+ # Create a more descriptive schema for nested structures
120
+ {
121
+ type: "object",
122
+ propertyNames: key_schema, # Describe key constraints
123
+ additionalProperties: value_schema,
124
+ # Add a more explicit description of the expected structure
125
+ description: "A mapping where keys are #{key_schema[:type]}s and values are #{value_schema[:description] || value_schema[:type]}s"
126
+ }
127
+ elsif type.is_a?(T::Types::FixedHash)
128
+ # Handle fixed hashes (from type aliases like { "key" => Type })
129
+ properties = {}
130
+ required = []
131
+
132
+ type.types.each do |key, value_type|
133
+ properties[key] = self.type_to_json_schema(value_type, visited)
134
+ required << key
135
+ end
136
+
137
+ {
138
+ type: "object",
139
+ properties: properties,
140
+ required: required,
141
+ additionalProperties: false
142
+ }
143
+ elsif type.class.name == "T::Private::Types::SimplePairUnion"
144
+ # Handle T.nilable types (T::Private::Types::SimplePairUnion)
145
+ # This is the actual implementation of T.nilable(SomeType)
146
+ has_nil = type.respond_to?(:types) && type.types.any? do |t|
147
+ (t.respond_to?(:raw_type) && t.raw_type == NilClass) ||
148
+ (t.respond_to?(:name) && t.name == "NilClass")
149
+ end
150
+
151
+ if has_nil
152
+ # Find the non-nil type
153
+ non_nil_type = type.types.find do |t|
154
+ !(t.respond_to?(:raw_type) && t.raw_type == NilClass) &&
155
+ !(t.respond_to?(:name) && t.name == "NilClass")
156
+ end
157
+
158
+ if non_nil_type
159
+ base_schema = self.type_to_json_schema(non_nil_type, visited)
160
+ if base_schema[:type].is_a?(String)
161
+ # Convert single type to array with null
162
+ { type: [base_schema[:type], "null"] }.merge(base_schema.except(:type))
163
+ else
164
+ # For complex schemas, use anyOf to allow null
165
+ { anyOf: [base_schema, { type: "null" }] }
166
+ end
167
+ else
168
+ { type: "string" } # Fallback
169
+ end
170
+ else
171
+ # Not nilable SimplePairUnion - this is a regular T.any() union
172
+ # Generate oneOf schema for all types
173
+ if type.respond_to?(:types) && type.types.length > 1
174
+ {
175
+ oneOf: type.types.map { |t| self.type_to_json_schema(t, visited) },
176
+ description: "Union of multiple types"
177
+ }
178
+ else
179
+ # Single type or fallback
180
+ first_type = type.respond_to?(:types) ? type.types.first : type
181
+ self.type_to_json_schema(first_type, visited)
182
+ end
183
+ end
184
+ elsif type.is_a?(T::Types::Union)
185
+ # Check if this is a nilable type (contains NilClass)
186
+ is_nilable = type.types.any? { |t| t == T::Utils.coerce(NilClass) }
187
+ non_nil_types = type.types.reject { |t| t == T::Utils.coerce(NilClass) }
188
+
189
+ # Special case: check if we have TrueClass + FalseClass (T.nilable(T::Boolean))
190
+ if non_nil_types.size == 2 && is_nilable
191
+ true_class_type = non_nil_types.find { |t| t.respond_to?(:raw_type) && t.raw_type == TrueClass }
192
+ false_class_type = non_nil_types.find { |t| t.respond_to?(:raw_type) && t.raw_type == FalseClass }
193
+
194
+ if true_class_type && false_class_type
195
+ # This is T.nilable(T::Boolean) - treat as nilable boolean
196
+ return { type: ["boolean", "null"] }
197
+ end
198
+ end
199
+
200
+ if non_nil_types.size == 1 && is_nilable
201
+ # This is T.nilable(SomeType) - generate proper schema with null allowed
202
+ base_schema = self.type_to_json_schema(non_nil_types.first, visited)
203
+ if base_schema[:type].is_a?(String)
204
+ # Convert single type to array with null
205
+ { type: [base_schema[:type], "null"] }.merge(base_schema.except(:type))
206
+ else
207
+ # For complex schemas, use anyOf to allow null
208
+ { anyOf: [base_schema, { type: "null" }] }
209
+ end
210
+ elsif non_nil_types.size == 1
211
+ # Non-nilable single type union (shouldn't happen in practice)
212
+ self.type_to_json_schema(non_nil_types.first, visited)
213
+ elsif non_nil_types.size > 1
214
+ # Handle complex unions with oneOf for better JSON schema compliance
215
+ base_schema = {
216
+ oneOf: non_nil_types.map { |t| self.type_to_json_schema(t, visited) },
217
+ description: "Union of multiple types"
218
+ }
219
+ if is_nilable
220
+ # Add null as an option for complex nilable unions
221
+ base_schema[:oneOf] << { type: "null" }
222
+ end
223
+ base_schema
224
+ else
225
+ { type: "string" } # Fallback for complex unions
226
+ end
227
+ elsif type.is_a?(T::Types::ClassOf)
228
+ # Handle T.class_of() types
229
+ {
230
+ type: "string",
231
+ description: "Class name (T.class_of type)"
232
+ }
233
+ else
234
+ { type: "string" } # Default fallback
235
+ end
236
+ end
237
+
238
+ # Generate JSON schema for custom T::Struct classes
239
+ sig { params(struct_class: T.class_of(T::Struct), visited: T.nilable(T::Set[T.untyped])).returns(T::Hash[Symbol, T.untyped]) }
240
+ def self.generate_struct_schema(struct_class, visited = nil)
241
+ visited ||= Set.new
242
+
243
+ return { type: "string", description: "Struct (schema introspection not available)" } unless struct_class.respond_to?(:props)
244
+
245
+ # Add this struct to visited set to detect recursion
246
+ visited.add(struct_class)
247
+
248
+ properties = {}
249
+ required = []
250
+
251
+ # Check if struct already has a _type field
252
+ if struct_class.props.key?(:_type)
253
+ raise DSPy::ValidationError, "_type field conflict: #{struct_class.name} already has a _type field defined. " \
254
+ "DSPy uses _type for automatic type detection in union types."
255
+ end
256
+
257
+ # Add automatic _type field for type detection
258
+ properties[:_type] = {
259
+ type: "string",
260
+ const: struct_class.name.split('::').last # Use the simple class name
261
+ }
262
+ required << "_type"
263
+
264
+ struct_class.props.each do |prop_name, prop_info|
265
+ prop_type = prop_info[:type_object] || prop_info[:type]
266
+ properties[prop_name] = self.type_to_json_schema(prop_type, visited)
267
+
268
+ # A field is required if it's not fully optional
269
+ # fully_optional is true for nilable prop fields
270
+ # immutable const fields are required unless nilable
271
+ unless prop_info[:fully_optional]
272
+ required << prop_name.to_s
273
+ end
274
+ end
275
+
276
+ # Remove this struct from visited set after processing
277
+ visited.delete(struct_class)
278
+
279
+ {
280
+ type: "object",
281
+ properties: properties,
282
+ required: required,
283
+ description: "#{struct_class.name} struct"
284
+ }
285
+ end
286
+
287
+ private
288
+
289
+ # Extensions to Hash for Rails-like except method if not available
290
+ # This ensures compatibility with the original code
291
+ unless Hash.method_defined?(:except)
292
+ Hash.class_eval do
293
+ def except(*keys)
294
+ dup.tap do |hash|
295
+ keys.each { |key| hash.delete(key) }
296
+ end
297
+ end
298
+ end
299
+ end
300
+ end
301
+ end
302
+ end
@@ -0,0 +1,7 @@
1
+ # frozen_string_literal: true
2
+
3
+ module DSPy
4
+ module Schema
5
+ VERSION = "1.0.0"
6
+ end
7
+ end
@@ -0,0 +1,4 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative "schema/version"
4
+ require_relative "schema/sorbet_json_schema"
metadata ADDED
@@ -0,0 +1,61 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: dspy-schema
3
+ version: !ruby/object:Gem::Version
4
+ version: 1.0.0
5
+ platform: ruby
6
+ authors:
7
+ - Vicente Reig Rincón de Arellano
8
+ bindir: bin
9
+ cert_chain: []
10
+ date: 2025-10-25 00:00:00.000000000 Z
11
+ dependencies:
12
+ - !ruby/object:Gem::Dependency
13
+ name: sorbet-runtime
14
+ requirement: !ruby/object:Gem::Requirement
15
+ requirements:
16
+ - - ">="
17
+ - !ruby/object:Gem::Version
18
+ version: 0.5.0
19
+ type: :runtime
20
+ prerelease: false
21
+ version_requirements: !ruby/object:Gem::Requirement
22
+ requirements:
23
+ - - ">="
24
+ - !ruby/object:Gem::Version
25
+ version: 0.5.0
26
+ description: Provides DSPy::TypeSystem::SorbetJsonSchema without requiring the full
27
+ DSPy stack, enabling reuse in sibling gems and downstream projects.
28
+ email:
29
+ - hey@vicente.services
30
+ executables: []
31
+ extensions: []
32
+ extra_rdoc_files: []
33
+ files:
34
+ - LICENSE
35
+ - README.md
36
+ - lib/dspy/schema.rb
37
+ - lib/dspy/schema/sorbet_json_schema.rb
38
+ - lib/dspy/schema/version.rb
39
+ homepage: https://github.com/vicentereig/dspy.rb
40
+ licenses:
41
+ - MIT
42
+ metadata:
43
+ github_repo: git@github.com:vicentereig/dspy.rb
44
+ rdoc_options: []
45
+ require_paths:
46
+ - lib
47
+ required_ruby_version: !ruby/object:Gem::Requirement
48
+ requirements:
49
+ - - ">="
50
+ - !ruby/object:Gem::Version
51
+ version: 3.3.0
52
+ required_rubygems_version: !ruby/object:Gem::Requirement
53
+ requirements:
54
+ - - ">="
55
+ - !ruby/object:Gem::Version
56
+ version: '0'
57
+ requirements: []
58
+ rubygems_version: 3.6.5
59
+ specification_version: 4
60
+ summary: Sorbet to JSON Schema conversion utilities reused by DSPy.rb.
61
+ test_files: []