RubyGems - ruby_llm-contract - Versions diffs - 0.3.0 → 0.3.7 - Mend

ruby_llm-contract 0.3.0 → 0.3.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (22) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +54 -0
data/Gemfile.lock +2 -2
data/README.md +1 -1
data/lib/ruby_llm/contract/adapters/ruby_llm.rb +3 -3
data/lib/ruby_llm/contract/concerns/eval_host.rb +1 -0
data/lib/ruby_llm/contract/contract/schema_validator.rb +70 -3
data/lib/ruby_llm/contract/eval/baseline_diff.rb +15 -3
data/lib/ruby_llm/contract/eval/eval_definition.rb +24 -4
data/lib/ruby_llm/contract/eval/report.rb +1 -1
data/lib/ruby_llm/contract/eval/runner.rb +2 -1
data/lib/ruby_llm/contract/eval/trait_evaluator.rb +11 -2
data/lib/ruby_llm/contract/pipeline/result.rb +1 -1
data/lib/ruby_llm/contract/prompt/builder.rb +6 -3
data/lib/ruby_llm/contract/step/base.rb +4 -3
data/lib/ruby_llm/contract/step/limit_checker.rb +1 -1
data/lib/ruby_llm/contract/step/retry_policy.rb +1 -1
data/lib/ruby_llm/contract/step/runner.rb +7 -1
data/lib/ruby_llm/contract/version.rb +1 -1
data/lib/ruby_llm/contract.rb +19 -0
data/ruby_llm-contract.gemspec +5 -3
metadata +6 -4

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: b032109a7818caa3f68cae651f9f99210765d4257825f52a332944a6120ad522
-  data.tar.gz: 8f4c1bb95cbcf79236723e100becf8c8f2b87061bd7c29827152e4d716a99ce3
+  metadata.gz: dee963c252704634b8b9452e4e0460561e7795385e2dc59f4d5cc089a16d9210
+  data.tar.gz: ce289e0f1dee22a75d7079b28775c6dd0e5d85b01a54e5a97e4f47b40c2f5741
 SHA512:
-  metadata.gz: e84f8e58367e2eae1ea6a0a712e125be6b3edb361ce6feca984c659f15ca11ce658143adf7fdfcd09f5c1ff57d09fad31e431320f780dd08da7ab7499dd9b961
-  data.tar.gz: 29c98d8fb09a92df1a88136d7c67094784fdf2ae01ae9ec1aaa3fc5f1cd589fd27c7139c84663ba9e49c89e5537f98480eb451076c8a00dffcccfc3bf062f5d8
+  metadata.gz: d10ff4021462051d80cb5205174a24f9c5093ee096fc5add7d5bfacc88fb936a364d474871c05d87dad404ffc9577c998e7a1ae73cc8a8e0a5868e7cef629c83
+  data.tar.gz: 914a370baf65d5e8fc62f78a22e3bc6ee9eba83b78257ac95b87c8d5965ae23e54dbb7a66de7b2b6c7dc3c848a513be22c2e37e76445d9094dd576f3d3867215

data/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,59 @@
 # Changelog
+## 0.3.7 (2026-03-24)
+- **Trait missing key = error** — `expected_traits: { title: 0..5 }` on output `{}` now fails instead of silently passing.
+- **nil input in dynamic prompts** — `run(nil)` with `prompt { |input| ... }` correctly passes nil to block.
+- **Defensive sample pre-validation** — `sample_response` uses the same parser as runtime (handles code fences, BOM, prose around JSON).
+- **Baseline diff excludes skipped** — self-compare with skipped cases no longer shows artificial score delta.
+- **Zeitwerk eval/ ignore** — `eager_load_contract_dirs!` ignores `eval/` subdirs before eager load.
+## 0.3.6 (2026-03-24)
+- **Recursive array/object validation** — nested arrays (`array of array of string`) validated recursively. Object items validated even without `:properties` (e.g. `additionalProperties: false`).
+- **Deep symbolize in sample pre-validation** — array samples with string keys (`[{"name" => "Alice"}]`) correctly symbolized before schema validation.
+## 0.3.5 (2026-03-24)
+- **String constraints in SchemaValidator** — `minLength`/`maxLength` enforced for root and nested strings.
+- **Array item validation** — scalar items (string, integer) validated against items schema type and constraints.
+- **Non-JSON sample_response fails fast** — `sample_response("hello")` with object schema raises ArgumentError at definition time instead of silently passing.
+- **`max_tokens` in KNOWN_CONTEXT_KEYS** — no more spurious "Unknown context keys" warning.
+- **Duplicate models deduplicated** — `compare_models(models: ["m", "m"])` runs model once.
+## 0.3.4 (2026-03-24)
+- **SchemaValidator validates non-object roots** — boolean, integer, number, array root schemas now enforce type, min/max, enum, minItems/maxItems. Previously only object schemas were validated.
+- **Removed passing cases = regression** — `regressed?` returns true when baseline had passing cases that are now missing. Prevents gate bypass by deleting eval cases.
+- **JSON string sample_response fixed** — `sample_response('{"name":"Alice"}')` correctly parsed for pre-validation instead of double-encoding.
+- **`context[:max_tokens]` forwarded** — overrides step's `max_output` for adapter call AND budget precheck.
+## 0.3.3 (2026-03-23)
+- **Skipped cases visible in regression diff** — baseline PASS → current SKIP now detected as regression by `without_regressions` and `fail_on_regression`.
+- **Skip only on missing adapter** — eval runner no longer masks evaluator errors as SKIP. Only "No adapter configured" triggers skip.
+- **Array/Hash sample pre-validation** — `sample_response([{...}])` correctly validated against schema instead of silently skipping.
+- **`assume_model_exists: false` forwarded** — boolean `false` no longer dropped by truthiness check in adapter options.
+- **Duplicate case names caught at definition** — `add_case`/`verify` with same name raises immediately, not at run time.
+## 0.3.2 (2026-03-23)
+- **Array response preserved** — `Adapters::RubyLLM` no longer stringifies Array content. Steps with `output_type Array` work correctly.
+- **Falsy prompt input** — `run(false)` and `build_messages(false)` pass `false` to dynamic prompt blocks instead of falling back to `instance_eval`.
+- **`retry_on` flatten** — `retry_on([:a, :b])` no longer wraps in nested array.
+- **Builder reset** — `Prompt::Builder` resets nodes on each build (no accumulation on reuse).
+- **Pipeline false output** — `output: false` no longer shows "(no output)" in pretty_print.
+## 0.3.1 (2026-03-23)
+Fixes from persona_tool production deployment (4 services migrated).
+- **Proc/Lambda in `expected_traits`** — `expected_traits: { score: ->(v) { v > 3 } }` now works.
+- **Zeitwerk eager-load** — `load_evals!` eager-loads `app/contracts/` and `app/steps/` before loading eval files. Fixes uninitialized constant errors in Rake tasks.
+- **Falsy values** — `expected: false`, `input: false`, `sample_response(nil)` all handled correctly.
+- **Context key forwarding** — `provider:` and `assume_model_exists:` forwarded to adapter. `schema:` and `max_tokens:` are step-level only (no split-brain).
+- **Deep-freeze immutability** — constructors never mutate caller's data.
 ## 0.3.0 (2026-03-23)
 Baseline regression detection — know when quality drops before users do.

data/Gemfile.lock CHANGED Viewed

@@ -1,7 +1,7 @@
 PATH
   remote: .
   specs:
-    ruby_llm-contract (0.3.0)
+    ruby_llm-contract (0.3.7)
       dry-types (~> 1.7)
       ruby_llm (~> 1.0)
       ruby_llm-schema (~> 0.3)
@@ -165,7 +165,7 @@ CHECKSUMS
   rubocop-ast (1.49.1) sha256=4412f3ee70f6fe4546cc489548e0f6fcf76cafcfa80fa03af67098ffed755035
   ruby-progressbar (1.13.0) sha256=80fc9c47a9b640d6834e0dc7b3c94c9df37f08cb072b7761e4a71e22cff29b33
   ruby_llm (1.14.0) sha256=57c6f7034fc4a44504ea137d70f853b07824f1c1cdbe774ab3ab3522e7098deb
-  ruby_llm-contract (0.3.0)
+  ruby_llm-contract (0.3.7)
   ruby_llm-schema (0.3.0) sha256=a591edc5ca1b7f0304f0e2261de61ba4b3bea17be09f5cf7558153adfda3dec6
   unicode-display_width (3.2.0) sha256=0cdd96b5681a5949cdbc2c55e7b420facae74c4aaf9a9815eee1087cb1853c42
   unicode-emoji (4.2.0) sha256=519e69150f75652e40bf736106cfbc8f0f73aa3fb6a65afe62fefa7f80b0f80f

data/README.md CHANGED Viewed

@@ -6,7 +6,7 @@ Companion gem for [ruby_llm](https://github.com/crmne/ruby_llm).
 ## The problem
-You call an LLM. It returns bad JSON, wrong values, or costs 4x more than it should. You switch models and quality drops silently. You have no data to decide which model to use.
+Which model should you use? The expensive one is accurate but costs 4x more. The cheap one is fast but hallucinates on edge cases. You tweak a prompt — did accuracy improve or drop? You have no data. Just gut feeling.
 ## The fix

data/lib/ruby_llm/contract/adapters/ruby_llm.rb CHANGED Viewed

@@ -43,8 +43,8 @@ module RubyLLM
         def chat_constructor_options(options)
           opts = { model: options[:model] }
-          opts[:provider] = options[:provider] if options[:provider]
-          opts[:assume_model_exists] = options[:assume_model_exists] if options[:assume_model_exists]
+          opts[:provider] = options[:provider] if options.key?(:provider)
+          opts[:assume_model_exists] = options[:assume_model_exists] if options.key?(:assume_model_exists)
           opts
         end
@@ -57,7 +57,7 @@ module RubyLLM
         def build_response(response)
           content = response.content
-          content = content.to_s unless content.is_a?(Hash)
+          content = content.to_s unless content.is_a?(Hash) || content.is_a?(Array)
           Response.new(
             content: content,

data/lib/ruby_llm/contract/concerns/eval_host.rb CHANGED Viewed

@@ -46,6 +46,7 @@ module RubyLLM
         def compare_models(eval_name, models:, context: {})
           context ||= {}
+          models = models.uniq
           reports = models.each_with_object({}) do |model, hash|
             model_context = deep_dup_context(context).merge(model: model)
             hash[model] = run_single_eval(eval_name, model_context)

data/lib/ruby_llm/contract/contract/schema_validator.rb CHANGED Viewed

@@ -40,10 +40,77 @@ module RubyLLM
       def validate_non_hash_output
         expected_type = @json_schema[:type]&.to_s
         if expected_type == "object" || @json_schema.key?(:properties)
-          ["expected object, got #{@output.class}"]
-        else
-          []
+          return ["expected object, got #{@output.class}"]
+        end
+        errors = []
+        validate_type_match(errors, @output, expected_type, "root") if expected_type
+        validate_constraints(errors, @output, @json_schema, "root")
+        if expected_type == "array" && @output.is_a?(Array) && @json_schema[:items]
+          validate_array_items(errors, @output, @json_schema[:items], "")
+        end
+        errors
+      end
+      def validate_array_items(errors, array, items_schema, prefix)
+        array.each_with_index do |item, i|
+          item_prefix = "#{prefix}[#{i}]"
+          validate_value(errors, item, items_schema, item_prefix)
+        end
+      end
+      def validate_value(errors, value, schema, prefix)
+        value_type = schema[:type]&.to_s
+        validate_type_match(errors, value, value_type, prefix) if value_type
+        validate_constraints(errors, value, schema, prefix)
+        if value.is_a?(Hash) && (schema.key?(:properties) || value_type == "object")
+          validate_object(value, schema, prefix: prefix)
+          errors.concat(@errors)
+          @errors = []
+        elsif value.is_a?(Array) && schema[:items]
+          validate_array_items(errors, value, schema[:items], prefix)
+        end
+      end
+      def validate_type_match(errors, value, expected_type, prefix)
+        valid = case expected_type
+                when "string" then value.is_a?(String)
+                when "integer" then value.is_a?(Integer)
+                when "number" then value.is_a?(Numeric)
+                when "boolean" then value.is_a?(TrueClass) || value.is_a?(FalseClass)
+                when "array" then value.is_a?(Array)
+                else true
+                end
+        errors << "#{prefix}: expected #{expected_type}, got #{value.class}" unless valid
+      end
+      def validate_constraints(errors, value, schema, prefix)
+        if schema[:minimum] && value.is_a?(Numeric) && value < schema[:minimum]
+          errors << "#{prefix}: #{value} is less than minimum #{schema[:minimum]}"
+        end
+        if schema[:maximum] && value.is_a?(Numeric) && value > schema[:maximum]
+          errors << "#{prefix}: #{value} is greater than maximum #{schema[:maximum]}"
+        end
+        if schema[:enum] && !schema[:enum].include?(value)
+          errors << "#{prefix}: #{value.inspect} is not in enum #{schema[:enum].inspect}"
+        end
+        if schema[:minItems] && value.is_a?(Array) && value.length < schema[:minItems]
+          errors << "#{prefix}: array has #{value.length} items, minimum #{schema[:minItems]}"
+        end
+        if schema[:maxItems] && value.is_a?(Array) && value.length > schema[:maxItems]
+          errors << "#{prefix}: array has #{value.length} items, maximum #{schema[:maxItems]}"
+        end
+        if schema[:minLength] && value.is_a?(String) && value.length < schema[:minLength]
+          errors << "#{prefix}: string length #{value.length} is less than minLength #{schema[:minLength]}"
+        end
+        if schema[:maxLength] && value.is_a?(String) && value.length > schema[:maxLength]
+          errors << "#{prefix}: string length #{value.length} is greater than maxLength #{schema[:maxLength]}"
         end
       end

data/lib/ruby_llm/contract/eval/baseline_diff.rb CHANGED Viewed

@@ -9,8 +9,8 @@ module RubyLLM
         def initialize(baseline_cases:, current_cases:)
           @baseline = index_by_name(baseline_cases)
           @current = index_by_name(current_cases)
-          @baseline_score = baseline_cases.empty? ? 0.0 : baseline_cases.sum { |c| c[:score] } / baseline_cases.length
-          @current_score = current_cases.empty? ? 0.0 : current_cases.sum { |c| c[:score] } / current_cases.length
+          @baseline_score = compute_score(baseline_cases)
+          @current_score = compute_score(current_cases)
           freeze
         end
@@ -48,7 +48,11 @@ module RubyLLM
         end
         def regressed?
-          regressions.any?
+          regressions.any? || removed_passing_cases.any?
+        end
+        def removed_passing_cases
+          removed_cases.select { |name| @baseline[name]&.dig(:passed) }
         end
         def improved?
@@ -74,6 +78,14 @@ module RubyLLM
         private
+        def compute_score(cases)
+          # Exclude skipped cases from score (consistent with Report#score)
+          evaluated = cases.reject { |c| c[:details]&.start_with?("skipped:") }
+          return 0.0 if evaluated.empty?
+          evaluated.sum { |c| c[:score] } / evaluated.length
+        end
         def index_by_name(cases)
           cases.each_with_object({}) { |c, h| h[c[:name]] = c }
         end

data/lib/ruby_llm/contract/eval/eval_definition.rb CHANGED Viewed

@@ -34,6 +34,7 @@ module RubyLLM
         def add_case(description, input: nil, expected: nil, expected_traits: nil, evaluator: nil)
           case_input = input.nil? ? @default_input : input
           raise ArgumentError, "add_case requires input (set default_input or pass input:)" if case_input.nil?
+          validate_unique_case_name!(description)
           @cases << {
             name: description,
@@ -52,6 +53,7 @@ module RubyLLM
           expected_or_proc = expect unless expect.nil?
           case_input = input.nil? ? @default_input : input
           validate_verify_args!(expected_or_proc, case_input)
+          validate_unique_case_name!(description)
           evaluator = expected_or_proc.is_a?(::Proc) ? expected_or_proc : nil
@@ -85,6 +87,12 @@ module RubyLLM
           [{ name: "contract check", input: @default_input, expected: nil, evaluator: nil }]
         end
+        def validate_unique_case_name!(name)
+          return unless @cases.any? { |c| c[:name] == name }
+          raise ArgumentError, "Duplicate case name '#{name}' in eval '#{@name}'. Case names must be unique."
+        end
         def validate_verify_args!(expected_or_proc, case_input)
           raise ArgumentError, "verify requires either a positional argument or expect: keyword" if expected_or_proc.nil?
           raise ArgumentError, "verify requires input (set default_input or pass input:)" if case_input.nil?
@@ -98,15 +106,27 @@ module RubyLLM
           return if errors.empty?
           raise ArgumentError, "sample_response does not satisfy step schema: #{errors.join(", ")}"
-        rescue JSON::ParserError
-          # Not JSON -- skip pre-validation
+        rescue JSON::ParserError, RubyLLM::Contract::ParseError => e
+          raise ArgumentError, "sample_response is not valid JSON: #{e.message}"
         end
         def validate_sample_against_schema(schema)
-          response_hash = @sample_response.is_a?(Hash) ? @sample_response : JSON.parse(@sample_response.to_s)
-          symbolized = Parser.symbolize_keys(response_hash)
+          parsed = case @sample_response
+                   when Hash, Array then @sample_response
+                   when String then Parser.parse(@sample_response, strategy: :json)
+                   else @sample_response
+                   end
+          symbolized = deep_symbolize(parsed)
           SchemaValidator.validate(symbolized, schema)
         end
+        def deep_symbolize(obj)
+          case obj
+          when Hash then Parser.symbolize_keys(obj)
+          when Array then obj.map { |item| deep_symbolize(item) }
+          else obj
+          end
+        end
       end
     end
   end

data/lib/ruby_llm/contract/eval/report.rb CHANGED Viewed

@@ -97,7 +97,7 @@ module RubyLLM
           validate_baseline!(baseline_data)
           BaselineDiff.new(
             baseline_cases: baseline_data[:cases],
-            current_cases: evaluated_results.map { |r| serialize_case(r) }
+            current_cases: results.map { |r| serialize_case(r) }
           )
         end

data/lib/ruby_llm/contract/eval/runner.rb CHANGED Viewed

@@ -32,7 +32,8 @@ module RubyLLM
           build_case_result(test_case, step_result, eval_result)
         rescue RubyLLM::Contract::Error => e
-          # No adapter configured — skip this case (offline mode without sample_response)
+          raise unless e.message.include?("No adapter configured")
           skipped_result(test_case, e.message)
         end

data/lib/ruby_llm/contract/eval/trait_evaluator.rb CHANGED Viewed

@@ -19,13 +19,18 @@ module RubyLLM
         end
         def check_trait(output, key, expectation, errors)
-          value = output.is_a?(Hash) ? output[key] : nil
-          error_msg = trait_error(key, value, expectation)
+          unless output.is_a?(Hash) && output.key?(key)
+            errors << "#{key}: missing key"
+            return
+          end
+          error_msg = trait_error(key, output[key], expectation)
           errors << error_msg if error_msg
         end
         def trait_error(key, value, expectation)
           case expectation
+          when ::Proc
+            trait_proc_error(key, value, expectation)
           when ::Regexp
             trait_regexp_error(key, value, expectation)
           when Range
@@ -56,6 +61,10 @@ module RubyLLM
           "#{key}: expected falsy, got #{value.inspect}" if value
         end
+        def trait_proc_error(key, value, expectation)
+          "#{key}: trait check failed, got #{value.inspect}" unless expectation.call(value)
+        end
         def trait_equality_error(key, value, expectation)
           "#{key}: expected #{expectation.inspect}, got #{value.inspect}" unless value == expectation
         end

data/lib/ruby_llm/contract/pipeline/result.rb CHANGED Viewed

@@ -116,7 +116,7 @@ module RubyLLM
         end
         def format_output(output)
-          return ["(no output)"] unless output
+          return ["(no output)"] if output.nil?
           pairs = output.is_a?(Hash) ? output : { value: output }
           pairs.map do |key, val|

data/lib/ruby_llm/contract/prompt/builder.rb CHANGED Viewed

@@ -4,13 +4,16 @@ module RubyLLM
   module Contract
     module Prompt
       class Builder
+        NOT_PROVIDED = Object.new.freeze
         def initialize(block)
           @block = block
           @nodes = []
         end
-        def build(input = nil)
-          if input && @block.arity >= 1
+        def build(input = NOT_PROVIDED)
+          @nodes = []
+          if input != NOT_PROVIDED && @block.arity >= 1
             instance_exec(input, &@block)
           else
             instance_eval(&@block)
@@ -38,7 +41,7 @@ module RubyLLM
           @nodes << Nodes::SectionNode.new(name, text)
         end
-        def self.build(input: nil, &block)
+        def self.build(input: NOT_PROVIDED, &block)
           new(block).build(input)
         end
       end

data/lib/ruby_llm/contract/step/base.rb CHANGED Viewed

@@ -58,7 +58,7 @@ module RubyLLM
             end
           end
-          KNOWN_CONTEXT_KEYS = %i[adapter model temperature provider assume_model_exists].freeze
+          KNOWN_CONTEXT_KEYS = %i[adapter model temperature max_tokens provider assume_model_exists].freeze
           def run(input, context: {})
             context = (context || {}).transform_keys { |k| k.respond_to?(:to_sym) ? k.to_sym : k }
@@ -68,7 +68,7 @@ module RubyLLM
             policy = retry_policy
             ctx_temp = context[:temperature]
-            extra = context.slice(:provider, :assume_model_exists)
+            extra = context.slice(:provider, :assume_model_exists, :max_tokens)
             result = if policy
                        run_with_retry(input, adapter: adapter, default_model: default_model,
                                       policy: policy, context_temperature: ctx_temp, extra_options: extra)
@@ -82,7 +82,8 @@ module RubyLLM
           def build_messages(input)
             dynamic = prompt.arity >= 1
-            ast = Prompt::Builder.build(input: dynamic ? input : nil, &prompt)
+            builder_input = dynamic ? input : Prompt::Builder::NOT_PROVIDED
+            ast = Prompt::Builder.build(input: builder_input, &prompt)
             variables = dynamic ? {} : { input: input }
             variables.merge!(input.transform_keys(&:to_sym)) if !dynamic && input.is_a?(Hash)
             Prompt::Renderer.render(ast, variables: variables)

data/lib/ruby_llm/contract/step/limit_checker.rb CHANGED Viewed

@@ -29,7 +29,7 @@ module RubyLLM
         end
         def append_cost_error(estimated, errors)
-          estimated_output = @max_output || 0
+          estimated_output = effective_max_output || 0
           estimated_cost = CostCalculator.calculate(
             model_name: @model,
             usage: { input_tokens: estimated, output_tokens: estimated_output }

data/lib/ruby_llm/contract/step/retry_policy.rb CHANGED Viewed

@@ -39,7 +39,7 @@ module RubyLLM
         end
         def retry_on(*statuses)
-          @retryable_statuses = statuses
+          @retryable_statuses = statuses.flatten
         end
         def retryable?(result)

data/lib/ruby_llm/contract/step/runner.rb CHANGED Viewed

@@ -83,14 +83,20 @@ module RubyLLM
         end
         def build_adapter_options
+          effective_max_tokens = @extra_options[:max_tokens] || @max_output
           { model: @model }.tap do |opts|
             opts[:schema] = @output_schema if @output_schema
-            opts[:max_tokens] = @max_output if @max_output
+            opts[:max_tokens] = effective_max_tokens if effective_max_tokens
             opts[:temperature] = @temperature if @temperature
             @extra_options.each { |k, v| opts[k] = v unless opts.key?(k) }
           end
         end
+        def effective_max_output
+          @extra_options[:max_tokens] || @max_output
+        end
         def build_error_result(error_result, messages)
           Result.new(
             status: error_result.status,

data/lib/ruby_llm/contract/version.rb CHANGED Viewed

@@ -2,6 +2,6 @@
 module RubyLLM
   module Contract
-    VERSION = "0.3.0"
+    VERSION = "0.3.7"
   end
 end

data/lib/ruby_llm/contract.rb CHANGED Viewed

@@ -51,6 +51,10 @@ module RubyLLM
         return if dirs.empty?
+        # In Rails, eager-load parent directories so contract classes
+        # are available when eval files reference them.
+        eager_load_contract_dirs! if defined?(::Rails)
         # Clear file-sourced evals ONCE, then load ALL dirs.
         Thread.current[:ruby_llm_contract_reloading] = true
         eval_hosts.each do |host|
@@ -79,6 +83,21 @@ module RubyLLM
         @eval_hosts || []
       end
+      def eager_load_contract_dirs!
+        %w[app/contracts app/steps].each do |path|
+          full = ::Rails.root.join(path)
+          next unless full.exist?
+          # Ignore eval/ subdirs — they don't define Zeitwerk-compatible
+          # constants and are loaded separately by load_evals!
+          eval_dir = full.join("eval")
+          ::Rails.autoloaders.main.ignore(eval_dir.to_s) if eval_dir.exist?
+          ::Rails.autoloaders.main.eager_load_dir(full.to_s)
+        rescue StandardError
+          nil
+        end
+      end
       def auto_create_adapter!
         require "ruby_llm"
         configuration.default_adapter = Adapters::RubyLLM.new

data/ruby_llm-contract.gemspec CHANGED Viewed

@@ -7,9 +7,10 @@ Gem::Specification.new do |spec|
   spec.version = RubyLLM::Contract::VERSION
   spec.authors = ["Justyna"]
-  spec.summary = "Contract-first LLM step execution for RubyLLM"
-  spec.description = "Turn RubyLLM calls into contracted, validated, testable steps with schema enforcement, " \
-                     "retry with model escalation, and eval."
+  spec.summary = "Know which LLM model to use, what it costs, and when accuracy drops"
+  spec.description = "Compare LLM models by accuracy and cost. Regression-test prompts in CI. " \
+                     "Start on nano, auto-escalate to bigger models when quality drops. " \
+                     "Companion gem for ruby_llm."
   spec.homepage = "https://github.com/justi/ruby_llm-contract"
   spec.license = "MIT"
   spec.required_ruby_version = ">= 3.2.0"
@@ -17,6 +18,7 @@ Gem::Specification.new do |spec|
   spec.metadata["homepage_uri"] = spec.homepage
   spec.metadata["source_code_uri"] = spec.homepage
   spec.metadata["changelog_uri"] = "#{spec.homepage}/blob/main/CHANGELOG.md"
+  spec.metadata["documentation_uri"] = "#{spec.homepage}#readme"
   spec.metadata["rubygems_mfa_required"] = "true"
   spec.files = Dir.chdir(__dir__) do

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: ruby_llm-contract
 version: !ruby/object:Gem::Version
-  version: 0.3.0
+  version: 0.3.7
 platform: ruby
 authors:
 - Justyna
@@ -51,8 +51,9 @@ dependencies:
     - - "~>"
       - !ruby/object:Gem::Version
         version: '0.3'
-description: Turn RubyLLM calls into contracted, validated, testable steps with schema
-  enforcement, retry with model escalation, and eval.
+description: Compare LLM models by accuracy and cost. Regression-test prompts in CI.
+  Start on nano, auto-escalate to bigger models when quality drops. Companion gem
+  for ruby_llm.
 executables: []
 extensions: []
 extra_rdoc_files: []
@@ -154,6 +155,7 @@ metadata:
   homepage_uri: https://github.com/justi/ruby_llm-contract
   source_code_uri: https://github.com/justi/ruby_llm-contract
   changelog_uri: https://github.com/justi/ruby_llm-contract/blob/main/CHANGELOG.md
+  documentation_uri: https://github.com/justi/ruby_llm-contract#readme
   rubygems_mfa_required: 'true'
 rdoc_options: []
 require_paths:
@@ -171,5 +173,5 @@ required_rubygems_version: !ruby/object:Gem::Requirement
 requirements: []
 rubygems_version: 3.6.7
 specification_version: 4
-summary: Contract-first LLM step execution for RubyLLM
+summary: Know which LLM model to use, what it costs, and when accuracy drops
 test_files: []