RubyGems - ruby_llm-contract - Versions diffs - 0.7.3 → 0.8.0 - Mend

ruby_llm-contract 0.7.3 → 0.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +36 -0
data/Gemfile.lock +3 -3
data/README.md +24 -5
data/lib/ruby_llm/contract/adapters/ruby_llm.rb +22 -1
data/lib/ruby_llm/contract/cost_calculator.rb +39 -0
data/lib/ruby_llm/contract/step/base.rb +18 -1
data/lib/ruby_llm/contract/step/dsl.rb +38 -0
data/lib/ruby_llm/contract/step/limit_checker.rb +2 -2
data/lib/ruby_llm/contract/token_estimator.rb +20 -3
data/lib/ruby_llm/contract/version.rb +1 -1
data/ruby_llm-contract.gemspec +6 -5
metadata +8 -7

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: c9b5adda942a6e1f3393d2d05e8226d210b25088c65c2c20160b2ffc6a493533
-  data.tar.gz: 91110ddd5d2fc39a8e3f6cd9ec912c707ceef7da2ee40e8805598a787ae1c067
+  metadata.gz: 33f6a8a686f7f20791904c4fbfacd19f6ea5b8bad428c374ec14b7e33521354d
+  data.tar.gz: a2b2f7d9ff1e6cd69b39d55a3809b1babbd82a5d42afe3c567733076f03fa317
 SHA512:
-  metadata.gz: 543779ceacb617909e7e45355473fb300d9a5f81206ee51019d4f868b3ea66d43272b04be78aa95d0a014ea16619ab57c933eb58f1d17d8fa3845a957d6c750a
-  data.tar.gz: eea6b49a1cb01de6b501bddec6fcfcccbc221a5448a72a435f4c04e80131a7994757b3c337390e1b6212a2709ae082a5d55c82158535a5789e606b7ef13d27f9
+  metadata.gz: '0895986db9cde7d26d2e91ffc7c6469b7d34df299a361148c9ee339dbf1dc61539e44adf250c00c06383717ed2e47ff250d4490d1cce43c3cdf8c3169529fba5'
+  data.tar.gz: ed35a4b4cc9ab1afd46c427468dcb33844d8d54207531f98a9f1d775004efc5a1e19d64fb22c64b20da81750165a3a979f4de4ecaa46aff4121eb7ba80a27ed2

data/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,41 @@
 # Changelog
+## 0.8.0 (2026-04-26)
+Narrative repositioning + small API additions. Internal architecture unchanged: no `Step::Base` refactor, no breaking changes to existing DSL.
+### Added
+- **`thinking(effort:, budget:)` class macro on `Step::Base`** — mirrors `RubyLLM::Agent.thinking` signature exactly. Stored as `{ effort:, budget: }` hash; reader returns the hash; supports `:default` reset semantics; superclass inheritance like `model`/`temperature`. The convenience alias `reasoning_effort(:low)` is implemented as `thinking(effort: :low)` — single normalized state, not separate ivar.
+- **Adapter wiring for `with_thinking`** — when `thinking` is set on the Step class, OR when `reasoning_effort:` is passed through context, OR when an attempt config in `retry_policy escalate(...)` carries `reasoning_effort:`, the RubyLLM adapter resolves the effective `{ effort:, budget: }` hash and forwards it via `chat.with_thinking(**)` — provider-agnostic (supports OpenAI `reasoning_effort` AND Anthropic extended-thinking budget). Precedence: per-attempt / context `reasoning_effort` overrides class-level `thinking[:effort]`; budget is taken from class-level `thinking[:budget]`. **Behavioural change vs 0.7.x**: `reasoning_effort` is now forwarded via `with_thinking` instead of `with_params`. Same wire-level OpenAI parameter; provider-agnostic Anthropic support is now automatic.
+### Dependencies
+- **`ruby_llm` constraint bumped from `~> 1.0` to `~> 1.12`** — `Chat#with_thinking` is the canonical path for reasoning effort + extended thinking; it shipped in RubyLLM 1.12. Adopters on `ruby_llm < 1.12` need to bump RubyLLM before upgrading this gem to 0.8.0.
+### Changed
+- **Tagline + README opening** — repositioned around "Contracts + Evals for RubyLLM". New "Relation to RubyLLM::Agent" section explicitly frames Step as a sibling abstraction (same niche as Agent, wider contract), not an alternative or foundation. README does not claim "Step uses Agent under the hood" — current call path is `Step → Runner → Adapters::RubyLLM → RubyLLM.chat` directly.
+- **`TokenEstimator` documented as heuristic** — module docstring expanded with explicit "±30% accuracy" framing. Refusal messages from `LimitChecker` now include `(heuristic ±30%)` suffix so adopters know the pre-flight number is estimated, not measured. RubyLLM 1.14 also has no pre-flight tokenizer; `RubyLLM::Tokens` is post-hoc only.
+- **`CostCalculator` repositioned in docs** — module narrative reframed from "cost calculator" to "fine-tune pricing registry + lookup with fallback chain". Math methods (`compute_cost`, `token_cost`, etc.) were already private; this release makes the docs match. Public API surface unchanged: `register_model`, `unregister_model`, `reset_custom_models!`, `calculate`.
+- **`output_schema` reframed in docs** — described as "wrapper around `RubyLLM::Schema` + client-side validation step", not a standalone feature. The schema language is identical to what `RubyLLM::Agent.schema` accepts; the difference is what wraps it.
+- **README retry framing** — `retry_policy escalate(...)` (model escalation on validation failure) is the marketed default. `retry_policy attempts: N` (same-model retry) stays in the API for backward compat and niche cases (subjective criteria, multi-step pipelines, weaker models) but is no longer marketed as a recommended default. Empirical basis: four small experiments across PDF quiz generation, GSM8K math (n=30 + n=120), and multi-constraint schedule generation found no useful lift for nano-class models on tasks with clear correctness criteria.
+### Documentation
+- **New disambiguation paragraphs** in `prompt_ast.md` (`Step.input_type` vs `RubyLLM::Agent.inputs`; `Prompt::Builder` multi-role DSL vs Agent ERB single-string template loader), `testing.md` (`Step.observe` vs `Chat#on_end_message` / `on_tool_call`), `output_schema.md` (relation to `Agent.schema`), and `optimizing_retry_policy.md` (orthogonal model + thinking dimensions).
+- **`getting_started.md` refusal message example** updated to include the new `(heuristic ±30%)` suffix.
+### Issues closed
+- **#11** (Optimizer is blind to same-model attempts) — closed after empirical experiments. `attempts: N` retry stays in API; not marketed as a default.
+- **#6** (Production cost reporting) — already implemented in 0.7.x; close confirmed.
+### Not in this release (deferred)
+- `output_schema` Proc form for runtime-input-aware schemas (parity with `Agent.schema` Proc form). Additive, low-risk; deferred to 0.9 to keep 0.8 scope tight.
+- H4 (Step composing `RubyLLM::Agent` internally as config holder) — verified feasible but ROI insufficient for current adopter base; trigger-based revisit, no calendar commitment.
 ## 0.7.3 (2026-04-24)
 Adoption-friction release. No runtime behavior changes — every delta is in `docs/`, `examples/`, or `spec/integration/` (plus the `version.rb` / Gemfile.lock bumps). Upgrading from 0.7.2 picks up the expanded guide set, the new runnable showcases, and one extra integration spec.

data/Gemfile.lock CHANGED Viewed

@@ -1,9 +1,9 @@
 PATH
   remote: .
   specs:
-    ruby_llm-contract (0.7.3)
+    ruby_llm-contract (0.8.0)
       dry-types (~> 1.7)
-      ruby_llm (~> 1.0)
+      ruby_llm (~> 1.12)
       ruby_llm-schema (~> 0.3)
 GEM
@@ -258,7 +258,7 @@ CHECKSUMS
   rubocop-ast (1.49.1) sha256=4412f3ee70f6fe4546cc489548e0f6fcf76cafcfa80fa03af67098ffed755035
   ruby-progressbar (1.13.0) sha256=80fc9c47a9b640d6834e0dc7b3c94c9df37f08cb072b7761e4a71e22cff29b33
   ruby_llm (1.14.0) sha256=57c6f7034fc4a44504ea137d70f853b07824f1c1cdbe774ab3ab3522e7098deb
-  ruby_llm-contract (0.7.3)
+  ruby_llm-contract (0.8.0)
   ruby_llm-schema (0.3.0) sha256=a591edc5ca1b7f0304f0e2261de61ba4b3bea17be09f5cf7558153adfda3dec6
   ruby_parser (3.22.0) sha256=1eb4937cd9eb220aa2d194e352a24dba90aef00751e24c8dfffdb14000f15d23
   rubycritic (4.12.0) sha256=024fed90fe656fa939f6ea80aab17569699ac3863d0b52fd72cb99892247abc8

data/README.md CHANGED Viewed

@@ -1,8 +1,10 @@
 # ruby_llm-contract
-**Validate and retry LLM outputs for [ruby_llm](https://github.com/crmne/ruby_llm).** Describe the answer you expect (JSON schema + business rules). If the model returns something that doesn't match, retry — optionally falling back to a stronger model — until it passes or you hit the budget.
+**Contracts + Evals for [ruby_llm](https://github.com/crmne/ruby_llm).**
-`ruby_llm` handles the HTTP side (rate limits, timeouts, streaming, tool calls, embeddings). This gem handles what the model *returned*: schema validation, business rules, retry with model fallback, datasets, regression tests.
+Your eval passed. Prod broke anyway? This gem wraps `RubyLLM::Chat` with input/output contracts, business-rule validation, retry with model escalation on validation failure, pre-flight cost ceilings, and an evaluation framework — so a flaky cheap-model call escalates to a stronger model instead of shipping garbage to your user.
+`ruby_llm` handles the HTTP side (rate limits, timeouts, streaming, tool calls, embeddings). This gem handles what the model *returned*: schema validation, business rules, model escalation on failed validation, datasets, regression tests.
 ## Install
@@ -65,9 +67,26 @@ Everything below is optional — the example above is a complete step. Reach for
 - **[CI regression gates](docs/guide/getting_started.md)** — `define_eval` + `save_baseline!` + `pass_eval(...).without_regressions` blocks CI when accuracy drops on a model update or prompt tweak.
 - **[Find the cheapest viable fallback list](docs/guide/optimizing_retry_policy.md)** — `Step.recommend("regression", candidates: [...], min_score: 0.95)` returns the cheapest list of models that still passes your evals. `production_mode:` measures retry-aware cost.
 - **[A/B test prompts](docs/guide/eval_first.md)** — `SummarizeArticleV2.compare_with(SummarizeArticleV1, eval: "regression")` reports whether the new prompt is safe to ship.
-- **[Budget caps](docs/guide/getting_started.md)** — `max_cost`, `max_input`, `max_output` refuse the request before calling the API when an estimate exceeds the limit.
+- **[Budget caps](docs/guide/getting_started.md)** — `max_cost`, `max_input`, `max_output` refuse the request before calling the API when a heuristic estimate (~±30% accuracy) exceeds the limit.
+- **[Reasoning effort / thinking config](docs/guide/optimizing_retry_policy.md)** — `thinking effort: :low` (or alias `reasoning_effort :low`) on the Step class; mirrors `RubyLLM::Agent.thinking` and forwards through `Chat#with_thinking`.
+Also supports [multi-step pipelines](docs/guide/pipeline.md) with fail-fast and `retry_policy attempts: N` for niche cases (we measured this empirically — for `gpt-4.1-nano` / `gpt-5-nano` on tasks with clear correctness criteria, same-model retry rarely helps; `escalate(model_2)` is the strategy that moves the needle, see [optimizing_retry_policy.md](docs/guide/optimizing_retry_policy.md)).
+## Relation to `RubyLLM::Agent`
+`RubyLLM::Agent` (since RubyLLM 1.12) and `Step::Base` here target the **same niche**: reusable, class-based prompts. They are siblings, not foundation-and-roof.
+| What you write | Where it lives |
+|---|---|
+| `model`, `temperature`, `schema`, `instructions`, `tools`, `thinking` | covered by both — same idea, different DSL surface |
+| `validate :rule do |out| ... end` business invariants | only here |
+| `retry_policy escalate(...)` model escalation on validation failure | only here (different from RubyLLM's network-level retry) |
+| `max_cost` / `max_input` / `max_output` pre-flight refusal | only here |
+| `define_eval` + baseline regression + `compare_models` + `optimize_retry_policy` | only here (RubyLLM does not ship an eval framework) |
+| Pipeline composition with `step SomeStep, as: :alias` | only here (RubyLLM intentionally leaves workflows as plain Ruby) |
+| `around_call`, named `observe` hooks with pass/fail in trace | only here |
-Also supports [multi-step pipelines](docs/guide/pipeline.md) with fail-fast and [best-effort retries without fallback](docs/guide/best_practices.md) (`retry_policy attempts: 3` for sampling variance).
+`Step::Base` does NOT use `Agent` internally today — both call into `RubyLLM::Chat` directly. The two abstractions can coexist on the same project: use `Agent` for prompt-only reuse, use `Step` when you need any of the contract-layer features above. The retry-strategy framing here (favouring `escalate(...)` over same-model `attempts: N`) is grounded in an empirical comparison; `attempts: N` stays in the API for niche cases.
 ## Docs
@@ -89,7 +108,7 @@ Also supports [multi-step pipelines](docs/guide/pipeline.md) with fail-fast and
 ## Roadmap
-Latest: **v0.7.2** — terminal output labels and guides aligned with the fallback narrative; `output_schema.md` DSL bug fix. See [CHANGELOG](CHANGELOG.md) for history.
+Latest: **v0.8.0** — tagline + narrative repositioning around "Contracts + Evals for RubyLLM", `thinking` / `reasoning_effort` class macro, TokenEstimator labelled as heuristic, CostCalculator repositioned. See [CHANGELOG](CHANGELOG.md) for history.
 ## License

data/lib/ruby_llm/contract/adapters/ruby_llm.rb CHANGED Viewed

@@ -52,12 +52,33 @@ module RubyLLM
           CHAT_OPTION_METHODS.each do |key, method_name|
             chat.public_send(method_name, options[key]) if options[key]
           end
+          # Resolve thinking config from BOTH sources, with `:reasoning_effort`
+          # taking precedence over `:thinking[:effort]`. This is the per-attempt
+          # override path used by `retry_policy { escalate({model:, reasoning_effort:}) }`
+          # — the attempt-specific effort must win over the class-level default.
+          # Forwarded provider-agnostically via `chat.with_thinking(**)` —
+          # available since RubyLLM 1.12 (gemspec enforces this minimum).
+          thinking_config = resolve_thinking_config(options)
+          chat.with_thinking(**thinking_config) if thinking_config
+          # `with_params` carries only raw passthroughs (currently `max_tokens`).
+          # `reasoning_effort` is no longer forwarded here — it goes through
+          # `with_thinking` above, which is the canonical RubyLLM API.
           params = {}
           params[:max_tokens] = options[:max_tokens] if options[:max_tokens]
-          params[:reasoning_effort] = options[:reasoning_effort] if options[:reasoning_effort]
           chat.with_params(**params) if params.any?
         end
+        # Returns merged `{ effort:, budget: }` or nil. `options[:reasoning_effort]`
+        # overrides any inherited `options[:thinking][:effort]`; budget is
+        # taken from `options[:thinking][:budget]` only.
+        def resolve_thinking_config(options)
+          base = options[:thinking].is_a?(Hash) ? options[:thinking].dup : {}
+          base[:effort] = options[:reasoning_effort] if options[:reasoning_effort]
+          base.empty? ? nil : base
+        end
         def build_response(response)
           content = response.content
           content = content.to_s unless content.is_a?(Hash) || content.is_a?(Array)

data/lib/ruby_llm/contract/cost_calculator.rb CHANGED Viewed

@@ -2,6 +2,29 @@
 module RubyLLM
   module Contract
+    # Pricing lookup for `max_cost` budget gating + retry usage aggregation.
+    #
+    # **What this module does (public surface):**
+    #
+    # 1. **Fine-tune / custom-model pricing registry** — `register_model`
+    #    fills the gap left by RubyLLM 1.14's models.json: there is no
+    #    upstream `RubyLLM::Models.register` API, so fine-tuned models
+    #    (e.g. `ft:gpt-4o-custom`) need their pricing supplied locally.
+    # 2. **Lookup with fallback chain** — `calculate(model_name:, usage:)`
+    #    checks the custom registry first, falls back to
+    #    `RubyLLM.models.find(model_name)`, returns `nil` on miss.
+    #
+    # **What this module is NOT:**
+    #
+    # - Not a "cost calculator" feature — the math itself
+    #   (`tokens × price_per_million / 1_000_000`) is trivial and lives
+    #   in `private_class_method :compute_cost` for internal use only.
+    # - Not a substitute for RubyLLM's pricing data — for any model in
+    #   `RubyLLM.models`, this module simply queries it.
+    #
+    # The reason this module exists at all is the registry + retry usage
+    # aggregation across attempts (the latter sits in `Step::RetryExecutor`,
+    # which calls `calculate` per attempt and sums; not in this module).
     module CostCalculator
       # Simple struct for custom-registered model pricing
       RegisteredModel = Struct.new(:input_price_per_million, :output_price_per_million, keyword_init: true)
@@ -9,6 +32,8 @@ module RubyLLM
       @custom_models = {}
       # Register pricing for custom or fine-tuned models not in the RubyLLM registry.
+      # This is the gem's primary value-add for cost computation; everything
+      # else falls back to RubyLLM's own model registry.
       #
       #   CostCalculator.register_model("ft:gpt-4o-custom",
       #     input_per_1m: 3.0, output_per_1m: 6.0)
@@ -33,6 +58,20 @@ module RubyLLM
         @custom_models.clear
       end
+      # Look up cost for a single model + usage hash.
+      # Returns nil if model is unknown (custom registry miss + RubyLLM miss),
+      # so callers can decide whether to refuse the call or proceed (see
+      # `on_unknown_pricing:` step option for the budget-gating policy).
+      #
+      #   CostCalculator.calculate(
+      #     model_name: "gpt-4o-mini",
+      #     usage: { input_tokens: 1_500, output_tokens: 800 }
+      #   )
+      #   # => 0.00069 (or nil if model not registered)
+      #
+      # Math is intentionally simple and private — this method is the
+      # primary public entry point. Aggregating across retry attempts is
+      # done in `Step::RetryExecutor`, not here.
       def self.calculate(model_name:, usage:)
         return nil unless model_name && usage.is_a?(Hash)

data/lib/ruby_llm/contract/step/base.rb CHANGED Viewed

@@ -159,10 +159,27 @@ module RubyLLM
           def runtime_settings(context)
             policy = context.key?(:retry_policy_override) ? context[:retry_policy_override] : retry_policy
+            extra = context.slice(:provider, :assume_model_exists, :max_tokens, :reasoning_effort)
+            # Always pass the class-level `thinking` config to the adapter when
+            # set, so fields like `budget` survive a per-call `reasoning_effort`
+            # override. The adapter's `resolve_thinking_config` merges
+            # `reasoning_effort` over `thinking[:effort]` while keeping the
+            # rest of the hash intact.
+            #
+            # `reasoning_effort` is also seeded into extra_options for
+            # backward compat with eval_host / production_mode paths that
+            # read it from there — but only when the caller did not already
+            # provide one in context.
+            if respond_to?(:thinking) && thinking
+              extra[:thinking] = thinking
+              extra[:reasoning_effort] = thinking[:effort] if !extra.key?(:reasoning_effort) && thinking[:effort]
+            end
             {
               model: context[:model] || model || RubyLLM::Contract.configuration.default_model,
               temperature: context[:temperature],
-              extra_options: context.slice(:provider, :assume_model_exists, :max_tokens, :reasoning_effort),
+              extra_options: extra,
               policy: policy
             }
           end

data/lib/ruby_llm/contract/step/dsl.rb CHANGED Viewed

@@ -200,6 +200,44 @@ module RubyLLM
           superclass.temperature if superclass.respond_to?(:temperature)
         end
+        def thinking(effort: nil, budget: nil)
+          if effort == :default
+            @thinking = nil
+            @thinking_explicitly_unset = true
+            return nil
+          end
+          if effort || budget
+            @thinking_explicitly_unset = false
+            return @thinking = { effort: effort, budget: budget }.compact
+          end
+          return @thinking if defined?(@thinking) && !@thinking_explicitly_unset
+          return nil if @thinking_explicitly_unset
+          superclass.thinking if superclass.respond_to?(:thinking)
+        end
+        def reasoning_effort(value = nil)
+          return (thinking && thinking[:effort]) if value.nil?
+          # Alias is scoped to the effort dimension only. `:default` on the
+          # alias clears effort but PRESERVES any previously-set budget — the
+          # name does not suggest "wipe the whole thinking config." Use the
+          # full `thinking(effort: :default)` to clear everything.
+          if value == :default
+            current_budget = thinking && thinking[:budget]
+            if current_budget
+              @thinking_explicitly_unset = false
+              @thinking = { budget: current_budget }
+              return nil
+            end
+            return thinking(effort: :default)
+          end
+          thinking(effort: value)
+        end
         def around_call(&block)
           if block
             return @around_call = block

data/lib/ruby_llm/contract/step/limit_checker.rb CHANGED Viewed

@@ -22,7 +22,7 @@ module RubyLLM
         def collect_limit_errors(estimated)
           errors = []
           if max_input && estimated > max_input
-            errors << "Input token limit exceeded: estimated #{estimated} tokens, max #{max_input}"
+            errors << "Input token limit exceeded: estimated #{estimated} tokens (heuristic ±30%), max #{max_input}"
           end
           append_cost_error(estimated, errors) if max_cost
           errors
@@ -46,7 +46,7 @@ module RubyLLM
             handle_unknown_pricing(errors)
           elsif estimated_cost > max_cost
             errors << "Cost limit exceeded: estimated $#{format("%.6f", estimated_cost)} " \
-                      "(#{estimated} input + #{estimated_output} output tokens), " \
+                      "(#{estimated} input + #{estimated_output} output tokens, heuristic ±30%), " \
                       "max $#{format("%.6f", max_cost)}"
           end
         end

data/lib/ruby_llm/contract/token_estimator.rb CHANGED Viewed

@@ -2,12 +2,29 @@
 module RubyLLM
   module Contract
+    # Pre-flight token estimation for `max_input` / `max_cost` budget gating.
+    #
+    # IMPORTANT — heuristic only. This is NOT an accurate tokenizer.
+    # The estimate uses a fixed `length / CHARS_PER_TOKEN` ratio:
+    #
+    #   - Accurate to ±30% for English prose with mainstream OpenAI / Anthropic models
+    #   - Worse for non-English text, code, structured data, and unusual scripts
+    #   - Useless for models with very different tokenizers (e.g. some open-source models)
+    #
+    # RubyLLM 1.14 ships no pre-flight tokenizer either; once the API call
+    # returns, `RubyLLM::Tokens` provides accurate counts from provider usage
+    # data. This estimator is for the *pre-flight refusal* path only — its job
+    # is to answer "is this call almost certainly within budget?" with enough
+    # accuracy that runaway prompts get caught, while accepting that the
+    # boundary cases will be wrong.
+    #
+    # Refusal messages from `LimitChecker` carry an "(heuristic)" suffix so
+    # adopters know the number is estimated, not measured.
     module TokenEstimator
-      # Heuristic: ~4 characters per token for English text.
-      # This is a rough estimate — actual tokenization varies by model and content.
-      # Intentionally conservative (overestimates slightly) to avoid surprise costs.
       CHARS_PER_TOKEN = 4
+      # Heuristic estimate. Returns an integer token count.
+      # See module docstring for accuracy caveats.
       def self.estimate(messages)
         return 0 unless messages.is_a?(Array)

data/lib/ruby_llm/contract/version.rb CHANGED Viewed

@@ -2,6 +2,6 @@
 module RubyLLM
   module Contract
-    VERSION = "0.7.3"
+    VERSION = "0.8.0"
   end
 end

data/ruby_llm-contract.gemspec CHANGED Viewed

@@ -7,10 +7,11 @@ Gem::Specification.new do |spec|
   spec.version = RubyLLM::Contract::VERSION
   spec.authors = ["Justyna"]
-  spec.summary = "Know which LLM model to use, what it costs, and when accuracy drops"
-  spec.description = "Compare LLM models by accuracy and cost. Regression-test prompts in CI. " \
-                     "Start on nano, auto-escalate to bigger models when quality drops. " \
-                     "Companion gem for ruby_llm."
+  spec.summary = "Contracts + Evals for ruby_llm"
+  spec.description = "Wraps RubyLLM::Chat with input/output contracts, business-rule validation, " \
+                     "retry with model escalation on validation failure, pre-flight cost ceilings, " \
+                     "and an evaluation framework. Sibling abstraction to RubyLLM::Agent — same " \
+                     "niche (reusable class-based prompts), wider contract."
   spec.homepage = "https://github.com/justi/ruby_llm-contract"
   spec.license = "MIT"
   spec.required_ruby_version = ">= 3.2.0"
@@ -30,6 +31,6 @@ Gem::Specification.new do |spec|
   spec.require_paths = ["lib"]
   spec.add_dependency "dry-types", "~> 1.7"
-  spec.add_dependency "ruby_llm", "~> 1.0"
+  spec.add_dependency "ruby_llm", "~> 1.12"
   spec.add_dependency "ruby_llm-schema", "~> 0.3"
 end

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: ruby_llm-contract
 version: !ruby/object:Gem::Version
-  version: 0.7.3
+  version: 0.8.0
 platform: ruby
 authors:
 - Justyna
@@ -29,14 +29,14 @@ dependencies:
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: '1.0'
+        version: '1.12'
   type: :runtime
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: '1.0'
+        version: '1.12'
 - !ruby/object:Gem::Dependency
   name: ruby_llm-schema
   requirement: !ruby/object:Gem::Requirement
@@ -51,9 +51,10 @@ dependencies:
     - - "~>"
       - !ruby/object:Gem::Version
         version: '0.3'
-description: Compare LLM models by accuracy and cost. Regression-test prompts in CI.
-  Start on nano, auto-escalate to bigger models when quality drops. Companion gem
-  for ruby_llm.
+description: Wraps RubyLLM::Chat with input/output contracts, business-rule validation,
+  retry with model escalation on validation failure, pre-flight cost ceilings, and
+  an evaluation framework. Sibling abstraction to RubyLLM::Agent — same niche (reusable
+  class-based prompts), wider contract.
 executables: []
 extensions: []
 extra_rdoc_files: []
@@ -205,5 +206,5 @@ required_rubygems_version: !ruby/object:Gem::Requirement
 requirements: []
 rubygems_version: 3.6.7
 specification_version: 4
-summary: Know which LLM model to use, what it costs, and when accuracy drops
+summary: Contracts + Evals for ruby_llm
 test_files: []