ruby_llm-contract 0.7.3 → 0.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: c9b5adda942a6e1f3393d2d05e8226d210b25088c65c2c20160b2ffc6a493533
4
- data.tar.gz: 91110ddd5d2fc39a8e3f6cd9ec912c707ceef7da2ee40e8805598a787ae1c067
3
+ metadata.gz: 33f6a8a686f7f20791904c4fbfacd19f6ea5b8bad428c374ec14b7e33521354d
4
+ data.tar.gz: a2b2f7d9ff1e6cd69b39d55a3809b1babbd82a5d42afe3c567733076f03fa317
5
5
  SHA512:
6
- metadata.gz: 543779ceacb617909e7e45355473fb300d9a5f81206ee51019d4f868b3ea66d43272b04be78aa95d0a014ea16619ab57c933eb58f1d17d8fa3845a957d6c750a
7
- data.tar.gz: eea6b49a1cb01de6b501bddec6fcfcccbc221a5448a72a435f4c04e80131a7994757b3c337390e1b6212a2709ae082a5d55c82158535a5789e606b7ef13d27f9
6
+ metadata.gz: '0895986db9cde7d26d2e91ffc7c6469b7d34df299a361148c9ee339dbf1dc61539e44adf250c00c06383717ed2e47ff250d4490d1cce43c3cdf8c3169529fba5'
7
+ data.tar.gz: ed35a4b4cc9ab1afd46c427468dcb33844d8d54207531f98a9f1d775004efc5a1e19d64fb22c64b20da81750165a3a979f4de4ecaa46aff4121eb7ba80a27ed2
data/CHANGELOG.md CHANGED
@@ -1,5 +1,41 @@
1
1
  # Changelog
2
2
 
3
+ ## 0.8.0 (2026-04-26)
4
+
5
+ Narrative repositioning + small API additions. Internal architecture unchanged: no `Step::Base` refactor, no breaking changes to existing DSL.
6
+
7
+ ### Added
8
+
9
+ - **`thinking(effort:, budget:)` class macro on `Step::Base`** — mirrors `RubyLLM::Agent.thinking` signature exactly. Stored as `{ effort:, budget: }` hash; reader returns the hash; supports `:default` reset semantics; superclass inheritance like `model`/`temperature`. The convenience alias `reasoning_effort(:low)` is implemented as `thinking(effort: :low)` — single normalized state, not separate ivar.
10
+ - **Adapter wiring for `with_thinking`** — when `thinking` is set on the Step class, OR when `reasoning_effort:` is passed through context, OR when an attempt config in `retry_policy escalate(...)` carries `reasoning_effort:`, the RubyLLM adapter resolves the effective `{ effort:, budget: }` hash and forwards it via `chat.with_thinking(**)` — provider-agnostic (supports OpenAI `reasoning_effort` AND Anthropic extended-thinking budget). Precedence: per-attempt / context `reasoning_effort` overrides class-level `thinking[:effort]`; budget is taken from class-level `thinking[:budget]`. **Behavioural change vs 0.7.x**: `reasoning_effort` is now forwarded via `with_thinking` instead of `with_params`. Same wire-level OpenAI parameter; provider-agnostic Anthropic support is now automatic.
11
+
12
+ ### Dependencies
13
+
14
+ - **`ruby_llm` constraint bumped from `~> 1.0` to `~> 1.12`** — `Chat#with_thinking` is the canonical path for reasoning effort + extended thinking; it shipped in RubyLLM 1.12. Adopters on `ruby_llm < 1.12` need to bump RubyLLM before upgrading this gem to 0.8.0.
15
+
16
+ ### Changed
17
+
18
+ - **Tagline + README opening** — repositioned around "Contracts + Evals for RubyLLM". New "Relation to RubyLLM::Agent" section explicitly frames Step as a sibling abstraction (same niche as Agent, wider contract), not an alternative or foundation. README does not claim "Step uses Agent under the hood" — current call path is `Step → Runner → Adapters::RubyLLM → RubyLLM.chat` directly.
19
+ - **`TokenEstimator` documented as heuristic** — module docstring expanded with explicit "±30% accuracy" framing. Refusal messages from `LimitChecker` now include `(heuristic ±30%)` suffix so adopters know the pre-flight number is estimated, not measured. RubyLLM 1.14 also has no pre-flight tokenizer; `RubyLLM::Tokens` is post-hoc only.
20
+ - **`CostCalculator` repositioned in docs** — module narrative reframed from "cost calculator" to "fine-tune pricing registry + lookup with fallback chain". Math methods (`compute_cost`, `token_cost`, etc.) were already private; this release makes the docs match. Public API surface unchanged: `register_model`, `unregister_model`, `reset_custom_models!`, `calculate`.
21
+ - **`output_schema` reframed in docs** — described as "wrapper around `RubyLLM::Schema` + client-side validation step", not a standalone feature. The schema language is identical to what `RubyLLM::Agent.schema` accepts; the difference is what wraps it.
22
+ - **README retry framing** — `retry_policy escalate(...)` (model escalation on validation failure) is the marketed default. `retry_policy attempts: N` (same-model retry) stays in the API for backward compat and niche cases (subjective criteria, multi-step pipelines, weaker models) but is no longer marketed as a recommended default. Empirical basis: four small experiments across PDF quiz generation, GSM8K math (n=30 + n=120), and multi-constraint schedule generation found no useful lift for nano-class models on tasks with clear correctness criteria.
23
+
24
+ ### Documentation
25
+
26
+ - **New disambiguation paragraphs** in `prompt_ast.md` (`Step.input_type` vs `RubyLLM::Agent.inputs`; `Prompt::Builder` multi-role DSL vs Agent ERB single-string template loader), `testing.md` (`Step.observe` vs `Chat#on_end_message` / `on_tool_call`), `output_schema.md` (relation to `Agent.schema`), and `optimizing_retry_policy.md` (orthogonal model + thinking dimensions).
27
+ - **`getting_started.md` refusal message example** updated to include the new `(heuristic ±30%)` suffix.
28
+
29
+ ### Issues closed
30
+
31
+ - **#11** (Optimizer is blind to same-model attempts) — closed after empirical experiments. `attempts: N` retry stays in API; not marketed as a default.
32
+ - **#6** (Production cost reporting) — already implemented in 0.7.x; close confirmed.
33
+
34
+ ### Not in this release (deferred)
35
+
36
+ - `output_schema` Proc form for runtime-input-aware schemas (parity with `Agent.schema` Proc form). Additive, low-risk; deferred to 0.9 to keep 0.8 scope tight.
37
+ - H4 (Step composing `RubyLLM::Agent` internally as config holder) — verified feasible but ROI insufficient for current adopter base; trigger-based revisit, no calendar commitment.
38
+
3
39
  ## 0.7.3 (2026-04-24)
4
40
 
5
41
  Adoption-friction release. No runtime behavior changes — every delta is in `docs/`, `examples/`, or `spec/integration/` (plus the `version.rb` / Gemfile.lock bumps). Upgrading from 0.7.2 picks up the expanded guide set, the new runnable showcases, and one extra integration spec.
data/Gemfile.lock CHANGED
@@ -1,9 +1,9 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- ruby_llm-contract (0.7.3)
4
+ ruby_llm-contract (0.8.0)
5
5
  dry-types (~> 1.7)
6
- ruby_llm (~> 1.0)
6
+ ruby_llm (~> 1.12)
7
7
  ruby_llm-schema (~> 0.3)
8
8
 
9
9
  GEM
@@ -258,7 +258,7 @@ CHECKSUMS
258
258
  rubocop-ast (1.49.1) sha256=4412f3ee70f6fe4546cc489548e0f6fcf76cafcfa80fa03af67098ffed755035
259
259
  ruby-progressbar (1.13.0) sha256=80fc9c47a9b640d6834e0dc7b3c94c9df37f08cb072b7761e4a71e22cff29b33
260
260
  ruby_llm (1.14.0) sha256=57c6f7034fc4a44504ea137d70f853b07824f1c1cdbe774ab3ab3522e7098deb
261
- ruby_llm-contract (0.7.3)
261
+ ruby_llm-contract (0.8.0)
262
262
  ruby_llm-schema (0.3.0) sha256=a591edc5ca1b7f0304f0e2261de61ba4b3bea17be09f5cf7558153adfda3dec6
263
263
  ruby_parser (3.22.0) sha256=1eb4937cd9eb220aa2d194e352a24dba90aef00751e24c8dfffdb14000f15d23
264
264
  rubycritic (4.12.0) sha256=024fed90fe656fa939f6ea80aab17569699ac3863d0b52fd72cb99892247abc8
data/README.md CHANGED
@@ -1,8 +1,10 @@
1
1
  # ruby_llm-contract
2
2
 
3
- **Validate and retry LLM outputs for [ruby_llm](https://github.com/crmne/ruby_llm).** Describe the answer you expect (JSON schema + business rules). If the model returns something that doesn't match, retry — optionally falling back to a stronger model — until it passes or you hit the budget.
3
+ **Contracts + Evals for [ruby_llm](https://github.com/crmne/ruby_llm).**
4
4
 
5
- `ruby_llm` handles the HTTP side (rate limits, timeouts, streaming, tool calls, embeddings). This gem handles what the model *returned*: schema validation, business rules, retry with model fallback, datasets, regression tests.
5
+ Your eval passed. Prod broke anyway? This gem wraps `RubyLLM::Chat` with input/output contracts, business-rule validation, retry with model escalation on validation failure, pre-flight cost ceilings, and an evaluation framework — so a flaky cheap-model call escalates to a stronger model instead of shipping garbage to your user.
6
+
7
+ `ruby_llm` handles the HTTP side (rate limits, timeouts, streaming, tool calls, embeddings). This gem handles what the model *returned*: schema validation, business rules, model escalation on failed validation, datasets, regression tests.
6
8
 
7
9
  ## Install
8
10
 
@@ -65,9 +67,26 @@ Everything below is optional — the example above is a complete step. Reach for
65
67
  - **[CI regression gates](docs/guide/getting_started.md)** — `define_eval` + `save_baseline!` + `pass_eval(...).without_regressions` blocks CI when accuracy drops on a model update or prompt tweak.
66
68
  - **[Find the cheapest viable fallback list](docs/guide/optimizing_retry_policy.md)** — `Step.recommend("regression", candidates: [...], min_score: 0.95)` returns the cheapest list of models that still passes your evals. `production_mode:` measures retry-aware cost.
67
69
  - **[A/B test prompts](docs/guide/eval_first.md)** — `SummarizeArticleV2.compare_with(SummarizeArticleV1, eval: "regression")` reports whether the new prompt is safe to ship.
68
- - **[Budget caps](docs/guide/getting_started.md)** — `max_cost`, `max_input`, `max_output` refuse the request before calling the API when an estimate exceeds the limit.
70
+ - **[Budget caps](docs/guide/getting_started.md)** — `max_cost`, `max_input`, `max_output` refuse the request before calling the API when a heuristic estimate (~±30% accuracy) exceeds the limit.
71
+ - **[Reasoning effort / thinking config](docs/guide/optimizing_retry_policy.md)** — `thinking effort: :low` (or alias `reasoning_effort :low`) on the Step class; mirrors `RubyLLM::Agent.thinking` and forwards through `Chat#with_thinking`.
72
+
73
+ Also supports [multi-step pipelines](docs/guide/pipeline.md) with fail-fast and `retry_policy attempts: N` for niche cases (we measured this empirically — for `gpt-4.1-nano` / `gpt-5-nano` on tasks with clear correctness criteria, same-model retry rarely helps; `escalate(model_2)` is the strategy that moves the needle, see [optimizing_retry_policy.md](docs/guide/optimizing_retry_policy.md)).
74
+
75
+ ## Relation to `RubyLLM::Agent`
76
+
77
+ `RubyLLM::Agent` (since RubyLLM 1.12) and `Step::Base` here target the **same niche**: reusable, class-based prompts. They are siblings, not foundation-and-roof.
78
+
79
+ | What you write | Where it lives |
80
+ |---|---|
81
+ | `model`, `temperature`, `schema`, `instructions`, `tools`, `thinking` | covered by both — same idea, different DSL surface |
82
+ | `validate :rule do |out| ... end` business invariants | only here |
83
+ | `retry_policy escalate(...)` model escalation on validation failure | only here (different from RubyLLM's network-level retry) |
84
+ | `max_cost` / `max_input` / `max_output` pre-flight refusal | only here |
85
+ | `define_eval` + baseline regression + `compare_models` + `optimize_retry_policy` | only here (RubyLLM does not ship an eval framework) |
86
+ | Pipeline composition with `step SomeStep, as: :alias` | only here (RubyLLM intentionally leaves workflows as plain Ruby) |
87
+ | `around_call`, named `observe` hooks with pass/fail in trace | only here |
69
88
 
70
- Also supports [multi-step pipelines](docs/guide/pipeline.md) with fail-fast and [best-effort retries without fallback](docs/guide/best_practices.md) (`retry_policy attempts: 3` for sampling variance).
89
+ `Step::Base` does NOT use `Agent` internally today — both call into `RubyLLM::Chat` directly. The two abstractions can coexist on the same project: use `Agent` for prompt-only reuse, use `Step` when you need any of the contract-layer features above. The retry-strategy framing here (favouring `escalate(...)` over same-model `attempts: N`) is grounded in an empirical comparison; `attempts: N` stays in the API for niche cases.
71
90
 
72
91
  ## Docs
73
92
 
@@ -89,7 +108,7 @@ Also supports [multi-step pipelines](docs/guide/pipeline.md) with fail-fast and
89
108
 
90
109
  ## Roadmap
91
110
 
92
- Latest: **v0.7.2** — terminal output labels and guides aligned with the fallback narrative; `output_schema.md` DSL bug fix. See [CHANGELOG](CHANGELOG.md) for history.
111
+ Latest: **v0.8.0** — tagline + narrative repositioning around "Contracts + Evals for RubyLLM", `thinking` / `reasoning_effort` class macro, TokenEstimator labelled as heuristic, CostCalculator repositioned. See [CHANGELOG](CHANGELOG.md) for history.
93
112
 
94
113
  ## License
95
114
 
@@ -52,12 +52,33 @@ module RubyLLM
52
52
  CHAT_OPTION_METHODS.each do |key, method_name|
53
53
  chat.public_send(method_name, options[key]) if options[key]
54
54
  end
55
+
56
+ # Resolve thinking config from BOTH sources, with `:reasoning_effort`
57
+ # taking precedence over `:thinking[:effort]`. This is the per-attempt
58
+ # override path used by `retry_policy { escalate({model:, reasoning_effort:}) }`
59
+ # — the attempt-specific effort must win over the class-level default.
60
+ # Forwarded provider-agnostically via `chat.with_thinking(**)` —
61
+ # available since RubyLLM 1.12 (gemspec enforces this minimum).
62
+ thinking_config = resolve_thinking_config(options)
63
+ chat.with_thinking(**thinking_config) if thinking_config
64
+
65
+ # `with_params` carries only raw passthroughs (currently `max_tokens`).
66
+ # `reasoning_effort` is no longer forwarded here — it goes through
67
+ # `with_thinking` above, which is the canonical RubyLLM API.
55
68
  params = {}
56
69
  params[:max_tokens] = options[:max_tokens] if options[:max_tokens]
57
- params[:reasoning_effort] = options[:reasoning_effort] if options[:reasoning_effort]
58
70
  chat.with_params(**params) if params.any?
59
71
  end
60
72
 
73
+ # Returns merged `{ effort:, budget: }` or nil. `options[:reasoning_effort]`
74
+ # overrides any inherited `options[:thinking][:effort]`; budget is
75
+ # taken from `options[:thinking][:budget]` only.
76
+ def resolve_thinking_config(options)
77
+ base = options[:thinking].is_a?(Hash) ? options[:thinking].dup : {}
78
+ base[:effort] = options[:reasoning_effort] if options[:reasoning_effort]
79
+ base.empty? ? nil : base
80
+ end
81
+
61
82
  def build_response(response)
62
83
  content = response.content
63
84
  content = content.to_s unless content.is_a?(Hash) || content.is_a?(Array)
@@ -2,6 +2,29 @@
2
2
 
3
3
  module RubyLLM
4
4
  module Contract
5
+ # Pricing lookup for `max_cost` budget gating + retry usage aggregation.
6
+ #
7
+ # **What this module does (public surface):**
8
+ #
9
+ # 1. **Fine-tune / custom-model pricing registry** — `register_model`
10
+ # fills the gap left by RubyLLM 1.14's models.json: there is no
11
+ # upstream `RubyLLM::Models.register` API, so fine-tuned models
12
+ # (e.g. `ft:gpt-4o-custom`) need their pricing supplied locally.
13
+ # 2. **Lookup with fallback chain** — `calculate(model_name:, usage:)`
14
+ # checks the custom registry first, falls back to
15
+ # `RubyLLM.models.find(model_name)`, returns `nil` on miss.
16
+ #
17
+ # **What this module is NOT:**
18
+ #
19
+ # - Not a "cost calculator" feature — the math itself
20
+ # (`tokens × price_per_million / 1_000_000`) is trivial and lives
21
+ # in `private_class_method :compute_cost` for internal use only.
22
+ # - Not a substitute for RubyLLM's pricing data — for any model in
23
+ # `RubyLLM.models`, this module simply queries it.
24
+ #
25
+ # The reason this module exists at all is the registry + retry usage
26
+ # aggregation across attempts (the latter sits in `Step::RetryExecutor`,
27
+ # which calls `calculate` per attempt and sums; not in this module).
5
28
  module CostCalculator
6
29
  # Simple struct for custom-registered model pricing
7
30
  RegisteredModel = Struct.new(:input_price_per_million, :output_price_per_million, keyword_init: true)
@@ -9,6 +32,8 @@ module RubyLLM
9
32
  @custom_models = {}
10
33
 
11
34
  # Register pricing for custom or fine-tuned models not in the RubyLLM registry.
35
+ # This is the gem's primary value-add for cost computation; everything
36
+ # else falls back to RubyLLM's own model registry.
12
37
  #
13
38
  # CostCalculator.register_model("ft:gpt-4o-custom",
14
39
  # input_per_1m: 3.0, output_per_1m: 6.0)
@@ -33,6 +58,20 @@ module RubyLLM
33
58
  @custom_models.clear
34
59
  end
35
60
 
61
+ # Look up cost for a single model + usage hash.
62
+ # Returns nil if model is unknown (custom registry miss + RubyLLM miss),
63
+ # so callers can decide whether to refuse the call or proceed (see
64
+ # `on_unknown_pricing:` step option for the budget-gating policy).
65
+ #
66
+ # CostCalculator.calculate(
67
+ # model_name: "gpt-4o-mini",
68
+ # usage: { input_tokens: 1_500, output_tokens: 800 }
69
+ # )
70
+ # # => 0.00069 (or nil if model not registered)
71
+ #
72
+ # Math is intentionally simple and private — this method is the
73
+ # primary public entry point. Aggregating across retry attempts is
74
+ # done in `Step::RetryExecutor`, not here.
36
75
  def self.calculate(model_name:, usage:)
37
76
  return nil unless model_name && usage.is_a?(Hash)
38
77
 
@@ -159,10 +159,27 @@ module RubyLLM
159
159
 
160
160
  def runtime_settings(context)
161
161
  policy = context.key?(:retry_policy_override) ? context[:retry_policy_override] : retry_policy
162
+ extra = context.slice(:provider, :assume_model_exists, :max_tokens, :reasoning_effort)
163
+
164
+ # Always pass the class-level `thinking` config to the adapter when
165
+ # set, so fields like `budget` survive a per-call `reasoning_effort`
166
+ # override. The adapter's `resolve_thinking_config` merges
167
+ # `reasoning_effort` over `thinking[:effort]` while keeping the
168
+ # rest of the hash intact.
169
+ #
170
+ # `reasoning_effort` is also seeded into extra_options for
171
+ # backward compat with eval_host / production_mode paths that
172
+ # read it from there — but only when the caller did not already
173
+ # provide one in context.
174
+ if respond_to?(:thinking) && thinking
175
+ extra[:thinking] = thinking
176
+ extra[:reasoning_effort] = thinking[:effort] if !extra.key?(:reasoning_effort) && thinking[:effort]
177
+ end
178
+
162
179
  {
163
180
  model: context[:model] || model || RubyLLM::Contract.configuration.default_model,
164
181
  temperature: context[:temperature],
165
- extra_options: context.slice(:provider, :assume_model_exists, :max_tokens, :reasoning_effort),
182
+ extra_options: extra,
166
183
  policy: policy
167
184
  }
168
185
  end
@@ -200,6 +200,44 @@ module RubyLLM
200
200
  superclass.temperature if superclass.respond_to?(:temperature)
201
201
  end
202
202
 
203
+ def thinking(effort: nil, budget: nil)
204
+ if effort == :default
205
+ @thinking = nil
206
+ @thinking_explicitly_unset = true
207
+ return nil
208
+ end
209
+
210
+ if effort || budget
211
+ @thinking_explicitly_unset = false
212
+ return @thinking = { effort: effort, budget: budget }.compact
213
+ end
214
+
215
+ return @thinking if defined?(@thinking) && !@thinking_explicitly_unset
216
+ return nil if @thinking_explicitly_unset
217
+
218
+ superclass.thinking if superclass.respond_to?(:thinking)
219
+ end
220
+
221
+ def reasoning_effort(value = nil)
222
+ return (thinking && thinking[:effort]) if value.nil?
223
+
224
+ # Alias is scoped to the effort dimension only. `:default` on the
225
+ # alias clears effort but PRESERVES any previously-set budget — the
226
+ # name does not suggest "wipe the whole thinking config." Use the
227
+ # full `thinking(effort: :default)` to clear everything.
228
+ if value == :default
229
+ current_budget = thinking && thinking[:budget]
230
+ if current_budget
231
+ @thinking_explicitly_unset = false
232
+ @thinking = { budget: current_budget }
233
+ return nil
234
+ end
235
+ return thinking(effort: :default)
236
+ end
237
+
238
+ thinking(effort: value)
239
+ end
240
+
203
241
  def around_call(&block)
204
242
  if block
205
243
  return @around_call = block
@@ -22,7 +22,7 @@ module RubyLLM
22
22
  def collect_limit_errors(estimated)
23
23
  errors = []
24
24
  if max_input && estimated > max_input
25
- errors << "Input token limit exceeded: estimated #{estimated} tokens, max #{max_input}"
25
+ errors << "Input token limit exceeded: estimated #{estimated} tokens (heuristic ±30%), max #{max_input}"
26
26
  end
27
27
  append_cost_error(estimated, errors) if max_cost
28
28
  errors
@@ -46,7 +46,7 @@ module RubyLLM
46
46
  handle_unknown_pricing(errors)
47
47
  elsif estimated_cost > max_cost
48
48
  errors << "Cost limit exceeded: estimated $#{format("%.6f", estimated_cost)} " \
49
- "(#{estimated} input + #{estimated_output} output tokens), " \
49
+ "(#{estimated} input + #{estimated_output} output tokens, heuristic ±30%), " \
50
50
  "max $#{format("%.6f", max_cost)}"
51
51
  end
52
52
  end
@@ -2,12 +2,29 @@
2
2
 
3
3
  module RubyLLM
4
4
  module Contract
5
+ # Pre-flight token estimation for `max_input` / `max_cost` budget gating.
6
+ #
7
+ # IMPORTANT — heuristic only. This is NOT an accurate tokenizer.
8
+ # The estimate uses a fixed `length / CHARS_PER_TOKEN` ratio:
9
+ #
10
+ # - Accurate to ±30% for English prose with mainstream OpenAI / Anthropic models
11
+ # - Worse for non-English text, code, structured data, and unusual scripts
12
+ # - Useless for models with very different tokenizers (e.g. some open-source models)
13
+ #
14
+ # RubyLLM 1.14 ships no pre-flight tokenizer either; once the API call
15
+ # returns, `RubyLLM::Tokens` provides accurate counts from provider usage
16
+ # data. This estimator is for the *pre-flight refusal* path only — its job
17
+ # is to answer "is this call almost certainly within budget?" with enough
18
+ # accuracy that runaway prompts get caught, while accepting that the
19
+ # boundary cases will be wrong.
20
+ #
21
+ # Refusal messages from `LimitChecker` carry an "(heuristic)" suffix so
22
+ # adopters know the number is estimated, not measured.
5
23
  module TokenEstimator
6
- # Heuristic: ~4 characters per token for English text.
7
- # This is a rough estimate — actual tokenization varies by model and content.
8
- # Intentionally conservative (overestimates slightly) to avoid surprise costs.
9
24
  CHARS_PER_TOKEN = 4
10
25
 
26
+ # Heuristic estimate. Returns an integer token count.
27
+ # See module docstring for accuracy caveats.
11
28
  def self.estimate(messages)
12
29
  return 0 unless messages.is_a?(Array)
13
30
 
@@ -2,6 +2,6 @@
2
2
 
3
3
  module RubyLLM
4
4
  module Contract
5
- VERSION = "0.7.3"
5
+ VERSION = "0.8.0"
6
6
  end
7
7
  end
@@ -7,10 +7,11 @@ Gem::Specification.new do |spec|
7
7
  spec.version = RubyLLM::Contract::VERSION
8
8
  spec.authors = ["Justyna"]
9
9
 
10
- spec.summary = "Know which LLM model to use, what it costs, and when accuracy drops"
11
- spec.description = "Compare LLM models by accuracy and cost. Regression-test prompts in CI. " \
12
- "Start on nano, auto-escalate to bigger models when quality drops. " \
13
- "Companion gem for ruby_llm."
10
+ spec.summary = "Contracts + Evals for ruby_llm"
11
+ spec.description = "Wraps RubyLLM::Chat with input/output contracts, business-rule validation, " \
12
+ "retry with model escalation on validation failure, pre-flight cost ceilings, " \
13
+ "and an evaluation framework. Sibling abstraction to RubyLLM::Agent — same " \
14
+ "niche (reusable class-based prompts), wider contract."
14
15
  spec.homepage = "https://github.com/justi/ruby_llm-contract"
15
16
  spec.license = "MIT"
16
17
  spec.required_ruby_version = ">= 3.2.0"
@@ -30,6 +31,6 @@ Gem::Specification.new do |spec|
30
31
  spec.require_paths = ["lib"]
31
32
 
32
33
  spec.add_dependency "dry-types", "~> 1.7"
33
- spec.add_dependency "ruby_llm", "~> 1.0"
34
+ spec.add_dependency "ruby_llm", "~> 1.12"
34
35
  spec.add_dependency "ruby_llm-schema", "~> 0.3"
35
36
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: ruby_llm-contract
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.7.3
4
+ version: 0.8.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Justyna
@@ -29,14 +29,14 @@ dependencies:
29
29
  requirements:
30
30
  - - "~>"
31
31
  - !ruby/object:Gem::Version
32
- version: '1.0'
32
+ version: '1.12'
33
33
  type: :runtime
34
34
  prerelease: false
35
35
  version_requirements: !ruby/object:Gem::Requirement
36
36
  requirements:
37
37
  - - "~>"
38
38
  - !ruby/object:Gem::Version
39
- version: '1.0'
39
+ version: '1.12'
40
40
  - !ruby/object:Gem::Dependency
41
41
  name: ruby_llm-schema
42
42
  requirement: !ruby/object:Gem::Requirement
@@ -51,9 +51,10 @@ dependencies:
51
51
  - - "~>"
52
52
  - !ruby/object:Gem::Version
53
53
  version: '0.3'
54
- description: Compare LLM models by accuracy and cost. Regression-test prompts in CI.
55
- Start on nano, auto-escalate to bigger models when quality drops. Companion gem
56
- for ruby_llm.
54
+ description: Wraps RubyLLM::Chat with input/output contracts, business-rule validation,
55
+ retry with model escalation on validation failure, pre-flight cost ceilings, and
56
+ an evaluation framework. Sibling abstraction to RubyLLM::Agent — same niche (reusable
57
+ class-based prompts), wider contract.
57
58
  executables: []
58
59
  extensions: []
59
60
  extra_rdoc_files: []
@@ -205,5 +206,5 @@ required_rubygems_version: !ruby/object:Gem::Requirement
205
206
  requirements: []
206
207
  rubygems_version: 3.6.7
207
208
  specification_version: 4
208
- summary: Know which LLM model to use, what it costs, and when accuracy drops
209
+ summary: Contracts + Evals for ruby_llm
209
210
  test_files: []