ruby_llm-contract 0.7.3 → 0.8.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +36 -0
- data/Gemfile.lock +3 -3
- data/README.md +24 -5
- data/lib/ruby_llm/contract/adapters/ruby_llm.rb +22 -1
- data/lib/ruby_llm/contract/cost_calculator.rb +39 -0
- data/lib/ruby_llm/contract/step/base.rb +18 -1
- data/lib/ruby_llm/contract/step/dsl.rb +38 -0
- data/lib/ruby_llm/contract/step/limit_checker.rb +2 -2
- data/lib/ruby_llm/contract/token_estimator.rb +20 -3
- data/lib/ruby_llm/contract/version.rb +1 -1
- data/ruby_llm-contract.gemspec +6 -5
- metadata +8 -7
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 33f6a8a686f7f20791904c4fbfacd19f6ea5b8bad428c374ec14b7e33521354d
|
|
4
|
+
data.tar.gz: a2b2f7d9ff1e6cd69b39d55a3809b1babbd82a5d42afe3c567733076f03fa317
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: '0895986db9cde7d26d2e91ffc7c6469b7d34df299a361148c9ee339dbf1dc61539e44adf250c00c06383717ed2e47ff250d4490d1cce43c3cdf8c3169529fba5'
|
|
7
|
+
data.tar.gz: ed35a4b4cc9ab1afd46c427468dcb33844d8d54207531f98a9f1d775004efc5a1e19d64fb22c64b20da81750165a3a979f4de4ecaa46aff4121eb7ba80a27ed2
|
data/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,41 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## 0.8.0 (2026-04-26)
|
|
4
|
+
|
|
5
|
+
Narrative repositioning + small API additions. Internal architecture unchanged: no `Step::Base` refactor, no breaking changes to existing DSL.
|
|
6
|
+
|
|
7
|
+
### Added
|
|
8
|
+
|
|
9
|
+
- **`thinking(effort:, budget:)` class macro on `Step::Base`** — mirrors `RubyLLM::Agent.thinking` signature exactly. Stored as `{ effort:, budget: }` hash; reader returns the hash; supports `:default` reset semantics; superclass inheritance like `model`/`temperature`. The convenience alias `reasoning_effort(:low)` is implemented as `thinking(effort: :low)` — single normalized state, not separate ivar.
|
|
10
|
+
- **Adapter wiring for `with_thinking`** — when `thinking` is set on the Step class, OR when `reasoning_effort:` is passed through context, OR when an attempt config in `retry_policy escalate(...)` carries `reasoning_effort:`, the RubyLLM adapter resolves the effective `{ effort:, budget: }` hash and forwards it via `chat.with_thinking(**)` — provider-agnostic (supports OpenAI `reasoning_effort` AND Anthropic extended-thinking budget). Precedence: per-attempt / context `reasoning_effort` overrides class-level `thinking[:effort]`; budget is taken from class-level `thinking[:budget]`. **Behavioural change vs 0.7.x**: `reasoning_effort` is now forwarded via `with_thinking` instead of `with_params`. Same wire-level OpenAI parameter; provider-agnostic Anthropic support is now automatic.
|
|
11
|
+
|
|
12
|
+
### Dependencies
|
|
13
|
+
|
|
14
|
+
- **`ruby_llm` constraint bumped from `~> 1.0` to `~> 1.12`** — `Chat#with_thinking` is the canonical path for reasoning effort + extended thinking; it shipped in RubyLLM 1.12. Adopters on `ruby_llm < 1.12` need to bump RubyLLM before upgrading this gem to 0.8.0.
|
|
15
|
+
|
|
16
|
+
### Changed
|
|
17
|
+
|
|
18
|
+
- **Tagline + README opening** — repositioned around "Contracts + Evals for RubyLLM". New "Relation to RubyLLM::Agent" section explicitly frames Step as a sibling abstraction (same niche as Agent, wider contract), not an alternative or foundation. README does not claim "Step uses Agent under the hood" — current call path is `Step → Runner → Adapters::RubyLLM → RubyLLM.chat` directly.
|
|
19
|
+
- **`TokenEstimator` documented as heuristic** — module docstring expanded with explicit "±30% accuracy" framing. Refusal messages from `LimitChecker` now include `(heuristic ±30%)` suffix so adopters know the pre-flight number is estimated, not measured. RubyLLM 1.14 also has no pre-flight tokenizer; `RubyLLM::Tokens` is post-hoc only.
|
|
20
|
+
- **`CostCalculator` repositioned in docs** — module narrative reframed from "cost calculator" to "fine-tune pricing registry + lookup with fallback chain". Math methods (`compute_cost`, `token_cost`, etc.) were already private; this release makes the docs match. Public API surface unchanged: `register_model`, `unregister_model`, `reset_custom_models!`, `calculate`.
|
|
21
|
+
- **`output_schema` reframed in docs** — described as "wrapper around `RubyLLM::Schema` + client-side validation step", not a standalone feature. The schema language is identical to what `RubyLLM::Agent.schema` accepts; the difference is what wraps it.
|
|
22
|
+
- **README retry framing** — `retry_policy escalate(...)` (model escalation on validation failure) is the marketed default. `retry_policy attempts: N` (same-model retry) stays in the API for backward compat and niche cases (subjective criteria, multi-step pipelines, weaker models) but is no longer marketed as a recommended default. Empirical basis: four small experiments across PDF quiz generation, GSM8K math (n=30 + n=120), and multi-constraint schedule generation found no useful lift for nano-class models on tasks with clear correctness criteria.
|
|
23
|
+
|
|
24
|
+
### Documentation
|
|
25
|
+
|
|
26
|
+
- **New disambiguation paragraphs** in `prompt_ast.md` (`Step.input_type` vs `RubyLLM::Agent.inputs`; `Prompt::Builder` multi-role DSL vs Agent ERB single-string template loader), `testing.md` (`Step.observe` vs `Chat#on_end_message` / `on_tool_call`), `output_schema.md` (relation to `Agent.schema`), and `optimizing_retry_policy.md` (orthogonal model + thinking dimensions).
|
|
27
|
+
- **`getting_started.md` refusal message example** updated to include the new `(heuristic ±30%)` suffix.
|
|
28
|
+
|
|
29
|
+
### Issues closed
|
|
30
|
+
|
|
31
|
+
- **#11** (Optimizer is blind to same-model attempts) — closed after empirical experiments. `attempts: N` retry stays in API; not marketed as a default.
|
|
32
|
+
- **#6** (Production cost reporting) — already implemented in 0.7.x; close confirmed.
|
|
33
|
+
|
|
34
|
+
### Not in this release (deferred)
|
|
35
|
+
|
|
36
|
+
- `output_schema` Proc form for runtime-input-aware schemas (parity with `Agent.schema` Proc form). Additive, low-risk; deferred to 0.9 to keep 0.8 scope tight.
|
|
37
|
+
- H4 (Step composing `RubyLLM::Agent` internally as config holder) — verified feasible but ROI insufficient for current adopter base; trigger-based revisit, no calendar commitment.
|
|
38
|
+
|
|
3
39
|
## 0.7.3 (2026-04-24)
|
|
4
40
|
|
|
5
41
|
Adoption-friction release. No runtime behavior changes — every delta is in `docs/`, `examples/`, or `spec/integration/` (plus the `version.rb` / Gemfile.lock bumps). Upgrading from 0.7.2 picks up the expanded guide set, the new runnable showcases, and one extra integration spec.
|
data/Gemfile.lock
CHANGED
|
@@ -1,9 +1,9 @@
|
|
|
1
1
|
PATH
|
|
2
2
|
remote: .
|
|
3
3
|
specs:
|
|
4
|
-
ruby_llm-contract (0.
|
|
4
|
+
ruby_llm-contract (0.8.0)
|
|
5
5
|
dry-types (~> 1.7)
|
|
6
|
-
ruby_llm (~> 1.
|
|
6
|
+
ruby_llm (~> 1.12)
|
|
7
7
|
ruby_llm-schema (~> 0.3)
|
|
8
8
|
|
|
9
9
|
GEM
|
|
@@ -258,7 +258,7 @@ CHECKSUMS
|
|
|
258
258
|
rubocop-ast (1.49.1) sha256=4412f3ee70f6fe4546cc489548e0f6fcf76cafcfa80fa03af67098ffed755035
|
|
259
259
|
ruby-progressbar (1.13.0) sha256=80fc9c47a9b640d6834e0dc7b3c94c9df37f08cb072b7761e4a71e22cff29b33
|
|
260
260
|
ruby_llm (1.14.0) sha256=57c6f7034fc4a44504ea137d70f853b07824f1c1cdbe774ab3ab3522e7098deb
|
|
261
|
-
ruby_llm-contract (0.
|
|
261
|
+
ruby_llm-contract (0.8.0)
|
|
262
262
|
ruby_llm-schema (0.3.0) sha256=a591edc5ca1b7f0304f0e2261de61ba4b3bea17be09f5cf7558153adfda3dec6
|
|
263
263
|
ruby_parser (3.22.0) sha256=1eb4937cd9eb220aa2d194e352a24dba90aef00751e24c8dfffdb14000f15d23
|
|
264
264
|
rubycritic (4.12.0) sha256=024fed90fe656fa939f6ea80aab17569699ac3863d0b52fd72cb99892247abc8
|
data/README.md
CHANGED
|
@@ -1,8 +1,10 @@
|
|
|
1
1
|
# ruby_llm-contract
|
|
2
2
|
|
|
3
|
-
**
|
|
3
|
+
**Contracts + Evals for [ruby_llm](https://github.com/crmne/ruby_llm).**
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
Your eval passed. Prod broke anyway? This gem wraps `RubyLLM::Chat` with input/output contracts, business-rule validation, retry with model escalation on validation failure, pre-flight cost ceilings, and an evaluation framework — so a flaky cheap-model call escalates to a stronger model instead of shipping garbage to your user.
|
|
6
|
+
|
|
7
|
+
`ruby_llm` handles the HTTP side (rate limits, timeouts, streaming, tool calls, embeddings). This gem handles what the model *returned*: schema validation, business rules, model escalation on failed validation, datasets, regression tests.
|
|
6
8
|
|
|
7
9
|
## Install
|
|
8
10
|
|
|
@@ -65,9 +67,26 @@ Everything below is optional — the example above is a complete step. Reach for
|
|
|
65
67
|
- **[CI regression gates](docs/guide/getting_started.md)** — `define_eval` + `save_baseline!` + `pass_eval(...).without_regressions` blocks CI when accuracy drops on a model update or prompt tweak.
|
|
66
68
|
- **[Find the cheapest viable fallback list](docs/guide/optimizing_retry_policy.md)** — `Step.recommend("regression", candidates: [...], min_score: 0.95)` returns the cheapest list of models that still passes your evals. `production_mode:` measures retry-aware cost.
|
|
67
69
|
- **[A/B test prompts](docs/guide/eval_first.md)** — `SummarizeArticleV2.compare_with(SummarizeArticleV1, eval: "regression")` reports whether the new prompt is safe to ship.
|
|
68
|
-
- **[Budget caps](docs/guide/getting_started.md)** — `max_cost`, `max_input`, `max_output` refuse the request before calling the API when
|
|
70
|
+
- **[Budget caps](docs/guide/getting_started.md)** — `max_cost`, `max_input`, `max_output` refuse the request before calling the API when a heuristic estimate (~±30% accuracy) exceeds the limit.
|
|
71
|
+
- **[Reasoning effort / thinking config](docs/guide/optimizing_retry_policy.md)** — `thinking effort: :low` (or alias `reasoning_effort :low`) on the Step class; mirrors `RubyLLM::Agent.thinking` and forwards through `Chat#with_thinking`.
|
|
72
|
+
|
|
73
|
+
Also supports [multi-step pipelines](docs/guide/pipeline.md) with fail-fast and `retry_policy attempts: N` for niche cases (we measured this empirically — for `gpt-4.1-nano` / `gpt-5-nano` on tasks with clear correctness criteria, same-model retry rarely helps; `escalate(model_2)` is the strategy that moves the needle, see [optimizing_retry_policy.md](docs/guide/optimizing_retry_policy.md)).
|
|
74
|
+
|
|
75
|
+
## Relation to `RubyLLM::Agent`
|
|
76
|
+
|
|
77
|
+
`RubyLLM::Agent` (since RubyLLM 1.12) and `Step::Base` here target the **same niche**: reusable, class-based prompts. They are siblings, not foundation-and-roof.
|
|
78
|
+
|
|
79
|
+
| What you write | Where it lives |
|
|
80
|
+
|---|---|
|
|
81
|
+
| `model`, `temperature`, `schema`, `instructions`, `tools`, `thinking` | covered by both — same idea, different DSL surface |
|
|
82
|
+
| `validate :rule do |out| ... end` business invariants | only here |
|
|
83
|
+
| `retry_policy escalate(...)` model escalation on validation failure | only here (different from RubyLLM's network-level retry) |
|
|
84
|
+
| `max_cost` / `max_input` / `max_output` pre-flight refusal | only here |
|
|
85
|
+
| `define_eval` + baseline regression + `compare_models` + `optimize_retry_policy` | only here (RubyLLM does not ship an eval framework) |
|
|
86
|
+
| Pipeline composition with `step SomeStep, as: :alias` | only here (RubyLLM intentionally leaves workflows as plain Ruby) |
|
|
87
|
+
| `around_call`, named `observe` hooks with pass/fail in trace | only here |
|
|
69
88
|
|
|
70
|
-
|
|
89
|
+
`Step::Base` does NOT use `Agent` internally today — both call into `RubyLLM::Chat` directly. The two abstractions can coexist on the same project: use `Agent` for prompt-only reuse, use `Step` when you need any of the contract-layer features above. The retry-strategy framing here (favouring `escalate(...)` over same-model `attempts: N`) is grounded in an empirical comparison; `attempts: N` stays in the API for niche cases.
|
|
71
90
|
|
|
72
91
|
## Docs
|
|
73
92
|
|
|
@@ -89,7 +108,7 @@ Also supports [multi-step pipelines](docs/guide/pipeline.md) with fail-fast and
|
|
|
89
108
|
|
|
90
109
|
## Roadmap
|
|
91
110
|
|
|
92
|
-
Latest: **v0.
|
|
111
|
+
Latest: **v0.8.0** — tagline + narrative repositioning around "Contracts + Evals for RubyLLM", `thinking` / `reasoning_effort` class macro, TokenEstimator labelled as heuristic, CostCalculator repositioned. See [CHANGELOG](CHANGELOG.md) for history.
|
|
93
112
|
|
|
94
113
|
## License
|
|
95
114
|
|
|
@@ -52,12 +52,33 @@ module RubyLLM
|
|
|
52
52
|
CHAT_OPTION_METHODS.each do |key, method_name|
|
|
53
53
|
chat.public_send(method_name, options[key]) if options[key]
|
|
54
54
|
end
|
|
55
|
+
|
|
56
|
+
# Resolve thinking config from BOTH sources, with `:reasoning_effort`
|
|
57
|
+
# taking precedence over `:thinking[:effort]`. This is the per-attempt
|
|
58
|
+
# override path used by `retry_policy { escalate({model:, reasoning_effort:}) }`
|
|
59
|
+
# — the attempt-specific effort must win over the class-level default.
|
|
60
|
+
# Forwarded provider-agnostically via `chat.with_thinking(**)` —
|
|
61
|
+
# available since RubyLLM 1.12 (gemspec enforces this minimum).
|
|
62
|
+
thinking_config = resolve_thinking_config(options)
|
|
63
|
+
chat.with_thinking(**thinking_config) if thinking_config
|
|
64
|
+
|
|
65
|
+
# `with_params` carries only raw passthroughs (currently `max_tokens`).
|
|
66
|
+
# `reasoning_effort` is no longer forwarded here — it goes through
|
|
67
|
+
# `with_thinking` above, which is the canonical RubyLLM API.
|
|
55
68
|
params = {}
|
|
56
69
|
params[:max_tokens] = options[:max_tokens] if options[:max_tokens]
|
|
57
|
-
params[:reasoning_effort] = options[:reasoning_effort] if options[:reasoning_effort]
|
|
58
70
|
chat.with_params(**params) if params.any?
|
|
59
71
|
end
|
|
60
72
|
|
|
73
|
+
# Returns merged `{ effort:, budget: }` or nil. `options[:reasoning_effort]`
|
|
74
|
+
# overrides any inherited `options[:thinking][:effort]`; budget is
|
|
75
|
+
# taken from `options[:thinking][:budget]` only.
|
|
76
|
+
def resolve_thinking_config(options)
|
|
77
|
+
base = options[:thinking].is_a?(Hash) ? options[:thinking].dup : {}
|
|
78
|
+
base[:effort] = options[:reasoning_effort] if options[:reasoning_effort]
|
|
79
|
+
base.empty? ? nil : base
|
|
80
|
+
end
|
|
81
|
+
|
|
61
82
|
def build_response(response)
|
|
62
83
|
content = response.content
|
|
63
84
|
content = content.to_s unless content.is_a?(Hash) || content.is_a?(Array)
|
|
@@ -2,6 +2,29 @@
|
|
|
2
2
|
|
|
3
3
|
module RubyLLM
|
|
4
4
|
module Contract
|
|
5
|
+
# Pricing lookup for `max_cost` budget gating + retry usage aggregation.
|
|
6
|
+
#
|
|
7
|
+
# **What this module does (public surface):**
|
|
8
|
+
#
|
|
9
|
+
# 1. **Fine-tune / custom-model pricing registry** — `register_model`
|
|
10
|
+
# fills the gap left by RubyLLM 1.14's models.json: there is no
|
|
11
|
+
# upstream `RubyLLM::Models.register` API, so fine-tuned models
|
|
12
|
+
# (e.g. `ft:gpt-4o-custom`) need their pricing supplied locally.
|
|
13
|
+
# 2. **Lookup with fallback chain** — `calculate(model_name:, usage:)`
|
|
14
|
+
# checks the custom registry first, falls back to
|
|
15
|
+
# `RubyLLM.models.find(model_name)`, returns `nil` on miss.
|
|
16
|
+
#
|
|
17
|
+
# **What this module is NOT:**
|
|
18
|
+
#
|
|
19
|
+
# - Not a "cost calculator" feature — the math itself
|
|
20
|
+
# (`tokens × price_per_million / 1_000_000`) is trivial and lives
|
|
21
|
+
# in `private_class_method :compute_cost` for internal use only.
|
|
22
|
+
# - Not a substitute for RubyLLM's pricing data — for any model in
|
|
23
|
+
# `RubyLLM.models`, this module simply queries it.
|
|
24
|
+
#
|
|
25
|
+
# The reason this module exists at all is the registry + retry usage
|
|
26
|
+
# aggregation across attempts (the latter sits in `Step::RetryExecutor`,
|
|
27
|
+
# which calls `calculate` per attempt and sums; not in this module).
|
|
5
28
|
module CostCalculator
|
|
6
29
|
# Simple struct for custom-registered model pricing
|
|
7
30
|
RegisteredModel = Struct.new(:input_price_per_million, :output_price_per_million, keyword_init: true)
|
|
@@ -9,6 +32,8 @@ module RubyLLM
|
|
|
9
32
|
@custom_models = {}
|
|
10
33
|
|
|
11
34
|
# Register pricing for custom or fine-tuned models not in the RubyLLM registry.
|
|
35
|
+
# This is the gem's primary value-add for cost computation; everything
|
|
36
|
+
# else falls back to RubyLLM's own model registry.
|
|
12
37
|
#
|
|
13
38
|
# CostCalculator.register_model("ft:gpt-4o-custom",
|
|
14
39
|
# input_per_1m: 3.0, output_per_1m: 6.0)
|
|
@@ -33,6 +58,20 @@ module RubyLLM
|
|
|
33
58
|
@custom_models.clear
|
|
34
59
|
end
|
|
35
60
|
|
|
61
|
+
# Look up cost for a single model + usage hash.
|
|
62
|
+
# Returns nil if model is unknown (custom registry miss + RubyLLM miss),
|
|
63
|
+
# so callers can decide whether to refuse the call or proceed (see
|
|
64
|
+
# `on_unknown_pricing:` step option for the budget-gating policy).
|
|
65
|
+
#
|
|
66
|
+
# CostCalculator.calculate(
|
|
67
|
+
# model_name: "gpt-4o-mini",
|
|
68
|
+
# usage: { input_tokens: 1_500, output_tokens: 800 }
|
|
69
|
+
# )
|
|
70
|
+
# # => 0.00069 (or nil if model not registered)
|
|
71
|
+
#
|
|
72
|
+
# Math is intentionally simple and private — this method is the
|
|
73
|
+
# primary public entry point. Aggregating across retry attempts is
|
|
74
|
+
# done in `Step::RetryExecutor`, not here.
|
|
36
75
|
def self.calculate(model_name:, usage:)
|
|
37
76
|
return nil unless model_name && usage.is_a?(Hash)
|
|
38
77
|
|
|
@@ -159,10 +159,27 @@ module RubyLLM
|
|
|
159
159
|
|
|
160
160
|
def runtime_settings(context)
|
|
161
161
|
policy = context.key?(:retry_policy_override) ? context[:retry_policy_override] : retry_policy
|
|
162
|
+
extra = context.slice(:provider, :assume_model_exists, :max_tokens, :reasoning_effort)
|
|
163
|
+
|
|
164
|
+
# Always pass the class-level `thinking` config to the adapter when
|
|
165
|
+
# set, so fields like `budget` survive a per-call `reasoning_effort`
|
|
166
|
+
# override. The adapter's `resolve_thinking_config` merges
|
|
167
|
+
# `reasoning_effort` over `thinking[:effort]` while keeping the
|
|
168
|
+
# rest of the hash intact.
|
|
169
|
+
#
|
|
170
|
+
# `reasoning_effort` is also seeded into extra_options for
|
|
171
|
+
# backward compat with eval_host / production_mode paths that
|
|
172
|
+
# read it from there — but only when the caller did not already
|
|
173
|
+
# provide one in context.
|
|
174
|
+
if respond_to?(:thinking) && thinking
|
|
175
|
+
extra[:thinking] = thinking
|
|
176
|
+
extra[:reasoning_effort] = thinking[:effort] if !extra.key?(:reasoning_effort) && thinking[:effort]
|
|
177
|
+
end
|
|
178
|
+
|
|
162
179
|
{
|
|
163
180
|
model: context[:model] || model || RubyLLM::Contract.configuration.default_model,
|
|
164
181
|
temperature: context[:temperature],
|
|
165
|
-
extra_options:
|
|
182
|
+
extra_options: extra,
|
|
166
183
|
policy: policy
|
|
167
184
|
}
|
|
168
185
|
end
|
|
@@ -200,6 +200,44 @@ module RubyLLM
|
|
|
200
200
|
superclass.temperature if superclass.respond_to?(:temperature)
|
|
201
201
|
end
|
|
202
202
|
|
|
203
|
+
def thinking(effort: nil, budget: nil)
|
|
204
|
+
if effort == :default
|
|
205
|
+
@thinking = nil
|
|
206
|
+
@thinking_explicitly_unset = true
|
|
207
|
+
return nil
|
|
208
|
+
end
|
|
209
|
+
|
|
210
|
+
if effort || budget
|
|
211
|
+
@thinking_explicitly_unset = false
|
|
212
|
+
return @thinking = { effort: effort, budget: budget }.compact
|
|
213
|
+
end
|
|
214
|
+
|
|
215
|
+
return @thinking if defined?(@thinking) && !@thinking_explicitly_unset
|
|
216
|
+
return nil if @thinking_explicitly_unset
|
|
217
|
+
|
|
218
|
+
superclass.thinking if superclass.respond_to?(:thinking)
|
|
219
|
+
end
|
|
220
|
+
|
|
221
|
+
def reasoning_effort(value = nil)
|
|
222
|
+
return (thinking && thinking[:effort]) if value.nil?
|
|
223
|
+
|
|
224
|
+
# Alias is scoped to the effort dimension only. `:default` on the
|
|
225
|
+
# alias clears effort but PRESERVES any previously-set budget — the
|
|
226
|
+
# name does not suggest "wipe the whole thinking config." Use the
|
|
227
|
+
# full `thinking(effort: :default)` to clear everything.
|
|
228
|
+
if value == :default
|
|
229
|
+
current_budget = thinking && thinking[:budget]
|
|
230
|
+
if current_budget
|
|
231
|
+
@thinking_explicitly_unset = false
|
|
232
|
+
@thinking = { budget: current_budget }
|
|
233
|
+
return nil
|
|
234
|
+
end
|
|
235
|
+
return thinking(effort: :default)
|
|
236
|
+
end
|
|
237
|
+
|
|
238
|
+
thinking(effort: value)
|
|
239
|
+
end
|
|
240
|
+
|
|
203
241
|
def around_call(&block)
|
|
204
242
|
if block
|
|
205
243
|
return @around_call = block
|
|
@@ -22,7 +22,7 @@ module RubyLLM
|
|
|
22
22
|
def collect_limit_errors(estimated)
|
|
23
23
|
errors = []
|
|
24
24
|
if max_input && estimated > max_input
|
|
25
|
-
errors << "Input token limit exceeded: estimated #{estimated} tokens, max #{max_input}"
|
|
25
|
+
errors << "Input token limit exceeded: estimated #{estimated} tokens (heuristic ±30%), max #{max_input}"
|
|
26
26
|
end
|
|
27
27
|
append_cost_error(estimated, errors) if max_cost
|
|
28
28
|
errors
|
|
@@ -46,7 +46,7 @@ module RubyLLM
|
|
|
46
46
|
handle_unknown_pricing(errors)
|
|
47
47
|
elsif estimated_cost > max_cost
|
|
48
48
|
errors << "Cost limit exceeded: estimated $#{format("%.6f", estimated_cost)} " \
|
|
49
|
-
"(#{estimated} input + #{estimated_output} output tokens), " \
|
|
49
|
+
"(#{estimated} input + #{estimated_output} output tokens, heuristic ±30%), " \
|
|
50
50
|
"max $#{format("%.6f", max_cost)}"
|
|
51
51
|
end
|
|
52
52
|
end
|
|
@@ -2,12 +2,29 @@
|
|
|
2
2
|
|
|
3
3
|
module RubyLLM
|
|
4
4
|
module Contract
|
|
5
|
+
# Pre-flight token estimation for `max_input` / `max_cost` budget gating.
|
|
6
|
+
#
|
|
7
|
+
# IMPORTANT — heuristic only. This is NOT an accurate tokenizer.
|
|
8
|
+
# The estimate uses a fixed `length / CHARS_PER_TOKEN` ratio:
|
|
9
|
+
#
|
|
10
|
+
# - Accurate to ±30% for English prose with mainstream OpenAI / Anthropic models
|
|
11
|
+
# - Worse for non-English text, code, structured data, and unusual scripts
|
|
12
|
+
# - Useless for models with very different tokenizers (e.g. some open-source models)
|
|
13
|
+
#
|
|
14
|
+
# RubyLLM 1.14 ships no pre-flight tokenizer either; once the API call
|
|
15
|
+
# returns, `RubyLLM::Tokens` provides accurate counts from provider usage
|
|
16
|
+
# data. This estimator is for the *pre-flight refusal* path only — its job
|
|
17
|
+
# is to answer "is this call almost certainly within budget?" with enough
|
|
18
|
+
# accuracy that runaway prompts get caught, while accepting that the
|
|
19
|
+
# boundary cases will be wrong.
|
|
20
|
+
#
|
|
21
|
+
# Refusal messages from `LimitChecker` carry an "(heuristic)" suffix so
|
|
22
|
+
# adopters know the number is estimated, not measured.
|
|
5
23
|
module TokenEstimator
|
|
6
|
-
# Heuristic: ~4 characters per token for English text.
|
|
7
|
-
# This is a rough estimate — actual tokenization varies by model and content.
|
|
8
|
-
# Intentionally conservative (overestimates slightly) to avoid surprise costs.
|
|
9
24
|
CHARS_PER_TOKEN = 4
|
|
10
25
|
|
|
26
|
+
# Heuristic estimate. Returns an integer token count.
|
|
27
|
+
# See module docstring for accuracy caveats.
|
|
11
28
|
def self.estimate(messages)
|
|
12
29
|
return 0 unless messages.is_a?(Array)
|
|
13
30
|
|
data/ruby_llm-contract.gemspec
CHANGED
|
@@ -7,10 +7,11 @@ Gem::Specification.new do |spec|
|
|
|
7
7
|
spec.version = RubyLLM::Contract::VERSION
|
|
8
8
|
spec.authors = ["Justyna"]
|
|
9
9
|
|
|
10
|
-
spec.summary = "
|
|
11
|
-
spec.description = "
|
|
12
|
-
"
|
|
13
|
-
"
|
|
10
|
+
spec.summary = "Contracts + Evals for ruby_llm"
|
|
11
|
+
spec.description = "Wraps RubyLLM::Chat with input/output contracts, business-rule validation, " \
|
|
12
|
+
"retry with model escalation on validation failure, pre-flight cost ceilings, " \
|
|
13
|
+
"and an evaluation framework. Sibling abstraction to RubyLLM::Agent — same " \
|
|
14
|
+
"niche (reusable class-based prompts), wider contract."
|
|
14
15
|
spec.homepage = "https://github.com/justi/ruby_llm-contract"
|
|
15
16
|
spec.license = "MIT"
|
|
16
17
|
spec.required_ruby_version = ">= 3.2.0"
|
|
@@ -30,6 +31,6 @@ Gem::Specification.new do |spec|
|
|
|
30
31
|
spec.require_paths = ["lib"]
|
|
31
32
|
|
|
32
33
|
spec.add_dependency "dry-types", "~> 1.7"
|
|
33
|
-
spec.add_dependency "ruby_llm", "~> 1.
|
|
34
|
+
spec.add_dependency "ruby_llm", "~> 1.12"
|
|
34
35
|
spec.add_dependency "ruby_llm-schema", "~> 0.3"
|
|
35
36
|
end
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: ruby_llm-contract
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.
|
|
4
|
+
version: 0.8.0
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Justyna
|
|
@@ -29,14 +29,14 @@ dependencies:
|
|
|
29
29
|
requirements:
|
|
30
30
|
- - "~>"
|
|
31
31
|
- !ruby/object:Gem::Version
|
|
32
|
-
version: '1.
|
|
32
|
+
version: '1.12'
|
|
33
33
|
type: :runtime
|
|
34
34
|
prerelease: false
|
|
35
35
|
version_requirements: !ruby/object:Gem::Requirement
|
|
36
36
|
requirements:
|
|
37
37
|
- - "~>"
|
|
38
38
|
- !ruby/object:Gem::Version
|
|
39
|
-
version: '1.
|
|
39
|
+
version: '1.12'
|
|
40
40
|
- !ruby/object:Gem::Dependency
|
|
41
41
|
name: ruby_llm-schema
|
|
42
42
|
requirement: !ruby/object:Gem::Requirement
|
|
@@ -51,9 +51,10 @@ dependencies:
|
|
|
51
51
|
- - "~>"
|
|
52
52
|
- !ruby/object:Gem::Version
|
|
53
53
|
version: '0.3'
|
|
54
|
-
description:
|
|
55
|
-
|
|
56
|
-
|
|
54
|
+
description: Wraps RubyLLM::Chat with input/output contracts, business-rule validation,
|
|
55
|
+
retry with model escalation on validation failure, pre-flight cost ceilings, and
|
|
56
|
+
an evaluation framework. Sibling abstraction to RubyLLM::Agent — same niche (reusable
|
|
57
|
+
class-based prompts), wider contract.
|
|
57
58
|
executables: []
|
|
58
59
|
extensions: []
|
|
59
60
|
extra_rdoc_files: []
|
|
@@ -205,5 +206,5 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
|
205
206
|
requirements: []
|
|
206
207
|
rubygems_version: 3.6.7
|
|
207
208
|
specification_version: 4
|
|
208
|
-
summary:
|
|
209
|
+
summary: Contracts + Evals for ruby_llm
|
|
209
210
|
test_files: []
|