open_router_enhanced 2.0.1 → 2.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +18 -0
- data/Gemfile.lock +1 -1
- data/README.md +90 -0
- data/Rakefile +24 -14
- data/docs/superpowers/plans/2026-06-27-openrouter-routing-features.md +913 -0
- data/docs/superpowers/specs/2026-06-27-openrouter-routing-features-design.md +179 -0
- data/examples/dynamic_model_switching_example.rb +0 -0
- data/examples/model_selection_example.rb +0 -0
- data/examples/prompt_template_example.rb +0 -0
- data/examples/real_world_schemas_example.rb +0 -0
- data/examples/responses_api_example.rb +0 -0
- data/examples/smart_completion_example.rb +0 -0
- data/examples/structured_outputs_example.rb +0 -0
- data/examples/tool_calling_example.rb +0 -0
- data/examples/tool_loop_example.rb +0 -0
- data/lib/open_router/callbacks.rb +50 -0
- data/lib/open_router/client.rb +12 -576
- data/lib/open_router/json_healer.rb +1 -1
- data/lib/open_router/model_registry.rb +24 -6
- data/lib/open_router/model_selector.rb +7 -7
- data/lib/open_router/parameter_builder.rb +120 -0
- data/lib/open_router/request_handler.rb +98 -0
- data/lib/open_router/response.rb +13 -120
- data/lib/open_router/response_parsing.rb +107 -0
- data/lib/open_router/routing.rb +80 -0
- data/lib/open_router/streaming_client.rb +1 -1
- data/lib/open_router/subagent_tool.rb +51 -0
- data/lib/open_router/tool_serializer.rb +164 -0
- data/lib/open_router/version.rb +1 -1
- data/lib/open_router.rb +14 -0
- metadata +11 -2
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: b6c9c14171242103eaeab8219521180242f9e0ee0968c739f8db2148a17423a5
|
|
4
|
+
data.tar.gz: '0906f33ab027e8cbf17ff60ab120ec65de39ec689c6e9a48025fb8df679f7d55'
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 4cd127d4d6889e281e88e3f044f2444bd32e46f7ac4797d5c786b9a3fc5a8f792baf445c8dea481ab5018fae72646a221da05266bcb4134266735546e35a428d
|
|
7
|
+
data.tar.gz: a5b80c88d5f2228f1891409d2edb335a1a2c0294b94ffc69fb2730d0a854f7010c5eece8ac026eb6ba99a60df583634f42c77e18aada4b0399886b6a744a3488
|
data/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,23 @@
|
|
|
1
1
|
## [Unreleased]
|
|
2
2
|
|
|
3
|
+
## [2.2.0] - 2026-06-28
|
|
4
|
+
|
|
5
|
+
### Added
|
|
6
|
+
|
|
7
|
+
- **`Routing` mixin** (`OpenRouter::Routing`) included in `Client`, providing two new meta-routing methods:
|
|
8
|
+
- `pareto_complete(messages, min_coding_score: nil, **opts)` — routes to the cheapest model meeting a configurable quality bar via OpenRouter's Pareto Code Router (`openrouter/pareto-code`). `min_coding_score` is validated to `0.0–1.0`.
|
|
9
|
+
- `fuse(messages, analysis_models: nil, judge: nil, preset: nil, max_tool_calls: nil, **opts)` — fans a prompt out to a panel of models and synthesises one answer via OpenRouter's Fusion router (`openrouter/fusion`). `analysis_models` (1–8) and `max_tool_calls` (1–16) are validated.
|
|
10
|
+
- **`SubagentTool`** (`OpenRouter::SubagentTool`) — wraps OpenRouter's `openrouter:subagent` server tool so an orchestrator model can delegate self-contained subtasks to a cheaper worker model mid-generation. Constructor: `model:` (required worker model) plus optional `instructions:`, `max_completion_tokens:`, `temperature:`, and `reasoning:`. Pass it via the normal `tools:` array to `complete`.
|
|
11
|
+
- **`Response#selected_model`** — alias for `#model`; returns the concrete model OpenRouter resolved for routing responses (e.g. Pareto, Auto, Fusion).
|
|
12
|
+
|
|
13
|
+
### Changed
|
|
14
|
+
|
|
15
|
+
- Capability warning / strict-mode guards now exempt all `openrouter/`-prefixed meta-models (previously only `openrouter/auto` was exempt); this prevents spurious warnings or `CapabilityError` when using `pareto_complete` or `fuse` with tools or structured outputs.
|
|
16
|
+
|
|
17
|
+
### Notes
|
|
18
|
+
|
|
19
|
+
- These three OpenRouter platform features are still evolving server-side. The gem builds and validates the requests; routing/synthesis/delegation behaviour is performed by OpenRouter. Fusion fans out to every panel model plus a judge, so it costs roughly 4–5× a single completion. `pareto_complete` may resolve to a reasoning model that consumes a small `max_tokens` budget entirely on reasoning (returning `nil` content with `finish_reason: "length"`) — budget `max_tokens` accordingly.
|
|
20
|
+
|
|
3
21
|
## [2.0.0] - 2025-12-28
|
|
4
22
|
|
|
5
23
|
### Overview
|
data/Gemfile.lock
CHANGED
data/README.md
CHANGED
|
@@ -45,6 +45,7 @@ The [OpenRouter API](https://openrouter.ai/docs) is a single unified interface f
|
|
|
45
45
|
- [Tool Calling](#tool-calling)
|
|
46
46
|
- [Structured Outputs](#structured-outputs)
|
|
47
47
|
- [Smart Model Selection](#smart-model-selection)
|
|
48
|
+
- [Routing (Pareto & Fusion)](#routing-pareto--fusion)
|
|
48
49
|
- [Prompt Templates](#prompt-templates)
|
|
49
50
|
- [Streaming](#streaming)
|
|
50
51
|
- [Usage Tracking](#usage-tracking)
|
|
@@ -383,6 +384,95 @@ models = OpenRouter::ModelSelector.new
|
|
|
383
384
|
|
|
384
385
|
**[Complete Model Selection Documentation](docs/model_selection.md)**
|
|
385
386
|
|
|
387
|
+
### Routing (Pareto & Fusion)
|
|
388
|
+
|
|
389
|
+
OpenRouter offers two meta-routing modes that automatically pick or synthesize answers across models.
|
|
390
|
+
|
|
391
|
+
#### Pareto Code Router
|
|
392
|
+
|
|
393
|
+
Routes each request to the cheapest model that meets a configurable quality bar — useful when you want cost-optimised code completions without picking a specific model.
|
|
394
|
+
|
|
395
|
+
```ruby
|
|
396
|
+
# Cheapest model meeting default quality threshold
|
|
397
|
+
response = client.pareto_complete([
|
|
398
|
+
{ role: "user", content: "Write a binary search in Ruby" }
|
|
399
|
+
])
|
|
400
|
+
|
|
401
|
+
# Require a higher quality bar (0.0–1.0, higher = better)
|
|
402
|
+
response = client.pareto_complete(
|
|
403
|
+
[{ role: "user", content: "Implement a red-black tree" }],
|
|
404
|
+
min_coding_score: 0.8,
|
|
405
|
+
max_tokens: 1000
|
|
406
|
+
)
|
|
407
|
+
|
|
408
|
+
# Which model actually answered?
|
|
409
|
+
puts response.selected_model # => "anthropic/claude-3.5-haiku"
|
|
410
|
+
puts response.content
|
|
411
|
+
```
|
|
412
|
+
|
|
413
|
+
#### Fusion Router
|
|
414
|
+
|
|
415
|
+
Fans a prompt out to a panel of models in parallel, then synthesises one answer with a judge model. Costs roughly 4–5× a single completion but can outperform any individual model.
|
|
416
|
+
|
|
417
|
+
```ruby
|
|
418
|
+
# Default panel (OpenRouter chooses)
|
|
419
|
+
response = client.fuse([
|
|
420
|
+
{ role: "user", content: "What is the best approach to distributed consensus?" }
|
|
421
|
+
])
|
|
422
|
+
|
|
423
|
+
# Custom panel + explicit judge
|
|
424
|
+
response = client.fuse(
|
|
425
|
+
[{ role: "user", content: "Review this architecture" }],
|
|
426
|
+
analysis_models: [
|
|
427
|
+
"anthropic/claude-3.5-sonnet",
|
|
428
|
+
"openai/gpt-4o",
|
|
429
|
+
"google/gemini-2.0-flash-001"
|
|
430
|
+
],
|
|
431
|
+
judge: "anthropic/claude-opus-4-5",
|
|
432
|
+
max_tokens: 2000
|
|
433
|
+
)
|
|
434
|
+
|
|
435
|
+
# Curated preset panels
|
|
436
|
+
response = client.fuse(messages, preset: "general-budget")
|
|
437
|
+
|
|
438
|
+
# selected_model reports the synthesis/judge model that produced the answer,
|
|
439
|
+
# e.g. "anthropic/claude-opus-4-5" — not the "openrouter/fusion" router alias.
|
|
440
|
+
puts response.selected_model
|
|
441
|
+
puts response.content
|
|
442
|
+
```
|
|
443
|
+
|
|
444
|
+
> **Note:** Fusion fans out to every panel model plus a judge, so it costs roughly 4–5× a single completion. `min_coding_score` for Pareto is validated to `0.0–1.0`; `analysis_models` (1–8) and `max_tool_calls` (1–16) for Fusion are validated client-side.
|
|
445
|
+
|
|
446
|
+
#### `SubagentTool`
|
|
447
|
+
|
|
448
|
+
Wraps OpenRouter's built-in `openrouter:subagent` server tool so an LLM can spawn its own sub-completions during a tool-calling loop.
|
|
449
|
+
|
|
450
|
+
```ruby
|
|
451
|
+
subagent = OpenRouter::SubagentTool.new(
|
|
452
|
+
model: "anthropic/claude-3.5-haiku", # required: the cheaper worker model
|
|
453
|
+
instructions: "Complete the task exactly as described. Be concise.", # optional
|
|
454
|
+
max_completion_tokens: 512 # optional (also: temperature:, reasoning:)
|
|
455
|
+
)
|
|
456
|
+
|
|
457
|
+
response = client.complete(
|
|
458
|
+
[{ role: "user", content: "Summarize the attached changelog into release notes." }],
|
|
459
|
+
model: "openai/gpt-4o",
|
|
460
|
+
tools: [subagent],
|
|
461
|
+
tool_choice: "auto"
|
|
462
|
+
)
|
|
463
|
+
```
|
|
464
|
+
|
|
465
|
+
> The orchestrator decides whether to delegate. The gem's job is to build and send a valid `openrouter:subagent` tool; OpenRouter runs the worker server-side and feeds its result back into the orchestrator's generation.
|
|
466
|
+
|
|
467
|
+
#### `Response#selected_model`
|
|
468
|
+
|
|
469
|
+
All routing methods (`complete`, `pareto_complete`, `fuse`) return a `Response` object. Use `#selected_model` (alias for `#model`) to see which model OpenRouter ultimately used:
|
|
470
|
+
|
|
471
|
+
```ruby
|
|
472
|
+
response = client.pareto_complete(messages)
|
|
473
|
+
puts response.selected_model # e.g. "mistralai/codestral-2501"
|
|
474
|
+
```
|
|
475
|
+
|
|
386
476
|
### Prompt Templates
|
|
387
477
|
|
|
388
478
|
Create reusable, parameterized prompts with variable interpolation.
|
data/Rakefile
CHANGED
|
@@ -30,6 +30,16 @@ task ci: %i[spec_all rubocop]
|
|
|
30
30
|
|
|
31
31
|
# Model exploration tasks
|
|
32
32
|
namespace :models do
|
|
33
|
+
desc "Fetch fresh model data from OpenRouter API and update local cache"
|
|
34
|
+
task :update do
|
|
35
|
+
require_relative "lib/open_router"
|
|
36
|
+
|
|
37
|
+
print "Fetching models from OpenRouter API..."
|
|
38
|
+
OpenRouter::ModelRegistry.refresh!
|
|
39
|
+
count = OpenRouter::ModelRegistry.all_models.size
|
|
40
|
+
puts " done. #{count} models cached."
|
|
41
|
+
end
|
|
42
|
+
|
|
33
43
|
desc "Display summary of available models"
|
|
34
44
|
task :summary do
|
|
35
45
|
require_relative "lib/open_router"
|
|
@@ -59,18 +69,18 @@ namespace :models do
|
|
|
59
69
|
end
|
|
60
70
|
|
|
61
71
|
# Cost analysis
|
|
62
|
-
input_costs = models.values.map { |spec| spec[:
|
|
63
|
-
output_costs = models.values.map { |spec| spec[:
|
|
72
|
+
input_costs = models.values.map { |spec| spec[:cost_per_token][:input] }.compact.sort
|
|
73
|
+
output_costs = models.values.map { |spec| spec[:cost_per_token][:output] }.compact.sort
|
|
64
74
|
|
|
65
|
-
puts "\n💰 Cost Analysis (per
|
|
75
|
+
puts "\n💰 Cost Analysis (per million tokens):"
|
|
66
76
|
puts " Input tokens:"
|
|
67
|
-
puts " Min: $#{format("%.
|
|
68
|
-
puts " Max: $#{format("%.
|
|
69
|
-
puts " Median: $#{format("%.
|
|
77
|
+
puts " Min: $#{format("%.4f", input_costs.min * 1_000_000)}"
|
|
78
|
+
puts " Max: $#{format("%.4f", input_costs.max * 1_000_000)}"
|
|
79
|
+
puts " Median: $#{format("%.4f", input_costs[input_costs.size / 2] * 1_000_000)}"
|
|
70
80
|
puts " Output tokens:"
|
|
71
|
-
puts " Min: $#{format("%.
|
|
72
|
-
puts " Max: $#{format("%.
|
|
73
|
-
puts " Median: $#{format("%.
|
|
81
|
+
puts " Min: $#{format("%.4f", output_costs.min * 1_000_000)}"
|
|
82
|
+
puts " Max: $#{format("%.4f", output_costs.max * 1_000_000)}"
|
|
83
|
+
puts " Median: $#{format("%.4f", output_costs[output_costs.size / 2] * 1_000_000)}"
|
|
74
84
|
|
|
75
85
|
# Context length analysis
|
|
76
86
|
context_lengths = models.values.map { |spec| spec[:context_length] }.compact.sort
|
|
@@ -269,8 +279,8 @@ namespace :models do
|
|
|
269
279
|
def self.display_model_info(model_id, specs, index)
|
|
270
280
|
puts "#{(index + 1).to_s.rjust(3)}. #{model_id}"
|
|
271
281
|
puts " Name: #{specs[:name]}" if specs[:name]
|
|
272
|
-
|
|
273
|
-
|
|
282
|
+
cpm = OpenRouter::ModelRegistry.cost_per_million(model_id)
|
|
283
|
+
puts " Cost: $#{format("%.4f", cpm[:input])}/M input, $#{format("%.4f", cpm[:output])}/M output"
|
|
274
284
|
puts " Context: #{format_number_with_commas(specs[:context_length])} tokens"
|
|
275
285
|
puts " Capabilities: #{specs[:capabilities].join(", ")}"
|
|
276
286
|
puts " Tier: #{specs[:performance_tier]}"
|
|
@@ -318,17 +328,17 @@ namespace :models do
|
|
|
318
328
|
def self.sort_by_strategy(candidates, strategy)
|
|
319
329
|
case strategy
|
|
320
330
|
when :cost
|
|
321
|
-
candidates.sort_by { |_, specs| specs[:
|
|
331
|
+
candidates.sort_by { |_, specs| specs[:cost_per_token][:input] }
|
|
322
332
|
when :performance
|
|
323
333
|
candidates.sort_by do |_, specs|
|
|
324
|
-
[specs[:performance_tier] == :premium ? 0 : 1, specs[:
|
|
334
|
+
[specs[:performance_tier] == :premium ? 0 : 1, specs[:cost_per_token][:input]]
|
|
325
335
|
end
|
|
326
336
|
when :latest
|
|
327
337
|
candidates.sort_by { |_, specs| -(specs[:created_at] || 0).to_i }
|
|
328
338
|
when :context
|
|
329
339
|
candidates.sort_by { |_, specs| -(specs[:context_length] || 0).to_i }
|
|
330
340
|
else
|
|
331
|
-
candidates.sort_by { |_, specs| specs[:
|
|
341
|
+
candidates.sort_by { |_, specs| specs[:cost_per_token][:input] }
|
|
332
342
|
end
|
|
333
343
|
end
|
|
334
344
|
end
|