open_router_enhanced 2.2.0 → 2.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +0 -18
- data/Gemfile.lock +1 -1
- data/README.md +0 -90
- data/lib/open_router/callbacks.rb +1 -1
- data/lib/open_router/client.rb +12 -4
- data/lib/open_router/completion_options.rb +15 -7
- data/lib/open_router/json_healer.rb +1 -1
- data/lib/open_router/request_handler.rb +5 -4
- data/lib/open_router/response.rb +16 -14
- data/lib/open_router/schema.rb +28 -12
- data/lib/open_router/streaming_client.rb +1 -1
- data/lib/open_router/tool_serializer.rb +27 -39
- data/lib/open_router/version.rb +1 -1
- data/lib/open_router.rb +7 -24
- metadata +2 -6
- data/docs/superpowers/plans/2026-06-27-openrouter-routing-features.md +0 -913
- data/docs/superpowers/specs/2026-06-27-openrouter-routing-features-design.md +0 -179
- data/lib/open_router/routing.rb +0 -80
- data/lib/open_router/subagent_tool.rb +0 -51
|
@@ -1,179 +0,0 @@
|
|
|
1
|
-
# Design: OpenRouter Routing & Delegation Features
|
|
2
|
-
|
|
3
|
-
**Date:** 2026-06-27
|
|
4
|
-
**Gem:** `open_router_enhanced`
|
|
5
|
-
**Target version:** 2.2.0
|
|
6
|
-
**Status:** Approved — ready for implementation plan
|
|
7
|
-
|
|
8
|
-
## Goal
|
|
9
|
-
|
|
10
|
-
Add first-class, validated, ergonomic access to three OpenRouter platform features
|
|
11
|
-
that currently must be hand-rolled as raw `plugins:`/`tools:` hashes:
|
|
12
|
-
|
|
13
|
-
1. **Fusion** (`openrouter/fusion`) — fan a prompt out to a panel of models in
|
|
14
|
-
parallel and synthesize one answer via a judge model.
|
|
15
|
-
2. **Subagent server tool** (`openrouter:subagent`) — let an orchestrator model
|
|
16
|
-
delegate self-contained subtasks mid-generation to a cheaper worker model.
|
|
17
|
-
3. **Pareto Code Router** (`openrouter/pareto-code`) — set `min_coding_score` and
|
|
18
|
-
route to the cheapest code-capable model clearing that bar.
|
|
19
|
-
|
|
20
|
-
These features are **still evolving on OpenRouter's side**. The goal is reliable
|
|
21
|
-
*access and use* with solid error handling — not a comprehensive DSL. A richer DSL
|
|
22
|
-
may follow later; this design deliberately keeps the surface small and additive.
|
|
23
|
-
|
|
24
|
-
## Non-goals (v1 / YAGNI)
|
|
25
|
-
|
|
26
|
-
- No fluent/builder DSL (a hybrid helper+options surface was chosen; DSL may come later).
|
|
27
|
-
- No special streaming handling — these flow through `complete`'s existing `stream:`
|
|
28
|
-
param as pass-through, not specially tested in v1.
|
|
29
|
-
- No automatic fusion cost-guardrail. Fusion costs ~4–5× a single completion; this is
|
|
30
|
-
documented, and the existing `:after_response` callback already lets callers meter spend.
|
|
31
|
-
- No typed parsing of the Fusion judge's internal JSON (consensus/contradictions/etc.) —
|
|
32
|
-
OpenRouter does not publish a stable caller-facing schema for it.
|
|
33
|
-
|
|
34
|
-
## Architecture
|
|
35
|
-
|
|
36
|
-
The gem already serializes all three features correctly through existing plumbing
|
|
37
|
-
(`CompletionOptions` → `ParameterBuilder#prepare_base_parameters` → `plugins`/`tools`/
|
|
38
|
-
`model`). Nothing in `complete()`'s core flow changes. The work is a thin ergonomic +
|
|
39
|
-
validation layer, mirroring how `smart_complete` wraps `complete`.
|
|
40
|
-
|
|
41
|
-
Two new units + a minimal response surface:
|
|
42
|
-
|
|
43
|
-
### 1. `OpenRouter::Routing` (mixin, included in `Client`)
|
|
44
|
-
|
|
45
|
-
File: `lib/open_router/routing.rb`. Included in `Client` alongside the existing mixins.
|
|
46
|
-
|
|
47
|
-
```ruby
|
|
48
|
-
FUSION_MODEL = "openrouter/fusion"
|
|
49
|
-
PARETO_CODE_MODEL = "openrouter/pareto-code"
|
|
50
|
-
|
|
51
|
-
# Fusion
|
|
52
|
-
def fuse(messages, analysis_models: nil, judge: nil, preset: nil,
|
|
53
|
-
max_tool_calls: nil, **opts)
|
|
54
|
-
# validate, build plugin hash (compact), delegate to complete
|
|
55
|
-
# model: FUSION_MODEL, plugins: [{ id: "fusion", ... }.compact]
|
|
56
|
-
end
|
|
57
|
-
|
|
58
|
-
# Pareto code router
|
|
59
|
-
def pareto_complete(messages, min_coding_score: nil, **opts)
|
|
60
|
-
# validate, build plugin, delegate
|
|
61
|
-
# model: PARETO_CODE_MODEL, plugins: [{ id: "pareto-router", min_coding_score: }.compact]
|
|
62
|
-
end
|
|
63
|
-
```
|
|
64
|
-
|
|
65
|
-
- Both accept the full `**opts` / `CompletionOptions` surface, so `temperature:`,
|
|
66
|
-
`session_id:`, callbacks, etc. compose normally.
|
|
67
|
-
- If a caller passes a `plugins:` of their own, the fusion/pareto plugin is **merged**
|
|
68
|
-
into it (not clobbered), de-duped by `id`.
|
|
69
|
-
|
|
70
|
-
**Validation (fail-fast, raises `ArgumentError`):**
|
|
71
|
-
- `analysis_models`: array of 1–8 model id strings when present.
|
|
72
|
-
- `max_tool_calls`: integer 1–16 when present.
|
|
73
|
-
- `min_coding_score`: numeric 0.0–1.0 when present.
|
|
74
|
-
- `judge`/`preset`: passed through as strings/symbols; no hard validation (server-side,
|
|
75
|
-
in flux).
|
|
76
|
-
|
|
77
|
-
### 2. `OpenRouter::SubagentTool < OpenRouter::Tool`
|
|
78
|
-
|
|
79
|
-
File: `lib/open_router/subagent_tool.rb`. Subclasses `Tool` so it passes the existing
|
|
80
|
-
`serialize_tools` `when Tool` branch, but overrides construction/validation/`to_h` to
|
|
81
|
-
emit the server-tool shape instead of the `function` shape:
|
|
82
|
-
|
|
83
|
-
```ruby
|
|
84
|
-
sub = OpenRouter::SubagentTool.new(
|
|
85
|
-
model: "z-ai/glm-5.2", # required; pins the worker model
|
|
86
|
-
instructions: "Be concise.", # optional
|
|
87
|
-
max_completion_tokens: 1024, # optional
|
|
88
|
-
temperature: 0.2, # optional
|
|
89
|
-
reasoning: { effort: "low" }) # optional
|
|
90
|
-
|
|
91
|
-
sub.to_h
|
|
92
|
-
# => { type: "openrouter:subagent",
|
|
93
|
-
# parameters: { model:, instructions:, max_completion_tokens:, temperature:, reasoning: }.compact }
|
|
94
|
-
```
|
|
95
|
-
|
|
96
|
-
- `model:` is **required** → `ArgumentError` if missing/blank.
|
|
97
|
-
- Used via the normal path: `client.complete(messages, model: orchestrator, tools: [sub])`.
|
|
98
|
-
- `serialize_tools` needs no change if `SubagentTool` is a `Tool` subclass; we will add a
|
|
99
|
-
focused spec asserting it serializes correctly through `complete`.
|
|
100
|
-
|
|
101
|
-
### 3. Response surface (minimal, cassette-driven)
|
|
102
|
-
|
|
103
|
-
Small additions to `Response` (`lib/open_router/response.rb`):
|
|
104
|
-
|
|
105
|
-
- `selected_model` — the concrete model the router actually resolved, read from the
|
|
106
|
-
response `model` field. Useful for pareto / auto / fusion ("which model answered?").
|
|
107
|
-
- `router` and any subagent-delegation metadata — added **only if** the first real
|
|
108
|
-
cassette shows the API returns them. We do not invent fields the live API doesn't emit.
|
|
109
|
-
|
|
110
|
-
## Error handling (explicit requirement)
|
|
111
|
-
|
|
112
|
-
Because these endpoints are in flux, error paths are first-class and tested:
|
|
113
|
-
|
|
114
|
-
- **Client-side validation errors** raise `ArgumentError` *before* any HTTP call
|
|
115
|
-
(bad `min_coding_score`, empty/oversized `analysis_models`, missing subagent `model`).
|
|
116
|
-
- **API rejections** (e.g. a 400 when a plugin field name has drifted server-side) surface
|
|
117
|
-
through the gem's existing `ServerError`/error handling and `:on_error` callback,
|
|
118
|
-
unchanged. We add a VCR spec that records a real error response (e.g. an invalid
|
|
119
|
-
`min_coding_score` or malformed plugin) and asserts the gem raises/propagates cleanly
|
|
120
|
-
rather than returning a malformed `Response`.
|
|
121
|
-
- **Subagent runtime errors**: the server tool may return
|
|
122
|
-
`{ "status": "error", "task_name": ..., "error": ... }`. The orchestrator's final
|
|
123
|
-
message still returns normally; we assert the completion succeeds and document that
|
|
124
|
-
per-delegation errors are surfaced in the model's own output (not raised), since the
|
|
125
|
-
server tool runs server-side.
|
|
126
|
-
- **Fusion partial-panel failure**: documented as handled by OpenRouter server-side
|
|
127
|
-
(judge synthesizes from whoever succeeded); no special client handling. Covered by the
|
|
128
|
-
happy-path cassette returning a synthesized answer.
|
|
129
|
-
|
|
130
|
-
## Testing strategy (TDD, red → green, per feature)
|
|
131
|
-
|
|
132
|
-
For **each** feature, two layers:
|
|
133
|
-
|
|
134
|
-
**(a) Unit / contract spec** (mocks `post`, asserts the exact `parameters:` hash built) —
|
|
135
|
-
the "internal public method spec," modeled on the existing `#complete` spec in
|
|
136
|
-
`spec/open_router_spec.rb`:
|
|
137
|
-
- `spec/routing_spec.rb` — `fuse` and `pareto_complete` build correct params; validation
|
|
138
|
-
raises on bad input; plugins merge rather than clobber.
|
|
139
|
-
- `spec/subagent_tool_spec.rb` — `SubagentTool#to_h` shape; required-model validation;
|
|
140
|
-
serializes correctly when passed to `complete` (mocked `post`).
|
|
141
|
-
|
|
142
|
-
**(b) VCR integration spec** (real cassette against the live API):
|
|
143
|
-
- `spec/vcr/fusion_spec.rb`, `spec/vcr/pareto_spec.rb`, `spec/vcr/subagent_spec.rb`
|
|
144
|
-
- One happy-path cassette each + at least one **error cassette** (e.g. invalid score / bad
|
|
145
|
-
plugin field) to lock real error behavior.
|
|
146
|
-
- Recorded with `VCR_RECORD_NEW=1` (never delete-and-rerecord — per project memory on the
|
|
147
|
-
shared mutable on-disk model cache).
|
|
148
|
-
- **Use cheap models** for all recordings (e.g. budget panel members and a budget judge for
|
|
149
|
-
fusion; a cheap worker for subagent). Never the most expensive frontier models.
|
|
150
|
-
|
|
151
|
-
The integration record run is the source of truth that pins the doc-flagged uncertain
|
|
152
|
-
field names (`analysis_models` vs `models`, `min_coding_score` placement, subagent result
|
|
153
|
-
payload shape). If the live API disagrees with the docs, the unit spec's expected hash is
|
|
154
|
-
corrected to match reality — the cassette wins.
|
|
155
|
-
|
|
156
|
-
## Files touched
|
|
157
|
-
|
|
158
|
-
New:
|
|
159
|
-
- `lib/open_router/routing.rb`
|
|
160
|
-
- `lib/open_router/subagent_tool.rb`
|
|
161
|
-
- `spec/routing_spec.rb`, `spec/subagent_tool_spec.rb`
|
|
162
|
-
- `spec/vcr/fusion_spec.rb`, `spec/vcr/pareto_spec.rb`, `spec/vcr/subagent_spec.rb`
|
|
163
|
-
- new cassettes under `spec/fixtures/vcr_cassettes/`
|
|
164
|
-
|
|
165
|
-
Modified:
|
|
166
|
-
- `lib/open_router/client.rb` — `require_relative` + `include OpenRouter::Routing`
|
|
167
|
-
- `lib/open_router/response.rb` — `selected_model` (+ cassette-driven extras)
|
|
168
|
-
- `lib/open_router.rb` — `require` the new files if not autoloaded
|
|
169
|
-
- `lib/open_router/version.rb` — bump to 2.2.0
|
|
170
|
-
- docs (README / feature docs) — usage + cost notes (after green)
|
|
171
|
-
|
|
172
|
-
## Acceptance criteria
|
|
173
|
-
|
|
174
|
-
- `fuse`, `pareto_complete`, and `SubagentTool` work end-to-end against real recorded
|
|
175
|
-
cassettes using cheap models.
|
|
176
|
-
- Unit specs assert exact request shapes and validation behavior.
|
|
177
|
-
- Error paths (validation + real API error) are tested and behave predictably.
|
|
178
|
-
- Full suite green in CI (`:none` record mode replays cassettes).
|
|
179
|
-
- Version bumped to 2.2.0; docs updated.
|
data/lib/open_router/routing.rb
DELETED
|
@@ -1,80 +0,0 @@
|
|
|
1
|
-
# frozen_string_literal: true
|
|
2
|
-
|
|
3
|
-
module OpenRouter
|
|
4
|
-
# Mixin providing ergonomic access to OpenRouter router/meta-model features
|
|
5
|
-
# (Fusion, Pareto Code Router). Builds the right model alias + plugin config
|
|
6
|
-
# and delegates to Client#complete.
|
|
7
|
-
module Routing
|
|
8
|
-
FUSION_MODEL = "openrouter/fusion"
|
|
9
|
-
PARETO_CODE_MODEL = "openrouter/pareto-code"
|
|
10
|
-
|
|
11
|
-
# Route to the cheapest code-capable model meeting a quality bar.
|
|
12
|
-
#
|
|
13
|
-
# @param min_coding_score [Float, nil] 0.0–1.0 (1.0 = best). Optional.
|
|
14
|
-
def pareto_complete(messages, min_coding_score: nil, **opts)
|
|
15
|
-
validate_min_coding_score!(min_coding_score)
|
|
16
|
-
|
|
17
|
-
plugin = { id: "pareto-router" }
|
|
18
|
-
plugin[:min_coding_score] = min_coding_score unless min_coding_score.nil?
|
|
19
|
-
|
|
20
|
-
kwargs = merge_plugin(opts, plugin)
|
|
21
|
-
complete(messages, model: PARETO_CODE_MODEL, **kwargs)
|
|
22
|
-
end
|
|
23
|
-
|
|
24
|
-
# Fan a prompt out to a panel of models and synthesize one answer.
|
|
25
|
-
# NOTE: Fusion costs ~4–5x a single completion (panel calls + judge).
|
|
26
|
-
#
|
|
27
|
-
# @param analysis_models [Array<String>, nil] 1–8 panel model ids.
|
|
28
|
-
# @param judge [String, nil] synthesis model id (defaults to the fusion model server-side).
|
|
29
|
-
# @param preset [String, Symbol, nil] curated panel slug (e.g. "general-budget").
|
|
30
|
-
# @param max_tool_calls [Integer, nil] 1–16.
|
|
31
|
-
def fuse(messages, analysis_models: nil, judge: nil, preset: nil, max_tool_calls: nil, **opts)
|
|
32
|
-
validate_analysis_models!(analysis_models)
|
|
33
|
-
validate_max_tool_calls!(max_tool_calls)
|
|
34
|
-
|
|
35
|
-
plugin = {
|
|
36
|
-
id: "fusion",
|
|
37
|
-
analysis_models: analysis_models,
|
|
38
|
-
model: judge, # OpenRouter Fusion plugin field is 'model', not 'judge'
|
|
39
|
-
preset: preset&.to_s,
|
|
40
|
-
max_tool_calls: max_tool_calls
|
|
41
|
-
}.compact
|
|
42
|
-
|
|
43
|
-
kwargs = merge_plugin(opts, plugin)
|
|
44
|
-
complete(messages, model: FUSION_MODEL, **kwargs)
|
|
45
|
-
end
|
|
46
|
-
|
|
47
|
-
private
|
|
48
|
-
|
|
49
|
-
def validate_min_coding_score!(score)
|
|
50
|
-
return if score.nil?
|
|
51
|
-
|
|
52
|
-
return if score.is_a?(Numeric) && score >= 0.0 && score <= 1.0
|
|
53
|
-
|
|
54
|
-
raise ArgumentError, "min_coding_score must be a number between 0.0 and 1.0 (got #{score.inspect})"
|
|
55
|
-
end
|
|
56
|
-
|
|
57
|
-
def validate_analysis_models!(models)
|
|
58
|
-
return if models.nil?
|
|
59
|
-
|
|
60
|
-
return if models.is_a?(Array) && (1..8).cover?(models.size) && models.all? { |m| m.is_a?(String) && !m.strip.empty? }
|
|
61
|
-
|
|
62
|
-
raise ArgumentError, "analysis_models must be an array of 1–8 model id strings (got #{models.inspect})"
|
|
63
|
-
end
|
|
64
|
-
|
|
65
|
-
def validate_max_tool_calls!(value)
|
|
66
|
-
return if value.nil?
|
|
67
|
-
|
|
68
|
-
return if value.is_a?(Integer) && (1..16).cover?(value)
|
|
69
|
-
|
|
70
|
-
raise ArgumentError, "max_tool_calls must be an integer between 1 and 16 (got #{value.inspect})"
|
|
71
|
-
end
|
|
72
|
-
|
|
73
|
-
# Merge a router plugin into any caller-supplied plugins, de-duped by :id.
|
|
74
|
-
def merge_plugin(opts, plugin)
|
|
75
|
-
existing = Array(opts[:plugins]).map { |p| p.transform_keys(&:to_sym) }
|
|
76
|
-
existing = existing.reject { |p| p[:id].to_s == plugin[:id].to_s }
|
|
77
|
-
opts.merge(plugins: existing + [plugin])
|
|
78
|
-
end
|
|
79
|
-
end
|
|
80
|
-
end
|
|
@@ -1,51 +0,0 @@
|
|
|
1
|
-
# frozen_string_literal: true
|
|
2
|
-
|
|
3
|
-
require_relative "tool"
|
|
4
|
-
|
|
5
|
-
module OpenRouter
|
|
6
|
-
# Represents the `openrouter:subagent` server tool, which lets an orchestrator
|
|
7
|
-
# model delegate self-contained subtasks to a cheaper worker model mid-generation.
|
|
8
|
-
#
|
|
9
|
-
# Unlike a function Tool, it serializes to the server-tool shape:
|
|
10
|
-
# { type: "openrouter:subagent", parameters: { model:, instructions:, ... } }
|
|
11
|
-
#
|
|
12
|
-
# @example
|
|
13
|
-
# sub = OpenRouter::SubagentTool.new(model: "z-ai/glm-5.2", instructions: "Be concise.")
|
|
14
|
-
# client.complete(messages, model: "anthropic/claude-3.5-sonnet", tools: [sub])
|
|
15
|
-
class SubagentTool < Tool
|
|
16
|
-
SERVER_TOOL_TYPE = "openrouter:subagent"
|
|
17
|
-
|
|
18
|
-
# We deliberately do not call super: Tool#initialize expects a function
|
|
19
|
-
# definition with a name/description and validates it, neither of which a
|
|
20
|
-
# server tool has. The server-tool shape is built directly here instead.
|
|
21
|
-
def initialize(model:, instructions: nil, max_completion_tokens: nil, # rubocop:disable Lint/MissingSuper
|
|
22
|
-
temperature: nil, reasoning: nil)
|
|
23
|
-
raise ArgumentError, "model is required for SubagentTool" if model.nil? || model.to_s.strip.empty?
|
|
24
|
-
|
|
25
|
-
@type = SERVER_TOOL_TYPE
|
|
26
|
-
@parameters_config = {
|
|
27
|
-
model: model,
|
|
28
|
-
instructions: instructions,
|
|
29
|
-
max_completion_tokens: max_completion_tokens,
|
|
30
|
-
temperature: temperature,
|
|
31
|
-
reasoning: reasoning
|
|
32
|
-
}.compact
|
|
33
|
-
end
|
|
34
|
-
|
|
35
|
-
def to_h
|
|
36
|
-
{ type: @type, parameters: @parameters_config }
|
|
37
|
-
end
|
|
38
|
-
|
|
39
|
-
def name
|
|
40
|
-
@type
|
|
41
|
-
end
|
|
42
|
-
|
|
43
|
-
def description
|
|
44
|
-
"OpenRouter subagent server tool (worker: #{@parameters_config[:model]})"
|
|
45
|
-
end
|
|
46
|
-
|
|
47
|
-
def parameters
|
|
48
|
-
nil
|
|
49
|
-
end
|
|
50
|
-
end
|
|
51
|
-
end
|