legion-llm 0.9.22 → 0.9.28

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 428a14e141f5cbbb278e05f49fd198ef13f6e789727037c90154a855b76a8b34
4
- data.tar.gz: 8dc2aea0cd776675aad1c8ff198b35f0eba573e4a37c6e2bcdc0b6dfbbb7210b
3
+ metadata.gz: 178958a3403cbac0fad20d83f2726914d420137db2a1c340c33c4c7305457fcd
4
+ data.tar.gz: df951b9e05e0a0bfaff3701b6a3c5bd8452edea2298fe91e6a98165ce96961d1
5
5
  SHA512:
6
- metadata.gz: 3b9f1b9fae5371eefcbfbc89262bfa422e23df0e2e52d56735c5f3af9912b7245883ae4864568b6d8828e2dfdc3ab8c3d9fd4f125f662d6f8ae51602976d9952
7
- data.tar.gz: dcdbf11006d26b929779bdb0e2ae8a541225b3a62c820dd027ef6801198a5056eb1d9a93e7cd504846b11b2196c187d7a415e263592864c3eae5ace4153b31ee
6
+ metadata.gz: 7ff1622a50cdafc4d09577e8dc5f9e90632f7f93e528f41c1b630ce5daeeeceab6c91cbf5d8ec92183e0b6f27c17a62d09934b6bb0a413da9577cea5a47c942f
7
+ data.tar.gz: 49651eb56bbe046223674a626ce4339363ba10550793c1d95ca46d4bf2c4b7f2eb430397b7f0d91368791cecb683ae2919f8627edb45d88d5c847f4d0cb4ee12
data/CHANGELOG.md CHANGED
@@ -1,5 +1,84 @@
1
1
  # Legion LLM Changelog
2
2
 
3
+ ## [0.9.28] - 2026-05-15
4
+
5
+ ### Added
6
+ - API: `/api/llm/models` now surfaces a static `LegionIO` model (`id: legionio`) as the default auto-routing placeholder.
7
+
8
+ ### Changed
9
+ - Routing: `model: "legionio"` clears explicit provider/model/instance/tier routing and sends the request through the router chain using the configured default intent.
10
+ - Routing: default tier priority now includes `direct` between `local` and `fleet`, and discovery-generated rule scores honor `routing.tier_priority`.
11
+
12
+ ### Fixed
13
+ - Prompt dispatch: provider-inferable model-only calls such as `gpt-5.4` infer the provider instead of pairing the model with `llm.default_provider`.
14
+ - Executor: provider-tier lookup failures are logged and return nil instead of silently defaulting to `:cloud`.
15
+ - LexLLMAdapter: optional content-block accessor fallbacks now capture and debug-log probe errors instead of bare-rescuing them.
16
+ - Auto routing: unresolved `legionio` requests now raise a clear provider error instead of falling back to configured defaults.
17
+ - Routing: model-only requests stay on provider inference while explicit provider/instance/tier requests still get registry defaults without requiring rule routing.
18
+
19
+ ## [0.9.27] - 2026-05-15
20
+
21
+ ### Fixed
22
+ - Router/Executor: provider-scoped instance resolution no longer applies a global `llm.default_instance` to models inferred for another provider; invalid explicit instances now fall back to that provider's registered default instance instead of dispatching to an unregistered `provider/instance` pair.
23
+
24
+ ## [0.9.26] - 2026-05-15
25
+
26
+ ### Fixed
27
+ - Discovery: `detect_embedding_from_registry` no longer sets `@can_embed = true` when no model is resolvable — adds `first_embedding_model_for(provider, instance)` as a third fallback scanning the discovered model catalog; returns `false` (allowing legacy probe to run) when all three sources yield nothing (#121)
28
+ - RagContext: `positive_integer` no longer raises `TypeError` when `value` is nil or an empty string — adds empty-string guard before `Kernel#Integer()` call so GAIA advisory `context_window: nil` does not abort the inference pipeline (#122)
29
+ - LexLLMAdapter: `text_part_content` now handles Anthropic-style `[{type:"text", text:"…"}]` content block arrays — flattens them to plain text instead of calling `.to_s` on the array, preventing Ruby array literals from leaking into provider prompts (#123)
30
+ - Embeddings/Discovery: `embedding_config_value` and `embedding_settings` now accept the deprecated plural `"embeddings"` key alongside the canonical singular `"embedding"` key, emitting a deprecation warning; fixes silent misconfiguration when users follow doc examples that used the plural spelling (#124)
31
+
32
+ ## [0.9.25] - 2026-05-14
33
+
34
+ ### Added
35
+ - Router: `TIER_RANK` constant — ordered quality ranking of tiers (local → direct → fleet → openai_compat → cloud → frontier)
36
+ - Router: `explicit_resolution` promoted to public — callable directly from executor without `send`
37
+ - Router: `chain_from_defaults` appends all registered fallback providers after the primary so the chain has real alternatives to escalate to (previously single-entry when a default provider was configured)
38
+ - Executor: `run_escalation_resolution` extracted from escalation loop — encapsulates per-attempt dispatch, error rescue, and `tried[]` tracking
39
+ - Executor: `skip_same_tier!` — on `ContextOverflow`, immediately skips all remaining same-tier candidates and routes to a higher-tier provider with a larger context window
40
+ - Executor: lateral vs. escalation move classification in per-attempt log line (`move=lateral` for same-tier, `move=escalation` for higher-tier)
41
+
42
+ ### Fixed
43
+ - Router: `explicit_resolution` handles nil `provider` and nil `tier` without raising `NoMethodError`
44
+ - Executor: `build_fallback_resolutions` sorts lateral alternatives (same-tier) before escalation candidates (higher-tier) — tries other instances at the same tier before promoting to a more expensive one
45
+ - Executor: deduplication in escalation loop is fully safe — `tried` entry is recorded on all rescue paths and on quality failure
46
+ - EscalationChain: `padded_resolutions` no longer pads the list by repeating the last resolution — only real distinct options are tried
47
+
48
+ ## [0.9.24] - 2026-05-14
49
+
50
+ ### Fixed
51
+ - API: `instance` from POST body was silently dropped — never forwarded into routing hash
52
+ - Executor: Gaia advisory tier assignment no longer overrides explicit `provider`+`instance` from caller
53
+ - Executor: `instance` now passed through `routing_resolution_for` to `Router.resolve`/`resolve_chain`
54
+ - Executor: `build_default_escalation_chain` now passes resolved provider/instance/model — previously ignored them and built a full auto chain, routing to vllm/fleet instead of the requested provider
55
+ - Router: `resolve`/`resolve_chain` accept `instance:` param; short-circuit to `explicit_resolution` when `provider` or `instance` is set (not just `tier`)
56
+ - Router: `explicit_resolution` honors caller-supplied instance instead of always pulling from registry; infers tier from `PROVIDER_TIER` when not explicitly given
57
+
58
+ ## [0.9.23] - 2026-05-13
59
+
60
+ ### Added
61
+ - Router: `registry_entry_for_provider` for explicit provider model resolution
62
+ - Router: model denylist (`deny_model`, `model_denied?`, `excluded_by_denial?`) — config errors auto-deny models
63
+ - Executor: config error detection (`CONFIG_ERROR_PATTERNS`) — prevents circuit breaker trips on auth/validation errors
64
+ - Executor: step timing hash on response (`metrics.timing`, `metrics.latency_legionio_ms`)
65
+ - API: `/api/llm/inference` response includes `provider`, `instance`, `tier`, `metrics`
66
+ - API: `/api/llm/providers` surfaces `source` and `credential_fingerprint`
67
+ - Inventory: provider-scoped queries skip unrelated providers
68
+ - Metering: disk-based JSONL spool when transport unavailable (was dropping events)
69
+ - Discovery: `report_discovery_failure` reports connection failures to health tracker
70
+ - Providers: `enabled: false` instances not registered; `default_model` in metadata
71
+
72
+ ### Changed
73
+ - Router: tier-aware model fallback — global default no longer bleeds across providers
74
+ - Inventory: single-source offerings (native_provider preferred over discovery to eliminate duplicates)
75
+ - Inventory: dedup normalizes `"default"` instance name
76
+ - Discovery: concise connection error log (no stacktrace for unreachable providers)
77
+ - Settings: removed `claude` from `native_providers` list
78
+
79
+ ### Fixed
80
+ - Cache spec rewritten to use real `Legion::Cache` instead of fragile stubs
81
+
3
82
  ## [0.9.22] - 2026-05-12
4
83
 
5
84
  ### Added
data/CLAUDE.md CHANGED
@@ -745,6 +745,26 @@ These rules are enforced across all legion-llm code. Violations will be caught i
745
745
  - **Advanced signals**: Budget tracking, GPU utilization monitoring, per-tenant spend limits
746
746
  - **Fleet auto-scaling**: Dynamic worker pool sizing based on queue depth and latency
747
747
 
748
+ ## Provider Registration & Model Resolution
749
+
750
+ - `discover_instances` in each lex-llm-* must include `default_model` in returned config — it flows to registry metadata via `instance_metadata` in `call/providers.rb`
751
+ - Router resolves models via: `registry_entry_for_provider(provider)` → `registry_default_model(entry)` → `metadata[:default_model]`
752
+ - `enabled: false` on an instance config prevents registration — checked in `register_provider_instance`
753
+ - `PROVIDER_DEFAULT_MODEL` does NOT belong in legion-llm — each provider owns its default in its own extension
754
+ - Inventory calls `native_provider_offerings` (full metadata) and excludes `discovery_offerings` for providers with native adapters
755
+
756
+ ## Metering Spool
757
+
758
+ - Events spool to `~/.legionio/data/spool/metering/events.jsonl` when AMQP transport is unavailable
759
+ - Thread-safe (SPOOL_MUTEX), capped at `settings[:metering][:spool][:max_events]` (default 10K)
760
+ - `flush_spool` publishes spooled events when transport reconnects; `lex-llm-ledger` actor triggers it
761
+
762
+ ## Health Tracker
763
+
764
+ - `deny_model(provider:, model:, instance:)` — permanently excludes a model from routing (in-memory, until restart)
765
+ - Config errors (ValidationException, AccessDenied, marketplace) trigger deny instead of circuit breaker
766
+ - Discovery connection failures report `:error` to health tracker — circuit opens after threshold
767
+
748
768
  ---
749
769
 
750
770
  **Maintained By**: Matthew Iverson (@Esity)
@@ -498,6 +498,26 @@ module Legion
498
498
 
499
499
  nil
500
500
  end
501
+
502
+ define_method(:build_response_metrics) do |pipeline_response|
503
+ routing = pipeline_response.routing || {}
504
+ timestamps = pipeline_response.timestamps || {}
505
+ metrics = {}
506
+
507
+ if (latency = routing[:latency_ms])
508
+ metrics[:latency_ms] = latency
509
+ end
510
+
511
+ step_timings = timestamps[:step_timings]
512
+ if step_timings.is_a?(Hash) && step_timings.any?
513
+ metrics[:timing] = step_timings
514
+ total = step_timings[:total].to_i
515
+ external = step_timings[:provider_call].to_i + step_timings[:tool_calls].to_i
516
+ metrics[:latency_legionio_ms] = total - external if total.positive?
517
+ end
518
+
519
+ metrics.empty? ? nil : metrics
520
+ end
501
521
  end
502
522
 
503
523
  log.debug('[llm][api][helpers] shared helpers registered')
@@ -108,7 +108,7 @@ module Legion
108
108
  id: request_id,
109
109
  messages: messages,
110
110
  system: body[:system],
111
- routing: { provider: provider, model: model },
111
+ routing: { provider: provider, model: model, instance: body[:instance] }.compact,
112
112
  tools: tool_declarations,
113
113
  caller: effective_caller,
114
114
  conversation_id: conversation_id,
@@ -184,11 +184,15 @@ module Legion
184
184
  request_id: request_id,
185
185
  content: full_text,
186
186
  model: (routing[:model] || routing['model']).to_s,
187
+ provider: (routing[:provider] || routing['provider'])&.to_s,
188
+ instance: (routing[:instance] || routing['instance'])&.to_s,
189
+ tier: (routing[:tier] || routing['tier'])&.to_s,
187
190
  input_tokens: token_value(tokens, :input),
188
191
  output_tokens: token_value(tokens, :output),
189
192
  tool_calls: extract_tool_calls(pipeline_response),
190
- conversation_id: pipeline_response.conversation_id
191
- }
193
+ conversation_id: pipeline_response.conversation_id,
194
+ metrics: build_response_metrics(pipeline_response)
195
+ }.compact
192
196
  done_payload[:thinking] = pipeline_response.thinking if include_thinking && pipeline_response.thinking
193
197
  emit_sse_event(out, 'done', {
194
198
  **done_payload
@@ -237,11 +241,16 @@ module Legion
237
241
  tool_calls: tool_calls,
238
242
  stop_reason: pipeline_response.stop&.dig(:reason)&.to_s,
239
243
  model: (routing[:model] || routing['model']).to_s,
244
+ provider: (routing[:provider] || routing['provider'])&.to_s,
245
+ instance: (routing[:instance] || routing['instance'])&.to_s,
246
+ tier: (routing[:tier] || routing['tier'])&.to_s,
240
247
  input_tokens: token_value(tokens, :input),
241
248
  output_tokens: token_value(tokens, :output),
242
- conversation_id: pipeline_response.conversation_id
249
+ conversation_id: pipeline_response.conversation_id,
250
+ metrics: build_response_metrics(pipeline_response)
243
251
  }
244
252
  payload[:thinking] = pipeline_response.thinking if include_thinking && pipeline_response.thinking
253
+ payload.compact!
245
254
  json_response(payload, status_code: 200)
246
255
  end
247
256
  rescue Legion::LLM::AuthError => e
@@ -9,6 +9,11 @@ module Legion
9
9
  module Models
10
10
  extend Legion::Logging::Helper
11
11
 
12
+ AUTO_ROUTING_MODEL_ID = 'legionio'
13
+ AUTO_ROUTING_MODEL_DISPLAY = 'LegionIO'
14
+ AUTO_ROUTING_OFFERING_ID = 'legionio:auto:inference:legionio'
15
+ AUTO_ROUTING_CAPABILITIES = %w[auto_routing chat completion json_schema tools].freeze
16
+
12
17
  def self.registered(app)
13
18
  log.debug('[llm][api][models] registering model inventory routes')
14
19
 
@@ -18,6 +23,7 @@ module Legion
18
23
 
19
24
  filters = Legion::LLM::API::Native::Models.request_filters(params)
20
25
  offerings = Legion::LLM::Inventory.offerings(filters)
26
+ offerings = Legion::LLM::API::Native::Models.with_auto_routing_offering(offerings, filters)
21
27
 
22
28
  json_response({
23
29
  models: Legion::LLM::API::Native::Models.model_summaries(offerings),
@@ -34,7 +40,9 @@ module Legion
34
40
  log.debug("[llm][api][models] action=get_model id=#{model_id}")
35
41
  require_llm!
36
42
 
37
- offerings = Legion::LLM::Inventory.offerings(model: model_id)
43
+ filters = { model: model_id }
44
+ offerings = Legion::LLM::Inventory.offerings(filters)
45
+ offerings = Legion::LLM::API::Native::Models.with_auto_routing_offering(offerings, filters)
38
46
  halt json_error('model_not_found', "Model '#{model_id}' not found", status_code: 404) unless offerings.any?
39
47
 
40
48
  json_response({
@@ -84,11 +92,11 @@ module Legion
84
92
  summaries = offerings.group_by { |offering| offering[:model] }.map do |model, rows|
85
93
  summarize_model(model, rows)
86
94
  end
87
- summaries.sort_by { |model| model[:id] }
95
+ summaries.sort_by { |model| [auto_routing_model?(model[:id]) ? 0 : 1, model[:id]] }
88
96
  end
89
97
 
90
98
  def self.summarize_model(model, offerings)
91
- {
99
+ summary = {
92
100
  id: model.to_s,
93
101
  types: offerings.map { |offering| offering[:type].to_s }.uniq.sort,
94
102
  providers: offerings.map { |offering| offering[:provider_family] }.uniq.sort,
@@ -99,6 +107,12 @@ module Legion
99
107
  max_context: offerings.filter_map { |offering| offering.dig(:limits, :context_window) }.max,
100
108
  enabled: offerings.any? { |offering| offering[:enabled] != false }
101
109
  }
110
+ if auto_routing_model?(model)
111
+ summary[:display_name] = AUTO_ROUTING_MODEL_DISPLAY
112
+ summary[:auto_route] = true
113
+ summary[:default] = true
114
+ end
115
+ summary
102
116
  end
103
117
 
104
118
  def self.summary(offerings)
@@ -110,6 +124,64 @@ module Legion
110
124
  .transform_values(&:size)
111
125
  }
112
126
  end
127
+
128
+ def self.with_auto_routing_offering(offerings, filters = {})
129
+ return offerings unless auto_routing_offering_matches?(filters)
130
+ return offerings if offerings.any? { |offering| auto_routing_model?(offering[:model]) }
131
+
132
+ [auto_routing_offering, *offerings]
133
+ end
134
+
135
+ def self.auto_routing_offering
136
+ {
137
+ id: AUTO_ROUTING_OFFERING_ID,
138
+ offering_id: AUTO_ROUTING_OFFERING_ID,
139
+ model: AUTO_ROUTING_MODEL_ID,
140
+ display_name: AUTO_ROUTING_MODEL_DISPLAY,
141
+ model_family: 'legionio',
142
+ canonical_model_alias: AUTO_ROUTING_MODEL_ID,
143
+ type: :inference,
144
+ provider_family: 'legionio',
145
+ provider_instance: 'auto',
146
+ instance_id: 'auto',
147
+ tier: :auto,
148
+ transport: :internal,
149
+ enabled: true,
150
+ capabilities: AUTO_ROUTING_CAPABILITIES,
151
+ limits: {},
152
+ health: { circuit_state: 'available' },
153
+ metadata: { auto_route: true, placeholder: true, display_name: AUTO_ROUTING_MODEL_DISPLAY },
154
+ routing_metadata: { strategy: 'auto' },
155
+ source: 'static'
156
+ }
157
+ end
158
+
159
+ def self.auto_routing_offering_matches?(filters)
160
+ normalized = request_filters(filters)
161
+ type = normalized[:type]
162
+ return false if type && !type.to_s.empty? && type.to_s != 'inference' && type.to_s != 'chat'
163
+
164
+ provider = normalized[:provider]
165
+ return false if provider && !provider.to_s.empty? && !%w[legionio auto].include?(provider.to_s.downcase)
166
+
167
+ instance = normalized[:instance_id]
168
+ return false if instance && !instance.to_s.empty? && !%w[auto legionio].include?(instance.to_s.downcase)
169
+
170
+ model = normalized[:model] || normalized[:offering_id]
171
+ return false if model && !model.to_s.empty? && !auto_routing_model?(model) && model.to_s != AUTO_ROUTING_OFFERING_ID
172
+
173
+ family = normalized[:model_family]
174
+ return false if family && !family.to_s.empty? && family.to_s.downcase != 'legionio'
175
+
176
+ capability = normalized[:capability]
177
+ return false if capability && !AUTO_ROUTING_CAPABILITIES.include?(capability.to_s)
178
+
179
+ true
180
+ end
181
+
182
+ def self.auto_routing_model?(model)
183
+ model.to_s.strip.downcase == AUTO_ROUTING_MODEL_ID
184
+ end
113
185
  end
114
186
  end
115
187
  end
@@ -87,7 +87,7 @@ module Legion
87
87
  provider_key = entry[:provider].to_sym
88
88
  instance_key = entry[:instance].to_sym
89
89
 
90
- {
90
+ result = {
91
91
  provider: entry[:provider].to_s,
92
92
  instance: entry[:instance].to_s,
93
93
  tier: entry.dig(:metadata, :tier)&.to_s,
@@ -102,6 +102,9 @@ module Legion
102
102
  end,
103
103
  native: true
104
104
  }
105
+ result[:source] = entry.dig(:metadata, :source) if entry.dig(:metadata, :source)
106
+ result[:credential_fingerprint] = entry.dig(:metadata, :credential_fingerprint) if entry.dig(:metadata, :credential_fingerprint)
107
+ result
105
108
  end
106
109
  end
107
110
  end
@@ -232,7 +232,8 @@ module Legion
232
232
  return 'unknown' unless tracker
233
233
 
234
234
  tracker.circuit_state(provider_name.to_sym, instance: instance_name.to_sym).to_s
235
- rescue StandardError
235
+ rescue StandardError => e
236
+ log.debug "[llm][tiers] action=offering_instance_health provider=#{provider_name} instance=#{instance_name} error=#{e.class} — #{e.message}"
236
237
  'unknown'
237
238
  end
238
239
  end
@@ -122,12 +122,21 @@ module Legion
122
122
 
123
123
  def resolve_provider
124
124
  LLM.embedding_provider ||
125
- Legion::LLM::Settings.value(:embedding, :provider)&.to_sym
125
+ embedding_config_value(:provider)&.to_sym
126
126
  end
127
127
 
128
128
  def resolve_model
129
129
  LLM.embedding_model ||
130
- Legion::LLM::Settings.value(:embedding, :default_model)
130
+ embedding_config_value(:default_model)
131
+ end
132
+
133
+ def embedding_config_value(key)
134
+ v = Legion::LLM::Settings.value(:embedding, key)
135
+ return v unless v.nil?
136
+
137
+ plural = Legion::LLM::Settings.value(:embeddings, key)
138
+ log.warn "[llm][embeddings] settings key \"embeddings.#{key}\" (plural) is deprecated — rename to \"embedding.#{key}\"" unless plural.nil?
139
+ plural
131
140
  end
132
141
 
133
142
  def coerce_text(value)
@@ -239,12 +239,49 @@ module Legion
239
239
  end
240
240
 
241
241
  def text_part_content(part)
242
- return unless part.respond_to?(:transform_keys)
242
+ return part if part.is_a?(String)
243
243
 
244
- normalized = part.transform_keys { |key| key.respond_to?(:to_sym) ? key.to_sym : key }
245
- return unless normalized[:type].to_s == 'text'
244
+ if part.respond_to?(:transform_keys)
245
+ normalized = part.transform_keys { |key| key.respond_to?(:to_sym) ? key.to_sym : key }
246
+ return unless normalized[:type].to_s == 'text'
246
247
 
247
- normalized[:text].to_s
248
+ return normalized[:text].to_s
249
+ end
250
+
251
+ # Data structs expose named readers (type/text) without necessarily implementing [].
252
+ # Try named accessor path first; fall through to [] / fetch for plain hashes/structs.
253
+ if part.respond_to?(:type) || part.respond_to?(:text)
254
+ type = (part.respond_to?(:type) ? part.type.to_s : '')
255
+ text = part.respond_to?(:text) ? part.text : nil
256
+ return text.to_s if type == 'text' || (type.empty? && !text.nil?)
257
+
258
+ return nil
259
+ end
260
+
261
+ return unless part.respond_to?(:[]) || part.respond_to?(:fetch)
262
+
263
+ type = (defined_method_access(part, :type) || '').to_s
264
+ text = defined_method_access(part, :text)
265
+ text.to_s if type == 'text' || (type.empty? && !text.nil?)
266
+ end
267
+
268
+ def defined_method_access(obj, key)
269
+ # Prefer named accessor (covers Data structs like Types::ContentBlock).
270
+ key_sym = key.respond_to?(:to_sym) ? key.to_sym : key
271
+ return obj.public_send(key_sym) if obj.respond_to?(key_sym)
272
+
273
+ str_key = key.to_s
274
+ obj[key]
275
+ rescue TypeError, NoMethodError, KeyError => e
276
+ log.debug "[llm][adapter] action=defined_method_access key=#{key} class=#{obj.class} " \
277
+ "fallback=string_key error=#{e.class}: #{e.message}"
278
+ begin
279
+ obj[str_key]
280
+ rescue TypeError, NoMethodError, KeyError => fallback_error
281
+ log.debug "[llm][adapter] action=defined_method_access key=#{key} class=#{obj.class} " \
282
+ "fallback=none error=#{fallback_error.class}: #{fallback_error.message}"
283
+ nil
284
+ end
248
285
  end
249
286
 
250
287
  def normalize_message_tool_calls(tool_calls)
@@ -52,7 +52,8 @@ module Legion
52
52
  Legion::Extensions::Llm.constants(false).filter_map do |const_name|
53
53
  mod = Legion::Extensions::Llm.const_get(const_name, false)
54
54
  provider_module?(mod) ? mod : nil
55
- rescue NameError
55
+ rescue NameError => e
56
+ log.debug "[llm][providers] action=discover_provider_modules const=#{const_name} error=#{e.class} — #{e.message}"
56
57
  nil
57
58
  end
58
59
  end
@@ -80,6 +81,8 @@ module Legion
80
81
 
81
82
  def register_provider_instance(provider_module, family, aliases, instance_id, config)
82
83
  normalized_config = normalize_instance_config(config)
84
+ return if normalized_config[:enabled] == false
85
+
83
86
  registry_config = adapter_instance_config(normalized_config, instance_id)
84
87
  metadata = instance_metadata(normalized_config)
85
88
  adapter = Call::LexLLMAdapter.new(family, provider_module.provider_class, instance_config: registry_config)
@@ -107,14 +110,19 @@ module Legion
107
110
  end
108
111
 
109
112
  def instance_metadata(config)
110
- { tier: config[:tier], capabilities: config[:capabilities] || [] }
113
+ meta = { tier: config[:tier], capabilities: config[:capabilities] || [] }
114
+ meta[:default_model] = config[:default_model] if config[:default_model]
115
+ meta[:source] = config[:source] if config[:source]
116
+ meta[:credential_fingerprint] = config[:credential_fingerprint] if config[:credential_fingerprint]
117
+ meta
111
118
  end
112
119
 
113
120
  def safe_provider_family(provider_module)
114
121
  return nil unless provider_module&.const_defined?(:PROVIDER_FAMILY, false)
115
122
 
116
123
  provider_module::PROVIDER_FAMILY
117
- rescue StandardError
124
+ rescue StandardError => e
125
+ log.debug "[llm][providers] action=safe_provider_family error=#{e.class} — #{e.message}"
118
126
  nil
119
127
  end
120
128
 
@@ -9,9 +9,15 @@ module Legion
9
9
  class Curator
10
10
  include Legion::Logging::Helper
11
11
 
12
- CURATED_KEY = :__curated__
13
- THINKING_OPEN = '<thinking>'
14
- THINKING_CLOSE = '</thinking>'
12
+ CURATED_KEY = :__curated__
13
+
14
+ # All known provider thinking tag variants.
15
+ # Anthropic: <thinking>…</thinking>
16
+ # DeepSeek / Qwen / Ollama / vLLM inline: <think>…</think>
17
+ THINKING_TAG_PAIRS = [
18
+ ['<thinking>', '</thinking>'],
19
+ ['<think>', '</think>']
20
+ ].freeze
15
21
 
16
22
  def initialize(conversation_id:)
17
23
  @conversation_id = conversation_id
@@ -76,6 +82,8 @@ module Legion
76
82
  return msg if content.length <= max_chars
77
83
 
78
84
  summary = heuristic_tool_summary(content, tool_name_from(msg))
85
+ log.debug "[llm][curator] action=distill_tool_result conversation_id=#{@conversation_id} " \
86
+ "original_chars=#{content.length} summary_chars=#{summary.length}"
79
87
  msg.merge(content: summary, curated: true, original_content: content)
80
88
  end
81
89
 
@@ -89,6 +97,8 @@ module Legion
89
97
 
90
98
  return msg if stripped == content || stripped.empty?
91
99
 
100
+ log.debug "[llm][curator] action=strip_thinking conversation_id=#{@conversation_id} " \
101
+ "original_chars=#{content.length} stripped_chars=#{stripped.length}"
92
102
  msg.merge(content: stripped, curated: true, original_content: content)
93
103
  end
94
104
 
@@ -192,18 +202,27 @@ module Legion
192
202
  end
193
203
 
194
204
  def strip_thinking_tags(text)
195
- result = +''
205
+ result = text
206
+ THINKING_TAG_PAIRS.each do |open_tag, close_tag|
207
+ result = strip_tag_pair(result, open_tag, close_tag)
208
+ end
209
+ result
210
+ end
211
+
212
+ def strip_tag_pair(text, open_tag, close_tag)
213
+ out = +''
196
214
  pos = 0
197
215
  while pos < text.length
198
- open_idx = text.index(THINKING_OPEN, pos)
216
+ open_idx = text.index(open_tag, pos)
199
217
  break unless open_idx
200
218
 
201
- result << text[pos...open_idx]
202
- close_idx = text.index(THINKING_CLOSE, open_idx + THINKING_OPEN.length)
203
- pos = close_idx ? close_idx + THINKING_CLOSE.length : text.length
219
+ out << text[pos...open_idx]
220
+ close_idx = text.index(close_tag, open_idx + open_tag.length)
221
+ pos = close_idx ? close_idx + close_tag.length : text.length
204
222
  end
205
- result << text[pos..] if pos < text.length
206
- result
223
+ out << text[pos..] if pos < text.length
224
+ # Strip any unclosed open tag left at the end (provider died mid-stream).
225
+ out.sub(/#{Regexp.escape(open_tag)}.*\z/m, '').strip
207
226
  end
208
227
 
209
228
  def curate_message(msg, assistant_response)
@@ -427,7 +446,8 @@ module Legion
427
446
  def curated_payload(entry)
428
447
  parsed = Legion::JSON.parse(entry[:content].to_s)
429
448
  parsed.is_a?(Hash) ? parsed : {}
430
- rescue Legion::JSON::ParseError
449
+ rescue Legion::JSON::ParseError => e
450
+ log.debug "[llm][curator] action=curated_payload conversation_id=#{@conversation_id} error=#{e.class} — #{e.message}"
431
451
  {}
432
452
  end
433
453
 
@@ -26,7 +26,7 @@ module Legion
26
26
  anthropic: :frontier
27
27
  }.freeze
28
28
 
29
- TIER_WEIGHT = { local: 100, fleet: 80, cloud: 60, frontier: 40 }.freeze
29
+ DEFAULT_TIER_PRIORITY = %i[local direct fleet openai_compat cloud frontier].freeze
30
30
 
31
31
  module_function
32
32
 
@@ -50,7 +50,7 @@ module Legion
50
50
  extract_field(model_data, 'tier')&.to_sym ||
51
51
  tier
52
52
  capability = embedding_model?(model_data) ? :embed : :chat
53
- priority = (TIER_WEIGHT[model_tier] || 80) - order
53
+ priority = tier_weight(model_tier) - order
54
54
  rules << build_rule(provider, instance_id, model_data, capability, model_tier, priority)
55
55
  rules << build_rule(provider, instance_id, model_data, :stream, model_tier, priority) if capability == :chat
56
56
  order += 1
@@ -91,7 +91,7 @@ module Legion
91
91
  next unless default_model
92
92
 
93
93
  model_data = { name: default_model }
94
- priority = TIER_WEIGHT[tier] || 40
94
+ priority = tier_weight(tier)
95
95
  rules << build_rule(provider_name, :default, model_data, :chat, tier, priority)
96
96
  rules << build_rule(provider_name, :default, model_data, :stream, tier, priority)
97
97
  end
@@ -136,6 +136,26 @@ module Legion
136
136
  model_data[field] || model_data[field.to_s]
137
137
  end
138
138
 
139
+ def tier_weight(tier)
140
+ tier_sym = tier.respond_to?(:to_sym) ? tier.to_sym : tier
141
+ index = tier_priority.index(tier_sym)
142
+ return 0 unless index
143
+
144
+ (tier_priority.length - index) * 100
145
+ end
146
+
147
+ def tier_priority
148
+ configured = Legion::LLM::Settings.value(:routing, :tier_priority, default: DEFAULT_TIER_PRIORITY)
149
+ normalized = Array(configured).filter_map do |tier|
150
+ tier.to_sym if tier.respond_to?(:to_sym)
151
+ end
152
+ normalized = DEFAULT_TIER_PRIORITY if normalized.empty?
153
+ (normalized + DEFAULT_TIER_PRIORITY).uniq
154
+ rescue StandardError => e
155
+ handle_exception(e, level: :warn, handled: true, operation: 'rule_generator.tier_priority')
156
+ DEFAULT_TIER_PRIORITY
157
+ end
158
+
139
159
  def extension_providers
140
160
  ext = Legion::Settings[:extensions]
141
161
  return ext[:llm] if ext.is_a?(Hash) && ext[:llm].is_a?(Hash)