legion-llm 0.8.29 → 0.8.32

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (39) hide show
  1. checksums.yaml +4 -4
  2. data/.gitignore +1 -0
  3. data/CHANGELOG.md +22 -0
  4. data/README.md +4 -1
  5. data/legion-llm.gemspec +1 -0
  6. data/lib/legion/llm/api/native/helpers.rb +81 -9
  7. data/lib/legion/llm/api/native/inference.rb +5 -2
  8. data/lib/legion/llm/call/embeddings.rb +14 -1
  9. data/lib/legion/llm/call/providers.rb +47 -28
  10. data/lib/legion/llm/call/structured_output.rb +15 -3
  11. data/lib/legion/llm/inference/executor.rb +37 -32
  12. data/lib/legion/llm/router.rb +1 -1
  13. data/lib/legion/llm/settings.rb +3 -2
  14. data/lib/legion/llm/tools/adapter.rb +15 -0
  15. data/lib/legion/llm/version.rb +1 -1
  16. metadata +15 -24
  17. data/docs/2026-03-23-pipeline-gap-analysis.md +0 -203
  18. data/docs/example_settings.json +0 -16
  19. data/docs/examples/anthropic_request.json +0 -108
  20. data/docs/examples/anthropic_response.json +0 -90
  21. data/docs/examples/azure_ai_request.json +0 -103
  22. data/docs/examples/azure_ai_response.json +0 -91
  23. data/docs/examples/bedrock_request.json +0 -127
  24. data/docs/examples/bedrock_response.json +0 -93
  25. data/docs/examples/gemini_request.json +0 -127
  26. data/docs/examples/gemini_response.json +0 -109
  27. data/docs/examples/openai_request.json +0 -100
  28. data/docs/examples/openai_response.json +0 -77
  29. data/docs/examples/xai_request.json +0 -93
  30. data/docs/examples/xai_response.json +0 -48
  31. data/docs/gas-apollo-idea.md +0 -528
  32. data/docs/generation-augmented-storage.md +0 -135
  33. data/docs/llm-schema-spec.md +0 -2816
  34. data/docs/plans/2026-03-15-ollama-discovery-design.md +0 -164
  35. data/docs/plans/2026-03-15-ollama-discovery-implementation.md +0 -1147
  36. data/docs/routing-reenvisioned.md +0 -861
  37. data/docs/superpowers/plans/2026-04-15-sticky-runners-tool-history.md +0 -1866
  38. data/docs/superpowers/specs/2026-04-15-sticky-runners-tool-history-design.md +0 -713
  39. data/legion-llm-0.3.20.gem +0 -0
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 4ec77012ba08ec5ed5cb8fd544fca1a28ee8993b5136852f647be4f7e8725309
4
- data.tar.gz: ed78d65c4966669e008c853c89983a0baa1f1fd8f28a3510bc8461544f9ead5c
3
+ metadata.gz: f73cca457b602ab79cd3964a15333afe338a7ed5ef18a556f0e502da7817b4ef
4
+ data.tar.gz: 706498c91924640dc31bf43e814a04f2880acb2930cf30abdf144c28cb02fd01
5
5
  SHA512:
6
- metadata.gz: 2a33fd3d2b5dcd7c36e11ef5d1715d03f71256c1826967cf987e1665443bed0123d9473e178f034e9aa6a6f62ab72b8a1608b6b187ee3554882d8aacc98ded04
7
- data.tar.gz: 3c701ef336fbb0695819860bf3f68c108887d040ccd960dddb944ffc3fcfabc2ab52dc3d0194384530756bf6b8891d0f6d091bbae7b4cdd9db62024f7bd9874e
6
+ metadata.gz: 3b499fde4636085676157fdad1576ba60ed0149f2e9d126289b38399b4232ce64ed47e66250c7bf0dd01c87e15afff675c0d3f0ddc5d6d152ec17640af615fd2
7
+ data.tar.gz: 5b26a4426f630a3e36dedd6bdeacae7b6a2283696f4a9771d5458bdd2c8735380984ee983a10c9483ea2cb373eaa5a4e54c0cc92cac5d1a0b04f5835a7920850
data/.gitignore CHANGED
@@ -1,6 +1,7 @@
1
1
  /.bundle/
2
2
  /.yardoc
3
3
  /Gemfile.lock
4
+ *.gem
4
5
  /_yardoc/
5
6
  /coverage/
6
7
  /doc/
data/CHANGELOG.md CHANGED
@@ -1,5 +1,27 @@
1
1
  # Legion LLM Changelog
2
2
 
3
+ ## [0.8.32] - 2026-04-27
4
+
5
+ ### Fixed
6
+ - Embedding calls now return a clear unavailable-provider error when no embedding provider is configured or detected, preventing RubyLLM from implicitly selecting a chat/default provider.
7
+
8
+ ## [0.8.31] - 2026-04-27
9
+
10
+ ### Fixed
11
+ - Embedding calls no longer inherit the chat `llm.default_provider`, preventing vLLM or other chat defaults from receiving embedding traffic unless explicitly configured for embeddings. Fixes #104
12
+
13
+ ## [0.8.30] - 2026-04-27
14
+
15
+ ### Fixed
16
+ - Structured output parsing now strips markdown code fences before JSON parse, including retry responses from models that keep returning fenced JSON.
17
+ - LLM tool adapter dispatch now symbolizes JSON/string-keyed tool arguments before invoking Ruby keyword-argument tool classes.
18
+ - Default routing chains now honor explicit `default_provider` / `default_model` before auto-enabled local providers, preventing Ollama defaults from overriding a configured Bedrock default.
19
+ - Provider credential setup now resolves `env://` placeholders consistently for Bedrock SigV4, Anthropic, OpenAI, Gemini, Azure, and vLLM, and unresolved placeholder arrays no longer auto-enable hosted providers.
20
+ - Native `/api/llm/inference` responses now flatten structured provider content blocks into plain text for both streaming SSE deltas and non-streaming JSON responses, preventing Anthropic/Bedrock-style block arrays from being stored and replayed as nested JSON-looking assistant replies.
21
+ - Native `/api/llm/inference` streaming now emits `thinking-delta` SSE events for provider reasoning chunks without appending those chunks to final assistant content.
22
+ - Native `file_read` client tools now extract text from PDFs via `pdf-reader` and return a clear unsupported-binary message for non-text binary files.
23
+ - Local providers now cap automatically injected registry tools with `llm.tool_trigger.local_tool_limit`, prioritizing trigger-matched tools before always-loaded tools for Ollama/vLLM requests.
24
+
3
25
  ## [0.8.29] - 2026-04-27
4
26
 
5
27
  ### Added
data/README.md CHANGED
@@ -2,7 +2,7 @@
2
2
 
3
3
  LLM integration for the [LegionIO](https://github.com/LegionIO/LegionIO) framework. Wraps [ruby_llm](https://github.com/crmne/ruby_llm) to provide chat, embeddings, tool use, and agent capabilities to any Legion extension. Exposes OpenAI- and Anthropic-compatible API endpoints so external tools can point at the Legion daemon and just work.
4
4
 
5
- **Version**: 0.8.0
5
+ **Version**: 0.8.30
6
6
 
7
7
  ## Installation
8
8
 
@@ -60,6 +60,7 @@ Requests flow through the full Inference pipeline — routing, metering, audit,
60
60
  Both formats supported with correct SSE shapes:
61
61
  - **OpenAI**: `data: {"choices":[{"delta":{"content":"..."}}]}` chunks, terminated by `data: [DONE]`
62
62
  - **Anthropic**: Typed events — `message_start`, `content_block_start`, `content_block_delta`, `content_block_stop`, `message_delta`, `message_stop`
63
+ - **Native**: `/api/llm/inference` streams `text-delta`, `thinking-delta`, tool lifecycle events, and a final `done` event. Structured provider content blocks are flattened to plain text in both streaming and non-streaming native responses so `content` remains a string for daemon clients.
63
64
 
64
65
  ### API Authentication
65
66
 
@@ -851,6 +852,8 @@ No code changes are needed in consumers immediately. The aliases will be maintai
851
852
  | Azure AI | `azure` | `vault://`, `env://`, or direct | Azure OpenAI endpoint; `api_base` + `api_key` or `auth_token` |
852
853
  | Ollama | `ollama` | Local, no credentials needed | Local inference |
853
854
 
855
+ `env://NAME` credential placeholders resolve at provider configuration time, including array fallbacks such as `["env://OPENAI_API_KEY", "env://CODEX_API_KEY"]`. Unresolved placeholders do not auto-enable hosted providers.
856
+
854
857
  ## Integration with LegionIO
855
858
 
856
859
  legion-llm follows the standard core gem lifecycle:
data/legion-llm.gemspec CHANGED
@@ -35,6 +35,7 @@ Gem::Specification.new do |spec|
35
35
  spec.add_dependency 'lex-gemini'
36
36
  spec.add_dependency 'lex-knowledge'
37
37
  spec.add_dependency 'lex-openai'
38
+ spec.add_dependency 'pdf-reader'
38
39
  spec.add_dependency 'ruby_llm', '~> 1.13'
39
40
  spec.add_dependency 'tzinfo', '>= 2.0'
40
41
  end
@@ -10,12 +10,14 @@ module Legion
10
10
  module API
11
11
  module Native
12
12
  module ClientToolMethods
13
+ include Legion::Logging::Helper
14
+
13
15
  private
14
16
 
15
17
  def log_tool(level, ref, status, **details)
16
18
  parts = ["[tool][#{ref}] #{status}"]
17
19
  details.each { |k, v| parts << "#{k}=#{v}" }
18
- Legion::Logging.send(level, parts.join(' '))
20
+ log.public_send(level, parts.join(' '))
19
21
  end
20
22
 
21
23
  def summarize_tool_arg_keys(kwargs)
@@ -37,7 +39,7 @@ module Legion
37
39
  end
38
40
  end
39
41
 
40
- def dispatch_client_tool(ref, **kwargs) # rubocop:disable Metrics/AbcSize,Metrics/CyclomaticComplexity,Metrics/PerceivedComplexity
42
+ def dispatch_client_tool(ref, **kwargs) # rubocop:disable Metrics/AbcSize,Metrics/CyclomaticComplexity,Metrics/MethodLength,Metrics/PerceivedComplexity
41
43
  case ref
42
44
  when 'sh'
43
45
  cmd = kwargs[:command] || kwargs[:cmd] || kwargs.values.first.to_s
@@ -45,7 +47,7 @@ module Legion
45
47
  "exit=#{status.exitstatus}\n#{output}"
46
48
  when 'file_read'
47
49
  path = kwargs[:path] || kwargs[:file_path] || kwargs.values.first.to_s
48
- ::File.exist?(path) ? ::File.read(path, encoding: 'utf-8') : "File not found: #{path}"
50
+ read_client_file(path)
49
51
  when 'file_write'
50
52
  path = kwargs[:path] || kwargs[:file_path]
51
53
  content = kwargs[:content] || kwargs[:contents]
@@ -82,6 +84,7 @@ module Legion
82
84
  max_length ? content[0, max_length] : content
83
85
  rescue LoadError => e
84
86
  missing = e.respond_to?(:path) && e.path ? e.path : 'legion/cli/chat/web_fetch'
87
+ handle_exception(e, level: :warn, handled: true, operation: 'llm.api.client_tool.web_fetch', missing: missing)
85
88
  "web_fetch is unavailable: missing optional dependency #{missing}"
86
89
  end
87
90
  when 'web_search'
@@ -93,6 +96,7 @@ module Legion
93
96
  results[:results].map { |r| "### #{r[:title]}\n#{r[:url]}\n#{r[:snippet]}" }.join("\n\n")
94
97
  rescue LoadError => e
95
98
  missing = e.respond_to?(:path) && e.path ? e.path : 'legion/cli/chat/web_search'
99
+ handle_exception(e, level: :warn, handled: true, operation: 'llm.api.client_tool.web_search', missing: missing)
96
100
  "web_search is unavailable: missing optional dependency #{missing}"
97
101
  end
98
102
  else
@@ -100,6 +104,51 @@ module Legion
100
104
  end
101
105
  end
102
106
 
107
+ def read_client_file(path)
108
+ return "File not found: #{path}" unless ::File.exist?(path)
109
+
110
+ return read_pdf_text(path) if pdf_file?(path)
111
+
112
+ content = ::File.binread(path)
113
+ return 'Binary file detected, cannot read as text.' if binary_content?(content)
114
+
115
+ content.force_encoding('UTF-8')
116
+ content
117
+ rescue StandardError => e
118
+ handle_exception(e, level: :warn, handled: true, operation: 'llm.api.client_tool.file_read', path: path)
119
+ "file_read error: #{e.message}"
120
+ end
121
+
122
+ def pdf_file?(path)
123
+ ::File.extname(path).casecmp('.pdf').zero? || ::File.binread(path, 5) == '%PDF-'
124
+ rescue StandardError => e
125
+ handle_exception(e, level: :warn, handled: true, operation: 'llm.api.client_tool.pdf_sniff', path: path)
126
+ false
127
+ end
128
+
129
+ def read_pdf_text(path)
130
+ require 'pdf-reader' unless defined?(::PDF::Reader)
131
+
132
+ reader = ::PDF::Reader.new(path)
133
+ text = reader.pages.map(&:text).join("\n\n").strip
134
+ text.empty? ? 'PDF contained no extractable text.' : text
135
+ rescue LoadError => e
136
+ missing = e.respond_to?(:path) && e.path ? e.path : 'pdf-reader'
137
+ handle_exception(e, level: :warn, handled: true, operation: 'llm.api.client_tool.pdf_extract', missing: missing)
138
+ 'PDF text extraction unavailable: missing pdf-reader gem.'
139
+ rescue StandardError => e
140
+ handle_exception(e, level: :warn, handled: true, operation: 'llm.api.client_tool.pdf_extract', path: path)
141
+ "PDF text extraction failed: #{e.message}"
142
+ end
143
+
144
+ def binary_content?(content)
145
+ return true if content.include?("\x00")
146
+
147
+ sample = content.byteslice(0, 4096).to_s
148
+ sample.force_encoding('UTF-8')
149
+ !sample.valid_encoding?
150
+ end
151
+
103
152
  def notify_tool_event(type, ref, **data)
104
153
  handler = Thread.current[:legion_tool_event_handler]
105
154
  return unless handler
@@ -257,13 +306,14 @@ module Legion
257
306
  rescue StandardError => e
258
307
  ms = begin
259
308
  ((::Process.clock_gettime(::Process::CLOCK_MONOTONIC) - t0) * 1000).round(1)
260
- rescue StandardError
309
+ rescue StandardError => e
310
+ handle_exception(e, level: :warn, handled: true,
311
+ operation: 'llm.api.client_tool.duration_measurement', tool_ref: tool_ref)
261
312
  nil
262
313
  end
263
314
  log_tool(:error, tool_ref, 'failed', duration_ms: ms, error: e.message)
264
315
  notify_tool_event(:tool_error, tool_ref, error: e.message)
265
- Legion::Logging.log_exception(e, payload_summary: "client tool #{tool_ref} failed",
266
- component_type: :api)
316
+ handle_exception(e, level: :error, handled: true, operation: "llm.api.client_tool.#{tool_ref}")
267
317
  "Tool error: #{e.message}"
268
318
  end
269
319
  end
@@ -287,6 +337,25 @@ module Legion
287
337
  end
288
338
  end
289
339
 
340
+ define_method(:extract_text_content) do |content|
341
+ case content
342
+ when nil
343
+ ''
344
+ when String
345
+ content
346
+ when Array
347
+ content.filter_map { |entry| extract_text_content(entry) }.join
348
+ when Hash
349
+ type = content[:type] || content['type']
350
+ return '' unless type.nil? || type.to_s == 'text'
351
+
352
+ text = content.key?(:text) || content.key?('text') ? (content[:text] || content['text']) : (content[:content] || content['content'])
353
+ extract_text_content(text)
354
+ else
355
+ content.to_s
356
+ end
357
+ end
358
+
290
359
  define_method(:emit_sse_event) do |stream, event_name, payload|
291
360
  level = event_name == 'text-delta' ? :debug : :info
292
361
  log.send(level, "[sse][emit] event=#{event_name} keys=#{payload.is_a?(Hash) ? payload.keys.join(',') : 'n/a'}")
@@ -333,7 +402,8 @@ module Legion
333
402
 
334
403
  kerb = begin
335
404
  Legion::Settings.dig(:kerberos, :username)
336
- rescue StandardError
405
+ rescue StandardError => e
406
+ handle_exception(e, level: :warn, handled: true, operation: 'llm.api.identity.kerberos_username')
337
407
  nil
338
408
  end
339
409
  return "user:#{kerb}" if kerb.is_a?(String) && !kerb.empty?
@@ -354,14 +424,16 @@ module Legion
354
424
  define_method(:resolve_requested_by) do |rack_env, identity_string|
355
425
  hostname = begin
356
426
  Legion::Settings[:client][:hostname]
357
- rescue StandardError
427
+ rescue StandardError => e
428
+ handle_exception(e, level: :warn, handled: true, operation: 'llm.api.identity.client_hostname')
358
429
  Socket.gethostname
359
430
  end
360
431
  username = identity_string.delete_prefix('user:')
361
432
 
362
433
  kerb = begin
363
434
  Legion::Settings.dig(:kerberos, :username)
364
- rescue StandardError
435
+ rescue StandardError => e
436
+ handle_exception(e, level: :warn, handled: true, operation: 'llm.api.identity.requested_by_kerberos')
365
437
  nil
366
438
  end
367
439
  if kerb.is_a?(String) && !kerb.empty?
@@ -150,7 +150,10 @@ module Legion
150
150
  }
151
151
 
152
152
  pipeline_response = executor.call_stream do |chunk|
153
- text = chunk.respond_to?(:content) ? chunk.content.to_s : chunk.to_s
153
+ thinking = extract_text_content(chunk.thinking) if chunk.respond_to?(:thinking)
154
+ emit_sse_event(out, 'thinking-delta', { delta: thinking }) unless thinking.to_s.empty?
155
+
156
+ text = extract_text_content(chunk.respond_to?(:content) ? chunk.content : chunk)
154
157
  next if text.empty?
155
158
 
156
159
  full_text << text
@@ -195,7 +198,7 @@ module Legion
195
198
  exec_ms = ((::Process.clock_gettime(::Process::CLOCK_MONOTONIC) - exec_t0) * 1000).round
196
199
  log.debug("[llm][api][inference] action=executor_call duration_ms=#{exec_ms} request_id=#{request_id}")
197
200
  raw_msg = pipeline_response.message
198
- content = raw_msg.is_a?(Hash) ? (raw_msg[:content] || raw_msg['content']) : raw_msg.to_s
201
+ content = extract_text_content(raw_msg.is_a?(Hash) ? (raw_msg[:content] || raw_msg['content']) : raw_msg)
199
202
  routing = pipeline_response.routing || {}
200
203
  tokens = pipeline_response.tokens || {}
201
204
  tool_calls = extract_tool_calls(pipeline_response)
@@ -14,6 +14,7 @@ module Legion
14
14
  return { vector: nil, model: model, provider: provider, error: 'LLM not started' } unless LLM.started?
15
15
 
16
16
  provider ||= resolve_provider
17
+ return embedding_unavailable_result(model, provider) unless provider
17
18
  return { vector: nil, model: model, provider: provider, error: "provider #{provider} is disabled" } if provider_disabled?(provider)
18
19
 
19
20
  model ||= resolve_model(provider)
@@ -41,6 +42,8 @@ module Legion
41
42
  return texts.map { |_| { vector: nil, error: 'LLM not started' } } unless LLM.started?
42
43
 
43
44
  provider ||= resolve_provider
45
+ return unavailable_batch_result(texts, provider, model) unless provider
46
+
44
47
  disabled_result = disabled_batch_result(texts, provider, model)
45
48
  return disabled_result if disabled_result
46
49
 
@@ -74,6 +77,16 @@ module Legion
74
77
  end
75
78
  end
76
79
 
80
+ def embedding_unavailable_result(model, provider)
81
+ { vector: nil, model: model, provider: provider, error: 'No embedding provider configured' }
82
+ end
83
+
84
+ def unavailable_batch_result(texts, provider, model)
85
+ texts.each_with_index.map do |_, i|
86
+ embedding_unavailable_result(model, provider).merge(dimensions: 0, index: i)
87
+ end
88
+ end
89
+
77
90
  def provider_disabled?(provider)
78
91
  return false unless provider
79
92
 
@@ -176,7 +189,7 @@ module Legion
176
189
  configured = embedding_settings[:provider]
177
190
  return configured&.to_sym if configured
178
191
 
179
- Legion::Settings.dig(:llm, :default_provider)&.to_sym
192
+ nil
180
193
  rescue StandardError => e
181
194
  handle_exception(e, level: :debug, operation: 'llm.embeddings.resolve_provider')
182
195
  nil
@@ -82,12 +82,16 @@ module Legion
82
82
  usable_setting?(config[:api_key])
83
83
  end
84
84
  rescue StandardError => e
85
- handle_exception(e, level: :debug, operation: 'llm.providers.credential_available_for', provider: provider)
85
+ handle_exception(e, level: :warn, operation: 'llm.providers.credential_available_for', provider: provider)
86
86
  false
87
87
  end
88
88
 
89
89
  def usable_setting?(value)
90
- !Call::ClaudeConfigLoader.resolve_setting_reference(value).nil?
90
+ !resolve_credential_value(value).nil?
91
+ end
92
+
93
+ def resolve_credential_value(value)
94
+ Call::ClaudeConfigLoader.resolve_setting_reference(value)
91
95
  end
92
96
 
93
97
  def env_present?(key)
@@ -104,7 +108,7 @@ module Legion
104
108
  Socket.tcp(addr, port, connect_timeout: 1).close
105
109
  true
106
110
  rescue StandardError => e
107
- handle_exception(e, level: :debug, operation: 'llm.providers.ollama_running', base_url: url)
111
+ handle_exception(e, level: :warn, operation: 'llm.providers.ollama_running', base_url: url)
108
112
  false
109
113
  end
110
114
 
@@ -120,7 +124,7 @@ module Legion
120
124
  end.get('/health')
121
125
  response.success?
122
126
  rescue StandardError => e
123
- handle_exception(e, level: :debug, operation: 'llm.providers.vllm_running', base_url: url)
127
+ handle_exception(e, level: :warn, operation: 'llm.providers.vllm_running', base_url: url)
124
128
  false
125
129
  end
126
130
 
@@ -139,9 +143,15 @@ module Legion
139
143
  end
140
144
 
141
145
  def configure_bedrock(config)
142
- has_sigv4 = usable_setting?(config[:api_key]) && usable_setting?(config[:secret_key])
143
- has_bearer = Call::ClaudeConfigLoader.resolve_setting_reference(config[:bearer_token])
146
+ resolved_api_key = resolve_credential_value(config[:api_key])
147
+ resolved_secret_key = resolve_credential_value(config[:secret_key])
148
+ resolved_session_token = resolve_credential_value(config[:session_token])
149
+ has_sigv4 = resolved_api_key && resolved_secret_key
150
+ has_bearer = resolve_credential_value(config[:bearer_token])
144
151
  config[:bearer_token] = has_bearer if has_bearer
152
+ config[:api_key] = resolved_api_key if resolved_api_key
153
+ config[:secret_key] = resolved_secret_key if resolved_secret_key
154
+ config[:session_token] = resolved_session_token if resolved_session_token
145
155
 
146
156
  unless has_sigv4 || has_bearer
147
157
  broker_creds = resolve_broker_aws_credentials
@@ -176,7 +186,7 @@ module Legion
176
186
 
177
187
  def configure_anthropic(config)
178
188
  api_key = resolve_broker_credential(:anthropic) ||
179
- Call::ClaudeConfigLoader.resolve_setting_reference(config[:api_key]) ||
189
+ resolve_credential_value(config[:api_key]) ||
180
190
  ENV.fetch('ANTHROPIC_API_KEY', nil)
181
191
  return unless api_key
182
192
 
@@ -189,7 +199,7 @@ module Legion
189
199
 
190
200
  def configure_openai(config)
191
201
  api_key = resolve_broker_credential(:openai) ||
192
- Call::ClaudeConfigLoader.resolve_setting_reference(config[:api_key]) ||
202
+ resolve_credential_value(config[:api_key]) ||
193
203
  ENV.fetch('OPENAI_API_KEY', nil) ||
194
204
  ENV.fetch('CODEX_API_KEY', nil)
195
205
  return unless api_key
@@ -202,7 +212,9 @@ module Legion
202
212
  end
203
213
 
204
214
  def configure_gemini(config)
205
- api_key = resolve_broker_credential(:gemini) || config[:api_key]
215
+ api_key = resolve_broker_credential(:gemini) ||
216
+ resolve_credential_value(config[:api_key]) ||
217
+ ENV.fetch('GEMINI_API_KEY', nil)
206
218
  return unless api_key
207
219
 
208
220
  RubyLLM.configure do |c|
@@ -214,8 +226,8 @@ module Legion
214
226
 
215
227
  def configure_azure(config)
216
228
  api_base = config[:api_base]
217
- api_key = resolve_broker_credential(:azure) || config[:api_key]
218
- auth_token = config[:auth_token]
229
+ api_key = resolve_broker_credential(:azure) || resolve_credential_value(config[:api_key])
230
+ auth_token = resolve_credential_value(config[:auth_token])
219
231
  return unless api_base && (api_key || auth_token)
220
232
 
221
233
  RubyLLM.configure do |c|
@@ -235,9 +247,10 @@ module Legion
235
247
 
236
248
  def configure_vllm(config)
237
249
  base_url = config[:base_url] || 'http://localhost:8000/v1'
250
+ api_key = resolve_credential_value(config[:api_key])
238
251
  RubyLLM.configure do |c|
239
252
  c.vllm_api_base = base_url
240
- c.vllm_api_key = config[:api_key] if config[:api_key]
253
+ c.vllm_api_key = api_key if api_key
241
254
  end
242
255
  log.info "[llm][providers] configured vllm base_url=#{base_url.inspect}"
243
256
  end
@@ -294,22 +307,22 @@ module Legion
294
307
  case provider
295
308
  when :bedrock
296
309
  candidates = []
297
- resolved_bearer = Call::ClaudeConfigLoader.resolve_setting_reference(config[:bearer_token])
310
+ resolved_bearer = resolve_credential_value(config[:bearer_token])
298
311
  bearer_env = ENV.fetch('AWS_BEARER_TOKEN_BEDROCK', nil)
299
312
  claude_bearer = Call::ClaudeConfigLoader.bedrock_bearer_token
300
313
  candidates += [resolved_bearer, bearer_env, claude_bearer].compact.uniq.map { |t| { bearer_token: t } }
301
- api_key = Call::ClaudeConfigLoader.resolve_setting_reference(config[:api_key])
302
- secret = Call::ClaudeConfigLoader.resolve_setting_reference(config[:secret_key])
314
+ api_key = resolve_credential_value(config[:api_key])
315
+ secret = resolve_credential_value(config[:secret_key])
303
316
  candidates << { api_key: api_key, secret_key: secret } if api_key && secret
304
317
  candidates
305
318
  when :anthropic
306
319
  [
307
- Call::ClaudeConfigLoader.resolve_setting_reference(config[:api_key]),
320
+ resolve_credential_value(config[:api_key]),
308
321
  ENV.fetch('ANTHROPIC_API_KEY', nil)
309
322
  ].compact.uniq.map { |k| { api_key: k } }
310
323
  when :openai
311
324
  keys = [
312
- Call::ClaudeConfigLoader.resolve_setting_reference(config[:api_key]),
325
+ resolve_credential_value(config[:api_key]),
313
326
  ENV.fetch('OPENAI_API_KEY', nil),
314
327
  ENV.fetch('CODEX_API_KEY', nil),
315
328
  Call::CodexConfigLoader.read_token
@@ -317,14 +330,14 @@ module Legion
317
330
  keys.map { |k| { api_key: k } }
318
331
  when :gemini
319
332
  [
320
- Call::ClaudeConfigLoader.resolve_setting_reference(config[:api_key]),
333
+ resolve_credential_value(config[:api_key]),
321
334
  ENV.fetch('GEMINI_API_KEY', nil)
322
335
  ].compact.uniq.map { |k| { api_key: k } }
323
336
  else
324
337
  []
325
338
  end
326
339
  rescue StandardError => e
327
- handle_exception(e, level: :debug, operation: 'llm.providers.collect_credential_candidates', provider: provider)
340
+ handle_exception(e, level: :warn, operation: 'llm.providers.collect_credential_candidates', provider: provider)
328
341
  []
329
342
  end
330
343
 
@@ -383,7 +396,7 @@ module Legion
383
396
  end
384
397
  rescue StandardError => e
385
398
  log.warn "[llm][providers] health_check failed provider=#{provider} error=#{e.class}"
386
- handle_exception(e, level: :debug, operation: 'llm.providers.attempt_provider_call', provider: provider, model: model)
399
+ handle_exception(e, level: :warn, operation: 'llm.providers.attempt_provider_call', provider: provider, model: model)
387
400
  false
388
401
  end
389
402
 
@@ -398,19 +411,25 @@ module Legion
398
411
  return :ok if model_ids.any? { |id| id.include?(target_model) || target_model.include?(id) }
399
412
 
400
413
  :model_missing
401
- rescue RubyLLM::UnauthorizedError, RubyLLM::ForbiddenError
414
+ rescue RubyLLM::UnauthorizedError, RubyLLM::ForbiddenError => e
415
+ handle_exception(e, level: :warn, handled: true,
416
+ operation: 'llm.providers.probe_via_model_list.auth', provider: provider)
402
417
  :auth_error
403
418
  rescue StandardError => e
404
- handle_exception(e, level: :debug, operation: 'llm.providers.probe_via_model_list', provider: provider)
419
+ handle_exception(e, level: :warn, operation: 'llm.providers.probe_via_model_list', provider: provider)
405
420
  probe_via_chat(provider, target_model)
406
421
  end
407
422
 
408
423
  def probe_via_chat(provider, model)
409
424
  RubyLLM.chat(model: model, provider: provider).ask('Respond with only the word: pong')
410
425
  :ok
411
- rescue RubyLLM::ModelNotFoundError
426
+ rescue RubyLLM::ModelNotFoundError => e
427
+ handle_exception(e, level: :warn, handled: true,
428
+ operation: 'llm.providers.probe_via_chat.model_missing', provider: provider, model: model)
412
429
  :model_missing
413
- rescue RubyLLM::UnauthorizedError, RubyLLM::ForbiddenError
430
+ rescue RubyLLM::UnauthorizedError, RubyLLM::ForbiddenError => e
431
+ handle_exception(e, level: :warn, handled: true,
432
+ operation: 'llm.providers.probe_via_chat.auth', provider: provider, model: model)
414
433
  :auth_error
415
434
  end
416
435
 
@@ -433,7 +452,7 @@ module Legion
433
452
  ok = attempt_provider_call(:openai, openai_config[:default_model])
434
453
  openai_config[:enabled] = false unless ok
435
454
  rescue StandardError => e
436
- handle_exception(e, level: :debug, operation: 'llm.providers.recover_openai_with_codex')
455
+ handle_exception(e, level: :warn, operation: 'llm.providers.recover_openai_with_codex')
437
456
  end
438
457
 
439
458
  def auto_register_providers
@@ -485,7 +504,7 @@ module Legion
485
504
 
486
505
  Legion::Identity::Broker.token_for(provider_name)
487
506
  rescue StandardError => e
488
- handle_exception(e, level: :debug, operation: "llm.providers.broker_resolve.#{provider_name}")
507
+ handle_exception(e, level: :warn, operation: "llm.providers.broker_resolve.#{provider_name}")
489
508
  nil
490
509
  end
491
510
 
@@ -497,7 +516,7 @@ module Legion
497
516
 
498
517
  nil
499
518
  rescue StandardError => e
500
- handle_exception(e, level: :debug, operation: 'llm.providers.broker_resolve.aws')
519
+ handle_exception(e, level: :warn, operation: 'llm.providers.broker_resolve.aws')
501
520
  nil
502
521
  end
503
522
 
@@ -512,7 +531,7 @@ module Legion
512
531
  !Legion::Identity::Broker.token_for(provider).nil?
513
532
  end
514
533
  rescue StandardError => e
515
- handle_exception(e, level: :debug, operation: 'llm.providers.broker_credential_available', provider: provider)
534
+ handle_exception(e, level: :warn, operation: 'llm.providers.broker_credential_available', provider: provider)
516
535
  false
517
536
  end
518
537
 
@@ -15,7 +15,7 @@ module Legion
15
15
  result = call_with_schema(messages, schema, model, provider: provider, **)
16
16
  log.info "[llm][structured_output] model=#{model} provider=#{provider} valid=true"
17
17
 
18
- content = result.respond_to?(:content) ? result.content : result[:content]
18
+ content = strip_markdown_fences(result.respond_to?(:content) ? result.content : result[:content])
19
19
  raw_model = result.respond_to?(:model_id) ? result.model_id : result[:model]
20
20
 
21
21
  parsed = Legion::JSON.load(content)
@@ -52,7 +52,7 @@ module Legion
52
52
  if retry_enabled? && attempt < max_retries
53
53
  retry_with_instruction(messages, schema, model, provider: provider, attempt: attempt + 1, **opts)
54
54
  else
55
- raw = result.respond_to?(:content) ? result&.content : result&.dig(:content)
55
+ raw = strip_markdown_fences(result.respond_to?(:content) ? result&.content : result&.dig(:content))
56
56
  { data: nil, error: "JSON parse failed: #{error.message}", raw: raw, valid: false }
57
57
  end
58
58
  end
@@ -64,7 +64,7 @@ module Legion
64
64
  model: model, provider: provider, intent: nil, tier: nil,
65
65
  message: user_content, **opts.except(:attempt))
66
66
 
67
- retry_content = result.respond_to?(:content) ? result.content : result[:content]
67
+ retry_content = strip_markdown_fences(result.respond_to?(:content) ? result.content : result[:content])
68
68
  retry_model = result.respond_to?(:model_id) ? result.model_id : result[:model]
69
69
 
70
70
  parsed = Legion::JSON.load(retry_content)
@@ -83,6 +83,18 @@ module Legion
83
83
  parts.join("\n\n")
84
84
  end
85
85
 
86
+ def strip_markdown_fences(text)
87
+ return text unless text.is_a?(String)
88
+
89
+ stripped = text.strip
90
+ return stripped unless stripped.start_with?('```')
91
+
92
+ stripped
93
+ .sub(/\A`{3,}[[:space:]]*(?:json)?[[:space:]]*\n?/i, '')
94
+ .sub(/\n?[[:space:]]*`{3,}\z/, '')
95
+ .strip
96
+ end
97
+
86
98
  def supports_response_format?(model)
87
99
  SCHEMA_CAPABLE_MODELS.any? { |m| model.to_s.include?(m) }
88
100
  end