legion-llm 0.8.29 → 0.8.30

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (37) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +12 -0
  3. data/README.md +4 -1
  4. data/legion-llm.gemspec +1 -0
  5. data/lib/legion/llm/api/native/helpers.rb +81 -9
  6. data/lib/legion/llm/api/native/inference.rb +5 -2
  7. data/lib/legion/llm/call/providers.rb +47 -28
  8. data/lib/legion/llm/call/structured_output.rb +15 -3
  9. data/lib/legion/llm/inference/executor.rb +37 -32
  10. data/lib/legion/llm/router.rb +1 -1
  11. data/lib/legion/llm/settings.rb +3 -2
  12. data/lib/legion/llm/tools/adapter.rb +15 -0
  13. data/lib/legion/llm/version.rb +1 -1
  14. metadata +15 -24
  15. data/docs/2026-03-23-pipeline-gap-analysis.md +0 -203
  16. data/docs/example_settings.json +0 -16
  17. data/docs/examples/anthropic_request.json +0 -108
  18. data/docs/examples/anthropic_response.json +0 -90
  19. data/docs/examples/azure_ai_request.json +0 -103
  20. data/docs/examples/azure_ai_response.json +0 -91
  21. data/docs/examples/bedrock_request.json +0 -127
  22. data/docs/examples/bedrock_response.json +0 -93
  23. data/docs/examples/gemini_request.json +0 -127
  24. data/docs/examples/gemini_response.json +0 -109
  25. data/docs/examples/openai_request.json +0 -100
  26. data/docs/examples/openai_response.json +0 -77
  27. data/docs/examples/xai_request.json +0 -93
  28. data/docs/examples/xai_response.json +0 -48
  29. data/docs/gas-apollo-idea.md +0 -528
  30. data/docs/generation-augmented-storage.md +0 -135
  31. data/docs/llm-schema-spec.md +0 -2816
  32. data/docs/plans/2026-03-15-ollama-discovery-design.md +0 -164
  33. data/docs/plans/2026-03-15-ollama-discovery-implementation.md +0 -1147
  34. data/docs/routing-reenvisioned.md +0 -861
  35. data/docs/superpowers/plans/2026-04-15-sticky-runners-tool-history.md +0 -1866
  36. data/docs/superpowers/specs/2026-04-15-sticky-runners-tool-history-design.md +0 -713
  37. data/legion-llm-0.3.20.gem +0 -0
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 4ec77012ba08ec5ed5cb8fd544fca1a28ee8993b5136852f647be4f7e8725309
4
- data.tar.gz: ed78d65c4966669e008c853c89983a0baa1f1fd8f28a3510bc8461544f9ead5c
3
+ metadata.gz: d560457b6321f55371b3dd14d546c8c23d11485b2b3ba5dec218cea028d50399
4
+ data.tar.gz: 8ee76eba6bf57f592f9d372fec7f8c5372d97c220869f988abe9e1521e766b7b
5
5
  SHA512:
6
- metadata.gz: 2a33fd3d2b5dcd7c36e11ef5d1715d03f71256c1826967cf987e1665443bed0123d9473e178f034e9aa6a6f62ab72b8a1608b6b187ee3554882d8aacc98ded04
7
- data.tar.gz: 3c701ef336fbb0695819860bf3f68c108887d040ccd960dddb944ffc3fcfabc2ab52dc3d0194384530756bf6b8891d0f6d091bbae7b4cdd9db62024f7bd9874e
6
+ metadata.gz: 0b81ee44a4f57a8ec0e9eb8aef043bdc543217baade9ff1b2772a46056ebc7a486d4e9acfded4fb05cedf341583cb8aaaa9091bd243c4f26eac9ff494ada3d01
7
+ data.tar.gz: 2ebf5a44aa5588a635c75ef6dc0b68c6554905d343b125cd944c1f5032399c4c714cad8953030e53ba3bb92efe0233371d5c2ebad12eebc5b4a1e2887a0e7c35
data/CHANGELOG.md CHANGED
@@ -1,5 +1,17 @@
1
1
  # Legion LLM Changelog
2
2
 
3
+ ## [0.8.30] - 2026-04-27
4
+
5
+ ### Fixed
6
+ - Structured output parsing now strips markdown code fences before JSON parse, including retry responses from models that keep returning fenced JSON.
7
+ - LLM tool adapter dispatch now symbolizes JSON/string-keyed tool arguments before invoking Ruby keyword-argument tool classes.
8
+ - Default routing chains now honor explicit `default_provider` / `default_model` before auto-enabled local providers, preventing Ollama defaults from overriding a configured Bedrock default.
9
+ - Provider credential setup now resolves `env://` placeholders consistently for Bedrock SigV4, Anthropic, OpenAI, Gemini, Azure, and vLLM, and unresolved placeholder arrays no longer auto-enable hosted providers.
10
+ - Native `/api/llm/inference` responses now flatten structured provider content blocks into plain text for both streaming SSE deltas and non-streaming JSON responses, preventing Anthropic/Bedrock-style block arrays from being stored and replayed as nested JSON-looking assistant replies.
11
+ - Native `/api/llm/inference` streaming now emits `thinking-delta` SSE events for provider reasoning chunks without appending those chunks to final assistant content.
12
+ - Native `file_read` client tools now extract text from PDFs via `pdf-reader` and return a clear unsupported-binary message for non-text binary files.
13
+ - Local providers now cap automatically injected registry tools with `llm.tool_trigger.local_tool_limit`, prioritizing trigger-matched tools before always-loaded tools for Ollama/vLLM requests.
14
+
3
15
  ## [0.8.29] - 2026-04-27
4
16
 
5
17
  ### Added
data/README.md CHANGED
@@ -2,7 +2,7 @@
2
2
 
3
3
  LLM integration for the [LegionIO](https://github.com/LegionIO/LegionIO) framework. Wraps [ruby_llm](https://github.com/crmne/ruby_llm) to provide chat, embeddings, tool use, and agent capabilities to any Legion extension. Exposes OpenAI- and Anthropic-compatible API endpoints so external tools can point at the Legion daemon and just work.
4
4
 
5
- **Version**: 0.8.0
5
+ **Version**: 0.8.30
6
6
 
7
7
  ## Installation
8
8
 
@@ -60,6 +60,7 @@ Requests flow through the full Inference pipeline — routing, metering, audit,
60
60
  Both formats supported with correct SSE shapes:
61
61
  - **OpenAI**: `data: {"choices":[{"delta":{"content":"..."}}]}` chunks, terminated by `data: [DONE]`
62
62
  - **Anthropic**: Typed events — `message_start`, `content_block_start`, `content_block_delta`, `content_block_stop`, `message_delta`, `message_stop`
63
+ - **Native**: `/api/llm/inference` streams `text-delta`, `thinking-delta`, tool lifecycle events, and a final `done` event. Structured provider content blocks are flattened to plain text in both streaming and non-streaming native responses so `content` remains a string for daemon clients.
63
64
 
64
65
  ### API Authentication
65
66
 
@@ -851,6 +852,8 @@ No code changes are needed in consumers immediately. The aliases will be maintai
851
852
  | Azure AI | `azure` | `vault://`, `env://`, or direct | Azure OpenAI endpoint; `api_base` + `api_key` or `auth_token` |
852
853
  | Ollama | `ollama` | Local, no credentials needed | Local inference |
853
854
 
855
+ `env://NAME` credential placeholders resolve at provider configuration time, including array fallbacks such as `["env://OPENAI_API_KEY", "env://CODEX_API_KEY"]`. Unresolved placeholders do not auto-enable hosted providers.
856
+
854
857
  ## Integration with LegionIO
855
858
 
856
859
  legion-llm follows the standard core gem lifecycle:
data/legion-llm.gemspec CHANGED
@@ -35,6 +35,7 @@ Gem::Specification.new do |spec|
35
35
  spec.add_dependency 'lex-gemini'
36
36
  spec.add_dependency 'lex-knowledge'
37
37
  spec.add_dependency 'lex-openai'
38
+ spec.add_dependency 'pdf-reader'
38
39
  spec.add_dependency 'ruby_llm', '~> 1.13'
39
40
  spec.add_dependency 'tzinfo', '>= 2.0'
40
41
  end
@@ -10,12 +10,14 @@ module Legion
10
10
  module API
11
11
  module Native
12
12
  module ClientToolMethods
13
+ include Legion::Logging::Helper
14
+
13
15
  private
14
16
 
15
17
  def log_tool(level, ref, status, **details)
16
18
  parts = ["[tool][#{ref}] #{status}"]
17
19
  details.each { |k, v| parts << "#{k}=#{v}" }
18
- Legion::Logging.send(level, parts.join(' '))
20
+ log.public_send(level, parts.join(' '))
19
21
  end
20
22
 
21
23
  def summarize_tool_arg_keys(kwargs)
@@ -37,7 +39,7 @@ module Legion
37
39
  end
38
40
  end
39
41
 
40
- def dispatch_client_tool(ref, **kwargs) # rubocop:disable Metrics/AbcSize,Metrics/CyclomaticComplexity,Metrics/PerceivedComplexity
42
+ def dispatch_client_tool(ref, **kwargs) # rubocop:disable Metrics/AbcSize,Metrics/CyclomaticComplexity,Metrics/MethodLength,Metrics/PerceivedComplexity
41
43
  case ref
42
44
  when 'sh'
43
45
  cmd = kwargs[:command] || kwargs[:cmd] || kwargs.values.first.to_s
@@ -45,7 +47,7 @@ module Legion
45
47
  "exit=#{status.exitstatus}\n#{output}"
46
48
  when 'file_read'
47
49
  path = kwargs[:path] || kwargs[:file_path] || kwargs.values.first.to_s
48
- ::File.exist?(path) ? ::File.read(path, encoding: 'utf-8') : "File not found: #{path}"
50
+ read_client_file(path)
49
51
  when 'file_write'
50
52
  path = kwargs[:path] || kwargs[:file_path]
51
53
  content = kwargs[:content] || kwargs[:contents]
@@ -82,6 +84,7 @@ module Legion
82
84
  max_length ? content[0, max_length] : content
83
85
  rescue LoadError => e
84
86
  missing = e.respond_to?(:path) && e.path ? e.path : 'legion/cli/chat/web_fetch'
87
+ handle_exception(e, level: :warn, handled: true, operation: 'llm.api.client_tool.web_fetch', missing: missing)
85
88
  "web_fetch is unavailable: missing optional dependency #{missing}"
86
89
  end
87
90
  when 'web_search'
@@ -93,6 +96,7 @@ module Legion
93
96
  results[:results].map { |r| "### #{r[:title]}\n#{r[:url]}\n#{r[:snippet]}" }.join("\n\n")
94
97
  rescue LoadError => e
95
98
  missing = e.respond_to?(:path) && e.path ? e.path : 'legion/cli/chat/web_search'
99
+ handle_exception(e, level: :warn, handled: true, operation: 'llm.api.client_tool.web_search', missing: missing)
96
100
  "web_search is unavailable: missing optional dependency #{missing}"
97
101
  end
98
102
  else
@@ -100,6 +104,51 @@ module Legion
100
104
  end
101
105
  end
102
106
 
107
+ def read_client_file(path)
108
+ return "File not found: #{path}" unless ::File.exist?(path)
109
+
110
+ return read_pdf_text(path) if pdf_file?(path)
111
+
112
+ content = ::File.binread(path)
113
+ return 'Binary file detected, cannot read as text.' if binary_content?(content)
114
+
115
+ content.force_encoding('UTF-8')
116
+ content
117
+ rescue StandardError => e
118
+ handle_exception(e, level: :warn, handled: true, operation: 'llm.api.client_tool.file_read', path: path)
119
+ "file_read error: #{e.message}"
120
+ end
121
+
122
+ def pdf_file?(path)
123
+ ::File.extname(path).casecmp('.pdf').zero? || ::File.binread(path, 5) == '%PDF-'
124
+ rescue StandardError => e
125
+ handle_exception(e, level: :warn, handled: true, operation: 'llm.api.client_tool.pdf_sniff', path: path)
126
+ false
127
+ end
128
+
129
+ def read_pdf_text(path)
130
+ require 'pdf-reader' unless defined?(::PDF::Reader)
131
+
132
+ reader = ::PDF::Reader.new(path)
133
+ text = reader.pages.map(&:text).join("\n\n").strip
134
+ text.empty? ? 'PDF contained no extractable text.' : text
135
+ rescue LoadError => e
136
+ missing = e.respond_to?(:path) && e.path ? e.path : 'pdf-reader'
137
+ handle_exception(e, level: :warn, handled: true, operation: 'llm.api.client_tool.pdf_extract', missing: missing)
138
+ 'PDF text extraction unavailable: missing pdf-reader gem.'
139
+ rescue StandardError => e
140
+ handle_exception(e, level: :warn, handled: true, operation: 'llm.api.client_tool.pdf_extract', path: path)
141
+ "PDF text extraction failed: #{e.message}"
142
+ end
143
+
144
+ def binary_content?(content)
145
+ return true if content.include?("\x00")
146
+
147
+ sample = content.byteslice(0, 4096).to_s
148
+ sample.force_encoding('UTF-8')
149
+ !sample.valid_encoding?
150
+ end
151
+
103
152
  def notify_tool_event(type, ref, **data)
104
153
  handler = Thread.current[:legion_tool_event_handler]
105
154
  return unless handler
@@ -257,13 +306,14 @@ module Legion
257
306
  rescue StandardError => e
258
307
  ms = begin
259
308
  ((::Process.clock_gettime(::Process::CLOCK_MONOTONIC) - t0) * 1000).round(1)
260
- rescue StandardError
309
+ rescue StandardError => e
310
+ handle_exception(e, level: :warn, handled: true,
311
+ operation: 'llm.api.client_tool.duration_measurement', tool_ref: tool_ref)
261
312
  nil
262
313
  end
263
314
  log_tool(:error, tool_ref, 'failed', duration_ms: ms, error: e.message)
264
315
  notify_tool_event(:tool_error, tool_ref, error: e.message)
265
- Legion::Logging.log_exception(e, payload_summary: "client tool #{tool_ref} failed",
266
- component_type: :api)
316
+ handle_exception(e, level: :error, handled: true, operation: "llm.api.client_tool.#{tool_ref}")
267
317
  "Tool error: #{e.message}"
268
318
  end
269
319
  end
@@ -287,6 +337,25 @@ module Legion
287
337
  end
288
338
  end
289
339
 
340
+ define_method(:extract_text_content) do |content|
341
+ case content
342
+ when nil
343
+ ''
344
+ when String
345
+ content
346
+ when Array
347
+ content.filter_map { |entry| extract_text_content(entry) }.join
348
+ when Hash
349
+ type = content[:type] || content['type']
350
+ return '' unless type.nil? || type.to_s == 'text'
351
+
352
+ text = content.key?(:text) || content.key?('text') ? (content[:text] || content['text']) : (content[:content] || content['content'])
353
+ extract_text_content(text)
354
+ else
355
+ content.to_s
356
+ end
357
+ end
358
+
290
359
  define_method(:emit_sse_event) do |stream, event_name, payload|
291
360
  level = event_name == 'text-delta' ? :debug : :info
292
361
  log.send(level, "[sse][emit] event=#{event_name} keys=#{payload.is_a?(Hash) ? payload.keys.join(',') : 'n/a'}")
@@ -333,7 +402,8 @@ module Legion
333
402
 
334
403
  kerb = begin
335
404
  Legion::Settings.dig(:kerberos, :username)
336
- rescue StandardError
405
+ rescue StandardError => e
406
+ handle_exception(e, level: :warn, handled: true, operation: 'llm.api.identity.kerberos_username')
337
407
  nil
338
408
  end
339
409
  return "user:#{kerb}" if kerb.is_a?(String) && !kerb.empty?
@@ -354,14 +424,16 @@ module Legion
354
424
  define_method(:resolve_requested_by) do |rack_env, identity_string|
355
425
  hostname = begin
356
426
  Legion::Settings[:client][:hostname]
357
- rescue StandardError
427
+ rescue StandardError => e
428
+ handle_exception(e, level: :warn, handled: true, operation: 'llm.api.identity.client_hostname')
358
429
  Socket.gethostname
359
430
  end
360
431
  username = identity_string.delete_prefix('user:')
361
432
 
362
433
  kerb = begin
363
434
  Legion::Settings.dig(:kerberos, :username)
364
- rescue StandardError
435
+ rescue StandardError => e
436
+ handle_exception(e, level: :warn, handled: true, operation: 'llm.api.identity.requested_by_kerberos')
365
437
  nil
366
438
  end
367
439
  if kerb.is_a?(String) && !kerb.empty?
@@ -150,7 +150,10 @@ module Legion
150
150
  }
151
151
 
152
152
  pipeline_response = executor.call_stream do |chunk|
153
- text = chunk.respond_to?(:content) ? chunk.content.to_s : chunk.to_s
153
+ thinking = extract_text_content(chunk.thinking) if chunk.respond_to?(:thinking)
154
+ emit_sse_event(out, 'thinking-delta', { delta: thinking }) unless thinking.to_s.empty?
155
+
156
+ text = extract_text_content(chunk.respond_to?(:content) ? chunk.content : chunk)
154
157
  next if text.empty?
155
158
 
156
159
  full_text << text
@@ -195,7 +198,7 @@ module Legion
195
198
  exec_ms = ((::Process.clock_gettime(::Process::CLOCK_MONOTONIC) - exec_t0) * 1000).round
196
199
  log.debug("[llm][api][inference] action=executor_call duration_ms=#{exec_ms} request_id=#{request_id}")
197
200
  raw_msg = pipeline_response.message
198
- content = raw_msg.is_a?(Hash) ? (raw_msg[:content] || raw_msg['content']) : raw_msg.to_s
201
+ content = extract_text_content(raw_msg.is_a?(Hash) ? (raw_msg[:content] || raw_msg['content']) : raw_msg)
199
202
  routing = pipeline_response.routing || {}
200
203
  tokens = pipeline_response.tokens || {}
201
204
  tool_calls = extract_tool_calls(pipeline_response)
@@ -82,12 +82,16 @@ module Legion
82
82
  usable_setting?(config[:api_key])
83
83
  end
84
84
  rescue StandardError => e
85
- handle_exception(e, level: :debug, operation: 'llm.providers.credential_available_for', provider: provider)
85
+ handle_exception(e, level: :warn, operation: 'llm.providers.credential_available_for', provider: provider)
86
86
  false
87
87
  end
88
88
 
89
89
  def usable_setting?(value)
90
- !Call::ClaudeConfigLoader.resolve_setting_reference(value).nil?
90
+ !resolve_credential_value(value).nil?
91
+ end
92
+
93
+ def resolve_credential_value(value)
94
+ Call::ClaudeConfigLoader.resolve_setting_reference(value)
91
95
  end
92
96
 
93
97
  def env_present?(key)
@@ -104,7 +108,7 @@ module Legion
104
108
  Socket.tcp(addr, port, connect_timeout: 1).close
105
109
  true
106
110
  rescue StandardError => e
107
- handle_exception(e, level: :debug, operation: 'llm.providers.ollama_running', base_url: url)
111
+ handle_exception(e, level: :warn, operation: 'llm.providers.ollama_running', base_url: url)
108
112
  false
109
113
  end
110
114
 
@@ -120,7 +124,7 @@ module Legion
120
124
  end.get('/health')
121
125
  response.success?
122
126
  rescue StandardError => e
123
- handle_exception(e, level: :debug, operation: 'llm.providers.vllm_running', base_url: url)
127
+ handle_exception(e, level: :warn, operation: 'llm.providers.vllm_running', base_url: url)
124
128
  false
125
129
  end
126
130
 
@@ -139,9 +143,15 @@ module Legion
139
143
  end
140
144
 
141
145
  def configure_bedrock(config)
142
- has_sigv4 = usable_setting?(config[:api_key]) && usable_setting?(config[:secret_key])
143
- has_bearer = Call::ClaudeConfigLoader.resolve_setting_reference(config[:bearer_token])
146
+ resolved_api_key = resolve_credential_value(config[:api_key])
147
+ resolved_secret_key = resolve_credential_value(config[:secret_key])
148
+ resolved_session_token = resolve_credential_value(config[:session_token])
149
+ has_sigv4 = resolved_api_key && resolved_secret_key
150
+ has_bearer = resolve_credential_value(config[:bearer_token])
144
151
  config[:bearer_token] = has_bearer if has_bearer
152
+ config[:api_key] = resolved_api_key if resolved_api_key
153
+ config[:secret_key] = resolved_secret_key if resolved_secret_key
154
+ config[:session_token] = resolved_session_token if resolved_session_token
145
155
 
146
156
  unless has_sigv4 || has_bearer
147
157
  broker_creds = resolve_broker_aws_credentials
@@ -176,7 +186,7 @@ module Legion
176
186
 
177
187
  def configure_anthropic(config)
178
188
  api_key = resolve_broker_credential(:anthropic) ||
179
- Call::ClaudeConfigLoader.resolve_setting_reference(config[:api_key]) ||
189
+ resolve_credential_value(config[:api_key]) ||
180
190
  ENV.fetch('ANTHROPIC_API_KEY', nil)
181
191
  return unless api_key
182
192
 
@@ -189,7 +199,7 @@ module Legion
189
199
 
190
200
  def configure_openai(config)
191
201
  api_key = resolve_broker_credential(:openai) ||
192
- Call::ClaudeConfigLoader.resolve_setting_reference(config[:api_key]) ||
202
+ resolve_credential_value(config[:api_key]) ||
193
203
  ENV.fetch('OPENAI_API_KEY', nil) ||
194
204
  ENV.fetch('CODEX_API_KEY', nil)
195
205
  return unless api_key
@@ -202,7 +212,9 @@ module Legion
202
212
  end
203
213
 
204
214
  def configure_gemini(config)
205
- api_key = resolve_broker_credential(:gemini) || config[:api_key]
215
+ api_key = resolve_broker_credential(:gemini) ||
216
+ resolve_credential_value(config[:api_key]) ||
217
+ ENV.fetch('GEMINI_API_KEY', nil)
206
218
  return unless api_key
207
219
 
208
220
  RubyLLM.configure do |c|
@@ -214,8 +226,8 @@ module Legion
214
226
 
215
227
  def configure_azure(config)
216
228
  api_base = config[:api_base]
217
- api_key = resolve_broker_credential(:azure) || config[:api_key]
218
- auth_token = config[:auth_token]
229
+ api_key = resolve_broker_credential(:azure) || resolve_credential_value(config[:api_key])
230
+ auth_token = resolve_credential_value(config[:auth_token])
219
231
  return unless api_base && (api_key || auth_token)
220
232
 
221
233
  RubyLLM.configure do |c|
@@ -235,9 +247,10 @@ module Legion
235
247
 
236
248
  def configure_vllm(config)
237
249
  base_url = config[:base_url] || 'http://localhost:8000/v1'
250
+ api_key = resolve_credential_value(config[:api_key])
238
251
  RubyLLM.configure do |c|
239
252
  c.vllm_api_base = base_url
240
- c.vllm_api_key = config[:api_key] if config[:api_key]
253
+ c.vllm_api_key = api_key if api_key
241
254
  end
242
255
  log.info "[llm][providers] configured vllm base_url=#{base_url.inspect}"
243
256
  end
@@ -294,22 +307,22 @@ module Legion
294
307
  case provider
295
308
  when :bedrock
296
309
  candidates = []
297
- resolved_bearer = Call::ClaudeConfigLoader.resolve_setting_reference(config[:bearer_token])
310
+ resolved_bearer = resolve_credential_value(config[:bearer_token])
298
311
  bearer_env = ENV.fetch('AWS_BEARER_TOKEN_BEDROCK', nil)
299
312
  claude_bearer = Call::ClaudeConfigLoader.bedrock_bearer_token
300
313
  candidates += [resolved_bearer, bearer_env, claude_bearer].compact.uniq.map { |t| { bearer_token: t } }
301
- api_key = Call::ClaudeConfigLoader.resolve_setting_reference(config[:api_key])
302
- secret = Call::ClaudeConfigLoader.resolve_setting_reference(config[:secret_key])
314
+ api_key = resolve_credential_value(config[:api_key])
315
+ secret = resolve_credential_value(config[:secret_key])
303
316
  candidates << { api_key: api_key, secret_key: secret } if api_key && secret
304
317
  candidates
305
318
  when :anthropic
306
319
  [
307
- Call::ClaudeConfigLoader.resolve_setting_reference(config[:api_key]),
320
+ resolve_credential_value(config[:api_key]),
308
321
  ENV.fetch('ANTHROPIC_API_KEY', nil)
309
322
  ].compact.uniq.map { |k| { api_key: k } }
310
323
  when :openai
311
324
  keys = [
312
- Call::ClaudeConfigLoader.resolve_setting_reference(config[:api_key]),
325
+ resolve_credential_value(config[:api_key]),
313
326
  ENV.fetch('OPENAI_API_KEY', nil),
314
327
  ENV.fetch('CODEX_API_KEY', nil),
315
328
  Call::CodexConfigLoader.read_token
@@ -317,14 +330,14 @@ module Legion
317
330
  keys.map { |k| { api_key: k } }
318
331
  when :gemini
319
332
  [
320
- Call::ClaudeConfigLoader.resolve_setting_reference(config[:api_key]),
333
+ resolve_credential_value(config[:api_key]),
321
334
  ENV.fetch('GEMINI_API_KEY', nil)
322
335
  ].compact.uniq.map { |k| { api_key: k } }
323
336
  else
324
337
  []
325
338
  end
326
339
  rescue StandardError => e
327
- handle_exception(e, level: :debug, operation: 'llm.providers.collect_credential_candidates', provider: provider)
340
+ handle_exception(e, level: :warn, operation: 'llm.providers.collect_credential_candidates', provider: provider)
328
341
  []
329
342
  end
330
343
 
@@ -383,7 +396,7 @@ module Legion
383
396
  end
384
397
  rescue StandardError => e
385
398
  log.warn "[llm][providers] health_check failed provider=#{provider} error=#{e.class}"
386
- handle_exception(e, level: :debug, operation: 'llm.providers.attempt_provider_call', provider: provider, model: model)
399
+ handle_exception(e, level: :warn, operation: 'llm.providers.attempt_provider_call', provider: provider, model: model)
387
400
  false
388
401
  end
389
402
 
@@ -398,19 +411,25 @@ module Legion
398
411
  return :ok if model_ids.any? { |id| id.include?(target_model) || target_model.include?(id) }
399
412
 
400
413
  :model_missing
401
- rescue RubyLLM::UnauthorizedError, RubyLLM::ForbiddenError
414
+ rescue RubyLLM::UnauthorizedError, RubyLLM::ForbiddenError => e
415
+ handle_exception(e, level: :warn, handled: true,
416
+ operation: 'llm.providers.probe_via_model_list.auth', provider: provider)
402
417
  :auth_error
403
418
  rescue StandardError => e
404
- handle_exception(e, level: :debug, operation: 'llm.providers.probe_via_model_list', provider: provider)
419
+ handle_exception(e, level: :warn, operation: 'llm.providers.probe_via_model_list', provider: provider)
405
420
  probe_via_chat(provider, target_model)
406
421
  end
407
422
 
408
423
  def probe_via_chat(provider, model)
409
424
  RubyLLM.chat(model: model, provider: provider).ask('Respond with only the word: pong')
410
425
  :ok
411
- rescue RubyLLM::ModelNotFoundError
426
+ rescue RubyLLM::ModelNotFoundError => e
427
+ handle_exception(e, level: :warn, handled: true,
428
+ operation: 'llm.providers.probe_via_chat.model_missing', provider: provider, model: model)
412
429
  :model_missing
413
- rescue RubyLLM::UnauthorizedError, RubyLLM::ForbiddenError
430
+ rescue RubyLLM::UnauthorizedError, RubyLLM::ForbiddenError => e
431
+ handle_exception(e, level: :warn, handled: true,
432
+ operation: 'llm.providers.probe_via_chat.auth', provider: provider, model: model)
414
433
  :auth_error
415
434
  end
416
435
 
@@ -433,7 +452,7 @@ module Legion
433
452
  ok = attempt_provider_call(:openai, openai_config[:default_model])
434
453
  openai_config[:enabled] = false unless ok
435
454
  rescue StandardError => e
436
- handle_exception(e, level: :debug, operation: 'llm.providers.recover_openai_with_codex')
455
+ handle_exception(e, level: :warn, operation: 'llm.providers.recover_openai_with_codex')
437
456
  end
438
457
 
439
458
  def auto_register_providers
@@ -485,7 +504,7 @@ module Legion
485
504
 
486
505
  Legion::Identity::Broker.token_for(provider_name)
487
506
  rescue StandardError => e
488
- handle_exception(e, level: :debug, operation: "llm.providers.broker_resolve.#{provider_name}")
507
+ handle_exception(e, level: :warn, operation: "llm.providers.broker_resolve.#{provider_name}")
489
508
  nil
490
509
  end
491
510
 
@@ -497,7 +516,7 @@ module Legion
497
516
 
498
517
  nil
499
518
  rescue StandardError => e
500
- handle_exception(e, level: :debug, operation: 'llm.providers.broker_resolve.aws')
519
+ handle_exception(e, level: :warn, operation: 'llm.providers.broker_resolve.aws')
501
520
  nil
502
521
  end
503
522
 
@@ -512,7 +531,7 @@ module Legion
512
531
  !Legion::Identity::Broker.token_for(provider).nil?
513
532
  end
514
533
  rescue StandardError => e
515
- handle_exception(e, level: :debug, operation: 'llm.providers.broker_credential_available', provider: provider)
534
+ handle_exception(e, level: :warn, operation: 'llm.providers.broker_credential_available', provider: provider)
516
535
  false
517
536
  end
518
537
 
@@ -15,7 +15,7 @@ module Legion
15
15
  result = call_with_schema(messages, schema, model, provider: provider, **)
16
16
  log.info "[llm][structured_output] model=#{model} provider=#{provider} valid=true"
17
17
 
18
- content = result.respond_to?(:content) ? result.content : result[:content]
18
+ content = strip_markdown_fences(result.respond_to?(:content) ? result.content : result[:content])
19
19
  raw_model = result.respond_to?(:model_id) ? result.model_id : result[:model]
20
20
 
21
21
  parsed = Legion::JSON.load(content)
@@ -52,7 +52,7 @@ module Legion
52
52
  if retry_enabled? && attempt < max_retries
53
53
  retry_with_instruction(messages, schema, model, provider: provider, attempt: attempt + 1, **opts)
54
54
  else
55
- raw = result.respond_to?(:content) ? result&.content : result&.dig(:content)
55
+ raw = strip_markdown_fences(result.respond_to?(:content) ? result&.content : result&.dig(:content))
56
56
  { data: nil, error: "JSON parse failed: #{error.message}", raw: raw, valid: false }
57
57
  end
58
58
  end
@@ -64,7 +64,7 @@ module Legion
64
64
  model: model, provider: provider, intent: nil, tier: nil,
65
65
  message: user_content, **opts.except(:attempt))
66
66
 
67
- retry_content = result.respond_to?(:content) ? result.content : result[:content]
67
+ retry_content = strip_markdown_fences(result.respond_to?(:content) ? result.content : result[:content])
68
68
  retry_model = result.respond_to?(:model_id) ? result.model_id : result[:model]
69
69
 
70
70
  parsed = Legion::JSON.load(retry_content)
@@ -83,6 +83,18 @@ module Legion
83
83
  parts.join("\n\n")
84
84
  end
85
85
 
86
+ def strip_markdown_fences(text)
87
+ return text unless text.is_a?(String)
88
+
89
+ stripped = text.strip
90
+ return stripped unless stripped.start_with?('```')
91
+
92
+ stripped
93
+ .sub(/\A`{3,}[[:space:]]*(?:json)?[[:space:]]*\n?/i, '')
94
+ .sub(/\n?[[:space:]]*`{3,}\z/, '')
95
+ .strip
96
+ end
97
+
86
98
  def supports_response_format?(model)
87
99
  SCHEMA_CAPABLE_MODELS.any? { |m| model.to_s.include?(m) }
88
100
  end
@@ -102,30 +102,15 @@ module Legion
102
102
 
103
103
  injected_names = []
104
104
 
105
- # Always-loaded tools — inject all unconditionally
106
- ::Legion::Tools::Registry.tools.each do |tool_class|
107
- adapter = ToolAdapter.new(tool_class)
108
- @injected_tool_map[adapter.name] = tool_class
109
- session.with_tool(adapter)
110
- injected_names << adapter.name
111
- rescue StandardError => e
112
- @warnings << "Failed to inject always tool: #{e.message}"
113
- handle_exception(e, level: :warn, operation: 'llm.pipeline.inject_always_tool')
114
- end
105
+ always_tools = Array(::Legion::Tools::Registry.tools)
106
+ triggered_tools = @triggered_tools.any? ? Array(@triggered_tools) : []
107
+ inject_limit = registry_tool_limit
108
+ prioritized_tools = local_provider? ? triggered_tools + always_tools : always_tools + triggered_tools
115
109
 
116
- # Trigger-matched tools — inject tools surfaced by trigger word matching
117
- if @triggered_tools.any?
118
- @triggered_tools.each do |tool_class|
119
- adapter = ToolAdapter.new(tool_class)
120
- next if injected_names.include?(adapter.name)
121
-
122
- @injected_tool_map[adapter.name] = tool_class
123
- session.with_tool(adapter)
124
- injected_names << adapter.name
125
- rescue StandardError => e
126
- @warnings << "Failed to inject triggered tool: #{e.message}"
127
- handle_exception(e, level: :warn, operation: 'llm.pipeline.inject_triggered_tool')
128
- end
110
+ prioritized_tools.each do |tool_class|
111
+ break if inject_limit && injected_names.size >= inject_limit
112
+
113
+ inject_tool_class(session, tool_class, injected_names, operation: 'llm.pipeline.inject_registry_tool')
129
114
  end
130
115
 
131
116
  # Requested deferred tools — inject only if explicitly requested
@@ -133,15 +118,9 @@ module Legion
133
118
  requested = requested_deferred_tool_names
134
119
  if requested.any?
135
120
  deferred.each do |tool_class|
136
- adapter = ToolAdapter.new(tool_class)
137
- next unless requested.include?(adapter.name)
138
-
139
- @injected_tool_map[adapter.name] = tool_class
140
- session.with_tool(adapter)
141
- injected_names << adapter.name
142
- rescue StandardError => e
143
- @warnings << "Failed to inject deferred tool: #{e.message}"
144
- handle_exception(e, level: :warn, operation: 'llm.pipeline.inject_deferred_tool')
121
+ inject_tool_class(session, tool_class, injected_names, operation: 'llm.pipeline.inject_deferred_tool') do |adapter|
122
+ requested.include?(adapter.name)
123
+ end
145
124
  end
146
125
  end
147
126
 
@@ -149,6 +128,7 @@ module Legion
149
128
  "[llm][tools] inject request_id=#{@request.id} " \
150
129
  "always=#{::Legion::Tools::Registry.tools.size} " \
151
130
  "triggered=#{@triggered_tools.size} " \
131
+ "limit=#{inject_limit || 'none'} " \
152
132
  "deferred_available=#{deferred.size} " \
153
133
  "requested_deferred=#{requested.size} " \
154
134
  "injected=#{injected_names.size} names=#{injected_names.first(25).join(',')}"
@@ -158,6 +138,31 @@ module Legion
158
138
  handle_exception(e, level: :warn, operation: 'llm.pipeline.inject_tools')
159
139
  end
160
140
 
141
+ def inject_tool_class(session, tool_class, injected_names, operation:)
142
+ adapter = ToolAdapter.new(tool_class)
143
+ return if injected_names.include?(adapter.name)
144
+ return if block_given? && !yield(adapter)
145
+
146
+ @injected_tool_map[adapter.name] = tool_class
147
+ session.with_tool(adapter)
148
+ injected_names << adapter.name
149
+ rescue StandardError => e
150
+ @warnings << "Failed to inject tool: #{e.message}"
151
+ handle_exception(e, level: :warn, operation: operation)
152
+ end
153
+
154
+ def registry_tool_limit
155
+ return nil unless local_provider?
156
+
157
+ raw_limit = (Legion::LLM.settings[:tool_trigger] || {})[:local_tool_limit]
158
+ limit = raw_limit.to_i
159
+ limit.positive? ? limit : nil
160
+ end
161
+
162
+ def local_provider?
163
+ %i[ollama vllm].include?(@resolved_provider&.to_sym)
164
+ end
165
+
161
166
  # Backwards compatibility alias
162
167
  alias inject_discovered_tools inject_registry_tools
163
168
 
@@ -347,7 +347,7 @@ module Legion
347
347
  end
348
348
 
349
349
  def chain_from_defaults(model, provider, max)
350
- if provider || model
350
+ if provider || model || default_settings_provider || default_settings_model
351
351
  p = (provider || default_settings_provider)&.to_sym
352
352
  res = Resolution.new(tier: PROVIDER_TIER.fetch(p, :frontier),
353
353
  provider: p || :anthropic,
@@ -284,8 +284,9 @@ module Legion
284
284
 
285
285
  def self.tool_trigger_defaults
286
286
  {
287
- scan_depth: 10,
288
- tool_limit: 25
287
+ scan_depth: 10,
288
+ tool_limit: 25,
289
+ local_tool_limit: 10
289
290
  }
290
291
  end
291
292