RubyGems - legion-llm - Versions diffs - 0.8.29 → 0.8.30 - Mend

legion-llm 0.8.29 → 0.8.30

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (37) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +12 -0
data/README.md +4 -1
data/legion-llm.gemspec +1 -0
data/lib/legion/llm/api/native/helpers.rb +81 -9
data/lib/legion/llm/api/native/inference.rb +5 -2
data/lib/legion/llm/call/providers.rb +47 -28
data/lib/legion/llm/call/structured_output.rb +15 -3
data/lib/legion/llm/inference/executor.rb +37 -32
data/lib/legion/llm/router.rb +1 -1
data/lib/legion/llm/settings.rb +3 -2
data/lib/legion/llm/tools/adapter.rb +15 -0
data/lib/legion/llm/version.rb +1 -1
metadata +15 -24
data/docs/2026-03-23-pipeline-gap-analysis.md +0 -203
data/docs/example_settings.json +0 -16
data/docs/examples/anthropic_request.json +0 -108
data/docs/examples/anthropic_response.json +0 -90
data/docs/examples/azure_ai_request.json +0 -103
data/docs/examples/azure_ai_response.json +0 -91
data/docs/examples/bedrock_request.json +0 -127
data/docs/examples/bedrock_response.json +0 -93
data/docs/examples/gemini_request.json +0 -127
data/docs/examples/gemini_response.json +0 -109
data/docs/examples/openai_request.json +0 -100
data/docs/examples/openai_response.json +0 -77
data/docs/examples/xai_request.json +0 -93
data/docs/examples/xai_response.json +0 -48
data/docs/gas-apollo-idea.md +0 -528
data/docs/generation-augmented-storage.md +0 -135
data/docs/llm-schema-spec.md +0 -2816
data/docs/plans/2026-03-15-ollama-discovery-design.md +0 -164
data/docs/plans/2026-03-15-ollama-discovery-implementation.md +0 -1147
data/docs/routing-reenvisioned.md +0 -861
data/docs/superpowers/plans/2026-04-15-sticky-runners-tool-history.md +0 -1866
data/docs/superpowers/specs/2026-04-15-sticky-runners-tool-history-design.md +0 -713
data/legion-llm-0.3.20.gem +0 -0

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 4ec77012ba08ec5ed5cb8fd544fca1a28ee8993b5136852f647be4f7e8725309
-  data.tar.gz: ed78d65c4966669e008c853c89983a0baa1f1fd8f28a3510bc8461544f9ead5c
+  metadata.gz: d560457b6321f55371b3dd14d546c8c23d11485b2b3ba5dec218cea028d50399
+  data.tar.gz: 8ee76eba6bf57f592f9d372fec7f8c5372d97c220869f988abe9e1521e766b7b
 SHA512:
-  metadata.gz: 2a33fd3d2b5dcd7c36e11ef5d1715d03f71256c1826967cf987e1665443bed0123d9473e178f034e9aa6a6f62ab72b8a1608b6b187ee3554882d8aacc98ded04
-  data.tar.gz: 3c701ef336fbb0695819860bf3f68c108887d040ccd960dddb944ffc3fcfabc2ab52dc3d0194384530756bf6b8891d0f6d091bbae7b4cdd9db62024f7bd9874e
+  metadata.gz: 0b81ee44a4f57a8ec0e9eb8aef043bdc543217baade9ff1b2772a46056ebc7a486d4e9acfded4fb05cedf341583cb8aaaa9091bd243c4f26eac9ff494ada3d01
+  data.tar.gz: 2ebf5a44aa5588a635c75ef6dc0b68c6554905d343b125cd944c1f5032399c4c714cad8953030e53ba3bb92efe0233371d5c2ebad12eebc5b4a1e2887a0e7c35

data/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,17 @@
 # Legion LLM Changelog
+## [0.8.30] - 2026-04-27
+### Fixed
+- Structured output parsing now strips markdown code fences before JSON parse, including retry responses from models that keep returning fenced JSON.
+- LLM tool adapter dispatch now symbolizes JSON/string-keyed tool arguments before invoking Ruby keyword-argument tool classes.
+- Default routing chains now honor explicit `default_provider` / `default_model` before auto-enabled local providers, preventing Ollama defaults from overriding a configured Bedrock default.
+- Provider credential setup now resolves `env://` placeholders consistently for Bedrock SigV4, Anthropic, OpenAI, Gemini, Azure, and vLLM, and unresolved placeholder arrays no longer auto-enable hosted providers.
+- Native `/api/llm/inference` responses now flatten structured provider content blocks into plain text for both streaming SSE deltas and non-streaming JSON responses, preventing Anthropic/Bedrock-style block arrays from being stored and replayed as nested JSON-looking assistant replies.
+- Native `/api/llm/inference` streaming now emits `thinking-delta` SSE events for provider reasoning chunks without appending those chunks to final assistant content.
+- Native `file_read` client tools now extract text from PDFs via `pdf-reader` and return a clear unsupported-binary message for non-text binary files.
+- Local providers now cap automatically injected registry tools with `llm.tool_trigger.local_tool_limit`, prioritizing trigger-matched tools before always-loaded tools for Ollama/vLLM requests.
 ## [0.8.29] - 2026-04-27
 ### Added

data/README.md CHANGED Viewed

@@ -2,7 +2,7 @@
 LLM integration for the [LegionIO](https://github.com/LegionIO/LegionIO) framework. Wraps [ruby_llm](https://github.com/crmne/ruby_llm) to provide chat, embeddings, tool use, and agent capabilities to any Legion extension. Exposes OpenAI- and Anthropic-compatible API endpoints so external tools can point at the Legion daemon and just work.
-**Version**: 0.8.0
+**Version**: 0.8.30
 ## Installation
@@ -60,6 +60,7 @@ Requests flow through the full Inference pipeline — routing, metering, audit,
 Both formats supported with correct SSE shapes:
 - **OpenAI**: `data: {"choices":[{"delta":{"content":"..."}}]}` chunks, terminated by `data: [DONE]`
 - **Anthropic**: Typed events — `message_start`, `content_block_start`, `content_block_delta`, `content_block_stop`, `message_delta`, `message_stop`
+- **Native**: `/api/llm/inference` streams `text-delta`, `thinking-delta`, tool lifecycle events, and a final `done` event. Structured provider content blocks are flattened to plain text in both streaming and non-streaming native responses so `content` remains a string for daemon clients.
 ### API Authentication
@@ -851,6 +852,8 @@ No code changes are needed in consumers immediately. The aliases will be maintai
 | Azure AI | `azure` | `vault://`, `env://`, or direct | Azure OpenAI endpoint; `api_base` + `api_key` or `auth_token` |
 | Ollama | `ollama` | Local, no credentials needed | Local inference |
+`env://NAME` credential placeholders resolve at provider configuration time, including array fallbacks such as `["env://OPENAI_API_KEY", "env://CODEX_API_KEY"]`. Unresolved placeholders do not auto-enable hosted providers.
 ## Integration with LegionIO
 legion-llm follows the standard core gem lifecycle:

data/legion-llm.gemspec CHANGED Viewed

@@ -35,6 +35,7 @@ Gem::Specification.new do |spec|
   spec.add_dependency 'lex-gemini'
   spec.add_dependency 'lex-knowledge'
   spec.add_dependency 'lex-openai'
+  spec.add_dependency 'pdf-reader'
   spec.add_dependency 'ruby_llm', '~> 1.13'
   spec.add_dependency 'tzinfo', '>= 2.0'
 end

data/lib/legion/llm/api/native/helpers.rb CHANGED Viewed

@@ -10,12 +10,14 @@ module Legion
     module API
       module Native
         module ClientToolMethods
+          include Legion::Logging::Helper
           private
           def log_tool(level, ref, status, **details)
             parts = ["[tool][#{ref}] #{status}"]
             details.each { |k, v| parts << "#{k}=#{v}" }
-            Legion::Logging.send(level, parts.join(' '))
+            log.public_send(level, parts.join(' '))
           end
           def summarize_tool_arg_keys(kwargs)
@@ -37,7 +39,7 @@ module Legion
             end
           end
-          def dispatch_client_tool(ref, **kwargs) # rubocop:disable Metrics/AbcSize,Metrics/CyclomaticComplexity,Metrics/PerceivedComplexity
+          def dispatch_client_tool(ref, **kwargs) # rubocop:disable Metrics/AbcSize,Metrics/CyclomaticComplexity,Metrics/MethodLength,Metrics/PerceivedComplexity
             case ref
             when 'sh'
               cmd = kwargs[:command] || kwargs[:cmd] || kwargs.values.first.to_s
@@ -45,7 +47,7 @@ module Legion
               "exit=#{status.exitstatus}\n#{output}"
             when 'file_read'
               path = kwargs[:path] || kwargs[:file_path] || kwargs.values.first.to_s
-              ::File.exist?(path) ? ::File.read(path, encoding: 'utf-8') : "File not found: #{path}"
+              read_client_file(path)
             when 'file_write'
               path = kwargs[:path] || kwargs[:file_path]
               content = kwargs[:content] || kwargs[:contents]
@@ -82,6 +84,7 @@ module Legion
                 max_length ? content[0, max_length] : content
               rescue LoadError => e
                 missing = e.respond_to?(:path) && e.path ? e.path : 'legion/cli/chat/web_fetch'
+                handle_exception(e, level: :warn, handled: true, operation: 'llm.api.client_tool.web_fetch', missing: missing)
                 "web_fetch is unavailable: missing optional dependency #{missing}"
               end
             when 'web_search'
@@ -93,6 +96,7 @@ module Legion
                 results[:results].map { |r| "### #{r[:title]}\n#{r[:url]}\n#{r[:snippet]}" }.join("\n\n")
               rescue LoadError => e
                 missing = e.respond_to?(:path) && e.path ? e.path : 'legion/cli/chat/web_search'
+                handle_exception(e, level: :warn, handled: true, operation: 'llm.api.client_tool.web_search', missing: missing)
                 "web_search is unavailable: missing optional dependency #{missing}"
               end
             else
@@ -100,6 +104,51 @@ module Legion
             end
           end
+          def read_client_file(path)
+            return "File not found: #{path}" unless ::File.exist?(path)
+            return read_pdf_text(path) if pdf_file?(path)
+            content = ::File.binread(path)
+            return 'Binary file detected, cannot read as text.' if binary_content?(content)
+            content.force_encoding('UTF-8')
+            content
+          rescue StandardError => e
+            handle_exception(e, level: :warn, handled: true, operation: 'llm.api.client_tool.file_read', path: path)
+            "file_read error: #{e.message}"
+          end
+          def pdf_file?(path)
+            ::File.extname(path).casecmp('.pdf').zero? || ::File.binread(path, 5) == '%PDF-'
+          rescue StandardError => e
+            handle_exception(e, level: :warn, handled: true, operation: 'llm.api.client_tool.pdf_sniff', path: path)
+            false
+          end
+          def read_pdf_text(path)
+            require 'pdf-reader' unless defined?(::PDF::Reader)
+            reader = ::PDF::Reader.new(path)
+            text = reader.pages.map(&:text).join("\n\n").strip
+            text.empty? ? 'PDF contained no extractable text.' : text
+          rescue LoadError => e
+            missing = e.respond_to?(:path) && e.path ? e.path : 'pdf-reader'
+            handle_exception(e, level: :warn, handled: true, operation: 'llm.api.client_tool.pdf_extract', missing: missing)
+            'PDF text extraction unavailable: missing pdf-reader gem.'
+          rescue StandardError => e
+            handle_exception(e, level: :warn, handled: true, operation: 'llm.api.client_tool.pdf_extract', path: path)
+            "PDF text extraction failed: #{e.message}"
+          end
+          def binary_content?(content)
+            return true if content.include?("\x00")
+            sample = content.byteslice(0, 4096).to_s
+            sample.force_encoding('UTF-8')
+            !sample.valid_encoding?
+          end
           def notify_tool_event(type, ref, **data)
             handler = Thread.current[:legion_tool_event_handler]
             return unless handler
@@ -257,13 +306,14 @@ module Legion
                   rescue StandardError => e
                     ms = begin
                       ((::Process.clock_gettime(::Process::CLOCK_MONOTONIC) - t0) * 1000).round(1)
-                    rescue StandardError
+                    rescue StandardError => e
+                      handle_exception(e, level: :warn, handled: true,
+                                          operation: 'llm.api.client_tool.duration_measurement', tool_ref: tool_ref)
                       nil
                     end
                     log_tool(:error, tool_ref, 'failed', duration_ms: ms, error: e.message)
                     notify_tool_event(:tool_error, tool_ref, error: e.message)
-                    Legion::Logging.log_exception(e, payload_summary: "client tool #{tool_ref} failed",
-                                                     component_type:  :api)
+                    handle_exception(e, level: :error, handled: true, operation: "llm.api.client_tool.#{tool_ref}")
                     "Tool error: #{e.message}"
                   end
                 end
@@ -287,6 +337,25 @@ module Legion
                 end
               end
+              define_method(:extract_text_content) do |content|
+                case content
+                when nil
+                  ''
+                when String
+                  content
+                when Array
+                  content.filter_map { |entry| extract_text_content(entry) }.join
+                when Hash
+                  type = content[:type] || content['type']
+                  return '' unless type.nil? || type.to_s == 'text'
+                  text = content.key?(:text) || content.key?('text') ? (content[:text] || content['text']) : (content[:content] || content['content'])
+                  extract_text_content(text)
+                else
+                  content.to_s
+                end
+              end
               define_method(:emit_sse_event) do |stream, event_name, payload|
                 level = event_name == 'text-delta' ? :debug : :info
                 log.send(level, "[sse][emit] event=#{event_name} keys=#{payload.is_a?(Hash) ? payload.keys.join(',') : 'n/a'}")
@@ -333,7 +402,8 @@ module Legion
                 kerb = begin
                   Legion::Settings.dig(:kerberos, :username)
-                rescue StandardError
+                rescue StandardError => e
+                  handle_exception(e, level: :warn, handled: true, operation: 'llm.api.identity.kerberos_username')
                   nil
                 end
                 return "user:#{kerb}" if kerb.is_a?(String) && !kerb.empty?
@@ -354,14 +424,16 @@ module Legion
               define_method(:resolve_requested_by) do |rack_env, identity_string|
                 hostname = begin
                   Legion::Settings[:client][:hostname]
-                rescue StandardError
+                rescue StandardError => e
+                  handle_exception(e, level: :warn, handled: true, operation: 'llm.api.identity.client_hostname')
                   Socket.gethostname
                 end
                 username = identity_string.delete_prefix('user:')
                 kerb = begin
                   Legion::Settings.dig(:kerberos, :username)
-                rescue StandardError
+                rescue StandardError => e
+                  handle_exception(e, level: :warn, handled: true, operation: 'llm.api.identity.requested_by_kerberos')
                   nil
                 end
                 if kerb.is_a?(String) && !kerb.empty?

data/lib/legion/llm/api/native/inference.rb CHANGED Viewed

@@ -150,7 +150,10 @@ module Legion
                   }
                   pipeline_response = executor.call_stream do |chunk|
-                    text = chunk.respond_to?(:content) ? chunk.content.to_s : chunk.to_s
+                    thinking = extract_text_content(chunk.thinking) if chunk.respond_to?(:thinking)
+                    emit_sse_event(out, 'thinking-delta', { delta: thinking }) unless thinking.to_s.empty?
+                    text = extract_text_content(chunk.respond_to?(:content) ? chunk.content : chunk)
                     next if text.empty?
                     full_text << text
@@ -195,7 +198,7 @@ module Legion
                 exec_ms = ((::Process.clock_gettime(::Process::CLOCK_MONOTONIC) - exec_t0) * 1000).round
                 log.debug("[llm][api][inference] action=executor_call duration_ms=#{exec_ms} request_id=#{request_id}")
                 raw_msg = pipeline_response.message
-                content = raw_msg.is_a?(Hash) ? (raw_msg[:content] || raw_msg['content']) : raw_msg.to_s
+                content = extract_text_content(raw_msg.is_a?(Hash) ? (raw_msg[:content] || raw_msg['content']) : raw_msg)
                 routing = pipeline_response.routing || {}
                 tokens = pipeline_response.tokens || {}
                 tool_calls = extract_tool_calls(pipeline_response)

data/lib/legion/llm/call/providers.rb CHANGED Viewed

@@ -82,12 +82,16 @@ module Legion
             usable_setting?(config[:api_key])
           end
         rescue StandardError => e
-          handle_exception(e, level: :debug, operation: 'llm.providers.credential_available_for', provider: provider)
+          handle_exception(e, level: :warn, operation: 'llm.providers.credential_available_for', provider: provider)
           false
         end
         def usable_setting?(value)
-          !Call::ClaudeConfigLoader.resolve_setting_reference(value).nil?
+          !resolve_credential_value(value).nil?
+        end
+        def resolve_credential_value(value)
+          Call::ClaudeConfigLoader.resolve_setting_reference(value)
         end
         def env_present?(key)
@@ -104,7 +108,7 @@ module Legion
           Socket.tcp(addr, port, connect_timeout: 1).close
           true
         rescue StandardError => e
-          handle_exception(e, level: :debug, operation: 'llm.providers.ollama_running', base_url: url)
+          handle_exception(e, level: :warn, operation: 'llm.providers.ollama_running', base_url: url)
           false
         end
@@ -120,7 +124,7 @@ module Legion
           end.get('/health')
           response.success?
         rescue StandardError => e
-          handle_exception(e, level: :debug, operation: 'llm.providers.vllm_running', base_url: url)
+          handle_exception(e, level: :warn, operation: 'llm.providers.vllm_running', base_url: url)
           false
         end
@@ -139,9 +143,15 @@ module Legion
         end
         def configure_bedrock(config)
-          has_sigv4 = usable_setting?(config[:api_key]) && usable_setting?(config[:secret_key])
-          has_bearer = Call::ClaudeConfigLoader.resolve_setting_reference(config[:bearer_token])
+          resolved_api_key = resolve_credential_value(config[:api_key])
+          resolved_secret_key = resolve_credential_value(config[:secret_key])
+          resolved_session_token = resolve_credential_value(config[:session_token])
+          has_sigv4 = resolved_api_key && resolved_secret_key
+          has_bearer = resolve_credential_value(config[:bearer_token])
           config[:bearer_token] = has_bearer if has_bearer
+          config[:api_key] = resolved_api_key if resolved_api_key
+          config[:secret_key] = resolved_secret_key if resolved_secret_key
+          config[:session_token] = resolved_session_token if resolved_session_token
           unless has_sigv4 || has_bearer
             broker_creds = resolve_broker_aws_credentials
@@ -176,7 +186,7 @@ module Legion
         def configure_anthropic(config)
           api_key = resolve_broker_credential(:anthropic) ||
-                    Call::ClaudeConfigLoader.resolve_setting_reference(config[:api_key]) ||
+                    resolve_credential_value(config[:api_key]) ||
                     ENV.fetch('ANTHROPIC_API_KEY', nil)
           return unless api_key
@@ -189,7 +199,7 @@ module Legion
         def configure_openai(config)
           api_key = resolve_broker_credential(:openai) ||
-                    Call::ClaudeConfigLoader.resolve_setting_reference(config[:api_key]) ||
+                    resolve_credential_value(config[:api_key]) ||
                     ENV.fetch('OPENAI_API_KEY', nil) ||
                     ENV.fetch('CODEX_API_KEY', nil)
           return unless api_key
@@ -202,7 +212,9 @@ module Legion
         end
         def configure_gemini(config)
-          api_key = resolve_broker_credential(:gemini) || config[:api_key]
+          api_key = resolve_broker_credential(:gemini) ||
+                    resolve_credential_value(config[:api_key]) ||
+                    ENV.fetch('GEMINI_API_KEY', nil)
           return unless api_key
           RubyLLM.configure do |c|
@@ -214,8 +226,8 @@ module Legion
         def configure_azure(config)
           api_base = config[:api_base]
-          api_key = resolve_broker_credential(:azure) || config[:api_key]
-          auth_token = config[:auth_token]
+          api_key = resolve_broker_credential(:azure) || resolve_credential_value(config[:api_key])
+          auth_token = resolve_credential_value(config[:auth_token])
           return unless api_base && (api_key || auth_token)
           RubyLLM.configure do |c|
@@ -235,9 +247,10 @@ module Legion
         def configure_vllm(config)
           base_url = config[:base_url] || 'http://localhost:8000/v1'
+          api_key = resolve_credential_value(config[:api_key])
           RubyLLM.configure do |c|
             c.vllm_api_base = base_url
-            c.vllm_api_key = config[:api_key] if config[:api_key]
+            c.vllm_api_key = api_key if api_key
           end
           log.info "[llm][providers] configured vllm base_url=#{base_url.inspect}"
         end
@@ -294,22 +307,22 @@ module Legion
           case provider
           when :bedrock
             candidates = []
-            resolved_bearer = Call::ClaudeConfigLoader.resolve_setting_reference(config[:bearer_token])
+            resolved_bearer = resolve_credential_value(config[:bearer_token])
             bearer_env = ENV.fetch('AWS_BEARER_TOKEN_BEDROCK', nil)
             claude_bearer = Call::ClaudeConfigLoader.bedrock_bearer_token
             candidates += [resolved_bearer, bearer_env, claude_bearer].compact.uniq.map { |t| { bearer_token: t } }
-            api_key = Call::ClaudeConfigLoader.resolve_setting_reference(config[:api_key])
-            secret = Call::ClaudeConfigLoader.resolve_setting_reference(config[:secret_key])
+            api_key = resolve_credential_value(config[:api_key])
+            secret = resolve_credential_value(config[:secret_key])
             candidates << { api_key: api_key, secret_key: secret } if api_key && secret
             candidates
           when :anthropic
             [
-              Call::ClaudeConfigLoader.resolve_setting_reference(config[:api_key]),
+              resolve_credential_value(config[:api_key]),
               ENV.fetch('ANTHROPIC_API_KEY', nil)
             ].compact.uniq.map { |k| { api_key: k } }
           when :openai
             keys = [
-              Call::ClaudeConfigLoader.resolve_setting_reference(config[:api_key]),
+              resolve_credential_value(config[:api_key]),
               ENV.fetch('OPENAI_API_KEY', nil),
               ENV.fetch('CODEX_API_KEY', nil),
               Call::CodexConfigLoader.read_token
@@ -317,14 +330,14 @@ module Legion
             keys.map { |k| { api_key: k } }
           when :gemini
             [
-              Call::ClaudeConfigLoader.resolve_setting_reference(config[:api_key]),
+              resolve_credential_value(config[:api_key]),
               ENV.fetch('GEMINI_API_KEY', nil)
             ].compact.uniq.map { |k| { api_key: k } }
           else
             []
           end
         rescue StandardError => e
-          handle_exception(e, level: :debug, operation: 'llm.providers.collect_credential_candidates', provider: provider)
+          handle_exception(e, level: :warn, operation: 'llm.providers.collect_credential_candidates', provider: provider)
           []
         end
@@ -383,7 +396,7 @@ module Legion
           end
         rescue StandardError => e
           log.warn "[llm][providers] health_check failed provider=#{provider} error=#{e.class}"
-          handle_exception(e, level: :debug, operation: 'llm.providers.attempt_provider_call', provider: provider, model: model)
+          handle_exception(e, level: :warn, operation: 'llm.providers.attempt_provider_call', provider: provider, model: model)
           false
         end
@@ -398,19 +411,25 @@ module Legion
           return :ok if model_ids.any? { |id| id.include?(target_model) || target_model.include?(id) }
           :model_missing
-        rescue RubyLLM::UnauthorizedError, RubyLLM::ForbiddenError
+        rescue RubyLLM::UnauthorizedError, RubyLLM::ForbiddenError => e
+          handle_exception(e, level: :warn, handled: true,
+                              operation: 'llm.providers.probe_via_model_list.auth', provider: provider)
           :auth_error
         rescue StandardError => e
-          handle_exception(e, level: :debug, operation: 'llm.providers.probe_via_model_list', provider: provider)
+          handle_exception(e, level: :warn, operation: 'llm.providers.probe_via_model_list', provider: provider)
           probe_via_chat(provider, target_model)
         end
         def probe_via_chat(provider, model)
           RubyLLM.chat(model: model, provider: provider).ask('Respond with only the word: pong')
           :ok
-        rescue RubyLLM::ModelNotFoundError
+        rescue RubyLLM::ModelNotFoundError => e
+          handle_exception(e, level: :warn, handled: true,
+                              operation: 'llm.providers.probe_via_chat.model_missing', provider: provider, model: model)
           :model_missing
-        rescue RubyLLM::UnauthorizedError, RubyLLM::ForbiddenError
+        rescue RubyLLM::UnauthorizedError, RubyLLM::ForbiddenError => e
+          handle_exception(e, level: :warn, handled: true,
+                              operation: 'llm.providers.probe_via_chat.auth', provider: provider, model: model)
           :auth_error
         end
@@ -433,7 +452,7 @@ module Legion
           ok = attempt_provider_call(:openai, openai_config[:default_model])
           openai_config[:enabled] = false unless ok
         rescue StandardError => e
-          handle_exception(e, level: :debug, operation: 'llm.providers.recover_openai_with_codex')
+          handle_exception(e, level: :warn, operation: 'llm.providers.recover_openai_with_codex')
         end
         def auto_register_providers
@@ -485,7 +504,7 @@ module Legion
           Legion::Identity::Broker.token_for(provider_name)
         rescue StandardError => e
-          handle_exception(e, level: :debug, operation: "llm.providers.broker_resolve.#{provider_name}")
+          handle_exception(e, level: :warn, operation: "llm.providers.broker_resolve.#{provider_name}")
           nil
         end
@@ -497,7 +516,7 @@ module Legion
           nil
         rescue StandardError => e
-          handle_exception(e, level: :debug, operation: 'llm.providers.broker_resolve.aws')
+          handle_exception(e, level: :warn, operation: 'llm.providers.broker_resolve.aws')
           nil
         end
@@ -512,7 +531,7 @@ module Legion
             !Legion::Identity::Broker.token_for(provider).nil?
           end
         rescue StandardError => e
-          handle_exception(e, level: :debug, operation: 'llm.providers.broker_credential_available', provider: provider)
+          handle_exception(e, level: :warn, operation: 'llm.providers.broker_credential_available', provider: provider)
           false
         end

data/lib/legion/llm/call/structured_output.rb CHANGED Viewed

@@ -15,7 +15,7 @@ module Legion
             result = call_with_schema(messages, schema, model, provider: provider, **)
             log.info "[llm][structured_output] model=#{model} provider=#{provider} valid=true"
-            content = result.respond_to?(:content) ? result.content : result[:content]
+            content = strip_markdown_fences(result.respond_to?(:content) ? result.content : result[:content])
             raw_model = result.respond_to?(:model_id) ? result.model_id : result[:model]
             parsed = Legion::JSON.load(content)
@@ -52,7 +52,7 @@ module Legion
             if retry_enabled? && attempt < max_retries
               retry_with_instruction(messages, schema, model, provider: provider, attempt: attempt + 1, **opts)
             else
-              raw = result.respond_to?(:content) ? result&.content : result&.dig(:content)
+              raw = strip_markdown_fences(result.respond_to?(:content) ? result&.content : result&.dig(:content))
               { data: nil, error: "JSON parse failed: #{error.message}", raw: raw, valid: false }
             end
           end
@@ -64,7 +64,7 @@ module Legion
                                                  model: model, provider: provider, intent: nil, tier: nil,
                                                  message: user_content, **opts.except(:attempt))
-            retry_content = result.respond_to?(:content) ? result.content : result[:content]
+            retry_content = strip_markdown_fences(result.respond_to?(:content) ? result.content : result[:content])
             retry_model = result.respond_to?(:model_id) ? result.model_id : result[:model]
             parsed = Legion::JSON.load(retry_content)
@@ -83,6 +83,18 @@ module Legion
             parts.join("\n\n")
           end
+          def strip_markdown_fences(text)
+            return text unless text.is_a?(String)
+            stripped = text.strip
+            return stripped unless stripped.start_with?('```')
+            stripped
+              .sub(/\A`{3,}[[:space:]]*(?:json)?[[:space:]]*\n?/i, '')
+              .sub(/\n?[[:space:]]*`{3,}\z/, '')
+              .strip
+          end
           def supports_response_format?(model)
             SCHEMA_CAPABLE_MODELS.any? { |m| model.to_s.include?(m) }
           end

data/lib/legion/llm/inference/executor.rb CHANGED Viewed

@@ -102,30 +102,15 @@ module Legion
           injected_names = []
-          # Always-loaded tools — inject all unconditionally
-          ::Legion::Tools::Registry.tools.each do |tool_class|
-            adapter = ToolAdapter.new(tool_class)
-            @injected_tool_map[adapter.name] = tool_class
-            session.with_tool(adapter)
-            injected_names << adapter.name
-          rescue StandardError => e
-            @warnings << "Failed to inject always tool: #{e.message}"
-            handle_exception(e, level: :warn, operation: 'llm.pipeline.inject_always_tool')
-          end
+          always_tools = Array(::Legion::Tools::Registry.tools)
+          triggered_tools = @triggered_tools.any? ? Array(@triggered_tools) : []
+          inject_limit = registry_tool_limit
+          prioritized_tools = local_provider? ? triggered_tools + always_tools : always_tools + triggered_tools
-          # Trigger-matched tools — inject tools surfaced by trigger word matching
-          if @triggered_tools.any?
-            @triggered_tools.each do |tool_class|
-              adapter = ToolAdapter.new(tool_class)
-              next if injected_names.include?(adapter.name)
-              @injected_tool_map[adapter.name] = tool_class
-              session.with_tool(adapter)
-              injected_names << adapter.name
-            rescue StandardError => e
-              @warnings << "Failed to inject triggered tool: #{e.message}"
-              handle_exception(e, level: :warn, operation: 'llm.pipeline.inject_triggered_tool')
-            end
+          prioritized_tools.each do |tool_class|
+            break if inject_limit && injected_names.size >= inject_limit
+            inject_tool_class(session, tool_class, injected_names, operation: 'llm.pipeline.inject_registry_tool')
           end
           # Requested deferred tools — inject only if explicitly requested
@@ -133,15 +118,9 @@ module Legion
           requested = requested_deferred_tool_names
           if requested.any?
             deferred.each do |tool_class|
-              adapter = ToolAdapter.new(tool_class)
-              next unless requested.include?(adapter.name)
-              @injected_tool_map[adapter.name] = tool_class
-              session.with_tool(adapter)
-              injected_names << adapter.name
-            rescue StandardError => e
-              @warnings << "Failed to inject deferred tool: #{e.message}"
-              handle_exception(e, level: :warn, operation: 'llm.pipeline.inject_deferred_tool')
+              inject_tool_class(session, tool_class, injected_names, operation: 'llm.pipeline.inject_deferred_tool') do |adapter|
+                requested.include?(adapter.name)
+              end
             end
           end
@@ -149,6 +128,7 @@ module Legion
             "[llm][tools] inject request_id=#{@request.id} " \
             "always=#{::Legion::Tools::Registry.tools.size} " \
             "triggered=#{@triggered_tools.size} " \
+            "limit=#{inject_limit || 'none'} " \
             "deferred_available=#{deferred.size} " \
             "requested_deferred=#{requested.size} " \
             "injected=#{injected_names.size} names=#{injected_names.first(25).join(',')}"
@@ -158,6 +138,31 @@ module Legion
           handle_exception(e, level: :warn, operation: 'llm.pipeline.inject_tools')
         end
+        def inject_tool_class(session, tool_class, injected_names, operation:)
+          adapter = ToolAdapter.new(tool_class)
+          return if injected_names.include?(adapter.name)
+          return if block_given? && !yield(adapter)
+          @injected_tool_map[adapter.name] = tool_class
+          session.with_tool(adapter)
+          injected_names << adapter.name
+        rescue StandardError => e
+          @warnings << "Failed to inject tool: #{e.message}"
+          handle_exception(e, level: :warn, operation: operation)
+        end
+        def registry_tool_limit
+          return nil unless local_provider?
+          raw_limit = (Legion::LLM.settings[:tool_trigger] || {})[:local_tool_limit]
+          limit = raw_limit.to_i
+          limit.positive? ? limit : nil
+        end
+        def local_provider?
+          %i[ollama vllm].include?(@resolved_provider&.to_sym)
+        end
         # Backwards compatibility alias
         alias inject_discovered_tools inject_registry_tools

data/lib/legion/llm/router.rb CHANGED Viewed

@@ -347,7 +347,7 @@ module Legion
         end
         def chain_from_defaults(model, provider, max)
-          if provider || model
+          if provider || model || default_settings_provider || default_settings_model
             p = (provider || default_settings_provider)&.to_sym
             res = Resolution.new(tier:     PROVIDER_TIER.fetch(p, :frontier),
                                  provider: p || :anthropic,

data/lib/legion/llm/settings.rb CHANGED Viewed

@@ -284,8 +284,9 @@ module Legion
       def self.tool_trigger_defaults
         {
-          scan_depth: 10,
-          tool_limit: 25
+          scan_depth:       10,
+          tool_limit:       25,
+          local_tool_limit: 10
         }
       end