RubyGems - legion-llm - Versions diffs - 0.9.36 → 0.9.51 - Mend

legion-llm 0.9.36 → 0.9.51

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (18) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +80 -0
data/legion-llm.gemspec +1 -0
data/lib/legion/llm/api/native/helpers.rb +50 -25
data/lib/legion/llm/api/native/inference.rb +48 -26
data/lib/legion/llm/api/openai/chat_completions.rb +15 -1
data/lib/legion/llm/api/openai/responses.rb +49 -5
data/lib/legion/llm/api/translators/openai_response.rb +45 -0
data/lib/legion/llm/call/dispatch.rb +4 -0
data/lib/legion/llm/call/lex_llm_adapter.rb +260 -1
data/lib/legion/llm/call/providers.rb +1 -1
data/lib/legion/llm/inference/executor.rb +127 -9
data/lib/legion/llm/inference/native_tool_loop.rb +18 -2
data/lib/legion/llm/inference/route_attempts.rb +35 -0
data/lib/legion/llm/settings.rb +16 -4
data/lib/legion/llm/tools/special.rb +9 -0
data/lib/legion/llm/version.rb +1 -1
metadata +15 -1

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 93611da95712602a9f99e00c4b34523c23838a99d34c3c441ea6bef642231e3f
-  data.tar.gz: 6ced6ad0b6091c5a3d53702b867eea5f04d35199892338023aebb6bb452ed867
+  metadata.gz: e0cae7c608acb8fe3f09852e7833639c08998bb944b2df026d9c1999c03ff017
+  data.tar.gz: bbea922035cf6f38eb43139ea5d33deaca70cbbea8693edccb3811bdc5f43608
 SHA512:
-  metadata.gz: aa99ed858c6bef1fc214a45d4d59e51f1e9f0262f75dcdbd0f60645d59296edf6fa57e47dfa706dd0b06ec7c7f6dbf572f3832235d0d7125cd9992ec65aa6eee
-  data.tar.gz: dfe7e2db5cf883de39a5ac47438408a858372a52dd82230baa4a624e33e17b0558eb50359237345afa5b8a1df432b164149c3fce540304ac56ffbad888110c33
+  metadata.gz: cc620102bcfdbd73387ba3da2e31e80e4fd9c9b9fd3ceeb85b00417972deeda55bbc427702df3a86ec7a2d3f07be34f99383bf9141b79679e394e66c45eda7c1
+  data.tar.gz: 4f5b8e4739873d147be2ddfed81c6a04297a016f9c7c4c143b6ad0409f61a9f75c39a14c6de4b30f37119b4e5aa577e4faba813917da0a9c771c016500e079a2

data/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,85 @@
 # Legion LLM Changelog
+## [0.9.51] - 2026-05-23
+### Changed
+- Settings: `tool_trigger.client_tool_passthrough` now defaults to `false`; callers must opt in with request metadata or a settings override before non-executable client tools are passed through to providers.
+## [0.9.50] - 2026-05-23
+### Fixed
+- API: native `/api/llm/inference` now debug-logs the exact outward response payload for sync JSON responses and streaming `done` events, making client passthrough tool-call shape visible in runtime logs.
+## [0.9.49] - 2026-05-23
+### Fixed
+- API: native `/api/llm/inference` client tool requests now use the OpenAI Chat Completions `tool_calls` shape with `type: "function"` and `function.name` / JSON-string `function.arguments`, aligning native sync responses and streaming `done.tool_calls` with the OpenAI-compatible endpoints.
+## [0.9.48] - 2026-05-23
+### Fixed
+- API: OpenAI-compatible streaming now emits tool callbacks for both Chat Completions and Responses: `/v1/chat/completions` streams `delta.tool_calls` and finishes with `finish_reason: "tool_calls"`, while `/v1/responses` streams `function_call` output item events and includes them in `response.completed.output`.
+## [0.9.47] - 2026-05-22
+### Fixed
+- API: streaming client passthrough tool calls now stay only on the terminal `done.tool_calls` payload instead of emitting a live `tool-call` event that makes clients wait for an impossible same-stream `tool-result`.
+## [0.9.46] - 2026-05-22
+### Fixed
+- API: returned client passthrough tool calls now keep the existing streaming `tool-call` event name while carrying `clientPassthrough` and `requiresToolResult` metadata, preserving current client execution behavior.
+## [0.9.45] - 2026-05-22
+### Fixed
+- Tools: vLLM explicit tool-choice matching now uses tool-name boundaries, so paths like `/rubymine/...` no longer force the `ruby` tool when the user asked for `git`.
+## [0.9.44] - 2026-05-22
+### Fixed
+- API: streaming client passthrough tool calls now emit `client-tool-call` with explicit client execution metadata instead of a server-execution `tool-call`, preventing clients from waiting for an impossible same-stream server tool result.
+- Settings: default client passthrough blacklist now blocks computer-use session/control tools plus Aithena and cron plugin tools so provider calls do not hand off internal UI/plugin controls as executable client tools.
+## [0.9.43] - 2026-05-22
+### Fixed
+- Tools: responses that end on client passthrough after prior server-side tool execution now return the current passthrough tool call instead of replaying the earlier executed tool from pending history.
+## [0.9.42] - 2026-05-22
+### Fixed
+- Tools: native tool-loop follow-up provider calls now include a continuation instruction that tells models to make another available tool call instead of narrating intent after a failed or incomplete tool result.
+- API: streamed native tool results now include status and emit `tool-error` when the server-side tool execution failed.
+## [0.9.41] - 2026-05-22
+### Fixed
+- API: streaming inference `done` events now include `stop_reason` and `requires_tool_result` when client passthrough tool calls must be executed and submitted back by the caller, matching the sync response contract and preventing wrappers from treating tool-use turns as final completions.
+## [0.9.40] - 2026-05-22
+### Added
+- Settings: client tool passthrough blacklist now blocks Legion launcher-style client tools such as `legion`, `legionio`, `legionio do`, and `legionio/legion` by default, including sanitized tool-name variants.
+- Tools: client `python3` and `pip3` passthrough definitions deduplicate against Legion's native `python` and `pip` special tools when the managed Python runtime is injected.
+## [0.9.39] - 2026-05-22
+### Added
+- Settings: `client_tool_passthrough_whitelist` and `client_tool_passthrough_blacklist` now filter non-executable client tools before native provider dispatch; `client_tool_passthrough` defaults to enabled, explicit API true/false overrides are preserved, the default blacklist blocks `sudo`, `visudo`, and `su`, and the default whitelist is empty.
+## [0.9.38] - 2026-05-22
+### Fixed
+- API: OpenAI Responses upstream dispatch now preserves Responses `input_text` / `output_text` content parts instead of stringifying them through chat-message normalization.
+- Providers: native `:responses` dispatch is gated to providers or instances that explicitly support the Responses API, preventing non-Responses providers from receiving `/v1/responses` traffic just because the adapter has a helper method.
+- Packaging: `event_stream_parser` is now a direct runtime dependency because `legion-llm` requires it for Responses SSE parsing.
+## [0.9.37] - 2026-05-22
+### Changed
+- API: OpenAI Responses requests now dispatch to upstream `/v1/responses` through a native `:responses` provider capability instead of adapting Responses input through Chat Completions `stream_chat`, preserving upstream Responses streaming usage from `response.completed.response.usage`
 ## [0.9.36] - 2026-05-22
 ### Fixed

data/legion-llm.gemspec CHANGED Viewed

@@ -26,6 +26,7 @@ Gem::Specification.new do |spec|
   }
   spec.add_dependency 'concurrent-ruby'
+  spec.add_dependency 'event_stream_parser', '~> 1'
   spec.add_dependency 'faraday'
   spec.add_dependency 'legion-cache', '>= 1.4.2'
   spec.add_dependency 'legion-json', '>= 1.2.0'

data/lib/legion/llm/api/native/helpers.rb CHANGED Viewed

@@ -5,6 +5,7 @@ require 'open3'
 require 'time'
 require 'legion/cache/helper'
 require 'legion/logging/helper'
+require 'legion/llm/api/translators/openai_response'
 require 'legion/llm/publisher_identity'
 require 'legion/llm/types'
@@ -302,7 +303,7 @@ module Legion
                   name:        tname,
                   description: tdesc,
                   parameters:  tschema || {},
-                  source:      { type: :client, executable: false }
+                  source:      { type: :client, executable: false, raw_name: tname }
                 )
               rescue StandardError => e
                 handle_exception(e, level: :warn, handled: true, operation: "llm.api.build_client_tool_class.#{tname}")
@@ -310,16 +311,7 @@ module Legion
               end
               define_method(:extract_tool_calls) do |pipeline_response|
-                tools_data = pipeline_response.tools
-                return [] unless tools_data.is_a?(Array) && !tools_data.empty?
-                tools_data.map do |tc|
-                  {
-                    id:        tc.respond_to?(:id) ? tc.id : (tc[:id] || tc['id']),
-                    name:      tc.respond_to?(:name) ? tc.name : (tc[:name] || tc['name'] || tc.to_s),
-                    arguments: tc.respond_to?(:arguments) ? tc.arguments : (tc[:arguments] || tc['arguments'] || {})
-                  }
-                end
+                Legion::LLM::API::Translators::OpenAIResponse.build_tool_calls(pipeline_response)
               end
               define_method(:extract_text_content) do |content|
@@ -347,7 +339,46 @@ module Legion
                 stream << "event: #{event_name}\ndata: #{Legion::JSON.dump(payload)}\n\n"
               end
-              define_method(:emit_response_tool_call_events) do |stream, pipeline_response|
+              define_method(:log_native_inference_response) do |request_id:, conversation_id:, stream:, kind:, payload:|
+                log.debug(
+                  "[llm][api][inference] action=response_payload request_id=#{request_id || 'unknown'} " \
+                  "conversation_id=#{conversation_id || 'none'} stream=#{stream} kind=#{kind} " \
+                  "payload=#{Legion::JSON.dump(payload)}"
+                )
+              rescue StandardError => e
+                handle_exception(e, level: :debug, handled: true,
+                                    operation: 'llm.api.inference.response_payload_log',
+                                    request_id: request_id)
+              end
+              define_method(:returned_client_tool_call_payload) do |tool_call, tool_call_id, tool_name|
+                {
+                  toolCallId:         tool_call_id,
+                  toolName:           tool_name,
+                  args:               openai_tool_call_arguments(tool_call),
+                  clientPassthrough:  true,
+                  requiresToolResult: true,
+                  status:             'requires_client_execution',
+                  timestamp:          Time.now.utc.iso8601
+                }
+              end
+              define_method(:openai_tool_call_name) do |tool_call|
+                fn = tool_call[:function] || tool_call['function'] || {}
+                fn[:name] || fn['name'] || tool_call[:name] || tool_call['name']
+              end
+              define_method(:openai_tool_call_arguments) do |tool_call|
+                fn = tool_call[:function] || tool_call['function'] || {}
+                raw_args = fn[:arguments] || fn['arguments'] || tool_call[:arguments] || tool_call['arguments'] || {}
+                return raw_args unless raw_args.is_a?(String)
+                Legion::JSON.parse(raw_args, symbolize_names: true)
+              rescue StandardError
+                raw_args
+              end
+              define_method(:emit_response_tool_call_events) do |_stream, pipeline_response|
                 tool_calls = extract_tool_calls(pipeline_response)
                 return if tool_calls.empty?
@@ -359,7 +390,7 @@ module Legion
                   data[:tool_call_id] || data['tool_call_id']
                 end
-                emitted = 0
+                done_only = 0
                 skipped_timeline = 0
                 request_id = pipeline_response.respond_to?(:request_id) ? pipeline_response.request_id : 'unknown'
                 conversation_id = pipeline_response.respond_to?(:conversation_id) ? pipeline_response.conversation_id : 'none'
@@ -371,28 +402,22 @@ module Legion
                     next
                   end
-                  tool_name = tool_call[:name] || tool_call['name']
+                  tool_name = openai_tool_call_name(tool_call)
                   next if tool_name.to_s.empty?
                   log.info(
-                    "[llm][api][tools] action=returned_tool_call_sse request_id=#{request_id || 'unknown'} " \
+                    "[llm][api][tools] action=returned_tool_call_done_only request_id=#{request_id || 'unknown'} " \
                     "conversation_id=#{conversation_id || 'none'} tool_call_id=#{tool_call_id || 'none'} name=#{tool_name} " \
-                    "args_class=#{(tool_call[:arguments] || tool_call['arguments'] || {}).class}"
+                    "args_class=#{openai_tool_call_arguments(tool_call).class}"
                   )
-                  emit_sse_event(stream, 'tool-call', {
-                                   toolCallId: tool_call_id,
-                                   toolName:   tool_name,
-                                   args:       tool_call[:arguments] || tool_call['arguments'] || {},
-                                   timestamp:  Time.now.utc.iso8601
-                                 })
-                  emitted += 1
+                  done_only += 1
                 end
-                names = tool_calls.map { |tool_call| tool_call[:name] || tool_call['name'] }.compact
+                names = tool_calls.map { |tool_call| openai_tool_call_name(tool_call) }.compact
                 names = names.first(30).join(',') + (names.size > 30 ? ",+#{names.size - 30}more" : '')
                 log.info(
                   "[llm][api][tools] action=returned_tool_calls_complete request_id=#{request_id || 'unknown'} " \
-                  "conversation_id=#{conversation_id || 'none'} total=#{tool_calls.size} emitted=#{emitted} " \
+                  "conversation_id=#{conversation_id || 'none'} total=#{tool_calls.size} done_only=#{done_only} " \
                   "skipped_timeline=#{skipped_timeline} names=#{names.empty? ? 'none' : names}"
                 )
               end

data/lib/legion/llm/api/native/inference.rb CHANGED Viewed

@@ -28,7 +28,7 @@ module Legion
               conversation_id = body[:conversation_id]
               request_id      = body[:request_id] || SecureRandom.uuid
               include_thinking = body[:include_thinking] == true
-              client_tool_passthrough = body[:client_tool_passthrough] == true
+              client_tool_passthrough = body[:client_tool_passthrough] if [true, false].include?(body[:client_tool_passthrough])
               unless messages.is_a?(Array)
                 halt 400, { 'Content-Type' => 'application/json' },
@@ -105,7 +105,7 @@ module Legion
               extra = {}
               extra[:tier] = tier.to_sym if tier
               metadata = { requested_tools: requested_tools }
-              metadata[:client_tool_passthrough] = true if client_tool_passthrough
+              metadata[:client_tool_passthrough] = client_tool_passthrough unless client_tool_passthrough.nil?
               metadata[:client_tool_request_count] = tools.size if tools.any?
               pipeline_request = Legion::LLM::Inference::Request.build(
@@ -148,10 +148,12 @@ module Legion
                                        timestamp:  Time.now.utc.iso8601
                                      })
                     when :tool_result
-                      emit_sse_event(out, 'tool-result', {
+                      event_name = event[:status].to_s == 'error' ? 'tool-error' : 'tool-result'
+                      emit_sse_event(out, event_name, {
                                        toolCallId: event[:tool_call_id],
                                        toolName:   event[:tool_name],
                                        result:     event[:result],
+                                       status:     event[:status],
                                        timestamp:  Time.now.utc.iso8601
                                      })
                     when :tool_error
@@ -184,20 +186,31 @@ module Legion
                   routing = pipeline_response.routing || {}
                   tokens = pipeline_response.tokens || {}
+                  tool_calls = extract_tool_calls(pipeline_response)
+                  stop_reason = pipeline_response.stop&.dig(:reason)&.to_s
                   done_payload = {
-                    request_id:      request_id,
-                    content:         full_text,
-                    model:           (routing[:model] || routing['model']).to_s,
-                    provider:        (routing[:provider] || routing['provider'])&.to_s,
-                    instance:        (routing[:instance] || routing['instance'])&.to_s,
-                    tier:            (routing[:tier] || routing['tier'])&.to_s,
-                    input_tokens:    token_value(tokens, :input),
-                    output_tokens:   token_value(tokens, :output),
-                    tool_calls:      extract_tool_calls(pipeline_response),
-                    conversation_id: pipeline_response.conversation_id,
-                    metrics:         build_response_metrics(pipeline_response)
+                    request_id:           request_id,
+                    content:              full_text,
+                    model:                (routing[:model] || routing['model']).to_s,
+                    provider:             (routing[:provider] || routing['provider'])&.to_s,
+                    instance:             (routing[:instance] || routing['instance'])&.to_s,
+                    tier:                 (routing[:tier] || routing['tier'])&.to_s,
+                    input_tokens:         token_value(tokens, :input),
+                    output_tokens:        token_value(tokens, :output),
+                    tool_calls:           tool_calls,
+                    stop_reason:          stop_reason,
+                    requires_tool_result: stop_reason == 'tool_use' && tool_calls.any?,
+                    conversation_id:      pipeline_response.conversation_id,
+                    metrics:              build_response_metrics(pipeline_response)
                   }.compact
                   done_payload[:thinking] = pipeline_response.thinking if include_thinking && pipeline_response.thinking
+                  log_native_inference_response(
+                    request_id:      request_id,
+                    conversation_id: pipeline_response.conversation_id || conversation_id,
+                    stream:          true,
+                    kind:            'sse_done',
+                    payload:         done_payload
+                  )
                   emit_sse_event(out, 'done', {
                                    **done_payload
                                  })
@@ -227,6 +240,7 @@ module Legion
                 routing = pipeline_response.routing || {}
                 tokens = pipeline_response.tokens || {}
                 tool_calls = extract_tool_calls(pipeline_response)
+                stop_reason = pipeline_response.stop&.dig(:reason)&.to_s
                 log.info(
                   "[llm][api][inference] action=completed request_id=#{request_id} " \
@@ -240,21 +254,29 @@ module Legion
                 )
                 payload = {
-                  request_id:      request_id,
-                  content:         content,
-                  tool_calls:      tool_calls,
-                  stop_reason:     pipeline_response.stop&.dig(:reason)&.to_s,
-                  model:           (routing[:model] || routing['model']).to_s,
-                  provider:        (routing[:provider] || routing['provider'])&.to_s,
-                  instance:        (routing[:instance] || routing['instance'])&.to_s,
-                  tier:            (routing[:tier] || routing['tier'])&.to_s,
-                  input_tokens:    token_value(tokens, :input),
-                  output_tokens:   token_value(tokens, :output),
-                  conversation_id: pipeline_response.conversation_id,
-                  metrics:         build_response_metrics(pipeline_response)
+                  request_id:           request_id,
+                  content:              content,
+                  tool_calls:           tool_calls,
+                  stop_reason:          stop_reason,
+                  requires_tool_result: stop_reason == 'tool_use' && tool_calls.any?,
+                  model:                (routing[:model] || routing['model']).to_s,
+                  provider:             (routing[:provider] || routing['provider'])&.to_s,
+                  instance:             (routing[:instance] || routing['instance'])&.to_s,
+                  tier:                 (routing[:tier] || routing['tier'])&.to_s,
+                  input_tokens:         token_value(tokens, :input),
+                  output_tokens:        token_value(tokens, :output),
+                  conversation_id:      pipeline_response.conversation_id,
+                  metrics:              build_response_metrics(pipeline_response)
                 }
                 payload[:thinking] = pipeline_response.thinking if include_thinking && pipeline_response.thinking
                 payload.compact!
+                log_native_inference_response(
+                  request_id:      request_id,
+                  conversation_id: pipeline_response.conversation_id || conversation_id,
+                  stream:          false,
+                  kind:            'json_response',
+                  payload:         { data: payload }
+                )
                 json_response(payload, status_code: 200)
               end
             rescue Legion::LLM::AuthError => e

data/lib/legion/llm/api/openai/chat_completions.rb CHANGED Viewed

@@ -67,8 +67,22 @@ module Legion
                   routing = pipeline_response.routing || {}
                   final_model = (routing[:model] || routing['model'] || model).to_s
+                  tool_calls = Legion::LLM::API::Translators::OpenAIResponse.build_tool_calls(pipeline_response)
+                  tool_calls.each_with_index do |tool_call, index|
+                    out << "data: #{Legion::JSON.dump(Legion::LLM::API::Translators::OpenAIResponse.format_stream_tool_call_chunk(
+                                                        tool_call,
+                                                        model:      final_model,
+                                                        request_id: request_id,
+                                                        index:      index
+                                                      ))}\n\n"
+                  end
                   done_chunk = Legion::LLM::API::Translators::OpenAIResponse.format_stream_chunk(
-                    nil, model: final_model, request_id: request_id, finish_reason: 'stop'
+                    nil,
+                    model:         final_model,
+                    request_id:    request_id,
+                    finish_reason: tool_calls.empty? ? 'stop' : 'tool_calls'
                   )
                   out << "data: #{Legion::JSON.dump(done_chunk)}\n\n"
                   out << "data: [DONE]\n\n"

data/lib/legion/llm/api/openai/responses.rb CHANGED Viewed

@@ -76,13 +76,13 @@ module Legion
                         'X-Accel-Buffering' => 'no'
                 stream do |out|
-                  Responses.stream_response(out, executor, request_id: request_id, model: model)
+                  Responses.stream_response(out, executor, request_id: request_id, model: model, upstream_body: body)
                 rescue StandardError => e
                   handle_exception(e, level: :error, handled: false, operation: 'llm.api.openai.responses.stream', request_id: request_id)
                   out << "event: error\ndata: #{Legion::JSON.dump({ type: 'server_error', message: e.message })}\n\n"
                 end
               else
-                pipeline_response = executor.call
+                pipeline_response = executor.call_responses(body: body, stream: false)
                 response_body = Responses.format_response(pipeline_response, request_id: request_id, model: model)
                 log.info("[llm][api][openai][responses] action=complete request_id=#{request_id} model=#{response_body[:model]}")
@@ -179,7 +179,7 @@ module Legion
             }
           end
-          def self.stream_response(out, executor, request_id:, model:) # rubocop:disable Metrics/MethodLength
+          def self.stream_response(out, executor, request_id:, model:, upstream_body: nil) # rubocop:disable Metrics/MethodLength
             created_at = Time.now.to_i
             seq = 0
             in_progress_response = { id: request_id, object: 'response', created_at: created_at,
@@ -218,7 +218,7 @@ module Legion
             full_text = +''
-            pipeline_response = executor.call_stream do |chunk|
+            pipeline_response = call_streaming_executor(executor, upstream_body: upstream_body) do |chunk|
               text = chunk.respond_to?(:content) ? chunk.content.to_s : chunk.to_s
               next if text.empty?
@@ -237,6 +237,7 @@ module Legion
             tokens  = pipeline_response.tokens || {}
             resolved_model = (routing[:model] || routing['model'] || model).to_s
             usage = build_usage(tokens)
+            function_calls = build_output_tool_calls(pipeline_response)
             out << sse_event('response.output_text.done', {
                                type:            'response.output_text.done',
@@ -265,6 +266,41 @@ module Legion
                                item:            completed_item
                              })
+            function_calls.each_with_index do |function_call, index|
+              output_index = index + 1
+              in_progress_item = function_call.merge(status: 'in_progress', arguments: '')
+              out << sse_event('response.output_item.added', {
+                                 type:            'response.output_item.added',
+                                 sequence_number: seq += 1,
+                                 output_index:    output_index,
+                                 item:            in_progress_item
+                               })
+              out << sse_event('response.function_call_arguments.delta', {
+                                 type:            'response.function_call_arguments.delta',
+                                 sequence_number: seq += 1,
+                                 output_index:    output_index,
+                                 item_id:         function_call[:id],
+                                 delta:           function_call[:arguments]
+                               })
+              out << sse_event('response.function_call_arguments.done', {
+                                 type:            'response.function_call_arguments.done',
+                                 sequence_number: seq += 1,
+                                 output_index:    output_index,
+                                 item_id:         function_call[:id],
+                                 arguments:       function_call[:arguments]
+                               })
+              out << sse_event('response.output_item.done', {
+                                 type:            'response.output_item.done',
+                                 sequence_number: seq += 1,
+                                 output_index:    output_index,
+                                 item:            function_call
+                               })
+            end
             out << sse_event('response.completed', {
                                type:            'response.completed',
                                sequence_number: seq + 1,
@@ -274,7 +310,7 @@ module Legion
                                  created_at: created_at,
                                  status:     'completed',
                                  model:      resolved_model,
-                                 output:     [completed_item],
+                                 output:     [completed_item, *function_calls],
                                  usage:      usage
                                }
                              })
@@ -282,6 +318,14 @@ module Legion
             log.info("[llm][api][openai][responses] action=stream_complete request_id=#{request_id} model=#{resolved_model}")
           end
+          def self.call_streaming_executor(executor, upstream_body: nil, &)
+            if upstream_body && executor.respond_to?(:call_responses)
+              executor.call_responses(body: upstream_body, stream: true, &)
+            else
+              executor.call_stream(&)
+            end
+          end
           def self.sse_event(name, payload)
             "event: #{name}\ndata: #{Legion::JSON.dump(payload)}\n\n"
           end

data/lib/legion/llm/api/translators/openai_response.rb CHANGED Viewed

@@ -70,6 +70,51 @@ module Legion
             }
           end
+          def format_stream_tool_call_chunk(tool_call, model:, request_id:, index:)
+            fn = tool_call.is_a?(Hash) ? (tool_call[:function] || tool_call['function'] || {}) : {}
+            name = tool_call.respond_to?(:name) ? tool_call.name : (tool_call[:name] || tool_call['name'] || fn[:name] || fn['name'])
+            args = if tool_call.respond_to?(:arguments)
+                     tool_call.arguments
+                   else
+                     tool_call[:arguments] || tool_call['arguments'] || fn[:arguments] || fn['arguments'] || {}
+                   end
+            tc_id = tool_call.respond_to?(:id) ? tool_call.id : (tool_call[:id] || tool_call['id'] || "call_#{SecureRandom.hex(8)}")
+            format_stream_delta_chunk(
+              {
+                tool_calls: [
+                  {
+                    index:    index,
+                    id:       tc_id,
+                    type:     'function',
+                    function: {
+                      name:      name.to_s,
+                      arguments: args.is_a?(String) ? args : Legion::JSON.dump(args)
+                    }
+                  }
+                ]
+              },
+              model:      model,
+              request_id: request_id
+            )
+          end
+          def format_stream_delta_chunk(delta, model:, request_id:, finish_reason: nil)
+            {
+              id:      "chatcmpl-#{request_id.delete('-')}",
+              object:  'chat.completion.chunk',
+              created: Time.now.to_i,
+              model:   model.to_s,
+              choices: [
+                {
+                  index:         0,
+                  delta:         delta,
+                  finish_reason: finish_reason
+                }
+              ]
+            }
+          end
           def format_embeddings(vector, model:, input_text:, usage: nil)
             tokens = embedding_token_count(usage, input_text)

data/lib/legion/llm/call/dispatch.rb CHANGED Viewed

@@ -168,6 +168,7 @@ module Legion
         CAPABILITY_METHODS = {
           chat:         :chat,
           stream:       :stream,
+          responses:    :responses,
           embed:        :embed,
           image:        :image,
           count_tokens: :count_tokens
@@ -189,6 +190,9 @@ module Legion
           raise Legion::LLM::ProviderError, "unsupported capability: #{capability}" unless method_name
           ext = fetch_extension!(provider, instance: instance)
+          if ext.respond_to?(:supports?) && !ext.supports?(cap_sym)
+            raise Legion::LLM::ProviderError, "unsupported capability #{capability} for provider #{provider}"
+          end
           log.info("[llm][dispatch] capability=#{cap_sym} provider=#{provider} " \
                    "instance=#{instance || 'default'} model=#{model}")

data/lib/legion/llm/call/lex_llm_adapter.rb CHANGED Viewed

@@ -1,5 +1,6 @@
 # frozen_string_literal: true
+require 'event_stream_parser'
 require 'legion/logging/helper'
 module Legion
@@ -10,11 +11,13 @@ module Legion
         include Legion::Logging::Helper
         METADATA_KEYS = %i[tier capabilities enabled].freeze
+        RESPONSES_PROVIDER_FAMILIES = %i[openai vllm].freeze
         def initialize(provider_name, provider_class, instance_config: {})
           @provider_name = provider_name.to_sym
           @provider_class = provider_class
           @instance_config = instance_config
+          @capabilities = Array(instance_config[:capabilities] || instance_config['capabilities']).map(&:to_sym)
           @lex_llm_namespace = resolve_lex_llm_namespace
         end
@@ -58,6 +61,32 @@ module Legion
           end
         end
+        def responses(model:, body:, messages:, stream: false, **opts, &)
+          raise Legion::LLM::ProviderError, "Responses API dispatch is not supported for #{provider_name}" unless supports?(:responses)
+          payload = build_responses_payload(
+            body:     body,
+            model:    model,
+            messages: messages,
+            stream:   stream,
+            system:   opts[:system],
+            tools:    opts[:tools]
+          )
+          if stream
+            stream_responses_payload(payload, offering_metadata: opts[:offering_metadata], &)
+          else
+            response = provider.connection.post(responses_url, payload)
+            responses_hash_response(response.body, offering_metadata: opts[:offering_metadata])
+          end
+        end
+        def supports?(capability)
+          return true unless capability.to_sym == :responses
+          @capabilities.include?(:responses) || RESPONSES_PROVIDER_FAMILIES.include?(provider_name)
+        end
         def embed(model:, text:, dimensions: nil, **opts)
           model_info = model_info(model, offering_metadata: opts[:offering_metadata])
           response = provider.embed(
@@ -136,6 +165,236 @@ module Legion
           end
         end
+        def responses_url = '/v1/responses'
+        def build_responses_payload(body:, model:, messages:, stream:, system: nil, tools: nil)
+          payload = normalize_hash(body).dup
+          payload[:model] = model
+          payload[:stream] = stream
+          payload[:input] = responses_payload_input(payload, messages)
+          system_content = normalize_response_system(system)
+          payload[:instructions] = system_content if present_system?(system_content)
+          formatted_tools = responses_tools(tools)
+          payload[:tools] = formatted_tools if formatted_tools.any?
+          deep_compact(payload)
+        end
+        def responses_input(messages)
+          Array(messages).map do |message|
+            normalized = normalize_hash(message)
+            if normalized[:role].to_s == 'tool'
+              next({
+                type:    'function_call_output',
+                call_id: normalized[:tool_call_id].to_s,
+                output:  normalize_message_content(normalized[:content]).to_s
+              })
+            end
+            {
+              role:         normalized[:role]&.to_s || 'user',
+              content:      responses_message_content(normalized[:content]),
+              tool_call_id: normalized[:tool_call_id]
+            }.compact
+          end
+        end
+        def responses_payload_input(payload, messages)
+          return payload[:input] if payload.key?(:input)
+          return payload['input'] if payload.key?('input')
+          responses_input(messages)
+        end
+        def responses_message_content(content)
+          return content if content.nil? || content.is_a?(String)
+          if content.is_a?(Array)
+            parts = content.filter_map { |part| responses_content_part(part) }
+            return parts unless parts.empty?
+          end
+          text_part_content(content) || content.to_s
+        end
+        def responses_content_part(part)
+          return { type: 'input_text', text: part } if part.is_a?(String)
+          return part unless part.respond_to?(:transform_keys)
+          normalized = part.transform_keys { |key| key.respond_to?(:to_sym) ? key.to_sym : key }
+          type = normalized[:type].to_s
+          return { type: type, text: normalized[:text].to_s } if %w[input_text output_text text].include?(type)
+          part
+        end
+        def normalize_response_system(system)
+          return nil if system.nil?
+          return system[:content] || system['content'] if system.is_a?(Hash)
+          system.to_s
+        end
+        def responses_tools(tools)
+          normalize_tools(tools).values.map do |tool|
+            {
+              type:        'function',
+              name:        tool.name.to_s,
+              description: tool.description.to_s,
+              parameters:  tool.params_schema || { type: 'object', properties: {} }
+            }
+          end
+        end
+        def deep_compact(value)
+          case value
+          when Hash
+            value.each_with_object({}) do |(key, hash_value), compacted|
+              compact_value = deep_compact(hash_value)
+              compacted[key] = compact_value unless compact_value.nil?
+            end
+          when Array
+            value.map { |entry| deep_compact(entry) }.compact
+          else
+            value
+          end
+        end
+        def stream_responses_payload(payload, offering_metadata: nil, &block)
+          accumulator = build_responses_stream_accumulator
+          parser = EventStreamParser::Parser.new
+          response = provider.connection.post(responses_url, payload) do |req|
+            req.headers['Accept'] = 'text/event-stream'
+            attach_responses_stream_handler(req, parser, accumulator, block)
+          end
+          responses_stream_response(accumulator, response.body, offering_metadata: offering_metadata)
+        end
+        def build_responses_stream_accumulator
+          {
+            content:   +'',
+            model:     nil,
+            usage:     {},
+            completed: nil,
+            raw:       nil
+          }
+        end
+        def attach_responses_stream_handler(req, parser, accumulator, block)
+          handler = proc do |chunk, *_args|
+            parser.feed(chunk) do |_event, data|
+              handle_responses_stream_data(data, accumulator, block)
+            end
+          end
+          if req.options.respond_to?(:on_data=)
+            req.options.on_data = handler
+          else
+            req.options[:on_data] = handler
+          end
+        end
+        def handle_responses_stream_data(data, accumulator, block)
+          return if data == '[DONE]'
+          parsed = Legion::JSON.parse(data, symbolize_names: false)
+          return unless parsed.is_a?(Hash)
+          accumulator[:raw] = parsed
+          case parsed['type']
+          when 'response.output_text.delta'
+            accumulate_responses_text_delta(parsed, accumulator, block)
+          when 'response.completed'
+            response = parsed['response'] || {}
+            accumulator[:completed] = response
+            accumulator[:model] = response['model'] if response['model']
+            accumulator[:usage] = responses_usage(response['usage'])
+          end
+        end
+        def accumulate_responses_text_delta(parsed, accumulator, block)
+          delta = parsed['delta'].to_s
+          return if delta.empty?
+          accumulator[:content] << delta
+          block&.call(
+            lex_llm_namespace::Chunk.new(
+              role:     :assistant,
+              content:  delta,
+              model_id: parsed['model'],
+              raw:      parsed,
+              tokens:   nil
+            )
+          )
+        end
+        def responses_stream_response(accumulator, response_body, offering_metadata: nil)
+          completed = accumulator[:completed] || {}
+          content = accumulator[:content]
+          content = extract_responses_text(completed) if content.empty?
+          {
+            result:   content,
+            model:    accumulator[:model] || completed['model'],
+            usage:    accumulator[:usage],
+            metadata: response_metadata(completed.empty? ? response_body : completed, offering_metadata: offering_metadata)
+          }.compact
+        end
+        def responses_hash_response(body, offering_metadata: nil)
+          normalized = normalize_string_hash(body)
+          {
+            result:   extract_responses_text(normalized),
+            model:    normalized['model'],
+            usage:    responses_usage(normalized['usage']),
+            metadata: response_metadata(normalized, offering_metadata: offering_metadata)
+          }.compact
+        end
+        def normalize_string_hash(value)
+          return value.map { |entry| normalize_string_hash(entry) } if value.is_a?(Array)
+          return {} unless value.respond_to?(:each_pair)
+          value.each_with_object({}) do |(key, hash_value), normalized|
+            normalized[key.to_s] = normalize_string_hash_value(hash_value)
+          end
+        end
+        def normalize_string_hash_value(value)
+          return normalize_string_hash(value) if value.respond_to?(:each_pair)
+          return value.map { |entry| normalize_string_hash_value(entry) } if value.is_a?(Array)
+          value
+        end
+        def extract_responses_text(body)
+          return body['output_text'].to_s if body['output_text']
+          Array(body['output']).flat_map do |item|
+            Array(item['content']).filter_map do |content|
+              next unless %w[output_text text].include?(content['type'].to_s)
+              content['text']
+            end
+          end.join
+        end
+        def responses_usage(usage)
+          usage = normalize_string_hash(usage)
+          input = usage['input_tokens'] || usage['prompt_tokens']
+          output = usage['output_tokens'] || usage['completion_tokens']
+          {
+            input_tokens:       input.to_i,
+            output_tokens:      output.to_i,
+            cache_read_tokens:  usage.dig('input_tokens_details', 'cached_tokens').to_i,
+            cache_write_tokens: usage.dig('input_tokens_details', 'cache_creation_tokens').to_i
+          }
+        end
         def model_info(model, offering_metadata: nil)
           offering = normalize_offering_metadata(offering_metadata)
           lex_llm_namespace::Model::Info.new(
@@ -243,7 +502,7 @@ module Legion
           if part.respond_to?(:transform_keys)
             normalized = part.transform_keys { |key| key.respond_to?(:to_sym) ? key.to_sym : key }
-            return unless normalized[:type].to_s == 'text'
+            return unless %w[input_text output_text text].include?(normalized[:type].to_s)
             return normalized[:text].to_s
           end

data/lib/legion/llm/call/providers.rb CHANGED Viewed

@@ -104,7 +104,7 @@ module Legion
         end
         def adapter_instance_config(config, instance_id)
-          config.except(:tier, :capabilities).tap do |registry_config|
+          config.except(:tier).tap do |registry_config|
             registry_config[:instance_id] ||= instance_id
           end
         end

data/lib/legion/llm/inference/executor.rb CHANGED Viewed

@@ -69,7 +69,7 @@ module Legion
         ].freeze
         MAX_NATIVE_TOOL_ROUNDS = 200
-        ToolResultEvent = Struct.new(:result, :tool_call_id, :tool_name, :started_at, keyword_init: true)
+        ToolResultEvent = Struct.new(:result, :tool_call_id, :tool_name, :started_at, :status, keyword_init: true)
         ASYNC_THREAD_POOL = Concurrent::FixedThreadPool.new(4, fallback_policy: :caller_runs)
@@ -124,6 +124,14 @@ module Legion
           build_response
         end
+        def call_responses(body:, stream: false, &)
+          log.debug "[llm][executor] action=call_responses request_id=#{@request.id} profile=#{@profile} stream=#{stream}"
+          execute_pre_provider_steps
+          execute_provider_request_responses(body: body, stream: stream, &)
+          execute_post_provider_steps
+          build_response
+        end
         private
         def llm_setting(key, default = nil)
@@ -904,12 +912,29 @@ module Legion
             offering_id:       @resolved_offering_id,
             offering_metadata: @resolved_offering_metadata
           }
+          options[:system] = native_tool_loop_system(options[:system])
           options[:tools] = native_dispatch_tools if native_dispatch_tools.any?
           options[:tool_prefs] = native_tool_prefs if native_dispatch_tools.any? && native_tool_prefs
           options[:thinking] = native_dispatch_thinking if native_dispatch_thinking
           options.compact
         end
+        def native_tool_loop_system(system)
+          return system unless @native_tool_loop_round.to_i.positive? && native_dispatch_tools.any?
+          [system, native_tool_loop_continuation_prompt].compact.join("\n\n")
+        end
+        def native_tool_loop_continuation_prompt
+          <<~PROMPT.strip
+            Tool-use continuation rule:
+            - You just received tool results.
+            - If a tool failed or produced incomplete information and another available tool can continue the user's request, call that tool now.
+            - Do not say you will use a tool unless you are actually making the tool call in this response.
+            - Only provide a final answer when no further tool call is needed or possible.
+          PROMPT
+        end
         def native_dispatch_chat_options
           opts = { model: @resolved_model, provider: @resolved_provider }
           opts[:instance] = @resolved_instance if @resolved_instance
@@ -951,7 +976,42 @@ module Legion
             return value if [true, false].include?(value)
           end
-          Legion::LLM::Settings.value(:tool_trigger, :client_tool_passthrough) != false
+          Legion::LLM::Settings.value(:tool_trigger, :client_tool_passthrough) == true
+        end
+        def client_tool_passthrough_allowed?(definition)
+          names = client_tool_passthrough_name_variants(definition)
+          whitelist = client_tool_passthrough_list(:client_tool_passthrough_whitelist)
+          blacklist = client_tool_passthrough_list(:client_tool_passthrough_blacklist)
+          return false if whitelist.any? && !names.intersect?(whitelist)
+          return false if names.intersect?(blacklist)
+          true
+        end
+        def client_tool_passthrough_list(key)
+          defaults = {
+            client_tool_passthrough_whitelist: Legion::LLM::Settings::CLIENT_TOOL_PASSTHROUGH_WHITELIST_DEFAULT,
+            client_tool_passthrough_blacklist: Legion::LLM::Settings::CLIENT_TOOL_PASSTHROUGH_BLACKLIST_DEFAULT
+          }
+          Array(Legion::LLM::Settings.value(:tool_trigger, key, default: defaults.fetch(key))).flat_map do |entry|
+            client_tool_policy_variants(entry)
+          end.uniq
+        end
+        def client_tool_passthrough_name_variants(definition)
+          source = definition.respond_to?(:source) ? definition.source : {}
+          raw_name = source[:raw_name] || source['raw_name'] if source.is_a?(Hash)
+          [definition.name, raw_name].compact.flat_map { |name| client_tool_policy_variants(name) }.uniq
+        end
+        def client_tool_policy_variants(value)
+          raw = value.to_s.strip.downcase
+          sanitized = Types::ToolDefinition.sanitize_tool_name(value).downcase
+          compact = raw.gsub(/[^a-z0-9]/, '')
+          [raw, sanitized, compact].reject(&:empty?).uniq
         end
         def non_executable_client_tool?(definition)
@@ -989,8 +1049,16 @@ module Legion
             )
             return
           end
+          if non_executable_client_tool?(definition) && !client_tool_passthrough_allowed?(definition)
+            log.info(
+              "[llm][tools][inject] action=client_tool_skipped request_id=#{request_log_value(:id, 'unknown')} " \
+              "conversation_id=#{request_log_value(:conversation_id, 'none') || 'none'} name=#{definition.name} " \
+              'reason=client_passthrough_policy'
+            )
+            return
+          end
           return if gaia_tool_suppressed?(definition.name)
-          return if definitions.any? { |existing| existing.name == definition.name }
+          return if native_tool_definition_duplicate?(definitions, definition)
           @injected_tool_map[definition.name] = definition.source[:tool_class] if definition.source[:tool_class]
           @native_tool_source_map[definition.name] = definition.source
@@ -1015,6 +1083,24 @@ module Legion
           handle_exception(e, level: :error, operation: 'llm.pipeline.native_registry_tools')
         end
+        def native_tool_definition_duplicate?(definitions, definition)
+          candidate_names = native_tool_definition_name_variants(definition)
+          definitions.any? do |existing|
+            native_tool_definition_name_variants(existing).intersect?(candidate_names)
+          end
+        end
+        def native_tool_definition_name_variants(definition)
+          variants = client_tool_passthrough_name_variants(definition)
+          source = definition.respond_to?(:source) ? definition.source : {}
+          source_type = nil
+          source_type = source[:type] || source['type'] if source.is_a?(Hash)
+          if source_type.respond_to?(:to_sym) && source_type.to_sym == :special
+            variants += Tools::Special.aliases_for(definition.name).flat_map { |name| client_tool_policy_variants(name) }
+          end
+          variants.uniq
+        end
         def add_settings_extensions_tool_definitions(definitions)
           existing_names = definitions.map(&:name)
           inject_limit = registry_tool_limit
@@ -1097,7 +1183,8 @@ module Legion
               result:       native_tool_result_content(result),
               tool_call_id: normalized_call[:id],
               tool_name:    normalized_call[:name],
-              started_at:   Thread.current[:legion_current_tool_started_at]
+              started_at:   Thread.current[:legion_current_tool_started_at],
+              status:       result[:status] || result['status']
             )
           )
           result
@@ -1339,6 +1426,30 @@ module Legion
           @raw_response = Call::NativeResponseAdapter.new(result)
         end
+        def execute_provider_request_responses(body:, stream:, &block)
+          @timestamps[:provider_start] = Time.now
+          @timeline.record(
+            category: :provider, key: 'provider:request_sent',
+            exchange_id: @exchange_id, direction: :outbound,
+            detail: "responses from #{@resolved_provider}",
+            from: 'pipeline', to: "provider:#{@resolved_provider}"
+          )
+          raise Legion::LLM::ProviderError, "Native provider not registered: #{@resolved_provider}" unless use_native_dispatch?(@resolved_provider)
+          result = dispatch_responses_request(
+            body:         body,
+            messages:     native_dispatch_messages,
+            stream:       stream,
+            stream_block: block
+          )
+          merge_response_offering_metadata(result[:metadata])
+          @raw_response = Call::NativeResponseAdapter.new(result)
+          @timestamps[:provider_end] = Time.now
+          record_provider_response
+        end
         def normalize_message_content(content)
           return content if content.nil? || content.is_a?(String)
           return content unless content.is_a?(Array)
@@ -1396,12 +1507,13 @@ module Legion
           started_at = tool_result.respond_to?(:started_at)   ? tool_result.started_at   : Thread.current[:legion_current_tool_started_at]
           finished_at = Time.now
           raw = tool_result.respond_to?(:result) ? tool_result.result : tool_result
+          status = tool_result.respond_to?(:status) ? tool_result.status : nil
           duration_ms = started_at ? ((finished_at - started_at) * 1000).round : nil
           result_str = (raw.is_a?(String) ? raw : raw.to_s)
           result_str = result_str.encode('UTF-8', invalid: :replace, undef: :replace, replace: '�') unless result_str.valid_encoding?
           result_str = result_str.delete("\x00")
-          is_error = raw.is_a?(Hash) && (raw[:error] || raw['error']) ? true : false
+          is_error = status.to_s == 'error' || (raw.is_a?(Hash) && (raw[:error] || raw['error']) ? true : false)
           @pending_tool_history_mutex.synchronize do
             entry = @pending_tool_history.find { |e| e[:tool_call_id] == tc_id && e[:result].nil? }
@@ -1425,7 +1537,7 @@ module Legion
           @tool_event_handler&.call(
             type: :tool_result, tool_call_id: tc_id, tool_name: tc_name,
-            result: result_str[0, 4096], result_size: result_str.bytesize,
+            result: result_str[0, 4096], result_size: result_str.bytesize, status: is_error ? :error : :success,
             started_at: started_at, finished_at: finished_at, duration_ms: duration_ms
           )
@@ -2016,16 +2128,22 @@ module Legion
         end
         def response_tool_calls
-          # Prefer typed ToolCall objects from pending history (already built during execution)
+          raw_tool_calls = @raw_response.respond_to?(:tool_calls) ? @raw_response.tool_calls : nil
+          return build_response_tool_calls(raw_tool_calls) if raw_tool_calls&.any?
+          # Fall back to typed ToolCall objects from pending history when the final
+          # model response completed after server-side tool execution.
           typed_from_history = @pending_tool_history
                                .filter_map { |entry| entry[:typed_call] }
           return typed_from_history if typed_from_history.any?
-          return [] unless @raw_response.respond_to?(:tool_calls) && @raw_response.tool_calls
+          []
+        end
+        def build_response_tool_calls(tool_calls)
           tool_timeline = build_tool_timeline_index
-          Array(@raw_response.tool_calls).map do |tool_call|
+          Array(tool_calls).map do |tool_call|
             tc_id   = tool_call[:id] || tool_call['id']
             tc_name = tool_call[:name] || tool_call['name']
             tc_args = tool_call[:arguments] || tool_call['arguments'] || {}

data/lib/legion/llm/inference/native_tool_loop.rb CHANGED Viewed

@@ -114,11 +114,27 @@ module Legion
           text = latest_user_text.to_s.downcase
           return if text.empty?
-          native_dispatch_tools.keys.map(&:to_s).find do |tool_name|
-            text.include?(tool_name.downcase)
+          native_dispatch_tools.keys.map(&:to_s).sort_by { |tool_name| -tool_name.length }.find do |tool_name|
+            explicit_tool_name_mentioned?(text, tool_name)
           end
         end
+        def explicit_tool_name_mentioned?(text, tool_name)
+          explicit_tool_name_candidates(tool_name).any? do |candidate|
+            text.match?(/(?<![[:alnum:]_-])#{Regexp.escape(candidate)}(?![[:alnum:]_-])/)
+          end
+        end
+        def explicit_tool_name_candidates(tool_name)
+          normalized_name = tool_name.to_s.downcase
+          [
+            normalized_name,
+            normalized_name.tr('_-', ' '),
+            normalized_name.tr('_', '-'),
+            normalized_name.tr('-', '_')
+          ].reject(&:empty?).uniq
+        end
         def latest_user_text
           message = Array(@request.messages).reverse.find do |msg|
             msg.is_a?(Hash) && (msg[:role] || msg['role']).to_s == 'user'

data/lib/legion/llm/inference/route_attempts.rb CHANGED Viewed

@@ -24,6 +24,41 @@ module Legion
           end
         end
+        def dispatch_responses_request(body:, messages:, stream:, stream_block: nil)
+          raise Legion::LLM::ProviderError, 'Responses API upstream dispatch is not supported for fleet providers' if fleet_dispatch?
+          idempotency_key = next_route_idempotency_key
+          result = Call::Dispatch.call(
+            provider:   @resolved_provider,
+            instance:   @resolved_instance,
+            capability: :responses,
+            model:      @resolved_model,
+            body:       body,
+            messages:   messages,
+            stream:     stream,
+            **native_dispatch_options,
+            &stream_block
+          )
+          record_route_attempt(
+            dispatch_path:   :direct,
+            operation:       :responses,
+            status:          :success,
+            idempotency_key: idempotency_key,
+            selected_lane:   nil
+          )
+          result
+        rescue StandardError => e
+          record_route_attempt(
+            dispatch_path:   :direct,
+            operation:       :responses,
+            status:          :failure,
+            idempotency_key: idempotency_key,
+            selected_lane:   nil,
+            failure_reason:  e.message
+          )
+          raise
+        end
         def dispatch_direct_request(capability:, operation:, messages:, stream_block: nil)
           idempotency_key = next_route_idempotency_key
           result = Call::Dispatch.call(

data/lib/legion/llm/settings.rb CHANGED Viewed

@@ -8,6 +8,16 @@ module Legion
     module Settings
       extend Legion::Logging::Helper
+      CLIENT_TOOL_PASSTHROUGH_BLACKLIST_DEFAULT = [
+        'sudo', 'visudo', 'su', 'legion', 'legionio', 'legionio do', 'legionio/legion',
+        'computer_use_session', 'computer_use_control', 'computer_use_session_info',
+        'computer_use_session_message', 'plugin__aithena__recall', 'plugin__aithena__remember',
+        'plugin__aithena__skill_search', 'plugin__aithena__skill_feedback', 'plugin__aithena__memory_stats',
+        'plugin__cron__create', 'plugin__cron__list', 'plugin__cron__get', 'plugin__cron__update',
+        'plugin__cron__delete', 'plugin__cron__get_history', 'plugin__cron__run_now', 'plugin__cron__stop'
+      ].freeze
+      CLIENT_TOOL_PASSTHROUGH_WHITELIST_DEFAULT = [].freeze
       def self.default
         model_override = ENV.fetch('ANTHROPIC_MODEL', nil)
         {
@@ -482,10 +492,12 @@ module Legion
       def self.tool_trigger_defaults
         {
-          scan_depth:              10,
-          tool_limit:              25,
-          local_tool_limit:        100,
-          client_tool_passthrough: false
+          scan_depth:                        10,
+          tool_limit:                        25,
+          local_tool_limit:                  100,
+          client_tool_passthrough:           false,
+          client_tool_passthrough_whitelist: CLIENT_TOOL_PASSTHROUGH_WHITELIST_DEFAULT.dup,
+          client_tool_passthrough_blacklist: CLIENT_TOOL_PASSTHROUGH_BLACKLIST_DEFAULT.dup
         }
       end

data/lib/legion/llm/tools/special.rb CHANGED Viewed

@@ -19,6 +19,10 @@ module Legion
         LIST_ALL_TOOLS_NAME = 'legion_list_all_tools'
         DEFAULT_TIMEOUT_MS = 120_000
         MAX_TIMEOUT_MS = 600_000
+        TOOL_ALIASES = {
+          'python' => %w[python python3],
+          'pip'    => %w[pip pip3]
+        }.freeze
         PYTHON_PACKAGES = %w[
           python-pptx
           python-docx
@@ -60,6 +64,11 @@ module Legion
           { status: :error, result: e.message }
         end
+        def aliases_for(tool_name)
+          normalized = normalize_tool_name(tool_name)
+          TOOL_ALIASES.fetch(normalized, [normalized])
+        end
         def inventory
           {
             special_tools:             special_tool_summaries,

data/lib/legion/llm/version.rb CHANGED Viewed

@@ -2,6 +2,6 @@
 module Legion
   module LLM
-    VERSION = '0.9.36'
+    VERSION = '0.9.51'
   end
 end

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: legion-llm
 version: !ruby/object:Gem::Version
-  version: 0.9.36
+  version: 0.9.51
 platform: ruby
 authors:
 - Esity
@@ -23,6 +23,20 @@ dependencies:
     - - ">="
       - !ruby/object:Gem::Version
         version: '0'
+- !ruby/object:Gem::Dependency
+  name: event_stream_parser
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '1'
+  type: :runtime
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '1'
 - !ruby/object:Gem::Dependency
   name: faraday
   requirement: !ruby/object:Gem::Requirement