legion-llm 0.9.36 → 0.9.51

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 93611da95712602a9f99e00c4b34523c23838a99d34c3c441ea6bef642231e3f
4
- data.tar.gz: 6ced6ad0b6091c5a3d53702b867eea5f04d35199892338023aebb6bb452ed867
3
+ metadata.gz: e0cae7c608acb8fe3f09852e7833639c08998bb944b2df026d9c1999c03ff017
4
+ data.tar.gz: bbea922035cf6f38eb43139ea5d33deaca70cbbea8693edccb3811bdc5f43608
5
5
  SHA512:
6
- metadata.gz: aa99ed858c6bef1fc214a45d4d59e51f1e9f0262f75dcdbd0f60645d59296edf6fa57e47dfa706dd0b06ec7c7f6dbf572f3832235d0d7125cd9992ec65aa6eee
7
- data.tar.gz: dfe7e2db5cf883de39a5ac47438408a858372a52dd82230baa4a624e33e17b0558eb50359237345afa5b8a1df432b164149c3fce540304ac56ffbad888110c33
6
+ metadata.gz: cc620102bcfdbd73387ba3da2e31e80e4fd9c9b9fd3ceeb85b00417972deeda55bbc427702df3a86ec7a2d3f07be34f99383bf9141b79679e394e66c45eda7c1
7
+ data.tar.gz: 4f5b8e4739873d147be2ddfed81c6a04297a016f9c7c4c143b6ad0409f61a9f75c39a14c6de4b30f37119b4e5aa577e4faba813917da0a9c771c016500e079a2
data/CHANGELOG.md CHANGED
@@ -1,5 +1,85 @@
1
1
  # Legion LLM Changelog
2
2
 
3
+ ## [0.9.51] - 2026-05-23
4
+
5
+ ### Changed
6
+ - Settings: `tool_trigger.client_tool_passthrough` now defaults to `false`; callers must opt in with request metadata or a settings override before non-executable client tools are passed through to providers.
7
+
8
+ ## [0.9.50] - 2026-05-23
9
+
10
+ ### Fixed
11
+ - API: native `/api/llm/inference` now debug-logs the exact outward response payload for sync JSON responses and streaming `done` events, making client passthrough tool-call shape visible in runtime logs.
12
+
13
+ ## [0.9.49] - 2026-05-23
14
+
15
+ ### Fixed
16
+ - API: native `/api/llm/inference` client tool requests now use the OpenAI Chat Completions `tool_calls` shape with `type: "function"` and `function.name` / JSON-string `function.arguments`, aligning native sync responses and streaming `done.tool_calls` with the OpenAI-compatible endpoints.
17
+
18
+ ## [0.9.48] - 2026-05-23
19
+
20
+ ### Fixed
21
+ - API: OpenAI-compatible streaming now emits tool callbacks for both Chat Completions and Responses: `/v1/chat/completions` streams `delta.tool_calls` and finishes with `finish_reason: "tool_calls"`, while `/v1/responses` streams `function_call` output item events and includes them in `response.completed.output`.
22
+
23
+ ## [0.9.47] - 2026-05-22
24
+
25
+ ### Fixed
26
+ - API: streaming client passthrough tool calls now stay only on the terminal `done.tool_calls` payload instead of emitting a live `tool-call` event that makes clients wait for an impossible same-stream `tool-result`.
27
+
28
+ ## [0.9.46] - 2026-05-22
29
+
30
+ ### Fixed
31
+ - API: returned client passthrough tool calls now keep the existing streaming `tool-call` event name while carrying `clientPassthrough` and `requiresToolResult` metadata, preserving current client execution behavior.
32
+
33
+ ## [0.9.45] - 2026-05-22
34
+
35
+ ### Fixed
36
+ - Tools: vLLM explicit tool-choice matching now uses tool-name boundaries, so paths like `/rubymine/...` no longer force the `ruby` tool when the user asked for `git`.
37
+
38
+ ## [0.9.44] - 2026-05-22
39
+
40
+ ### Fixed
41
+ - API: streaming client passthrough tool calls now emit `client-tool-call` with explicit client execution metadata instead of a server-execution `tool-call`, preventing clients from waiting for an impossible same-stream server tool result.
42
+ - Settings: default client passthrough blacklist now blocks computer-use session/control tools plus Aithena and cron plugin tools so provider calls do not hand off internal UI/plugin controls as executable client tools.
43
+
44
+ ## [0.9.43] - 2026-05-22
45
+
46
+ ### Fixed
47
+ - Tools: responses that end on client passthrough after prior server-side tool execution now return the current passthrough tool call instead of replaying the earlier executed tool from pending history.
48
+
49
+ ## [0.9.42] - 2026-05-22
50
+
51
+ ### Fixed
52
+ - Tools: native tool-loop follow-up provider calls now include a continuation instruction that tells models to make another available tool call instead of narrating intent after a failed or incomplete tool result.
53
+ - API: streamed native tool results now include status and emit `tool-error` when the server-side tool execution failed.
54
+
55
+ ## [0.9.41] - 2026-05-22
56
+
57
+ ### Fixed
58
+ - API: streaming inference `done` events now include `stop_reason` and `requires_tool_result` when client passthrough tool calls must be executed and submitted back by the caller, matching the sync response contract and preventing wrappers from treating tool-use turns as final completions.
59
+
60
+ ## [0.9.40] - 2026-05-22
61
+
62
+ ### Added
63
+ - Settings: client tool passthrough blacklist now blocks Legion launcher-style client tools such as `legion`, `legionio`, `legionio do`, and `legionio/legion` by default, including sanitized tool-name variants.
64
+ - Tools: client `python3` and `pip3` passthrough definitions deduplicate against Legion's native `python` and `pip` special tools when the managed Python runtime is injected.
65
+
66
+ ## [0.9.39] - 2026-05-22
67
+
68
+ ### Added
69
+ - Settings: `client_tool_passthrough_whitelist` and `client_tool_passthrough_blacklist` now filter non-executable client tools before native provider dispatch; `client_tool_passthrough` defaults to enabled, explicit API true/false overrides are preserved, the default blacklist blocks `sudo`, `visudo`, and `su`, and the default whitelist is empty.
70
+
71
+ ## [0.9.38] - 2026-05-22
72
+
73
+ ### Fixed
74
+ - API: OpenAI Responses upstream dispatch now preserves Responses `input_text` / `output_text` content parts instead of stringifying them through chat-message normalization.
75
+ - Providers: native `:responses` dispatch is gated to providers or instances that explicitly support the Responses API, preventing non-Responses providers from receiving `/v1/responses` traffic just because the adapter has a helper method.
76
+ - Packaging: `event_stream_parser` is now a direct runtime dependency because `legion-llm` requires it for Responses SSE parsing.
77
+
78
+ ## [0.9.37] - 2026-05-22
79
+
80
+ ### Changed
81
+ - API: OpenAI Responses requests now dispatch to upstream `/v1/responses` through a native `:responses` provider capability instead of adapting Responses input through Chat Completions `stream_chat`, preserving upstream Responses streaming usage from `response.completed.response.usage`
82
+
3
83
  ## [0.9.36] - 2026-05-22
4
84
 
5
85
  ### Fixed
data/legion-llm.gemspec CHANGED
@@ -26,6 +26,7 @@ Gem::Specification.new do |spec|
26
26
  }
27
27
 
28
28
  spec.add_dependency 'concurrent-ruby'
29
+ spec.add_dependency 'event_stream_parser', '~> 1'
29
30
  spec.add_dependency 'faraday'
30
31
  spec.add_dependency 'legion-cache', '>= 1.4.2'
31
32
  spec.add_dependency 'legion-json', '>= 1.2.0'
@@ -5,6 +5,7 @@ require 'open3'
5
5
  require 'time'
6
6
  require 'legion/cache/helper'
7
7
  require 'legion/logging/helper'
8
+ require 'legion/llm/api/translators/openai_response'
8
9
  require 'legion/llm/publisher_identity'
9
10
  require 'legion/llm/types'
10
11
 
@@ -302,7 +303,7 @@ module Legion
302
303
  name: tname,
303
304
  description: tdesc,
304
305
  parameters: tschema || {},
305
- source: { type: :client, executable: false }
306
+ source: { type: :client, executable: false, raw_name: tname }
306
307
  )
307
308
  rescue StandardError => e
308
309
  handle_exception(e, level: :warn, handled: true, operation: "llm.api.build_client_tool_class.#{tname}")
@@ -310,16 +311,7 @@ module Legion
310
311
  end
311
312
 
312
313
  define_method(:extract_tool_calls) do |pipeline_response|
313
- tools_data = pipeline_response.tools
314
- return [] unless tools_data.is_a?(Array) && !tools_data.empty?
315
-
316
- tools_data.map do |tc|
317
- {
318
- id: tc.respond_to?(:id) ? tc.id : (tc[:id] || tc['id']),
319
- name: tc.respond_to?(:name) ? tc.name : (tc[:name] || tc['name'] || tc.to_s),
320
- arguments: tc.respond_to?(:arguments) ? tc.arguments : (tc[:arguments] || tc['arguments'] || {})
321
- }
322
- end
314
+ Legion::LLM::API::Translators::OpenAIResponse.build_tool_calls(pipeline_response)
323
315
  end
324
316
 
325
317
  define_method(:extract_text_content) do |content|
@@ -347,7 +339,46 @@ module Legion
347
339
  stream << "event: #{event_name}\ndata: #{Legion::JSON.dump(payload)}\n\n"
348
340
  end
349
341
 
350
- define_method(:emit_response_tool_call_events) do |stream, pipeline_response|
342
+ define_method(:log_native_inference_response) do |request_id:, conversation_id:, stream:, kind:, payload:|
343
+ log.debug(
344
+ "[llm][api][inference] action=response_payload request_id=#{request_id || 'unknown'} " \
345
+ "conversation_id=#{conversation_id || 'none'} stream=#{stream} kind=#{kind} " \
346
+ "payload=#{Legion::JSON.dump(payload)}"
347
+ )
348
+ rescue StandardError => e
349
+ handle_exception(e, level: :debug, handled: true,
350
+ operation: 'llm.api.inference.response_payload_log',
351
+ request_id: request_id)
352
+ end
353
+
354
+ define_method(:returned_client_tool_call_payload) do |tool_call, tool_call_id, tool_name|
355
+ {
356
+ toolCallId: tool_call_id,
357
+ toolName: tool_name,
358
+ args: openai_tool_call_arguments(tool_call),
359
+ clientPassthrough: true,
360
+ requiresToolResult: true,
361
+ status: 'requires_client_execution',
362
+ timestamp: Time.now.utc.iso8601
363
+ }
364
+ end
365
+
366
+ define_method(:openai_tool_call_name) do |tool_call|
367
+ fn = tool_call[:function] || tool_call['function'] || {}
368
+ fn[:name] || fn['name'] || tool_call[:name] || tool_call['name']
369
+ end
370
+
371
+ define_method(:openai_tool_call_arguments) do |tool_call|
372
+ fn = tool_call[:function] || tool_call['function'] || {}
373
+ raw_args = fn[:arguments] || fn['arguments'] || tool_call[:arguments] || tool_call['arguments'] || {}
374
+ return raw_args unless raw_args.is_a?(String)
375
+
376
+ Legion::JSON.parse(raw_args, symbolize_names: true)
377
+ rescue StandardError
378
+ raw_args
379
+ end
380
+
381
+ define_method(:emit_response_tool_call_events) do |_stream, pipeline_response|
351
382
  tool_calls = extract_tool_calls(pipeline_response)
352
383
  return if tool_calls.empty?
353
384
 
@@ -359,7 +390,7 @@ module Legion
359
390
  data[:tool_call_id] || data['tool_call_id']
360
391
  end
361
392
 
362
- emitted = 0
393
+ done_only = 0
363
394
  skipped_timeline = 0
364
395
  request_id = pipeline_response.respond_to?(:request_id) ? pipeline_response.request_id : 'unknown'
365
396
  conversation_id = pipeline_response.respond_to?(:conversation_id) ? pipeline_response.conversation_id : 'none'
@@ -371,28 +402,22 @@ module Legion
371
402
  next
372
403
  end
373
404
 
374
- tool_name = tool_call[:name] || tool_call['name']
405
+ tool_name = openai_tool_call_name(tool_call)
375
406
  next if tool_name.to_s.empty?
376
407
 
377
408
  log.info(
378
- "[llm][api][tools] action=returned_tool_call_sse request_id=#{request_id || 'unknown'} " \
409
+ "[llm][api][tools] action=returned_tool_call_done_only request_id=#{request_id || 'unknown'} " \
379
410
  "conversation_id=#{conversation_id || 'none'} tool_call_id=#{tool_call_id || 'none'} name=#{tool_name} " \
380
- "args_class=#{(tool_call[:arguments] || tool_call['arguments'] || {}).class}"
411
+ "args_class=#{openai_tool_call_arguments(tool_call).class}"
381
412
  )
382
- emit_sse_event(stream, 'tool-call', {
383
- toolCallId: tool_call_id,
384
- toolName: tool_name,
385
- args: tool_call[:arguments] || tool_call['arguments'] || {},
386
- timestamp: Time.now.utc.iso8601
387
- })
388
- emitted += 1
413
+ done_only += 1
389
414
  end
390
415
 
391
- names = tool_calls.map { |tool_call| tool_call[:name] || tool_call['name'] }.compact
416
+ names = tool_calls.map { |tool_call| openai_tool_call_name(tool_call) }.compact
392
417
  names = names.first(30).join(',') + (names.size > 30 ? ",+#{names.size - 30}more" : '')
393
418
  log.info(
394
419
  "[llm][api][tools] action=returned_tool_calls_complete request_id=#{request_id || 'unknown'} " \
395
- "conversation_id=#{conversation_id || 'none'} total=#{tool_calls.size} emitted=#{emitted} " \
420
+ "conversation_id=#{conversation_id || 'none'} total=#{tool_calls.size} done_only=#{done_only} " \
396
421
  "skipped_timeline=#{skipped_timeline} names=#{names.empty? ? 'none' : names}"
397
422
  )
398
423
  end
@@ -28,7 +28,7 @@ module Legion
28
28
  conversation_id = body[:conversation_id]
29
29
  request_id = body[:request_id] || SecureRandom.uuid
30
30
  include_thinking = body[:include_thinking] == true
31
- client_tool_passthrough = body[:client_tool_passthrough] == true
31
+ client_tool_passthrough = body[:client_tool_passthrough] if [true, false].include?(body[:client_tool_passthrough])
32
32
 
33
33
  unless messages.is_a?(Array)
34
34
  halt 400, { 'Content-Type' => 'application/json' },
@@ -105,7 +105,7 @@ module Legion
105
105
  extra = {}
106
106
  extra[:tier] = tier.to_sym if tier
107
107
  metadata = { requested_tools: requested_tools }
108
- metadata[:client_tool_passthrough] = true if client_tool_passthrough
108
+ metadata[:client_tool_passthrough] = client_tool_passthrough unless client_tool_passthrough.nil?
109
109
  metadata[:client_tool_request_count] = tools.size if tools.any?
110
110
 
111
111
  pipeline_request = Legion::LLM::Inference::Request.build(
@@ -148,10 +148,12 @@ module Legion
148
148
  timestamp: Time.now.utc.iso8601
149
149
  })
150
150
  when :tool_result
151
- emit_sse_event(out, 'tool-result', {
151
+ event_name = event[:status].to_s == 'error' ? 'tool-error' : 'tool-result'
152
+ emit_sse_event(out, event_name, {
152
153
  toolCallId: event[:tool_call_id],
153
154
  toolName: event[:tool_name],
154
155
  result: event[:result],
156
+ status: event[:status],
155
157
  timestamp: Time.now.utc.iso8601
156
158
  })
157
159
  when :tool_error
@@ -184,20 +186,31 @@ module Legion
184
186
 
185
187
  routing = pipeline_response.routing || {}
186
188
  tokens = pipeline_response.tokens || {}
189
+ tool_calls = extract_tool_calls(pipeline_response)
190
+ stop_reason = pipeline_response.stop&.dig(:reason)&.to_s
187
191
  done_payload = {
188
- request_id: request_id,
189
- content: full_text,
190
- model: (routing[:model] || routing['model']).to_s,
191
- provider: (routing[:provider] || routing['provider'])&.to_s,
192
- instance: (routing[:instance] || routing['instance'])&.to_s,
193
- tier: (routing[:tier] || routing['tier'])&.to_s,
194
- input_tokens: token_value(tokens, :input),
195
- output_tokens: token_value(tokens, :output),
196
- tool_calls: extract_tool_calls(pipeline_response),
197
- conversation_id: pipeline_response.conversation_id,
198
- metrics: build_response_metrics(pipeline_response)
192
+ request_id: request_id,
193
+ content: full_text,
194
+ model: (routing[:model] || routing['model']).to_s,
195
+ provider: (routing[:provider] || routing['provider'])&.to_s,
196
+ instance: (routing[:instance] || routing['instance'])&.to_s,
197
+ tier: (routing[:tier] || routing['tier'])&.to_s,
198
+ input_tokens: token_value(tokens, :input),
199
+ output_tokens: token_value(tokens, :output),
200
+ tool_calls: tool_calls,
201
+ stop_reason: stop_reason,
202
+ requires_tool_result: stop_reason == 'tool_use' && tool_calls.any?,
203
+ conversation_id: pipeline_response.conversation_id,
204
+ metrics: build_response_metrics(pipeline_response)
199
205
  }.compact
200
206
  done_payload[:thinking] = pipeline_response.thinking if include_thinking && pipeline_response.thinking
207
+ log_native_inference_response(
208
+ request_id: request_id,
209
+ conversation_id: pipeline_response.conversation_id || conversation_id,
210
+ stream: true,
211
+ kind: 'sse_done',
212
+ payload: done_payload
213
+ )
201
214
  emit_sse_event(out, 'done', {
202
215
  **done_payload
203
216
  })
@@ -227,6 +240,7 @@ module Legion
227
240
  routing = pipeline_response.routing || {}
228
241
  tokens = pipeline_response.tokens || {}
229
242
  tool_calls = extract_tool_calls(pipeline_response)
243
+ stop_reason = pipeline_response.stop&.dig(:reason)&.to_s
230
244
 
231
245
  log.info(
232
246
  "[llm][api][inference] action=completed request_id=#{request_id} " \
@@ -240,21 +254,29 @@ module Legion
240
254
  )
241
255
 
242
256
  payload = {
243
- request_id: request_id,
244
- content: content,
245
- tool_calls: tool_calls,
246
- stop_reason: pipeline_response.stop&.dig(:reason)&.to_s,
247
- model: (routing[:model] || routing['model']).to_s,
248
- provider: (routing[:provider] || routing['provider'])&.to_s,
249
- instance: (routing[:instance] || routing['instance'])&.to_s,
250
- tier: (routing[:tier] || routing['tier'])&.to_s,
251
- input_tokens: token_value(tokens, :input),
252
- output_tokens: token_value(tokens, :output),
253
- conversation_id: pipeline_response.conversation_id,
254
- metrics: build_response_metrics(pipeline_response)
257
+ request_id: request_id,
258
+ content: content,
259
+ tool_calls: tool_calls,
260
+ stop_reason: stop_reason,
261
+ requires_tool_result: stop_reason == 'tool_use' && tool_calls.any?,
262
+ model: (routing[:model] || routing['model']).to_s,
263
+ provider: (routing[:provider] || routing['provider'])&.to_s,
264
+ instance: (routing[:instance] || routing['instance'])&.to_s,
265
+ tier: (routing[:tier] || routing['tier'])&.to_s,
266
+ input_tokens: token_value(tokens, :input),
267
+ output_tokens: token_value(tokens, :output),
268
+ conversation_id: pipeline_response.conversation_id,
269
+ metrics: build_response_metrics(pipeline_response)
255
270
  }
256
271
  payload[:thinking] = pipeline_response.thinking if include_thinking && pipeline_response.thinking
257
272
  payload.compact!
273
+ log_native_inference_response(
274
+ request_id: request_id,
275
+ conversation_id: pipeline_response.conversation_id || conversation_id,
276
+ stream: false,
277
+ kind: 'json_response',
278
+ payload: { data: payload }
279
+ )
258
280
  json_response(payload, status_code: 200)
259
281
  end
260
282
  rescue Legion::LLM::AuthError => e
@@ -67,8 +67,22 @@ module Legion
67
67
 
68
68
  routing = pipeline_response.routing || {}
69
69
  final_model = (routing[:model] || routing['model'] || model).to_s
70
+ tool_calls = Legion::LLM::API::Translators::OpenAIResponse.build_tool_calls(pipeline_response)
71
+
72
+ tool_calls.each_with_index do |tool_call, index|
73
+ out << "data: #{Legion::JSON.dump(Legion::LLM::API::Translators::OpenAIResponse.format_stream_tool_call_chunk(
74
+ tool_call,
75
+ model: final_model,
76
+ request_id: request_id,
77
+ index: index
78
+ ))}\n\n"
79
+ end
80
+
70
81
  done_chunk = Legion::LLM::API::Translators::OpenAIResponse.format_stream_chunk(
71
- nil, model: final_model, request_id: request_id, finish_reason: 'stop'
82
+ nil,
83
+ model: final_model,
84
+ request_id: request_id,
85
+ finish_reason: tool_calls.empty? ? 'stop' : 'tool_calls'
72
86
  )
73
87
  out << "data: #{Legion::JSON.dump(done_chunk)}\n\n"
74
88
  out << "data: [DONE]\n\n"
@@ -76,13 +76,13 @@ module Legion
76
76
  'X-Accel-Buffering' => 'no'
77
77
 
78
78
  stream do |out|
79
- Responses.stream_response(out, executor, request_id: request_id, model: model)
79
+ Responses.stream_response(out, executor, request_id: request_id, model: model, upstream_body: body)
80
80
  rescue StandardError => e
81
81
  handle_exception(e, level: :error, handled: false, operation: 'llm.api.openai.responses.stream', request_id: request_id)
82
82
  out << "event: error\ndata: #{Legion::JSON.dump({ type: 'server_error', message: e.message })}\n\n"
83
83
  end
84
84
  else
85
- pipeline_response = executor.call
85
+ pipeline_response = executor.call_responses(body: body, stream: false)
86
86
  response_body = Responses.format_response(pipeline_response, request_id: request_id, model: model)
87
87
 
88
88
  log.info("[llm][api][openai][responses] action=complete request_id=#{request_id} model=#{response_body[:model]}")
@@ -179,7 +179,7 @@ module Legion
179
179
  }
180
180
  end
181
181
 
182
- def self.stream_response(out, executor, request_id:, model:) # rubocop:disable Metrics/MethodLength
182
+ def self.stream_response(out, executor, request_id:, model:, upstream_body: nil) # rubocop:disable Metrics/MethodLength
183
183
  created_at = Time.now.to_i
184
184
  seq = 0
185
185
  in_progress_response = { id: request_id, object: 'response', created_at: created_at,
@@ -218,7 +218,7 @@ module Legion
218
218
 
219
219
  full_text = +''
220
220
 
221
- pipeline_response = executor.call_stream do |chunk|
221
+ pipeline_response = call_streaming_executor(executor, upstream_body: upstream_body) do |chunk|
222
222
  text = chunk.respond_to?(:content) ? chunk.content.to_s : chunk.to_s
223
223
  next if text.empty?
224
224
 
@@ -237,6 +237,7 @@ module Legion
237
237
  tokens = pipeline_response.tokens || {}
238
238
  resolved_model = (routing[:model] || routing['model'] || model).to_s
239
239
  usage = build_usage(tokens)
240
+ function_calls = build_output_tool_calls(pipeline_response)
240
241
 
241
242
  out << sse_event('response.output_text.done', {
242
243
  type: 'response.output_text.done',
@@ -265,6 +266,41 @@ module Legion
265
266
  item: completed_item
266
267
  })
267
268
 
269
+ function_calls.each_with_index do |function_call, index|
270
+ output_index = index + 1
271
+ in_progress_item = function_call.merge(status: 'in_progress', arguments: '')
272
+
273
+ out << sse_event('response.output_item.added', {
274
+ type: 'response.output_item.added',
275
+ sequence_number: seq += 1,
276
+ output_index: output_index,
277
+ item: in_progress_item
278
+ })
279
+
280
+ out << sse_event('response.function_call_arguments.delta', {
281
+ type: 'response.function_call_arguments.delta',
282
+ sequence_number: seq += 1,
283
+ output_index: output_index,
284
+ item_id: function_call[:id],
285
+ delta: function_call[:arguments]
286
+ })
287
+
288
+ out << sse_event('response.function_call_arguments.done', {
289
+ type: 'response.function_call_arguments.done',
290
+ sequence_number: seq += 1,
291
+ output_index: output_index,
292
+ item_id: function_call[:id],
293
+ arguments: function_call[:arguments]
294
+ })
295
+
296
+ out << sse_event('response.output_item.done', {
297
+ type: 'response.output_item.done',
298
+ sequence_number: seq += 1,
299
+ output_index: output_index,
300
+ item: function_call
301
+ })
302
+ end
303
+
268
304
  out << sse_event('response.completed', {
269
305
  type: 'response.completed',
270
306
  sequence_number: seq + 1,
@@ -274,7 +310,7 @@ module Legion
274
310
  created_at: created_at,
275
311
  status: 'completed',
276
312
  model: resolved_model,
277
- output: [completed_item],
313
+ output: [completed_item, *function_calls],
278
314
  usage: usage
279
315
  }
280
316
  })
@@ -282,6 +318,14 @@ module Legion
282
318
  log.info("[llm][api][openai][responses] action=stream_complete request_id=#{request_id} model=#{resolved_model}")
283
319
  end
284
320
 
321
+ def self.call_streaming_executor(executor, upstream_body: nil, &)
322
+ if upstream_body && executor.respond_to?(:call_responses)
323
+ executor.call_responses(body: upstream_body, stream: true, &)
324
+ else
325
+ executor.call_stream(&)
326
+ end
327
+ end
328
+
285
329
  def self.sse_event(name, payload)
286
330
  "event: #{name}\ndata: #{Legion::JSON.dump(payload)}\n\n"
287
331
  end
@@ -70,6 +70,51 @@ module Legion
70
70
  }
71
71
  end
72
72
 
73
+ def format_stream_tool_call_chunk(tool_call, model:, request_id:, index:)
74
+ fn = tool_call.is_a?(Hash) ? (tool_call[:function] || tool_call['function'] || {}) : {}
75
+ name = tool_call.respond_to?(:name) ? tool_call.name : (tool_call[:name] || tool_call['name'] || fn[:name] || fn['name'])
76
+ args = if tool_call.respond_to?(:arguments)
77
+ tool_call.arguments
78
+ else
79
+ tool_call[:arguments] || tool_call['arguments'] || fn[:arguments] || fn['arguments'] || {}
80
+ end
81
+ tc_id = tool_call.respond_to?(:id) ? tool_call.id : (tool_call[:id] || tool_call['id'] || "call_#{SecureRandom.hex(8)}")
82
+
83
+ format_stream_delta_chunk(
84
+ {
85
+ tool_calls: [
86
+ {
87
+ index: index,
88
+ id: tc_id,
89
+ type: 'function',
90
+ function: {
91
+ name: name.to_s,
92
+ arguments: args.is_a?(String) ? args : Legion::JSON.dump(args)
93
+ }
94
+ }
95
+ ]
96
+ },
97
+ model: model,
98
+ request_id: request_id
99
+ )
100
+ end
101
+
102
+ def format_stream_delta_chunk(delta, model:, request_id:, finish_reason: nil)
103
+ {
104
+ id: "chatcmpl-#{request_id.delete('-')}",
105
+ object: 'chat.completion.chunk',
106
+ created: Time.now.to_i,
107
+ model: model.to_s,
108
+ choices: [
109
+ {
110
+ index: 0,
111
+ delta: delta,
112
+ finish_reason: finish_reason
113
+ }
114
+ ]
115
+ }
116
+ end
117
+
73
118
  def format_embeddings(vector, model:, input_text:, usage: nil)
74
119
  tokens = embedding_token_count(usage, input_text)
75
120
 
@@ -168,6 +168,7 @@ module Legion
168
168
  CAPABILITY_METHODS = {
169
169
  chat: :chat,
170
170
  stream: :stream,
171
+ responses: :responses,
171
172
  embed: :embed,
172
173
  image: :image,
173
174
  count_tokens: :count_tokens
@@ -189,6 +190,9 @@ module Legion
189
190
  raise Legion::LLM::ProviderError, "unsupported capability: #{capability}" unless method_name
190
191
 
191
192
  ext = fetch_extension!(provider, instance: instance)
193
+ if ext.respond_to?(:supports?) && !ext.supports?(cap_sym)
194
+ raise Legion::LLM::ProviderError, "unsupported capability #{capability} for provider #{provider}"
195
+ end
192
196
 
193
197
  log.info("[llm][dispatch] capability=#{cap_sym} provider=#{provider} " \
194
198
  "instance=#{instance || 'default'} model=#{model}")
@@ -1,5 +1,6 @@
1
1
  # frozen_string_literal: true
2
2
 
3
+ require 'event_stream_parser'
3
4
  require 'legion/logging/helper'
4
5
 
5
6
  module Legion
@@ -10,11 +11,13 @@ module Legion
10
11
  include Legion::Logging::Helper
11
12
 
12
13
  METADATA_KEYS = %i[tier capabilities enabled].freeze
14
+ RESPONSES_PROVIDER_FAMILIES = %i[openai vllm].freeze
13
15
 
14
16
  def initialize(provider_name, provider_class, instance_config: {})
15
17
  @provider_name = provider_name.to_sym
16
18
  @provider_class = provider_class
17
19
  @instance_config = instance_config
20
+ @capabilities = Array(instance_config[:capabilities] || instance_config['capabilities']).map(&:to_sym)
18
21
  @lex_llm_namespace = resolve_lex_llm_namespace
19
22
  end
20
23
 
@@ -58,6 +61,32 @@ module Legion
58
61
  end
59
62
  end
60
63
 
64
+ def responses(model:, body:, messages:, stream: false, **opts, &)
65
+ raise Legion::LLM::ProviderError, "Responses API dispatch is not supported for #{provider_name}" unless supports?(:responses)
66
+
67
+ payload = build_responses_payload(
68
+ body: body,
69
+ model: model,
70
+ messages: messages,
71
+ stream: stream,
72
+ system: opts[:system],
73
+ tools: opts[:tools]
74
+ )
75
+
76
+ if stream
77
+ stream_responses_payload(payload, offering_metadata: opts[:offering_metadata], &)
78
+ else
79
+ response = provider.connection.post(responses_url, payload)
80
+ responses_hash_response(response.body, offering_metadata: opts[:offering_metadata])
81
+ end
82
+ end
83
+
84
+ def supports?(capability)
85
+ return true unless capability.to_sym == :responses
86
+
87
+ @capabilities.include?(:responses) || RESPONSES_PROVIDER_FAMILIES.include?(provider_name)
88
+ end
89
+
61
90
  def embed(model:, text:, dimensions: nil, **opts)
62
91
  model_info = model_info(model, offering_metadata: opts[:offering_metadata])
63
92
  response = provider.embed(
@@ -136,6 +165,236 @@ module Legion
136
165
  end
137
166
  end
138
167
 
168
+ def responses_url = '/v1/responses'
169
+
170
+ def build_responses_payload(body:, model:, messages:, stream:, system: nil, tools: nil)
171
+ payload = normalize_hash(body).dup
172
+ payload[:model] = model
173
+ payload[:stream] = stream
174
+ payload[:input] = responses_payload_input(payload, messages)
175
+
176
+ system_content = normalize_response_system(system)
177
+ payload[:instructions] = system_content if present_system?(system_content)
178
+
179
+ formatted_tools = responses_tools(tools)
180
+ payload[:tools] = formatted_tools if formatted_tools.any?
181
+
182
+ deep_compact(payload)
183
+ end
184
+
185
+ def responses_input(messages)
186
+ Array(messages).map do |message|
187
+ normalized = normalize_hash(message)
188
+ if normalized[:role].to_s == 'tool'
189
+ next({
190
+ type: 'function_call_output',
191
+ call_id: normalized[:tool_call_id].to_s,
192
+ output: normalize_message_content(normalized[:content]).to_s
193
+ })
194
+ end
195
+
196
+ {
197
+ role: normalized[:role]&.to_s || 'user',
198
+ content: responses_message_content(normalized[:content]),
199
+ tool_call_id: normalized[:tool_call_id]
200
+ }.compact
201
+ end
202
+ end
203
+
204
+ def responses_payload_input(payload, messages)
205
+ return payload[:input] if payload.key?(:input)
206
+ return payload['input'] if payload.key?('input')
207
+
208
+ responses_input(messages)
209
+ end
210
+
211
+ def responses_message_content(content)
212
+ return content if content.nil? || content.is_a?(String)
213
+
214
+ if content.is_a?(Array)
215
+ parts = content.filter_map { |part| responses_content_part(part) }
216
+ return parts unless parts.empty?
217
+ end
218
+
219
+ text_part_content(content) || content.to_s
220
+ end
221
+
222
+ def responses_content_part(part)
223
+ return { type: 'input_text', text: part } if part.is_a?(String)
224
+ return part unless part.respond_to?(:transform_keys)
225
+
226
+ normalized = part.transform_keys { |key| key.respond_to?(:to_sym) ? key.to_sym : key }
227
+ type = normalized[:type].to_s
228
+ return { type: type, text: normalized[:text].to_s } if %w[input_text output_text text].include?(type)
229
+
230
+ part
231
+ end
232
+
233
+ def normalize_response_system(system)
234
+ return nil if system.nil?
235
+ return system[:content] || system['content'] if system.is_a?(Hash)
236
+
237
+ system.to_s
238
+ end
239
+
240
+ def responses_tools(tools)
241
+ normalize_tools(tools).values.map do |tool|
242
+ {
243
+ type: 'function',
244
+ name: tool.name.to_s,
245
+ description: tool.description.to_s,
246
+ parameters: tool.params_schema || { type: 'object', properties: {} }
247
+ }
248
+ end
249
+ end
250
+
251
+ def deep_compact(value)
252
+ case value
253
+ when Hash
254
+ value.each_with_object({}) do |(key, hash_value), compacted|
255
+ compact_value = deep_compact(hash_value)
256
+ compacted[key] = compact_value unless compact_value.nil?
257
+ end
258
+ when Array
259
+ value.map { |entry| deep_compact(entry) }.compact
260
+ else
261
+ value
262
+ end
263
+ end
264
+
265
+ def stream_responses_payload(payload, offering_metadata: nil, &block)
266
+ accumulator = build_responses_stream_accumulator
267
+ parser = EventStreamParser::Parser.new
268
+
269
+ response = provider.connection.post(responses_url, payload) do |req|
270
+ req.headers['Accept'] = 'text/event-stream'
271
+ attach_responses_stream_handler(req, parser, accumulator, block)
272
+ end
273
+
274
+ responses_stream_response(accumulator, response.body, offering_metadata: offering_metadata)
275
+ end
276
+
277
+ def build_responses_stream_accumulator
278
+ {
279
+ content: +'',
280
+ model: nil,
281
+ usage: {},
282
+ completed: nil,
283
+ raw: nil
284
+ }
285
+ end
286
+
287
+ def attach_responses_stream_handler(req, parser, accumulator, block)
288
+ handler = proc do |chunk, *_args|
289
+ parser.feed(chunk) do |_event, data|
290
+ handle_responses_stream_data(data, accumulator, block)
291
+ end
292
+ end
293
+
294
+ if req.options.respond_to?(:on_data=)
295
+ req.options.on_data = handler
296
+ else
297
+ req.options[:on_data] = handler
298
+ end
299
+ end
300
+
301
+ def handle_responses_stream_data(data, accumulator, block)
302
+ return if data == '[DONE]'
303
+
304
+ parsed = Legion::JSON.parse(data, symbolize_names: false)
305
+ return unless parsed.is_a?(Hash)
306
+
307
+ accumulator[:raw] = parsed
308
+ case parsed['type']
309
+ when 'response.output_text.delta'
310
+ accumulate_responses_text_delta(parsed, accumulator, block)
311
+ when 'response.completed'
312
+ response = parsed['response'] || {}
313
+ accumulator[:completed] = response
314
+ accumulator[:model] = response['model'] if response['model']
315
+ accumulator[:usage] = responses_usage(response['usage'])
316
+ end
317
+ end
318
+
319
+ def accumulate_responses_text_delta(parsed, accumulator, block)
320
+ delta = parsed['delta'].to_s
321
+ return if delta.empty?
322
+
323
+ accumulator[:content] << delta
324
+ block&.call(
325
+ lex_llm_namespace::Chunk.new(
326
+ role: :assistant,
327
+ content: delta,
328
+ model_id: parsed['model'],
329
+ raw: parsed,
330
+ tokens: nil
331
+ )
332
+ )
333
+ end
334
+
335
+ def responses_stream_response(accumulator, response_body, offering_metadata: nil)
336
+ completed = accumulator[:completed] || {}
337
+ content = accumulator[:content]
338
+ content = extract_responses_text(completed) if content.empty?
339
+
340
+ {
341
+ result: content,
342
+ model: accumulator[:model] || completed['model'],
343
+ usage: accumulator[:usage],
344
+ metadata: response_metadata(completed.empty? ? response_body : completed, offering_metadata: offering_metadata)
345
+ }.compact
346
+ end
347
+
348
+ def responses_hash_response(body, offering_metadata: nil)
349
+ normalized = normalize_string_hash(body)
350
+ {
351
+ result: extract_responses_text(normalized),
352
+ model: normalized['model'],
353
+ usage: responses_usage(normalized['usage']),
354
+ metadata: response_metadata(normalized, offering_metadata: offering_metadata)
355
+ }.compact
356
+ end
357
+
358
+ def normalize_string_hash(value)
359
+ return value.map { |entry| normalize_string_hash(entry) } if value.is_a?(Array)
360
+ return {} unless value.respond_to?(:each_pair)
361
+
362
+ value.each_with_object({}) do |(key, hash_value), normalized|
363
+ normalized[key.to_s] = normalize_string_hash_value(hash_value)
364
+ end
365
+ end
366
+
367
+ def normalize_string_hash_value(value)
368
+ return normalize_string_hash(value) if value.respond_to?(:each_pair)
369
+ return value.map { |entry| normalize_string_hash_value(entry) } if value.is_a?(Array)
370
+
371
+ value
372
+ end
373
+
374
+ def extract_responses_text(body)
375
+ return body['output_text'].to_s if body['output_text']
376
+
377
+ Array(body['output']).flat_map do |item|
378
+ Array(item['content']).filter_map do |content|
379
+ next unless %w[output_text text].include?(content['type'].to_s)
380
+
381
+ content['text']
382
+ end
383
+ end.join
384
+ end
385
+
386
+ def responses_usage(usage)
387
+ usage = normalize_string_hash(usage)
388
+ input = usage['input_tokens'] || usage['prompt_tokens']
389
+ output = usage['output_tokens'] || usage['completion_tokens']
390
+ {
391
+ input_tokens: input.to_i,
392
+ output_tokens: output.to_i,
393
+ cache_read_tokens: usage.dig('input_tokens_details', 'cached_tokens').to_i,
394
+ cache_write_tokens: usage.dig('input_tokens_details', 'cache_creation_tokens').to_i
395
+ }
396
+ end
397
+
139
398
  def model_info(model, offering_metadata: nil)
140
399
  offering = normalize_offering_metadata(offering_metadata)
141
400
  lex_llm_namespace::Model::Info.new(
@@ -243,7 +502,7 @@ module Legion
243
502
 
244
503
  if part.respond_to?(:transform_keys)
245
504
  normalized = part.transform_keys { |key| key.respond_to?(:to_sym) ? key.to_sym : key }
246
- return unless normalized[:type].to_s == 'text'
505
+ return unless %w[input_text output_text text].include?(normalized[:type].to_s)
247
506
 
248
507
  return normalized[:text].to_s
249
508
  end
@@ -104,7 +104,7 @@ module Legion
104
104
  end
105
105
 
106
106
  def adapter_instance_config(config, instance_id)
107
- config.except(:tier, :capabilities).tap do |registry_config|
107
+ config.except(:tier).tap do |registry_config|
108
108
  registry_config[:instance_id] ||= instance_id
109
109
  end
110
110
  end
@@ -69,7 +69,7 @@ module Legion
69
69
  ].freeze
70
70
 
71
71
  MAX_NATIVE_TOOL_ROUNDS = 200
72
- ToolResultEvent = Struct.new(:result, :tool_call_id, :tool_name, :started_at, keyword_init: true)
72
+ ToolResultEvent = Struct.new(:result, :tool_call_id, :tool_name, :started_at, :status, keyword_init: true)
73
73
 
74
74
  ASYNC_THREAD_POOL = Concurrent::FixedThreadPool.new(4, fallback_policy: :caller_runs)
75
75
 
@@ -124,6 +124,14 @@ module Legion
124
124
  build_response
125
125
  end
126
126
 
127
+ def call_responses(body:, stream: false, &)
128
+ log.debug "[llm][executor] action=call_responses request_id=#{@request.id} profile=#{@profile} stream=#{stream}"
129
+ execute_pre_provider_steps
130
+ execute_provider_request_responses(body: body, stream: stream, &)
131
+ execute_post_provider_steps
132
+ build_response
133
+ end
134
+
127
135
  private
128
136
 
129
137
  def llm_setting(key, default = nil)
@@ -904,12 +912,29 @@ module Legion
904
912
  offering_id: @resolved_offering_id,
905
913
  offering_metadata: @resolved_offering_metadata
906
914
  }
915
+ options[:system] = native_tool_loop_system(options[:system])
907
916
  options[:tools] = native_dispatch_tools if native_dispatch_tools.any?
908
917
  options[:tool_prefs] = native_tool_prefs if native_dispatch_tools.any? && native_tool_prefs
909
918
  options[:thinking] = native_dispatch_thinking if native_dispatch_thinking
910
919
  options.compact
911
920
  end
912
921
 
922
+ def native_tool_loop_system(system)
923
+ return system unless @native_tool_loop_round.to_i.positive? && native_dispatch_tools.any?
924
+
925
+ [system, native_tool_loop_continuation_prompt].compact.join("\n\n")
926
+ end
927
+
928
+ def native_tool_loop_continuation_prompt
929
+ <<~PROMPT.strip
930
+ Tool-use continuation rule:
931
+ - You just received tool results.
932
+ - If a tool failed or produced incomplete information and another available tool can continue the user's request, call that tool now.
933
+ - Do not say you will use a tool unless you are actually making the tool call in this response.
934
+ - Only provide a final answer when no further tool call is needed or possible.
935
+ PROMPT
936
+ end
937
+
913
938
  def native_dispatch_chat_options
914
939
  opts = { model: @resolved_model, provider: @resolved_provider }
915
940
  opts[:instance] = @resolved_instance if @resolved_instance
@@ -951,7 +976,42 @@ module Legion
951
976
  return value if [true, false].include?(value)
952
977
  end
953
978
 
954
- Legion::LLM::Settings.value(:tool_trigger, :client_tool_passthrough) != false
979
+ Legion::LLM::Settings.value(:tool_trigger, :client_tool_passthrough) == true
980
+ end
981
+
982
+ def client_tool_passthrough_allowed?(definition)
983
+ names = client_tool_passthrough_name_variants(definition)
984
+ whitelist = client_tool_passthrough_list(:client_tool_passthrough_whitelist)
985
+ blacklist = client_tool_passthrough_list(:client_tool_passthrough_blacklist)
986
+
987
+ return false if whitelist.any? && !names.intersect?(whitelist)
988
+ return false if names.intersect?(blacklist)
989
+
990
+ true
991
+ end
992
+
993
+ def client_tool_passthrough_list(key)
994
+ defaults = {
995
+ client_tool_passthrough_whitelist: Legion::LLM::Settings::CLIENT_TOOL_PASSTHROUGH_WHITELIST_DEFAULT,
996
+ client_tool_passthrough_blacklist: Legion::LLM::Settings::CLIENT_TOOL_PASSTHROUGH_BLACKLIST_DEFAULT
997
+ }
998
+ Array(Legion::LLM::Settings.value(:tool_trigger, key, default: defaults.fetch(key))).flat_map do |entry|
999
+ client_tool_policy_variants(entry)
1000
+ end.uniq
1001
+ end
1002
+
1003
+ def client_tool_passthrough_name_variants(definition)
1004
+ source = definition.respond_to?(:source) ? definition.source : {}
1005
+ raw_name = source[:raw_name] || source['raw_name'] if source.is_a?(Hash)
1006
+ [definition.name, raw_name].compact.flat_map { |name| client_tool_policy_variants(name) }.uniq
1007
+ end
1008
+
1009
+ def client_tool_policy_variants(value)
1010
+ raw = value.to_s.strip.downcase
1011
+ sanitized = Types::ToolDefinition.sanitize_tool_name(value).downcase
1012
+ compact = raw.gsub(/[^a-z0-9]/, '')
1013
+
1014
+ [raw, sanitized, compact].reject(&:empty?).uniq
955
1015
  end
956
1016
 
957
1017
  def non_executable_client_tool?(definition)
@@ -989,8 +1049,16 @@ module Legion
989
1049
  )
990
1050
  return
991
1051
  end
1052
+ if non_executable_client_tool?(definition) && !client_tool_passthrough_allowed?(definition)
1053
+ log.info(
1054
+ "[llm][tools][inject] action=client_tool_skipped request_id=#{request_log_value(:id, 'unknown')} " \
1055
+ "conversation_id=#{request_log_value(:conversation_id, 'none') || 'none'} name=#{definition.name} " \
1056
+ 'reason=client_passthrough_policy'
1057
+ )
1058
+ return
1059
+ end
992
1060
  return if gaia_tool_suppressed?(definition.name)
993
- return if definitions.any? { |existing| existing.name == definition.name }
1061
+ return if native_tool_definition_duplicate?(definitions, definition)
994
1062
 
995
1063
  @injected_tool_map[definition.name] = definition.source[:tool_class] if definition.source[:tool_class]
996
1064
  @native_tool_source_map[definition.name] = definition.source
@@ -1015,6 +1083,24 @@ module Legion
1015
1083
  handle_exception(e, level: :error, operation: 'llm.pipeline.native_registry_tools')
1016
1084
  end
1017
1085
 
1086
+ def native_tool_definition_duplicate?(definitions, definition)
1087
+ candidate_names = native_tool_definition_name_variants(definition)
1088
+ definitions.any? do |existing|
1089
+ native_tool_definition_name_variants(existing).intersect?(candidate_names)
1090
+ end
1091
+ end
1092
+
1093
+ def native_tool_definition_name_variants(definition)
1094
+ variants = client_tool_passthrough_name_variants(definition)
1095
+ source = definition.respond_to?(:source) ? definition.source : {}
1096
+ source_type = nil
1097
+ source_type = source[:type] || source['type'] if source.is_a?(Hash)
1098
+ if source_type.respond_to?(:to_sym) && source_type.to_sym == :special
1099
+ variants += Tools::Special.aliases_for(definition.name).flat_map { |name| client_tool_policy_variants(name) }
1100
+ end
1101
+ variants.uniq
1102
+ end
1103
+
1018
1104
  def add_settings_extensions_tool_definitions(definitions)
1019
1105
  existing_names = definitions.map(&:name)
1020
1106
  inject_limit = registry_tool_limit
@@ -1097,7 +1183,8 @@ module Legion
1097
1183
  result: native_tool_result_content(result),
1098
1184
  tool_call_id: normalized_call[:id],
1099
1185
  tool_name: normalized_call[:name],
1100
- started_at: Thread.current[:legion_current_tool_started_at]
1186
+ started_at: Thread.current[:legion_current_tool_started_at],
1187
+ status: result[:status] || result['status']
1101
1188
  )
1102
1189
  )
1103
1190
  result
@@ -1339,6 +1426,30 @@ module Legion
1339
1426
  @raw_response = Call::NativeResponseAdapter.new(result)
1340
1427
  end
1341
1428
 
1429
+ def execute_provider_request_responses(body:, stream:, &block)
1430
+ @timestamps[:provider_start] = Time.now
1431
+ @timeline.record(
1432
+ category: :provider, key: 'provider:request_sent',
1433
+ exchange_id: @exchange_id, direction: :outbound,
1434
+ detail: "responses from #{@resolved_provider}",
1435
+ from: 'pipeline', to: "provider:#{@resolved_provider}"
1436
+ )
1437
+
1438
+ raise Legion::LLM::ProviderError, "Native provider not registered: #{@resolved_provider}" unless use_native_dispatch?(@resolved_provider)
1439
+
1440
+ result = dispatch_responses_request(
1441
+ body: body,
1442
+ messages: native_dispatch_messages,
1443
+ stream: stream,
1444
+ stream_block: block
1445
+ )
1446
+ merge_response_offering_metadata(result[:metadata])
1447
+ @raw_response = Call::NativeResponseAdapter.new(result)
1448
+
1449
+ @timestamps[:provider_end] = Time.now
1450
+ record_provider_response
1451
+ end
1452
+
1342
1453
  def normalize_message_content(content)
1343
1454
  return content if content.nil? || content.is_a?(String)
1344
1455
  return content unless content.is_a?(Array)
@@ -1396,12 +1507,13 @@ module Legion
1396
1507
  started_at = tool_result.respond_to?(:started_at) ? tool_result.started_at : Thread.current[:legion_current_tool_started_at]
1397
1508
  finished_at = Time.now
1398
1509
  raw = tool_result.respond_to?(:result) ? tool_result.result : tool_result
1510
+ status = tool_result.respond_to?(:status) ? tool_result.status : nil
1399
1511
  duration_ms = started_at ? ((finished_at - started_at) * 1000).round : nil
1400
1512
 
1401
1513
  result_str = (raw.is_a?(String) ? raw : raw.to_s)
1402
1514
  result_str = result_str.encode('UTF-8', invalid: :replace, undef: :replace, replace: '�') unless result_str.valid_encoding?
1403
1515
  result_str = result_str.delete("\x00")
1404
- is_error = raw.is_a?(Hash) && (raw[:error] || raw['error']) ? true : false
1516
+ is_error = status.to_s == 'error' || (raw.is_a?(Hash) && (raw[:error] || raw['error']) ? true : false)
1405
1517
 
1406
1518
  @pending_tool_history_mutex.synchronize do
1407
1519
  entry = @pending_tool_history.find { |e| e[:tool_call_id] == tc_id && e[:result].nil? }
@@ -1425,7 +1537,7 @@ module Legion
1425
1537
 
1426
1538
  @tool_event_handler&.call(
1427
1539
  type: :tool_result, tool_call_id: tc_id, tool_name: tc_name,
1428
- result: result_str[0, 4096], result_size: result_str.bytesize,
1540
+ result: result_str[0, 4096], result_size: result_str.bytesize, status: is_error ? :error : :success,
1429
1541
  started_at: started_at, finished_at: finished_at, duration_ms: duration_ms
1430
1542
  )
1431
1543
 
@@ -2016,16 +2128,22 @@ module Legion
2016
2128
  end
2017
2129
 
2018
2130
  def response_tool_calls
2019
- # Prefer typed ToolCall objects from pending history (already built during execution)
2131
+ raw_tool_calls = @raw_response.respond_to?(:tool_calls) ? @raw_response.tool_calls : nil
2132
+ return build_response_tool_calls(raw_tool_calls) if raw_tool_calls&.any?
2133
+
2134
+ # Fall back to typed ToolCall objects from pending history when the final
2135
+ # model response completed after server-side tool execution.
2020
2136
  typed_from_history = @pending_tool_history
2021
2137
  .filter_map { |entry| entry[:typed_call] }
2022
2138
  return typed_from_history if typed_from_history.any?
2023
2139
 
2024
- return [] unless @raw_response.respond_to?(:tool_calls) && @raw_response.tool_calls
2140
+ []
2141
+ end
2025
2142
 
2143
+ def build_response_tool_calls(tool_calls)
2026
2144
  tool_timeline = build_tool_timeline_index
2027
2145
 
2028
- Array(@raw_response.tool_calls).map do |tool_call|
2146
+ Array(tool_calls).map do |tool_call|
2029
2147
  tc_id = tool_call[:id] || tool_call['id']
2030
2148
  tc_name = tool_call[:name] || tool_call['name']
2031
2149
  tc_args = tool_call[:arguments] || tool_call['arguments'] || {}
@@ -114,11 +114,27 @@ module Legion
114
114
  text = latest_user_text.to_s.downcase
115
115
  return if text.empty?
116
116
 
117
- native_dispatch_tools.keys.map(&:to_s).find do |tool_name|
118
- text.include?(tool_name.downcase)
117
+ native_dispatch_tools.keys.map(&:to_s).sort_by { |tool_name| -tool_name.length }.find do |tool_name|
118
+ explicit_tool_name_mentioned?(text, tool_name)
119
119
  end
120
120
  end
121
121
 
122
+ def explicit_tool_name_mentioned?(text, tool_name)
123
+ explicit_tool_name_candidates(tool_name).any? do |candidate|
124
+ text.match?(/(?<![[:alnum:]_-])#{Regexp.escape(candidate)}(?![[:alnum:]_-])/)
125
+ end
126
+ end
127
+
128
+ def explicit_tool_name_candidates(tool_name)
129
+ normalized_name = tool_name.to_s.downcase
130
+ [
131
+ normalized_name,
132
+ normalized_name.tr('_-', ' '),
133
+ normalized_name.tr('_', '-'),
134
+ normalized_name.tr('-', '_')
135
+ ].reject(&:empty?).uniq
136
+ end
137
+
122
138
  def latest_user_text
123
139
  message = Array(@request.messages).reverse.find do |msg|
124
140
  msg.is_a?(Hash) && (msg[:role] || msg['role']).to_s == 'user'
@@ -24,6 +24,41 @@ module Legion
24
24
  end
25
25
  end
26
26
 
27
+ def dispatch_responses_request(body:, messages:, stream:, stream_block: nil)
28
+ raise Legion::LLM::ProviderError, 'Responses API upstream dispatch is not supported for fleet providers' if fleet_dispatch?
29
+
30
+ idempotency_key = next_route_idempotency_key
31
+ result = Call::Dispatch.call(
32
+ provider: @resolved_provider,
33
+ instance: @resolved_instance,
34
+ capability: :responses,
35
+ model: @resolved_model,
36
+ body: body,
37
+ messages: messages,
38
+ stream: stream,
39
+ **native_dispatch_options,
40
+ &stream_block
41
+ )
42
+ record_route_attempt(
43
+ dispatch_path: :direct,
44
+ operation: :responses,
45
+ status: :success,
46
+ idempotency_key: idempotency_key,
47
+ selected_lane: nil
48
+ )
49
+ result
50
+ rescue StandardError => e
51
+ record_route_attempt(
52
+ dispatch_path: :direct,
53
+ operation: :responses,
54
+ status: :failure,
55
+ idempotency_key: idempotency_key,
56
+ selected_lane: nil,
57
+ failure_reason: e.message
58
+ )
59
+ raise
60
+ end
61
+
27
62
  def dispatch_direct_request(capability:, operation:, messages:, stream_block: nil)
28
63
  idempotency_key = next_route_idempotency_key
29
64
  result = Call::Dispatch.call(
@@ -8,6 +8,16 @@ module Legion
8
8
  module Settings
9
9
  extend Legion::Logging::Helper
10
10
 
11
+ CLIENT_TOOL_PASSTHROUGH_BLACKLIST_DEFAULT = [
12
+ 'sudo', 'visudo', 'su', 'legion', 'legionio', 'legionio do', 'legionio/legion',
13
+ 'computer_use_session', 'computer_use_control', 'computer_use_session_info',
14
+ 'computer_use_session_message', 'plugin__aithena__recall', 'plugin__aithena__remember',
15
+ 'plugin__aithena__skill_search', 'plugin__aithena__skill_feedback', 'plugin__aithena__memory_stats',
16
+ 'plugin__cron__create', 'plugin__cron__list', 'plugin__cron__get', 'plugin__cron__update',
17
+ 'plugin__cron__delete', 'plugin__cron__get_history', 'plugin__cron__run_now', 'plugin__cron__stop'
18
+ ].freeze
19
+ CLIENT_TOOL_PASSTHROUGH_WHITELIST_DEFAULT = [].freeze
20
+
11
21
  def self.default
12
22
  model_override = ENV.fetch('ANTHROPIC_MODEL', nil)
13
23
  {
@@ -482,10 +492,12 @@ module Legion
482
492
 
483
493
  def self.tool_trigger_defaults
484
494
  {
485
- scan_depth: 10,
486
- tool_limit: 25,
487
- local_tool_limit: 100,
488
- client_tool_passthrough: false
495
+ scan_depth: 10,
496
+ tool_limit: 25,
497
+ local_tool_limit: 100,
498
+ client_tool_passthrough: false,
499
+ client_tool_passthrough_whitelist: CLIENT_TOOL_PASSTHROUGH_WHITELIST_DEFAULT.dup,
500
+ client_tool_passthrough_blacklist: CLIENT_TOOL_PASSTHROUGH_BLACKLIST_DEFAULT.dup
489
501
  }
490
502
  end
491
503
 
@@ -19,6 +19,10 @@ module Legion
19
19
  LIST_ALL_TOOLS_NAME = 'legion_list_all_tools'
20
20
  DEFAULT_TIMEOUT_MS = 120_000
21
21
  MAX_TIMEOUT_MS = 600_000
22
+ TOOL_ALIASES = {
23
+ 'python' => %w[python python3],
24
+ 'pip' => %w[pip pip3]
25
+ }.freeze
22
26
  PYTHON_PACKAGES = %w[
23
27
  python-pptx
24
28
  python-docx
@@ -60,6 +64,11 @@ module Legion
60
64
  { status: :error, result: e.message }
61
65
  end
62
66
 
67
+ def aliases_for(tool_name)
68
+ normalized = normalize_tool_name(tool_name)
69
+ TOOL_ALIASES.fetch(normalized, [normalized])
70
+ end
71
+
63
72
  def inventory
64
73
  {
65
74
  special_tools: special_tool_summaries,
@@ -2,6 +2,6 @@
2
2
 
3
3
  module Legion
4
4
  module LLM
5
- VERSION = '0.9.36'
5
+ VERSION = '0.9.51'
6
6
  end
7
7
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: legion-llm
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.9.36
4
+ version: 0.9.51
5
5
  platform: ruby
6
6
  authors:
7
7
  - Esity
@@ -23,6 +23,20 @@ dependencies:
23
23
  - - ">="
24
24
  - !ruby/object:Gem::Version
25
25
  version: '0'
26
+ - !ruby/object:Gem::Dependency
27
+ name: event_stream_parser
28
+ requirement: !ruby/object:Gem::Requirement
29
+ requirements:
30
+ - - "~>"
31
+ - !ruby/object:Gem::Version
32
+ version: '1'
33
+ type: :runtime
34
+ prerelease: false
35
+ version_requirements: !ruby/object:Gem::Requirement
36
+ requirements:
37
+ - - "~>"
38
+ - !ruby/object:Gem::Version
39
+ version: '1'
26
40
  - !ruby/object:Gem::Dependency
27
41
  name: faraday
28
42
  requirement: !ruby/object:Gem::Requirement