legion-llm 0.9.37 → 0.9.51

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 05ce805ec96361b4a033e7d14a5e9e49e80de7415c75b22fd2af44b41ae447e0
4
- data.tar.gz: b28f87cc01e43a8165c41d72b315373948e0465094bbb103402ea4e5f66d37bc
3
+ metadata.gz: e0cae7c608acb8fe3f09852e7833639c08998bb944b2df026d9c1999c03ff017
4
+ data.tar.gz: bbea922035cf6f38eb43139ea5d33deaca70cbbea8693edccb3811bdc5f43608
5
5
  SHA512:
6
- metadata.gz: 75a99d484b509a4f361b7fae7d3df534e17d6148a613e0e6c9c67b9a6315a73d3ec6830031cd840588c5feb5fd064433638b69191d89b3c0ebddcb9d333d0b62
7
- data.tar.gz: 9bc998ea9c5e12ec2f3b545bc0b24dc44258aa42fc92882217aae59b5ca7c3c45594888584444c41479d93bc2015b435bfa371440b2b429277b93c05dff7a3dc
6
+ metadata.gz: cc620102bcfdbd73387ba3da2e31e80e4fd9c9b9fd3ceeb85b00417972deeda55bbc427702df3a86ec7a2d3f07be34f99383bf9141b79679e394e66c45eda7c1
7
+ data.tar.gz: 4f5b8e4739873d147be2ddfed81c6a04297a016f9c7c4c143b6ad0409f61a9f75c39a14c6de4b30f37119b4e5aa577e4faba813917da0a9c771c016500e079a2
data/CHANGELOG.md CHANGED
@@ -1,5 +1,80 @@
1
1
  # Legion LLM Changelog
2
2
 
3
+ ## [0.9.51] - 2026-05-23
4
+
5
+ ### Changed
6
+ - Settings: `tool_trigger.client_tool_passthrough` now defaults to `false`; callers must opt in with request metadata or a settings override before non-executable client tools are passed through to providers.
7
+
8
+ ## [0.9.50] - 2026-05-23
9
+
10
+ ### Fixed
11
+ - API: native `/api/llm/inference` now debug-logs the exact outward response payload for sync JSON responses and streaming `done` events, making client passthrough tool-call shape visible in runtime logs.
12
+
13
+ ## [0.9.49] - 2026-05-23
14
+
15
+ ### Fixed
16
+ - API: native `/api/llm/inference` client tool requests now use the OpenAI Chat Completions `tool_calls` shape with `type: "function"` and `function.name` / JSON-string `function.arguments`, aligning native sync responses and streaming `done.tool_calls` with the OpenAI-compatible endpoints.
17
+
18
+ ## [0.9.48] - 2026-05-23
19
+
20
+ ### Fixed
21
+ - API: OpenAI-compatible streaming now emits tool callbacks for both Chat Completions and Responses: `/v1/chat/completions` streams `delta.tool_calls` and finishes with `finish_reason: "tool_calls"`, while `/v1/responses` streams `function_call` output item events and includes them in `response.completed.output`.
22
+
23
+ ## [0.9.47] - 2026-05-22
24
+
25
+ ### Fixed
26
+ - API: streaming client passthrough tool calls now stay only on the terminal `done.tool_calls` payload instead of emitting a live `tool-call` event that makes clients wait for an impossible same-stream `tool-result`.
27
+
28
+ ## [0.9.46] - 2026-05-22
29
+
30
+ ### Fixed
31
+ - API: returned client passthrough tool calls now keep the existing streaming `tool-call` event name while carrying `clientPassthrough` and `requiresToolResult` metadata, preserving current client execution behavior.
32
+
33
+ ## [0.9.45] - 2026-05-22
34
+
35
+ ### Fixed
36
+ - Tools: vLLM explicit tool-choice matching now uses tool-name boundaries, so paths like `/rubymine/...` no longer force the `ruby` tool when the user asked for `git`.
37
+
38
+ ## [0.9.44] - 2026-05-22
39
+
40
+ ### Fixed
41
+ - API: streaming client passthrough tool calls now emit `client-tool-call` with explicit client execution metadata instead of a server-execution `tool-call`, preventing clients from waiting for an impossible same-stream server tool result.
42
+ - Settings: default client passthrough blacklist now blocks computer-use session/control tools plus Aithena and cron plugin tools so provider calls do not hand off internal UI/plugin controls as executable client tools.
43
+
44
+ ## [0.9.43] - 2026-05-22
45
+
46
+ ### Fixed
47
+ - Tools: responses that end on client passthrough after prior server-side tool execution now return the current passthrough tool call instead of replaying the earlier executed tool from pending history.
48
+
49
+ ## [0.9.42] - 2026-05-22
50
+
51
+ ### Fixed
52
+ - Tools: native tool-loop follow-up provider calls now include a continuation instruction that tells models to make another available tool call instead of narrating intent after a failed or incomplete tool result.
53
+ - API: streamed native tool results now include status and emit `tool-error` when the server-side tool execution failed.
54
+
55
+ ## [0.9.41] - 2026-05-22
56
+
57
+ ### Fixed
58
+ - API: streaming inference `done` events now include `stop_reason` and `requires_tool_result` when client passthrough tool calls must be executed and submitted back by the caller, matching the sync response contract and preventing wrappers from treating tool-use turns as final completions.
59
+
60
+ ## [0.9.40] - 2026-05-22
61
+
62
+ ### Added
63
+ - Settings: client tool passthrough blacklist now blocks Legion launcher-style client tools such as `legion`, `legionio`, `legionio do`, and `legionio/legion` by default, including sanitized tool-name variants.
64
+ - Tools: client `python3` and `pip3` passthrough definitions deduplicate against Legion's native `python` and `pip` special tools when the managed Python runtime is injected.
65
+
66
+ ## [0.9.39] - 2026-05-22
67
+
68
+ ### Added
69
+ - Settings: `client_tool_passthrough_whitelist` and `client_tool_passthrough_blacklist` now filter non-executable client tools before native provider dispatch; `client_tool_passthrough` defaults to enabled, explicit API true/false overrides are preserved, the default blacklist blocks `sudo`, `visudo`, and `su`, and the default whitelist is empty.
70
+
71
+ ## [0.9.38] - 2026-05-22
72
+
73
+ ### Fixed
74
+ - API: OpenAI Responses upstream dispatch now preserves Responses `input_text` / `output_text` content parts instead of stringifying them through chat-message normalization.
75
+ - Providers: native `:responses` dispatch is gated to providers or instances that explicitly support the Responses API, preventing non-Responses providers from receiving `/v1/responses` traffic just because the adapter has a helper method.
76
+ - Packaging: `event_stream_parser` is now a direct runtime dependency because `legion-llm` requires it for Responses SSE parsing.
77
+
3
78
  ## [0.9.37] - 2026-05-22
4
79
 
5
80
  ### Changed
data/legion-llm.gemspec CHANGED
@@ -26,6 +26,7 @@ Gem::Specification.new do |spec|
26
26
  }
27
27
 
28
28
  spec.add_dependency 'concurrent-ruby'
29
+ spec.add_dependency 'event_stream_parser', '~> 1'
29
30
  spec.add_dependency 'faraday'
30
31
  spec.add_dependency 'legion-cache', '>= 1.4.2'
31
32
  spec.add_dependency 'legion-json', '>= 1.2.0'
@@ -5,6 +5,7 @@ require 'open3'
5
5
  require 'time'
6
6
  require 'legion/cache/helper'
7
7
  require 'legion/logging/helper'
8
+ require 'legion/llm/api/translators/openai_response'
8
9
  require 'legion/llm/publisher_identity'
9
10
  require 'legion/llm/types'
10
11
 
@@ -302,7 +303,7 @@ module Legion
302
303
  name: tname,
303
304
  description: tdesc,
304
305
  parameters: tschema || {},
305
- source: { type: :client, executable: false }
306
+ source: { type: :client, executable: false, raw_name: tname }
306
307
  )
307
308
  rescue StandardError => e
308
309
  handle_exception(e, level: :warn, handled: true, operation: "llm.api.build_client_tool_class.#{tname}")
@@ -310,16 +311,7 @@ module Legion
310
311
  end
311
312
 
312
313
  define_method(:extract_tool_calls) do |pipeline_response|
313
- tools_data = pipeline_response.tools
314
- return [] unless tools_data.is_a?(Array) && !tools_data.empty?
315
-
316
- tools_data.map do |tc|
317
- {
318
- id: tc.respond_to?(:id) ? tc.id : (tc[:id] || tc['id']),
319
- name: tc.respond_to?(:name) ? tc.name : (tc[:name] || tc['name'] || tc.to_s),
320
- arguments: tc.respond_to?(:arguments) ? tc.arguments : (tc[:arguments] || tc['arguments'] || {})
321
- }
322
- end
314
+ Legion::LLM::API::Translators::OpenAIResponse.build_tool_calls(pipeline_response)
323
315
  end
324
316
 
325
317
  define_method(:extract_text_content) do |content|
@@ -347,7 +339,46 @@ module Legion
347
339
  stream << "event: #{event_name}\ndata: #{Legion::JSON.dump(payload)}\n\n"
348
340
  end
349
341
 
350
- define_method(:emit_response_tool_call_events) do |stream, pipeline_response|
342
+ define_method(:log_native_inference_response) do |request_id:, conversation_id:, stream:, kind:, payload:|
343
+ log.debug(
344
+ "[llm][api][inference] action=response_payload request_id=#{request_id || 'unknown'} " \
345
+ "conversation_id=#{conversation_id || 'none'} stream=#{stream} kind=#{kind} " \
346
+ "payload=#{Legion::JSON.dump(payload)}"
347
+ )
348
+ rescue StandardError => e
349
+ handle_exception(e, level: :debug, handled: true,
350
+ operation: 'llm.api.inference.response_payload_log',
351
+ request_id: request_id)
352
+ end
353
+
354
+ define_method(:returned_client_tool_call_payload) do |tool_call, tool_call_id, tool_name|
355
+ {
356
+ toolCallId: tool_call_id,
357
+ toolName: tool_name,
358
+ args: openai_tool_call_arguments(tool_call),
359
+ clientPassthrough: true,
360
+ requiresToolResult: true,
361
+ status: 'requires_client_execution',
362
+ timestamp: Time.now.utc.iso8601
363
+ }
364
+ end
365
+
366
+ define_method(:openai_tool_call_name) do |tool_call|
367
+ fn = tool_call[:function] || tool_call['function'] || {}
368
+ fn[:name] || fn['name'] || tool_call[:name] || tool_call['name']
369
+ end
370
+
371
+ define_method(:openai_tool_call_arguments) do |tool_call|
372
+ fn = tool_call[:function] || tool_call['function'] || {}
373
+ raw_args = fn[:arguments] || fn['arguments'] || tool_call[:arguments] || tool_call['arguments'] || {}
374
+ return raw_args unless raw_args.is_a?(String)
375
+
376
+ Legion::JSON.parse(raw_args, symbolize_names: true)
377
+ rescue StandardError
378
+ raw_args
379
+ end
380
+
381
+ define_method(:emit_response_tool_call_events) do |_stream, pipeline_response|
351
382
  tool_calls = extract_tool_calls(pipeline_response)
352
383
  return if tool_calls.empty?
353
384
 
@@ -359,7 +390,7 @@ module Legion
359
390
  data[:tool_call_id] || data['tool_call_id']
360
391
  end
361
392
 
362
- emitted = 0
393
+ done_only = 0
363
394
  skipped_timeline = 0
364
395
  request_id = pipeline_response.respond_to?(:request_id) ? pipeline_response.request_id : 'unknown'
365
396
  conversation_id = pipeline_response.respond_to?(:conversation_id) ? pipeline_response.conversation_id : 'none'
@@ -371,28 +402,22 @@ module Legion
371
402
  next
372
403
  end
373
404
 
374
- tool_name = tool_call[:name] || tool_call['name']
405
+ tool_name = openai_tool_call_name(tool_call)
375
406
  next if tool_name.to_s.empty?
376
407
 
377
408
  log.info(
378
- "[llm][api][tools] action=returned_tool_call_sse request_id=#{request_id || 'unknown'} " \
409
+ "[llm][api][tools] action=returned_tool_call_done_only request_id=#{request_id || 'unknown'} " \
379
410
  "conversation_id=#{conversation_id || 'none'} tool_call_id=#{tool_call_id || 'none'} name=#{tool_name} " \
380
- "args_class=#{(tool_call[:arguments] || tool_call['arguments'] || {}).class}"
411
+ "args_class=#{openai_tool_call_arguments(tool_call).class}"
381
412
  )
382
- emit_sse_event(stream, 'tool-call', {
383
- toolCallId: tool_call_id,
384
- toolName: tool_name,
385
- args: tool_call[:arguments] || tool_call['arguments'] || {},
386
- timestamp: Time.now.utc.iso8601
387
- })
388
- emitted += 1
413
+ done_only += 1
389
414
  end
390
415
 
391
- names = tool_calls.map { |tool_call| tool_call[:name] || tool_call['name'] }.compact
416
+ names = tool_calls.map { |tool_call| openai_tool_call_name(tool_call) }.compact
392
417
  names = names.first(30).join(',') + (names.size > 30 ? ",+#{names.size - 30}more" : '')
393
418
  log.info(
394
419
  "[llm][api][tools] action=returned_tool_calls_complete request_id=#{request_id || 'unknown'} " \
395
- "conversation_id=#{conversation_id || 'none'} total=#{tool_calls.size} emitted=#{emitted} " \
420
+ "conversation_id=#{conversation_id || 'none'} total=#{tool_calls.size} done_only=#{done_only} " \
396
421
  "skipped_timeline=#{skipped_timeline} names=#{names.empty? ? 'none' : names}"
397
422
  )
398
423
  end
@@ -28,7 +28,7 @@ module Legion
28
28
  conversation_id = body[:conversation_id]
29
29
  request_id = body[:request_id] || SecureRandom.uuid
30
30
  include_thinking = body[:include_thinking] == true
31
- client_tool_passthrough = body[:client_tool_passthrough] == true
31
+ client_tool_passthrough = body[:client_tool_passthrough] if [true, false].include?(body[:client_tool_passthrough])
32
32
 
33
33
  unless messages.is_a?(Array)
34
34
  halt 400, { 'Content-Type' => 'application/json' },
@@ -105,7 +105,7 @@ module Legion
105
105
  extra = {}
106
106
  extra[:tier] = tier.to_sym if tier
107
107
  metadata = { requested_tools: requested_tools }
108
- metadata[:client_tool_passthrough] = true if client_tool_passthrough
108
+ metadata[:client_tool_passthrough] = client_tool_passthrough unless client_tool_passthrough.nil?
109
109
  metadata[:client_tool_request_count] = tools.size if tools.any?
110
110
 
111
111
  pipeline_request = Legion::LLM::Inference::Request.build(
@@ -148,10 +148,12 @@ module Legion
148
148
  timestamp: Time.now.utc.iso8601
149
149
  })
150
150
  when :tool_result
151
- emit_sse_event(out, 'tool-result', {
151
+ event_name = event[:status].to_s == 'error' ? 'tool-error' : 'tool-result'
152
+ emit_sse_event(out, event_name, {
152
153
  toolCallId: event[:tool_call_id],
153
154
  toolName: event[:tool_name],
154
155
  result: event[:result],
156
+ status: event[:status],
155
157
  timestamp: Time.now.utc.iso8601
156
158
  })
157
159
  when :tool_error
@@ -184,20 +186,31 @@ module Legion
184
186
 
185
187
  routing = pipeline_response.routing || {}
186
188
  tokens = pipeline_response.tokens || {}
189
+ tool_calls = extract_tool_calls(pipeline_response)
190
+ stop_reason = pipeline_response.stop&.dig(:reason)&.to_s
187
191
  done_payload = {
188
- request_id: request_id,
189
- content: full_text,
190
- model: (routing[:model] || routing['model']).to_s,
191
- provider: (routing[:provider] || routing['provider'])&.to_s,
192
- instance: (routing[:instance] || routing['instance'])&.to_s,
193
- tier: (routing[:tier] || routing['tier'])&.to_s,
194
- input_tokens: token_value(tokens, :input),
195
- output_tokens: token_value(tokens, :output),
196
- tool_calls: extract_tool_calls(pipeline_response),
197
- conversation_id: pipeline_response.conversation_id,
198
- metrics: build_response_metrics(pipeline_response)
192
+ request_id: request_id,
193
+ content: full_text,
194
+ model: (routing[:model] || routing['model']).to_s,
195
+ provider: (routing[:provider] || routing['provider'])&.to_s,
196
+ instance: (routing[:instance] || routing['instance'])&.to_s,
197
+ tier: (routing[:tier] || routing['tier'])&.to_s,
198
+ input_tokens: token_value(tokens, :input),
199
+ output_tokens: token_value(tokens, :output),
200
+ tool_calls: tool_calls,
201
+ stop_reason: stop_reason,
202
+ requires_tool_result: stop_reason == 'tool_use' && tool_calls.any?,
203
+ conversation_id: pipeline_response.conversation_id,
204
+ metrics: build_response_metrics(pipeline_response)
199
205
  }.compact
200
206
  done_payload[:thinking] = pipeline_response.thinking if include_thinking && pipeline_response.thinking
207
+ log_native_inference_response(
208
+ request_id: request_id,
209
+ conversation_id: pipeline_response.conversation_id || conversation_id,
210
+ stream: true,
211
+ kind: 'sse_done',
212
+ payload: done_payload
213
+ )
201
214
  emit_sse_event(out, 'done', {
202
215
  **done_payload
203
216
  })
@@ -227,6 +240,7 @@ module Legion
227
240
  routing = pipeline_response.routing || {}
228
241
  tokens = pipeline_response.tokens || {}
229
242
  tool_calls = extract_tool_calls(pipeline_response)
243
+ stop_reason = pipeline_response.stop&.dig(:reason)&.to_s
230
244
 
231
245
  log.info(
232
246
  "[llm][api][inference] action=completed request_id=#{request_id} " \
@@ -240,21 +254,29 @@ module Legion
240
254
  )
241
255
 
242
256
  payload = {
243
- request_id: request_id,
244
- content: content,
245
- tool_calls: tool_calls,
246
- stop_reason: pipeline_response.stop&.dig(:reason)&.to_s,
247
- model: (routing[:model] || routing['model']).to_s,
248
- provider: (routing[:provider] || routing['provider'])&.to_s,
249
- instance: (routing[:instance] || routing['instance'])&.to_s,
250
- tier: (routing[:tier] || routing['tier'])&.to_s,
251
- input_tokens: token_value(tokens, :input),
252
- output_tokens: token_value(tokens, :output),
253
- conversation_id: pipeline_response.conversation_id,
254
- metrics: build_response_metrics(pipeline_response)
257
+ request_id: request_id,
258
+ content: content,
259
+ tool_calls: tool_calls,
260
+ stop_reason: stop_reason,
261
+ requires_tool_result: stop_reason == 'tool_use' && tool_calls.any?,
262
+ model: (routing[:model] || routing['model']).to_s,
263
+ provider: (routing[:provider] || routing['provider'])&.to_s,
264
+ instance: (routing[:instance] || routing['instance'])&.to_s,
265
+ tier: (routing[:tier] || routing['tier'])&.to_s,
266
+ input_tokens: token_value(tokens, :input),
267
+ output_tokens: token_value(tokens, :output),
268
+ conversation_id: pipeline_response.conversation_id,
269
+ metrics: build_response_metrics(pipeline_response)
255
270
  }
256
271
  payload[:thinking] = pipeline_response.thinking if include_thinking && pipeline_response.thinking
257
272
  payload.compact!
273
+ log_native_inference_response(
274
+ request_id: request_id,
275
+ conversation_id: pipeline_response.conversation_id || conversation_id,
276
+ stream: false,
277
+ kind: 'json_response',
278
+ payload: { data: payload }
279
+ )
258
280
  json_response(payload, status_code: 200)
259
281
  end
260
282
  rescue Legion::LLM::AuthError => e
@@ -67,8 +67,22 @@ module Legion
67
67
 
68
68
  routing = pipeline_response.routing || {}
69
69
  final_model = (routing[:model] || routing['model'] || model).to_s
70
+ tool_calls = Legion::LLM::API::Translators::OpenAIResponse.build_tool_calls(pipeline_response)
71
+
72
+ tool_calls.each_with_index do |tool_call, index|
73
+ out << "data: #{Legion::JSON.dump(Legion::LLM::API::Translators::OpenAIResponse.format_stream_tool_call_chunk(
74
+ tool_call,
75
+ model: final_model,
76
+ request_id: request_id,
77
+ index: index
78
+ ))}\n\n"
79
+ end
80
+
70
81
  done_chunk = Legion::LLM::API::Translators::OpenAIResponse.format_stream_chunk(
71
- nil, model: final_model, request_id: request_id, finish_reason: 'stop'
82
+ nil,
83
+ model: final_model,
84
+ request_id: request_id,
85
+ finish_reason: tool_calls.empty? ? 'stop' : 'tool_calls'
72
86
  )
73
87
  out << "data: #{Legion::JSON.dump(done_chunk)}\n\n"
74
88
  out << "data: [DONE]\n\n"
@@ -237,6 +237,7 @@ module Legion
237
237
  tokens = pipeline_response.tokens || {}
238
238
  resolved_model = (routing[:model] || routing['model'] || model).to_s
239
239
  usage = build_usage(tokens)
240
+ function_calls = build_output_tool_calls(pipeline_response)
240
241
 
241
242
  out << sse_event('response.output_text.done', {
242
243
  type: 'response.output_text.done',
@@ -265,6 +266,41 @@ module Legion
265
266
  item: completed_item
266
267
  })
267
268
 
269
+ function_calls.each_with_index do |function_call, index|
270
+ output_index = index + 1
271
+ in_progress_item = function_call.merge(status: 'in_progress', arguments: '')
272
+
273
+ out << sse_event('response.output_item.added', {
274
+ type: 'response.output_item.added',
275
+ sequence_number: seq += 1,
276
+ output_index: output_index,
277
+ item: in_progress_item
278
+ })
279
+
280
+ out << sse_event('response.function_call_arguments.delta', {
281
+ type: 'response.function_call_arguments.delta',
282
+ sequence_number: seq += 1,
283
+ output_index: output_index,
284
+ item_id: function_call[:id],
285
+ delta: function_call[:arguments]
286
+ })
287
+
288
+ out << sse_event('response.function_call_arguments.done', {
289
+ type: 'response.function_call_arguments.done',
290
+ sequence_number: seq += 1,
291
+ output_index: output_index,
292
+ item_id: function_call[:id],
293
+ arguments: function_call[:arguments]
294
+ })
295
+
296
+ out << sse_event('response.output_item.done', {
297
+ type: 'response.output_item.done',
298
+ sequence_number: seq += 1,
299
+ output_index: output_index,
300
+ item: function_call
301
+ })
302
+ end
303
+
268
304
  out << sse_event('response.completed', {
269
305
  type: 'response.completed',
270
306
  sequence_number: seq + 1,
@@ -274,7 +310,7 @@ module Legion
274
310
  created_at: created_at,
275
311
  status: 'completed',
276
312
  model: resolved_model,
277
- output: [completed_item],
313
+ output: [completed_item, *function_calls],
278
314
  usage: usage
279
315
  }
280
316
  })
@@ -70,6 +70,51 @@ module Legion
70
70
  }
71
71
  end
72
72
 
73
+ def format_stream_tool_call_chunk(tool_call, model:, request_id:, index:)
74
+ fn = tool_call.is_a?(Hash) ? (tool_call[:function] || tool_call['function'] || {}) : {}
75
+ name = tool_call.respond_to?(:name) ? tool_call.name : (tool_call[:name] || tool_call['name'] || fn[:name] || fn['name'])
76
+ args = if tool_call.respond_to?(:arguments)
77
+ tool_call.arguments
78
+ else
79
+ tool_call[:arguments] || tool_call['arguments'] || fn[:arguments] || fn['arguments'] || {}
80
+ end
81
+ tc_id = tool_call.respond_to?(:id) ? tool_call.id : (tool_call[:id] || tool_call['id'] || "call_#{SecureRandom.hex(8)}")
82
+
83
+ format_stream_delta_chunk(
84
+ {
85
+ tool_calls: [
86
+ {
87
+ index: index,
88
+ id: tc_id,
89
+ type: 'function',
90
+ function: {
91
+ name: name.to_s,
92
+ arguments: args.is_a?(String) ? args : Legion::JSON.dump(args)
93
+ }
94
+ }
95
+ ]
96
+ },
97
+ model: model,
98
+ request_id: request_id
99
+ )
100
+ end
101
+
102
+ def format_stream_delta_chunk(delta, model:, request_id:, finish_reason: nil)
103
+ {
104
+ id: "chatcmpl-#{request_id.delete('-')}",
105
+ object: 'chat.completion.chunk',
106
+ created: Time.now.to_i,
107
+ model: model.to_s,
108
+ choices: [
109
+ {
110
+ index: 0,
111
+ delta: delta,
112
+ finish_reason: finish_reason
113
+ }
114
+ ]
115
+ }
116
+ end
117
+
73
118
  def format_embeddings(vector, model:, input_text:, usage: nil)
74
119
  tokens = embedding_token_count(usage, input_text)
75
120
 
@@ -190,6 +190,9 @@ module Legion
190
190
  raise Legion::LLM::ProviderError, "unsupported capability: #{capability}" unless method_name
191
191
 
192
192
  ext = fetch_extension!(provider, instance: instance)
193
+ if ext.respond_to?(:supports?) && !ext.supports?(cap_sym)
194
+ raise Legion::LLM::ProviderError, "unsupported capability #{capability} for provider #{provider}"
195
+ end
193
196
 
194
197
  log.info("[llm][dispatch] capability=#{cap_sym} provider=#{provider} " \
195
198
  "instance=#{instance || 'default'} model=#{model}")
@@ -11,11 +11,13 @@ module Legion
11
11
  include Legion::Logging::Helper
12
12
 
13
13
  METADATA_KEYS = %i[tier capabilities enabled].freeze
14
+ RESPONSES_PROVIDER_FAMILIES = %i[openai vllm].freeze
14
15
 
15
16
  def initialize(provider_name, provider_class, instance_config: {})
16
17
  @provider_name = provider_name.to_sym
17
18
  @provider_class = provider_class
18
19
  @instance_config = instance_config
20
+ @capabilities = Array(instance_config[:capabilities] || instance_config['capabilities']).map(&:to_sym)
19
21
  @lex_llm_namespace = resolve_lex_llm_namespace
20
22
  end
21
23
 
@@ -60,6 +62,8 @@ module Legion
60
62
  end
61
63
 
62
64
  def responses(model:, body:, messages:, stream: false, **opts, &)
65
+ raise Legion::LLM::ProviderError, "Responses API dispatch is not supported for #{provider_name}" unless supports?(:responses)
66
+
63
67
  payload = build_responses_payload(
64
68
  body: body,
65
69
  model: model,
@@ -77,6 +81,12 @@ module Legion
77
81
  end
78
82
  end
79
83
 
84
+ def supports?(capability)
85
+ return true unless capability.to_sym == :responses
86
+
87
+ @capabilities.include?(:responses) || RESPONSES_PROVIDER_FAMILIES.include?(provider_name)
88
+ end
89
+
80
90
  def embed(model:, text:, dimensions: nil, **opts)
81
91
  model_info = model_info(model, offering_metadata: opts[:offering_metadata])
82
92
  response = provider.embed(
@@ -161,7 +171,7 @@ module Legion
161
171
  payload = normalize_hash(body).dup
162
172
  payload[:model] = model
163
173
  payload[:stream] = stream
164
- payload[:input] = responses_input(messages)
174
+ payload[:input] = responses_payload_input(payload, messages)
165
175
 
166
176
  system_content = normalize_response_system(system)
167
177
  payload[:instructions] = system_content if present_system?(system_content)
@@ -185,12 +195,41 @@ module Legion
185
195
 
186
196
  {
187
197
  role: normalized[:role]&.to_s || 'user',
188
- content: normalize_message_content(normalized[:content]).to_s,
198
+ content: responses_message_content(normalized[:content]),
189
199
  tool_call_id: normalized[:tool_call_id]
190
200
  }.compact
191
201
  end
192
202
  end
193
203
 
204
+ def responses_payload_input(payload, messages)
205
+ return payload[:input] if payload.key?(:input)
206
+ return payload['input'] if payload.key?('input')
207
+
208
+ responses_input(messages)
209
+ end
210
+
211
+ def responses_message_content(content)
212
+ return content if content.nil? || content.is_a?(String)
213
+
214
+ if content.is_a?(Array)
215
+ parts = content.filter_map { |part| responses_content_part(part) }
216
+ return parts unless parts.empty?
217
+ end
218
+
219
+ text_part_content(content) || content.to_s
220
+ end
221
+
222
+ def responses_content_part(part)
223
+ return { type: 'input_text', text: part } if part.is_a?(String)
224
+ return part unless part.respond_to?(:transform_keys)
225
+
226
+ normalized = part.transform_keys { |key| key.respond_to?(:to_sym) ? key.to_sym : key }
227
+ type = normalized[:type].to_s
228
+ return { type: type, text: normalized[:text].to_s } if %w[input_text output_text text].include?(type)
229
+
230
+ part
231
+ end
232
+
194
233
  def normalize_response_system(system)
195
234
  return nil if system.nil?
196
235
  return system[:content] || system['content'] if system.is_a?(Hash)
@@ -463,7 +502,7 @@ module Legion
463
502
 
464
503
  if part.respond_to?(:transform_keys)
465
504
  normalized = part.transform_keys { |key| key.respond_to?(:to_sym) ? key.to_sym : key }
466
- return unless normalized[:type].to_s == 'text'
505
+ return unless %w[input_text output_text text].include?(normalized[:type].to_s)
467
506
 
468
507
  return normalized[:text].to_s
469
508
  end
@@ -104,7 +104,7 @@ module Legion
104
104
  end
105
105
 
106
106
  def adapter_instance_config(config, instance_id)
107
- config.except(:tier, :capabilities).tap do |registry_config|
107
+ config.except(:tier).tap do |registry_config|
108
108
  registry_config[:instance_id] ||= instance_id
109
109
  end
110
110
  end
@@ -69,7 +69,7 @@ module Legion
69
69
  ].freeze
70
70
 
71
71
  MAX_NATIVE_TOOL_ROUNDS = 200
72
- ToolResultEvent = Struct.new(:result, :tool_call_id, :tool_name, :started_at, keyword_init: true)
72
+ ToolResultEvent = Struct.new(:result, :tool_call_id, :tool_name, :started_at, :status, keyword_init: true)
73
73
 
74
74
  ASYNC_THREAD_POOL = Concurrent::FixedThreadPool.new(4, fallback_policy: :caller_runs)
75
75
 
@@ -912,12 +912,29 @@ module Legion
912
912
  offering_id: @resolved_offering_id,
913
913
  offering_metadata: @resolved_offering_metadata
914
914
  }
915
+ options[:system] = native_tool_loop_system(options[:system])
915
916
  options[:tools] = native_dispatch_tools if native_dispatch_tools.any?
916
917
  options[:tool_prefs] = native_tool_prefs if native_dispatch_tools.any? && native_tool_prefs
917
918
  options[:thinking] = native_dispatch_thinking if native_dispatch_thinking
918
919
  options.compact
919
920
  end
920
921
 
922
+ def native_tool_loop_system(system)
923
+ return system unless @native_tool_loop_round.to_i.positive? && native_dispatch_tools.any?
924
+
925
+ [system, native_tool_loop_continuation_prompt].compact.join("\n\n")
926
+ end
927
+
928
+ def native_tool_loop_continuation_prompt
929
+ <<~PROMPT.strip
930
+ Tool-use continuation rule:
931
+ - You just received tool results.
932
+ - If a tool failed or produced incomplete information and another available tool can continue the user's request, call that tool now.
933
+ - Do not say you will use a tool unless you are actually making the tool call in this response.
934
+ - Only provide a final answer when no further tool call is needed or possible.
935
+ PROMPT
936
+ end
937
+
921
938
  def native_dispatch_chat_options
922
939
  opts = { model: @resolved_model, provider: @resolved_provider }
923
940
  opts[:instance] = @resolved_instance if @resolved_instance
@@ -959,7 +976,42 @@ module Legion
959
976
  return value if [true, false].include?(value)
960
977
  end
961
978
 
962
- Legion::LLM::Settings.value(:tool_trigger, :client_tool_passthrough) != false
979
+ Legion::LLM::Settings.value(:tool_trigger, :client_tool_passthrough) == true
980
+ end
981
+
982
+ def client_tool_passthrough_allowed?(definition)
983
+ names = client_tool_passthrough_name_variants(definition)
984
+ whitelist = client_tool_passthrough_list(:client_tool_passthrough_whitelist)
985
+ blacklist = client_tool_passthrough_list(:client_tool_passthrough_blacklist)
986
+
987
+ return false if whitelist.any? && !names.intersect?(whitelist)
988
+ return false if names.intersect?(blacklist)
989
+
990
+ true
991
+ end
992
+
993
+ def client_tool_passthrough_list(key)
994
+ defaults = {
995
+ client_tool_passthrough_whitelist: Legion::LLM::Settings::CLIENT_TOOL_PASSTHROUGH_WHITELIST_DEFAULT,
996
+ client_tool_passthrough_blacklist: Legion::LLM::Settings::CLIENT_TOOL_PASSTHROUGH_BLACKLIST_DEFAULT
997
+ }
998
+ Array(Legion::LLM::Settings.value(:tool_trigger, key, default: defaults.fetch(key))).flat_map do |entry|
999
+ client_tool_policy_variants(entry)
1000
+ end.uniq
1001
+ end
1002
+
1003
+ def client_tool_passthrough_name_variants(definition)
1004
+ source = definition.respond_to?(:source) ? definition.source : {}
1005
+ raw_name = source[:raw_name] || source['raw_name'] if source.is_a?(Hash)
1006
+ [definition.name, raw_name].compact.flat_map { |name| client_tool_policy_variants(name) }.uniq
1007
+ end
1008
+
1009
+ def client_tool_policy_variants(value)
1010
+ raw = value.to_s.strip.downcase
1011
+ sanitized = Types::ToolDefinition.sanitize_tool_name(value).downcase
1012
+ compact = raw.gsub(/[^a-z0-9]/, '')
1013
+
1014
+ [raw, sanitized, compact].reject(&:empty?).uniq
963
1015
  end
964
1016
 
965
1017
  def non_executable_client_tool?(definition)
@@ -997,8 +1049,16 @@ module Legion
997
1049
  )
998
1050
  return
999
1051
  end
1052
+ if non_executable_client_tool?(definition) && !client_tool_passthrough_allowed?(definition)
1053
+ log.info(
1054
+ "[llm][tools][inject] action=client_tool_skipped request_id=#{request_log_value(:id, 'unknown')} " \
1055
+ "conversation_id=#{request_log_value(:conversation_id, 'none') || 'none'} name=#{definition.name} " \
1056
+ 'reason=client_passthrough_policy'
1057
+ )
1058
+ return
1059
+ end
1000
1060
  return if gaia_tool_suppressed?(definition.name)
1001
- return if definitions.any? { |existing| existing.name == definition.name }
1061
+ return if native_tool_definition_duplicate?(definitions, definition)
1002
1062
 
1003
1063
  @injected_tool_map[definition.name] = definition.source[:tool_class] if definition.source[:tool_class]
1004
1064
  @native_tool_source_map[definition.name] = definition.source
@@ -1023,6 +1083,24 @@ module Legion
1023
1083
  handle_exception(e, level: :error, operation: 'llm.pipeline.native_registry_tools')
1024
1084
  end
1025
1085
 
1086
+ def native_tool_definition_duplicate?(definitions, definition)
1087
+ candidate_names = native_tool_definition_name_variants(definition)
1088
+ definitions.any? do |existing|
1089
+ native_tool_definition_name_variants(existing).intersect?(candidate_names)
1090
+ end
1091
+ end
1092
+
1093
+ def native_tool_definition_name_variants(definition)
1094
+ variants = client_tool_passthrough_name_variants(definition)
1095
+ source = definition.respond_to?(:source) ? definition.source : {}
1096
+ source_type = nil
1097
+ source_type = source[:type] || source['type'] if source.is_a?(Hash)
1098
+ if source_type.respond_to?(:to_sym) && source_type.to_sym == :special
1099
+ variants += Tools::Special.aliases_for(definition.name).flat_map { |name| client_tool_policy_variants(name) }
1100
+ end
1101
+ variants.uniq
1102
+ end
1103
+
1026
1104
  def add_settings_extensions_tool_definitions(definitions)
1027
1105
  existing_names = definitions.map(&:name)
1028
1106
  inject_limit = registry_tool_limit
@@ -1105,7 +1183,8 @@ module Legion
1105
1183
  result: native_tool_result_content(result),
1106
1184
  tool_call_id: normalized_call[:id],
1107
1185
  tool_name: normalized_call[:name],
1108
- started_at: Thread.current[:legion_current_tool_started_at]
1186
+ started_at: Thread.current[:legion_current_tool_started_at],
1187
+ status: result[:status] || result['status']
1109
1188
  )
1110
1189
  )
1111
1190
  result
@@ -1428,12 +1507,13 @@ module Legion
1428
1507
  started_at = tool_result.respond_to?(:started_at) ? tool_result.started_at : Thread.current[:legion_current_tool_started_at]
1429
1508
  finished_at = Time.now
1430
1509
  raw = tool_result.respond_to?(:result) ? tool_result.result : tool_result
1510
+ status = tool_result.respond_to?(:status) ? tool_result.status : nil
1431
1511
  duration_ms = started_at ? ((finished_at - started_at) * 1000).round : nil
1432
1512
 
1433
1513
  result_str = (raw.is_a?(String) ? raw : raw.to_s)
1434
1514
  result_str = result_str.encode('UTF-8', invalid: :replace, undef: :replace, replace: '�') unless result_str.valid_encoding?
1435
1515
  result_str = result_str.delete("\x00")
1436
- is_error = raw.is_a?(Hash) && (raw[:error] || raw['error']) ? true : false
1516
+ is_error = status.to_s == 'error' || (raw.is_a?(Hash) && (raw[:error] || raw['error']) ? true : false)
1437
1517
 
1438
1518
  @pending_tool_history_mutex.synchronize do
1439
1519
  entry = @pending_tool_history.find { |e| e[:tool_call_id] == tc_id && e[:result].nil? }
@@ -1457,7 +1537,7 @@ module Legion
1457
1537
 
1458
1538
  @tool_event_handler&.call(
1459
1539
  type: :tool_result, tool_call_id: tc_id, tool_name: tc_name,
1460
- result: result_str[0, 4096], result_size: result_str.bytesize,
1540
+ result: result_str[0, 4096], result_size: result_str.bytesize, status: is_error ? :error : :success,
1461
1541
  started_at: started_at, finished_at: finished_at, duration_ms: duration_ms
1462
1542
  )
1463
1543
 
@@ -2048,16 +2128,22 @@ module Legion
2048
2128
  end
2049
2129
 
2050
2130
  def response_tool_calls
2051
- # Prefer typed ToolCall objects from pending history (already built during execution)
2131
+ raw_tool_calls = @raw_response.respond_to?(:tool_calls) ? @raw_response.tool_calls : nil
2132
+ return build_response_tool_calls(raw_tool_calls) if raw_tool_calls&.any?
2133
+
2134
+ # Fall back to typed ToolCall objects from pending history when the final
2135
+ # model response completed after server-side tool execution.
2052
2136
  typed_from_history = @pending_tool_history
2053
2137
  .filter_map { |entry| entry[:typed_call] }
2054
2138
  return typed_from_history if typed_from_history.any?
2055
2139
 
2056
- return [] unless @raw_response.respond_to?(:tool_calls) && @raw_response.tool_calls
2140
+ []
2141
+ end
2057
2142
 
2143
+ def build_response_tool_calls(tool_calls)
2058
2144
  tool_timeline = build_tool_timeline_index
2059
2145
 
2060
- Array(@raw_response.tool_calls).map do |tool_call|
2146
+ Array(tool_calls).map do |tool_call|
2061
2147
  tc_id = tool_call[:id] || tool_call['id']
2062
2148
  tc_name = tool_call[:name] || tool_call['name']
2063
2149
  tc_args = tool_call[:arguments] || tool_call['arguments'] || {}
@@ -114,11 +114,27 @@ module Legion
114
114
  text = latest_user_text.to_s.downcase
115
115
  return if text.empty?
116
116
 
117
- native_dispatch_tools.keys.map(&:to_s).find do |tool_name|
118
- text.include?(tool_name.downcase)
117
+ native_dispatch_tools.keys.map(&:to_s).sort_by { |tool_name| -tool_name.length }.find do |tool_name|
118
+ explicit_tool_name_mentioned?(text, tool_name)
119
119
  end
120
120
  end
121
121
 
122
+ def explicit_tool_name_mentioned?(text, tool_name)
123
+ explicit_tool_name_candidates(tool_name).any? do |candidate|
124
+ text.match?(/(?<![[:alnum:]_-])#{Regexp.escape(candidate)}(?![[:alnum:]_-])/)
125
+ end
126
+ end
127
+
128
+ def explicit_tool_name_candidates(tool_name)
129
+ normalized_name = tool_name.to_s.downcase
130
+ [
131
+ normalized_name,
132
+ normalized_name.tr('_-', ' '),
133
+ normalized_name.tr('_', '-'),
134
+ normalized_name.tr('-', '_')
135
+ ].reject(&:empty?).uniq
136
+ end
137
+
122
138
  def latest_user_text
123
139
  message = Array(@request.messages).reverse.find do |msg|
124
140
  msg.is_a?(Hash) && (msg[:role] || msg['role']).to_s == 'user'
@@ -8,6 +8,16 @@ module Legion
8
8
  module Settings
9
9
  extend Legion::Logging::Helper
10
10
 
11
+ CLIENT_TOOL_PASSTHROUGH_BLACKLIST_DEFAULT = [
12
+ 'sudo', 'visudo', 'su', 'legion', 'legionio', 'legionio do', 'legionio/legion',
13
+ 'computer_use_session', 'computer_use_control', 'computer_use_session_info',
14
+ 'computer_use_session_message', 'plugin__aithena__recall', 'plugin__aithena__remember',
15
+ 'plugin__aithena__skill_search', 'plugin__aithena__skill_feedback', 'plugin__aithena__memory_stats',
16
+ 'plugin__cron__create', 'plugin__cron__list', 'plugin__cron__get', 'plugin__cron__update',
17
+ 'plugin__cron__delete', 'plugin__cron__get_history', 'plugin__cron__run_now', 'plugin__cron__stop'
18
+ ].freeze
19
+ CLIENT_TOOL_PASSTHROUGH_WHITELIST_DEFAULT = [].freeze
20
+
11
21
  def self.default
12
22
  model_override = ENV.fetch('ANTHROPIC_MODEL', nil)
13
23
  {
@@ -482,10 +492,12 @@ module Legion
482
492
 
483
493
  def self.tool_trigger_defaults
484
494
  {
485
- scan_depth: 10,
486
- tool_limit: 25,
487
- local_tool_limit: 100,
488
- client_tool_passthrough: false
495
+ scan_depth: 10,
496
+ tool_limit: 25,
497
+ local_tool_limit: 100,
498
+ client_tool_passthrough: false,
499
+ client_tool_passthrough_whitelist: CLIENT_TOOL_PASSTHROUGH_WHITELIST_DEFAULT.dup,
500
+ client_tool_passthrough_blacklist: CLIENT_TOOL_PASSTHROUGH_BLACKLIST_DEFAULT.dup
489
501
  }
490
502
  end
491
503
 
@@ -19,6 +19,10 @@ module Legion
19
19
  LIST_ALL_TOOLS_NAME = 'legion_list_all_tools'
20
20
  DEFAULT_TIMEOUT_MS = 120_000
21
21
  MAX_TIMEOUT_MS = 600_000
22
+ TOOL_ALIASES = {
23
+ 'python' => %w[python python3],
24
+ 'pip' => %w[pip pip3]
25
+ }.freeze
22
26
  PYTHON_PACKAGES = %w[
23
27
  python-pptx
24
28
  python-docx
@@ -60,6 +64,11 @@ module Legion
60
64
  { status: :error, result: e.message }
61
65
  end
62
66
 
67
+ def aliases_for(tool_name)
68
+ normalized = normalize_tool_name(tool_name)
69
+ TOOL_ALIASES.fetch(normalized, [normalized])
70
+ end
71
+
63
72
  def inventory
64
73
  {
65
74
  special_tools: special_tool_summaries,
@@ -2,6 +2,6 @@
2
2
 
3
3
  module Legion
4
4
  module LLM
5
- VERSION = '0.9.37'
5
+ VERSION = '0.9.51'
6
6
  end
7
7
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: legion-llm
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.9.37
4
+ version: 0.9.51
5
5
  platform: ruby
6
6
  authors:
7
7
  - Esity
@@ -23,6 +23,20 @@ dependencies:
23
23
  - - ">="
24
24
  - !ruby/object:Gem::Version
25
25
  version: '0'
26
+ - !ruby/object:Gem::Dependency
27
+ name: event_stream_parser
28
+ requirement: !ruby/object:Gem::Requirement
29
+ requirements:
30
+ - - "~>"
31
+ - !ruby/object:Gem::Version
32
+ version: '1'
33
+ type: :runtime
34
+ prerelease: false
35
+ version_requirements: !ruby/object:Gem::Requirement
36
+ requirements:
37
+ - - "~>"
38
+ - !ruby/object:Gem::Version
39
+ version: '1'
26
40
  - !ruby/object:Gem::Dependency
27
41
  name: faraday
28
42
  requirement: !ruby/object:Gem::Requirement