legion-llm 0.8.29 → 0.8.32
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.gitignore +1 -0
- data/CHANGELOG.md +22 -0
- data/README.md +4 -1
- data/legion-llm.gemspec +1 -0
- data/lib/legion/llm/api/native/helpers.rb +81 -9
- data/lib/legion/llm/api/native/inference.rb +5 -2
- data/lib/legion/llm/call/embeddings.rb +14 -1
- data/lib/legion/llm/call/providers.rb +47 -28
- data/lib/legion/llm/call/structured_output.rb +15 -3
- data/lib/legion/llm/inference/executor.rb +37 -32
- data/lib/legion/llm/router.rb +1 -1
- data/lib/legion/llm/settings.rb +3 -2
- data/lib/legion/llm/tools/adapter.rb +15 -0
- data/lib/legion/llm/version.rb +1 -1
- metadata +15 -24
- data/docs/2026-03-23-pipeline-gap-analysis.md +0 -203
- data/docs/example_settings.json +0 -16
- data/docs/examples/anthropic_request.json +0 -108
- data/docs/examples/anthropic_response.json +0 -90
- data/docs/examples/azure_ai_request.json +0 -103
- data/docs/examples/azure_ai_response.json +0 -91
- data/docs/examples/bedrock_request.json +0 -127
- data/docs/examples/bedrock_response.json +0 -93
- data/docs/examples/gemini_request.json +0 -127
- data/docs/examples/gemini_response.json +0 -109
- data/docs/examples/openai_request.json +0 -100
- data/docs/examples/openai_response.json +0 -77
- data/docs/examples/xai_request.json +0 -93
- data/docs/examples/xai_response.json +0 -48
- data/docs/gas-apollo-idea.md +0 -528
- data/docs/generation-augmented-storage.md +0 -135
- data/docs/llm-schema-spec.md +0 -2816
- data/docs/plans/2026-03-15-ollama-discovery-design.md +0 -164
- data/docs/plans/2026-03-15-ollama-discovery-implementation.md +0 -1147
- data/docs/routing-reenvisioned.md +0 -861
- data/docs/superpowers/plans/2026-04-15-sticky-runners-tool-history.md +0 -1866
- data/docs/superpowers/specs/2026-04-15-sticky-runners-tool-history-design.md +0 -713
- data/legion-llm-0.3.20.gem +0 -0
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: f73cca457b602ab79cd3964a15333afe338a7ed5ef18a556f0e502da7817b4ef
|
|
4
|
+
data.tar.gz: 706498c91924640dc31bf43e814a04f2880acb2930cf30abdf144c28cb02fd01
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 3b499fde4636085676157fdad1576ba60ed0149f2e9d126289b38399b4232ce64ed47e66250c7bf0dd01c87e15afff675c0d3f0ddc5d6d152ec17640af615fd2
|
|
7
|
+
data.tar.gz: 5b26a4426f630a3e36dedd6bdeacae7b6a2283696f4a9771d5458bdd2c8735380984ee983a10c9483ea2cb373eaa5a4e54c0cc92cac5d1a0b04f5835a7920850
|
data/.gitignore
CHANGED
data/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,27 @@
|
|
|
1
1
|
# Legion LLM Changelog
|
|
2
2
|
|
|
3
|
+
## [0.8.32] - 2026-04-27
|
|
4
|
+
|
|
5
|
+
### Fixed
|
|
6
|
+
- Embedding calls now return a clear unavailable-provider error when no embedding provider is configured or detected, preventing RubyLLM from implicitly selecting a chat/default provider.
|
|
7
|
+
|
|
8
|
+
## [0.8.31] - 2026-04-27
|
|
9
|
+
|
|
10
|
+
### Fixed
|
|
11
|
+
- Embedding calls no longer inherit the chat `llm.default_provider`, preventing vLLM or other chat defaults from receiving embedding traffic unless explicitly configured for embeddings. Fixes #104
|
|
12
|
+
|
|
13
|
+
## [0.8.30] - 2026-04-27
|
|
14
|
+
|
|
15
|
+
### Fixed
|
|
16
|
+
- Structured output parsing now strips markdown code fences before JSON parse, including retry responses from models that keep returning fenced JSON.
|
|
17
|
+
- LLM tool adapter dispatch now symbolizes JSON/string-keyed tool arguments before invoking Ruby keyword-argument tool classes.
|
|
18
|
+
- Default routing chains now honor explicit `default_provider` / `default_model` before auto-enabled local providers, preventing Ollama defaults from overriding a configured Bedrock default.
|
|
19
|
+
- Provider credential setup now resolves `env://` placeholders consistently for Bedrock SigV4, Anthropic, OpenAI, Gemini, Azure, and vLLM, and unresolved placeholder arrays no longer auto-enable hosted providers.
|
|
20
|
+
- Native `/api/llm/inference` responses now flatten structured provider content blocks into plain text for both streaming SSE deltas and non-streaming JSON responses, preventing Anthropic/Bedrock-style block arrays from being stored and replayed as nested JSON-looking assistant replies.
|
|
21
|
+
- Native `/api/llm/inference` streaming now emits `thinking-delta` SSE events for provider reasoning chunks without appending those chunks to final assistant content.
|
|
22
|
+
- Native `file_read` client tools now extract text from PDFs via `pdf-reader` and return a clear unsupported-binary message for non-text binary files.
|
|
23
|
+
- Local providers now cap automatically injected registry tools with `llm.tool_trigger.local_tool_limit`, prioritizing trigger-matched tools before always-loaded tools for Ollama/vLLM requests.
|
|
24
|
+
|
|
3
25
|
## [0.8.29] - 2026-04-27
|
|
4
26
|
|
|
5
27
|
### Added
|
data/README.md
CHANGED
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
|
|
3
3
|
LLM integration for the [LegionIO](https://github.com/LegionIO/LegionIO) framework. Wraps [ruby_llm](https://github.com/crmne/ruby_llm) to provide chat, embeddings, tool use, and agent capabilities to any Legion extension. Exposes OpenAI- and Anthropic-compatible API endpoints so external tools can point at the Legion daemon and just work.
|
|
4
4
|
|
|
5
|
-
**Version**: 0.8.
|
|
5
|
+
**Version**: 0.8.30
|
|
6
6
|
|
|
7
7
|
## Installation
|
|
8
8
|
|
|
@@ -60,6 +60,7 @@ Requests flow through the full Inference pipeline — routing, metering, audit,
|
|
|
60
60
|
Both formats supported with correct SSE shapes:
|
|
61
61
|
- **OpenAI**: `data: {"choices":[{"delta":{"content":"..."}}]}` chunks, terminated by `data: [DONE]`
|
|
62
62
|
- **Anthropic**: Typed events — `message_start`, `content_block_start`, `content_block_delta`, `content_block_stop`, `message_delta`, `message_stop`
|
|
63
|
+
- **Native**: `/api/llm/inference` streams `text-delta`, `thinking-delta`, tool lifecycle events, and a final `done` event. Structured provider content blocks are flattened to plain text in both streaming and non-streaming native responses so `content` remains a string for daemon clients.
|
|
63
64
|
|
|
64
65
|
### API Authentication
|
|
65
66
|
|
|
@@ -851,6 +852,8 @@ No code changes are needed in consumers immediately. The aliases will be maintai
|
|
|
851
852
|
| Azure AI | `azure` | `vault://`, `env://`, or direct | Azure OpenAI endpoint; `api_base` + `api_key` or `auth_token` |
|
|
852
853
|
| Ollama | `ollama` | Local, no credentials needed | Local inference |
|
|
853
854
|
|
|
855
|
+
`env://NAME` credential placeholders resolve at provider configuration time, including array fallbacks such as `["env://OPENAI_API_KEY", "env://CODEX_API_KEY"]`. Unresolved placeholders do not auto-enable hosted providers.
|
|
856
|
+
|
|
854
857
|
## Integration with LegionIO
|
|
855
858
|
|
|
856
859
|
legion-llm follows the standard core gem lifecycle:
|
data/legion-llm.gemspec
CHANGED
|
@@ -35,6 +35,7 @@ Gem::Specification.new do |spec|
|
|
|
35
35
|
spec.add_dependency 'lex-gemini'
|
|
36
36
|
spec.add_dependency 'lex-knowledge'
|
|
37
37
|
spec.add_dependency 'lex-openai'
|
|
38
|
+
spec.add_dependency 'pdf-reader'
|
|
38
39
|
spec.add_dependency 'ruby_llm', '~> 1.13'
|
|
39
40
|
spec.add_dependency 'tzinfo', '>= 2.0'
|
|
40
41
|
end
|
|
@@ -10,12 +10,14 @@ module Legion
|
|
|
10
10
|
module API
|
|
11
11
|
module Native
|
|
12
12
|
module ClientToolMethods
|
|
13
|
+
include Legion::Logging::Helper
|
|
14
|
+
|
|
13
15
|
private
|
|
14
16
|
|
|
15
17
|
def log_tool(level, ref, status, **details)
|
|
16
18
|
parts = ["[tool][#{ref}] #{status}"]
|
|
17
19
|
details.each { |k, v| parts << "#{k}=#{v}" }
|
|
18
|
-
|
|
20
|
+
log.public_send(level, parts.join(' '))
|
|
19
21
|
end
|
|
20
22
|
|
|
21
23
|
def summarize_tool_arg_keys(kwargs)
|
|
@@ -37,7 +39,7 @@ module Legion
|
|
|
37
39
|
end
|
|
38
40
|
end
|
|
39
41
|
|
|
40
|
-
def dispatch_client_tool(ref, **kwargs) # rubocop:disable Metrics/AbcSize,Metrics/CyclomaticComplexity,Metrics/PerceivedComplexity
|
|
42
|
+
def dispatch_client_tool(ref, **kwargs) # rubocop:disable Metrics/AbcSize,Metrics/CyclomaticComplexity,Metrics/MethodLength,Metrics/PerceivedComplexity
|
|
41
43
|
case ref
|
|
42
44
|
when 'sh'
|
|
43
45
|
cmd = kwargs[:command] || kwargs[:cmd] || kwargs.values.first.to_s
|
|
@@ -45,7 +47,7 @@ module Legion
|
|
|
45
47
|
"exit=#{status.exitstatus}\n#{output}"
|
|
46
48
|
when 'file_read'
|
|
47
49
|
path = kwargs[:path] || kwargs[:file_path] || kwargs.values.first.to_s
|
|
48
|
-
|
|
50
|
+
read_client_file(path)
|
|
49
51
|
when 'file_write'
|
|
50
52
|
path = kwargs[:path] || kwargs[:file_path]
|
|
51
53
|
content = kwargs[:content] || kwargs[:contents]
|
|
@@ -82,6 +84,7 @@ module Legion
|
|
|
82
84
|
max_length ? content[0, max_length] : content
|
|
83
85
|
rescue LoadError => e
|
|
84
86
|
missing = e.respond_to?(:path) && e.path ? e.path : 'legion/cli/chat/web_fetch'
|
|
87
|
+
handle_exception(e, level: :warn, handled: true, operation: 'llm.api.client_tool.web_fetch', missing: missing)
|
|
85
88
|
"web_fetch is unavailable: missing optional dependency #{missing}"
|
|
86
89
|
end
|
|
87
90
|
when 'web_search'
|
|
@@ -93,6 +96,7 @@ module Legion
|
|
|
93
96
|
results[:results].map { |r| "### #{r[:title]}\n#{r[:url]}\n#{r[:snippet]}" }.join("\n\n")
|
|
94
97
|
rescue LoadError => e
|
|
95
98
|
missing = e.respond_to?(:path) && e.path ? e.path : 'legion/cli/chat/web_search'
|
|
99
|
+
handle_exception(e, level: :warn, handled: true, operation: 'llm.api.client_tool.web_search', missing: missing)
|
|
96
100
|
"web_search is unavailable: missing optional dependency #{missing}"
|
|
97
101
|
end
|
|
98
102
|
else
|
|
@@ -100,6 +104,51 @@ module Legion
|
|
|
100
104
|
end
|
|
101
105
|
end
|
|
102
106
|
|
|
107
|
+
def read_client_file(path)
|
|
108
|
+
return "File not found: #{path}" unless ::File.exist?(path)
|
|
109
|
+
|
|
110
|
+
return read_pdf_text(path) if pdf_file?(path)
|
|
111
|
+
|
|
112
|
+
content = ::File.binread(path)
|
|
113
|
+
return 'Binary file detected, cannot read as text.' if binary_content?(content)
|
|
114
|
+
|
|
115
|
+
content.force_encoding('UTF-8')
|
|
116
|
+
content
|
|
117
|
+
rescue StandardError => e
|
|
118
|
+
handle_exception(e, level: :warn, handled: true, operation: 'llm.api.client_tool.file_read', path: path)
|
|
119
|
+
"file_read error: #{e.message}"
|
|
120
|
+
end
|
|
121
|
+
|
|
122
|
+
def pdf_file?(path)
|
|
123
|
+
::File.extname(path).casecmp('.pdf').zero? || ::File.binread(path, 5) == '%PDF-'
|
|
124
|
+
rescue StandardError => e
|
|
125
|
+
handle_exception(e, level: :warn, handled: true, operation: 'llm.api.client_tool.pdf_sniff', path: path)
|
|
126
|
+
false
|
|
127
|
+
end
|
|
128
|
+
|
|
129
|
+
def read_pdf_text(path)
|
|
130
|
+
require 'pdf-reader' unless defined?(::PDF::Reader)
|
|
131
|
+
|
|
132
|
+
reader = ::PDF::Reader.new(path)
|
|
133
|
+
text = reader.pages.map(&:text).join("\n\n").strip
|
|
134
|
+
text.empty? ? 'PDF contained no extractable text.' : text
|
|
135
|
+
rescue LoadError => e
|
|
136
|
+
missing = e.respond_to?(:path) && e.path ? e.path : 'pdf-reader'
|
|
137
|
+
handle_exception(e, level: :warn, handled: true, operation: 'llm.api.client_tool.pdf_extract', missing: missing)
|
|
138
|
+
'PDF text extraction unavailable: missing pdf-reader gem.'
|
|
139
|
+
rescue StandardError => e
|
|
140
|
+
handle_exception(e, level: :warn, handled: true, operation: 'llm.api.client_tool.pdf_extract', path: path)
|
|
141
|
+
"PDF text extraction failed: #{e.message}"
|
|
142
|
+
end
|
|
143
|
+
|
|
144
|
+
def binary_content?(content)
|
|
145
|
+
return true if content.include?("\x00")
|
|
146
|
+
|
|
147
|
+
sample = content.byteslice(0, 4096).to_s
|
|
148
|
+
sample.force_encoding('UTF-8')
|
|
149
|
+
!sample.valid_encoding?
|
|
150
|
+
end
|
|
151
|
+
|
|
103
152
|
def notify_tool_event(type, ref, **data)
|
|
104
153
|
handler = Thread.current[:legion_tool_event_handler]
|
|
105
154
|
return unless handler
|
|
@@ -257,13 +306,14 @@ module Legion
|
|
|
257
306
|
rescue StandardError => e
|
|
258
307
|
ms = begin
|
|
259
308
|
((::Process.clock_gettime(::Process::CLOCK_MONOTONIC) - t0) * 1000).round(1)
|
|
260
|
-
rescue StandardError
|
|
309
|
+
rescue StandardError => e
|
|
310
|
+
handle_exception(e, level: :warn, handled: true,
|
|
311
|
+
operation: 'llm.api.client_tool.duration_measurement', tool_ref: tool_ref)
|
|
261
312
|
nil
|
|
262
313
|
end
|
|
263
314
|
log_tool(:error, tool_ref, 'failed', duration_ms: ms, error: e.message)
|
|
264
315
|
notify_tool_event(:tool_error, tool_ref, error: e.message)
|
|
265
|
-
|
|
266
|
-
component_type: :api)
|
|
316
|
+
handle_exception(e, level: :error, handled: true, operation: "llm.api.client_tool.#{tool_ref}")
|
|
267
317
|
"Tool error: #{e.message}"
|
|
268
318
|
end
|
|
269
319
|
end
|
|
@@ -287,6 +337,25 @@ module Legion
|
|
|
287
337
|
end
|
|
288
338
|
end
|
|
289
339
|
|
|
340
|
+
define_method(:extract_text_content) do |content|
|
|
341
|
+
case content
|
|
342
|
+
when nil
|
|
343
|
+
''
|
|
344
|
+
when String
|
|
345
|
+
content
|
|
346
|
+
when Array
|
|
347
|
+
content.filter_map { |entry| extract_text_content(entry) }.join
|
|
348
|
+
when Hash
|
|
349
|
+
type = content[:type] || content['type']
|
|
350
|
+
return '' unless type.nil? || type.to_s == 'text'
|
|
351
|
+
|
|
352
|
+
text = content.key?(:text) || content.key?('text') ? (content[:text] || content['text']) : (content[:content] || content['content'])
|
|
353
|
+
extract_text_content(text)
|
|
354
|
+
else
|
|
355
|
+
content.to_s
|
|
356
|
+
end
|
|
357
|
+
end
|
|
358
|
+
|
|
290
359
|
define_method(:emit_sse_event) do |stream, event_name, payload|
|
|
291
360
|
level = event_name == 'text-delta' ? :debug : :info
|
|
292
361
|
log.send(level, "[sse][emit] event=#{event_name} keys=#{payload.is_a?(Hash) ? payload.keys.join(',') : 'n/a'}")
|
|
@@ -333,7 +402,8 @@ module Legion
|
|
|
333
402
|
|
|
334
403
|
kerb = begin
|
|
335
404
|
Legion::Settings.dig(:kerberos, :username)
|
|
336
|
-
rescue StandardError
|
|
405
|
+
rescue StandardError => e
|
|
406
|
+
handle_exception(e, level: :warn, handled: true, operation: 'llm.api.identity.kerberos_username')
|
|
337
407
|
nil
|
|
338
408
|
end
|
|
339
409
|
return "user:#{kerb}" if kerb.is_a?(String) && !kerb.empty?
|
|
@@ -354,14 +424,16 @@ module Legion
|
|
|
354
424
|
define_method(:resolve_requested_by) do |rack_env, identity_string|
|
|
355
425
|
hostname = begin
|
|
356
426
|
Legion::Settings[:client][:hostname]
|
|
357
|
-
rescue StandardError
|
|
427
|
+
rescue StandardError => e
|
|
428
|
+
handle_exception(e, level: :warn, handled: true, operation: 'llm.api.identity.client_hostname')
|
|
358
429
|
Socket.gethostname
|
|
359
430
|
end
|
|
360
431
|
username = identity_string.delete_prefix('user:')
|
|
361
432
|
|
|
362
433
|
kerb = begin
|
|
363
434
|
Legion::Settings.dig(:kerberos, :username)
|
|
364
|
-
rescue StandardError
|
|
435
|
+
rescue StandardError => e
|
|
436
|
+
handle_exception(e, level: :warn, handled: true, operation: 'llm.api.identity.requested_by_kerberos')
|
|
365
437
|
nil
|
|
366
438
|
end
|
|
367
439
|
if kerb.is_a?(String) && !kerb.empty?
|
|
@@ -150,7 +150,10 @@ module Legion
|
|
|
150
150
|
}
|
|
151
151
|
|
|
152
152
|
pipeline_response = executor.call_stream do |chunk|
|
|
153
|
-
|
|
153
|
+
thinking = extract_text_content(chunk.thinking) if chunk.respond_to?(:thinking)
|
|
154
|
+
emit_sse_event(out, 'thinking-delta', { delta: thinking }) unless thinking.to_s.empty?
|
|
155
|
+
|
|
156
|
+
text = extract_text_content(chunk.respond_to?(:content) ? chunk.content : chunk)
|
|
154
157
|
next if text.empty?
|
|
155
158
|
|
|
156
159
|
full_text << text
|
|
@@ -195,7 +198,7 @@ module Legion
|
|
|
195
198
|
exec_ms = ((::Process.clock_gettime(::Process::CLOCK_MONOTONIC) - exec_t0) * 1000).round
|
|
196
199
|
log.debug("[llm][api][inference] action=executor_call duration_ms=#{exec_ms} request_id=#{request_id}")
|
|
197
200
|
raw_msg = pipeline_response.message
|
|
198
|
-
content = raw_msg.is_a?(Hash) ? (raw_msg[:content] || raw_msg['content']) : raw_msg
|
|
201
|
+
content = extract_text_content(raw_msg.is_a?(Hash) ? (raw_msg[:content] || raw_msg['content']) : raw_msg)
|
|
199
202
|
routing = pipeline_response.routing || {}
|
|
200
203
|
tokens = pipeline_response.tokens || {}
|
|
201
204
|
tool_calls = extract_tool_calls(pipeline_response)
|
|
@@ -14,6 +14,7 @@ module Legion
|
|
|
14
14
|
return { vector: nil, model: model, provider: provider, error: 'LLM not started' } unless LLM.started?
|
|
15
15
|
|
|
16
16
|
provider ||= resolve_provider
|
|
17
|
+
return embedding_unavailable_result(model, provider) unless provider
|
|
17
18
|
return { vector: nil, model: model, provider: provider, error: "provider #{provider} is disabled" } if provider_disabled?(provider)
|
|
18
19
|
|
|
19
20
|
model ||= resolve_model(provider)
|
|
@@ -41,6 +42,8 @@ module Legion
|
|
|
41
42
|
return texts.map { |_| { vector: nil, error: 'LLM not started' } } unless LLM.started?
|
|
42
43
|
|
|
43
44
|
provider ||= resolve_provider
|
|
45
|
+
return unavailable_batch_result(texts, provider, model) unless provider
|
|
46
|
+
|
|
44
47
|
disabled_result = disabled_batch_result(texts, provider, model)
|
|
45
48
|
return disabled_result if disabled_result
|
|
46
49
|
|
|
@@ -74,6 +77,16 @@ module Legion
|
|
|
74
77
|
end
|
|
75
78
|
end
|
|
76
79
|
|
|
80
|
+
def embedding_unavailable_result(model, provider)
|
|
81
|
+
{ vector: nil, model: model, provider: provider, error: 'No embedding provider configured' }
|
|
82
|
+
end
|
|
83
|
+
|
|
84
|
+
def unavailable_batch_result(texts, provider, model)
|
|
85
|
+
texts.each_with_index.map do |_, i|
|
|
86
|
+
embedding_unavailable_result(model, provider).merge(dimensions: 0, index: i)
|
|
87
|
+
end
|
|
88
|
+
end
|
|
89
|
+
|
|
77
90
|
def provider_disabled?(provider)
|
|
78
91
|
return false unless provider
|
|
79
92
|
|
|
@@ -176,7 +189,7 @@ module Legion
|
|
|
176
189
|
configured = embedding_settings[:provider]
|
|
177
190
|
return configured&.to_sym if configured
|
|
178
191
|
|
|
179
|
-
|
|
192
|
+
nil
|
|
180
193
|
rescue StandardError => e
|
|
181
194
|
handle_exception(e, level: :debug, operation: 'llm.embeddings.resolve_provider')
|
|
182
195
|
nil
|
|
@@ -82,12 +82,16 @@ module Legion
|
|
|
82
82
|
usable_setting?(config[:api_key])
|
|
83
83
|
end
|
|
84
84
|
rescue StandardError => e
|
|
85
|
-
handle_exception(e, level: :
|
|
85
|
+
handle_exception(e, level: :warn, operation: 'llm.providers.credential_available_for', provider: provider)
|
|
86
86
|
false
|
|
87
87
|
end
|
|
88
88
|
|
|
89
89
|
def usable_setting?(value)
|
|
90
|
-
!
|
|
90
|
+
!resolve_credential_value(value).nil?
|
|
91
|
+
end
|
|
92
|
+
|
|
93
|
+
def resolve_credential_value(value)
|
|
94
|
+
Call::ClaudeConfigLoader.resolve_setting_reference(value)
|
|
91
95
|
end
|
|
92
96
|
|
|
93
97
|
def env_present?(key)
|
|
@@ -104,7 +108,7 @@ module Legion
|
|
|
104
108
|
Socket.tcp(addr, port, connect_timeout: 1).close
|
|
105
109
|
true
|
|
106
110
|
rescue StandardError => e
|
|
107
|
-
handle_exception(e, level: :
|
|
111
|
+
handle_exception(e, level: :warn, operation: 'llm.providers.ollama_running', base_url: url)
|
|
108
112
|
false
|
|
109
113
|
end
|
|
110
114
|
|
|
@@ -120,7 +124,7 @@ module Legion
|
|
|
120
124
|
end.get('/health')
|
|
121
125
|
response.success?
|
|
122
126
|
rescue StandardError => e
|
|
123
|
-
handle_exception(e, level: :
|
|
127
|
+
handle_exception(e, level: :warn, operation: 'llm.providers.vllm_running', base_url: url)
|
|
124
128
|
false
|
|
125
129
|
end
|
|
126
130
|
|
|
@@ -139,9 +143,15 @@ module Legion
|
|
|
139
143
|
end
|
|
140
144
|
|
|
141
145
|
def configure_bedrock(config)
|
|
142
|
-
|
|
143
|
-
|
|
146
|
+
resolved_api_key = resolve_credential_value(config[:api_key])
|
|
147
|
+
resolved_secret_key = resolve_credential_value(config[:secret_key])
|
|
148
|
+
resolved_session_token = resolve_credential_value(config[:session_token])
|
|
149
|
+
has_sigv4 = resolved_api_key && resolved_secret_key
|
|
150
|
+
has_bearer = resolve_credential_value(config[:bearer_token])
|
|
144
151
|
config[:bearer_token] = has_bearer if has_bearer
|
|
152
|
+
config[:api_key] = resolved_api_key if resolved_api_key
|
|
153
|
+
config[:secret_key] = resolved_secret_key if resolved_secret_key
|
|
154
|
+
config[:session_token] = resolved_session_token if resolved_session_token
|
|
145
155
|
|
|
146
156
|
unless has_sigv4 || has_bearer
|
|
147
157
|
broker_creds = resolve_broker_aws_credentials
|
|
@@ -176,7 +186,7 @@ module Legion
|
|
|
176
186
|
|
|
177
187
|
def configure_anthropic(config)
|
|
178
188
|
api_key = resolve_broker_credential(:anthropic) ||
|
|
179
|
-
|
|
189
|
+
resolve_credential_value(config[:api_key]) ||
|
|
180
190
|
ENV.fetch('ANTHROPIC_API_KEY', nil)
|
|
181
191
|
return unless api_key
|
|
182
192
|
|
|
@@ -189,7 +199,7 @@ module Legion
|
|
|
189
199
|
|
|
190
200
|
def configure_openai(config)
|
|
191
201
|
api_key = resolve_broker_credential(:openai) ||
|
|
192
|
-
|
|
202
|
+
resolve_credential_value(config[:api_key]) ||
|
|
193
203
|
ENV.fetch('OPENAI_API_KEY', nil) ||
|
|
194
204
|
ENV.fetch('CODEX_API_KEY', nil)
|
|
195
205
|
return unless api_key
|
|
@@ -202,7 +212,9 @@ module Legion
|
|
|
202
212
|
end
|
|
203
213
|
|
|
204
214
|
def configure_gemini(config)
|
|
205
|
-
api_key = resolve_broker_credential(:gemini) ||
|
|
215
|
+
api_key = resolve_broker_credential(:gemini) ||
|
|
216
|
+
resolve_credential_value(config[:api_key]) ||
|
|
217
|
+
ENV.fetch('GEMINI_API_KEY', nil)
|
|
206
218
|
return unless api_key
|
|
207
219
|
|
|
208
220
|
RubyLLM.configure do |c|
|
|
@@ -214,8 +226,8 @@ module Legion
|
|
|
214
226
|
|
|
215
227
|
def configure_azure(config)
|
|
216
228
|
api_base = config[:api_base]
|
|
217
|
-
api_key = resolve_broker_credential(:azure) || config[:api_key]
|
|
218
|
-
auth_token = config[:auth_token]
|
|
229
|
+
api_key = resolve_broker_credential(:azure) || resolve_credential_value(config[:api_key])
|
|
230
|
+
auth_token = resolve_credential_value(config[:auth_token])
|
|
219
231
|
return unless api_base && (api_key || auth_token)
|
|
220
232
|
|
|
221
233
|
RubyLLM.configure do |c|
|
|
@@ -235,9 +247,10 @@ module Legion
|
|
|
235
247
|
|
|
236
248
|
def configure_vllm(config)
|
|
237
249
|
base_url = config[:base_url] || 'http://localhost:8000/v1'
|
|
250
|
+
api_key = resolve_credential_value(config[:api_key])
|
|
238
251
|
RubyLLM.configure do |c|
|
|
239
252
|
c.vllm_api_base = base_url
|
|
240
|
-
c.vllm_api_key =
|
|
253
|
+
c.vllm_api_key = api_key if api_key
|
|
241
254
|
end
|
|
242
255
|
log.info "[llm][providers] configured vllm base_url=#{base_url.inspect}"
|
|
243
256
|
end
|
|
@@ -294,22 +307,22 @@ module Legion
|
|
|
294
307
|
case provider
|
|
295
308
|
when :bedrock
|
|
296
309
|
candidates = []
|
|
297
|
-
resolved_bearer =
|
|
310
|
+
resolved_bearer = resolve_credential_value(config[:bearer_token])
|
|
298
311
|
bearer_env = ENV.fetch('AWS_BEARER_TOKEN_BEDROCK', nil)
|
|
299
312
|
claude_bearer = Call::ClaudeConfigLoader.bedrock_bearer_token
|
|
300
313
|
candidates += [resolved_bearer, bearer_env, claude_bearer].compact.uniq.map { |t| { bearer_token: t } }
|
|
301
|
-
api_key =
|
|
302
|
-
secret =
|
|
314
|
+
api_key = resolve_credential_value(config[:api_key])
|
|
315
|
+
secret = resolve_credential_value(config[:secret_key])
|
|
303
316
|
candidates << { api_key: api_key, secret_key: secret } if api_key && secret
|
|
304
317
|
candidates
|
|
305
318
|
when :anthropic
|
|
306
319
|
[
|
|
307
|
-
|
|
320
|
+
resolve_credential_value(config[:api_key]),
|
|
308
321
|
ENV.fetch('ANTHROPIC_API_KEY', nil)
|
|
309
322
|
].compact.uniq.map { |k| { api_key: k } }
|
|
310
323
|
when :openai
|
|
311
324
|
keys = [
|
|
312
|
-
|
|
325
|
+
resolve_credential_value(config[:api_key]),
|
|
313
326
|
ENV.fetch('OPENAI_API_KEY', nil),
|
|
314
327
|
ENV.fetch('CODEX_API_KEY', nil),
|
|
315
328
|
Call::CodexConfigLoader.read_token
|
|
@@ -317,14 +330,14 @@ module Legion
|
|
|
317
330
|
keys.map { |k| { api_key: k } }
|
|
318
331
|
when :gemini
|
|
319
332
|
[
|
|
320
|
-
|
|
333
|
+
resolve_credential_value(config[:api_key]),
|
|
321
334
|
ENV.fetch('GEMINI_API_KEY', nil)
|
|
322
335
|
].compact.uniq.map { |k| { api_key: k } }
|
|
323
336
|
else
|
|
324
337
|
[]
|
|
325
338
|
end
|
|
326
339
|
rescue StandardError => e
|
|
327
|
-
handle_exception(e, level: :
|
|
340
|
+
handle_exception(e, level: :warn, operation: 'llm.providers.collect_credential_candidates', provider: provider)
|
|
328
341
|
[]
|
|
329
342
|
end
|
|
330
343
|
|
|
@@ -383,7 +396,7 @@ module Legion
|
|
|
383
396
|
end
|
|
384
397
|
rescue StandardError => e
|
|
385
398
|
log.warn "[llm][providers] health_check failed provider=#{provider} error=#{e.class}"
|
|
386
|
-
handle_exception(e, level: :
|
|
399
|
+
handle_exception(e, level: :warn, operation: 'llm.providers.attempt_provider_call', provider: provider, model: model)
|
|
387
400
|
false
|
|
388
401
|
end
|
|
389
402
|
|
|
@@ -398,19 +411,25 @@ module Legion
|
|
|
398
411
|
return :ok if model_ids.any? { |id| id.include?(target_model) || target_model.include?(id) }
|
|
399
412
|
|
|
400
413
|
:model_missing
|
|
401
|
-
rescue RubyLLM::UnauthorizedError, RubyLLM::ForbiddenError
|
|
414
|
+
rescue RubyLLM::UnauthorizedError, RubyLLM::ForbiddenError => e
|
|
415
|
+
handle_exception(e, level: :warn, handled: true,
|
|
416
|
+
operation: 'llm.providers.probe_via_model_list.auth', provider: provider)
|
|
402
417
|
:auth_error
|
|
403
418
|
rescue StandardError => e
|
|
404
|
-
handle_exception(e, level: :
|
|
419
|
+
handle_exception(e, level: :warn, operation: 'llm.providers.probe_via_model_list', provider: provider)
|
|
405
420
|
probe_via_chat(provider, target_model)
|
|
406
421
|
end
|
|
407
422
|
|
|
408
423
|
def probe_via_chat(provider, model)
|
|
409
424
|
RubyLLM.chat(model: model, provider: provider).ask('Respond with only the word: pong')
|
|
410
425
|
:ok
|
|
411
|
-
rescue RubyLLM::ModelNotFoundError
|
|
426
|
+
rescue RubyLLM::ModelNotFoundError => e
|
|
427
|
+
handle_exception(e, level: :warn, handled: true,
|
|
428
|
+
operation: 'llm.providers.probe_via_chat.model_missing', provider: provider, model: model)
|
|
412
429
|
:model_missing
|
|
413
|
-
rescue RubyLLM::UnauthorizedError, RubyLLM::ForbiddenError
|
|
430
|
+
rescue RubyLLM::UnauthorizedError, RubyLLM::ForbiddenError => e
|
|
431
|
+
handle_exception(e, level: :warn, handled: true,
|
|
432
|
+
operation: 'llm.providers.probe_via_chat.auth', provider: provider, model: model)
|
|
414
433
|
:auth_error
|
|
415
434
|
end
|
|
416
435
|
|
|
@@ -433,7 +452,7 @@ module Legion
|
|
|
433
452
|
ok = attempt_provider_call(:openai, openai_config[:default_model])
|
|
434
453
|
openai_config[:enabled] = false unless ok
|
|
435
454
|
rescue StandardError => e
|
|
436
|
-
handle_exception(e, level: :
|
|
455
|
+
handle_exception(e, level: :warn, operation: 'llm.providers.recover_openai_with_codex')
|
|
437
456
|
end
|
|
438
457
|
|
|
439
458
|
def auto_register_providers
|
|
@@ -485,7 +504,7 @@ module Legion
|
|
|
485
504
|
|
|
486
505
|
Legion::Identity::Broker.token_for(provider_name)
|
|
487
506
|
rescue StandardError => e
|
|
488
|
-
handle_exception(e, level: :
|
|
507
|
+
handle_exception(e, level: :warn, operation: "llm.providers.broker_resolve.#{provider_name}")
|
|
489
508
|
nil
|
|
490
509
|
end
|
|
491
510
|
|
|
@@ -497,7 +516,7 @@ module Legion
|
|
|
497
516
|
|
|
498
517
|
nil
|
|
499
518
|
rescue StandardError => e
|
|
500
|
-
handle_exception(e, level: :
|
|
519
|
+
handle_exception(e, level: :warn, operation: 'llm.providers.broker_resolve.aws')
|
|
501
520
|
nil
|
|
502
521
|
end
|
|
503
522
|
|
|
@@ -512,7 +531,7 @@ module Legion
|
|
|
512
531
|
!Legion::Identity::Broker.token_for(provider).nil?
|
|
513
532
|
end
|
|
514
533
|
rescue StandardError => e
|
|
515
|
-
handle_exception(e, level: :
|
|
534
|
+
handle_exception(e, level: :warn, operation: 'llm.providers.broker_credential_available', provider: provider)
|
|
516
535
|
false
|
|
517
536
|
end
|
|
518
537
|
|
|
@@ -15,7 +15,7 @@ module Legion
|
|
|
15
15
|
result = call_with_schema(messages, schema, model, provider: provider, **)
|
|
16
16
|
log.info "[llm][structured_output] model=#{model} provider=#{provider} valid=true"
|
|
17
17
|
|
|
18
|
-
content = result.respond_to?(:content) ? result.content : result[:content]
|
|
18
|
+
content = strip_markdown_fences(result.respond_to?(:content) ? result.content : result[:content])
|
|
19
19
|
raw_model = result.respond_to?(:model_id) ? result.model_id : result[:model]
|
|
20
20
|
|
|
21
21
|
parsed = Legion::JSON.load(content)
|
|
@@ -52,7 +52,7 @@ module Legion
|
|
|
52
52
|
if retry_enabled? && attempt < max_retries
|
|
53
53
|
retry_with_instruction(messages, schema, model, provider: provider, attempt: attempt + 1, **opts)
|
|
54
54
|
else
|
|
55
|
-
raw = result.respond_to?(:content) ? result&.content : result&.dig(:content)
|
|
55
|
+
raw = strip_markdown_fences(result.respond_to?(:content) ? result&.content : result&.dig(:content))
|
|
56
56
|
{ data: nil, error: "JSON parse failed: #{error.message}", raw: raw, valid: false }
|
|
57
57
|
end
|
|
58
58
|
end
|
|
@@ -64,7 +64,7 @@ module Legion
|
|
|
64
64
|
model: model, provider: provider, intent: nil, tier: nil,
|
|
65
65
|
message: user_content, **opts.except(:attempt))
|
|
66
66
|
|
|
67
|
-
retry_content = result.respond_to?(:content) ? result.content : result[:content]
|
|
67
|
+
retry_content = strip_markdown_fences(result.respond_to?(:content) ? result.content : result[:content])
|
|
68
68
|
retry_model = result.respond_to?(:model_id) ? result.model_id : result[:model]
|
|
69
69
|
|
|
70
70
|
parsed = Legion::JSON.load(retry_content)
|
|
@@ -83,6 +83,18 @@ module Legion
|
|
|
83
83
|
parts.join("\n\n")
|
|
84
84
|
end
|
|
85
85
|
|
|
86
|
+
def strip_markdown_fences(text)
|
|
87
|
+
return text unless text.is_a?(String)
|
|
88
|
+
|
|
89
|
+
stripped = text.strip
|
|
90
|
+
return stripped unless stripped.start_with?('```')
|
|
91
|
+
|
|
92
|
+
stripped
|
|
93
|
+
.sub(/\A`{3,}[[:space:]]*(?:json)?[[:space:]]*\n?/i, '')
|
|
94
|
+
.sub(/\n?[[:space:]]*`{3,}\z/, '')
|
|
95
|
+
.strip
|
|
96
|
+
end
|
|
97
|
+
|
|
86
98
|
def supports_response_format?(model)
|
|
87
99
|
SCHEMA_CAPABLE_MODELS.any? { |m| model.to_s.include?(m) }
|
|
88
100
|
end
|