lex-llm-azure-foundry 0.2.0 → 0.2.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 816c51165a46a9e5f15c3e535da382dc2459f6cca39b53644b60b2fb090196b9
4
- data.tar.gz: 5ddf21d6404a9a255d4e3ef30c0b3a4c36989c7e217558b2a00da050f5149093
3
+ metadata.gz: dc801cd438178d5431250e6b2dbfb3d2b6dce8af6fb7a266cd2bf65eeb24b7a9
4
+ data.tar.gz: '04928abba688736869565f51365019a470980ddf4502d866d0a3f169bb30524b'
5
5
  SHA512:
6
- metadata.gz: 266af948f19a52cf3b7612c43641daab824a16ffd0b1a06a344184ef5724b30b1b138b85a915459f70fe431c4df80e2e388f884c64e7e10300ecbc590afbef1d
7
- data.tar.gz: bf925fbf1fe563c47f6e4adf244e6a2468c0c5d198f273515ba98d91fb54a7029dc0315e3520ceda8ef289fd15791358ed441766e4639b60ac708f14c1f43c88
6
+ metadata.gz: 93d13277c3dfbe8bdb683f623d65800e8bedc771541573684f7181ff81fa49fb37f0a7c228aa87bf9f380f5685682760ba1e5b91720fb6e434bb9a28ae3c606c
7
+ data.tar.gz: cca5a4ea13dde3029a3402cec8dd58ced8592572236d7646cd6162f27473a143f434838c7495d67ec366553b647b142d99981631a30a9136eed0785479fd4c29
@@ -8,8 +8,20 @@ jobs:
8
8
  ci:
9
9
  uses: LegionIO/.github/.github/workflows/ci.yml@main
10
10
 
11
+ excluded-files:
12
+ uses: LegionIO/.github/.github/workflows/excluded-files.yml@main
13
+
14
+ security:
15
+ uses: LegionIO/.github/.github/workflows/security-scan.yml@main
16
+
17
+ version-changelog:
18
+ uses: LegionIO/.github/.github/workflows/version-changelog.yml@main
19
+
20
+ dependency-review:
21
+ uses: LegionIO/.github/.github/workflows/dependency-review.yml@main
22
+
11
23
  release:
12
- needs: ci
24
+ needs: [ci, excluded-files, security]
13
25
  if: github.event_name == 'push' && github.ref == 'refs/heads/main'
14
26
  uses: LegionIO/.github/.github/workflows/release.yml@main
15
27
  secrets:
data/CHANGELOG.md CHANGED
@@ -1,5 +1,33 @@
1
1
  # Changelog
2
2
 
3
+ ## 0.2.5 - 2026-05-06
4
+
5
+ - Load provider-owned fleet actors through the LegionIO subscription base and the canonical Azure Foundry provider root.
6
+ - Keep fleet runners anchored on the provider root namespace so provider constants and instance discovery are always loaded.
7
+ - Preserve configured transport and tier metadata when Azure Foundry builds routing offerings.
8
+ - Gate release publishing on the shared security workflow.
9
+
10
+ ## 0.2.4 - 2026-05-06
11
+
12
+ - Use the shared `lex-llm` fleet provider responder helper for provider-owned fleet workers.
13
+ - Remove the runtime `legion-llm` dependency and require `lex-llm >= 0.4.3` for responder-side fleet execution.
14
+
15
+ ## 0.2.3 - 2026-05-06
16
+
17
+ - Remove require-time provider self-registration; `legion-llm` now owns adapter creation and registry writes from loaded provider discovery metadata.
18
+ - Bump dependency floors to `lex-llm >= 0.4.1` and `legion-llm >= 0.9.1`.
19
+
20
+ ## 0.2.2 - 2026-05-06
21
+
22
+ - Enforce the shared keyword-only `lex-llm` provider contract for chat, embeddings, and token counting.
23
+ - Move defaults back to `Legion::Extensions::Llm.provider_settings` with credentials/provider metadata under the default instance and instance-level fleet responder settings.
24
+ - Add provider-owned fleet responder actor and runner backed by `legion-llm` fleet policy execution.
25
+ - Bump the transport dependency floor to `legion-transport >= 1.4.14`.
26
+
27
+ ## 0.2.1 - 2026-05-03
28
+
29
+ - Normalize generic settings keys to Azure Foundry provider config keys during instance discovery.
30
+
3
31
  ## 0.2.0 - 2026-05-01
4
32
 
5
33
  - Add auto-discovery via CredentialSources and AutoRegistration from lex-llm 0.3.0
data/Gemfile CHANGED
@@ -4,6 +4,8 @@ source 'https://rubygems.org'
4
4
 
5
5
  group :test do
6
6
  llm_base_path = ENV.fetch('LEX_LLM_PATH', File.expand_path('../lex-llm', __dir__))
7
+ transport_path = ENV.fetch('LEGION_TRANSPORT_PATH', File.expand_path('../../legion-transport', __dir__))
8
+ gem 'legion-transport', path: transport_path if File.directory?(transport_path)
7
9
  gem 'lex-llm', path: llm_base_path if File.directory?(llm_base_path)
8
10
  end
9
11
 
data/README.md CHANGED
@@ -2,153 +2,179 @@
2
2
 
3
3
  LegionIO LLM provider extension for Azure AI Foundry Models and Azure OpenAI hosted deployments.
4
4
 
5
- This gem lives under `Legion::Extensions::Llm::AzureFoundry` and depends on `lex-llm >= 0.1.5` for shared provider-neutral routing, fleet, model-offering, readiness, canonical-alias, and schema primitives.
5
+ This gem lives under `Legion::Extensions::Llm::AzureFoundry`. It depends on `lex-llm >= 0.4.3` for provider contracts, routing metadata, registry publishing helpers, and provider-owned fleet request handling. It does not require or depend on `legion-llm` at runtime; Legion LLM orchestration can load this provider gem and consume its discovery metadata.
6
6
 
7
- Load it with `require 'legion/extensions/llm/azure_foundry'`.
7
+ Load it with:
8
+
9
+ ```ruby
10
+ require 'legion/extensions/llm/azure_foundry'
11
+ ```
8
12
 
9
13
  ## What It Provides
10
14
 
11
- - `Legion::Extensions::Llm::Provider` registration as `:azure_foundry`
15
+ - Provider family `:azure_foundry`
12
16
  - Azure AI Foundry model inference chat completions through `POST /models/chat/completions?api-version=...`
13
17
  - Azure AI Foundry model inference embeddings through `POST /models/embeddings?api-version=...`
14
- - Azure AI Foundry model info health check through `GET /models/info?api-version=...` when `live: true`
18
+ - Azure AI Foundry model info health checks through `GET /models/info?api-version=...` when `live: true`
15
19
  - Azure OpenAI v1-compatible endpoint support through `/openai/v1/chat/completions` and `/openai/v1/embeddings`
16
- - Deployment-name-preserving routing offerings for hosted Azure deployments
20
+ - Offline-first offering discovery from configured deployments
21
+ - Deployment-name-preserving routing metadata for hosted Azure deployments
17
22
  - Explicit `model_family` and `canonical_model_alias` metadata for deployments whose base model cannot be proven from Azure metadata
18
- - Offline-first discovery from configured deployments
19
- - Shared OpenAI-compatible request and response mapping via `Legion::Extensions::Llm::Provider::OpenAICompatible`
20
- - Conservative token-counting metadata when no portable Azure token-counting REST endpoint is configured
21
- - Best-effort `llm.registry` event publishing for readiness and model availability via AMQP when transport is available
23
+ - Shared OpenAI-compatible request and response mapping through `Legion::Extensions::Llm::Provider::OpenAICompatible`
24
+ - Shared registry availability publishing through `Legion::Extensions::Llm::RegistryPublisher` when transport is available
25
+ - Provider-owned fleet request handling through `Legion::Extensions::Llm::Fleet::ProviderResponder`
22
26
 
23
27
  ## Architecture
24
28
 
25
- ```
29
+ ```text
26
30
  Legion::Extensions::Llm::AzureFoundry
27
- ├── Provider # Azure AI Foundry and Azure OpenAI hosted provider surface
28
- └── Capabilities # Capability predicates inferred from deployment metadata and model naming
29
- ├── RegistryPublisher # Best-effort async publisher for llm.registry availability events
30
- ├── RegistryEventBuilder # Builds sanitized lex-llm registry envelopes for provider state
31
- ├── Transport/
32
- │ ├── Messages::RegistryEvent # AMQP message for llm.registry events
33
- │ └── Exchanges::LlmRegistry # Topic exchange for provider availability events
34
- └── VERSION
31
+ |-- Provider # Azure AI Foundry and Azure OpenAI hosted provider surface
32
+ | `-- Capabilities # Capability predicates inferred from deployment metadata and model naming
33
+ |-- Actor::FleetWorker # Subscription actor for provider-owned fleet requests
34
+ |-- Runners::FleetWorker # Runner entrypoint that delegates to lex-llm ProviderResponder
35
+ `-- VERSION
35
36
  ```
36
37
 
38
+ `AzureFoundry.discover_instances` reads `extensions.llm.azure_foundry` settings and returns provider instance configs. The base Legion LLM runtime can use those configs to populate the provider registry and routing inventory; this gem does not write `legion-llm` registry state itself at require time.
39
+
37
40
  ## File Map
38
41
 
39
42
  | Path | Purpose |
40
43
  |------|---------|
41
- | `lib/legion/extensions/llm/azure_foundry.rb` | Entry point, provider registration, default settings |
42
- | `lib/legion/extensions/llm/azure_foundry/provider.rb` | Provider implementation with chat, stream, embed, health, readiness, discovery |
43
- | `lib/legion/extensions/llm/azure_foundry/registry_publisher.rb` | Async registry event publishing with transport guards |
44
- | `lib/legion/extensions/llm/azure_foundry/registry_event_builder.rb` | Sanitized registry envelope construction |
45
- | `lib/legion/extensions/llm/azure_foundry/transport/messages/registry_event.rb` | AMQP message class for registry events |
46
- | `lib/legion/extensions/llm/azure_foundry/transport/exchanges/llm_registry.rb` | Topic exchange definition for llm.registry |
44
+ | `lib/legion/extensions/llm/azure_foundry.rb` | Entry point, provider defaults, instance discovery, shared registry publisher |
45
+ | `lib/legion/extensions/llm/azure_foundry/provider.rb` | Provider implementation with chat, stream, embed, health, readiness, model listing, and offering discovery |
46
+ | `lib/legion/extensions/llm/azure_foundry/actors/fleet_worker.rb` | Subscription actor gated by ProviderResponder fleet settings |
47
+ | `lib/legion/extensions/llm/azure_foundry/runners/fleet_worker.rb` | Fleet request runner that delegates execution to `ProviderResponder.call` |
47
48
  | `lib/legion/extensions/llm/azure_foundry/version.rb` | `VERSION` constant |
48
49
 
49
- ## Observability
50
-
51
- Every class and module uses `Legion::Logging::Helper`:
52
-
53
- - **AzureFoundry** module: `extend Legion::Logging::Helper`
54
- - **Provider**: inherits `include Legion::Logging::Helper` from `Legion::Extensions::Llm::Provider`
55
- - **RegistryPublisher**: `include Legion::Logging::Helper`
56
- - **RegistryEventBuilder**: `include Legion::Logging::Helper`
50
+ ## Configuration
57
51
 
58
- All rescue blocks call `handle_exception(e, level:, handled:, operation:)` for structured exception reporting. Key actions emit info-level log lines including discover_offerings, health checks, readiness, model listing, chat, stream, embed, and registry publish operations.
52
+ Configured instances can be supplied through Legion settings under `extensions.llm.azure_foundry`. A top-level endpoint creates a `:settings` instance; entries under `instances` create named instances.
53
+
54
+ ```yaml
55
+ extensions:
56
+ llm:
57
+ azure_foundry:
58
+ endpoint: https://example.services.ai.azure.com
59
+ api_key: env://AZURE_INFERENCE_CREDENTIAL
60
+ bearer_token: env://AZURE_FOUNDRY_BEARER_TOKEN
61
+ api_version: 2024-05-01-preview
62
+ surface: model_inference
63
+ deployments:
64
+ - deployment: gpt-4o-prod
65
+ model_family: openai
66
+ canonical_model_alias: gpt-4o
67
+ usage_type: inference
68
+ - deployment: embedding-prod
69
+ model_family: openai
70
+ canonical_model_alias: text-embedding-3-small
71
+ usage_type: embedding
72
+ instances:
73
+ prod:
74
+ endpoint: https://prod.services.ai.azure.com
75
+ api_key: env://AZURE_INFERENCE_CREDENTIAL
76
+ api_version: 2024-05-01-preview
77
+ surface: model_inference
78
+ deployments:
79
+ - deployment: gpt-4o-prod
80
+ model_family: openai
81
+ canonical_model_alias: gpt-4o
82
+ usage_type: inference
83
+ fleet:
84
+ enabled: true
85
+ respond_to_requests: true
86
+ capabilities:
87
+ - chat
88
+ - stream_chat
89
+ - embed
90
+ ```
59
91
 
60
- ## API Contract
92
+ The provider also supports direct configuration through `Legion::Extensions::Llm.configure` for tests and embedded use:
61
93
 
62
- The implementation follows Microsoft Learn REST documentation for Azure AI Foundry Models:
94
+ ```ruby
95
+ Legion::Extensions::Llm.configure do |config|
96
+ config.azure_foundry_endpoint = ENV.fetch('AZURE_FOUNDRY_ENDPOINT')
97
+ config.azure_foundry_api_key = ENV['AZURE_INFERENCE_CREDENTIAL']
98
+ config.azure_foundry_bearer_token = ENV['AZURE_FOUNDRY_BEARER_TOKEN']
99
+ config.azure_foundry_api_version = '2024-05-01-preview'
100
+ config.azure_foundry_surface = :model_inference
101
+ config.azure_foundry_deployments = [
102
+ {
103
+ deployment: 'gpt-4o-prod',
104
+ model_family: :openai,
105
+ canonical_model_alias: 'gpt-4o',
106
+ usage_type: :inference
107
+ }
108
+ ]
109
+ end
110
+ ```
63
111
 
64
- - Azure AI Foundry model inference endpoints use deployment names as the request `model`.
65
- - The model inference endpoint supports chat completions and embeddings.
66
- - The documented model-info endpoint is used only for explicit live health checks.
67
- - Azure deployment metadata is not assumed to reliably prove base model family or version, so routing metadata should be configured explicitly.
112
+ Use `:openai_v1` when the endpoint should be treated as the OpenAI v1-compatible Azure route. The provider appends `/openai/v1` when the configured endpoint does not already include it.
68
113
 
69
- ## Defaults
114
+ ## Default Settings
70
115
 
71
116
  ```ruby
72
117
  Legion::Extensions::Llm::AzureFoundry.default_settings
73
118
  # {
119
+ # enabled: true,
74
120
  # provider_family: :azure_foundry,
75
- # discovery: { enabled: true, live: false },
76
121
  # instances: {
77
122
  # default: {
78
- # endpoint: "https://<resource>.services.ai.azure.com",
79
- # api_version: "2024-05-01-preview",
80
- # surface: :model_inference,
123
+ # endpoint: nil,
81
124
  # tier: :frontier,
82
125
  # transport: :http,
83
126
  # credentials: {
84
- # api_key: "env://AZURE_INFERENCE_CREDENTIAL",
85
- # bearer_token: "env://AZURE_FOUNDRY_BEARER_TOKEN",
86
- # entra_scope: "https://cognitiveservices.azure.com/.default"
127
+ # api_key: nil,
128
+ # bearer_token: nil
129
+ # },
130
+ # provider: {
131
+ # api_version: "2024-05-01-preview",
132
+ # surface: nil,
133
+ # deployments: []
87
134
  # },
88
- # deployments: [],
89
- # usage: { inference: true, embedding: true, token_counting: false },
90
- # limits: { concurrency: 4 }
135
+ # usage: { inference: true, embedding: true, image: false },
136
+ # limits: { concurrency: 4 },
137
+ # fleet: {
138
+ # enabled: false,
139
+ # respond_to_requests: false,
140
+ # capabilities: [:chat, :stream_chat, :embed],
141
+ # lanes: [],
142
+ # concurrency: 4,
143
+ # queue_suffix: nil
144
+ # }
91
145
  # }
92
146
  # }
93
147
  # }
94
148
  ```
95
149
 
96
- ## Configuration
97
-
98
- ```ruby
99
- Legion::Extensions::Llm.configure do |config|
100
- config.azure_foundry_endpoint = ENV.fetch("AZURE_FOUNDRY_ENDPOINT")
101
- config.azure_foundry_api_key = ENV["AZURE_INFERENCE_CREDENTIAL"]
102
- config.azure_foundry_bearer_token = ENV["AZURE_FOUNDRY_BEARER_TOKEN"]
103
- config.azure_foundry_api_version = "2024-05-01-preview"
104
- config.azure_foundry_surface = :model_inference
105
- config.azure_foundry_deployments = [
106
- {
107
- deployment: "gpt-4o-prod",
108
- model_family: :openai,
109
- canonical_model_alias: "gpt-4o",
110
- usage_type: :inference
111
- },
112
- {
113
- deployment: "mistral-large-prod",
114
- model_family: :mistral,
115
- canonical_model_alias: "mistral-large",
116
- usage_type: :inference
117
- },
118
- {
119
- deployment: "embedding-prod",
120
- model_family: :openai,
121
- canonical_model_alias: "text-embedding-3-small",
122
- usage_type: :embedding
123
- }
124
- ]
125
- end
126
- ```
127
-
128
- Use `config.azure_foundry_surface = :openai_v1` when the target endpoint should be treated as the OpenAI v1-compatible Azure route. The provider appends `/openai/v1` when the configured endpoint does not already include it.
129
-
130
150
  ## Provider Methods
131
151
 
132
152
  ```ruby
133
153
  provider = Legion::Extensions::Llm::AzureFoundry.provider_class.new(Legion::Extensions::Llm.config)
134
154
 
135
155
  provider.discover_offerings(live: false)
136
- provider.offering_for(model: "gpt-4o-prod", model_family: :openai, canonical_model_alias: "gpt-4o")
156
+ provider.offering_for(model: 'gpt-4o-prod', model_family: :openai, canonical_model_alias: 'gpt-4o')
137
157
  provider.health(live: false)
138
158
  provider.readiness(live: false)
139
159
  provider.list_models
140
- provider.chat(messages, model: "gpt-4o-prod")
141
- provider.stream(messages, model: "gpt-4o-prod") { |chunk| puts chunk.content }
142
- provider.embed(["hello"], model: "embedding-prod")
143
- provider.count_tokens(messages, model: "gpt-4o-prod")
160
+ provider.chat(messages: messages, model: 'gpt-4o-prod')
161
+ provider.stream(messages: messages, model: 'gpt-4o-prod') { |chunk| puts chunk.content }
162
+ provider.embed(text: ['hello'], model: 'embedding-prod')
163
+ provider.count_tokens(messages: messages, model: 'gpt-4o-prod')
144
164
  ```
145
165
 
146
- `discover_offerings(live: false)` never calls Azure. It maps configured deployments into `Legion::Extensions::Llm::Routing::ModelOffering` values with `provider_family: :azure_foundry`.
166
+ `discover_offerings(live: false)` does not call Azure. It maps configured deployments into `Legion::Extensions::Llm::Routing::ModelOffering` values with `provider_family: :azure_foundry`.
147
167
 
148
168
  `health(live: true)` calls the documented model-info endpoint for the configured model-inference surface. Keep `live: false` for startup paths and tests that must not require Azure.
149
169
 
150
170
  `count_tokens` returns a structured unsupported result by default because the Microsoft REST contract used here does not define a portable token-counting endpoint across Azure AI Foundry deployments.
151
171
 
172
+ ## Fleet Responder
173
+
174
+ Provider instances can opt in to consuming Legion LLM fleet requests. The actor is enabled only when at least one discovered instance has `fleet.respond_to_requests: true`.
175
+
176
+ Fleet execution is delegated to `Legion::Extensions::Llm::Fleet::ProviderResponder` from `lex-llm`; this provider supplies the provider family, provider class, discovered instances, and delivery metadata.
177
+
152
178
  ## Routing Metadata
153
179
 
154
180
  Azure deployments are aliases. A deployment name can hide provider, model, and version details, so this extension preserves the deployment name as `model` and treats `canonical_model_alias` and `model_family` as routing metadata.
@@ -163,3 +189,7 @@ Supported `model_family` values are intentionally open-ended symbols, including:
163
189
  - `:microsoft`
164
190
 
165
191
  When `model_family` or `canonical_model_alias` is missing, offerings include `requires_explicit_model_metadata: true`.
192
+
193
+ ## Failure Behavior
194
+
195
+ Live discovery and health-check failures are reported with `handle_exception(e, level: :warn, handled: true, operation: ...)` before returning degraded metadata. Offline discovery, provider configuration, and fleet actor enablement should not require live Azure connectivity.
@@ -26,5 +26,6 @@ Gem::Specification.new do |spec|
26
26
  spec.add_dependency 'legion-json', '>= 1.2.1'
27
27
  spec.add_dependency 'legion-logging', '>= 1.3.2'
28
28
  spec.add_dependency 'legion-settings', '>= 1.3.14'
29
- spec.add_dependency 'lex-llm', '>= 0.3.0'
29
+ spec.add_dependency 'legion-transport', '>= 1.4.14'
30
+ spec.add_dependency 'lex-llm', '>= 0.4.3'
30
31
  end
@@ -0,0 +1,43 @@
1
+ # frozen_string_literal: true
2
+
3
+ begin
4
+ require 'legion/extensions/actors/subscription'
5
+ rescue LoadError => e
6
+ warn(e.message) if $VERBOSE
7
+ end
8
+
9
+ unless defined?(Legion::Extensions::Actors::Subscription)
10
+ raise LoadError, 'LegionIO actor runtime is required for Azure Foundry fleet worker'
11
+ end
12
+
13
+ require 'legion/extensions/llm/azure_foundry'
14
+ require 'legion/extensions/llm/fleet/provider_responder'
15
+
16
+ module Legion
17
+ module Extensions
18
+ module Llm
19
+ module AzureFoundry
20
+ module Actor
21
+ # Subscription actor for Azure Foundry fleet request consumption.
22
+ class FleetWorker < Legion::Extensions::Actors::Subscription
23
+ def runner_class
24
+ 'Legion::Extensions::Llm::AzureFoundry::Runners::FleetWorker'
25
+ end
26
+
27
+ def runner_function
28
+ 'handle_fleet_request'
29
+ end
30
+
31
+ def use_runner?
32
+ false
33
+ end
34
+
35
+ def enabled?
36
+ Legion::Extensions::Llm::Fleet::ProviderResponder.enabled_for?(AzureFoundry.discover_instances)
37
+ end
38
+ end
39
+ end
40
+ end
41
+ end
42
+ end
43
+ end
@@ -197,25 +197,49 @@ module Legion
197
197
  models
198
198
  end
199
199
 
200
- def chat(messages, model:, temperature: nil, max_tokens: nil, tools: {}, tool_prefs: nil, params: {}) # rubocop:disable Metrics/ParameterLists
200
+ def chat(
201
+ messages:,
202
+ model:,
203
+ **options
204
+ )
201
205
  log.info { "chat request model=#{model} messages=#{messages.size}" }
202
- complete(messages, tools:, temperature:, model: model_info(model, max_tokens:), params:, tool_prefs:)
206
+ complete(messages, tools: options.fetch(:tools, {}), temperature: options[:temperature],
207
+ model: model_info(model, max_tokens: options[:max_tokens]),
208
+ params: options.fetch(:params, {}), tool_prefs: options[:tool_prefs])
203
209
  end
204
210
 
205
- def stream(messages, model:, temperature: nil, max_tokens: nil, tools: {}, tool_prefs: nil, params: {}, &) # rubocop:disable Metrics/ParameterLists
211
+ def stream(
212
+ messages:,
213
+ model:,
214
+ **options,
215
+ &
216
+ )
206
217
  log.info { "stream request model=#{model} messages=#{messages.size}" }
207
- complete(messages, tools:, temperature:, model: model_info(model, max_tokens:), params:, tool_prefs:, &)
218
+ complete(messages, tools: options.fetch(:tools, {}), temperature: options[:temperature],
219
+ model: model_info(model, max_tokens: options[:max_tokens]),
220
+ params: options.fetch(:params, {}), tool_prefs: options[:tool_prefs], &)
208
221
  end
209
222
 
210
- def embed(text, model:, dimensions: nil, input_type: nil)
223
+ def embed(
224
+ text:,
225
+ model:,
226
+ **options
227
+ )
211
228
  log.info { "embed request model=#{model}" }
212
- payload = render_embedding_payload(text, model: model_id(model), dimensions:)
213
- payload[:input_type] = input_type if input_type
229
+ payload = Utils.deep_merge(
230
+ render_embedding_payload(text, model: model_id(model), dimensions: options[:dimensions]),
231
+ options.fetch(:params, {})
232
+ )
233
+ payload[:input_type] = options[:input_type] if options[:input_type]
214
234
  response = connection.post(embedding_url(model:), payload)
215
235
  parse_embedding_response(response, model: model_id(model), text:)
216
236
  end
217
237
 
218
- def count_tokens(messages, model:, **)
238
+ def count_tokens(
239
+ messages:,
240
+ model:,
241
+ **_provider_options
242
+ )
219
243
  {
220
244
  provider_family: :azure_foundry,
221
245
  model: model_id(model),
@@ -295,8 +319,8 @@ module Legion
295
319
  Legion::Extensions::Llm::Routing::ModelOffering.new(
296
320
  provider_family: :azure_foundry,
297
321
  instance_id: instance_id,
298
- transport: :http,
299
- tier: :frontier,
322
+ transport: configured_transport(:http),
323
+ tier: configured_tier(:frontier),
300
324
  model: model,
301
325
  usage_type: usage_type.to_sym,
302
326
  capabilities: capabilities,
@@ -308,6 +332,14 @@ module Legion
308
332
  )
309
333
  end
310
334
 
335
+ def configured_transport(default)
336
+ config.respond_to?(:transport) ? config.transport : default
337
+ end
338
+
339
+ def configured_tier(default)
340
+ config.respond_to?(:tier) ? config.tier : default
341
+ end
342
+
311
343
  def with_live_metadata(offering)
312
344
  response = connection.get(models_url)
313
345
  metadata = offering.metadata.merge(model_info: response.body)
@@ -0,0 +1,30 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'legion/extensions/llm/fleet/provider_responder'
4
+ require 'legion/extensions/llm/azure_foundry'
5
+
6
+ module Legion
7
+ module Extensions
8
+ module Llm
9
+ module AzureFoundry
10
+ module Runners
11
+ # Runner entrypoint for Azure Foundry fleet request execution.
12
+ module FleetWorker
13
+ module_function
14
+
15
+ def handle_fleet_request(payload, delivery: nil, properties: nil)
16
+ Legion::Extensions::Llm::Fleet::ProviderResponder.call(
17
+ payload: payload,
18
+ provider_family: AzureFoundry::PROVIDER_FAMILY,
19
+ provider_class: AzureFoundry::Provider,
20
+ provider_instances: -> { AzureFoundry.discover_instances },
21
+ delivery: delivery,
22
+ properties: properties
23
+ )
24
+ end
25
+ end
26
+ end
27
+ end
28
+ end
29
+ end
30
+ end
@@ -4,7 +4,7 @@ module Legion
4
4
  module Extensions
5
5
  module Llm
6
6
  module AzureFoundry
7
- VERSION = '0.2.0'
7
+ VERSION = '0.2.5'
8
8
  end
9
9
  end
10
10
  end
@@ -16,21 +16,33 @@ module Legion
16
16
  PROVIDER_FAMILY = :azure_foundry
17
17
 
18
18
  def self.default_settings
19
- {
20
- enabled: false,
21
- default_model: nil,
22
- endpoint: nil,
23
- api_key: nil,
24
- bearer_token: nil,
25
- api_version: '2024-05-01-preview',
26
- surface: nil,
27
- deployments: [],
28
- model_whitelist: [],
29
- model_blacklist: [],
30
- model_cache_ttl: 3600,
31
- tls: { enabled: false, verify: :peer },
32
- instances: {}
33
- }
19
+ ::Legion::Extensions::Llm.provider_settings(
20
+ family: PROVIDER_FAMILY,
21
+ instance: {
22
+ endpoint: nil,
23
+ tier: :frontier,
24
+ transport: :http,
25
+ credentials: {
26
+ api_key: nil,
27
+ bearer_token: nil
28
+ },
29
+ provider: {
30
+ api_version: Provider::DEFAULT_API_VERSION,
31
+ surface: nil,
32
+ deployments: []
33
+ },
34
+ usage: { inference: true, embedding: true, image: false },
35
+ limits: { concurrency: 4 },
36
+ fleet: {
37
+ enabled: false,
38
+ respond_to_requests: false,
39
+ capabilities: %i[chat stream_chat embed],
40
+ lanes: [],
41
+ concurrency: 4,
42
+ queue_suffix: nil
43
+ }
44
+ }
45
+ )
34
46
  end
35
47
 
36
48
  def self.provider_class
@@ -48,14 +60,15 @@ module Legion
48
60
  instances
49
61
  end
50
62
 
51
- def self.discover_default_instance(instances)
63
+ def self.discover_default_instance(instances) # rubocop:disable Metrics/AbcSize, Metrics/CyclomaticComplexity, Metrics/PerceivedComplexity
52
64
  cfg = CredentialSources.setting(:extensions, :llm, :azure_foundry)
53
65
  return unless cfg.is_a?(Hash)
54
66
 
55
- endpoint = cfg[:endpoint] || cfg['endpoint']
67
+ endpoint = cfg[:endpoint] || cfg['endpoint'] || cfg[:base_url] || cfg['base_url'] || cfg[:api_base] ||
68
+ cfg['api_base']
56
69
  return if endpoint.nil? || endpoint.to_s.strip.empty?
57
70
 
58
- instances[:settings] = cfg.except(:instances, 'instances').merge(tier: :cloud)
71
+ instances[:settings] = normalize_instance_config(cfg).merge(tier: :cloud)
59
72
  end
60
73
 
61
74
  def self.discover_named_instances(instances)
@@ -68,21 +81,35 @@ module Legion
68
81
  named.each { |name, config| add_named_instance(instances, name, config) }
69
82
  end
70
83
 
71
- def self.add_named_instance(instances, name, config)
84
+ def self.add_named_instance(instances, name, config) # rubocop:disable Metrics/AbcSize, Metrics/CyclomaticComplexity, Metrics/PerceivedComplexity
72
85
  return unless config.is_a?(Hash)
73
86
 
74
- endpoint = config[:endpoint] || config['endpoint']
87
+ endpoint = config[:endpoint] || config['endpoint'] || config[:base_url] || config['base_url'] ||
88
+ config[:api_base] || config['api_base']
75
89
  return if endpoint.nil? || endpoint.to_s.strip.empty?
76
90
 
77
- instances[name.to_sym] = config.merge(tier: :cloud)
91
+ instances[name.to_sym] = normalize_instance_config(config).merge(tier: :cloud)
78
92
  end
79
93
 
80
- private_class_method :discover_default_instance, :discover_named_instances, :add_named_instance
94
+ def self.normalize_instance_config(config) # rubocop:disable Metrics/AbcSize, Metrics/CyclomaticComplexity, Metrics/PerceivedComplexity
95
+ normalized = config.to_h.transform_keys { |key| key.respond_to?(:to_sym) ? key.to_sym : key }
96
+ normalized[:azure_foundry_endpoint] ||= normalized.delete(:endpoint)
97
+ normalized[:azure_foundry_endpoint] ||= normalized.delete(:base_url)
98
+ normalized[:azure_foundry_endpoint] ||= normalized.delete(:api_base)
99
+ normalized[:azure_foundry_api_key] ||= normalized.delete(:api_key)
100
+ normalized[:azure_foundry_bearer_token] ||= normalized.delete(:bearer_token)
101
+ normalized[:azure_foundry_api_version] ||= normalized.delete(:api_version)
102
+ normalized[:azure_foundry_surface] ||= normalized.delete(:surface)
103
+ normalized[:azure_foundry_deployments] ||= normalized.delete(:deployments)
104
+ normalized.compact.except(:instances)
105
+ end
106
+
107
+ private_class_method :discover_default_instance, :discover_named_instances, :add_named_instance,
108
+ :normalize_instance_config
81
109
 
82
- Legion::Extensions::Llm::Configuration.register_provider_options(Provider.configuration_options)
110
+ Legion::Extensions::Llm::Configuration.register_provider_options(Provider.configuration_options) if
111
+ Legion::Extensions::Llm::Configuration.respond_to?(:register_provider_options)
83
112
  end
84
113
  end
85
114
  end
86
115
  end
87
-
88
- Legion::Extensions::Llm::AzureFoundry.register_discovered_instances
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: lex-llm-azure-foundry
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.0
4
+ version: 0.2.5
5
5
  platform: ruby
6
6
  authors:
7
7
  - LegionIO
@@ -51,20 +51,34 @@ dependencies:
51
51
  - - ">="
52
52
  - !ruby/object:Gem::Version
53
53
  version: 1.3.14
54
+ - !ruby/object:Gem::Dependency
55
+ name: legion-transport
56
+ requirement: !ruby/object:Gem::Requirement
57
+ requirements:
58
+ - - ">="
59
+ - !ruby/object:Gem::Version
60
+ version: 1.4.14
61
+ type: :runtime
62
+ prerelease: false
63
+ version_requirements: !ruby/object:Gem::Requirement
64
+ requirements:
65
+ - - ">="
66
+ - !ruby/object:Gem::Version
67
+ version: 1.4.14
54
68
  - !ruby/object:Gem::Dependency
55
69
  name: lex-llm
56
70
  requirement: !ruby/object:Gem::Requirement
57
71
  requirements:
58
72
  - - ">="
59
73
  - !ruby/object:Gem::Version
60
- version: 0.3.0
74
+ version: 0.4.3
61
75
  type: :runtime
62
76
  prerelease: false
63
77
  version_requirements: !ruby/object:Gem::Requirement
64
78
  requirements:
65
79
  - - ">="
66
80
  - !ruby/object:Gem::Version
67
- version: 0.3.0
81
+ version: 0.4.3
68
82
  description: Azure AI Foundry and Azure OpenAI hosted provider integration for LegionIO
69
83
  LLM routing.
70
84
  email:
@@ -84,7 +98,9 @@ files:
84
98
  - README.md
85
99
  - lex-llm-azure-foundry.gemspec
86
100
  - lib/legion/extensions/llm/azure_foundry.rb
101
+ - lib/legion/extensions/llm/azure_foundry/actors/fleet_worker.rb
87
102
  - lib/legion/extensions/llm/azure_foundry/provider.rb
103
+ - lib/legion/extensions/llm/azure_foundry/runners/fleet_worker.rb
88
104
  - lib/legion/extensions/llm/azure_foundry/version.rb
89
105
  homepage: https://github.com/LegionIO/lex-llm-azure-foundry
90
106
  licenses: