lex-llm-azure-foundry 0.2.0 → 0.2.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.github/workflows/ci.yml +13 -1
- data/CHANGELOG.md +28 -0
- data/Gemfile +2 -0
- data/README.md +120 -90
- data/lex-llm-azure-foundry.gemspec +2 -1
- data/lib/legion/extensions/llm/azure_foundry/actors/fleet_worker.rb +43 -0
- data/lib/legion/extensions/llm/azure_foundry/provider.rb +42 -10
- data/lib/legion/extensions/llm/azure_foundry/runners/fleet_worker.rb +30 -0
- data/lib/legion/extensions/llm/azure_foundry/version.rb +1 -1
- data/lib/legion/extensions/llm/azure_foundry.rb +52 -25
- metadata +19 -3
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: dc801cd438178d5431250e6b2dbfb3d2b6dce8af6fb7a266cd2bf65eeb24b7a9
|
|
4
|
+
data.tar.gz: '04928abba688736869565f51365019a470980ddf4502d866d0a3f169bb30524b'
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 93d13277c3dfbe8bdb683f623d65800e8bedc771541573684f7181ff81fa49fb37f0a7c228aa87bf9f380f5685682760ba1e5b91720fb6e434bb9a28ae3c606c
|
|
7
|
+
data.tar.gz: cca5a4ea13dde3029a3402cec8dd58ced8592572236d7646cd6162f27473a143f434838c7495d67ec366553b647b142d99981631a30a9136eed0785479fd4c29
|
data/.github/workflows/ci.yml
CHANGED
|
@@ -8,8 +8,20 @@ jobs:
|
|
|
8
8
|
ci:
|
|
9
9
|
uses: LegionIO/.github/.github/workflows/ci.yml@main
|
|
10
10
|
|
|
11
|
+
excluded-files:
|
|
12
|
+
uses: LegionIO/.github/.github/workflows/excluded-files.yml@main
|
|
13
|
+
|
|
14
|
+
security:
|
|
15
|
+
uses: LegionIO/.github/.github/workflows/security-scan.yml@main
|
|
16
|
+
|
|
17
|
+
version-changelog:
|
|
18
|
+
uses: LegionIO/.github/.github/workflows/version-changelog.yml@main
|
|
19
|
+
|
|
20
|
+
dependency-review:
|
|
21
|
+
uses: LegionIO/.github/.github/workflows/dependency-review.yml@main
|
|
22
|
+
|
|
11
23
|
release:
|
|
12
|
-
needs: ci
|
|
24
|
+
needs: [ci, excluded-files, security]
|
|
13
25
|
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
|
|
14
26
|
uses: LegionIO/.github/.github/workflows/release.yml@main
|
|
15
27
|
secrets:
|
data/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,33 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## 0.2.5 - 2026-05-06
|
|
4
|
+
|
|
5
|
+
- Load provider-owned fleet actors through the LegionIO subscription base and the canonical Azure Foundry provider root.
|
|
6
|
+
- Keep fleet runners anchored on the provider root namespace so provider constants and instance discovery are always loaded.
|
|
7
|
+
- Preserve configured transport and tier metadata when Azure Foundry builds routing offerings.
|
|
8
|
+
- Gate release publishing on the shared security workflow.
|
|
9
|
+
|
|
10
|
+
## 0.2.4 - 2026-05-06
|
|
11
|
+
|
|
12
|
+
- Use the shared `lex-llm` fleet provider responder helper for provider-owned fleet workers.
|
|
13
|
+
- Remove the runtime `legion-llm` dependency and require `lex-llm >= 0.4.3` for responder-side fleet execution.
|
|
14
|
+
|
|
15
|
+
## 0.2.3 - 2026-05-06
|
|
16
|
+
|
|
17
|
+
- Remove require-time provider self-registration; `legion-llm` now owns adapter creation and registry writes from loaded provider discovery metadata.
|
|
18
|
+
- Bump dependency floors to `lex-llm >= 0.4.1` and `legion-llm >= 0.9.1`.
|
|
19
|
+
|
|
20
|
+
## 0.2.2 - 2026-05-06
|
|
21
|
+
|
|
22
|
+
- Enforce the shared keyword-only `lex-llm` provider contract for chat, embeddings, and token counting.
|
|
23
|
+
- Move defaults back to `Legion::Extensions::Llm.provider_settings` with credentials/provider metadata under the default instance and instance-level fleet responder settings.
|
|
24
|
+
- Add provider-owned fleet responder actor and runner backed by `legion-llm` fleet policy execution.
|
|
25
|
+
- Bump the transport dependency floor to `legion-transport >= 1.4.14`.
|
|
26
|
+
|
|
27
|
+
## 0.2.1 - 2026-05-03
|
|
28
|
+
|
|
29
|
+
- Normalize generic settings keys to Azure Foundry provider config keys during instance discovery.
|
|
30
|
+
|
|
3
31
|
## 0.2.0 - 2026-05-01
|
|
4
32
|
|
|
5
33
|
- Add auto-discovery via CredentialSources and AutoRegistration from lex-llm 0.3.0
|
data/Gemfile
CHANGED
|
@@ -4,6 +4,8 @@ source 'https://rubygems.org'
|
|
|
4
4
|
|
|
5
5
|
group :test do
|
|
6
6
|
llm_base_path = ENV.fetch('LEX_LLM_PATH', File.expand_path('../lex-llm', __dir__))
|
|
7
|
+
transport_path = ENV.fetch('LEGION_TRANSPORT_PATH', File.expand_path('../../legion-transport', __dir__))
|
|
8
|
+
gem 'legion-transport', path: transport_path if File.directory?(transport_path)
|
|
7
9
|
gem 'lex-llm', path: llm_base_path if File.directory?(llm_base_path)
|
|
8
10
|
end
|
|
9
11
|
|
data/README.md
CHANGED
|
@@ -2,153 +2,179 @@
|
|
|
2
2
|
|
|
3
3
|
LegionIO LLM provider extension for Azure AI Foundry Models and Azure OpenAI hosted deployments.
|
|
4
4
|
|
|
5
|
-
This gem lives under `Legion::Extensions::Llm::AzureFoundry
|
|
5
|
+
This gem lives under `Legion::Extensions::Llm::AzureFoundry`. It depends on `lex-llm >= 0.4.3` for provider contracts, routing metadata, registry publishing helpers, and provider-owned fleet request handling. It does not require or depend on `legion-llm` at runtime; Legion LLM orchestration can load this provider gem and consume its discovery metadata.
|
|
6
6
|
|
|
7
|
-
Load it with
|
|
7
|
+
Load it with:
|
|
8
|
+
|
|
9
|
+
```ruby
|
|
10
|
+
require 'legion/extensions/llm/azure_foundry'
|
|
11
|
+
```
|
|
8
12
|
|
|
9
13
|
## What It Provides
|
|
10
14
|
|
|
11
|
-
-
|
|
15
|
+
- Provider family `:azure_foundry`
|
|
12
16
|
- Azure AI Foundry model inference chat completions through `POST /models/chat/completions?api-version=...`
|
|
13
17
|
- Azure AI Foundry model inference embeddings through `POST /models/embeddings?api-version=...`
|
|
14
|
-
- Azure AI Foundry model info health
|
|
18
|
+
- Azure AI Foundry model info health checks through `GET /models/info?api-version=...` when `live: true`
|
|
15
19
|
- Azure OpenAI v1-compatible endpoint support through `/openai/v1/chat/completions` and `/openai/v1/embeddings`
|
|
16
|
-
-
|
|
20
|
+
- Offline-first offering discovery from configured deployments
|
|
21
|
+
- Deployment-name-preserving routing metadata for hosted Azure deployments
|
|
17
22
|
- Explicit `model_family` and `canonical_model_alias` metadata for deployments whose base model cannot be proven from Azure metadata
|
|
18
|
-
-
|
|
19
|
-
- Shared
|
|
20
|
-
-
|
|
21
|
-
- Best-effort `llm.registry` event publishing for readiness and model availability via AMQP when transport is available
|
|
23
|
+
- Shared OpenAI-compatible request and response mapping through `Legion::Extensions::Llm::Provider::OpenAICompatible`
|
|
24
|
+
- Shared registry availability publishing through `Legion::Extensions::Llm::RegistryPublisher` when transport is available
|
|
25
|
+
- Provider-owned fleet request handling through `Legion::Extensions::Llm::Fleet::ProviderResponder`
|
|
22
26
|
|
|
23
27
|
## Architecture
|
|
24
28
|
|
|
25
|
-
```
|
|
29
|
+
```text
|
|
26
30
|
Legion::Extensions::Llm::AzureFoundry
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
│ ├── Messages::RegistryEvent # AMQP message for llm.registry events
|
|
33
|
-
│ └── Exchanges::LlmRegistry # Topic exchange for provider availability events
|
|
34
|
-
└── VERSION
|
|
31
|
+
|-- Provider # Azure AI Foundry and Azure OpenAI hosted provider surface
|
|
32
|
+
| `-- Capabilities # Capability predicates inferred from deployment metadata and model naming
|
|
33
|
+
|-- Actor::FleetWorker # Subscription actor for provider-owned fleet requests
|
|
34
|
+
|-- Runners::FleetWorker # Runner entrypoint that delegates to lex-llm ProviderResponder
|
|
35
|
+
`-- VERSION
|
|
35
36
|
```
|
|
36
37
|
|
|
38
|
+
`AzureFoundry.discover_instances` reads `extensions.llm.azure_foundry` settings and returns provider instance configs. The base Legion LLM runtime can use those configs to populate the provider registry and routing inventory; this gem does not write `legion-llm` registry state itself at require time.
|
|
39
|
+
|
|
37
40
|
## File Map
|
|
38
41
|
|
|
39
42
|
| Path | Purpose |
|
|
40
43
|
|------|---------|
|
|
41
|
-
| `lib/legion/extensions/llm/azure_foundry.rb` | Entry point, provider
|
|
42
|
-
| `lib/legion/extensions/llm/azure_foundry/provider.rb` | Provider implementation with chat, stream, embed, health, readiness, discovery |
|
|
43
|
-
| `lib/legion/extensions/llm/azure_foundry/
|
|
44
|
-
| `lib/legion/extensions/llm/azure_foundry/
|
|
45
|
-
| `lib/legion/extensions/llm/azure_foundry/transport/messages/registry_event.rb` | AMQP message class for registry events |
|
|
46
|
-
| `lib/legion/extensions/llm/azure_foundry/transport/exchanges/llm_registry.rb` | Topic exchange definition for llm.registry |
|
|
44
|
+
| `lib/legion/extensions/llm/azure_foundry.rb` | Entry point, provider defaults, instance discovery, shared registry publisher |
|
|
45
|
+
| `lib/legion/extensions/llm/azure_foundry/provider.rb` | Provider implementation with chat, stream, embed, health, readiness, model listing, and offering discovery |
|
|
46
|
+
| `lib/legion/extensions/llm/azure_foundry/actors/fleet_worker.rb` | Subscription actor gated by ProviderResponder fleet settings |
|
|
47
|
+
| `lib/legion/extensions/llm/azure_foundry/runners/fleet_worker.rb` | Fleet request runner that delegates execution to `ProviderResponder.call` |
|
|
47
48
|
| `lib/legion/extensions/llm/azure_foundry/version.rb` | `VERSION` constant |
|
|
48
49
|
|
|
49
|
-
##
|
|
50
|
-
|
|
51
|
-
Every class and module uses `Legion::Logging::Helper`:
|
|
52
|
-
|
|
53
|
-
- **AzureFoundry** module: `extend Legion::Logging::Helper`
|
|
54
|
-
- **Provider**: inherits `include Legion::Logging::Helper` from `Legion::Extensions::Llm::Provider`
|
|
55
|
-
- **RegistryPublisher**: `include Legion::Logging::Helper`
|
|
56
|
-
- **RegistryEventBuilder**: `include Legion::Logging::Helper`
|
|
50
|
+
## Configuration
|
|
57
51
|
|
|
58
|
-
|
|
52
|
+
Configured instances can be supplied through Legion settings under `extensions.llm.azure_foundry`. A top-level endpoint creates a `:settings` instance; entries under `instances` create named instances.
|
|
53
|
+
|
|
54
|
+
```yaml
|
|
55
|
+
extensions:
|
|
56
|
+
llm:
|
|
57
|
+
azure_foundry:
|
|
58
|
+
endpoint: https://example.services.ai.azure.com
|
|
59
|
+
api_key: env://AZURE_INFERENCE_CREDENTIAL
|
|
60
|
+
bearer_token: env://AZURE_FOUNDRY_BEARER_TOKEN
|
|
61
|
+
api_version: 2024-05-01-preview
|
|
62
|
+
surface: model_inference
|
|
63
|
+
deployments:
|
|
64
|
+
- deployment: gpt-4o-prod
|
|
65
|
+
model_family: openai
|
|
66
|
+
canonical_model_alias: gpt-4o
|
|
67
|
+
usage_type: inference
|
|
68
|
+
- deployment: embedding-prod
|
|
69
|
+
model_family: openai
|
|
70
|
+
canonical_model_alias: text-embedding-3-small
|
|
71
|
+
usage_type: embedding
|
|
72
|
+
instances:
|
|
73
|
+
prod:
|
|
74
|
+
endpoint: https://prod.services.ai.azure.com
|
|
75
|
+
api_key: env://AZURE_INFERENCE_CREDENTIAL
|
|
76
|
+
api_version: 2024-05-01-preview
|
|
77
|
+
surface: model_inference
|
|
78
|
+
deployments:
|
|
79
|
+
- deployment: gpt-4o-prod
|
|
80
|
+
model_family: openai
|
|
81
|
+
canonical_model_alias: gpt-4o
|
|
82
|
+
usage_type: inference
|
|
83
|
+
fleet:
|
|
84
|
+
enabled: true
|
|
85
|
+
respond_to_requests: true
|
|
86
|
+
capabilities:
|
|
87
|
+
- chat
|
|
88
|
+
- stream_chat
|
|
89
|
+
- embed
|
|
90
|
+
```
|
|
59
91
|
|
|
60
|
-
|
|
92
|
+
The provider also supports direct configuration through `Legion::Extensions::Llm.configure` for tests and embedded use:
|
|
61
93
|
|
|
62
|
-
|
|
94
|
+
```ruby
|
|
95
|
+
Legion::Extensions::Llm.configure do |config|
|
|
96
|
+
config.azure_foundry_endpoint = ENV.fetch('AZURE_FOUNDRY_ENDPOINT')
|
|
97
|
+
config.azure_foundry_api_key = ENV['AZURE_INFERENCE_CREDENTIAL']
|
|
98
|
+
config.azure_foundry_bearer_token = ENV['AZURE_FOUNDRY_BEARER_TOKEN']
|
|
99
|
+
config.azure_foundry_api_version = '2024-05-01-preview'
|
|
100
|
+
config.azure_foundry_surface = :model_inference
|
|
101
|
+
config.azure_foundry_deployments = [
|
|
102
|
+
{
|
|
103
|
+
deployment: 'gpt-4o-prod',
|
|
104
|
+
model_family: :openai,
|
|
105
|
+
canonical_model_alias: 'gpt-4o',
|
|
106
|
+
usage_type: :inference
|
|
107
|
+
}
|
|
108
|
+
]
|
|
109
|
+
end
|
|
110
|
+
```
|
|
63
111
|
|
|
64
|
-
- Azure
|
|
65
|
-
- The model inference endpoint supports chat completions and embeddings.
|
|
66
|
-
- The documented model-info endpoint is used only for explicit live health checks.
|
|
67
|
-
- Azure deployment metadata is not assumed to reliably prove base model family or version, so routing metadata should be configured explicitly.
|
|
112
|
+
Use `:openai_v1` when the endpoint should be treated as the OpenAI v1-compatible Azure route. The provider appends `/openai/v1` when the configured endpoint does not already include it.
|
|
68
113
|
|
|
69
|
-
##
|
|
114
|
+
## Default Settings
|
|
70
115
|
|
|
71
116
|
```ruby
|
|
72
117
|
Legion::Extensions::Llm::AzureFoundry.default_settings
|
|
73
118
|
# {
|
|
119
|
+
# enabled: true,
|
|
74
120
|
# provider_family: :azure_foundry,
|
|
75
|
-
# discovery: { enabled: true, live: false },
|
|
76
121
|
# instances: {
|
|
77
122
|
# default: {
|
|
78
|
-
# endpoint:
|
|
79
|
-
# api_version: "2024-05-01-preview",
|
|
80
|
-
# surface: :model_inference,
|
|
123
|
+
# endpoint: nil,
|
|
81
124
|
# tier: :frontier,
|
|
82
125
|
# transport: :http,
|
|
83
126
|
# credentials: {
|
|
84
|
-
# api_key:
|
|
85
|
-
# bearer_token:
|
|
86
|
-
#
|
|
127
|
+
# api_key: nil,
|
|
128
|
+
# bearer_token: nil
|
|
129
|
+
# },
|
|
130
|
+
# provider: {
|
|
131
|
+
# api_version: "2024-05-01-preview",
|
|
132
|
+
# surface: nil,
|
|
133
|
+
# deployments: []
|
|
87
134
|
# },
|
|
88
|
-
#
|
|
89
|
-
#
|
|
90
|
-
#
|
|
135
|
+
# usage: { inference: true, embedding: true, image: false },
|
|
136
|
+
# limits: { concurrency: 4 },
|
|
137
|
+
# fleet: {
|
|
138
|
+
# enabled: false,
|
|
139
|
+
# respond_to_requests: false,
|
|
140
|
+
# capabilities: [:chat, :stream_chat, :embed],
|
|
141
|
+
# lanes: [],
|
|
142
|
+
# concurrency: 4,
|
|
143
|
+
# queue_suffix: nil
|
|
144
|
+
# }
|
|
91
145
|
# }
|
|
92
146
|
# }
|
|
93
147
|
# }
|
|
94
148
|
```
|
|
95
149
|
|
|
96
|
-
## Configuration
|
|
97
|
-
|
|
98
|
-
```ruby
|
|
99
|
-
Legion::Extensions::Llm.configure do |config|
|
|
100
|
-
config.azure_foundry_endpoint = ENV.fetch("AZURE_FOUNDRY_ENDPOINT")
|
|
101
|
-
config.azure_foundry_api_key = ENV["AZURE_INFERENCE_CREDENTIAL"]
|
|
102
|
-
config.azure_foundry_bearer_token = ENV["AZURE_FOUNDRY_BEARER_TOKEN"]
|
|
103
|
-
config.azure_foundry_api_version = "2024-05-01-preview"
|
|
104
|
-
config.azure_foundry_surface = :model_inference
|
|
105
|
-
config.azure_foundry_deployments = [
|
|
106
|
-
{
|
|
107
|
-
deployment: "gpt-4o-prod",
|
|
108
|
-
model_family: :openai,
|
|
109
|
-
canonical_model_alias: "gpt-4o",
|
|
110
|
-
usage_type: :inference
|
|
111
|
-
},
|
|
112
|
-
{
|
|
113
|
-
deployment: "mistral-large-prod",
|
|
114
|
-
model_family: :mistral,
|
|
115
|
-
canonical_model_alias: "mistral-large",
|
|
116
|
-
usage_type: :inference
|
|
117
|
-
},
|
|
118
|
-
{
|
|
119
|
-
deployment: "embedding-prod",
|
|
120
|
-
model_family: :openai,
|
|
121
|
-
canonical_model_alias: "text-embedding-3-small",
|
|
122
|
-
usage_type: :embedding
|
|
123
|
-
}
|
|
124
|
-
]
|
|
125
|
-
end
|
|
126
|
-
```
|
|
127
|
-
|
|
128
|
-
Use `config.azure_foundry_surface = :openai_v1` when the target endpoint should be treated as the OpenAI v1-compatible Azure route. The provider appends `/openai/v1` when the configured endpoint does not already include it.
|
|
129
|
-
|
|
130
150
|
## Provider Methods
|
|
131
151
|
|
|
132
152
|
```ruby
|
|
133
153
|
provider = Legion::Extensions::Llm::AzureFoundry.provider_class.new(Legion::Extensions::Llm.config)
|
|
134
154
|
|
|
135
155
|
provider.discover_offerings(live: false)
|
|
136
|
-
provider.offering_for(model:
|
|
156
|
+
provider.offering_for(model: 'gpt-4o-prod', model_family: :openai, canonical_model_alias: 'gpt-4o')
|
|
137
157
|
provider.health(live: false)
|
|
138
158
|
provider.readiness(live: false)
|
|
139
159
|
provider.list_models
|
|
140
|
-
provider.chat(messages, model:
|
|
141
|
-
provider.stream(messages, model:
|
|
142
|
-
provider.embed([
|
|
143
|
-
provider.count_tokens(messages, model:
|
|
160
|
+
provider.chat(messages: messages, model: 'gpt-4o-prod')
|
|
161
|
+
provider.stream(messages: messages, model: 'gpt-4o-prod') { |chunk| puts chunk.content }
|
|
162
|
+
provider.embed(text: ['hello'], model: 'embedding-prod')
|
|
163
|
+
provider.count_tokens(messages: messages, model: 'gpt-4o-prod')
|
|
144
164
|
```
|
|
145
165
|
|
|
146
|
-
`discover_offerings(live: false)`
|
|
166
|
+
`discover_offerings(live: false)` does not call Azure. It maps configured deployments into `Legion::Extensions::Llm::Routing::ModelOffering` values with `provider_family: :azure_foundry`.
|
|
147
167
|
|
|
148
168
|
`health(live: true)` calls the documented model-info endpoint for the configured model-inference surface. Keep `live: false` for startup paths and tests that must not require Azure.
|
|
149
169
|
|
|
150
170
|
`count_tokens` returns a structured unsupported result by default because the Microsoft REST contract used here does not define a portable token-counting endpoint across Azure AI Foundry deployments.
|
|
151
171
|
|
|
172
|
+
## Fleet Responder
|
|
173
|
+
|
|
174
|
+
Provider instances can opt in to consuming Legion LLM fleet requests. The actor is enabled only when at least one discovered instance has `fleet.respond_to_requests: true`.
|
|
175
|
+
|
|
176
|
+
Fleet execution is delegated to `Legion::Extensions::Llm::Fleet::ProviderResponder` from `lex-llm`; this provider supplies the provider family, provider class, discovered instances, and delivery metadata.
|
|
177
|
+
|
|
152
178
|
## Routing Metadata
|
|
153
179
|
|
|
154
180
|
Azure deployments are aliases. A deployment name can hide provider, model, and version details, so this extension preserves the deployment name as `model` and treats `canonical_model_alias` and `model_family` as routing metadata.
|
|
@@ -163,3 +189,7 @@ Supported `model_family` values are intentionally open-ended symbols, including:
|
|
|
163
189
|
- `:microsoft`
|
|
164
190
|
|
|
165
191
|
When `model_family` or `canonical_model_alias` is missing, offerings include `requires_explicit_model_metadata: true`.
|
|
192
|
+
|
|
193
|
+
## Failure Behavior
|
|
194
|
+
|
|
195
|
+
Live discovery and health-check failures are reported with `handle_exception(e, level: :warn, handled: true, operation: ...)` before returning degraded metadata. Offline discovery, provider configuration, and fleet actor enablement should not require live Azure connectivity.
|
|
@@ -26,5 +26,6 @@ Gem::Specification.new do |spec|
|
|
|
26
26
|
spec.add_dependency 'legion-json', '>= 1.2.1'
|
|
27
27
|
spec.add_dependency 'legion-logging', '>= 1.3.2'
|
|
28
28
|
spec.add_dependency 'legion-settings', '>= 1.3.14'
|
|
29
|
-
spec.add_dependency '
|
|
29
|
+
spec.add_dependency 'legion-transport', '>= 1.4.14'
|
|
30
|
+
spec.add_dependency 'lex-llm', '>= 0.4.3'
|
|
30
31
|
end
|
|
@@ -0,0 +1,43 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
begin
|
|
4
|
+
require 'legion/extensions/actors/subscription'
|
|
5
|
+
rescue LoadError => e
|
|
6
|
+
warn(e.message) if $VERBOSE
|
|
7
|
+
end
|
|
8
|
+
|
|
9
|
+
unless defined?(Legion::Extensions::Actors::Subscription)
|
|
10
|
+
raise LoadError, 'LegionIO actor runtime is required for Azure Foundry fleet worker'
|
|
11
|
+
end
|
|
12
|
+
|
|
13
|
+
require 'legion/extensions/llm/azure_foundry'
|
|
14
|
+
require 'legion/extensions/llm/fleet/provider_responder'
|
|
15
|
+
|
|
16
|
+
module Legion
|
|
17
|
+
module Extensions
|
|
18
|
+
module Llm
|
|
19
|
+
module AzureFoundry
|
|
20
|
+
module Actor
|
|
21
|
+
# Subscription actor for Azure Foundry fleet request consumption.
|
|
22
|
+
class FleetWorker < Legion::Extensions::Actors::Subscription
|
|
23
|
+
def runner_class
|
|
24
|
+
'Legion::Extensions::Llm::AzureFoundry::Runners::FleetWorker'
|
|
25
|
+
end
|
|
26
|
+
|
|
27
|
+
def runner_function
|
|
28
|
+
'handle_fleet_request'
|
|
29
|
+
end
|
|
30
|
+
|
|
31
|
+
def use_runner?
|
|
32
|
+
false
|
|
33
|
+
end
|
|
34
|
+
|
|
35
|
+
def enabled?
|
|
36
|
+
Legion::Extensions::Llm::Fleet::ProviderResponder.enabled_for?(AzureFoundry.discover_instances)
|
|
37
|
+
end
|
|
38
|
+
end
|
|
39
|
+
end
|
|
40
|
+
end
|
|
41
|
+
end
|
|
42
|
+
end
|
|
43
|
+
end
|
|
@@ -197,25 +197,49 @@ module Legion
|
|
|
197
197
|
models
|
|
198
198
|
end
|
|
199
199
|
|
|
200
|
-
def chat(
|
|
200
|
+
def chat(
|
|
201
|
+
messages:,
|
|
202
|
+
model:,
|
|
203
|
+
**options
|
|
204
|
+
)
|
|
201
205
|
log.info { "chat request model=#{model} messages=#{messages.size}" }
|
|
202
|
-
complete(messages, tools
|
|
206
|
+
complete(messages, tools: options.fetch(:tools, {}), temperature: options[:temperature],
|
|
207
|
+
model: model_info(model, max_tokens: options[:max_tokens]),
|
|
208
|
+
params: options.fetch(:params, {}), tool_prefs: options[:tool_prefs])
|
|
203
209
|
end
|
|
204
210
|
|
|
205
|
-
def stream(
|
|
211
|
+
def stream(
|
|
212
|
+
messages:,
|
|
213
|
+
model:,
|
|
214
|
+
**options,
|
|
215
|
+
&
|
|
216
|
+
)
|
|
206
217
|
log.info { "stream request model=#{model} messages=#{messages.size}" }
|
|
207
|
-
complete(messages, tools
|
|
218
|
+
complete(messages, tools: options.fetch(:tools, {}), temperature: options[:temperature],
|
|
219
|
+
model: model_info(model, max_tokens: options[:max_tokens]),
|
|
220
|
+
params: options.fetch(:params, {}), tool_prefs: options[:tool_prefs], &)
|
|
208
221
|
end
|
|
209
222
|
|
|
210
|
-
def embed(
|
|
223
|
+
def embed(
|
|
224
|
+
text:,
|
|
225
|
+
model:,
|
|
226
|
+
**options
|
|
227
|
+
)
|
|
211
228
|
log.info { "embed request model=#{model}" }
|
|
212
|
-
payload =
|
|
213
|
-
|
|
229
|
+
payload = Utils.deep_merge(
|
|
230
|
+
render_embedding_payload(text, model: model_id(model), dimensions: options[:dimensions]),
|
|
231
|
+
options.fetch(:params, {})
|
|
232
|
+
)
|
|
233
|
+
payload[:input_type] = options[:input_type] if options[:input_type]
|
|
214
234
|
response = connection.post(embedding_url(model:), payload)
|
|
215
235
|
parse_embedding_response(response, model: model_id(model), text:)
|
|
216
236
|
end
|
|
217
237
|
|
|
218
|
-
def count_tokens(
|
|
238
|
+
def count_tokens(
|
|
239
|
+
messages:,
|
|
240
|
+
model:,
|
|
241
|
+
**_provider_options
|
|
242
|
+
)
|
|
219
243
|
{
|
|
220
244
|
provider_family: :azure_foundry,
|
|
221
245
|
model: model_id(model),
|
|
@@ -295,8 +319,8 @@ module Legion
|
|
|
295
319
|
Legion::Extensions::Llm::Routing::ModelOffering.new(
|
|
296
320
|
provider_family: :azure_foundry,
|
|
297
321
|
instance_id: instance_id,
|
|
298
|
-
transport: :http,
|
|
299
|
-
tier: :frontier,
|
|
322
|
+
transport: configured_transport(:http),
|
|
323
|
+
tier: configured_tier(:frontier),
|
|
300
324
|
model: model,
|
|
301
325
|
usage_type: usage_type.to_sym,
|
|
302
326
|
capabilities: capabilities,
|
|
@@ -308,6 +332,14 @@ module Legion
|
|
|
308
332
|
)
|
|
309
333
|
end
|
|
310
334
|
|
|
335
|
+
def configured_transport(default)
|
|
336
|
+
config.respond_to?(:transport) ? config.transport : default
|
|
337
|
+
end
|
|
338
|
+
|
|
339
|
+
def configured_tier(default)
|
|
340
|
+
config.respond_to?(:tier) ? config.tier : default
|
|
341
|
+
end
|
|
342
|
+
|
|
311
343
|
def with_live_metadata(offering)
|
|
312
344
|
response = connection.get(models_url)
|
|
313
345
|
metadata = offering.metadata.merge(model_info: response.body)
|
|
@@ -0,0 +1,30 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require 'legion/extensions/llm/fleet/provider_responder'
|
|
4
|
+
require 'legion/extensions/llm/azure_foundry'
|
|
5
|
+
|
|
6
|
+
module Legion
|
|
7
|
+
module Extensions
|
|
8
|
+
module Llm
|
|
9
|
+
module AzureFoundry
|
|
10
|
+
module Runners
|
|
11
|
+
# Runner entrypoint for Azure Foundry fleet request execution.
|
|
12
|
+
module FleetWorker
|
|
13
|
+
module_function
|
|
14
|
+
|
|
15
|
+
def handle_fleet_request(payload, delivery: nil, properties: nil)
|
|
16
|
+
Legion::Extensions::Llm::Fleet::ProviderResponder.call(
|
|
17
|
+
payload: payload,
|
|
18
|
+
provider_family: AzureFoundry::PROVIDER_FAMILY,
|
|
19
|
+
provider_class: AzureFoundry::Provider,
|
|
20
|
+
provider_instances: -> { AzureFoundry.discover_instances },
|
|
21
|
+
delivery: delivery,
|
|
22
|
+
properties: properties
|
|
23
|
+
)
|
|
24
|
+
end
|
|
25
|
+
end
|
|
26
|
+
end
|
|
27
|
+
end
|
|
28
|
+
end
|
|
29
|
+
end
|
|
30
|
+
end
|
|
@@ -16,21 +16,33 @@ module Legion
|
|
|
16
16
|
PROVIDER_FAMILY = :azure_foundry
|
|
17
17
|
|
|
18
18
|
def self.default_settings
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
19
|
+
::Legion::Extensions::Llm.provider_settings(
|
|
20
|
+
family: PROVIDER_FAMILY,
|
|
21
|
+
instance: {
|
|
22
|
+
endpoint: nil,
|
|
23
|
+
tier: :frontier,
|
|
24
|
+
transport: :http,
|
|
25
|
+
credentials: {
|
|
26
|
+
api_key: nil,
|
|
27
|
+
bearer_token: nil
|
|
28
|
+
},
|
|
29
|
+
provider: {
|
|
30
|
+
api_version: Provider::DEFAULT_API_VERSION,
|
|
31
|
+
surface: nil,
|
|
32
|
+
deployments: []
|
|
33
|
+
},
|
|
34
|
+
usage: { inference: true, embedding: true, image: false },
|
|
35
|
+
limits: { concurrency: 4 },
|
|
36
|
+
fleet: {
|
|
37
|
+
enabled: false,
|
|
38
|
+
respond_to_requests: false,
|
|
39
|
+
capabilities: %i[chat stream_chat embed],
|
|
40
|
+
lanes: [],
|
|
41
|
+
concurrency: 4,
|
|
42
|
+
queue_suffix: nil
|
|
43
|
+
}
|
|
44
|
+
}
|
|
45
|
+
)
|
|
34
46
|
end
|
|
35
47
|
|
|
36
48
|
def self.provider_class
|
|
@@ -48,14 +60,15 @@ module Legion
|
|
|
48
60
|
instances
|
|
49
61
|
end
|
|
50
62
|
|
|
51
|
-
def self.discover_default_instance(instances)
|
|
63
|
+
def self.discover_default_instance(instances) # rubocop:disable Metrics/AbcSize, Metrics/CyclomaticComplexity, Metrics/PerceivedComplexity
|
|
52
64
|
cfg = CredentialSources.setting(:extensions, :llm, :azure_foundry)
|
|
53
65
|
return unless cfg.is_a?(Hash)
|
|
54
66
|
|
|
55
|
-
endpoint = cfg[:endpoint] || cfg['endpoint']
|
|
67
|
+
endpoint = cfg[:endpoint] || cfg['endpoint'] || cfg[:base_url] || cfg['base_url'] || cfg[:api_base] ||
|
|
68
|
+
cfg['api_base']
|
|
56
69
|
return if endpoint.nil? || endpoint.to_s.strip.empty?
|
|
57
70
|
|
|
58
|
-
instances[:settings] = cfg
|
|
71
|
+
instances[:settings] = normalize_instance_config(cfg).merge(tier: :cloud)
|
|
59
72
|
end
|
|
60
73
|
|
|
61
74
|
def self.discover_named_instances(instances)
|
|
@@ -68,21 +81,35 @@ module Legion
|
|
|
68
81
|
named.each { |name, config| add_named_instance(instances, name, config) }
|
|
69
82
|
end
|
|
70
83
|
|
|
71
|
-
def self.add_named_instance(instances, name, config)
|
|
84
|
+
def self.add_named_instance(instances, name, config) # rubocop:disable Metrics/AbcSize, Metrics/CyclomaticComplexity, Metrics/PerceivedComplexity
|
|
72
85
|
return unless config.is_a?(Hash)
|
|
73
86
|
|
|
74
|
-
endpoint = config[:endpoint] || config['endpoint']
|
|
87
|
+
endpoint = config[:endpoint] || config['endpoint'] || config[:base_url] || config['base_url'] ||
|
|
88
|
+
config[:api_base] || config['api_base']
|
|
75
89
|
return if endpoint.nil? || endpoint.to_s.strip.empty?
|
|
76
90
|
|
|
77
|
-
instances[name.to_sym] = config.merge(tier: :cloud)
|
|
91
|
+
instances[name.to_sym] = normalize_instance_config(config).merge(tier: :cloud)
|
|
78
92
|
end
|
|
79
93
|
|
|
80
|
-
|
|
94
|
+
def self.normalize_instance_config(config) # rubocop:disable Metrics/AbcSize, Metrics/CyclomaticComplexity, Metrics/PerceivedComplexity
|
|
95
|
+
normalized = config.to_h.transform_keys { |key| key.respond_to?(:to_sym) ? key.to_sym : key }
|
|
96
|
+
normalized[:azure_foundry_endpoint] ||= normalized.delete(:endpoint)
|
|
97
|
+
normalized[:azure_foundry_endpoint] ||= normalized.delete(:base_url)
|
|
98
|
+
normalized[:azure_foundry_endpoint] ||= normalized.delete(:api_base)
|
|
99
|
+
normalized[:azure_foundry_api_key] ||= normalized.delete(:api_key)
|
|
100
|
+
normalized[:azure_foundry_bearer_token] ||= normalized.delete(:bearer_token)
|
|
101
|
+
normalized[:azure_foundry_api_version] ||= normalized.delete(:api_version)
|
|
102
|
+
normalized[:azure_foundry_surface] ||= normalized.delete(:surface)
|
|
103
|
+
normalized[:azure_foundry_deployments] ||= normalized.delete(:deployments)
|
|
104
|
+
normalized.compact.except(:instances)
|
|
105
|
+
end
|
|
106
|
+
|
|
107
|
+
private_class_method :discover_default_instance, :discover_named_instances, :add_named_instance,
|
|
108
|
+
:normalize_instance_config
|
|
81
109
|
|
|
82
|
-
Legion::Extensions::Llm::Configuration.register_provider_options(Provider.configuration_options)
|
|
110
|
+
Legion::Extensions::Llm::Configuration.register_provider_options(Provider.configuration_options) if
|
|
111
|
+
Legion::Extensions::Llm::Configuration.respond_to?(:register_provider_options)
|
|
83
112
|
end
|
|
84
113
|
end
|
|
85
114
|
end
|
|
86
115
|
end
|
|
87
|
-
|
|
88
|
-
Legion::Extensions::Llm::AzureFoundry.register_discovered_instances
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: lex-llm-azure-foundry
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.2.
|
|
4
|
+
version: 0.2.5
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- LegionIO
|
|
@@ -51,20 +51,34 @@ dependencies:
|
|
|
51
51
|
- - ">="
|
|
52
52
|
- !ruby/object:Gem::Version
|
|
53
53
|
version: 1.3.14
|
|
54
|
+
- !ruby/object:Gem::Dependency
|
|
55
|
+
name: legion-transport
|
|
56
|
+
requirement: !ruby/object:Gem::Requirement
|
|
57
|
+
requirements:
|
|
58
|
+
- - ">="
|
|
59
|
+
- !ruby/object:Gem::Version
|
|
60
|
+
version: 1.4.14
|
|
61
|
+
type: :runtime
|
|
62
|
+
prerelease: false
|
|
63
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
64
|
+
requirements:
|
|
65
|
+
- - ">="
|
|
66
|
+
- !ruby/object:Gem::Version
|
|
67
|
+
version: 1.4.14
|
|
54
68
|
- !ruby/object:Gem::Dependency
|
|
55
69
|
name: lex-llm
|
|
56
70
|
requirement: !ruby/object:Gem::Requirement
|
|
57
71
|
requirements:
|
|
58
72
|
- - ">="
|
|
59
73
|
- !ruby/object:Gem::Version
|
|
60
|
-
version: 0.3
|
|
74
|
+
version: 0.4.3
|
|
61
75
|
type: :runtime
|
|
62
76
|
prerelease: false
|
|
63
77
|
version_requirements: !ruby/object:Gem::Requirement
|
|
64
78
|
requirements:
|
|
65
79
|
- - ">="
|
|
66
80
|
- !ruby/object:Gem::Version
|
|
67
|
-
version: 0.3
|
|
81
|
+
version: 0.4.3
|
|
68
82
|
description: Azure AI Foundry and Azure OpenAI hosted provider integration for LegionIO
|
|
69
83
|
LLM routing.
|
|
70
84
|
email:
|
|
@@ -84,7 +98,9 @@ files:
|
|
|
84
98
|
- README.md
|
|
85
99
|
- lex-llm-azure-foundry.gemspec
|
|
86
100
|
- lib/legion/extensions/llm/azure_foundry.rb
|
|
101
|
+
- lib/legion/extensions/llm/azure_foundry/actors/fleet_worker.rb
|
|
87
102
|
- lib/legion/extensions/llm/azure_foundry/provider.rb
|
|
103
|
+
- lib/legion/extensions/llm/azure_foundry/runners/fleet_worker.rb
|
|
88
104
|
- lib/legion/extensions/llm/azure_foundry/version.rb
|
|
89
105
|
homepage: https://github.com/LegionIO/lex-llm-azure-foundry
|
|
90
106
|
licenses:
|