lex-llm-azure-foundry 0.2.0 → 0.2.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.github/workflows/ci.yml +13 -1
- data/CHANGELOG.md +36 -0
- data/Gemfile +2 -0
- data/README.md +120 -90
- data/lex-llm-azure-foundry.gemspec +2 -1
- data/lib/legion/extensions/llm/azure_foundry/actors/fleet_worker.rb +43 -0
- data/lib/legion/extensions/llm/azure_foundry/provider.rb +53 -15
- data/lib/legion/extensions/llm/azure_foundry/runners/fleet_worker.rb +30 -0
- data/lib/legion/extensions/llm/azure_foundry/version.rb +1 -1
- data/lib/legion/extensions/llm/azure_foundry.rb +52 -25
- metadata +19 -3
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: be5a0deef7be2d2f074ec98ebb5a77ec065d46029c7d4a902479bdfc2044240d
|
|
4
|
+
data.tar.gz: 565fe0757d88c7f8289c919d721db319303e25ef906f96982fd18179d1135845
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 3503088b0c52dedeb98dc49bdbf8006836e038a71c0b0600e5aaaa89302c0e10c4f9963de7d583445fc98910520bd1bed89793b9c9776b7e95cf0d0b6d3d9215
|
|
7
|
+
data.tar.gz: bc52d6be469507ffa3fd886ccab511c345d34c568432b2e94c8e6f89c302d597d12f1fffffa61e2e6206f25e149c0590180c2d40ff28d6a17ad5c9baec3071ec
|
data/.github/workflows/ci.yml
CHANGED
|
@@ -8,8 +8,20 @@ jobs:
|
|
|
8
8
|
ci:
|
|
9
9
|
uses: LegionIO/.github/.github/workflows/ci.yml@main
|
|
10
10
|
|
|
11
|
+
excluded-files:
|
|
12
|
+
uses: LegionIO/.github/.github/workflows/excluded-files.yml@main
|
|
13
|
+
|
|
14
|
+
security:
|
|
15
|
+
uses: LegionIO/.github/.github/workflows/security-scan.yml@main
|
|
16
|
+
|
|
17
|
+
version-changelog:
|
|
18
|
+
uses: LegionIO/.github/.github/workflows/version-changelog.yml@main
|
|
19
|
+
|
|
20
|
+
dependency-review:
|
|
21
|
+
uses: LegionIO/.github/.github/workflows/dependency-review.yml@main
|
|
22
|
+
|
|
11
23
|
release:
|
|
12
|
-
needs: ci
|
|
24
|
+
needs: [ci, excluded-files, security]
|
|
13
25
|
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
|
|
14
26
|
uses: LegionIO/.github/.github/workflows/release.yml@main
|
|
15
27
|
secrets:
|
data/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,41 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## 0.2.6 - 2026-05-21
|
|
4
|
+
|
|
5
|
+
- Add `default_transport`/`default_tier` class declarations, remove `configured_transport`/`configured_tier`
|
|
6
|
+
- Add `model_allowed?` filtering in `discover_offerings`
|
|
7
|
+
- Default tier set to :cloud
|
|
8
|
+
- Identity headers included via base provider
|
|
9
|
+
|
|
10
|
+
|
|
11
|
+
## 0.2.5 - 2026-05-06
|
|
12
|
+
|
|
13
|
+
- Load provider-owned fleet actors through the LegionIO subscription base and the canonical Azure Foundry provider root.
|
|
14
|
+
- Keep fleet runners anchored on the provider root namespace so provider constants and instance discovery are always loaded.
|
|
15
|
+
- Preserve configured transport and tier metadata when Azure Foundry builds routing offerings.
|
|
16
|
+
- Gate release publishing on the shared security workflow.
|
|
17
|
+
|
|
18
|
+
## 0.2.4 - 2026-05-06
|
|
19
|
+
|
|
20
|
+
- Use the shared `lex-llm` fleet provider responder helper for provider-owned fleet workers.
|
|
21
|
+
- Remove the runtime `legion-llm` dependency and require `lex-llm >= 0.4.3` for responder-side fleet execution.
|
|
22
|
+
|
|
23
|
+
## 0.2.3 - 2026-05-06
|
|
24
|
+
|
|
25
|
+
- Remove require-time provider self-registration; `legion-llm` now owns adapter creation and registry writes from loaded provider discovery metadata.
|
|
26
|
+
- Bump dependency floors to `lex-llm >= 0.4.1` and `legion-llm >= 0.9.1`.
|
|
27
|
+
|
|
28
|
+
## 0.2.2 - 2026-05-06
|
|
29
|
+
|
|
30
|
+
- Enforce the shared keyword-only `lex-llm` provider contract for chat, embeddings, and token counting.
|
|
31
|
+
- Move defaults back to `Legion::Extensions::Llm.provider_settings` with credentials/provider metadata under the default instance and instance-level fleet responder settings.
|
|
32
|
+
- Add provider-owned fleet responder actor and runner backed by `legion-llm` fleet policy execution.
|
|
33
|
+
- Bump the transport dependency floor to `legion-transport >= 1.4.14`.
|
|
34
|
+
|
|
35
|
+
## 0.2.1 - 2026-05-03
|
|
36
|
+
|
|
37
|
+
- Normalize generic settings keys to Azure Foundry provider config keys during instance discovery.
|
|
38
|
+
|
|
3
39
|
## 0.2.0 - 2026-05-01
|
|
4
40
|
|
|
5
41
|
- Add auto-discovery via CredentialSources and AutoRegistration from lex-llm 0.3.0
|
data/Gemfile
CHANGED
|
@@ -4,6 +4,8 @@ source 'https://rubygems.org'
|
|
|
4
4
|
|
|
5
5
|
group :test do
|
|
6
6
|
llm_base_path = ENV.fetch('LEX_LLM_PATH', File.expand_path('../lex-llm', __dir__))
|
|
7
|
+
transport_path = ENV.fetch('LEGION_TRANSPORT_PATH', File.expand_path('../../legion-transport', __dir__))
|
|
8
|
+
gem 'legion-transport', path: transport_path if File.directory?(transport_path)
|
|
7
9
|
gem 'lex-llm', path: llm_base_path if File.directory?(llm_base_path)
|
|
8
10
|
end
|
|
9
11
|
|
data/README.md
CHANGED
|
@@ -2,153 +2,179 @@
|
|
|
2
2
|
|
|
3
3
|
LegionIO LLM provider extension for Azure AI Foundry Models and Azure OpenAI hosted deployments.
|
|
4
4
|
|
|
5
|
-
This gem lives under `Legion::Extensions::Llm::AzureFoundry
|
|
5
|
+
This gem lives under `Legion::Extensions::Llm::AzureFoundry`. It depends on `lex-llm >= 0.4.3` for provider contracts, routing metadata, registry publishing helpers, and provider-owned fleet request handling. It does not require or depend on `legion-llm` at runtime; Legion LLM orchestration can load this provider gem and consume its discovery metadata.
|
|
6
6
|
|
|
7
|
-
Load it with
|
|
7
|
+
Load it with:
|
|
8
|
+
|
|
9
|
+
```ruby
|
|
10
|
+
require 'legion/extensions/llm/azure_foundry'
|
|
11
|
+
```
|
|
8
12
|
|
|
9
13
|
## What It Provides
|
|
10
14
|
|
|
11
|
-
-
|
|
15
|
+
- Provider family `:azure_foundry`
|
|
12
16
|
- Azure AI Foundry model inference chat completions through `POST /models/chat/completions?api-version=...`
|
|
13
17
|
- Azure AI Foundry model inference embeddings through `POST /models/embeddings?api-version=...`
|
|
14
|
-
- Azure AI Foundry model info health
|
|
18
|
+
- Azure AI Foundry model info health checks through `GET /models/info?api-version=...` when `live: true`
|
|
15
19
|
- Azure OpenAI v1-compatible endpoint support through `/openai/v1/chat/completions` and `/openai/v1/embeddings`
|
|
16
|
-
-
|
|
20
|
+
- Offline-first offering discovery from configured deployments
|
|
21
|
+
- Deployment-name-preserving routing metadata for hosted Azure deployments
|
|
17
22
|
- Explicit `model_family` and `canonical_model_alias` metadata for deployments whose base model cannot be proven from Azure metadata
|
|
18
|
-
-
|
|
19
|
-
- Shared
|
|
20
|
-
-
|
|
21
|
-
- Best-effort `llm.registry` event publishing for readiness and model availability via AMQP when transport is available
|
|
23
|
+
- Shared OpenAI-compatible request and response mapping through `Legion::Extensions::Llm::Provider::OpenAICompatible`
|
|
24
|
+
- Shared registry availability publishing through `Legion::Extensions::Llm::RegistryPublisher` when transport is available
|
|
25
|
+
- Provider-owned fleet request handling through `Legion::Extensions::Llm::Fleet::ProviderResponder`
|
|
22
26
|
|
|
23
27
|
## Architecture
|
|
24
28
|
|
|
25
|
-
```
|
|
29
|
+
```text
|
|
26
30
|
Legion::Extensions::Llm::AzureFoundry
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
│ ├── Messages::RegistryEvent # AMQP message for llm.registry events
|
|
33
|
-
│ └── Exchanges::LlmRegistry # Topic exchange for provider availability events
|
|
34
|
-
└── VERSION
|
|
31
|
+
|-- Provider # Azure AI Foundry and Azure OpenAI hosted provider surface
|
|
32
|
+
| `-- Capabilities # Capability predicates inferred from deployment metadata and model naming
|
|
33
|
+
|-- Actor::FleetWorker # Subscription actor for provider-owned fleet requests
|
|
34
|
+
|-- Runners::FleetWorker # Runner entrypoint that delegates to lex-llm ProviderResponder
|
|
35
|
+
`-- VERSION
|
|
35
36
|
```
|
|
36
37
|
|
|
38
|
+
`AzureFoundry.discover_instances` reads `extensions.llm.azure_foundry` settings and returns provider instance configs. The base Legion LLM runtime can use those configs to populate the provider registry and routing inventory; this gem does not write `legion-llm` registry state itself at require time.
|
|
39
|
+
|
|
37
40
|
## File Map
|
|
38
41
|
|
|
39
42
|
| Path | Purpose |
|
|
40
43
|
|------|---------|
|
|
41
|
-
| `lib/legion/extensions/llm/azure_foundry.rb` | Entry point, provider
|
|
42
|
-
| `lib/legion/extensions/llm/azure_foundry/provider.rb` | Provider implementation with chat, stream, embed, health, readiness, discovery |
|
|
43
|
-
| `lib/legion/extensions/llm/azure_foundry/
|
|
44
|
-
| `lib/legion/extensions/llm/azure_foundry/
|
|
45
|
-
| `lib/legion/extensions/llm/azure_foundry/transport/messages/registry_event.rb` | AMQP message class for registry events |
|
|
46
|
-
| `lib/legion/extensions/llm/azure_foundry/transport/exchanges/llm_registry.rb` | Topic exchange definition for llm.registry |
|
|
44
|
+
| `lib/legion/extensions/llm/azure_foundry.rb` | Entry point, provider defaults, instance discovery, shared registry publisher |
|
|
45
|
+
| `lib/legion/extensions/llm/azure_foundry/provider.rb` | Provider implementation with chat, stream, embed, health, readiness, model listing, and offering discovery |
|
|
46
|
+
| `lib/legion/extensions/llm/azure_foundry/actors/fleet_worker.rb` | Subscription actor gated by ProviderResponder fleet settings |
|
|
47
|
+
| `lib/legion/extensions/llm/azure_foundry/runners/fleet_worker.rb` | Fleet request runner that delegates execution to `ProviderResponder.call` |
|
|
47
48
|
| `lib/legion/extensions/llm/azure_foundry/version.rb` | `VERSION` constant |
|
|
48
49
|
|
|
49
|
-
##
|
|
50
|
-
|
|
51
|
-
Every class and module uses `Legion::Logging::Helper`:
|
|
52
|
-
|
|
53
|
-
- **AzureFoundry** module: `extend Legion::Logging::Helper`
|
|
54
|
-
- **Provider**: inherits `include Legion::Logging::Helper` from `Legion::Extensions::Llm::Provider`
|
|
55
|
-
- **RegistryPublisher**: `include Legion::Logging::Helper`
|
|
56
|
-
- **RegistryEventBuilder**: `include Legion::Logging::Helper`
|
|
50
|
+
## Configuration
|
|
57
51
|
|
|
58
|
-
|
|
52
|
+
Configured instances can be supplied through Legion settings under `extensions.llm.azure_foundry`. A top-level endpoint creates a `:settings` instance; entries under `instances` create named instances.
|
|
53
|
+
|
|
54
|
+
```yaml
|
|
55
|
+
extensions:
|
|
56
|
+
llm:
|
|
57
|
+
azure_foundry:
|
|
58
|
+
endpoint: https://example.services.ai.azure.com
|
|
59
|
+
api_key: env://AZURE_INFERENCE_CREDENTIAL
|
|
60
|
+
bearer_token: env://AZURE_FOUNDRY_BEARER_TOKEN
|
|
61
|
+
api_version: 2024-05-01-preview
|
|
62
|
+
surface: model_inference
|
|
63
|
+
deployments:
|
|
64
|
+
- deployment: gpt-4o-prod
|
|
65
|
+
model_family: openai
|
|
66
|
+
canonical_model_alias: gpt-4o
|
|
67
|
+
usage_type: inference
|
|
68
|
+
- deployment: embedding-prod
|
|
69
|
+
model_family: openai
|
|
70
|
+
canonical_model_alias: text-embedding-3-small
|
|
71
|
+
usage_type: embedding
|
|
72
|
+
instances:
|
|
73
|
+
prod:
|
|
74
|
+
endpoint: https://prod.services.ai.azure.com
|
|
75
|
+
api_key: env://AZURE_INFERENCE_CREDENTIAL
|
|
76
|
+
api_version: 2024-05-01-preview
|
|
77
|
+
surface: model_inference
|
|
78
|
+
deployments:
|
|
79
|
+
- deployment: gpt-4o-prod
|
|
80
|
+
model_family: openai
|
|
81
|
+
canonical_model_alias: gpt-4o
|
|
82
|
+
usage_type: inference
|
|
83
|
+
fleet:
|
|
84
|
+
enabled: true
|
|
85
|
+
respond_to_requests: true
|
|
86
|
+
capabilities:
|
|
87
|
+
- chat
|
|
88
|
+
- stream_chat
|
|
89
|
+
- embed
|
|
90
|
+
```
|
|
59
91
|
|
|
60
|
-
|
|
92
|
+
The provider also supports direct configuration through `Legion::Extensions::Llm.configure` for tests and embedded use:
|
|
61
93
|
|
|
62
|
-
|
|
94
|
+
```ruby
|
|
95
|
+
Legion::Extensions::Llm.configure do |config|
|
|
96
|
+
config.azure_foundry_endpoint = ENV.fetch('AZURE_FOUNDRY_ENDPOINT')
|
|
97
|
+
config.azure_foundry_api_key = ENV['AZURE_INFERENCE_CREDENTIAL']
|
|
98
|
+
config.azure_foundry_bearer_token = ENV['AZURE_FOUNDRY_BEARER_TOKEN']
|
|
99
|
+
config.azure_foundry_api_version = '2024-05-01-preview'
|
|
100
|
+
config.azure_foundry_surface = :model_inference
|
|
101
|
+
config.azure_foundry_deployments = [
|
|
102
|
+
{
|
|
103
|
+
deployment: 'gpt-4o-prod',
|
|
104
|
+
model_family: :openai,
|
|
105
|
+
canonical_model_alias: 'gpt-4o',
|
|
106
|
+
usage_type: :inference
|
|
107
|
+
}
|
|
108
|
+
]
|
|
109
|
+
end
|
|
110
|
+
```
|
|
63
111
|
|
|
64
|
-
- Azure
|
|
65
|
-
- The model inference endpoint supports chat completions and embeddings.
|
|
66
|
-
- The documented model-info endpoint is used only for explicit live health checks.
|
|
67
|
-
- Azure deployment metadata is not assumed to reliably prove base model family or version, so routing metadata should be configured explicitly.
|
|
112
|
+
Use `:openai_v1` when the endpoint should be treated as the OpenAI v1-compatible Azure route. The provider appends `/openai/v1` when the configured endpoint does not already include it.
|
|
68
113
|
|
|
69
|
-
##
|
|
114
|
+
## Default Settings
|
|
70
115
|
|
|
71
116
|
```ruby
|
|
72
117
|
Legion::Extensions::Llm::AzureFoundry.default_settings
|
|
73
118
|
# {
|
|
119
|
+
# enabled: true,
|
|
74
120
|
# provider_family: :azure_foundry,
|
|
75
|
-
# discovery: { enabled: true, live: false },
|
|
76
121
|
# instances: {
|
|
77
122
|
# default: {
|
|
78
|
-
# endpoint:
|
|
79
|
-
# api_version: "2024-05-01-preview",
|
|
80
|
-
# surface: :model_inference,
|
|
123
|
+
# endpoint: nil,
|
|
81
124
|
# tier: :frontier,
|
|
82
125
|
# transport: :http,
|
|
83
126
|
# credentials: {
|
|
84
|
-
# api_key:
|
|
85
|
-
# bearer_token:
|
|
86
|
-
#
|
|
127
|
+
# api_key: nil,
|
|
128
|
+
# bearer_token: nil
|
|
129
|
+
# },
|
|
130
|
+
# provider: {
|
|
131
|
+
# api_version: "2024-05-01-preview",
|
|
132
|
+
# surface: nil,
|
|
133
|
+
# deployments: []
|
|
87
134
|
# },
|
|
88
|
-
#
|
|
89
|
-
#
|
|
90
|
-
#
|
|
135
|
+
# usage: { inference: true, embedding: true, image: false },
|
|
136
|
+
# limits: { concurrency: 4 },
|
|
137
|
+
# fleet: {
|
|
138
|
+
# enabled: false,
|
|
139
|
+
# respond_to_requests: false,
|
|
140
|
+
# capabilities: [:chat, :stream_chat, :embed],
|
|
141
|
+
# lanes: [],
|
|
142
|
+
# concurrency: 4,
|
|
143
|
+
# queue_suffix: nil
|
|
144
|
+
# }
|
|
91
145
|
# }
|
|
92
146
|
# }
|
|
93
147
|
# }
|
|
94
148
|
```
|
|
95
149
|
|
|
96
|
-
## Configuration
|
|
97
|
-
|
|
98
|
-
```ruby
|
|
99
|
-
Legion::Extensions::Llm.configure do |config|
|
|
100
|
-
config.azure_foundry_endpoint = ENV.fetch("AZURE_FOUNDRY_ENDPOINT")
|
|
101
|
-
config.azure_foundry_api_key = ENV["AZURE_INFERENCE_CREDENTIAL"]
|
|
102
|
-
config.azure_foundry_bearer_token = ENV["AZURE_FOUNDRY_BEARER_TOKEN"]
|
|
103
|
-
config.azure_foundry_api_version = "2024-05-01-preview"
|
|
104
|
-
config.azure_foundry_surface = :model_inference
|
|
105
|
-
config.azure_foundry_deployments = [
|
|
106
|
-
{
|
|
107
|
-
deployment: "gpt-4o-prod",
|
|
108
|
-
model_family: :openai,
|
|
109
|
-
canonical_model_alias: "gpt-4o",
|
|
110
|
-
usage_type: :inference
|
|
111
|
-
},
|
|
112
|
-
{
|
|
113
|
-
deployment: "mistral-large-prod",
|
|
114
|
-
model_family: :mistral,
|
|
115
|
-
canonical_model_alias: "mistral-large",
|
|
116
|
-
usage_type: :inference
|
|
117
|
-
},
|
|
118
|
-
{
|
|
119
|
-
deployment: "embedding-prod",
|
|
120
|
-
model_family: :openai,
|
|
121
|
-
canonical_model_alias: "text-embedding-3-small",
|
|
122
|
-
usage_type: :embedding
|
|
123
|
-
}
|
|
124
|
-
]
|
|
125
|
-
end
|
|
126
|
-
```
|
|
127
|
-
|
|
128
|
-
Use `config.azure_foundry_surface = :openai_v1` when the target endpoint should be treated as the OpenAI v1-compatible Azure route. The provider appends `/openai/v1` when the configured endpoint does not already include it.
|
|
129
|
-
|
|
130
150
|
## Provider Methods
|
|
131
151
|
|
|
132
152
|
```ruby
|
|
133
153
|
provider = Legion::Extensions::Llm::AzureFoundry.provider_class.new(Legion::Extensions::Llm.config)
|
|
134
154
|
|
|
135
155
|
provider.discover_offerings(live: false)
|
|
136
|
-
provider.offering_for(model:
|
|
156
|
+
provider.offering_for(model: 'gpt-4o-prod', model_family: :openai, canonical_model_alias: 'gpt-4o')
|
|
137
157
|
provider.health(live: false)
|
|
138
158
|
provider.readiness(live: false)
|
|
139
159
|
provider.list_models
|
|
140
|
-
provider.chat(messages, model:
|
|
141
|
-
provider.stream(messages, model:
|
|
142
|
-
provider.embed([
|
|
143
|
-
provider.count_tokens(messages, model:
|
|
160
|
+
provider.chat(messages: messages, model: 'gpt-4o-prod')
|
|
161
|
+
provider.stream(messages: messages, model: 'gpt-4o-prod') { |chunk| puts chunk.content }
|
|
162
|
+
provider.embed(text: ['hello'], model: 'embedding-prod')
|
|
163
|
+
provider.count_tokens(messages: messages, model: 'gpt-4o-prod')
|
|
144
164
|
```
|
|
145
165
|
|
|
146
|
-
`discover_offerings(live: false)`
|
|
166
|
+
`discover_offerings(live: false)` does not call Azure. It maps configured deployments into `Legion::Extensions::Llm::Routing::ModelOffering` values with `provider_family: :azure_foundry`.
|
|
147
167
|
|
|
148
168
|
`health(live: true)` calls the documented model-info endpoint for the configured model-inference surface. Keep `live: false` for startup paths and tests that must not require Azure.
|
|
149
169
|
|
|
150
170
|
`count_tokens` returns a structured unsupported result by default because the Microsoft REST contract used here does not define a portable token-counting endpoint across Azure AI Foundry deployments.
|
|
151
171
|
|
|
172
|
+
## Fleet Responder
|
|
173
|
+
|
|
174
|
+
Provider instances can opt in to consuming Legion LLM fleet requests. The actor is enabled only when at least one discovered instance has `fleet.respond_to_requests: true`.
|
|
175
|
+
|
|
176
|
+
Fleet execution is delegated to `Legion::Extensions::Llm::Fleet::ProviderResponder` from `lex-llm`; this provider supplies the provider family, provider class, discovered instances, and delivery metadata.
|
|
177
|
+
|
|
152
178
|
## Routing Metadata
|
|
153
179
|
|
|
154
180
|
Azure deployments are aliases. A deployment name can hide provider, model, and version details, so this extension preserves the deployment name as `model` and treats `canonical_model_alias` and `model_family` as routing metadata.
|
|
@@ -163,3 +189,7 @@ Supported `model_family` values are intentionally open-ended symbols, including:
|
|
|
163
189
|
- `:microsoft`
|
|
164
190
|
|
|
165
191
|
When `model_family` or `canonical_model_alias` is missing, offerings include `requires_explicit_model_metadata: true`.
|
|
192
|
+
|
|
193
|
+
## Failure Behavior
|
|
194
|
+
|
|
195
|
+
Live discovery and health-check failures are reported with `handle_exception(e, level: :warn, handled: true, operation: ...)` before returning degraded metadata. Offline discovery, provider configuration, and fleet actor enablement should not require live Azure connectivity.
|
|
@@ -26,5 +26,6 @@ Gem::Specification.new do |spec|
|
|
|
26
26
|
spec.add_dependency 'legion-json', '>= 1.2.1'
|
|
27
27
|
spec.add_dependency 'legion-logging', '>= 1.3.2'
|
|
28
28
|
spec.add_dependency 'legion-settings', '>= 1.3.14'
|
|
29
|
-
spec.add_dependency '
|
|
29
|
+
spec.add_dependency 'legion-transport', '>= 1.4.14'
|
|
30
|
+
spec.add_dependency 'lex-llm', '>= 0.4.3'
|
|
30
31
|
end
|
|
@@ -0,0 +1,43 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
begin
|
|
4
|
+
require 'legion/extensions/actors/subscription'
|
|
5
|
+
rescue LoadError => e
|
|
6
|
+
warn(e.message) if $VERBOSE
|
|
7
|
+
end
|
|
8
|
+
|
|
9
|
+
unless defined?(Legion::Extensions::Actors::Subscription)
|
|
10
|
+
raise LoadError, 'LegionIO actor runtime is required for Azure Foundry fleet worker'
|
|
11
|
+
end
|
|
12
|
+
|
|
13
|
+
require 'legion/extensions/llm/azure_foundry'
|
|
14
|
+
require 'legion/extensions/llm/fleet/provider_responder'
|
|
15
|
+
|
|
16
|
+
module Legion
|
|
17
|
+
module Extensions
|
|
18
|
+
module Llm
|
|
19
|
+
module AzureFoundry
|
|
20
|
+
module Actor
|
|
21
|
+
# Subscription actor for Azure Foundry fleet request consumption.
|
|
22
|
+
class FleetWorker < Legion::Extensions::Actors::Subscription
|
|
23
|
+
def runner_class
|
|
24
|
+
'Legion::Extensions::Llm::AzureFoundry::Runners::FleetWorker'
|
|
25
|
+
end
|
|
26
|
+
|
|
27
|
+
def runner_function
|
|
28
|
+
'handle_fleet_request'
|
|
29
|
+
end
|
|
30
|
+
|
|
31
|
+
def use_runner?
|
|
32
|
+
false
|
|
33
|
+
end
|
|
34
|
+
|
|
35
|
+
def enabled?
|
|
36
|
+
Legion::Extensions::Llm::Fleet::ProviderResponder.enabled_for?(AzureFoundry.discover_instances)
|
|
37
|
+
end
|
|
38
|
+
end
|
|
39
|
+
end
|
|
40
|
+
end
|
|
41
|
+
end
|
|
42
|
+
end
|
|
43
|
+
end
|
|
@@ -18,6 +18,8 @@ module Legion
|
|
|
18
18
|
|
|
19
19
|
class << self
|
|
20
20
|
def slug = 'azure_foundry'
|
|
21
|
+
def default_transport = :http
|
|
22
|
+
def default_tier = :cloud
|
|
21
23
|
def configuration_requirements = %i[azure_foundry_endpoint]
|
|
22
24
|
|
|
23
25
|
def configuration_options
|
|
@@ -128,10 +130,10 @@ module Legion
|
|
|
128
130
|
end
|
|
129
131
|
|
|
130
132
|
def headers
|
|
131
|
-
{
|
|
133
|
+
identity_headers.merge({
|
|
132
134
|
'api-key' => config.azure_foundry_api_key,
|
|
133
135
|
'Authorization' => bearer_header
|
|
134
|
-
}.compact
|
|
136
|
+
}.compact)
|
|
135
137
|
end
|
|
136
138
|
|
|
137
139
|
def completion_url = path_for('chat/completions')
|
|
@@ -143,10 +145,10 @@ module Legion
|
|
|
143
145
|
|
|
144
146
|
def discover_offerings(live: false, **filters)
|
|
145
147
|
log.info { "discovering offerings live=#{live} from #{api_base}" }
|
|
146
|
-
offerings =
|
|
147
|
-
return
|
|
148
|
+
offerings = filter_offerings(allowed_offerings, **filters)
|
|
149
|
+
return offerings unless live
|
|
148
150
|
|
|
149
|
-
|
|
151
|
+
offerings.map do |offering|
|
|
150
152
|
with_live_metadata(offering)
|
|
151
153
|
rescue StandardError => e
|
|
152
154
|
handle_exception(e, level: :warn, handled: true, operation: 'azure_foundry.discover_offerings')
|
|
@@ -197,25 +199,49 @@ module Legion
|
|
|
197
199
|
models
|
|
198
200
|
end
|
|
199
201
|
|
|
200
|
-
def chat(
|
|
202
|
+
def chat(
|
|
203
|
+
messages:,
|
|
204
|
+
model:,
|
|
205
|
+
**options
|
|
206
|
+
)
|
|
201
207
|
log.info { "chat request model=#{model} messages=#{messages.size}" }
|
|
202
|
-
complete(messages, tools
|
|
208
|
+
complete(messages, tools: options.fetch(:tools, {}), temperature: options[:temperature],
|
|
209
|
+
model: model_info(model, max_tokens: options[:max_tokens]),
|
|
210
|
+
params: options.fetch(:params, {}), tool_prefs: options[:tool_prefs])
|
|
203
211
|
end
|
|
204
212
|
|
|
205
|
-
def stream(
|
|
213
|
+
def stream(
|
|
214
|
+
messages:,
|
|
215
|
+
model:,
|
|
216
|
+
**options,
|
|
217
|
+
&
|
|
218
|
+
)
|
|
206
219
|
log.info { "stream request model=#{model} messages=#{messages.size}" }
|
|
207
|
-
complete(messages, tools
|
|
220
|
+
complete(messages, tools: options.fetch(:tools, {}), temperature: options[:temperature],
|
|
221
|
+
model: model_info(model, max_tokens: options[:max_tokens]),
|
|
222
|
+
params: options.fetch(:params, {}), tool_prefs: options[:tool_prefs], &)
|
|
208
223
|
end
|
|
209
224
|
|
|
210
|
-
def embed(
|
|
225
|
+
def embed(
|
|
226
|
+
text:,
|
|
227
|
+
model:,
|
|
228
|
+
**options
|
|
229
|
+
)
|
|
211
230
|
log.info { "embed request model=#{model}" }
|
|
212
|
-
payload =
|
|
213
|
-
|
|
231
|
+
payload = Utils.deep_merge(
|
|
232
|
+
render_embedding_payload(text, model: model_id(model), dimensions: options[:dimensions]),
|
|
233
|
+
options.fetch(:params, {})
|
|
234
|
+
)
|
|
235
|
+
payload[:input_type] = options[:input_type] if options[:input_type]
|
|
214
236
|
response = connection.post(embedding_url(model:), payload)
|
|
215
237
|
parse_embedding_response(response, model: model_id(model), text:)
|
|
216
238
|
end
|
|
217
239
|
|
|
218
|
-
def count_tokens(
|
|
240
|
+
def count_tokens(
|
|
241
|
+
messages:,
|
|
242
|
+
model:,
|
|
243
|
+
**_provider_options
|
|
244
|
+
)
|
|
219
245
|
{
|
|
220
246
|
provider_family: :azure_foundry,
|
|
221
247
|
model: model_id(model),
|
|
@@ -276,6 +302,18 @@ module Legion
|
|
|
276
302
|
self.class.normalize_deployments(config.azure_foundry_deployments)
|
|
277
303
|
end
|
|
278
304
|
|
|
305
|
+
def allowed_offerings
|
|
306
|
+
configured_deployments.filter_map do |deployment|
|
|
307
|
+
offering = offering_from_config(deployment)
|
|
308
|
+
next unless offering
|
|
309
|
+
|
|
310
|
+
mid = offering.respond_to?(:model) ? offering.model : (offering[:model] || deployment[:model])
|
|
311
|
+
next unless model_allowed?(mid.to_s)
|
|
312
|
+
|
|
313
|
+
offering
|
|
314
|
+
end
|
|
315
|
+
end
|
|
316
|
+
|
|
279
317
|
def offering_from_config(deployment)
|
|
280
318
|
deployment_name = value_for(deployment, :deployment) || value_for(deployment, :model)
|
|
281
319
|
return nil if deployment_name.to_s.empty?
|
|
@@ -295,8 +333,8 @@ module Legion
|
|
|
295
333
|
Legion::Extensions::Llm::Routing::ModelOffering.new(
|
|
296
334
|
provider_family: :azure_foundry,
|
|
297
335
|
instance_id: instance_id,
|
|
298
|
-
transport:
|
|
299
|
-
tier:
|
|
336
|
+
transport: offering_transport,
|
|
337
|
+
tier: offering_tier,
|
|
300
338
|
model: model,
|
|
301
339
|
usage_type: usage_type.to_sym,
|
|
302
340
|
capabilities: capabilities,
|
|
@@ -0,0 +1,30 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require 'legion/extensions/llm/fleet/provider_responder'
|
|
4
|
+
require 'legion/extensions/llm/azure_foundry'
|
|
5
|
+
|
|
6
|
+
module Legion
|
|
7
|
+
module Extensions
|
|
8
|
+
module Llm
|
|
9
|
+
module AzureFoundry
|
|
10
|
+
module Runners
|
|
11
|
+
# Runner entrypoint for Azure Foundry fleet request execution.
|
|
12
|
+
module FleetWorker
|
|
13
|
+
module_function
|
|
14
|
+
|
|
15
|
+
def handle_fleet_request(payload, delivery: nil, properties: nil)
|
|
16
|
+
Legion::Extensions::Llm::Fleet::ProviderResponder.call(
|
|
17
|
+
payload: payload,
|
|
18
|
+
provider_family: AzureFoundry::PROVIDER_FAMILY,
|
|
19
|
+
provider_class: AzureFoundry::Provider,
|
|
20
|
+
provider_instances: -> { AzureFoundry.discover_instances },
|
|
21
|
+
delivery: delivery,
|
|
22
|
+
properties: properties
|
|
23
|
+
)
|
|
24
|
+
end
|
|
25
|
+
end
|
|
26
|
+
end
|
|
27
|
+
end
|
|
28
|
+
end
|
|
29
|
+
end
|
|
30
|
+
end
|
|
@@ -16,21 +16,33 @@ module Legion
|
|
|
16
16
|
PROVIDER_FAMILY = :azure_foundry
|
|
17
17
|
|
|
18
18
|
def self.default_settings
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
19
|
+
::Legion::Extensions::Llm.provider_settings(
|
|
20
|
+
family: PROVIDER_FAMILY,
|
|
21
|
+
instance: {
|
|
22
|
+
endpoint: nil,
|
|
23
|
+
tier: :frontier,
|
|
24
|
+
transport: :http,
|
|
25
|
+
credentials: {
|
|
26
|
+
api_key: nil,
|
|
27
|
+
bearer_token: nil
|
|
28
|
+
},
|
|
29
|
+
provider: {
|
|
30
|
+
api_version: Provider::DEFAULT_API_VERSION,
|
|
31
|
+
surface: nil,
|
|
32
|
+
deployments: []
|
|
33
|
+
},
|
|
34
|
+
usage: { inference: true, embedding: true, image: false },
|
|
35
|
+
limits: { concurrency: 4 },
|
|
36
|
+
fleet: {
|
|
37
|
+
enabled: false,
|
|
38
|
+
respond_to_requests: false,
|
|
39
|
+
capabilities: %i[chat stream_chat embed],
|
|
40
|
+
lanes: [],
|
|
41
|
+
concurrency: 4,
|
|
42
|
+
queue_suffix: nil
|
|
43
|
+
}
|
|
44
|
+
}
|
|
45
|
+
)
|
|
34
46
|
end
|
|
35
47
|
|
|
36
48
|
def self.provider_class
|
|
@@ -48,14 +60,15 @@ module Legion
|
|
|
48
60
|
instances
|
|
49
61
|
end
|
|
50
62
|
|
|
51
|
-
def self.discover_default_instance(instances)
|
|
63
|
+
def self.discover_default_instance(instances) # rubocop:disable Metrics/AbcSize, Metrics/CyclomaticComplexity, Metrics/PerceivedComplexity
|
|
52
64
|
cfg = CredentialSources.setting(:extensions, :llm, :azure_foundry)
|
|
53
65
|
return unless cfg.is_a?(Hash)
|
|
54
66
|
|
|
55
|
-
endpoint = cfg[:endpoint] || cfg['endpoint']
|
|
67
|
+
endpoint = cfg[:endpoint] || cfg['endpoint'] || cfg[:base_url] || cfg['base_url'] || cfg[:api_base] ||
|
|
68
|
+
cfg['api_base']
|
|
56
69
|
return if endpoint.nil? || endpoint.to_s.strip.empty?
|
|
57
70
|
|
|
58
|
-
instances[:settings] = cfg
|
|
71
|
+
instances[:settings] = normalize_instance_config(cfg).merge(tier: :cloud)
|
|
59
72
|
end
|
|
60
73
|
|
|
61
74
|
def self.discover_named_instances(instances)
|
|
@@ -68,21 +81,35 @@ module Legion
|
|
|
68
81
|
named.each { |name, config| add_named_instance(instances, name, config) }
|
|
69
82
|
end
|
|
70
83
|
|
|
71
|
-
def self.add_named_instance(instances, name, config)
|
|
84
|
+
def self.add_named_instance(instances, name, config) # rubocop:disable Metrics/AbcSize, Metrics/CyclomaticComplexity, Metrics/PerceivedComplexity
|
|
72
85
|
return unless config.is_a?(Hash)
|
|
73
86
|
|
|
74
|
-
endpoint = config[:endpoint] || config['endpoint']
|
|
87
|
+
endpoint = config[:endpoint] || config['endpoint'] || config[:base_url] || config['base_url'] ||
|
|
88
|
+
config[:api_base] || config['api_base']
|
|
75
89
|
return if endpoint.nil? || endpoint.to_s.strip.empty?
|
|
76
90
|
|
|
77
|
-
instances[name.to_sym] = config.merge(tier: :cloud)
|
|
91
|
+
instances[name.to_sym] = normalize_instance_config(config).merge(tier: :cloud)
|
|
78
92
|
end
|
|
79
93
|
|
|
80
|
-
|
|
94
|
+
def self.normalize_instance_config(config) # rubocop:disable Metrics/AbcSize, Metrics/CyclomaticComplexity, Metrics/PerceivedComplexity
|
|
95
|
+
normalized = config.to_h.transform_keys { |key| key.respond_to?(:to_sym) ? key.to_sym : key }
|
|
96
|
+
normalized[:azure_foundry_endpoint] ||= normalized.delete(:endpoint)
|
|
97
|
+
normalized[:azure_foundry_endpoint] ||= normalized.delete(:base_url)
|
|
98
|
+
normalized[:azure_foundry_endpoint] ||= normalized.delete(:api_base)
|
|
99
|
+
normalized[:azure_foundry_api_key] ||= normalized.delete(:api_key)
|
|
100
|
+
normalized[:azure_foundry_bearer_token] ||= normalized.delete(:bearer_token)
|
|
101
|
+
normalized[:azure_foundry_api_version] ||= normalized.delete(:api_version)
|
|
102
|
+
normalized[:azure_foundry_surface] ||= normalized.delete(:surface)
|
|
103
|
+
normalized[:azure_foundry_deployments] ||= normalized.delete(:deployments)
|
|
104
|
+
normalized.compact.except(:instances)
|
|
105
|
+
end
|
|
106
|
+
|
|
107
|
+
private_class_method :discover_default_instance, :discover_named_instances, :add_named_instance,
|
|
108
|
+
:normalize_instance_config
|
|
81
109
|
|
|
82
|
-
Legion::Extensions::Llm::Configuration.register_provider_options(Provider.configuration_options)
|
|
110
|
+
Legion::Extensions::Llm::Configuration.register_provider_options(Provider.configuration_options) if
|
|
111
|
+
Legion::Extensions::Llm::Configuration.respond_to?(:register_provider_options)
|
|
83
112
|
end
|
|
84
113
|
end
|
|
85
114
|
end
|
|
86
115
|
end
|
|
87
|
-
|
|
88
|
-
Legion::Extensions::Llm::AzureFoundry.register_discovered_instances
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: lex-llm-azure-foundry
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.2.
|
|
4
|
+
version: 0.2.6
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- LegionIO
|
|
@@ -51,20 +51,34 @@ dependencies:
|
|
|
51
51
|
- - ">="
|
|
52
52
|
- !ruby/object:Gem::Version
|
|
53
53
|
version: 1.3.14
|
|
54
|
+
- !ruby/object:Gem::Dependency
|
|
55
|
+
name: legion-transport
|
|
56
|
+
requirement: !ruby/object:Gem::Requirement
|
|
57
|
+
requirements:
|
|
58
|
+
- - ">="
|
|
59
|
+
- !ruby/object:Gem::Version
|
|
60
|
+
version: 1.4.14
|
|
61
|
+
type: :runtime
|
|
62
|
+
prerelease: false
|
|
63
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
64
|
+
requirements:
|
|
65
|
+
- - ">="
|
|
66
|
+
- !ruby/object:Gem::Version
|
|
67
|
+
version: 1.4.14
|
|
54
68
|
- !ruby/object:Gem::Dependency
|
|
55
69
|
name: lex-llm
|
|
56
70
|
requirement: !ruby/object:Gem::Requirement
|
|
57
71
|
requirements:
|
|
58
72
|
- - ">="
|
|
59
73
|
- !ruby/object:Gem::Version
|
|
60
|
-
version: 0.3
|
|
74
|
+
version: 0.4.3
|
|
61
75
|
type: :runtime
|
|
62
76
|
prerelease: false
|
|
63
77
|
version_requirements: !ruby/object:Gem::Requirement
|
|
64
78
|
requirements:
|
|
65
79
|
- - ">="
|
|
66
80
|
- !ruby/object:Gem::Version
|
|
67
|
-
version: 0.3
|
|
81
|
+
version: 0.4.3
|
|
68
82
|
description: Azure AI Foundry and Azure OpenAI hosted provider integration for LegionIO
|
|
69
83
|
LLM routing.
|
|
70
84
|
email:
|
|
@@ -84,7 +98,9 @@ files:
|
|
|
84
98
|
- README.md
|
|
85
99
|
- lex-llm-azure-foundry.gemspec
|
|
86
100
|
- lib/legion/extensions/llm/azure_foundry.rb
|
|
101
|
+
- lib/legion/extensions/llm/azure_foundry/actors/fleet_worker.rb
|
|
87
102
|
- lib/legion/extensions/llm/azure_foundry/provider.rb
|
|
103
|
+
- lib/legion/extensions/llm/azure_foundry/runners/fleet_worker.rb
|
|
88
104
|
- lib/legion/extensions/llm/azure_foundry/version.rb
|
|
89
105
|
homepage: https://github.com/LegionIO/lex-llm-azure-foundry
|
|
90
106
|
licenses:
|