legion-llm 0.3.1 → 0.3.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 9a39a5fe8483ddcd715dafd4d65dfe1f4457b90e5a39e62cfa2a32b6c68c8e0c
4
- data.tar.gz: 37ed9c3b024a1cb9cce7eed1e287512c91269ef99fbb7b54342de57ceae9a668
3
+ metadata.gz: 7faf26458139d4c0e585e5c30e42602e85402e20abcbe2cd73ef8449ae17f947
4
+ data.tar.gz: 73fecb93dbfd407891e64a4278c21c9e9f7fcc2af8313660de4e075a3195bbc7
5
5
  SHA512:
6
- metadata.gz: a19943d8d25665e16ae55dfe6c0e32bad0e834a3eed3c5e028c0c0db672d531ea21e6015e6137b1f7c0b57bb38e2677091cab7d48dc3a3169cf9273fe6e468e7
7
- data.tar.gz: b9bd3d4586e64b9f1866d7e276ecdb3969e2204c8428b37e08fd262ddaef77846c5d28f192174b2ed5787d1576431aae4ebe29e2087499f06e2d0f9393e293ff
6
+ metadata.gz: 067c7e99927b675df13517a6ca5aa12b494fdedd8e277e7a42ae060e7597033597fbd0f88f595d5368937ba4a00aa7f6e2e02c14e56d01763f4253eb1cd3f421
7
+ data.tar.gz: 9dc754975461db838b49d1f9826d54a80c7bdf106fa44fa6fe3d3693eefe70a1d503ed5025e551a458b367ab53580c67b7c9cf1fb9ee01e69cd5c2394181f150
data/CHANGELOG.md CHANGED
@@ -1,5 +1,23 @@
1
1
  # Legion LLM Changelog
2
2
 
3
+ ## [0.3.3] - 2026-03-17
4
+
5
+ ### Added
6
+ - `Router::GatewayInterceptor`: optional gateway routing mode for cloud-tier LLM calls
7
+ - Gateway settings: endpoint, API key, model policy per risk tier, fallback_to_direct
8
+ - Identity header builder: X-Agent-Id, X-Tenant-Id, X-AIRB-Project-Id, X-Risk-Tier
9
+ - Model selection policy: fnmatch-based allowlist per risk tier
10
+ - Wired gateway interceptor into `chat_single` for automatic cloud-tier interception
11
+
12
+ ## [0.3.2] - 2026-03-16
13
+
14
+ ### Added
15
+ - `Legion::LLM::Embeddings` module — structured wrapper around RubyLLM.embed with `generate`, `generate_batch`, `default_model`
16
+ - `Legion::LLM::ShadowEval` module — parallel evaluation on cheaper model with configurable sample rate for quality comparison
17
+ - `Legion::LLM::StructuredOutput` module — JSON schema enforcement with native `response_format` for capable models and prompt-based fallback with retry logic
18
+ - `embed_batch` and `structured` convenience methods on `Legion::LLM`
19
+ - `Settings.dig` support in spec_helper for nested settings access in tests
20
+
3
21
  ## [0.3.1] - 2026-03-16
4
22
 
5
23
  ### Removed
data/CLAUDE.md CHANGED
@@ -38,6 +38,9 @@ Legion::LLM (lib/legion/llm.rb)
38
38
  │ └── System # Queries OS memory: macOS (vm_stat/sysctl), Linux (/proc/meminfo)
39
39
  ├── QualityChecker # Response quality heuristics (empty, too_short, repetition, json_parse, json_expected) + pluggable callable
40
40
  ├── EscalationHistory # Mixin for response objects: escalation_history, escalated?, final_resolution, escalation_chain
41
+ ├── Embeddings # Structured embedding wrapper: generate, generate_batch, default_model
42
+ ├── ShadowEval # Parallel shadow evaluation on cheaper models with sampling
43
+ ├── StructuredOutput # JSON schema enforcement with native response_format and prompt fallback
41
44
  ├── Router # Dynamic weighted routing engine
42
45
  │ ├── Resolution # Value object: tier, provider, model, rule name, metadata, compress_level
43
46
  │ ├── Rule # Routing rule: intent matching, schedule windows, constraints
@@ -278,7 +281,10 @@ In-memory signal consumer with pluggable handlers. Adjusts effective priorities
278
281
  | `lib/legion/llm/router/health_tracker.rb` | HealthTracker: circuit breaker, latency window, pluggable signal handlers |
279
282
  | `lib/legion/llm/discovery/ollama.rb` | Ollama /api/tags discovery with TTL cache |
280
283
  | `lib/legion/llm/discovery/system.rb` | OS memory introspection (macOS + Linux) with TTL cache |
281
- | `lib/legion/llm/version.rb` | Version constant (0.3.0) |
284
+ | `lib/legion/llm/embeddings.rb` | Embeddings module: generate, generate_batch, default_model |
285
+ | `lib/legion/llm/shadow_eval.rb` | Shadow evaluation: enabled?, should_sample?, evaluate, compare |
286
+ | `lib/legion/llm/structured_output.rb` | JSON schema enforcement with native response_format and prompt fallback |
287
+ | `lib/legion/llm/version.rb` | Version constant (0.3.2) |
282
288
  | `lib/legion/llm/quality_checker.rb` | QualityChecker module with QualityResult struct |
283
289
  | `lib/legion/llm/escalation_history.rb` | EscalationHistory mixin: `escalation_history`, `escalated?`, `final_resolution`, `escalation_chain` |
284
290
  | `lib/legion/llm/router/escalation_chain.rb` | EscalationChain value object |
@@ -306,6 +312,9 @@ In-memory signal consumer with pluggable handlers. Adjusts effective priorities
306
312
  | `spec/legion/llm/router/escalation_chain_spec.rb` | EscalationChain tests |
307
313
  | `spec/legion/llm/router/resolve_chain_spec.rb` | Router.resolve_chain tests |
308
314
  | `spec/legion/llm/transport/escalation_spec.rb` | Transport tests |
315
+ | `spec/legion/llm/embeddings_spec.rb` | Embeddings tests |
316
+ | `spec/legion/llm/shadow_eval_spec.rb` | ShadowEval tests |
317
+ | `spec/legion/llm/structured_output_spec.rb` | StructuredOutput tests |
309
318
  | `spec/spec_helper.rb` | Stubbed Legion::Logging and Legion::Settings for testing |
310
319
 
311
320
  ## Extension Integration
@@ -365,7 +374,7 @@ The legacy `vault_path` per-provider setting was removed in v0.3.1.
365
374
  Tests run without the full LegionIO stack. `spec/spec_helper.rb` stubs `Legion::Logging` and `Legion::Settings` with in-memory implementations. Each test resets settings to defaults via `before(:each)`.
366
375
 
367
376
  ```bash
368
- bundle exec rspec # 269 examples, 0 failures
377
+ bundle exec rspec # 287 examples, 0 failures
369
378
  bundle exec rubocop # 31 files, 0 offenses
370
379
  ```
371
380
 
data/README.md CHANGED
@@ -12,7 +12,7 @@ Or add to your Gemfile and `bundle install`.
12
12
 
13
13
  ## Configuration
14
14
 
15
- Add to your LegionIO settings directory:
15
+ Add to your LegionIO settings directory (e.g. `~/.legionio/settings/llm.json`):
16
16
 
17
17
  ```json
18
18
  {
@@ -23,14 +23,15 @@ Add to your LegionIO settings directory:
23
23
  "bedrock": {
24
24
  "enabled": true,
25
25
  "region": "us-east-2",
26
- "vault_path": "legion/bedrock"
26
+ "bearer_token": ["vault://secret/data/llm/bedrock#bearer_token", "env://AWS_BEARER_TOKEN"]
27
27
  },
28
28
  "anthropic": {
29
29
  "enabled": false,
30
- "vault_path": "legion/anthropic"
30
+ "api_key": "env://ANTHROPIC_API_KEY"
31
31
  },
32
32
  "openai": {
33
- "enabled": false
33
+ "enabled": false,
34
+ "api_key": "env://OPENAI_API_KEY"
34
35
  },
35
36
  "ollama": {
36
37
  "enabled": false,
@@ -41,7 +42,7 @@ Add to your LegionIO settings directory:
41
42
  }
42
43
  ```
43
44
 
44
- Credentials are resolved from Vault automatically when `vault_path` is set and Legion::Crypt is connected.
45
+ Credentials are resolved automatically by the universal secret resolver in `legion-settings` (v1.3.0+). Use `vault://` URIs for Vault secrets, `env://` for environment variables, or plain strings for static values. Array values act as fallback chains — the first non-nil result wins.
45
46
 
46
47
  ### Provider Configuration
47
48
 
@@ -50,8 +51,7 @@ Each provider supports these common fields:
50
51
  | Field | Type | Description |
51
52
  |-------|------|-------------|
52
53
  | `enabled` | Boolean | Enable this provider (default: `false`) |
53
- | `api_key` | String | API key (resolved from Vault if `vault_path` set) |
54
- | `vault_path` | String | Vault secret path for credential resolution |
54
+ | `api_key` | String | API key (supports `vault://`, `env://`, or plain string) |
55
55
 
56
56
  Provider-specific fields:
57
57
 
@@ -60,19 +60,23 @@ Provider-specific fields:
60
60
  | **Bedrock** | `secret_key`, `session_token`, `region` (default: `us-east-2`), `bearer_token` (alternative to SigV4 — for AWS Identity Center/SSO) |
61
61
  | **Ollama** | `base_url` (default: `http://localhost:11434`) |
62
62
 
63
- ### Vault Credential Resolution
63
+ ### Credential Resolution
64
64
 
65
- When `vault_path` is set and `Legion::Crypt::Vault` is connected, credentials are fetched from Vault at startup. The secret keys map to provider fields automatically:
65
+ All credential fields support the universal `vault://` and `env://` URI schemes provided by `legion-settings`. Use array values for fallback chains:
66
66
 
67
- | Provider | Vault Key | Maps To |
68
- |----------|-----------|---------|
69
- | Bedrock | `access_key` / `aws_access_key_id` | `api_key` |
70
- | Bedrock | `secret_key` / `aws_secret_access_key` | `secret_key` |
71
- | Bedrock | `session_token` / `aws_session_token` | `session_token` |
72
- | Bedrock | `bearer_token` / `aws_bearer_token` | `bearer_token` (Identity Center/SSO) |
73
- | Anthropic / OpenAI / Gemini | `api_key` / `token` | `api_key` |
67
+ ```json
68
+ {
69
+ "bedrock": {
70
+ "enabled": true,
71
+ "api_key": ["vault://secret/data/llm/bedrock#access_key", "env://AWS_ACCESS_KEY_ID"],
72
+ "secret_key": ["vault://secret/data/llm/bedrock#secret_key", "env://AWS_SECRET_ACCESS_KEY"],
73
+ "bearer_token": ["vault://secret/data/llm/bedrock#bearer_token", "env://AWS_BEARER_TOKEN"],
74
+ "region": "us-east-2"
75
+ }
76
+ }
77
+ ```
74
78
 
75
- Direct configuration (setting `api_key` in settings) takes precedence over Vault-resolved values.
79
+ By the time `Legion::LLM.start` runs, all `vault://` and `env://` references have already been resolved to plain strings by `Legion::Settings.resolve_secrets!` (called in the boot sequence after `Legion::Crypt.start`). The `env://` scheme works even when Vault is not connected.
76
80
 
77
81
  ### Auto-Detection
78
82
 
@@ -91,7 +95,7 @@ If no `default_model` or `default_provider` is set, legion-llm auto-detects from
91
95
  ### Lifecycle
92
96
 
93
97
  ```ruby
94
- Legion::LLM.start # Configure providers, resolve Vault credentials, warm discovery caches, set defaults, ping provider
98
+ Legion::LLM.start # Configure providers, warm discovery caches, set defaults, ping provider
95
99
  Legion::LLM.shutdown # Mark disconnected, clean up
96
100
  Legion::LLM.started? # -> Boolean
97
101
  Legion::LLM.settings # -> Hash (current LLM settings)
@@ -556,10 +560,10 @@ end
556
560
 
557
561
  | Provider | Config Key | Credential Source | Notes |
558
562
  |----------|-----------|-------------------|-------|
559
- | AWS Bedrock | `bedrock` | Vault (`access_key`, `secret_key`) or direct | Default region: us-east-2 |
560
- | Anthropic | `anthropic` | Vault (`api_key`) or direct | Direct API access |
561
- | OpenAI | `openai` | Vault (`api_key`) or direct | GPT models |
562
- | Google Gemini | `gemini` | Vault (`api_key`) or direct | Gemini models |
563
+ | AWS Bedrock | `bedrock` | `vault://`, `env://`, or direct | Default region: us-east-2, SigV4 or Bearer Token auth |
564
+ | Anthropic | `anthropic` | `vault://`, `env://`, or direct | Direct API access |
565
+ | OpenAI | `openai` | `vault://`, `env://`, or direct | GPT models |
566
+ | Google Gemini | `gemini` | `vault://`, `env://`, or direct | Gemini models |
563
567
  | Ollama | `ollama` | Local, no credentials needed | Local inference |
564
568
 
565
569
  ## Integration with LegionIO
@@ -0,0 +1,43 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Legion
4
+ module LLM
5
+ module Embeddings
6
+ class << self
7
+ def generate(text:, model: nil, dimensions: nil)
8
+ model ||= default_model
9
+ opts = { model: model }
10
+ opts[:dimensions] = dimensions if dimensions
11
+
12
+ response = RubyLLM.embed(text, **opts)
13
+ {
14
+ vector: response.vectors.first,
15
+ model: model,
16
+ dimensions: response.vectors.first&.size || 0,
17
+ tokens: response.input_tokens
18
+ }
19
+ rescue StandardError => e
20
+ Legion::Logging.warn "Embedding failed: #{e.message}" if defined?(Legion::Logging)
21
+ { vector: nil, model: model, error: e.message }
22
+ end
23
+
24
+ def generate_batch(texts:, model: nil, dimensions: nil)
25
+ model ||= default_model
26
+ opts = { model: model }
27
+ opts[:dimensions] = dimensions if dimensions
28
+
29
+ response = RubyLLM.embed(texts, **opts)
30
+ response.vectors.each_with_index.map do |vec, i|
31
+ { vector: vec, model: model, dimensions: vec&.size || 0, index: i }
32
+ end
33
+ rescue StandardError => e
34
+ texts.map { |_| { vector: nil, model: model, error: e.message } }
35
+ end
36
+
37
+ def default_model
38
+ Legion::Settings.dig(:llm, :embeddings, :default_model) || 'text-embedding-3-small'
39
+ end
40
+ end
41
+ end
42
+ end
43
+ end
@@ -0,0 +1,65 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Legion
4
+ module LLM
5
+ module Router
6
+ module GatewayInterceptor
7
+ module_function
8
+
9
+ def intercept(resolution, context: {})
10
+ return resolution unless gateway_enabled?
11
+ return resolution unless resolution&.tier == :cloud
12
+
13
+ model = resolution.model
14
+ risk_tier = context[:risk_tier]&.to_sym
15
+
16
+ unless model_allowed?(model, risk_tier)
17
+ Legion::Logging.warn "[llm] gateway policy blocked model=#{model} risk_tier=#{risk_tier}"
18
+ return nil
19
+ end
20
+
21
+ Resolution.new(
22
+ tier: :cloud,
23
+ provider: :gateway,
24
+ model: model,
25
+ rule: 'gateway_intercept',
26
+ metadata: { original_provider: resolution.provider }
27
+ )
28
+ end
29
+
30
+ def gateway_enabled?
31
+ settings = gateway_settings
32
+ settings[:enabled] == true && !settings[:endpoint].nil?
33
+ end
34
+
35
+ def model_allowed?(model, risk_tier)
36
+ return true unless risk_tier
37
+
38
+ allowlist = gateway_settings.dig(:model_policy, risk_tier)
39
+ return true unless allowlist.is_a?(Array) && !allowlist.empty?
40
+
41
+ allowlist.any? { |pattern| File.fnmatch?(pattern, model.to_s) }
42
+ end
43
+
44
+ def gateway_headers(context)
45
+ {
46
+ 'X-Agent-Id' => context[:worker_id],
47
+ 'X-Tenant-Id' => context[:tenant_id],
48
+ 'X-AIRB-Project-Id' => context[:airb_project_id],
49
+ 'X-Risk-Tier' => context[:risk_tier]&.to_s,
50
+ 'X-Legion-Task-Id' => context[:task_id]&.to_s
51
+ }.compact
52
+ end
53
+
54
+ def gateway_settings
55
+ llm = Legion::Settings[:llm]
56
+ return {} unless llm.is_a?(Hash)
57
+
58
+ (llm[:gateway] || {}).transform_keys(&:to_sym)
59
+ rescue StandardError
60
+ {}
61
+ end
62
+ end
63
+ end
64
+ end
65
+ end
@@ -4,6 +4,7 @@ require_relative 'router/resolution'
4
4
  require_relative 'router/rule'
5
5
  require_relative 'router/health_tracker'
6
6
  require_relative 'router/escalation_chain'
7
+ require_relative 'router/gateway_interceptor'
7
8
  require_relative 'discovery/ollama'
8
9
  require_relative 'discovery/system'
9
10
 
@@ -11,7 +11,8 @@ module Legion
11
11
  default_provider: nil,
12
12
  providers: providers,
13
13
  routing: routing_defaults,
14
- discovery: discovery_defaults
14
+ discovery: discovery_defaults,
15
+ gateway: gateway_defaults
15
16
  }
16
17
  end
17
18
 
@@ -47,6 +48,18 @@ module Legion
47
48
  }
48
49
  end
49
50
 
51
+ def self.gateway_defaults
52
+ {
53
+ enabled: false,
54
+ endpoint: nil,
55
+ api_key: nil,
56
+ timeout_seconds: 30,
57
+ model_policy: {},
58
+ headers: {},
59
+ fallback_to_direct: true
60
+ }
61
+ end
62
+
50
63
  def self.providers
51
64
  {
52
65
  bedrock: {
@@ -0,0 +1,49 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Legion
4
+ module LLM
5
+ module ShadowEval
6
+ class << self
7
+ def enabled?
8
+ Legion::Settings.dig(:llm, :shadow, :enabled) == true
9
+ end
10
+
11
+ def should_sample?
12
+ return false unless enabled?
13
+
14
+ rate = Legion::Settings.dig(:llm, :shadow, :sample_rate) || 0.1
15
+ rand < rate
16
+ end
17
+
18
+ def evaluate(primary_response:, messages: nil, shadow_model: nil) # rubocop:disable Lint/UnusedMethodArgument
19
+ shadow_model ||= Legion::Settings.dig(:llm, :shadow, :model) || 'gpt-4o-mini'
20
+
21
+ shadow_response = Legion::LLM.send(:chat_single,
22
+ model: shadow_model, provider: nil,
23
+ intent: nil, tier: nil,
24
+ skip_shadow: true)
25
+
26
+ comparison = compare(primary_response, shadow_response, shadow_model)
27
+ Legion::Events.emit('llm.shadow_eval', comparison) if defined?(Legion::Events)
28
+ comparison
29
+ rescue StandardError => e
30
+ { error: e.message, shadow_model: shadow_model }
31
+ end
32
+
33
+ def compare(primary, shadow, shadow_model)
34
+ primary_len = primary[:content]&.length || 0
35
+ shadow_len = shadow[:content]&.length || 0
36
+
37
+ {
38
+ primary_model: primary[:model],
39
+ shadow_model: shadow_model,
40
+ primary_tokens: primary[:usage],
41
+ shadow_tokens: shadow[:usage],
42
+ length_ratio: primary_len.zero? ? 0.0 : shadow_len.to_f / primary_len,
43
+ evaluated_at: Time.now.utc
44
+ }
45
+ end
46
+ end
47
+ end
48
+ end
49
+ end
@@ -0,0 +1,74 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Legion
4
+ module LLM
5
+ module StructuredOutput
6
+ SCHEMA_CAPABLE_MODELS = %w[gpt-4o gpt-4o-mini gpt-4-turbo claude-3-5-sonnet claude-4-sonnet claude-4-opus].freeze
7
+
8
+ class << self
9
+ def generate(messages:, schema:, model: nil, **)
10
+ model ||= Legion::LLM.settings[:default_model]
11
+ result = call_with_schema(messages, schema, model, **)
12
+
13
+ parsed = Legion::JSON.load(result[:content])
14
+ { data: parsed, raw: result[:content], model: result[:model], valid: true }
15
+ rescue ::JSON::ParserError => e
16
+ handle_parse_error(e, messages, schema, model, result, **)
17
+ end
18
+
19
+ private
20
+
21
+ def call_with_schema(messages, schema, model, **opts)
22
+ if supports_response_format?(model)
23
+ Legion::LLM.send(:chat_single,
24
+ model: model, provider: nil, intent: nil, tier: nil,
25
+ response_format: { type: 'json_schema',
26
+ json_schema: { name: 'response', schema: schema } },
27
+ **opts.except(:attempt))
28
+ else
29
+ instruction = "You MUST respond with valid JSON matching this schema:\n" \
30
+ "```json\n#{Legion::JSON.dump(schema)}\n```\n" \
31
+ 'Respond with ONLY the JSON object, no other text.'
32
+ augmented = [{ role: 'system', content: instruction }] + Array(messages)
33
+ Legion::LLM.send(:chat_single,
34
+ model: model, provider: nil, intent: nil, tier: nil,
35
+ messages: augmented, **opts.except(:attempt))
36
+ end
37
+ end
38
+
39
+ def handle_parse_error(error, messages, schema, model, result, **opts)
40
+ if retry_enabled? && (opts[:attempt] || 0) < max_retries
41
+ retry_with_instruction(messages, schema, model, attempt: (opts[:attempt] || 0) + 1, **opts)
42
+ else
43
+ { data: nil, error: "JSON parse failed: #{error.message}", raw: result&.dig(:content), valid: false }
44
+ end
45
+ end
46
+
47
+ def retry_with_instruction(messages, schema, model, **opts)
48
+ instruction = "Your previous response was not valid JSON. Respond with ONLY a valid JSON object matching this schema:\n#{Legion::JSON.dump(schema)}"
49
+ augmented = Array(messages) + [{ role: 'user', content: instruction }]
50
+ result = Legion::LLM.send(:chat_single,
51
+ model: model, provider: nil, intent: nil, tier: nil,
52
+ messages: augmented, **opts.except(:attempt))
53
+
54
+ parsed = Legion::JSON.load(result[:content])
55
+ { data: parsed, raw: result[:content], model: result[:model], valid: true, retried: true }
56
+ rescue StandardError => e
57
+ { data: nil, error: e.message, valid: false }
58
+ end
59
+
60
+ def supports_response_format?(model)
61
+ SCHEMA_CAPABLE_MODELS.any? { |m| model.to_s.include?(m) }
62
+ end
63
+
64
+ def retry_enabled?
65
+ Legion::Settings.dig(:llm, :structured_output, :retry_on_parse_failure) != false
66
+ end
67
+
68
+ def max_retries
69
+ Legion::Settings.dig(:llm, :structured_output, :max_retries) || 2
70
+ end
71
+ end
72
+ end
73
+ end
74
+ end
@@ -2,6 +2,6 @@
2
2
 
3
3
  module Legion
4
4
  module LLM
5
- VERSION = '0.3.1'
5
+ VERSION = '0.3.3'
6
6
  end
7
7
  end
data/lib/legion/llm.rb CHANGED
@@ -74,16 +74,30 @@ module Legion
74
74
  end
75
75
  end
76
76
 
77
- # Generate embeddings
77
+ # Generate embeddings via Embeddings module
78
78
  # @param text [String, Array<String>] text to embed
79
79
  # @param model [String] embedding model ID
80
- # @return [RubyLLM::Embedding]
81
- def embed(text, model: nil)
82
- if model
83
- RubyLLM.embed(text, model: model)
84
- else
85
- RubyLLM.embed(text)
86
- end
80
+ # @return [Hash] { vector:, model:, dimensions:, tokens: }
81
+ def embed(text, **)
82
+ require 'legion/llm/embeddings'
83
+ Embeddings.generate(text: text, **)
84
+ end
85
+
86
+ # Batch embed multiple texts
87
+ # @param texts [Array<String>] texts to embed
88
+ # @return [Array<Hash>]
89
+ def embed_batch(texts, **)
90
+ require 'legion/llm/embeddings'
91
+ Embeddings.generate_batch(texts: texts, **)
92
+ end
93
+
94
+ # Generate structured JSON output from LLM
95
+ # @param messages [Array<Hash>] conversation messages
96
+ # @param schema [Hash] JSON schema to enforce
97
+ # @return [Hash] { data:, raw:, model:, valid: }
98
+ def structured(messages:, schema:, **)
99
+ require 'legion/llm/structured_output'
100
+ StructuredOutput.generate(messages: messages, schema: schema, **)
87
101
  end
88
102
 
89
103
  # Create a configured agent instance
@@ -100,6 +114,7 @@ module Legion
100
114
  if (intent || tier) && Router.routing_enabled?
101
115
  resolution = Router.resolve(intent: intent, tier: tier, model: model, provider: provider)
102
116
  if resolution
117
+ resolution = Router::GatewayInterceptor.intercept(resolution, context: kwargs.fetch(:context, {}))
103
118
  model = resolution.model
104
119
  provider = resolution.provider
105
120
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: legion-llm
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.3.1
4
+ version: 0.3.3
5
5
  platform: ruby
6
6
  authors:
7
7
  - Esity
@@ -92,16 +92,20 @@ files:
92
92
  - lib/legion/llm/compressor.rb
93
93
  - lib/legion/llm/discovery/ollama.rb
94
94
  - lib/legion/llm/discovery/system.rb
95
+ - lib/legion/llm/embeddings.rb
95
96
  - lib/legion/llm/escalation_history.rb
96
97
  - lib/legion/llm/helpers/llm.rb
97
98
  - lib/legion/llm/providers.rb
98
99
  - lib/legion/llm/quality_checker.rb
99
100
  - lib/legion/llm/router.rb
100
101
  - lib/legion/llm/router/escalation_chain.rb
102
+ - lib/legion/llm/router/gateway_interceptor.rb
101
103
  - lib/legion/llm/router/health_tracker.rb
102
104
  - lib/legion/llm/router/resolution.rb
103
105
  - lib/legion/llm/router/rule.rb
104
106
  - lib/legion/llm/settings.rb
107
+ - lib/legion/llm/shadow_eval.rb
108
+ - lib/legion/llm/structured_output.rb
105
109
  - lib/legion/llm/transport/exchanges/escalation.rb
106
110
  - lib/legion/llm/transport/messages/escalation_event.rb
107
111
  - lib/legion/llm/version.rb