legion-llm 0.6.24 → 0.6.25

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: c7e4263174302a505c21078bbc343c36134c08e22f0ad2d2741e6e2b4b747327
4
- data.tar.gz: ef99cbea7efe6f0e0c479586c73792d814328169ca1e0faaf2362da8cb140be1
3
+ metadata.gz: 5e329176f3f041bc5cade11008d7a932325df80d510142da79f362ab5bd27e5a
4
+ data.tar.gz: 46f1e1a12ebd64bafdbe0e4d4f799df4addf586e5c89dfd7824cef4694582681
5
5
  SHA512:
6
- metadata.gz: bb18ac2c9d7cb8108edc71208dc97b4befaed9ab6cdbcb7c1f8662c00402c08619609e3ce5d372b9f4483175020ef2100ac5b86ae2d147216e755f4211173f41
7
- data.tar.gz: e0f5b35fb908eff66faea9509f984fa141531c467db9ab1c8e2276f5e47dcd7a65f42e63e5004db4d27334867489c578a44c51b3779d9d954bd7d4028487e37e
6
+ metadata.gz: 464cf343578d082092d57ed8a8091724a3508925088f795b1c1a1724250ec8876bb18ae32b8957b738df7439a31d1ce436275274997925f7cdc8ebafcdf99080
7
+ data.tar.gz: 0be8ab0bd23eed8cf0320aed8d422f9d1a63937e888ece646ead1c7615add65c8f769f6da250859c8f8e32013bb7daffd3310a311df9d279339902568e4151fb
data/CHANGELOG.md CHANGED
@@ -1,5 +1,35 @@
1
1
  # Legion LLM Changelog
2
2
 
3
+ ## [0.6.25] - 2026-04-08
4
+
5
+ ### Added
6
+ - `Legion::LLM::Transport::Message` — LLM base message class with `message_context` propagation, LLM-specific headers (`x-legion-llm-provider`, `x-legion-llm-model`, `x-legion-llm-request-type`, `x-legion-llm-schema-version`), context header promotion, and `tracing_headers` stub for future OpenTelemetry integration
7
+ - `Legion::LLM::Fleet::Exchange` — declares `llm.request` topic exchange (source of truth for fleet routing)
8
+ - `Legion::LLM::Fleet::Request` — fleet inference request message with priority mapping, TTL-to-expiration conversion, and `req_` prefixed message IDs
9
+ - `Legion::LLM::Fleet::Response` — fleet inference response message with default-exchange publish override, Bunny error rescue, and `resp_` prefixed message IDs
10
+ - `Legion::LLM::Fleet::Error` — fleet error message with `ERROR_CODES` registry (12 codes), `x-legion-fleet-error` header, default-exchange publish override, and `err_` prefixed message IDs
11
+ - `Legion::LLM::Metering::Exchange` — declares `llm.metering` topic exchange
12
+ - `Legion::LLM::Metering::Event` — metering event message with tier header, `metering.<type>` routing keys, and `meter_` prefixed message IDs
13
+ - `Legion::LLM::Metering` module — `emit(event)` and `flush_spool` public API replacing gateway dependency for metering
14
+ - `Legion::LLM::Audit::Exchange` — declares `llm.audit` topic exchange (supersedes `Transport::Exchanges::Audit`)
15
+ - `Legion::LLM::Audit::PromptEvent` — prompt audit message (always encrypted) with classification, caller, retention, and tier headers
16
+ - `Legion::LLM::Audit::ToolEvent` — tool call audit message (always encrypted) with tool metadata headers
17
+ - `Legion::LLM::Audit` module — `emit_prompt(event)` and `emit_tools(event)` public API (no spool — audit data too sensitive for plaintext disk)
18
+ - `Fleet::Dispatcher.build_routing_key` — builds `llm.request.<provider>.<type>.<model>` routing keys with `:` to `.` sanitization
19
+ - `Fleet::Dispatcher` per-type timeout resolution (`embed: 10s`, `chat: 30s`, `generate: 30s`) from settings or `TIMEOUTS` constant
20
+ - `Fleet::Dispatcher` backwards-compatible shim supporting both old `(model:, messages:)` and new `(request:, message_context:)` dispatch signatures
21
+ - `Fleet::ReplyDispatcher.fulfill_return` — handles `basic.return` with `no_fleet_queue` error
22
+ - `Fleet::ReplyDispatcher.fulfill_nack` — handles `basic.nack` with `fleet_backpressure` error
23
+ - `Fleet::ReplyDispatcher` type-aware delivery dispatch — handles `llm.fleet.response`, `llm.fleet.error`, and legacy (no type) formats
24
+ - `routing.tier_priority` setting — default `[local, fleet, direct]` three-tier ordering
25
+ - `routing.tiers.fleet.timeouts` setting — per-request-type timeout configuration
26
+
27
+ ### Changed
28
+ - `Fleet::Dispatcher#publish_request` now uses `Fleet::Request` message class (falls back to gateway `InferenceRequest` when `Fleet::Request` unavailable)
29
+ - `Pipeline::Steps::Metering#publish_event` now delegates to `Legion::LLM::Metering.emit` instead of `Gateway::Transport::Messages::MeteringEvent`
30
+ - `Pipeline::AuditPublisher#publish` now delegates to `Legion::LLM::Audit.emit_prompt` instead of raw `Transport::Messages::AuditEvent`
31
+ - `routing.tiers.fleet.queue` default changed from `llm.inference` to `llm.request` (fleet exchange rename)
32
+
3
33
  ## [0.6.24] - 2026-04-08
4
34
 
5
35
  ### Added
data/CLAUDE.md CHANGED
@@ -8,7 +8,7 @@
8
8
  Core LegionIO gem providing LLM capabilities to all extensions. Wraps ruby_llm to provide a consistent interface for chat, embeddings, tool use, and agents across multiple providers (Bedrock, Anthropic, OpenAI, Gemini, Ollama). Includes a dynamic weighted routing engine that dispatches requests across local, fleet, and cloud tiers based on caller intent, priority rules, time schedules, cost multipliers, and real-time provider health.
9
9
 
10
10
  **GitHub**: https://github.com/LegionIO/legion-llm
11
- **Version**: 0.6.18
11
+ **Version**: 0.6.25
12
12
  **License**: Apache-2.0
13
13
 
14
14
  ## Architecture
@@ -69,9 +69,20 @@ Legion::LLM (lib/legion/llm.rb)
69
69
  │ McpToolAdapter renamed to ToolAdapter; McpToolAdapter kept as a backwards-compatible alias.
70
70
  ├── CostEstimator # Model cost estimation with fuzzy pricing (absorbed from lex-llm-gateway)
71
71
  ├── Fleet # Fleet RPC dispatch (absorbed from lex-llm-gateway)
72
- │ ├── Dispatcher # Fleet dispatch with timeout and availability checks
72
+ │ ├── Exchange # Declares `llm.request` topic exchange (source of truth)
73
+ │ ├── Request # Fleet inference request message (type: 'llm.fleet.request')
74
+ │ ├── Response # Fleet inference response message (type: 'llm.fleet.response', default exchange publish)
75
+ │ ├── Error # Fleet error message (type: 'llm.fleet.error', ERROR_CODES registry)
76
+ │ ├── Dispatcher # Fleet dispatch with timeout and routing key building
73
77
  │ ├── Handler # Fleet request handler for GPU worker nodes
74
- │ └── ReplyDispatcher # Correlation-based reply routing for fleet RPC
78
+ │ └── ReplyDispatcher # Correlation-based reply routing with type-aware dispatch, fulfill_return, fulfill_nack
79
+ ├── Metering # Metering event emission (replaces gateway dependency)
80
+ │ ├── Exchange # Declares `llm.metering` topic exchange
81
+ │ └── Event # Metering event message (type: 'llm.metering.event')
82
+ ├── Audit # Audit event emission (replaces gateway dependency)
83
+ │ ├── Exchange # Declares `llm.audit` topic exchange
84
+ │ ├── PromptEvent # Prompt audit message (type: 'llm.audit.prompt', always encrypted)
85
+ │ └── ToolEvent # Tool audit message (type: 'llm.audit.tool', always encrypted)
75
86
  └── Helpers::LLM # Extension helper mixin (llm_chat, llm_embed, llm_session, compress:)
76
87
  ```
77
88
 
@@ -181,6 +192,20 @@ Legion::LLM.chat(message:, escalate: true, max_escalations: 3, quality_check:) #
181
192
  Legion::LLM::EscalationExhausted # raised when all escalation attempts are exhausted
182
193
  Legion::LLM::Router.resolve_chain(intent:, tier:, max_escalations:) # -> EscalationChain
183
194
  Legion::LLM::QualityChecker.check(response, quality_threshold: 50, json_expected: false, quality_check: nil) # -> QualityResult
195
+
196
+ # Metering
197
+ Legion::LLM::Metering.emit(event_hash) # -> :published | :spooled | :dropped
198
+ Legion::LLM::Metering.flush_spool # -> Integer (count flushed)
199
+
200
+ # Audit
201
+ Legion::LLM::Audit.emit_prompt(event_hash) # -> :published | :dropped
202
+ Legion::LLM::Audit.emit_tools(event_hash) # -> :published | :dropped
203
+
204
+ # Fleet Dispatcher
205
+ Legion::LLM::Fleet::Dispatcher.dispatch(model:, messages:, **) # Old signature (backwards compat)
206
+ Legion::LLM::Fleet::Dispatcher.dispatch(request:, message_context:, routing_key:, **) # New signature
207
+ Legion::LLM::Fleet::Dispatcher.build_routing_key(provider:, request_type:, model:) # -> String
208
+ Legion::LLM::Fleet::Dispatcher.fleet_available? # -> Boolean
184
209
  ```
185
210
 
186
211
  ## Settings
@@ -347,10 +372,22 @@ In-memory signal consumer with pluggable handlers. Adjusts effective priorities
347
372
  | `lib/legion/llm/pipeline/steps/rag_guard.rb` | Pipeline::Steps::RagGuard: faithfulness check against retrieved RAG context |
348
373
  | `lib/legion/llm/pipeline/enrichment_injector.rb` | Pipeline::EnrichmentInjector: converts RAG/GAIA enrichments into system prompt |
349
374
  | `lib/legion/llm/cost_estimator.rb` | CostEstimator: model cost estimation with fuzzy pricing |
350
- | `lib/legion/llm/fleet.rb` | Fleet module: requires dispatcher, handler, reply_dispatcher |
351
- | `lib/legion/llm/fleet/dispatcher.rb` | Fleet::Dispatcher: fleet RPC dispatch |
375
+ | `lib/legion/llm/transport/message.rb` | LLM base message class: message_context propagation, LLM headers, envelope key stripping |
376
+ | `lib/legion/llm/fleet.rb` | Fleet module: requires exchange, request, response, error, dispatcher, handler, reply_dispatcher |
377
+ | `lib/legion/llm/fleet/exchange.rb` | Fleet::Exchange: declares `llm.request` topic exchange |
378
+ | `lib/legion/llm/fleet/request.rb` | Fleet::Request: fleet inference request with priority mapping, TTL conversion |
379
+ | `lib/legion/llm/fleet/response.rb` | Fleet::Response: fleet response with default-exchange publish |
380
+ | `lib/legion/llm/fleet/error.rb` | Fleet::Error: fleet error with ERROR_CODES registry, error headers |
381
+ | `lib/legion/llm/fleet/dispatcher.rb` | Fleet::Dispatcher: fleet RPC dispatch with routing key building, per-type timeouts |
352
382
  | `lib/legion/llm/fleet/handler.rb` | Fleet::Handler: fleet request handler |
353
- | `lib/legion/llm/fleet/reply_dispatcher.rb` | Fleet::ReplyDispatcher: correlation-based reply routing |
383
+ | `lib/legion/llm/fleet/reply_dispatcher.rb` | Fleet::ReplyDispatcher: type-aware reply routing, fulfill_return, fulfill_nack |
384
+ | `lib/legion/llm/metering.rb` | Metering module: emit, flush_spool public API |
385
+ | `lib/legion/llm/metering/exchange.rb` | Metering::Exchange: declares `llm.metering` topic exchange |
386
+ | `lib/legion/llm/metering/event.rb` | Metering::Event: metering event message with tier header |
387
+ | `lib/legion/llm/audit.rb` | Audit module: emit_prompt, emit_tools public API |
388
+ | `lib/legion/llm/audit/exchange.rb` | Audit::Exchange: declares `llm.audit` topic exchange |
389
+ | `lib/legion/llm/audit/prompt_event.rb` | Audit::PromptEvent: prompt audit with classification/caller/retention headers |
390
+ | `lib/legion/llm/audit/tool_event.rb` | Audit::ToolEvent: tool audit with tool metadata headers |
354
391
  | `lib/legion/llm/helpers/llm.rb` | Extension helper mixin: llm_chat (with compress:, escalate:, max_escalations:, quality_check:), llm_embed, llm_session |
355
392
  | `spec/legion/llm_spec.rb` | Tests: settings, lifecycle, providers, auto-config |
356
393
  | `spec/legion/llm/integration_spec.rb` | Tests: routing integration with chat() |
@@ -390,8 +427,20 @@ In-memory signal consumer with pluggable handlers. Adjusts effective priorities
390
427
  | `spec/legion/llm/pipeline/executor_spec.rb` | Tests: Executor pipeline execution, profile skipping |
391
428
  | `spec/legion/llm/pipeline/integration_spec.rb` | Tests: Pipeline integration with chat() dispatch |
392
429
  | `spec/legion/llm/pipeline/steps/metering_spec.rb` | Tests: Metering event building |
393
- | `spec/legion/llm/fleet/dispatcher_spec.rb` | Tests: Fleet dispatch, availability, timeout |
430
+ | `spec/legion/llm/transport/message_spec.rb` | Tests: LLM base message class |
431
+ | `spec/legion/llm/fleet/exchange_spec.rb` | Tests: fleet exchange declaration |
432
+ | `spec/legion/llm/fleet/request_spec.rb` | Tests: Fleet::Request message |
433
+ | `spec/legion/llm/fleet/response_spec.rb` | Tests: Fleet::Response message |
434
+ | `spec/legion/llm/fleet/error_spec.rb` | Tests: Fleet::Error message |
435
+ | `spec/legion/llm/fleet/dispatcher_spec.rb` | Tests: Fleet dispatch, routing keys, per-type timeouts, ReplyDispatcher |
394
436
  | `spec/legion/llm/fleet/handler_spec.rb` | Tests: Fleet handler, auth, response building |
437
+ | `spec/legion/llm/metering/exchange_spec.rb` | Tests: metering exchange |
438
+ | `spec/legion/llm/metering/event_spec.rb` | Tests: Metering::Event message |
439
+ | `spec/legion/llm/metering_spec.rb` | Tests: Metering emit/spool API |
440
+ | `spec/legion/llm/audit/exchange_spec.rb` | Tests: audit exchange |
441
+ | `spec/legion/llm/audit/prompt_event_spec.rb` | Tests: Audit::PromptEvent |
442
+ | `spec/legion/llm/audit/tool_event_spec.rb` | Tests: Audit::ToolEvent |
443
+ | `spec/legion/llm/audit_spec.rb` | Tests: Audit emit API |
395
444
  | `spec/legion/llm/pipeline/steps/rag_context_spec.rb` | Tests: RAG context strategy selection, Apollo retrieval, graceful degradation |
396
445
  | `spec/legion/llm/pipeline/steps/rag_guard_spec.rb` | Tests: RAG faithfulness checking |
397
446
  | `spec/legion/llm/pipeline/enrichment_injector_spec.rb` | Tests: enrichment injection into system prompt |
@@ -0,0 +1,12 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Legion
4
+ module LLM
5
+ module Audit
6
+ class Exchange < ::Legion::Transport::Exchange
7
+ def exchange_name = 'llm.audit'
8
+ def default_type = 'topic'
9
+ end
10
+ end
11
+ end
12
+ end
@@ -0,0 +1,56 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative '../transport/message'
4
+
5
+ module Legion
6
+ module LLM
7
+ module Audit
8
+ class PromptEvent < Legion::LLM::Transport::Message
9
+ def type = 'llm.audit.prompt'
10
+ def exchange = Legion::LLM::Audit::Exchange
11
+ def routing_key = "audit.prompt.#{@options[:request_type]}"
12
+ def priority = 0
13
+ def encrypt? = true
14
+ def expiration = nil
15
+
16
+ def headers
17
+ super.merge(classification_headers).merge(caller_headers).merge(retention_headers).merge(tier_header)
18
+ end
19
+
20
+ private
21
+
22
+ def message_id_prefix = 'audit_prompt'
23
+
24
+ def classification_headers
25
+ cls = @options[:classification] || {}
26
+ h = {}
27
+ h['x-legion-classification'] = cls[:level].to_s if cls[:level]
28
+ h['x-legion-contains-phi'] = cls[:contains_phi].to_s unless cls[:contains_phi].nil?
29
+ h['x-legion-jurisdictions'] = Array(cls[:jurisdictions]).join(',') if cls[:jurisdictions]
30
+ h
31
+ end
32
+
33
+ def caller_headers
34
+ caller_info = @options.dig(:caller, :requested_by) || {}
35
+ h = {}
36
+ h['x-legion-caller-identity'] = caller_info[:identity].to_s if caller_info[:identity]
37
+ h['x-legion-caller-type'] = caller_info[:type].to_s if caller_info[:type]
38
+ h
39
+ end
40
+
41
+ def retention_headers
42
+ cls = @options[:classification] || {}
43
+ h = {}
44
+ h['x-legion-retention'] = cls[:retention].to_s if cls[:retention]
45
+ h
46
+ end
47
+
48
+ def tier_header
49
+ h = {}
50
+ h['x-legion-llm-tier'] = @options[:tier].to_s if @options[:tier]
51
+ h
52
+ end
53
+ end
54
+ end
55
+ end
56
+ end
@@ -0,0 +1,46 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative '../transport/message'
4
+
5
+ module Legion
6
+ module LLM
7
+ module Audit
8
+ class ToolEvent < Legion::LLM::Transport::Message
9
+ def type = 'llm.audit.tool'
10
+ def exchange = Legion::LLM::Audit::Exchange
11
+ def routing_key = "audit.tool.#{@options[:tool_name]}"
12
+ def priority = 0
13
+ def encrypt? = true
14
+ def expiration = nil
15
+
16
+ def headers
17
+ super.merge(tool_headers).merge(classification_headers)
18
+ end
19
+
20
+ private
21
+
22
+ def message_id_prefix = 'audit_tool'
23
+
24
+ def tool_headers
25
+ tc = @options[:tool_call] || {}
26
+ src = tc[:source] || {}
27
+ h = {}
28
+ tool_name = tc[:name] || @options[:tool_name]
29
+ h['x-legion-tool-name'] = tool_name.to_s if tool_name
30
+ h['x-legion-tool-source-type'] = src[:type].to_s if src[:type]
31
+ h['x-legion-tool-source-server'] = src[:server].to_s if src[:server]
32
+ h['x-legion-tool-status'] = tc[:status].to_s if tc[:status]
33
+ h
34
+ end
35
+
36
+ def classification_headers
37
+ cls = @options[:classification] || {}
38
+ h = {}
39
+ h['x-legion-classification'] = cls[:level].to_s if cls[:level]
40
+ h['x-legion-contains-phi'] = cls[:contains_phi].to_s unless cls[:contains_phi].nil?
41
+ h
42
+ end
43
+ end
44
+ end
45
+ end
46
+ end
@@ -0,0 +1,53 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'legion/logging/helper'
4
+
5
+ if defined?(Legion::Transport::Message)
6
+ require_relative 'audit/exchange'
7
+ require_relative 'audit/prompt_event'
8
+ require_relative 'audit/tool_event'
9
+ end
10
+
11
+ module Legion
12
+ module LLM
13
+ module Audit
14
+ extend Legion::Logging::Helper
15
+
16
+ module_function
17
+
18
+ def emit_prompt(event)
19
+ if transport_connected? && defined?(Legion::LLM::Audit::PromptEvent)
20
+ Legion::LLM::Audit::PromptEvent.new(**event).publish
21
+ log.info('[llm][audit] published prompt audit')
22
+ :published
23
+ else
24
+ log.warn('[llm][audit] dropped prompt audit: transport unavailable')
25
+ :dropped
26
+ end
27
+ rescue StandardError => e
28
+ handle_exception(e, level: :warn, operation: 'llm.audit.emit_prompt')
29
+ :dropped
30
+ end
31
+
32
+ def emit_tools(event)
33
+ if transport_connected? && defined?(Legion::LLM::Audit::ToolEvent)
34
+ Legion::LLM::Audit::ToolEvent.new(**event).publish
35
+ log.info('[llm][audit] published tool audit')
36
+ :published
37
+ else
38
+ log.warn('[llm][audit] dropped tool audit: transport unavailable')
39
+ :dropped
40
+ end
41
+ rescue StandardError => e
42
+ handle_exception(e, level: :warn, operation: 'llm.audit.emit_tools')
43
+ :dropped
44
+ end
45
+
46
+ def transport_connected?
47
+ !!(defined?(Legion::Transport) &&
48
+ Legion::Transport.respond_to?(:connected?) &&
49
+ Legion::Transport.connected?)
50
+ end
51
+ end
52
+ end
53
+ end
@@ -7,18 +7,66 @@ module Legion
7
7
  module Fleet
8
8
  module Dispatcher
9
9
  DEFAULT_TIMEOUT = 30
10
+
11
+ TIMEOUTS = {
12
+ embed: 10,
13
+ chat: 30,
14
+ generate: 30,
15
+ default: 30
16
+ }.freeze
17
+
10
18
  extend Legion::Logging::Helper
11
19
 
12
20
  module_function
13
21
 
14
- def dispatch(model:, messages:, **opts)
15
- return error_result('fleet_unavailable') unless fleet_available?
22
+ # Backwards-compatible shim: supports old (model:, messages:) and new (request:, message_context:) callers
23
+ def dispatch(model: nil, messages: nil, request: nil, message_context: {}, routing_key: nil, reply_to: nil, **opts)
24
+ return error_result('fleet_unavailable', message_context: message_context) unless fleet_available?
25
+
26
+ # Old calling convention: build minimal params from model/messages
27
+ if request.nil? && (model || messages)
28
+ provider = opts[:provider] || 'ollama'
29
+ request_type = opts[:request_type] || 'chat'
30
+ routing_key ||= build_routing_key(provider: provider, request_type: request_type, model: model)
31
+ reply_to ||= ReplyDispatcher.agent_queue_name
32
+ correlation_id = publish_request(
33
+ routing_key: routing_key, reply_to: reply_to,
34
+ provider: provider, model: model, request_type: request_type,
35
+ messages: messages, message_context: message_context, **opts
36
+ )
37
+ timeout = resolve_timeout(request_type: request_type, override: opts[:timeout])
38
+ return wait_for_response(correlation_id, timeout: timeout, message_context: message_context)
39
+ end
40
+
41
+ # New calling convention
42
+ request_opts =
43
+ if request.respond_to?(:to_h)
44
+ request.to_h.transform_keys(&:to_sym)
45
+ else
46
+ {}
47
+ end
48
+ request_opts = request_opts.merge(opts)
49
+
50
+ provider = request_opts[:provider] || 'ollama'
51
+ request_type = request_opts[:request_type] || 'chat'
52
+ model = request_opts[:model]
53
+ routing_key ||= build_routing_key(provider: provider, request_type: request_type, model: model)
54
+ reply_to ||= ReplyDispatcher.agent_queue_name
55
+ correlation_id = publish_request(
56
+ routing_key: routing_key, reply_to: reply_to,
57
+ provider: provider, model: model, request_type: request_type,
58
+ message_context: message_context, **request_opts.except(:provider, :model, :request_type, :timeout)
59
+ )
60
+ timeout = resolve_timeout(request_type: request_type, override: request_opts[:timeout] || opts[:timeout])
61
+ wait_for_response(correlation_id, timeout: timeout, message_context: message_context)
62
+ end
16
63
 
17
- correlation_id = "fleet_#{SecureRandom.hex(12)}"
18
- publish_request(model: model, messages: messages, intent: opts[:intent],
19
- correlation_id: correlation_id, **opts.except(:intent, :timeout))
64
+ def build_routing_key(provider:, request_type:, model:)
65
+ "llm.request.#{provider}.#{request_type}.#{sanitize_model(model)}"
66
+ end
20
67
 
21
- wait_for_response(correlation_id, timeout: resolve_timeout(opts[:timeout]))
68
+ def sanitize_model(model)
69
+ model.to_s.gsub(':', '.')
22
70
  end
23
71
 
24
72
  def fleet_available?
@@ -48,10 +96,17 @@ module Legion
48
96
  routing.fetch(:use_fleet, true)
49
97
  end
50
98
 
51
- def resolve_timeout(override)
99
+ def resolve_timeout(request_type: :default, override: nil)
52
100
  return override if override
53
101
 
54
- return DEFAULT_TIMEOUT unless defined?(Legion::Settings)
102
+ configured = fleet_timeout_from_settings(request_type)
103
+ return configured if configured
104
+
105
+ TIMEOUTS[request_type.to_sym] || TIMEOUTS[:default]
106
+ end
107
+
108
+ def fleet_timeout_from_settings(request_type)
109
+ return unless defined?(Legion::Settings)
55
110
 
56
111
  settings = begin
57
112
  Legion::Settings[:llm]
@@ -59,35 +114,51 @@ module Legion
59
114
  handle_exception(e, level: :debug, operation: 'llm.fleet.dispatcher.resolve_timeout')
60
115
  nil
61
116
  end
62
- return DEFAULT_TIMEOUT unless settings.is_a?(Hash)
63
117
 
64
- settings.dig(:routing, :fleet, :timeout_seconds) || DEFAULT_TIMEOUT
118
+ return unless settings.is_a?(Hash)
119
+
120
+ routing = settings[:routing]
121
+ return unless routing.is_a?(Hash)
122
+
123
+ fleet_settings = routing.dig(:tiers, :fleet)
124
+ fleet_settings = routing[:fleet] unless fleet_settings.is_a?(Hash)
125
+ return unless fleet_settings.is_a?(Hash)
126
+
127
+ fleet_settings.dig(:timeouts, request_type.to_sym) || fleet_settings[:timeout_seconds]
65
128
  end
66
129
 
67
- def publish_request(**)
68
- return unless defined?(Legion::Extensions::LLM::Gateway::Transport::Messages::InferenceRequest)
130
+ def publish_request(**opts)
131
+ correlation_id = "req_#{SecureRandom.uuid}"
132
+ opts[:fleet_correlation_id] = correlation_id
133
+
134
+ if defined?(Legion::LLM::Fleet::Request)
135
+ Legion::LLM::Fleet::Request.new(**opts).publish
136
+ elsif defined?(Legion::Extensions::LLM::Gateway::Transport::Messages::InferenceRequest)
137
+ Legion::Extensions::LLM::Gateway::Transport::Messages::InferenceRequest.new(
138
+ reply_to: opts[:reply_to], **opts.except(:reply_to)
139
+ ).publish
140
+ end
69
141
 
70
- Legion::Extensions::LLM::Gateway::Transport::Messages::InferenceRequest.new(
71
- reply_to: ReplyDispatcher.agent_queue_name, **
72
- ).publish
142
+ correlation_id
73
143
  end
74
144
 
75
- def wait_for_response(correlation_id, timeout:)
145
+ def wait_for_response(correlation_id, timeout:, message_context: {})
76
146
  future = ReplyDispatcher.register(correlation_id)
77
147
  result = future.value!(timeout)
78
- result || timeout_result(correlation_id, timeout)
148
+ result || timeout_result(correlation_id, timeout, message_context: message_context)
79
149
  rescue Concurrent::CancelledOperationError
80
- timeout_result(correlation_id, timeout)
150
+ timeout_result(correlation_id, timeout, message_context: message_context)
81
151
  ensure
82
152
  ReplyDispatcher.deregister(correlation_id)
83
153
  end
84
154
 
85
- def timeout_result(correlation_id, timeout)
86
- { success: false, error: 'fleet_timeout', correlation_id: correlation_id, timeout: timeout }
155
+ def timeout_result(correlation_id, timeout, message_context: {})
156
+ { success: false, error: 'fleet_timeout', correlation_id: correlation_id,
157
+ timeout: timeout, message_context: message_context }
87
158
  end
88
159
 
89
- def error_result(reason)
90
- { success: false, error: reason }
160
+ def error_result(reason, message_context: {})
161
+ { success: false, error: reason, message_context: message_context }
91
162
  end
92
163
  end
93
164
  end
@@ -0,0 +1,61 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative '../transport/message'
4
+
5
+ module Legion
6
+ module LLM
7
+ module Fleet
8
+ class Error < Legion::LLM::Transport::Message
9
+ ERROR_CODES = %w[
10
+ model_not_loaded ollama_unavailable inference_failed inference_timeout
11
+ invalid_token token_expired payload_too_large unsupported_type
12
+ unsupported_streaming no_fleet_queue fleet_backpressure fleet_timeout
13
+ ].freeze
14
+
15
+ def type = 'llm.fleet.error'
16
+ def routing_key = @options[:reply_to]
17
+ def priority = 0
18
+ def expiration = nil
19
+ def encrypt? = false
20
+
21
+ def headers
22
+ super.merge(error_headers).merge(tracing_headers)
23
+ end
24
+
25
+ # Same default-exchange override as Fleet::Response.
26
+ def publish(options = @options)
27
+ raise unless @valid
28
+
29
+ validate_payload_size
30
+ channel.default_exchange.publish(
31
+ encode_message,
32
+ routing_key: routing_key,
33
+ content_type: options[:content_type] || content_type,
34
+ content_encoding: options[:content_encoding] || content_encoding,
35
+ headers: headers,
36
+ type: type,
37
+ priority: priority,
38
+ message_id: message_id,
39
+ correlation_id: correlation_id,
40
+ app_id: app_id,
41
+ timestamp: timestamp
42
+ )
43
+ rescue Bunny::ConnectionClosedError, Bunny::ChannelAlreadyClosed,
44
+ Bunny::NetworkErrorWrapper, IOError, Timeout::Error => e
45
+ spool_message(e)
46
+ end
47
+
48
+ private
49
+
50
+ def message_id_prefix = 'err'
51
+
52
+ def error_headers
53
+ h = {}
54
+ code = @options.dig(:error, :code)
55
+ h['x-legion-fleet-error'] = code.to_s if code
56
+ h
57
+ end
58
+ end
59
+ end
60
+ end
61
+ end
@@ -0,0 +1,12 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Legion
4
+ module LLM
5
+ module Fleet
6
+ class Exchange < ::Legion::Transport::Exchange
7
+ def exchange_name = 'llm.request'
8
+ def default_type = 'topic'
9
+ end
10
+ end
11
+ end
12
+ end
@@ -34,11 +34,36 @@ module Legion
34
34
  future = @pending.delete(cid)
35
35
  return unless future
36
36
 
37
- future.fulfill(payload)
37
+ # Type-aware dispatch (new protocol) with fallback to legacy (no type)
38
+ case properties[:type]
39
+ when 'llm.fleet.error'
40
+ future.fulfill(normalize_error(payload))
41
+ else
42
+ # 'llm.fleet.response' or legacy (no type)
43
+ future.fulfill(payload)
44
+ end
38
45
  rescue StandardError => e
39
46
  handle_exception(e, level: :warn)
40
47
  end
41
48
 
49
+ def fulfill_return(correlation_id)
50
+ future = @pending.delete(correlation_id)
51
+ return unless future
52
+
53
+ future.fulfill({ success: false, error: 'no_fleet_queue' })
54
+ rescue StandardError => e
55
+ handle_exception(e, level: :warn, operation: 'llm.fleet.reply_dispatcher.fulfill_return')
56
+ end
57
+
58
+ def fulfill_nack(correlation_id)
59
+ future = @pending.delete(correlation_id)
60
+ return unless future
61
+
62
+ future.fulfill({ success: false, error: 'fleet_backpressure' })
63
+ rescue StandardError => e
64
+ handle_exception(e, level: :warn, operation: 'llm.fleet.reply_dispatcher.fulfill_nack')
65
+ end
66
+
42
67
  def agent_queue_name
43
68
  @agent_queue_name ||= "llm.fleet.reply.#{SecureRandom.hex(8)}"
44
69
  end
@@ -62,7 +87,10 @@ module Legion
62
87
  channel = Legion::Transport.connection.create_channel
63
88
  queue = channel.queue(agent_queue_name, auto_delete: true, durable: false)
64
89
  @consumer = queue.subscribe(manual_ack: false) do |_delivery, properties, body|
65
- props = { correlation_id: properties.correlation_id }
90
+ props = {
91
+ correlation_id: properties.correlation_id,
92
+ type: properties.type
93
+ }
66
94
  handle_delivery(body, props)
67
95
  end
68
96
  end
@@ -96,6 +124,16 @@ module Legion
96
124
  handle_exception(e, level: :debug)
97
125
  {}
98
126
  end
127
+
128
+ def normalize_error(payload)
129
+ error = payload[:error] || {}
130
+ {
131
+ success: false,
132
+ error: error.is_a?(Hash) ? error[:code] || error[:message] || 'fleet_error' : error.to_s,
133
+ message_context: payload[:message_context] || {},
134
+ raw_error: error
135
+ }
136
+ end
99
137
  end
100
138
  end
101
139
  end
@@ -0,0 +1,30 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative '../transport/message'
4
+
5
+ module Legion
6
+ module LLM
7
+ module Fleet
8
+ class Request < Legion::LLM::Transport::Message
9
+ PRIORITY_MAP = { critical: 9, high: 7, normal: 5, low: 2 }.freeze
10
+
11
+ def type = 'llm.fleet.request'
12
+ def exchange = Legion::LLM::Fleet::Exchange
13
+ def routing_key = @options[:routing_key]
14
+ def reply_to = @options[:reply_to]
15
+ def priority = map_priority(@options[:priority])
16
+ def expiration = @options[:ttl] ? (@options[:ttl] * 1000).to_s : super
17
+
18
+ private
19
+
20
+ def message_id_prefix = 'req'
21
+
22
+ def map_priority(val)
23
+ return val if val.is_a?(Integer)
24
+
25
+ PRIORITY_MAP.fetch(val, 5)
26
+ end
27
+ end
28
+ end
29
+ end
30
+ end
@@ -0,0 +1,49 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative '../transport/message'
4
+
5
+ module Legion
6
+ module LLM
7
+ module Fleet
8
+ class Response < Legion::LLM::Transport::Message
9
+ def type = 'llm.fleet.response'
10
+ def routing_key = @options[:reply_to]
11
+ def priority = 0
12
+ def expiration = nil
13
+
14
+ def headers
15
+ super.merge(tracing_headers)
16
+ end
17
+
18
+ # Override publish to use the AMQP default exchange ('').
19
+ # The base class's publish calls exchange.publish(...), but the
20
+ # default exchange is accessed via channel.default_exchange in Bunny.
21
+ def publish(options = @options)
22
+ raise unless @valid
23
+
24
+ validate_payload_size
25
+ channel.default_exchange.publish(
26
+ encode_message,
27
+ routing_key: routing_key,
28
+ content_type: options[:content_type] || content_type,
29
+ content_encoding: options[:content_encoding] || content_encoding,
30
+ headers: headers,
31
+ type: type,
32
+ priority: priority,
33
+ message_id: message_id,
34
+ correlation_id: correlation_id,
35
+ app_id: app_id,
36
+ timestamp: timestamp
37
+ )
38
+ rescue Bunny::ConnectionClosedError, Bunny::ChannelAlreadyClosed,
39
+ Bunny::NetworkErrorWrapper, IOError, Timeout::Error => e
40
+ spool_message(e)
41
+ end
42
+
43
+ private
44
+
45
+ def message_id_prefix = 'resp'
46
+ end
47
+ end
48
+ end
49
+ end
@@ -1,5 +1,14 @@
1
1
  # frozen_string_literal: true
2
2
 
3
+ # Message classes require Legion::Transport::Message base class at load time.
4
+ # Only load when the transport gem is available.
5
+ if defined?(Legion::Transport::Message)
6
+ require_relative 'fleet/exchange'
7
+ require_relative 'fleet/request'
8
+ require_relative 'fleet/response'
9
+ require_relative 'fleet/error'
10
+ end
11
+
3
12
  require_relative 'fleet/dispatcher'
4
13
  require_relative 'fleet/handler'
5
14
  require_relative 'fleet/reply_dispatcher'
@@ -0,0 +1,32 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative '../transport/message'
4
+
5
+ module Legion
6
+ module LLM
7
+ module Metering
8
+ class Event < Legion::LLM::Transport::Message
9
+ def type = 'llm.metering.event'
10
+ def exchange = Legion::LLM::Metering::Exchange
11
+ def routing_key = "metering.#{@options[:request_type]}"
12
+ def priority = 0
13
+ def encrypt? = false
14
+ def expiration = nil
15
+
16
+ def headers
17
+ super.merge(tier_header)
18
+ end
19
+
20
+ private
21
+
22
+ def message_id_prefix = 'meter'
23
+
24
+ def tier_header
25
+ h = {}
26
+ h['x-legion-llm-tier'] = @options[:tier].to_s if @options[:tier]
27
+ h
28
+ end
29
+ end
30
+ end
31
+ end
32
+ end
@@ -0,0 +1,12 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Legion
4
+ module LLM
5
+ module Metering
6
+ class Exchange < ::Legion::Transport::Exchange
7
+ def exchange_name = 'llm.metering'
8
+ def default_type = 'topic'
9
+ end
10
+ end
11
+ end
12
+ end
@@ -0,0 +1,63 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'legion/logging/helper'
4
+
5
+ if defined?(Legion::Transport::Message)
6
+ require_relative 'metering/exchange'
7
+ require_relative 'metering/event'
8
+ end
9
+
10
+ module Legion
11
+ module LLM
12
+ module Metering
13
+ extend Legion::Logging::Helper
14
+
15
+ module_function
16
+
17
+ def emit(event)
18
+ if transport_connected? && defined?(Legion::LLM::Metering::Event)
19
+ Legion::LLM::Metering::Event.new(**event).publish
20
+ log.info("[llm][metering] published provider=#{event[:provider]} model=#{event[:model_id]}")
21
+ :published
22
+ elsif spool_available?
23
+ spool_event(event)
24
+ log.info("[llm][metering] spooled provider=#{event[:provider]} model=#{event[:model_id]}")
25
+ :spooled
26
+ else
27
+ log.warn("[llm][metering] dropped provider=#{event[:provider]} model=#{event[:model_id]}")
28
+ :dropped
29
+ end
30
+ rescue StandardError => e
31
+ handle_exception(e, level: :warn, operation: 'llm.metering.emit')
32
+ :dropped
33
+ end
34
+
35
+ def flush_spool
36
+ return 0 unless spool_available? && transport_connected?
37
+
38
+ spool = Legion::Data::Spool.for(Legion::LLM)
39
+ flushed = spool.flush(:metering) { |event| emit(event) }
40
+ log.info("[llm][metering] spool_flushed count=#{flushed}")
41
+ flushed
42
+ rescue StandardError => e
43
+ handle_exception(e, level: :warn, operation: 'llm.metering.flush_spool')
44
+ 0
45
+ end
46
+
47
+ def transport_connected?
48
+ !!(defined?(Legion::Transport) &&
49
+ Legion::Transport.respond_to?(:connected?) &&
50
+ Legion::Transport.connected?)
51
+ end
52
+
53
+ def spool_available?
54
+ !!defined?(Legion::Data::Spool)
55
+ end
56
+
57
+ def spool_event(event)
58
+ spool = Legion::Data::Spool.for(Legion::LLM)
59
+ spool.write(:metering, event)
60
+ end
61
+ end
62
+ end
63
+ end
@@ -7,9 +7,6 @@ module Legion
7
7
  module AuditPublisher
8
8
  extend Legion::Logging::Helper
9
9
 
10
- EXCHANGE = 'llm.audit'
11
- ROUTING_KEY = 'llm.audit.complete'
12
-
13
10
  module_function
14
11
 
15
12
  def build_event(request:, response:)
@@ -29,30 +26,28 @@ module Legion
29
26
  messages: request.messages,
30
27
  response_content: response.message[:content],
31
28
  tools_used: response.tools,
32
- timestamp: Time.now
29
+ timestamp: Time.now,
30
+ request_type: request.respond_to?(:request_type) ? request.request_type : 'chat',
31
+ tier: response.routing.is_a?(Hash) ? response.routing[:tier] : nil,
32
+ message_context: build_message_context(request: request, response: response)
33
33
  }
34
34
  end
35
35
 
36
36
  def publish(request:, response:)
37
37
  event = build_event(request: request, response: response)
38
-
39
- begin
40
- if defined?(Legion::Transport)
41
- require 'legion/llm/transport/exchanges/audit'
42
- require 'legion/llm/transport/messages/audit_event'
43
- Legion::LLM::Transport::Messages::AuditEvent.new(**event).publish
44
- else
45
- log.debug('audit publish skipped: transport unavailable')
46
- end
47
- rescue StandardError => e
48
- handle_exception(e, level: :warn)
49
- end
50
-
38
+ Legion::LLM::Audit.emit_prompt(event)
51
39
  event
52
40
  rescue StandardError => e
53
41
  handle_exception(e, level: :warn)
54
42
  nil
55
43
  end
44
+
45
+ def build_message_context(response:, **)
46
+ {
47
+ request_id: response.request_id,
48
+ conversation_id: response.conversation_id
49
+ }.compact
50
+ end
56
51
  end
57
52
  end
58
53
  end
@@ -16,27 +16,11 @@ module Legion
16
16
  end
17
17
 
18
18
  def publish_or_spool(event)
19
- if transport_connected?
20
- publish_event(event)
21
- log.info("[llm][metering] published provider=#{event[:provider]} model=#{event[:model_id]}")
22
- :published
23
- elsif spool_available?
24
- spool_event(event)
25
- log.info("[llm][metering] spooled provider=#{event[:provider]} model=#{event[:model_id]}")
26
- :spooled
27
- else
28
- log.warn("[llm][metering] dropped provider=#{event[:provider]} model=#{event[:model_id]}")
29
- :dropped
30
- end
19
+ publish_event(event)
31
20
  end
32
21
 
33
22
  def flush_spool
34
- return 0 unless spool_available? && transport_connected?
35
-
36
- spool = Legion::Data::Spool.for(Legion::LLM)
37
- flushed = spool.flush(:metering) { |event| publish_event(event) }
38
- log.info("[llm][metering] spool_flushed count=#{flushed}")
39
- flushed
23
+ Legion::LLM::Metering.flush_spool
40
24
  end
41
25
 
42
26
  def identity_fields(opts)
@@ -68,25 +52,8 @@ module Legion
68
52
  }
69
53
  end
70
54
 
71
- def transport_connected?
72
- !!(defined?(Legion::Transport) &&
73
- Legion::Transport.respond_to?(:connected?) &&
74
- Legion::Transport.connected?)
75
- end
76
-
77
- def spool_available?
78
- !!defined?(Legion::Data::Spool)
79
- end
80
-
81
55
  def publish_event(event)
82
- return unless defined?(Legion::Extensions::LLM::Gateway::Transport::Messages::MeteringEvent)
83
-
84
- Legion::Extensions::LLM::Gateway::Transport::Messages::MeteringEvent.new(**event).publish
85
- end
86
-
87
- def spool_event(event)
88
- spool = Legion::Data::Spool.for(Legion::LLM)
89
- spool.write(:metering, event)
56
+ Legion::LLM::Metering.emit(event)
90
57
  end
91
58
  end
92
59
  end
@@ -83,10 +83,15 @@ module Legion
83
83
  def self.routing_defaults
84
84
  {
85
85
  enabled: false,
86
+ tier_priority: %w[local fleet direct],
86
87
  default_intent: { privacy: 'normal', capability: 'moderate', cost: 'normal' },
87
88
  tiers: {
88
89
  local: { provider: 'ollama' },
89
- fleet: { queue: 'llm.inference', timeout_seconds: 30 },
90
+ fleet: {
91
+ queue: 'llm.request',
92
+ timeout_seconds: 30,
93
+ timeouts: { embed: 10, chat: 30, generate: 30, default: 30 }
94
+ },
90
95
  cloud: { providers: %w[bedrock anthropic] }
91
96
  },
92
97
  health: {
@@ -0,0 +1,82 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'securerandom'
4
+
5
+ module Legion
6
+ module LLM
7
+ module Transport
8
+ class Message < ::Legion::Transport::Message
9
+ # Keys stripped from the JSON body (in addition to base ENVELOPE_KEYS).
10
+ # Do NOT add keys already in ENVELOPE_KEYS (:routing_key, :reply_to, etc.).
11
+ # Do NOT add :request_type — metering/audit need it in the body.
12
+ # Do NOT add :message_context — it MUST appear in the body of all 6 messages.
13
+ LLM_ENVELOPE_KEYS = %i[
14
+ fleet_correlation_id provider model ttl
15
+ ].freeze
16
+
17
+ def message_context
18
+ @options[:message_context] || {}
19
+ end
20
+
21
+ def message
22
+ @options.except(*ENVELOPE_KEYS, *LLM_ENVELOPE_KEYS)
23
+ end
24
+
25
+ def message_id
26
+ @options[:message_id] || "#{message_id_prefix}_#{SecureRandom.uuid}"
27
+ end
28
+
29
+ # Fleet messages use :fleet_correlation_id to avoid collision with the
30
+ # base class's :correlation_id (which falls through to :parent_id/:task_id).
31
+ def correlation_id
32
+ @options[:fleet_correlation_id] || super
33
+ end
34
+
35
+ def app_id
36
+ @options[:app_id] || 'legion-llm'
37
+ end
38
+
39
+ def headers
40
+ super.merge(llm_headers).merge(context_headers).merge(tracing_headers)
41
+ end
42
+
43
+ # Subclasses override to inject OpenTelemetry span context.
44
+ # Stub returns empty hash until tracing integration is implemented.
45
+ def tracing_headers
46
+ {}
47
+ end
48
+
49
+ private
50
+
51
+ def message_id_prefix = 'msg'
52
+
53
+ def option_value(*keys)
54
+ keys.each do |key|
55
+ value = @options[key]
56
+ return value if value
57
+ end
58
+ nil
59
+ end
60
+
61
+ def llm_headers
62
+ h = {}
63
+ h['x-legion-llm-provider'] = @options[:provider].to_s if @options[:provider]
64
+ model_val = option_value(:model, :model_id)
65
+ h['x-legion-llm-model'] = model_val.to_s if model_val
66
+ h['x-legion-llm-request-type'] = @options[:request_type].to_s if @options[:request_type]
67
+ h['x-legion-llm-schema-version'] = '1.0.0'
68
+ h
69
+ end
70
+
71
+ def context_headers
72
+ ctx = message_context
73
+ h = {}
74
+ h['x-legion-llm-conversation-id'] = ctx[:conversation_id].to_s if ctx[:conversation_id]
75
+ h['x-legion-llm-message-id'] = ctx[:message_id].to_s if ctx[:message_id]
76
+ h['x-legion-llm-request-id'] = ctx[:request_id].to_s if ctx[:request_id]
77
+ h
78
+ end
79
+ end
80
+ end
81
+ end
82
+ end
@@ -2,6 +2,6 @@
2
2
 
3
3
  module Legion
4
4
  module LLM
5
- VERSION = '0.6.24'
5
+ VERSION = '0.6.25'
6
6
  end
7
7
  end
data/lib/legion/llm.rb CHANGED
@@ -23,6 +23,8 @@ require 'legion/llm/cache'
23
23
  require 'legion/llm/pipeline'
24
24
  require 'legion/llm/cost_estimator'
25
25
  require 'legion/llm/fleet'
26
+ require 'legion/llm/metering'
27
+ require 'legion/llm/audit'
26
28
  require_relative 'llm/provider_registry'
27
29
  require_relative 'llm/native_dispatch'
28
30
  require_relative 'llm/response_cache'
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: legion-llm
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.6.24
4
+ version: 0.6.25
5
5
  platform: ruby
6
6
  authors:
7
7
  - Esity
@@ -224,6 +224,10 @@ files:
224
224
  - legion-llm.gemspec
225
225
  - lib/legion/llm.rb
226
226
  - lib/legion/llm/arbitrage.rb
227
+ - lib/legion/llm/audit.rb
228
+ - lib/legion/llm/audit/exchange.rb
229
+ - lib/legion/llm/audit/prompt_event.rb
230
+ - lib/legion/llm/audit/tool_event.rb
227
231
  - lib/legion/llm/batch.rb
228
232
  - lib/legion/llm/bedrock_bearer_auth.rb
229
233
  - lib/legion/llm/cache.rb
@@ -245,8 +249,12 @@ files:
245
249
  - lib/legion/llm/escalation_tracker.rb
246
250
  - lib/legion/llm/fleet.rb
247
251
  - lib/legion/llm/fleet/dispatcher.rb
252
+ - lib/legion/llm/fleet/error.rb
253
+ - lib/legion/llm/fleet/exchange.rb
248
254
  - lib/legion/llm/fleet/handler.rb
249
255
  - lib/legion/llm/fleet/reply_dispatcher.rb
256
+ - lib/legion/llm/fleet/request.rb
257
+ - lib/legion/llm/fleet/response.rb
250
258
  - lib/legion/llm/helper.rb
251
259
  - lib/legion/llm/helpers/llm.rb
252
260
  - lib/legion/llm/hooks.rb
@@ -257,6 +265,9 @@ files:
257
265
  - lib/legion/llm/hooks/reciprocity.rb
258
266
  - lib/legion/llm/hooks/reflection.rb
259
267
  - lib/legion/llm/hooks/response_guard.rb
268
+ - lib/legion/llm/metering.rb
269
+ - lib/legion/llm/metering/event.rb
270
+ - lib/legion/llm/metering/exchange.rb
260
271
  - lib/legion/llm/native_dispatch.rb
261
272
  - lib/legion/llm/off_peak.rb
262
273
  - lib/legion/llm/override_confidence.rb
@@ -311,6 +322,7 @@ files:
311
322
  - lib/legion/llm/token_tracker.rb
312
323
  - lib/legion/llm/transport/exchanges/audit.rb
313
324
  - lib/legion/llm/transport/exchanges/escalation.rb
325
+ - lib/legion/llm/transport/message.rb
314
326
  - lib/legion/llm/transport/messages/audit_event.rb
315
327
  - lib/legion/llm/transport/messages/escalation_event.rb
316
328
  - lib/legion/llm/usage.rb