legion-llm 0.3.15 → 0.3.17

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 16ae90179fe84f5fdef3459c5463517048a71481e53c778c6be53ef8a0e4f078
4
- data.tar.gz: 6bc1cdebbf9807443e057748abd11e6fb41694a68666bbc1da2dee3fa4ead10a
3
+ metadata.gz: 4f0427a4ddb7c7118e21cf0abd805ae05c994f78c35be322307858a9ad8d0b3c
4
+ data.tar.gz: b010081a1f8007df86babebd563d5fada164eeabcd573594e3aaa310317c1474
5
5
  SHA512:
6
- metadata.gz: 1a5a14010e8b18f19f38d94a64ebfbc5a1d0f0ad589aa6c87d2155ee2705f1369c68e56f3697075bab2a4a43a2fc67cbf3797845f32a8214e156ca2003f64ccb
7
- data.tar.gz: 6251af13334ead29cb2d53cd76137d3627d28f23fce0787816b037718611f1f9ca26bc4b48ea49c33ca0747e9a1c875c2641ae577f4e557b65f68b9d1adfe32b
6
+ metadata.gz: e882a711b7c9a56d03c0fc3f1753c47cf6c148c7312cb3d50302dcaeda1ca67e09bef67ed37f295480625f981b4e32b1d0a4e96736fa9a6f3ac61af8cd1b57ec
7
+ data.tar.gz: 59cb342d0af6c92e94caff375f557de3dac75413c19821f28d5c471f319d748cdcfff99ebd8328ea1fcf0e4dff98614cf262eba69037f5b7ce3d72f567bf2499
data/CHANGELOG.md CHANGED
@@ -1,5 +1,19 @@
1
1
  # Legion LLM Changelog
2
2
 
3
+ ## [0.3.17] - 2026-03-22
4
+
5
+ ### Added
6
+ - `Legion::LLM::OffPeak` module for off-peak scheduling: `peak_hour?`, `should_defer?(priority:)`, `next_off_peak` — defers non-urgent LLM requests during configurable peak hours (default 14:00-22:00 UTC)
7
+ - `Legion::LLM::CostTracker` module for per-request cost tracking: `record(model:, input_tokens:, output_tokens:)`, `summary(since:)` with by-model breakdown, configurable pricing table via settings, thread-safe accumulator
8
+
9
+ ## [0.3.16] - 2026-03-22
10
+
11
+ ### Fixed
12
+ - `chat_single` now accepts and forwards `message:` kwarg, calling `session.ask(message)` when present instead of returning a bare session object
13
+ - `chat_direct` passes `message:` through to `chat_single` in the non-escalation branch
14
+ - Add `FRAMEWORK_KEYS` constant to strip Runner.run metadata kwargs (`task_id`, `source`, `timestamp`, etc.) before passing to RubyLLM
15
+ - Move `FRAMEWORK_KEYS` out of `private` scope (constants are not affected by `private` in Ruby)
16
+
3
17
  ## [0.3.15] - 2026-03-21
4
18
 
5
19
  ### Changed
data/CLAUDE.md CHANGED
@@ -8,7 +8,7 @@
8
8
  Core LegionIO gem providing LLM capabilities to all extensions. Wraps ruby_llm to provide a consistent interface for chat, embeddings, tool use, and agents across multiple providers (Bedrock, Anthropic, OpenAI, Gemini, Ollama). Includes a dynamic weighted routing engine that dispatches requests across local, fleet, and cloud tiers based on caller intent, priority rules, time schedules, cost multipliers, and real-time provider health.
9
9
 
10
10
  **GitHub**: https://github.com/LegionIO/legion-llm
11
- **Version**: 0.3.8
11
+ **Version**: 0.3.15
12
12
  **License**: Apache-2.0
13
13
 
14
14
  ## Architecture
@@ -314,7 +314,7 @@ In-memory signal consumer with pluggable handlers. Adjusts effective priorities
314
314
  | `lib/legion/llm/embeddings.rb` | Embeddings module: generate, generate_batch, default_model |
315
315
  | `lib/legion/llm/shadow_eval.rb` | Shadow evaluation: enabled?, should_sample?, evaluate, compare |
316
316
  | `lib/legion/llm/structured_output.rb` | JSON schema enforcement with native response_format and prompt fallback |
317
- | `lib/legion/llm/version.rb` | Version constant (0.3.8) |
317
+ | `lib/legion/llm/version.rb` | Version constant (0.3.15) |
318
318
  | `lib/legion/llm/quality_checker.rb` | QualityChecker module with QualityResult struct |
319
319
  | `lib/legion/llm/escalation_history.rb` | EscalationHistory mixin: `escalation_history`, `escalated?`, `final_resolution`, `escalation_chain` |
320
320
  | `lib/legion/llm/router/escalation_chain.rb` | EscalationChain value object |
data/CODEOWNERS CHANGED
@@ -1 +1,40 @@
1
+ # Default owner — all files
1
2
  * @Esity
3
+
4
+ # Core library code
5
+ # lib/ @Esity @future-ai-team
6
+
7
+ # Router (dynamic weighted routing, intent, escalation)
8
+ # lib/legion/llm/router/ @Esity @future-ai-team
9
+ # lib/legion/llm/router.rb @Esity @future-ai-team
10
+
11
+ # Provider configuration
12
+ # lib/legion/llm/providers.rb @Esity @future-ai-team
13
+
14
+ # Discovery (Ollama, system memory)
15
+ # lib/legion/llm/discovery/ @Esity @future-ai-team
16
+
17
+ # Embeddings
18
+ # lib/legion/llm/embeddings.rb @Esity @future-ai-team
19
+
20
+ # Structured output and quality checking
21
+ # lib/legion/llm/structured_output.rb @Esity @future-ai-team
22
+ # lib/legion/llm/quality_checker.rb @Esity @future-ai-team
23
+
24
+ # Compressor
25
+ # lib/legion/llm/compressor.rb @Esity @future-ai-team
26
+
27
+ # Transport (escalation events)
28
+ # lib/legion/llm/transport/ @Esity @future-infra-team
29
+
30
+ # Extension helper mixin
31
+ # lib/legion/llm/helpers/ @Esity @future-core-team
32
+
33
+ # Specs
34
+ # spec/ @Esity @future-contributors
35
+
36
+ # Documentation
37
+ # *.md @Esity @future-docs-team
38
+
39
+ # CI/CD
40
+ # .github/ @Esity
data/README.md CHANGED
@@ -2,7 +2,7 @@
2
2
 
3
3
  LLM integration for the [LegionIO](https://github.com/LegionIO/LegionIO) framework. Wraps [ruby_llm](https://github.com/crmne/ruby_llm) to provide chat, embeddings, tool use, and agent capabilities to any Legion extension.
4
4
 
5
- **Version**: 0.3.8
5
+ **Version**: 0.3.15
6
6
 
7
7
  ## Installation
8
8
 
@@ -0,0 +1,95 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Legion
4
+ module LLM
5
+ module CostTracker
6
+ # Default per-1M-token pricing in USD (input / output).
7
+ # Overridable via Legion::Settings[:llm][:pricing].
8
+ DEFAULT_PRICING = {
9
+ 'claude-sonnet-4-6' => { input: 3.0, output: 15.0 },
10
+ 'claude-haiku-4-5' => { input: 0.80, output: 4.0 },
11
+ 'claude-opus-4-6' => { input: 15.0, output: 75.0 },
12
+ 'gpt-4o' => { input: 2.50, output: 10.0 },
13
+ 'gpt-4o-mini' => { input: 0.15, output: 0.60 }
14
+ }.freeze
15
+
16
+ class << self
17
+ # Records a completed LLM request and calculates its cost.
18
+ #
19
+ # @param model [String] model identifier
20
+ # @param input_tokens [Integer] number of input tokens consumed
21
+ # @param output_tokens [Integer] number of output tokens produced
22
+ # @param provider [Symbol, nil] provider (informational)
23
+ # @return [Hash] the recorded entry
24
+ def record(model:, input_tokens:, output_tokens:, provider: nil)
25
+ pricing = pricing_for(model)
26
+ cost = (input_tokens * pricing[:input] / 1_000_000.0) +
27
+ (output_tokens * pricing[:output] / 1_000_000.0)
28
+
29
+ entry = {
30
+ model: model,
31
+ provider: provider,
32
+ input_tokens: input_tokens,
33
+ output_tokens: output_tokens,
34
+ cost_usd: cost.round(6),
35
+ recorded_at: Time.now
36
+ }
37
+
38
+ records << entry
39
+ Legion::Logging.debug "[LLM::CostTracker] #{model}: #{input_tokens}+#{output_tokens} tokens = $#{cost.round(6)}"
40
+ entry
41
+ end
42
+
43
+ # Returns a cost summary, optionally filtered by a start time.
44
+ #
45
+ # @param since [Time, nil] include only records on or after this time
46
+ # @return [Hash] with :total_cost_usd, :total_requests, token totals, and :by_model breakdown
47
+ def summary(since: nil)
48
+ subset = since ? records.select { |r| r[:recorded_at] >= since } : records.dup
49
+
50
+ {
51
+ total_cost_usd: subset.sum { |r| r[:cost_usd] }.round(6),
52
+ total_requests: subset.size,
53
+ total_input_tokens: subset.sum { |r| r[:input_tokens] },
54
+ total_output_tokens: subset.sum { |r| r[:output_tokens] },
55
+ by_model: subset.group_by { |r| r[:model] }.transform_values do |rs|
56
+ {
57
+ cost_usd: rs.sum { |r| r[:cost_usd] }.round(6),
58
+ requests: rs.size
59
+ }
60
+ end
61
+ }
62
+ end
63
+
64
+ # Clears all recorded entries.
65
+ def clear
66
+ @records = []
67
+ end
68
+
69
+ # Returns pricing for a model, preferring settings-defined overrides.
70
+ #
71
+ # @param model [String] model identifier
72
+ # @return [Hash] with :input and :output keys (per-1M-token USD)
73
+ def pricing_for(model)
74
+ custom = settings_pricing
75
+ custom[model.to_s] || DEFAULT_PRICING[model.to_s] || { input: 5.0, output: 15.0 }
76
+ end
77
+
78
+ private
79
+
80
+ def records
81
+ @records ||= []
82
+ end
83
+
84
+ def settings_pricing
85
+ return {} unless defined?(Legion::Settings)
86
+
87
+ pricing = Legion::Settings.dig(:'legion-llm', :pricing)
88
+ pricing.is_a?(Hash) ? pricing : {}
89
+ rescue StandardError
90
+ {}
91
+ end
92
+ end
93
+ end
94
+ end
95
+ end
@@ -0,0 +1,44 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Legion
4
+ module LLM
5
+ module OffPeak
6
+ # Peak hours in UTC: 14:00-22:00 (9 AM - 5 PM CT)
7
+ PEAK_HOURS = (14..22)
8
+
9
+ class << self
10
+ # Returns true if the given time falls within peak hours.
11
+ #
12
+ # @param time [Time] time to check (defaults to now)
13
+ # @return [Boolean]
14
+ def peak_hour?(time = Time.now.utc)
15
+ PEAK_HOURS.cover?(time.hour)
16
+ end
17
+
18
+ # Returns true when a non-urgent request should be deferred to off-peak.
19
+ #
20
+ # @param priority [Symbol] :urgent bypasses deferral; :normal and :low defer during peak
21
+ # @return [Boolean]
22
+ def should_defer?(priority: :normal)
23
+ return false if priority.to_sym == :urgent
24
+
25
+ peak_hour?
26
+ end
27
+
28
+ # Returns the next off-peak Time (UTC).
29
+ # If already off-peak, returns the current time.
30
+ # Off-peak begins at the hour after the peak window ends (23:00 UTC).
31
+ #
32
+ # @param time [Time] reference time (defaults to now)
33
+ # @return [Time]
34
+ def next_off_peak(time = Time.now.utc)
35
+ if time.hour < PEAK_HOURS.first || time.hour >= PEAK_HOURS.last
36
+ time
37
+ else
38
+ Time.utc(time.year, time.month, time.day, PEAK_HOURS.last, 0, 0)
39
+ end
40
+ end
41
+ end
42
+ end
43
+ end
44
+ end
@@ -2,6 +2,6 @@
2
2
 
3
3
  module Legion
4
4
  module LLM
5
- VERSION = '0.3.15'
5
+ VERSION = '0.3.17'
6
6
  end
7
7
  end
data/lib/legion/llm.rb CHANGED
@@ -15,6 +15,8 @@ require_relative 'llm/daemon_client'
15
15
  require_relative 'llm/arbitrage'
16
16
  require_relative 'llm/batch'
17
17
  require_relative 'llm/scheduling'
18
+ require_relative 'llm/off_peak'
19
+ require_relative 'llm/cost_tracker'
18
20
 
19
21
  begin
20
22
  require 'legion/extensions/llm/gateway'
@@ -124,7 +126,7 @@ module Legion
124
126
  )
125
127
  else
126
128
  chat_single(model: model, provider: provider, intent: intent, tier: tier,
127
- temperature: temperature, **kwargs)
129
+ temperature: temperature, message: message, **kwargs)
128
130
  end
129
131
 
130
132
  if cache_key && result.is_a?(Hash)
@@ -185,6 +187,10 @@ module Legion
185
187
  agent_class.new(**)
186
188
  end
187
189
 
190
+ FRAMEWORK_KEYS = %i[request_id source timestamp datetime task_id parent_id master_id
191
+ check_subtask generate_task catch_exceptions worker_id principal_id
192
+ principal_type].freeze
193
+
188
194
  private
189
195
 
190
196
  def _dispatch_chat(model:, provider:, intent:, tier:, escalate:, max_escalations:, quality_check:, message:, **kwargs)
@@ -276,7 +282,7 @@ module Legion
276
282
  Legion::Extensions::LLM::Gateway::Runners::Inference.chat(**)
277
283
  end
278
284
 
279
- def chat_single(model:, provider:, intent:, tier:, **kwargs)
285
+ def chat_single(model:, provider:, intent:, tier:, message: nil, **kwargs)
280
286
  if (intent || tier) && Router.routing_enabled?
281
287
  resolution = Router.resolve(intent: intent, tier: tier, model: model, provider: provider)
282
288
  if resolution
@@ -295,12 +301,15 @@ module Legion
295
301
  opts = {}
296
302
  opts[:model] = model if model
297
303
  opts[:provider] = provider if provider
298
- opts.merge!(kwargs)
304
+ opts.merge!(kwargs.except(*FRAMEWORK_KEYS))
299
305
  opts.delete(:temperature) if opts[:temperature].nil?
300
306
 
301
307
  inject_anthropic_cache_control!(opts, provider)
302
308
 
303
- RubyLLM.chat(**opts)
309
+ session = RubyLLM.chat(**opts)
310
+ return session unless message
311
+
312
+ session.ask(message)
304
313
  end
305
314
 
306
315
  def chat_with_escalation(model:, provider:, intent:, tier:, max_escalations:, quality_check:, message:, **kwargs)
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: legion-llm
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.3.15
4
+ version: 0.3.17
5
5
  platform: ruby
6
6
  authors:
7
7
  - Esity
@@ -137,6 +137,7 @@ files:
137
137
  - lib/legion/llm/cache.rb
138
138
  - lib/legion/llm/claude_config_loader.rb
139
139
  - lib/legion/llm/compressor.rb
140
+ - lib/legion/llm/cost_tracker.rb
140
141
  - lib/legion/llm/daemon_client.rb
141
142
  - lib/legion/llm/discovery/ollama.rb
142
143
  - lib/legion/llm/discovery/system.rb
@@ -146,6 +147,7 @@ files:
146
147
  - lib/legion/llm/hooks.rb
147
148
  - lib/legion/llm/hooks/rag_guard.rb
148
149
  - lib/legion/llm/hooks/response_guard.rb
150
+ - lib/legion/llm/off_peak.rb
149
151
  - lib/legion/llm/providers.rb
150
152
  - lib/legion/llm/quality_checker.rb
151
153
  - lib/legion/llm/response_cache.rb