RubyGems - legion-llm - Versions diffs - 0.3.15 → 0.3.17 - Mend

legion-llm 0.3.15 → 0.3.17

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +14 -0
data/CLAUDE.md +2 -2
data/CODEOWNERS +39 -0
data/README.md +1 -1
data/lib/legion/llm/cost_tracker.rb +95 -0
data/lib/legion/llm/off_peak.rb +44 -0
data/lib/legion/llm/version.rb +1 -1
data/lib/legion/llm.rb +13 -4
metadata +3 -1

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 16ae90179fe84f5fdef3459c5463517048a71481e53c778c6be53ef8a0e4f078
-  data.tar.gz: 6bc1cdebbf9807443e057748abd11e6fb41694a68666bbc1da2dee3fa4ead10a
+  metadata.gz: 4f0427a4ddb7c7118e21cf0abd805ae05c994f78c35be322307858a9ad8d0b3c
+  data.tar.gz: b010081a1f8007df86babebd563d5fada164eeabcd573594e3aaa310317c1474
 SHA512:
-  metadata.gz: 1a5a14010e8b18f19f38d94a64ebfbc5a1d0f0ad589aa6c87d2155ee2705f1369c68e56f3697075bab2a4a43a2fc67cbf3797845f32a8214e156ca2003f64ccb
-  data.tar.gz: 6251af13334ead29cb2d53cd76137d3627d28f23fce0787816b037718611f1f9ca26bc4b48ea49c33ca0747e9a1c875c2641ae577f4e557b65f68b9d1adfe32b
+  metadata.gz: e882a711b7c9a56d03c0fc3f1753c47cf6c148c7312cb3d50302dcaeda1ca67e09bef67ed37f295480625f981b4e32b1d0a4e96736fa9a6f3ac61af8cd1b57ec
+  data.tar.gz: 59cb342d0af6c92e94caff375f557de3dac75413c19821f28d5c471f319d748cdcfff99ebd8328ea1fcf0e4dff98614cf262eba69037f5b7ce3d72f567bf2499

data/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,19 @@
 # Legion LLM Changelog
+## [0.3.17] - 2026-03-22
+### Added
+- `Legion::LLM::OffPeak` module for off-peak scheduling: `peak_hour?`, `should_defer?(priority:)`, `next_off_peak` — defers non-urgent LLM requests during configurable peak hours (default 14:00-22:00 UTC)
+- `Legion::LLM::CostTracker` module for per-request cost tracking: `record(model:, input_tokens:, output_tokens:)`, `summary(since:)` with by-model breakdown, configurable pricing table via settings, thread-safe accumulator
+## [0.3.16] - 2026-03-22
+### Fixed
+- `chat_single` now accepts and forwards `message:` kwarg, calling `session.ask(message)` when present instead of returning a bare session object
+- `chat_direct` passes `message:` through to `chat_single` in the non-escalation branch
+- Add `FRAMEWORK_KEYS` constant to strip Runner.run metadata kwargs (`task_id`, `source`, `timestamp`, etc.) before passing to RubyLLM
+- Move `FRAMEWORK_KEYS` out of `private` scope (constants are not affected by `private` in Ruby)
 ## [0.3.15] - 2026-03-21
 ### Changed

data/CLAUDE.md CHANGED Viewed

@@ -8,7 +8,7 @@
 Core LegionIO gem providing LLM capabilities to all extensions. Wraps ruby_llm to provide a consistent interface for chat, embeddings, tool use, and agents across multiple providers (Bedrock, Anthropic, OpenAI, Gemini, Ollama). Includes a dynamic weighted routing engine that dispatches requests across local, fleet, and cloud tiers based on caller intent, priority rules, time schedules, cost multipliers, and real-time provider health.
 **GitHub**: https://github.com/LegionIO/legion-llm
-**Version**: 0.3.8
+**Version**: 0.3.15
 **License**: Apache-2.0
 ## Architecture
@@ -314,7 +314,7 @@ In-memory signal consumer with pluggable handlers. Adjusts effective priorities
 | `lib/legion/llm/embeddings.rb` | Embeddings module: generate, generate_batch, default_model |
 | `lib/legion/llm/shadow_eval.rb` | Shadow evaluation: enabled?, should_sample?, evaluate, compare |
 | `lib/legion/llm/structured_output.rb` | JSON schema enforcement with native response_format and prompt fallback |
-| `lib/legion/llm/version.rb` | Version constant (0.3.8) |
+| `lib/legion/llm/version.rb` | Version constant (0.3.15) |
 | `lib/legion/llm/quality_checker.rb` | QualityChecker module with QualityResult struct |
 | `lib/legion/llm/escalation_history.rb` | EscalationHistory mixin: `escalation_history`, `escalated?`, `final_resolution`, `escalation_chain` |
 | `lib/legion/llm/router/escalation_chain.rb` | EscalationChain value object |

data/CODEOWNERS CHANGED Viewed

@@ -1 +1,40 @@
+# Default owner — all files
 * @Esity
+# Core library code
+# lib/ @Esity @future-ai-team
+# Router (dynamic weighted routing, intent, escalation)
+# lib/legion/llm/router/ @Esity @future-ai-team
+# lib/legion/llm/router.rb @Esity @future-ai-team
+# Provider configuration
+# lib/legion/llm/providers.rb @Esity @future-ai-team
+# Discovery (Ollama, system memory)
+# lib/legion/llm/discovery/ @Esity @future-ai-team
+# Embeddings
+# lib/legion/llm/embeddings.rb @Esity @future-ai-team
+# Structured output and quality checking
+# lib/legion/llm/structured_output.rb @Esity @future-ai-team
+# lib/legion/llm/quality_checker.rb @Esity @future-ai-team
+# Compressor
+# lib/legion/llm/compressor.rb @Esity @future-ai-team
+# Transport (escalation events)
+# lib/legion/llm/transport/ @Esity @future-infra-team
+# Extension helper mixin
+# lib/legion/llm/helpers/ @Esity @future-core-team
+# Specs
+# spec/ @Esity @future-contributors
+# Documentation
+# *.md @Esity @future-docs-team
+# CI/CD
+# .github/ @Esity

data/README.md CHANGED Viewed

@@ -2,7 +2,7 @@
 LLM integration for the [LegionIO](https://github.com/LegionIO/LegionIO) framework. Wraps [ruby_llm](https://github.com/crmne/ruby_llm) to provide chat, embeddings, tool use, and agent capabilities to any Legion extension.
-**Version**: 0.3.8
+**Version**: 0.3.15
 ## Installation

data/lib/legion/llm/cost_tracker.rb ADDED Viewed

@@ -0,0 +1,95 @@
+# frozen_string_literal: true
+module Legion
+  module LLM
+    module CostTracker
+      # Default per-1M-token pricing in USD (input / output).
+      # Overridable via Legion::Settings[:llm][:pricing].
+      DEFAULT_PRICING = {
+        'claude-sonnet-4-6' => { input: 3.0,  output: 15.0 },
+        'claude-haiku-4-5'  => { input: 0.80, output: 4.0  },
+        'claude-opus-4-6'   => { input: 15.0, output: 75.0 },
+        'gpt-4o'            => { input: 2.50, output: 10.0 },
+        'gpt-4o-mini'       => { input: 0.15, output: 0.60 }
+      }.freeze
+      class << self
+        # Records a completed LLM request and calculates its cost.
+        #
+        # @param model         [String]       model identifier
+        # @param input_tokens  [Integer]      number of input tokens consumed
+        # @param output_tokens [Integer]      number of output tokens produced
+        # @param provider      [Symbol, nil]  provider (informational)
+        # @return [Hash] the recorded entry
+        def record(model:, input_tokens:, output_tokens:, provider: nil)
+          pricing = pricing_for(model)
+          cost    = (input_tokens * pricing[:input] / 1_000_000.0) +
+                    (output_tokens * pricing[:output] / 1_000_000.0)
+          entry = {
+            model:         model,
+            provider:      provider,
+            input_tokens:  input_tokens,
+            output_tokens: output_tokens,
+            cost_usd:      cost.round(6),
+            recorded_at:   Time.now
+          }
+          records << entry
+          Legion::Logging.debug "[LLM::CostTracker] #{model}: #{input_tokens}+#{output_tokens} tokens = $#{cost.round(6)}"
+          entry
+        end
+        # Returns a cost summary, optionally filtered by a start time.
+        #
+        # @param since [Time, nil] include only records on or after this time
+        # @return [Hash] with :total_cost_usd, :total_requests, token totals, and :by_model breakdown
+        def summary(since: nil)
+          subset = since ? records.select { |r| r[:recorded_at] >= since } : records.dup
+          {
+            total_cost_usd:      subset.sum { |r| r[:cost_usd] }.round(6),
+            total_requests:      subset.size,
+            total_input_tokens:  subset.sum { |r| r[:input_tokens] },
+            total_output_tokens: subset.sum { |r| r[:output_tokens] },
+            by_model:            subset.group_by { |r| r[:model] }.transform_values do |rs|
+              {
+                cost_usd: rs.sum { |r| r[:cost_usd] }.round(6),
+                requests: rs.size
+              }
+            end
+          }
+        end
+        # Clears all recorded entries.
+        def clear
+          @records = []
+        end
+        # Returns pricing for a model, preferring settings-defined overrides.
+        #
+        # @param model [String] model identifier
+        # @return [Hash] with :input and :output keys (per-1M-token USD)
+        def pricing_for(model)
+          custom = settings_pricing
+          custom[model.to_s] || DEFAULT_PRICING[model.to_s] || { input: 5.0, output: 15.0 }
+        end
+        private
+        def records
+          @records ||= []
+        end
+        def settings_pricing
+          return {} unless defined?(Legion::Settings)
+          pricing = Legion::Settings.dig(:'legion-llm', :pricing)
+          pricing.is_a?(Hash) ? pricing : {}
+        rescue StandardError
+          {}
+        end
+      end
+    end
+  end
+end

data/lib/legion/llm/off_peak.rb ADDED Viewed

@@ -0,0 +1,44 @@
+# frozen_string_literal: true
+module Legion
+  module LLM
+    module OffPeak
+      # Peak hours in UTC: 14:00-22:00 (9 AM - 5 PM CT)
+      PEAK_HOURS = (14..22)
+      class << self
+        # Returns true if the given time falls within peak hours.
+        #
+        # @param time [Time] time to check (defaults to now)
+        # @return [Boolean]
+        def peak_hour?(time = Time.now.utc)
+          PEAK_HOURS.cover?(time.hour)
+        end
+        # Returns true when a non-urgent request should be deferred to off-peak.
+        #
+        # @param priority [Symbol] :urgent bypasses deferral; :normal and :low defer during peak
+        # @return [Boolean]
+        def should_defer?(priority: :normal)
+          return false if priority.to_sym == :urgent
+          peak_hour?
+        end
+        # Returns the next off-peak Time (UTC).
+        # If already off-peak, returns the current time.
+        # Off-peak begins at the hour after the peak window ends (23:00 UTC).
+        #
+        # @param time [Time] reference time (defaults to now)
+        # @return [Time]
+        def next_off_peak(time = Time.now.utc)
+          if time.hour < PEAK_HOURS.first || time.hour >= PEAK_HOURS.last
+            time
+          else
+            Time.utc(time.year, time.month, time.day, PEAK_HOURS.last, 0, 0)
+          end
+        end
+      end
+    end
+  end
+end

data/lib/legion/llm/version.rb CHANGED Viewed

@@ -2,6 +2,6 @@
 module Legion
   module LLM
-    VERSION = '0.3.15'
+    VERSION = '0.3.17'
   end
 end

data/lib/legion/llm.rb CHANGED Viewed

@@ -15,6 +15,8 @@ require_relative 'llm/daemon_client'
 require_relative 'llm/arbitrage'
 require_relative 'llm/batch'
 require_relative 'llm/scheduling'
+require_relative 'llm/off_peak'
+require_relative 'llm/cost_tracker'
 begin
   require 'legion/extensions/llm/gateway'
@@ -124,7 +126,7 @@ module Legion
                    )
                  else
                    chat_single(model: model, provider: provider, intent: intent, tier: tier,
-                               temperature: temperature, **kwargs)
+                               temperature: temperature, message: message, **kwargs)
                  end
         if cache_key && result.is_a?(Hash)
@@ -185,6 +187,10 @@ module Legion
         agent_class.new(**)
       end
+      FRAMEWORK_KEYS = %i[request_id source timestamp datetime task_id parent_id master_id
+                          check_subtask generate_task catch_exceptions worker_id principal_id
+                          principal_type].freeze
       private
       def _dispatch_chat(model:, provider:, intent:, tier:, escalate:, max_escalations:, quality_check:, message:, **kwargs)
@@ -276,7 +282,7 @@ module Legion
         Legion::Extensions::LLM::Gateway::Runners::Inference.chat(**)
       end
-      def chat_single(model:, provider:, intent:, tier:, **kwargs)
+      def chat_single(model:, provider:, intent:, tier:, message: nil, **kwargs)
         if (intent || tier) && Router.routing_enabled?
           resolution = Router.resolve(intent: intent, tier: tier, model: model, provider: provider)
           if resolution
@@ -295,12 +301,15 @@ module Legion
         opts = {}
         opts[:model]    = model    if model
         opts[:provider] = provider if provider
-        opts.merge!(kwargs)
+        opts.merge!(kwargs.except(*FRAMEWORK_KEYS))
         opts.delete(:temperature) if opts[:temperature].nil?
         inject_anthropic_cache_control!(opts, provider)
-        RubyLLM.chat(**opts)
+        session = RubyLLM.chat(**opts)
+        return session unless message
+        session.ask(message)
       end
       def chat_with_escalation(model:, provider:, intent:, tier:, max_escalations:, quality_check:, message:, **kwargs)

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: legion-llm
 version: !ruby/object:Gem::Version
-  version: 0.3.15
+  version: 0.3.17
 platform: ruby
 authors:
 - Esity
@@ -137,6 +137,7 @@ files:
 - lib/legion/llm/cache.rb
 - lib/legion/llm/claude_config_loader.rb
 - lib/legion/llm/compressor.rb
+- lib/legion/llm/cost_tracker.rb
 - lib/legion/llm/daemon_client.rb
 - lib/legion/llm/discovery/ollama.rb
 - lib/legion/llm/discovery/system.rb
@@ -146,6 +147,7 @@ files:
 - lib/legion/llm/hooks.rb
 - lib/legion/llm/hooks/rag_guard.rb
 - lib/legion/llm/hooks/response_guard.rb
+- lib/legion/llm/off_peak.rb
 - lib/legion/llm/providers.rb
 - lib/legion/llm/quality_checker.rb
 - lib/legion/llm/response_cache.rb