RubyGems - legion-llm - Versions diffs - 0.3.15 → 0.3.18 - Mend

legion-llm 0.3.15 → 0.3.18

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (23) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +36 -0
data/CLAUDE.md +2 -2
data/CODEOWNERS +39 -0
data/README.md +1 -1
data/lib/legion/llm/arbitrage.rb +3 -1
data/lib/legion/llm/cache.rb +9 -3
data/lib/legion/llm/compressor.rb +2 -0
data/lib/legion/llm/cost_tracker.rb +95 -0
data/lib/legion/llm/daemon_client.rb +4 -0
data/lib/legion/llm/discovery/ollama.rb +4 -1
data/lib/legion/llm/discovery/system.rb +8 -4
data/lib/legion/llm/off_peak.rb +46 -0
data/lib/legion/llm/response_cache.rb +3 -0
data/lib/legion/llm/router/health_tracker.rb +13 -2
data/lib/legion/llm/router/rule.rb +32 -9
data/lib/legion/llm/router.rb +18 -2
data/lib/legion/llm/scheduling.rb +3 -1
data/lib/legion/llm/shadow_eval.rb +2 -0
data/lib/legion/llm/structured_output.rb +5 -2
data/lib/legion/llm/version.rb +1 -1
data/lib/legion/llm.rb +13 -4
metadata +3 -1

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 16ae90179fe84f5fdef3459c5463517048a71481e53c778c6be53ef8a0e4f078
-  data.tar.gz: 6bc1cdebbf9807443e057748abd11e6fb41694a68666bbc1da2dee3fa4ead10a
+  metadata.gz: 0a19ae18f6bbb96680e6bfbcad1e81bd3e67c6b1be72e0615cf24852a69b2e59
+  data.tar.gz: f293c1bc52cb97652e545efb4877b32592d8aa0d96d190bef4f2624d6b277a5a
 SHA512:
-  metadata.gz: 1a5a14010e8b18f19f38d94a64ebfbc5a1d0f0ad589aa6c87d2155ee2705f1369c68e56f3697075bab2a4a43a2fc67cbf3797845f32a8214e156ca2003f64ccb
-  data.tar.gz: 6251af13334ead29cb2d53cd76137d3627d28f23fce0787816b037718611f1f9ca26bc4b48ea49c33ca0747e9a1c875c2641ae577f4e557b65f68b9d1adfe32b
+  metadata.gz: 488d6fec5178b75b4f48b9031cd3e172b9637d1044ca210912ff1bc0a5b91818d2fd379725426e8be58bfc30638f0cd6ef83a2f550e94e728f40b74b1a39a5ad
+  data.tar.gz: b738f793fc22c4dbf3400da0fb9c13d2bb427321fa1f1d8a594482a35239644d2e4d0fc96e9e08c295c95a2e3ff388e9f45b1cddf3e7db0f1bd00a4139c257ed

data/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,41 @@
 # Legion LLM Changelog
+## [0.3.18] - 2026-03-22
+### Added
+- Logging across routing, health tracking, caching, and discovery subsystems
+- `Router.resolve`: `.info` on route decision (tier/provider/model/rule), `.debug` on candidate filtering counts, `.debug` when no rules match
+- `Router::HealthTracker`: `.warn` on circuit state transitions (closed->open, half_open->open, open->half_open, any->closed), `.debug` on latency penalty applied
+- `Router::Rule`: `.debug` on intent mismatch, schedule constraint rejections (valid_from, valid_until, hours, days)
+- `Cache`: `.debug` on cache miss and cache write, `.warn` on swallowed get/set errors
+- `ResponseCache`: `.warn` on spool overflow to disk, `.debug` on async poll status, `.warn` on fail_request
+- `DaemonClient`: `.warn` on mark_unhealthy, `.warn` on 403/429 responses, `.info` on health check result
+- `StructuredOutput`: `.warn` on JSON parse failure with attempt count, `.debug` when using prompt-based fallback
+- `Compressor`: `.debug` on compression applied (level, original length, compressed length)
+- `Discovery::Ollama`: `.warn` on HTTP failure, `.debug` on model list refresh with count
+- `Discovery::System`: `.warn` on system command failures (sysctl, vm_stat, /proc/meminfo)
+- `ShadowEval`: `.debug` on evaluation triggered, `.warn` on failure
+- `Scheduling`: `.debug` on defer decision
+- `OffPeak`: `.debug` on peak hour check result
+- `Arbitrage`: `.debug` on model selection result
+### Changed
+- `Router::Rule#within_schedule?` refactored to extract `schedule_rejection` helper (reduces cyclomatic complexity)
+## [0.3.17] - 2026-03-22
+### Added
+- `Legion::LLM::OffPeak` module for off-peak scheduling: `peak_hour?`, `should_defer?(priority:)`, `next_off_peak` — defers non-urgent LLM requests during configurable peak hours (default 14:00-22:00 UTC)
+- `Legion::LLM::CostTracker` module for per-request cost tracking: `record(model:, input_tokens:, output_tokens:)`, `summary(since:)` with by-model breakdown, configurable pricing table via settings, thread-safe accumulator
+## [0.3.16] - 2026-03-22
+### Fixed
+- `chat_single` now accepts and forwards `message:` kwarg, calling `session.ask(message)` when present instead of returning a bare session object
+- `chat_direct` passes `message:` through to `chat_single` in the non-escalation branch
+- Add `FRAMEWORK_KEYS` constant to strip Runner.run metadata kwargs (`task_id`, `source`, `timestamp`, etc.) before passing to RubyLLM
+- Move `FRAMEWORK_KEYS` out of `private` scope (constants are not affected by `private` in Ruby)
 ## [0.3.15] - 2026-03-21
 ### Changed

data/CLAUDE.md CHANGED Viewed

@@ -8,7 +8,7 @@
 Core LegionIO gem providing LLM capabilities to all extensions. Wraps ruby_llm to provide a consistent interface for chat, embeddings, tool use, and agents across multiple providers (Bedrock, Anthropic, OpenAI, Gemini, Ollama). Includes a dynamic weighted routing engine that dispatches requests across local, fleet, and cloud tiers based on caller intent, priority rules, time schedules, cost multipliers, and real-time provider health.
 **GitHub**: https://github.com/LegionIO/legion-llm
-**Version**: 0.3.8
+**Version**: 0.3.15
 **License**: Apache-2.0
 ## Architecture
@@ -314,7 +314,7 @@ In-memory signal consumer with pluggable handlers. Adjusts effective priorities
 | `lib/legion/llm/embeddings.rb` | Embeddings module: generate, generate_batch, default_model |
 | `lib/legion/llm/shadow_eval.rb` | Shadow evaluation: enabled?, should_sample?, evaluate, compare |
 | `lib/legion/llm/structured_output.rb` | JSON schema enforcement with native response_format and prompt fallback |
-| `lib/legion/llm/version.rb` | Version constant (0.3.8) |
+| `lib/legion/llm/version.rb` | Version constant (0.3.15) |
 | `lib/legion/llm/quality_checker.rb` | QualityChecker module with QualityResult struct |
 | `lib/legion/llm/escalation_history.rb` | EscalationHistory mixin: `escalation_history`, `escalated?`, `final_resolution`, `escalation_chain` |
 | `lib/legion/llm/router/escalation_chain.rb` | EscalationChain value object |

data/CODEOWNERS CHANGED Viewed

@@ -1 +1,40 @@
+# Default owner — all files
 * @Esity
+# Core library code
+# lib/ @Esity @future-ai-team
+# Router (dynamic weighted routing, intent, escalation)
+# lib/legion/llm/router/ @Esity @future-ai-team
+# lib/legion/llm/router.rb @Esity @future-ai-team
+# Provider configuration
+# lib/legion/llm/providers.rb @Esity @future-ai-team
+# Discovery (Ollama, system memory)
+# lib/legion/llm/discovery/ @Esity @future-ai-team
+# Embeddings
+# lib/legion/llm/embeddings.rb @Esity @future-ai-team
+# Structured output and quality checking
+# lib/legion/llm/structured_output.rb @Esity @future-ai-team
+# lib/legion/llm/quality_checker.rb @Esity @future-ai-team
+# Compressor
+# lib/legion/llm/compressor.rb @Esity @future-ai-team
+# Transport (escalation events)
+# lib/legion/llm/transport/ @Esity @future-infra-team
+# Extension helper mixin
+# lib/legion/llm/helpers/ @Esity @future-core-team
+# Specs
+# spec/ @Esity @future-contributors
+# Documentation
+# *.md @Esity @future-docs-team
+# CI/CD
+# .github/ @Esity

data/README.md CHANGED Viewed

@@ -2,7 +2,7 @@
 LLM integration for the [LegionIO](https://github.com/LegionIO/LegionIO) framework. Wraps [ruby_llm](https://github.com/crmne/ruby_llm) to provide chat, embeddings, tool use, and agent capabilities to any Legion extension.
-**Version**: 0.3.8
+**Version**: 0.3.15
 ## Installation

data/lib/legion/llm/arbitrage.rb CHANGED Viewed

@@ -56,7 +56,9 @@ module Legion
           return nil if scored.empty?
-          scored.min_by { |_model, cost| cost }&.first
+          selected = scored.min_by { |_model, cost| cost }&.first
+          Legion::Logging.debug("Arbitrage selected model=#{selected} capability=#{capability}") if defined?(Legion::Logging)
+          selected
         end
         # Returns the merged cost table: defaults overridden by any settings-defined entries.

data/lib/legion/llm/cache.rb CHANGED Viewed

@@ -27,10 +27,14 @@ module Legion
         return nil unless available?
         raw = Legion::Cache.get(cache_key)
-        return nil if raw.nil?
+        if raw.nil?
+          Legion::Logging.debug("LLM cache miss key=#{cache_key}") if defined?(Legion::Logging)
+          return nil
+        end
         ::JSON.parse(raw, symbolize_names: true)
-      rescue StandardError
+      rescue StandardError => e
+        Legion::Logging.warn("LLM cache get error key=#{cache_key}: #{e.message}") if defined?(Legion::Logging)
         nil
       end
@@ -39,8 +43,10 @@ module Legion
         return false unless available?
         Legion::Cache.set(cache_key, ::JSON.dump(response), ttl)
+        Legion::Logging.debug("LLM cache write key=#{cache_key} ttl=#{ttl}") if defined?(Legion::Logging)
         true
-      rescue StandardError
+      rescue StandardError => e
+        Legion::Logging.warn("LLM cache set error key=#{cache_key}: #{e.message}") if defined?(Legion::Logging)
         false
       end

data/lib/legion/llm/compressor.rb CHANGED Viewed

@@ -19,10 +19,12 @@ module Legion
         def compress(text, level: LIGHT)
           return text if text.nil? || text.empty? || level <= NONE
+          original_length = text.length
           segments = split_segments(text)
           result = segments.map { |seg| seg[:protected] ? seg[:text] : compress_prose(seg[:text], level) }.join
           result = collapse_whitespace(result) if level >= AGGRESSIVE
+          Legion::Logging.debug("Compressor applied level=#{level} original=#{original_length} compressed=#{result.length}") if defined?(Legion::Logging)
           result
         end

data/lib/legion/llm/cost_tracker.rb ADDED Viewed

@@ -0,0 +1,95 @@
+# frozen_string_literal: true
+module Legion
+  module LLM
+    module CostTracker
+      # Default per-1M-token pricing in USD (input / output).
+      # Overridable via Legion::Settings[:llm][:pricing].
+      DEFAULT_PRICING = {
+        'claude-sonnet-4-6' => { input: 3.0,  output: 15.0 },
+        'claude-haiku-4-5'  => { input: 0.80, output: 4.0  },
+        'claude-opus-4-6'   => { input: 15.0, output: 75.0 },
+        'gpt-4o'            => { input: 2.50, output: 10.0 },
+        'gpt-4o-mini'       => { input: 0.15, output: 0.60 }
+      }.freeze
+      class << self
+        # Records a completed LLM request and calculates its cost.
+        #
+        # @param model         [String]       model identifier
+        # @param input_tokens  [Integer]      number of input tokens consumed
+        # @param output_tokens [Integer]      number of output tokens produced
+        # @param provider      [Symbol, nil]  provider (informational)
+        # @return [Hash] the recorded entry
+        def record(model:, input_tokens:, output_tokens:, provider: nil)
+          pricing = pricing_for(model)
+          cost    = (input_tokens * pricing[:input] / 1_000_000.0) +
+                    (output_tokens * pricing[:output] / 1_000_000.0)
+          entry = {
+            model:         model,
+            provider:      provider,
+            input_tokens:  input_tokens,
+            output_tokens: output_tokens,
+            cost_usd:      cost.round(6),
+            recorded_at:   Time.now
+          }
+          records << entry
+          Legion::Logging.debug "[LLM::CostTracker] #{model}: #{input_tokens}+#{output_tokens} tokens = $#{cost.round(6)}"
+          entry
+        end
+        # Returns a cost summary, optionally filtered by a start time.
+        #
+        # @param since [Time, nil] include only records on or after this time
+        # @return [Hash] with :total_cost_usd, :total_requests, token totals, and :by_model breakdown
+        def summary(since: nil)
+          subset = since ? records.select { |r| r[:recorded_at] >= since } : records.dup
+          {
+            total_cost_usd:      subset.sum { |r| r[:cost_usd] }.round(6),
+            total_requests:      subset.size,
+            total_input_tokens:  subset.sum { |r| r[:input_tokens] },
+            total_output_tokens: subset.sum { |r| r[:output_tokens] },
+            by_model:            subset.group_by { |r| r[:model] }.transform_values do |rs|
+              {
+                cost_usd: rs.sum { |r| r[:cost_usd] }.round(6),
+                requests: rs.size
+              }
+            end
+          }
+        end
+        # Clears all recorded entries.
+        def clear
+          @records = []
+        end
+        # Returns pricing for a model, preferring settings-defined overrides.
+        #
+        # @param model [String] model identifier
+        # @return [Hash] with :input and :output keys (per-1M-token USD)
+        def pricing_for(model)
+          custom = settings_pricing
+          custom[model.to_s] || DEFAULT_PRICING[model.to_s] || { input: 5.0, output: 15.0 }
+        end
+        private
+        def records
+          @records ||= []
+        end
+        def settings_pricing
+          return {} unless defined?(Legion::Settings)
+          pricing = Legion::Settings.dig(:'legion-llm', :pricing)
+          pricing.is_a?(Hash) ? pricing : {}
+        rescue StandardError
+          {}
+        end
+      end
+    end
+  end
+end

data/lib/legion/llm/daemon_client.rb CHANGED Viewed

@@ -76,6 +76,7 @@ module Legion
         healthy = response.code == '200'
         @healthy           = healthy
         @health_checked_at = ::Process.clock_gettime(::Process::CLOCK_MONOTONIC)
+        Legion::Logging.info("Daemon health check result=#{healthy ? 'healthy' : 'unhealthy'} url=#{daemon_url}") if defined?(Legion::Logging)
         healthy
       rescue StandardError
         mark_unhealthy
@@ -84,6 +85,7 @@ module Legion
       # Marks the daemon as unhealthy and records the timestamp.
       def mark_unhealthy
+        Legion::Logging.warn("Daemon marked unhealthy url=#{daemon_url}") if defined?(Legion::Logging)
         @healthy           = false
         @health_checked_at = ::Process.clock_gettime(::Process::CLOCK_MONOTONIC)
       end
@@ -128,9 +130,11 @@ module Legion
           data = parsed.fetch(:data, {})
           { status: :accepted, request_id: data[:request_id], poll_key: data[:poll_key] }
         when 403
+          Legion::Logging.warn("Daemon returned 403 Denied url=#{daemon_url}") if defined?(Legion::Logging)
           { status: :denied, error: parsed.fetch(:error, parsed) }
         when 429
           retry_after = extract_retry_after(response, parsed)
+          Legion::Logging.warn("Daemon returned 429 RateLimited url=#{daemon_url} retry_after=#{retry_after}") if defined?(Legion::Logging)
           { status: :rate_limited, retry_after: retry_after }
         when 503
           { status: :unavailable }

data/lib/legion/llm/discovery/ollama.rb CHANGED Viewed

@@ -30,10 +30,13 @@ module Legion
             if response.success?
               parsed = ::JSON.parse(response.body)
               @models = parsed['models'] || []
+              Legion::Logging.debug("Discovery::Ollama model list refreshed count=#{@models.size}") if defined?(Legion::Logging)
             else
+              Legion::Logging.warn("Discovery::Ollama HTTP failure status=#{response.status}") if defined?(Legion::Logging)
               @models ||= []
             end
-          rescue StandardError
+          rescue StandardError => e
+            Legion::Logging.warn("Discovery::Ollama HTTP failure: #{e.message}") if defined?(Legion::Logging)
             @models ||= []
           ensure
             @last_refreshed_at = Time.now

data/lib/legion/llm/discovery/system.rb CHANGED Viewed

@@ -94,7 +94,8 @@ module Legion
           def fetch_macos_total
             raw = `sysctl -n hw.memsize`.strip.to_i
             @total_memory_mb = raw / 1024 / 1024
-          rescue StandardError
+          rescue StandardError => e
+            Legion::Logging.warn("Discovery::System sysctl command failed: #{e.message}") if defined?(Legion::Logging)
             @total_memory_mb = nil
           end
@@ -104,7 +105,8 @@ module Legion
             free     = vm_output[/Pages free:\s+(\d+)/, 1].to_i
             inactive = vm_output[/Pages inactive:\s+(\d+)/, 1].to_i
             @available_memory_mb = (free + inactive) * page_size / 1024 / 1024
-          rescue StandardError
+          rescue StandardError => e
+            Legion::Logging.warn("Discovery::System vm_stat command failed: #{e.message}") if defined?(Legion::Logging)
             @available_memory_mb = nil
           end
@@ -112,7 +114,8 @@ module Legion
             meminfo = File.read('/proc/meminfo')
             total_kb = meminfo[/MemTotal:\s+(\d+)/, 1].to_i
             @total_memory_mb = total_kb / 1024
-          rescue StandardError
+          rescue StandardError => e
+            Legion::Logging.warn("Discovery::System /proc/meminfo read failed: #{e.message}") if defined?(Legion::Logging)
             @total_memory_mb = nil
           end
@@ -121,7 +124,8 @@ module Legion
             free_kb     = meminfo[/MemFree:\s+(\d+)/, 1].to_i
             inactive_kb = meminfo[/Inactive:\s+(\d+)/, 1].to_i
             @available_memory_mb = (free_kb + inactive_kb) / 1024
-          rescue StandardError
+          rescue StandardError => e
+            Legion::Logging.warn("Discovery::System /proc/meminfo available read failed: #{e.message}") if defined?(Legion::Logging)
             @available_memory_mb = nil
           end

data/lib/legion/llm/off_peak.rb ADDED Viewed

@@ -0,0 +1,46 @@
+# frozen_string_literal: true
+module Legion
+  module LLM
+    module OffPeak
+      # Peak hours in UTC: 14:00-22:00 (9 AM - 5 PM CT)
+      PEAK_HOURS = (14..22)
+      class << self
+        # Returns true if the given time falls within peak hours.
+        #
+        # @param time [Time] time to check (defaults to now)
+        # @return [Boolean]
+        def peak_hour?(time = Time.now.utc)
+          result = PEAK_HOURS.cover?(time.hour)
+          Legion::Logging.debug("OffPeak peak_hour check hour=#{time.hour} peak=#{result}") if defined?(Legion::Logging)
+          result
+        end
+        # Returns true when a non-urgent request should be deferred to off-peak.
+        #
+        # @param priority [Symbol] :urgent bypasses deferral; :normal and :low defer during peak
+        # @return [Boolean]
+        def should_defer?(priority: :normal)
+          return false if priority.to_sym == :urgent
+          peak_hour?
+        end
+        # Returns the next off-peak Time (UTC).
+        # If already off-peak, returns the current time.
+        # Off-peak begins at the hour after the peak window ends (23:00 UTC).
+        #
+        # @param time [Time] reference time (defaults to now)
+        # @return [Time]
+        def next_off_peak(time = Time.now.utc)
+          if time.hour < PEAK_HOURS.first || time.hour >= PEAK_HOURS.last
+            time
+          else
+            Time.utc(time.year, time.month, time.day, PEAK_HOURS.last, 0, 0)
+          end
+        end
+      end
+    end
+  end
+end

data/lib/legion/llm/response_cache.rb CHANGED Viewed

@@ -26,6 +26,7 @@ module Legion
       # Writes error details and marks status as :error.
       def fail_request(request_id, code:, message:, ttl: DEFAULT_TTL)
+        Legion::Logging.warn("ResponseCache fail_request request_id=#{request_id} code=#{code} message=#{message}") if defined?(Legion::Logging)
         payload = ::JSON.dump({ code: code, message: message })
         cache_set(error_key(request_id), payload, ttl)
         cache_set(status_key(request_id), 'error', ttl)
@@ -69,6 +70,7 @@ module Legion
         loop do
           current = status(request_id)
+          Legion::Logging.debug("ResponseCache poll request_id=#{request_id} status=#{current}") if defined?(Legion::Logging)
           case current
           when :done
@@ -120,6 +122,7 @@ module Legion
       private_class_method def self.write_response(request_id, response_text, ttl)
         if response_text.bytesize > SPOOL_THRESHOLD
+          Legion::Logging.warn("ResponseCache spool overflow request_id=#{request_id} bytes=#{response_text.bytesize}") if defined?(Legion::Logging)
           FileUtils.mkdir_p(SPOOL_DIR)
           path = File.join(SPOOL_DIR, "#{request_id}.txt")
           File.write(path, response_text)

data/lib/legion/llm/router/health_tracker.rb CHANGED Viewed

@@ -49,7 +49,10 @@ module Legion
           if circuit[:state] == :open
             elapsed = Time.now - circuit[:opened_at]
-            return :half_open if elapsed >= @cooldown_seconds
+            if elapsed >= @cooldown_seconds
+              Legion::Logging.warn("Circuit open->half_open for provider=#{provider} (cooldown elapsed)") if defined?(Legion::Logging)
+              return :half_open
+            end
           end
           circuit[:state]
@@ -82,11 +85,13 @@ module Legion
             if circuit_state(provider) == :half_open
               circuit[:state]     = :open
               circuit[:opened_at] = Time.now
+              Legion::Logging.warn("Circuit half_open->open for provider=#{provider} (error during probe)") if defined?(Legion::Logging)
             else
               circuit[:failures] += 1.0
               if circuit[:failures] >= @failure_threshold
                 circuit[:state]     = :open
                 circuit[:opened_at] = Time.now
+                Legion::Logging.warn("Circuit closed->open for provider=#{provider} (failures=#{circuit[:failures]})") if defined?(Legion::Logging)
               end
             end
           end
@@ -94,10 +99,12 @@ module Legion
           register_handler(:success) do |payload|
             provider = payload[:provider]
             ensure_circuit(provider)
+            prev_state          = circuit_state(provider)
             circuit             = @circuits[provider]
             circuit[:failures]  = 0
             circuit[:state]     = :closed
             circuit[:opened_at] = nil
+            Legion::Logging.warn("Circuit #{prev_state}->closed for provider=#{provider}") if defined?(Legion::Logging) && prev_state != :closed
           end
           register_handler(:quality_failure) do |payload|
@@ -108,11 +115,13 @@ module Legion
             if circuit_state(provider) == :half_open
               circuit[:state]     = :open
               circuit[:opened_at] = Time.now
+              Legion::Logging.warn("Circuit half_open->open for provider=#{provider} (quality failure during probe)") if defined?(Legion::Logging)
             else
               circuit[:failures] += 0.5
               if circuit[:failures] >= @failure_threshold
                 circuit[:state]     = :open
                 circuit[:opened_at] = Time.now
+                Legion::Logging.warn("Circuit closed->open for provider=#{provider} (quality failures=#{circuit[:failures]})") if defined?(Legion::Logging)
               end
             end
           end
@@ -152,7 +161,9 @@ module Legion
           return 0 if avg <= LATENCY_THRESHOLD_MS
           multiplier = (avg / LATENCY_THRESHOLD_MS).floor
-          [LATENCY_PENALTY_STEP * multiplier, OPEN_PENALTY].max
+          penalty = [LATENCY_PENALTY_STEP * multiplier, OPEN_PENALTY].max
+          Legion::Logging.debug("Latency penalty applied to provider=#{provider} avg_ms=#{avg.round} penalty=#{penalty}") if defined?(Legion::Logging)
+          penalty
         end
       end
     end

data/lib/legion/llm/router/rule.rb CHANGED Viewed

@@ -39,9 +39,17 @@ module Legion
         def matches_intent?(intent)
           @conditions.all? do |key, value|
-            return false unless intent.key?(key)
+            unless intent.key?(key)
+              Legion::Logging.debug("Rule '#{@name}' rejected: missing intent key=#{key}") if defined?(Legion::Logging)
+              return false
+            end
-            intent[key].to_s == value.to_s
+            unless intent[key].to_s == value.to_s
+              Legion::Logging.debug("Rule '#{@name}' rejected: intent #{key}=#{intent[key]} != #{value}") if defined?(Legion::Logging)
+              return false
+            end
+            true
           end
         end
@@ -60,17 +68,32 @@ module Legion
           sched = @schedule.transform_keys(&:to_s)
           now = localize(now, sched['timezone'])
-          return false if sched['valid_from']  && now < Time.parse(sched['valid_from'])
-          return false if sched['valid_until'] && now > Time.parse(sched['valid_until'])
-          return false if sched['hours'] && !within_hours?(sched['hours'], now)
-          return false if sched['days']  && !on_allowed_day?(sched['days'], now)
-          true
+          schedule_rejection(sched, now).nil?
         end
         private
+        def schedule_rejection(sched, now)
+          if sched['valid_from'] && now < Time.parse(sched['valid_from'])
+            Legion::Logging.debug("Rule '#{@name}' rejected: before valid_from=#{sched['valid_from']}") if defined?(Legion::Logging)
+            return :valid_from
+          end
+          if sched['valid_until'] && now > Time.parse(sched['valid_until'])
+            Legion::Logging.debug("Rule '#{@name}' rejected: after valid_until=#{sched['valid_until']}") if defined?(Legion::Logging)
+            return :valid_until
+          end
+          if sched['hours'] && !within_hours?(sched['hours'], now)
+            Legion::Logging.debug("Rule '#{@name}' rejected: outside schedule hours=#{sched['hours']}") if defined?(Legion::Logging)
+            return :hours
+          end
+          if sched['days'] && !on_allowed_day?(sched['days'], now)
+            Legion::Logging.debug("Rule '#{@name}' rejected: outside schedule days=#{sched['days']}") if defined?(Legion::Logging)
+            return :days
+          end
+          nil
+        end
         def localize(time, timezone_name)
           return time unless timezone_name

data/lib/legion/llm/router.rb CHANGED Viewed

@@ -28,7 +28,17 @@ module Legion
           rules = load_rules
           candidates = select_candidates(rules, merged)
           best = pick_best(candidates)
-          best&.to_resolution
+          resolution = best&.to_resolution
+          if resolution
+            if defined?(Legion::Logging)
+              Legion::Logging.info("Routed to tier=#{resolution.tier} provider=#{resolution.provider} model=#{resolution.model} via rule='#{resolution.rule}'")
+            end
+          elsif defined?(Legion::Logging)
+            Legion::Logging.debug('Router: no rules matched, resolution is nil')
+          end
+          resolution
         end
         def resolve_chain(intent: nil, tier: nil, model: nil, provider: nil, max_escalations: nil)
@@ -100,6 +110,8 @@ module Legion
         end
         def select_candidates(rules, intent)
+          Legion::Logging.debug("Router: selecting candidates from #{rules.size} rules") if defined?(Legion::Logging)
           # 1. Collect constraints from constraint rules that match the intent
           constraints = rules
                         .select { |r| r.constraint && r.matches_intent?(intent) }
@@ -118,7 +130,11 @@ module Legion
           discovered = unconstrained.reject { |r| excluded_by_discovery?(r) }
           # 5. Filter by tier availability
-          discovered.select { |r| tier_available?(r.target[:tier] || r.target['tier']) }
+          final = discovered.select { |r| tier_available?(r.target[:tier] || r.target['tier']) }
+          Legion::Logging.debug("Router: #{final.size} candidates after filtering (started with #{rules.size})") if defined?(Legion::Logging)
+          final
         end
         def excluded_by_constraint?(rule, constraints)

data/lib/legion/llm/scheduling.rb CHANGED Viewed

@@ -24,7 +24,9 @@ module Legion
           return false unless enabled?
           return false if urgency.to_sym == :immediate
-          eligible_for_deferral?(intent.to_sym) && peak_hours?
+          result = eligible_for_deferral?(intent.to_sym) && peak_hours?
+          Legion::Logging.debug("Scheduling defer decision intent=#{intent} urgency=#{urgency} defer=#{result}") if defined?(Legion::Logging)
+          result
         end
         # Returns true if the current UTC hour falls within the configured peak window.

data/lib/legion/llm/shadow_eval.rb CHANGED Viewed

@@ -17,6 +17,7 @@ module Legion
         def evaluate(primary_response:, messages: nil, shadow_model: nil) # rubocop:disable Lint/UnusedMethodArgument
           shadow_model ||= Legion::Settings.dig(:llm, :shadow, :model) || 'gpt-4o-mini'
+          Legion::Logging.debug("ShadowEval triggered primary_model=#{primary_response[:model]} shadow_model=#{shadow_model}") if defined?(Legion::Logging)
           shadow_response = Legion::LLM.send(:chat_single,
                                              model: shadow_model, provider: nil,
@@ -27,6 +28,7 @@ module Legion
           Legion::Events.emit('llm.shadow_eval', comparison) if defined?(Legion::Events)
           comparison
         rescue StandardError => e
+          Legion::Logging.warn("ShadowEval failed shadow_model=#{shadow_model}: #{e.message}") if defined?(Legion::Logging)
           { error: e.message, shadow_model: shadow_model }
         end

data/lib/legion/llm/structured_output.rb CHANGED Viewed

@@ -26,6 +26,7 @@ module Legion
                                                 json_schema: { name: 'response', schema: schema } },
                              **opts.except(:attempt))
           else
+            Legion::Logging.debug("StructuredOutput using prompt-based fallback for model=#{model}") if defined?(Legion::Logging)
             instruction = "You MUST respond with valid JSON matching this schema:\n" \
                           "```json\n#{Legion::JSON.dump(schema)}\n```\n" \
                           'Respond with ONLY the JSON object, no other text.'
@@ -37,8 +38,10 @@ module Legion
         end
         def handle_parse_error(error, messages, schema, model, result, **opts)
-          if retry_enabled? && (opts[:attempt] || 0) < max_retries
-            retry_with_instruction(messages, schema, model, attempt: (opts[:attempt] || 0) + 1, **opts)
+          attempt = opts[:attempt] || 0
+          Legion::Logging.warn("StructuredOutput JSON parse failure attempt=#{attempt} model=#{model}: #{error.message}") if defined?(Legion::Logging)
+          if retry_enabled? && attempt < max_retries
+            retry_with_instruction(messages, schema, model, attempt: attempt + 1, **opts)
           else
             { data: nil, error: "JSON parse failed: #{error.message}", raw: result&.dig(:content), valid: false }
           end

data/lib/legion/llm/version.rb CHANGED Viewed

@@ -2,6 +2,6 @@
 module Legion
   module LLM
-    VERSION = '0.3.15'
+    VERSION = '0.3.18'
   end
 end

data/lib/legion/llm.rb CHANGED Viewed

@@ -15,6 +15,8 @@ require_relative 'llm/daemon_client'
 require_relative 'llm/arbitrage'
 require_relative 'llm/batch'
 require_relative 'llm/scheduling'
+require_relative 'llm/off_peak'
+require_relative 'llm/cost_tracker'
 begin
   require 'legion/extensions/llm/gateway'
@@ -124,7 +126,7 @@ module Legion
                    )
                  else
                    chat_single(model: model, provider: provider, intent: intent, tier: tier,
-                               temperature: temperature, **kwargs)
+                               temperature: temperature, message: message, **kwargs)
                  end
         if cache_key && result.is_a?(Hash)
@@ -185,6 +187,10 @@ module Legion
         agent_class.new(**)
       end
+      FRAMEWORK_KEYS = %i[request_id source timestamp datetime task_id parent_id master_id
+                          check_subtask generate_task catch_exceptions worker_id principal_id
+                          principal_type].freeze
       private
       def _dispatch_chat(model:, provider:, intent:, tier:, escalate:, max_escalations:, quality_check:, message:, **kwargs)
@@ -276,7 +282,7 @@ module Legion
         Legion::Extensions::LLM::Gateway::Runners::Inference.chat(**)
       end
-      def chat_single(model:, provider:, intent:, tier:, **kwargs)
+      def chat_single(model:, provider:, intent:, tier:, message: nil, **kwargs)
         if (intent || tier) && Router.routing_enabled?
           resolution = Router.resolve(intent: intent, tier: tier, model: model, provider: provider)
           if resolution
@@ -295,12 +301,15 @@ module Legion
         opts = {}
         opts[:model]    = model    if model
         opts[:provider] = provider if provider
-        opts.merge!(kwargs)
+        opts.merge!(kwargs.except(*FRAMEWORK_KEYS))
         opts.delete(:temperature) if opts[:temperature].nil?
         inject_anthropic_cache_control!(opts, provider)
-        RubyLLM.chat(**opts)
+        session = RubyLLM.chat(**opts)
+        return session unless message
+        session.ask(message)
       end
       def chat_with_escalation(model:, provider:, intent:, tier:, max_escalations:, quality_check:, message:, **kwargs)

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: legion-llm
 version: !ruby/object:Gem::Version
-  version: 0.3.15
+  version: 0.3.18
 platform: ruby
 authors:
 - Esity
@@ -137,6 +137,7 @@ files:
 - lib/legion/llm/cache.rb
 - lib/legion/llm/claude_config_loader.rb
 - lib/legion/llm/compressor.rb
+- lib/legion/llm/cost_tracker.rb
 - lib/legion/llm/daemon_client.rb
 - lib/legion/llm/discovery/ollama.rb
 - lib/legion/llm/discovery/system.rb
@@ -146,6 +147,7 @@ files:
 - lib/legion/llm/hooks.rb
 - lib/legion/llm/hooks/rag_guard.rb
 - lib/legion/llm/hooks/response_guard.rb
+- lib/legion/llm/off_peak.rb
 - lib/legion/llm/providers.rb
 - lib/legion/llm/quality_checker.rb
 - lib/legion/llm/response_cache.rb