legion-llm 0.3.5 → 0.3.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: e3e10dfcd60fe722290bec30017671fb261f3baf151b416c82defb082d4445f4
4
- data.tar.gz: 0c0b062649f8ede281fada374681d551d437126a2efa0e604a069610e14b7069
3
+ metadata.gz: '0914899eb9eee81b947d95d617a16ddf152fb74fa7afb4c1c1cfca74c9c8445d'
4
+ data.tar.gz: d4146a95967ceffca175c531fd412089f3a13df4b4b60964598e123115d3c19f
5
5
  SHA512:
6
- metadata.gz: eeb2cd074c2eb1c3b63ccb7644adbcf7cac6bab62f8d5cc966e318b2185267ab73fae920726b83e1f72cbf8753ba1245ab84ae914baa092aebdbef08e0548cd3
7
- data.tar.gz: ccc52360f869421100f0bbda503570168f2f7eb86c5fda9069e6ee29bdaa4c60553d202a1ad2e2109e98ba451b9abf9c00dc8210b50cf5d0a925192cada2ab9d
6
+ metadata.gz: 8d9fb16e659a4f24d6c01bb3b7caa96d6814980e5b9866fe8ccc293bae57121f8d21acc95efef98832b015875abebfe1ca2cbba63f825a43d64cb9feac82f9b2
7
+ data.tar.gz: 4e6788a7b28889ed80ec1701e5a45a05bcfe71914610b538fae2f68d3b16ac4942e8edd0abbf2414d3dd124edc109817ceef3390d22108c1c9899a82b6d93c55
data/CHANGELOG.md CHANGED
@@ -1,5 +1,10 @@
1
1
  # Legion LLM Changelog
2
2
 
3
+ ## [0.3.6] - 2026-03-18
4
+
5
+ ### Added
6
+ - Add `lex-claude`, `lex-gemini`, `lex-openai` as runtime dependencies (AI provider extensions)
7
+
3
8
  ## [0.3.5] - 2026-03-18
4
9
 
5
10
  ### Added
data/CLAUDE.md CHANGED
@@ -8,6 +8,7 @@
8
8
  Core LegionIO gem providing LLM capabilities to all extensions. Wraps ruby_llm to provide a consistent interface for chat, embeddings, tool use, and agents across multiple providers (Bedrock, Anthropic, OpenAI, Gemini, Ollama). Includes a dynamic weighted routing engine that dispatches requests across local, fleet, and cloud tiers based on caller intent, priority rules, time schedules, cost multipliers, and real-time provider health.
9
9
 
10
10
  **GitHub**: https://github.com/LegionIO/legion-llm
11
+ **Version**: 0.3.5
11
12
  **License**: Apache-2.0
12
13
 
13
14
  ## Architecture
@@ -61,8 +62,7 @@ Three-tier dispatch model. Local-first avoids unnecessary network hops; fleet of
61
62
  │ Zero network overhead, no Transport │
62
63
  │ │
63
64
  │ Tier 2: FLEET → Ollama on Mac Studios / GPU servers │
64
- │ Via Legion::Transport (AMQP) when local can't
65
- │ serve the model (Phase 2, not yet built) │
65
+ │ Via lex-llm-gateway RPC over AMQP
66
66
  │ │
67
67
  │ Tier 3: CLOUD → Bedrock / Anthropic / OpenAI / Gemini │
68
68
  │ Existing provider API calls │
@@ -87,6 +87,19 @@ Three-tier dispatch model. Local-first avoids unnecessary network hops; fleet of
87
87
  5. Return Resolution for highest-scoring candidate
88
88
  ```
89
89
 
90
+ ### Gateway Integration (lex-llm-gateway)
91
+
92
+ When `lex-llm-gateway` is installed, `chat`, `embed`, and `structured` automatically delegate to the gateway for metering and fleet dispatch. The gateway is loaded via `begin/rescue LoadError` — optional, not a hard dependency.
93
+
94
+ ```
95
+ Caller → Legion::LLM.chat(message:)
96
+ └─ gateway loaded? → Gateway::Runners::Inference.chat (meters, fleet dispatch)
97
+ └─ Legion::LLM.chat_direct (routing, escalation, RubyLLM)
98
+ └─ no gateway? → Legion::LLM.chat_direct (same path, no metering)
99
+ ```
100
+
101
+ The `_direct` variants (`chat_direct`, `embed_direct`, `structured_direct`) bypass gateway delegation. The gateway's `call_llm` uses these to avoid infinite recursion.
102
+
90
103
  ### Integration with LegionIO
91
104
 
92
105
  - **Service**: `setup_llm` called between data and supervision in startup sequence
@@ -94,6 +107,7 @@ Three-tier dispatch model. Local-first avoids unnecessary network hops; fleet of
94
107
  - **Helpers**: `Legion::Extensions::Helpers::LLM` auto-loaded when gem is present
95
108
  - **Readiness**: Registers as `:llm` in `Legion::Readiness`
96
109
  - **Shutdown**: `Legion::LLM.shutdown` called during service shutdown
110
+ - **Gateway**: `lex-llm-gateway` auto-loaded if present; provides metering and fleet RPC
97
111
 
98
112
  ## Dependencies
99
113
 
@@ -103,6 +117,7 @@ Three-tier dispatch model. Local-first avoids unnecessary network hops; fleet of
103
117
  | `tzinfo` (>= 2.0) | IANA timezone conversion for schedule windows |
104
118
  | `legion-logging` | Logging |
105
119
  | `legion-settings` | Configuration |
120
+ | `lex-llm-gateway` (optional) | Metering over RMQ, fleet RPC dispatch, disk spool — auto-loaded if present |
106
121
 
107
122
  ## Key Interfaces
108
123
 
@@ -113,11 +128,15 @@ Legion::LLM.shutdown # Cleanup
113
128
  Legion::LLM.started? # -> Boolean
114
129
  Legion::LLM.settings # -> Hash
115
130
 
116
- # Chat (with optional routing)
117
- Legion::LLM.chat(model:, provider:) # Direct (no routing)
131
+ # Chat (delegates to gateway when loaded, otherwise direct)
132
+ Legion::LLM.chat(message: 'hello', model:, provider:) # Gateway-metered if available
118
133
  Legion::LLM.chat(intent: { privacy: :strict }) # Intent-based routing
119
134
  Legion::LLM.chat(tier: :cloud, model: 'claude-sonnet-4-6') # Explicit tier override
120
- Legion::LLM.embed(text, model:) # Embeddings (no routing)
135
+ Legion::LLM.chat_direct(message:, model:, provider:) # Bypass gateway (no metering)
136
+ Legion::LLM.embed(text, model:) # Embeddings (gateway-metered)
137
+ Legion::LLM.embed_direct(text, model:) # Bypass gateway
138
+ Legion::LLM.structured(messages:, schema:) # Structured (gateway-metered)
139
+ Legion::LLM.structured_direct(messages:, schema:) # Bypass gateway
121
140
  Legion::LLM.agent(AgentClass) # Agent instance
122
141
 
123
142
  # Compressor
@@ -284,7 +303,7 @@ In-memory signal consumer with pluggable handlers. Adjusts effective priorities
284
303
  | `lib/legion/llm/embeddings.rb` | Embeddings module: generate, generate_batch, default_model |
285
304
  | `lib/legion/llm/shadow_eval.rb` | Shadow evaluation: enabled?, should_sample?, evaluate, compare |
286
305
  | `lib/legion/llm/structured_output.rb` | JSON schema enforcement with native response_format and prompt fallback |
287
- | `lib/legion/llm/version.rb` | Version constant (0.3.3) |
306
+ | `lib/legion/llm/version.rb` | Version constant (0.3.5) |
288
307
  | `lib/legion/llm/quality_checker.rb` | QualityChecker module with QualityResult struct |
289
308
  | `lib/legion/llm/escalation_history.rb` | EscalationHistory mixin: `escalation_history`, `escalated?`, `final_resolution`, `escalation_chain` |
290
309
  | `lib/legion/llm/router/escalation_chain.rb` | EscalationChain value object |
@@ -315,6 +334,7 @@ In-memory signal consumer with pluggable handlers. Adjusts effective priorities
315
334
  | `spec/legion/llm/embeddings_spec.rb` | Embeddings tests |
316
335
  | `spec/legion/llm/shadow_eval_spec.rb` | ShadowEval tests |
317
336
  | `spec/legion/llm/structured_output_spec.rb` | StructuredOutput tests |
337
+ | `spec/legion/llm/gateway_integration_spec.rb` | Tests: gateway delegation and _direct bypass |
318
338
  | `spec/spec_helper.rb` | Stubbed Legion::Logging and Legion::Settings for testing |
319
339
 
320
340
  ## Extension Integration
@@ -374,8 +394,8 @@ The legacy `vault_path` per-provider setting was removed in v0.3.1.
374
394
  Tests run without the full LegionIO stack. `spec/spec_helper.rb` stubs `Legion::Logging` and `Legion::Settings` with in-memory implementations. Each test resets settings to defaults via `before(:each)`.
375
395
 
376
396
  ```bash
377
- bundle exec rspec # 287 examples, 0 failures
378
- bundle exec rubocop # 31 files, 0 offenses
397
+ bundle exec rspec # 304 examples, 0 failures
398
+ bundle exec rubocop # 52 files, 0 offenses
379
399
  ```
380
400
 
381
401
  ## Design Documents
@@ -389,8 +409,8 @@ bundle exec rubocop # 31 files, 0 offenses
389
409
 
390
410
  ## Future (Not Yet Built)
391
411
 
392
- - **Fleet tier (Phase 2)**: `lex-llm-fleet` extension inference workers on Mac Studios / NVIDIA servers, dispatched via Legion::Transport AMQP queues
393
- - **Advanced signals (Phase 3)**: Budget tracking, lex-metering integration, GPU utilization monitoring
412
+ - **Advanced signals**: Budget tracking, GPU utilization monitoring, per-tenant spend limits
413
+ - **Fleet auto-scaling**: Dynamic worker pool sizing based on queue depth and latency
394
414
 
395
415
  ---
396
416
 
data/Gemfile CHANGED
@@ -4,8 +4,6 @@ source 'https://rubygems.org'
4
4
 
5
5
  gemspec
6
6
 
7
- gem 'lex-llm-gateway', path: '../extensions-core/lex-llm-gateway' if File.directory?('../extensions-core/lex-llm-gateway')
8
-
9
7
  group :test do
10
8
  gem 'rake'
11
9
  gem 'rspec'
data/README.md CHANGED
@@ -2,6 +2,8 @@
2
2
 
3
3
  LLM integration for the [LegionIO](https://github.com/LegionIO/LegionIO) framework. Wraps [ruby_llm](https://github.com/crmne/ruby_llm) to provide chat, embeddings, tool use, and agent capabilities to any Legion extension.
4
4
 
5
+ **Version**: 0.3.5
6
+
5
7
  ## Installation
6
8
 
7
9
  ```ruby
@@ -599,7 +601,7 @@ bundle exec rspec
599
601
  Tests use stubbed `Legion::Logging` and `Legion::Settings` modules (no need for the full LegionIO stack):
600
602
 
601
603
  ```bash
602
- bundle exec rspec # Run all 269 tests
604
+ bundle exec rspec # Run all 304 tests
603
605
  bundle exec rubocop # Lint (0 offenses)
604
606
  bundle exec rspec spec/legion/llm_spec.rb # Run specific test file
605
607
  bundle exec rspec spec/legion/llm/router_spec.rb # Router tests only
data/legion-llm.gemspec CHANGED
@@ -27,6 +27,9 @@ Gem::Specification.new do |spec|
27
27
 
28
28
  spec.add_dependency 'legion-logging'
29
29
  spec.add_dependency 'legion-settings'
30
+ spec.add_dependency 'lex-claude'
31
+ spec.add_dependency 'lex-gemini'
32
+ spec.add_dependency 'lex-openai'
30
33
  spec.add_dependency 'ruby_llm', '>= 1.0'
31
34
  spec.add_dependency 'tzinfo', '>= 2.0'
32
35
  end
@@ -2,6 +2,6 @@
2
2
 
3
3
  module Legion
4
4
  module LLM
5
- VERSION = '0.3.5'
5
+ VERSION = '0.3.6'
6
6
  end
7
7
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: legion-llm
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.3.5
4
+ version: 0.3.6
5
5
  platform: ruby
6
6
  authors:
7
7
  - Esity
@@ -37,6 +37,48 @@ dependencies:
37
37
  - - ">="
38
38
  - !ruby/object:Gem::Version
39
39
  version: '0'
40
+ - !ruby/object:Gem::Dependency
41
+ name: lex-claude
42
+ requirement: !ruby/object:Gem::Requirement
43
+ requirements:
44
+ - - ">="
45
+ - !ruby/object:Gem::Version
46
+ version: '0'
47
+ type: :runtime
48
+ prerelease: false
49
+ version_requirements: !ruby/object:Gem::Requirement
50
+ requirements:
51
+ - - ">="
52
+ - !ruby/object:Gem::Version
53
+ version: '0'
54
+ - !ruby/object:Gem::Dependency
55
+ name: lex-gemini
56
+ requirement: !ruby/object:Gem::Requirement
57
+ requirements:
58
+ - - ">="
59
+ - !ruby/object:Gem::Version
60
+ version: '0'
61
+ type: :runtime
62
+ prerelease: false
63
+ version_requirements: !ruby/object:Gem::Requirement
64
+ requirements:
65
+ - - ">="
66
+ - !ruby/object:Gem::Version
67
+ version: '0'
68
+ - !ruby/object:Gem::Dependency
69
+ name: lex-openai
70
+ requirement: !ruby/object:Gem::Requirement
71
+ requirements:
72
+ - - ">="
73
+ - !ruby/object:Gem::Version
74
+ version: '0'
75
+ type: :runtime
76
+ prerelease: false
77
+ version_requirements: !ruby/object:Gem::Requirement
78
+ requirements:
79
+ - - ">="
80
+ - !ruby/object:Gem::Version
81
+ version: '0'
40
82
  - !ruby/object:Gem::Dependency
41
83
  name: ruby_llm
42
84
  requirement: !ruby/object:Gem::Requirement