lex-llm-gateway 0.2.0 → 0.2.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +18 -0
- data/README.md +76 -1
- data/lex-llm-gateway.gemspec +8 -0
- data/lib/legion/extensions/llm/gateway/runners/inference.rb +14 -14
- data/lib/legion/extensions/llm/gateway/version.rb +1 -1
- data/lib/legion/extensions/llm/gateway.rb +1 -0
- metadata +99 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 113ab7ef4818f7904351d1c4ebf08bdfc8beabc3310c219d909443399d051e41
|
|
4
|
+
data.tar.gz: '0285316fbedb4b01afed93c25d9897b7723de482d06dd38bb85b2c6e59bcf2d8'
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: be41efa7a0e59477137225023730e271b04be3343cb178fd7110e863320f95d5ffeaafa4fc8d912056c78fd739e9ac0d6d2ec77b6d90c0d991b0df1622943e21
|
|
7
|
+
data.tar.gz: 985e1a3bc30d68657dacdc8f0df39bb5cd33d0cbc7192eb3336e99c7161d99e74af6909b99814b92ebd6084dc989ae5e0bb257c285482509be308cc43a5f9519
|
data/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,23 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## [0.2.3] - 2026-03-22
|
|
4
|
+
|
|
5
|
+
### Changed
|
|
6
|
+
- Add runtime deps: legion-cache >= 1.3.11, legion-crypt >= 1.4.9, legion-data >= 1.4.17, legion-json >= 1.2.1, legion-logging >= 1.3.2, legion-settings >= 1.3.14, legion-transport >= 1.3.9
|
|
7
|
+
- Update spec_helper to require real sub-gem helpers and define Helpers::Lex stub with all 7 includes; require legion/transport for actor base class inheritance
|
|
8
|
+
- Fix transport message specs: expect raise from new instead of validate (real Message#initialize calls validate)
|
|
9
|
+
- Refactor Runners::Inference dispatch_chat and call_llm to resolve Metrics/ModuleLength and Metrics/MethodLength
|
|
10
|
+
|
|
11
|
+
## [0.2.2] - 2026-03-22
|
|
12
|
+
|
|
13
|
+
### Fixed
|
|
14
|
+
- Replace bare `Process` with `::Process` in `Runners::Inference` (6 occurrences) to avoid resolving to `Legion::Process` instead of Ruby stdlib `::Process`, which caused a `NameError` and silently failed inference calls
|
|
15
|
+
|
|
16
|
+
## [0.2.1] - 2026-03-20
|
|
17
|
+
|
|
18
|
+
### Fixed
|
|
19
|
+
- Add `Llm` constant alias for `LLM` so the framework can resolve `Legion::Extensions::Llm::Gateway` during extension discovery
|
|
20
|
+
|
|
3
21
|
## [0.2.0] - 2026-03-18
|
|
4
22
|
|
|
5
23
|
### Added
|
data/README.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# lex-llm-gateway
|
|
2
2
|
|
|
3
|
-
LLM inference gateway for LegionIO. Provides centralized metering over RabbitMQ, fleet RPC dispatch to GPU workers, and local disk spooling for offline resilience.
|
|
3
|
+
LLM inference gateway for [LegionIO](https://github.com/LegionIO/LegionIO). Provides centralized metering over RabbitMQ, fleet RPC dispatch to GPU workers, and local disk spooling for offline resilience.
|
|
4
4
|
|
|
5
5
|
## Installation
|
|
6
6
|
|
|
@@ -10,6 +10,81 @@ Add to your Gemfile:
|
|
|
10
10
|
gem 'lex-llm-gateway'
|
|
11
11
|
```
|
|
12
12
|
|
|
13
|
+
## Overview
|
|
14
|
+
|
|
15
|
+
`lex-llm-gateway` wraps all LLM calls with automatic metering and fleet routing. It is designed for clusters with 100k+ edge nodes that cannot have direct database access.
|
|
16
|
+
|
|
17
|
+
Three node roles:
|
|
18
|
+
|
|
19
|
+
| Role | What It Does |
|
|
20
|
+
|------|-------------|
|
|
21
|
+
| **Publisher** (all nodes) | Calls `Inference.chat` which auto-meters to RMQ or disk spool |
|
|
22
|
+
| **Fleet Worker** (GPU nodes) | Runs InferenceWorker actor, processes fleet requests |
|
|
23
|
+
| **Metering Writer** (DB nodes) | Runs MeteringWriter actor, writes to `metering_records` |
|
|
24
|
+
|
|
25
|
+
## Degradation Ladder
|
|
26
|
+
|
|
27
|
+
```
|
|
28
|
+
Full stack (transport + gateway + LLM + fleet)
|
|
29
|
+
no transport -> spool to disk, flush when reconnected
|
|
30
|
+
no gateway -> Legion::LLM direct (no metering)
|
|
31
|
+
no fleet -> local/cloud only
|
|
32
|
+
no cloud -> local LLM only
|
|
33
|
+
no local -> error
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
## Runners
|
|
37
|
+
|
|
38
|
+
- **Metering** - `build_event`, `publish_or_spool`, `flush_spool`
|
|
39
|
+
- **Inference** - `chat`, `embed`, `structured` (all auto-metered)
|
|
40
|
+
- **Fleet** - `dispatch` to GPU workers with timeout and JWT auth
|
|
41
|
+
- **FleetHandler** - `handle_fleet_request` (validates JWT, calls local LLM)
|
|
42
|
+
- **MeteringWriter** - `write_metering_record` (DB insert consumed from RMQ)
|
|
43
|
+
|
|
44
|
+
## Standalone Client
|
|
45
|
+
|
|
46
|
+
```ruby
|
|
47
|
+
require 'legion/extensions/llm/gateway/client'
|
|
48
|
+
|
|
49
|
+
client = Legion::Extensions::LLM::Gateway::Client.new
|
|
50
|
+
result = client.chat(model: 'claude-opus-4-6', messages: [{ role: 'user', content: 'Hello' }])
|
|
51
|
+
result[:success] # => true
|
|
52
|
+
result[:response] # => "Hello! How can I help you?"
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
## Settings
|
|
56
|
+
|
|
57
|
+
```json
|
|
58
|
+
{
|
|
59
|
+
"llm": {
|
|
60
|
+
"routing": {
|
|
61
|
+
"use_fleet": true,
|
|
62
|
+
"fleet": {
|
|
63
|
+
"timeout_seconds": 30,
|
|
64
|
+
"require_auth": false
|
|
65
|
+
}
|
|
66
|
+
}
|
|
67
|
+
}
|
|
68
|
+
}
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
## Requirements
|
|
72
|
+
|
|
73
|
+
- Ruby >= 3.4
|
|
74
|
+
- [LegionIO](https://github.com/LegionIO/LegionIO) framework
|
|
75
|
+
- `legion-transport` (AMQP metering + inference queues)
|
|
76
|
+
- `legion-crypt` (JWT signing for fleet auth, optional)
|
|
77
|
+
- `legion-data` (MeteringWriter and disk spool, optional)
|
|
78
|
+
- `legion-llm` (inference execution on fleet workers)
|
|
79
|
+
|
|
80
|
+
## Development
|
|
81
|
+
|
|
82
|
+
```bash
|
|
83
|
+
bundle install
|
|
84
|
+
bundle exec rspec # 199 examples, 0 failures
|
|
85
|
+
bundle exec rubocop # 0 offenses
|
|
86
|
+
```
|
|
87
|
+
|
|
13
88
|
## License
|
|
14
89
|
|
|
15
90
|
MIT
|
data/lex-llm-gateway.gemspec
CHANGED
|
@@ -28,6 +28,14 @@ Gem::Specification.new do |spec|
|
|
|
28
28
|
end
|
|
29
29
|
spec.require_paths = ['lib']
|
|
30
30
|
|
|
31
|
+
spec.add_dependency 'legion-cache', '>= 1.3.11'
|
|
32
|
+
spec.add_dependency 'legion-crypt', '>= 1.4.9'
|
|
33
|
+
spec.add_dependency 'legion-data', '>= 1.4.17'
|
|
34
|
+
spec.add_dependency 'legion-json', '>= 1.2.1'
|
|
35
|
+
spec.add_dependency 'legion-logging', '>= 1.3.2'
|
|
36
|
+
spec.add_dependency 'legion-settings', '>= 1.3.14'
|
|
37
|
+
spec.add_dependency 'legion-transport', '>= 1.3.9'
|
|
38
|
+
|
|
31
39
|
spec.add_development_dependency 'rake'
|
|
32
40
|
spec.add_development_dependency 'rspec'
|
|
33
41
|
spec.add_development_dependency 'rubocop'
|
|
@@ -9,28 +9,28 @@ module Legion
|
|
|
9
9
|
module_function
|
|
10
10
|
|
|
11
11
|
def chat(model: nil, provider: nil, **opts)
|
|
12
|
-
start_ms = Process.clock_gettime(Process::CLOCK_MONOTONIC, :millisecond)
|
|
12
|
+
start_ms = ::Process.clock_gettime(::Process::CLOCK_MONOTONIC, :millisecond)
|
|
13
13
|
response = dispatch_chat(model: model, provider: provider, **opts)
|
|
14
|
-
elapsed_ms = Process.clock_gettime(Process::CLOCK_MONOTONIC, :millisecond) - start_ms
|
|
14
|
+
elapsed_ms = ::Process.clock_gettime(::Process::CLOCK_MONOTONIC, :millisecond) - start_ms
|
|
15
15
|
meter_response(response, request_type: 'chat', provider: provider,
|
|
16
16
|
model_id: model, latency_ms: elapsed_ms, **opts.slice(:tier, :intent))
|
|
17
17
|
response
|
|
18
18
|
end
|
|
19
19
|
|
|
20
20
|
def embed(text: nil, model: nil, provider: nil, **)
|
|
21
|
-
start_ms = Process.clock_gettime(Process::CLOCK_MONOTONIC, :millisecond)
|
|
21
|
+
start_ms = ::Process.clock_gettime(::Process::CLOCK_MONOTONIC, :millisecond)
|
|
22
22
|
response = call_llm(:embed, text: text, model: model, provider: provider, **)
|
|
23
|
-
elapsed_ms = Process.clock_gettime(Process::CLOCK_MONOTONIC, :millisecond) - start_ms
|
|
23
|
+
elapsed_ms = ::Process.clock_gettime(::Process::CLOCK_MONOTONIC, :millisecond) - start_ms
|
|
24
24
|
meter_response(response, request_type: 'embed', provider: provider, model_id: model,
|
|
25
25
|
latency_ms: elapsed_ms)
|
|
26
26
|
response
|
|
27
27
|
end
|
|
28
28
|
|
|
29
29
|
def structured(messages: nil, schema: nil, model: nil, provider: nil, **)
|
|
30
|
-
start_ms = Process.clock_gettime(Process::CLOCK_MONOTONIC, :millisecond)
|
|
30
|
+
start_ms = ::Process.clock_gettime(::Process::CLOCK_MONOTONIC, :millisecond)
|
|
31
31
|
response = call_llm(:structured, messages: messages, schema: schema, model: model,
|
|
32
32
|
provider: provider, **)
|
|
33
|
-
elapsed_ms = Process.clock_gettime(Process::CLOCK_MONOTONIC, :millisecond) - start_ms
|
|
33
|
+
elapsed_ms = ::Process.clock_gettime(::Process::CLOCK_MONOTONIC, :millisecond) - start_ms
|
|
34
34
|
meter_response(response, request_type: 'structured', provider: provider, model_id: model,
|
|
35
35
|
latency_ms: elapsed_ms)
|
|
36
36
|
response
|
|
@@ -38,10 +38,10 @@ module Legion
|
|
|
38
38
|
|
|
39
39
|
def dispatch_chat(message: nil, model: nil, provider: nil, **opts)
|
|
40
40
|
tier = opts[:tier]
|
|
41
|
-
|
|
41
|
+
Legion::Logging.debug "[Gateway::Inference] dispatch_chat tier=#{tier}" if defined?(Legion::Logging)
|
|
42
42
|
if tier == 'fleet' && fleet_available?
|
|
43
43
|
Fleet.dispatch(model: model, messages: [{ role: 'user', content: message }],
|
|
44
|
-
intent: intent)
|
|
44
|
+
intent: opts[:intent])
|
|
45
45
|
else
|
|
46
46
|
call_llm(:chat, message: message, model: model, provider: provider, **opts)
|
|
47
47
|
end
|
|
@@ -53,14 +53,14 @@ module Legion
|
|
|
53
53
|
end
|
|
54
54
|
|
|
55
55
|
def call_llm(method_name, **)
|
|
56
|
-
|
|
56
|
+
unless defined?(Legion::LLM)
|
|
57
|
+
Legion::Logging.warn '[Gateway::Inference] Legion::LLM not defined' if defined?(Legion::Logging)
|
|
58
|
+
return { error: 'llm_not_available' }
|
|
59
|
+
end
|
|
57
60
|
|
|
58
61
|
direct = :"#{method_name}_direct"
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
else
|
|
62
|
-
Legion::LLM.public_send(method_name, **)
|
|
63
|
-
end
|
|
62
|
+
target = Legion::LLM.respond_to?(direct) ? direct : method_name
|
|
63
|
+
Legion::LLM.public_send(target, **)
|
|
64
64
|
end
|
|
65
65
|
|
|
66
66
|
def meter_response(response, **)
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: lex-llm-gateway
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.2.
|
|
4
|
+
version: 0.2.3
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Esity
|
|
@@ -9,6 +9,104 @@ bindir: bin
|
|
|
9
9
|
cert_chain: []
|
|
10
10
|
date: 1980-01-02 00:00:00.000000000 Z
|
|
11
11
|
dependencies:
|
|
12
|
+
- !ruby/object:Gem::Dependency
|
|
13
|
+
name: legion-cache
|
|
14
|
+
requirement: !ruby/object:Gem::Requirement
|
|
15
|
+
requirements:
|
|
16
|
+
- - ">="
|
|
17
|
+
- !ruby/object:Gem::Version
|
|
18
|
+
version: 1.3.11
|
|
19
|
+
type: :runtime
|
|
20
|
+
prerelease: false
|
|
21
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
22
|
+
requirements:
|
|
23
|
+
- - ">="
|
|
24
|
+
- !ruby/object:Gem::Version
|
|
25
|
+
version: 1.3.11
|
|
26
|
+
- !ruby/object:Gem::Dependency
|
|
27
|
+
name: legion-crypt
|
|
28
|
+
requirement: !ruby/object:Gem::Requirement
|
|
29
|
+
requirements:
|
|
30
|
+
- - ">="
|
|
31
|
+
- !ruby/object:Gem::Version
|
|
32
|
+
version: 1.4.9
|
|
33
|
+
type: :runtime
|
|
34
|
+
prerelease: false
|
|
35
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
36
|
+
requirements:
|
|
37
|
+
- - ">="
|
|
38
|
+
- !ruby/object:Gem::Version
|
|
39
|
+
version: 1.4.9
|
|
40
|
+
- !ruby/object:Gem::Dependency
|
|
41
|
+
name: legion-data
|
|
42
|
+
requirement: !ruby/object:Gem::Requirement
|
|
43
|
+
requirements:
|
|
44
|
+
- - ">="
|
|
45
|
+
- !ruby/object:Gem::Version
|
|
46
|
+
version: 1.4.17
|
|
47
|
+
type: :runtime
|
|
48
|
+
prerelease: false
|
|
49
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
50
|
+
requirements:
|
|
51
|
+
- - ">="
|
|
52
|
+
- !ruby/object:Gem::Version
|
|
53
|
+
version: 1.4.17
|
|
54
|
+
- !ruby/object:Gem::Dependency
|
|
55
|
+
name: legion-json
|
|
56
|
+
requirement: !ruby/object:Gem::Requirement
|
|
57
|
+
requirements:
|
|
58
|
+
- - ">="
|
|
59
|
+
- !ruby/object:Gem::Version
|
|
60
|
+
version: 1.2.1
|
|
61
|
+
type: :runtime
|
|
62
|
+
prerelease: false
|
|
63
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
64
|
+
requirements:
|
|
65
|
+
- - ">="
|
|
66
|
+
- !ruby/object:Gem::Version
|
|
67
|
+
version: 1.2.1
|
|
68
|
+
- !ruby/object:Gem::Dependency
|
|
69
|
+
name: legion-logging
|
|
70
|
+
requirement: !ruby/object:Gem::Requirement
|
|
71
|
+
requirements:
|
|
72
|
+
- - ">="
|
|
73
|
+
- !ruby/object:Gem::Version
|
|
74
|
+
version: 1.3.2
|
|
75
|
+
type: :runtime
|
|
76
|
+
prerelease: false
|
|
77
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
78
|
+
requirements:
|
|
79
|
+
- - ">="
|
|
80
|
+
- !ruby/object:Gem::Version
|
|
81
|
+
version: 1.3.2
|
|
82
|
+
- !ruby/object:Gem::Dependency
|
|
83
|
+
name: legion-settings
|
|
84
|
+
requirement: !ruby/object:Gem::Requirement
|
|
85
|
+
requirements:
|
|
86
|
+
- - ">="
|
|
87
|
+
- !ruby/object:Gem::Version
|
|
88
|
+
version: 1.3.14
|
|
89
|
+
type: :runtime
|
|
90
|
+
prerelease: false
|
|
91
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
92
|
+
requirements:
|
|
93
|
+
- - ">="
|
|
94
|
+
- !ruby/object:Gem::Version
|
|
95
|
+
version: 1.3.14
|
|
96
|
+
- !ruby/object:Gem::Dependency
|
|
97
|
+
name: legion-transport
|
|
98
|
+
requirement: !ruby/object:Gem::Requirement
|
|
99
|
+
requirements:
|
|
100
|
+
- - ">="
|
|
101
|
+
- !ruby/object:Gem::Version
|
|
102
|
+
version: 1.3.9
|
|
103
|
+
type: :runtime
|
|
104
|
+
prerelease: false
|
|
105
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
106
|
+
requirements:
|
|
107
|
+
- - ">="
|
|
108
|
+
- !ruby/object:Gem::Version
|
|
109
|
+
version: 1.3.9
|
|
12
110
|
- !ruby/object:Gem::Dependency
|
|
13
111
|
name: rake
|
|
14
112
|
requirement: !ruby/object:Gem::Requirement
|