lex-ollama 0.3.2 → 0.3.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 8657f3e11e11fcd2ee34e12317bf7698bcfea1e907006c76f9a07326996c7a69
4
- data.tar.gz: d1c1bbb05dc6a3a0071b4474a45a074b0d4929cd4b48b7488b5aa10539a9a6ee
3
+ metadata.gz: '086cd0d9744893c480b69c7b9fc6a10229af627344d316cafbccbd72d01cdafe'
4
+ data.tar.gz: e88f7893305401d36d8a3d48a7fb057e831fb6edeb245a5691c3d1e2527e1f1f
5
5
  SHA512:
6
- metadata.gz: e2a8622a2914cdfbc04b365d1ca7a9e8d35b4daa656931fd23a6e010b25b5a8ed6699246bffdbe0bfe064757ad26cad8f3664fe8c1dd2d1c606220dc932af45f
7
- data.tar.gz: ea975e9ac1c89621d41c274b6040bb92760672f9d4a2db223ea6c08371f7ed8e8a2dc810417b9b5b2beb9ce30fccb64544f24c11ae815422e5b99f5a43a48517
6
+ metadata.gz: f6154fee6005bb96262961342983f01c5d4e5f2747c2cd11a33afbebc1f5ffa00ea64e4f39ef63758620524a5c6690b1a81856c1658371acc040fdd2295c8731
7
+ data.tar.gz: 0dda701fb81e1f084bc3cc4404d8014d6e2b6ecc68153fcdb42009c10a337f4a078ff11890a4032d95d516a5c1bc5177f4a8e1492db4caea47e9da1641fe6669
data/CHANGELOG.md CHANGED
@@ -1,5 +1,14 @@
1
1
  # Changelog
2
2
 
3
+ ## [0.3.3] - 2026-04-16
4
+
5
+ ### Added
6
+ - `Actor::ModelSync` — once actor; runs 5s after extension load; reads `legion.ollama.default_models` and `legion.ollama.s3` from settings; calls `import_from_s3` for any configured model not already present on disk; no-op if either setting is absent
7
+
8
+ ### Fixed
9
+ - `Transport::Queues::ModelRequest` deleted — the framework auto-discovers every file in `transport/queues/` and calls `.new` with no arguments at startup, which crashed because `ModelRequest` required `request_type:` and `model:`; the queue definition is now an anonymous class created inline by `Actor::ModelWorker#build_queue_class`
10
+ - `Actor::ModelWorker#queue` now returns a CLASS instead of an instance — `Subscription#initialize` calls `queue.new`, so returning an instance caused a silent `NoMethodError` on `NilClass#new`; the anonymous queue class has `queue_name`, `queue_options`, `dlx_enabled`, and `initialize` (exchange bind) defined inline via `define_method`
11
+
3
12
  ## [0.3.2] - 2026-04-08
4
13
 
5
14
  ### Changed
data/CLAUDE.md CHANGED
@@ -13,7 +13,7 @@ reporting, and **fleet queue subscription** for receiving routed LLM requests fr
13
13
  **GitHub**: https://github.com/LegionIO/lex-ollama
14
14
  **License**: MIT
15
15
  **Version**: 0.3.2
16
- **Specs**: 82 examples (12 spec files) — fleet additions add ~35 more
16
+ **Specs**: 166 examples (17 spec files)
17
17
 
18
18
  ---
19
19
 
@@ -181,8 +181,9 @@ message_context:
181
181
  A subset (`conversation_id`, `message_id`, `request_id`) is promoted to AMQP headers
182
182
  (`x-legion-llm-conversation-id`, etc.) for filtering without body parsing.
183
183
 
184
- See: `docs/plans/2026-04-08-fleet-wire-protocol.md` for full AMQP property mapping,
185
- platform-wide standard, and per-message-type specifications.
184
+ The wire protocol spec (AMQP property mapping, platform-wide standard, per-message-type
185
+ specifications) was developed during the fleet design phase and is maintained in the
186
+ legion-llm repository alongside the implementation.
186
187
 
187
188
  ---
188
189
 
@@ -210,4 +211,4 @@ bundle exec rubocop
210
211
  ---
211
212
 
212
213
  **Maintained By**: Matthew Iverson (@Esity)
213
- **Last Updated**: 2026-04-08
214
+ **Last Updated**: 2026-04-10
data/README.md CHANGED
@@ -44,6 +44,36 @@ gem install lex-ollama
44
44
  ### Version
45
45
  - `server_version` - Retrieve the Ollama server version (GET /api/version)
46
46
 
47
+ ### Fleet Queue Subscription
48
+ - `handle_request` - Dispatch inbound fleet AMQP messages to the appropriate runner (chat/embed/generate)
49
+
50
+ When `Legion::Extensions::Core` is present, lex-ollama subscribes to model-scoped queues on the
51
+ `llm.request` topic exchange, accepting routed LLM inference work from other Legion fleet members.
52
+
53
+ Each configured `(type, model)` pair gets its own auto-delete queue with routing key
54
+ `llm.request.ollama.<type>.<model>`. Multiple nodes serving the same model compete fairly
55
+ via RabbitMQ round-robin with consumer priority.
56
+
57
+ ```yaml
58
+ legion:
59
+ ollama:
60
+ host: "http://localhost:11434"
61
+ fleet:
62
+ consumer_priority: 10 # H100: 10, Mac Studio: 5, MacBook: 1
63
+ subscriptions:
64
+ - type: embed
65
+ model: nomic-embed-text
66
+ - type: chat
67
+ model: "qwen3.5:27b"
68
+ ```
69
+
70
+ Fleet messages use the wire protocol defined in `legion-llm`: typed AMQP messages
71
+ (`llm.fleet.request` / `llm.fleet.response` / `llm.fleet.error`) with `message_context`
72
+ propagation for end-to-end tracing.
73
+
74
+ Without `Legion::Extensions::Core`, the gem works as a pure HTTP client library with no
75
+ AMQP dependency.
76
+
47
77
  ## Standalone Client
48
78
 
49
79
  ```ruby
@@ -85,21 +115,21 @@ Pull models from an internal S3 mirror instead of the public Ollama registry:
85
115
  client = Legion::Extensions::Ollama::Client.new
86
116
 
87
117
  # List available models in S3
88
- client.list_s3_models(bucket: 'legion', endpoint: 'https://mesh.s3api-core.optum.com')
118
+ client.list_s3_models(bucket: 'legion', endpoint: 'https://s3.example.internal')
89
119
 
90
120
  # Import directly to filesystem (works without Ollama running)
91
121
  client.import_from_s3(model: 'llama3:latest', bucket: 'legion',
92
- endpoint: 'https://mesh.s3api-core.optum.com')
122
+ endpoint: 'https://s3.example.internal')
93
123
 
94
124
  # Push through Ollama API (requires Ollama running)
95
125
  client.sync_from_s3(model: 'llama3:latest', bucket: 'legion',
96
- endpoint: 'https://mesh.s3api-core.optum.com')
126
+ endpoint: 'https://s3.example.internal')
97
127
 
98
128
  # Provision fleet with default models
99
129
  client.import_default_models(
100
130
  default_models: %w[llama3:latest nomic-embed-text:latest],
101
131
  bucket: 'legion',
102
- endpoint: 'https://mesh.s3api-core.optum.com'
132
+ endpoint: 'https://s3.example.internal'
103
133
  )
104
134
  ```
105
135
 
@@ -121,7 +151,7 @@ result[:usage] # => { input_tokens: 1, output_tokens: 5, total_duration: ..., .
121
151
 
122
152
  ## Version
123
153
 
124
- 0.3.1
154
+ 0.3.2
125
155
 
126
156
  ## License
127
157
 
@@ -0,0 +1,90 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Legion
4
+ module Extensions
5
+ module Ollama
6
+ module Actor
7
+ # Once actor — runs once shortly after extension load.
8
+ # Reads legion.ollama.s3 and legion.ollama.default_models from settings
9
+ # and calls import_from_s3 for any model not already present locally.
10
+ #
11
+ # Settings example:
12
+ # {
13
+ # "legion": {
14
+ # "ollama": {
15
+ # "s3": {
16
+ # "bucket": "legion",
17
+ # "prefix": "ollama/models",
18
+ # "endpoint": "https://s3.example.internal"
19
+ # },
20
+ # "default_models": ["qwen3.5:4b", "nomic-embed-text:latest"]
21
+ # }
22
+ # }
23
+ # }
24
+ class ModelSync < Legion::Extensions::Actors::Once
25
+ include Legion::Logging::Helper
26
+
27
+ # Run 5 seconds after extension load to allow the rest of startup to complete.
28
+ def delay
29
+ 5.0
30
+ end
31
+
32
+ def use_runner?
33
+ false
34
+ end
35
+
36
+ def runner_class
37
+ self.class
38
+ end
39
+
40
+ def enabled?
41
+ return false unless defined?(Legion::Settings)
42
+
43
+ models = Legion::Settings.dig(:ollama, :default_models)
44
+ s3_cfg = Legion::Settings.dig(:ollama, :s3)
45
+ models.is_a?(Array) && !models.empty? && s3_cfg.is_a?(Hash) && s3_cfg[:bucket]
46
+ rescue StandardError => e
47
+ handle_exception(e, level: :warn, handled: true)
48
+ false
49
+ end
50
+
51
+ def manual
52
+ models = Legion::Settings.dig(:ollama, :default_models) || []
53
+ s3_cfg = Legion::Settings.dig(:ollama, :s3)
54
+ bucket = s3_cfg[:bucket]
55
+ s3_opts = s3_cfg.except(:bucket)
56
+
57
+ client = Object.new.extend(Legion::Extensions::Ollama::Runners::S3Models)
58
+ models_path = ENV.fetch('OLLAMA_MODELS', File.join(Dir.home, '.ollama', 'models'))
59
+
60
+ models.each do |model|
61
+ if model_present_locally?(model, models_path)
62
+ log.debug "[ModelSync] #{model} already present locally, skipping"
63
+ next
64
+ end
65
+
66
+ log.info "[ModelSync] importing #{model} from S3"
67
+ result = client.import_from_s3(model: model, bucket: bucket, models_path: models_path, **s3_opts)
68
+ if result[:status] == 200
69
+ log.info "[ModelSync] imported #{model} (blobs_downloaded=#{result[:blobs_downloaded]}, blobs_skipped=#{result[:blobs_skipped]})"
70
+ else
71
+ log.warn "[ModelSync] failed to import #{model}: #{result.inspect}"
72
+ end
73
+ rescue StandardError => e
74
+ handle_exception(e, level: :error, handled: true, model: model)
75
+ end
76
+ end
77
+
78
+ private
79
+
80
+ def model_present_locally?(model, models_path)
81
+ name, tag = model.split(':')
82
+ tag ||= 'latest'
83
+ manifest = File.join(models_path, 'manifests', 'registry.ollama.ai', 'library', name, tag)
84
+ File.exist?(manifest)
85
+ end
86
+ end
87
+ end
88
+ end
89
+ end
90
+ end
@@ -75,10 +75,12 @@ module Legion
75
75
  base.merge(arguments: { 'x-priority' => consumer_priority })
76
76
  end
77
77
 
78
- # Override queue to return a model-scoped queue bound with the precise
79
- # routing key for this worker's (type, model) pair.
78
+ # Returns a queue CLASS (not instance) bound to the llm.request exchange
79
+ # with the routing key for this worker's (type, model) pair.
80
+ # The Subscription base class calls queue.new in initialize, so this must
81
+ # return a class, not an instance.
80
82
  def queue
81
- @queue ||= build_and_bind_queue
83
+ @queue ||= build_queue_class
82
84
  end
83
85
 
84
86
  # Enrich every inbound message with the worker's own request_type and model
@@ -94,17 +96,22 @@ module Legion
94
96
 
95
97
  private
96
98
 
97
- def build_and_bind_queue
99
+ def build_queue_class
98
100
  sanitised_model = @model_name.tr(':', '.')
99
101
  routing_key = "llm.request.ollama.#{@request_type}.#{sanitised_model}"
102
+ exchange_class = Transport::Exchanges::LlmRequest
100
103
 
101
- queue_obj = Transport::Queues::ModelRequest.new(
102
- request_type: @request_type,
103
- model: @model_name
104
- )
105
- exchange_obj = Transport::Exchanges::LlmRequest.new
106
- queue_obj.bind(exchange_obj, routing_key: routing_key)
107
- queue_obj
104
+ Class.new(Legion::Transport::Queue) do
105
+ define_method(:queue_name) { routing_key }
106
+ define_method(:queue_options) do
107
+ { durable: false, auto_delete: true, arguments: { 'x-max-priority' => 10 } }
108
+ end
109
+ define_method(:dlx_enabled) { false }
110
+ define_method(:initialize) do
111
+ super()
112
+ bind(exchange_class.new, routing_key: routing_key)
113
+ end
114
+ end
108
115
  end
109
116
  end
110
117
  end
@@ -12,13 +12,8 @@ module Legion
12
12
  module Transport
13
13
  extend Legion::Extensions::Transport if Legion::Extensions.const_defined?(:Transport, false)
14
14
 
15
- # All queue-to-exchange bindings are established dynamically by
16
- # Actor::ModelWorker#build_and_bind_queue at subscription time.
17
- # This file only needs to declare the exchange so topology/infra mode
18
- # can introspect the full routing graph.
19
- def self.additional_e_to_q
20
- []
21
- end
15
+ # All queue-to-exchange bindings for fleet queues are established dynamically by
16
+ # Actor::ModelWorker at subscription time via build_queue_class.
22
17
  end
23
18
  end
24
19
  end
@@ -3,7 +3,7 @@
3
3
  module Legion
4
4
  module Extensions
5
5
  module Ollama
6
- VERSION = '0.3.2'
6
+ VERSION = '0.3.3'
7
7
  end
8
8
  end
9
9
  end
@@ -18,10 +18,10 @@ require 'legion/extensions/ollama/client'
18
18
  # so the gem still works as a standalone HTTP client without any AMQP runtime.
19
19
  if Legion::Extensions.const_defined?(:Core, false)
20
20
  require 'legion/extensions/ollama/transport/exchanges/llm_request'
21
- require 'legion/extensions/ollama/transport/queues/model_request'
22
21
  require 'legion/extensions/ollama/transport/messages/llm_response'
23
22
  require 'legion/extensions/ollama/transport'
24
23
  require 'legion/extensions/ollama/actors/model_worker'
24
+ require 'legion/extensions/ollama/actors/model_sync'
25
25
  end
26
26
 
27
27
  module Legion
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: lex-ollama
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.3.2
4
+ version: 0.3.3
5
5
  platform: ruby
6
6
  authors:
7
7
  - Esity
@@ -56,6 +56,7 @@ files:
56
56
  - README.md
57
57
  - lex-ollama.gemspec
58
58
  - lib/legion/extensions/ollama.rb
59
+ - lib/legion/extensions/ollama/actors/model_sync.rb
59
60
  - lib/legion/extensions/ollama/actors/model_worker.rb
60
61
  - lib/legion/extensions/ollama/client.rb
61
62
  - lib/legion/extensions/ollama/helpers/client.rb
@@ -72,7 +73,6 @@ files:
72
73
  - lib/legion/extensions/ollama/transport.rb
73
74
  - lib/legion/extensions/ollama/transport/exchanges/llm_request.rb
74
75
  - lib/legion/extensions/ollama/transport/messages/llm_response.rb
75
- - lib/legion/extensions/ollama/transport/queues/model_request.rb
76
76
  - lib/legion/extensions/ollama/version.rb
77
77
  homepage: https://github.com/LegionIO/lex-ollama
78
78
  licenses:
@@ -1,58 +0,0 @@
1
- # frozen_string_literal: true
2
-
3
- module Legion
4
- module Extensions
5
- module Ollama
6
- module Transport
7
- module Queues
8
- # Parametric queue — one instance per (request_type, model) tuple.
9
- #
10
- # queue_name mirrors the routing key exactly so bindings are self-documenting
11
- # in the RabbitMQ management UI, e.g.:
12
- # llm.request.ollama.embed.nomic-embed-text
13
- # llm.request.ollama.chat.qwen3.5.27b
14
- #
15
- # Queue strategy:
16
- # - classic (not quorum): quorum queues cannot be auto-delete
17
- # - auto_delete: true — queue deletes when last consumer disconnects + queue empties,
18
- # enabling basic.return feedback to publishers via mandatory: true
19
- # - x-max-priority: 10 — must be a queue argument at declaration time for classic
20
- # queues; policies handle max-length and overflow externally
21
- class ModelRequest < Legion::Transport::Queue
22
- def initialize(request_type:, model:, **)
23
- @request_type = request_type.to_s
24
- @model = sanitise_model(model)
25
- super(**)
26
- end
27
-
28
- def queue_name
29
- "llm.request.ollama.#{@request_type}.#{@model}"
30
- end
31
-
32
- def queue_options
33
- {
34
- durable: false,
35
- auto_delete: true,
36
- arguments: { 'x-max-priority' => 10 }
37
- }
38
- end
39
-
40
- # Disable dead-letter exchange provisioning. The base class
41
- # default_options always adds x-dead-letter-exchange when
42
- # dlx_enabled returns true. Fleet queues are ephemeral
43
- # (auto-delete) and must not provision persistent DLX queues.
44
- def dlx_enabled
45
- false
46
- end
47
-
48
- private
49
-
50
- def sanitise_model(name)
51
- name.to_s.tr(':', '.')
52
- end
53
- end
54
- end
55
- end
56
- end
57
- end
58
- end