lex-llm-ollama 0.2.13 → 0.2.14

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 8edb9398901726a46e399fd9a7ab11a9686c3cf76bd6cd7efc2cc48c89005230
4
- data.tar.gz: 89e29c003fac6d296e26f9214cde4f93c499cb5f8ccaa065d8270a587f6bb9f5
3
+ metadata.gz: 8bbec813c20e8b5c62b97209466439569cfcde42251acfe9e87f2bb0fce79e9d
4
+ data.tar.gz: 59822c6527476ec0000af57ec2a5884672d03065ca3212e0fb655faf809cc0da
5
5
  SHA512:
6
- metadata.gz: f9a7f5bb4fd18596bec2b301fd92ed49cf4c2fdb77d592d85ef54c834bef8f5a4770c3c7b773c8a50a84737595309439826e6a4c110cf62497c8a61242251edb
7
- data.tar.gz: 927d0afc66e6ef69fbf3754defdbb03e4ff6891d71046c4bf722d88cf5c56ec32b573fa4066ce9421c2a51e1b596378f9f8cd906df8f0ea37da9546bd1b610f2
6
+ metadata.gz: 6f41591f42a566ab7f3344e6d9963393db977dfea257b61c4a4aea798943804cc1db9f9525522cb690e5ab0448bc4b1451ee2ec3a67b615caccb049e84607bc4
7
+ data.tar.gz: effc50944c4583c1732ea4b23563c2f1ac00f660c0571e6c001403f3a140027303fe3bd77d2e0e64c61db39828a5230f9a35f30e7a1ab8663a3dd4e8b56bc185
data/CHANGELOG.md CHANGED
@@ -1,5 +1,10 @@
1
1
  # Changelog
2
2
 
3
+ ## 0.2.14 - 2026-06-05
4
+
5
+ - Verified specs and RuboCop compliance (52 examples, 0 failures; 15 files, 0 offenses)
6
+ - Updated README with comprehensive extension index covering architecture, classes, configuration, and usage
7
+
3
8
  ## 0.2.13 - 2026-06-02
4
9
 
5
10
  - **Scope discovery refresh to Ollama only** — `DiscoveryRefresh#manual` now calls `Discovery.refresh_discovered_models!(provider: :ollama)` instead of `Discovery.run`, which previously triggered model discovery for all registered providers (anthropic, bedrock, etc.) and caused cross-provider coupling
data/README.md CHANGED
@@ -2,38 +2,134 @@
2
2
 
3
3
  LegionIO LLM provider extension for [Ollama](https://ollama.ai).
4
4
 
5
- This gem lives under `Legion::Extensions::Llm::Ollama` and depends on `lex-llm >= 0.4.3` for shared provider-neutral routing, response normalization, fleet envelopes, responder execution, transport, and registry primitives. It does not carry a runtime `legion-llm` dependency; `legion-llm` owns higher-level routing and can discover this provider through normal extension loading.
5
+ This gem lives under `Legion::Extensions::Llm::Ollama` and depends on `lex-llm >= 0.4.3` for shared provider-neutral routing, response normalization, fleet envelopes, responder execution, transport, and registry primitives. It does not carry a runtime `legion-llm` dependency; `legion-llm` owns higher-level routing and discovers this provider through normal extension loading.
6
6
 
7
7
  Load it with `require 'legion/extensions/llm/ollama'`.
8
8
 
9
9
  ## What It Provides
10
10
 
11
- - Ollama-native chat requests through `POST /api/chat`
12
- - Streaming chat support
13
- - Model discovery through `GET /api/tags` with automatic embedding capability inference
14
- - Running model inspection through `GET /api/ps`
15
- - Model details through `POST /api/show`
16
- - Model download helper through `POST /api/pull`
17
- - Embeddings through `POST /api/embed`
18
- - Best-effort `llm.registry` availability events via the shared `Legion::Extensions::Llm::RegistryPublisher`
19
- - Local socket discovery plus configured instance discovery through the shared `lex-llm` credential sources
20
- - Provider-owned fleet response handling through `Legion::Extensions::Llm::Fleet::ProviderResponder`
21
- - Full `Legion::Logging::Helper` integration with structured `handle_exception` in every rescue block
11
+ | Feature | Endpoint | Provider Method |
12
+ |---------|----------|----------------|
13
+ | Chat completion | `POST /api/chat` | Inherited from `Lex-llm` base provider |
14
+ | Streaming chat | `POST /api/chat` | `stream_response` |
15
+ | List models | `GET /api/tags` | `list_models` |
16
+ | Running models | `GET /api/ps` | `list_running_models` |
17
+ | Model details | `POST /api/show` | `show_model`, `fetch_model_detail` |
18
+ | Pull models | `POST /api/pull` | `pull_model` |
19
+ | Embeddings | `POST /api/embed` | Inherited from `Lex-llm` base provider |
20
+ | Readiness check | `GET /api/version` | `readiness(live: false)` |
21
+
22
+ All responses pass through the shared `Lex-llm` normalization layer: `Message`, `Chunk`, `Embedding`, and `Model::Info`.
23
+
24
+ ## File Index
25
+
26
+ ```
27
+ lib/
28
+ legion/extensions/llm/ollama.rb # Extension entry point, instance discovery, default settings
29
+ legion/extensions/llm/ollama/provider.rb # Provider — chat, stream, embed, models, offerings
30
+ legion/extensions/llm/ollama/version.rb # VERSION constant
31
+ legion/extensions/llm/ollama/actors/
32
+ discovery_refresh.rb # Periodic model discovery actor (Every, 30min default)
33
+ fleet_worker.rb # Fleet request subscription actor (Subscription)
34
+ legion/extensions/llm/ollama/runners/
35
+ fleet_worker.rb # Fleet request execution runner (delegates to lex-llm)
36
+ ```
22
37
 
23
38
  ## Architecture
24
39
 
25
40
  ```
26
41
  Legion::Extensions::Llm::Ollama
27
- ├── Provider # Ollama provider (chat, stream, embed, models, readiness)
28
- ├── Actor::FleetWorker # Optional provider-owned fleet subscription actor
29
- ├── Runners::FleetWorker # Delegates fleet execution to lex-llm
30
- └── (shared from lex-llm)
31
- ├── Fleet::ProviderResponder
32
- ├── RegistryPublisher
33
- ├── RegistryEventBuilder
34
- └── Transport/
42
+ ├── Provider # Ollama provider implementation
43
+ ├── Capabilities # Capability predicates (chat, streaming, vision, functions, embeddings)
44
+ ├── #render_payload # Build Ollama chat payload from messages, tools, schema
45
+ │ ├── #stream_response # NDJSON streaming via Faraday on_data
46
+ ├── #discover_offerings # Build ModelOffering array from live/cached models
47
+ ├── #fetch_model_detail # Call /api/show, extract context_window + capabilities
48
+ ├── #render_embedding_payload # Build Ollama embedding payload
49
+ └── (inherited from lex-llm) # Chat, embedding, connection, registry helpers
50
+ ├── Actor::DiscoveryRefresh # Every actor; refreshes model list, repopulates auto rules
51
+ ├── Actor::FleetWorker # Subscription actor; gates on respond_to_requests
52
+ └── Runners::FleetWorker # Module function; delegates to ProviderResponder.call
53
+
54
+ Shared from lex-llm:
55
+ ├── Fleet::ProviderResponder # Fleet request execution harness
56
+ ├── RegistryPublisher # Publishes readiness + model events to llm.registry
57
+ ├── RegistryEventBuilder # Builds registry event payloads
58
+ ├── AutoRegistration # Self-registers discovered instances
59
+ └── CredentialSources # Socket probing + setting lookup for instance discovery
35
60
  ```
36
61
 
62
+ ## Key Classes
63
+
64
+ ### `Legion::Extensions::Llm::Ollama` (module)
65
+
66
+ - **`default_settings`** — Returns the full settings schema via `Lex-llm.provider_settings`.
67
+ - **`provider_class`** — Returns `Provider`.
68
+ - **`discover_instances`** — Probes `127.0.0.1:11434` socket + reads configured instances from settings.
69
+ - **`normalize_instance_config(config)`** — Normalizes `endpoint`/`api_base`/`ollama_api_base` aliases to `base_url`.
70
+ - **`registry_publisher`** — Lazily instantiated `RegistryPublisher` for the `:ollama` family.
71
+
72
+ ### `Provider`
73
+
74
+ Extends `Legion::Extensions::Llm::Provider`. Implements the Ollama-specific contract:
75
+
76
+ | Method | Purpose |
77
+ |--------|---------|
78
+ | `api_base` | Resolves base URL from `resolve_base_url`, settings, or default `127.0.0.1:11434` |
79
+ | `completion_url` | `/api/chat` |
80
+ | `stream_url` | `/api/chat` |
81
+ | `models_url` | `/api/tags` |
82
+ | `running_models_url` | `/api/ps` |
83
+ | `show_model_url` | `/api/show` |
84
+ | `embedding_url` | `/api/embed` |
85
+ | `pull_url` | `/api/pull` |
86
+ | `version_url` | `/api/version` |
87
+ | `list_running_models` | GET `/api/ps`, returns array of running model hashes |
88
+ | `readiness(live:)` | Checks Ollama version endpoint; publishes readiness event when `live: true` |
89
+ | `list_models` | GET `/api/tags`, parses and publishes model events via registry |
90
+ | `show_model(model)` | POST `/api/show`, returns raw model detail hash |
91
+ | `fetch_model_detail(model)` | Wraps `show_model`; extracts `context_window` and `capabilities` |
92
+ | `pull_model(model, stream:)` | POST `/api/pull` to download a model |
93
+ | `discover_offerings(live:)` | Builds `ModelOffering` array from live or cached models |
94
+ | `render_payload(...)` | Converts Legion messages/tools to Ollama NDJSON format |
95
+ | `stream_response(conn, payload)` | Posts with Faraday `on_data` handler for NDJSON streaming |
96
+ | `parse_completion_response(resp)` | Normalizes Ollama chat response to `Legion::Extensions::Llm::Message` |
97
+ | `build_chunk(data)` | Normalizes a stream NDJSON line to `Legion::Extensions::Llm::Chunk` |
98
+ | `render_embedding_payload(text, model:, dimensions:)` | Builds embedding request body |
99
+ | `parse_embedding_response(resp, ...)` | Normalizes embedding response to `Legion::Extensions::Llm::Embedding` |
100
+
101
+ ### `Capabilities` (module inside Provider)
102
+
103
+ Module functions providing capability predicates used during offering construction:
104
+
105
+ | Method | Always Returns |
106
+ |--------|---------------|
107
+ | `chat?(model)` | `true` |
108
+ | `streaming?(model)` | `true` |
109
+ | `vision?(model)` | `true` |
110
+ | `functions?(model)` | `true` |
111
+ | `embeddings?(model)` | `true` |
112
+
113
+ ### `CONTEXT_WINDOWS` (constant)
114
+
115
+ Static fallback map keyed by model name prefix (e.g., `'qwen3' => 128_000`). Used when `/api/show` is unavailable to infer context window. Covers qwen, llama, gemma, mistral, deepseek, phi, command-r, codellama, and embedding families.
116
+
117
+ ### `Actor::DiscoveryRefresh`
118
+
119
+ An `Every` actor that runs every 30 minutes (configurable via `settings[:extensions][:llm][:ollama][:discovery_interval]`). On each tick:
120
+
121
+ 1. Calls `Legion::LLM::Discovery.refresh_discovered_models!(provider: :ollama)`
122
+ 2. Repopulates auto routing rules if `Legion::LLM::Router` is available
123
+ 3. Invalidates the offerings cache if `Legion::LLM::Inventory` is available
124
+
125
+ ### `Actor::FleetWorker`
126
+
127
+ A `Subscription` actor that starts only when at least one instance has `fleet.respond_to_requests: true`. Routes messages to the fleet worker runner.
128
+
129
+ ### `Runners::FleetWorker`
130
+
131
+ A module with `handle_fleet_request(payload, delivery:, properties:)`. Delegates to `Legion::Extensions::Llm::Fleet::ProviderResponder.call` with the Ollama provider family, provider class, and instance discovery lambda.
132
+
37
133
  ## Defaults
38
134
 
39
135
  ```ruby
@@ -65,21 +161,24 @@ Legion::Extensions::Llm::Ollama.default_settings
65
161
 
66
162
  ## Configuration
67
163
 
68
- `discover_instances` returns a local `http://127.0.0.1:11434` instance when the Ollama socket is reachable. Additional instances can be supplied under the shared LLM extension configuration and may use `base_url`, `endpoint`, `api_base`, or `ollama_api_base`; the extension normalizes those aliases to `base_url`.
164
+ ### Instance Discovery
165
+
166
+ `discover_instances` auto-detects a local instance when the socket at `127.0.0.1:11434` is reachable. Additional instances can be defined in settings using any of the recognized endpoint aliases (`base_url`, `endpoint`, `api_base`, `ollama_api_base`); the extension normalizes all to `base_url`.
69
167
 
70
168
  ```yaml
71
169
  extensions:
72
170
  llm:
73
171
  ollama:
172
+ discovery_interval: 1800 # DiscoveryRefresh actor interval (seconds)
74
173
  instances:
75
174
  lab:
76
175
  base_url: http://ollama-lab:11434
77
176
  default_model: qwen3.5:latest
78
177
  ```
79
178
 
80
- ## Fleet Responder
179
+ ### Fleet Responder
81
180
 
82
- Provider instances can opt in to consuming Legion LLM fleet requests. The provider-owned fleet actor only starts when at least one discovered instance enables `respond_to_requests`, and the runner delegates execution to the shared `lex-llm` responder helper.
181
+ Provider instances can opt in to consuming Legion LLM fleet requests. The fleet actor only starts when at least one instance enables `respond_to_requests`, and the runner delegates execution to the shared `lex-llm` responder helper.
83
182
 
84
183
  ```yaml
85
184
  extensions:
@@ -96,14 +195,76 @@ extensions:
96
195
  - embed
97
196
  ```
98
197
 
198
+ ## Ollama API Surface
199
+
200
+ | Legion Method | Ollama Route | HTTP Verb |
201
+ |---------------|-------------|-----------|
202
+ | Chat | `/api/chat` | POST |
203
+ | Stream chat | `/api/chat` | POST |
204
+ | List models | `/api/tags` | GET |
205
+ | Running models | `/api/ps` | GET |
206
+ | Model details | `/api/show` | POST |
207
+ | Pull model | `/api/pull` | POST |
208
+ | Embeddings | `/api/embed` | POST |
209
+ | Readiness | `/api/version` | GET |
210
+
211
+ ## Error Handling
212
+
213
+ Every rescue block uses `handle_exception` from `Legion::Logging::Helper` with explicit `level`, `handled:`, and `operation:` parameters. Connection failures during `discover_offerings` produce a warn-level log and return an empty array (never raise).
214
+
215
+ ## Usage
216
+
217
+ ```ruby
218
+ require 'legion/extensions/llm/ollama'
219
+
220
+ # Access the module
221
+ Legion::Extensions::Llm::Ollama.discover_instances
222
+ Legion::Extensions::Llm::Ollama.default_settings
223
+
224
+ # Create a provider instance (usually done by lex-llm routing)
225
+ provider = Legion::Extensions::Llm::Ollama::Provider.new(config:)
226
+
227
+ # Discover offerings
228
+ provider.discover_offerings(live: true)
229
+
230
+ # Chat
231
+ result = provider.chat(messages: [...], model: 'llama3', temperature: 0.7)
232
+
233
+ # Stream chat
234
+ provider.stream_chat(messages: [...], model: 'llama3') do |chunk|
235
+ print chunk.content
236
+ end
237
+
238
+ # Embeddings
239
+ embeddings = provider.embed(text: "Hello world", model: 'nomic-embed-text')
240
+ ```
241
+
242
+ ## Dependencies
243
+
244
+ | Gem | Minimum Version | Purpose |
245
+ |-----|----------------|---------|
246
+ | `lex-llm` | `>= 0.4.3` | Base provider contract, routing, fleet responder, registry, credential sources |
247
+ | `legion-transport` | `>= 1.4.14` | Faraday connection management |
248
+ | `legion-json` | — | JSON serialization (`Legion::JSON`) |
249
+ | `legion-logging` | — | Structured logging (`Legion::Logging::Helper`) |
250
+ | `legion-settings` | — | Configuration access |
251
+ | `legion-extensions` | — | Extension framework (`Core`, `Actors::Every`, `Actors::Subscription`) |
252
+
99
253
  ## Development
100
254
 
101
255
  ```bash
256
+ cd /Users/matt.iverson@optum.com/rubymine/legion/extensions-ai/lex-llm-ollama
102
257
  bundle install
103
- bundle exec rspec --format json --out tmp/rspec_results.json --format progress --out tmp/rspec_progress.txt
258
+
259
+ # Run specs
260
+ bundle exec rspec
261
+
262
+ # Lint (auto-correct)
104
263
  bundle exec rubocop -A
105
264
  ```
106
265
 
266
+ Spec count: 52 examples across 7 spec files.
267
+
107
268
  ## License
108
269
 
109
270
  MIT
@@ -4,7 +4,7 @@ module Legion
4
4
  module Extensions
5
5
  module Llm
6
6
  module Ollama
7
- VERSION = '0.2.13'
7
+ VERSION = '0.2.14'
8
8
  end
9
9
  end
10
10
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: lex-llm-ollama
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.13
4
+ version: 0.2.14
5
5
  platform: ruby
6
6
  authors:
7
7
  - LegionIO