lex-llm-ollama 0.2.13 → 0.2.14
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +5 -0
- data/README.md +185 -24
- data/lib/legion/extensions/llm/ollama/version.rb +1 -1
- metadata +1 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 8bbec813c20e8b5c62b97209466439569cfcde42251acfe9e87f2bb0fce79e9d
|
|
4
|
+
data.tar.gz: 59822c6527476ec0000af57ec2a5884672d03065ca3212e0fb655faf809cc0da
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 6f41591f42a566ab7f3344e6d9963393db977dfea257b61c4a4aea798943804cc1db9f9525522cb690e5ab0448bc4b1451ee2ec3a67b615caccb049e84607bc4
|
|
7
|
+
data.tar.gz: effc50944c4583c1732ea4b23563c2f1ac00f660c0571e6c001403f3a140027303fe3bd77d2e0e64c61db39828a5230f9a35f30e7a1ab8663a3dd4e8b56bc185
|
data/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,10 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## 0.2.14 - 2026-06-05
|
|
4
|
+
|
|
5
|
+
- Verified specs and RuboCop compliance (52 examples, 0 failures; 15 files, 0 offenses)
|
|
6
|
+
- Updated README with comprehensive extension index covering architecture, classes, configuration, and usage
|
|
7
|
+
|
|
3
8
|
## 0.2.13 - 2026-06-02
|
|
4
9
|
|
|
5
10
|
- **Scope discovery refresh to Ollama only** — `DiscoveryRefresh#manual` now calls `Discovery.refresh_discovered_models!(provider: :ollama)` instead of `Discovery.run`, which previously triggered model discovery for all registered providers (anthropic, bedrock, etc.) and caused cross-provider coupling
|
data/README.md
CHANGED
|
@@ -2,38 +2,134 @@
|
|
|
2
2
|
|
|
3
3
|
LegionIO LLM provider extension for [Ollama](https://ollama.ai).
|
|
4
4
|
|
|
5
|
-
This gem lives under `Legion::Extensions::Llm::Ollama` and depends on `lex-llm >= 0.4.3` for shared provider-neutral routing, response normalization, fleet envelopes, responder execution, transport, and registry primitives. It does not carry a runtime `legion-llm` dependency; `legion-llm` owns higher-level routing and
|
|
5
|
+
This gem lives under `Legion::Extensions::Llm::Ollama` and depends on `lex-llm >= 0.4.3` for shared provider-neutral routing, response normalization, fleet envelopes, responder execution, transport, and registry primitives. It does not carry a runtime `legion-llm` dependency; `legion-llm` owns higher-level routing and discovers this provider through normal extension loading.
|
|
6
6
|
|
|
7
7
|
Load it with `require 'legion/extensions/llm/ollama'`.
|
|
8
8
|
|
|
9
9
|
## What It Provides
|
|
10
10
|
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
11
|
+
| Feature | Endpoint | Provider Method |
|
|
12
|
+
|---------|----------|----------------|
|
|
13
|
+
| Chat completion | `POST /api/chat` | Inherited from `Lex-llm` base provider |
|
|
14
|
+
| Streaming chat | `POST /api/chat` | `stream_response` |
|
|
15
|
+
| List models | `GET /api/tags` | `list_models` |
|
|
16
|
+
| Running models | `GET /api/ps` | `list_running_models` |
|
|
17
|
+
| Model details | `POST /api/show` | `show_model`, `fetch_model_detail` |
|
|
18
|
+
| Pull models | `POST /api/pull` | `pull_model` |
|
|
19
|
+
| Embeddings | `POST /api/embed` | Inherited from `Lex-llm` base provider |
|
|
20
|
+
| Readiness check | `GET /api/version` | `readiness(live: false)` |
|
|
21
|
+
|
|
22
|
+
All responses pass through the shared `Lex-llm` normalization layer: `Message`, `Chunk`, `Embedding`, and `Model::Info`.
|
|
23
|
+
|
|
24
|
+
## File Index
|
|
25
|
+
|
|
26
|
+
```
|
|
27
|
+
lib/
|
|
28
|
+
legion/extensions/llm/ollama.rb # Extension entry point, instance discovery, default settings
|
|
29
|
+
legion/extensions/llm/ollama/provider.rb # Provider — chat, stream, embed, models, offerings
|
|
30
|
+
legion/extensions/llm/ollama/version.rb # VERSION constant
|
|
31
|
+
legion/extensions/llm/ollama/actors/
|
|
32
|
+
discovery_refresh.rb # Periodic model discovery actor (Every, 30min default)
|
|
33
|
+
fleet_worker.rb # Fleet request subscription actor (Subscription)
|
|
34
|
+
legion/extensions/llm/ollama/runners/
|
|
35
|
+
fleet_worker.rb # Fleet request execution runner (delegates to lex-llm)
|
|
36
|
+
```
|
|
22
37
|
|
|
23
38
|
## Architecture
|
|
24
39
|
|
|
25
40
|
```
|
|
26
41
|
Legion::Extensions::Llm::Ollama
|
|
27
|
-
├── Provider
|
|
28
|
-
├──
|
|
29
|
-
├──
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
|
|
42
|
+
├── Provider # Ollama provider implementation
|
|
43
|
+
│ ├── Capabilities # Capability predicates (chat, streaming, vision, functions, embeddings)
|
|
44
|
+
│ ├── #render_payload # Build Ollama chat payload from messages, tools, schema
|
|
45
|
+
│ ├── #stream_response # NDJSON streaming via Faraday on_data
|
|
46
|
+
│ ├── #discover_offerings # Build ModelOffering array from live/cached models
|
|
47
|
+
│ ├── #fetch_model_detail # Call /api/show, extract context_window + capabilities
|
|
48
|
+
│ ├── #render_embedding_payload # Build Ollama embedding payload
|
|
49
|
+
│ └── (inherited from lex-llm) # Chat, embedding, connection, registry helpers
|
|
50
|
+
├── Actor::DiscoveryRefresh # Every actor; refreshes model list, repopulates auto rules
|
|
51
|
+
├── Actor::FleetWorker # Subscription actor; gates on respond_to_requests
|
|
52
|
+
└── Runners::FleetWorker # Module function; delegates to ProviderResponder.call
|
|
53
|
+
|
|
54
|
+
Shared from lex-llm:
|
|
55
|
+
├── Fleet::ProviderResponder # Fleet request execution harness
|
|
56
|
+
├── RegistryPublisher # Publishes readiness + model events to llm.registry
|
|
57
|
+
├── RegistryEventBuilder # Builds registry event payloads
|
|
58
|
+
├── AutoRegistration # Self-registers discovered instances
|
|
59
|
+
└── CredentialSources # Socket probing + setting lookup for instance discovery
|
|
35
60
|
```
|
|
36
61
|
|
|
62
|
+
## Key Classes
|
|
63
|
+
|
|
64
|
+
### `Legion::Extensions::Llm::Ollama` (module)
|
|
65
|
+
|
|
66
|
+
- **`default_settings`** — Returns the full settings schema via `Lex-llm.provider_settings`.
|
|
67
|
+
- **`provider_class`** — Returns `Provider`.
|
|
68
|
+
- **`discover_instances`** — Probes `127.0.0.1:11434` socket + reads configured instances from settings.
|
|
69
|
+
- **`normalize_instance_config(config)`** — Normalizes `endpoint`/`api_base`/`ollama_api_base` aliases to `base_url`.
|
|
70
|
+
- **`registry_publisher`** — Lazily instantiated `RegistryPublisher` for the `:ollama` family.
|
|
71
|
+
|
|
72
|
+
### `Provider`
|
|
73
|
+
|
|
74
|
+
Extends `Legion::Extensions::Llm::Provider`. Implements the Ollama-specific contract:
|
|
75
|
+
|
|
76
|
+
| Method | Purpose |
|
|
77
|
+
|--------|---------|
|
|
78
|
+
| `api_base` | Resolves base URL from `resolve_base_url`, settings, or default `127.0.0.1:11434` |
|
|
79
|
+
| `completion_url` | `/api/chat` |
|
|
80
|
+
| `stream_url` | `/api/chat` |
|
|
81
|
+
| `models_url` | `/api/tags` |
|
|
82
|
+
| `running_models_url` | `/api/ps` |
|
|
83
|
+
| `show_model_url` | `/api/show` |
|
|
84
|
+
| `embedding_url` | `/api/embed` |
|
|
85
|
+
| `pull_url` | `/api/pull` |
|
|
86
|
+
| `version_url` | `/api/version` |
|
|
87
|
+
| `list_running_models` | GET `/api/ps`, returns array of running model hashes |
|
|
88
|
+
| `readiness(live:)` | Checks Ollama version endpoint; publishes readiness event when `live: true` |
|
|
89
|
+
| `list_models` | GET `/api/tags`, parses and publishes model events via registry |
|
|
90
|
+
| `show_model(model)` | POST `/api/show`, returns raw model detail hash |
|
|
91
|
+
| `fetch_model_detail(model)` | Wraps `show_model`; extracts `context_window` and `capabilities` |
|
|
92
|
+
| `pull_model(model, stream:)` | POST `/api/pull` to download a model |
|
|
93
|
+
| `discover_offerings(live:)` | Builds `ModelOffering` array from live or cached models |
|
|
94
|
+
| `render_payload(...)` | Converts Legion messages/tools to Ollama NDJSON format |
|
|
95
|
+
| `stream_response(conn, payload)` | Posts with Faraday `on_data` handler for NDJSON streaming |
|
|
96
|
+
| `parse_completion_response(resp)` | Normalizes Ollama chat response to `Legion::Extensions::Llm::Message` |
|
|
97
|
+
| `build_chunk(data)` | Normalizes a stream NDJSON line to `Legion::Extensions::Llm::Chunk` |
|
|
98
|
+
| `render_embedding_payload(text, model:, dimensions:)` | Builds embedding request body |
|
|
99
|
+
| `parse_embedding_response(resp, ...)` | Normalizes embedding response to `Legion::Extensions::Llm::Embedding` |
|
|
100
|
+
|
|
101
|
+
### `Capabilities` (module inside Provider)
|
|
102
|
+
|
|
103
|
+
Module functions providing capability predicates used during offering construction:
|
|
104
|
+
|
|
105
|
+
| Method | Always Returns |
|
|
106
|
+
|--------|---------------|
|
|
107
|
+
| `chat?(model)` | `true` |
|
|
108
|
+
| `streaming?(model)` | `true` |
|
|
109
|
+
| `vision?(model)` | `true` |
|
|
110
|
+
| `functions?(model)` | `true` |
|
|
111
|
+
| `embeddings?(model)` | `true` |
|
|
112
|
+
|
|
113
|
+
### `CONTEXT_WINDOWS` (constant)
|
|
114
|
+
|
|
115
|
+
Static fallback map keyed by model name prefix (e.g., `'qwen3' => 128_000`). Used when `/api/show` is unavailable to infer context window. Covers qwen, llama, gemma, mistral, deepseek, phi, command-r, codellama, and embedding families.
|
|
116
|
+
|
|
117
|
+
### `Actor::DiscoveryRefresh`
|
|
118
|
+
|
|
119
|
+
An `Every` actor that runs every 30 minutes (configurable via `settings[:extensions][:llm][:ollama][:discovery_interval]`). On each tick:
|
|
120
|
+
|
|
121
|
+
1. Calls `Legion::LLM::Discovery.refresh_discovered_models!(provider: :ollama)`
|
|
122
|
+
2. Repopulates auto routing rules if `Legion::LLM::Router` is available
|
|
123
|
+
3. Invalidates the offerings cache if `Legion::LLM::Inventory` is available
|
|
124
|
+
|
|
125
|
+
### `Actor::FleetWorker`
|
|
126
|
+
|
|
127
|
+
A `Subscription` actor that starts only when at least one instance has `fleet.respond_to_requests: true`. Routes messages to the fleet worker runner.
|
|
128
|
+
|
|
129
|
+
### `Runners::FleetWorker`
|
|
130
|
+
|
|
131
|
+
A module with `handle_fleet_request(payload, delivery:, properties:)`. Delegates to `Legion::Extensions::Llm::Fleet::ProviderResponder.call` with the Ollama provider family, provider class, and instance discovery lambda.
|
|
132
|
+
|
|
37
133
|
## Defaults
|
|
38
134
|
|
|
39
135
|
```ruby
|
|
@@ -65,21 +161,24 @@ Legion::Extensions::Llm::Ollama.default_settings
|
|
|
65
161
|
|
|
66
162
|
## Configuration
|
|
67
163
|
|
|
68
|
-
|
|
164
|
+
### Instance Discovery
|
|
165
|
+
|
|
166
|
+
`discover_instances` auto-detects a local instance when the socket at `127.0.0.1:11434` is reachable. Additional instances can be defined in settings using any of the recognized endpoint aliases (`base_url`, `endpoint`, `api_base`, `ollama_api_base`); the extension normalizes all to `base_url`.
|
|
69
167
|
|
|
70
168
|
```yaml
|
|
71
169
|
extensions:
|
|
72
170
|
llm:
|
|
73
171
|
ollama:
|
|
172
|
+
discovery_interval: 1800 # DiscoveryRefresh actor interval (seconds)
|
|
74
173
|
instances:
|
|
75
174
|
lab:
|
|
76
175
|
base_url: http://ollama-lab:11434
|
|
77
176
|
default_model: qwen3.5:latest
|
|
78
177
|
```
|
|
79
178
|
|
|
80
|
-
|
|
179
|
+
### Fleet Responder
|
|
81
180
|
|
|
82
|
-
Provider instances can opt in to consuming Legion LLM fleet requests. The
|
|
181
|
+
Provider instances can opt in to consuming Legion LLM fleet requests. The fleet actor only starts when at least one instance enables `respond_to_requests`, and the runner delegates execution to the shared `lex-llm` responder helper.
|
|
83
182
|
|
|
84
183
|
```yaml
|
|
85
184
|
extensions:
|
|
@@ -96,14 +195,76 @@ extensions:
|
|
|
96
195
|
- embed
|
|
97
196
|
```
|
|
98
197
|
|
|
198
|
+
## Ollama API Surface
|
|
199
|
+
|
|
200
|
+
| Legion Method | Ollama Route | HTTP Verb |
|
|
201
|
+
|---------------|-------------|-----------|
|
|
202
|
+
| Chat | `/api/chat` | POST |
|
|
203
|
+
| Stream chat | `/api/chat` | POST |
|
|
204
|
+
| List models | `/api/tags` | GET |
|
|
205
|
+
| Running models | `/api/ps` | GET |
|
|
206
|
+
| Model details | `/api/show` | POST |
|
|
207
|
+
| Pull model | `/api/pull` | POST |
|
|
208
|
+
| Embeddings | `/api/embed` | POST |
|
|
209
|
+
| Readiness | `/api/version` | GET |
|
|
210
|
+
|
|
211
|
+
## Error Handling
|
|
212
|
+
|
|
213
|
+
Every rescue block uses `handle_exception` from `Legion::Logging::Helper` with explicit `level`, `handled:`, and `operation:` parameters. Connection failures during `discover_offerings` produce a warn-level log and return an empty array (never raise).
|
|
214
|
+
|
|
215
|
+
## Usage
|
|
216
|
+
|
|
217
|
+
```ruby
|
|
218
|
+
require 'legion/extensions/llm/ollama'
|
|
219
|
+
|
|
220
|
+
# Access the module
|
|
221
|
+
Legion::Extensions::Llm::Ollama.discover_instances
|
|
222
|
+
Legion::Extensions::Llm::Ollama.default_settings
|
|
223
|
+
|
|
224
|
+
# Create a provider instance (usually done by lex-llm routing)
|
|
225
|
+
provider = Legion::Extensions::Llm::Ollama::Provider.new(config:)
|
|
226
|
+
|
|
227
|
+
# Discover offerings
|
|
228
|
+
provider.discover_offerings(live: true)
|
|
229
|
+
|
|
230
|
+
# Chat
|
|
231
|
+
result = provider.chat(messages: [...], model: 'llama3', temperature: 0.7)
|
|
232
|
+
|
|
233
|
+
# Stream chat
|
|
234
|
+
provider.stream_chat(messages: [...], model: 'llama3') do |chunk|
|
|
235
|
+
print chunk.content
|
|
236
|
+
end
|
|
237
|
+
|
|
238
|
+
# Embeddings
|
|
239
|
+
embeddings = provider.embed(text: "Hello world", model: 'nomic-embed-text')
|
|
240
|
+
```
|
|
241
|
+
|
|
242
|
+
## Dependencies
|
|
243
|
+
|
|
244
|
+
| Gem | Minimum Version | Purpose |
|
|
245
|
+
|-----|----------------|---------|
|
|
246
|
+
| `lex-llm` | `>= 0.4.3` | Base provider contract, routing, fleet responder, registry, credential sources |
|
|
247
|
+
| `legion-transport` | `>= 1.4.14` | Faraday connection management |
|
|
248
|
+
| `legion-json` | — | JSON serialization (`Legion::JSON`) |
|
|
249
|
+
| `legion-logging` | — | Structured logging (`Legion::Logging::Helper`) |
|
|
250
|
+
| `legion-settings` | — | Configuration access |
|
|
251
|
+
| `legion-extensions` | — | Extension framework (`Core`, `Actors::Every`, `Actors::Subscription`) |
|
|
252
|
+
|
|
99
253
|
## Development
|
|
100
254
|
|
|
101
255
|
```bash
|
|
256
|
+
cd /Users/matt.iverson@optum.com/rubymine/legion/extensions-ai/lex-llm-ollama
|
|
102
257
|
bundle install
|
|
103
|
-
|
|
258
|
+
|
|
259
|
+
# Run specs
|
|
260
|
+
bundle exec rspec
|
|
261
|
+
|
|
262
|
+
# Lint (auto-correct)
|
|
104
263
|
bundle exec rubocop -A
|
|
105
264
|
```
|
|
106
265
|
|
|
266
|
+
Spec count: 52 examples across 7 spec files.
|
|
267
|
+
|
|
107
268
|
## License
|
|
108
269
|
|
|
109
270
|
MIT
|