llm.rb 4.11.0 → 4.12.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +33 -1
- data/README.md +268 -191
- data/lib/llm/function/task.rb +7 -1
- data/lib/llm/function.rb +13 -2
- data/lib/llm/mcp/transport/http.rb +2 -1
- data/lib/llm/mcp/transport/stdio.rb +1 -0
- data/lib/llm/mcp.rb +2 -1
- data/lib/llm/provider.rb +3 -4
- data/lib/llm/providers/anthropic/request_adapter/completion.rb +8 -1
- data/lib/llm/providers/anthropic/response_adapter/completion.rb +7 -2
- data/lib/llm/providers/anthropic/stream_parser.rb +1 -1
- data/lib/llm/providers/anthropic/utils.rb +23 -0
- data/lib/llm/providers/anthropic.rb +11 -0
- data/lib/llm/stream/queue.rb +15 -2
- data/lib/llm/stream.rb +24 -10
- data/lib/llm/tracer/telemetry.rb +2 -2
- data/lib/llm/version.rb +1 -1
- data/llm.gemspec +7 -39
- metadata +9 -38
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 79d4a45ec25408e46451475575e917ef9d8579bec32f1a6a78bfed235e5ae212
|
|
4
|
+
data.tar.gz: fdeb12175be3ef87e411021444305b9e785a9bf2d055dfdc7bf718f5740623d8
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: ea35b39b5476b75370485128dd8441e078bc7ac69236a7a50f4e32fb419f6fac5f7bb81faf3e029f28b788f4d69645e1b97e4126ea4f9fcc31f014921d2434a4
|
|
7
|
+
data.tar.gz: c73bbf806f5cef71bfadfc1368fbdbfe07bf37118df18ebec71f4914a27ae2a3858fa6a210ee4d7cdff8f672a14c59016604a72a0a90c611b37223c4652ee991
|
data/CHANGELOG.md
CHANGED
|
@@ -1,9 +1,41 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
-
##
|
|
3
|
+
## v4.12.0
|
|
4
|
+
|
|
5
|
+
Changes since `v4.11.1`.
|
|
6
|
+
|
|
7
|
+
This release expands advanced streaming and MCP execution while reframing
|
|
8
|
+
llm.rb more clearly as a system integration layer for LLMs, tools, MCP
|
|
9
|
+
sources, and application APIs.
|
|
10
|
+
|
|
11
|
+
### Add
|
|
12
|
+
|
|
13
|
+
- Add `persistent` as an alias for `persist!` on providers and MCP transports.
|
|
14
|
+
- Add `LLM::Stream#on_tool_return` for observing completed streamed tool work.
|
|
15
|
+
- Add `LLM::Function::Return#error?`.
|
|
16
|
+
|
|
17
|
+
### Change
|
|
18
|
+
|
|
19
|
+
- Expect advanced streaming callbacks to use `LLM::Stream` subclasses
|
|
20
|
+
instead of duck-typing them onto arbitrary objects. Basic `#<<`
|
|
21
|
+
streaming remains supported.
|
|
22
|
+
|
|
23
|
+
### Fix
|
|
24
|
+
|
|
25
|
+
- Fix Anthropic tools without params by always emitting `input_schema`.
|
|
26
|
+
- Fix Anthropic tool-only responses to still produce an assistant message.
|
|
27
|
+
- Fix Anthropic tool results to use the `user` role.
|
|
28
|
+
- Fix Anthropic tool input normalization.
|
|
29
|
+
|
|
30
|
+
## v4.11.1
|
|
4
31
|
|
|
5
32
|
Changes since `v4.11.0`.
|
|
6
33
|
|
|
34
|
+
### Fix
|
|
35
|
+
|
|
36
|
+
* Cast OpenTelemetry tool-related values to strings. <br>
|
|
37
|
+
Otherwise they're rejected by opentelemetry-sdk as invalid attributes.
|
|
38
|
+
|
|
7
39
|
## v4.11.0
|
|
8
40
|
|
|
9
41
|
Changes since `v4.10.0`.
|
data/README.md
CHANGED
|
@@ -4,15 +4,16 @@
|
|
|
4
4
|
<p align="center">
|
|
5
5
|
<a href="https://0x1eef.github.io/x/llm.rb?rebuild=1"><img src="https://img.shields.io/badge/docs-0x1eef.github.io-blue.svg" alt="RubyDoc"></a>
|
|
6
6
|
<a href="https://opensource.org/license/0bsd"><img src="https://img.shields.io/badge/License-0BSD-orange.svg?" alt="License"></a>
|
|
7
|
-
<a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-4.
|
|
7
|
+
<a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-4.12.0-green.svg?" alt="Version"></a>
|
|
8
8
|
</p>
|
|
9
9
|
|
|
10
10
|
## About
|
|
11
11
|
|
|
12
|
-
llm.rb is a Ruby-centric
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
12
|
+
llm.rb is a Ruby-centric system integration layer for building real
|
|
13
|
+
LLM-powered systems. It connects LLMs to real systems by turning APIs into
|
|
14
|
+
tools and unifying MCP, providers, and application logic into a single
|
|
15
|
+
execution model. It is used in production systems integrating external and
|
|
16
|
+
internal tools, including agents, MCP services, and OpenAPI-based APIs.
|
|
16
17
|
|
|
17
18
|
Built for engineers who want to understand and control their LLM systems. No
|
|
18
19
|
frameworks, no hidden magic — just composable primitives for building real
|
|
@@ -26,17 +27,22 @@ and capabilities of llm.rb.
|
|
|
26
27
|
## What Makes It Different
|
|
27
28
|
|
|
28
29
|
Most LLM libraries stop at requests and responses. <br>
|
|
29
|
-
llm.rb is built around the state and execution model
|
|
30
|
+
llm.rb is built around the state and execution model behind them:
|
|
30
31
|
|
|
32
|
+
- **A system layer, not just an API wrapper** <br>
|
|
33
|
+
llm.rb unifies LLMs, tools, MCP servers, and application APIs into a single execution model.
|
|
31
34
|
- **Contexts are central** <br>
|
|
32
35
|
They hold history, tools, schema, usage, cost, persistence, and execution state.
|
|
36
|
+
- **Contexts can be serialized** <br>
|
|
37
|
+
A context can be serialized to JSON and stored on disk, in a database, in a
|
|
38
|
+
job queue, or anywhere else your application needs to persist state.
|
|
33
39
|
- **Tool execution is explicit** <br>
|
|
34
40
|
Run local, provider-native, and MCP tools sequentially or concurrently with threads, fibers, or async tasks.
|
|
35
41
|
- **Run tools while streaming** <br>
|
|
36
42
|
Start tool work while a response is still streaming instead of waiting for the turn to finish. <br>
|
|
37
|
-
This
|
|
43
|
+
This overlaps tool latency with model output and exposes streamed tool-call events for introspection, making it one of llm.rb's strongest execution features.
|
|
38
44
|
- **HTTP MCP can reuse connections** <br>
|
|
39
|
-
Opt into persistent HTTP pooling for repeated remote MCP tool calls with `
|
|
45
|
+
Opt into persistent HTTP pooling for repeated remote MCP tool calls with `persistent`.
|
|
40
46
|
- **One API across providers and capabilities** <br>
|
|
41
47
|
The same model covers chat, files, images, audio, embeddings, vector stores, and more.
|
|
42
48
|
- **Thread-safe where it matters** <br>
|
|
@@ -46,22 +52,48 @@ llm.rb is built around the state and execution model around them:
|
|
|
46
52
|
- **Stdlib-only by default** <br>
|
|
47
53
|
llm.rb runs on the Ruby standard library by default, with providers, optional features, and the model registry loaded only when you use them.
|
|
48
54
|
|
|
55
|
+
## What llm.rb Enables
|
|
56
|
+
|
|
57
|
+
llm.rb acts as the integration layer between LLMs, tools, and real systems.
|
|
58
|
+
|
|
59
|
+
- Turn REST / OpenAPI APIs into LLM tools
|
|
60
|
+
- Connect multiple MCP sources (Notion, internal services, etc.)
|
|
61
|
+
- Build agents that operate across system boundaries
|
|
62
|
+
- Orchestrate tools from multiple providers and protocols
|
|
63
|
+
- Stream responses while executing tools concurrently
|
|
64
|
+
- Treat LLMs as part of your architecture, not isolated calls
|
|
65
|
+
|
|
66
|
+
Without llm.rb, providers, tool formats, and orchestration paths tend to stay
|
|
67
|
+
fragmented. With llm.rb, they share a unified execution model with composable
|
|
68
|
+
tools and a more consistent system architecture.
|
|
69
|
+
|
|
70
|
+
## Real-World Usage
|
|
71
|
+
|
|
72
|
+
llm.rb is used to integrate external MCP services such as Notion, internal APIs
|
|
73
|
+
exposed via OpenAPI or `swagger.json`, and multiple tool sources into a unified
|
|
74
|
+
execution model. Common usage patterns include combining multiple MCP sources,
|
|
75
|
+
turning internal APIs into tools, and running those tools through the same
|
|
76
|
+
context and provider flow.
|
|
77
|
+
|
|
78
|
+
It supports multiple MCP sources, external SaaS integrations, internal APIs via
|
|
79
|
+
OpenAPI, and multiple LLM providers simultaneously.
|
|
80
|
+
|
|
49
81
|
## Architecture & Execution Model
|
|
50
82
|
|
|
51
|
-
llm.rb
|
|
83
|
+
llm.rb sits at the center of the execution path, connecting tools, MCP
|
|
84
|
+
sources, APIs, providers, and your application through explicit contexts:
|
|
52
85
|
|
|
53
86
|
```
|
|
54
|
-
|
|
55
|
-
│
|
|
56
|
-
|
|
57
|
-
│
|
|
58
|
-
|
|
59
|
-
│
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
└─────────────────────────────────────────┘
|
|
87
|
+
External MCP Internal MCP OpenAPI / REST
|
|
88
|
+
│ │ │
|
|
89
|
+
└────────── Tools / MCP Layer ──────────┘
|
|
90
|
+
│
|
|
91
|
+
llm.rb Contexts
|
|
92
|
+
│
|
|
93
|
+
LLM Providers
|
|
94
|
+
(OpenAI, Anthropic, etc.)
|
|
95
|
+
│
|
|
96
|
+
Your Application
|
|
65
97
|
```
|
|
66
98
|
|
|
67
99
|
### Key Design Decisions
|
|
@@ -100,167 +132,150 @@ llm.rb provides a complete set of primitives for building LLM-powered systems:
|
|
|
100
132
|
|
|
101
133
|
## Quick Start
|
|
102
134
|
|
|
103
|
-
|
|
135
|
+
These examples show individual features, but llm.rb is designed to combine
|
|
136
|
+
them into full systems where LLMs, tools, and external services operate
|
|
137
|
+
together.
|
|
104
138
|
|
|
105
|
-
|
|
106
|
-
assistant turn is fully finished. That means tool latency can overlap with
|
|
107
|
-
streaming output instead of happening strictly after it. If your model emits
|
|
108
|
-
tool calls early, this can noticeably reduce end-to-end latency for real
|
|
109
|
-
systems.
|
|
139
|
+
#### Simple Streaming
|
|
110
140
|
|
|
111
|
-
|
|
112
|
-
|
|
141
|
+
At the simplest level, any object that implements `#<<` can receive visible
|
|
142
|
+
output as it arrives. This works with `$stdout`, `StringIO`, files, sockets,
|
|
143
|
+
and other Ruby IO-style objects.
|
|
113
144
|
|
|
114
|
-
For
|
|
145
|
+
For more control, llm.rb also supports advanced streaming patterns through
|
|
146
|
+
[`LLM::Stream`](lib/llm/stream.rb). See [Advanced Streaming](#advanced-streaming)
|
|
147
|
+
for a structured callback-based example. Basic `#<<` streams only receive
|
|
148
|
+
visible output chunks:
|
|
115
149
|
|
|
116
150
|
```ruby
|
|
117
151
|
#!/usr/bin/env ruby
|
|
118
152
|
require "llm"
|
|
119
153
|
|
|
120
|
-
class System < LLM::Tool
|
|
121
|
-
name "system"
|
|
122
|
-
description "Run a shell command"
|
|
123
|
-
params { _1.object(command: _1.string.required) }
|
|
124
|
-
|
|
125
|
-
def call(command:)
|
|
126
|
-
{success: Kernel.system(command)}
|
|
127
|
-
end
|
|
128
|
-
end
|
|
129
|
-
|
|
130
|
-
class Stream < LLM::Stream
|
|
131
|
-
def on_content(content)
|
|
132
|
-
print content
|
|
133
|
-
end
|
|
134
|
-
|
|
135
|
-
def on_tool_call(tool, error)
|
|
136
|
-
queue << (error || tool.spawn(:thread))
|
|
137
|
-
end
|
|
138
|
-
end
|
|
139
|
-
|
|
140
154
|
llm = LLM.openai(key: ENV["KEY"])
|
|
141
|
-
ctx = LLM::Context.new(llm, stream:
|
|
142
|
-
|
|
143
|
-
|
|
144
|
-
ctx.talk(
|
|
155
|
+
ctx = LLM::Context.new(llm, stream: $stdout)
|
|
156
|
+
loop do
|
|
157
|
+
print "> "
|
|
158
|
+
ctx.talk(STDIN.gets || break)
|
|
159
|
+
puts
|
|
160
|
+
end
|
|
145
161
|
```
|
|
146
162
|
|
|
147
|
-
####
|
|
163
|
+
#### Structured Outputs
|
|
148
164
|
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
operations:
|
|
165
|
+
The `LLM::Schema` system lets you define JSON schemas for structured outputs.
|
|
166
|
+
Schemas can be defined as classes with `property` declarations or built
|
|
167
|
+
programmatically using a fluent interface. When you pass a schema to a context,
|
|
168
|
+
llm.rb adapts it into the provider's structured-output format when that
|
|
169
|
+
provider supports one. The `content!` method then parses the assistant's JSON
|
|
170
|
+
response into a Ruby object:
|
|
156
171
|
|
|
157
172
|
```ruby
|
|
158
173
|
#!/usr/bin/env ruby
|
|
159
174
|
require "llm"
|
|
175
|
+
require "pp"
|
|
176
|
+
|
|
177
|
+
class Report < LLM::Schema
|
|
178
|
+
property :category, Enum["performance", "security", "outage"], "Report category", required: true
|
|
179
|
+
property :summary, String, "Short summary", required: true
|
|
180
|
+
property :impact, OneOf[String, Integer], "Primary impact, as text or a count", required: true
|
|
181
|
+
property :services, Array[String], "Impacted services", required: true
|
|
182
|
+
property :timestamp, String, "When it happened", optional: true
|
|
183
|
+
end
|
|
160
184
|
|
|
161
185
|
llm = LLM.openai(key: ENV["KEY"])
|
|
162
|
-
ctx = LLM::Context.new(llm,
|
|
186
|
+
ctx = LLM::Context.new(llm, schema: Report)
|
|
187
|
+
res = ctx.talk("Structure this report: 'Database latency spiked at 10:42 UTC, causing 5% request timeouts for 12 minutes.'")
|
|
188
|
+
pp res.content!
|
|
163
189
|
|
|
164
|
-
#
|
|
165
|
-
|
|
166
|
-
|
|
190
|
+
# {
|
|
191
|
+
# "category" => "performance",
|
|
192
|
+
# "summary" => "Database latency spiked, causing 5% request timeouts for 12 minutes.",
|
|
193
|
+
# "impact" => "5% request timeouts",
|
|
194
|
+
# "services" => ["Database"],
|
|
195
|
+
# "timestamp" => "2024-06-05T10:42:00Z"
|
|
196
|
+
# }
|
|
167
197
|
```
|
|
168
198
|
|
|
169
|
-
####
|
|
199
|
+
#### Tool Calling
|
|
170
200
|
|
|
171
|
-
llm.rb
|
|
172
|
-
|
|
173
|
-
|
|
174
|
-
|
|
175
|
-
|
|
176
|
-
|
|
177
|
-
isolated to a single thread:
|
|
201
|
+
Tools in llm.rb can be defined as classes inheriting from `LLM::Tool` or as
|
|
202
|
+
closures using `LLM.function`. When the LLM requests a tool call, the context
|
|
203
|
+
stores `Function` objects in `ctx.functions`. The `call()` method executes all
|
|
204
|
+
pending functions and returns their results to the LLM. Tools describe
|
|
205
|
+
structured parameters with JSON Schema and adapt those definitions to each
|
|
206
|
+
provider's tool-calling format (OpenAI, Anthropic, Google, etc.):
|
|
178
207
|
|
|
179
208
|
```ruby
|
|
180
209
|
#!/usr/bin/env ruby
|
|
181
210
|
require "llm"
|
|
182
211
|
|
|
183
|
-
|
|
184
|
-
|
|
212
|
+
class System < LLM::Tool
|
|
213
|
+
name "system"
|
|
214
|
+
description "Run a shell command"
|
|
215
|
+
param :command, String, "Command to execute", required: true
|
|
185
216
|
|
|
186
|
-
|
|
187
|
-
|
|
188
|
-
|
|
189
|
-
ctx.talk("List the directories in this project.")
|
|
190
|
-
ctx.talk(ctx.call(:functions)) while ctx.functions.any?
|
|
191
|
-
ensure
|
|
192
|
-
mcp.stop
|
|
217
|
+
def call(command:)
|
|
218
|
+
{success: system(command)}
|
|
219
|
+
end
|
|
193
220
|
end
|
|
194
|
-
```
|
|
195
|
-
|
|
196
|
-
You can also connect to an MCP server over HTTP. This is useful when the
|
|
197
|
-
server already runs remotely and exposes MCP through a URL instead of a local
|
|
198
|
-
process. If you expect repeated tool calls, use `persist!` to reuse a
|
|
199
|
-
process-wide HTTP connection pool. This requires the optional
|
|
200
|
-
`net-http-persistent` gem:
|
|
201
|
-
|
|
202
|
-
```ruby
|
|
203
|
-
#!/usr/bin/env ruby
|
|
204
|
-
require "llm"
|
|
205
221
|
|
|
206
222
|
llm = LLM.openai(key: ENV["KEY"])
|
|
207
|
-
|
|
208
|
-
|
|
209
|
-
|
|
210
|
-
).persist!
|
|
211
|
-
|
|
212
|
-
begin
|
|
213
|
-
mcp.start
|
|
214
|
-
ctx = LLM::Context.new(llm, stream: $stdout, tools: mcp.tools)
|
|
215
|
-
ctx.talk("List the available GitHub MCP toolsets.")
|
|
216
|
-
ctx.talk(ctx.call(:functions)) while ctx.functions.any?
|
|
217
|
-
ensure
|
|
218
|
-
mcp.stop
|
|
219
|
-
end
|
|
223
|
+
ctx = LLM::Context.new(llm, stream: $stdout, tools: [System])
|
|
224
|
+
ctx.talk("Run `date`.")
|
|
225
|
+
ctx.talk(ctx.call(:functions)) while ctx.functions.any?
|
|
220
226
|
```
|
|
221
227
|
|
|
222
|
-
####
|
|
228
|
+
#### Concurrent Tools
|
|
223
229
|
|
|
224
|
-
|
|
225
|
-
|
|
226
|
-
|
|
230
|
+
llm.rb provides explicit concurrency control for tool execution. The
|
|
231
|
+
`wait(:thread)` method spawns each pending function in its own thread and waits
|
|
232
|
+
for all to complete. You can also use `:fiber` for cooperative multitasking or
|
|
233
|
+
`:task` for async/await patterns (requires the `async` gem). The context
|
|
234
|
+
automatically collects all results and reports them back to the LLM in a
|
|
235
|
+
single turn, maintaining conversation flow while parallelizing independent
|
|
236
|
+
operations:
|
|
227
237
|
|
|
228
238
|
```ruby
|
|
229
239
|
#!/usr/bin/env ruby
|
|
230
240
|
require "llm"
|
|
231
241
|
|
|
232
242
|
llm = LLM.openai(key: ENV["KEY"])
|
|
233
|
-
ctx = LLM::Context.new(llm, stream: $stdout)
|
|
234
|
-
|
|
235
|
-
|
|
236
|
-
|
|
237
|
-
|
|
238
|
-
end
|
|
243
|
+
ctx = LLM::Context.new(llm, stream: $stdout, tools: [FetchWeather, FetchNews, FetchStock])
|
|
244
|
+
|
|
245
|
+
# Execute multiple independent tools concurrently
|
|
246
|
+
ctx.talk("Summarize the weather, headlines, and stock price.")
|
|
247
|
+
ctx.talk(ctx.wait(:thread)) while ctx.functions.any?
|
|
239
248
|
```
|
|
240
249
|
|
|
241
250
|
#### Advanced Streaming
|
|
242
251
|
|
|
243
|
-
|
|
244
|
-
structured streaming
|
|
252
|
+
Use [`LLM::Stream`](lib/llm/stream.rb) when you want more than plain `#<<`
|
|
253
|
+
output. It adds structured streaming callbacks for:
|
|
245
254
|
|
|
246
255
|
- `on_content` for visible assistant output
|
|
247
256
|
- `on_reasoning_content` for separate reasoning output
|
|
248
257
|
- `on_tool_call` for streamed tool-call notifications
|
|
258
|
+
- `on_tool_return` for completed tool execution
|
|
259
|
+
|
|
260
|
+
Subclass [`LLM::Stream`](lib/llm/stream.rb) when you want callbacks like
|
|
261
|
+
`on_reasoning_content`, `on_tool_call`, and `on_tool_return`, or helpers like
|
|
262
|
+
`queue` and `wait`.
|
|
249
263
|
|
|
250
|
-
|
|
251
|
-
|
|
252
|
-
|
|
264
|
+
Keep `on_content`, `on_reasoning_content`, and `on_tool_call` fast: they run
|
|
265
|
+
inline with the streaming parser. `on_tool_return` is different: it runs later,
|
|
266
|
+
when `wait` resolves queued streamed tool work.
|
|
253
267
|
|
|
254
268
|
`on_tool_call` lets tools start before the model finishes its turn, for
|
|
255
269
|
example with `tool.spawn(:thread)`, `tool.spawn(:fiber)`, or
|
|
256
|
-
`tool.spawn(:task)`.
|
|
257
|
-
|
|
270
|
+
`tool.spawn(:task)`. That can overlap tool latency with streaming output.
|
|
271
|
+
`on_tool_return` is the place to react when that queued work completes, for
|
|
272
|
+
example by updating progress UIs, logging tool latency, or changing visible
|
|
273
|
+
state from "Running tool ..." to "Finished tool ...".
|
|
258
274
|
|
|
259
|
-
If a stream cannot
|
|
260
|
-
|
|
261
|
-
|
|
262
|
-
|
|
263
|
-
however it sees fit.
|
|
275
|
+
If a stream cannot resolve a tool, `on_tool_call` receives `error` as an
|
|
276
|
+
`LLM::Function::Return`. That keeps the session alive and leaves control in
|
|
277
|
+
the callback: it can send `error`, spawn the tool when `error == nil`, or
|
|
278
|
+
handle the situation however it sees fit.
|
|
264
279
|
|
|
265
280
|
In normal use this should be rare, since `on_tool_call` is usually called with
|
|
266
281
|
a resolved tool and `error == nil`. To resolve a tool call, the tool must be
|
|
@@ -274,25 +289,22 @@ require "llm"
|
|
|
274
289
|
# Assume `System < LLM::Tool` is already defined.
|
|
275
290
|
|
|
276
291
|
class Stream < LLM::Stream
|
|
277
|
-
attr_reader :content, :reasoning_content
|
|
278
|
-
|
|
279
|
-
def initialize
|
|
280
|
-
@content = +""
|
|
281
|
-
@reasoning_content = +""
|
|
282
|
-
end
|
|
283
|
-
|
|
284
292
|
def on_content(content)
|
|
285
|
-
|
|
286
|
-
print content
|
|
293
|
+
$stdout << content
|
|
287
294
|
end
|
|
288
295
|
|
|
289
296
|
def on_reasoning_content(content)
|
|
290
|
-
|
|
297
|
+
$stderr << content
|
|
291
298
|
end
|
|
292
299
|
|
|
293
300
|
def on_tool_call(tool, error)
|
|
301
|
+
$stdout << "Running tool #{tool.name}\n"
|
|
294
302
|
queue << (error || tool.spawn(:thread))
|
|
295
303
|
end
|
|
304
|
+
|
|
305
|
+
def on_tool_return(tool, ret)
|
|
306
|
+
$stdout << (ret.error? ? "Tool #{tool.name} failed\n" : "Finished tool #{tool.name}\n")
|
|
307
|
+
end
|
|
296
308
|
end
|
|
297
309
|
|
|
298
310
|
llm = LLM.openai(key: ENV["KEY"])
|
|
@@ -304,69 +316,67 @@ while ctx.functions.any?
|
|
|
304
316
|
end
|
|
305
317
|
```
|
|
306
318
|
|
|
307
|
-
####
|
|
319
|
+
#### MCP
|
|
308
320
|
|
|
309
|
-
|
|
310
|
-
|
|
311
|
-
|
|
312
|
-
|
|
313
|
-
|
|
314
|
-
|
|
321
|
+
MCP is a first-class integration mechanism in llm.rb.
|
|
322
|
+
|
|
323
|
+
MCP allows llm.rb to treat external services, internal APIs, and system
|
|
324
|
+
capabilities as tools in a unified interface. This makes it possible to
|
|
325
|
+
connect multiple MCP sources simultaneously and expose your own APIs as tools.
|
|
326
|
+
|
|
327
|
+
In practice, this supports workflows such as external SaaS integrations,
|
|
328
|
+
multiple MCP sources in the same context, and OpenAPI -> MCP -> tools
|
|
329
|
+
pipelines for internal services.
|
|
330
|
+
|
|
331
|
+
llm.rb integrates with the Model Context Protocol (MCP) to dynamically discover
|
|
332
|
+
and use tools from external servers. This example starts a filesystem MCP
|
|
333
|
+
server over stdio and makes its tools available to a context, enabling the LLM
|
|
334
|
+
to interact with the local file system through a standardized interface.
|
|
335
|
+
Use `LLM::MCP.stdio` or `LLM::MCP.http` when you want to make the transport
|
|
336
|
+
explicit. Like `LLM::Context`, an MCP client is stateful and should remain
|
|
337
|
+
isolated to a single thread:
|
|
315
338
|
|
|
316
339
|
```ruby
|
|
317
340
|
#!/usr/bin/env ruby
|
|
318
341
|
require "llm"
|
|
319
342
|
|
|
320
|
-
|
|
321
|
-
|
|
322
|
-
description "Run a shell command"
|
|
323
|
-
param :command, String, "Command to execute", required: true
|
|
343
|
+
llm = LLM.openai(key: ENV["KEY"])
|
|
344
|
+
mcp = LLM::MCP.stdio(argv: ["npx", "-y", "@modelcontextprotocol/server-filesystem", Dir.pwd])
|
|
324
345
|
|
|
325
|
-
|
|
326
|
-
|
|
327
|
-
|
|
346
|
+
begin
|
|
347
|
+
mcp.start
|
|
348
|
+
ctx = LLM::Context.new(llm, stream: $stdout, tools: mcp.tools)
|
|
349
|
+
ctx.talk("List the directories in this project.")
|
|
350
|
+
ctx.talk(ctx.call(:functions)) while ctx.functions.any?
|
|
351
|
+
ensure
|
|
352
|
+
mcp.stop
|
|
328
353
|
end
|
|
329
|
-
|
|
330
|
-
llm = LLM.openai(key: ENV["KEY"])
|
|
331
|
-
ctx = LLM::Context.new(llm, stream: $stdout, tools: [System])
|
|
332
|
-
ctx.talk("Run `date`.")
|
|
333
|
-
ctx.talk(ctx.call(:functions)) while ctx.functions.any?
|
|
334
354
|
```
|
|
335
355
|
|
|
336
|
-
|
|
337
|
-
|
|
338
|
-
|
|
339
|
-
|
|
340
|
-
|
|
341
|
-
llm.rb adapts it into the provider's structured-output format when that
|
|
342
|
-
provider supports one. The `content!` method then parses the assistant's JSON
|
|
343
|
-
response into a Ruby object:
|
|
356
|
+
You can also connect to an MCP server over HTTP. This is useful when the
|
|
357
|
+
server already runs remotely and exposes MCP through a URL instead of a local
|
|
358
|
+
process. If you expect repeated tool calls, use `persistent` to reuse a
|
|
359
|
+
process-wide HTTP connection pool. This requires the optional
|
|
360
|
+
`net-http-persistent` gem:
|
|
344
361
|
|
|
345
362
|
```ruby
|
|
346
363
|
#!/usr/bin/env ruby
|
|
347
364
|
require "llm"
|
|
348
|
-
require "pp"
|
|
349
|
-
|
|
350
|
-
class Report < LLM::Schema
|
|
351
|
-
property :category, Enum["performance", "security", "outage"], "Report category", required: true
|
|
352
|
-
property :summary, String, "Short summary", required: true
|
|
353
|
-
property :impact, OneOf[String, Integer], "Primary impact, as text or a count", required: true
|
|
354
|
-
property :services, Array[String], "Impacted services", required: true
|
|
355
|
-
property :timestamp, String, "When it happened", optional: true
|
|
356
|
-
end
|
|
357
365
|
|
|
358
366
|
llm = LLM.openai(key: ENV["KEY"])
|
|
359
|
-
|
|
360
|
-
|
|
361
|
-
|
|
367
|
+
mcp = LLM::MCP.http(
|
|
368
|
+
url: "https://api.githubcopilot.com/mcp/",
|
|
369
|
+
headers: {"Authorization" => "Bearer #{ENV.fetch("GITHUB_PAT")}"}
|
|
370
|
+
).persistent
|
|
362
371
|
|
|
363
|
-
|
|
364
|
-
|
|
365
|
-
|
|
366
|
-
|
|
367
|
-
|
|
368
|
-
|
|
369
|
-
|
|
372
|
+
begin
|
|
373
|
+
mcp.start
|
|
374
|
+
ctx = LLM::Context.new(llm, stream: $stdout, tools: mcp.tools)
|
|
375
|
+
ctx.talk("List the available GitHub MCP toolsets.")
|
|
376
|
+
ctx.talk(ctx.call(:functions)) while ctx.functions.any?
|
|
377
|
+
ensure
|
|
378
|
+
mcp.stop
|
|
379
|
+
end
|
|
370
380
|
```
|
|
371
381
|
|
|
372
382
|
## Providers
|
|
@@ -496,7 +506,7 @@ require "llm"
|
|
|
496
506
|
LLM.json = :oj # Use Oj for faster JSON parsing
|
|
497
507
|
|
|
498
508
|
# Enable HTTP connection pooling for high-throughput applications
|
|
499
|
-
llm = LLM.openai(key: ENV["KEY"]).
|
|
509
|
+
llm = LLM.openai(key: ENV["KEY"]).persistent # Uses net-http-persistent when available
|
|
500
510
|
```
|
|
501
511
|
|
|
502
512
|
#### Model Registry
|
|
@@ -542,11 +552,11 @@ res = ctx.talk("What is the capital of France?")
|
|
|
542
552
|
puts res.content
|
|
543
553
|
```
|
|
544
554
|
|
|
545
|
-
#### Context Persistence
|
|
555
|
+
#### Context Persistence: Vanilla
|
|
546
556
|
|
|
547
|
-
Contexts can be serialized and restored across process boundaries.
|
|
548
|
-
|
|
549
|
-
|
|
557
|
+
Contexts can be serialized and restored across process boundaries. A context
|
|
558
|
+
can be serialized to JSON and stored on disk, in a database, in a job queue,
|
|
559
|
+
or anywhere else your application needs to persist state:
|
|
550
560
|
|
|
551
561
|
```ruby
|
|
552
562
|
#!/usr/bin/env ruby
|
|
@@ -556,12 +566,79 @@ llm = LLM.openai(key: ENV["KEY"])
|
|
|
556
566
|
ctx = LLM::Context.new(llm)
|
|
557
567
|
ctx.talk("Hello")
|
|
558
568
|
ctx.talk("Remember that my favorite language is Ruby")
|
|
559
|
-
|
|
569
|
+
|
|
570
|
+
# Serialize to a string when you want to store the context yourself,
|
|
571
|
+
# for example in a database row or job payload.
|
|
572
|
+
payload = ctx.to_json
|
|
560
573
|
|
|
561
574
|
restored = LLM::Context.new(llm)
|
|
562
|
-
restored.restore(
|
|
575
|
+
restored.restore(string: payload)
|
|
563
576
|
res = restored.talk("What is my favorite language?")
|
|
564
577
|
puts res.content
|
|
578
|
+
|
|
579
|
+
# You can also persist the same state to a file:
|
|
580
|
+
ctx.save(path: "context.json")
|
|
581
|
+
restored = LLM::Context.new(llm)
|
|
582
|
+
restored.restore(path: "context.json")
|
|
583
|
+
```
|
|
584
|
+
|
|
585
|
+
#### Context Persistence: ActiveRecord (Rails)
|
|
586
|
+
|
|
587
|
+
In a Rails application, you can also wrap persisted context state in an
|
|
588
|
+
ActiveRecord model. A minimal schema would include a `snapshot` column for the
|
|
589
|
+
serialized context payload (`jsonb` is recommended) and a `provider` column
|
|
590
|
+
for the provider name:
|
|
591
|
+
|
|
592
|
+
```ruby
|
|
593
|
+
create_table :contexts do |t|
|
|
594
|
+
t.jsonb :snapshot
|
|
595
|
+
t.string :provider, null: false
|
|
596
|
+
t.timestamps
|
|
597
|
+
end
|
|
598
|
+
```
|
|
599
|
+
|
|
600
|
+
For example:
|
|
601
|
+
|
|
602
|
+
```ruby
|
|
603
|
+
class Context < ApplicationRecord
|
|
604
|
+
def talk(...)
|
|
605
|
+
ctx.talk(...).tap { flush }
|
|
606
|
+
end
|
|
607
|
+
|
|
608
|
+
def wait(...)
|
|
609
|
+
ctx.wait(...).tap { flush }
|
|
610
|
+
end
|
|
611
|
+
|
|
612
|
+
def messages
|
|
613
|
+
ctx.messages
|
|
614
|
+
end
|
|
615
|
+
|
|
616
|
+
def model
|
|
617
|
+
ctx.model
|
|
618
|
+
end
|
|
619
|
+
|
|
620
|
+
def flush
|
|
621
|
+
update_column(:snapshot, ctx.to_json)
|
|
622
|
+
end
|
|
623
|
+
|
|
624
|
+
private
|
|
625
|
+
|
|
626
|
+
def ctx
|
|
627
|
+
@ctx ||= begin
|
|
628
|
+
ctx = LLM::Context.new(llm)
|
|
629
|
+
ctx.restore(string: snapshot) if snapshot
|
|
630
|
+
ctx
|
|
631
|
+
end
|
|
632
|
+
end
|
|
633
|
+
|
|
634
|
+
def llm
|
|
635
|
+
LLM.method(provider).call(key: ENV.fetch(key))
|
|
636
|
+
end
|
|
637
|
+
|
|
638
|
+
def key
|
|
639
|
+
"#{provider.upcase}_KEY"
|
|
640
|
+
end
|
|
641
|
+
end
|
|
565
642
|
```
|
|
566
643
|
|
|
567
644
|
#### Agents
|
data/lib/llm/function/task.rb
CHANGED
|
@@ -9,11 +9,17 @@ class LLM::Function
|
|
|
9
9
|
# @return [Object]
|
|
10
10
|
attr_reader :task
|
|
11
11
|
|
|
12
|
+
##
|
|
13
|
+
# @return [LLM::Function, nil]
|
|
14
|
+
attr_reader :function
|
|
15
|
+
|
|
12
16
|
##
|
|
13
17
|
# @param [Thread, Fiber, Async::Task] task
|
|
18
|
+
# @param [LLM::Function, nil] function
|
|
14
19
|
# @return [LLM::Function::Task]
|
|
15
|
-
def initialize(task)
|
|
20
|
+
def initialize(task, function = nil)
|
|
16
21
|
@task = task
|
|
22
|
+
@function = function
|
|
17
23
|
end
|
|
18
24
|
|
|
19
25
|
##
|
data/lib/llm/function.rb
CHANGED
|
@@ -41,6 +41,13 @@ class LLM::Function
|
|
|
41
41
|
prepend LLM::Function::Tracing
|
|
42
42
|
|
|
43
43
|
Return = Struct.new(:id, :name, :value) do
|
|
44
|
+
##
|
|
45
|
+
# Returns true when the return value represents an error.
|
|
46
|
+
# @return [Boolean]
|
|
47
|
+
def error?
|
|
48
|
+
Hash === value && value[:error] == true
|
|
49
|
+
end
|
|
50
|
+
|
|
44
51
|
##
|
|
45
52
|
# Returns a Hash representation of {LLM::Function::Return}
|
|
46
53
|
# @return [Hash]
|
|
@@ -186,7 +193,7 @@ class LLM::Function
|
|
|
186
193
|
else
|
|
187
194
|
raise ArgumentError, "Unknown strategy: #{strategy.inspect}. Expected :thread, :task, or :fiber"
|
|
188
195
|
end
|
|
189
|
-
Task.new(task)
|
|
196
|
+
Task.new(task, self)
|
|
190
197
|
ensure
|
|
191
198
|
@called = true
|
|
192
199
|
end
|
|
@@ -233,7 +240,11 @@ class LLM::Function
|
|
|
233
240
|
when "LLM::Google"
|
|
234
241
|
{name: @name, description: @description, parameters: @params}.compact
|
|
235
242
|
when "LLM::Anthropic"
|
|
236
|
-
{
|
|
243
|
+
{
|
|
244
|
+
name: @name,
|
|
245
|
+
description: @description,
|
|
246
|
+
input_schema: @params || {type: "object", properties: {}}
|
|
247
|
+
}.compact
|
|
237
248
|
else
|
|
238
249
|
format_openai(provider)
|
|
239
250
|
end
|
|
@@ -104,7 +104,7 @@ module LLM::MCP::Transport
|
|
|
104
104
|
# Configures the transport to use a persistent HTTP connection pool
|
|
105
105
|
# via the optional dependency [Net::HTTP::Persistent](https://github.com/drbrain/net-http-persistent)
|
|
106
106
|
# @example
|
|
107
|
-
# mcp = LLM.mcp(http: {url: "https://example.com/mcp"}).
|
|
107
|
+
# mcp = LLM.mcp(http: {url: "https://example.com/mcp"}).persistent
|
|
108
108
|
# # do something with 'mcp'
|
|
109
109
|
# @return [LLM::MCP::Transport::HTTP]
|
|
110
110
|
def persist!
|
|
@@ -119,6 +119,7 @@ module LLM::MCP::Transport
|
|
|
119
119
|
end
|
|
120
120
|
self
|
|
121
121
|
end
|
|
122
|
+
alias_method :persistent, :persist!
|
|
122
123
|
|
|
123
124
|
private
|
|
124
125
|
|
data/lib/llm/mcp.rb
CHANGED
|
@@ -104,13 +104,14 @@ class LLM::MCP
|
|
|
104
104
|
# Configures an HTTP MCP transport to use a persistent connection pool
|
|
105
105
|
# via the optional dependency [Net::HTTP::Persistent](https://github.com/drbrain/net-http-persistent)
|
|
106
106
|
# @example
|
|
107
|
-
# mcp = LLM.mcp(http: {url: "https://example.com/mcp"}).
|
|
107
|
+
# mcp = LLM.mcp(http: {url: "https://example.com/mcp"}).persistent
|
|
108
108
|
# # do something with 'mcp'
|
|
109
109
|
# @return [LLM::MCP]
|
|
110
110
|
def persist!
|
|
111
111
|
transport.persist!
|
|
112
112
|
self
|
|
113
113
|
end
|
|
114
|
+
alias_method :persistent, :persist!
|
|
114
115
|
|
|
115
116
|
##
|
|
116
117
|
# Returns the tools provided by the MCP process.
|
data/lib/llm/provider.rb
CHANGED
|
@@ -308,7 +308,7 @@ class LLM::Provider
|
|
|
308
308
|
# This method configures a provider to use a persistent connection pool
|
|
309
309
|
# via the optional dependency [Net::HTTP::Persistent](https://github.com/drbrain/net-http-persistent)
|
|
310
310
|
# @example
|
|
311
|
-
# llm = LLM.openai(key: ENV["KEY"]).
|
|
311
|
+
# llm = LLM.openai(key: ENV["KEY"]).persistent
|
|
312
312
|
# # do something with 'llm'
|
|
313
313
|
# @return [LLM::Provider]
|
|
314
314
|
def persist!
|
|
@@ -317,14 +317,13 @@ class LLM::Provider
|
|
|
317
317
|
tap { @client = client }
|
|
318
318
|
end
|
|
319
319
|
end
|
|
320
|
+
alias_method :persistent, :persist!
|
|
320
321
|
|
|
321
322
|
##
|
|
322
323
|
# @param [Object] stream
|
|
323
324
|
# @return [Boolean]
|
|
324
325
|
def streamable?(stream)
|
|
325
|
-
stream.respond_to?(
|
|
326
|
-
stream.respond_to?(:on_reasoning_content) ||
|
|
327
|
-
stream.respond_to?(:<<)
|
|
326
|
+
LLM::Stream === stream || stream.respond_to?(:<<)
|
|
328
327
|
end
|
|
329
328
|
|
|
330
329
|
private
|
|
@@ -28,12 +28,19 @@ module LLM::Anthropic::RequestAdapter
|
|
|
28
28
|
|
|
29
29
|
def adapt_message
|
|
30
30
|
if message.tool_call?
|
|
31
|
-
{role: message.role, content:
|
|
31
|
+
{role: message.role, content: adapt_tool_calls}
|
|
32
32
|
else
|
|
33
33
|
{role: message.role, content: adapt_content(content)}
|
|
34
34
|
end
|
|
35
35
|
end
|
|
36
36
|
|
|
37
|
+
def adapt_tool_calls
|
|
38
|
+
message.extra[:tool_calls].filter_map do |tool|
|
|
39
|
+
next unless tool[:id] && tool[:name]
|
|
40
|
+
{type: "tool_use", id: tool[:id], name: tool[:name], input: LLM::Anthropic.parse_tool_input(tool[:arguments])}
|
|
41
|
+
end
|
|
42
|
+
end
|
|
43
|
+
|
|
37
44
|
##
|
|
38
45
|
# @param [String, URI] content
|
|
39
46
|
# The content to format
|
|
@@ -66,7 +66,8 @@ module LLM::Anthropic::ResponseAdapter
|
|
|
66
66
|
private
|
|
67
67
|
|
|
68
68
|
def adapt_choices
|
|
69
|
-
texts.
|
|
69
|
+
source = texts.empty? && tools.any? ? [{"text" => ""}] : texts
|
|
70
|
+
source.map.with_index do |choice, index|
|
|
70
71
|
extra = {
|
|
71
72
|
index:, response: self,
|
|
72
73
|
tool_calls: adapt_tool_calls(tools), original_tool_calls: tools
|
|
@@ -77,7 +78,11 @@ module LLM::Anthropic::ResponseAdapter
|
|
|
77
78
|
|
|
78
79
|
def adapt_tool_calls(tools)
|
|
79
80
|
(tools || []).filter_map do |tool|
|
|
80
|
-
{
|
|
81
|
+
{
|
|
82
|
+
id: tool.id,
|
|
83
|
+
name: tool.name,
|
|
84
|
+
arguments: LLM::Anthropic.parse_tool_input(tool.input)
|
|
85
|
+
}
|
|
81
86
|
end
|
|
82
87
|
end
|
|
83
88
|
|
|
@@ -105,7 +105,7 @@ class LLM::Anthropic
|
|
|
105
105
|
registered = LLM::Function.find_by_name(tool["name"])
|
|
106
106
|
fn = (registered || LLM::Function.new(tool["name"])).dup.tap do |fn|
|
|
107
107
|
fn.id = tool["id"]
|
|
108
|
-
fn.arguments = tool["input"]
|
|
108
|
+
fn.arguments = LLM::Anthropic.parse_tool_input(tool["input"])
|
|
109
109
|
end
|
|
110
110
|
[fn, (registered ? nil : @stream.tool_not_found(fn))]
|
|
111
111
|
end
|
|
@@ -0,0 +1,23 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
class LLM::Anthropic
|
|
4
|
+
module Utils
|
|
5
|
+
##
|
|
6
|
+
# Normalizes Anthropic tool input to a Hash suitable for kwargs.
|
|
7
|
+
# @param input [Hash, String, nil]
|
|
8
|
+
# @return [Hash]
|
|
9
|
+
def parse_tool_input(input)
|
|
10
|
+
case input
|
|
11
|
+
when Hash then input
|
|
12
|
+
when String
|
|
13
|
+
parsed = LLM.json.load(input)
|
|
14
|
+
Hash === parsed ? parsed : {}
|
|
15
|
+
when nil then {}
|
|
16
|
+
else
|
|
17
|
+
input.respond_to?(:to_h) ? input.to_h : {}
|
|
18
|
+
end
|
|
19
|
+
rescue *LLM.json.parser_error
|
|
20
|
+
{}
|
|
21
|
+
end
|
|
22
|
+
end
|
|
23
|
+
end
|
|
@@ -14,6 +14,7 @@ module LLM
|
|
|
14
14
|
# ctx.talk ["Tell me about this photo", ctx.local_file("/images/photo.png")]
|
|
15
15
|
# ctx.messages.select(&:assistant?).each { print "[#{_1.role}]", _1.content, "\n" }
|
|
16
16
|
class Anthropic < Provider
|
|
17
|
+
require_relative "anthropic/utils"
|
|
17
18
|
require_relative "anthropic/error_handler"
|
|
18
19
|
require_relative "anthropic/request_adapter"
|
|
19
20
|
require_relative "anthropic/response_adapter"
|
|
@@ -21,6 +22,7 @@ module LLM
|
|
|
21
22
|
require_relative "anthropic/models"
|
|
22
23
|
require_relative "anthropic/files"
|
|
23
24
|
include RequestAdapter
|
|
25
|
+
extend Utils
|
|
24
26
|
|
|
25
27
|
HOST = "api.anthropic.com"
|
|
26
28
|
|
|
@@ -79,6 +81,15 @@ module LLM
|
|
|
79
81
|
"assistant"
|
|
80
82
|
end
|
|
81
83
|
|
|
84
|
+
##
|
|
85
|
+
# Anthropic expects tool results to be sent as user messages
|
|
86
|
+
# containing `tool_result` content blocks rather than a distinct
|
|
87
|
+
# `tool` role.
|
|
88
|
+
# @return (see LLM::Provider#tool_role)
|
|
89
|
+
def tool_role
|
|
90
|
+
:user
|
|
91
|
+
end
|
|
92
|
+
|
|
82
93
|
##
|
|
83
94
|
# Returns the default model for chat completions
|
|
84
95
|
# @see https://docs.anthropic.com/en/docs/about-claude/models/all-models#model-comparison-table claude-sonnet-4-20250514
|
data/lib/llm/stream/queue.rb
CHANGED
|
@@ -8,8 +8,10 @@ class LLM::Stream
|
|
|
8
8
|
# returns an array of {LLM::Function::Return} values.
|
|
9
9
|
class Queue
|
|
10
10
|
##
|
|
11
|
+
# @param [LLM::Stream] stream
|
|
11
12
|
# @return [LLM::Stream::Queue]
|
|
12
|
-
def initialize
|
|
13
|
+
def initialize(stream)
|
|
14
|
+
@stream = stream
|
|
13
15
|
@items = []
|
|
14
16
|
end
|
|
15
17
|
|
|
@@ -39,13 +41,24 @@ class LLM::Stream
|
|
|
39
41
|
# @return [Array<LLM::Function::Return>]
|
|
40
42
|
def wait(strategy)
|
|
41
43
|
returns, tasks = @items.shift(@items.length).partition { LLM::Function::Return === _1 }
|
|
42
|
-
|
|
44
|
+
results = case strategy
|
|
43
45
|
when :thread then LLM::Function::ThreadGroup.new(tasks).wait
|
|
44
46
|
when :task then LLM::Function::TaskGroup.new(tasks).wait
|
|
45
47
|
when :fiber then LLM::Function::FiberGroup.new(tasks).wait
|
|
46
48
|
else raise ArgumentError, "Unknown strategy: #{strategy.inspect}. Expected :thread, :task, or :fiber"
|
|
47
49
|
end
|
|
50
|
+
returns.concat fire_hooks(tasks, results)
|
|
48
51
|
end
|
|
49
52
|
alias_method :value, :wait
|
|
53
|
+
|
|
54
|
+
private
|
|
55
|
+
|
|
56
|
+
def fire_hooks(tasks, results)
|
|
57
|
+
results.each_with_index do |ret, idx|
|
|
58
|
+
tool = tasks[idx]&.function
|
|
59
|
+
@stream.on_tool_return(tool, ret) if tool
|
|
60
|
+
end
|
|
61
|
+
results
|
|
62
|
+
end
|
|
50
63
|
end
|
|
51
64
|
end
|
data/lib/llm/stream.rb
CHANGED
|
@@ -5,20 +5,20 @@ module LLM
|
|
|
5
5
|
# The {LLM::Stream LLM::Stream} class provides the callback interface for
|
|
6
6
|
# streamed model output in llm.rb.
|
|
7
7
|
#
|
|
8
|
-
# A stream object can be an instance of {LLM::Stream LLM::Stream}
|
|
9
|
-
# subclass that overrides the callbacks it needs
|
|
10
|
-
#
|
|
11
|
-
# helper for collecting asynchronous tool work started from a
|
|
12
|
-
# {#tool_not_found} returns an in-band tool error when a
|
|
13
|
-
# cannot be resolved.
|
|
8
|
+
# A stream object can be an instance of {LLM::Stream LLM::Stream} or a
|
|
9
|
+
# subclass that overrides the callbacks it needs. For basic streaming,
|
|
10
|
+
# llm.rb also accepts any object that implements `#<<`. {#queue} provides
|
|
11
|
+
# a small helper for collecting asynchronous tool work started from a
|
|
12
|
+
# callback, and {#tool_not_found} returns an in-band tool error when a
|
|
13
|
+
# streamed tool cannot be resolved.
|
|
14
14
|
#
|
|
15
15
|
# @note The `on_*` callbacks run inline with the streaming parser. They
|
|
16
16
|
# therefore block streaming progress and should generally return as
|
|
17
17
|
# quickly as possible.
|
|
18
18
|
#
|
|
19
|
-
# The most common callback is {#on_content}, which also maps to {#<<}
|
|
20
|
-
#
|
|
21
|
-
#
|
|
19
|
+
# The most common callback is {#on_content}, which also maps to {#<<}.
|
|
20
|
+
# Providers may also call {#on_reasoning_content} and {#on_tool_call} when
|
|
21
|
+
# that data is available.
|
|
22
22
|
class Stream
|
|
23
23
|
require_relative "stream/queue"
|
|
24
24
|
|
|
@@ -26,7 +26,7 @@ module LLM
|
|
|
26
26
|
# Returns a lazily-initialized queue for tool results or spawned work.
|
|
27
27
|
# @return [LLM::Stream::Queue]
|
|
28
28
|
def queue
|
|
29
|
-
@queue ||= Queue.new
|
|
29
|
+
@queue ||= Queue.new(self)
|
|
30
30
|
end
|
|
31
31
|
|
|
32
32
|
##
|
|
@@ -79,6 +79,20 @@ module LLM
|
|
|
79
79
|
nil
|
|
80
80
|
end
|
|
81
81
|
|
|
82
|
+
##
|
|
83
|
+
# Called when queued streamed tool work returns.
|
|
84
|
+
# @note This callback runs when {#wait} resolves work that was queued from
|
|
85
|
+
# {#on_tool_call}, such as values returned by `tool.spawn(:thread)`,
|
|
86
|
+
# `tool.spawn(:fiber)`, or `tool.spawn(:task)`.
|
|
87
|
+
# @param [LLM::Function] tool
|
|
88
|
+
# The tool that returned.
|
|
89
|
+
# @param [LLM::Function::Return] ret
|
|
90
|
+
# The completed tool return.
|
|
91
|
+
# @return [nil]
|
|
92
|
+
def on_tool_return(tool, ret)
|
|
93
|
+
nil
|
|
94
|
+
end
|
|
95
|
+
|
|
82
96
|
# @endgroup
|
|
83
97
|
|
|
84
98
|
# @group Error handlers
|
data/lib/llm/tracer/telemetry.rb
CHANGED
|
@@ -126,7 +126,7 @@ module LLM
|
|
|
126
126
|
"gen_ai.operation.name" => "execute_tool",
|
|
127
127
|
"gen_ai.request.model" => model,
|
|
128
128
|
"gen_ai.tool.call.id" => id,
|
|
129
|
-
"gen_ai.tool.name" => name,
|
|
129
|
+
"gen_ai.tool.name" => name&.to_s,
|
|
130
130
|
"gen_ai.tool.call.arguments" => LLM.json.dump(arguments),
|
|
131
131
|
"gen_ai.provider.name" => provider_name,
|
|
132
132
|
"server.address" => provider_host,
|
|
@@ -145,7 +145,7 @@ module LLM
|
|
|
145
145
|
return nil unless span
|
|
146
146
|
attributes = {
|
|
147
147
|
"gen_ai.tool.call.id" => result.id,
|
|
148
|
-
"gen_ai.tool.name" => result.name,
|
|
148
|
+
"gen_ai.tool.name" => result.name&.to_s,
|
|
149
149
|
"gen_ai.tool.call.result" => LLM.json.dump(result.value)
|
|
150
150
|
}.compact
|
|
151
151
|
attributes.each { span.set_attribute(_1, _2) }
|
data/lib/llm/version.rb
CHANGED
data/llm.gemspec
CHANGED
|
@@ -8,47 +8,15 @@ Gem::Specification.new do |spec|
|
|
|
8
8
|
spec.authors = ["Antar Azri", "0x1eef", "Christos Maris", "Rodrigo Serrano"]
|
|
9
9
|
spec.email = ["azantar@proton.me", "0x1eef@hardenedbsd.org"]
|
|
10
10
|
|
|
11
|
-
spec.summary =
|
|
12
|
-
llm.rb is a Ruby-centric toolkit for building real LLM-powered systems — where
|
|
13
|
-
LLMs are part of your architecture, not just API calls. It gives you explicit
|
|
14
|
-
control over contexts, tools, concurrency, and providers, so you can compose
|
|
15
|
-
reliable, production-ready workflows without hidden abstractions.
|
|
16
|
-
SUMMARY
|
|
11
|
+
spec.summary = "System integration layer for LLMs, tools, MCP, and APIs in Ruby."
|
|
17
12
|
|
|
18
13
|
spec.description = <<~DESCRIPTION
|
|
19
|
-
llm.rb is a Ruby-centric
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
frameworks, no hidden magic — just composable primitives for building real
|
|
26
|
-
applications, from scripts to full systems like Relay.
|
|
27
|
-
|
|
28
|
-
## Key Features
|
|
29
|
-
|
|
30
|
-
- **Contexts are central** — Hold history, tools, schema, usage, cost, persistence, and execution state
|
|
31
|
-
- **Tool execution is explicit** — Run local, provider-native, and MCP tools sequentially or concurrently
|
|
32
|
-
- **One API across providers** — Unified interface for OpenAI, Anthropic, Google, xAI, zAI, DeepSeek, Ollama, and LlamaCpp
|
|
33
|
-
- **Thread-safe where it matters** — Providers are shareable, while contexts stay isolated and stateful
|
|
34
|
-
- **Production-ready** — Cost tracking, observability, persistence, and performance tuning built in
|
|
35
|
-
- **Stdlib-only by default** — Runs on Ruby standard library, with optional features loaded only when used
|
|
36
|
-
|
|
37
|
-
## Capabilities
|
|
38
|
-
|
|
39
|
-
- Chat & Contexts with persistence
|
|
40
|
-
- Streaming responses
|
|
41
|
-
- Tool calling with JSON Schema validation
|
|
42
|
-
- Concurrent execution (threads, fibers, async tasks)
|
|
43
|
-
- Agents with auto-execution
|
|
44
|
-
- Structured outputs
|
|
45
|
-
- MCP (Model Context Protocol) support
|
|
46
|
-
- Multimodal inputs (text, images, audio, documents)
|
|
47
|
-
- Audio generation, transcription, translation
|
|
48
|
-
- Image generation and editing
|
|
49
|
-
- Files API for document processing
|
|
50
|
-
- Embeddings and vector stores
|
|
51
|
-
- Local model registry for capabilities, limits, and pricing
|
|
14
|
+
llm.rb is a Ruby-centric system integration layer for building LLM-powered
|
|
15
|
+
systems. It connects LLMs to real systems by turning APIs into tools and
|
|
16
|
+
unifying MCP, providers, contexts, and application logic in one execution
|
|
17
|
+
model. It supports explicit tool orchestration, concurrent execution,
|
|
18
|
+
streaming, multiple MCP sources, and multiple LLM providers for production
|
|
19
|
+
systems that integrate external and internal services.
|
|
52
20
|
DESCRIPTION
|
|
53
21
|
|
|
54
22
|
spec.license = "0BSD"
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: llm.rb
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 4.
|
|
4
|
+
version: 4.12.0
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Antar Azri
|
|
@@ -195,39 +195,12 @@ dependencies:
|
|
|
195
195
|
- !ruby/object:Gem::Version
|
|
196
196
|
version: '1.7'
|
|
197
197
|
description: |
|
|
198
|
-
llm.rb is a Ruby-centric
|
|
199
|
-
|
|
200
|
-
|
|
201
|
-
|
|
202
|
-
|
|
203
|
-
|
|
204
|
-
frameworks, no hidden magic — just composable primitives for building real
|
|
205
|
-
applications, from scripts to full systems like Relay.
|
|
206
|
-
|
|
207
|
-
## Key Features
|
|
208
|
-
|
|
209
|
-
- **Contexts are central** — Hold history, tools, schema, usage, cost, persistence, and execution state
|
|
210
|
-
- **Tool execution is explicit** — Run local, provider-native, and MCP tools sequentially or concurrently
|
|
211
|
-
- **One API across providers** — Unified interface for OpenAI, Anthropic, Google, xAI, zAI, DeepSeek, Ollama, and LlamaCpp
|
|
212
|
-
- **Thread-safe where it matters** — Providers are shareable, while contexts stay isolated and stateful
|
|
213
|
-
- **Production-ready** — Cost tracking, observability, persistence, and performance tuning built in
|
|
214
|
-
- **Stdlib-only by default** — Runs on Ruby standard library, with optional features loaded only when used
|
|
215
|
-
|
|
216
|
-
## Capabilities
|
|
217
|
-
|
|
218
|
-
- Chat & Contexts with persistence
|
|
219
|
-
- Streaming responses
|
|
220
|
-
- Tool calling with JSON Schema validation
|
|
221
|
-
- Concurrent execution (threads, fibers, async tasks)
|
|
222
|
-
- Agents with auto-execution
|
|
223
|
-
- Structured outputs
|
|
224
|
-
- MCP (Model Context Protocol) support
|
|
225
|
-
- Multimodal inputs (text, images, audio, documents)
|
|
226
|
-
- Audio generation, transcription, translation
|
|
227
|
-
- Image generation and editing
|
|
228
|
-
- Files API for document processing
|
|
229
|
-
- Embeddings and vector stores
|
|
230
|
-
- Local model registry for capabilities, limits, and pricing
|
|
198
|
+
llm.rb is a Ruby-centric system integration layer for building LLM-powered
|
|
199
|
+
systems. It connects LLMs to real systems by turning APIs into tools and
|
|
200
|
+
unifying MCP, providers, contexts, and application logic in one execution
|
|
201
|
+
model. It supports explicit tool orchestration, concurrent execution,
|
|
202
|
+
streaming, multiple MCP sources, and multiple LLM providers for production
|
|
203
|
+
systems that integrate external and internal services.
|
|
231
204
|
email:
|
|
232
205
|
- azantar@proton.me
|
|
233
206
|
- 0x1eef@hardenedbsd.org
|
|
@@ -300,6 +273,7 @@ files:
|
|
|
300
273
|
- lib/llm/providers/anthropic/response_adapter/models.rb
|
|
301
274
|
- lib/llm/providers/anthropic/response_adapter/web_search.rb
|
|
302
275
|
- lib/llm/providers/anthropic/stream_parser.rb
|
|
276
|
+
- lib/llm/providers/anthropic/utils.rb
|
|
303
277
|
- lib/llm/providers/deepseek.rb
|
|
304
278
|
- lib/llm/providers/deepseek/request_adapter.rb
|
|
305
279
|
- lib/llm/providers/deepseek/request_adapter/completion.rb
|
|
@@ -417,8 +391,5 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
|
417
391
|
requirements: []
|
|
418
392
|
rubygems_version: 3.6.9
|
|
419
393
|
specification_version: 4
|
|
420
|
-
summary:
|
|
421
|
-
where LLMs are part of your architecture, not just API calls. It gives you explicit
|
|
422
|
-
control over contexts, tools, concurrency, and providers, so you can compose reliable,
|
|
423
|
-
production-ready workflows without hidden abstractions.
|
|
394
|
+
summary: System integration layer for LLMs, tools, MCP, and APIs in Ruby.
|
|
424
395
|
test_files: []
|