llm.rb 11.3.1 → 12.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +242 -1
- data/LICENSE +92 -17
- data/README.md +204 -623
- data/data/anthropic.json +433 -249
- data/data/bedrock.json +2097 -1055
- data/data/deepinfra.json +993 -0
- data/data/deepseek.json +53 -28
- data/data/google.json +389 -771
- data/data/openai.json +1053 -771
- data/data/xai.json +133 -292
- data/data/zai.json +249 -141
- data/lib/llm/active_record/acts_as_agent.rb +3 -41
- data/lib/llm/active_record/acts_as_llm.rb +18 -0
- data/lib/llm/active_record.rb +3 -3
- data/lib/llm/context.rb +9 -5
- data/lib/llm/contract/completion.rb +2 -2
- data/lib/llm/provider.rb +2 -2
- data/lib/llm/providers/deepinfra/audio.rb +66 -0
- data/lib/llm/providers/deepinfra/images.rb +90 -0
- data/lib/llm/providers/deepinfra/response_adapter.rb +36 -0
- data/lib/llm/providers/deepinfra.rb +100 -0
- data/lib/llm/providers/deepseek/images.rb +109 -0
- data/lib/llm/providers/deepseek/request_adapter.rb +32 -0
- data/lib/llm/providers/deepseek/response_adapter/image.rb +9 -0
- data/lib/llm/providers/deepseek/response_adapter.rb +29 -0
- data/lib/llm/providers/deepseek.rb +4 -2
- data/lib/llm/providers/google/request_adapter.rb +22 -5
- data/lib/llm/providers/google.rb +4 -4
- data/lib/llm/providers/openai/audio.rb +6 -2
- data/lib/llm/providers/openai/images.rb +9 -50
- data/lib/llm/providers/openai/request_adapter/respond.rb +38 -4
- data/lib/llm/providers/openai/response_adapter/audio.rb +5 -1
- data/lib/llm/providers/openai/response_adapter/completion.rb +1 -1
- data/lib/llm/providers/openai/response_adapter/image.rb +0 -4
- data/lib/llm/providers/openai/responses.rb +1 -0
- data/lib/llm/providers/openai/stream_parser.rb +5 -6
- data/lib/llm/providers/openai.rb +2 -2
- data/lib/llm/providers/xai/images.rb +49 -26
- data/lib/llm/providers/xai.rb +2 -2
- data/lib/llm/response.rb +10 -0
- data/lib/llm/schema/leaf.rb +7 -1
- data/lib/llm/schema/renderer.rb +121 -0
- data/lib/llm/schema.rb +30 -0
- data/lib/llm/sequel/agent.rb +2 -43
- data/lib/llm/sequel/plugin.rb +25 -7
- data/lib/llm/tracer/telemetry.rb +4 -6
- data/lib/llm/tracer.rb +9 -21
- data/lib/llm/transport/execution.rb +16 -1
- data/lib/llm/transport/net_http_adapter.rb +1 -1
- data/lib/llm/uridata.rb +16 -0
- data/lib/llm/version.rb +1 -1
- data/lib/llm.rb +9 -0
- data/llm.gemspec +5 -18
- data/resources/deepdive.md +798 -264
- metadata +15 -18
- data/lib/llm/tracer/langsmith.rb +0 -144
data/resources/deepdive.md
CHANGED
|
@@ -12,421 +12,955 @@
|
|
|
12
12
|
|
|
13
13
|
> A [r.uby.dev](https://r.uby.dev) project.
|
|
14
14
|
|
|
15
|
-
##
|
|
15
|
+
## Welcome
|
|
16
|
+
|
|
17
|
+
Welcome to the llm.rb deepdive. You are reading this document
|
|
18
|
+
in the markdown format. An optimized version exists
|
|
19
|
+
at [https://r.uby.dev/llm/deepdive](https://r.uby.dev/llm/deepdive)
|
|
20
|
+
and it is both easier to read and navigate.
|
|
21
|
+
|
|
22
|
+
This document is a continuation of the [homepage documentation](https://r.uby.dev/llm).
|
|
23
|
+
It assumes you are familiar with the basics already, and focuses on
|
|
24
|
+
features that didn't make it into the homepage documentation.
|
|
25
|
+
|
|
26
|
+
## Table of contents
|
|
27
|
+
|
|
28
|
+
- [Agents](#agents)
|
|
29
|
+
- [As a subclass](#as-a-subclass)
|
|
30
|
+
- [As an object](#as-an-object)
|
|
31
|
+
- [Skills](#skills)
|
|
32
|
+
- [SKILL.md](#skillmd)
|
|
33
|
+
- [Run it](#run-it)
|
|
34
|
+
- [MCP](#mcp)
|
|
35
|
+
- [stdio](#stdio)
|
|
36
|
+
- [http](#http)
|
|
37
|
+
- [A2A](#a2a)
|
|
38
|
+
- [rest](#rest)
|
|
39
|
+
- [jsonrpc](#jsonrpc)
|
|
40
|
+
- [Transports](#transports)
|
|
41
|
+
- [net/http](#nethttp)
|
|
42
|
+
- [net/http/persistent](#nethttppersistent)
|
|
43
|
+
- [curb](#curb)
|
|
44
|
+
- [Stream](#stream)
|
|
45
|
+
- [IO-like object](#io-like-object)
|
|
46
|
+
- [LLM::Stream](#llmstream)
|
|
47
|
+
- [ORM](#orm)
|
|
48
|
+
- [ActiveRecord](#activerecord)
|
|
49
|
+
- [Sequel](#sequel)
|
|
50
|
+
- [Schema](#schema)
|
|
51
|
+
- [Estimation](#estimation)
|
|
52
|
+
- [Cancellation](#cancellation)
|
|
53
|
+
- [Cancel a request](#cancel-a-request)
|
|
54
|
+
- [Tracer](#tracer)
|
|
55
|
+
- [Provider-wide tracer](#provider-wide-tracer)
|
|
56
|
+
- [Agent-local tracer](#agent-local-tracer)
|
|
57
|
+
- [Images](#images)
|
|
58
|
+
- [Generation](#generation)
|
|
59
|
+
- [Edits](#edits)
|
|
60
|
+
- [Audio](#audio)
|
|
61
|
+
- [text-to-speech](#text-to-speech)
|
|
62
|
+
- [speech-to-text](#speech-to-text)
|
|
63
|
+
- [translation](#translation)
|
|
64
|
+
|
|
65
|
+
## Agents
|
|
66
|
+
|
|
67
|
+
An agent is represented by the
|
|
68
|
+
[`LLM::Agent`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html)
|
|
69
|
+
class, and it is built on top of
|
|
70
|
+
[`LLM::Context`](https://r.uby.dev/api-docs/llm.rb/LLM/Context.html) -
|
|
71
|
+
the heart of the runtime. An agent manages the tool loop automatically,
|
|
72
|
+
implements a tool loop guard for misbehaving models, and
|
|
73
|
+
it can use five different concurrency strategies to execute
|
|
74
|
+
tools.
|
|
75
|
+
|
|
76
|
+
An agent can be a subclass of
|
|
77
|
+
[`LLM::Agent`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html),
|
|
78
|
+
or a direct
|
|
79
|
+
instance of it. The subclass approach is useful when you
|
|
80
|
+
want reusable agents that can attach behavior (as methods)
|
|
81
|
+
to their own class.
|
|
82
|
+
|
|
83
|
+
#### As a subclass
|
|
84
|
+
|
|
85
|
+
A subclass of
|
|
86
|
+
[`LLM::Agent`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html)
|
|
87
|
+
can define its model, tools,
|
|
88
|
+
and other attributes at the class-level. All of these
|
|
89
|
+
attributes are optional, and they act as defaults that
|
|
90
|
+
can be overriden on the instance level.
|
|
91
|
+
|
|
92
|
+
The example uses the `:fork` concurrency model. It has
|
|
93
|
+
two primary benefits: tools are run in parallel, and in
|
|
94
|
+
a separate process with a separate memory address space.
|
|
95
|
+
|
|
96
|
+
The example purposefully demonstrates how the attributes
|
|
97
|
+
can be lazily defined with a block, or a Symbol that is
|
|
98
|
+
evaluated as an instance method on the subclass. It is
|
|
99
|
+
not strictly neccessary, though, and the example would
|
|
100
|
+
be simpler without it.
|
|
16
101
|
|
|
17
|
-
|
|
18
|
-
|
|
102
|
+
```ruby
|
|
103
|
+
class Agent < LLM::Agent
|
|
104
|
+
model "deepseek-v4-pro"
|
|
105
|
+
tools { [DoResearch, FinalizeResearch, ActOnResearch] }
|
|
106
|
+
stream { $stdout }
|
|
107
|
+
tracer :set_tracer
|
|
108
|
+
concurrency :fork
|
|
109
|
+
|
|
110
|
+
def research!
|
|
111
|
+
talk "start the research"
|
|
112
|
+
end
|
|
19
113
|
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
114
|
+
private
|
|
115
|
+
|
|
116
|
+
def set_tracer
|
|
117
|
+
LLM::Tracer::Logger.new(llm, io: $stderr)
|
|
118
|
+
end
|
|
119
|
+
end
|
|
120
|
+
llm = LLM.deepseek(key: ENV["KEY"])
|
|
121
|
+
agent = Agent.new(llm).tap(&:research!)
|
|
122
|
+
agent.talk "How did the research go?"
|
|
123
|
+
```
|
|
24
124
|
|
|
25
|
-
|
|
26
|
-
DeepSeek, xAI, Z.ai, AWS Bedrock, Ollama, and llama.cpp. ActiveRecord and
|
|
27
|
-
Sequel support are built in, along with concurrent tool execution through
|
|
28
|
-
threads, tasks, fibers, ractors, and fork.
|
|
125
|
+
#### As an object
|
|
29
126
|
|
|
30
|
-
|
|
127
|
+
The more direct, and sometimes more convienent approach, is to
|
|
128
|
+
create an instance of
|
|
129
|
+
[`LLM::Agent`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html)
|
|
130
|
+
directly. The same attributes can be provided as the
|
|
131
|
+
second argument given to
|
|
132
|
+
[`LLM::Agent.new`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html),
|
|
133
|
+
and the same lazy evaluation rules apply. This approach can be
|
|
134
|
+
great for prototyping quickly, and you can always turn to a
|
|
135
|
+
subclass later if that makes more sense.
|
|
31
136
|
|
|
32
|
-
```
|
|
33
|
-
|
|
137
|
+
```ruby
|
|
138
|
+
llm = LLM.deepseek(key: ENV["KEY"])
|
|
139
|
+
agent = LLM::Agent.new(llm, stream: $stdout)
|
|
140
|
+
agent.talk "Hello, fellow agent"
|
|
34
141
|
```
|
|
35
142
|
|
|
36
|
-
|
|
143
|
+
[Back to top](#table-of-contents)
|
|
37
144
|
|
|
38
|
-
|
|
145
|
+
## Tools
|
|
39
146
|
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
147
|
+
A tool extends the capabilities of a model. <br>
|
|
148
|
+
A tool is a subclass of
|
|
149
|
+
[`LLM::Tool`](https://r.uby.dev/api-docs/llm.rb/LLM/Tool.html)
|
|
150
|
+
that has a name,
|
|
151
|
+
a description, and an optional set of typed parameters.
|
|
152
|
+
|
|
153
|
+
A tool also has a method associated with it, and when the
|
|
154
|
+
model calls a tool it will do so through this method –
|
|
155
|
+
alongside any parameters the tool might have defined.
|
|
156
|
+
|
|
157
|
+
In other words, a tool provides a way for a model to
|
|
158
|
+
call a method you have written, and it returns a value
|
|
159
|
+
to the model that is considered the tool's response.
|
|
160
|
+
The model then proceeds to process the tool's response,
|
|
161
|
+
and then might generate its own response, or perhaps call
|
|
162
|
+
another tool.
|
|
163
|
+
|
|
164
|
+
#### LLM::Tool
|
|
165
|
+
|
|
166
|
+
A tool can be defined by subclassing
|
|
167
|
+
[`LLM::Tool`](https://r.uby.dev/api-docs/llm.rb/LLM/Tool.html)
|
|
168
|
+
with
|
|
169
|
+
a name, description, and optional set of parameters. The
|
|
170
|
+
tool name, and description should be informative so the
|
|
171
|
+
model can understand what the tool does and how it can
|
|
172
|
+
serve a user's query.
|
|
44
173
|
|
|
45
174
|
```ruby
|
|
46
175
|
require "llm"
|
|
176
|
+
require "shellwords"
|
|
177
|
+
|
|
178
|
+
class Shell < LLM::Shell
|
|
179
|
+
name "shell"
|
|
180
|
+
description "execute a shell command"
|
|
181
|
+
parameter :name, String, "the command's name"
|
|
182
|
+
parameter :arguments, Array[String], "One or more arguments"
|
|
183
|
+
required %i[name]
|
|
184
|
+
defaults arguments: []
|
|
185
|
+
|
|
186
|
+
def call(name:, arguments:)
|
|
187
|
+
out = `#{name.shellscape} #{arguments.map(&:shellescape).join(" ")}`
|
|
188
|
+
{ok: $?.success?, out:}
|
|
189
|
+
end
|
|
190
|
+
end
|
|
47
191
|
|
|
48
|
-
llm = LLM.
|
|
49
|
-
agent = LLM::Agent.new(llm, stream: $stdout)
|
|
50
|
-
agent.talk "
|
|
192
|
+
llm = LLM.deepseek(key: ENV["KEY"])
|
|
193
|
+
agent = LLM::Agent.new(llm, tools: [Shell], stream: $stdout)
|
|
194
|
+
agent.talk "What files are in the current working directory?"
|
|
51
195
|
```
|
|
52
196
|
|
|
53
|
-
####
|
|
197
|
+
#### Errors
|
|
198
|
+
|
|
199
|
+
Exceptions that might be raised by a tool are automatically
|
|
200
|
+
rescued and returned to the model as a structured error.
|
|
201
|
+
Otherwise – the conversation's history could be left
|
|
202
|
+
in an invalid state.
|
|
54
203
|
|
|
55
|
-
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
204
|
+
That's because a tool call must complete with a tool response,
|
|
205
|
+
that's the only valid response a model expects, so even in the
|
|
206
|
+
case of an error, something must be returned that communicates
|
|
207
|
+
what happened.
|
|
59
208
|
|
|
60
209
|
```ruby
|
|
61
|
-
|
|
210
|
+
class Error < LLM::Tool
|
|
211
|
+
name "error"
|
|
212
|
+
description "demo how errors are handled"
|
|
213
|
+
|
|
214
|
+
##
|
|
215
|
+
# Returns
|
|
216
|
+
# {error: true, kind: "RuntimeError", message: "boom"}
|
|
217
|
+
def call
|
|
218
|
+
raise "boom"
|
|
219
|
+
end
|
|
220
|
+
end
|
|
221
|
+
```
|
|
62
222
|
|
|
63
|
-
|
|
64
|
-
agent = LLM::Agent.new(llm, stream: $stdout)
|
|
223
|
+
## Skills
|
|
65
224
|
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
|
|
225
|
+
The skill concept is borrowed from tools like Claude and
|
|
226
|
+
Codex, but llm.rb gives it a runtime of its own. A skill
|
|
227
|
+
is a directory with a `SKILL.md` file. That file contains
|
|
228
|
+
frontmatter where the skill's name, description, and tools
|
|
229
|
+
can be declared.
|
|
230
|
+
|
|
231
|
+
#### SKILL.md
|
|
232
|
+
|
|
233
|
+
The `SKILL.md` file can look like this. When a skill runs,
|
|
234
|
+
the runtime spawns a subagent with its own context window
|
|
235
|
+
and message history. Some context is inherited from the
|
|
236
|
+
parent agent, though.
|
|
237
|
+
|
|
238
|
+
By default the subagent can only access the tools declared
|
|
239
|
+
by the skill. The `inherit` directive lets it inherit the
|
|
240
|
+
parent agent's tools instead, including A2A and MCP tools.
|
|
241
|
+
|
|
242
|
+
```markdown
|
|
243
|
+
---
|
|
244
|
+
name: git-skill
|
|
245
|
+
description: reads my git history and writes a summary
|
|
246
|
+
tools: ['git-log', 'git-show', 'write-file']
|
|
247
|
+
---
|
|
248
|
+
|
|
249
|
+
## Task
|
|
250
|
+
|
|
251
|
+
Collect a log of recent history.
|
|
252
|
+
Analyze each commit.
|
|
253
|
+
Write a summary to summary.txt
|
|
71
254
|
```
|
|
72
255
|
|
|
73
|
-
####
|
|
256
|
+
#### Run it
|
|
257
|
+
|
|
258
|
+
Given the skill above, llm.rb only needs the path to the
|
|
259
|
+
directory that contains `SKILL.md`. Under the hood, a skill
|
|
260
|
+
is represented as a tool the model can call. That means
|
|
261
|
+
a skill can be called whenever it satisfies the user's
|
|
262
|
+
request – in the same way that a regular tool can.
|
|
74
263
|
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
<br>
|
|
78
|
-
It holds the same conversation state but leaves tool execution up to you.
|
|
79
|
-
Use it when you want to decide when and how tools run.
|
|
264
|
+
This feature also works with both the ActiveRecord, and
|
|
265
|
+
Sequel integrations.
|
|
80
266
|
|
|
81
267
|
```ruby
|
|
82
268
|
require "llm"
|
|
83
269
|
|
|
84
|
-
llm = LLM.
|
|
85
|
-
|
|
86
|
-
|
|
270
|
+
llm = LLM.deepseek(key: ENV["KEY"])
|
|
271
|
+
agent = LLM::Agent.new(llm, skills: [__dir__])
|
|
272
|
+
agent.talk "run the git skill"
|
|
87
273
|
```
|
|
88
274
|
|
|
89
|
-
|
|
275
|
+
[Back to top](#table-of-contents)
|
|
276
|
+
|
|
277
|
+
## MCP
|
|
278
|
+
|
|
279
|
+
#### stdio
|
|
280
|
+
|
|
281
|
+
The stdio transport connects to an MCP server that is launched as a
|
|
282
|
+
separate process, and both its standard input and standard output
|
|
283
|
+
streams are used for communication. It is recommended but not
|
|
284
|
+
required to execute commands for a stdio transport over a
|
|
285
|
+
persistent session via the
|
|
286
|
+
[`LLM::MCP#session`](https://r.uby.dev/api-docs/llm.rb/LLM/MCP.html#session-instance_method)
|
|
287
|
+
method – otherwise
|
|
288
|
+
you could end up launching the same process multiple times.
|
|
90
289
|
|
|
91
290
|
```ruby
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
|
|
291
|
+
require "llm"
|
|
292
|
+
|
|
293
|
+
llm = LLM.deepseek(key: ENV["KEY"])
|
|
294
|
+
mcp = LLM::MCP.stdio(argv: ["npx", "-y", "@forgejo/mcp-server"])
|
|
295
|
+
agent = LLM::Agent.new(llm)
|
|
296
|
+
|
|
297
|
+
mcp.session do
|
|
298
|
+
agent.talk "What's happening on forgejo?", tools: mcp.tools
|
|
299
|
+
end
|
|
95
300
|
```
|
|
96
301
|
|
|
97
|
-
|
|
98
|
-
[`LLM::Agent`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html).
|
|
99
|
-
It does the same thing but manages the loop for you.
|
|
302
|
+
#### http
|
|
100
303
|
|
|
101
|
-
|
|
304
|
+
The http transport connects to an MCP server over HTTP, and unlike
|
|
305
|
+
the stdio transport, the MCP server does not have to be running
|
|
306
|
+
locally. Popular services like GitHub provide their own MCP server
|
|
307
|
+
over HTTP, and it is one of the most capable MCP servers I have
|
|
308
|
+
used.
|
|
102
309
|
|
|
103
|
-
|
|
310
|
+
Unlike the stdio transport,
|
|
311
|
+
[`LLM::MCP#session`](https://r.uby.dev/api-docs/llm.rb/LLM/MCP.html#session-instance_method)
|
|
312
|
+
carries little benefit for the http transport and it can be
|
|
313
|
+
omitted. It is recommended to consider the `net_http_persistent`
|
|
314
|
+
transport for MCP interactions that run over HTTP, otherwise
|
|
315
|
+
you could end up tearing down and setting up the same connection
|
|
316
|
+
multiple times.
|
|
104
317
|
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
|
|
318
|
+
```ruby
|
|
319
|
+
require "llm"
|
|
320
|
+
|
|
321
|
+
llm = LLM.deepseek(key: ENV["KEY"])
|
|
322
|
+
mcp = LLM::MCP.http(
|
|
323
|
+
url: "https://api.githubcopilot.com/mcp/",
|
|
324
|
+
headers: {
|
|
325
|
+
"Authorization" => "Bearer #{ENV.fetch('GITHUB_PAT')}"
|
|
326
|
+
},
|
|
327
|
+
transport: :net_http_persistent
|
|
328
|
+
)
|
|
329
|
+
agent = LLM::Agent.new(llm)
|
|
330
|
+
agent.talk "What's happening on GitHub?", tools: mcp.tools
|
|
331
|
+
```
|
|
332
|
+
|
|
333
|
+
[Back to top](#table-of-contents)
|
|
334
|
+
|
|
335
|
+
## A2A
|
|
336
|
+
|
|
337
|
+
#### rest
|
|
338
|
+
|
|
339
|
+
The rest transport communicates with other agents via A2A
|
|
340
|
+
endpoints that speak both HTTP and JSON. The skills advertised
|
|
341
|
+
by an agent become subclasses of
|
|
342
|
+
[`LLM::Tool`](https://r.uby.dev/api-docs/llm.rb/LLM/Tool.html)
|
|
343
|
+
that can be used by both
|
|
344
|
+
[`LLM::Context`](https://r.uby.dev/api-docs/llm.rb/LLM/Context.html),
|
|
345
|
+
and [`LLM::Agent`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html)
|
|
346
|
+
– similar to how MCP tools become subclasses of
|
|
347
|
+
[`LLM::Tool`](https://r.uby.dev/api-docs/llm.rb/LLM/Tool.html).
|
|
109
348
|
|
|
110
349
|
```ruby
|
|
111
|
-
|
|
112
|
-
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
|
|
116
|
-
|
|
117
|
-
def call(path:)
|
|
118
|
-
{contents: File.read(path)}
|
|
119
|
-
end
|
|
120
|
-
end
|
|
350
|
+
require "llm"
|
|
351
|
+
|
|
352
|
+
llm = LLM.deepseek(key: ENV["KEY"])
|
|
353
|
+
a2a = LLM::A2A.rest(url: "https://agent.example.com")
|
|
354
|
+
agent = LLM::Agent.new(llm, tools: a2a.skills)
|
|
355
|
+
agent.talk "What's happening, fellow agent?"
|
|
121
356
|
```
|
|
122
357
|
|
|
123
|
-
|
|
358
|
+
#### jsonrpc
|
|
359
|
+
|
|
360
|
+
The jsonrpc transport communicates with other agents via HTTP
|
|
361
|
+
and a protocol known as jsonrpc. Sometimes an agent will
|
|
362
|
+
implement both, or just one of each. An agent's card, which
|
|
363
|
+
is represented by an instance of
|
|
364
|
+
[`LLM::A2A::Card`](https://r.uby.dev/api-docs/llm.rb/LLM/A2A/Card.html),
|
|
365
|
+
can be
|
|
366
|
+
used to discover available transports via the
|
|
367
|
+
[`LLM::A2A::Card#interfaces`](https://r.uby.dev/api-docs/llm.rb/LLM/A2A/Card.html#interfaces-instance_method)
|
|
368
|
+
method.
|
|
124
369
|
|
|
125
370
|
```ruby
|
|
126
|
-
|
|
127
|
-
|
|
371
|
+
require "llm"
|
|
372
|
+
llm = LLM.deepseek(key: ENV["KEY"])
|
|
373
|
+
a2a = LLM::A2A.jsonrpc(url: "https://agent.example.com")
|
|
374
|
+
agent = LLM::Agent.new(llm, tools: a2a.skills)
|
|
375
|
+
agent.talk "What's happening, fellow agent?"
|
|
128
376
|
```
|
|
129
377
|
|
|
130
|
-
[
|
|
131
|
-
|
|
132
|
-
|
|
378
|
+
[Back to top](#table-of-contents)
|
|
379
|
+
|
|
380
|
+
## Transports
|
|
133
381
|
|
|
134
|
-
|
|
382
|
+
The [`LLM::Provider`](https://r.uby.dev/api-docs/llm.rb/LLM/Provider.html),
|
|
383
|
+
[`LLM::MCP`](https://r.uby.dev/api-docs/llm.rb/LLM/MCP.html), and
|
|
384
|
+
[`LLM::A2A`](https://r.uby.dev/api-docs/llm.rb/LLM/A2A.html) classes
|
|
385
|
+
all accept a `transport` option that decides which library
|
|
386
|
+
will be used for HTTP communication. There are three options out
|
|
387
|
+
of the box:
|
|
388
|
+
[`net-http`](https://github.com/ruby/net-http),
|
|
389
|
+
[`net-http-persistent`](https://github.com/drbrain/net-http-persistent),
|
|
390
|
+
and [`curb`](https://github.com/taf2/curb).
|
|
135
391
|
|
|
136
|
-
|
|
137
|
-
|
|
138
|
-
|
|
139
|
-
|
|
392
|
+
#### net/http
|
|
393
|
+
|
|
394
|
+
The [`net/http`](https://github.com/ruby/net-http) transport is represented by the symbol `:net_http`. <br>
|
|
395
|
+
It is the default transport.
|
|
140
396
|
|
|
141
397
|
```ruby
|
|
142
|
-
|
|
143
|
-
model "gpt-5.4-mini"
|
|
144
|
-
tools ReadFile
|
|
145
|
-
concurrency :thread
|
|
146
|
-
end
|
|
398
|
+
require "llm"
|
|
147
399
|
|
|
148
|
-
llm = LLM.
|
|
149
|
-
|
|
150
|
-
|
|
400
|
+
llm = LLM.deepseek(key: "...", transport: :net_http)
|
|
401
|
+
mcp = LLM::MCP.http(url: "...", transport: :net_http)
|
|
402
|
+
a2a = LLM::A2A.rest(url: "...", transport: :net_http)
|
|
151
403
|
```
|
|
152
404
|
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
#### Schema
|
|
405
|
+
#### net/http/persistent
|
|
156
406
|
|
|
157
|
-
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
|
|
407
|
+
The [`net/http/persistent`](https://github.com/drbrain/net-http-persistent) transport is represented by the symbol `:net_http_persistent`. <br>
|
|
408
|
+
It maintains a connection pool so the cost of tearing down and
|
|
409
|
+
setting up a connection repeatedly is kept low, and it is built
|
|
410
|
+
on top of [`net/http`](https://github.com/ruby/net-http).
|
|
161
411
|
|
|
162
412
|
```ruby
|
|
163
|
-
|
|
164
|
-
property :category, Enum["performance", "security", "outage"]
|
|
165
|
-
property :summary, String, "Short summary"
|
|
166
|
-
property :services, Array[String], "Impacted services"
|
|
167
|
-
required %i[category summary services]
|
|
168
|
-
end
|
|
413
|
+
require "llm"
|
|
169
414
|
|
|
170
|
-
|
|
171
|
-
|
|
172
|
-
|
|
415
|
+
llm = LLM.deepseek(key: "...", transport: :net_http_persistent)
|
|
416
|
+
mcp = LLM::MCP.http(url: "...", transport: :net_http_persistent)
|
|
417
|
+
a2a = LLM::A2A.rest(url: "...", transport: :net_http_persistent)
|
|
173
418
|
```
|
|
174
419
|
|
|
175
|
-
|
|
420
|
+
#### curb
|
|
421
|
+
|
|
422
|
+
The [`curb`](https://github.com/taf2/curb) transport is represented by the symbol `:curb`. <br>
|
|
423
|
+
It provides bindings for libcurl – a widely used, highly portable
|
|
424
|
+
and feature-rich HTTP library written in C.
|
|
176
425
|
|
|
177
426
|
```ruby
|
|
178
|
-
|
|
179
|
-
category: LLM::Schema.new.string.enum("bug", "feature").required,
|
|
180
|
-
summary: LLM::Schema.new.string.required
|
|
181
|
-
)
|
|
427
|
+
require "llm"
|
|
182
428
|
|
|
183
|
-
|
|
184
|
-
|
|
185
|
-
|
|
429
|
+
llm = LLM.deepseek(key: "...", transport: :curb)
|
|
430
|
+
mcp = LLM::MCP.http(url: "...", transport: :curb)
|
|
431
|
+
a2a = LLM::A2A.rest(url: "...", transport: :curb)
|
|
186
432
|
```
|
|
187
433
|
|
|
188
|
-
|
|
434
|
+
[Back to top](#table-of-contents)
|
|
435
|
+
|
|
436
|
+
## Stream
|
|
437
|
+
|
|
438
|
+
#### IO-like object
|
|
189
439
|
|
|
190
|
-
|
|
440
|
+
Any object that implements the `#<<` method can receive
|
|
441
|
+
chunks from a stream. That includes objects like `$stdout`.
|
|
442
|
+
This form of streaming is simple and limited. It is the
|
|
443
|
+
equivalent of
|
|
444
|
+
[`LLM::Stream#on_content`](https://r.uby.dev/api-docs/llm.rb/LLM/Stream.html#on_content-instance_method),
|
|
445
|
+
and doesn't include
|
|
446
|
+
any of the other
|
|
447
|
+
[`LLM::Stream`](https://r.uby.dev/api-docs/llm.rb/LLM/Stream.html)
|
|
448
|
+
hooks.
|
|
191
449
|
|
|
192
|
-
|
|
193
|
-
|
|
194
|
-
|
|
195
|
-
|
|
196
|
-
|
|
450
|
+
```ruby
|
|
451
|
+
require "llm"
|
|
452
|
+
|
|
453
|
+
llm = LLM.deepseek(key: ENV["KEY"])
|
|
454
|
+
agent = LLM::Agent.new(llm, stream: $stdout)
|
|
455
|
+
agent.talk "hello world"
|
|
456
|
+
```
|
|
457
|
+
|
|
458
|
+
#### LLM::Stream
|
|
459
|
+
|
|
460
|
+
The [`LLM::Stream`](https://r.uby.dev/api-docs/llm.rb/LLM/Stream.html)
|
|
461
|
+
class provides many hooks that a subclass
|
|
462
|
+
can implement. They range from being notified when a tool call
|
|
463
|
+
starts to when a tool call finishes, or when a conversation is
|
|
464
|
+
due to be compacted because the context window exceeded a defined
|
|
465
|
+
limit. All these callbacks support a responsive user interface
|
|
466
|
+
where the user is always aware of what is happening behind the
|
|
467
|
+
scenes.
|
|
197
468
|
|
|
198
469
|
```ruby
|
|
199
|
-
class
|
|
470
|
+
class Stream < LLM::Stream
|
|
200
471
|
def on_content(content)
|
|
201
|
-
|
|
472
|
+
puts content
|
|
202
473
|
end
|
|
203
474
|
|
|
204
475
|
def on_reasoning_content(content)
|
|
205
|
-
|
|
476
|
+
puts content
|
|
206
477
|
end
|
|
207
|
-
end
|
|
208
478
|
|
|
209
|
-
|
|
210
|
-
|
|
211
|
-
|
|
479
|
+
def on_tool_call(tool, error)
|
|
480
|
+
# this callback can be used to either log a tool call,
|
|
481
|
+
# or execute a tool call during a stream.
|
|
482
|
+
end
|
|
483
|
+
|
|
484
|
+
def on_tool_return(tool, result)
|
|
485
|
+
end
|
|
486
|
+
|
|
487
|
+
def on_compaction(ctx, compactor)
|
|
488
|
+
# this callback is called *before* a compact happens
|
|
489
|
+
end
|
|
490
|
+
|
|
491
|
+
def on_compaction_finish(ctx, compactor)
|
|
492
|
+
# this callback is called *after* a compact happens
|
|
493
|
+
end
|
|
494
|
+
end
|
|
212
495
|
```
|
|
213
496
|
|
|
214
|
-
|
|
497
|
+
[Back to top](#table-of-contents)
|
|
215
498
|
|
|
216
|
-
|
|
499
|
+
## Serialization
|
|
217
500
|
|
|
218
|
-
|
|
219
|
-
|
|
220
|
-
|
|
221
|
-
|
|
222
|
-
on
|
|
501
|
+
The [`LLM::Context`](https://r.uby.dev/api-docs/llm.rb/LLM/Context.html)
|
|
502
|
+
class can be serialized to JSON and stored in a string or on disk.
|
|
503
|
+
That is powerful because a context contains runtime state that can
|
|
504
|
+
be restored later, in a different process or even on a different
|
|
505
|
+
machine. And because an agent is implemented on top of
|
|
506
|
+
[`LLM::Context`](https://r.uby.dev/api-docs/llm.rb/LLM/Context.html)
|
|
507
|
+
this feature works for [`LLM::Agent`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html),
|
|
508
|
+
too.
|
|
223
509
|
|
|
224
|
-
|
|
225
|
-
---
|
|
226
|
-
name: release
|
|
227
|
-
description: Prepare a release
|
|
228
|
-
tools: ["search-docs", "git"]
|
|
229
|
-
---
|
|
510
|
+
#### Save to disk
|
|
230
511
|
|
|
231
|
-
|
|
512
|
+
The runtime can serialize its state to a string, a text file, or
|
|
513
|
+
a database column. The option that fits best depends on your application
|
|
514
|
+
and environment. Web applications might be more interested in the [ORM](#orm)
|
|
515
|
+
feature, which is built on top of the serialization feature.
|
|
232
516
|
|
|
233
|
-
|
|
517
|
+
```ruby
|
|
518
|
+
##
|
|
519
|
+
# Create a provider
|
|
520
|
+
llm = LLM.deepseek(key: ENV["KEY"])
|
|
521
|
+
|
|
522
|
+
##
|
|
523
|
+
# Save agent
|
|
524
|
+
agent1 = LLM::Agent.new(llm)
|
|
525
|
+
agent1.talk "remember my name is robert"
|
|
526
|
+
agent1.save(path: "agent.json")
|
|
527
|
+
|
|
528
|
+
##
|
|
529
|
+
# Restore agent
|
|
530
|
+
agent2 = LLM::Agent.new(llm, stream: $stdout)
|
|
531
|
+
agent2.restore(path: "agent.json")
|
|
532
|
+
agent2.talk "what's my name?"
|
|
234
533
|
```
|
|
235
534
|
|
|
535
|
+
## ORM
|
|
536
|
+
|
|
537
|
+
Both ActiveRecord, and Sequel have first-class support on the
|
|
538
|
+
llm.rb runtime. In both cases an ActiveRecord or Sequel model
|
|
539
|
+
can be turned into a model that has the same capabilities as
|
|
540
|
+
[`LLM::Context`](https://r.uby.dev/api-docs/llm.rb/LLM/Context.html),
|
|
541
|
+
or [`LLM::Agent`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html).
|
|
542
|
+
|
|
543
|
+
The main difference is that the runtime persists directly into
|
|
544
|
+
the database with no requirements beyond a single column on a
|
|
545
|
+
single row. That means it is usually trivial to turn an existing
|
|
546
|
+
model into an AI-aware model.
|
|
547
|
+
|
|
548
|
+
#### ActiveRecord
|
|
549
|
+
|
|
550
|
+
The ActiveRecord interface for
|
|
551
|
+
[`LLM::Agent`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html)
|
|
552
|
+
is
|
|
553
|
+
[`acts_as_agent`](https://r.uby.dev/api-docs/llm.rb/LLM/ActiveRecord/ActsAsAgent.html).
|
|
554
|
+
It yields an instance of
|
|
555
|
+
[`LLM::Agent`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html),
|
|
556
|
+
and that can be used
|
|
557
|
+
to configure the agent (eg which model, instructions, skills,
|
|
558
|
+
tools, etc).
|
|
559
|
+
|
|
560
|
+
An interesting option is the `format` option, by default it
|
|
561
|
+
defaults to `:string` but it can also be changed to `:json`
|
|
562
|
+
or `:jsonb` depending on the configuration and type of underlying
|
|
563
|
+
column. The JSONB column type is recommended.
|
|
564
|
+
|
|
236
565
|
```ruby
|
|
237
|
-
|
|
238
|
-
|
|
239
|
-
|
|
240
|
-
end
|
|
566
|
+
require "active_record"
|
|
567
|
+
require "llm"
|
|
568
|
+
require "llm/active_record"
|
|
241
569
|
|
|
242
|
-
|
|
243
|
-
|
|
244
|
-
|
|
570
|
+
class Agent < ApplicationRecord
|
|
571
|
+
acts_as_agent(format: :jsonb) do |agent|
|
|
572
|
+
agent.model "deepseek-v4-pro"
|
|
573
|
+
agent.instructions "solve the user's query"
|
|
574
|
+
agent.tools [Research, FinalizeResearch, ActOnResearch]
|
|
575
|
+
end
|
|
245
576
|
|
|
246
|
-
|
|
247
|
-
its allowed tools, and recent conversation context. Skills can also use
|
|
248
|
-
`tools: inherit` to run with the parent agent's full toolset.
|
|
577
|
+
private
|
|
249
578
|
|
|
250
|
-
##
|
|
579
|
+
##
|
|
580
|
+
# By convention, this method defines the provider
|
|
581
|
+
# for a model. If neccessary, it can be renamed and
|
|
582
|
+
# configured via `provider: :your_method` instead.
|
|
583
|
+
def set_provider
|
|
584
|
+
LLM.deepseek(key: ENV["KEY"])
|
|
585
|
+
end
|
|
586
|
+
|
|
587
|
+
##
|
|
588
|
+
# By convention, this method should return what is
|
|
589
|
+
# given as the second argument to `LLM::Context` or
|
|
590
|
+
# `LLM::Agent`.
|
|
591
|
+
#
|
|
592
|
+
# Often, there is no need to set it, so it can be left
|
|
593
|
+
# undefined or it can be reassigned in the same way as
|
|
594
|
+
# `set_provider`. For example: `context: :your_method`
|
|
595
|
+
def set_context
|
|
596
|
+
{}
|
|
597
|
+
end
|
|
598
|
+
end
|
|
599
|
+
|
|
600
|
+
agent = Agent.create!
|
|
601
|
+
agent.talk "perform research"
|
|
602
|
+
```
|
|
251
603
|
|
|
252
|
-
####
|
|
604
|
+
#### Sequel
|
|
253
605
|
|
|
254
|
-
|
|
255
|
-
|
|
256
|
-
|
|
257
|
-
|
|
258
|
-
that speaks the Model Context Protocol.
|
|
606
|
+
The following is a Sequel equivalent to the ActiveRecord example,
|
|
607
|
+
but to keep it interesting and informative, this example also
|
|
608
|
+
configures a per-model tracer that logs to `$stdout`. Works the
|
|
609
|
+
same for ActiveRecord.
|
|
259
610
|
|
|
260
611
|
```ruby
|
|
612
|
+
require "sequel"
|
|
261
613
|
require "llm"
|
|
614
|
+
require "llm/sequel/plugin"
|
|
615
|
+
|
|
616
|
+
class Agent < Sequel::Model
|
|
617
|
+
plugin(:agent, format: :jsonb) do |agent|
|
|
618
|
+
agent.model "deepseek-v4-pro"
|
|
619
|
+
agent.instructions "solve the user's query"
|
|
620
|
+
agent.tools [Research, FinalizeResearch, ActOnResearch]
|
|
621
|
+
agent.tracer { LLM::Tracer::Logger.new(llm, io: $stdout) }
|
|
622
|
+
end
|
|
262
623
|
|
|
263
|
-
|
|
264
|
-
mcp = LLM::MCP.stdio(argv: ["ruby", "server.rb"])
|
|
624
|
+
private
|
|
265
625
|
|
|
266
|
-
|
|
267
|
-
|
|
268
|
-
|
|
626
|
+
def set_provider
|
|
627
|
+
LLM.deepseek(key: ENV["KEY"])
|
|
628
|
+
end
|
|
269
629
|
end
|
|
630
|
+
|
|
631
|
+
agent = Agent.create
|
|
632
|
+
agent.talk "perform research"
|
|
270
633
|
```
|
|
271
634
|
|
|
272
|
-
|
|
635
|
+
[Back to top](#table-of-contents)
|
|
636
|
+
|
|
637
|
+
## Schema
|
|
638
|
+
|
|
639
|
+
The [`LLM::Schema`](https://r.uby.dev/api-docs/llm.rb/LLM/Schema.html)
|
|
640
|
+
class can be subclassed to describe
|
|
641
|
+
the shape of a JSON object or objects that you expect
|
|
642
|
+
the model to respond with.
|
|
643
|
+
|
|
644
|
+
It can be useful for a wide range of use cases but the
|
|
645
|
+
most popular might be classification, data extraction,
|
|
646
|
+
and transferring structured data between different software
|
|
647
|
+
rather than blobs of text that a machine cannot easily parse
|
|
648
|
+
in a structured way.
|
|
273
649
|
|
|
274
|
-
|
|
275
|
-
|
|
650
|
+
#### Estimation
|
|
651
|
+
|
|
652
|
+
The following example asks the model to estimate the age
|
|
653
|
+
of a person in a photo. The model provides a structured response
|
|
654
|
+
that's represented by an instance of
|
|
655
|
+
[`LLM::Object`](https://r.uby.dev/api-docs/llm.rb/LLM/Object.html).
|
|
656
|
+
|
|
657
|
+
The object returned by
|
|
658
|
+
[`LLM::Response#content!`](https://r.uby.dev/api-docs/llm.rb/LLM/Contract/Completion.html#content!-instance_method)
|
|
659
|
+
has methods that can access the age, confidence, and comments
|
|
660
|
+
properties.
|
|
661
|
+
This approach can also work for extracting data or an analysis
|
|
662
|
+
from a PDF, and other file types.
|
|
276
663
|
|
|
277
664
|
```ruby
|
|
278
|
-
|
|
279
|
-
|
|
280
|
-
|
|
281
|
-
|
|
665
|
+
require "llm"
|
|
666
|
+
require "pp"
|
|
667
|
+
|
|
668
|
+
class Estimation < LLM::Schema
|
|
669
|
+
property :age, Integer, "The estimated age of the person"
|
|
670
|
+
property :confidence, Number, "Your confidence in the estimate"
|
|
671
|
+
property :applicable, Boolean, "True when the photo contains a person"
|
|
672
|
+
property :comments, String, "Any additional comments or input"
|
|
673
|
+
required %i[age confidence applicable comments]
|
|
674
|
+
end
|
|
282
675
|
|
|
283
|
-
|
|
284
|
-
agent
|
|
676
|
+
llm = LLM.openai(key: ENV["KEY"])
|
|
677
|
+
agent = LLM::Agent.new(llm, schema: Estimation)
|
|
678
|
+
res = agent.ask "Given this photo, provide an age estimate", with: "photo.jpg"
|
|
679
|
+
|
|
680
|
+
##
|
|
681
|
+
# Coerces the model's response from a JSON string
|
|
682
|
+
# to an instance of LLM::Object.
|
|
683
|
+
estimate = res.content!
|
|
684
|
+
|
|
685
|
+
##
|
|
686
|
+
# Let's print the estimate
|
|
687
|
+
if estimate.applicable
|
|
688
|
+
print "The person is approx ", estimate.age.to_s, " years old", "\n"
|
|
689
|
+
print "I have a confidence rating of ", estimate.confidence.to_s, "\n"
|
|
690
|
+
else
|
|
691
|
+
print "This photo is not applicable:", "\n"
|
|
692
|
+
print estimate.comments
|
|
693
|
+
end
|
|
285
694
|
```
|
|
286
695
|
|
|
287
|
-
|
|
696
|
+
[Back to top](#table-of-contents)
|
|
697
|
+
|
|
698
|
+
## Cancellation
|
|
288
699
|
|
|
289
|
-
####
|
|
700
|
+
#### Cancel a request
|
|
290
701
|
|
|
291
|
-
|
|
292
|
-
|
|
293
|
-
|
|
702
|
+
A common scenario when communicating with a model is to
|
|
703
|
+
want to cancel the request mid-stream. This could be done
|
|
704
|
+
for a number of different reasons, most often because the
|
|
705
|
+
user made a mistake, or the model is making a mistake and
|
|
706
|
+
the user wants to cancel the action.
|
|
294
707
|
|
|
295
|
-
|
|
708
|
+
The runtime has built-in support for cancellation. So for
|
|
709
|
+
example it is possible to cancel a request on the main
|
|
710
|
+
thread from a secondary thread. A number of things happen
|
|
711
|
+
when a request is cancelled. First the request is cancelled
|
|
712
|
+
at the transport level, and each transport handles it a little
|
|
713
|
+
differently. The net effect in every case is that the connection
|
|
714
|
+
is closed.
|
|
296
715
|
|
|
297
|
-
|
|
716
|
+
The runtime then notifies the rest of the system. so for example,
|
|
717
|
+
if a tool was running, it will receive the `on_interrupt` / `on_cancel`
|
|
718
|
+
callback that lets the tool do any necessary cleanup, or execute its own
|
|
719
|
+
cancellation plan. Tools that were pending (not yet run but requetsed to
|
|
720
|
+
run) are cancelled through
|
|
721
|
+
[`LLM::Function#cancel`](https://r.uby.dev/api-docs/llm.rb/LLM/Function.html#cancel-instance_method).
|
|
298
722
|
|
|
299
723
|
```ruby
|
|
300
724
|
require "llm"
|
|
301
725
|
|
|
302
|
-
llm = LLM.
|
|
726
|
+
llm = LLM.deepseek(key: ENV["DEEPSEEK_SECRET"])
|
|
303
727
|
agent = LLM::Agent.new(llm)
|
|
304
|
-
|
|
728
|
+
queue = Queue.new
|
|
305
729
|
|
|
306
|
-
|
|
307
|
-
|
|
730
|
+
Thread.new do
|
|
731
|
+
queue.push(nil)
|
|
732
|
+
sleep(2)
|
|
733
|
+
agent.cancel!
|
|
734
|
+
end
|
|
308
735
|
|
|
309
|
-
|
|
310
|
-
|
|
311
|
-
|
|
312
|
-
|
|
736
|
+
begin
|
|
737
|
+
queue.pop
|
|
738
|
+
agent.talk "write me a very long poem", stream: $stdout
|
|
739
|
+
rescue LLM::Interrupt
|
|
740
|
+
puts "request cancelled!"
|
|
741
|
+
end
|
|
313
742
|
```
|
|
314
743
|
|
|
315
|
-
|
|
744
|
+
[Back to top](#table-of-contents)
|
|
316
745
|
|
|
317
|
-
|
|
318
|
-
wraps an agent directly on an ActiveRecord model.
|
|
319
|
-
<br>
|
|
320
|
-
Serialized state lives in a single `data` column while your application
|
|
321
|
-
controls provider, model, and tool configuration.
|
|
746
|
+
## Tracer
|
|
322
747
|
|
|
323
|
-
|
|
324
|
-
|
|
325
|
-
|
|
326
|
-
|
|
748
|
+
The runtime can be observed by subclasses of
|
|
749
|
+
[`LLM::Tracer`](https://r.uby.dev/api-docs/llm.rb/LLM/Tracer.html). <br>
|
|
750
|
+
The default tracers include a tracer that can write to standard
|
|
751
|
+
output
|
|
752
|
+
([`LLM::Tracer::Logger`](https://r.uby.dev/api-docs/llm.rb/LLM/Tracer/Logger.html)),
|
|
753
|
+
and a generic OpenTelemetry tracer that can export spans via OTLP
|
|
754
|
+
([`LLM::Tracer::Telemetry`](https://r.uby.dev/api-docs/llm.rb/LLM/Tracer/Telemetry.html)).
|
|
327
755
|
|
|
328
|
-
|
|
329
|
-
|
|
330
|
-
|
|
331
|
-
|
|
332
|
-
|
|
333
|
-
|
|
756
|
+
llm.rb has numerous hooks implemented throughout the runtime that
|
|
757
|
+
[`LLM::Tracer`](https://r.uby.dev/api-docs/llm.rb/LLM/Tracer.html)
|
|
758
|
+
subclasses can hook into, and the tracer is
|
|
759
|
+
purposefully designed to be extensible. The scope of a trace
|
|
760
|
+
can vary from an individual agent (an instance of
|
|
761
|
+
[`LLM::Agent`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html)),
|
|
762
|
+
or for every request a provider makes (an indirect instance of
|
|
763
|
+
[`LLM::Provider`](https://r.uby.dev/api-docs/llm.rb/LLM/Provider.html)).
|
|
334
764
|
|
|
335
|
-
|
|
765
|
+
#### Provider-wide tracer
|
|
336
766
|
|
|
337
|
-
|
|
338
|
-
|
|
339
|
-
end
|
|
767
|
+
The following two examples demonstrate provider-wide tracers that
|
|
768
|
+
cover every request made for a single provider.
|
|
340
769
|
|
|
341
|
-
|
|
342
|
-
|
|
343
|
-
|
|
344
|
-
|
|
770
|
+
```ruby
|
|
771
|
+
##
|
|
772
|
+
# Provider-wide tracer
|
|
773
|
+
# Writes to $stdout
|
|
774
|
+
llm = LLM.deepseek(key: ENV["KEY"])
|
|
775
|
+
llm.tracer = LLM::Tracer::Logger.new(llm, io: $stdout)
|
|
345
776
|
|
|
346
|
-
|
|
347
|
-
|
|
777
|
+
##
|
|
778
|
+
# Provider-wide tracer
|
|
779
|
+
# Writes to deepseek.log
|
|
780
|
+
llm = LLM.deepseek(key: ENV["KEY"])
|
|
781
|
+
llm.tracer = LLM::Tracer::Logger.new(llm, path: "deepseek.log")
|
|
348
782
|
```
|
|
349
783
|
|
|
350
|
-
|
|
351
|
-
[`acts_as_llm`](https://r.uby.dev/api-docs/llm.rb/LLM/ActiveRecord/ActsAsLLM.html)
|
|
352
|
-
instead. It wraps
|
|
353
|
-
[`LLM::Context`](https://r.uby.dev/api-docs/llm.rb/LLM/Context.html) with the
|
|
354
|
-
same persistence contract.
|
|
784
|
+
#### Agent-local tracer
|
|
355
785
|
|
|
356
|
-
|
|
786
|
+
The next two examples demonstrate a tracer that is local
|
|
787
|
+
to an agent.
|
|
357
788
|
|
|
358
|
-
|
|
789
|
+
```ruby
|
|
790
|
+
##
|
|
791
|
+
# Agent-local
|
|
792
|
+
# Writes to $stdout
|
|
793
|
+
llm = LLM.deepseek(key: ENV["KEY"])
|
|
794
|
+
agent = LLM::Agent.new(llm, tracer: LLM::Tracer::Logger.new(llm, io: $stdout))
|
|
795
|
+
|
|
796
|
+
##
|
|
797
|
+
# Agent-local
|
|
798
|
+
# Writes to deepseek-agent.log
|
|
799
|
+
llm = LLM.deepseek(key: ENV["KEY"])
|
|
800
|
+
agent = LLM::Agent.new(llm, tracer: LLM::Tracer::Logger.new(llm, path: "deepseek-agent.log"))
|
|
801
|
+
```
|
|
359
802
|
|
|
360
|
-
|
|
361
|
-
|
|
362
|
-
|
|
363
|
-
|
|
803
|
+
[Back to top](#table-of-contents)
|
|
804
|
+
|
|
805
|
+
## Images
|
|
806
|
+
|
|
807
|
+
The OpenAI, Google, xAI, DeepInfra, and DeepSeek providers have
|
|
808
|
+
builtin image generation capabilities. OpenAI, xAI, and DeepInfra
|
|
809
|
+
also support image edits. Google only supports image generation.
|
|
810
|
+
DeepSeek supports generation and edits too, but only through SVG
|
|
811
|
+
output rather than raster image models.
|
|
812
|
+
|
|
813
|
+
#### Generation
|
|
814
|
+
|
|
815
|
+
The [`LLM::Provider#images`](https://r.uby.dev/api-docs/llm.rb/LLM/Provider.html#images-instance_method)
|
|
816
|
+
method returns an Image
|
|
817
|
+
object that a subset of providers implement. At the
|
|
818
|
+
moment Google, xAI, OpenAI, DeepInfra, and DeepSeek have image
|
|
819
|
+
generation capabilities. DeepSeek is the odd one out: it generates
|
|
820
|
+
SVG documents rather than raster images.
|
|
364
821
|
|
|
365
822
|
```ruby
|
|
823
|
+
require "llm"
|
|
824
|
+
|
|
825
|
+
##
|
|
826
|
+
# Store dogrocket.png
|
|
366
827
|
llm = LLM.openai(key: ENV["KEY"])
|
|
367
|
-
res = llm.
|
|
368
|
-
|
|
369
|
-
puts res.embeddings.first.size
|
|
828
|
+
res = llm.images.create(prompt: "a dog on a rocket to the moon")
|
|
829
|
+
IO.copy_stream res.images[0], "dogrocket.png"
|
|
370
830
|
```
|
|
371
831
|
|
|
372
|
-
|
|
832
|
+
The API is the same across providers. <br>
|
|
833
|
+
For example – xAI:
|
|
373
834
|
|
|
374
835
|
```ruby
|
|
375
|
-
|
|
376
|
-
|
|
377
|
-
|
|
378
|
-
|
|
379
|
-
|
|
380
|
-
|
|
381
|
-
res = llm.
|
|
382
|
-
res.
|
|
836
|
+
require "llm"
|
|
837
|
+
|
|
838
|
+
##
|
|
839
|
+
# Store dogrocket.png
|
|
840
|
+
# Same API as OpenAI
|
|
841
|
+
llm = LLM.xai(key: ENV["KEY"])
|
|
842
|
+
res = llm.images.create(prompt: "a dog on a rocket to the moon")
|
|
843
|
+
IO.copy_stream res.images[0], "dogrocket.png"
|
|
383
844
|
```
|
|
384
845
|
|
|
385
|
-
|
|
846
|
+
#### Edits
|
|
386
847
|
|
|
387
|
-
|
|
848
|
+
OpenAI, xAI, and DeepInfra have the same interface for image edits. <br>
|
|
849
|
+
DeepSeek also supports edits, but only for SVG files. <br>
|
|
850
|
+
Google does not have edit image support. <br>
|
|
388
851
|
|
|
389
|
-
|
|
390
|
-
|
|
391
|
-
|
|
392
|
-
|
|
852
|
+
```ruby
|
|
853
|
+
require "llm"
|
|
854
|
+
|
|
855
|
+
##
|
|
856
|
+
# Edit self.jpg and add a mustache
|
|
857
|
+
# Save to mustache.png
|
|
858
|
+
llm = LLM.openai(key: ENV["KEY"])
|
|
859
|
+
res = llm.images.edit(prompt: "add a mustache", image: "self.jpg")
|
|
860
|
+
IO.copy_stream res.images[0], "mustache.png"
|
|
861
|
+
```
|
|
862
|
+
|
|
863
|
+
#### DeepSeek
|
|
864
|
+
|
|
865
|
+
The DeepSeek provider does not provide an image generation model
|
|
866
|
+
but it is possible to ask a text-to-text model to produce
|
|
867
|
+
vector graphics (SVGs), and in that limited sense, it can become
|
|
868
|
+
a capable text-to-image model.
|
|
393
869
|
|
|
394
870
|
```ruby
|
|
395
|
-
|
|
396
|
-
|
|
397
|
-
|
|
398
|
-
|
|
399
|
-
|
|
871
|
+
require "llm"
|
|
872
|
+
|
|
873
|
+
##
|
|
874
|
+
# Edit rocket.svg and change its color
|
|
875
|
+
# Save to rocket-edited.svg
|
|
876
|
+
llm = LLM.deepseek(key: ENV["KEY"])
|
|
877
|
+
res = llm.images.edit(prompt: "make the rocket red", image: "rocket.svg")
|
|
878
|
+
IO.copy_stream res.images[0], "rocket-edited.svg"
|
|
400
879
|
```
|
|
401
880
|
|
|
402
|
-
|
|
403
|
-
|
|
881
|
+
An interesting property of the DeepSeek implementation is that
|
|
882
|
+
it can maintain a session that can perform multiple image generations
|
|
883
|
+
or edits rather than just one-shot generations.
|
|
884
|
+
|
|
885
|
+
It's possible because under the hood
|
|
886
|
+
[`LLM::Agent`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html),
|
|
887
|
+
is attached to the
|
|
888
|
+
[`LLM::Response`](https://r.uby.dev/api-docs/llm.rb/LLM/Response.html)
|
|
889
|
+
object that is returned to the caller. So the response includes an
|
|
890
|
+
`agent` method, and it can be carried across multiple generations.
|
|
891
|
+
It is specific to this endpoint though. It works like this:
|
|
404
892
|
|
|
405
893
|
```ruby
|
|
406
|
-
|
|
407
|
-
|
|
894
|
+
require "llm"
|
|
895
|
+
|
|
896
|
+
llm = LLM.deepseek(key: ENV["DEEPSEEK_SECRET"])
|
|
897
|
+
agent = nil
|
|
898
|
+
loop do
|
|
899
|
+
print "> "
|
|
900
|
+
prompt = $stdin.gets
|
|
901
|
+
res = llm.images.create(prompt:, agent:)
|
|
902
|
+
agent = res.agent
|
|
903
|
+
IO.copy_stream res.images[0], "image.svg"
|
|
904
|
+
print "ok: saved image.svg", "\n"
|
|
905
|
+
end
|
|
408
906
|
```
|
|
409
907
|
|
|
410
|
-
|
|
908
|
+
[Back to top](#table-of-contents)
|
|
909
|
+
|
|
910
|
+
## Audio
|
|
411
911
|
|
|
412
|
-
|
|
912
|
+
The audio interface defined by llm.rb describes three methods,
|
|
913
|
+
although not every provider implements all of them. Generally
|
|
914
|
+
speaking the audio interface is for text-to-speech, and
|
|
915
|
+
speech-to-text models.
|
|
413
916
|
|
|
414
|
-
|
|
917
|
+
The following providers have audio support:
|
|
918
|
+
|
|
919
|
+
* OpenAI - full support
|
|
920
|
+
* Google - partial support
|
|
921
|
+
* DeepInfra - partial support
|
|
922
|
+
|
|
923
|
+
#### text-to-speech
|
|
924
|
+
|
|
925
|
+
The `create_speech` method generates an audio clip based
|
|
926
|
+
on the given input. This method returns a
|
|
927
|
+
[`LLM::URIData`](https://r.uby.dev/api-docs/llm.rb/LLM/URIData.html)
|
|
928
|
+
object. OpenAI, and DeepInfra support this method.
|
|
415
929
|
|
|
416
930
|
```ruby
|
|
417
|
-
|
|
418
|
-
|
|
419
|
-
|
|
931
|
+
require "llm"
|
|
932
|
+
|
|
933
|
+
llm = LLM.openai(key: ENV["KEY"])
|
|
934
|
+
res = llm.audio.create_speech(input: "Hello world")
|
|
935
|
+
IO.copy_stream res.audio.decoded, "helloworld.mp3"
|
|
420
936
|
```
|
|
421
937
|
|
|
422
|
-
|
|
938
|
+
#### speech-to-text
|
|
423
939
|
|
|
424
|
-
|
|
940
|
+
The `create_transcription` method transcribes a given
|
|
941
|
+
audio clip as text. OpenAI, Google and DeepInfra support
|
|
942
|
+
this method.
|
|
943
|
+
|
|
944
|
+
```ruby
|
|
945
|
+
require "llm"
|
|
946
|
+
|
|
947
|
+
llm = LLM.google(key: ENV["KEY"])
|
|
948
|
+
res = llm.audio.create_transcription(file: "helloworld.mp3")
|
|
949
|
+
res.text # => "Hello world"
|
|
950
|
+
```
|
|
425
951
|
|
|
426
|
-
|
|
427
|
-
|
|
952
|
+
#### translation
|
|
953
|
+
|
|
954
|
+
The `create_translation` method translates a given audio
|
|
955
|
+
clip, then transcribes it as text. OpenAI, and Google
|
|
956
|
+
support this method.
|
|
957
|
+
|
|
958
|
+
```ruby
|
|
959
|
+
require "llm"
|
|
960
|
+
|
|
961
|
+
llm = LLM.google(key: ENV["KEY"])
|
|
962
|
+
res = llm.audio.create_translation(file: "bomdia.mp3")
|
|
963
|
+
res.text # => "Good day"
|
|
964
|
+
```
|
|
428
965
|
|
|
429
|
-
|
|
430
|
-
|---|---|---|
|
|
431
|
-
| [matz](https://r.uby.dev/matz/) | `ssh matz@r.uby.dev` | [mruby-llm](https://r.uby.dev/mruby-llm/) |
|
|
432
|
-
| [robert](https://4.4bsd.dev/robert) | `ssh robert@4.4bsd.dev` | [mruby-llm](https://r.uby.dev/mruby-llm/) |
|
|
966
|
+
[Back to top](#table-of-contents)
|