llm.rb 4.11.0 → 4.12.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: a2af34506e099996b451951da8fb892ecdacebe9f29217bbf7a9e3ee3382d942
4
- data.tar.gz: f49edb6d166ae113618139f0b118f37acbbd001b9b256d76d5c66b2828915a88
3
+ metadata.gz: 79d4a45ec25408e46451475575e917ef9d8579bec32f1a6a78bfed235e5ae212
4
+ data.tar.gz: fdeb12175be3ef87e411021444305b9e785a9bf2d055dfdc7bf718f5740623d8
5
5
  SHA512:
6
- metadata.gz: 8dbdbde04bf04fd714ce5ab3689f078f6a77243853bdb7ea287124295b2a5b5878493a36e4ec0c703a10466306f13ca503de9132b2a8a31c2c39b2f721b1bf78
7
- data.tar.gz: 5bcb9be7c664bbee548cdc305878bc62fe1c8b5ab23d64630719084dab3581b8f4abf875a235a0e33ee05430cda8d69b0b6cc8fce538abafa4e8f85bbbbaead0
6
+ metadata.gz: ea35b39b5476b75370485128dd8441e078bc7ac69236a7a50f4e32fb419f6fac5f7bb81faf3e029f28b788f4d69645e1b97e4126ea4f9fcc31f014921d2434a4
7
+ data.tar.gz: c73bbf806f5cef71bfadfc1368fbdbfe07bf37118df18ebec71f4914a27ae2a3858fa6a210ee4d7cdff8f672a14c59016604a72a0a90c611b37223c4652ee991
data/CHANGELOG.md CHANGED
@@ -1,9 +1,41 @@
1
1
  # Changelog
2
2
 
3
- ## Unreleased
3
+ ## v4.12.0
4
+
5
+ Changes since `v4.11.1`.
6
+
7
+ This release expands advanced streaming and MCP execution while reframing
8
+ llm.rb more clearly as a system integration layer for LLMs, tools, MCP
9
+ sources, and application APIs.
10
+
11
+ ### Add
12
+
13
+ - Add `persistent` as an alias for `persist!` on providers and MCP transports.
14
+ - Add `LLM::Stream#on_tool_return` for observing completed streamed tool work.
15
+ - Add `LLM::Function::Return#error?`.
16
+
17
+ ### Change
18
+
19
+ - Expect advanced streaming callbacks to use `LLM::Stream` subclasses
20
+ instead of duck-typing them onto arbitrary objects. Basic `#<<`
21
+ streaming remains supported.
22
+
23
+ ### Fix
24
+
25
+ - Fix Anthropic tools without params by always emitting `input_schema`.
26
+ - Fix Anthropic tool-only responses to still produce an assistant message.
27
+ - Fix Anthropic tool results to use the `user` role.
28
+ - Fix Anthropic tool input normalization.
29
+
30
+ ## v4.11.1
4
31
 
5
32
  Changes since `v4.11.0`.
6
33
 
34
+ ### Fix
35
+
36
+ * Cast OpenTelemetry tool-related values to strings. <br>
37
+ Otherwise they're rejected by opentelemetry-sdk as invalid attributes.
38
+
7
39
  ## v4.11.0
8
40
 
9
41
  Changes since `v4.10.0`.
data/README.md CHANGED
@@ -4,15 +4,16 @@
4
4
  <p align="center">
5
5
  <a href="https://0x1eef.github.io/x/llm.rb?rebuild=1"><img src="https://img.shields.io/badge/docs-0x1eef.github.io-blue.svg" alt="RubyDoc"></a>
6
6
  <a href="https://opensource.org/license/0bsd"><img src="https://img.shields.io/badge/License-0BSD-orange.svg?" alt="License"></a>
7
- <a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-4.11.0-green.svg?" alt="Version"></a>
7
+ <a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-4.12.0-green.svg?" alt="Version"></a>
8
8
  </p>
9
9
 
10
10
  ## About
11
11
 
12
- llm.rb is a Ruby-centric toolkit for building real LLM-powered systems — where
13
- LLMs are part of your architecture, not just API calls. It gives you explicit
14
- control over contexts, tools, concurrency, and providers, so you can compose
15
- reliable, production-ready workflows without hidden abstractions.
12
+ llm.rb is a Ruby-centric system integration layer for building real
13
+ LLM-powered systems. It connects LLMs to real systems by turning APIs into
14
+ tools and unifying MCP, providers, and application logic into a single
15
+ execution model. It is used in production systems integrating external and
16
+ internal tools, including agents, MCP services, and OpenAPI-based APIs.
16
17
 
17
18
  Built for engineers who want to understand and control their LLM systems. No
18
19
  frameworks, no hidden magic — just composable primitives for building real
@@ -26,17 +27,22 @@ and capabilities of llm.rb.
26
27
  ## What Makes It Different
27
28
 
28
29
  Most LLM libraries stop at requests and responses. <br>
29
- llm.rb is built around the state and execution model around them:
30
+ llm.rb is built around the state and execution model behind them:
30
31
 
32
+ - **A system layer, not just an API wrapper** <br>
33
+ llm.rb unifies LLMs, tools, MCP servers, and application APIs into a single execution model.
31
34
  - **Contexts are central** <br>
32
35
  They hold history, tools, schema, usage, cost, persistence, and execution state.
36
+ - **Contexts can be serialized** <br>
37
+ A context can be serialized to JSON and stored on disk, in a database, in a
38
+ job queue, or anywhere else your application needs to persist state.
33
39
  - **Tool execution is explicit** <br>
34
40
  Run local, provider-native, and MCP tools sequentially or concurrently with threads, fibers, or async tasks.
35
41
  - **Run tools while streaming** <br>
36
42
  Start tool work while a response is still streaming instead of waiting for the turn to finish. <br>
37
- This lets tool latency overlap with model output and is one of llm.rb's strongest execution features.
43
+ This overlaps tool latency with model output and exposes streamed tool-call events for introspection, making it one of llm.rb's strongest execution features.
38
44
  - **HTTP MCP can reuse connections** <br>
39
- Opt into persistent HTTP pooling for repeated remote MCP tool calls with `persist!`.
45
+ Opt into persistent HTTP pooling for repeated remote MCP tool calls with `persistent`.
40
46
  - **One API across providers and capabilities** <br>
41
47
  The same model covers chat, files, images, audio, embeddings, vector stores, and more.
42
48
  - **Thread-safe where it matters** <br>
@@ -46,22 +52,48 @@ llm.rb is built around the state and execution model around them:
46
52
  - **Stdlib-only by default** <br>
47
53
  llm.rb runs on the Ruby standard library by default, with providers, optional features, and the model registry loaded only when you use them.
48
54
 
55
+ ## What llm.rb Enables
56
+
57
+ llm.rb acts as the integration layer between LLMs, tools, and real systems.
58
+
59
+ - Turn REST / OpenAPI APIs into LLM tools
60
+ - Connect multiple MCP sources (Notion, internal services, etc.)
61
+ - Build agents that operate across system boundaries
62
+ - Orchestrate tools from multiple providers and protocols
63
+ - Stream responses while executing tools concurrently
64
+ - Treat LLMs as part of your architecture, not isolated calls
65
+
66
+ Without llm.rb, providers, tool formats, and orchestration paths tend to stay
67
+ fragmented. With llm.rb, they share a unified execution model with composable
68
+ tools and a more consistent system architecture.
69
+
70
+ ## Real-World Usage
71
+
72
+ llm.rb is used to integrate external MCP services such as Notion, internal APIs
73
+ exposed via OpenAPI or `swagger.json`, and multiple tool sources into a unified
74
+ execution model. Common usage patterns include combining multiple MCP sources,
75
+ turning internal APIs into tools, and running those tools through the same
76
+ context and provider flow.
77
+
78
+ It supports multiple MCP sources, external SaaS integrations, internal APIs via
79
+ OpenAPI, and multiple LLM providers simultaneously.
80
+
49
81
  ## Architecture & Execution Model
50
82
 
51
- llm.rb is built in layers, each providing explicit control:
83
+ llm.rb sits at the center of the execution path, connecting tools, MCP
84
+ sources, APIs, providers, and your application through explicit contexts:
52
85
 
53
86
  ```
54
- ┌─────────────────────────────────────────┐
55
- Your Application
56
- ├─────────────────────────────────────────┤
57
- Contexts & Agents │ ← Stateful workflows
58
- ├─────────────────────────────────────────┤
59
- Tools & Functions │ ← Concurrent execution
60
- ├─────────────────────────────────────────┤
61
- │ Unified Provider API (OpenAI, etc.) │ ← Provider abstraction
62
- ├─────────────────────────────────────────┤
63
- │ HTTP, JSON, Thread Safety │ ← Infrastructure
64
- └─────────────────────────────────────────┘
87
+ External MCP Internal MCP OpenAPI / REST
88
+
89
+ └────────── Tools / MCP Layer ──────────┘
90
+
91
+ llm.rb Contexts
92
+
93
+ LLM Providers
94
+ (OpenAI, Anthropic, etc.)
95
+
96
+ Your Application
65
97
  ```
66
98
 
67
99
  ### Key Design Decisions
@@ -100,167 +132,150 @@ llm.rb provides a complete set of primitives for building LLM-powered systems:
100
132
 
101
133
  ## Quick Start
102
134
 
103
- #### Run Tools While Streaming
135
+ These examples show individual features, but llm.rb is designed to combine
136
+ them into full systems where LLMs, tools, and external services operate
137
+ together.
104
138
 
105
- llm.rb can start tool execution from streamed tool-call events before the
106
- assistant turn is fully finished. That means tool latency can overlap with
107
- streaming output instead of happening strictly after it. If your model emits
108
- tool calls early, this can noticeably reduce end-to-end latency for real
109
- systems.
139
+ #### Simple Streaming
110
140
 
111
- This is different from plain concurrent tool execution. The tool starts while
112
- the response is still arriving, not after the turn has fully completed.
141
+ At the simplest level, any object that implements `#<<` can receive visible
142
+ output as it arrives. This works with `$stdout`, `StringIO`, files, sockets,
143
+ and other Ruby IO-style objects.
113
144
 
114
- For example:
145
+ For more control, llm.rb also supports advanced streaming patterns through
146
+ [`LLM::Stream`](lib/llm/stream.rb). See [Advanced Streaming](#advanced-streaming)
147
+ for a structured callback-based example. Basic `#<<` streams only receive
148
+ visible output chunks:
115
149
 
116
150
  ```ruby
117
151
  #!/usr/bin/env ruby
118
152
  require "llm"
119
153
 
120
- class System < LLM::Tool
121
- name "system"
122
- description "Run a shell command"
123
- params { _1.object(command: _1.string.required) }
124
-
125
- def call(command:)
126
- {success: Kernel.system(command)}
127
- end
128
- end
129
-
130
- class Stream < LLM::Stream
131
- def on_content(content)
132
- print content
133
- end
134
-
135
- def on_tool_call(tool, error)
136
- queue << (error || tool.spawn(:thread))
137
- end
138
- end
139
-
140
154
  llm = LLM.openai(key: ENV["KEY"])
141
- ctx = LLM::Context.new(llm, stream: Stream.new, tools: [System])
142
-
143
- ctx.talk("Run `date` and tell me what command you ran.")
144
- ctx.talk(ctx.wait(:thread)) while ctx.functions.any?
155
+ ctx = LLM::Context.new(llm, stream: $stdout)
156
+ loop do
157
+ print "> "
158
+ ctx.talk(STDIN.gets || break)
159
+ puts
160
+ end
145
161
  ```
146
162
 
147
- #### Concurrent Tools
163
+ #### Structured Outputs
148
164
 
149
- llm.rb provides explicit concurrency control for tool execution. The
150
- `wait(:thread)` method spawns each pending function in its own thread and waits
151
- for all to complete. You can also use `:fiber` for cooperative multitasking or
152
- `:task` for async/await patterns (requires the `async` gem). The context
153
- automatically collects all results and reports them back to the LLM in a
154
- single turn, maintaining conversation flow while parallelizing independent
155
- operations:
165
+ The `LLM::Schema` system lets you define JSON schemas for structured outputs.
166
+ Schemas can be defined as classes with `property` declarations or built
167
+ programmatically using a fluent interface. When you pass a schema to a context,
168
+ llm.rb adapts it into the provider's structured-output format when that
169
+ provider supports one. The `content!` method then parses the assistant's JSON
170
+ response into a Ruby object:
156
171
 
157
172
  ```ruby
158
173
  #!/usr/bin/env ruby
159
174
  require "llm"
175
+ require "pp"
176
+
177
+ class Report < LLM::Schema
178
+ property :category, Enum["performance", "security", "outage"], "Report category", required: true
179
+ property :summary, String, "Short summary", required: true
180
+ property :impact, OneOf[String, Integer], "Primary impact, as text or a count", required: true
181
+ property :services, Array[String], "Impacted services", required: true
182
+ property :timestamp, String, "When it happened", optional: true
183
+ end
160
184
 
161
185
  llm = LLM.openai(key: ENV["KEY"])
162
- ctx = LLM::Context.new(llm, stream: $stdout, tools: [FetchWeather, FetchNews, FetchStock])
186
+ ctx = LLM::Context.new(llm, schema: Report)
187
+ res = ctx.talk("Structure this report: 'Database latency spiked at 10:42 UTC, causing 5% request timeouts for 12 minutes.'")
188
+ pp res.content!
163
189
 
164
- # Execute multiple independent tools concurrently
165
- ctx.talk("Summarize the weather, headlines, and stock price.")
166
- ctx.talk(ctx.wait(:thread)) while ctx.functions.any?
190
+ # {
191
+ # "category" => "performance",
192
+ # "summary" => "Database latency spiked, causing 5% request timeouts for 12 minutes.",
193
+ # "impact" => "5% request timeouts",
194
+ # "services" => ["Database"],
195
+ # "timestamp" => "2024-06-05T10:42:00Z"
196
+ # }
167
197
  ```
168
198
 
169
- #### MCP
199
+ #### Tool Calling
170
200
 
171
- llm.rb integrates with the Model Context Protocol (MCP) to dynamically discover
172
- and use tools from external servers. This example starts a filesystem MCP
173
- server over stdio and makes its tools available to a context, enabling the LLM
174
- to interact with the local file system through a standardized interface.
175
- Use `LLM::MCP.stdio` or `LLM::MCP.http` when you want to make the transport
176
- explicit. Like `LLM::Context`, an MCP client is stateful and should remain
177
- isolated to a single thread:
201
+ Tools in llm.rb can be defined as classes inheriting from `LLM::Tool` or as
202
+ closures using `LLM.function`. When the LLM requests a tool call, the context
203
+ stores `Function` objects in `ctx.functions`. The `call()` method executes all
204
+ pending functions and returns their results to the LLM. Tools describe
205
+ structured parameters with JSON Schema and adapt those definitions to each
206
+ provider's tool-calling format (OpenAI, Anthropic, Google, etc.):
178
207
 
179
208
  ```ruby
180
209
  #!/usr/bin/env ruby
181
210
  require "llm"
182
211
 
183
- llm = LLM.openai(key: ENV["KEY"])
184
- mcp = LLM::MCP.stdio(argv: ["npx", "-y", "@modelcontextprotocol/server-filesystem", Dir.pwd])
212
+ class System < LLM::Tool
213
+ name "system"
214
+ description "Run a shell command"
215
+ param :command, String, "Command to execute", required: true
185
216
 
186
- begin
187
- mcp.start
188
- ctx = LLM::Context.new(llm, stream: $stdout, tools: mcp.tools)
189
- ctx.talk("List the directories in this project.")
190
- ctx.talk(ctx.call(:functions)) while ctx.functions.any?
191
- ensure
192
- mcp.stop
217
+ def call(command:)
218
+ {success: system(command)}
219
+ end
193
220
  end
194
- ```
195
-
196
- You can also connect to an MCP server over HTTP. This is useful when the
197
- server already runs remotely and exposes MCP through a URL instead of a local
198
- process. If you expect repeated tool calls, use `persist!` to reuse a
199
- process-wide HTTP connection pool. This requires the optional
200
- `net-http-persistent` gem:
201
-
202
- ```ruby
203
- #!/usr/bin/env ruby
204
- require "llm"
205
221
 
206
222
  llm = LLM.openai(key: ENV["KEY"])
207
- mcp = LLM::MCP.http(
208
- url: "https://api.githubcopilot.com/mcp/",
209
- headers: {"Authorization" => "Bearer #{ENV.fetch("GITHUB_PAT")}"}
210
- ).persist!
211
-
212
- begin
213
- mcp.start
214
- ctx = LLM::Context.new(llm, stream: $stdout, tools: mcp.tools)
215
- ctx.talk("List the available GitHub MCP toolsets.")
216
- ctx.talk(ctx.call(:functions)) while ctx.functions.any?
217
- ensure
218
- mcp.stop
219
- end
223
+ ctx = LLM::Context.new(llm, stream: $stdout, tools: [System])
224
+ ctx.talk("Run `date`.")
225
+ ctx.talk(ctx.call(:functions)) while ctx.functions.any?
220
226
  ```
221
227
 
222
- #### Simple Streaming
228
+ #### Concurrent Tools
223
229
 
224
- At the simplest level, any object that implements `#<<` can receive visible
225
- output as it arrives. This works with `$stdout`, `StringIO`, files, sockets,
226
- and other Ruby IO-style objects:
230
+ llm.rb provides explicit concurrency control for tool execution. The
231
+ `wait(:thread)` method spawns each pending function in its own thread and waits
232
+ for all to complete. You can also use `:fiber` for cooperative multitasking or
233
+ `:task` for async/await patterns (requires the `async` gem). The context
234
+ automatically collects all results and reports them back to the LLM in a
235
+ single turn, maintaining conversation flow while parallelizing independent
236
+ operations:
227
237
 
228
238
  ```ruby
229
239
  #!/usr/bin/env ruby
230
240
  require "llm"
231
241
 
232
242
  llm = LLM.openai(key: ENV["KEY"])
233
- ctx = LLM::Context.new(llm, stream: $stdout)
234
- loop do
235
- print "> "
236
- ctx.talk(STDIN.gets || break)
237
- puts
238
- end
243
+ ctx = LLM::Context.new(llm, stream: $stdout, tools: [FetchWeather, FetchNews, FetchStock])
244
+
245
+ # Execute multiple independent tools concurrently
246
+ ctx.talk("Summarize the weather, headlines, and stock price.")
247
+ ctx.talk(ctx.wait(:thread)) while ctx.functions.any?
239
248
  ```
240
249
 
241
250
  #### Advanced Streaming
242
251
 
243
- llm.rb also supports the [`LLM::Stream`](lib/llm/stream.rb) interface for
244
- structured streaming events:
252
+ Use [`LLM::Stream`](lib/llm/stream.rb) when you want more than plain `#<<`
253
+ output. It adds structured streaming callbacks for:
245
254
 
246
255
  - `on_content` for visible assistant output
247
256
  - `on_reasoning_content` for separate reasoning output
248
257
  - `on_tool_call` for streamed tool-call notifications
258
+ - `on_tool_return` for completed tool execution
259
+
260
+ Subclass [`LLM::Stream`](lib/llm/stream.rb) when you want callbacks like
261
+ `on_reasoning_content`, `on_tool_call`, and `on_tool_return`, or helpers like
262
+ `queue` and `wait`.
249
263
 
250
- Subclass [`LLM::Stream`](lib/llm/stream.rb) when you want features like
251
- `queue` and `wait`, or implement the same methods on your own object. Keep these
252
- callbacks fast: they run inline with the parser.
264
+ Keep `on_content`, `on_reasoning_content`, and `on_tool_call` fast: they run
265
+ inline with the streaming parser. `on_tool_return` is different: it runs later,
266
+ when `wait` resolves queued streamed tool work.
253
267
 
254
268
  `on_tool_call` lets tools start before the model finishes its turn, for
255
269
  example with `tool.spawn(:thread)`, `tool.spawn(:fiber)`, or
256
- `tool.spawn(:task)`. This is the mechanism behind running tools while
257
- streaming.
270
+ `tool.spawn(:task)`. That can overlap tool latency with streaming output.
271
+ `on_tool_return` is the place to react when that queued work completes, for
272
+ example by updating progress UIs, logging tool latency, or changing visible
273
+ state from "Running tool ..." to "Finished tool ...".
258
274
 
259
- If a stream cannot execute a tool, `error` is an `LLM::Function::Return` that
260
- communicates the failure back to the LLM. That lets the tool-call path recover
261
- and keeps the session alive. It also leaves control in the callback: it can
262
- send `error`, spawn the tool when `error == nil`, or handle the situation
263
- however it sees fit.
275
+ If a stream cannot resolve a tool, `on_tool_call` receives `error` as an
276
+ `LLM::Function::Return`. That keeps the session alive and leaves control in
277
+ the callback: it can send `error`, spawn the tool when `error == nil`, or
278
+ handle the situation however it sees fit.
264
279
 
265
280
  In normal use this should be rare, since `on_tool_call` is usually called with
266
281
  a resolved tool and `error == nil`. To resolve a tool call, the tool must be
@@ -274,25 +289,22 @@ require "llm"
274
289
  # Assume `System < LLM::Tool` is already defined.
275
290
 
276
291
  class Stream < LLM::Stream
277
- attr_reader :content, :reasoning_content
278
-
279
- def initialize
280
- @content = +""
281
- @reasoning_content = +""
282
- end
283
-
284
292
  def on_content(content)
285
- @content << content
286
- print content
293
+ $stdout << content
287
294
  end
288
295
 
289
296
  def on_reasoning_content(content)
290
- @reasoning_content << content
297
+ $stderr << content
291
298
  end
292
299
 
293
300
  def on_tool_call(tool, error)
301
+ $stdout << "Running tool #{tool.name}\n"
294
302
  queue << (error || tool.spawn(:thread))
295
303
  end
304
+
305
+ def on_tool_return(tool, ret)
306
+ $stdout << (ret.error? ? "Tool #{tool.name} failed\n" : "Finished tool #{tool.name}\n")
307
+ end
296
308
  end
297
309
 
298
310
  llm = LLM.openai(key: ENV["KEY"])
@@ -304,69 +316,67 @@ while ctx.functions.any?
304
316
  end
305
317
  ```
306
318
 
307
- #### Tool Calling
319
+ #### MCP
308
320
 
309
- Tools in llm.rb can be defined as classes inheriting from `LLM::Tool` or as
310
- closures using `LLM.function`. When the LLM requests a tool call, the context
311
- stores `Function` objects in `ctx.functions`. The `call()` method executes all
312
- pending functions and returns their results to the LLM. Tools describe
313
- structured parameters with JSON Schema and adapt those definitions to each
314
- provider's tool-calling format (OpenAI, Anthropic, Google, etc.):
321
+ MCP is a first-class integration mechanism in llm.rb.
322
+
323
+ MCP allows llm.rb to treat external services, internal APIs, and system
324
+ capabilities as tools in a unified interface. This makes it possible to
325
+ connect multiple MCP sources simultaneously and expose your own APIs as tools.
326
+
327
+ In practice, this supports workflows such as external SaaS integrations,
328
+ multiple MCP sources in the same context, and OpenAPI -> MCP -> tools
329
+ pipelines for internal services.
330
+
331
+ llm.rb integrates with the Model Context Protocol (MCP) to dynamically discover
332
+ and use tools from external servers. This example starts a filesystem MCP
333
+ server over stdio and makes its tools available to a context, enabling the LLM
334
+ to interact with the local file system through a standardized interface.
335
+ Use `LLM::MCP.stdio` or `LLM::MCP.http` when you want to make the transport
336
+ explicit. Like `LLM::Context`, an MCP client is stateful and should remain
337
+ isolated to a single thread:
315
338
 
316
339
  ```ruby
317
340
  #!/usr/bin/env ruby
318
341
  require "llm"
319
342
 
320
- class System < LLM::Tool
321
- name "system"
322
- description "Run a shell command"
323
- param :command, String, "Command to execute", required: true
343
+ llm = LLM.openai(key: ENV["KEY"])
344
+ mcp = LLM::MCP.stdio(argv: ["npx", "-y", "@modelcontextprotocol/server-filesystem", Dir.pwd])
324
345
 
325
- def call(command:)
326
- {success: system(command)}
327
- end
346
+ begin
347
+ mcp.start
348
+ ctx = LLM::Context.new(llm, stream: $stdout, tools: mcp.tools)
349
+ ctx.talk("List the directories in this project.")
350
+ ctx.talk(ctx.call(:functions)) while ctx.functions.any?
351
+ ensure
352
+ mcp.stop
328
353
  end
329
-
330
- llm = LLM.openai(key: ENV["KEY"])
331
- ctx = LLM::Context.new(llm, stream: $stdout, tools: [System])
332
- ctx.talk("Run `date`.")
333
- ctx.talk(ctx.call(:functions)) while ctx.functions.any?
334
354
  ```
335
355
 
336
- #### Structured Outputs
337
-
338
- The `LLM::Schema` system lets you define JSON schemas for structured outputs.
339
- Schemas can be defined as classes with `property` declarations or built
340
- programmatically using a fluent interface. When you pass a schema to a context,
341
- llm.rb adapts it into the provider's structured-output format when that
342
- provider supports one. The `content!` method then parses the assistant's JSON
343
- response into a Ruby object:
356
+ You can also connect to an MCP server over HTTP. This is useful when the
357
+ server already runs remotely and exposes MCP through a URL instead of a local
358
+ process. If you expect repeated tool calls, use `persistent` to reuse a
359
+ process-wide HTTP connection pool. This requires the optional
360
+ `net-http-persistent` gem:
344
361
 
345
362
  ```ruby
346
363
  #!/usr/bin/env ruby
347
364
  require "llm"
348
- require "pp"
349
-
350
- class Report < LLM::Schema
351
- property :category, Enum["performance", "security", "outage"], "Report category", required: true
352
- property :summary, String, "Short summary", required: true
353
- property :impact, OneOf[String, Integer], "Primary impact, as text or a count", required: true
354
- property :services, Array[String], "Impacted services", required: true
355
- property :timestamp, String, "When it happened", optional: true
356
- end
357
365
 
358
366
  llm = LLM.openai(key: ENV["KEY"])
359
- ctx = LLM::Context.new(llm, schema: Report)
360
- res = ctx.talk("Structure this report: 'Database latency spiked at 10:42 UTC, causing 5% request timeouts for 12 minutes.'")
361
- pp res.content!
367
+ mcp = LLM::MCP.http(
368
+ url: "https://api.githubcopilot.com/mcp/",
369
+ headers: {"Authorization" => "Bearer #{ENV.fetch("GITHUB_PAT")}"}
370
+ ).persistent
362
371
 
363
- # {
364
- # "category" => "performance",
365
- # "summary" => "Database latency spiked, causing 5% request timeouts for 12 minutes.",
366
- # "impact" => "5% request timeouts",
367
- # "services" => ["Database"],
368
- # "timestamp" => "2024-06-05T10:42:00Z"
369
- # }
372
+ begin
373
+ mcp.start
374
+ ctx = LLM::Context.new(llm, stream: $stdout, tools: mcp.tools)
375
+ ctx.talk("List the available GitHub MCP toolsets.")
376
+ ctx.talk(ctx.call(:functions)) while ctx.functions.any?
377
+ ensure
378
+ mcp.stop
379
+ end
370
380
  ```
371
381
 
372
382
  ## Providers
@@ -496,7 +506,7 @@ require "llm"
496
506
  LLM.json = :oj # Use Oj for faster JSON parsing
497
507
 
498
508
  # Enable HTTP connection pooling for high-throughput applications
499
- llm = LLM.openai(key: ENV["KEY"]).persist! # Uses net-http-persistent when available
509
+ llm = LLM.openai(key: ENV["KEY"]).persistent # Uses net-http-persistent when available
500
510
  ```
501
511
 
502
512
  #### Model Registry
@@ -542,11 +552,11 @@ res = ctx.talk("What is the capital of France?")
542
552
  puts res.content
543
553
  ```
544
554
 
545
- #### Context Persistence
555
+ #### Context Persistence: Vanilla
546
556
 
547
- Contexts can be serialized and restored across process boundaries. This makes
548
- it possible to persist conversation state in a file, database, or queue and
549
- resume work later:
557
+ Contexts can be serialized and restored across process boundaries. A context
558
+ can be serialized to JSON and stored on disk, in a database, in a job queue,
559
+ or anywhere else your application needs to persist state:
550
560
 
551
561
  ```ruby
552
562
  #!/usr/bin/env ruby
@@ -556,12 +566,79 @@ llm = LLM.openai(key: ENV["KEY"])
556
566
  ctx = LLM::Context.new(llm)
557
567
  ctx.talk("Hello")
558
568
  ctx.talk("Remember that my favorite language is Ruby")
559
- ctx.save(path: "context.json")
569
+
570
+ # Serialize to a string when you want to store the context yourself,
571
+ # for example in a database row or job payload.
572
+ payload = ctx.to_json
560
573
 
561
574
  restored = LLM::Context.new(llm)
562
- restored.restore(path: "context.json")
575
+ restored.restore(string: payload)
563
576
  res = restored.talk("What is my favorite language?")
564
577
  puts res.content
578
+
579
+ # You can also persist the same state to a file:
580
+ ctx.save(path: "context.json")
581
+ restored = LLM::Context.new(llm)
582
+ restored.restore(path: "context.json")
583
+ ```
584
+
585
+ #### Context Persistence: ActiveRecord (Rails)
586
+
587
+ In a Rails application, you can also wrap persisted context state in an
588
+ ActiveRecord model. A minimal schema would include a `snapshot` column for the
589
+ serialized context payload (`jsonb` is recommended) and a `provider` column
590
+ for the provider name:
591
+
592
+ ```ruby
593
+ create_table :contexts do |t|
594
+ t.jsonb :snapshot
595
+ t.string :provider, null: false
596
+ t.timestamps
597
+ end
598
+ ```
599
+
600
+ For example:
601
+
602
+ ```ruby
603
+ class Context < ApplicationRecord
604
+ def talk(...)
605
+ ctx.talk(...).tap { flush }
606
+ end
607
+
608
+ def wait(...)
609
+ ctx.wait(...).tap { flush }
610
+ end
611
+
612
+ def messages
613
+ ctx.messages
614
+ end
615
+
616
+ def model
617
+ ctx.model
618
+ end
619
+
620
+ def flush
621
+ update_column(:snapshot, ctx.to_json)
622
+ end
623
+
624
+ private
625
+
626
+ def ctx
627
+ @ctx ||= begin
628
+ ctx = LLM::Context.new(llm)
629
+ ctx.restore(string: snapshot) if snapshot
630
+ ctx
631
+ end
632
+ end
633
+
634
+ def llm
635
+ LLM.method(provider).call(key: ENV.fetch(key))
636
+ end
637
+
638
+ def key
639
+ "#{provider.upcase}_KEY"
640
+ end
641
+ end
565
642
  ```
566
643
 
567
644
  #### Agents
@@ -9,11 +9,17 @@ class LLM::Function
9
9
  # @return [Object]
10
10
  attr_reader :task
11
11
 
12
+ ##
13
+ # @return [LLM::Function, nil]
14
+ attr_reader :function
15
+
12
16
  ##
13
17
  # @param [Thread, Fiber, Async::Task] task
18
+ # @param [LLM::Function, nil] function
14
19
  # @return [LLM::Function::Task]
15
- def initialize(task)
20
+ def initialize(task, function = nil)
16
21
  @task = task
22
+ @function = function
17
23
  end
18
24
 
19
25
  ##
data/lib/llm/function.rb CHANGED
@@ -41,6 +41,13 @@ class LLM::Function
41
41
  prepend LLM::Function::Tracing
42
42
 
43
43
  Return = Struct.new(:id, :name, :value) do
44
+ ##
45
+ # Returns true when the return value represents an error.
46
+ # @return [Boolean]
47
+ def error?
48
+ Hash === value && value[:error] == true
49
+ end
50
+
44
51
  ##
45
52
  # Returns a Hash representation of {LLM::Function::Return}
46
53
  # @return [Hash]
@@ -186,7 +193,7 @@ class LLM::Function
186
193
  else
187
194
  raise ArgumentError, "Unknown strategy: #{strategy.inspect}. Expected :thread, :task, or :fiber"
188
195
  end
189
- Task.new(task)
196
+ Task.new(task, self)
190
197
  ensure
191
198
  @called = true
192
199
  end
@@ -233,7 +240,11 @@ class LLM::Function
233
240
  when "LLM::Google"
234
241
  {name: @name, description: @description, parameters: @params}.compact
235
242
  when "LLM::Anthropic"
236
- {name: @name, description: @description, input_schema: @params}.compact
243
+ {
244
+ name: @name,
245
+ description: @description,
246
+ input_schema: @params || {type: "object", properties: {}}
247
+ }.compact
237
248
  else
238
249
  format_openai(provider)
239
250
  end
@@ -104,7 +104,7 @@ module LLM::MCP::Transport
104
104
  # Configures the transport to use a persistent HTTP connection pool
105
105
  # via the optional dependency [Net::HTTP::Persistent](https://github.com/drbrain/net-http-persistent)
106
106
  # @example
107
- # mcp = LLM.mcp(http: {url: "https://example.com/mcp"}).persist!
107
+ # mcp = LLM.mcp(http: {url: "https://example.com/mcp"}).persistent
108
108
  # # do something with 'mcp'
109
109
  # @return [LLM::MCP::Transport::HTTP]
110
110
  def persist!
@@ -119,6 +119,7 @@ module LLM::MCP::Transport
119
119
  end
120
120
  self
121
121
  end
122
+ alias_method :persistent, :persist!
122
123
 
123
124
  private
124
125
 
@@ -84,6 +84,7 @@ module LLM::MCP::Transport
84
84
  def persist!
85
85
  self
86
86
  end
87
+ alias_method :persistent, :persist!
87
88
 
88
89
  private
89
90
 
data/lib/llm/mcp.rb CHANGED
@@ -104,13 +104,14 @@ class LLM::MCP
104
104
  # Configures an HTTP MCP transport to use a persistent connection pool
105
105
  # via the optional dependency [Net::HTTP::Persistent](https://github.com/drbrain/net-http-persistent)
106
106
  # @example
107
- # mcp = LLM.mcp(http: {url: "https://example.com/mcp"}).persist!
107
+ # mcp = LLM.mcp(http: {url: "https://example.com/mcp"}).persistent
108
108
  # # do something with 'mcp'
109
109
  # @return [LLM::MCP]
110
110
  def persist!
111
111
  transport.persist!
112
112
  self
113
113
  end
114
+ alias_method :persistent, :persist!
114
115
 
115
116
  ##
116
117
  # Returns the tools provided by the MCP process.
data/lib/llm/provider.rb CHANGED
@@ -308,7 +308,7 @@ class LLM::Provider
308
308
  # This method configures a provider to use a persistent connection pool
309
309
  # via the optional dependency [Net::HTTP::Persistent](https://github.com/drbrain/net-http-persistent)
310
310
  # @example
311
- # llm = LLM.openai(key: ENV["KEY"]).persist!
311
+ # llm = LLM.openai(key: ENV["KEY"]).persistent
312
312
  # # do something with 'llm'
313
313
  # @return [LLM::Provider]
314
314
  def persist!
@@ -317,14 +317,13 @@ class LLM::Provider
317
317
  tap { @client = client }
318
318
  end
319
319
  end
320
+ alias_method :persistent, :persist!
320
321
 
321
322
  ##
322
323
  # @param [Object] stream
323
324
  # @return [Boolean]
324
325
  def streamable?(stream)
325
- stream.respond_to?(:on_content) ||
326
- stream.respond_to?(:on_reasoning_content) ||
327
- stream.respond_to?(:<<)
326
+ LLM::Stream === stream || stream.respond_to?(:<<)
328
327
  end
329
328
 
330
329
  private
@@ -28,12 +28,19 @@ module LLM::Anthropic::RequestAdapter
28
28
 
29
29
  def adapt_message
30
30
  if message.tool_call?
31
- {role: message.role, content: message.extra[:original_tool_calls]}
31
+ {role: message.role, content: adapt_tool_calls}
32
32
  else
33
33
  {role: message.role, content: adapt_content(content)}
34
34
  end
35
35
  end
36
36
 
37
+ def adapt_tool_calls
38
+ message.extra[:tool_calls].filter_map do |tool|
39
+ next unless tool[:id] && tool[:name]
40
+ {type: "tool_use", id: tool[:id], name: tool[:name], input: LLM::Anthropic.parse_tool_input(tool[:arguments])}
41
+ end
42
+ end
43
+
37
44
  ##
38
45
  # @param [String, URI] content
39
46
  # The content to format
@@ -66,7 +66,8 @@ module LLM::Anthropic::ResponseAdapter
66
66
  private
67
67
 
68
68
  def adapt_choices
69
- texts.map.with_index do |choice, index|
69
+ source = texts.empty? && tools.any? ? [{"text" => ""}] : texts
70
+ source.map.with_index do |choice, index|
70
71
  extra = {
71
72
  index:, response: self,
72
73
  tool_calls: adapt_tool_calls(tools), original_tool_calls: tools
@@ -77,7 +78,11 @@ module LLM::Anthropic::ResponseAdapter
77
78
 
78
79
  def adapt_tool_calls(tools)
79
80
  (tools || []).filter_map do |tool|
80
- {id: tool.id, name: tool.name, arguments: tool.input}
81
+ {
82
+ id: tool.id,
83
+ name: tool.name,
84
+ arguments: LLM::Anthropic.parse_tool_input(tool.input)
85
+ }
81
86
  end
82
87
  end
83
88
 
@@ -105,7 +105,7 @@ class LLM::Anthropic
105
105
  registered = LLM::Function.find_by_name(tool["name"])
106
106
  fn = (registered || LLM::Function.new(tool["name"])).dup.tap do |fn|
107
107
  fn.id = tool["id"]
108
- fn.arguments = tool["input"]
108
+ fn.arguments = LLM::Anthropic.parse_tool_input(tool["input"])
109
109
  end
110
110
  [fn, (registered ? nil : @stream.tool_not_found(fn))]
111
111
  end
@@ -0,0 +1,23 @@
1
+ # frozen_string_literal: true
2
+
3
+ class LLM::Anthropic
4
+ module Utils
5
+ ##
6
+ # Normalizes Anthropic tool input to a Hash suitable for kwargs.
7
+ # @param input [Hash, String, nil]
8
+ # @return [Hash]
9
+ def parse_tool_input(input)
10
+ case input
11
+ when Hash then input
12
+ when String
13
+ parsed = LLM.json.load(input)
14
+ Hash === parsed ? parsed : {}
15
+ when nil then {}
16
+ else
17
+ input.respond_to?(:to_h) ? input.to_h : {}
18
+ end
19
+ rescue *LLM.json.parser_error
20
+ {}
21
+ end
22
+ end
23
+ end
@@ -14,6 +14,7 @@ module LLM
14
14
  # ctx.talk ["Tell me about this photo", ctx.local_file("/images/photo.png")]
15
15
  # ctx.messages.select(&:assistant?).each { print "[#{_1.role}]", _1.content, "\n" }
16
16
  class Anthropic < Provider
17
+ require_relative "anthropic/utils"
17
18
  require_relative "anthropic/error_handler"
18
19
  require_relative "anthropic/request_adapter"
19
20
  require_relative "anthropic/response_adapter"
@@ -21,6 +22,7 @@ module LLM
21
22
  require_relative "anthropic/models"
22
23
  require_relative "anthropic/files"
23
24
  include RequestAdapter
25
+ extend Utils
24
26
 
25
27
  HOST = "api.anthropic.com"
26
28
 
@@ -79,6 +81,15 @@ module LLM
79
81
  "assistant"
80
82
  end
81
83
 
84
+ ##
85
+ # Anthropic expects tool results to be sent as user messages
86
+ # containing `tool_result` content blocks rather than a distinct
87
+ # `tool` role.
88
+ # @return (see LLM::Provider#tool_role)
89
+ def tool_role
90
+ :user
91
+ end
92
+
82
93
  ##
83
94
  # Returns the default model for chat completions
84
95
  # @see https://docs.anthropic.com/en/docs/about-claude/models/all-models#model-comparison-table claude-sonnet-4-20250514
@@ -8,8 +8,10 @@ class LLM::Stream
8
8
  # returns an array of {LLM::Function::Return} values.
9
9
  class Queue
10
10
  ##
11
+ # @param [LLM::Stream] stream
11
12
  # @return [LLM::Stream::Queue]
12
- def initialize
13
+ def initialize(stream)
14
+ @stream = stream
13
15
  @items = []
14
16
  end
15
17
 
@@ -39,13 +41,24 @@ class LLM::Stream
39
41
  # @return [Array<LLM::Function::Return>]
40
42
  def wait(strategy)
41
43
  returns, tasks = @items.shift(@items.length).partition { LLM::Function::Return === _1 }
42
- returns.concat case strategy
44
+ results = case strategy
43
45
  when :thread then LLM::Function::ThreadGroup.new(tasks).wait
44
46
  when :task then LLM::Function::TaskGroup.new(tasks).wait
45
47
  when :fiber then LLM::Function::FiberGroup.new(tasks).wait
46
48
  else raise ArgumentError, "Unknown strategy: #{strategy.inspect}. Expected :thread, :task, or :fiber"
47
49
  end
50
+ returns.concat fire_hooks(tasks, results)
48
51
  end
49
52
  alias_method :value, :wait
53
+
54
+ private
55
+
56
+ def fire_hooks(tasks, results)
57
+ results.each_with_index do |ret, idx|
58
+ tool = tasks[idx]&.function
59
+ @stream.on_tool_return(tool, ret) if tool
60
+ end
61
+ results
62
+ end
50
63
  end
51
64
  end
data/lib/llm/stream.rb CHANGED
@@ -5,20 +5,20 @@ module LLM
5
5
  # The {LLM::Stream LLM::Stream} class provides the callback interface for
6
6
  # streamed model output in llm.rb.
7
7
  #
8
- # A stream object can be an instance of {LLM::Stream LLM::Stream}, a
9
- # subclass that overrides the callbacks it needs, or any other object that
10
- # implements some or all of the same interface. {#queue} provides a small
11
- # helper for collecting asynchronous tool work started from a callback, and
12
- # {#tool_not_found} returns an in-band tool error when a streamed tool
13
- # cannot be resolved.
8
+ # A stream object can be an instance of {LLM::Stream LLM::Stream} or a
9
+ # subclass that overrides the callbacks it needs. For basic streaming,
10
+ # llm.rb also accepts any object that implements `#<<`. {#queue} provides
11
+ # a small helper for collecting asynchronous tool work started from a
12
+ # callback, and {#tool_not_found} returns an in-band tool error when a
13
+ # streamed tool cannot be resolved.
14
14
  #
15
15
  # @note The `on_*` callbacks run inline with the streaming parser. They
16
16
  # therefore block streaming progress and should generally return as
17
17
  # quickly as possible.
18
18
  #
19
- # The most common callback is {#on_content}, which also maps to {#<<} for
20
- # compatibility with `StringIO`-style objects. Providers may also call
21
- # {#on_reasoning_content} and {#on_tool_call} when that data is available.
19
+ # The most common callback is {#on_content}, which also maps to {#<<}.
20
+ # Providers may also call {#on_reasoning_content} and {#on_tool_call} when
21
+ # that data is available.
22
22
  class Stream
23
23
  require_relative "stream/queue"
24
24
 
@@ -26,7 +26,7 @@ module LLM
26
26
  # Returns a lazily-initialized queue for tool results or spawned work.
27
27
  # @return [LLM::Stream::Queue]
28
28
  def queue
29
- @queue ||= Queue.new
29
+ @queue ||= Queue.new(self)
30
30
  end
31
31
 
32
32
  ##
@@ -79,6 +79,20 @@ module LLM
79
79
  nil
80
80
  end
81
81
 
82
+ ##
83
+ # Called when queued streamed tool work returns.
84
+ # @note This callback runs when {#wait} resolves work that was queued from
85
+ # {#on_tool_call}, such as values returned by `tool.spawn(:thread)`,
86
+ # `tool.spawn(:fiber)`, or `tool.spawn(:task)`.
87
+ # @param [LLM::Function] tool
88
+ # The tool that returned.
89
+ # @param [LLM::Function::Return] ret
90
+ # The completed tool return.
91
+ # @return [nil]
92
+ def on_tool_return(tool, ret)
93
+ nil
94
+ end
95
+
82
96
  # @endgroup
83
97
 
84
98
  # @group Error handlers
@@ -126,7 +126,7 @@ module LLM
126
126
  "gen_ai.operation.name" => "execute_tool",
127
127
  "gen_ai.request.model" => model,
128
128
  "gen_ai.tool.call.id" => id,
129
- "gen_ai.tool.name" => name,
129
+ "gen_ai.tool.name" => name&.to_s,
130
130
  "gen_ai.tool.call.arguments" => LLM.json.dump(arguments),
131
131
  "gen_ai.provider.name" => provider_name,
132
132
  "server.address" => provider_host,
@@ -145,7 +145,7 @@ module LLM
145
145
  return nil unless span
146
146
  attributes = {
147
147
  "gen_ai.tool.call.id" => result.id,
148
- "gen_ai.tool.name" => result.name,
148
+ "gen_ai.tool.name" => result.name&.to_s,
149
149
  "gen_ai.tool.call.result" => LLM.json.dump(result.value)
150
150
  }.compact
151
151
  attributes.each { span.set_attribute(_1, _2) }
data/lib/llm/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module LLM
4
- VERSION = "4.11.0"
4
+ VERSION = "4.12.0"
5
5
  end
data/llm.gemspec CHANGED
@@ -8,47 +8,15 @@ Gem::Specification.new do |spec|
8
8
  spec.authors = ["Antar Azri", "0x1eef", "Christos Maris", "Rodrigo Serrano"]
9
9
  spec.email = ["azantar@proton.me", "0x1eef@hardenedbsd.org"]
10
10
 
11
- spec.summary = <<~SUMMARY
12
- llm.rb is a Ruby-centric toolkit for building real LLM-powered systems — where
13
- LLMs are part of your architecture, not just API calls. It gives you explicit
14
- control over contexts, tools, concurrency, and providers, so you can compose
15
- reliable, production-ready workflows without hidden abstractions.
16
- SUMMARY
11
+ spec.summary = "System integration layer for LLMs, tools, MCP, and APIs in Ruby."
17
12
 
18
13
  spec.description = <<~DESCRIPTION
19
- llm.rb is a Ruby-centric toolkit for building real LLM-powered systems — where
20
- LLMs are part of your architecture, not just API calls. It gives you explicit
21
- control over contexts, tools, concurrency, and providers, so you can compose
22
- reliable, production-ready workflows without hidden abstractions.
23
-
24
- Built for engineers who want to understand and control their LLM systems. No
25
- frameworks, no hidden magic — just composable primitives for building real
26
- applications, from scripts to full systems like Relay.
27
-
28
- ## Key Features
29
-
30
- - **Contexts are central** — Hold history, tools, schema, usage, cost, persistence, and execution state
31
- - **Tool execution is explicit** — Run local, provider-native, and MCP tools sequentially or concurrently
32
- - **One API across providers** — Unified interface for OpenAI, Anthropic, Google, xAI, zAI, DeepSeek, Ollama, and LlamaCpp
33
- - **Thread-safe where it matters** — Providers are shareable, while contexts stay isolated and stateful
34
- - **Production-ready** — Cost tracking, observability, persistence, and performance tuning built in
35
- - **Stdlib-only by default** — Runs on Ruby standard library, with optional features loaded only when used
36
-
37
- ## Capabilities
38
-
39
- - Chat & Contexts with persistence
40
- - Streaming responses
41
- - Tool calling with JSON Schema validation
42
- - Concurrent execution (threads, fibers, async tasks)
43
- - Agents with auto-execution
44
- - Structured outputs
45
- - MCP (Model Context Protocol) support
46
- - Multimodal inputs (text, images, audio, documents)
47
- - Audio generation, transcription, translation
48
- - Image generation and editing
49
- - Files API for document processing
50
- - Embeddings and vector stores
51
- - Local model registry for capabilities, limits, and pricing
14
+ llm.rb is a Ruby-centric system integration layer for building LLM-powered
15
+ systems. It connects LLMs to real systems by turning APIs into tools and
16
+ unifying MCP, providers, contexts, and application logic in one execution
17
+ model. It supports explicit tool orchestration, concurrent execution,
18
+ streaming, multiple MCP sources, and multiple LLM providers for production
19
+ systems that integrate external and internal services.
52
20
  DESCRIPTION
53
21
 
54
22
  spec.license = "0BSD"
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: llm.rb
3
3
  version: !ruby/object:Gem::Version
4
- version: 4.11.0
4
+ version: 4.12.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Antar Azri
@@ -195,39 +195,12 @@ dependencies:
195
195
  - !ruby/object:Gem::Version
196
196
  version: '1.7'
197
197
  description: |
198
- llm.rb is a Ruby-centric toolkit for building real LLM-powered systems — where
199
- LLMs are part of your architecture, not just API calls. It gives you explicit
200
- control over contexts, tools, concurrency, and providers, so you can compose
201
- reliable, production-ready workflows without hidden abstractions.
202
-
203
- Built for engineers who want to understand and control their LLM systems. No
204
- frameworks, no hidden magic — just composable primitives for building real
205
- applications, from scripts to full systems like Relay.
206
-
207
- ## Key Features
208
-
209
- - **Contexts are central** — Hold history, tools, schema, usage, cost, persistence, and execution state
210
- - **Tool execution is explicit** — Run local, provider-native, and MCP tools sequentially or concurrently
211
- - **One API across providers** — Unified interface for OpenAI, Anthropic, Google, xAI, zAI, DeepSeek, Ollama, and LlamaCpp
212
- - **Thread-safe where it matters** — Providers are shareable, while contexts stay isolated and stateful
213
- - **Production-ready** — Cost tracking, observability, persistence, and performance tuning built in
214
- - **Stdlib-only by default** — Runs on Ruby standard library, with optional features loaded only when used
215
-
216
- ## Capabilities
217
-
218
- - Chat & Contexts with persistence
219
- - Streaming responses
220
- - Tool calling with JSON Schema validation
221
- - Concurrent execution (threads, fibers, async tasks)
222
- - Agents with auto-execution
223
- - Structured outputs
224
- - MCP (Model Context Protocol) support
225
- - Multimodal inputs (text, images, audio, documents)
226
- - Audio generation, transcription, translation
227
- - Image generation and editing
228
- - Files API for document processing
229
- - Embeddings and vector stores
230
- - Local model registry for capabilities, limits, and pricing
198
+ llm.rb is a Ruby-centric system integration layer for building LLM-powered
199
+ systems. It connects LLMs to real systems by turning APIs into tools and
200
+ unifying MCP, providers, contexts, and application logic in one execution
201
+ model. It supports explicit tool orchestration, concurrent execution,
202
+ streaming, multiple MCP sources, and multiple LLM providers for production
203
+ systems that integrate external and internal services.
231
204
  email:
232
205
  - azantar@proton.me
233
206
  - 0x1eef@hardenedbsd.org
@@ -300,6 +273,7 @@ files:
300
273
  - lib/llm/providers/anthropic/response_adapter/models.rb
301
274
  - lib/llm/providers/anthropic/response_adapter/web_search.rb
302
275
  - lib/llm/providers/anthropic/stream_parser.rb
276
+ - lib/llm/providers/anthropic/utils.rb
303
277
  - lib/llm/providers/deepseek.rb
304
278
  - lib/llm/providers/deepseek/request_adapter.rb
305
279
  - lib/llm/providers/deepseek/request_adapter/completion.rb
@@ -417,8 +391,5 @@ required_rubygems_version: !ruby/object:Gem::Requirement
417
391
  requirements: []
418
392
  rubygems_version: 3.6.9
419
393
  specification_version: 4
420
- summary: llm.rb is a Ruby-centric toolkit for building real LLM-powered systems
421
- where LLMs are part of your architecture, not just API calls. It gives you explicit
422
- control over contexts, tools, concurrency, and providers, so you can compose reliable,
423
- production-ready workflows without hidden abstractions.
394
+ summary: System integration layer for LLMs, tools, MCP, and APIs in Ruby.
424
395
  test_files: []