llm.rb 4.11.0 → 4.11.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: a2af34506e099996b451951da8fb892ecdacebe9f29217bbf7a9e3ee3382d942
4
- data.tar.gz: f49edb6d166ae113618139f0b118f37acbbd001b9b256d76d5c66b2828915a88
3
+ metadata.gz: f4c449483ce7a3b53411760d6376157fed3e23b4f013f23ae397255398bef368
4
+ data.tar.gz: a9a9c82b107cde72edfe6fe5f68ea7b1ea5e493314883d101c453a94db81b601
5
5
  SHA512:
6
- metadata.gz: 8dbdbde04bf04fd714ce5ab3689f078f6a77243853bdb7ea287124295b2a5b5878493a36e4ec0c703a10466306f13ca503de9132b2a8a31c2c39b2f721b1bf78
7
- data.tar.gz: 5bcb9be7c664bbee548cdc305878bc62fe1c8b5ab23d64630719084dab3581b8f4abf875a235a0e33ee05430cda8d69b0b6cc8fce538abafa4e8f85bbbbaead0
6
+ metadata.gz: 71a389b2fe654cfd053f45bd749c34b96c9d89ac60e984960f4a2720896588ba39056a3a92ab75a429572cd099961d9f3c02474f7dc43460b59866e41d8b5f28
7
+ data.tar.gz: 4532ec55176751b32ed21b281f2f71395dcd32cdf318973a751decf171af0a9e5f3f75b75871542c578fd9a2a134f8fc5cbf6a54b1df3b2dbe0c47745122b900
data/CHANGELOG.md CHANGED
@@ -2,9 +2,9 @@
2
2
 
3
3
  ## Unreleased
4
4
 
5
- Changes since `v4.11.0`.
5
+ Changes since `v4.11.1`.
6
6
 
7
- ## v4.11.0
7
+ ## v4.11.1
8
8
 
9
9
  Changes since `v4.10.0`.
10
10
 
data/README.md CHANGED
@@ -4,7 +4,7 @@
4
4
  <p align="center">
5
5
  <a href="https://0x1eef.github.io/x/llm.rb?rebuild=1"><img src="https://img.shields.io/badge/docs-0x1eef.github.io-blue.svg" alt="RubyDoc"></a>
6
6
  <a href="https://opensource.org/license/0bsd"><img src="https://img.shields.io/badge/License-0BSD-orange.svg?" alt="License"></a>
7
- <a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-4.11.0-green.svg?" alt="Version"></a>
7
+ <a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-4.11.1-green.svg?" alt="Version"></a>
8
8
  </p>
9
9
 
10
10
  ## About
@@ -30,11 +30,14 @@ llm.rb is built around the state and execution model around them:
30
30
 
31
31
  - **Contexts are central** <br>
32
32
  They hold history, tools, schema, usage, cost, persistence, and execution state.
33
+ - **Contexts can be serialized** <br>
34
+ A context can be serialized to JSON and stored on disk, in a database, in a
35
+ job queue, or anywhere else your application needs to persist state.
33
36
  - **Tool execution is explicit** <br>
34
37
  Run local, provider-native, and MCP tools sequentially or concurrently with threads, fibers, or async tasks.
35
38
  - **Run tools while streaming** <br>
36
39
  Start tool work while a response is still streaming instead of waiting for the turn to finish. <br>
37
- This lets tool latency overlap with model output and is one of llm.rb's strongest execution features.
40
+ This overlaps tool latency with model output and exposes streamed tool-call events for introspection, making it one of llm.rb's strongest execution features.
38
41
  - **HTTP MCP can reuse connections** <br>
39
42
  Opt into persistent HTTP pooling for repeated remote MCP tool calls with `persist!`.
40
43
  - **One API across providers and capabilities** <br>
@@ -100,142 +103,114 @@ llm.rb provides a complete set of primitives for building LLM-powered systems:
100
103
 
101
104
  ## Quick Start
102
105
 
103
- #### Run Tools While Streaming
104
-
105
- llm.rb can start tool execution from streamed tool-call events before the
106
- assistant turn is fully finished. That means tool latency can overlap with
107
- streaming output instead of happening strictly after it. If your model emits
108
- tool calls early, this can noticeably reduce end-to-end latency for real
109
- systems.
106
+ #### Simple Streaming
110
107
 
111
- This is different from plain concurrent tool execution. The tool starts while
112
- the response is still arriving, not after the turn has fully completed.
108
+ At the simplest level, any object that implements `#<<` can receive visible
109
+ output as it arrives. This works with `$stdout`, `StringIO`, files, sockets,
110
+ and other Ruby IO-style objects.
113
111
 
114
- For example:
112
+ For more control, llm.rb also supports advanced streaming patterns through
113
+ [`LLM::Stream`](lib/llm/stream.rb). See [Advanced Streaming](#advanced-streaming)
114
+ for a structured callback-based example:
115
115
 
116
116
  ```ruby
117
117
  #!/usr/bin/env ruby
118
118
  require "llm"
119
119
 
120
- class System < LLM::Tool
121
- name "system"
122
- description "Run a shell command"
123
- params { _1.object(command: _1.string.required) }
124
-
125
- def call(command:)
126
- {success: Kernel.system(command)}
127
- end
128
- end
129
-
130
- class Stream < LLM::Stream
131
- def on_content(content)
132
- print content
133
- end
134
-
135
- def on_tool_call(tool, error)
136
- queue << (error || tool.spawn(:thread))
137
- end
138
- end
139
-
140
120
  llm = LLM.openai(key: ENV["KEY"])
141
- ctx = LLM::Context.new(llm, stream: Stream.new, tools: [System])
142
-
143
- ctx.talk("Run `date` and tell me what command you ran.")
144
- ctx.talk(ctx.wait(:thread)) while ctx.functions.any?
121
+ ctx = LLM::Context.new(llm, stream: $stdout)
122
+ loop do
123
+ print "> "
124
+ ctx.talk(STDIN.gets || break)
125
+ puts
126
+ end
145
127
  ```
146
128
 
147
- #### Concurrent Tools
129
+ #### Structured Outputs
148
130
 
149
- llm.rb provides explicit concurrency control for tool execution. The
150
- `wait(:thread)` method spawns each pending function in its own thread and waits
151
- for all to complete. You can also use `:fiber` for cooperative multitasking or
152
- `:task` for async/await patterns (requires the `async` gem). The context
153
- automatically collects all results and reports them back to the LLM in a
154
- single turn, maintaining conversation flow while parallelizing independent
155
- operations:
131
+ The `LLM::Schema` system lets you define JSON schemas for structured outputs.
132
+ Schemas can be defined as classes with `property` declarations or built
133
+ programmatically using a fluent interface. When you pass a schema to a context,
134
+ llm.rb adapts it into the provider's structured-output format when that
135
+ provider supports one. The `content!` method then parses the assistant's JSON
136
+ response into a Ruby object:
156
137
 
157
138
  ```ruby
158
139
  #!/usr/bin/env ruby
159
140
  require "llm"
141
+ require "pp"
142
+
143
+ class Report < LLM::Schema
144
+ property :category, Enum["performance", "security", "outage"], "Report category", required: true
145
+ property :summary, String, "Short summary", required: true
146
+ property :impact, OneOf[String, Integer], "Primary impact, as text or a count", required: true
147
+ property :services, Array[String], "Impacted services", required: true
148
+ property :timestamp, String, "When it happened", optional: true
149
+ end
160
150
 
161
151
  llm = LLM.openai(key: ENV["KEY"])
162
- ctx = LLM::Context.new(llm, stream: $stdout, tools: [FetchWeather, FetchNews, FetchStock])
152
+ ctx = LLM::Context.new(llm, schema: Report)
153
+ res = ctx.talk("Structure this report: 'Database latency spiked at 10:42 UTC, causing 5% request timeouts for 12 minutes.'")
154
+ pp res.content!
163
155
 
164
- # Execute multiple independent tools concurrently
165
- ctx.talk("Summarize the weather, headlines, and stock price.")
166
- ctx.talk(ctx.wait(:thread)) while ctx.functions.any?
156
+ # {
157
+ # "category" => "performance",
158
+ # "summary" => "Database latency spiked, causing 5% request timeouts for 12 minutes.",
159
+ # "impact" => "5% request timeouts",
160
+ # "services" => ["Database"],
161
+ # "timestamp" => "2024-06-05T10:42:00Z"
162
+ # }
167
163
  ```
168
164
 
169
- #### MCP
165
+ #### Tool Calling
170
166
 
171
- llm.rb integrates with the Model Context Protocol (MCP) to dynamically discover
172
- and use tools from external servers. This example starts a filesystem MCP
173
- server over stdio and makes its tools available to a context, enabling the LLM
174
- to interact with the local file system through a standardized interface.
175
- Use `LLM::MCP.stdio` or `LLM::MCP.http` when you want to make the transport
176
- explicit. Like `LLM::Context`, an MCP client is stateful and should remain
177
- isolated to a single thread:
167
+ Tools in llm.rb can be defined as classes inheriting from `LLM::Tool` or as
168
+ closures using `LLM.function`. When the LLM requests a tool call, the context
169
+ stores `Function` objects in `ctx.functions`. The `call()` method executes all
170
+ pending functions and returns their results to the LLM. Tools describe
171
+ structured parameters with JSON Schema and adapt those definitions to each
172
+ provider's tool-calling format (OpenAI, Anthropic, Google, etc.):
178
173
 
179
174
  ```ruby
180
175
  #!/usr/bin/env ruby
181
176
  require "llm"
182
177
 
183
- llm = LLM.openai(key: ENV["KEY"])
184
- mcp = LLM::MCP.stdio(argv: ["npx", "-y", "@modelcontextprotocol/server-filesystem", Dir.pwd])
178
+ class System < LLM::Tool
179
+ name "system"
180
+ description "Run a shell command"
181
+ param :command, String, "Command to execute", required: true
185
182
 
186
- begin
187
- mcp.start
188
- ctx = LLM::Context.new(llm, stream: $stdout, tools: mcp.tools)
189
- ctx.talk("List the directories in this project.")
190
- ctx.talk(ctx.call(:functions)) while ctx.functions.any?
191
- ensure
192
- mcp.stop
183
+ def call(command:)
184
+ {success: system(command)}
185
+ end
193
186
  end
194
- ```
195
-
196
- You can also connect to an MCP server over HTTP. This is useful when the
197
- server already runs remotely and exposes MCP through a URL instead of a local
198
- process. If you expect repeated tool calls, use `persist!` to reuse a
199
- process-wide HTTP connection pool. This requires the optional
200
- `net-http-persistent` gem:
201
-
202
- ```ruby
203
- #!/usr/bin/env ruby
204
- require "llm"
205
187
 
206
188
  llm = LLM.openai(key: ENV["KEY"])
207
- mcp = LLM::MCP.http(
208
- url: "https://api.githubcopilot.com/mcp/",
209
- headers: {"Authorization" => "Bearer #{ENV.fetch("GITHUB_PAT")}"}
210
- ).persist!
211
-
212
- begin
213
- mcp.start
214
- ctx = LLM::Context.new(llm, stream: $stdout, tools: mcp.tools)
215
- ctx.talk("List the available GitHub MCP toolsets.")
216
- ctx.talk(ctx.call(:functions)) while ctx.functions.any?
217
- ensure
218
- mcp.stop
219
- end
189
+ ctx = LLM::Context.new(llm, stream: $stdout, tools: [System])
190
+ ctx.talk("Run `date`.")
191
+ ctx.talk(ctx.call(:functions)) while ctx.functions.any?
220
192
  ```
221
193
 
222
- #### Simple Streaming
194
+ #### Concurrent Tools
223
195
 
224
- At the simplest level, any object that implements `#<<` can receive visible
225
- output as it arrives. This works with `$stdout`, `StringIO`, files, sockets,
226
- and other Ruby IO-style objects:
196
+ llm.rb provides explicit concurrency control for tool execution. The
197
+ `wait(:thread)` method spawns each pending function in its own thread and waits
198
+ for all to complete. You can also use `:fiber` for cooperative multitasking or
199
+ `:task` for async/await patterns (requires the `async` gem). The context
200
+ automatically collects all results and reports them back to the LLM in a
201
+ single turn, maintaining conversation flow while parallelizing independent
202
+ operations:
227
203
 
228
204
  ```ruby
229
205
  #!/usr/bin/env ruby
230
206
  require "llm"
231
207
 
232
208
  llm = LLM.openai(key: ENV["KEY"])
233
- ctx = LLM::Context.new(llm, stream: $stdout)
234
- loop do
235
- print "> "
236
- ctx.talk(STDIN.gets || break)
237
- puts
238
- end
209
+ ctx = LLM::Context.new(llm, stream: $stdout, tools: [FetchWeather, FetchNews, FetchStock])
210
+
211
+ # Execute multiple independent tools concurrently
212
+ ctx.talk("Summarize the weather, headlines, and stock price.")
213
+ ctx.talk(ctx.wait(:thread)) while ctx.functions.any?
239
214
  ```
240
215
 
241
216
  #### Advanced Streaming
@@ -253,10 +228,11 @@ callbacks fast: they run inline with the parser.
253
228
 
254
229
  `on_tool_call` lets tools start before the model finishes its turn, for
255
230
  example with `tool.spawn(:thread)`, `tool.spawn(:fiber)`, or
256
- `tool.spawn(:task)`. This is the mechanism behind running tools while
257
- streaming.
231
+ `tool.spawn(:task)`. That can overlap tool latency with streaming output and
232
+ gives you a first-class place to observe and instrument tool-call execution as
233
+ it unfolds.
258
234
 
259
- If a stream cannot execute a tool, `error` is an `LLM::Function::Return` that
235
+ If a stream cannot resolve a tool, `error` is an `LLM::Function::Return` that
260
236
  communicates the failure back to the LLM. That lets the tool-call path recover
261
237
  and keeps the session alive. It also leaves control in the callback: it can
262
238
  send `error`, spawn the tool when `error == nil`, or handle the situation
@@ -304,69 +280,57 @@ while ctx.functions.any?
304
280
  end
305
281
  ```
306
282
 
307
- #### Tool Calling
283
+ #### MCP
308
284
 
309
- Tools in llm.rb can be defined as classes inheriting from `LLM::Tool` or as
310
- closures using `LLM.function`. When the LLM requests a tool call, the context
311
- stores `Function` objects in `ctx.functions`. The `call()` method executes all
312
- pending functions and returns their results to the LLM. Tools describe
313
- structured parameters with JSON Schema and adapt those definitions to each
314
- provider's tool-calling format (OpenAI, Anthropic, Google, etc.):
285
+ llm.rb integrates with the Model Context Protocol (MCP) to dynamically discover
286
+ and use tools from external servers. This example starts a filesystem MCP
287
+ server over stdio and makes its tools available to a context, enabling the LLM
288
+ to interact with the local file system through a standardized interface.
289
+ Use `LLM::MCP.stdio` or `LLM::MCP.http` when you want to make the transport
290
+ explicit. Like `LLM::Context`, an MCP client is stateful and should remain
291
+ isolated to a single thread:
315
292
 
316
293
  ```ruby
317
294
  #!/usr/bin/env ruby
318
295
  require "llm"
319
296
 
320
- class System < LLM::Tool
321
- name "system"
322
- description "Run a shell command"
323
- param :command, String, "Command to execute", required: true
297
+ llm = LLM.openai(key: ENV["KEY"])
298
+ mcp = LLM::MCP.stdio(argv: ["npx", "-y", "@modelcontextprotocol/server-filesystem", Dir.pwd])
324
299
 
325
- def call(command:)
326
- {success: system(command)}
327
- end
300
+ begin
301
+ mcp.start
302
+ ctx = LLM::Context.new(llm, stream: $stdout, tools: mcp.tools)
303
+ ctx.talk("List the directories in this project.")
304
+ ctx.talk(ctx.call(:functions)) while ctx.functions.any?
305
+ ensure
306
+ mcp.stop
328
307
  end
329
-
330
- llm = LLM.openai(key: ENV["KEY"])
331
- ctx = LLM::Context.new(llm, stream: $stdout, tools: [System])
332
- ctx.talk("Run `date`.")
333
- ctx.talk(ctx.call(:functions)) while ctx.functions.any?
334
308
  ```
335
309
 
336
- #### Structured Outputs
337
-
338
- The `LLM::Schema` system lets you define JSON schemas for structured outputs.
339
- Schemas can be defined as classes with `property` declarations or built
340
- programmatically using a fluent interface. When you pass a schema to a context,
341
- llm.rb adapts it into the provider's structured-output format when that
342
- provider supports one. The `content!` method then parses the assistant's JSON
343
- response into a Ruby object:
310
+ You can also connect to an MCP server over HTTP. This is useful when the
311
+ server already runs remotely and exposes MCP through a URL instead of a local
312
+ process. If you expect repeated tool calls, use `persist!` to reuse a
313
+ process-wide HTTP connection pool. This requires the optional
314
+ `net-http-persistent` gem:
344
315
 
345
316
  ```ruby
346
317
  #!/usr/bin/env ruby
347
318
  require "llm"
348
- require "pp"
349
-
350
- class Report < LLM::Schema
351
- property :category, Enum["performance", "security", "outage"], "Report category", required: true
352
- property :summary, String, "Short summary", required: true
353
- property :impact, OneOf[String, Integer], "Primary impact, as text or a count", required: true
354
- property :services, Array[String], "Impacted services", required: true
355
- property :timestamp, String, "When it happened", optional: true
356
- end
357
319
 
358
320
  llm = LLM.openai(key: ENV["KEY"])
359
- ctx = LLM::Context.new(llm, schema: Report)
360
- res = ctx.talk("Structure this report: 'Database latency spiked at 10:42 UTC, causing 5% request timeouts for 12 minutes.'")
361
- pp res.content!
321
+ mcp = LLM::MCP.http(
322
+ url: "https://api.githubcopilot.com/mcp/",
323
+ headers: {"Authorization" => "Bearer #{ENV.fetch("GITHUB_PAT")}"}
324
+ ).persist!
362
325
 
363
- # {
364
- # "category" => "performance",
365
- # "summary" => "Database latency spiked, causing 5% request timeouts for 12 minutes.",
366
- # "impact" => "5% request timeouts",
367
- # "services" => ["Database"],
368
- # "timestamp" => "2024-06-05T10:42:00Z"
369
- # }
326
+ begin
327
+ mcp.start
328
+ ctx = LLM::Context.new(llm, stream: $stdout, tools: mcp.tools)
329
+ ctx.talk("List the available GitHub MCP toolsets.")
330
+ ctx.talk(ctx.call(:functions)) while ctx.functions.any?
331
+ ensure
332
+ mcp.stop
333
+ end
370
334
  ```
371
335
 
372
336
  ## Providers
@@ -542,11 +506,11 @@ res = ctx.talk("What is the capital of France?")
542
506
  puts res.content
543
507
  ```
544
508
 
545
- #### Context Persistence
509
+ #### Context Persistence: Vanilla
546
510
 
547
- Contexts can be serialized and restored across process boundaries. This makes
548
- it possible to persist conversation state in a file, database, or queue and
549
- resume work later:
511
+ Contexts can be serialized and restored across process boundaries. A context
512
+ can be serialized to JSON and stored on disk, in a database, in a job queue,
513
+ or anywhere else your application needs to persist state:
550
514
 
551
515
  ```ruby
552
516
  #!/usr/bin/env ruby
@@ -556,12 +520,79 @@ llm = LLM.openai(key: ENV["KEY"])
556
520
  ctx = LLM::Context.new(llm)
557
521
  ctx.talk("Hello")
558
522
  ctx.talk("Remember that my favorite language is Ruby")
559
- ctx.save(path: "context.json")
523
+
524
+ # Serialize to a string when you want to store the context yourself,
525
+ # for example in a database row or job payload.
526
+ payload = ctx.to_json
560
527
 
561
528
  restored = LLM::Context.new(llm)
562
- restored.restore(path: "context.json")
529
+ restored.restore(string: payload)
563
530
  res = restored.talk("What is my favorite language?")
564
531
  puts res.content
532
+
533
+ # You can also persist the same state to a file:
534
+ ctx.save(path: "context.json")
535
+ restored = LLM::Context.new(llm)
536
+ restored.restore(path: "context.json")
537
+ ```
538
+
539
+ #### Context Persistence: ActiveRecord (Rails)
540
+
541
+ In a Rails application, you can also wrap persisted context state in an
542
+ ActiveRecord model. A minimal schema would include a `snapshot` column for the
543
+ serialized context payload (`jsonb` is recommended) and a `provider` column
544
+ for the provider name:
545
+
546
+ ```ruby
547
+ create_table :contexts do |t|
548
+ t.jsonb :snapshot
549
+ t.string :provider, null: false
550
+ t.timestamps
551
+ end
552
+ ```
553
+
554
+ For example:
555
+
556
+ ```ruby
557
+ class Context < ApplicationRecord
558
+ def talk(...)
559
+ ctx.talk(...).tap { flush }
560
+ end
561
+
562
+ def wait(...)
563
+ ctx.wait(...).tap { flush }
564
+ end
565
+
566
+ def messages
567
+ ctx.messages
568
+ end
569
+
570
+ def model
571
+ ctx.model
572
+ end
573
+
574
+ def flush
575
+ update_column(:snapshot, ctx.to_json)
576
+ end
577
+
578
+ private
579
+
580
+ def ctx
581
+ @ctx ||= begin
582
+ ctx = LLM::Context.new(llm)
583
+ ctx.restore(string: snapshot) if snapshot
584
+ ctx
585
+ end
586
+ end
587
+
588
+ def llm
589
+ LLM.method(provider).call(key: ENV.fetch(key))
590
+ end
591
+
592
+ def key
593
+ "#{provider.upcase}_KEY"
594
+ end
595
+ end
565
596
  ```
566
597
 
567
598
  #### Agents
@@ -126,7 +126,7 @@ module LLM
126
126
  "gen_ai.operation.name" => "execute_tool",
127
127
  "gen_ai.request.model" => model,
128
128
  "gen_ai.tool.call.id" => id,
129
- "gen_ai.tool.name" => name,
129
+ "gen_ai.tool.name" => name&.to_s,
130
130
  "gen_ai.tool.call.arguments" => LLM.json.dump(arguments),
131
131
  "gen_ai.provider.name" => provider_name,
132
132
  "server.address" => provider_host,
@@ -145,7 +145,7 @@ module LLM
145
145
  return nil unless span
146
146
  attributes = {
147
147
  "gen_ai.tool.call.id" => result.id,
148
- "gen_ai.tool.name" => result.name,
148
+ "gen_ai.tool.name" => result.name&.to_s,
149
149
  "gen_ai.tool.call.result" => LLM.json.dump(result.value)
150
150
  }.compact
151
151
  attributes.each { span.set_attribute(_1, _2) }
data/lib/llm/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module LLM
4
- VERSION = "4.11.0"
4
+ VERSION = "4.11.1"
5
5
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: llm.rb
3
3
  version: !ruby/object:Gem::Version
4
- version: 4.11.0
4
+ version: 4.11.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Antar Azri