llm.rb 1.0.1 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 207a44401195a654a57ebf8050d211fc2c1722420a647dedcc447031aead1451
4
- data.tar.gz: aa8abf7d5104d0a93033a43041057d1af96f8f45dde7f924243788cd0b14621e
3
+ metadata.gz: 51146c557539ccc945c508cb84f06a6cbbffc1157814a357bddc038ae46aba92
4
+ data.tar.gz: 7c178e4cacf30c0b9799421edaa3d1e380d83c8edb43d3d376f9c3798b6a3976
5
5
  SHA512:
6
- metadata.gz: 8faf81ef91911ecbcd81694232f6e314caed8dab2ca8b30a241d8e346c3a4a084f0d46bb9e5e94f191a3dc7676c1c865f7062d440498d94e2e0b30a57a5a3510
7
- data.tar.gz: 3ba71c5c46b5ebbec12d136f24cff08d0da11ea807b81db1319c31b1dae18cb6786ad8ed759f3333d7e2f0f1ee3c69ee7ae85cf95e6d82b3e5552f0d516a279e
6
+ metadata.gz: b70b4797379674007036ec14512817097292d65d640f104aa8b9c0040b4bd5734d414dfa3f2ef2d9e990639862e1cb214dabf3462d4c8d600746d758579c5db5
7
+ data.tar.gz: c294544c718c35617b66f3fd5e667b8b66861cab8144809662ded2d5793ef2eaa2e856c09f7166a7cc6b385a6facb3167a9295d90ae6676b75d4688edc70724a
data/README.md CHANGED
@@ -1,48 +1,13 @@
1
- > **⚠️ Maintenance Mode ⚠️** <br>
2
- > Please note that the primary author of llm.rb is pivoting away from
3
- > Ruby and towards [Golang](https://golang.org) for future projects.
4
- > Although llm.rb will be maintained for the foreseeable future it is not
5
- > where my primary interests are anymore. Thanks for understanding.
6
-
7
1
  ## About
8
2
 
9
3
  llm.rb is a zero-dependency Ruby toolkit for Large Language Models that
10
- includes OpenAI, Gemini, Anthropic, xAI (Grok), [zAI](https://z.ai), DeepSeek,
11
- Ollama, and LlamaCpp. The toolkit includes full support for chat, streaming,
12
- tool calling, audio, images, files, and structured outputs (JSON Schema).
4
+ includes OpenAI, Gemini, Anthropic, xAI (Grok), zAI, DeepSeek, Ollama,
5
+ and LlamaCpp. The toolkit includes full support for chat, streaming,
6
+ tool calling, audio, images, files, and structured outputs.
13
7
 
14
8
  ## Quick start
15
9
 
16
- #### Demo
17
-
18
- This cool demo writes a new [llm-shell](https://github.com/llmrb/llm-shell#readme) command
19
- with the help of [llm.rb](https://github.com/llmrb/llm#readme). <br> Similar-ish to
20
- GitHub Copilot but for the terminal.
21
-
22
- <details>
23
- <summary>Start demo</summary>
24
- <img src="https://github.com/llmrb/llm/blob/main/share/llm-shell/examples/demo.gif?raw=true" alt="llm-shell demo" />
25
- </details>
26
-
27
- #### Guides
28
-
29
- * [An introduction to RAG](https://0x1eef.github.io/posts/an-introduction-to-rag-with-llm.rb/) &ndash;
30
- a blog post that implements the RAG pattern
31
- * [How to estimate the age of a person in a photo](https://0x1eef.github.io/posts/age-estimation-with-llm.rb/) &ndash;
32
- a blog post that implements an age estimation tool
33
- * [How to edit an image with Gemini](https://0x1eef.github.io/posts/how-to-edit-images-with-gemini/) &ndash;
34
- a blog post that implements image editing with Gemini
35
- * [Fast sailing with persistent connections](https://0x1eef.github.io/posts/persistent-connections-with-llm.rb/) &ndash;
36
- a blog post that optimizes performance with a thread-safe connection pool
37
- * [How to build agents (with llm.rb)](https://0x1eef.github.io/posts/how-to-build-agents-with-llm.rb/) &ndash;
38
- a blog post that implements agentic behavior via tools
39
-
40
- #### Ecosystem
41
-
42
- * [llm-shell](https://github.com/llmrb/llm-shell) &ndash; a developer-oriented console for Large Language Model communication
43
- * [llm-spell](https://github.com/llmrb/llm-spell) &ndash; a utility that can correct spelling mistakes with a Large Language Model
44
-
45
- #### Show code
10
+ #### REPL
46
11
 
47
12
  A simple chatbot that maintains a conversation and streams
48
13
  responses in real-time:
@@ -55,18 +20,62 @@ llm = LLM.openai(key: ENV["KEY"])
55
20
  bot = LLM::Bot.new(llm, stream: $stdout)
56
21
  loop do
57
22
  print "> "
58
- input = $stdin.gets&.chomp || break
59
- bot.chat(input).flush
23
+ bot.chat($stdin.gets)
60
24
  print "\n"
61
25
  end
62
26
  ```
63
27
 
28
+ #### Build
29
+
30
+ We can send multiple messages at once by building a chain of messages:
31
+
32
+ ```ruby
33
+ #!/usr/bin/env ruby
34
+ require "llm"
35
+
36
+ llm = LLM.openai(key: ENV["KEY"])
37
+ bot = LLM::Bot.new(llm)
38
+ prompt = bot.build_prompt do
39
+ it.system "Your task is to answer all user queries"
40
+ it.user "What language should I learn next ?"
41
+ end
42
+
43
+ bot.chat(prompt)
44
+ bot.messages.each { print "[#{it.role}] ", it.content, "\n" }
45
+ ```
46
+
47
+ #### Images
48
+
49
+ We can generate an image on the fly and estimate how old the person
50
+ in the image is:
51
+
52
+ ```ruby
53
+ #!/usr/bin/env ruby
54
+ require "llm"
55
+
56
+ llm = LLM.openai(key: ENV["OPENAI_SECRET"])
57
+ schema = llm.schema.object(
58
+ age: llm.schema.integer.required.description("The age of the person in a photo"),
59
+ confidence: llm.schema.number.required.description("Model confidence (0.0 to 1.0)"),
60
+ notes: llm.schema.string.required.description("Model notes or caveats")
61
+ )
62
+
63
+ img = llm.images.create(prompt: "A man in his 30s")
64
+ bot = LLM::Bot.new(llm, schema:)
65
+ res = bot.chat bot.image_url(img.urls[0])
66
+ body = res.choices.find(&:assistant?).content!
67
+
68
+ print "age: ", body["age"], "\n"
69
+ print "confidence: ", body["confidence"], "\n"
70
+ print "notes: ", body["notes"], "\n"
71
+ ```
72
+
64
73
  ## Features
65
74
 
66
75
  #### General
67
76
  - ✅ A single unified interface for multiple providers
68
77
  - 📦 Zero dependencies outside Ruby's standard library
69
- - 🚀 Smart API design that minimizes the number of requests made
78
+ - 🚀 Simple, composable API
70
79
  - ♻️ Optional: per-provider, process-wide connection pool via net-http-persistent
71
80
 
72
81
  #### Chat, Agents
@@ -136,6 +145,7 @@ llm = LLM.openai(key: "yourapikey")
136
145
  llm = LLM.gemini(key: "yourapikey")
137
146
  llm = LLM.anthropic(key: "yourapikey")
138
147
  llm = LLM.xai(key: "yourapikey")
148
+ llm = LLM.zai(key: "yourapikey")
139
149
  llm = LLM.deepseek(key: "yourapikey")
140
150
 
141
151
  ##
@@ -179,18 +189,13 @@ ensure thread-safety.
179
189
 
180
190
  #### Completions
181
191
 
182
- > This example uses the stateless chat completions API that all
183
- > providers support. A similar example for OpenAI's stateful
184
- > responses API is available in the [docs/](https://0x1eef.github.io/x/llm.rb/file.OPENAI.html#responses)
185
- > directory.
186
-
187
192
  The following example creates an instance of
188
193
  [LLM::Bot](https://0x1eef.github.io/x/llm.rb/LLM/Bot.html)
189
- and enters into a conversation where messages are buffered and
190
- sent to the provider on-demand. The implementation is designed to
191
- buffer messages by waiting until an attempt to iterate over
192
- [LLM::Bot#messages](https://0x1eef.github.io/x/llm.rb/LLM/Bot.html#messages-instance_method)
193
- is made before sending a request to the LLM:
194
+ and enters into a conversation where each call to "bot.chat" immediately
195
+ sends a request to the provider, updates the conversation history, and
196
+ returns an [LLM::Response](https://0x1eef.github.io/x/llm.rb/LLM/Response.html).
197
+ The full conversation history is automatically included in
198
+ each subsequent request:
194
199
 
195
200
  ```ruby
196
201
  #!/usr/bin/env ruby
@@ -198,45 +203,42 @@ require "llm"
198
203
 
199
204
  llm = LLM.openai(key: ENV["KEY"])
200
205
  bot = LLM::Bot.new(llm)
201
- url = "https://en.wikipedia.org/wiki/Special:FilePath/Cognac_glass.jpg"
206
+ url = "https://upload.wikimedia.org/wikipedia/commons/c/c7/Lisc_lipy.jpg"
202
207
 
203
- bot.chat "Your task is to answer all user queries", role: :system
204
- bot.chat ["Tell me about this URL", URI(url)], role: :user
205
- bot.chat ["Tell me about this PDF", File.open("handbook.pdf", "rb")], role: :user
206
- bot.chat "Are the URL and PDF similar to each other?", role: :user
208
+ prompt = bot.build_prompt do
209
+ it.system "Your task is to answer all user queries"
210
+ it.user ["Tell me about this URL", bot.image_url(url)]
211
+ it.user ["Tell me about this PDF", bot.local_file("handbook.pdf")]
212
+ end
207
213
 
208
- # At this point, we execute a single request
209
- bot.messages.each { print "[#{_1.role}] ", _1.content, "\n" }
214
+ bot.chat(prompt)
215
+ bot.messages.each { print "[#{it.role}] ", it.content, "\n" }
210
216
  ```
211
217
 
212
218
  #### Streaming
213
219
 
214
- > There Is More Than One Way To Do It (TIMTOWTDI) when you are
215
- > using llm.rb &ndash; and this is especially true when it
216
- > comes to streaming. See the streaming documentation in
217
- > [docs/](https://0x1eef.github.io/x/llm.rb/file.STREAMING.html#scopes)
218
- > for more details.
219
-
220
220
  The following example streams the messages in a conversation
221
221
  as they are generated in real-time. The `stream` option can
222
- be set to an IO object, or the value `true` to enable streaming
223
- &ndash; and at the end of the request, `bot.chat` returns the
224
- same response as the non-streaming version which allows you
225
- to process a response in the same way:
222
+ be set to an IO object, or the value `true` to enable streaming.
223
+ When streaming, the `bot.chat` method will block until the entire
224
+ stream is received. At the end, it returns the `LLM::Response` object
225
+ containing the full aggregated content:
226
226
 
227
227
  ```ruby
228
228
  #!/usr/bin/env ruby
229
229
  require "llm"
230
230
 
231
231
  llm = LLM.openai(key: ENV["KEY"])
232
- bot = LLM::Bot.new(llm)
233
- url = "https://en.wikipedia.org/wiki/Special:FilePath/Cognac_glass.jpg"
234
- bot.chat(stream: $stdout) do |prompt|
235
- prompt.system "Your task is to answer all user queries"
236
- prompt.user ["Tell me about this URL", URI(url)]
237
- prompt.user ["Tell me about this PDF", File.open("handbook.pdf", "rb")]
238
- prompt.user "Are the URL and PDF similar to each other?"
239
- end.flush
232
+ bot = LLM::Bot.new(llm, stream: $stdout)
233
+ url = "https://upload.wikimedia.org/wikipedia/commons/c/c7/Lisc_lipy.jpg"
234
+
235
+ prompt = bot.build_prompt do
236
+ it.system "Your task is to answer all user queries"
237
+ it.user ["Tell me about this URL", bot.image_url(url)]
238
+ it.user ["Tell me about the PDF", bot.local_file("handbook.pdf")]
239
+ end
240
+
241
+ bot.chat(prompt)
240
242
  ```
241
243
 
242
244
  ### Schema
@@ -252,31 +254,28 @@ an LLM should emit, and the LLM will abide by the schema:
252
254
  #!/usr/bin/env ruby
253
255
  require "llm"
254
256
 
257
+ llm = LLM.openai(key: ENV["KEY"])
258
+
255
259
  ##
256
260
  # Objects
257
- llm = LLM.openai(key: ENV["KEY"])
258
261
  schema = llm.schema.object(probability: llm.schema.number.required)
259
262
  bot = LLM::Bot.new(llm, schema:)
260
263
  bot.chat "Does the earth orbit the sun?", role: :user
261
- bot.messages.find(&:assistant?).content! # => {probability: 1.0}
264
+ puts bot.messages.find(&:assistant?).content! # => {probability: 1.0}
262
265
 
263
266
  ##
264
267
  # Enums
265
268
  schema = llm.schema.object(fruit: llm.schema.string.enum("Apple", "Orange", "Pineapple"))
266
- bot = LLM::Bot.new(llm, schema:)
267
- bot.chat "Your favorite fruit is Pineapple", role: :system
269
+ bot = LLM::Bot.new(llm, schema:) :system
268
270
  bot.chat "What fruit is your favorite?", role: :user
269
- bot.messages.find(&:assistant?).content! # => {fruit: "Pineapple"}
271
+ puts bot.messages.find(&:assistant?).content! # => {fruit: "Pineapple"}
270
272
 
271
273
  ##
272
274
  # Arrays
273
275
  schema = llm.schema.object(answers: llm.schema.array(llm.schema.integer.required))
274
276
  bot = LLM::Bot.new(llm, schema:)
275
- bot.chat "Answer all of my questions", role: :system
276
- bot.chat "Tell me the answer to ((5 + 5) / 2)", role: :user
277
- bot.chat "Tell me the answer to ((5 + 5) / 2) * 2", role: :user
278
277
  bot.chat "Tell me the answer to ((5 + 5) / 2) * 2 + 1", role: :user
279
- bot.messages.find(&:assistant?).content! # => {answers: [5, 10, 11]}
278
+ puts bot.messages.find(&:assistant?).content! # => {answers: [11]}
280
279
  ```
281
280
 
282
281
  ### Tools
@@ -300,11 +299,9 @@ its surrounding scope, which can be useful in some situations.
300
299
 
301
300
  The
302
301
  [LLM::Bot#functions](https://0x1eef.github.io/x/llm.rb/LLM/Bot.html#functions-instance_method)
303
- method returns an array of functions that can be called after sending a message and
304
- it will only be populated if the LLM detects a function should be called. Each function
305
- corresponds to an element in the "tools" array. The array is emptied after a function call,
306
- and potentially repopulated on the next message:
307
-
302
+ method returns an array of functions that can be called after a `chat` interaction
303
+ if the LLM detects a function should be called. You would then typically call these
304
+ functions and send their results back to the LLM in a subsequent `chat` call:
308
305
 
309
306
  ```ruby
310
307
  #!/usr/bin/env ruby
@@ -360,7 +357,7 @@ require "llm"
360
357
  class System < LLM::Tool
361
358
  name "system"
362
359
  description "Run a shell command"
363
- params { |schema| schema.object(command: schema.string.required) }
360
+ param :command, String, "The command to execute", required: true
364
361
 
365
362
  def call(command:)
366
363
  ro, wo = IO.pipe
@@ -371,6 +368,7 @@ class System < LLM::Tool
371
368
  end
372
369
  end
373
370
 
371
+ llm = LLM.openai(key: ENV["KEY"])
374
372
  bot = LLM::Bot.new(llm, tools: [System])
375
373
  bot.chat "Your task is to run shell commands via a tool.", role: :system
376
374
 
@@ -385,46 +383,6 @@ bot.chat bot.functions.map(&:call) # report return value to the LLM
385
383
  # {stderr: "", stdout: "FreeBSD"}
386
384
  ```
387
385
 
388
- #### Server Tools
389
-
390
- The
391
- [LLM::Function](https://0x1eef.github.io/x/llm.rb/LLM/Function.html)
392
- and
393
- [LLM::Tool](https://0x1eef.github.io/x/llm.rb/LLM/Tool.html)
394
- classes can define a local function or tool that can be called by
395
- a provider on your behalf, and the
396
- [LLM::ServerTool](https://0x1eef.github.io/x/llm.rb/LLM/ServerTool.html)
397
- class represents a tool that is defined and implemented by a provider, and we can
398
- request that the provider call the tool on our behalf. That's the primary difference
399
- between a function implemented locally and a tool implemented by a provider. The
400
- available tools depend on the provider, and the following example uses the
401
- OpenAI provider to execute Python code on OpenAI's servers:
402
-
403
- ```ruby
404
- #!/usr/bin/env ruby
405
- require "llm"
406
-
407
- llm = LLM.openai(key: ENV["KEY"])
408
- res = llm.responses.create "Run: 'print(\"hello world\")'",
409
- tools: [llm.server_tool(:code_interpreter)]
410
- print res.output_text, "\n"
411
- ```
412
-
413
- #### Web Search
414
-
415
- A common tool among all providers is the ability to perform a web search, and
416
- the following example uses the OpenAI provider to search the web using the
417
- Web Search tool. This can also be done with the Anthropic and Gemini providers:
418
-
419
- ```ruby
420
- #!/usr/bin/env ruby
421
- require "llm"
422
-
423
- llm = LLM.openai(key: ENV["KEY"])
424
- res = llm.web_search(query: "summarize today's news")
425
- print res.output_text, "\n"
426
- ```
427
-
428
386
  ### Files
429
387
 
430
388
  #### Create
@@ -442,24 +400,32 @@ require "llm"
442
400
 
443
401
  llm = LLM.openai(key: ENV["KEY"])
444
402
  bot = LLM::Bot.new(llm)
445
- file = llm.files.create(file: "/books/goodread.pdf")
446
- bot.chat ["Tell me about this file", file]
447
- bot.messages.select(&:assistant?).each { print "[#{_1.role}] ", _1.content, "\n" }
403
+ file = llm.files.create(file: "/book.pdf")
404
+ res = bot.chat ["Tell me about this file", file]
405
+ res.choices.each { print "[#{it.role}] ", it.content, "\n" }
448
406
  ```
449
407
 
450
408
  ### Prompts
451
409
 
452
410
  #### Multimodal
453
411
 
454
- It is generally a given that an LLM will understand text but they can also
455
- understand and generate other types of media as well: audio, images, video,
456
- and even URLs. The object given as a prompt in llm.rb can be a string to
457
- represent text, a URI object to represent a URL, an LLM::Response object
458
- to represent a file stored with the LLM, and so on. These are objects you
459
- can throw at the prompt and have them be understood automatically.
412
+ While LLMs inherently understand text, they can also process and
413
+ generate other types of media such as audio, images, video, and
414
+ even URLs. To provide these multimodal inputs to the LLM, llm.rb
415
+ uses explicit tagging methods on the `LLM::Bot` instance.
416
+ These methods wrap your input into a special `LLM::Object`,
417
+ clearly indicating its type and intent to the underlying LLM
418
+ provider.
419
+
420
+ For instance, to specify an image URL, you would use
421
+ `bot.image_url`. For a local file, `bot.local_file`. For an
422
+ already uploaded file managed by the LLM provider's Files API,
423
+ `bot.remote_file`. This approach ensures clarity and allows
424
+ llm.rb to correctly format the input for each provider's
425
+ specific requirements.
460
426
 
461
- A prompt can also have multiple parts, and in that case, an array is given
462
- as a prompt. Each element is considered to be part of the prompt:
427
+ An array can be used for a prompt with multiple parts, where each
428
+ element contributes to the overall input:
463
429
 
464
430
  ```ruby
465
431
  #!/usr/bin/env ruby
@@ -467,16 +433,17 @@ require "llm"
467
433
 
468
434
  llm = LLM.openai(key: ENV["KEY"])
469
435
  bot = LLM::Bot.new(llm)
436
+ url = "https://upload.wikimedia.org/wikipedia/commons/c/c7/Lisc_lipy.jpg"
470
437
 
471
- bot.chat ["Tell me about this URL", URI("https://example.com/path/to/image.png")]
472
- [bot.messages.find(&:assistant?)].each { print "[#{_1.role}] ", _1.content, "\n" }
438
+ res1 = bot.chat ["Tell me about this URL", bot.image_url(url)]
439
+ res1.choices.each { print "[#{it.role}] ", it.content, "\n" }
473
440
 
474
- file = llm.files.create(file: "/books/goodread.pdf")
475
- bot.chat ["Tell me about this PDF", file]
476
- [bot.messages.find(&:assistant?)].each { print "[#{_1.role}] ", _1.content, "\n" }
441
+ file = llm.files.create(file: "/book.pdf")
442
+ res2 = bot.chat ["Tell me about this PDF", bot.remote_file(file)]
443
+ res2.choices.each { print "[#{it.role}] ", it.content, "\n" }
477
444
 
478
- bot.chat ["Tell me about this image", File.open("/images/nemothefish.png", "r")]
479
- [bot.messages.find(&:assistant?)].each { print "[#{_1.role}] ", _1.content, "\n" }
445
+ res3 = bot.chat ["Tell me about this image", bot.local_file("/puffy.png")]
446
+ res3.choices.each { print "[#{it.role}] ", it.content, "\n" }
480
447
  ```
481
448
 
482
449
  ### Audio
@@ -662,36 +629,10 @@ end
662
629
  # Select a model
663
630
  model = llm.models.all.find { |m| m.id == "gpt-3.5-turbo" }
664
631
  bot = LLM::Bot.new(llm, model: model.id)
665
- bot.chat "Hello #{model.id} :)"
666
- bot.messages.select(&:assistant?).each { print "[#{_1.role}] ", _1.content, "\n" }
632
+ res = bot.chat "Hello #{model.id} :)"
633
+ res.choices.each { print "[#{it.role}] ", it.content, "\n" }
667
634
  ```
668
635
 
669
- ## Reviews
670
-
671
- I supplied both Gemini and DeepSeek with the contents of [lib/](https://github.com/llmrb/llm/tree/main/lib)
672
- and [README.md](https://github.com/llmrb/llm#readme) via [llm-shell](https://github.com/llmrb/llm-shell#readme).
673
- Their feedback was way more positive than I could have imagined 😅 These are genuine responses though, with no
674
- special prompting or engineering. I just provided them with the source code and asked for their opinion.
675
-
676
- <details>
677
- <summary>Review by Gemini</summary>
678
- <img src="https://github.com/llmrb/llm/blob/main/share/llm-shell/examples/gemini.png?raw=true" alt="Gemini review" />
679
- </details>
680
-
681
- <details>
682
- <summary>Review by DeepSeek</summary>
683
- <img src="https://github.com/llmrb/llm/blob/main/share/llm-shell/examples/deepseek.png?raw=true" alt="DeepSeek review" />
684
- </details>
685
-
686
- ## Documentation
687
-
688
- ### API
689
-
690
- The README tries to provide a high-level overview of the library. For everything
691
- else there's the API reference. It covers classes and methods that the README glances
692
- over or doesn't cover at all. The API reference is available at
693
- [0x1eef.github.io/x/llm.rb](https://0x1eef.github.io/x/llm.rb).
694
-
695
636
  ## Install
696
637
 
697
638
  llm.rb can be installed via rubygems.org:
@@ -702,4 +643,4 @@ llm.rb can be installed via rubygems.org:
702
643
 
703
644
  [BSD Zero Clause](https://choosealicense.com/licenses/0bsd/)
704
645
  <br>
705
- See [LICENSE](./LICENSE)
646
+ See [LICENSE](./LICENSE)