llm.rb 6.1.0 → 7.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 57b39b3b4b79d1d9f8cfd10426ad233d698dd6e3ed84bfef887c8c63f543f40f
4
- data.tar.gz: 443ed7e2a04259c69d41b1da7a42e7637efaa4ab1075548706ce349bced7ed51
3
+ metadata.gz: 6c923952039095a2234eb1bd5c058a951b0d797d27577cdf7f679df59b49060b
4
+ data.tar.gz: 3667e0d79e44634f769dfced198dd07c1039f173cb43b72aab7d3204aa3638f8
5
5
  SHA512:
6
- metadata.gz: f8e53dc41eacf16cea35f64a6048aa77852fcf7a135676b2b9c02e37beff174b5a500948477c4f931ff0a71d20c4503ba3e9eef19358d3aaa204040e77fe14c5
7
- data.tar.gz: 358ce7f33d2dca51365f6581867006970fd66079dcaa189268e2deff2f297c89b8332fd11b714bedfd89124413b7a9e12fc09d928c2c28f2e9cb2368f2bc3e24
6
+ metadata.gz: 655d450b2ffeb71ed9564b7c5c23a2a86e9e385de9dc1abdac18588e460cffdecd1b2da1d5ef9fc162dc3f3286b7d2c979baec3953cd1ddbdab74d1ef5b87112
7
+ data.tar.gz: a044fedb675c4d92eff55c210d588b68b80c7e3967188674c2de4d8f6bc69d76e8f15c18f49fb54e09a8c93dff89074304d231609337bfa3bc79c96e1f3f576b
data/CHANGELOG.md CHANGED
@@ -1,5 +1,36 @@
1
1
  # Changelog
2
2
 
3
+ ## Unreleased
4
+
5
+ ## v7.0.0
6
+
7
+ Changes since `v6.1.0`.
8
+
9
+ This release turns agent tool-loop limit errors into in-band advisory
10
+ returns so the LLM can react to rate limits and continue the loop. It
11
+ adds `tool_attempts: nil` as a way to opt out of advisory tool-limit
12
+ returns entirely, and fixes the default provider HTTP path to keep
13
+ `net-http-persistent` optional when not explicitly enabled.
14
+
15
+ ### Breaking
16
+
17
+ * **Return in-band tool-loop limit errors from agents** <br>
18
+ Stop raising `LLM::ToolLoopError` when an agent exhausts its tool loop
19
+ attempt budget, and instead send advisory `LLM::Function::Return`
20
+ errors back through the model so the LLM can react to the rate limit
21
+ in-band and continue the loop.
22
+
23
+ * **Allow `tool_attempts: nil` to disable advisory tool-limit returns** <br>
24
+ Keep the default `tool_attempts` budget at `25`, but treat an explicit
25
+ `tool_attempts: nil` as an opt-out that disables advisory tool-limit
26
+ returns entirely.
27
+
28
+ ### Fix
29
+
30
+ * **Keep `net-http-persistent` optional on normal HTTP requests** <br>
31
+ Stop the default provider HTTP path from loading `net/http/persistent`
32
+ unless persistent transport support is explicitly enabled.
33
+
3
34
  ## v6.1.0
4
35
 
5
36
  Changes since `v6.0.0`.
data/README.md CHANGED
@@ -4,7 +4,7 @@
4
4
  <p align="center">
5
5
  <a href="https://0x1eef.github.io/x/llm.rb?rebuild=1"><img src="https://img.shields.io/badge/docs-0x1eef.github.io-blue.svg" alt="RubyDoc"></a>
6
6
  <a href="https://opensource.org/license/0bsd"><img src="https://img.shields.io/badge/License-0BSD-orange.svg?" alt="License"></a>
7
- <a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-6.1.0-green.svg?" alt="Version"></a>
7
+ <a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-7.0.0-green.svg?" alt="Version"></a>
8
8
  </p>
9
9
 
10
10
  ## About
@@ -370,6 +370,10 @@ worker.join
370
370
  or experimental `:ractor` support for class-based tools. MCP tools are not
371
371
  supported by the current `:ractor` mode, but mixed tool sets can still
372
372
  route MCP tools and local tools through different strategies at runtime.
373
+ By default, the tool attempt budget is `25`. When an agent exhausts that
374
+ budget, it sends advisory tool errors back through the model instead of
375
+ raising out of the runtime. Set `tool_attempts: nil` to disable that
376
+ advisory behavior.
373
377
  - **Tool calls have an explicit lifecycle** <br>
374
378
  A tool call can be executed, cancelled through
375
379
  [`LLM::Function#cancel`](https://0x1eef.github.io/x/llm.rb/LLM/Function.html#cancel-instance_method),
@@ -625,7 +629,7 @@ This example uses [`LLM::Context`](https://0x1eef.github.io/x/llm.rb/LLM/Context
625
629
  [`LLM::Stream`](https://0x1eef.github.io/x/llm.rb/LLM/Stream.html) together so
626
630
  long-lived contexts can summarize older history and expose the lifecycle
627
631
  through stream hooks. This approach is inspired by General Intelligence
628
- Systems' [Brute](https://github.com/general-intelligence-systems/brute). The
632
+ Systems. The
629
633
  compactor can also use its own `model:` if you want summarization to run on a
630
634
  different model from the main context. `token_threshold:` accepts either a
631
635
  fixed token count or a percentage string like `"90%"`, which resolves
data/lib/llm/agent.rb CHANGED
@@ -19,6 +19,9 @@ module LLM
19
19
  # * The automatic tool loop enables the wrapped context's `guard` by default.
20
20
  # The built-in {LLM::LoopGuard LLM::LoopGuard} detects repeated tool-call
21
21
  # patterns and blocks stuck execution before more tool work is queued.
22
+ # * The default tool attempt budget is `25`. After that, the agent sends
23
+ # advisory tool errors back through the model and keeps the loop in-band.
24
+ # Set `tool_attempts: nil` to disable that advisory behavior.
22
25
  # * Tool loop execution can be configured with `concurrency :call`,
23
26
  # `:thread`, `:task`, `:fiber`, `:ractor`, or a list of queued task
24
27
  # types such as `[:thread, :ractor]`.
@@ -161,7 +164,10 @@ module LLM
161
164
  #
162
165
  # @param prompt (see LLM::Provider#complete)
163
166
  # @param [Hash] params The params passed to the provider, including optional :stream, :tools, :schema etc.
164
- # @option params [Integer] :tool_attempts The maxinum number of tool call iterations (default 25)
167
+ # @option params [Integer] :tool_attempts
168
+ # The maxinum number of tool call iterations before the agent sends
169
+ # in-band advisory tool errors back through the model (default 25).
170
+ # Set to `nil` to disable advisory tool-limit returns.
165
171
  # @return [LLM::Response] Returns the LLM's response for this turn.
166
172
  # @example
167
173
  # llm = LLM.openai(key: ENV["KEY"])
@@ -180,7 +186,10 @@ module LLM
180
186
  # @note Not all LLM providers support this API
181
187
  # @param prompt (see LLM::Provider#complete)
182
188
  # @param [Hash] params The params passed to the provider, including optional :stream, :tools, :schema etc.
183
- # @option params [Integer] :tool_attempts The maxinum number of tool call iterations (default 25)
189
+ # @option params [Integer] :tool_attempts
190
+ # The maxinum number of tool call iterations before the agent sends
191
+ # in-band advisory tool errors back through the model (default 25).
192
+ # Set to `nil` to disable advisory tool-limit returns.
184
193
  # @return [LLM::Response] Returns the LLM's response for this turn.
185
194
  # @example
186
195
  # llm = LLM.openai(key: ENV["KEY"])
@@ -393,20 +402,37 @@ module LLM
393
402
 
394
403
  def run_loop(method, prompt, params)
395
404
  loop = proc do
396
- max = Integer(params.delete(:tool_attempts) || 25)
405
+ max = params.key?(:tool_attempts) ? params.delete(:tool_attempts) : 25
406
+ max = Integer(max) if max
397
407
  stream = params[:stream] || @ctx.params[:stream]
398
408
  stream.extra[:concurrency] = concurrency if LLM::Stream === stream
399
409
  res = @ctx.public_send(method, apply_instructions(prompt), params)
400
- max.times do
410
+ loop do
401
411
  break if @ctx.functions.empty?
402
- res = @ctx.public_send(method, call_functions, params)
412
+ if max
413
+ max.times do
414
+ break if @ctx.functions.empty?
415
+ res = @ctx.public_send(method, call_functions, params)
416
+ end
417
+ break if @ctx.functions.empty?
418
+ res = @ctx.public_send(method, @ctx.functions.map { rate_limit(_1) }, params)
419
+ else
420
+ res = @ctx.public_send(method, call_functions, params)
421
+ end
403
422
  end
404
- raise LLM::ToolLoopError, "pending tool calls remain" unless @ctx.functions.empty?
405
423
  res
406
424
  end
407
425
  @tracer ? @llm.with_tracer(@tracer, &loop) : loop.call
408
426
  end
409
427
 
428
+ def rate_limit(function)
429
+ LLM::Function::Return.new(function.id, function.name, {
430
+ error: true,
431
+ type: LLM::ToolLoopError.name,
432
+ message: "tool loop rate limit reached"
433
+ })
434
+ end
435
+
410
436
  def resolve_option(option)
411
437
  Proc === option ? instance_exec(&option) : option
412
438
  end
data/lib/llm/compactor.rb CHANGED
@@ -5,8 +5,7 @@
5
5
  # smaller replacement message when a context grows too large.
6
6
  #
7
7
  # This work is directly inspired by the compaction approach developed by
8
- # General Intelligence Systems in
9
- # [Brute](https://github.com/general-intelligence-systems/brute).
8
+ # General Intelligence Systems.
10
9
  #
11
10
  # The compactor can also use a different model from the main context by
12
11
  # setting `model:` in the compactor config. Compaction thresholds are opt-in:
data/lib/llm/context.rb CHANGED
@@ -96,8 +96,7 @@ module LLM
96
96
  ##
97
97
  # Returns a context compactor
98
98
  # This feature is inspired by the compaction approach developed by
99
- # General Intelligence Systems in
100
- # [Brute](https://github.com/general-intelligence-systems/brute).
99
+ # General Intelligence Systems.
101
100
  # @return [LLM::Compactor]
102
101
  def compactor
103
102
  @compactor = LLM::Compactor.new(self, @compactor || {}) unless LLM::Compactor === @compactor
@@ -10,8 +10,7 @@
10
10
  #
11
11
  # {LLM::LoopGuard LLM::LoopGuard} detects when a context is repeating the same
12
12
  # tool-call pattern instead of making progress. It is directly inspired by
13
- # General Intelligence Systems' Brute runtime and its doom-loop detection
14
- # approach.
13
+ # General Intelligence Systems and its doom-loop detection approach.
15
14
  #
16
15
  # The public interface is intentionally small:
17
16
  # - `call(ctx)` returns `nil` when no intervention is needed
@@ -22,14 +21,6 @@
22
21
  # {LLM::Agent LLM::Agent} enables this guard by default through its wrapped
23
22
  # context.
24
23
  #
25
- # Brute is MIT licensed. The relevant license grant is:
26
- #
27
- # Permission is hereby granted, free of charge, to any person obtaining a copy
28
- # of this software and associated documentation files (the "Software"), to deal
29
- # in the Software without restriction, including without limitation the rights
30
- # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
31
- # copies of the Software, and to permit persons to whom the Software is
32
- # furnished to do so.
33
24
  class LLM::LoopGuard
34
25
  ##
35
26
  # The default number of repeated tool-call patterns required before
@@ -71,7 +71,7 @@ class LLM::Provider
71
71
  ##
72
72
  # @return [Boolean]
73
73
  def persistent?
74
- !persistent_client.nil?
74
+ !@persistent_client.nil?
75
75
  end
76
76
 
77
77
  ##
data/lib/llm/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module LLM
4
- VERSION = "6.1.0"
4
+ VERSION = "7.0.0"
5
5
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: llm.rb
3
3
  version: !ruby/object:Gem::Version
4
- version: 6.1.0
4
+ version: 7.0.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Antar Azri