llm.rb 6.1.0 → 7.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +31 -0
- data/README.md +6 -2
- data/lib/llm/agent.rb +32 -6
- data/lib/llm/compactor.rb +1 -2
- data/lib/llm/context.rb +1 -2
- data/lib/llm/loop_guard.rb +1 -10
- data/lib/llm/provider/transport/http.rb +1 -1
- data/lib/llm/version.rb +1 -1
- metadata +1 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 6c923952039095a2234eb1bd5c058a951b0d797d27577cdf7f679df59b49060b
|
|
4
|
+
data.tar.gz: 3667e0d79e44634f769dfced198dd07c1039f173cb43b72aab7d3204aa3638f8
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 655d450b2ffeb71ed9564b7c5c23a2a86e9e385de9dc1abdac18588e460cffdecd1b2da1d5ef9fc162dc3f3286b7d2c979baec3953cd1ddbdab74d1ef5b87112
|
|
7
|
+
data.tar.gz: a044fedb675c4d92eff55c210d588b68b80c7e3967188674c2de4d8f6bc69d76e8f15c18f49fb54e09a8c93dff89074304d231609337bfa3bc79c96e1f3f576b
|
data/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,36 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## Unreleased
|
|
4
|
+
|
|
5
|
+
## v7.0.0
|
|
6
|
+
|
|
7
|
+
Changes since `v6.1.0`.
|
|
8
|
+
|
|
9
|
+
This release turns agent tool-loop limit errors into in-band advisory
|
|
10
|
+
returns so the LLM can react to rate limits and continue the loop. It
|
|
11
|
+
adds `tool_attempts: nil` as a way to opt out of advisory tool-limit
|
|
12
|
+
returns entirely, and fixes the default provider HTTP path to keep
|
|
13
|
+
`net-http-persistent` optional when not explicitly enabled.
|
|
14
|
+
|
|
15
|
+
### Breaking
|
|
16
|
+
|
|
17
|
+
* **Return in-band tool-loop limit errors from agents** <br>
|
|
18
|
+
Stop raising `LLM::ToolLoopError` when an agent exhausts its tool loop
|
|
19
|
+
attempt budget, and instead send advisory `LLM::Function::Return`
|
|
20
|
+
errors back through the model so the LLM can react to the rate limit
|
|
21
|
+
in-band and continue the loop.
|
|
22
|
+
|
|
23
|
+
* **Allow `tool_attempts: nil` to disable advisory tool-limit returns** <br>
|
|
24
|
+
Keep the default `tool_attempts` budget at `25`, but treat an explicit
|
|
25
|
+
`tool_attempts: nil` as an opt-out that disables advisory tool-limit
|
|
26
|
+
returns entirely.
|
|
27
|
+
|
|
28
|
+
### Fix
|
|
29
|
+
|
|
30
|
+
* **Keep `net-http-persistent` optional on normal HTTP requests** <br>
|
|
31
|
+
Stop the default provider HTTP path from loading `net/http/persistent`
|
|
32
|
+
unless persistent transport support is explicitly enabled.
|
|
33
|
+
|
|
3
34
|
## v6.1.0
|
|
4
35
|
|
|
5
36
|
Changes since `v6.0.0`.
|
data/README.md
CHANGED
|
@@ -4,7 +4,7 @@
|
|
|
4
4
|
<p align="center">
|
|
5
5
|
<a href="https://0x1eef.github.io/x/llm.rb?rebuild=1"><img src="https://img.shields.io/badge/docs-0x1eef.github.io-blue.svg" alt="RubyDoc"></a>
|
|
6
6
|
<a href="https://opensource.org/license/0bsd"><img src="https://img.shields.io/badge/License-0BSD-orange.svg?" alt="License"></a>
|
|
7
|
-
<a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-
|
|
7
|
+
<a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-7.0.0-green.svg?" alt="Version"></a>
|
|
8
8
|
</p>
|
|
9
9
|
|
|
10
10
|
## About
|
|
@@ -370,6 +370,10 @@ worker.join
|
|
|
370
370
|
or experimental `:ractor` support for class-based tools. MCP tools are not
|
|
371
371
|
supported by the current `:ractor` mode, but mixed tool sets can still
|
|
372
372
|
route MCP tools and local tools through different strategies at runtime.
|
|
373
|
+
By default, the tool attempt budget is `25`. When an agent exhausts that
|
|
374
|
+
budget, it sends advisory tool errors back through the model instead of
|
|
375
|
+
raising out of the runtime. Set `tool_attempts: nil` to disable that
|
|
376
|
+
advisory behavior.
|
|
373
377
|
- **Tool calls have an explicit lifecycle** <br>
|
|
374
378
|
A tool call can be executed, cancelled through
|
|
375
379
|
[`LLM::Function#cancel`](https://0x1eef.github.io/x/llm.rb/LLM/Function.html#cancel-instance_method),
|
|
@@ -625,7 +629,7 @@ This example uses [`LLM::Context`](https://0x1eef.github.io/x/llm.rb/LLM/Context
|
|
|
625
629
|
[`LLM::Stream`](https://0x1eef.github.io/x/llm.rb/LLM/Stream.html) together so
|
|
626
630
|
long-lived contexts can summarize older history and expose the lifecycle
|
|
627
631
|
through stream hooks. This approach is inspired by General Intelligence
|
|
628
|
-
Systems
|
|
632
|
+
Systems. The
|
|
629
633
|
compactor can also use its own `model:` if you want summarization to run on a
|
|
630
634
|
different model from the main context. `token_threshold:` accepts either a
|
|
631
635
|
fixed token count or a percentage string like `"90%"`, which resolves
|
data/lib/llm/agent.rb
CHANGED
|
@@ -19,6 +19,9 @@ module LLM
|
|
|
19
19
|
# * The automatic tool loop enables the wrapped context's `guard` by default.
|
|
20
20
|
# The built-in {LLM::LoopGuard LLM::LoopGuard} detects repeated tool-call
|
|
21
21
|
# patterns and blocks stuck execution before more tool work is queued.
|
|
22
|
+
# * The default tool attempt budget is `25`. After that, the agent sends
|
|
23
|
+
# advisory tool errors back through the model and keeps the loop in-band.
|
|
24
|
+
# Set `tool_attempts: nil` to disable that advisory behavior.
|
|
22
25
|
# * Tool loop execution can be configured with `concurrency :call`,
|
|
23
26
|
# `:thread`, `:task`, `:fiber`, `:ractor`, or a list of queued task
|
|
24
27
|
# types such as `[:thread, :ractor]`.
|
|
@@ -161,7 +164,10 @@ module LLM
|
|
|
161
164
|
#
|
|
162
165
|
# @param prompt (see LLM::Provider#complete)
|
|
163
166
|
# @param [Hash] params The params passed to the provider, including optional :stream, :tools, :schema etc.
|
|
164
|
-
# @option params [Integer] :tool_attempts
|
|
167
|
+
# @option params [Integer] :tool_attempts
|
|
168
|
+
# The maxinum number of tool call iterations before the agent sends
|
|
169
|
+
# in-band advisory tool errors back through the model (default 25).
|
|
170
|
+
# Set to `nil` to disable advisory tool-limit returns.
|
|
165
171
|
# @return [LLM::Response] Returns the LLM's response for this turn.
|
|
166
172
|
# @example
|
|
167
173
|
# llm = LLM.openai(key: ENV["KEY"])
|
|
@@ -180,7 +186,10 @@ module LLM
|
|
|
180
186
|
# @note Not all LLM providers support this API
|
|
181
187
|
# @param prompt (see LLM::Provider#complete)
|
|
182
188
|
# @param [Hash] params The params passed to the provider, including optional :stream, :tools, :schema etc.
|
|
183
|
-
# @option params [Integer] :tool_attempts
|
|
189
|
+
# @option params [Integer] :tool_attempts
|
|
190
|
+
# The maxinum number of tool call iterations before the agent sends
|
|
191
|
+
# in-band advisory tool errors back through the model (default 25).
|
|
192
|
+
# Set to `nil` to disable advisory tool-limit returns.
|
|
184
193
|
# @return [LLM::Response] Returns the LLM's response for this turn.
|
|
185
194
|
# @example
|
|
186
195
|
# llm = LLM.openai(key: ENV["KEY"])
|
|
@@ -393,20 +402,37 @@ module LLM
|
|
|
393
402
|
|
|
394
403
|
def run_loop(method, prompt, params)
|
|
395
404
|
loop = proc do
|
|
396
|
-
max =
|
|
405
|
+
max = params.key?(:tool_attempts) ? params.delete(:tool_attempts) : 25
|
|
406
|
+
max = Integer(max) if max
|
|
397
407
|
stream = params[:stream] || @ctx.params[:stream]
|
|
398
408
|
stream.extra[:concurrency] = concurrency if LLM::Stream === stream
|
|
399
409
|
res = @ctx.public_send(method, apply_instructions(prompt), params)
|
|
400
|
-
|
|
410
|
+
loop do
|
|
401
411
|
break if @ctx.functions.empty?
|
|
402
|
-
|
|
412
|
+
if max
|
|
413
|
+
max.times do
|
|
414
|
+
break if @ctx.functions.empty?
|
|
415
|
+
res = @ctx.public_send(method, call_functions, params)
|
|
416
|
+
end
|
|
417
|
+
break if @ctx.functions.empty?
|
|
418
|
+
res = @ctx.public_send(method, @ctx.functions.map { rate_limit(_1) }, params)
|
|
419
|
+
else
|
|
420
|
+
res = @ctx.public_send(method, call_functions, params)
|
|
421
|
+
end
|
|
403
422
|
end
|
|
404
|
-
raise LLM::ToolLoopError, "pending tool calls remain" unless @ctx.functions.empty?
|
|
405
423
|
res
|
|
406
424
|
end
|
|
407
425
|
@tracer ? @llm.with_tracer(@tracer, &loop) : loop.call
|
|
408
426
|
end
|
|
409
427
|
|
|
428
|
+
def rate_limit(function)
|
|
429
|
+
LLM::Function::Return.new(function.id, function.name, {
|
|
430
|
+
error: true,
|
|
431
|
+
type: LLM::ToolLoopError.name,
|
|
432
|
+
message: "tool loop rate limit reached"
|
|
433
|
+
})
|
|
434
|
+
end
|
|
435
|
+
|
|
410
436
|
def resolve_option(option)
|
|
411
437
|
Proc === option ? instance_exec(&option) : option
|
|
412
438
|
end
|
data/lib/llm/compactor.rb
CHANGED
|
@@ -5,8 +5,7 @@
|
|
|
5
5
|
# smaller replacement message when a context grows too large.
|
|
6
6
|
#
|
|
7
7
|
# This work is directly inspired by the compaction approach developed by
|
|
8
|
-
# General Intelligence Systems
|
|
9
|
-
# [Brute](https://github.com/general-intelligence-systems/brute).
|
|
8
|
+
# General Intelligence Systems.
|
|
10
9
|
#
|
|
11
10
|
# The compactor can also use a different model from the main context by
|
|
12
11
|
# setting `model:` in the compactor config. Compaction thresholds are opt-in:
|
data/lib/llm/context.rb
CHANGED
|
@@ -96,8 +96,7 @@ module LLM
|
|
|
96
96
|
##
|
|
97
97
|
# Returns a context compactor
|
|
98
98
|
# This feature is inspired by the compaction approach developed by
|
|
99
|
-
# General Intelligence Systems
|
|
100
|
-
# [Brute](https://github.com/general-intelligence-systems/brute).
|
|
99
|
+
# General Intelligence Systems.
|
|
101
100
|
# @return [LLM::Compactor]
|
|
102
101
|
def compactor
|
|
103
102
|
@compactor = LLM::Compactor.new(self, @compactor || {}) unless LLM::Compactor === @compactor
|
data/lib/llm/loop_guard.rb
CHANGED
|
@@ -10,8 +10,7 @@
|
|
|
10
10
|
#
|
|
11
11
|
# {LLM::LoopGuard LLM::LoopGuard} detects when a context is repeating the same
|
|
12
12
|
# tool-call pattern instead of making progress. It is directly inspired by
|
|
13
|
-
# General Intelligence Systems
|
|
14
|
-
# approach.
|
|
13
|
+
# General Intelligence Systems and its doom-loop detection approach.
|
|
15
14
|
#
|
|
16
15
|
# The public interface is intentionally small:
|
|
17
16
|
# - `call(ctx)` returns `nil` when no intervention is needed
|
|
@@ -22,14 +21,6 @@
|
|
|
22
21
|
# {LLM::Agent LLM::Agent} enables this guard by default through its wrapped
|
|
23
22
|
# context.
|
|
24
23
|
#
|
|
25
|
-
# Brute is MIT licensed. The relevant license grant is:
|
|
26
|
-
#
|
|
27
|
-
# Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
28
|
-
# of this software and associated documentation files (the "Software"), to deal
|
|
29
|
-
# in the Software without restriction, including without limitation the rights
|
|
30
|
-
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
31
|
-
# copies of the Software, and to permit persons to whom the Software is
|
|
32
|
-
# furnished to do so.
|
|
33
24
|
class LLM::LoopGuard
|
|
34
25
|
##
|
|
35
26
|
# The default number of repeated tool-call patterns required before
|
data/lib/llm/version.rb
CHANGED