RubyGems - llm.rb - Versions diffs - 4.23.0 → 5.1.0 - Mend

llm.rb 4.23.0 → 5.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (27) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +68 -2
data/README.md +58 -4
data/lib/llm/agent.rb +19 -19
data/lib/llm/compactor.rb +10 -21
data/lib/llm/context/deserializer.rb +2 -1
data/lib/llm/context.rb +121 -7
data/lib/llm/error.rb +4 -0
data/lib/llm/function/fiber_group.rb +8 -0
data/lib/llm/function/ractor/task.rb +7 -0
data/lib/llm/function/ractor_group.rb +8 -0
data/lib/llm/function/task.rb +8 -0
data/lib/llm/function/task_group.rb +8 -0
data/lib/llm/function/thread_group.rb +8 -0
data/lib/llm/function.rb +19 -0
data/lib/llm/loop_guard.rb +117 -0
data/lib/llm/message.rb +8 -0
data/lib/llm/providers/anthropic/stream_parser.rb +1 -1
data/lib/llm/providers/google/stream_parser.rb +1 -1
data/lib/llm/providers/openai/responses/stream_parser.rb +1 -1
data/lib/llm/providers/openai/stream_parser.rb +1 -1
data/lib/llm/stream/queue.rb +8 -0
data/lib/llm/stream.rb +49 -13
data/lib/llm/tool.rb +14 -0
data/lib/llm/version.rb +1 -1
data/lib/llm.rb +1 -0
metadata +2 -1

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 49ed8077a6283802d4141dcb9ec037c7fc46920ebd3273b30c55624b575f3156
-  data.tar.gz: e2289baf740ba9603ed1c308414e632ddda296356659c8714bf3a1744c216104
+  metadata.gz: 56ddedb75f6c791cc42bca736bc62360ba4850a3a204f9a82288e8c6ea977eeb
+  data.tar.gz: 3881b731dacd921e258eac954c4468d052e673e48ad53c63ae1a246973c84d33
 SHA512:
-  metadata.gz: b6b0d72baa785a6bf25cbfd3f2581d7f6a5850a0fa61dea29668596e19eb8a1142330f8acfea7f04a1bc76461c02c0af681588332d955aae2b5c6808f2fc0610
-  data.tar.gz: 836fc45489b9d86c7bde3ed2b94d2813be5bdaea1ebf7697f7e7eca5962f5374343e371188e40ced180ed50e053cd74ec8fcec8dea08c164291ee8577301f195
+  metadata.gz: a8838f57a1232afc42448d28a0f3f7b8907c2a527be284579b3af56e398edda50d7cc02a8dda2794c65096699831058494cccae3f8a116b538f76bc42127eba8
+  data.tar.gz: ef49e8046b4aab4e59b252ffdbf16135673d227b4124bf45bf5c31856c49822168182928b56fdc7428d2056dbae7c211fac0d6d8ef3eba48a7aca47420bb96e7

data/CHANGELOG.md CHANGED Viewed

@@ -2,8 +2,73 @@
 ## Unreleased
+Changes since `v5.1.0`.
+## v5.1.0
+Changes since `v5.0.0`.
+This release tightens streamed tool execution around the actual request-local
+runtime state. It fixes streamed resolution of per-request tools and makes
+that streamed path work cleanly with `LLM.function(...)`, MCP tools, bound
+tool instances, and normal tool classes.
+### Fix
+* **Resolve request-local tools during streaming** <br>
+  Resolve streamed tool calls through `LLM::Stream` request-local tools
+  before falling back to the global registry, so per-request tools and bound
+  tool instances work correctly during streaming.
+* **Support `LLM.function(...)` and MCP tools in streamed tool resolution** <br>
+  Let streamed tool resolution use the current request tool set, so
+  `LLM.function(...)`, MCP tools, bound tool instances, and normal
+  `LLM::Tool` classes all work through the same streamed tool path.
+## v5.0.0
 Changes since `v4.23.0`.
+This release expands llm.rb from an execution runtime into a more explicit
+supervision and transformation runtime. It adds context-level guards,
+transformers, and loop supervision through `LLM::LoopGuard`, while deepening
+long-lived context behavior through compaction, interruption hooks, and
+streamed `ctx.spawn(...)` tool execution.
+### Change
+* **Make compactor thresholds explicit** <br>
+  Require `message_threshold:` and `token_threshold:` to be opted into
+  explicitly, so `LLM::Compactor` only compacts automatically when one of
+  those thresholds is configured. Context-window-derived token limits can be
+  computed by the caller when needed.
+* **Allow assigning a compactor through `LLM::Context`** <br>
+  Let `LLM::Context` accept `ctx.compactor = ...` in addition to the
+  constructor `compactor:` option, so compactor config can be assigned or
+  replaced after context initialization.
+* **Mark compaction summaries in message metadata** <br>
+  Mark compaction summaries with `extra[:compaction]` and
+  `LLM::Message#compaction?`, so applications can detect or hide synthetic
+  summary messages in conversation history.
+* **Add cooperative tool interruption hooks** <br>
+  Let `ctx.interrupt!` notify queued tool work through `on_interrupt`, so
+  running tools can clean up cooperatively when a context is cancelled.
+* **Add `LLM::Context` guards** <br>
+  Add a new `guard` capability to `LLM::Context` so execution can be
+  supervised at the runtime level. The built-in `LLM::LoopGuard` detects
+  repeated tool-call patterns and stops stuck agentic loops through in-band
+  `LLM::GuardError` returns. `LLM::Agent` enables this guard by default.
+* **Add `LLM::Context` transformers** <br>
+  Add a new `transformer` capability to `LLM::Context` so prompts and params
+  can be rewritten before provider requests are sent. This makes it possible
+  to apply context-wide behaviors such as PII scrubbing or request-level
+  param injection without rewriting every `talk` and `respond` call site.
 ## v4.23.0
 Changes since `v4.22.0`.
@@ -18,8 +83,9 @@ OpenAI-compatible no-arg tool schemas for stricter providers such as xAI.
 * **Add `LLM::Compactor` for long-lived contexts** <br>
   Add built-in context compaction through `LLM::Compactor`, so older history
   can be summarized, retained windows can stay bounded, compaction can run on
-  its own `model:`, and `LLM::Stream` can observe the lifecycle through
-  `on_compaction` and `on_compaction_finish`.
+  its own `model:`, thresholds can be configured explicitly, and
+  `LLM::Stream` can observe the lifecycle through `on_compaction` and
+  `on_compaction_finish`.
 * **Allow bound tool instances in explicit tool lists** <br>
   Let explicit `tools:` arrays accept `LLM::Tool` instances such as

data/README.md CHANGED Viewed

@@ -4,7 +4,7 @@
 <p align="center">
   <a href="https://0x1eef.github.io/x/llm.rb?rebuild=1"><img src="https://img.shields.io/badge/docs-0x1eef.github.io-blue.svg" alt="RubyDoc"></a>
   <a href="https://opensource.org/license/0bsd"><img src="https://img.shields.io/badge/License-0BSD-orange.svg?" alt="License"></a>
-  <a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-4.23.0-green.svg?" alt="Version"></a>
+  <a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-5.1.0-green.svg?" alt="Version"></a>
 </p>
 ## About
@@ -25,6 +25,7 @@ schemas, files, and persisted state, so real systems can be built out of one coh
 execution model instead of a pile of adapters.
 Want to see some code? Jump to [the examples](#examples) section. <br>
+Want to see an agentic framework built on top of llm.rb? Check out [general-intelligence-systems/brute](https://github.com/general-intelligence-systems/brute). <br>
 Want a taste of what llm.rb can build? See [the screencast](#screencast).
 ## Architecture
@@ -168,6 +169,58 @@ ctx = LLM::Context.new(
 )
 ```
+#### Guards
+Guards let llm.rb supervise agentic execution, not just run it.
+They live on [`LLM::Context`](https://0x1eef.github.io/x/llm.rb/LLM/Context.html),
+can inspect the current runtime state, and can step in when a context is no
+longer making progress.
+[`LLM::LoopGuard`](https://0x1eef.github.io/x/llm.rb/LLM/LoopGuard.html) is
+the built-in implementation. It detects repeated tool-call patterns and
+blocks pending tool execution with in-band guarded tool errors instead of
+letting the loop keep spinning. [`LLM::Agent`](https://0x1eef.github.io/x/llm.rb/LLM/Agent.html)
+enables that guard by default through its wrapped context.
+```ruby
+ctx = LLM::Context.new(llm)
+ctx.guard = MyGuard.new
+```
+#### Transformers
+Transformers let llm.rb rewrite outgoing prompts and params before a request
+is sent to the provider. They also live on
+[`LLM::Context`](https://0x1eef.github.io/x/llm.rb/LLM/Context.html), but
+they solve a different problem from guards: instead of blocking execution,
+they can normalize or scrub what gets sent.
+That makes them a good fit for things like PII scrubbing, prompt
+normalization, or request-level param injection. A transformer just needs to
+implement `call(ctx, prompt, params)` and return `[prompt, params]`.
+```ruby
+class ScrubPII
+  EMAIL = /\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b/i
+  def call(ctx, prompt, params)
+    [scrub(prompt), params]
+  end
+  private
+  def scrub(prompt)
+    case prompt
+    when String then prompt.gsub(EMAIL, "[REDACTED_EMAIL]")
+    else prompt
+    end
+  end
+end
+ctx = LLM::Context.new(llm)
+ctx.transformer = ScrubPII.new
+```
 #### LLM::Stream
 `LLM::Stream` is not just for printing tokens. It supports `on_content`,
@@ -179,7 +232,7 @@ execution path.
 ```ruby
 class Stream < LLM::Stream
   def on_tool_call(tool, error)
-    queue << tool.spawn(:thread)
+    queue << (error || ctx.spawn(tool, :thread))
   end
   def on_tool_return(tool, result)
@@ -468,7 +521,7 @@ class Stream < LLM::Stream
   def on_tool_call(tool, error)
     return queue << error if error
     $stdout << "\nRunning tool #{tool.name}...\n"
-    queue << tool.spawn(:thread)
+    queue << ctx.spawn(tool, :thread)
   end
   def on_tool_return(tool, result)
@@ -481,7 +534,8 @@ class Stream < LLM::Stream
 end
 llm = LLM.openai(key: ENV["KEY"])
-ctx = LLM::Context.new(llm, stream: Stream.new, tools: [System])
+stream = Stream.new
+ctx = LLM::Context.new(llm, stream:, tools: [System])
 ctx.talk("Run `date` and `uname -a`.")
 ctx.talk(ctx.wait(:thread)) while ctx.functions.any?

data/lib/llm/agent.rb CHANGED Viewed

@@ -16,6 +16,9 @@ module LLM
   # **Notes:**
   # * Instructions are injected once unless a system message is already present.
   # * An agent automatically executes tool loops (unlike {LLM::Context LLM::Context}).
+  # * The automatic tool loop enables the wrapped context's `guard` by default.
+  #   The built-in {LLM::LoopGuard LLM::LoopGuard} detects repeated tool-call
+  #   patterns and blocks stuck execution before more tool work is queued.
   # * Tool loop execution can be configured with `concurrency :call`,
   #   `:thread`, `:task`, `:fiber`, `:ractor`, or a list of queued task
   #   types such as `[:thread, :ractor]`.
@@ -128,7 +131,7 @@ module LLM
       defaults = {model: self.class.model, tools: self.class.tools, skills: self.class.skills, schema: self.class.schema}.compact
       @concurrency = params.delete(:concurrency) || self.class.concurrency
       @llm = llm
-      @ctx = LLM::Context.new(llm, defaults.merge(params))
+      @ctx = LLM::Context.new(llm, defaults.merge({guard: true}).merge(params))
     end
     ##
@@ -137,7 +140,7 @@ module LLM
     #
     # @param prompt (see LLM::Provider#complete)
     # @param [Hash] params The params passed to the provider, including optional :stream, :tools, :schema etc.
-    # @option params [Integer] :tool_attempts The maxinum number of tool call iterations (default 10)
+    # @option params [Integer] :tool_attempts The maxinum number of tool call iterations (default 25)
     # @return [LLM::Response] Returns the LLM's response for this turn.
     # @example
     #   llm = LLM.openai(key: ENV["KEY"])
@@ -145,14 +148,7 @@ module LLM
     #   response = agent.talk("Hello, what is your name?")
     #   puts response.choices[0].content
     def talk(prompt, params = {})
-      max = Integer(params.delete(:tool_attempts) || 10)
-      res = @ctx.talk(apply_instructions(prompt), params)
-      max.times do
-        break if @ctx.functions.empty?
-        res = @ctx.talk(call_functions, params)
-      end
-      raise LLM::ToolLoopError, "pending tool calls remain" unless @ctx.functions.empty?
-      res
+      run_loop(:talk, prompt, params)
     end
     alias_method :chat, :talk
@@ -163,7 +159,7 @@ module LLM
     # @note Not all LLM providers support this API
     # @param prompt (see LLM::Provider#complete)
     # @param [Hash] params The params passed to the provider, including optional :stream, :tools, :schema etc.
-    # @option params [Integer] :tool_attempts The maxinum number of tool call iterations (default 10)
+    # @option params [Integer] :tool_attempts The maxinum number of tool call iterations (default 25)
     # @return [LLM::Response] Returns the LLM's response for this turn.
     # @example
     #   llm = LLM.openai(key: ENV["KEY"])
@@ -171,14 +167,7 @@ module LLM
     #   res = agent.respond("What is the capital of France?")
     #   puts res.output_text
     def respond(prompt, params = {})
-      max = Integer(params.delete(:tool_attempts) || 10)
-      res = @ctx.respond(apply_instructions(prompt), params)
-      max.times do
-        break if @ctx.functions.empty?
-        res = @ctx.respond(call_functions, params)
-      end
-      raise LLM::ToolLoopError, "pending tool calls remain" unless @ctx.functions.empty?
-      res
+      run_loop(:respond, prompt, params)
     end
     ##
@@ -380,5 +369,16 @@ module LLM
       else raise ArgumentError, "Unknown concurrency: #{concurrency.inspect}. Expected :call, :thread, :task, :fiber, :ractor, or an array of queued task types"
       end
     end
+    def run_loop(method, prompt, params)
+      max = Integer(params.delete(:tool_attempts) || 25)
+      res = @ctx.public_send(method, apply_instructions(prompt), params)
+      max.times do
+        break if @ctx.functions.empty?
+        res = @ctx.public_send(method, call_functions, params)
+      end
+      raise LLM::ToolLoopError, "pending tool calls remain" unless @ctx.functions.empty?
+      res
+    end
   end
 end

data/lib/llm/compactor.rb CHANGED Viewed

@@ -9,14 +9,11 @@
 # [Brute](https://github.com/general-intelligence-systems/brute).
 #
 # The compactor can also use a different model from the main context by
-# setting `model:` in the compactor config. By default, `token_threshold` is
-# 10% less than the current context window, or `100_000` when the context
-# window is unknown. Set `message_threshold:` or `token_threshold:` to `nil`
-# to disable that constraint.
+# setting `model:` in the compactor config. Compaction thresholds are opt-in:
+# provide `message_threshold:` and/or `token_threshold:` to enable policy-
+# driven compaction.
 class LLM::Compactor
-  DEFAULT_TOKEN_THRESHOLD = 100_000
   DEFAULTS = {
-    message_threshold: 200,
     retention_window: 8,
     model: nil
   }.freeze
@@ -28,19 +25,17 @@ class LLM::Compactor
   ##
   # @param [LLM::Context] ctx
   # @param [Hash] config
-  # @option config [Integer] :token_threshold
-  #  Defaults to 10% less than the current context window, or `100_000` when
-  #  the context window is unknown. Set to `nil` to disable token-based
-  #  compaction.
-  # @option config [Integer] :message_threshold
-  #  Set to `nil` to disable message-count-based compaction.
+  # @option config [Integer, nil] :token_threshold
+  #  Enables token-based compaction.
+  # @option config [Integer, nil] :message_threshold
+  #  Enables message-count-based compaction.
   # @option config [Integer] :retention_window
   # @option config [String, nil] :model
   #  The model to use for the summarization request. Defaults to the current
   #  context model.
-  def initialize(ctx, **config)
+  def initialize(ctx, config = {})
     @ctx = ctx
-    @config = DEFAULTS.merge(token_threshold: default_token_threshold).merge(config)
+    @config = DEFAULTS.merge(config)
   end
   ##
@@ -71,7 +66,7 @@ class LLM::Compactor
     stream.on_compaction(ctx, self) if LLM::Stream === stream
     recent = retained_messages
     older = messages[0...(messages.size - recent.size)]
-    summary = LLM::Message.new(ctx.llm.user_role, "[Previous conversation summary]\n\n#{summarize(older)}")
+    summary = LLM::Message.new(ctx.llm.user_role, "[Previous conversation summary]\n\n#{summarize(older)}", {compaction: true})
     ctx.messages.replace([*ctx.messages.take_while(&:system?), summary, *recent])
     stream.on_compaction_finish(ctx, self) if LLM::Stream === stream
     summary
@@ -81,12 +76,6 @@ class LLM::Compactor
   attr_reader :ctx
-  def default_token_threshold
-    window = ctx.context_window
-    return DEFAULT_TOKEN_THRESHOLD if window.zero?
-    window - (window / 10)
-  end
   def retained_messages
     messages = ctx.messages.reject(&:system?)
     retention_window = [config[:retention_window], messages.size].min

data/lib/llm/context/deserializer.rb CHANGED Viewed

@@ -39,7 +39,8 @@ class LLM::Context
       original_tool_calls = payload["original_tool_calls"]
       usage = payload["usage"]
       reasoning_content = payload["reasoning_content"]
-      extra = {tool_calls:, original_tool_calls:, tools: @params[:tools], usage:, reasoning_content:}.compact
+      compaction = payload["compaction"]
+      extra = {tool_calls:, original_tool_calls:, tools: @params[:tools], usage:, reasoning_content:, compaction:}.compact
       content = returns.nil? ? deserialize_content(payload["content"]) : returns
       LLM::Message.new(payload["role"], content, extra)
     end

data/lib/llm/context.rb CHANGED Viewed

@@ -77,6 +77,8 @@ module LLM
       @llm = llm
       @mode = params.delete(:mode) || :completions
       @compactor = params.delete(:compactor)
+      @guard = params.delete(:guard)
+      @transformer = params.delete(:transformer)
       tools = [*params.delete(:tools), *load_skills(params.delete(:skills))]
       @params = {model: llm.default_model, schema: nil}.compact.merge!(params)
       @params[:tools] = tools unless tools.empty?
@@ -90,11 +92,73 @@ module LLM
     # [Brute](https://github.com/general-intelligence-systems/brute).
     # @return [LLM::Compactor]
     def compactor
-      @compactor = LLM::Compactor.new(self, **(@compactor || {})) unless LLM::Compactor === @compactor
+      @compactor = LLM::Compactor.new(self, @compactor || {}) unless LLM::Compactor === @compactor
       @compactor
     end
     ##
+    # Sets a context compactor or compactor config
+    # @param [LLM::Compactor, Hash, nil] compactor
+    # @return [LLM::Compactor, Hash, nil]
+    def compactor=(compactor)
+      @compactor = compactor
+    end
+    ##
+    # Returns a guard, if configured.
+    #
+    # Guards are context-level supervisors for agentic execution. A guard can
+    # inspect the runtime state and decide whether pending tool work should be
+    # blocked before the context keeps looping.
+    #
+    # The built-in implementation is {LLM::LoopGuard LLM::LoopGuard}, which
+    # detects repeated tool-call patterns and turns them into in-band
+    # {LLM::GuardError LLM::GuardError} tool returns.
+    #
+    # @return [#call, nil]
+    def guard
+      return if @guard.nil? || @guard == false
+      @guard = LLM::LoopGuard.new if @guard == true
+      @guard = LLM::LoopGuard.new(@guard) if Hash === @guard
+      @guard
+    end
+    ##
+    # Sets a guard or guard config.
+    #
+    # Guards must implement `call(ctx)` and return either `nil` or a warning
+    # string. Returning a warning tells the context to block pending tool work
+    # with guarded tool errors instead of continuing the loop.
+    #
+    # @param [#call, Hash, Boolean, nil] guard
+    # @return [#call, Hash, Boolean, nil]
+    def guard=(guard)
+      @guard = guard
+    end
+    ##
+    # Returns a transformer, if configured.
+    #
+    # Transformers can rewrite outgoing prompts and params before a request is
+    # sent to the provider.
+    #
+    # @return [#call, nil]
+    def transformer
+      @transformer
+    end
+    ##
+    # Sets a transformer.
+    #
+    # Transformers must implement `call(ctx, prompt, params)` and return a
+    # two-element array of `[prompt, params]`.
+    #
+    # @param [#call, nil] transformer
+    # @return [#call, nil]
+    def transformer=(transformer)
+      @transformer = transformer
+    end
     # Interact with the context via the chat completions API.
     # This method immediately sends a request to the LLM and returns the response.
     #
@@ -112,7 +176,8 @@ module LLM
       compactor.compact!(prompt) if compactor.compact?(prompt)
       params = params.merge(messages: @messages.to_a)
       params = @params.merge(params)
-      bind!(params[:stream], params[:model])
+      prompt, params = transform(prompt, params)
+      bind!(params[:stream], params[:model], params[:tools])
       res = @llm.complete(prompt, params)
       role = params[:role] || @llm.user_role
       role = @llm.tool_role if params[:role].nil? && [*prompt].grep(LLM::Function::Return).any?
@@ -139,7 +204,8 @@ module LLM
       @owner = Fiber.current
       compactor.compact!(prompt) if compactor.compact?(prompt)
       params = @params.merge(params)
-      bind!(params[:stream], params[:model])
+      prompt, params = transform(prompt, params)
+      bind!(params[:stream], params[:model], params[:tools])
       res_id = params[:store] == false ? nil : @messages.find(&:assistant?)&.response&.response_id
       params = params.merge(previous_response_id: res_id, input: @messages.to_a).compact
       res = @llm.responses.create(prompt, params)
@@ -183,11 +249,26 @@ module LLM
     # @return [Array<LLM::Function::Return>]
     def call(target)
       case target
-      when :functions then functions.call
+      when :functions then guarded_returns || functions.call
       else raise ArgumentError, "Unknown target: #{target.inspect}. Expected :functions"
       end
     end
+    ##
+    # Spawns a function through the context.
+    #
+    # When a guard is configured, this method can return an in-band guarded
+    # tool error instead of spawning work.
+    #
+    # @param [LLM::Function] function
+    # @param [Symbol] strategy
+    # @return [LLM::Function::Return, LLM::Function::Task]
+    def spawn(function, strategy)
+      warning = guard&.call(self)
+      return guarded_return_for(function, warning) if warning
+      function.spawn(strategy)
+    end
     ##
     # Returns tool returns accumulated in this context
     # @return [Array<LLM::Function::Return>]
@@ -216,10 +297,15 @@ module LLM
     def wait(strategy)
       stream = @params[:stream]
       if LLM::Stream === stream && !stream.queue.empty?
-        stream.wait(strategy)
+        @queue = stream.queue
+        @queue.wait(strategy)
       else
-        functions.wait(strategy)
+        return guarded_returns if guarded_returns
+        @queue = functions.spawn(strategy)
+        @queue.wait
       end
+    ensure
+      @queue = nil
     end
     ##
@@ -228,6 +314,7 @@ module LLM
     # @return [nil]
     def interrupt!
       llm.interrupt!(@owner)
+      queue&.interrupt!
     end
     alias_method :cancel!, :interrupt!
@@ -372,15 +459,42 @@ module LLM
     private
-    def bind!(stream, model)
+    def bind!(stream, model, tools)
       return unless LLM::Stream === stream
+      stream.extra[:ctx] = self
       stream.extra[:tracer] = tracer
       stream.extra[:model] = model
+      stream.extra[:tools] = tools
+    end
+    def queue
+      return @queue if @queue
+      stream = @params[:stream]
+      stream.queue if LLM::Stream === stream
     end
     def load_skills(skills)
       [*skills].map { LLM::Skill.load(_1).to_tool(self) }
     end
+    def guarded_returns
+      warning = guard&.call(self)
+      return unless warning
+      functions.map { guarded_return_for(_1, warning) }
+    end
+    def transform(prompt, params)
+      return [prompt, params] unless transformer
+      transformer.call(self, prompt, params)
+    end
+    def guarded_return_for(function, warning)
+      LLM::Function::Return.new(function.id, function.name, {
+        error: true,
+        type: LLM::GuardError.name,
+        message: warning
+      })
+    end
   end
   # Backward-compatible alias

data/lib/llm/error.rb CHANGED Viewed

@@ -55,6 +55,10 @@ module LLM
   # When stuck in a tool call loop
   ToolLoopError = Class.new(Error)
+  ##
+  # When a guard blocks pending tool execution
+  GuardError = Class.new(Error)
   ##
   # When a request is interrupted
   Interrupt = Class.new(Error)

data/lib/llm/function/fiber_group.rb CHANGED Viewed

@@ -59,6 +59,14 @@ class LLM::Function
       @fibers.any?(&:alive?)
     end
+    ##
+    # @return [nil]
+    def interrupt!
+      @fibers.each(&:interrupt!)
+      nil
+    end
+    alias_method :cancel!, :interrupt!
     ##
     # Waits for all fibers in the group to finish and returns
     # their {LLM::Function::Return} values.

data/lib/llm/function/ractor/task.rb CHANGED Viewed

@@ -26,6 +26,13 @@ class LLM::Function
       mailbox.alive?
     end
+    ##
+    # @return [nil]
+    def interrupt!
+      nil
+    end
+    alias_method :cancel!, :interrupt!
     ##
     # @return [LLM::Function::Return]
     def wait

data/lib/llm/function/ractor_group.rb CHANGED Viewed

@@ -19,6 +19,14 @@ class LLM::Function
       @tasks.any?(&:alive?)
     end
+    ##
+    # @return [nil]
+    def interrupt!
+      @tasks.each(&:interrupt!)
+      nil
+    end
+    alias_method :cancel!, :interrupt!
     ##
     # @return [Array<LLM::Function::Return>]
     def wait

data/lib/llm/function/task.rb CHANGED Viewed

@@ -29,6 +29,14 @@ class LLM::Function
       false
     end
+    ##
+    # @return [nil]
+    def interrupt!
+      function&.interrupt!
+      nil
+    end
+    alias_method :cancel!, :interrupt!
     ##
     # @return [LLM::Function::Return]
     def wait

data/lib/llm/function/task_group.rb CHANGED Viewed

@@ -60,6 +60,14 @@ class LLM::Function
       @tasks.any?(&:alive?)
     end
+    ##
+    # @return [nil]
+    def interrupt!
+      @tasks.each(&:interrupt!)
+      nil
+    end
+    alias_method :cancel!, :interrupt!
     ##
     # Waits for all tasks in the group to finish and returns
     # their {LLM::Function::Return} values.

data/lib/llm/function/thread_group.rb CHANGED Viewed

@@ -65,6 +65,14 @@ class LLM::Function
       @threads.any?(&:alive?)
     end
+    ##
+    # @return [nil]
+    def interrupt!
+      @threads.each(&:interrupt!)
+      nil
+    end
+    alias_method :cancel!, :interrupt!
     ##
     # Waits for all threads in the group to finish and returns
     # their {LLM::Function::Return} values.

data/lib/llm/function.rb CHANGED Viewed

@@ -62,6 +62,13 @@ class LLM::Function
     def to_json(...)
       LLM.json.dump(to_h, ...)
     end
+    ##
+    # @return [nil]
+    def interrupt!
+      nil
+    end
+    alias_method :cancel!, :interrupt!
   end
   ##
@@ -218,6 +225,18 @@ class LLM::Function
     @cancelled = true
   end
+  ##
+  # Notifies the function runner that the call was interrupted.
+  # This is cooperative and only applies to runners that implement
+  # `on_interrupt`.
+  # @return [nil]
+  def interrupt!
+    hook = %i[on_cancel on_interrupt].find { @runner.respond_to?(_1) }
+    @runner.public_send(hook) if hook
+    nil
+  end
+  alias_method :cancel!, :interrupt!
   ##
   # Returns true when a function has been called
   # @return [Boolean]

data/lib/llm/loop_guard.rb ADDED Viewed

@@ -0,0 +1,117 @@
+# frozen_string_literal: true
+##
+# {LLM::LoopGuard LLM::LoopGuard} is the built-in implementation of
+# llm.rb's `guard` capability.
+#
+# A guard is a context-level supervisor for agentic execution. It can inspect
+# the current runtime state and return a warning string when pending tool work
+# should be blocked before the loop keeps going.
+#
+# {LLM::LoopGuard LLM::LoopGuard} detects when a context is repeating the same
+# tool-call pattern instead of making progress. It is directly inspired by
+# General Intelligence Systems' Brute runtime and its doom-loop detection
+# approach.
+#
+# The public interface is intentionally small:
+# - `call(ctx)` returns `nil` when no intervention is needed
+# - `call(ctx)` returns a warning string when pending tool execution should be blocked
+#
+# {LLM::Context LLM::Context} can use that warning to return in-band
+# {LLM::GuardError LLM::GuardError} tool errors, and
+# {LLM::Agent LLM::Agent} enables this guard by default through its wrapped
+# context.
+#
+# Brute is MIT licensed. The relevant license grant is:
+#
+#   Permission is hereby granted, free of charge, to any person obtaining a copy
+#   of this software and associated documentation files (the "Software"), to deal
+#   in the Software without restriction, including without limitation the rights
+#   to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+#   copies of the Software, and to permit persons to whom the Software is
+#   furnished to do so.
+class LLM::LoopGuard
+  ##
+  # The default number of repeated tool-call patterns required before
+  # the guard intervenes.
+  # @return [Integer]
+  DEFAULT_THRESHOLD = 3
+  ##
+  # Returns the repetition threshold.
+  # @return [Integer]
+  attr_reader :threshold
+  ##
+  # @param [Hash] config
+  # @option config [Integer] :threshold
+  #  How many repeated tool-call patterns must appear at the tail of the
+  #  sequence before the guard returns a warning.
+  def initialize(config = {})
+    @threshold = config.fetch(:threshold, DEFAULT_THRESHOLD)
+  end
+  ##
+  # Checks the current context for repeated tool-call patterns.
+  #
+  # This method inspects assistant tool calls only. It reduces each call to a
+  # `[tool_name, arguments]` signature and checks whether the tail of the
+  # sequence is repeating.
+  #
+  # @param [LLM::Context] ctx
+  # @return [String, nil]
+  #  Returns a warning string when pending tool execution should be blocked,
+  #  or `nil` when execution should continue.
+  def call(ctx)
+    repetitions = detect(ctx.messages.to_a)
+    repetitions ? warning(repetitions) : nil
+  end
+  private
+  def detect(messages)
+    signatures = extract_signatures(messages)
+    return if signatures.size < threshold
+    check_repeating_pattern(signatures)
+  end
+  def warning(repetitions)
+    <<~MSG
+      SYSTEM NOTICE: Repeated tool-call pattern detected - the same pattern has repeated #{repetitions} times.
+      You are stuck in a loop and not making progress. Stop and try a fundamentally different approach:
+      - Re-read the relevant context before retrying
+      - Try a different tool or strategy
+      - Break the problem into smaller steps
+      - If a tool keeps failing, investigate why before retrying
+    MSG
+  end
+  def extract_signatures(messages)
+    messages
+      .select { _1.respond_to?(:functions) && _1.assistant? }
+      .flat_map { |message| message.functions.map { [_1.name.to_s, _1.arguments.to_s] } }
+  end
+  def check_repeating_pattern(sequence)
+    max_pattern_len = sequence.size / threshold
+    (1..max_pattern_len).each do |pattern_len|
+      count = count_tail_repetitions(sequence, pattern_len)
+      return count if count >= threshold
+    end
+    nil
+  end
+  def count_tail_repetitions(sequence, length)
+    return 0 if sequence.size < length
+    pattern = sequence.last(length)
+    count = 1
+    pos = sequence.size - length
+    while pos >= length
+      candidate = sequence[(pos - length)...pos]
+      break unless candidate == pattern
+      count += 1
+      pos -= length
+    end
+    count
+  end
+end

data/lib/llm/message.rb CHANGED Viewed

@@ -34,6 +34,7 @@ module LLM
     # @return [Hash]
     def to_h
       {role:, content:, reasoning_content:,
+       compaction: extra.compaction,
        tools: extra.tool_calls,
        usage:,
        original_tool_calls: extra.original_tool_calls}.compact
@@ -74,6 +75,13 @@ module LLM
       extra.reasoning_content
     end
+    ##
+    # Returns true when a message was created by context compaction
+    # @return [Boolean]
+    def compaction?
+      !!extra.compaction
+    end
     ##
     # Returns true when a message contains an image URL
     # @return [Boolean]

data/lib/llm/providers/anthropic/stream_parser.rb CHANGED Viewed

@@ -105,7 +105,7 @@ class LLM::Anthropic
     end
     def resolve_tool(tool)
-      registered = LLM::Function.find_by_name(tool["name"])
+      registered = @stream.find_tool(tool["name"])
       fn = (registered || LLM::Function.new(tool["name"])).dup.tap do |fn|
         fn.id = tool["id"]
         fn.arguments = LLM::Anthropic.parse_tool_input(tool["input"])

data/lib/llm/providers/google/stream_parser.rb CHANGED Viewed

@@ -153,7 +153,7 @@ class LLM::Google
     def resolve_tool(part, cindex, pindex)
       call = part["functionCall"]
-      registered = LLM::Function.find_by_name(call["name"])
+      registered = @stream.find_tool(call["name"])
       fn = (registered || LLM::Function.new(call["name"])).dup.tap do |fn|
         fn.id = LLM::Google.tool_id(part:, cindex:, pindex:)
         fn.arguments = call["args"]

data/lib/llm/providers/openai/responses/stream_parser.rb CHANGED Viewed

@@ -269,7 +269,7 @@ class LLM::OpenAI
     # @group Resolvers
     def resolve_tool(tool, arguments)
-      registered = LLM::Function.find_by_name(tool["name"])
+      registered = @stream.find_tool(tool["name"])
       fn = (registered || LLM::Function.new(tool["name"])).dup.tap do |fn|
         fn.id = tool["call_id"]
         fn.arguments = arguments

data/lib/llm/providers/openai/stream_parser.rb CHANGED Viewed

@@ -185,7 +185,7 @@ class LLM::OpenAI
     end
     def resolve_tool(tool, function, arguments)
-      registered = LLM::Function.find_by_name(function["name"])
+      registered = @stream.find_tool(function["name"])
       fn = (registered || LLM::Function.new(function["name"])).dup.tap do |fn|
         fn.id = tool["id"]
         fn.arguments = arguments

data/lib/llm/stream/queue.rb CHANGED Viewed

@@ -31,6 +31,14 @@ class LLM::Stream
       @items.empty?
     end
+    ##
+    # @return [nil]
+    def interrupt!
+      @items.each(&:interrupt!)
+      nil
+    end
+    alias_method :cancel!, :interrupt!
     ##
     # Waits for queued work to finish and returns function results.
     # @param [Symbol, Array<Symbol>] strategy

data/lib/llm/stream.rb CHANGED Viewed

@@ -30,6 +30,13 @@ module LLM
       @extra ||= LLM::Object.from({})
     end
+    ##
+    # Returns the current context, if one was attached to the stream.
+    # @return [LLM::Context, nil]
+    def ctx
+      extra[:ctx]
+    end
     ##
     # Returns a lazily-initialized queue for tool results or spawned work.
     # @return [LLM::Stream::Queue]
@@ -70,17 +77,18 @@ module LLM
     ##
     # Called when a streamed tool call has been fully constructed.
     # @note A stream implementation may start tool execution here, for
-    #   example by pushing `tool.spawn(:thread)`, `tool.spawn(:fiber)`, or
-    #   `tool.spawn(:task)` onto {#queue}. Mixed strategies can also be
-    #   selected per tool, such as `tool.mcp? ? tool.spawn(:task) :
-    #   tool.spawn(:ractor)`. When a streamed tool cannot be resolved, `error`
-    #   is passed as an {LLM::Function::Return}. It can be sent back to the
-    #   model, allowing the tool-call path to recover and the session to
-    #   continue. Tool resolution depends on
-    #   {LLM::Function.registry}, which includes {LLM::Tool LLM::Tool}
-    #   subclasses, including MCP tools, but not functions defined with
-    #   {LLM.function}. The current `:ractor` mode is for class-based tools
-    #   and does not support MCP tools.
+    #   example by pushing `ctx.spawn(tool, :thread)`,
+    #   `ctx.spawn(tool, :fiber)`, or `ctx.spawn(tool, :task)` onto {#queue}.
+    #   Mixed strategies can also be selected per tool, such as
+    #   `tool.mcp? ? ctx.spawn(tool, :task) : ctx.spawn(tool, :ractor)`.
+    #   When a streamed tool cannot be resolved, `error` is passed as an
+    #   {LLM::Function::Return}. It can be sent back to the model, allowing
+    #   the tool-call path to recover and the session to continue. Streamed
+    #   tool resolution now prefers the current request tools, so
+    #   {LLM.function}, MCP tools, bound tool instances, and normal
+    #   {LLM::Tool LLM::Tool} classes can all resolve through the same
+    #   request-local path. The current `:ractor` mode is for class-based
+    #   tools and does not support MCP tools.
     # @param [LLM::Function] tool
     #  The parsed tool call.
     # @param [LLM::Function::Return, nil] error
@@ -93,8 +101,8 @@ module LLM
     ##
     # Called when queued streamed tool work returns.
     # @note This callback runs when {#wait} resolves work that was queued from
-    #   {#on_tool_call}, such as values returned by `tool.spawn(:thread)`,
-    #   `tool.spawn(:fiber)`, or `tool.spawn(:task)`.
+    #   {#on_tool_call}, such as values returned by `ctx.spawn(tool, :thread)`,
+    #   `ctx.spawn(tool, :fiber)`, or `ctx.spawn(tool, :task)`.
     # @param [LLM::Function] tool
     #  The tool that returned.
     # @param [LLM::Function::Return] result
@@ -140,6 +148,34 @@ module LLM
       })
     end
+    ##
+    # Returns the tool definitions available for the current streamed request.
+    # This prefers request-local tools attached to the stream and falls back
+    # to the current context defaults when present.
+    # @return [Array<LLM::Function, LLM::Tool>]
+    def tools
+      extra[:tools] || ctx&.params&.dig(:tools) || []
+    end
+    ##
+    # Resolves a streamed tool call against the current request tools first,
+    # then falls back to the global function registry.
+    # @param [String] name
+    # @return [LLM::Function, nil]
+    def find_tool(name)
+      tool = tools.find do |candidate|
+        candidate_name =
+          if candidate.respond_to?(:function)
+            candidate.function.name
+          else
+            candidate.name
+          end
+        candidate_name.to_s == name.to_s
+      end
+      tool&.then { _1.respond_to?(:function) ? _1.function : _1 } ||
+        LLM::Function.find_by_name(name)
+    end
     # @endgroup
   end
 end

data/lib/llm/tool.rb CHANGED Viewed

@@ -185,4 +185,18 @@ class LLM::Tool
   def mcp?
     self.class.mcp?
   end
+  ##
+  # Called when an in-flight tool run is interrupted.
+  # Tools can override this to implement cooperative cleanup.
+  # @return [nil]
+  def on_interrupt
+  end
+  ##
+  # Called when an in-flight tool run is cancelled.
+  # @return [nil]
+  def on_cancel
+    on_interrupt
+  end
 end

data/lib/llm/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module LLM
-  VERSION = "4.23.0"
+  VERSION = "5.1.0"
 end

data/lib/llm.rb CHANGED Viewed

@@ -23,6 +23,7 @@ module LLM
   require_relative "llm/stream"
   require_relative "llm/provider"
   require_relative "llm/context"
+  require_relative "llm/loop_guard"
   require_relative "llm/agent"
   require_relative "llm/buffer"
   require_relative "llm/function"

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: llm.rb
 version: !ruby/object:Gem::Version
-  version: 4.23.0
+  version: 5.1.0
 platform: ruby
 authors:
 - Antar Azri
@@ -298,6 +298,7 @@ files:
 - lib/llm/function/thread_group.rb
 - lib/llm/function/tracing.rb
 - lib/llm/json_adapter.rb
+- lib/llm/loop_guard.rb
 - lib/llm/mcp.rb
 - lib/llm/mcp/command.rb
 - lib/llm/mcp/error.rb