RubyGems - llm.rb - Versions diffs - 4.11.1 → 4.12.0 - Mend

llm.rb 4.11.1 → 4.12.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +33 -1
data/README.md +92 -46
data/lib/llm/function/task.rb +7 -1
data/lib/llm/function.rb +13 -2
data/lib/llm/mcp/transport/http.rb +2 -1
data/lib/llm/mcp/transport/stdio.rb +1 -0
data/lib/llm/mcp.rb +2 -1
data/lib/llm/provider.rb +3 -4
data/lib/llm/providers/anthropic/request_adapter/completion.rb +8 -1
data/lib/llm/providers/anthropic/response_adapter/completion.rb +7 -2
data/lib/llm/providers/anthropic/stream_parser.rb +1 -1
data/lib/llm/providers/anthropic/utils.rb +23 -0
data/lib/llm/providers/anthropic.rb +11 -0
data/lib/llm/stream/queue.rb +15 -2
data/lib/llm/stream.rb +24 -10
data/lib/llm/version.rb +1 -1
data/llm.gemspec +7 -39
metadata +9 -38

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: f4c449483ce7a3b53411760d6376157fed3e23b4f013f23ae397255398bef368
-  data.tar.gz: a9a9c82b107cde72edfe6fe5f68ea7b1ea5e493314883d101c453a94db81b601
+  metadata.gz: 79d4a45ec25408e46451475575e917ef9d8579bec32f1a6a78bfed235e5ae212
+  data.tar.gz: fdeb12175be3ef87e411021444305b9e785a9bf2d055dfdc7bf718f5740623d8
 SHA512:
-  metadata.gz: 71a389b2fe654cfd053f45bd749c34b96c9d89ac60e984960f4a2720896588ba39056a3a92ab75a429572cd099961d9f3c02474f7dc43460b59866e41d8b5f28
-  data.tar.gz: 4532ec55176751b32ed21b281f2f71395dcd32cdf318973a751decf171af0a9e5f3f75b75871542c578fd9a2a134f8fc5cbf6a54b1df3b2dbe0c47745122b900
+  metadata.gz: ea35b39b5476b75370485128dd8441e078bc7ac69236a7a50f4e32fb419f6fac5f7bb81faf3e029f28b788f4d69645e1b97e4126ea4f9fcc31f014921d2434a4
+  data.tar.gz: c73bbf806f5cef71bfadfc1368fbdbfe07bf37118df18ebec71f4914a27ae2a3858fa6a210ee4d7cdff8f672a14c59016604a72a0a90c611b37223c4652ee991

data/CHANGELOG.md CHANGED Viewed

@@ -1,11 +1,43 @@
 # Changelog
-## Unreleased
+## v4.12.0
 Changes since `v4.11.1`.
+This release expands advanced streaming and MCP execution while reframing
+llm.rb more clearly as a system integration layer for LLMs, tools, MCP
+sources, and application APIs.
+### Add
+- Add `persistent` as an alias for `persist!` on providers and MCP transports.
+- Add `LLM::Stream#on_tool_return` for observing completed streamed tool work.
+- Add `LLM::Function::Return#error?`.
+### Change
+- Expect advanced streaming callbacks to use `LLM::Stream` subclasses
+  instead of duck-typing them onto arbitrary objects. Basic `#<<`
+  streaming remains supported.
+### Fix
+- Fix Anthropic tools without params by always emitting `input_schema`.
+- Fix Anthropic tool-only responses to still produce an assistant message.
+- Fix Anthropic tool results to use the `user` role.
+- Fix Anthropic tool input normalization.
 ## v4.11.1
+Changes since `v4.11.0`.
+### Fix
+* Cast OpenTelemetry tool-related values to strings. <br>
+  Otherwise they're rejected by opentelemetry-sdk as invalid attributes.
+## v4.11.0
 Changes since `v4.10.0`.
 ### Add

data/README.md CHANGED Viewed

@@ -4,15 +4,16 @@
 <p align="center">
   <a href="https://0x1eef.github.io/x/llm.rb?rebuild=1"><img src="https://img.shields.io/badge/docs-0x1eef.github.io-blue.svg" alt="RubyDoc"></a>
   <a href="https://opensource.org/license/0bsd"><img src="https://img.shields.io/badge/License-0BSD-orange.svg?" alt="License"></a>
-  <a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-4.11.1-green.svg?" alt="Version"></a>
+  <a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-4.12.0-green.svg?" alt="Version"></a>
 </p>
 ## About
-llm.rb is a Ruby-centric toolkit for building real LLM-powered systems — where
-LLMs are part of your architecture, not just API calls. It gives you explicit
-control over contexts, tools, concurrency, and providers, so you can compose
-reliable, production-ready workflows without hidden abstractions.
+llm.rb is a Ruby-centric system integration layer for building real
+LLM-powered systems. It connects LLMs to real systems by turning APIs into
+tools and unifying MCP, providers, and application logic into a single
+execution model. It is used in production systems integrating external and
+internal tools, including agents, MCP services, and OpenAPI-based APIs.
 Built for engineers who want to understand and control their LLM systems. No
 frameworks, no hidden magic — just composable primitives for building real
@@ -26,8 +27,10 @@ and capabilities of llm.rb.
 ## What Makes It Different
 Most LLM libraries stop at requests and responses. <br>
-llm.rb is built around the state and execution model around them:
+llm.rb is built around the state and execution model behind them:
+- **A system layer, not just an API wrapper** <br>
+  llm.rb unifies LLMs, tools, MCP servers, and application APIs into a single execution model.
 - **Contexts are central** <br>
   They hold history, tools, schema, usage, cost, persistence, and execution state.
 - **Contexts can be serialized** <br>
@@ -39,7 +42,7 @@ llm.rb is built around the state and execution model around them:
   Start tool work while a response is still streaming instead of waiting for the turn to finish. <br>
   This overlaps tool latency with model output and exposes streamed tool-call events for introspection, making it one of llm.rb's strongest execution features.
 - **HTTP MCP can reuse connections** <br>
-  Opt into persistent HTTP pooling for repeated remote MCP tool calls with `persist!`.
+  Opt into persistent HTTP pooling for repeated remote MCP tool calls with `persistent`.
 - **One API across providers and capabilities** <br>
   The same model covers chat, files, images, audio, embeddings, vector stores, and more.
 - **Thread-safe where it matters** <br>
@@ -49,22 +52,48 @@ llm.rb is built around the state and execution model around them:
 - **Stdlib-only by default** <br>
   llm.rb runs on the Ruby standard library by default, with providers, optional features, and the model registry loaded only when you use them.
+## What llm.rb Enables
+llm.rb acts as the integration layer between LLMs, tools, and real systems.
+- Turn REST / OpenAPI APIs into LLM tools
+- Connect multiple MCP sources (Notion, internal services, etc.)
+- Build agents that operate across system boundaries
+- Orchestrate tools from multiple providers and protocols
+- Stream responses while executing tools concurrently
+- Treat LLMs as part of your architecture, not isolated calls
+Without llm.rb, providers, tool formats, and orchestration paths tend to stay
+fragmented. With llm.rb, they share a unified execution model with composable
+tools and a more consistent system architecture.
+## Real-World Usage
+llm.rb is used to integrate external MCP services such as Notion, internal APIs
+exposed via OpenAPI or `swagger.json`, and multiple tool sources into a unified
+execution model. Common usage patterns include combining multiple MCP sources,
+turning internal APIs into tools, and running those tools through the same
+context and provider flow.
+It supports multiple MCP sources, external SaaS integrations, internal APIs via
+OpenAPI, and multiple LLM providers simultaneously.
 ## Architecture & Execution Model
-llm.rb is built in layers, each providing explicit control:
+llm.rb sits at the center of the execution path, connecting tools, MCP
+sources, APIs, providers, and your application through explicit contexts:
 ```
-┌─────────────────────────────────────────┐
-│          Your Application               │
-├─────────────────────────────────────────┤
-│         Contexts & Agents               │ ← Stateful workflows
-├─────────────────────────────────────────┤
-│           Tools & Functions             │ ← Concurrent execution
-├─────────────────────────────────────────┤
-│   Unified Provider API (OpenAI, etc.)   │ ← Provider abstraction
-├─────────────────────────────────────────┤
-│      HTTP, JSON, Thread Safety          │ ← Infrastructure
-└─────────────────────────────────────────┘
+        External MCP        Internal MCP        OpenAPI / REST
+             │                   │                    │
+             └────────── Tools / MCP Layer ──────────┘
+                               │
+                         llm.rb Contexts
+                               │
+                        LLM Providers
+                  (OpenAI, Anthropic, etc.)
+                               │
+                        Your Application
 ```
 ### Key Design Decisions
@@ -103,6 +132,10 @@ llm.rb provides a complete set of primitives for building LLM-powered systems:
 ## Quick Start
+These examples show individual features, but llm.rb is designed to combine
+them into full systems where LLMs, tools, and external services operate
+together.
 #### Simple Streaming
 At the simplest level, any object that implements `#<<` can receive visible
@@ -111,7 +144,8 @@ and other Ruby IO-style objects.
 For more control, llm.rb also supports advanced streaming patterns through
 [`LLM::Stream`](lib/llm/stream.rb). See [Advanced Streaming](#advanced-streaming)
-for a structured callback-based example:
+for a structured callback-based example. Basic `#<<` streams only receive
+visible output chunks:
 ```ruby
 #!/usr/bin/env ruby
@@ -215,28 +249,33 @@ ctx.talk(ctx.wait(:thread)) while ctx.functions.any?
 #### Advanced Streaming
-llm.rb also supports the [`LLM::Stream`](lib/llm/stream.rb) interface for
-structured streaming events:
+Use [`LLM::Stream`](lib/llm/stream.rb) when you want more than plain `#<<`
+output. It adds structured streaming callbacks for:
 - `on_content` for visible assistant output
 - `on_reasoning_content` for separate reasoning output
 - `on_tool_call` for streamed tool-call notifications
+- `on_tool_return` for completed tool execution
+Subclass [`LLM::Stream`](lib/llm/stream.rb) when you want callbacks like
+`on_reasoning_content`, `on_tool_call`, and `on_tool_return`, or helpers like
+`queue` and `wait`.
-Subclass [`LLM::Stream`](lib/llm/stream.rb) when you want features like
-`queue` and `wait`, or implement the same methods on your own object. Keep these
-callbacks fast: they run inline with the parser.
+Keep `on_content`, `on_reasoning_content`, and `on_tool_call` fast: they run
+inline with the streaming parser. `on_tool_return` is different: it runs later,
+when `wait` resolves queued streamed tool work.
 `on_tool_call` lets tools start before the model finishes its turn, for
 example with `tool.spawn(:thread)`, `tool.spawn(:fiber)`, or
-`tool.spawn(:task)`. That can overlap tool latency with streaming output and
-gives you a first-class place to observe and instrument tool-call execution as
-it unfolds.
+`tool.spawn(:task)`. That can overlap tool latency with streaming output.
+`on_tool_return` is the place to react when that queued work completes, for
+example by updating progress UIs, logging tool latency, or changing visible
+state from "Running tool ..." to "Finished tool ...".
-If a stream cannot resolve a tool, `error` is an `LLM::Function::Return` that
-communicates the failure back to the LLM. That lets the tool-call path recover
-and keeps the session alive. It also leaves control in the callback: it can
-send `error`, spawn the tool when `error == nil`, or handle the situation
-however it sees fit.
+If a stream cannot resolve a tool, `on_tool_call` receives `error` as an
+`LLM::Function::Return`. That keeps the session alive and leaves control in
+the callback: it can send `error`, spawn the tool when `error == nil`, or
+handle the situation however it sees fit.
 In normal use this should be rare, since `on_tool_call` is usually called with
 a resolved tool and `error == nil`. To resolve a tool call, the tool must be
@@ -250,25 +289,22 @@ require "llm"
 # Assume `System < LLM::Tool` is already defined.
 class Stream < LLM::Stream
-  attr_reader :content, :reasoning_content
-  def initialize
-    @content = +""
-    @reasoning_content = +""
-  end
   def on_content(content)
-    @content << content
-    print content
+    $stdout << content
   end
   def on_reasoning_content(content)
-    @reasoning_content << content
+    $stderr << content
   end
   def on_tool_call(tool, error)
+    $stdout << "Running tool #{tool.name}\n"
     queue << (error || tool.spawn(:thread))
   end
+  def on_tool_return(tool, ret)
+    $stdout << (ret.error? ? "Tool #{tool.name} failed\n" : "Finished tool #{tool.name}\n")
+  end
 end
 llm = LLM.openai(key: ENV["KEY"])
@@ -282,6 +318,16 @@ end
 #### MCP
+MCP is a first-class integration mechanism in llm.rb.
+MCP allows llm.rb to treat external services, internal APIs, and system
+capabilities as tools in a unified interface. This makes it possible to
+connect multiple MCP sources simultaneously and expose your own APIs as tools.
+In practice, this supports workflows such as external SaaS integrations,
+multiple MCP sources in the same context, and OpenAPI -> MCP -> tools
+pipelines for internal services.
 llm.rb integrates with the Model Context Protocol (MCP) to dynamically discover
 and use tools from external servers. This example starts a filesystem MCP
 server over stdio and makes its tools available to a context, enabling the LLM
@@ -309,7 +355,7 @@ end
 You can also connect to an MCP server over HTTP. This is useful when the
 server already runs remotely and exposes MCP through a URL instead of a local
-process. If you expect repeated tool calls, use `persist!` to reuse a
+process. If you expect repeated tool calls, use `persistent` to reuse a
 process-wide HTTP connection pool. This requires the optional
 `net-http-persistent` gem:
@@ -321,7 +367,7 @@ llm = LLM.openai(key: ENV["KEY"])
 mcp = LLM::MCP.http(
   url: "https://api.githubcopilot.com/mcp/",
   headers: {"Authorization" => "Bearer #{ENV.fetch("GITHUB_PAT")}"}
-).persist!
+).persistent
 begin
   mcp.start
@@ -460,7 +506,7 @@ require "llm"
 LLM.json = :oj  # Use Oj for faster JSON parsing
 # Enable HTTP connection pooling for high-throughput applications
-llm = LLM.openai(key: ENV["KEY"]).persist!  # Uses net-http-persistent when available
+llm = LLM.openai(key: ENV["KEY"]).persistent  # Uses net-http-persistent when available
 ```
 #### Model Registry

data/lib/llm/function/task.rb CHANGED Viewed

@@ -9,11 +9,17 @@ class LLM::Function
     # @return [Object]
     attr_reader :task
+    ##
+    # @return [LLM::Function, nil]
+    attr_reader :function
     ##
     # @param [Thread, Fiber, Async::Task] task
+    # @param [LLM::Function, nil] function
     # @return [LLM::Function::Task]
-    def initialize(task)
+    def initialize(task, function = nil)
       @task = task
+      @function = function
     end
     ##

data/lib/llm/function.rb CHANGED Viewed

@@ -41,6 +41,13 @@ class LLM::Function
   prepend LLM::Function::Tracing
   Return = Struct.new(:id, :name, :value) do
+    ##
+    # Returns true when the return value represents an error.
+    # @return [Boolean]
+    def error?
+      Hash === value && value[:error] == true
+    end
     ##
     # Returns a Hash representation of {LLM::Function::Return}
     # @return [Hash]
@@ -186,7 +193,7 @@ class LLM::Function
     else
       raise ArgumentError, "Unknown strategy: #{strategy.inspect}. Expected :thread, :task, or :fiber"
     end
-    Task.new(task)
+    Task.new(task, self)
   ensure
     @called = true
   end
@@ -233,7 +240,11 @@ class LLM::Function
     when "LLM::Google"
       {name: @name, description: @description, parameters: @params}.compact
     when "LLM::Anthropic"
-      {name: @name, description: @description, input_schema: @params}.compact
+      {
+        name: @name,
+        description: @description,
+        input_schema: @params || {type: "object", properties: {}}
+      }.compact
     else
       format_openai(provider)
     end

data/lib/llm/mcp/transport/http.rb CHANGED Viewed

@@ -104,7 +104,7 @@ module LLM::MCP::Transport
     # Configures the transport to use a persistent HTTP connection pool
     # via the optional dependency [Net::HTTP::Persistent](https://github.com/drbrain/net-http-persistent)
     # @example
-    #   mcp = LLM.mcp(http: {url: "https://example.com/mcp"}).persist!
+    #   mcp = LLM.mcp(http: {url: "https://example.com/mcp"}).persistent
     #   # do something with 'mcp'
     # @return [LLM::MCP::Transport::HTTP]
     def persist!
@@ -119,6 +119,7 @@ module LLM::MCP::Transport
       end
       self
     end
+    alias_method :persistent, :persist!
     private

data/lib/llm/mcp/transport/stdio.rb CHANGED Viewed

@@ -84,6 +84,7 @@ module LLM::MCP::Transport
     def persist!
       self
     end
+    alias_method :persistent, :persist!
     private

data/lib/llm/mcp.rb CHANGED Viewed

@@ -104,13 +104,14 @@ class LLM::MCP
   # Configures an HTTP MCP transport to use a persistent connection pool
   # via the optional dependency [Net::HTTP::Persistent](https://github.com/drbrain/net-http-persistent)
   # @example
-  #   mcp = LLM.mcp(http: {url: "https://example.com/mcp"}).persist!
+  #   mcp = LLM.mcp(http: {url: "https://example.com/mcp"}).persistent
   #   # do something with 'mcp'
   # @return [LLM::MCP]
   def persist!
     transport.persist!
     self
   end
+  alias_method :persistent, :persist!
   ##
   # Returns the tools provided by the MCP process.

data/lib/llm/provider.rb CHANGED Viewed

@@ -308,7 +308,7 @@ class LLM::Provider
   # This method configures a provider to use a persistent connection pool
   # via the optional dependency [Net::HTTP::Persistent](https://github.com/drbrain/net-http-persistent)
   # @example
-  #   llm = LLM.openai(key: ENV["KEY"]).persist!
+  #   llm = LLM.openai(key: ENV["KEY"]).persistent
   #   # do something with 'llm'
   # @return [LLM::Provider]
   def persist!
@@ -317,14 +317,13 @@ class LLM::Provider
       tap { @client = client }
     end
   end
+  alias_method :persistent, :persist!
   ##
   # @param [Object] stream
   # @return [Boolean]
   def streamable?(stream)
-    stream.respond_to?(:on_content) ||
-      stream.respond_to?(:on_reasoning_content) ||
-      stream.respond_to?(:<<)
+    LLM::Stream === stream || stream.respond_to?(:<<)
   end
   private

data/lib/llm/providers/anthropic/request_adapter/completion.rb CHANGED Viewed

@@ -28,12 +28,19 @@ module LLM::Anthropic::RequestAdapter
     def adapt_message
       if message.tool_call?
-        {role: message.role, content: message.extra[:original_tool_calls]}
+        {role: message.role, content: adapt_tool_calls}
       else
         {role: message.role, content: adapt_content(content)}
       end
     end
+    def adapt_tool_calls
+      message.extra[:tool_calls].filter_map do |tool|
+        next unless tool[:id] && tool[:name]
+        {type: "tool_use", id: tool[:id], name: tool[:name], input: LLM::Anthropic.parse_tool_input(tool[:arguments])}
+      end
+    end
     ##
     # @param [String, URI] content
     #  The content to format

data/lib/llm/providers/anthropic/response_adapter/completion.rb CHANGED Viewed

@@ -66,7 +66,8 @@ module LLM::Anthropic::ResponseAdapter
     private
     def adapt_choices
-      texts.map.with_index do |choice, index|
+      source = texts.empty? && tools.any? ? [{"text" => ""}] : texts
+      source.map.with_index do |choice, index|
         extra = {
           index:, response: self,
           tool_calls: adapt_tool_calls(tools), original_tool_calls: tools
@@ -77,7 +78,11 @@ module LLM::Anthropic::ResponseAdapter
     def adapt_tool_calls(tools)
       (tools || []).filter_map do |tool|
-        {id: tool.id, name: tool.name, arguments: tool.input}
+        {
+          id: tool.id,
+          name: tool.name,
+          arguments: LLM::Anthropic.parse_tool_input(tool.input)
+        }
       end
     end

data/lib/llm/providers/anthropic/stream_parser.rb CHANGED Viewed

@@ -105,7 +105,7 @@ class LLM::Anthropic
       registered = LLM::Function.find_by_name(tool["name"])
       fn = (registered || LLM::Function.new(tool["name"])).dup.tap do |fn|
         fn.id = tool["id"]
-        fn.arguments = tool["input"]
+        fn.arguments = LLM::Anthropic.parse_tool_input(tool["input"])
       end
       [fn, (registered ? nil : @stream.tool_not_found(fn))]
     end

data/lib/llm/providers/anthropic/utils.rb ADDED Viewed

@@ -0,0 +1,23 @@
+# frozen_string_literal: true
+class LLM::Anthropic
+  module Utils
+    ##
+    # Normalizes Anthropic tool input to a Hash suitable for kwargs.
+    # @param input [Hash, String, nil]
+    # @return [Hash]
+    def parse_tool_input(input)
+      case input
+      when Hash then input
+      when String
+        parsed = LLM.json.load(input)
+        Hash === parsed ? parsed : {}
+      when nil then {}
+      else
+        input.respond_to?(:to_h) ? input.to_h : {}
+      end
+    rescue *LLM.json.parser_error
+      {}
+    end
+  end
+end

data/lib/llm/providers/anthropic.rb CHANGED Viewed

@@ -14,6 +14,7 @@ module LLM
   #   ctx.talk ["Tell me about this photo", ctx.local_file("/images/photo.png")]
   #   ctx.messages.select(&:assistant?).each { print "[#{_1.role}]", _1.content, "\n" }
   class Anthropic < Provider
+    require_relative "anthropic/utils"
     require_relative "anthropic/error_handler"
     require_relative "anthropic/request_adapter"
     require_relative "anthropic/response_adapter"
@@ -21,6 +22,7 @@ module LLM
     require_relative "anthropic/models"
     require_relative "anthropic/files"
     include RequestAdapter
+    extend Utils
     HOST = "api.anthropic.com"
@@ -79,6 +81,15 @@ module LLM
       "assistant"
     end
+    ##
+    # Anthropic expects tool results to be sent as user messages
+    # containing `tool_result` content blocks rather than a distinct
+    # `tool` role.
+    # @return (see LLM::Provider#tool_role)
+    def tool_role
+      :user
+    end
     ##
     # Returns the default model for chat completions
     # @see https://docs.anthropic.com/en/docs/about-claude/models/all-models#model-comparison-table claude-sonnet-4-20250514

data/lib/llm/stream/queue.rb CHANGED Viewed

@@ -8,8 +8,10 @@ class LLM::Stream
   # returns an array of {LLM::Function::Return} values.
   class Queue
     ##
+    # @param [LLM::Stream] stream
     # @return [LLM::Stream::Queue]
-    def initialize
+    def initialize(stream)
+      @stream = stream
       @items = []
     end
@@ -39,13 +41,24 @@ class LLM::Stream
     # @return [Array<LLM::Function::Return>]
     def wait(strategy)
       returns, tasks = @items.shift(@items.length).partition { LLM::Function::Return === _1 }
-      returns.concat case strategy
+      results = case strategy
       when :thread then LLM::Function::ThreadGroup.new(tasks).wait
       when :task then LLM::Function::TaskGroup.new(tasks).wait
       when :fiber then LLM::Function::FiberGroup.new(tasks).wait
       else raise ArgumentError, "Unknown strategy: #{strategy.inspect}. Expected :thread, :task, or :fiber"
       end
+      returns.concat fire_hooks(tasks, results)
     end
     alias_method :value, :wait
+    private
+    def fire_hooks(tasks, results)
+      results.each_with_index do |ret, idx|
+        tool = tasks[idx]&.function
+        @stream.on_tool_return(tool, ret) if tool
+      end
+      results
+    end
   end
 end

data/lib/llm/stream.rb CHANGED Viewed

@@ -5,20 +5,20 @@ module LLM
   # The {LLM::Stream LLM::Stream} class provides the callback interface for
   # streamed model output in llm.rb.
   #
-  # A stream object can be an instance of {LLM::Stream LLM::Stream}, a
-  # subclass that overrides the callbacks it needs, or any other object that
-  # implements some or all of the same interface. {#queue} provides a small
-  # helper for collecting asynchronous tool work started from a callback, and
-  # {#tool_not_found} returns an in-band tool error when a streamed tool
-  # cannot be resolved.
+  # A stream object can be an instance of {LLM::Stream LLM::Stream} or a
+  # subclass that overrides the callbacks it needs. For basic streaming,
+  # llm.rb also accepts any object that implements `#<<`. {#queue} provides
+  # a small helper for collecting asynchronous tool work started from a
+  # callback, and {#tool_not_found} returns an in-band tool error when a
+  # streamed tool cannot be resolved.
   #
   # @note The `on_*` callbacks run inline with the streaming parser. They
   #   therefore block streaming progress and should generally return as
   #   quickly as possible.
   #
-  # The most common callback is {#on_content}, which also maps to {#<<} for
-  # compatibility with `StringIO`-style objects. Providers may also call
-  # {#on_reasoning_content} and {#on_tool_call} when that data is available.
+  # The most common callback is {#on_content}, which also maps to {#<<}.
+  # Providers may also call {#on_reasoning_content} and {#on_tool_call} when
+  # that data is available.
   class Stream
     require_relative "stream/queue"
@@ -26,7 +26,7 @@ module LLM
     # Returns a lazily-initialized queue for tool results or spawned work.
     # @return [LLM::Stream::Queue]
     def queue
-      @queue ||= Queue.new
+      @queue ||= Queue.new(self)
     end
     ##
@@ -79,6 +79,20 @@ module LLM
       nil
     end
+    ##
+    # Called when queued streamed tool work returns.
+    # @note This callback runs when {#wait} resolves work that was queued from
+    #   {#on_tool_call}, such as values returned by `tool.spawn(:thread)`,
+    #   `tool.spawn(:fiber)`, or `tool.spawn(:task)`.
+    # @param [LLM::Function] tool
+    #  The tool that returned.
+    # @param [LLM::Function::Return] ret
+    #  The completed tool return.
+    # @return [nil]
+    def on_tool_return(tool, ret)
+      nil
+    end
     # @endgroup
     # @group Error handlers

data/lib/llm/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module LLM
-  VERSION = "4.11.1"
+  VERSION = "4.12.0"
 end

data/llm.gemspec CHANGED Viewed

@@ -8,47 +8,15 @@ Gem::Specification.new do |spec|
   spec.authors = ["Antar Azri", "0x1eef", "Christos Maris", "Rodrigo Serrano"]
   spec.email = ["azantar@proton.me", "0x1eef@hardenedbsd.org"]
-  spec.summary = <<~SUMMARY
-  llm.rb is a Ruby-centric toolkit for building real LLM-powered systems — where
-  LLMs are part of your architecture, not just API calls. It gives you explicit
-  control over contexts, tools, concurrency, and providers, so you can compose
-  reliable, production-ready workflows without hidden abstractions.
-  SUMMARY
+  spec.summary = "System integration layer for LLMs, tools, MCP, and APIs in Ruby."
   spec.description = <<~DESCRIPTION
-  llm.rb is a Ruby-centric toolkit for building real LLM-powered systems — where
-  LLMs are part of your architecture, not just API calls. It gives you explicit
-  control over contexts, tools, concurrency, and providers, so you can compose
-  reliable, production-ready workflows without hidden abstractions.
-  Built for engineers who want to understand and control their LLM systems. No
-  frameworks, no hidden magic — just composable primitives for building real
-  applications, from scripts to full systems like Relay.
-  ## Key Features
-  - **Contexts are central** — Hold history, tools, schema, usage, cost, persistence, and execution state
-  - **Tool execution is explicit** — Run local, provider-native, and MCP tools sequentially or concurrently
-  - **One API across providers** — Unified interface for OpenAI, Anthropic, Google, xAI, zAI, DeepSeek, Ollama, and LlamaCpp
-  - **Thread-safe where it matters** — Providers are shareable, while contexts stay isolated and stateful
-  - **Production-ready** — Cost tracking, observability, persistence, and performance tuning built in
-  - **Stdlib-only by default** — Runs on Ruby standard library, with optional features loaded only when used
-  ## Capabilities
-  - Chat & Contexts with persistence
-  - Streaming responses
-  - Tool calling with JSON Schema validation
-  - Concurrent execution (threads, fibers, async tasks)
-  - Agents with auto-execution
-  - Structured outputs
-  - MCP (Model Context Protocol) support
-  - Multimodal inputs (text, images, audio, documents)
-  - Audio generation, transcription, translation
-  - Image generation and editing
-  - Files API for document processing
-  - Embeddings and vector stores
-  - Local model registry for capabilities, limits, and pricing
+  llm.rb is a Ruby-centric system integration layer for building LLM-powered
+  systems. It connects LLMs to real systems by turning APIs into tools and
+  unifying MCP, providers, contexts, and application logic in one execution
+  model. It supports explicit tool orchestration, concurrent execution,
+  streaming, multiple MCP sources, and multiple LLM providers for production
+  systems that integrate external and internal services.
   DESCRIPTION
   spec.license = "0BSD"

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: llm.rb
 version: !ruby/object:Gem::Version
-  version: 4.11.1
+  version: 4.12.0
 platform: ruby
 authors:
 - Antar Azri
@@ -195,39 +195,12 @@ dependencies:
       - !ruby/object:Gem::Version
         version: '1.7'
 description: |
-  llm.rb is a Ruby-centric toolkit for building real LLM-powered systems — where
-  LLMs are part of your architecture, not just API calls. It gives you explicit
-  control over contexts, tools, concurrency, and providers, so you can compose
-  reliable, production-ready workflows without hidden abstractions.
-  Built for engineers who want to understand and control their LLM systems. No
-  frameworks, no hidden magic — just composable primitives for building real
-  applications, from scripts to full systems like Relay.
-  ## Key Features
-  - **Contexts are central** — Hold history, tools, schema, usage, cost, persistence, and execution state
-  - **Tool execution is explicit** — Run local, provider-native, and MCP tools sequentially or concurrently
-  - **One API across providers** — Unified interface for OpenAI, Anthropic, Google, xAI, zAI, DeepSeek, Ollama, and LlamaCpp
-  - **Thread-safe where it matters** — Providers are shareable, while contexts stay isolated and stateful
-  - **Production-ready** — Cost tracking, observability, persistence, and performance tuning built in
-  - **Stdlib-only by default** — Runs on Ruby standard library, with optional features loaded only when used
-  ## Capabilities
-  - Chat & Contexts with persistence
-  - Streaming responses
-  - Tool calling with JSON Schema validation
-  - Concurrent execution (threads, fibers, async tasks)
-  - Agents with auto-execution
-  - Structured outputs
-  - MCP (Model Context Protocol) support
-  - Multimodal inputs (text, images, audio, documents)
-  - Audio generation, transcription, translation
-  - Image generation and editing
-  - Files API for document processing
-  - Embeddings and vector stores
-  - Local model registry for capabilities, limits, and pricing
+  llm.rb is a Ruby-centric system integration layer for building LLM-powered
+  systems. It connects LLMs to real systems by turning APIs into tools and
+  unifying MCP, providers, contexts, and application logic in one execution
+  model. It supports explicit tool orchestration, concurrent execution,
+  streaming, multiple MCP sources, and multiple LLM providers for production
+  systems that integrate external and internal services.
 email:
 - azantar@proton.me
 - 0x1eef@hardenedbsd.org
@@ -300,6 +273,7 @@ files:
 - lib/llm/providers/anthropic/response_adapter/models.rb
 - lib/llm/providers/anthropic/response_adapter/web_search.rb
 - lib/llm/providers/anthropic/stream_parser.rb
+- lib/llm/providers/anthropic/utils.rb
 - lib/llm/providers/deepseek.rb
 - lib/llm/providers/deepseek/request_adapter.rb
 - lib/llm/providers/deepseek/request_adapter/completion.rb
@@ -417,8 +391,5 @@ required_rubygems_version: !ruby/object:Gem::Requirement
 requirements: []
 rubygems_version: 3.6.9
 specification_version: 4
-summary: llm.rb is a Ruby-centric toolkit for building real LLM-powered systems —
-  where LLMs are part of your architecture, not just API calls. It gives you explicit
-  control over contexts, tools, concurrency, and providers, so you can compose reliable,
-  production-ready workflows without hidden abstractions.
+summary: System integration layer for LLMs, tools, MCP, and APIs in Ruby.
 test_files: []