RubyGems - llm.rb - Versions diffs - 4.11.0 → 4.12.0 - Mend

llm.rb 4.11.0 → 4.12.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +33 -1
data/README.md +268 -191
data/lib/llm/function/task.rb +7 -1
data/lib/llm/function.rb +13 -2
data/lib/llm/mcp/transport/http.rb +2 -1
data/lib/llm/mcp/transport/stdio.rb +1 -0
data/lib/llm/mcp.rb +2 -1
data/lib/llm/provider.rb +3 -4
data/lib/llm/providers/anthropic/request_adapter/completion.rb +8 -1
data/lib/llm/providers/anthropic/response_adapter/completion.rb +7 -2
data/lib/llm/providers/anthropic/stream_parser.rb +1 -1
data/lib/llm/providers/anthropic/utils.rb +23 -0
data/lib/llm/providers/anthropic.rb +11 -0
data/lib/llm/stream/queue.rb +15 -2
data/lib/llm/stream.rb +24 -10
data/lib/llm/tracer/telemetry.rb +2 -2
data/lib/llm/version.rb +1 -1
data/llm.gemspec +7 -39
metadata +9 -38

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: a2af34506e099996b451951da8fb892ecdacebe9f29217bbf7a9e3ee3382d942
-  data.tar.gz: f49edb6d166ae113618139f0b118f37acbbd001b9b256d76d5c66b2828915a88
+  metadata.gz: 79d4a45ec25408e46451475575e917ef9d8579bec32f1a6a78bfed235e5ae212
+  data.tar.gz: fdeb12175be3ef87e411021444305b9e785a9bf2d055dfdc7bf718f5740623d8
 SHA512:
-  metadata.gz: 8dbdbde04bf04fd714ce5ab3689f078f6a77243853bdb7ea287124295b2a5b5878493a36e4ec0c703a10466306f13ca503de9132b2a8a31c2c39b2f721b1bf78
-  data.tar.gz: 5bcb9be7c664bbee548cdc305878bc62fe1c8b5ab23d64630719084dab3581b8f4abf875a235a0e33ee05430cda8d69b0b6cc8fce538abafa4e8f85bbbbaead0
+  metadata.gz: ea35b39b5476b75370485128dd8441e078bc7ac69236a7a50f4e32fb419f6fac5f7bb81faf3e029f28b788f4d69645e1b97e4126ea4f9fcc31f014921d2434a4
+  data.tar.gz: c73bbf806f5cef71bfadfc1368fbdbfe07bf37118df18ebec71f4914a27ae2a3858fa6a210ee4d7cdff8f672a14c59016604a72a0a90c611b37223c4652ee991

data/CHANGELOG.md CHANGED Viewed

@@ -1,9 +1,41 @@
 # Changelog
-## Unreleased
+## v4.12.0
+Changes since `v4.11.1`.
+This release expands advanced streaming and MCP execution while reframing
+llm.rb more clearly as a system integration layer for LLMs, tools, MCP
+sources, and application APIs.
+### Add
+- Add `persistent` as an alias for `persist!` on providers and MCP transports.
+- Add `LLM::Stream#on_tool_return` for observing completed streamed tool work.
+- Add `LLM::Function::Return#error?`.
+### Change
+- Expect advanced streaming callbacks to use `LLM::Stream` subclasses
+  instead of duck-typing them onto arbitrary objects. Basic `#<<`
+  streaming remains supported.
+### Fix
+- Fix Anthropic tools without params by always emitting `input_schema`.
+- Fix Anthropic tool-only responses to still produce an assistant message.
+- Fix Anthropic tool results to use the `user` role.
+- Fix Anthropic tool input normalization.
+## v4.11.1
 Changes since `v4.11.0`.
+### Fix
+* Cast OpenTelemetry tool-related values to strings. <br>
+  Otherwise they're rejected by opentelemetry-sdk as invalid attributes.
 ## v4.11.0
 Changes since `v4.10.0`.

data/README.md CHANGED Viewed

@@ -4,15 +4,16 @@
 <p align="center">
   <a href="https://0x1eef.github.io/x/llm.rb?rebuild=1"><img src="https://img.shields.io/badge/docs-0x1eef.github.io-blue.svg" alt="RubyDoc"></a>
   <a href="https://opensource.org/license/0bsd"><img src="https://img.shields.io/badge/License-0BSD-orange.svg?" alt="License"></a>
-  <a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-4.11.0-green.svg?" alt="Version"></a>
+  <a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-4.12.0-green.svg?" alt="Version"></a>
 </p>
 ## About
-llm.rb is a Ruby-centric toolkit for building real LLM-powered systems — where
-LLMs are part of your architecture, not just API calls. It gives you explicit
-control over contexts, tools, concurrency, and providers, so you can compose
-reliable, production-ready workflows without hidden abstractions.
+llm.rb is a Ruby-centric system integration layer for building real
+LLM-powered systems. It connects LLMs to real systems by turning APIs into
+tools and unifying MCP, providers, and application logic into a single
+execution model. It is used in production systems integrating external and
+internal tools, including agents, MCP services, and OpenAPI-based APIs.
 Built for engineers who want to understand and control their LLM systems. No
 frameworks, no hidden magic — just composable primitives for building real
@@ -26,17 +27,22 @@ and capabilities of llm.rb.
 ## What Makes It Different
 Most LLM libraries stop at requests and responses. <br>
-llm.rb is built around the state and execution model around them:
+llm.rb is built around the state and execution model behind them:
+- **A system layer, not just an API wrapper** <br>
+  llm.rb unifies LLMs, tools, MCP servers, and application APIs into a single execution model.
 - **Contexts are central** <br>
   They hold history, tools, schema, usage, cost, persistence, and execution state.
+- **Contexts can be serialized** <br>
+  A context can be serialized to JSON and stored on disk, in a database, in a
+  job queue, or anywhere else your application needs to persist state.
 - **Tool execution is explicit** <br>
   Run local, provider-native, and MCP tools sequentially or concurrently with threads, fibers, or async tasks.
 - **Run tools while streaming** <br>
   Start tool work while a response is still streaming instead of waiting for the turn to finish. <br>
-  This lets tool latency overlap with model output and is one of llm.rb's strongest execution features.
+  This overlaps tool latency with model output and exposes streamed tool-call events for introspection, making it one of llm.rb's strongest execution features.
 - **HTTP MCP can reuse connections** <br>
-  Opt into persistent HTTP pooling for repeated remote MCP tool calls with `persist!`.
+  Opt into persistent HTTP pooling for repeated remote MCP tool calls with `persistent`.
 - **One API across providers and capabilities** <br>
   The same model covers chat, files, images, audio, embeddings, vector stores, and more.
 - **Thread-safe where it matters** <br>
@@ -46,22 +52,48 @@ llm.rb is built around the state and execution model around them:
 - **Stdlib-only by default** <br>
   llm.rb runs on the Ruby standard library by default, with providers, optional features, and the model registry loaded only when you use them.
+## What llm.rb Enables
+llm.rb acts as the integration layer between LLMs, tools, and real systems.
+- Turn REST / OpenAPI APIs into LLM tools
+- Connect multiple MCP sources (Notion, internal services, etc.)
+- Build agents that operate across system boundaries
+- Orchestrate tools from multiple providers and protocols
+- Stream responses while executing tools concurrently
+- Treat LLMs as part of your architecture, not isolated calls
+Without llm.rb, providers, tool formats, and orchestration paths tend to stay
+fragmented. With llm.rb, they share a unified execution model with composable
+tools and a more consistent system architecture.
+## Real-World Usage
+llm.rb is used to integrate external MCP services such as Notion, internal APIs
+exposed via OpenAPI or `swagger.json`, and multiple tool sources into a unified
+execution model. Common usage patterns include combining multiple MCP sources,
+turning internal APIs into tools, and running those tools through the same
+context and provider flow.
+It supports multiple MCP sources, external SaaS integrations, internal APIs via
+OpenAPI, and multiple LLM providers simultaneously.
 ## Architecture & Execution Model
-llm.rb is built in layers, each providing explicit control:
+llm.rb sits at the center of the execution path, connecting tools, MCP
+sources, APIs, providers, and your application through explicit contexts:
 ```
-┌─────────────────────────────────────────┐
-│          Your Application               │
-├─────────────────────────────────────────┤
-│         Contexts & Agents               │ ← Stateful workflows
-├─────────────────────────────────────────┤
-│           Tools & Functions             │ ← Concurrent execution
-├─────────────────────────────────────────┤
-│   Unified Provider API (OpenAI, etc.)   │ ← Provider abstraction
-├─────────────────────────────────────────┤
-│      HTTP, JSON, Thread Safety          │ ← Infrastructure
-└─────────────────────────────────────────┘
+        External MCP        Internal MCP        OpenAPI / REST
+             │                   │                    │
+             └────────── Tools / MCP Layer ──────────┘
+                               │
+                         llm.rb Contexts
+                               │
+                        LLM Providers
+                  (OpenAI, Anthropic, etc.)
+                               │
+                        Your Application
 ```
 ### Key Design Decisions
@@ -100,167 +132,150 @@ llm.rb provides a complete set of primitives for building LLM-powered systems:
 ## Quick Start
-#### Run Tools While Streaming
+These examples show individual features, but llm.rb is designed to combine
+them into full systems where LLMs, tools, and external services operate
+together.
-llm.rb can start tool execution from streamed tool-call events before the
-assistant turn is fully finished. That means tool latency can overlap with
-streaming output instead of happening strictly after it. If your model emits
-tool calls early, this can noticeably reduce end-to-end latency for real
-systems.
+#### Simple Streaming
-This is different from plain concurrent tool execution. The tool starts while
-the response is still arriving, not after the turn has fully completed.
+At the simplest level, any object that implements `#<<` can receive visible
+output as it arrives. This works with `$stdout`, `StringIO`, files, sockets,
+and other Ruby IO-style objects.
-For example:
+For more control, llm.rb also supports advanced streaming patterns through
+[`LLM::Stream`](lib/llm/stream.rb). See [Advanced Streaming](#advanced-streaming)
+for a structured callback-based example. Basic `#<<` streams only receive
+visible output chunks:
 ```ruby
 #!/usr/bin/env ruby
 require "llm"
-class System < LLM::Tool
-  name "system"
-  description "Run a shell command"
-  params { _1.object(command: _1.string.required) }
-  def call(command:)
-    {success: Kernel.system(command)}
-  end
-end
-class Stream < LLM::Stream
-  def on_content(content)
-    print content
-  end
-  def on_tool_call(tool, error)
-    queue << (error || tool.spawn(:thread))
-  end
-end
 llm = LLM.openai(key: ENV["KEY"])
-ctx = LLM::Context.new(llm, stream: Stream.new, tools: [System])
-ctx.talk("Run `date` and tell me what command you ran.")
-ctx.talk(ctx.wait(:thread)) while ctx.functions.any?
+ctx = LLM::Context.new(llm, stream: $stdout)
+loop do
+  print "> "
+  ctx.talk(STDIN.gets || break)
+  puts
+end
 ```
-#### Concurrent Tools
+#### Structured Outputs
-llm.rb provides explicit concurrency control for tool execution. The
-`wait(:thread)` method spawns each pending function in its own thread and waits
-for all to complete. You can also use `:fiber` for cooperative multitasking or
-`:task` for async/await patterns (requires the `async` gem). The context
-automatically collects all results and reports them back to the LLM in a
-single turn, maintaining conversation flow while parallelizing independent
-operations:
+The `LLM::Schema` system lets you define JSON schemas for structured outputs.
+Schemas can be defined as classes with `property` declarations or built
+programmatically using a fluent interface. When you pass a schema to a context,
+llm.rb adapts it into the provider's structured-output format when that
+provider supports one. The `content!` method then parses the assistant's JSON
+response into a Ruby object:
 ```ruby
 #!/usr/bin/env ruby
 require "llm"
+require "pp"
+class Report < LLM::Schema
+  property :category, Enum["performance", "security", "outage"], "Report category", required: true
+  property :summary, String, "Short summary", required: true
+  property :impact, OneOf[String, Integer], "Primary impact, as text or a count", required: true
+  property :services, Array[String], "Impacted services", required: true
+  property :timestamp, String, "When it happened", optional: true
+end
 llm = LLM.openai(key: ENV["KEY"])
-ctx = LLM::Context.new(llm, stream: $stdout, tools: [FetchWeather, FetchNews, FetchStock])
+ctx = LLM::Context.new(llm, schema: Report)
+res = ctx.talk("Structure this report: 'Database latency spiked at 10:42 UTC, causing 5% request timeouts for 12 minutes.'")
+pp res.content!
-# Execute multiple independent tools concurrently
-ctx.talk("Summarize the weather, headlines, and stock price.")
-ctx.talk(ctx.wait(:thread)) while ctx.functions.any?
+# {
+#   "category" => "performance",
+#   "summary" => "Database latency spiked, causing 5% request timeouts for 12 minutes.",
+#   "impact" => "5% request timeouts",
+#   "services" => ["Database"],
+#   "timestamp" => "2024-06-05T10:42:00Z"
+# }
 ```
-#### MCP
+#### Tool Calling
-llm.rb integrates with the Model Context Protocol (MCP) to dynamically discover
-and use tools from external servers. This example starts a filesystem MCP
-server over stdio and makes its tools available to a context, enabling the LLM
-to interact with the local file system through a standardized interface.
-Use `LLM::MCP.stdio` or `LLM::MCP.http` when you want to make the transport
-explicit. Like `LLM::Context`, an MCP client is stateful and should remain
-isolated to a single thread:
+Tools in llm.rb can be defined as classes inheriting from `LLM::Tool` or as
+closures using `LLM.function`. When the LLM requests a tool call, the context
+stores `Function` objects in `ctx.functions`. The `call()` method executes all
+pending functions and returns their results to the LLM. Tools describe
+structured parameters with JSON Schema and adapt those definitions to each
+provider's tool-calling format (OpenAI, Anthropic, Google, etc.):
 ```ruby
 #!/usr/bin/env ruby
 require "llm"
-llm = LLM.openai(key: ENV["KEY"])
-mcp = LLM::MCP.stdio(argv: ["npx", "-y", "@modelcontextprotocol/server-filesystem", Dir.pwd])
+class System < LLM::Tool
+  name "system"
+  description "Run a shell command"
+  param :command, String, "Command to execute", required: true
-begin
-  mcp.start
-  ctx = LLM::Context.new(llm, stream: $stdout, tools: mcp.tools)
-  ctx.talk("List the directories in this project.")
-  ctx.talk(ctx.call(:functions)) while ctx.functions.any?
-ensure
-  mcp.stop
+  def call(command:)
+    {success: system(command)}
+  end
 end
-```
-You can also connect to an MCP server over HTTP. This is useful when the
-server already runs remotely and exposes MCP through a URL instead of a local
-process. If you expect repeated tool calls, use `persist!` to reuse a
-process-wide HTTP connection pool. This requires the optional
-`net-http-persistent` gem:
-```ruby
-#!/usr/bin/env ruby
-require "llm"
 llm = LLM.openai(key: ENV["KEY"])
-mcp = LLM::MCP.http(
-  url: "https://api.githubcopilot.com/mcp/",
-  headers: {"Authorization" => "Bearer #{ENV.fetch("GITHUB_PAT")}"}
-).persist!
-begin
-  mcp.start
-  ctx = LLM::Context.new(llm, stream: $stdout, tools: mcp.tools)
-  ctx.talk("List the available GitHub MCP toolsets.")
-  ctx.talk(ctx.call(:functions)) while ctx.functions.any?
-ensure
-  mcp.stop
-end
+ctx = LLM::Context.new(llm, stream: $stdout, tools: [System])
+ctx.talk("Run `date`.")
+ctx.talk(ctx.call(:functions)) while ctx.functions.any?
 ```
-#### Simple Streaming
+#### Concurrent Tools
-At the simplest level, any object that implements `#<<` can receive visible
-output as it arrives. This works with `$stdout`, `StringIO`, files, sockets,
-and other Ruby IO-style objects:
+llm.rb provides explicit concurrency control for tool execution. The
+`wait(:thread)` method spawns each pending function in its own thread and waits
+for all to complete. You can also use `:fiber` for cooperative multitasking or
+`:task` for async/await patterns (requires the `async` gem). The context
+automatically collects all results and reports them back to the LLM in a
+single turn, maintaining conversation flow while parallelizing independent
+operations:
 ```ruby
 #!/usr/bin/env ruby
 require "llm"
 llm = LLM.openai(key: ENV["KEY"])
-ctx = LLM::Context.new(llm, stream: $stdout)
-loop do
-  print "> "
-  ctx.talk(STDIN.gets || break)
-  puts
-end
+ctx = LLM::Context.new(llm, stream: $stdout, tools: [FetchWeather, FetchNews, FetchStock])
+# Execute multiple independent tools concurrently
+ctx.talk("Summarize the weather, headlines, and stock price.")
+ctx.talk(ctx.wait(:thread)) while ctx.functions.any?
 ```
 #### Advanced Streaming
-llm.rb also supports the [`LLM::Stream`](lib/llm/stream.rb) interface for
-structured streaming events:
+Use [`LLM::Stream`](lib/llm/stream.rb) when you want more than plain `#<<`
+output. It adds structured streaming callbacks for:
 - `on_content` for visible assistant output
 - `on_reasoning_content` for separate reasoning output
 - `on_tool_call` for streamed tool-call notifications
+- `on_tool_return` for completed tool execution
+Subclass [`LLM::Stream`](lib/llm/stream.rb) when you want callbacks like
+`on_reasoning_content`, `on_tool_call`, and `on_tool_return`, or helpers like
+`queue` and `wait`.
-Subclass [`LLM::Stream`](lib/llm/stream.rb) when you want features like
-`queue` and `wait`, or implement the same methods on your own object. Keep these
-callbacks fast: they run inline with the parser.
+Keep `on_content`, `on_reasoning_content`, and `on_tool_call` fast: they run
+inline with the streaming parser. `on_tool_return` is different: it runs later,
+when `wait` resolves queued streamed tool work.
 `on_tool_call` lets tools start before the model finishes its turn, for
 example with `tool.spawn(:thread)`, `tool.spawn(:fiber)`, or
-`tool.spawn(:task)`. This is the mechanism behind running tools while
-streaming.
+`tool.spawn(:task)`. That can overlap tool latency with streaming output.
+`on_tool_return` is the place to react when that queued work completes, for
+example by updating progress UIs, logging tool latency, or changing visible
+state from "Running tool ..." to "Finished tool ...".
-If a stream cannot execute a tool, `error` is an `LLM::Function::Return` that
-communicates the failure back to the LLM. That lets the tool-call path recover
-and keeps the session alive. It also leaves control in the callback: it can
-send `error`, spawn the tool when `error == nil`, or handle the situation
-however it sees fit.
+If a stream cannot resolve a tool, `on_tool_call` receives `error` as an
+`LLM::Function::Return`. That keeps the session alive and leaves control in
+the callback: it can send `error`, spawn the tool when `error == nil`, or
+handle the situation however it sees fit.
 In normal use this should be rare, since `on_tool_call` is usually called with
 a resolved tool and `error == nil`. To resolve a tool call, the tool must be
@@ -274,25 +289,22 @@ require "llm"
 # Assume `System < LLM::Tool` is already defined.
 class Stream < LLM::Stream
-  attr_reader :content, :reasoning_content
-  def initialize
-    @content = +""
-    @reasoning_content = +""
-  end
   def on_content(content)
-    @content << content
-    print content
+    $stdout << content
   end
   def on_reasoning_content(content)
-    @reasoning_content << content
+    $stderr << content
   end
   def on_tool_call(tool, error)
+    $stdout << "Running tool #{tool.name}\n"
     queue << (error || tool.spawn(:thread))
   end
+  def on_tool_return(tool, ret)
+    $stdout << (ret.error? ? "Tool #{tool.name} failed\n" : "Finished tool #{tool.name}\n")
+  end
 end
 llm = LLM.openai(key: ENV["KEY"])
@@ -304,69 +316,67 @@ while ctx.functions.any?
 end
 ```
-#### Tool Calling
+#### MCP
-Tools in llm.rb can be defined as classes inheriting from `LLM::Tool` or as
-closures using `LLM.function`. When the LLM requests a tool call, the context
-stores `Function` objects in `ctx.functions`. The `call()` method executes all
-pending functions and returns their results to the LLM. Tools describe
-structured parameters with JSON Schema and adapt those definitions to each
-provider's tool-calling format (OpenAI, Anthropic, Google, etc.):
+MCP is a first-class integration mechanism in llm.rb.
+MCP allows llm.rb to treat external services, internal APIs, and system
+capabilities as tools in a unified interface. This makes it possible to
+connect multiple MCP sources simultaneously and expose your own APIs as tools.
+In practice, this supports workflows such as external SaaS integrations,
+multiple MCP sources in the same context, and OpenAPI -> MCP -> tools
+pipelines for internal services.
+llm.rb integrates with the Model Context Protocol (MCP) to dynamically discover
+and use tools from external servers. This example starts a filesystem MCP
+server over stdio and makes its tools available to a context, enabling the LLM
+to interact with the local file system through a standardized interface.
+Use `LLM::MCP.stdio` or `LLM::MCP.http` when you want to make the transport
+explicit. Like `LLM::Context`, an MCP client is stateful and should remain
+isolated to a single thread:
 ```ruby
 #!/usr/bin/env ruby
 require "llm"
-class System < LLM::Tool
-  name "system"
-  description "Run a shell command"
-  param :command, String, "Command to execute", required: true
+llm = LLM.openai(key: ENV["KEY"])
+mcp = LLM::MCP.stdio(argv: ["npx", "-y", "@modelcontextprotocol/server-filesystem", Dir.pwd])
-  def call(command:)
-    {success: system(command)}
-  end
+begin
+  mcp.start
+  ctx = LLM::Context.new(llm, stream: $stdout, tools: mcp.tools)
+  ctx.talk("List the directories in this project.")
+  ctx.talk(ctx.call(:functions)) while ctx.functions.any?
+ensure
+  mcp.stop
 end
-llm = LLM.openai(key: ENV["KEY"])
-ctx = LLM::Context.new(llm, stream: $stdout, tools: [System])
-ctx.talk("Run `date`.")
-ctx.talk(ctx.call(:functions)) while ctx.functions.any?
 ```
-#### Structured Outputs
-The `LLM::Schema` system lets you define JSON schemas for structured outputs.
-Schemas can be defined as classes with `property` declarations or built
-programmatically using a fluent interface. When you pass a schema to a context,
-llm.rb adapts it into the provider's structured-output format when that
-provider supports one. The `content!` method then parses the assistant's JSON
-response into a Ruby object:
+You can also connect to an MCP server over HTTP. This is useful when the
+server already runs remotely and exposes MCP through a URL instead of a local
+process. If you expect repeated tool calls, use `persistent` to reuse a
+process-wide HTTP connection pool. This requires the optional
+`net-http-persistent` gem:
 ```ruby
 #!/usr/bin/env ruby
 require "llm"
-require "pp"
-class Report < LLM::Schema
-  property :category, Enum["performance", "security", "outage"], "Report category", required: true
-  property :summary, String, "Short summary", required: true
-  property :impact, OneOf[String, Integer], "Primary impact, as text or a count", required: true
-  property :services, Array[String], "Impacted services", required: true
-  property :timestamp, String, "When it happened", optional: true
-end
 llm = LLM.openai(key: ENV["KEY"])
-ctx = LLM::Context.new(llm, schema: Report)
-res = ctx.talk("Structure this report: 'Database latency spiked at 10:42 UTC, causing 5% request timeouts for 12 minutes.'")
-pp res.content!
+mcp = LLM::MCP.http(
+  url: "https://api.githubcopilot.com/mcp/",
+  headers: {"Authorization" => "Bearer #{ENV.fetch("GITHUB_PAT")}"}
+).persistent
-# {
-#   "category" => "performance",
-#   "summary" => "Database latency spiked, causing 5% request timeouts for 12 minutes.",
-#   "impact" => "5% request timeouts",
-#   "services" => ["Database"],
-#   "timestamp" => "2024-06-05T10:42:00Z"
-# }
+begin
+  mcp.start
+  ctx = LLM::Context.new(llm, stream: $stdout, tools: mcp.tools)
+  ctx.talk("List the available GitHub MCP toolsets.")
+  ctx.talk(ctx.call(:functions)) while ctx.functions.any?
+ensure
+  mcp.stop
+end
 ```
 ## Providers
@@ -496,7 +506,7 @@ require "llm"
 LLM.json = :oj  # Use Oj for faster JSON parsing
 # Enable HTTP connection pooling for high-throughput applications
-llm = LLM.openai(key: ENV["KEY"]).persist!  # Uses net-http-persistent when available
+llm = LLM.openai(key: ENV["KEY"]).persistent  # Uses net-http-persistent when available
 ```
 #### Model Registry
@@ -542,11 +552,11 @@ res = ctx.talk("What is the capital of France?")
 puts res.content
 ```
-#### Context Persistence
+#### Context Persistence: Vanilla
-Contexts can be serialized and restored across process boundaries. This makes
-it possible to persist conversation state in a file, database, or queue and
-resume work later:
+Contexts can be serialized and restored across process boundaries. A context
+can be serialized to JSON and stored on disk, in a database, in a job queue,
+or anywhere else your application needs to persist state:
 ```ruby
 #!/usr/bin/env ruby
@@ -556,12 +566,79 @@ llm = LLM.openai(key: ENV["KEY"])
 ctx = LLM::Context.new(llm)
 ctx.talk("Hello")
 ctx.talk("Remember that my favorite language is Ruby")
-ctx.save(path: "context.json")
+# Serialize to a string when you want to store the context yourself,
+# for example in a database row or job payload.
+payload = ctx.to_json
 restored = LLM::Context.new(llm)
-restored.restore(path: "context.json")
+restored.restore(string: payload)
 res = restored.talk("What is my favorite language?")
 puts res.content
+# You can also persist the same state to a file:
+ctx.save(path: "context.json")
+restored = LLM::Context.new(llm)
+restored.restore(path: "context.json")
+```
+#### Context Persistence: ActiveRecord (Rails)
+In a Rails application, you can also wrap persisted context state in an
+ActiveRecord model. A minimal schema would include a `snapshot` column for the
+serialized context payload (`jsonb` is recommended) and a `provider` column
+for the provider name:
+```ruby
+create_table :contexts do |t|
+  t.jsonb :snapshot
+  t.string :provider, null: false
+  t.timestamps
+end
+```
+For example:
+```ruby
+class Context < ApplicationRecord
+  def talk(...)
+    ctx.talk(...).tap { flush }
+  end
+  def wait(...)
+    ctx.wait(...).tap { flush }
+  end
+  def messages
+    ctx.messages
+  end
+  def model
+    ctx.model
+  end
+  def flush
+    update_column(:snapshot, ctx.to_json)
+  end
+  private
+  def ctx
+    @ctx ||= begin
+      ctx = LLM::Context.new(llm)
+      ctx.restore(string: snapshot) if snapshot
+      ctx
+    end
+  end
+  def llm
+    LLM.method(provider).call(key: ENV.fetch(key))
+  end
+  def key
+    "#{provider.upcase}_KEY"
+  end
+end
 ```
 #### Agents

data/lib/llm/function/task.rb CHANGED Viewed

@@ -9,11 +9,17 @@ class LLM::Function
     # @return [Object]
     attr_reader :task
+    ##
+    # @return [LLM::Function, nil]
+    attr_reader :function
     ##
     # @param [Thread, Fiber, Async::Task] task
+    # @param [LLM::Function, nil] function
     # @return [LLM::Function::Task]
-    def initialize(task)
+    def initialize(task, function = nil)
       @task = task
+      @function = function
     end
     ##

data/lib/llm/function.rb CHANGED Viewed

@@ -41,6 +41,13 @@ class LLM::Function
   prepend LLM::Function::Tracing
   Return = Struct.new(:id, :name, :value) do
+    ##
+    # Returns true when the return value represents an error.
+    # @return [Boolean]
+    def error?
+      Hash === value && value[:error] == true
+    end
     ##
     # Returns a Hash representation of {LLM::Function::Return}
     # @return [Hash]
@@ -186,7 +193,7 @@ class LLM::Function
     else
       raise ArgumentError, "Unknown strategy: #{strategy.inspect}. Expected :thread, :task, or :fiber"
     end
-    Task.new(task)
+    Task.new(task, self)
   ensure
     @called = true
   end
@@ -233,7 +240,11 @@ class LLM::Function
     when "LLM::Google"
       {name: @name, description: @description, parameters: @params}.compact
     when "LLM::Anthropic"
-      {name: @name, description: @description, input_schema: @params}.compact
+      {
+        name: @name,
+        description: @description,
+        input_schema: @params || {type: "object", properties: {}}
+      }.compact
     else
       format_openai(provider)
     end

data/lib/llm/mcp/transport/http.rb CHANGED Viewed

@@ -104,7 +104,7 @@ module LLM::MCP::Transport
     # Configures the transport to use a persistent HTTP connection pool
     # via the optional dependency [Net::HTTP::Persistent](https://github.com/drbrain/net-http-persistent)
     # @example
-    #   mcp = LLM.mcp(http: {url: "https://example.com/mcp"}).persist!
+    #   mcp = LLM.mcp(http: {url: "https://example.com/mcp"}).persistent
     #   # do something with 'mcp'
     # @return [LLM::MCP::Transport::HTTP]
     def persist!
@@ -119,6 +119,7 @@ module LLM::MCP::Transport
       end
       self
     end
+    alias_method :persistent, :persist!
     private

data/lib/llm/mcp/transport/stdio.rb CHANGED Viewed

@@ -84,6 +84,7 @@ module LLM::MCP::Transport
     def persist!
       self
     end
+    alias_method :persistent, :persist!
     private

data/lib/llm/mcp.rb CHANGED Viewed

@@ -104,13 +104,14 @@ class LLM::MCP
   # Configures an HTTP MCP transport to use a persistent connection pool
   # via the optional dependency [Net::HTTP::Persistent](https://github.com/drbrain/net-http-persistent)
   # @example
-  #   mcp = LLM.mcp(http: {url: "https://example.com/mcp"}).persist!
+  #   mcp = LLM.mcp(http: {url: "https://example.com/mcp"}).persistent
   #   # do something with 'mcp'
   # @return [LLM::MCP]
   def persist!
     transport.persist!
     self
   end
+  alias_method :persistent, :persist!
   ##
   # Returns the tools provided by the MCP process.

data/lib/llm/provider.rb CHANGED Viewed

@@ -308,7 +308,7 @@ class LLM::Provider
   # This method configures a provider to use a persistent connection pool
   # via the optional dependency [Net::HTTP::Persistent](https://github.com/drbrain/net-http-persistent)
   # @example
-  #   llm = LLM.openai(key: ENV["KEY"]).persist!
+  #   llm = LLM.openai(key: ENV["KEY"]).persistent
   #   # do something with 'llm'
   # @return [LLM::Provider]
   def persist!
@@ -317,14 +317,13 @@ class LLM::Provider
       tap { @client = client }
     end
   end
+  alias_method :persistent, :persist!
   ##
   # @param [Object] stream
   # @return [Boolean]
   def streamable?(stream)
-    stream.respond_to?(:on_content) ||
-      stream.respond_to?(:on_reasoning_content) ||
-      stream.respond_to?(:<<)
+    LLM::Stream === stream || stream.respond_to?(:<<)
   end
   private

data/lib/llm/providers/anthropic/request_adapter/completion.rb CHANGED Viewed

@@ -28,12 +28,19 @@ module LLM::Anthropic::RequestAdapter
     def adapt_message
       if message.tool_call?
-        {role: message.role, content: message.extra[:original_tool_calls]}
+        {role: message.role, content: adapt_tool_calls}
       else
         {role: message.role, content: adapt_content(content)}
       end
     end
+    def adapt_tool_calls
+      message.extra[:tool_calls].filter_map do |tool|
+        next unless tool[:id] && tool[:name]
+        {type: "tool_use", id: tool[:id], name: tool[:name], input: LLM::Anthropic.parse_tool_input(tool[:arguments])}
+      end
+    end
     ##
     # @param [String, URI] content
     #  The content to format

data/lib/llm/providers/anthropic/response_adapter/completion.rb CHANGED Viewed

@@ -66,7 +66,8 @@ module LLM::Anthropic::ResponseAdapter
     private
     def adapt_choices
-      texts.map.with_index do |choice, index|
+      source = texts.empty? && tools.any? ? [{"text" => ""}] : texts
+      source.map.with_index do |choice, index|
         extra = {
           index:, response: self,
           tool_calls: adapt_tool_calls(tools), original_tool_calls: tools
@@ -77,7 +78,11 @@ module LLM::Anthropic::ResponseAdapter
     def adapt_tool_calls(tools)
       (tools || []).filter_map do |tool|
-        {id: tool.id, name: tool.name, arguments: tool.input}
+        {
+          id: tool.id,
+          name: tool.name,
+          arguments: LLM::Anthropic.parse_tool_input(tool.input)
+        }
       end
     end

data/lib/llm/providers/anthropic/stream_parser.rb CHANGED Viewed

@@ -105,7 +105,7 @@ class LLM::Anthropic
       registered = LLM::Function.find_by_name(tool["name"])
       fn = (registered || LLM::Function.new(tool["name"])).dup.tap do |fn|
         fn.id = tool["id"]
-        fn.arguments = tool["input"]
+        fn.arguments = LLM::Anthropic.parse_tool_input(tool["input"])
       end
       [fn, (registered ? nil : @stream.tool_not_found(fn))]
     end

data/lib/llm/providers/anthropic/utils.rb ADDED Viewed

@@ -0,0 +1,23 @@
+# frozen_string_literal: true
+class LLM::Anthropic
+  module Utils
+    ##
+    # Normalizes Anthropic tool input to a Hash suitable for kwargs.
+    # @param input [Hash, String, nil]
+    # @return [Hash]
+    def parse_tool_input(input)
+      case input
+      when Hash then input
+      when String
+        parsed = LLM.json.load(input)
+        Hash === parsed ? parsed : {}
+      when nil then {}
+      else
+        input.respond_to?(:to_h) ? input.to_h : {}
+      end
+    rescue *LLM.json.parser_error
+      {}
+    end
+  end
+end

data/lib/llm/providers/anthropic.rb CHANGED Viewed

@@ -14,6 +14,7 @@ module LLM
   #   ctx.talk ["Tell me about this photo", ctx.local_file("/images/photo.png")]
   #   ctx.messages.select(&:assistant?).each { print "[#{_1.role}]", _1.content, "\n" }
   class Anthropic < Provider
+    require_relative "anthropic/utils"
     require_relative "anthropic/error_handler"
     require_relative "anthropic/request_adapter"
     require_relative "anthropic/response_adapter"
@@ -21,6 +22,7 @@ module LLM
     require_relative "anthropic/models"
     require_relative "anthropic/files"
     include RequestAdapter
+    extend Utils
     HOST = "api.anthropic.com"
@@ -79,6 +81,15 @@ module LLM
       "assistant"
     end
+    ##
+    # Anthropic expects tool results to be sent as user messages
+    # containing `tool_result` content blocks rather than a distinct
+    # `tool` role.
+    # @return (see LLM::Provider#tool_role)
+    def tool_role
+      :user
+    end
     ##
     # Returns the default model for chat completions
     # @see https://docs.anthropic.com/en/docs/about-claude/models/all-models#model-comparison-table claude-sonnet-4-20250514

data/lib/llm/stream/queue.rb CHANGED Viewed

@@ -8,8 +8,10 @@ class LLM::Stream
   # returns an array of {LLM::Function::Return} values.
   class Queue
     ##
+    # @param [LLM::Stream] stream
     # @return [LLM::Stream::Queue]
-    def initialize
+    def initialize(stream)
+      @stream = stream
       @items = []
     end
@@ -39,13 +41,24 @@ class LLM::Stream
     # @return [Array<LLM::Function::Return>]
     def wait(strategy)
       returns, tasks = @items.shift(@items.length).partition { LLM::Function::Return === _1 }
-      returns.concat case strategy
+      results = case strategy
       when :thread then LLM::Function::ThreadGroup.new(tasks).wait
       when :task then LLM::Function::TaskGroup.new(tasks).wait
       when :fiber then LLM::Function::FiberGroup.new(tasks).wait
       else raise ArgumentError, "Unknown strategy: #{strategy.inspect}. Expected :thread, :task, or :fiber"
       end
+      returns.concat fire_hooks(tasks, results)
     end
     alias_method :value, :wait
+    private
+    def fire_hooks(tasks, results)
+      results.each_with_index do |ret, idx|
+        tool = tasks[idx]&.function
+        @stream.on_tool_return(tool, ret) if tool
+      end
+      results
+    end
   end
 end

data/lib/llm/stream.rb CHANGED Viewed

@@ -5,20 +5,20 @@ module LLM
   # The {LLM::Stream LLM::Stream} class provides the callback interface for
   # streamed model output in llm.rb.
   #
-  # A stream object can be an instance of {LLM::Stream LLM::Stream}, a
-  # subclass that overrides the callbacks it needs, or any other object that
-  # implements some or all of the same interface. {#queue} provides a small
-  # helper for collecting asynchronous tool work started from a callback, and
-  # {#tool_not_found} returns an in-band tool error when a streamed tool
-  # cannot be resolved.
+  # A stream object can be an instance of {LLM::Stream LLM::Stream} or a
+  # subclass that overrides the callbacks it needs. For basic streaming,
+  # llm.rb also accepts any object that implements `#<<`. {#queue} provides
+  # a small helper for collecting asynchronous tool work started from a
+  # callback, and {#tool_not_found} returns an in-band tool error when a
+  # streamed tool cannot be resolved.
   #
   # @note The `on_*` callbacks run inline with the streaming parser. They
   #   therefore block streaming progress and should generally return as
   #   quickly as possible.
   #
-  # The most common callback is {#on_content}, which also maps to {#<<} for
-  # compatibility with `StringIO`-style objects. Providers may also call
-  # {#on_reasoning_content} and {#on_tool_call} when that data is available.
+  # The most common callback is {#on_content}, which also maps to {#<<}.
+  # Providers may also call {#on_reasoning_content} and {#on_tool_call} when
+  # that data is available.
   class Stream
     require_relative "stream/queue"
@@ -26,7 +26,7 @@ module LLM
     # Returns a lazily-initialized queue for tool results or spawned work.
     # @return [LLM::Stream::Queue]
     def queue
-      @queue ||= Queue.new
+      @queue ||= Queue.new(self)
     end
     ##
@@ -79,6 +79,20 @@ module LLM
       nil
     end
+    ##
+    # Called when queued streamed tool work returns.
+    # @note This callback runs when {#wait} resolves work that was queued from
+    #   {#on_tool_call}, such as values returned by `tool.spawn(:thread)`,
+    #   `tool.spawn(:fiber)`, or `tool.spawn(:task)`.
+    # @param [LLM::Function] tool
+    #  The tool that returned.
+    # @param [LLM::Function::Return] ret
+    #  The completed tool return.
+    # @return [nil]
+    def on_tool_return(tool, ret)
+      nil
+    end
     # @endgroup
     # @group Error handlers

data/lib/llm/tracer/telemetry.rb CHANGED Viewed

@@ -126,7 +126,7 @@ module LLM
         "gen_ai.operation.name" => "execute_tool",
         "gen_ai.request.model" => model,
         "gen_ai.tool.call.id" => id,
-        "gen_ai.tool.name" => name,
+        "gen_ai.tool.name" => name&.to_s,
         "gen_ai.tool.call.arguments" => LLM.json.dump(arguments),
         "gen_ai.provider.name" => provider_name,
         "server.address" => provider_host,
@@ -145,7 +145,7 @@ module LLM
       return nil unless span
       attributes = {
         "gen_ai.tool.call.id" => result.id,
-        "gen_ai.tool.name" => result.name,
+        "gen_ai.tool.name" => result.name&.to_s,
         "gen_ai.tool.call.result" => LLM.json.dump(result.value)
       }.compact
       attributes.each { span.set_attribute(_1, _2) }

data/lib/llm/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module LLM
-  VERSION = "4.11.0"
+  VERSION = "4.12.0"
 end

data/llm.gemspec CHANGED Viewed

@@ -8,47 +8,15 @@ Gem::Specification.new do |spec|
   spec.authors = ["Antar Azri", "0x1eef", "Christos Maris", "Rodrigo Serrano"]
   spec.email = ["azantar@proton.me", "0x1eef@hardenedbsd.org"]
-  spec.summary = <<~SUMMARY
-  llm.rb is a Ruby-centric toolkit for building real LLM-powered systems — where
-  LLMs are part of your architecture, not just API calls. It gives you explicit
-  control over contexts, tools, concurrency, and providers, so you can compose
-  reliable, production-ready workflows without hidden abstractions.
-  SUMMARY
+  spec.summary = "System integration layer for LLMs, tools, MCP, and APIs in Ruby."
   spec.description = <<~DESCRIPTION
-  llm.rb is a Ruby-centric toolkit for building real LLM-powered systems — where
-  LLMs are part of your architecture, not just API calls. It gives you explicit
-  control over contexts, tools, concurrency, and providers, so you can compose
-  reliable, production-ready workflows without hidden abstractions.
-  Built for engineers who want to understand and control their LLM systems. No
-  frameworks, no hidden magic — just composable primitives for building real
-  applications, from scripts to full systems like Relay.
-  ## Key Features
-  - **Contexts are central** — Hold history, tools, schema, usage, cost, persistence, and execution state
-  - **Tool execution is explicit** — Run local, provider-native, and MCP tools sequentially or concurrently
-  - **One API across providers** — Unified interface for OpenAI, Anthropic, Google, xAI, zAI, DeepSeek, Ollama, and LlamaCpp
-  - **Thread-safe where it matters** — Providers are shareable, while contexts stay isolated and stateful
-  - **Production-ready** — Cost tracking, observability, persistence, and performance tuning built in
-  - **Stdlib-only by default** — Runs on Ruby standard library, with optional features loaded only when used
-  ## Capabilities
-  - Chat & Contexts with persistence
-  - Streaming responses
-  - Tool calling with JSON Schema validation
-  - Concurrent execution (threads, fibers, async tasks)
-  - Agents with auto-execution
-  - Structured outputs
-  - MCP (Model Context Protocol) support
-  - Multimodal inputs (text, images, audio, documents)
-  - Audio generation, transcription, translation
-  - Image generation and editing
-  - Files API for document processing
-  - Embeddings and vector stores
-  - Local model registry for capabilities, limits, and pricing
+  llm.rb is a Ruby-centric system integration layer for building LLM-powered
+  systems. It connects LLMs to real systems by turning APIs into tools and
+  unifying MCP, providers, contexts, and application logic in one execution
+  model. It supports explicit tool orchestration, concurrent execution,
+  streaming, multiple MCP sources, and multiple LLM providers for production
+  systems that integrate external and internal services.
   DESCRIPTION
   spec.license = "0BSD"

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: llm.rb
 version: !ruby/object:Gem::Version
-  version: 4.11.0
+  version: 4.12.0
 platform: ruby
 authors:
 - Antar Azri
@@ -195,39 +195,12 @@ dependencies:
       - !ruby/object:Gem::Version
         version: '1.7'
 description: |
-  llm.rb is a Ruby-centric toolkit for building real LLM-powered systems — where
-  LLMs are part of your architecture, not just API calls. It gives you explicit
-  control over contexts, tools, concurrency, and providers, so you can compose
-  reliable, production-ready workflows without hidden abstractions.
-  Built for engineers who want to understand and control their LLM systems. No
-  frameworks, no hidden magic — just composable primitives for building real
-  applications, from scripts to full systems like Relay.
-  ## Key Features
-  - **Contexts are central** — Hold history, tools, schema, usage, cost, persistence, and execution state
-  - **Tool execution is explicit** — Run local, provider-native, and MCP tools sequentially or concurrently
-  - **One API across providers** — Unified interface for OpenAI, Anthropic, Google, xAI, zAI, DeepSeek, Ollama, and LlamaCpp
-  - **Thread-safe where it matters** — Providers are shareable, while contexts stay isolated and stateful
-  - **Production-ready** — Cost tracking, observability, persistence, and performance tuning built in
-  - **Stdlib-only by default** — Runs on Ruby standard library, with optional features loaded only when used
-  ## Capabilities
-  - Chat & Contexts with persistence
-  - Streaming responses
-  - Tool calling with JSON Schema validation
-  - Concurrent execution (threads, fibers, async tasks)
-  - Agents with auto-execution
-  - Structured outputs
-  - MCP (Model Context Protocol) support
-  - Multimodal inputs (text, images, audio, documents)
-  - Audio generation, transcription, translation
-  - Image generation and editing
-  - Files API for document processing
-  - Embeddings and vector stores
-  - Local model registry for capabilities, limits, and pricing
+  llm.rb is a Ruby-centric system integration layer for building LLM-powered
+  systems. It connects LLMs to real systems by turning APIs into tools and
+  unifying MCP, providers, contexts, and application logic in one execution
+  model. It supports explicit tool orchestration, concurrent execution,
+  streaming, multiple MCP sources, and multiple LLM providers for production
+  systems that integrate external and internal services.
 email:
 - azantar@proton.me
 - 0x1eef@hardenedbsd.org
@@ -300,6 +273,7 @@ files:
 - lib/llm/providers/anthropic/response_adapter/models.rb
 - lib/llm/providers/anthropic/response_adapter/web_search.rb
 - lib/llm/providers/anthropic/stream_parser.rb
+- lib/llm/providers/anthropic/utils.rb
 - lib/llm/providers/deepseek.rb
 - lib/llm/providers/deepseek/request_adapter.rb
 - lib/llm/providers/deepseek/request_adapter/completion.rb
@@ -417,8 +391,5 @@ required_rubygems_version: !ruby/object:Gem::Requirement
 requirements: []
 rubygems_version: 3.6.9
 specification_version: 4
-summary: llm.rb is a Ruby-centric toolkit for building real LLM-powered systems —
-  where LLMs are part of your architecture, not just API calls. It gives you explicit
-  control over contexts, tools, concurrency, and providers, so you can compose reliable,
-  production-ready workflows without hidden abstractions.
+summary: System integration layer for LLMs, tools, MCP, and APIs in Ruby.
 test_files: []