RubyGems - llm.rb - Versions diffs - 4.13.0 → 4.14.0 - Mend

llm.rb 4.13.0 → 4.14.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (26) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +46 -0
data/README.md +20 -13
data/lib/llm/context.rb +10 -0
data/lib/llm/error.rb +4 -0
data/lib/llm/eventhandler.rb +16 -12
data/lib/llm/eventstream/event.rb +15 -5
data/lib/llm/eventstream/parser.rb +29 -14
data/lib/llm/mcp/command.rb +1 -1
data/lib/llm/mcp/mailbox.rb +23 -0
data/lib/llm/mcp/pipe.rb +1 -1
data/lib/llm/mcp/router.rb +44 -0
data/lib/llm/mcp/rpc.rb +29 -18
data/lib/llm/mcp/transport/http/event_handler.rb +11 -9
data/lib/llm/mcp/transport/http.rb +2 -2
data/lib/llm/mcp/transport/stdio.rb +1 -1
data/lib/llm/mcp.rb +5 -2
data/lib/llm/provider/transport/http/execution.rb +115 -0
data/lib/llm/provider/transport/http/interruptible.rb +109 -0
data/lib/llm/provider/transport/http/stream_decoder.rb +92 -0
data/lib/llm/provider/transport/http.rb +144 -0
data/lib/llm/provider.rb +17 -103
data/lib/llm/version.rb +1 -1
data/lib/llm.rb +8 -0
metadata +7 -2
data/lib/llm/client.rb +0 -36

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 7847fee7ea1e63553ad5323750fc2e5ac1b4a9082c2f4c5aba71f4587440ea75
-  data.tar.gz: e63bdae085b2f0f606cbdb4633a7eff93fd6e2428fcb85ff5fe94fc78851bf5d
+  metadata.gz: ea1addf0bff644fa11e4f69a806f8ff5b7aa04fbbbc3f0592bd51b6ebc07f0f8
+  data.tar.gz: a3c846b9744e4ef230e2f23ed6ab42f6b4c84a0165b8bc066b7f6a003ee8fc00
 SHA512:
-  metadata.gz: b1c8d8600b3214da5613d152677d13fde796b42e6a29cf8af035e4ad5f28b7cea0466a375b9b444a748e9e063d2e6ad6720b653609cb2b7038e8040cd2b44e39
-  data.tar.gz: c76882f9cd5416312e26f4e25493403df8f9f8c61ee14cba5096383b449bd7a4ce8b9d70834d12176648c3d9206f0f555a1eec4b22bdb6426d88c0c36c8ed592
+  metadata.gz: 7387da06d824d42753ff30455b0e464b7ca6eaa43e9410ce814ad96451c5595154d1e721fb69c9edc0971208aaf8a011ce42078827b57971e0e7c0a66eb0db6e
+  data.tar.gz: 590442f434086b7215d664e6b5d474130499a14fba16810ff7e0b04878d25e46ca8983057af5fd9275d8415d95da6e1439b84388fa450b8c06bc7841c832a48e

data/CHANGELOG.md CHANGED Viewed

@@ -2,8 +2,54 @@
 ## Unreleased
+Changes since `v4.14.0`.
+## v4.14.0
 Changes since `v4.13.0`.
+This release adds request interruption for contexts, reworks provider
+HTTP internals for lower-overhead streaming, and fixes MCP clients so
+parallel tool calls can safely share one connection.
+### Add
+* **Add request interruption support** <br>
+  Add `LLM::Context#interrupt!`, `LLM::Context#cancel!`, and
+  `LLM::Interrupt` for interrupting in-flight provider requests,
+  inspired by Go's context cancellation.
+### Change
+* **Rework provider HTTP transport internals** <br>
+  Rework provider HTTP around `LLM::Provider::Transport::HTTP` with
+  explicit transient and persistent transport handling.
+* **Reduce SSE parser overhead** <br>
+  Dispatch raw parsed values to registered visitors instead of building
+  an `Event` object for every streamed line.
+* **Reduce provider streaming allocations** <br>
+  Decode streamed provider payloads directly in
+  `LLM::Provider::Transport::HTTP` before handing them to provider
+  parsers, which cuts allocation churn and gives a smaller streaming
+  speed bump.
+* **Reduce generic SSE parser allocations** <br>
+  Keep unread event-stream buffer data in place until compaction is
+  worthwhile, which lowers allocation churn in the remaining generic
+  SSE path.
+### Fix
+* **Support parallel MCP tool calls on one client** <br>
+  Route MCP responses by JSON-RPC id so concurrent tool calls can
+  share one client and transport without mismatching replies.
+* **Use explicit MCP non-blocking read errors** <br>
+  Use `IO::EAGAINWaitReadable` while continuing to retry on
+  `IO::WaitReadable`.
 ## v4.13.0
 Changes since `v4.12.0`.

data/README.md CHANGED Viewed

@@ -4,7 +4,7 @@
 <p align="center">
   <a href="https://0x1eef.github.io/x/llm.rb?rebuild=1"><img src="https://img.shields.io/badge/docs-0x1eef.github.io-blue.svg" alt="RubyDoc"></a>
   <a href="https://opensource.org/license/0bsd"><img src="https://img.shields.io/badge/License-0BSD-orange.svg?" alt="License"></a>
-  <a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-4.13.0-green.svg?" alt="Version"></a>
+  <a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-4.14.0-green.svg?" alt="Version"></a>
 </p>
 ## About
@@ -28,18 +28,9 @@ so they compose naturally instead of becoming separate subsystems.
 ## Architecture
-```
-    External MCP      Internal MCP      OpenAPI / REST
-         │                 │                  │
-         └────────── Tools / MCP Layer ───────┘
-                            │
-                      llm.rb Contexts
-                            │
-                       LLM Providers
-                 (OpenAI, Anthropic, etc.)
-                            │
-                      Your Application
-```
+<p align="center">
+  <img src="https://github.com/llmrb/llm.rb/raw/main/resources/architecture.png" alt="llm.rb architecture" width="790">
+</p>
 ## Core Concept
@@ -74,6 +65,10 @@ same context object.
 - **Streaming and tool execution work together**
   Start tool work while output is still streaming so you can hide latency
   instead of waiting for turns to finish.
+- **Requests can be interrupted cleanly**
+  Stop in-flight provider work through the same runtime instead of treating
+  cancellation as a separate concern. `LLM::Context#cancel!` is inspired by
+  Go's context cancellation model.
 - **Concurrency is a first-class feature**
   Use threads, fibers, or async tasks without rewriting your tool layer.
 - **Advanced workloads are built in, not bolted on**
@@ -85,12 +80,23 @@ same context object.
 - **MCP is built in**
   Connect to MCP servers over stdio or HTTP without bolting on a separate
   integration stack.
+- **Provider support is broad**
+  Work with OpenAI, OpenAI-compatible endpoints, Anthropic, Google, DeepSeek,
+  Z.ai, xAI, llama.cpp, and Ollama through the same runtime.
 - **Tools are explicit**
   Run local tools, provider-native tools, and MCP tools through the same path
   with fewer special cases.
 - **Providers are normalized, not flattened**
   Share one API surface across providers without losing access to provider-
   specific capabilities where they matter.
+- **Responses keep a uniform shape**
+  Provider calls return
+  [`LLM::Response`](https://0x1eef.github.io/x/llm.rb/LLM/Response.html)
+  objects as a common base shape, then extend them with endpoint- or
+  provider-specific behavior when needed.
+- **Low-level access is still there**
+  Normalized responses still keep the raw `Net::HTTPResponse` available when
+  you need headers, status, or other HTTP details.
 - **Local model metadata is included**
   Model capabilities, pricing, and limits are available locally without extra
   API calls.
@@ -114,6 +120,7 @@ same context object.
 - **Chat & Contexts** — stateless and stateful interactions with persistence
 - **Context Serialization** — save and restore state across processes or time
 - **Streaming** — visible output, reasoning output, tool-call events
+- **Request Interruption** — stop in-flight provider work cleanly
 - **Tool Calling** — class-based tools and closure-based functions
 - **Run Tools While Streaming** — overlap model output with tool latency
 - **Concurrent Execution** — threads, async tasks, and fibers

data/lib/llm/context.rb CHANGED Viewed

@@ -62,6 +62,7 @@ module LLM
       @mode = params.delete(:mode) || :completions
       @params = {model: llm.default_model, schema: nil}.compact.merge!(params)
       @messages = LLM::Buffer.new(llm)
+      @owner = Fiber.current
     end
     ##
@@ -184,6 +185,15 @@ module LLM
       end
     end
+    ##
+    # Interrupt the active request, if any.
+    # This is inspired by Go's context cancellation model.
+    # @return [nil]
+    def interrupt!
+      llm.interrupt!(@owner)
+    end
+    alias_method :cancel!, :interrupt!
     ##
     # Returns token usage accumulated in this context
     # @note

data/lib/llm/error.rb CHANGED Viewed

@@ -55,6 +55,10 @@ module LLM
   # When stuck in a tool call loop
   ToolLoopError = Class.new(Error)
+  ##
+  # When a request is interrupted
+  Interrupt = Class.new(Error)
   ##
   # When a tool call cannot be mapped to a local tool
   NoSuchToolError = Class.new(Error)

data/lib/llm/eventhandler.rb CHANGED Viewed

@@ -13,13 +13,15 @@ module LLM
     ##
     # "data:" event callback
-    # @param [LLM::EventStream::Event] event
+    # @param [LLM::EventStream::Event, String, nil] event
+    # @param [String, nil] chunk
     # @return [void]
-    def on_data(event)
-      return if event.end?
-      chunk = LLM.json.load(event.value)
-      return unless chunk
-      @parser.parse!(chunk)
+    def on_data(event, chunk = nil)
+      value = chunk ? event : event.value
+      return if value == "[DONE]"
+      payload = LLM.json.load(value)
+      return unless payload
+      @parser.parse!(payload)
     rescue *LLM.json.parser_error
     end
@@ -28,13 +30,15 @@ module LLM
     # is received, regardless of whether it has
     # a field name or not. Primarily for ollama,
     # which does emit Server-Sent Events (SSE).
-    # @param [LLM::EventStream::Event] event
+    # @param [LLM::EventStream::Event, String, nil] event
+    # @param [String, nil] chunk
     # @return [void]
-    def on_chunk(event)
-      return if event.end?
-      chunk = LLM.json.load(event.chunk)
-      return unless chunk
-      @parser.parse!(chunk)
+    def on_chunk(event, chunk = nil)
+      raw_chunk = chunk || event&.chunk || event
+      return if raw_chunk == "[DONE]"
+      payload = LLM.json.load(raw_chunk)
+      return unless payload
+      @parser.parse!(payload)
     rescue *LLM.json.parser_error
     end

data/lib/llm/eventstream/event.rb CHANGED Viewed

@@ -4,8 +4,17 @@ module LLM::EventStream
   ##
   # @private
   class Event
-    FIELD_REGEXP = /[^:]+/
-    VALUE_REGEXP = /(?<=: ).+/
+    UNSET = Object.new.freeze
+    def self.parse(chunk)
+      newline = chunk.end_with?("\n") ? chunk.bytesize - 1 : chunk.bytesize
+      separator = chunk.index(":")
+      return [nil, nil] unless separator
+      field = chunk.byteslice(0, separator)
+      value_start = separator + (chunk.getbyte(separator + 1) == 32 ? 2 : 1)
+      value = value_start < newline ? chunk.byteslice(value_start, newline - value_start) : nil
+      [field, value]
+    end
     ##
     # Returns the field name
@@ -25,9 +34,10 @@ module LLM::EventStream
     ##
     # @param [String] chunk
     # @return [LLM::EventStream::Event]
-    def initialize(chunk)
-      @field = chunk[FIELD_REGEXP]
-      @value = chunk[VALUE_REGEXP]
+    def initialize(chunk, field: UNSET, value: UNSET)
+      @field, @value = self.class.parse(chunk) if field.equal?(UNSET) || value.equal?(UNSET)
+      @field = field unless field.equal?(UNSET)
+      @value = value unless value.equal?(UNSET)
       @chunk = chunk
     end

data/lib/llm/eventstream/parser.rb CHANGED Viewed

@@ -4,6 +4,8 @@ module LLM::EventStream
   ##
   # @private
   class Parser
+    COMPACT_THRESHOLD = 4096
     ##
     # @return [LLM::EventStream::Parser]
     def initialize
@@ -42,7 +44,8 @@ module LLM::EventStream
     # Returns the internal buffer
     # @return [String]
     def body
-      @buffer.dup
+      return @buffer.dup if @cursor.zero?
+      @buffer.byteslice(@cursor, @buffer.bytesize - @cursor) || +""
     end
     ##
@@ -55,34 +58,46 @@ module LLM::EventStream
     private
-    def parse!(event)
-      event = Event.new(event)
-      dispatch(event)
+    def parse!(chunk)
+      field, value = Event.parse(chunk)
+      dispatch_visitors(field, value, chunk)
+      dispatch_callbacks(field, value, chunk)
+    end
+    def dispatch_visitors(field, value, chunk)
+      @visitors.each { dispatch_visitor(_1, field, value, chunk) }
     end
-    def dispatch(event)
-      @visitors.each { dispatch_visitor(_1, event) }
-      @events[event.field].each { _1.call(event) }
+    def dispatch_callbacks(field, value, chunk)
+      callbacks = @events[field]
+      return if callbacks.empty?
+      event = Event.new(chunk, field:, value:)
+      callbacks.each { _1.call(event) }
     end
-    def dispatch_visitor(visitor, event)
-      method = "on_#{event.field}"
+    def dispatch_visitor(visitor, field, value, chunk)
+      method = "on_#{field}"
       if visitor.respond_to?(method)
-        visitor.public_send(method, event)
+        visitor.public_send(method, value, chunk)
       elsif visitor.respond_to?("on_chunk")
-        visitor.on_chunk(event)
+        visitor.on_chunk(nil, chunk)
       end
     end
     def each_line
       while (newline = @buffer.index("\n", @cursor))
-        line = @buffer[@cursor..newline]
+        line = @buffer.byteslice(@cursor, newline - @cursor + 1)
         @cursor = newline + 1
         yield(line)
       end
       return if @cursor.zero?
-      @buffer = @buffer[@cursor..] || +""
-      @cursor = 0
+      if @cursor >= @buffer.bytesize
+        @buffer.clear
+        @cursor = 0
+      elsif @cursor >= COMPACT_THRESHOLD
+        @buffer = @buffer.byteslice(@cursor, @buffer.bytesize - @cursor) || +""
+        @cursor = 0
+      end
     end
   end
 end

data/lib/llm/mcp/command.rb CHANGED Viewed

@@ -74,7 +74,7 @@ class LLM::MCP
     #  The IO stream to read from (:stdout, :stderr)
     # @raise [LLM::Error]
     #  When the command is not running
-    # @raise [IO::WaitReadable]
+    # @raise [IO::EAGAINWaitReadable]
     #  When no complete message is available to read
     # @return [String]
     #  The next complete line from the specified IO stream

data/lib/llm/mcp/mailbox.rb ADDED Viewed

@@ -0,0 +1,23 @@
+# frozen_string_literal: true
+class LLM::MCP
+  ##
+  # A per-request mailbox for routing a JSON-RPC response back to the
+  # caller waiting on that request id.
+  class Mailbox
+    def initialize
+      @queue = Queue.new
+    end
+    def <<(message)
+      @queue << message
+      self
+    end
+    def pop
+      @queue.pop(true)
+    rescue ThreadError
+      nil
+    end
+  end
+end

data/lib/llm/mcp/pipe.rb CHANGED Viewed

@@ -27,7 +27,7 @@ class LLM::MCP
     ##
     # Reads from the reader end without blocking.
-    # @raise [IO::WaitReadable]
+    # @raise [IO::EAGAINWaitReadable]
     #  When no data is available to read
     # @return [String]
     def read_nonblock(...)

data/lib/llm/mcp/router.rb ADDED Viewed

@@ -0,0 +1,44 @@
+# frozen_string_literal: true
+class LLM::MCP
+  ##
+  # Coordinates shared access to a transport by routing JSON-RPC
+  # responses to the mailbox waiting on the matching request id.
+  class Router
+    def initialize
+      @request_id = -1
+      @pending = {}
+      @lock = Monitor.new
+      @writer = Monitor.new
+      @reader = Monitor.new
+    end
+    def register
+      @lock.synchronize do
+        @request_id += 1
+        mailbox = LLM::MCP::Mailbox.new
+        @pending[@request_id] = mailbox
+        [@request_id, mailbox]
+      end
+    end
+    def clear(id)
+      @lock.synchronize { @pending.delete(id) }
+    end
+    def read(transport)
+      @reader.synchronize { transport.read_nonblock }
+    end
+    def write(transport, message)
+      @writer.synchronize { transport.write(message) }
+    end
+    def route(response)
+      mailbox = @lock.synchronize { @pending[response["id"]] }
+      raise LLM::MCP::MismatchError.new(expected_id: nil, actual_id: response["id"]) unless mailbox
+      mailbox << response
+      nil
+    end
+  end
+end

data/lib/llm/mcp/rpc.rb CHANGED Viewed

@@ -27,13 +27,15 @@ class LLM::MCP
     def call(transport, method, params = {})
       message = {jsonrpc: "2.0", method:, params: default_params(method).merge(params)}
       if notification?(method)
-        transport.write(message)
-        nil
-      else
-        @request_id = (@request_id || -1) + 1
-        id = @request_id
-        transport.write(message.merge(id:))
-        recv(transport, id)
+        router.write(transport, message)
+        return nil
+      end
+      id, mailbox = router.register
+      begin
+        router.write(transport, message.merge(id:))
+        recv(transport, id, mailbox)
+      ensure
+        router.clear(id)
       end
     end
@@ -49,19 +51,12 @@ class LLM::MCP
     #  When the MCP process returns an error
     # @return [Object, nil]
     #  The result returned by the MCP process
-    def recv(transport, id)
+    def recv(transport, id, mailbox)
       poll(timeout:, ex: [IO::WaitReadable]) do
         loop do
-          res = transport.read_nonblock
-          if res["id"] == id && res["error"]
-            raise LLM::MCP::Error.from(response: res)
-          elsif res["id"] == id
-            break res["result"]
-          elsif res["method"]
-            next
-          elsif res.key?("id")
-            raise LLM::MCP::MismatchError.new(expected_id: id, actual_id: res["id"])
-          end
+          res = mailbox.pop
+          return handle_response(id, res) if res
+          route_response(router.read(transport), id)
         end
       end
     end
@@ -119,5 +114,21 @@ class LLM::MCP
         sleep 0.05
       end
     end
+    def handle_response(id, res)
+      raise LLM::MCP::Error.from(response: res) if res["error"]
+      return res["result"] if res["id"] == id
+      raise LLM::MCP::MismatchError.new(expected_id: id, actual_id: res["id"])
+    end
+    def route_response(res, id)
+      return nil if res["method"]
+      return router.route(res) if res.key?("id")
+      raise LLM::MCP::MismatchError.new(expected_id: id, actual_id: nil)
+    end
+    def router
+      @router ||= LLM::MCP::Router.new
+    end
   end
 end

data/lib/llm/mcp/transport/http/event_handler.rb CHANGED Viewed

@@ -21,29 +21,31 @@ module LLM::MCP::Transport
     ##
     # Receives the SSE event name.
-    # @param [LLM::EventStream::Event] event
+    # @param [LLM::EventStream::Event, String, nil] event
+    # @param [String, nil] chunk
     #  The event stream event
     # @return [void]
-    def on_event(event)
-      @event = event.value
+    def on_event(event, chunk = nil)
+      @event = chunk ? event : event.value
     end
     ##
     # Receives one line of SSE data.
-    # @param [LLM::EventStream::Event] event
+    # @param [LLM::EventStream::Event, String, nil] event
+    # @param [String, nil] chunk
     #  The event stream event
     # @return [void]
-    def on_data(event)
-      @data << event.value.to_s
+    def on_data(event, chunk = nil)
+      @data << (chunk ? event : event.value).to_s
     end
     # The generic event stream parser dispatches one line at a time.
     # A blank line terminates the current SSE event.
-    # @param [LLM::EventStream::Event] event
+    # @param [LLM::EventStream::Event, String] event
     #  The event stream event
     # @return [void]
-    def on_chunk(event)
-      flush if event.chunk == "\n"
+    def on_chunk(event, chunk = nil)
+      flush if (chunk || event&.chunk || event) == "\n"
     end
     private

data/lib/llm/mcp/transport/http.rb CHANGED Viewed

@@ -82,13 +82,13 @@ module LLM::MCP::Transport
     # Reads the next queued message without blocking.
     # @raise [LLM::MCP::Error]
     #  When the transport is not running
-    # @raise [IO::WaitReadable]
+    # @raise [IO::EAGAINWaitReadable]
     #  When no complete message is available to read
     # @return [Hash]
     def read_nonblock
       lock do
         raise LLM::MCP::Error, "MCP transport is not running" unless running?
-        raise IO::WaitReadable if @queue.empty?
+        raise IO::EAGAINWaitReadable, "no complete message available" if @queue.empty?
         @queue.shift
       end
     end

data/lib/llm/mcp/transport/stdio.rb CHANGED Viewed

@@ -57,7 +57,7 @@ module LLM::MCP::Transport
     # Reads a message from the MCP process without blocking.
     # @raise [LLM::Error]
     #  When the transport is not running
-    # @raise [IO::WaitReadable]
+    # @raise [IO::EAGAINWaitReadable]
     #  When no complete message is available to read
     # @return [Hash]
     #  The next message from the MCP process

data/lib/llm/mcp.rb CHANGED Viewed

@@ -10,11 +10,14 @@
 # transports and focuses on discovering tools that can be used through
 # {LLM::Context LLM::Context} and {LLM::Agent LLM::Agent}.
 #
-# Like {LLM::Context LLM::Context}, an MCP client is stateful and is
-# expected to remain isolated to a single thread.
+# An MCP client is stateful. Coordinate lifecycle operations such as
+# {#start} and {#stop}; request methods can be issued concurrently and
+# responses are matched by JSON-RPC id.
 class LLM::MCP
   require_relative "mcp/error"
   require_relative "mcp/command"
+  require_relative "mcp/mailbox"
+  require_relative "mcp/router"
   require_relative "mcp/rpc"
   require_relative "mcp/pipe"
   require_relative "mcp/transport/http"

data/lib/llm/provider/transport/http/execution.rb ADDED Viewed

@@ -0,0 +1,115 @@
+# frozen_string_literal: true
+module LLM::Provider::Transport
+  class HTTP
+    ##
+    # Internal HTTP request execution methods for {LLM::Provider}.
+    #
+    # This module handles provider-side HTTP execution, response parsing,
+    # streaming, and request body setup through
+    # {LLM::Provider::Transport::HTTP}.
+    #
+    # @api private
+    module HTTP::Execution
+      private
+      ##
+      # Executes a HTTP request
+      # @param [Net::HTTPRequest] request
+      #  The request to send
+      # @param [Proc] b
+      #  A block to yield the response to (optional)
+      # @return [Net::HTTPResponse]
+      #  The response from the server
+      # @raise [LLM::Error::Unauthorized]
+      #  When authentication fails
+      # @raise [LLM::Error::RateLimit]
+      #  When the rate limit is exceeded
+      # @raise [LLM::Error]
+      #  When any other unsuccessful status code is returned
+      # @raise [SystemCallError]
+      #  When there is a network error at the operating system level
+      # @return [Net::HTTPResponse]
+      def execute(request:, operation:, stream: nil, stream_parser: self.stream_parser, model: nil, inputs: nil, &b)
+        owner = transport.request_owner
+        tracer = self.tracer
+        span = tracer.on_request_start(operation:, model:, inputs:)
+        res = transport.request(request, owner:) do |http|
+          perform_request(http, request, stream, stream_parser, &b)
+        end
+        [handle_response(res, tracer, span), span, tracer]
+      rescue *LLM::Provider::Transport::HTTP::Interruptible::INTERRUPT_ERRORS
+        raise LLM::Interrupt, "request interrupted" if transport.interrupted?(owner)
+        raise
+      end
+      ##
+      # Handles the response from a request
+      # @param [Net::HTTPResponse] res
+      #  The response to handle
+      # @param [Object, nil] span
+      #  The span
+      # @return [Net::HTTPResponse]
+      def handle_response(res, tracer, span)
+        case res
+        when Net::HTTPOK then res.body = parse_response(res)
+        else error_handler.new(tracer, span, res).raise_error!
+        end
+        res
+      end
+      ##
+      # Parse a HTTP response
+      # @param [Net::HTTPResponse] res
+      # @return [LLM::Object, String]
+      def parse_response(res)
+        case res["content-type"]
+        when %r{\Aapplication/json\s*} then LLM::Object.from(LLM.json.load(res.body))
+        else res.body
+        end
+      end
+      ##
+      # @param [Net::HTTPRequest] req
+      #  The request to set the body stream for
+      # @param [IO] io
+      #  The IO object to set as the body stream
+      # @return [void]
+      def set_body_stream(req, io)
+        req.body_stream = io
+        req["transfer-encoding"] = "chunked" unless req["content-length"]
+      end
+      ##
+      # Performs the request on the given HTTP connection.
+      # @param [Net::HTTP] http
+      # @param [Net::HTTPRequest] request
+      # @param [Object, nil] stream
+      # @param [Class] stream_parser
+      # @param [Proc, nil] b
+      # @return [Net::HTTPResponse]
+      def perform_request(http, request, stream, stream_parser, &b)
+        if stream
+          http.request(request) do |res|
+            if Net::HTTPSuccess === res
+              parser = StreamDecoder.new(stream_parser.new(stream))
+              res.read_body(parser)
+              body = parser.body
+              res.body = (Hash === body || Array === body) ? LLM::Object.from(body) : body
+            else
+              body = +""
+              res.read_body { body << _1 }
+              res.body = body
+            end
+          ensure
+            parser&.free
+          end
+        elsif b
+          http.request(request) { (Net::HTTPSuccess === _1) ? b.call(_1) : _1 }
+        else
+          http.request(request)
+        end
+      end
+    end
+  end
+end

data/lib/llm/provider/transport/http/interruptible.rb ADDED Viewed

@@ -0,0 +1,109 @@
+# frozen_string_literal: true
+class LLM::Provider
+  module Transport
+    class HTTP
+      ##
+      # Internal request interruption methods for
+      # {LLM::Provider::Transport::HTTP}.
+      #
+      # This module tracks active requests by execution owner and provides
+      # the logic used to interrupt an in-flight request by closing the
+      # active HTTP connection.
+      #
+      # @api private
+      module Interruptible
+        INTERRUPT_ERRORS = [::IOError, ::EOFError, Errno::EBADF].freeze
+        Request = Struct.new(:http, :connection, keyword_init: true)
+        ##
+        # Interrupt an active request, if any.
+        # @param [Fiber] owner
+        #  The execution owner whose request should be interrupted
+        # @return [nil]
+        def interrupt!(owner)
+          req = request_for(owner) or return
+          lock { (@interrupts ||= {})[owner] = true }
+          if persistent_http?(req.http)
+            close_socket(req.connection&.http)
+            req.http.finish(req.connection)
+          elsif transient_http?(req.http)
+            close_socket(req.http)
+            req.http.finish if req.http.active?
+          end
+        rescue *INTERRUPT_ERRORS
+          nil
+        end
+        private
+        ##
+        # Closes the active socket for a request, if present.
+        # @param [Net::HTTP, nil] http
+        # @return [nil]
+        def close_socket(http)
+          socket = http&.instance_variable_get(:@socket) or return
+          socket = socket.io if socket.respond_to?(:io)
+          socket.close
+        rescue *INTERRUPT_ERRORS
+          nil
+        end
+        ##
+        # Returns whether the active request is using a transient HTTP client.
+        # @param [Object, nil] http
+        # @return [Boolean]
+        def transient_http?(http)
+          Net::HTTP === http
+        end
+        ##
+        # Returns whether the active request is using a persistent HTTP client.
+        # @param [Object, nil] http
+        # @return [Boolean]
+        def persistent_http?(http)
+          defined?(Net::HTTP::Persistent) && Net::HTTP::Persistent === http
+        end
+        ##
+        # Returns the active request for an execution owner.
+        # @param [Fiber] owner
+        # @return [Request, nil]
+        def request_for(owner)
+          lock do
+            @requests ||= {}
+            @requests[owner]
+          end
+        end
+        ##
+        # Records an active request for an execution owner.
+        # @param [Request] req
+        # @param [Fiber] owner
+        # @return [Request]
+        def set_request(req, owner)
+          lock do
+            @requests ||= {}
+            @requests[owner] = req
+          end
+        end
+        ##
+        # Clears the active request for an execution owner.
+        # @param [Fiber] owner
+        # @return [Request, nil]
+        def clear_request(owner)
+          lock { @requests&.delete(owner) }
+        end
+        ##
+        # Returns whether an execution owner was interrupted.
+        # @param [Fiber] owner
+        # @return [Boolean, nil]
+        def interrupted?(owner)
+          lock { @interrupts&.delete(owner) }
+        end
+      end
+    end
+  end
+end

data/lib/llm/provider/transport/http/stream_decoder.rb ADDED Viewed

@@ -0,0 +1,92 @@
+# frozen_string_literal: true
+module LLM::Provider::Transport
+  ##
+  # @private
+  class HTTP::StreamDecoder
+    ##
+    # @return [Object]
+    attr_reader :parser
+    ##
+    # @param [#parse!, #body] parser
+    # @return [LLM::Provider::Transport::HTTP::StreamDecoder]
+    def initialize(parser)
+      @buffer = +""
+      @cursor = 0
+      @data = []
+      @parser = parser
+    end
+    ##
+    # @param [String] chunk
+    # @return [void]
+    def <<(chunk)
+      @buffer << chunk
+      each_line { handle_line(_1) }
+    end
+    ##
+    # @return [Object]
+    def body
+      parser.body
+    end
+    ##
+    # @return [void]
+    def free
+      @buffer.clear
+      @cursor = 0
+      @data.clear
+      parser.free if parser.respond_to?(:free)
+    end
+    private
+    def handle_line(line)
+      if line == "\n" || line == "\r\n"
+        flush_sse_event
+      elsif line.start_with?("data:")
+        @data << field_value(line)
+      elsif line.start_with?("event:", "id:", "retry:", ":")
+      else
+        decode!(strip_newline(line))
+      end
+    end
+    def flush_sse_event
+      return if @data.empty?
+      decode!(@data.join("\n"))
+      @data.clear
+    end
+    def field_value(line)
+      value_start = line.getbyte(5) == 32 ? 6 : 5
+      strip_newline(line.byteslice(value_start..))
+    end
+    def strip_newline(line)
+      line = line.byteslice(0, line.bytesize - 1) if line.end_with?("\n")
+      line = line.byteslice(0, line.bytesize - 1) if line.end_with?("\r")
+      line
+    end
+    def decode!(payload)
+      return if payload.empty? || payload == "[DONE]"
+      chunk = LLM.json.load(payload)
+      parser.parse!(chunk) if chunk
+    rescue *LLM.json.parser_error
+    end
+    def each_line
+      while (newline = @buffer.index("\n", @cursor))
+        line = @buffer[@cursor..newline]
+        @cursor = newline + 1
+        yield(line)
+      end
+      return if @cursor.zero?
+      @buffer = @buffer[@cursor..] || +""
+      @cursor = 0
+    end
+  end
+end

data/lib/llm/provider/transport/http.rb ADDED Viewed

@@ -0,0 +1,144 @@
+# frozen_string_literal: true
+class LLM::Provider
+  module Transport
+    ##
+    # The {LLM::Provider::Transport::HTTP LLM::Provider::Transport::HTTP}
+    # class manages HTTP connections for {LLM::Provider}. It handles
+    # transient and persistent clients, tracks active requests by owner,
+    # and interrupts in-flight requests when needed.
+    #
+    # @api private
+    class HTTP
+      require_relative "http/stream_decoder"
+      require_relative "http/interruptible"
+      include Interruptible
+      ##
+      # @param [String] host
+      # @param [Integer] port
+      # @param [Integer] timeout
+      # @param [Boolean] ssl
+      # @param [Boolean] persistent
+      # @return [LLM::Provider::Transport::HTTP]
+      def initialize(host:, port:, timeout:, ssl:, persistent: false)
+        @host = host
+        @port = port
+        @timeout = timeout
+        @ssl = ssl
+        @base_uri = URI("#{ssl ? "https" : "http"}://#{host}:#{port}/")
+        @persistent_client = persistent ? persistent_client : nil
+        @monitor = Monitor.new
+      end
+      ##
+      # Interrupt an active request, if any.
+      # @param [Fiber] owner
+      # @return [nil]
+      def interrupt!(owner)
+        super
+      end
+      ##
+      # Returns whether an execution owner was interrupted.
+      # @param [Fiber] owner
+      # @return [Boolean, nil]
+      def interrupted?(owner)
+        super
+      end
+      ##
+      # Returns the current request owner.
+      # @return [Fiber]
+      def request_owner
+        Fiber.current
+      end
+      ##
+      # Configures the transport to use a persistent HTTP connection pool.
+      # @return [LLM::Provider::Transport::HTTP]
+      def persist!
+        client = persistent_client
+        lock do
+          @persistent_client = client
+          self
+        end
+      end
+      alias_method :persistent, :persist!
+      ##
+      # @return [Boolean]
+      def persistent?
+        !persistent_client.nil?
+      end
+      ##
+      # Performs a request on the current HTTP transport.
+      # @param [Net::HTTPRequest] request
+      # @param [Fiber] owner
+      # @yieldparam [Net::HTTP] http
+      # @return [Object]
+      def request(request, owner:, &)
+        if persistent?
+          request_persistent(request, owner, &)
+        else
+          request_transient(request, owner, &)
+        end
+      ensure
+        clear_request(owner)
+      end
+      ##
+      # @return [String]
+      def inspect
+        "#<#{self.class.name}:0x#{object_id.to_s(16)} @persistent=#{persistent?}>"
+      end
+      private
+      attr_reader :host, :port, :timeout, :ssl, :base_uri
+      def request_transient(request, owner, &)
+        http = transient_client
+        set_request(Request.new(http:), owner)
+        yield http
+      end
+      def request_persistent(request, owner, &)
+        persistent_client.connection_for(URI.join(base_uri, request.path)) do |connection|
+          set_request(Request.new(http: persistent_client, connection:), owner)
+          yield connection.http
+        end
+      end
+      def persistent_client
+        LLM.lock(:clients) do
+          if LLM.clients[client_id]
+            LLM.clients[client_id]
+          else
+            require "net/http/persistent" unless defined?(Net::HTTP::Persistent)
+            client = Net::HTTP::Persistent.new(name: self.class.name)
+            client.read_timeout = timeout
+            LLM.clients[client_id] = client
+          end
+        end
+      end
+      def transient_client
+        client = Net::HTTP.new(host, port)
+        client.read_timeout = timeout
+        client.use_ssl = ssl
+        client
+      end
+      def client_id
+        "#{host}:#{port}:#{timeout}:#{ssl}"
+      end
+      def lock(&)
+        @monitor.synchronize(&)
+      end
+    end
+  end
+end

data/lib/llm/provider.rb CHANGED Viewed

@@ -7,14 +7,9 @@
 # @abstract
 class LLM::Provider
   require "net/http"
-  require_relative "client"
-  include LLM::Client
-  @@clients = {}
-  ##
-  # @api private
-  def self.clients = @@clients
+  require_relative "provider/transport/http"
+  require_relative "provider/transport/http/execution"
+  include Transport::HTTP::Execution
   ##
   # @param [String, nil] key
@@ -36,9 +31,9 @@ class LLM::Provider
     @port = port
     @timeout = timeout
     @ssl = ssl
-    @client = persistent ? persistent_client : nil
     @base_uri = URI("#{ssl ? "https" : "http"}://#{host}:#{port}/")
     @headers = {"User-Agent" => "llm.rb v#{LLM::VERSION}"}
+    @transport = Transport::HTTP.new(host:, port:, timeout:, ssl:, persistent:)
     @monitor = Monitor.new
   end
@@ -47,7 +42,7 @@ class LLM::Provider
   # @return [String]
   # @note The secret key is redacted in inspect for security reasons
   def inspect
-    "#<#{self.class.name}:0x#{object_id.to_s(16)} @key=[REDACTED] @client=#{@client.inspect} @tracer=#{tracer.inspect}>"
+    "#<#{self.class.name}:0x#{object_id.to_s(16)} @key=[REDACTED] @transport=#{transport.inspect} @tracer=#{tracer.inspect}>"
   end
   ##
@@ -312,13 +307,20 @@ class LLM::Provider
   #   # do something with 'llm'
   # @return [LLM::Provider]
   def persist!
-    client = persistent_client
-    lock do
-      tap { @client = client }
-    end
+    transport.persist!
+    self
   end
   alias_method :persistent, :persist!
+  ##
+  # Interrupt the active request, if any.
+  # @param [Fiber] owner
+  # @return [nil]
+  def interrupt!(owner)
+    transport.interrupt!(owner)
+  end
+  alias_method :cancel!, :interrupt!
   ##
   # @param [Object] stream
   # @return [Boolean]
@@ -328,7 +330,7 @@ class LLM::Provider
   private
-  attr_reader :client, :base_uri, :host, :port, :timeout, :ssl
+  attr_reader :base_uri, :host, :port, :timeout, :ssl, :transport
   ##
   # The headers to include with a request
@@ -360,94 +362,6 @@ class LLM::Provider
     raise NotImplementedError
   end
-  ##
-  # Executes a HTTP request
-  # @param [Net::HTTPRequest] request
-  #  The request to send
-  # @param [Proc] b
-  #  A block to yield the response to (optional)
-  # @return [Net::HTTPResponse]
-  #  The response from the server
-  # @raise [LLM::Error::Unauthorized]
-  #  When authentication fails
-  # @raise [LLM::Error::RateLimit]
-  #  When the rate limit is exceeded
-  # @raise [LLM::Error]
-  #  When any other unsuccessful status code is returned
-  # @raise [SystemCallError]
-  #  When there is a network error at the operating system level
-  # @return [Net::HTTPResponse]
-  def execute(request:, operation:, stream: nil, stream_parser: self.stream_parser, model: nil, inputs: nil, &b)
-    tracer = self.tracer
-    span = tracer.on_request_start(operation:, model:, inputs:)
-    http = client || transient_client
-    args = (Net::HTTP === http) ? [request] : [URI.join(base_uri, request.path), request]
-    res = if stream
-      http.request(*args) do |res|
-        if Net::HTTPSuccess === res
-          handler = event_handler.new stream_parser.new(stream)
-          parser = LLM::EventStream::Parser.new
-          parser.register(handler)
-          res.read_body(parser)
-          # If the handler body is empty, the response was
-          # most likely not streamed or parsing failed.
-          # Preserve the raw body in that case so standard
-          # JSON/error handling can parse it later.
-          body = handler.body.empty? ? parser.body : handler.body
-          res.body = Hash === body || Array === body ? LLM::Object.from(body) : body
-        else
-          body = +""
-          res.read_body { body << _1 }
-          res.body = body
-        end
-      ensure
-        handler&.free
-        parser&.free
-      end
-    else
-      b ? http.request(*args) { (Net::HTTPSuccess === _1) ? b.call(_1) : _1 } :
-          http.request(*args)
-    end
-    [handle_response(res, tracer, span), span, tracer]
-  end
-  ##
-  # Handles the response from a request
-  # @param [Net::HTTPResponse] res
-  #  The response to handle
-  # @param [Object, nil] span
-  #  The span
-  # @return [Net::HTTPResponse]
-  def handle_response(res, tracer, span)
-    case res
-    when Net::HTTPOK then res.body = parse_response(res)
-    else error_handler.new(tracer, span, res).raise_error!
-    end
-    res
-  end
-  ##
-  # Parse a HTTP response
-  # @param [Net::HTTPResponse] res
-  # @return [LLM::Object, String]
-  def parse_response(res)
-    case res["content-type"]
-    when %r|\Aapplication/json\s*| then LLM::Object.from(LLM.json.load(res.body))
-    else res.body
-    end
-  end
-  ##
-  # @param [Net::HTTPRequest] req
-  #  The request to set the body stream for
-  # @param [IO] io
-  #  The IO object to set as the body stream
-  # @return [void]
-  def set_body_stream(req, io)
-    req.body_stream = io
-    req["transfer-encoding"] = "chunked" unless req["content-length"]
-  end
   ##
   # Resolves tools to their function representations
   # @param [Array<LLM::Function, LLM::Tool>] tools

data/lib/llm/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module LLM
-  VERSION = "4.13.0"
+  VERSION = "4.14.0"
 end

data/lib/llm.rb CHANGED Viewed

@@ -40,6 +40,14 @@ module LLM
   # Model registry
   @registry = {}
+  ##
+  # Shared HTTP clients used by providers.
+  @clients = {}
+  ##
+  # @api private
+  def self.clients = @clients
   ##
   # @param [Symbol, LLM::Provider] llm
   #  The name of a provider, or an instance of LLM::Provider

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: llm.rb
 version: !ruby/object:Gem::Version
-  version: 4.13.0
+  version: 4.14.0
 platform: ruby
 authors:
 - Antar Azri
@@ -231,7 +231,6 @@ files:
 - lib/llm/agent.rb
 - lib/llm/bot.rb
 - lib/llm/buffer.rb
-- lib/llm/client.rb
 - lib/llm/context.rb
 - lib/llm/context/deserializer.rb
 - lib/llm/contract.rb
@@ -255,7 +254,9 @@ files:
 - lib/llm/mcp.rb
 - lib/llm/mcp/command.rb
 - lib/llm/mcp/error.rb
+- lib/llm/mcp/mailbox.rb
 - lib/llm/mcp/pipe.rb
+- lib/llm/mcp/router.rb
 - lib/llm/mcp/rpc.rb
 - lib/llm/mcp/transport/http.rb
 - lib/llm/mcp/transport/http/event_handler.rb
@@ -270,6 +271,10 @@ files:
 - lib/llm/object/kernel.rb
 - lib/llm/prompt.rb
 - lib/llm/provider.rb
+- lib/llm/provider/transport/http.rb
+- lib/llm/provider/transport/http/execution.rb
+- lib/llm/provider/transport/http/interruptible.rb
+- lib/llm/provider/transport/http/stream_decoder.rb
 - lib/llm/providers/anthropic.rb
 - lib/llm/providers/anthropic/error_handler.rb
 - lib/llm/providers/anthropic/files.rb

data/lib/llm/client.rb DELETED Viewed

@@ -1,36 +0,0 @@
-# frozen_string_literal: true
-module LLM
-  ##
-  # @api private
-  module Client
-    private
-    ##
-    # @api private
-    def persistent_client
-      LLM.lock(:clients) do
-        if clients[client_id]
-          clients[client_id]
-        else
-          require "net/http/persistent" unless defined?(Net::HTTP::Persistent)
-          client = Net::HTTP::Persistent.new(name: self.class.name)
-          client.read_timeout = timeout
-          clients[client_id] = client
-        end
-      end
-    end
-    ##
-    # @api private
-    def transient_client
-      client = Net::HTTP.new(host, port)
-      client.read_timeout = timeout
-      client.use_ssl = ssl
-      client
-    end
-    def client_id = "#{host}:#{port}:#{timeout}:#{ssl}"
-    def clients = self.class.clients
-  end
-end