llm.rb 4.13.0 → 4.14.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 7847fee7ea1e63553ad5323750fc2e5ac1b4a9082c2f4c5aba71f4587440ea75
4
- data.tar.gz: e63bdae085b2f0f606cbdb4633a7eff93fd6e2428fcb85ff5fe94fc78851bf5d
3
+ metadata.gz: ea1addf0bff644fa11e4f69a806f8ff5b7aa04fbbbc3f0592bd51b6ebc07f0f8
4
+ data.tar.gz: a3c846b9744e4ef230e2f23ed6ab42f6b4c84a0165b8bc066b7f6a003ee8fc00
5
5
  SHA512:
6
- metadata.gz: b1c8d8600b3214da5613d152677d13fde796b42e6a29cf8af035e4ad5f28b7cea0466a375b9b444a748e9e063d2e6ad6720b653609cb2b7038e8040cd2b44e39
7
- data.tar.gz: c76882f9cd5416312e26f4e25493403df8f9f8c61ee14cba5096383b449bd7a4ce8b9d70834d12176648c3d9206f0f555a1eec4b22bdb6426d88c0c36c8ed592
6
+ metadata.gz: 7387da06d824d42753ff30455b0e464b7ca6eaa43e9410ce814ad96451c5595154d1e721fb69c9edc0971208aaf8a011ce42078827b57971e0e7c0a66eb0db6e
7
+ data.tar.gz: 590442f434086b7215d664e6b5d474130499a14fba16810ff7e0b04878d25e46ca8983057af5fd9275d8415d95da6e1439b84388fa450b8c06bc7841c832a48e
data/CHANGELOG.md CHANGED
@@ -2,8 +2,54 @@
2
2
 
3
3
  ## Unreleased
4
4
 
5
+ Changes since `v4.14.0`.
6
+
7
+ ## v4.14.0
8
+
5
9
  Changes since `v4.13.0`.
6
10
 
11
+ This release adds request interruption for contexts, reworks provider
12
+ HTTP internals for lower-overhead streaming, and fixes MCP clients so
13
+ parallel tool calls can safely share one connection.
14
+
15
+ ### Add
16
+
17
+ * **Add request interruption support** <br>
18
+ Add `LLM::Context#interrupt!`, `LLM::Context#cancel!`, and
19
+ `LLM::Interrupt` for interrupting in-flight provider requests,
20
+ inspired by Go's context cancellation.
21
+
22
+ ### Change
23
+
24
+ * **Rework provider HTTP transport internals** <br>
25
+ Rework provider HTTP around `LLM::Provider::Transport::HTTP` with
26
+ explicit transient and persistent transport handling.
27
+
28
+ * **Reduce SSE parser overhead** <br>
29
+ Dispatch raw parsed values to registered visitors instead of building
30
+ an `Event` object for every streamed line.
31
+
32
+ * **Reduce provider streaming allocations** <br>
33
+ Decode streamed provider payloads directly in
34
+ `LLM::Provider::Transport::HTTP` before handing them to provider
35
+ parsers, which cuts allocation churn and gives a smaller streaming
36
+ speed bump.
37
+
38
+ * **Reduce generic SSE parser allocations** <br>
39
+ Keep unread event-stream buffer data in place until compaction is
40
+ worthwhile, which lowers allocation churn in the remaining generic
41
+ SSE path.
42
+
43
+ ### Fix
44
+
45
+ * **Support parallel MCP tool calls on one client** <br>
46
+ Route MCP responses by JSON-RPC id so concurrent tool calls can
47
+ share one client and transport without mismatching replies.
48
+
49
+ * **Use explicit MCP non-blocking read errors** <br>
50
+ Use `IO::EAGAINWaitReadable` while continuing to retry on
51
+ `IO::WaitReadable`.
52
+
7
53
  ## v4.13.0
8
54
 
9
55
  Changes since `v4.12.0`.
data/README.md CHANGED
@@ -4,7 +4,7 @@
4
4
  <p align="center">
5
5
  <a href="https://0x1eef.github.io/x/llm.rb?rebuild=1"><img src="https://img.shields.io/badge/docs-0x1eef.github.io-blue.svg" alt="RubyDoc"></a>
6
6
  <a href="https://opensource.org/license/0bsd"><img src="https://img.shields.io/badge/License-0BSD-orange.svg?" alt="License"></a>
7
- <a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-4.13.0-green.svg?" alt="Version"></a>
7
+ <a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-4.14.0-green.svg?" alt="Version"></a>
8
8
  </p>
9
9
 
10
10
  ## About
@@ -28,18 +28,9 @@ so they compose naturally instead of becoming separate subsystems.
28
28
 
29
29
  ## Architecture
30
30
 
31
- ```
32
- External MCP Internal MCP OpenAPI / REST
33
- │ │ │
34
- └────────── Tools / MCP Layer ───────┘
35
-
36
- llm.rb Contexts
37
-
38
- LLM Providers
39
- (OpenAI, Anthropic, etc.)
40
-
41
- Your Application
42
- ```
31
+ <p align="center">
32
+ <img src="https://github.com/llmrb/llm.rb/raw/main/resources/architecture.png" alt="llm.rb architecture" width="790">
33
+ </p>
43
34
 
44
35
  ## Core Concept
45
36
 
@@ -74,6 +65,10 @@ same context object.
74
65
  - **Streaming and tool execution work together**
75
66
  Start tool work while output is still streaming so you can hide latency
76
67
  instead of waiting for turns to finish.
68
+ - **Requests can be interrupted cleanly**
69
+ Stop in-flight provider work through the same runtime instead of treating
70
+ cancellation as a separate concern. `LLM::Context#cancel!` is inspired by
71
+ Go's context cancellation model.
77
72
  - **Concurrency is a first-class feature**
78
73
  Use threads, fibers, or async tasks without rewriting your tool layer.
79
74
  - **Advanced workloads are built in, not bolted on**
@@ -85,12 +80,23 @@ same context object.
85
80
  - **MCP is built in**
86
81
  Connect to MCP servers over stdio or HTTP without bolting on a separate
87
82
  integration stack.
83
+ - **Provider support is broad**
84
+ Work with OpenAI, OpenAI-compatible endpoints, Anthropic, Google, DeepSeek,
85
+ Z.ai, xAI, llama.cpp, and Ollama through the same runtime.
88
86
  - **Tools are explicit**
89
87
  Run local tools, provider-native tools, and MCP tools through the same path
90
88
  with fewer special cases.
91
89
  - **Providers are normalized, not flattened**
92
90
  Share one API surface across providers without losing access to provider-
93
91
  specific capabilities where they matter.
92
+ - **Responses keep a uniform shape**
93
+ Provider calls return
94
+ [`LLM::Response`](https://0x1eef.github.io/x/llm.rb/LLM/Response.html)
95
+ objects as a common base shape, then extend them with endpoint- or
96
+ provider-specific behavior when needed.
97
+ - **Low-level access is still there**
98
+ Normalized responses still keep the raw `Net::HTTPResponse` available when
99
+ you need headers, status, or other HTTP details.
94
100
  - **Local model metadata is included**
95
101
  Model capabilities, pricing, and limits are available locally without extra
96
102
  API calls.
@@ -114,6 +120,7 @@ same context object.
114
120
  - **Chat & Contexts** — stateless and stateful interactions with persistence
115
121
  - **Context Serialization** — save and restore state across processes or time
116
122
  - **Streaming** — visible output, reasoning output, tool-call events
123
+ - **Request Interruption** — stop in-flight provider work cleanly
117
124
  - **Tool Calling** — class-based tools and closure-based functions
118
125
  - **Run Tools While Streaming** — overlap model output with tool latency
119
126
  - **Concurrent Execution** — threads, async tasks, and fibers
data/lib/llm/context.rb CHANGED
@@ -62,6 +62,7 @@ module LLM
62
62
  @mode = params.delete(:mode) || :completions
63
63
  @params = {model: llm.default_model, schema: nil}.compact.merge!(params)
64
64
  @messages = LLM::Buffer.new(llm)
65
+ @owner = Fiber.current
65
66
  end
66
67
 
67
68
  ##
@@ -184,6 +185,15 @@ module LLM
184
185
  end
185
186
  end
186
187
 
188
+ ##
189
+ # Interrupt the active request, if any.
190
+ # This is inspired by Go's context cancellation model.
191
+ # @return [nil]
192
+ def interrupt!
193
+ llm.interrupt!(@owner)
194
+ end
195
+ alias_method :cancel!, :interrupt!
196
+
187
197
  ##
188
198
  # Returns token usage accumulated in this context
189
199
  # @note
data/lib/llm/error.rb CHANGED
@@ -55,6 +55,10 @@ module LLM
55
55
  # When stuck in a tool call loop
56
56
  ToolLoopError = Class.new(Error)
57
57
 
58
+ ##
59
+ # When a request is interrupted
60
+ Interrupt = Class.new(Error)
61
+
58
62
  ##
59
63
  # When a tool call cannot be mapped to a local tool
60
64
  NoSuchToolError = Class.new(Error)
@@ -13,13 +13,15 @@ module LLM
13
13
 
14
14
  ##
15
15
  # "data:" event callback
16
- # @param [LLM::EventStream::Event] event
16
+ # @param [LLM::EventStream::Event, String, nil] event
17
+ # @param [String, nil] chunk
17
18
  # @return [void]
18
- def on_data(event)
19
- return if event.end?
20
- chunk = LLM.json.load(event.value)
21
- return unless chunk
22
- @parser.parse!(chunk)
19
+ def on_data(event, chunk = nil)
20
+ value = chunk ? event : event.value
21
+ return if value == "[DONE]"
22
+ payload = LLM.json.load(value)
23
+ return unless payload
24
+ @parser.parse!(payload)
23
25
  rescue *LLM.json.parser_error
24
26
  end
25
27
 
@@ -28,13 +30,15 @@ module LLM
28
30
  # is received, regardless of whether it has
29
31
  # a field name or not. Primarily for ollama,
30
32
  # which does emit Server-Sent Events (SSE).
31
- # @param [LLM::EventStream::Event] event
33
+ # @param [LLM::EventStream::Event, String, nil] event
34
+ # @param [String, nil] chunk
32
35
  # @return [void]
33
- def on_chunk(event)
34
- return if event.end?
35
- chunk = LLM.json.load(event.chunk)
36
- return unless chunk
37
- @parser.parse!(chunk)
36
+ def on_chunk(event, chunk = nil)
37
+ raw_chunk = chunk || event&.chunk || event
38
+ return if raw_chunk == "[DONE]"
39
+ payload = LLM.json.load(raw_chunk)
40
+ return unless payload
41
+ @parser.parse!(payload)
38
42
  rescue *LLM.json.parser_error
39
43
  end
40
44
 
@@ -4,8 +4,17 @@ module LLM::EventStream
4
4
  ##
5
5
  # @private
6
6
  class Event
7
- FIELD_REGEXP = /[^:]+/
8
- VALUE_REGEXP = /(?<=: ).+/
7
+ UNSET = Object.new.freeze
8
+
9
+ def self.parse(chunk)
10
+ newline = chunk.end_with?("\n") ? chunk.bytesize - 1 : chunk.bytesize
11
+ separator = chunk.index(":")
12
+ return [nil, nil] unless separator
13
+ field = chunk.byteslice(0, separator)
14
+ value_start = separator + (chunk.getbyte(separator + 1) == 32 ? 2 : 1)
15
+ value = value_start < newline ? chunk.byteslice(value_start, newline - value_start) : nil
16
+ [field, value]
17
+ end
9
18
 
10
19
  ##
11
20
  # Returns the field name
@@ -25,9 +34,10 @@ module LLM::EventStream
25
34
  ##
26
35
  # @param [String] chunk
27
36
  # @return [LLM::EventStream::Event]
28
- def initialize(chunk)
29
- @field = chunk[FIELD_REGEXP]
30
- @value = chunk[VALUE_REGEXP]
37
+ def initialize(chunk, field: UNSET, value: UNSET)
38
+ @field, @value = self.class.parse(chunk) if field.equal?(UNSET) || value.equal?(UNSET)
39
+ @field = field unless field.equal?(UNSET)
40
+ @value = value unless value.equal?(UNSET)
31
41
  @chunk = chunk
32
42
  end
33
43
 
@@ -4,6 +4,8 @@ module LLM::EventStream
4
4
  ##
5
5
  # @private
6
6
  class Parser
7
+ COMPACT_THRESHOLD = 4096
8
+
7
9
  ##
8
10
  # @return [LLM::EventStream::Parser]
9
11
  def initialize
@@ -42,7 +44,8 @@ module LLM::EventStream
42
44
  # Returns the internal buffer
43
45
  # @return [String]
44
46
  def body
45
- @buffer.dup
47
+ return @buffer.dup if @cursor.zero?
48
+ @buffer.byteslice(@cursor, @buffer.bytesize - @cursor) || +""
46
49
  end
47
50
 
48
51
  ##
@@ -55,34 +58,46 @@ module LLM::EventStream
55
58
 
56
59
  private
57
60
 
58
- def parse!(event)
59
- event = Event.new(event)
60
- dispatch(event)
61
+ def parse!(chunk)
62
+ field, value = Event.parse(chunk)
63
+ dispatch_visitors(field, value, chunk)
64
+ dispatch_callbacks(field, value, chunk)
65
+ end
66
+
67
+ def dispatch_visitors(field, value, chunk)
68
+ @visitors.each { dispatch_visitor(_1, field, value, chunk) }
61
69
  end
62
70
 
63
- def dispatch(event)
64
- @visitors.each { dispatch_visitor(_1, event) }
65
- @events[event.field].each { _1.call(event) }
71
+ def dispatch_callbacks(field, value, chunk)
72
+ callbacks = @events[field]
73
+ return if callbacks.empty?
74
+ event = Event.new(chunk, field:, value:)
75
+ callbacks.each { _1.call(event) }
66
76
  end
67
77
 
68
- def dispatch_visitor(visitor, event)
69
- method = "on_#{event.field}"
78
+ def dispatch_visitor(visitor, field, value, chunk)
79
+ method = "on_#{field}"
70
80
  if visitor.respond_to?(method)
71
- visitor.public_send(method, event)
81
+ visitor.public_send(method, value, chunk)
72
82
  elsif visitor.respond_to?("on_chunk")
73
- visitor.on_chunk(event)
83
+ visitor.on_chunk(nil, chunk)
74
84
  end
75
85
  end
76
86
 
77
87
  def each_line
78
88
  while (newline = @buffer.index("\n", @cursor))
79
- line = @buffer[@cursor..newline]
89
+ line = @buffer.byteslice(@cursor, newline - @cursor + 1)
80
90
  @cursor = newline + 1
81
91
  yield(line)
82
92
  end
83
93
  return if @cursor.zero?
84
- @buffer = @buffer[@cursor..] || +""
85
- @cursor = 0
94
+ if @cursor >= @buffer.bytesize
95
+ @buffer.clear
96
+ @cursor = 0
97
+ elsif @cursor >= COMPACT_THRESHOLD
98
+ @buffer = @buffer.byteslice(@cursor, @buffer.bytesize - @cursor) || +""
99
+ @cursor = 0
100
+ end
86
101
  end
87
102
  end
88
103
  end
@@ -74,7 +74,7 @@ class LLM::MCP
74
74
  # The IO stream to read from (:stdout, :stderr)
75
75
  # @raise [LLM::Error]
76
76
  # When the command is not running
77
- # @raise [IO::WaitReadable]
77
+ # @raise [IO::EAGAINWaitReadable]
78
78
  # When no complete message is available to read
79
79
  # @return [String]
80
80
  # The next complete line from the specified IO stream
@@ -0,0 +1,23 @@
1
+ # frozen_string_literal: true
2
+
3
+ class LLM::MCP
4
+ ##
5
+ # A per-request mailbox for routing a JSON-RPC response back to the
6
+ # caller waiting on that request id.
7
+ class Mailbox
8
+ def initialize
9
+ @queue = Queue.new
10
+ end
11
+
12
+ def <<(message)
13
+ @queue << message
14
+ self
15
+ end
16
+
17
+ def pop
18
+ @queue.pop(true)
19
+ rescue ThreadError
20
+ nil
21
+ end
22
+ end
23
+ end
data/lib/llm/mcp/pipe.rb CHANGED
@@ -27,7 +27,7 @@ class LLM::MCP
27
27
 
28
28
  ##
29
29
  # Reads from the reader end without blocking.
30
- # @raise [IO::WaitReadable]
30
+ # @raise [IO::EAGAINWaitReadable]
31
31
  # When no data is available to read
32
32
  # @return [String]
33
33
  def read_nonblock(...)
@@ -0,0 +1,44 @@
1
+ # frozen_string_literal: true
2
+
3
+ class LLM::MCP
4
+ ##
5
+ # Coordinates shared access to a transport by routing JSON-RPC
6
+ # responses to the mailbox waiting on the matching request id.
7
+ class Router
8
+ def initialize
9
+ @request_id = -1
10
+ @pending = {}
11
+ @lock = Monitor.new
12
+ @writer = Monitor.new
13
+ @reader = Monitor.new
14
+ end
15
+
16
+ def register
17
+ @lock.synchronize do
18
+ @request_id += 1
19
+ mailbox = LLM::MCP::Mailbox.new
20
+ @pending[@request_id] = mailbox
21
+ [@request_id, mailbox]
22
+ end
23
+ end
24
+
25
+ def clear(id)
26
+ @lock.synchronize { @pending.delete(id) }
27
+ end
28
+
29
+ def read(transport)
30
+ @reader.synchronize { transport.read_nonblock }
31
+ end
32
+
33
+ def write(transport, message)
34
+ @writer.synchronize { transport.write(message) }
35
+ end
36
+
37
+ def route(response)
38
+ mailbox = @lock.synchronize { @pending[response["id"]] }
39
+ raise LLM::MCP::MismatchError.new(expected_id: nil, actual_id: response["id"]) unless mailbox
40
+ mailbox << response
41
+ nil
42
+ end
43
+ end
44
+ end
data/lib/llm/mcp/rpc.rb CHANGED
@@ -27,13 +27,15 @@ class LLM::MCP
27
27
  def call(transport, method, params = {})
28
28
  message = {jsonrpc: "2.0", method:, params: default_params(method).merge(params)}
29
29
  if notification?(method)
30
- transport.write(message)
31
- nil
32
- else
33
- @request_id = (@request_id || -1) + 1
34
- id = @request_id
35
- transport.write(message.merge(id:))
36
- recv(transport, id)
30
+ router.write(transport, message)
31
+ return nil
32
+ end
33
+ id, mailbox = router.register
34
+ begin
35
+ router.write(transport, message.merge(id:))
36
+ recv(transport, id, mailbox)
37
+ ensure
38
+ router.clear(id)
37
39
  end
38
40
  end
39
41
 
@@ -49,19 +51,12 @@ class LLM::MCP
49
51
  # When the MCP process returns an error
50
52
  # @return [Object, nil]
51
53
  # The result returned by the MCP process
52
- def recv(transport, id)
54
+ def recv(transport, id, mailbox)
53
55
  poll(timeout:, ex: [IO::WaitReadable]) do
54
56
  loop do
55
- res = transport.read_nonblock
56
- if res["id"] == id && res["error"]
57
- raise LLM::MCP::Error.from(response: res)
58
- elsif res["id"] == id
59
- break res["result"]
60
- elsif res["method"]
61
- next
62
- elsif res.key?("id")
63
- raise LLM::MCP::MismatchError.new(expected_id: id, actual_id: res["id"])
64
- end
57
+ res = mailbox.pop
58
+ return handle_response(id, res) if res
59
+ route_response(router.read(transport), id)
65
60
  end
66
61
  end
67
62
  end
@@ -119,5 +114,21 @@ class LLM::MCP
119
114
  sleep 0.05
120
115
  end
121
116
  end
117
+
118
+ def handle_response(id, res)
119
+ raise LLM::MCP::Error.from(response: res) if res["error"]
120
+ return res["result"] if res["id"] == id
121
+ raise LLM::MCP::MismatchError.new(expected_id: id, actual_id: res["id"])
122
+ end
123
+
124
+ def route_response(res, id)
125
+ return nil if res["method"]
126
+ return router.route(res) if res.key?("id")
127
+ raise LLM::MCP::MismatchError.new(expected_id: id, actual_id: nil)
128
+ end
129
+
130
+ def router
131
+ @router ||= LLM::MCP::Router.new
132
+ end
122
133
  end
123
134
  end
@@ -21,29 +21,31 @@ module LLM::MCP::Transport
21
21
 
22
22
  ##
23
23
  # Receives the SSE event name.
24
- # @param [LLM::EventStream::Event] event
24
+ # @param [LLM::EventStream::Event, String, nil] event
25
+ # @param [String, nil] chunk
25
26
  # The event stream event
26
27
  # @return [void]
27
- def on_event(event)
28
- @event = event.value
28
+ def on_event(event, chunk = nil)
29
+ @event = chunk ? event : event.value
29
30
  end
30
31
 
31
32
  ##
32
33
  # Receives one line of SSE data.
33
- # @param [LLM::EventStream::Event] event
34
+ # @param [LLM::EventStream::Event, String, nil] event
35
+ # @param [String, nil] chunk
34
36
  # The event stream event
35
37
  # @return [void]
36
- def on_data(event)
37
- @data << event.value.to_s
38
+ def on_data(event, chunk = nil)
39
+ @data << (chunk ? event : event.value).to_s
38
40
  end
39
41
 
40
42
  # The generic event stream parser dispatches one line at a time.
41
43
  # A blank line terminates the current SSE event.
42
- # @param [LLM::EventStream::Event] event
44
+ # @param [LLM::EventStream::Event, String] event
43
45
  # The event stream event
44
46
  # @return [void]
45
- def on_chunk(event)
46
- flush if event.chunk == "\n"
47
+ def on_chunk(event, chunk = nil)
48
+ flush if (chunk || event&.chunk || event) == "\n"
47
49
  end
48
50
 
49
51
  private
@@ -82,13 +82,13 @@ module LLM::MCP::Transport
82
82
  # Reads the next queued message without blocking.
83
83
  # @raise [LLM::MCP::Error]
84
84
  # When the transport is not running
85
- # @raise [IO::WaitReadable]
85
+ # @raise [IO::EAGAINWaitReadable]
86
86
  # When no complete message is available to read
87
87
  # @return [Hash]
88
88
  def read_nonblock
89
89
  lock do
90
90
  raise LLM::MCP::Error, "MCP transport is not running" unless running?
91
- raise IO::WaitReadable if @queue.empty?
91
+ raise IO::EAGAINWaitReadable, "no complete message available" if @queue.empty?
92
92
  @queue.shift
93
93
  end
94
94
  end
@@ -57,7 +57,7 @@ module LLM::MCP::Transport
57
57
  # Reads a message from the MCP process without blocking.
58
58
  # @raise [LLM::Error]
59
59
  # When the transport is not running
60
- # @raise [IO::WaitReadable]
60
+ # @raise [IO::EAGAINWaitReadable]
61
61
  # When no complete message is available to read
62
62
  # @return [Hash]
63
63
  # The next message from the MCP process
data/lib/llm/mcp.rb CHANGED
@@ -10,11 +10,14 @@
10
10
  # transports and focuses on discovering tools that can be used through
11
11
  # {LLM::Context LLM::Context} and {LLM::Agent LLM::Agent}.
12
12
  #
13
- # Like {LLM::Context LLM::Context}, an MCP client is stateful and is
14
- # expected to remain isolated to a single thread.
13
+ # An MCP client is stateful. Coordinate lifecycle operations such as
14
+ # {#start} and {#stop}; request methods can be issued concurrently and
15
+ # responses are matched by JSON-RPC id.
15
16
  class LLM::MCP
16
17
  require_relative "mcp/error"
17
18
  require_relative "mcp/command"
19
+ require_relative "mcp/mailbox"
20
+ require_relative "mcp/router"
18
21
  require_relative "mcp/rpc"
19
22
  require_relative "mcp/pipe"
20
23
  require_relative "mcp/transport/http"
@@ -0,0 +1,115 @@
1
+ # frozen_string_literal: true
2
+
3
+ module LLM::Provider::Transport
4
+ class HTTP
5
+ ##
6
+ # Internal HTTP request execution methods for {LLM::Provider}.
7
+ #
8
+ # This module handles provider-side HTTP execution, response parsing,
9
+ # streaming, and request body setup through
10
+ # {LLM::Provider::Transport::HTTP}.
11
+ #
12
+ # @api private
13
+ module HTTP::Execution
14
+ private
15
+
16
+ ##
17
+ # Executes a HTTP request
18
+ # @param [Net::HTTPRequest] request
19
+ # The request to send
20
+ # @param [Proc] b
21
+ # A block to yield the response to (optional)
22
+ # @return [Net::HTTPResponse]
23
+ # The response from the server
24
+ # @raise [LLM::Error::Unauthorized]
25
+ # When authentication fails
26
+ # @raise [LLM::Error::RateLimit]
27
+ # When the rate limit is exceeded
28
+ # @raise [LLM::Error]
29
+ # When any other unsuccessful status code is returned
30
+ # @raise [SystemCallError]
31
+ # When there is a network error at the operating system level
32
+ # @return [Net::HTTPResponse]
33
+ def execute(request:, operation:, stream: nil, stream_parser: self.stream_parser, model: nil, inputs: nil, &b)
34
+ owner = transport.request_owner
35
+ tracer = self.tracer
36
+ span = tracer.on_request_start(operation:, model:, inputs:)
37
+ res = transport.request(request, owner:) do |http|
38
+ perform_request(http, request, stream, stream_parser, &b)
39
+ end
40
+ [handle_response(res, tracer, span), span, tracer]
41
+ rescue *LLM::Provider::Transport::HTTP::Interruptible::INTERRUPT_ERRORS
42
+ raise LLM::Interrupt, "request interrupted" if transport.interrupted?(owner)
43
+ raise
44
+ end
45
+
46
+ ##
47
+ # Handles the response from a request
48
+ # @param [Net::HTTPResponse] res
49
+ # The response to handle
50
+ # @param [Object, nil] span
51
+ # The span
52
+ # @return [Net::HTTPResponse]
53
+ def handle_response(res, tracer, span)
54
+ case res
55
+ when Net::HTTPOK then res.body = parse_response(res)
56
+ else error_handler.new(tracer, span, res).raise_error!
57
+ end
58
+ res
59
+ end
60
+
61
+ ##
62
+ # Parse a HTTP response
63
+ # @param [Net::HTTPResponse] res
64
+ # @return [LLM::Object, String]
65
+ def parse_response(res)
66
+ case res["content-type"]
67
+ when %r{\Aapplication/json\s*} then LLM::Object.from(LLM.json.load(res.body))
68
+ else res.body
69
+ end
70
+ end
71
+
72
+ ##
73
+ # @param [Net::HTTPRequest] req
74
+ # The request to set the body stream for
75
+ # @param [IO] io
76
+ # The IO object to set as the body stream
77
+ # @return [void]
78
+ def set_body_stream(req, io)
79
+ req.body_stream = io
80
+ req["transfer-encoding"] = "chunked" unless req["content-length"]
81
+ end
82
+
83
+ ##
84
+ # Performs the request on the given HTTP connection.
85
+ # @param [Net::HTTP] http
86
+ # @param [Net::HTTPRequest] request
87
+ # @param [Object, nil] stream
88
+ # @param [Class] stream_parser
89
+ # @param [Proc, nil] b
90
+ # @return [Net::HTTPResponse]
91
+ def perform_request(http, request, stream, stream_parser, &b)
92
+ if stream
93
+ http.request(request) do |res|
94
+ if Net::HTTPSuccess === res
95
+ parser = StreamDecoder.new(stream_parser.new(stream))
96
+ res.read_body(parser)
97
+ body = parser.body
98
+ res.body = (Hash === body || Array === body) ? LLM::Object.from(body) : body
99
+ else
100
+ body = +""
101
+ res.read_body { body << _1 }
102
+ res.body = body
103
+ end
104
+ ensure
105
+ parser&.free
106
+ end
107
+ elsif b
108
+ http.request(request) { (Net::HTTPSuccess === _1) ? b.call(_1) : _1 }
109
+ else
110
+ http.request(request)
111
+ end
112
+ end
113
+ end
114
+ end
115
+ end
@@ -0,0 +1,109 @@
1
+ # frozen_string_literal: true
2
+
3
+ class LLM::Provider
4
+ module Transport
5
+ class HTTP
6
+ ##
7
+ # Internal request interruption methods for
8
+ # {LLM::Provider::Transport::HTTP}.
9
+ #
10
+ # This module tracks active requests by execution owner and provides
11
+ # the logic used to interrupt an in-flight request by closing the
12
+ # active HTTP connection.
13
+ #
14
+ # @api private
15
+ module Interruptible
16
+ INTERRUPT_ERRORS = [::IOError, ::EOFError, Errno::EBADF].freeze
17
+ Request = Struct.new(:http, :connection, keyword_init: true)
18
+
19
+ ##
20
+ # Interrupt an active request, if any.
21
+ # @param [Fiber] owner
22
+ # The execution owner whose request should be interrupted
23
+ # @return [nil]
24
+ def interrupt!(owner)
25
+ req = request_for(owner) or return
26
+ lock { (@interrupts ||= {})[owner] = true }
27
+ if persistent_http?(req.http)
28
+ close_socket(req.connection&.http)
29
+ req.http.finish(req.connection)
30
+ elsif transient_http?(req.http)
31
+ close_socket(req.http)
32
+ req.http.finish if req.http.active?
33
+ end
34
+ rescue *INTERRUPT_ERRORS
35
+ nil
36
+ end
37
+
38
+ private
39
+
40
+ ##
41
+ # Closes the active socket for a request, if present.
42
+ # @param [Net::HTTP, nil] http
43
+ # @return [nil]
44
+ def close_socket(http)
45
+ socket = http&.instance_variable_get(:@socket) or return
46
+ socket = socket.io if socket.respond_to?(:io)
47
+ socket.close
48
+ rescue *INTERRUPT_ERRORS
49
+ nil
50
+ end
51
+
52
+ ##
53
+ # Returns whether the active request is using a transient HTTP client.
54
+ # @param [Object, nil] http
55
+ # @return [Boolean]
56
+ def transient_http?(http)
57
+ Net::HTTP === http
58
+ end
59
+
60
+ ##
61
+ # Returns whether the active request is using a persistent HTTP client.
62
+ # @param [Object, nil] http
63
+ # @return [Boolean]
64
+ def persistent_http?(http)
65
+ defined?(Net::HTTP::Persistent) && Net::HTTP::Persistent === http
66
+ end
67
+
68
+ ##
69
+ # Returns the active request for an execution owner.
70
+ # @param [Fiber] owner
71
+ # @return [Request, nil]
72
+ def request_for(owner)
73
+ lock do
74
+ @requests ||= {}
75
+ @requests[owner]
76
+ end
77
+ end
78
+
79
+ ##
80
+ # Records an active request for an execution owner.
81
+ # @param [Request] req
82
+ # @param [Fiber] owner
83
+ # @return [Request]
84
+ def set_request(req, owner)
85
+ lock do
86
+ @requests ||= {}
87
+ @requests[owner] = req
88
+ end
89
+ end
90
+
91
+ ##
92
+ # Clears the active request for an execution owner.
93
+ # @param [Fiber] owner
94
+ # @return [Request, nil]
95
+ def clear_request(owner)
96
+ lock { @requests&.delete(owner) }
97
+ end
98
+
99
+ ##
100
+ # Returns whether an execution owner was interrupted.
101
+ # @param [Fiber] owner
102
+ # @return [Boolean, nil]
103
+ def interrupted?(owner)
104
+ lock { @interrupts&.delete(owner) }
105
+ end
106
+ end
107
+ end
108
+ end
109
+ end
@@ -0,0 +1,92 @@
1
+ # frozen_string_literal: true
2
+
3
+ module LLM::Provider::Transport
4
+ ##
5
+ # @private
6
+ class HTTP::StreamDecoder
7
+ ##
8
+ # @return [Object]
9
+ attr_reader :parser
10
+
11
+ ##
12
+ # @param [#parse!, #body] parser
13
+ # @return [LLM::Provider::Transport::HTTP::StreamDecoder]
14
+ def initialize(parser)
15
+ @buffer = +""
16
+ @cursor = 0
17
+ @data = []
18
+ @parser = parser
19
+ end
20
+
21
+ ##
22
+ # @param [String] chunk
23
+ # @return [void]
24
+ def <<(chunk)
25
+ @buffer << chunk
26
+ each_line { handle_line(_1) }
27
+ end
28
+
29
+ ##
30
+ # @return [Object]
31
+ def body
32
+ parser.body
33
+ end
34
+
35
+ ##
36
+ # @return [void]
37
+ def free
38
+ @buffer.clear
39
+ @cursor = 0
40
+ @data.clear
41
+ parser.free if parser.respond_to?(:free)
42
+ end
43
+
44
+ private
45
+
46
+ def handle_line(line)
47
+ if line == "\n" || line == "\r\n"
48
+ flush_sse_event
49
+ elsif line.start_with?("data:")
50
+ @data << field_value(line)
51
+ elsif line.start_with?("event:", "id:", "retry:", ":")
52
+ else
53
+ decode!(strip_newline(line))
54
+ end
55
+ end
56
+
57
+ def flush_sse_event
58
+ return if @data.empty?
59
+ decode!(@data.join("\n"))
60
+ @data.clear
61
+ end
62
+
63
+ def field_value(line)
64
+ value_start = line.getbyte(5) == 32 ? 6 : 5
65
+ strip_newline(line.byteslice(value_start..))
66
+ end
67
+
68
+ def strip_newline(line)
69
+ line = line.byteslice(0, line.bytesize - 1) if line.end_with?("\n")
70
+ line = line.byteslice(0, line.bytesize - 1) if line.end_with?("\r")
71
+ line
72
+ end
73
+
74
+ def decode!(payload)
75
+ return if payload.empty? || payload == "[DONE]"
76
+ chunk = LLM.json.load(payload)
77
+ parser.parse!(chunk) if chunk
78
+ rescue *LLM.json.parser_error
79
+ end
80
+
81
+ def each_line
82
+ while (newline = @buffer.index("\n", @cursor))
83
+ line = @buffer[@cursor..newline]
84
+ @cursor = newline + 1
85
+ yield(line)
86
+ end
87
+ return if @cursor.zero?
88
+ @buffer = @buffer[@cursor..] || +""
89
+ @cursor = 0
90
+ end
91
+ end
92
+ end
@@ -0,0 +1,144 @@
1
+ # frozen_string_literal: true
2
+
3
+ class LLM::Provider
4
+ module Transport
5
+ ##
6
+ # The {LLM::Provider::Transport::HTTP LLM::Provider::Transport::HTTP}
7
+ # class manages HTTP connections for {LLM::Provider}. It handles
8
+ # transient and persistent clients, tracks active requests by owner,
9
+ # and interrupts in-flight requests when needed.
10
+ #
11
+ # @api private
12
+ class HTTP
13
+ require_relative "http/stream_decoder"
14
+ require_relative "http/interruptible"
15
+
16
+ include Interruptible
17
+
18
+ ##
19
+ # @param [String] host
20
+ # @param [Integer] port
21
+ # @param [Integer] timeout
22
+ # @param [Boolean] ssl
23
+ # @param [Boolean] persistent
24
+ # @return [LLM::Provider::Transport::HTTP]
25
+ def initialize(host:, port:, timeout:, ssl:, persistent: false)
26
+ @host = host
27
+ @port = port
28
+ @timeout = timeout
29
+ @ssl = ssl
30
+ @base_uri = URI("#{ssl ? "https" : "http"}://#{host}:#{port}/")
31
+ @persistent_client = persistent ? persistent_client : nil
32
+ @monitor = Monitor.new
33
+ end
34
+
35
+ ##
36
+ # Interrupt an active request, if any.
37
+ # @param [Fiber] owner
38
+ # @return [nil]
39
+ def interrupt!(owner)
40
+ super
41
+ end
42
+
43
+ ##
44
+ # Returns whether an execution owner was interrupted.
45
+ # @param [Fiber] owner
46
+ # @return [Boolean, nil]
47
+ def interrupted?(owner)
48
+ super
49
+ end
50
+
51
+ ##
52
+ # Returns the current request owner.
53
+ # @return [Fiber]
54
+ def request_owner
55
+ Fiber.current
56
+ end
57
+
58
+ ##
59
+ # Configures the transport to use a persistent HTTP connection pool.
60
+ # @return [LLM::Provider::Transport::HTTP]
61
+ def persist!
62
+ client = persistent_client
63
+ lock do
64
+ @persistent_client = client
65
+ self
66
+ end
67
+ end
68
+ alias_method :persistent, :persist!
69
+
70
+ ##
71
+ # @return [Boolean]
72
+ def persistent?
73
+ !persistent_client.nil?
74
+ end
75
+
76
+ ##
77
+ # Performs a request on the current HTTP transport.
78
+ # @param [Net::HTTPRequest] request
79
+ # @param [Fiber] owner
80
+ # @yieldparam [Net::HTTP] http
81
+ # @return [Object]
82
+ def request(request, owner:, &)
83
+ if persistent?
84
+ request_persistent(request, owner, &)
85
+ else
86
+ request_transient(request, owner, &)
87
+ end
88
+ ensure
89
+ clear_request(owner)
90
+ end
91
+
92
+ ##
93
+ # @return [String]
94
+ def inspect
95
+ "#<#{self.class.name}:0x#{object_id.to_s(16)} @persistent=#{persistent?}>"
96
+ end
97
+
98
+ private
99
+
100
+ attr_reader :host, :port, :timeout, :ssl, :base_uri
101
+
102
+ def request_transient(request, owner, &)
103
+ http = transient_client
104
+ set_request(Request.new(http:), owner)
105
+ yield http
106
+ end
107
+
108
+ def request_persistent(request, owner, &)
109
+ persistent_client.connection_for(URI.join(base_uri, request.path)) do |connection|
110
+ set_request(Request.new(http: persistent_client, connection:), owner)
111
+ yield connection.http
112
+ end
113
+ end
114
+
115
+ def persistent_client
116
+ LLM.lock(:clients) do
117
+ if LLM.clients[client_id]
118
+ LLM.clients[client_id]
119
+ else
120
+ require "net/http/persistent" unless defined?(Net::HTTP::Persistent)
121
+ client = Net::HTTP::Persistent.new(name: self.class.name)
122
+ client.read_timeout = timeout
123
+ LLM.clients[client_id] = client
124
+ end
125
+ end
126
+ end
127
+
128
+ def transient_client
129
+ client = Net::HTTP.new(host, port)
130
+ client.read_timeout = timeout
131
+ client.use_ssl = ssl
132
+ client
133
+ end
134
+
135
+ def client_id
136
+ "#{host}:#{port}:#{timeout}:#{ssl}"
137
+ end
138
+
139
+ def lock(&)
140
+ @monitor.synchronize(&)
141
+ end
142
+ end
143
+ end
144
+ end
data/lib/llm/provider.rb CHANGED
@@ -7,14 +7,9 @@
7
7
  # @abstract
8
8
  class LLM::Provider
9
9
  require "net/http"
10
- require_relative "client"
11
- include LLM::Client
12
-
13
- @@clients = {}
14
-
15
- ##
16
- # @api private
17
- def self.clients = @@clients
10
+ require_relative "provider/transport/http"
11
+ require_relative "provider/transport/http/execution"
12
+ include Transport::HTTP::Execution
18
13
 
19
14
  ##
20
15
  # @param [String, nil] key
@@ -36,9 +31,9 @@ class LLM::Provider
36
31
  @port = port
37
32
  @timeout = timeout
38
33
  @ssl = ssl
39
- @client = persistent ? persistent_client : nil
40
34
  @base_uri = URI("#{ssl ? "https" : "http"}://#{host}:#{port}/")
41
35
  @headers = {"User-Agent" => "llm.rb v#{LLM::VERSION}"}
36
+ @transport = Transport::HTTP.new(host:, port:, timeout:, ssl:, persistent:)
42
37
  @monitor = Monitor.new
43
38
  end
44
39
 
@@ -47,7 +42,7 @@ class LLM::Provider
47
42
  # @return [String]
48
43
  # @note The secret key is redacted in inspect for security reasons
49
44
  def inspect
50
- "#<#{self.class.name}:0x#{object_id.to_s(16)} @key=[REDACTED] @client=#{@client.inspect} @tracer=#{tracer.inspect}>"
45
+ "#<#{self.class.name}:0x#{object_id.to_s(16)} @key=[REDACTED] @transport=#{transport.inspect} @tracer=#{tracer.inspect}>"
51
46
  end
52
47
 
53
48
  ##
@@ -312,13 +307,20 @@ class LLM::Provider
312
307
  # # do something with 'llm'
313
308
  # @return [LLM::Provider]
314
309
  def persist!
315
- client = persistent_client
316
- lock do
317
- tap { @client = client }
318
- end
310
+ transport.persist!
311
+ self
319
312
  end
320
313
  alias_method :persistent, :persist!
321
314
 
315
+ ##
316
+ # Interrupt the active request, if any.
317
+ # @param [Fiber] owner
318
+ # @return [nil]
319
+ def interrupt!(owner)
320
+ transport.interrupt!(owner)
321
+ end
322
+ alias_method :cancel!, :interrupt!
323
+
322
324
  ##
323
325
  # @param [Object] stream
324
326
  # @return [Boolean]
@@ -328,7 +330,7 @@ class LLM::Provider
328
330
 
329
331
  private
330
332
 
331
- attr_reader :client, :base_uri, :host, :port, :timeout, :ssl
333
+ attr_reader :base_uri, :host, :port, :timeout, :ssl, :transport
332
334
 
333
335
  ##
334
336
  # The headers to include with a request
@@ -360,94 +362,6 @@ class LLM::Provider
360
362
  raise NotImplementedError
361
363
  end
362
364
 
363
- ##
364
- # Executes a HTTP request
365
- # @param [Net::HTTPRequest] request
366
- # The request to send
367
- # @param [Proc] b
368
- # A block to yield the response to (optional)
369
- # @return [Net::HTTPResponse]
370
- # The response from the server
371
- # @raise [LLM::Error::Unauthorized]
372
- # When authentication fails
373
- # @raise [LLM::Error::RateLimit]
374
- # When the rate limit is exceeded
375
- # @raise [LLM::Error]
376
- # When any other unsuccessful status code is returned
377
- # @raise [SystemCallError]
378
- # When there is a network error at the operating system level
379
- # @return [Net::HTTPResponse]
380
- def execute(request:, operation:, stream: nil, stream_parser: self.stream_parser, model: nil, inputs: nil, &b)
381
- tracer = self.tracer
382
- span = tracer.on_request_start(operation:, model:, inputs:)
383
- http = client || transient_client
384
- args = (Net::HTTP === http) ? [request] : [URI.join(base_uri, request.path), request]
385
- res = if stream
386
- http.request(*args) do |res|
387
- if Net::HTTPSuccess === res
388
- handler = event_handler.new stream_parser.new(stream)
389
- parser = LLM::EventStream::Parser.new
390
- parser.register(handler)
391
- res.read_body(parser)
392
- # If the handler body is empty, the response was
393
- # most likely not streamed or parsing failed.
394
- # Preserve the raw body in that case so standard
395
- # JSON/error handling can parse it later.
396
- body = handler.body.empty? ? parser.body : handler.body
397
- res.body = Hash === body || Array === body ? LLM::Object.from(body) : body
398
- else
399
- body = +""
400
- res.read_body { body << _1 }
401
- res.body = body
402
- end
403
- ensure
404
- handler&.free
405
- parser&.free
406
- end
407
- else
408
- b ? http.request(*args) { (Net::HTTPSuccess === _1) ? b.call(_1) : _1 } :
409
- http.request(*args)
410
- end
411
- [handle_response(res, tracer, span), span, tracer]
412
- end
413
-
414
- ##
415
- # Handles the response from a request
416
- # @param [Net::HTTPResponse] res
417
- # The response to handle
418
- # @param [Object, nil] span
419
- # The span
420
- # @return [Net::HTTPResponse]
421
- def handle_response(res, tracer, span)
422
- case res
423
- when Net::HTTPOK then res.body = parse_response(res)
424
- else error_handler.new(tracer, span, res).raise_error!
425
- end
426
- res
427
- end
428
-
429
- ##
430
- # Parse a HTTP response
431
- # @param [Net::HTTPResponse] res
432
- # @return [LLM::Object, String]
433
- def parse_response(res)
434
- case res["content-type"]
435
- when %r|\Aapplication/json\s*| then LLM::Object.from(LLM.json.load(res.body))
436
- else res.body
437
- end
438
- end
439
-
440
- ##
441
- # @param [Net::HTTPRequest] req
442
- # The request to set the body stream for
443
- # @param [IO] io
444
- # The IO object to set as the body stream
445
- # @return [void]
446
- def set_body_stream(req, io)
447
- req.body_stream = io
448
- req["transfer-encoding"] = "chunked" unless req["content-length"]
449
- end
450
-
451
365
  ##
452
366
  # Resolves tools to their function representations
453
367
  # @param [Array<LLM::Function, LLM::Tool>] tools
data/lib/llm/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module LLM
4
- VERSION = "4.13.0"
4
+ VERSION = "4.14.0"
5
5
  end
data/lib/llm.rb CHANGED
@@ -40,6 +40,14 @@ module LLM
40
40
  # Model registry
41
41
  @registry = {}
42
42
 
43
+ ##
44
+ # Shared HTTP clients used by providers.
45
+ @clients = {}
46
+
47
+ ##
48
+ # @api private
49
+ def self.clients = @clients
50
+
43
51
  ##
44
52
  # @param [Symbol, LLM::Provider] llm
45
53
  # The name of a provider, or an instance of LLM::Provider
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: llm.rb
3
3
  version: !ruby/object:Gem::Version
4
- version: 4.13.0
4
+ version: 4.14.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Antar Azri
@@ -231,7 +231,6 @@ files:
231
231
  - lib/llm/agent.rb
232
232
  - lib/llm/bot.rb
233
233
  - lib/llm/buffer.rb
234
- - lib/llm/client.rb
235
234
  - lib/llm/context.rb
236
235
  - lib/llm/context/deserializer.rb
237
236
  - lib/llm/contract.rb
@@ -255,7 +254,9 @@ files:
255
254
  - lib/llm/mcp.rb
256
255
  - lib/llm/mcp/command.rb
257
256
  - lib/llm/mcp/error.rb
257
+ - lib/llm/mcp/mailbox.rb
258
258
  - lib/llm/mcp/pipe.rb
259
+ - lib/llm/mcp/router.rb
259
260
  - lib/llm/mcp/rpc.rb
260
261
  - lib/llm/mcp/transport/http.rb
261
262
  - lib/llm/mcp/transport/http/event_handler.rb
@@ -270,6 +271,10 @@ files:
270
271
  - lib/llm/object/kernel.rb
271
272
  - lib/llm/prompt.rb
272
273
  - lib/llm/provider.rb
274
+ - lib/llm/provider/transport/http.rb
275
+ - lib/llm/provider/transport/http/execution.rb
276
+ - lib/llm/provider/transport/http/interruptible.rb
277
+ - lib/llm/provider/transport/http/stream_decoder.rb
273
278
  - lib/llm/providers/anthropic.rb
274
279
  - lib/llm/providers/anthropic/error_handler.rb
275
280
  - lib/llm/providers/anthropic/files.rb
data/lib/llm/client.rb DELETED
@@ -1,36 +0,0 @@
1
- # frozen_string_literal: true
2
-
3
- module LLM
4
- ##
5
- # @api private
6
- module Client
7
- private
8
-
9
- ##
10
- # @api private
11
- def persistent_client
12
- LLM.lock(:clients) do
13
- if clients[client_id]
14
- clients[client_id]
15
- else
16
- require "net/http/persistent" unless defined?(Net::HTTP::Persistent)
17
- client = Net::HTTP::Persistent.new(name: self.class.name)
18
- client.read_timeout = timeout
19
- clients[client_id] = client
20
- end
21
- end
22
- end
23
-
24
- ##
25
- # @api private
26
- def transient_client
27
- client = Net::HTTP.new(host, port)
28
- client.read_timeout = timeout
29
- client.use_ssl = ssl
30
- client
31
- end
32
-
33
- def client_id = "#{host}:#{port}:#{timeout}:#{ssl}"
34
- def clients = self.class.clients
35
- end
36
- end