RubyGems - llm.rb - Versions diffs - 5.1.0 → 5.2.1 - Mend

llm.rb 5.1.0 → 5.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +60 -0
data/README.md +33 -12
data/data/deepseek.json +68 -0
data/data/google.json +26 -26
data/data/openai.json +55 -0
data/lib/llm/context.rb +6 -2
data/lib/llm/mcp.rb +15 -0
data/lib/llm/message.rb +14 -5
data/lib/llm/providers/deepseek/request_adapter/completion.rb +30 -7
data/lib/llm/providers/deepseek.rb +3 -3
data/lib/llm/version.rb +1 -1
metadata +1 -1

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 56ddedb75f6c791cc42bca736bc62360ba4850a3a204f9a82288e8c6ea977eeb
-  data.tar.gz: 3881b731dacd921e258eac954c4468d052e673e48ad53c63ae1a246973c84d33
+  metadata.gz: 8f9bdef0c733225e44dcf39d75e3397974122bfeb5e705a0797067242fd5c966
+  data.tar.gz: 567cc793e1e095e481abf5ef797a6fcb26a04faeed91855c234e531b78e3544a
 SHA512:
-  metadata.gz: a8838f57a1232afc42448d28a0f3f7b8907c2a527be284579b3af56e398edda50d7cc02a8dda2794c65096699831058494cccae3f8a116b538f76bc42127eba8
-  data.tar.gz: ef49e8046b4aab4e59b252ffdbf16135673d227b4124bf45bf5c31856c49822168182928b56fdc7428d2056dbae7c211fac0d6d8ef3eba48a7aca47420bb96e7
+  metadata.gz: 3d7e026b308228787d2f6ead8de197f847b644f7ba5bd0a1d679270a66e5c0e48c74a7d9494a86613bd061a42c2fd56ff270f73f8f3a627bc213dd7d57de788d
+  data.tar.gz: 8586a02d0345e7259f80b32e688a0ad531e28e3d15c8519781647ee1f77f23b6ecdaa9b28ed40efdea138c138511c7c4f88986ed7d646fb562e9faf7e2db687f

data/CHANGELOG.md CHANGED Viewed

@@ -2,8 +2,68 @@
 ## Unreleased
+Changes since `v5.2.1`.
+## v5.2.1
+Changes since `v5.2.0`.
+This release tightens the streamed queue fix from `v5.2.0` for concurrent
+workloads. Request-local streams now stay bound long enough for `wait` to
+drain queued work and then clear cleanly so later waits fall back to the
+context's configured stream.
+### Fix
+* **Reset request-local streams after `wait` drains queued work** <br>
+  Keep per-call `stream:` bindings alive through `LLM::Context#wait` so
+  queued streamed tool work still resolves correctly, then clear the
+  request-local stream after the wait completes to avoid leaking it into
+  later turns.
+## v5.2.0
 Changes since `v5.1.0`.
+This release adds current DeepSeek V4 support through refreshed provider
+metadata, including `deepseek-v4-flash` and `deepseek-v4-pro`, while fixing
+request-local queue handling for concurrent streamed workloads so `wait` and
+interruption use the active per-call stream correctly.
+### Change
+* **Add `LLM::MCP#run` for scoped MCP client lifecycle** <br>
+  Add `LLM::MCP#run` so MCP clients can be started for the duration of a
+  block and then stopped automatically, which simplifies the usual
+  `start`/`stop` pattern in examples and application code.
+* **Refresh provider model metadata** <br>
+  Add current DeepSeek and OpenAI model metadata to `data/` and update the
+  Google Gemma model entry to match the current provider naming.
+### Fix
+* **Reject unsupported DeepSeek multimodal prompt objects early** <br>
+  Raise `LLM::PromptError` for `image_url`, `local_file`, and
+  `remote_file` in DeepSeek chat requests instead of sending invalid
+  OpenAI-compatible payloads that the provider rejects at runtime.
+* **Preserve DeepSeek reasoning content across tool turns** <br>
+  Replay `reasoning_content` when serializing prior assistant messages for
+  DeepSeek chat completions, so thinking-mode tool calls can continue into
+  follow-up requests without triggering invalid request errors.
+* **Default DeepSeek to `deepseek-v4-flash`** <br>
+  Change `LLM::DeepSeek#default_model` to `deepseek-v4-flash` so new
+  contexts and default provider usage align with the current preferred chat
+  model.
+* **Use per-call streams when waiting on streamed tool work** <br>
+  Track request-local streams bound through `talk(..., stream:)` and
+  `respond(..., stream:)` so `LLM::Context#wait` and interruption-aware
+  queue handling use the active stream instead of falling back to pending
+  function spawning.
 ## v5.1.0
 Changes since `v5.0.0`.

data/README.md CHANGED Viewed

@@ -4,7 +4,7 @@
 <p align="center">
   <a href="https://0x1eef.github.io/x/llm.rb?rebuild=1"><img src="https://img.shields.io/badge/docs-0x1eef.github.io-blue.svg" alt="RubyDoc"></a>
   <a href="https://opensource.org/license/0bsd"><img src="https://img.shields.io/badge/License-0BSD-orange.svg?" alt="License"></a>
-  <a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-5.1.0-green.svg?" alt="Version"></a>
+  <a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-5.2.1-green.svg?" alt="Version"></a>
 </p>
 ## About
@@ -261,13 +261,17 @@ Remote MCP tools and prompts are not bolted on as a separate integration
 stack. They adapt into the same tool and prompt path used by local tools,
 skills, contexts, and agents.
+Use `mcp.run do ... end` for scoped work where the client should start and
+stop around one block. Use `mcp.start` and `mcp.stop` directly when you need
+finer sequential control across several steps before shutting the client down.
 ```ruby
-begin
-  mcp = LLM::MCP.http(url: "https://api.githubcopilot.com/mcp/").persistent
-  mcp.start
+mcp = LLM::MCP.http(
+  url: "https://api.githubcopilot.com/mcp/",
+  headers: {"Authorization" => "Bearer #{ENV.fetch("GITHUB_PAT")}"}
+).persistent
+mcp.run do
   ctx = LLM::Context.new(llm, tools: mcp.tools)
-ensure
-  mcp.stop
 end
 ```
@@ -281,12 +285,17 @@ Go's context package. In fact, llm.rb is heavily inspired by Go but with a Ruby
 twist.
 ```ruby
+require "llm"
+require "io/console"
+llm = LLM.openai(key: ENV["KEY"])
 ctx = LLM::Context.new(llm, stream: $stdout)
 worker = Thread.new do
   ctx.talk("Write a very long essay about network protocols.")
 rescue LLM::Interrupt
   puts "Request was interrupted!"
 end
 STDIN.getch
 ctx.interrupt!
 worker.join
@@ -615,9 +624,10 @@ require "io/console"
 llm = LLM.openai(key: ENV["KEY"])
 ctx = LLM::Context.new(llm, stream: $stdout)
 worker = Thread.new do
   ctx.talk("Write a very long essay about network protocols.")
+rescue LLM::Interrupt
+  puts "Request was interrupted!"
 end
 STDIN.getch
@@ -695,7 +705,7 @@ puts ticket.talk("How do I rotate my API key?").content
 #### MCP
-This example uses [`LLM::MCP`](https://0x1eef.github.io/x/llm.rb/LLM/MCP.html) over HTTP so remote GitHub MCP tools run through the same `LLM::Context` tool path as local tools. See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.
+This example uses [`LLM::MCP`](https://0x1eef.github.io/x/llm.rb/LLM/MCP.html) over HTTP so remote GitHub MCP tools run through the same `LLM::Context` tool path as local tools. It expects a GitHub token in `ENV["GITHUB_PAT"]`. See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.
 ```ruby
 require "llm"
@@ -707,13 +717,24 @@ mcp = LLM::MCP.http(
   headers: {"Authorization" => "Bearer #{ENV.fetch("GITHUB_PAT")}"}
 ).persistent
-begin
-  mcp.start
+mcp.start
+ctx = LLM::Context.new(llm, stream: $stdout, tools: mcp.tools)
+ctx.talk("Pull information about my GitHub account.")
+ctx.talk(ctx.call(:functions)) while ctx.functions.any?
+mcp.stop
+```
+For scoped work, `mcp.run do ... end` is shorter and handles cleanup for you:
+```ruby
+mcp = LLM::MCP.http(
+  url: "https://api.githubcopilot.com/mcp/",
+  headers: {"Authorization" => "Bearer #{ENV.fetch("GITHUB_PAT")}"}
+).persistent
+mcp.run do
   ctx = LLM::Context.new(llm, stream: $stdout, tools: mcp.tools)
   ctx.talk("Pull information about my GitHub account.")
   ctx.talk(ctx.call(:functions)) while ctx.functions.any?
-ensure
-  mcp.stop
 end
 ```

data/data/deepseek.json CHANGED Viewed

@@ -70,6 +70,74 @@
         "context": 128000,
         "output": 64000
       }
+    },
+    "deepseek-v4-flash": {
+      "id": "deepseek-v4-flash",
+      "name": "DeepSeek V4 Flash",
+      "family": "deepseek-flash",
+      "attachment": false,
+      "reasoning": true,
+      "tool_call": true,
+      "interleaved": {
+        "field": "reasoning_content"
+      },
+      "structured_output": true,
+      "temperature": true,
+      "knowledge": "2025-05",
+      "release_date": "2026-04-24",
+      "last_updated": "2026-04-24",
+      "modalities": {
+        "input": [
+          "text"
+        ],
+        "output": [
+          "text"
+        ]
+      },
+      "open_weights": true,
+      "cost": {
+        "input": 0.14,
+        "output": 0.28,
+        "cache_read": 0.028
+      },
+      "limit": {
+        "context": 1000000,
+        "output": 384000
+      }
+    },
+    "deepseek-v4-pro": {
+      "id": "deepseek-v4-pro",
+      "name": "DeepSeek V4 Pro",
+      "family": "deepseek-thinking",
+      "attachment": false,
+      "reasoning": true,
+      "tool_call": true,
+      "interleaved": {
+        "field": "reasoning_content"
+      },
+      "structured_output": true,
+      "temperature": true,
+      "knowledge": "2025-05",
+      "release_date": "2026-04-24",
+      "last_updated": "2026-04-24",
+      "modalities": {
+        "input": [
+          "text"
+        ],
+        "output": [
+          "text"
+        ]
+      },
+      "open_weights": true,
+      "cost": {
+        "input": 1.74,
+        "output": 3.48,
+        "cache_read": 0.145
+      },
+      "limit": {
+        "context": 1000000,
+        "output": 384000
+      }
     }
   }
 }

data/data/google.json CHANGED Viewed

@@ -1058,6 +1058,32 @@
         "output": 8192
       }
     },
+    "gemma-4-26b-a4b-it": {
+      "id": "gemma-4-26b-a4b-it",
+      "name": "Gemma 4 26B",
+      "family": "gemma",
+      "attachment": false,
+      "reasoning": true,
+      "tool_call": true,
+      "structured_output": true,
+      "temperature": true,
+      "release_date": "2026-04-02",
+      "last_updated": "2026-04-02",
+      "modalities": {
+        "input": [
+          "text",
+          "image"
+        ],
+        "output": [
+          "text"
+        ]
+      },
+      "open_weights": true,
+      "limit": {
+        "context": 256000,
+        "output": 8192
+      }
+    },
     "gemini-2.5-flash-lite": {
       "id": "gemini-2.5-flash-lite",
       "name": "Gemini 2.5 Flash Lite",
@@ -1093,32 +1119,6 @@
         "output": 65536
       }
     },
-    "gemma-4-26b-it": {
-      "id": "gemma-4-26b-it",
-      "name": "Gemma 4 26B",
-      "family": "gemma",
-      "attachment": false,
-      "reasoning": true,
-      "tool_call": true,
-      "structured_output": true,
-      "temperature": true,
-      "release_date": "2026-04-02",
-      "last_updated": "2026-04-02",
-      "modalities": {
-        "input": [
-          "text",
-          "image"
-        ],
-        "output": [
-          "text"
-        ]
-      },
-      "open_weights": true,
-      "limit": {
-        "context": 256000,
-        "output": 8192
-      }
-    },
     "gemini-2.5-flash-image-preview": {
       "id": "gemini-2.5-flash-image-preview",
       "name": "Gemini 2.5 Flash Image (Preview)",

data/data/openai.json CHANGED Viewed

@@ -195,6 +195,61 @@
         "output": 16384
       }
     },
+    "gpt-5.5": {
+      "id": "gpt-5.5",
+      "name": "GPT-5.5",
+      "family": "gpt",
+      "attachment": true,
+      "reasoning": true,
+      "tool_call": true,
+      "structured_output": true,
+      "temperature": false,
+      "knowledge": "2025-12-01",
+      "release_date": "2026-04-23",
+      "last_updated": "2026-04-23",
+      "modalities": {
+        "input": [
+          "text",
+          "image",
+          "pdf"
+        ],
+        "output": [
+          "text"
+        ]
+      },
+      "open_weights": false,
+      "cost": {
+        "input": 5,
+        "output": 30,
+        "cache_read": 0.5,
+        "context_over_200k": {
+          "input": 10,
+          "output": 45,
+          "cache_read": 1
+        }
+      },
+      "limit": {
+        "context": 1050000,
+        "input": 920000,
+        "output": 130000
+      },
+      "experimental": {
+        "modes": {
+          "fast": {
+            "cost": {
+              "input": 12.5,
+              "output": 75,
+              "cache_read": 1.25
+            },
+            "provider": {
+              "body": {
+                "service_tier": "priority"
+              }
+            }
+          }
+        }
+      }
+    },
     "gpt-5-mini": {
       "id": "gpt-5-mini",
       "name": "GPT-5 Mini",

data/lib/llm/context.rb CHANGED Viewed

@@ -295,7 +295,6 @@ module LLM
     #  ractor work, in that order.
     # @return [Array<LLM::Function::Return>]
     def wait(strategy)
-      stream = @params[:stream]
       if LLM::Stream === stream && !stream.queue.empty?
         @queue = stream.queue
         @queue.wait(strategy)
@@ -306,6 +305,7 @@ module LLM
       end
     ensure
       @queue = nil
+      @stream = nil
     end
     ##
@@ -461,6 +461,7 @@ module LLM
     def bind!(stream, model, tools)
       return unless LLM::Stream === stream
+      @stream = stream
       stream.extra[:ctx] = self
       stream.extra[:tracer] = tracer
       stream.extra[:model] = model
@@ -469,10 +470,13 @@ module LLM
     def queue
       return @queue if @queue
-      stream = @params[:stream]
       stream.queue if LLM::Stream === stream
     end
+    def stream
+      @stream || @params[:stream]
+    end
     def load_skills(skills)
       [*skills].map { LLM::Skill.load(_1).to_tool(self) }
     end

data/lib/llm/mcp.rb CHANGED Viewed

@@ -103,6 +103,21 @@ class LLM::MCP
     nil
   end
+  ##
+  # Starts the MCP client for the duration of a block and then stops it.
+  # @yield Runs with the MCP client started
+  # @raise [LocalJumpError]
+  #  When called without a block
+  # @raise [StandardError]
+  #  Propagates errors raised by {#start}, the block itself, or {#stop}
+  # @return [void]
+  def run
+    start
+    yield
+  ensure
+    stop
+  end
   ##
   # Configures an HTTP MCP transport to use a persistent connection pool
   # via the optional dependency [Net::HTTP::Persistent](https://github.com/drbrain/net-http-persistent)

data/lib/llm/message.rb CHANGED Viewed

@@ -33,11 +33,15 @@ module LLM
     # Returns a Hash representation of the message.
     # @return [Hash]
     def to_h
-      {role:, content:, reasoning_content:,
-       compaction: extra.compaction,
-       tools: extra.tool_calls,
-       usage:,
-       original_tool_calls: extra.original_tool_calls}.compact
+      {
+        role:,
+        content:,
+        reasoning_content:,
+        compaction: extra.compaction,
+        tools: extra.tool_calls&.map { LLM::Object === _1 ? _1.to_h : _1 },
+        usage:,
+        original_tool_calls: extra.original_tool_calls
+      }.compact.then { preserve_nil_content(_1) }
     end
     ##
@@ -208,6 +212,11 @@ module LLM
     private
+    def preserve_nil_content(hash)
+      hash[:content] = content if content.nil?
+      hash
+    end
     def tool_calls
       @tool_calls ||= LLM::Object.from(extra.tool_calls || [])
     end

data/lib/llm/providers/deepseek/request_adapter/completion.rb CHANGED Viewed

@@ -19,7 +19,7 @@ module LLM::DeepSeek::RequestAdapter
         if Hash === message
           {role: message[:role], content: adapt_content(message[:content])}
         elsif message.tool_call?
-          {role: message.role, content: nil, tool_calls: message.extra[:original_tool_calls]}
+          wrap(content: nil, tool_calls: message.extra[:original_tool_calls])
         else
           adapt_message
         end
@@ -30,25 +30,34 @@ module LLM::DeepSeek::RequestAdapter
     def adapt_content(content)
       case content
+      when LLM::Object
+        adapt_object(content)
       when String
-        content.to_s
+        [{type: :text, text: content.to_s}]
       when LLM::Message
         adapt_content(content.content)
       when LLM::Function::Return
         throw(:abort, {role: "tool", tool_call_id: content.id, content: LLM.json.dump(content.value)})
-      when LLM::Object
-        prompt_error!(content)
       else
         prompt_error!(content)
       end
     end
+    def adapt_object(object)
+      case object.kind
+      when :image_url, :local_file, :remote_file
+        prompt_error!(object)
+      else
+        prompt_error!(object)
+      end
+    end
     def adapt_message
       case content
       when Array
         adapt_array
       else
-        {role: message.role, content: adapt_content(content)}
+        wrap(content: adapt_content(content))
       end
     end
@@ -58,13 +67,13 @@ module LLM::DeepSeek::RequestAdapter
       elsif returns.any?
         returns.map { {role: "tool", tool_call_id: _1.id, content: LLM.json.dump(_1.value)} }
       else
-        {role: message.role, content: content.flat_map { adapt_content(_1) }}
+        wrap(content: content.flat_map { adapt_content(_1) })
       end
     end
     def prompt_error!(object)
       if LLM::Object === object
-        raise LLM::PromptError, "The given LLM::Object with kind '#{content.kind}' is not " \
+        raise LLM::PromptError, "The given LLM::Object with kind '#{object.kind}' is not " \
                                 "supported by the DeepSeek API"
       else
         raise LLM::PromptError, "The given object (an instance of #{object.class}) " \
@@ -72,8 +81,22 @@ module LLM::DeepSeek::RequestAdapter
       end
     end
+    def wrap(content:, tool_calls: nil)
+      {
+        role: message.role,
+        content:,
+        tool_calls: tool_calls&.map { LLM::Object === _1 ? _1.to_h : _1 },
+        reasoning_content: message.reasoning_content
+      }.compact.then { preserve_nil_content(_1) }
+    end
     def message = @message
     def content = message.content
     def returns = content.grep(LLM::Function::Return)
+    def preserve_nil_content(hash)
+      hash[:content] = content if content.nil?
+      hash
+    end
   end
 end

data/lib/llm/providers/deepseek.rb CHANGED Viewed

@@ -15,7 +15,7 @@ module LLM
   #
   #   llm = LLM.deepseek(key: ENV["KEY"])
   #   ctx = LLM::Context.new(llm)
-  #   ctx.talk ["Tell me about this photo", ctx.local_file("/images/photo.png")]
+  #   ctx.talk "Hello"
   #   ctx.messages.select(&:assistant?).each { print "[#{_1.role}]", _1.content, "\n" }
   class DeepSeek < OpenAI
     require_relative "deepseek/request_adapter"
@@ -73,10 +73,10 @@ module LLM
     ##
     # Returns the default model for chat completions
-    # @see https://api-docs.deepseek.com/quick_start/pricing deepseek-chat
+    # @see https://api-docs.deepseek.com/quick_start/pricing deepseek-v4-flash
     # @return [String]
     def default_model
-      "deepseek-chat"
+      "deepseek-v4-flash"
     end
   end
 end

data/lib/llm/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module LLM
-  VERSION = "5.1.0"
+  VERSION = "5.2.1"
 end

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: llm.rb
 version: !ruby/object:Gem::Version
-  version: 5.1.0
+  version: 5.2.1
 platform: ruby
 authors:
 - Antar Azri