RubyGems - llm.rb - Versions diffs - 5.4.0 → 6.1.0 - Mend

llm.rb 5.4.0 → 6.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (23) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +87 -1
data/README.md +113 -25
data/lib/llm/active_record/acts_as_agent.rb +5 -11
data/lib/llm/active_record/acts_as_llm.rb +17 -37
data/lib/llm/agent.rb +2 -0
data/lib/llm/buffer.rb +8 -0
data/lib/llm/compactor.rb +26 -7
data/lib/llm/context/deserializer.rb +1 -0
data/lib/llm/context.rb +48 -21
data/lib/llm/error.rb +4 -0
data/lib/llm/function/ractor/task.rb +8 -2
data/lib/llm/function.rb +6 -2
data/lib/llm/provider/transport/http/execution.rb +1 -1
data/lib/llm/provider/transport/http/interruptible.rb +99 -94
data/lib/llm/provider/transport/http.rb +3 -2
data/lib/llm/provider.rb +8 -0
data/lib/llm/sequel/agent.rb +2 -7
data/lib/llm/sequel/plugin.rb +31 -38
data/lib/llm/skill.rb +6 -0
data/lib/llm/version.rb +1 -1
data/llm.gemspec +1 -0
metadata +15 -1

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 0bd3ea0956fe1a9fa53bec3211dc4afe6f03c15fa67304ce5ba2c922d20abff1
-  data.tar.gz: 1aa03e4fc3eafbbbf9367deb8714f844ccd41299c666595d1d081a2db4d9d42e
+  metadata.gz: 57b39b3b4b79d1d9f8cfd10426ad233d698dd6e3ed84bfef887c8c63f543f40f
+  data.tar.gz: 443ed7e2a04259c69d41b1da7a42e7637efaa4ab1075548706ce349bced7ed51
 SHA512:
-  metadata.gz: 0e14b7cb29b5130b703c26369b6ecec106117e6a045bbcbeb79019c96814d9969e387c25992d2f3c96fd2ea43143ca4c48f00e91a00cc6cb3e20556145254d80
-  data.tar.gz: 3d34913dba2eab22f6f794196d59791cbadfeb71b9b4aaf588d46607112c4a0a0f741a9e8c9d308afb86d5d678a753b8a551ca66b08351581677845012c8e583
+  metadata.gz: f8e53dc41eacf16cea35f64a6048aa77852fcf7a135676b2b9c02e37beff174b5a500948477c4f931ff0a71d20c4503ba3e9eef19358d3aaa204040e77fe14c5
+  data.tar.gz: 358ce7f33d2dca51365f6581867006970fd66079dcaa189268e2deff2f297c89b8332fd11b714bedfd89124413b7a9e12fc09d928c2c28f2e9cb2368f2bc3e24

data/CHANGELOG.md CHANGED Viewed

@@ -1,9 +1,95 @@
 # Changelog
-## Unreleased
+## v6.1.0
+Changes since `v6.0.0`.
+This release tightens interrupt and compaction behavior for long-running
+contexts. It adds `LLM::Buffer#rindex`, supports percentage-based token
+thresholds in `LLM::Compactor`, tracks persisted compaction state through
+context serialization, reliably interrupts Async-backed requests, preserves
+valid tool-call history on cancellation, keeps concurrent skill tool loops
+running on streamed agents, and returns zero-valued usage objects when no
+provider usage has been recorded yet.
+### Change
+* **Add `LLM::Buffer#rindex`** <br>
+  Add `LLM::Buffer#rindex` as a direct forward to the underlying message
+  array so callers can find the last matching message index through the
+  buffer API.
+* **Support percentage compaction token thresholds** <br>
+  Let `LLM::Compactor` accept `token_threshold:` values like `"90%"` so
+  compaction can trigger at a percentage of the active model context
+  window.
+### Fix
+* **Interrupt Async-backed requests reliably** <br>
+  Track request ownership through the provider transport so contexts use
+  the active Async task when available, letting `ctx.interrupt!`
+  reliably cancel streamed requests under Async runtimes and surface
+  them as `LLM::Interrupt`.
+* **Preserve valid tool-call history on cancellation** <br>
+  Append cancelled tool-return messages for unresolved tool calls during
+  `ctx.interrupt!` so follow-up provider requests do not fail with
+  invalid tool-call history after pending tool work is cancelled.
+* **Preserve concurrent skill tool loops on streamed agents** <br>
+  Propagate the active agent concurrency through the effective request
+  stream so nested skill agents keep using queued `wait(...)` tool
+  execution instead of falling back to direct `:call` execution.
+* **Track persisted compaction state on contexts** <br>
+  Mark contexts as compacted after `LLM::Compactor#compact!`, persist and
+  restore that state through context serialization, and clear it after the
+  next successful model response.
+* **Return zero-valued usage objects from contexts** <br>
+  Make `LLM::Context#usage` consistently return an `LLM::Object`, using a
+  zero-valued usage object when no provider usage has been recorded yet.
+## v6.0.0
 Changes since `v5.4.0`.
+This release simplifies the ORM persistence contract around serialized
+`data` state, removing the assumption of reserved `provider`, `model`, and
+usage columns. Provider selection must now come from `provider:` hooks,
+model defaults come from `context:` or agent DSL, and usage is read from the
+serialized runtime state. Alongside this breaking change, Sequel JSON and
+JSONB persistence is fixed, ractor-backed tools now fire tracer callbacks,
+and `LLM::RactorError` is raised for unsupported ractor tool work.
+### Change
+* **Simplify ORM persistence to serialized `data` state** <br>
+  Change the built-in ActiveRecord and Sequel wrappers to treat serialized
+  `data` as the persistence contract, instead of assuming reserved
+  `provider`, `model`, and usage columns. Provider selection must now come
+  from `provider:` hooks that resolve a real `LLM::Provider` instance, model
+  defaults come from `context:` or agent DSL, and `usage` is read from the
+  serialized runtime state.
+### Fix
+* **Fix Sequel JSON and JSONB persistence** <br>
+  Load Sequel PostgreSQL JSON support when `plugin :llm` is configured with
+  `format: :json` or `:jsonb`, and wrap structured payloads correctly so
+  persisted context state can be stored in PostgreSQL JSON columns.
+* **Trace ractor-backed tool callbacks** <br>
+  Make tool tracers fire `on_tool_start` and `on_tool_finish` for
+  class-based `:ractor` execution too, so ractor-backed tool calls show up
+  in tracer callbacks like the other concurrent tool paths.
+* **Raise `LLM::RactorError` for unsupported ractor tool work** <br>
+  Add `LLM::RactorError` and fail fast when `:ractor` execution is requested
+  for unsupported tool types such as skill-backed tools, instead of letting
+  deeper Ruby isolation errors leak out later in execution.
 ## v5.4.0
 Changes since `v5.3.0`.

data/README.md CHANGED Viewed

@@ -4,7 +4,7 @@
 <p align="center">
   <a href="https://0x1eef.github.io/x/llm.rb?rebuild=1"><img src="https://img.shields.io/badge/docs-0x1eef.github.io-blue.svg" alt="RubyDoc"></a>
   <a href="https://opensource.org/license/0bsd"><img src="https://img.shields.io/badge/License-0BSD-orange.svg?" alt="License"></a>
-  <a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-5.4.0-green.svg?" alt="Version"></a>
+  <a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-6.1.0-green.svg?" alt="Version"></a>
 </p>
 ## About
@@ -25,7 +25,6 @@ schemas, files, and persisted state, so real systems can be built out of one coh
 execution model instead of a pile of adapters.
 Want to see some code? Jump to [the examples](#examples) section. <br>
-Want to see an agentic framework built on top of llm.rb? Check out [general-intelligence-systems/brute](https://github.com/general-intelligence-systems/brute). <br>
 Want to see a self-hosted LLM environment built on llm.rb? Check out [Relay](https://github.com/llmrb/relay).
 ## Architecture
@@ -102,20 +101,26 @@ separate agent table or a second persistence layer.
 `acts_as_agent` extends a model with agent capabilities: the same runtime
 surface as [`LLM::Agent`](https://0x1eef.github.io/x/llm.rb/LLM/Agent.html),
-because it actually wraps an `LLM::Agent`, plus persistence through a text,
-JSON, or JSONB-backed column on the same table.
+because it actually wraps an `LLM::Agent`, plus persistence through one text,
+JSON, or JSONB-backed `data` column on the same table. If your app also has
+provider or model columns, provide them to llm.rb through `set_provider` and
+`set_context`.
 ```ruby
 class Ticket < ApplicationRecord
-  acts_as_agent provider: :set_provider
+  acts_as_agent provider: :set_provider, context: :set_context
   model "gpt-5.4-mini"
   instructions "You are a support assistant."
   private
   def set_provider
-    { key: ENV["#{provider.upcase}_SECRET"], persistent: true }
+    LLM.openai(key: ENV["OPENAI_SECRET"])
+  end
+  def set_context
+    { mode: :responses, store: false }
   end
 end
 ```
@@ -158,12 +163,15 @@ and when a stream is present it emits `on_compaction` and
 `on_compaction_finish` through [`LLM::Stream`](https://0x1eef.github.io/x/llm.rb/LLM/Stream.html).
 The compactor can also use a different model from the main context, which is
 useful when you want summarization to run on a cheaper or faster model.
+`token_threshold:` accepts either a fixed token count or a percentage string
+like `"90%"`, which resolves against the active model context window and
+triggers compaction once total token usage goes over that percentage.
 ```ruby
 ctx = LLM::Context.new(
   llm,
   compactor: {
-    message_threshold: 200,
+    token_threshold: "90%",
     retention_window: 8,
     model: "gpt-5.4-mini"
   }
@@ -303,7 +311,7 @@ finer sequential control across several steps before shutting the client down.
 ```ruby
 mcp = LLM::MCP.http(
   url: "https://api.githubcopilot.com/mcp/",
-  headers: {"Authorization" => "Bearer #{ENV.fetch("GITHUB_PAT")}"}
+  headers: {"Authorization" => "Bearer #{ENV["GITHUB_PAT"]}"}
 ).persistent
 mcp.run do
   ctx = LLM::Context.new(llm, tools: mcp.tools)
@@ -376,7 +384,8 @@ worker.join
   Use threads, fibers, async tasks, or experimental ractors without
   rewriting your tool layer. The current `:ractor` mode is for class-based
   tools and does not support MCP tools, but mixed workloads can branch on
-  `tool.mcp?` and choose a supported strategy per tool. `:ractor` is
+  `tool.mcp?` and choose a supported strategy per tool. Class-based
+  `:ractor` tools still emit normal tool tracer callbacks. `:ractor` is
   especially useful for CPU-bound tools, while `:task`, `:fiber`, or
   `:thread` may be a better fit for I/O-bound work.
 - **Advanced workloads are built in, not bolted on** <br>
@@ -618,7 +627,10 @@ long-lived contexts can summarize older history and expose the lifecycle
 through stream hooks. This approach is inspired by General Intelligence
 Systems' [Brute](https://github.com/general-intelligence-systems/brute). The
 compactor can also use its own `model:` if you want summarization to run on a
-different model from the main context. <br> See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.
+different model from the main context. `token_threshold:` accepts either a
+fixed token count or a percentage string like `"90%"`, which resolves
+against the active model context window and triggers compaction once total
+token usage goes over that percentage. <br> See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.
 ```ruby
 require "llm"
@@ -638,7 +650,7 @@ ctx = LLM::Context.new(
   llm,
   stream: Stream.new,
   compactor: {
-    message_threshold: 200,
+    token_threshold: "90%",
     retention_window: 8,
     model: "gpt-5.4-mini"
   }
@@ -696,7 +708,7 @@ worker.join
 #### Sequel (ORM)
-The `plugin :llm` integration wraps [`LLM::Context`](https://0x1eef.github.io/x/llm.rb/LLM/Context.html) on a `Sequel::Model` and keeps tool execution explicit. <br> See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.
+The `plugin :llm` integration wraps [`LLM::Context`](https://0x1eef.github.io/x/llm.rb/LLM/Context.html) on a `Sequel::Model` and keeps tool execution explicit. Like the ActiveRecord wrappers, its built-in persistence contract is the serialized `data` column, while `provider:` resolves a real `LLM::Provider` instance and `context:` injects defaults such as `model:`. <br> See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.
 ```ruby
 require "llm"
@@ -705,10 +717,20 @@ require "sequel"
 require "sequel/plugins/llm"
 class Context < Sequel::Model
-  plugin :llm, provider: -> { { key: ENV["#{provider.upcase}_SECRET"], persistent: true } }
+  plugin :llm, provider: :set_provider, context: :set_context
+  private
+  def set_provider
+    LLM.openai(key: ENV["OPENAI_SECRET"])
+  end
+  def set_context
+    {model: "gpt-5.4-mini", mode: :responses, store: false}
+  end
 end
-ctx = Context.create(provider: "openai", model: "gpt-5.4-mini")
+ctx = Context.create
 ctx.talk("Remember that my favorite language is Ruby")
 puts ctx.talk("What is my favorite language?").content
 ```
@@ -716,36 +738,76 @@ puts ctx.talk("What is my favorite language?").content
 #### ActiveRecord (ORM): acts_as_llm
 The `acts_as_llm` method wraps [`LLM::Context`](https://0x1eef.github.io/x/llm.rb/LLM/Context.html) and
-provides full control over tool execution. <br> See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.
+provides full control over tool execution. Its built-in persistence contract is
+one serialized `data` column. If your app has provider, model, or usage
+columns, provide them to llm.rb through `provider:` and `context:` instead of
+relying on reserved wrapper columns.
+See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.
 ```ruby
 require "llm"
-require "net/http/persistent"
 require "active_record"
 require "llm/active_record"
 class Context < ApplicationRecord
-  acts_as_llm provider: -> { { key: ENV["#{provider.upcase}_SECRET"], persistent: true } }
+  acts_as_llm provider: :set_provider, context: :set_context
+  private
+  def set_provider
+    LLM.openai(key: ENV["OPENAI_SECRET"])
+  end
+  def set_context
+    {model: "gpt-5.4-mini", mode: :responses, store: false}
+  end
 end
-ctx = Context.create!(provider: "openai", model: "gpt-5.4-mini")
+ctx = Context.create!
 ctx.talk("Remember that my favorite language is Ruby")
 puts ctx.talk("What is my favorite language?").content
 ```
+```ruby
+require "llm"
+require "active_record"
+require "llm/active_record"
+class Context < ApplicationRecord
+  acts_as_llm provider: :set_provider, context: :set_context
+  # Optional application columns can still provide the provider and context.
+  # For example, `provider_name` and `model_name` can be normal columns.
+  private
+  def set_provider
+    LLM.public_send(provider_name, key: provider_key)
+  end
+  def set_context
+    {model: model_name, mode: :responses, store: false}
+  end
+end
+```
 #### ActiveRecord (ORM): acts_as_agent
 The `acts_as_agent` method wraps [`LLM::Agent`](https://0x1eef.github.io/x/llm.rb/LLM/Agent.html) and
-manages tool execution for you. <br> See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.
+manages tool execution for you. Like `acts_as_llm`, its built-in persistence
+contract is one serialized `data` column. If your app has provider or model
+columns, provide them to llm.rb through your hooks and agent DSL.
+See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.
 ```ruby
 require "llm"
-require "net/http/persistent"
 require "active_record"
 require "llm/active_record"
 class Ticket < ApplicationRecord
-  acts_as_agent provider: :set_provider
+  acts_as_agent provider: :set_provider, context: :set_context
   model "gpt-5.4-mini"
   instructions "You are a concise support assistant."
   tools SearchDocs, Escalate
@@ -754,14 +816,40 @@ class Ticket < ApplicationRecord
   private
   def set_provider
-    { key: ENV["#{provider.upcase}_SECRET"], persistent: true }
+    LLM.openai(key: ENV["OPENAI_SECRET"])
+  end
+  def set_context
+    {mode: :responses, store: false}
   end
 end
-ticket = Ticket.create!(provider: "openai", model: "gpt-5.4-mini")
+ticket = Ticket.create!
 puts ticket.talk("How do I rotate my API key?").content
 ```
+```ruby
+require "llm"
+require "active_record"
+require "llm/active_record"
+class Ticket < ApplicationRecord
+  acts_as_agent provider: :set_provider, context: :set_context
+  model "gpt-5.4-mini"
+  instructions "You are a concise support assistant."
+  private
+  def set_provider
+    LLM.public_send(provider_name, key: provider_key)
+  end
+  def set_context
+    {mode: :responses, store: false}
+  end
+end
+```
 #### MCP
 This example uses [`LLM::MCP`](https://0x1eef.github.io/x/llm.rb/LLM/MCP.html) over HTTP so remote GitHub MCP tools run through the same `LLM::Context` tool path as local tools. It expects a GitHub token in `ENV["GITHUB_PAT"]`. See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.
@@ -773,7 +861,7 @@ require "net/http/persistent"
 llm = LLM.openai(key: ENV["KEY"])
 mcp = LLM::MCP.http(
   url: "https://api.githubcopilot.com/mcp/",
-  headers: {"Authorization" => "Bearer #{ENV.fetch("GITHUB_PAT")}"}
+  headers: {"Authorization" => "Bearer #{ENV["GITHUB_PAT"]}"}
 ).persistent
 mcp.start
@@ -788,7 +876,7 @@ For scoped work, `mcp.run do ... end` is shorter and handles cleanup for you:
 ```ruby
 mcp = LLM::MCP.http(
   url: "https://api.githubcopilot.com/mcp/",
-  headers: {"Authorization" => "Bearer #{ENV.fetch("GITHUB_PAT")}"}
+  headers: {"Authorization" => "Bearer #{ENV["GITHUB_PAT"]}"}
 ).persistent
 mcp.run do
   ctx = LLM::Context.new(llm, stream: $stdout, tools: mcp.tools)

data/lib/llm/active_record/acts_as_agent.rb CHANGED Viewed

@@ -11,7 +11,6 @@ module LLM::ActiveRecord
   # class and forwarded to an internal agent subclass.
   module ActsAsAgent
     EMPTY_HASH = LLM::ActiveRecord::ActsAsLLM::EMPTY_HASH
-    DEFAULT_USAGE_COLUMNS = LLM::ActiveRecord::ActsAsLLM::DEFAULT_USAGE_COLUMNS
     DEFAULTS = LLM::ActiveRecord::ActsAsLLM::DEFAULTS
     Utils = LLM::ActiveRecord::ActsAsLLM::Utils
@@ -58,8 +57,6 @@ module LLM::ActiveRecord
       # @param [Class] model
       # @return [void]
       def self.extended(model)
-        options = model.llm_plugin_options
-        model.validates options[:provider_column], options[:model_column], presence: true
         model.include LLM::ActiveRecord::ActsAsLLM::InstanceMethods unless model.ancestors.include?(LLM::ActiveRecord::ActsAsLLM::InstanceMethods)
         model.include InstanceMethods unless model.ancestors.include?(InstanceMethods)
         model.extend ClassMethods unless model.singleton_class.ancestors.include?(ClassMethods)
@@ -77,6 +74,8 @@ module LLM::ActiveRecord
     # @option options [Proc, Symbol, LLM::Tracer, nil] :tracer
     #   Optional tracer, method name, or proc that resolves to one and is
     #   assigned through `llm.tracer = ...` on the resolved provider.
+    # @option options [Proc, Symbol, LLM::Provider] :provider
+    #   Must resolve to an `LLM::Provider` instance for the current record.
     # @yield
     #   Evaluated in the model class after the wrapper is installed, so agent
     #   DSL methods such as `model`, `tools`, `schema`, `instructions`, and
@@ -84,9 +83,8 @@ module LLM::ActiveRecord
     # @return [void]
     def acts_as_agent(options = EMPTY_HASH, &block)
       options = DEFAULTS.merge(options)
-      usage_columns = DEFAULT_USAGE_COLUMNS.merge(options[:usage_columns] || EMPTY_HASH)
       class_attribute :llm_plugin_options, instance_accessor: false, default: DEFAULTS unless respond_to?(:llm_plugin_options)
-      self.llm_plugin_options = options.merge(usage_columns: usage_columns.freeze).freeze
+      self.llm_plugin_options = options.freeze
       extend Hooks
       class_exec(&block) if block
     end
@@ -97,11 +95,8 @@ module LLM::ActiveRecord
       # @return [LLM::Provider]
       def llm
         options = self.class.llm_plugin_options
-        columns = Utils.columns(options)
-        provider = self[columns[:provider_column]]
-        kwargs = Utils.resolve_options(self, options[:provider], ActsAsAgent::EMPTY_HASH)
         return @llm if @llm
-        @llm = LLM.method(provider).call(**kwargs)
+        @llm = Utils.resolve_provider(self, options, ActsAsAgent::EMPTY_HASH)
         @llm.tracer = Utils.resolve_option(self, options[:tracer]) if options[:tracer]
         @llm
       end
@@ -113,10 +108,9 @@ module LLM::ActiveRecord
       def ctx
         @ctx ||= begin
           options = self.class.llm_plugin_options
-          columns = Utils.columns(options)
           params = Utils.resolve_options(self, options[:context], ActsAsAgent::EMPTY_HASH).dup
-          params[:model] ||= self[columns[:model_column]]
           ctx = self.class.agent.new(llm, params.compact)
+          columns = Utils.columns(options)
           data = self[columns[:data_column]]
           if data.nil? || data == ""
             ctx

data/lib/llm/active_record/acts_as_llm.rb CHANGED Viewed

@@ -17,19 +17,11 @@ module LLM::ActiveRecord
   # `tracer:` can also be configured as symbols that are called on the model.
   module ActsAsLLM
     EMPTY_HASH = {}.freeze
-    DEFAULT_USAGE_COLUMNS = {
-      input_tokens: :input_tokens,
-      output_tokens: :output_tokens,
-      total_tokens: :total_tokens
-    }.freeze
     DEFAULTS = {
-      provider_column: :provider,
-      model_column: :model,
       data_column: :data,
       format: :string,
-      usage_columns: DEFAULT_USAGE_COLUMNS,
       tracer: nil,
-      provider: EMPTY_HASH,
+      provider: nil,
       context: EMPTY_HASH
     }.freeze
@@ -78,28 +70,26 @@ module LLM::ActiveRecord
       # Maps wrapper options onto the record's storage columns.
       # @return [Hash]
       def self.columns(options)
-        usage_columns = options[:usage_columns]
         {
-          provider_column: options[:provider_column],
-          model_column: options[:model_column],
-          data_column: options[:data_column],
-          input_tokens: usage_columns[:input_tokens],
-          output_tokens: usage_columns[:output_tokens],
-          total_tokens: usage_columns[:total_tokens]
+          data_column: options[:data_column]
         }.freeze
       end
+      ##
+      # Resolves the provider runtime for a record.
+      # @return [LLM::Provider]
+      def self.resolve_provider(obj, options, empty_hash)
+        provider = resolve_option(obj, options[:provider])
+        return provider if LLM::Provider === provider
+        raise ArgumentError, "provider: must resolve to an LLM::Provider instance"
+      end
       ##
       # Persists the runtime state and usage columns back onto the record.
       # @return [void]
       def self.save(obj, ctx, options)
         columns = self.columns(options)
-        obj.assign_attributes(
-          columns[:data_column] => serialize_context(ctx, options[:format]),
-          columns[:input_tokens] => ctx.usage.input_tokens,
-          columns[:output_tokens] => ctx.usage.output_tokens,
-          columns[:total_tokens] => ctx.usage.total_tokens
-        )
+        obj.assign_attributes(columns[:data_column] => serialize_context(ctx, options[:format]))
         obj.save!
       end
     end
@@ -111,8 +101,6 @@ module LLM::ActiveRecord
       # @param [Class] model
       # @return [void]
       def self.extended(model)
-        options = model.llm_plugin_options
-        model.validates options[:provider_column], options[:model_column], presence: true
         model.include InstanceMethods unless model.ancestors.include?(InstanceMethods)
       end
     end
@@ -128,12 +116,13 @@ module LLM::ActiveRecord
     # @option options [Proc, Symbol, LLM::Tracer, nil] :tracer
     #   Optional tracer, method name, or proc that resolves to one and is
     #   assigned through `llm.tracer = ...` on the resolved provider.
+    # @option options [Proc, Symbol, LLM::Provider] :provider
+    #   Must resolve to an `LLM::Provider` instance for the current record.
     # @return [void]
     def acts_as_llm(options = EMPTY_HASH)
       options = DEFAULTS.merge(options)
-      usage_columns = DEFAULT_USAGE_COLUMNS.merge(options[:usage_columns] || EMPTY_HASH)
       class_attribute :llm_plugin_options, instance_accessor: false, default: DEFAULTS unless respond_to?(:llm_plugin_options)
-      self.llm_plugin_options = options.merge(usage_columns: usage_columns.freeze).freeze
+      self.llm_plugin_options = options.freeze
       extend Hooks
     end
@@ -228,12 +217,7 @@ module LLM::ActiveRecord
       # Returns usage from the mapped usage columns.
       # @return [LLM::Object]
       def usage
-        columns = Utils.columns(self.class.llm_plugin_options)
-        LLM::Object.from(
-          input_tokens: self[columns[:input_tokens]] || 0,
-          output_tokens: self[columns[:output_tokens]] || 0,
-          total_tokens: self[columns[:total_tokens]] || 0
-        )
+        ctx.usage || LLM::Object.from(input_tokens: 0, output_tokens: 0, total_tokens: 0)
       end
       ##
@@ -285,11 +269,8 @@ module LLM::ActiveRecord
       # @return [LLM::Provider]
       def llm
         options = self.class.llm_plugin_options
-        columns = Utils.columns(options)
-        provider = self[columns[:provider_column]]
-        kwargs = Utils.resolve_options(self, options[:provider], ActsAsLLM::EMPTY_HASH)
         return @llm if @llm
-        @llm = LLM.method(provider).call(**kwargs)
+        @llm = Utils.resolve_provider(self, options, ActsAsLLM::EMPTY_HASH)
         @llm.tracer = Utils.resolve_option(self, options[:tracer]) if options[:tracer]
         @llm
       end
@@ -303,7 +284,6 @@ module LLM::ActiveRecord
           options = self.class.llm_plugin_options
           columns = Utils.columns(options)
           params = Utils.resolve_options(self, options[:context], ActsAsLLM::EMPTY_HASH).dup
-          params[:model] ||= self[columns[:model_column]]
           ctx = LLM::Context.new(llm, params.compact)
           data = self[columns[:data_column]]
           if data.nil? || data == ""

data/lib/llm/agent.rb CHANGED Viewed

@@ -394,6 +394,8 @@ module LLM
     def run_loop(method, prompt, params)
       loop = proc do
         max = Integer(params.delete(:tool_attempts) || 25)
+        stream = params[:stream] || @ctx.params[:stream]
+        stream.extra[:concurrency] = concurrency if LLM::Stream === stream
         res = @ctx.public_send(method, apply_instructions(prompt), params)
         max.times do
           break if @ctx.functions.empty?

data/lib/llm/buffer.rb CHANGED Viewed

@@ -52,6 +52,14 @@ module LLM
       reverse_each.find(...)
     end
+    ##
+    # Returns the index of the last message matching the given block.
+    # @yield [LLM::Message]
+    # @return [Integer, nil]
+    def rindex(...)
+      @messages.rindex(...)
+    end
     ##
     # Returns the last message(s) in the buffer
     # @param [Integer, nil] n

data/lib/llm/compactor.rb CHANGED Viewed

@@ -11,7 +11,9 @@
 # The compactor can also use a different model from the main context by
 # setting `model:` in the compactor config. Compaction thresholds are opt-in:
 # provide `message_threshold:` and/or `token_threshold:` to enable policy-
-# driven compaction.
+# driven compaction. `token_threshold:` accepts either an integer token count
+# or a percentage string like `"90%"`, which resolves against the current
+# model context window.
 class LLM::Compactor
   DEFAULTS = {
     retention_window: 8,
@@ -25,8 +27,11 @@ class LLM::Compactor
   ##
   # @param [LLM::Context] ctx
   # @param [Hash] config
-  # @option config [Integer, nil] :token_threshold
-  #  Enables token-based compaction.
+  # @option config [Integer, String, nil] :token_threshold
+  #  Enables token-based compaction. Integer values are treated as a fixed
+  #  token count. Percentage strings like `"90%"` are resolved against
+  #  {LLM::Context#context_window}; if the context window is unknown, the
+  #  percentage threshold is treated as disabled.
   # @option config [Integer, nil] :message_threshold
   #  Enables message-count-based compaction.
   # @option config [Integer] :retention_window
@@ -39,18 +44,22 @@ class LLM::Compactor
   end
   ##
-  # Returns true when the context should be compacted
+  # Returns true when the context should be compacted.
+  #
+  # When `token_threshold:` is a percentage string such as `"90%"`, the
+  # threshold is resolved against the current context window and compared to
+  # the current total token usage.
   # @param [Object] prompt
   #  The next prompt or turn input
   # @return [Boolean]
-  def compact?(prompt = nil)
+  def compactable?(prompt = nil)
     return false if ctx.functions.any? || [*prompt].grep(LLM::Function::Return).any?
     messages = ctx.messages.reject(&:system?)
     return true if config[:message_threshold] && messages.size > config[:message_threshold]
-    usage = ctx.usage
-    return true if config[:token_threshold] && usage && usage.total_tokens > config[:token_threshold]
+    return true if token_threshold and ctx.usage.total_tokens > token_threshold
     false
   end
+  alias_method :compact?, :compactable?
   ##
   # Summarize older messages and replace them with a compact summary.
@@ -68,6 +77,7 @@ class LLM::Compactor
     older = messages[0...(messages.size - recent.size)]
     summary = LLM::Message.new(ctx.llm.user_role, "[Previous conversation summary]\n\n#{summarize(older)}", {compaction: true})
     ctx.messages.replace([*ctx.messages.take_while(&:system?), summary, *recent])
+    ctx.compacted = true
     stream.on_compaction_finish(ctx, self) if LLM::Stream === stream
     summary
   end
@@ -84,6 +94,15 @@ class LLM::Compactor
     messages[start..] || []
   end
+  def token_threshold
+    @token_threshold ||= begin
+      threshold = config[:token_threshold]
+      return threshold unless threshold.to_s.end_with?("%")
+      return if ctx.context_window <= 0
+      (ctx.context_window * threshold.delete_suffix("%").to_f / 100).floor
+    end
+  end
   def summarize(messages)
     model = config[:model] || ctx.params[:model] || ctx.llm.default_model
     ctx.llm.complete(summary_prompt(messages), model:).content