RubyGems - llm.rb - Versions diffs - 4.21.0 → 4.22.0 - Mend

llm.rb 4.21.0 → 4.22.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (16) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +49 -0
data/README.md +230 -58
data/data/anthropic.json +35 -2
data/data/google.json +7 -2
data/data/openai.json +0 -30
data/lib/llm/active_record/acts_as_agent.rb +11 -64
data/lib/llm/active_record/acts_as_llm.rb +81 -61
data/lib/llm/agent.rb +15 -3
data/lib/llm/context.rb +8 -1
data/lib/llm/sequel/agent.rb +4 -17
data/lib/llm/sequel/plugin.rb +82 -60
data/lib/llm/skill.rb +29 -14
data/lib/llm/version.rb +1 -1
data/llm.gemspec +3 -0
metadata +43 -1

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: f0bca66b2bd8873cf39abb3be19dc99ca20d558e40ef3e9f475bf1f33faef6b6
-  data.tar.gz: c73a2c5093e7e09557242919feb5a377f25b0fa8a11249a9f346673ad7d3a921
+  metadata.gz: 96698cb3af793b0bd83cae7635279cefbff24f86b11f59c9209edd76f76b757c
+  data.tar.gz: 389e4372ab3b4a2e90020e6e2e838b5a36516d5a5dd82a71243975dfe6f8f959
 SHA512:
-  metadata.gz: 2a00191aaab47702a794f9fa86d782f21832be2a7ef309bd558aa482100d7c66ddbdf3320e89c80af2942c6e33295f10d387702130162fbac7cc98fd9b24c9a8
-  data.tar.gz: a6709f6fd265af673da771f635f34c68e28e490405700c1a59b18253391dbbcae09ce677a4251994d898a851ec08dc598c5ff858e516e25b1206948f509abf67
+  metadata.gz: 6bd4fa02802333bbb925db2e513913bd1669e8a4d7c85d8cb76b88399e9b0e84bfd5ddf922c7816a2afd0c0d76d6a9f8c873702c789665dfe3205ada01d34203
+  data.tar.gz: 0d579386ead2158a4e7ad4991ff0c025758ac51624947d07e5d112779d46cb36bcabdd492ac20bbabc981b3e75e25300d04ba8b86808e4825b5c66e2186e52ae

data/CHANGELOG.md CHANGED Viewed

@@ -2,8 +2,57 @@
 ## Unreleased
+Changes since `v4.22.0`.
+## v4.22.0
 Changes since `v4.21.0`.
+This release deepens the runtime shape of llm.rb. It reduces helper-method
+surface on persisted ORM models, expands real ORM coverage, and makes skills
+behave more like bounded sub-agents with inherited recent context and proper
+instruction injection.
+### Change
+* **Reduce ActiveRecord wrapper model surface** <br>
+  Move helper methods such as option resolution, column mapping,
+  serialization, and persistence into `Utils` for the ActiveRecord
+  wrappers so wrapped models include fewer internal helper methods.
+* **Reduce Sequel wrapper model surface** <br>
+  Move helper methods such as option resolution, column mapping,
+  serialization, and persistence into `Utils` for the Sequel wrappers
+  so wrapped models include fewer internal helper methods.
+* **Expand ORM integration coverage** <br>
+  Add broader ActiveRecord and Sequel coverage for persisted context and
+  agent wrappers, including real SQLite-backed records and cassette-backed
+  OpenAI persistence paths.
+* **Make skills inherit recent parent context** <br>
+  Run `LLM::Skill` with a curated slice of recent parent user and assistant
+  messages, prefixed with `Recent context:`, so skills behave more like
+  task-scoped sub-agents instead of instruction-only helpers.
+### Fix
+* **Fix Sequel `plugin :agent` load order** <br>
+  Require the shared Sequel plugin support from `LLM::Sequel::Agent` so
+  `plugin :agent` can load independently without raising
+  `uninitialized constant LLM::Sequel::Plugin`.
+* **Make skill execution inherit parent context request settings** <br>
+  Run `LLM::Skill` through a parent `LLM::Context` instead of a bare
+  provider so nested skill agents inherit context-level settings such as
+  `mode: :responses`, `store: false`, streaming, and other request defaults,
+  while still keeping skill-local tools and avoiding parent schemas.
+* **Keep agent instructions when history is preseeded** <br>
+  Inject `LLM::Agent` instructions once unless a system message is already
+  present, so agents and nested skills still get their instructions when
+  they start with inherited non-system context.
 ## v4.21.0
 Changes since `v4.20.2`.

data/README.md CHANGED Viewed

@@ -9,18 +9,9 @@
 ## About
-llm.rb is a lightweight runtime for building capable AI systems in Ruby.
+llm.rb is the most capable runtime for building AI systems in Ruby.
 <br>
-It is also the most capable AI Ruby runtime that exists _today_, and that claim is
-backed up by research. Maybe it won't always be true, and that would be good news too -
-because it would mean the Ruby ecosystem is getting stronger.
-llm.rb is not just an API wrapper: it gives you one runtime for providers,
-contexts, agents, tools, skills, MCP servers, streaming, schemas, files, and
-persisted state, so real systems can be built out of one coherent execution
-model instead of a pile of adapters.
 llm.rb is designed for Ruby, and although it works great in Rails, it is not tightly
 coupled to it. It runs on the standard library by default (zero dependencies),
 loads optional pieces only when needed, includes built-in ActiveRecord support through
@@ -29,6 +20,10 @@ loads optional pieces only when needed, includes built-in ActiveRecord support t
 long-lived, tool-capable, stateful AI workflows instead of just
 request/response helpers.
+It provides one runtime for providers, agents, tools, skills, MCP servers, streaming,
+schemas, files, and persisted state, so real systems can be built out of one coherent
+execution model instead of a pile of adapters.
 Want to see some code? Jump to [the examples](#examples) section. <br>
 Want a taste of what llm.rb can build? See [the screencast](#screencast).
@@ -53,6 +48,175 @@ It holds:
 Instead of switching abstractions for each feature, everything builds on the
 same context object.
+## Standout features
+The following list is **not exhaustive**, but it covers a lot of ground.
+#### Skills
+Skills are reusable, directory-backed capabilities loaded from `SKILL.md`.
+They run through the same runtime as tools, agents, and MCP. They do not
+require a second orchestration layer or a parallel abstraction. If you've
+used Claude or Codex, you know the general idea of skills, and llm.rb
+supports that same concept with the same execution model as the rest of the
+system.
+In llm.rb, a skill has frontmatter and instructions. The frontmatter can
+define `name`, `description`, and `tools`. The `tools` entries are tool names,
+and each name must resolve to a subclass of
+[`LLM::Tool`](https://0x1eef.github.io/x/llm.rb/LLM/Tool.html) that is already
+loaded in the runtime.
+If you want Claude/Codex-like skills that can drive scripts or shell
+commands, you would typically pair the skill with a tool that can execute
+system commands.
+```yaml
+---
+name: release
+description: Prepare a release
+tools:
+  - search_docs
+  - git
+---
+Review the release state, summarize what changed, and prepare the release.
+```
+```ruby
+class Agent < LLM::Agent
+  model "gpt-5.4-mini"
+  skills "./skills/release"
+end
+llm = LLM.openai(key: ENV["KEY"])
+Agent.new(llm, stream: $stdout).talk("Let's prepare the release!")
+```
+#### ORM
+Any ActiveRecord model or Sequel model can become an agent-capable model,
+including existing business and domain models, without forcing you into a
+separate agent table or a second persistence layer.
+`acts_as_agent` extends a model with agent capabilities: the same runtime
+surface as [`LLM::Agent`](https://0x1eef.github.io/x/llm.rb/LLM/Agent.html),
+because it actually wraps an `LLM::Agent`, plus persistence through a text,
+JSON, or JSONB-backed column on the same table.
+```ruby
+class Ticket < ApplicationRecord
+  acts_as_agent provider: :set_provider
+  model "gpt-5.4-mini"
+  instructions "You are a support assistant."
+  private
+  def set_provider
+    { key: ENV["#{provider.upcase}_SECRET"], persistent: true }
+  end
+end
+```
+#### Agentic Patterns
+llm.rb is especially strong when you want to build agentic systems in a Ruby
+way. Agents can be ordinary application models with state, associations,
+tools, skills, and persistence, which makes it much easier to build systems
+where users have their own specialized agents instead of treating agents as
+something outside the app.
+That pattern works so well in llm.rb because
+[`LLM::Agent`](https://0x1eef.github.io/x/llm.rb/LLM/Agent.html),
+`acts_as_agent`, `plugin :agent`, skills, tools, and persisted runtime state
+all fit the same execution model. The runtime stays small enough that the
+main design work becomes application design, not orchestration glue.
+For a concrete example, see
+[How to build a platform of agents](https://0x1eef.github.io/posts/how-to-build-a-platform-of-agents).
+#### Persistence
+The same runtime can be serialized to disk, restored later, persisted in JSON
+or JSONB-backed ORM columns, resumed across process boundaries, or shared
+across long-lived workflows.
+```ruby
+ctx = LLM::Context.new(llm)
+ctx.talk("Remember that my favorite language is Ruby.")
+ctx.save(path: "context.json")
+```
+#### LLM::Stream
+`LLM::Stream` is not just for printing tokens. It supports `on_content`,
+`on_reasoning_content`, `on_tool_call`, and `on_tool_return`, which means
+visible output, reasoning output, and tool execution can all be driven through
+the same execution path.
+```ruby
+class Stream < LLM::Stream
+  def on_tool_call(tool, error)
+    queue << tool.spawn(:thread)
+  end
+  def on_tool_return(tool, result)
+    puts(result.value)
+  end
+end
+```
+#### Concurrency
+Tool execution can run sequentially with `:call` or concurrently through
+`:thread`, `:task`, `:fiber`, and experimental `:ractor`, without rewriting
+your tool layer.
+```ruby
+class Agent < LLM::Agent
+  model "gpt-5.4-mini"
+  tools FetchWeather, FetchNews, FetchStock
+  concurrency :thread
+end
+```
+#### MCP
+Remote MCP tools and prompts are not bolted on as a separate integration
+stack. They adapt into the same tool and prompt path used by local tools,
+skills, contexts, and agents.
+```ruby
+begin
+  mcp = LLM::MCP.http(url: "https://api.githubcopilot.com/mcp/").persistent
+  mcp.start
+  ctx = LLM::Context.new(llm, tools: mcp.tools)
+ensure
+  mcp.stop
+end
+```
+#### Cancellation
+Cancellation is one of the harder problems to get right, and while llm.rb
+makes it possible, it still requires careful engineering to use effectively.
+The point though is that it is possible to stop in-flight provider work cleanly
+through the same runtime, and the model used by llm.rb is directly inspired by
+Go's context package. In fact, llm.rb is heavily inspired by Go but with a Ruby
+twist.
+```ruby
+ctx = LLM::Context.new(llm, stream: $stdout)
+worker = Thread.new do
+  ctx.talk("Write a very long essay about network protocols.")
+rescue LLM::Interrupt
+  puts "Request was interrupted!"
+end
+STDIN.getch
+ctx.interrupt!
+worker.join
+```
 ## Differentiators
 ### Execution Model
@@ -137,11 +301,11 @@ same context object.
 - **Tools are explicit** <br>
   Run local tools, provider-native tools, and MCP tools through the same path
   with fewer special cases.
-- **Skills are just tools loaded from directories** <br>
+- **Skills become bounded runtime capabilities** <br>
   Point llm.rb at directories with a `SKILL.md`, resolve named tools through
-  the registry, and run those skills through `LLM::Context` or `LLM::Agent`
-  without creating a second execution model. If you are familiar with skills
-  in Claude or Codex, llm.rb supports the same general idea.
+  the registry, and adapt each skill into its own callable capability through
+  the normal runtime. Unlike a generic skill-discovery tool, each skill runs
+  with its own bounded tool subset and behaves like a task-scoped sub-agent.
 - **Providers are normalized, not flattened** <br>
   Share one API surface across providers without losing access to provider-
   specific capabilities where they matter.
@@ -173,24 +337,31 @@ same context object.
 ## Capabilities
+Execution:
 - **Chat & Contexts** — stateless and stateful interactions with persistence
 - **Context Serialization** — save and restore state across processes or time
 - **Streaming** — visible output, reasoning output, tool-call events
 - **Request Interruption** — stop in-flight provider work cleanly
+- **Concurrent Execution** — threads, async tasks, and fibers
+Runtime Building Blocks:
 - **Tool Calling** — class-based tools and closure-based functions
 - **Run Tools While Streaming** — overlap model output with tool latency
-- **Concurrent Execution** — threads, async tasks, and fibers
 - **Agents** — reusable assistants with tool auto-execution
 - **Skills** — directory-backed capabilities loaded from `SKILL.md`
+- **MCP Support** — stdio and HTTP MCP clients with prompt and tool support
+Data and Structure:
 - **Structured Outputs** — JSON Schema-based responses
 - **Responses API** — stateful response workflows where providers support them
-- **MCP Support** — stdio and HTTP MCP clients with prompt and tool support
 - **Multimodal Inputs** — text, images, audio, documents, URLs
 - **Audio** — speech generation, transcription, translation
 - **Images** — generation and editing
 - **Files API** — upload and reference files in prompts
 - **Embeddings** — vector generation for search and RAG
 - **Vector Stores** — retrieval workflows
+Operations:
 - **Cost Tracking** — local cost estimation without extra API calls
 - **Observability** — tracing, logging, telemetry
 - **Model Registry** — local metadata for capabilities, limits, pricing
@@ -221,6 +392,44 @@ loop do
 end
 ```
+#### Agent
+This example uses [`LLM::Agent`](https://0x1eef.github.io/x/llm.rb/LLM/Agent.html) directly and lets the agent manage tool execution. <br> See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.
+```ruby
+require "llm"
+class ShellAgent < LLM::Agent
+  model "gpt-5.4-mini"
+  instructions "You are a Linux system assistant."
+  tools Shell
+  concurrency :thread
+end
+llm = LLM.openai(key: ENV["KEY"])
+agent = ShellAgent.new(llm)
+puts agent.talk("What time is it on this system?").content
+```
+#### Skills
+This example uses [`LLM::Agent`](https://0x1eef.github.io/x/llm.rb/LLM/Agent.html) with directory-backed skills so `SKILL.md` capabilities run through the normal tool path. In llm.rb, a skill is exposed as a tool in the runtime. When that tool is called, it spawns a sub-agent with relevant context plus the instructions and tool subset declared in its own `SKILL.md`. <br> See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.
+Each skill runs only with the tools declared in its own frontmatter.
+```ruby
+require "llm"
+class Agent < LLM::Agent
+  model "gpt-5.4-mini"
+  instructions "You are a concise release assistant."
+  skills "./skills/release", "./skills/review"
+end
+llm = LLM.openai(key: ENV["KEY"])
+puts Agent.new(llm).talk("Use the review skill.").content
+```
 #### Streaming
 This example uses [`LLM::Stream`](https://0x1eef.github.io/x/llm.rb/LLM/Stream.html) directly so visible output and tool execution can happen together. <br> See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.
@@ -354,12 +563,11 @@ require "active_record"
 require "llm/active_record"
 class Ticket < ApplicationRecord
-  acts_as_agent provider: :set_provider do
-    model "gpt-5.4-mini"
-    instructions "You are a concise support assistant."
-    tools SearchDocs, Escalate
-    concurrency :thread
-  end
+  acts_as_agent provider: :set_provider
+  model "gpt-5.4-mini"
+  instructions "You are a concise support assistant."
+  tools SearchDocs, Escalate
+  concurrency :thread
   private
@@ -372,42 +580,6 @@ ticket = Ticket.create!(provider: "openai", model: "gpt-5.4-mini")
 puts ticket.talk("How do I rotate my API key?").content
 ```
-#### Agent
-This example uses [`LLM::Agent`](https://0x1eef.github.io/x/llm.rb/LLM/Agent.html) directly and lets the agent manage tool execution. <br> See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.
-```ruby
-require "llm"
-class ShellAgent < LLM::Agent
-  model "gpt-5.4-mini"
-  instructions "You are a Linux system assistant."
-  tools Shell
-  concurrency :thread
-end
-llm = LLM.openai(key: ENV["KEY"])
-agent = ShellAgent.new(llm)
-puts agent.talk("What time is it on this system?").content
-```
-#### Skills
-This example uses [`LLM::Agent`](https://0x1eef.github.io/x/llm.rb/LLM/Agent.html) with directory-backed skills so `SKILL.md` capabilities run through the normal tool path. If you have used skills in Claude or Codex, this is the same kind of building block. <br> See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.
-```ruby
-require "llm"
-class Agent < LLM::Agent
-  model "gpt-5.4-mini"
-  instructions "You are a concise release assistant."
-  skills "./skills/release", "./skills/review"
-end
-llm = LLM.openai(key: ENV["KEY"])
-puts Agent.new(llm).talk("Use the review skill.").content
-```
 #### MCP
 This example uses [`LLM::MCP`](https://0x1eef.github.io/x/llm.rb/LLM/MCP.html) over HTTP so remote GitHub MCP tools run through the same `LLM::Context` tool path as local tools. See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.

data/data/anthropic.json CHANGED Viewed

@@ -213,7 +213,7 @@
       "reasoning": true,
       "tool_call": true,
       "temperature": true,
-      "knowledge": "2025-08",
+      "knowledge": "2025-08-31",
       "release_date": "2026-02-17",
       "last_updated": "2026-03-13",
       "modalities": {
@@ -271,6 +271,39 @@
         "output": 32000
       }
     },
+    "claude-opus-4-7": {
+      "id": "claude-opus-4-7",
+      "name": "Claude Opus 4.7",
+      "family": "claude-opus",
+      "attachment": true,
+      "reasoning": true,
+      "tool_call": true,
+      "temperature": false,
+      "knowledge": "2026-01-31",
+      "release_date": "2026-04-16",
+      "last_updated": "2026-04-16",
+      "modalities": {
+        "input": [
+          "text",
+          "image",
+          "pdf"
+        ],
+        "output": [
+          "text"
+        ]
+      },
+      "open_weights": false,
+      "cost": {
+        "input": 5,
+        "output": 25,
+        "cache_read": 0.5,
+        "cache_write": 6.25
+      },
+      "limit": {
+        "context": 1000000,
+        "output": 128000
+      }
+    },
     "claude-3-haiku-20240307": {
       "id": "claude-3-haiku-20240307",
       "name": "Claude Haiku 3",
@@ -609,7 +642,7 @@
       "reasoning": true,
       "tool_call": true,
       "temperature": true,
-      "knowledge": "2025-05",
+      "knowledge": "2025-05-31",
       "release_date": "2026-02-05",
       "last_updated": "2026-03-13",
       "modalities": {

data/data/google.json CHANGED Viewed

@@ -594,7 +594,12 @@
       "cost": {
         "input": 1.25,
         "output": 10,
-        "cache_read": 0.31
+        "cache_read": 0.125,
+        "context_over_200k": {
+          "input": 2.5,
+          "output": 15,
+          "cache_read": 0.25
+        }
       },
       "limit": {
         "context": 1048576,
@@ -824,7 +829,7 @@
       "cost": {
         "input": 0.3,
         "output": 2.5,
-        "cache_read": 0.075,
+        "cache_read": 0.03,
         "input_audio": 1
       },
       "limit": {

data/data/openai.json CHANGED Viewed

@@ -1066,36 +1066,6 @@
         "output": 100000
       }
     },
-    "codex-mini-latest": {
-      "id": "codex-mini-latest",
-      "name": "Codex Mini",
-      "family": "gpt-codex-mini",
-      "attachment": true,
-      "reasoning": true,
-      "tool_call": true,
-      "temperature": false,
-      "knowledge": "2024-04",
-      "release_date": "2025-05-16",
-      "last_updated": "2025-05-16",
-      "modalities": {
-        "input": [
-          "text"
-        ],
-        "output": [
-          "text"
-        ]
-      },
-      "open_weights": false,
-      "cost": {
-        "input": 1.5,
-        "output": 6,
-        "cache_read": 0.375
-      },
-      "limit": {
-        "context": 200000,
-        "output": 100000
-      }
-    },
     "gpt-4": {
       "id": "gpt-4",
       "name": "GPT-4",

data/lib/llm/active_record/acts_as_agent.rb CHANGED Viewed

@@ -13,6 +13,7 @@ module LLM::ActiveRecord
     EMPTY_HASH = LLM::ActiveRecord::ActsAsLLM::EMPTY_HASH
     DEFAULT_USAGE_COLUMNS = LLM::ActiveRecord::ActsAsLLM::DEFAULT_USAGE_COLUMNS
     DEFAULTS = LLM::ActiveRecord::ActsAsLLM::DEFAULTS
+    Utils = LLM::ActiveRecord::ActsAsLLM::Utils
     module ClassMethods
       def model(model = nil)
@@ -52,7 +53,7 @@ module LLM::ActiveRecord
       # @param [Class] model
       # @return [void]
       def self.extended(model)
-        options = model.llm_agent_options
+        options = model.llm_plugin_options
         model.validates options[:provider_column], options[:model_column], presence: true
         model.include LLM::ActiveRecord::ActsAsLLM::InstanceMethods unless model.ancestors.include?(LLM::ActiveRecord::ActsAsLLM::InstanceMethods)
         model.include InstanceMethods unless model.ancestors.include?(InstanceMethods)
@@ -79,8 +80,8 @@ module LLM::ActiveRecord
     def acts_as_agent(options = EMPTY_HASH, &block)
       options = DEFAULTS.merge(options)
       usage_columns = DEFAULT_USAGE_COLUMNS.merge(options[:usage_columns] || EMPTY_HASH)
-      class_attribute :llm_agent_options, instance_accessor: false, default: DEFAULTS unless respond_to?(:llm_agent_options)
-      self.llm_agent_options = options.merge(usage_columns: usage_columns.freeze).freeze
+      class_attribute :llm_plugin_options, instance_accessor: false, default: DEFAULTS unless respond_to?(:llm_plugin_options)
+      self.llm_plugin_options = options.merge(usage_columns: usage_columns.freeze).freeze
       extend Hooks
       class_exec(&block) if block
     end
@@ -90,12 +91,13 @@ module LLM::ActiveRecord
       # Returns the resolved provider instance for this record.
       # @return [LLM::Provider]
       def llm
-        options = self.class.llm_agent_options
+        options = self.class.llm_plugin_options
+        columns = Utils.columns(options)
         provider = self[columns[:provider_column]]
-        kwargs = resolve_options(options[:provider])
+        kwargs = Utils.resolve_options(self, options[:provider], ActsAsAgent::EMPTY_HASH)
         return @llm if @llm
         @llm = LLM.method(provider).call(**kwargs)
-        @llm.tracer = resolve_option(options[:tracer]) if options[:tracer]
+        @llm.tracer = Utils.resolve_option(self, options[:tracer]) if options[:tracer]
         @llm
       end
@@ -105,8 +107,9 @@ module LLM::ActiveRecord
       # @return [LLM::Agent]
       def ctx
         @ctx ||= begin
-          options = self.class.llm_agent_options
-          params = resolve_options(options[:context]).dup
+          options = self.class.llm_plugin_options
+          columns = Utils.columns(options)
+          params = Utils.resolve_options(self, options[:context], ActsAsAgent::EMPTY_HASH).dup
           params[:model] ||= self[columns[:model_column]]
           ctx = self.class.agent.new(llm, params.compact)
           data = self[columns[:data_column]]
@@ -121,62 +124,6 @@ module LLM::ActiveRecord
           end
         end
       end
-      ##
-      # @return [void]
-      def flush
-        attrs = {
-          columns[:data_column] => serialize_context(self.class.llm_agent_options[:format]),
-          columns[:input_tokens] => ctx.usage.input_tokens,
-          columns[:output_tokens] => ctx.usage.output_tokens,
-          columns[:total_tokens] => ctx.usage.total_tokens
-        }
-        assign_attributes(attrs)
-        save!
-      end
-      ##
-      # @return [Hash]
-      def resolve_option(option)
-        case option
-        when Proc then instance_exec(&option)
-        when Symbol then send(option)
-        when Hash then option.dup
-        else option
-        end
-      end
-      ##
-      # @return [Hash]
-      def resolve_options(option)
-        case option
-        when Proc, Symbol, Hash then resolve_option(option)
-        else ActsAsAgent::EMPTY_HASH.dup
-        end
-      end
-      def serialize_context(format)
-        case format
-        when :string then ctx.to_json
-        when :json, :jsonb then ctx.to_h
-        else raise ArgumentError, "Unknown format: #{format.inspect}"
-        end
-      end
-      def columns
-        @columns ||= begin
-          options = self.class.llm_agent_options
-          usage_columns = options[:usage_columns]
-          {
-            provider_column: options[:provider_column],
-            model_column: options[:model_column],
-            data_column: options[:data_column],
-            input_tokens: usage_columns[:input_tokens],
-            output_tokens: usage_columns[:output_tokens],
-            total_tokens: usage_columns[:total_tokens]
-          }.freeze
-        end
-      end
     end
   end
 end

data/lib/llm/active_record/acts_as_llm.rb CHANGED Viewed

@@ -33,6 +33,77 @@ module LLM::ActiveRecord
       context: EMPTY_HASH
     }.freeze
+    ##
+    # Shared helper methods for the ORM wrapper.
+    #
+    # These utilities keep persistence plumbing out of the wrapped model's
+    # method namespace so the injected surface stays focused on the runtime
+    # API itself.
+    # @api private
+    module Utils
+      ##
+      # Resolves a single configured option against a model instance.
+      # @return [Object]
+      def self.resolve_option(obj, option)
+        case option
+        when Proc then obj.instance_exec(&option)
+        when Symbol then obj.send(option)
+        when Hash then option.dup
+        else option
+        end
+      end
+      ##
+      # Resolves hash-like wrapper options against a model instance.
+      # @return [Hash]
+      def self.resolve_options(obj, option, empty_hash)
+        case option
+        when Proc, Symbol, Hash then resolve_option(obj, option)
+        else empty_hash.dup
+        end
+      end
+      ##
+      # Serializes the runtime into the configured storage format.
+      # @return [String, Hash]
+      def self.serialize_context(ctx, format)
+        case format
+        when :string then ctx.to_json
+        when :json, :jsonb then ctx.to_h
+        else raise ArgumentError, "Unknown format: #{format.inspect}"
+        end
+      end
+      ##
+      # Maps wrapper options onto the record's storage columns.
+      # @return [Hash]
+      def self.columns(options)
+        usage_columns = options[:usage_columns]
+        {
+          provider_column: options[:provider_column],
+          model_column: options[:model_column],
+          data_column: options[:data_column],
+          input_tokens: usage_columns[:input_tokens],
+          output_tokens: usage_columns[:output_tokens],
+          total_tokens: usage_columns[:total_tokens]
+        }.freeze
+      end
+      ##
+      # Persists the runtime state and usage columns back onto the record.
+      # @return [void]
+      def self.save(obj, ctx, options)
+        columns = self.columns(options)
+        obj.assign_attributes(
+          columns[:data_column] => serialize_context(ctx, options[:format]),
+          columns[:input_tokens] => ctx.usage.input_tokens,
+          columns[:output_tokens] => ctx.usage.output_tokens,
+          columns[:total_tokens] => ctx.usage.total_tokens
+        )
+        obj.save!
+      end
+    end
     module Hooks
       ##
       # Called when hooks are extended onto an ActiveRecord model.
@@ -72,7 +143,8 @@ module LLM::ActiveRecord
       # @see LLM::Context#talk
       # @return [LLM::Response]
       def talk(...)
-        ctx.talk(...).tap { flush }
+        options = self.class.llm_plugin_options
+        ctx.talk(...).tap { Utils.save(self, ctx, options) }
       end
       ##
@@ -80,7 +152,8 @@ module LLM::ActiveRecord
       # @see LLM::Context#respond
       # @return [LLM::Response]
       def respond(...)
-        ctx.respond(...).tap { flush }
+        options = self.class.llm_plugin_options
+        ctx.respond(...).tap { Utils.save(self, ctx, options) }
       end
       ##
@@ -155,6 +228,7 @@ module LLM::ActiveRecord
       # Returns usage from the mapped usage columns.
       # @return [LLM::Object]
       def usage
+        columns = Utils.columns(self.class.llm_plugin_options)
         LLM::Object.from(
           input_tokens: self[columns[:input_tokens]] || 0,
           output_tokens: self[columns[:output_tokens]] || 0,
@@ -211,11 +285,12 @@ module LLM::ActiveRecord
       # @return [LLM::Provider]
       def llm
         options = self.class.llm_plugin_options
+        columns = Utils.columns(options)
         provider = self[columns[:provider_column]]
-        kwargs = resolve_options(options[:provider])
+        kwargs = Utils.resolve_options(self, options[:provider], ActsAsLLM::EMPTY_HASH)
         return @llm if @llm
         @llm = LLM.method(provider).call(**kwargs)
-        @llm.tracer = resolve_option(options[:tracer]) if options[:tracer]
+        @llm.tracer = Utils.resolve_option(self, options[:tracer]) if options[:tracer]
         @llm
       end
@@ -226,7 +301,8 @@ module LLM::ActiveRecord
       def ctx
         @ctx ||= begin
           options = self.class.llm_plugin_options
-          params = resolve_options(options[:context]).dup
+          columns = Utils.columns(options)
+          params = Utils.resolve_options(self, options[:context], ActsAsLLM::EMPTY_HASH).dup
           params[:model] ||= self[columns[:model_column]]
           ctx = LLM::Context.new(llm, params.compact)
           data = self[columns[:data_column]]
@@ -241,62 +317,6 @@ module LLM::ActiveRecord
           end
         end
       end
-      ##
-      # @return [void]
-      def flush
-        attrs = {
-          columns[:data_column] => serialize_context(self.class.llm_plugin_options[:format]),
-          columns[:input_tokens] => ctx.usage.input_tokens,
-          columns[:output_tokens] => ctx.usage.output_tokens,
-          columns[:total_tokens] => ctx.usage.total_tokens
-        }
-        assign_attributes(attrs)
-        save!
-      end
-      ##
-      # @return [Hash]
-      def resolve_option(option)
-        case option
-        when Proc then instance_exec(&option)
-        when Symbol then send(option)
-        when Hash then option.dup
-        else option
-        end
-      end
-      ##
-      # @return [Hash]
-      def resolve_options(option)
-        case option
-        when Proc, Symbol, Hash then resolve_option(option)
-        else ActsAsLLM::EMPTY_HASH.dup
-        end
-      end
-      def serialize_context(format)
-        case format
-        when :string then ctx.to_json
-        when :json, :jsonb then ctx.to_h
-        else raise ArgumentError, "Unknown format: #{format.inspect}"
-        end
-      end
-      def columns
-        @columns ||= begin
-          options = self.class.llm_plugin_options
-          usage_columns = options[:usage_columns]
-          {
-            provider_column: options[:provider_column],
-            model_column: options[:model_column],
-            data_column: options[:data_column],
-            input_tokens: usage_columns[:input_tokens],
-            output_tokens: usage_columns[:output_tokens],
-            total_tokens: usage_columns[:total_tokens]
-          }.freeze
-        end
-      end
     end
   end
 end

data/lib/llm/agent.rb CHANGED Viewed

@@ -14,7 +14,7 @@ module LLM
   # `respond`, instead of leaving tool loops to the caller.
   #
   # **Notes:**
-  # * Instructions are injected only on the first request.
+  # * Instructions are injected once unless a system message is already present.
   # * An agent automatically executes tool loops (unlike {LLM::Context LLM::Context}).
   # * Tool loop execution can be configured with `concurrency :call`,
   #   `:thread`, `:task`, `:fiber`, `:ractor`, or a list of queued task
@@ -349,16 +349,28 @@ module LLM
       instr = self.class.instructions
       return new_prompt unless instr
       if LLM::Prompt === new_prompt
-        new_prompt.system(instr) if @ctx.messages.empty?
+        new_prompt.system(instr) if inject_instructions?(new_prompt)
         new_prompt
       else
         prompt do
-          _1.system(instr) if @ctx.messages.empty?
+          _1.system(instr) if inject_instructions?
           _1.user(new_prompt)
         end
       end
     end
+    ##
+    # Returns true when agent instructions should be injected for the turn.
+    # Instructions are injected once unless a system message is already
+    # present in the existing context or the prompt being sent.
+    # @param [LLM::Prompt, nil] prompt
+    # @return [Boolean]
+    def inject_instructions?(prompt = nil)
+      return false if @ctx.messages.any?(&:system?)
+      return true if prompt.nil?
+      !prompt.to_a.any?(&:system?)
+    end
     ##
     # @return [Array<LLM::Function::Return>]
     def call_functions

data/lib/llm/context.rb CHANGED Viewed

@@ -54,6 +54,13 @@ module LLM
     # @return [Symbol]
     attr_reader :mode
+    ##
+    # Returns the default params for this context
+    # @return [Hash]
+    def params
+      @params.dup
+    end
     ##
     # @param [LLM::Provider] llm
     #  A provider
@@ -350,7 +357,7 @@ module LLM
     end
     def load_skills(skills)
-      [*skills].map { LLM::Skill.load(_1).to_tool(llm) }
+      [*skills].map { LLM::Skill.load(_1).to_tool(self) }
     end
   end

data/lib/llm/sequel/agent.rb CHANGED Viewed

@@ -10,9 +10,11 @@ module LLM::Sequel
   # instructions, and concurrency are configured on the model class and
   # forwarded to an internal agent subclass.
   module Agent
+    require_relative "plugin"
     EMPTY_HASH = LLM::Sequel::Plugin::EMPTY_HASH
     DEFAULT_USAGE_COLUMNS = LLM::Sequel::Plugin::DEFAULT_USAGE_COLUMNS
     DEFAULTS = LLM::Sequel::Plugin::DEFAULTS
+    Utils = LLM::Sequel::Plugin::Utils
     def self.apply(model, **)
       model.extend ClassMethods
@@ -71,7 +73,8 @@ module LLM::Sequel
       def ctx
         @ctx ||= begin
           options = self.class.llm_plugin_options
-          params = resolve_options(options[:context]).dup
+          columns = Agent::Utils.columns(options)
+          params = Agent::Utils.resolve_options(self, options[:context], Agent::EMPTY_HASH).dup
           params[:model] ||= self[columns[:model_column]]
           ctx = self.class.agent.new(llm, params.compact)
           data = self[columns[:data_column]]
@@ -86,22 +89,6 @@ module LLM::Sequel
           end
         end
       end
-      def resolve_option(option)
-        case option
-        when Proc then instance_exec(&option)
-        when Symbol then send(option)
-        when Hash then option.dup
-        else option
-        end
-      end
-      def resolve_options(option)
-        case option
-        when Proc, Symbol, Hash then resolve_option(option)
-        else Agent::EMPTY_HASH.dup
-        end
-      end
     end
   end
 end

data/lib/llm/sequel/plugin.rb CHANGED Viewed

@@ -22,6 +22,76 @@ module LLM::Sequel
       output_tokens: :output_tokens,
       total_tokens: :total_tokens
     }.freeze
+    ##
+    # Shared helper methods for the ORM wrapper.
+    #
+    # These utilities keep persistence plumbing out of the wrapped model's
+    # method namespace so the injected surface stays focused on the runtime
+    # API itself.
+    # @api private
+    module Utils
+      ##
+      # Resolves a single configured option against a model instance.
+      # @return [Object]
+      def self.resolve_option(obj, option)
+        case option
+        when Proc then obj.instance_exec(&option)
+        when Symbol then obj.send(option)
+        when Hash then option.dup
+        else option
+        end
+      end
+      ##
+      # Resolves hash-like wrapper options against a model instance.
+      # @return [Hash]
+      def self.resolve_options(obj, option, empty_hash)
+        case option
+        when Proc, Symbol, Hash then resolve_option(obj, option)
+        else empty_hash.dup
+        end
+      end
+      ##
+      # Serializes the runtime into the configured storage format.
+      # @return [String, Hash]
+      def self.serialize_context(ctx, format)
+        case format
+        when :string then ctx.to_json
+        when :json, :jsonb then ctx.to_h
+        else raise ArgumentError, "Unknown format: #{format.inspect}"
+        end
+      end
+      ##
+      # Maps wrapper options onto the record's storage columns.
+      # @return [Hash]
+      def self.columns(options)
+        usage_columns = options[:usage_columns]
+        {
+          provider_column: options[:provider_column],
+          model_column: options[:model_column],
+          data_column: options[:data_column],
+          input_tokens: usage_columns[:input_tokens],
+          output_tokens: usage_columns[:output_tokens],
+          total_tokens: usage_columns[:total_tokens]
+        }.freeze
+      end
+      ##
+      # Persists the runtime state and usage columns back onto the record.
+      # @return [void]
+      def self.save(obj, ctx, options)
+        columns = self.columns(options)
+        obj.update(
+          columns[:data_column] => serialize_context(ctx, options[:format]),
+          columns[:input_tokens] => ctx.usage.input_tokens,
+          columns[:output_tokens] => ctx.usage.output_tokens,
+          columns[:total_tokens] => ctx.usage.total_tokens
+        )
+      end
+    end
     DEFAULTS = {
       provider_column: :provider,
       model_column: :model,
@@ -84,12 +154,15 @@ module LLM::Sequel
   end
   module Plugin::InstanceMethods
+    Utils = Plugin::Utils
     ##
     # Continues the stored context with new input and flushes it.
     # @see LLM::Context#talk
     # @return [LLM::Response]
     def talk(...)
-      ctx.talk(...).tap { flush }
+      options = self.class.llm_plugin_options
+      ctx.talk(...).tap { Utils.save(self, ctx, options) }
     end
     ##
@@ -97,7 +170,8 @@ module LLM::Sequel
     # @see LLM::Context#respond
     # @return [LLM::Response]
     def respond(...)
-      ctx.respond(...).tap { flush }
+      options = self.class.llm_plugin_options
+      ctx.respond(...).tap { Utils.save(self, ctx, options) }
     end
     ##
@@ -173,6 +247,7 @@ module LLM::Sequel
     # Returns usage from the mapped usage columns.
     # @return [LLM::Object]
     def usage
+      columns = Utils.columns(self.class.llm_plugin_options)
       LLM::Object.from(
         input_tokens: self[columns[:input_tokens]] || 0,
         output_tokens: self[columns[:output_tokens]] || 0,
@@ -229,11 +304,12 @@ module LLM::Sequel
     # @return [LLM::Provider]
     def llm
       options = self.class.llm_plugin_options
+      columns = Utils.columns(options)
       provider = self[columns[:provider_column]]
-      kwargs = resolve_options(options[:provider])
+      kwargs = Utils.resolve_options(self, options[:provider], Plugin::EMPTY_HASH)
       return @llm if @llm
       @llm = LLM.method(provider).call(**kwargs)
-      @llm.tracer = resolve_option(options[:tracer]) if options[:tracer]
+      @llm.tracer = Utils.resolve_option(self, options[:tracer]) if options[:tracer]
       @llm
     end
@@ -244,7 +320,8 @@ module LLM::Sequel
     def ctx
       @ctx ||= begin
         options = self.class.llm_plugin_options
-        params = resolve_options(options[:context]).dup
+        columns = Utils.columns(options)
+        params = Utils.resolve_options(self, options[:context], Plugin::EMPTY_HASH).dup
         params[:model] ||= self[columns[:model_column]]
         ctx = LLM::Context.new(llm, params.compact)
         data = self[columns[:data_column]]
@@ -259,60 +336,5 @@ module LLM::Sequel
         end
       end
     end
-    ##
-    # @return [void]
-    def flush
-      options = self.class.llm_plugin_options
-      update({
-        columns[:data_column] => serialize_context(options[:format]),
-        columns[:input_tokens] => ctx.usage.input_tokens,
-        columns[:output_tokens] => ctx.usage.output_tokens,
-        columns[:total_tokens] => ctx.usage.total_tokens
-      })
-    end
-    ##
-    # @return [Hash]
-    def resolve_option(option)
-      case option
-      when Proc then instance_exec(&option)
-      when Symbol then send(option)
-      when Hash then option.dup
-      else option
-      end
-    end
-    ##
-    # @return [Hash]
-    def resolve_options(option)
-      case option
-      when Proc, Symbol, Hash then resolve_option(option)
-      else Plugin::EMPTY_HASH.dup
-      end
-    end
-    def serialize_context(format)
-      case format
-      when :string then ctx.to_json
-      when :json, :jsonb then ctx.to_h
-      else raise ArgumentError, "Unknown format: #{format.inspect}"
-      end
-    end
-    def columns
-      @columns ||= begin
-        options = self.class.llm_plugin_options
-        usage_columns = options[:usage_columns]
-        {
-          provider_column: options[:provider_column],
-          model_column: options[:model_column],
-          data_column: options[:data_column],
-          input_tokens: usage_columns[:input_tokens],
-          output_tokens: usage_columns[:output_tokens],
-          total_tokens: usage_columns[:total_tokens]
-        }.freeze
-      end
-    end
   end
 end

data/lib/llm/skill.rb CHANGED Viewed

@@ -45,6 +45,10 @@ module LLM
     # @return [Array<Class<LLM::Tool>>]
     attr_reader :tools
+    ##
+    # @param [String] path
+    #  The path to a directory
+    # @return [LLM::Skill]
     def initialize(path)
       @path = path.to_s
       @name = ::File.basename(@path)
@@ -65,40 +69,51 @@ module LLM
     ##
     # Execute the skill by wrapping it in a small agent with the skill
-    # instructions. The provider is bound explicitly by the caller.
-    # @param [LLM::Provider] llm
-    # @param [Hash] input
+    # instructions. The context is bound explicitly by the caller so the
+    # nested agent can inherit context-level behavior such as streaming.
+    # @param [LLM::Context] ctx
     # @return [Hash]
-    def call(llm, **)
-      instructions = self.instructions
-      tools = self.tools
+    def call(ctx)
+      instructions, tools = self.instructions, self.tools
+      params = ctx.params.merge(mode: ctx.mode).reject { [:tools, :schema].include?(_1) }
       agent = Class.new(LLM::Agent) do
-        instructions instructions
+        instructions(instructions)
         tools(*tools)
-      end.new(llm)
-      res = agent.talk(instructions)
+      end.new(ctx.llm, params)
+      agent.messages.concat(messages_for(ctx))
+      res = agent.talk("Solve the user's query.")
       {content: res.content}
     end
     ##
-    # Expose the skill as a normal LLM::Tool. The provider is bound explicitly
+    # Expose the skill as a normal LLM::Tool. The context is bound explicitly
     # when the tool class is built.
-    # @param [LLM::Provider] llm
+    # @param [LLM::Context] ctx
     # @return [Class<LLM::Tool>]
-    def to_tool(llm)
+    def to_tool(ctx)
       skill = self
       Class.new(LLM::Tool) do
         name skill.name
         description skill.description
-        define_method(:call) do |**input|
-          skill.call(llm, **input)
+        define_method(:call) do
+          skill.call(ctx)
         end
       end
     end
     private
+    def messages_for(ctx)
+      messages = ctx.messages
+        .to_a
+        .select { _1.user? || _1.assistant? }
+        .reject { _1.tool_call? || _1.tool_return? }
+        .last(8)
+      return messages if messages.empty?
+      [LLM::Message.new(:user, "Recent context:"), *messages]
+    end
     def parse(content)
       match = content.match(/\A---\s*\n(.*?)\n---\s*\n?(.*)\z/m)
       unless match

data/lib/llm/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module LLM
-  VERSION = "4.21.0"
+  VERSION = "4.22.0"
 end

data/llm.gemspec CHANGED Viewed

@@ -54,4 +54,7 @@ Gem::Specification.new do |spec|
   spec.add_development_dependency "net-http-persistent", "~> 4.0"
   spec.add_development_dependency "opentelemetry-sdk", "~> 1.10"
   spec.add_development_dependency "logger", "~> 1.7"
+  spec.add_development_dependency "activerecord", "~> 8.0"
+  spec.add_development_dependency "sequel", "~> 5.0"
+  spec.add_development_dependency "sqlite3", "~> 2.0"
 end

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: llm.rb
 version: !ruby/object:Gem::Version
-  version: 4.21.0
+  version: 4.22.0
 platform: ruby
 authors:
 - Antar Azri
@@ -194,6 +194,48 @@ dependencies:
     - - "~>"
       - !ruby/object:Gem::Version
         version: '1.7'
+- !ruby/object:Gem::Dependency
+  name: activerecord
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '8.0'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '8.0'
+- !ruby/object:Gem::Dependency
+  name: sequel
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '5.0'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '5.0'
+- !ruby/object:Gem::Dependency
+  name: sqlite3
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '2.0'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '2.0'
 description: |
   llm.rb is a lightweight runtime for building capable AI systems in Ruby.
   It is not just an API wrapper. llm.rb gives you one runtime for providers,