RubyGems - llm.rb - Versions diffs - 11.3.1 → 12.0.0 - Mend

llm.rb 11.3.1 → 12.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (57) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +242 -1
data/LICENSE +92 -17
data/README.md +204 -623
data/data/anthropic.json +433 -249
data/data/bedrock.json +2097 -1055
data/data/deepinfra.json +993 -0
data/data/deepseek.json +53 -28
data/data/google.json +389 -771
data/data/openai.json +1053 -771
data/data/xai.json +133 -292
data/data/zai.json +249 -141
data/lib/llm/active_record/acts_as_agent.rb +3 -41
data/lib/llm/active_record/acts_as_llm.rb +18 -0
data/lib/llm/active_record.rb +3 -3
data/lib/llm/context.rb +9 -5
data/lib/llm/contract/completion.rb +2 -2
data/lib/llm/provider.rb +2 -2
data/lib/llm/providers/deepinfra/audio.rb +66 -0
data/lib/llm/providers/deepinfra/images.rb +90 -0
data/lib/llm/providers/deepinfra/response_adapter.rb +36 -0
data/lib/llm/providers/deepinfra.rb +100 -0
data/lib/llm/providers/deepseek/images.rb +109 -0
data/lib/llm/providers/deepseek/request_adapter.rb +32 -0
data/lib/llm/providers/deepseek/response_adapter/image.rb +9 -0
data/lib/llm/providers/deepseek/response_adapter.rb +29 -0
data/lib/llm/providers/deepseek.rb +4 -2
data/lib/llm/providers/google/request_adapter.rb +22 -5
data/lib/llm/providers/google.rb +4 -4
data/lib/llm/providers/openai/audio.rb +6 -2
data/lib/llm/providers/openai/images.rb +9 -50
data/lib/llm/providers/openai/request_adapter/respond.rb +38 -4
data/lib/llm/providers/openai/response_adapter/audio.rb +5 -1
data/lib/llm/providers/openai/response_adapter/completion.rb +1 -1
data/lib/llm/providers/openai/response_adapter/image.rb +0 -4
data/lib/llm/providers/openai/responses.rb +1 -0
data/lib/llm/providers/openai/stream_parser.rb +5 -6
data/lib/llm/providers/openai.rb +2 -2
data/lib/llm/providers/xai/images.rb +49 -26
data/lib/llm/providers/xai.rb +2 -2
data/lib/llm/response.rb +10 -0
data/lib/llm/schema/leaf.rb +7 -1
data/lib/llm/schema/renderer.rb +121 -0
data/lib/llm/schema.rb +30 -0
data/lib/llm/sequel/agent.rb +2 -43
data/lib/llm/sequel/plugin.rb +25 -7
data/lib/llm/tracer/telemetry.rb +4 -6
data/lib/llm/tracer.rb +9 -21
data/lib/llm/transport/execution.rb +16 -1
data/lib/llm/transport/net_http_adapter.rb +1 -1
data/lib/llm/uridata.rb +16 -0
data/lib/llm/version.rb +1 -1
data/lib/llm.rb +9 -0
data/llm.gemspec +5 -18
data/resources/deepdive.md +798 -264
metadata +15 -18
data/lib/llm/tracer/langsmith.rb +0 -144

data/README.md CHANGED Viewed

@@ -12,113 +12,75 @@
 > A [r.uby.dev](https://r.uby.dev) project.
-Ruby's capable AI runtime.
+Welcome to the canonical llm.rb repository.
-It provides one Ruby interface for building with large language models:
-providers, agents, tools, skills, MCP, A2A, RAG, streaming, files, and                 persisted conversation state all share the same runtime.
+llm.rb is not a library, framework or toolkit but an advanced runtime
+for building highly capable AI applications on CRuby. By default
+it has zero runtime dependencies although certain functionality &ndash;
+such as ActiveRecord support &ndash; require optional dependencies
+that are opt-in.
-The gem runs on Ruby's standard library by default and loads optional
-integrations only when needed. It supports OpenAI, OpenAI-compatible
-endpoints, Anthropic, Google Gemini, DeepSeek, xAI, Z.ai, AWS Bedrock,
-Ollama, and llama.cpp, with built-in ActiveRecord and Sequel support.
+## Features
-## Services
+The runtime supports OpenAI, OpenAI-compatible endpoints, Anthropic, Google
+Gemini, DeepSeek, DeepInfra, xAI, Z.ai, AWS Bedrock, Ollama, and llama.cpp.
+It has first-class support for streaming, tool calls,  MCP
+and A2A, embeddings, vector stores and the RAG pattern.
-llm.rb is a [r.uby.dev](https://r.uby.dev) project
-that is part of a growing family of AI-related
-projects that also includes publically accessible
-SSH services.
+There are multiple HTTP backends to choose from, tools can be run concurrently
+or in parallel via threads, async tasks, fibers, ractors, and fork, and it is
+also possible to make a tool call while the model is still streaming.
-#### matz - the mruby expert
+The runtime builds on top of three core concepts: providers, contexts, and agents,
+so once you learn the fundamentals, everything else falls into place naturally. And once
+you learn llm.rb, you will also be able to use <a href="https://r.uby.dev/mruby-llm">mruby-llm</a> and
+<a href="https://r.uby.dev/wasm-llm">wasm-llm</a> because the API is pretty much identical.
-> ssh matz@r.uby.dev
+## Install
-See [https://r.uby.dev/matz](https://r.uby.dev/matz) for more information.
-#### robert - the freebsd expert
-> ssh robert@4.4bsd.dev
-See [https://4.4bsd.dev/robert](https://4.4bsd.dev/robert) for more information.
+```bash
+gem install llm.rb
+```
 ## Quick start
-#### LLM::Context
-The
-[LLM::Context](https://r.uby.dev/api-docs/llm.rb/LLM/Context.html)
-object is at the heart of the runtime. Almost all other features build
-on top of it. It is a low-level interface to a model, and requires tool
-execution to be managed manually. The
-[LLM::Agent](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html)
-class is almost the same as
-[LLM::Context](https://r.uby.dev/api-docs/llm.rb/LLM/Context.html)
-but it manages tool execution for you - we'll cover agents next:
-```ruby
-require "llm"
-llm = LLM.openai(key: ENV["KEY"])
-ctx = LLM::Context.new(llm, stream: $stdout)
-ctx.talk "Hello world"
-```
 #### LLM::Agent
-The
-[LLM::Agent](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html)
-object is implemented on top of
-[LLM::Context](https://r.uby.dev/api-docs/llm.rb/LLM/Context.html).
-It provides the same interface, but manages tool execution for you. It
-also has builtin features such as a loop guard that detects repeated
-tool call patterns, and another guard that detects infinite tool call
-loops. Both guards advise the model to change course rather than raise
-an error:
+The [`LLM::Agent`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html) class is the default high-level interface,
+and it is recommended for most use-cases. It manages tool execution
+automatically, guards against infinite loops, manages conversation
+state, and much more.
 ```ruby
 require "llm"
-llm = LLM.openai(key: ENV["KEY"])
+llm = LLM.deepseek(key: ENV["KEY"])
 agent = LLM::Agent.new(llm, stream: $stdout)
 agent.talk "Hello world"
 ```
-#### Agents (Advanced)
+#### LLM::Context
-An agent can be configured to require confirmation before a tool is
-executed. When a matching tool is called, llm.rb runs
-`on_tool_confirmation`. That callback must decide whether to cancel the
-tool call or approve it and execute it by calling
-`fn.spawn(strategy).wait`, and it must always return an instance of
-[`LLM::Function::Return`](https://r.uby.dev/api-docs/llm.rb/LLM/Function/Return.html):
+The [`LLM::Context`](https://r.uby.dev/api-docs/llm.rb/LLM/Context.html) class is at the heart of the runtime
+and it is what [`LLM::Agent`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html) uses under the hood.
+It requires that the tool call loop be managed manually -
+sometimes that can be useful, but usually for advanced use-cases.
+If you're new to llm.rb, try [`LLM::Agent`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html) first.
 ```ruby
 require "llm"
-class Agent < LLM::Agent
-  tools DeleteFile
-  confirm "delete-file"
-  def on_tool_confirmation(fn, strategy)
-    path = fn.arguments.path
-    if path.start_with?("/tmp/")
-      fn.spawn(strategy).wait
-    else
-      fn.cancel(reason: "Deletion requires approval")
-    end
-  end
-end
-llm = LLM.openai(key: ENV["KEY"])
-Agent.new(llm, stream: $stdout).talk("Delete /tmp/example.txt.")
+llm = LLM.deepseek(key: ENV["KEY"])
+ctx = LLM::Context.new(llm, stream: $stdout)
+ctx.talk "Hello world"
 ```
-#### Tools
+#### LLM::Tool
-The
-[LLM::Tool](https://r.uby.dev/api-docs/llm.rb/LLM/Tool.html)
-class can be subclassed to implement your own tools that can extend the
-abilities of a model:
+Subclasses of [`LLM::Tool`](https://r.uby.dev/api-docs/llm.rb/LLM/Tool.html) are plain Ruby classes with
+an optional set of typed parameters. <br> The model can choose to
+call them on your behalf, and they're one of the most powerful features
+for extending the feature set or abilities of a model.
 ```ruby
 class ReadFile < LLM::Tool
@@ -128,629 +90,248 @@ class ReadFile < LLM::Tool
   required %i[path]
   def call(path:)
-    { contents: File.read(path) }
+    {contents: File.read(path)}
   end
 end
 ```
-#### MCP
-The
-[LLM::MCP](https://r.uby.dev/api-docs/llm.rb/LLM/MCP.html)
-object lets llm.rb use tools provided by an MCP server. Those tools are
-exposed through the same runtime as local tools, so you can pass them
-to either
-[LLM::Context](https://r.uby.dev/api-docs/llm.rb/LLM/Context.html)
-or
-[LLM::Agent](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html).
-In this example, the MCP server runs over stdio and
-[LLM::Agent](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html)
-manages the tool loop. For **stdio**, `mcp.session` is the preferred
-pattern because it keeps one MCP session alive across discovery and
-tool calls:
-```ruby
-require "llm"
-llm = LLM.openai(key: ENV["KEY"])
-mcp = LLM::MCP.stdio(argv: ["ruby", "server.rb"])
-mcp.session do
-  agent = LLM::Agent.new(llm, stream: $stdout, tools: mcp.tools)
-  agent.talk "Use the available tools to inspect the environment."
-end
-```
-MCP can also be used without `session`. Although it works it is generally
-not recommended for the **stdio** transport because it is inefficient
-to start and stop a fresh MCP process for tool discovery and each tool
-call:
-```ruby
-require "llm"
-llm = LLM.openai(key: ENV["KEY"])
-mcp = LLM::MCP.stdio(argv: ["ruby", "server.rb"])
-agent = LLM::Agent.new(llm, tools: mcp.tools)
-agent.talk("Use the available tools to inspect the environment.")
-```
-The HTTP transport can be used with or without the `session` method,
-and unlike the stdio transport it can remain efficient without the
-`session` method through a persistent connection pool that is available
-through the
-[LLM::Transport.net_http_persistent](https://r.uby.dev/api-docs/llm.rb/LLM/Transport.html#method-c-net_http_persistent)
-transport:
-```ruby
-require "llm"
-llm = LLM.openai(key: ENV["KEY"])
-mcp = LLM::MCP.http(
-  url: "https://remote-mcp.example.com",
-  transport: :net_http_persistent
-)
-agent = LLM::Agent.new(llm, tools: mcp.tools)
-agent.talk("Use the available tools to inspect the environment.")
-```
-#### A2A (Agent 2 Agent)
-The
-[LLM::A2A](https://r.uby.dev/api-docs/llm.rb/LLM/A2A.html)
-object lets llm.rb use skills provided by a remote A2A agent. Those
-skills are exposed through the same runtime as local tools, so you can
-pass them to either
-[LLM::Context](https://r.uby.dev/api-docs/llm.rb/LLM/Context.html)
-or
-[LLM::Agent](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html).
-Use remote skills as local tools:
-```ruby
-require "llm"
-a2a = LLM::A2A.rest(
-  url: "https://remote-agent.example.com",
-  headers: { "Authorization" => "Bearer token" }
-)
-llm = LLM.openai(key: ENV["KEY"])
-agent = LLM::Agent.new(llm, tools: a2a.skills)
-agent.talk "Analyze this CSV and summarize the trends."
-```
-Use persistent HTTP connections:
-```ruby
-require "llm"
-a2a = LLM::A2A.rest(
-  url: "https://remote-agent.example.com",
-  transport: :net_http_persistent
-)
-```
-For more on direct messaging, task operations, push notification
-configs, and JSON-RPC, see the
-[LLM::A2A API docs](https://r.uby.dev/api-docs/llm.rb/LLM/A2A.html).
-#### Transports
-Providers use Ruby's standard library Net::HTTP transport by default.
-You can opt into persistent Net::HTTP connections with `persistent: true`,
-or provide a transport shortcut when you want a different backend.
-`transport: :curb` uses libcurl through the optional `curb` gem.
-Custom transports can implement the
-[LLM::Transport](https://r.uby.dev/api-docs/llm.rb/LLM/Transport.html)
-interface and receive transport-agnostic
-[LLM::Transport::Request](https://r.uby.dev/api-docs/llm.rb/LLM/Transport/Request.html)
-objects from providers.
-```ruby
-require "llm"
-llm = LLM.openai(key: ENV["KEY"], persistent: true)
-llm = LLM.openai(key: ENV["KEY"], transport: :net_http_persistent)
-llm = LLM.openai(key: ENV["KEY"], transport: :curb)
-```
-#### Skills
-Skills are reusable instructions loaded from a `SKILL.md` directory. They let
-you package behavior and tool access together, and they plug into the
-same runtime as tools, agents, MCP, and A2A. When a skill runs, llm.rb
-spawns a subagent with the skill instructions, access to only the tools
-listed in the skill, and recent conversation context:
-```yaml
----
-name: release
-description: Prepare a release
-tools: ["search-docs", "git"]
----
-## Task
-Review the release state, summarize what changed, and prepare the release.
-```
-```ruby
-require "llm"
-class ReleaseAgent < LLM::Agent
-  model "gpt-5.4-mini"
-  skills "./skills/release"
-end
-llm = LLM.openai(key: ENV["KEY"])
-ReleaseAgent.new(llm, stream: $stdout).talk("Prepare the next release.")
-```
-A skill can also have its sub-agent inherit the parents tools through the
-`inherit` directive. The `inherit` directive has coverage for the "classic"
-tools (a subclass of [LLM::Tool](https://r.uby.dev/api-docs/llm.rb/LLM/Tool.html)),
-MCP tools, and A2A tools that a parent context or agent has access to:
-```yaml
----
-  name: release
-  description: Prepare a release
-  tools: inherit
----
-```
 #### LLM::Stream
-The
-[LLM::Stream](https://r.uby.dev/api-docs/llm.rb/LLM/Stream.html)
-object lets you observe output and runtime events as they happen. You
-can subclass it to handle streamed content in your own application:
-```ruby
-require "llm"
-class Stream < LLM::Stream
-  def on_content(content)
-    $stdout << content
-  end
-end
-llm = LLM.openai(key: ENV["KEY"])
-agent = LLM::Agent.new(llm, stream: Stream.new)
-agent.talk "Write a haiku about Ruby."
-```
-#### LLM::Stream (advanced)
-The
-[LLM::Stream](https://r.uby.dev/api-docs/llm.rb/LLM/Stream.html)
-object can also resolve tool calls while output is still streaming. In
-`on_tool_call`, you can spawn the tool, push the work onto the stream
-queue, and later drain it with `wait`:
+Streams can be simple IO objects or subclasses of
+[`LLM::Stream`](https://r.uby.dev/api-docs/llm.rb/LLM/Stream.html) with structured callbacks for content,
+reasoning, tool calls, tool returns, and compaction.
 ```ruby
-require "llm"
-class Stream < LLM::Stream
+class MyStream < LLM::Stream
   def on_content(content)
-    $stdout << content
+    print content
   end
-  def on_tool_call(tool, error)
-    return queue << error if error
-    queue << ctx.spawn(tool, :thread)
+  def on_reasoning_content(content)
+    warn content
   end
 end
-llm = LLM.openai(key: ENV["KEY"])
-ctx = LLM::Context.new(llm, stream: Stream.new, tools: [ReadFile])
-ctx.talk "Read README.md and summarize the quick start."
-ctx.talk(ctx.wait) while ctx.functions?
-```
-#### Concurrency
-llm.rb can run tool work concurrently. This is useful when a model calls
-multiple tools and you want to resolve them in parallel instead of one
-at a time. On
-[LLM::Agent](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html),
-you can enable this with `concurrency`. Common options are `:call` for
-sequential execution, `:thread`, or `:task` for concurrent IO-bound work, and
-`:ractor` or `:fork` for more isolated CPU-bound work:
-```ruby
-require "llm"
-class Agent < LLM::Agent
-  model "gpt-5.4-mini"
-  tools ReadFile
-  concurrency :thread
-end
-llm = LLM.openai(key: ENV["KEY"])
-agent = Agent.new(llm, stream: $stdout)
-agent.talk "Read README.md and CHANGELOG.md and compare them."
-```
-#### Serialization
-The [`LLM::Agent`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html)
-object can be serialized to JSON, which makes it suitable for storing
-in a file, a database column, or a Redis queue. The built-in
-ActiveRecord and Sequel plugins are built on top of the same underlying
-serialization feature:
-```ruby
-require "llm"
-llm = LLM.openai(key: ENV["KEY"])
-# Serialize an agent
-agent1 = LLM::Agent.new(llm)
-agent1.talk "Remember that my favorite language is Ruby"
-string = agent1.to_json
-# Restore an agent (from JSON)
-agent2 = LLM::Agent.new(llm, stream: $stdout)
-agent2.restore(string:)
-agent2.talk "What is my favorite language?"
-```
-#### ask
-[`LLM::Agent`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html)
-also provides `ask`, a convenience interface that is compatible with
-RubyLLM's `ask` method. It accepts a prompt, an optional `with:`
-attachment path or paths, an optional `stream:` target, and an optional
-block that chunks are yielded to. It returns an
-[`LLM::Response`](https://r.uby.dev/api-docs/llm.rb/LLM/Response.html),
-so use `.content` when you want the text directly:
-```ruby
-require "llm"
-llm = LLM.openai(key: ENV["KEY"])
-agent = LLM::Agent.new(llm)
-puts agent.ask("Hello world").content
-puts agent.ask("Summarize this document.", with: "README.md").content
-agent.ask("Stream this reply.") { $stdout << _1 }
-```
-## Installation
-```bash
-gem install llm.rb
+llm = LLM.deepseek(key: ENV["KEY"])
+agent = LLM::Agent.new(llm, stream: MyStream.new)
+agent.talk "Explain Ruby fibers."
 ```
-## Examples
+#### LLM::MCP
-#### REPL
-This example uses [`LLM::Agent`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html)
-for an interactive REPL. <br> See the
-[deepdive (web)](https://r.uby.dev/llm/) or
-[deepdive (markdown)](resources/deepdive.md) for more examples.
+The Model Context Protocol (MCP) has first-class support
+in llm.rb. The stdio and http transports work out of the
+box. MCP tools are translated into subclasses of
+[`LLM::Tool`](https://r.uby.dev/api-docs/llm.rb/LLM/Tool.html) that can be used with [`LLM::Context`](https://r.uby.dev/api-docs/llm.rb/LLM/Context.html)
+or [`LLM::Agent`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html).
 ```ruby
 require "llm"
-llm = LLM.openai(key: ENV["KEY"])
-agent = LLM::Agent.new(llm, stream: $stdout)
-loop do
-  print "> "
-  agent.talk(STDIN.gets || break)
-  puts
-end
+llm   = LLM.deepseek(key: ENV["KEY"])
+mcp   = LLM::MCP.stdio(argv: ["ruby", "server.rb"])
+agent = LLM::Agent.new(llm, stream: $stdout, tools: mcp.tools)
+agent.talk "Run the tool"
 ```
-#### Multimodal: Local Files
-In llm.rb, a prompt can be a string, an [`LLM::Prompt`](https://r.uby.dev/api-docs/llm.rb/LLM/Prompt.html), or an array.
-When you use an array, each element can be plain text or a tagged object such as
-[`agent.image_url(...)`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html#image_url-instance_method),
-[`agent.local_file(...)`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html#local_file-instance_method),
-or [`agent.remote_file(...)`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html#remote_file-instance_method).
-Those tagged objects carry the metadata the provider adapter needs to turn one
-Ruby prompt into the provider-specific multimodal request schema.
+#### LLM::A2A
-If the model understands that file type, you can attach a local file directly
-with `agent.ask(..., with: path)` instead of uploading it first through a
-provider Files API. Under the hood, llm.rb tags the path as a
-[`agent.local_file(...)`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html#local_file-instance_method)
-object:
+The Agent 2 Agent (A2A) protocol has first-class support
+in llm.rb. The http and jsonrpc transports work out of the
+box. A2A skills are translated into subclasses of
+[`LLM::Tool`](https://r.uby.dev/api-docs/llm.rb/LLM/Tool.html) that can be used with [`LLM::Context`](https://r.uby.dev/api-docs/llm.rb/LLM/Context.html)
+or [`LLM::Agent`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html).
 ```ruby
 require "llm"
-llm = LLM.openai(key: ENV["KEY"])
-agent = LLM::Agent.new(llm)
-puts agent.ask("Summarize this document.", with: "README.md").content
+llm   = LLM.deepseek(key: ENV["KEY"])
+a2a   = LLM::A2A.rest(url: "https://remote-agent.example.com")
+agent = LLM::Agent.new(llm, stream: $stdout, tools: a2a.skills)
+agent.talk "Run the skill"
 ```
-#### Context Compaction
-This example uses [`LLM::Agent`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html),
-[`LLM::Compactor`](https://r.uby.dev/api-docs/llm.rb/LLM/Compactor.html), and
-[`LLM::Stream`](https://r.uby.dev/api-docs/llm.rb/LLM/Stream.html) together so
-long-lived conversations can summarize older history and expose the lifecycle
-through stream hooks. This approach is inspired by General Intelligence
-Systems. The
-compactor can also use its own `model:` if you want summarization to run on a
-different model from the main conversation. `token_threshold:` accepts either a
-fixed token count or a percentage string like `"90%"`, which resolves
-against the active model context window and triggers compaction once total
-token usage goes over that percentage. See the
-[deepdive (web)](https://r.uby.dev/llm/) or
-[deepdive (markdown)](resources/deepdive.md) for more examples.
+#### RAG
-```ruby
-require "llm"
-class Stream < LLM::Stream
-  def on_compaction(ctx, compactor)
-    puts "Compacting #{ctx.messages.size} messages..."
-  end
+Most providers offer an embedding model that can be
+used for semantic search, or similarity search. An
+embedding model can generate embeddings that can then
+be stored in a database that is optimized for storing
+and querying vectors, such as SQLite's [sqlite-vec](https://github.com/asg017/sqlite-vec)
+or PostgreSQL's [pg-vector](https://github.com/pgvector/pgvector).
-  def on_compaction_finish(ctx, compactor)
-    puts "Compacted to #{ctx.messages.size} messages."
-  end
-end
-llm = LLM.openai(key: ENV["KEY"])
-agent = LLM::Agent.new(
-  llm,
-  stream: Stream.new,
-  compactor: {
-    token_threshold: "90%",
-    retention_window: 8,
-    model: "gpt-5.4-mini"
-  }
-)
-```
-#### Reasoning
-This example uses [`LLM::Stream`](https://r.uby.dev/api-docs/llm.rb/LLM/Stream.html)
-with the OpenAI Responses API so reasoning output is streamed separately from
-visible assistant output. See the
-[deepdive (web)](https://r.uby.dev/llm/) or
-[deepdive (markdown)](resources/deepdive.md) for more examples.
-To use the Responses API (OpenAI-specific), initialize an agent with
-`mode: :responses` and keep using `talk` for turns.
+llm.rb also includes support for OpenAI's vector store API. It
+provides a vector database as a HTTP service but we won't cover
+that here.
 ```ruby
 require "llm"
-class Stream < LLM::Stream
-  def on_content(content)
-    $stdout << content
-  end
+llm  = LLM.openai(key: ENV["KEY"])
+body = "llm.rb is Ruby's capable AI runtime."
+embedding = llm.embed([body]).embeddings.first
-  def on_reasoning_content(content)
-    $stderr << content
-  end
-end
-llm = LLM.openai(key: ENV["KEY"])
-agent = LLM::Agent.new(
-  llm,
-  model: "gpt-5.4-mini",
-  mode: :responses,
-  reasoning: { effort: "medium" },
-  stream: Stream.new
+Document.create!(
+  title: "llm.rb",
+  body:,
+  embedding:,
 )
-agent.talk("Solve 17 * 19 and show your work.")
 ```
-#### Request Cancellation
-Need to cancel a stream? llm.rb has you covered through
-[`LLM::Agent#interrupt!`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html#interrupt-21-instance_method).
-<br> See the [deepdive (web)](https://r.uby.dev/llm/)
-or [deepdive (markdown)](resources/deepdive.md) for more examples.
-```ruby
-require "llm"
-require "io/console"
-llm = LLM.openai(key: ENV["KEY"])
-agent = LLM::Agent.new(llm, stream: $stdout)
-worker = Thread.new do
-  agent.talk("Write a very long essay about network protocols.")
-rescue LLM::Interrupt
-  puts "Request was interrupted!"
-end
-STDIN.getch
-agent.interrupt!
-worker.join
-```
+#### Concurrency
-#### Sequel (ORM)
+The runtime supports five different concurrency strategies that have
+different attributes. The choice between all of them often depends
+on the requirements of your application.
-The `plugin :llm` integration wraps
-[`LLM::Context`](https://r.uby.dev/api-docs/llm.rb/LLM/Context.html) on a
-`Sequel::Model` and keeps tool execution explicit. Like the ActiveRecord
-wrappers, its built-in persistence contract is the serialized `data` column,
-while `provider:` resolves a real `LLM::Provider` instance and `context:`
-injects defaults such as `model:`. <br> See the
-[deepdive (web)](https://r.uby.dev/llm/) or
-[deepdive (markdown)](resources/deepdive.md) for more examples.
+IO-bound tools are a good fit for the `:task`, `:thread`,
+and `:fiber` strategies while true parallelism can be achieved
+with the `:fork` and `:ractor` strategies. The
+`:fork` strategy also provides a separate process that offers
+isolation from its parent.
 ```ruby
 require "llm"
-require "net/http/persistent"
-require "sequel"
-require "sequel/plugins/llm"
-class Context < Sequel::Model
-  plugin :llm, provider: :set_provider, context: :set_context
-  private
-  def set_provider
-    LLM.openai(key: ENV["OPENAI_SECRET"], persistent: true)
-  end
-  def set_context
-    { model: "gpt-5.4-mini", mode: :responses, store: false }
-  end
-end
-ctx = Context.create
-ctx.talk("Remember that my favorite language is Ruby")
-puts ctx.talk("What is my favorite language?").content
+llm   = LLM.deepseek(key: ENV["KEY"])
+tools = [FetchNews, FetchStocks, FetchFeeds]
+agent = LLM::Agent.new(llm, tools:, concurrency: :fork)
+agent.talk "Run the tools in parallel"
 ```
-#### ActiveRecord (ORM): acts_as_llm
+#### ORM
-The `acts_as_llm` method wraps [`LLM::Context`](https://r.uby.dev/api-docs/llm.rb/LLM/Context.html) and
-provides full control over tool execution. Its built-in persistence contract is
-one serialized `data` column. If your app has provider, model, or usage
-columns, provide them to llm.rb through `provider:` and `context:` instead of
-relying on reserved wrapper columns.
+Because both [`LLM::Context`](https://r.uby.dev/api-docs/llm.rb/LLM/Context.html), and [`LLM::Agent`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html)
+can be serialized to JSON and stored in a simple string, both ActiveRecord
+and Sequel support can be implemented within a single column on a single row.
-See the [deepdive (web)](https://r.uby.dev/llm/)
-or [deepdive (markdown)](resources/deepdive.md) for more examples.
+The runtime includes first-class support for both ActiveRecord *and* Sequel, and
+for both Rack-based applications *and* Rails-based applications. On databases
+where it is supported, such as PostgreSQL, the column can be optimized by using
+the `jsonb` type.
 ```ruby
-require "llm"
 require "active_record"
-require "llm/active_record"
-class Context < ApplicationRecord
-  acts_as_llm provider: :set_provider, context: :set_context
-  private
-  def set_provider
-    LLM.openai(key: ENV["OPENAI_SECRET"])
-  end
-  def set_context
-    { model: "gpt-5.4-mini", mode: :responses, store: false }
-  end
-end
-ctx = Context.create!
-ctx.talk("Remember that my favorite language is Ruby")
-puts ctx.talk("What is my favorite language?").content
-```
-```ruby
 require "llm"
-require "active_record"
 require "llm/active_record"
-class Context < ApplicationRecord
-  acts_as_llm provider: :set_provider, context: :set_context
-  # Optional application columns can still provide the provider and context.
-  # For example, `provider_name` and `model_name` can be normal columns.
-  private
-  def set_provider
-    LLM.public_send(provider_name, key: provider_key)
-  end
-  def set_context
-    { model: model_name, mode: :responses, store: false }
+class Agent < ApplicationRecord
+  acts_as_agent do |agent|
+    agent.model "deepseek-v4-pro"
+    agent.instructions "solve the user's query"
+    agent.tools [Research, FinalizeResearch, ActOnResearch]
   end
-end
-```
-#### ActiveRecord (ORM): acts_as_agent
-The `acts_as_agent` method wraps [`LLM::Agent`](https://r.uby.dev/api-docs/llm.rb/LLM/Agent.html) and
-manages tool execution for you. Like `acts_as_llm`, its built-in persistence
-contract is one serialized `data` column. If your app has provider or model
-columns, provide them to llm.rb through your hooks and agent DSL.
-See the [deepdive (web)](https://r.uby.dev/llm/)
-or [deepdive (markdown)](resources/deepdive.md) for more examples.
-```ruby
-require "llm"
-require "active_record"
-require "llm/active_record"
-class Ticket < ApplicationRecord
-  acts_as_agent provider: :set_provider, context: :set_context
-  model "gpt-5.4-mini"
-  instructions "You are a concise support assistant."
-  tools SearchDocs, Escalate
-  concurrency :thread
   private
+  # By convention, this method defines the provider for a model.
+  # If necessary, it can be renamed with: provider: :your_method.
   def set_provider
-    LLM.openai(key: ENV["OPENAI_SECRET"])
+    LLM.deepseek(key: ENV["KEY"])
   end
+  # By convention, this method returns the context options given
+  # to LLM::Context or LLM::Agent.
   def set_context
-    { mode: :responses, store: false }
+    {}
   end
 end
-ticket = Ticket.create!
-puts ticket.talk("How do I rotate my API key?").content
+agent = Agent.create!
+agent.talk "perform research"
 ```
-```ruby
-require "llm"
-require "active_record"
-require "llm/active_record"
+## FAQ
-class Ticket < ApplicationRecord
-  acts_as_agent provider: :set_provider, context: :set_context
-  model "gpt-5.4-mini"
-  instructions "You are a concise support assistant."
+<details>
+<summary>What providers does llm.rb support?</summary>
+<br>
+<p>
+China-based
-  private
+* DeepSeek
+* zAI
-  def set_provider
-    LLM.public_send(provider_name, key: provider_key)
-  end
+US-based
-  def set_context
-    { mode: :responses, store: false }
-  end
-end
-```
+* OpenAI
+* Google (Gemini)
+* xAI
+* AWS bedrock
+* DeepInfra
+* Anthropic
-#### MCP
+Openweights
-This example uses [`LLM::MCP`](https://r.uby.dev/api-docs/llm.rb/LLM/MCP.html)
-over HTTP so remote GitHub MCP tools run through the same
-`LLM::Agent` tool path as local tools. It expects a GitHub token in
-`ENV["GITHUB_PAT"]`. See the
-[deepdive (web)](https://r.uby.dev/llm/) or
-[deepdive (markdown)](resources/deepdive.md) for more examples.
+* DeepSeek
+* zAI
+* DeepInfra
+* AWS bedrock
-```ruby
-require "llm"
-require "net/http/persistent"
+Host your own
-llm = LLM.openai(key: ENV["KEY"], persistent: true)
-mcp = LLM::MCP.http(
-  url: "https://api.githubcopilot.com/mcp/",
-  headers: { "Authorization" => "Bearer " + ENV["GITHUB_PAT"].to_s },
-  persistent: true
-)
+* Ollama
+* Llamacpp
+</p>
+</details>
-agent = LLM::Agent.new(llm, stream: $stdout, tools: mcp.tools)
-agent.talk("Pull information about my GitHub account.")
-```
+<details>
+<summary>I have a limited budget. What should I do?</summary>
+<br>
+<p>
+There a few options. The first option is to host
+your own model, and use the ollama or llamacpp
+providers. This can be diffilcult though because
+a capable model requires hardware that can
+match it. If you have the ability to self-host,
+this would be my first option.
+</p>
+<p>
+The second option is DeepSeek. <br>
+The deepseek-v4-flash model costs pennies to use. <br>
+And llm.rb has been optimized for deepseek. For example,
+DeepSeek does not have image generation capabilities
+but on the llm.rb runtime it does (vector graphics only,
+though).
+</p>
+<p>
+The same is true for structured outputs. DeepSeek does
+not support structured outputs in the same way as OpenAI or
+Google, but the llm.rb runtime makes it appear as
+though it does, through the `json_object` response
+type.
+</p>
+If you're on a budget, DeepSeek is hard to beat.
+</details>
+<details>
+<summary>Can I download llm.rb via a decentralized network?</summary>
+<br>
+You can!
+<br>
+We are on the <a href="https://radicle.network">radicle.network</a>
+<br>
+Every commit that lands on GitHub also lands on Radicle.
+<br>
+Our repository ID is z2PtfQ6dYwyYaW2aGrztG1sMyDmCE.
+<br>
+Browse on <a href="https://radicle.network/nodes/iris.radicle.network/z2PtfQ6dYwyYaW2aGrztG1sMyDmCE">the web</a>.
+</details>
+## Resources
+If you like what you read so far, check out the [deepdive.md](https://r.uby.dev/llm/deepdive/)
+to learn more. Unfortunately it
+wasn't possible to cover every feature without the README becoming a small book.
+The [r.uby.dev](https://r.uby.dev) homepage also includes more learning material
+and resources.
 ## License
-[BSD Zero Clause](https://choosealicense.com/licenses/0bsd/)
+[Business Source License 1.1](./LICENSE)
+<br>
+Commercial production use requires a commercial license.
+<br>
+Each version converts to the [BSD Zero Clause](https://choosealicense.com/licenses/0bsd/)
+four years after its first public release.
 <br>
-See [LICENSE](./LICENSE)
+Contact [robert@r.uby.dev](mailto:robert@r.uby.dev) for a commercial license.