RubyGems - llm.rb - Versions diffs - 11.1.0 → 11.3.0 - Mend

llm.rb 11.1.0 → 11.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (54) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +141 -12
data/README.md +104 -69
data/lib/llm/a2a/transport/http.rb +9 -8
data/lib/llm/a2a.rb +14 -7
data/lib/llm/agent.rb +31 -7
data/lib/llm/context.rb +20 -6
data/lib/llm/error.rb +4 -0
data/lib/llm/function/array.rb +6 -0
data/lib/llm/function.rb +26 -0
data/lib/llm/json_adapter.rb +8 -2
data/lib/llm/mcp/transport/http.rb +7 -5
data/lib/llm/mcp.rb +6 -7
data/lib/llm/provider.rb +1 -18
data/lib/llm/providers/anthropic/error_handler.rb +2 -0
data/lib/llm/providers/anthropic/files.rb +6 -6
data/lib/llm/providers/anthropic/models.rb +1 -1
data/lib/llm/providers/anthropic.rb +1 -1
data/lib/llm/providers/bedrock/error_handler.rb +1 -1
data/lib/llm/providers/bedrock/models.rb +4 -4
data/lib/llm/providers/bedrock/signature.rb +3 -3
data/lib/llm/providers/bedrock.rb +1 -1
data/lib/llm/providers/google/error_handler.rb +2 -0
data/lib/llm/providers/google/files.rb +5 -5
data/lib/llm/providers/google/images.rb +1 -1
data/lib/llm/providers/google/models.rb +1 -1
data/lib/llm/providers/google.rb +2 -2
data/lib/llm/providers/ollama/error_handler.rb +2 -0
data/lib/llm/providers/ollama/models.rb +1 -1
data/lib/llm/providers/ollama.rb +2 -2
data/lib/llm/providers/openai/audio.rb +3 -3
data/lib/llm/providers/openai/error_handler.rb +2 -0
data/lib/llm/providers/openai/files.rb +5 -5
data/lib/llm/providers/openai/images.rb +3 -3
data/lib/llm/providers/openai/models.rb +1 -1
data/lib/llm/providers/openai/moderations.rb +1 -1
data/lib/llm/providers/openai/responses.rb +3 -3
data/lib/llm/providers/openai/vector_stores.rb +11 -11
data/lib/llm/providers/openai.rb +2 -2
data/lib/llm/skill.rb +1 -1
data/lib/llm/tool.rb +21 -0
data/lib/llm/transport/curb.rb +246 -0
data/lib/llm/transport/execution.rb +1 -1
data/lib/llm/transport/http.rb +9 -4
data/lib/llm/transport/net_http_adapter.rb +61 -0
data/lib/llm/transport/persistent_http.rb +10 -5
data/lib/llm/transport/request.rb +121 -0
data/lib/llm/transport/response/curb.rb +112 -0
data/lib/llm/transport/response.rb +1 -0
data/lib/llm/transport/utils.rb +42 -17
data/lib/llm/transport.rb +17 -45
data/lib/llm/version.rb +1 -1
data/llm.gemspec +6 -5
metadata +25 -8

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 5bb91948d8cfa006f7512dd0a4fa62f90b42360e3f11a57074870470fdc70d3f
-  data.tar.gz: 64b49f633318bc0439252cebca4c3886db4c7676f3cb9a78f4945eefe58b4356
+  metadata.gz: 314712380b36e57b1492cef3850f5c4c2397522b74d3cc913fc0d09a796d8973
+  data.tar.gz: aefda31d90067a0a49ada778c6658243595b6698cc11ecf342a11e26f69ad93b
 SHA512:
-  metadata.gz: c56b48185604b22c44f7b4697da56a5fda8359a69e110392abf2510b47a4dc9aedbe5c6a9e64e733fafb5672c34d4a0833448fab5ca9de52c453c8a906080174
-  data.tar.gz: a2a9241da0e8749569111573451c99c92ed3569fa7cee0c9178b8aa884a72b27a24f18e19566cef8a16c2e460420a392b83d441708da67a571a9e2648d4d81e2
+  metadata.gz: 3a998015696027d232e0865c60ff840d11155206b705443035f6af7dcbb18f52d0e82b019cc82379a7ca919b60e3e50bf4156c8c4388beb8ba47a5d57775354a
+  data.tar.gz: 83faf786980a3307a760aec9698e29129dd34f8a838fa8f596caa60498f029bf3b5b4ac9400dd662c70a67a8c1eb748266d89b777773c8185025c8b8c86754bd

data/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,135 @@
 ## Unreleased
+## v11.3.0
+Changes since `v11.2.0`.
+This release promotes `LLM::Agent` as the default high-level runtime,
+raises `LLM::NotFoundError` for provider 404 responses, and adds
+Symbol resolution to `LLM::Agent.confirm` and `LLM::Agent.skills` for
+dynamic tool confirmation and skill lists.
+### Add
+* **Raise `LLM::NotFoundError` for provider 404 responses** <br>
+  Raise `LLM::NotFoundError` when a provider returns HTTP 404. One
+  example is calling the embeddings API on DeepSeek
+  (`LLM.deepseek(...).embed(["foobar"])`), which returns 404 because
+  DeepSeek does not implement that endpoint.
+* **Add Symbol resolution to `LLM::Agent.confirm`** <br>
+  When `confirm` receives a single Symbol argument, it stores it
+  as-is instead of converting it to a string array. At initialization
+  time, `resolve_option` resolves the Symbol by calling the method
+  with that name on the agent instance, and the result is converted
+  to strings. This allows dynamic tool confirmation lists:
+      class MyAgent < LLM::Agent
+        confirm :tools_that_need_confirmation
+        def tools_that_need_confirmation
+          some_condition ? %w[delete destroy] : %w[delete]
+        end
+      end
+  Ported from llmrb/mruby-llm@89a232e3 and @2dd04e2d.
+  Extend the same pattern to `LLM::Agent.skills` so the skills DSL
+  accepts a Symbol that resolves through the agent instance at
+  initialization time.
+### Change
+* **Clarify `LLM::Agent` as the default high-level runtime** <br>
+  Document that `LLM::Context` remains at the heart of llm.rb, but
+  `LLM::Agent` is the better default unless an application needs advanced
+  manual tool loops. `LLM::Agent` manages the tool loop for callers and
+  enables guards against runaway or repeated tool-call loops.
+## v11.2.0
+Changes since `v11.1.0`.
+This release adds `LLM::Function#skill?` and `LLM::Tool#skill?` so
+callers can inspect whether a function or tool is backed by a skill.
+It introduces `LLM::Transport::Request` as a transport-agnostic request
+object so providers no longer depend directly on `Net::HTTP` request
+classes, and adds an optional Curb (libcurl) backend alongside symbolic
+transport shortcuts such as `transport: :curb`.
+MCP and A2A clients now accept `persistent: true` matching provider configuration.
+Several fixes land for tool return callback emission, function comparison by
+tool call ID, function array filtering, skill tool inheritance, and JSON generator
+state compatibility on Ruby 4.
+### Add
+* **Add `LLM::Function#skill?`** <br>
+  Add `skill?` to `LLM::Function` so callers can check whether a
+  function is backed by a skill tool.
+* **Add `LLM::Tool.skill?` and `LLM::Tool#skill?`** <br>
+  Add class-level `skill?` and instance-level `skill?` to
+  `LLM::Tool`, matching the existing `mcp?` and `a2a?` pattern.
+* **Add `LLM::Transport::Request`** <br>
+  Add `LLM::Transport::Request` as a transport-agnostic request object
+  and update providers to build requests without depending directly on
+  Net::HTTP request classes. The built-in Net::HTTP transports still
+  accept existing Net::HTTP request objects through a compatibility
+  bridge, while alternative transports can handle the generic request
+  shape directly.
+* **Add optional Curb transport support** <br>
+  Add `LLM::Transport::Curb`, an optional libcurl-backed transport
+  that can be selected with `transport: :curb`. Providers already
+  emit `LLM::Transport::Request` objects, so the Curb backend can
+  execute requests without routing through Net::HTTP.
+* **Add symbolic transport shortcuts** <br>
+  Allow providers, MCP HTTP clients, and A2A HTTP clients to accept
+  transport shortcuts such as `transport: :curb` and
+  `transport: :net_http_persistent`.
+* **Add persistent HTTP selection to MCP and A2A clients** <br>
+  Allow MCP and A2A HTTP clients to accept `persistent: true`, matching
+  provider configuration and selecting the persistent Net::HTTP
+  transport by default.
+### Fix
+* **Support JSON generation state on Ruby 4** <br>
+  Handle JSON generator state objects in the standard JSON adapter so
+  schema objects serialize correctly when Ruby 4 calls custom `to_json`
+  methods during provider request generation.
+* **Emit tool return callbacks for direct context waits** <br>
+  Emit `LLM::Stream#on_tool_return` when `LLM::Context#wait` executes
+  pending tool work directly instead of draining `LLM::Stream::Queue`.
+* **Emit confirmed tool return callbacks once** <br>
+  Emit `LLM::Stream#on_tool_return` for confirmed and cancelled tool
+  calls, and exclude confirmed functions from later waits so mixed
+  confirmed and unconfirmed tool batches do not execute confirmed tools
+  twice.
+* **Compare functions by tool call ID** <br>
+  Add `LLM::Function#==`, `#eql?`, and `#hash` so pending function
+  collections can compare tool calls by provider-assigned ID instead of
+  object identity.
+* **Preserve function array behavior after filtering** <br>
+  Preserve `LLM::Function::Array` behavior when subtracting function
+  arrays so filtered tool batches can still spawn through the normal
+  function array API.
+* **Prevent skills from inheriting skill-backed tools** <br>
+  Exclude skill-backed tools when a skill sub-agent uses `tools:
+  inherit`, preventing skills loaded through a parent context from
+  being recursively exposed to nested skill agents.
 ## v11.1.0
 Changes since `v11.0.0`.
@@ -133,13 +262,13 @@ requests outside `#session`, `LLM::Function#def` as a short alias for
 * **Fix context and agent JSON serialization through `LLM.json`** <br>
   Fix `LLM::Context#to_json` and `LLM::Agent#to_json` to serialize
-  through `LLM.json.dump(...)` instead of plain `to_json`.
+  through `LLM.json.dump(...)` instead of plain `to_json`.
 * **Fix block-form ORM agent DSL forwarding** <br>
   Fix block-form `model { ... }`, `tools { ... }`, and
   `schema { ... }` declarations in the ActiveRecord and Sequel agent
   wrappers so persisted agent models configure the internal agent class
-  the same way as `LLM::Agent`.
+  the same way `LLM::Agent` does.
 * **Fix missing `skills` in ORM agent wrappers** <br>
   Fix the ActiveRecord and Sequel agent wrappers to expose `skills`, so
@@ -382,7 +511,7 @@ DSML tool-marker filtering in streamed text.
   blocks that Bedrock rejects.
 * **Suppress Bedrock DSML tool markers in streamed text** <br>
-  Filter `\"<｜DSML｜function_calls\"` markers out of streamed Bedrock
+  Filter `"\u003c\u003cDSML\u003efunction_calls\u003e\u003e"` markers out of streamed Bedrock
   assistant text so tool-call sentinels do not leak into user-visible
   output.
@@ -392,7 +521,7 @@ Changes since `v7.0.0`.
 This release adds Unix-fork concurrency for process-isolated tool
 execution, extends `LLM::Object` with `#merge` and `#delete`, and drops
-Ruby 3.2 support due to segfaults observed with the `:fork` path. It
+Ruby 3.2 support due to a segfault observed with the `:fork` path. It
 promotes `LLM::Pipe` to the top-level namespace and adds
 `persistent: true` on `LLM::MCP.http` for direct persistent transport
 configuration. `LLM::Function#runner` is exposed as public API, agent
@@ -533,7 +662,7 @@ provider usage has been recorded yet.
   buffer API.
 * **Support percentage compaction token thresholds** <br>
-  Let `LLM::Compactor` accept `token_threshold:` values like `\"90%\"` so
+  Let `LLM::Compactor` accept `token_threshold:` values like `"90%"` so
   compaction can trigger at a percentage of the active model context
   window.
@@ -692,7 +821,7 @@ interruption use the active per-call stream correctly.
 * **Refresh provider model metadata** <br>
   Add current DeepSeek and OpenAI model metadata to `data/` and update the
-  Google Gemma model entry to match the current provider naming.
+  Google Gemini model entry to match the current provider naming.
 ### Fix
@@ -1133,12 +1262,12 @@ Changes since `v4.14.0`.
   storage when Sequel JSON typecasting is enabled.
 * **Improve streaming parser performance** <br>
-  In the local replay-based `stream_parser` benchmark versus
-  `v4.14.0` (median of 20 samples, 5000 iterations), plain Ruby is a
+  In the local replay-based `stream_parser` benchmark versus `v4.14.0`
+  (median of 20 samples, 5000 iterations), plain Ruby is a
   small overall win: the generic eventstream path is about 0.4%
   faster, the OpenAI stream parser is about 0.5% faster, and the
   OpenAI Responses parser is about 1.6% faster, with unchanged
-  allocations. Under YJIT on the same benchmark, the generic
+  allocations. Under YJIT on the same benchmark harness, the generic
   eventstream path is about 0.9% faster and the OpenAI stream parser
   is about 0.4% faster, while the OpenAI Responses parser is about
   0.7% slower, also with unchanged allocations.
@@ -1180,7 +1309,7 @@ parallel tool calls can safely share one connection.
 * **Reduce provider streaming allocations** <br>
   Decode streamed provider payloads directly in
   `LLM::Provider::Transport::HTTP` before handing them to provider
-  parsers, which cuts allocation churn and gives a smaller streaming
+  parsers, which cuts allocation churn and gives a small streaming
   speed bump.
 * **Reduce generic SSE parser allocations** <br>
@@ -1316,7 +1445,7 @@ Changes since `v4.9.0`.
 - Add HTTP transport for MCP with `LLM::MCP::Transport::HTTP` for remote servers
 - Add JSON Schema union types (`any_of`, `all_of`, `one_of`) with parser integration
-- Add JSON Schema type array union support (e.g., `\"type\": [\"object\", \"null\"]`)
+- Add JSON Schema type array union support (e.g., `"type": ["object", "null"]`)
 - Add JSON Schema type inference from `const`, `enum`, or `default` fields
 ### Change
@@ -1417,7 +1546,7 @@ Notable merged work in this range includes:
 - `Add rack + websocket example (#130)`
 - `feat(gemspec): add changelog URI (#136)`
 - `feat(function): alias ThreadGroup#wait as ThreadGroup#value (#62)`
-- README and screencast refresh across `#66`, `#68`, `#71`, and
+- `README and screencast refresh across `#66`, `#68`, `#71`, and
   `#72`
 - `chore(bot): update deprecation warning from v5.0 to v6.0`
 - `fix(deepseek): tolerate malformed tool arguments`

data/README.md CHANGED Viewed

@@ -11,7 +11,7 @@
     <img src="https://img.shields.io/badge/License-0BSD-orange.svg?" alt="License">
   </a>
   <a href="https://github.com/llmrb/llm.rb/tags">
-    <img src="https://img.shields.io/badge/version-11.1.0-green.svg?" alt="Version">
+    <img src="https://img.shields.io/badge/version-11.3.0-green.svg?" alt="Version">
   </a>
 </p>
@@ -30,10 +30,27 @@ also includes built-in ActiveRecord and Sequel support, plus concurrent
 tool execution through threads, tasks (via async gem), fibers, ractors,
 and fork (via xchan.rb gem).
-As a bonus, llm.rb is also available to embedded systems [via mruby](https://github.com/llmrb/mruby-llm#readme),
-to the browser and edge devices [via WebAssembly](https://github.com/llmrb/wasm-llm#readme),
-and has first-class [Rails support](https://github.com/llmrb/rails-llm#readme)
-via a separate gem.
+## Services
+The llm.rb runtime and its forks
+([mruby-llm](https://github.com/llmrb/mruby-llm),
+[wasm-llm](https://github.com/llmrb/wasm-llm))
+power a growing family of AI applications, and
+services. The following applications are publicly
+accessible over SSH and are free to try. No account
+required. Nothing to install.
+#### matz - the mruby expert
+> ssh matz@r.uby.dev
+See [https://r.uby.dev/matz](https://r.uby.dev/matz) for more information.
+#### robert - the freebsd expert
+> ssh robert@4.4bsd.dev
+See [https://4.4bsd.dev/robert](https://4.4bsd.dev/robert) for more information.
 ## Quick start
@@ -138,10 +155,10 @@ to either
 or
 [LLM::Agent](https://0x1eef.github.io/x/llm.rb/LLM/Agent.html).
 In this example, the MCP server runs over stdio and
-[LLM::Context](https://0x1eef.github.io/x/llm.rb/LLM/Context.html)
-uses the same tool loop as local tools. For **stdio**, `mcp.session`
-is the preferred pattern because it keeps one MCP session alive across
-discovery and tool calls:
+[LLM::Agent](https://0x1eef.github.io/x/llm.rb/LLM/Agent.html)
+manages the tool loop. For **stdio**, `mcp.session` is the preferred
+pattern because it keeps one MCP session alive across discovery and
+tool calls:
 ```ruby
 require "llm"
@@ -150,9 +167,8 @@ llm = LLM.openai(key: ENV["KEY"])
 mcp = LLM::MCP.stdio(argv: ["ruby", "server.rb"])
 mcp.session do
-  ctx = LLM::Context.new(llm, stream: $stdout, tools: mcp.tools)
-  ctx.talk "Use the available tools to inspect the environment."
-  ctx.talk(ctx.wait(:call)) while ctx.functions?
+  agent = LLM::Agent.new(llm, stream: $stdout, tools: mcp.tools)
+  agent.talk "Use the available tools to inspect the environment."
 end
 ```
@@ -167,15 +183,16 @@ require "llm"
 llm = LLM.openai(key: ENV["KEY"])
 mcp = LLM::MCP.stdio(argv: ["ruby", "server.rb"])
-ctx = LLM::Context.new(llm, tools: mcp.tools)
-ctx.talk("Use the available tools to inspect the environment.")
-ctx.talk(ctx.wait(:call)) while ctx.functions?
+agent = LLM::Agent.new(llm, tools: mcp.tools)
+agent.talk("Use the available tools to inspect the environment.")
 ```
 The HTTP transport can be used with or without the `session` method,
 and unlike the stdio transport it can remain efficient without the
 `session` method through a persistent connection pool that is available
-through the [LLM::Transport.net_http_persistent](https://0x1eef.github.io/x/llm.rb/LLM/Transport.html#method-c-net_http_persistent) transport:
+through the
+[LLM::Transport.net_http_persistent](https://0x1eef.github.io/x/llm.rb/LLM/Transport.html#method-c-net_http_persistent)
+transport:
 ```ruby
 require "llm"
@@ -183,12 +200,11 @@ require "llm"
 llm = LLM.openai(key: ENV["KEY"])
 mcp = LLM::MCP.http(
   url: "https://remote-mcp.example.com",
-  transport: LLM::Transport.net_http_persistent
+  transport: :net_http_persistent
 )
-ctx = LLM::Context.new(llm, tools: mcp.tools)
-ctx.talk("Use the available tools to inspect the environment.")
-ctx.talk(ctx.wait(:call)) while ctx.functions?
+agent = LLM::Agent.new(llm, tools: mcp.tools)
+agent.talk("Use the available tools to inspect the environment.")
 ```
 #### A2A (Agent 2 Agent)
@@ -212,9 +228,8 @@ a2a = LLM::A2A.rest(
   headers: {"Authorization" => "Bearer token"}
 )
 llm = LLM.openai(key: ENV["KEY"])
-ctx = LLM::Context.new(llm, tools: a2a.skills)
-ctx.talk "Analyze this CSV and summarize the trends."
-ctx.talk(ctx.wait(:call)) while ctx.functions?
+agent = LLM::Agent.new(llm, tools: a2a.skills)
+agent.talk "Analyze this CSV and summarize the trends."
 ```
 Use persistent HTTP connections:
@@ -224,7 +239,7 @@ require "llm"
 a2a = LLM::A2A.rest(
   url: "https://remote-agent.example.com",
-  transport: LLM::Transport.net_http_persistent
+  transport: :net_http_persistent
 )
 ```
@@ -232,6 +247,27 @@ For more on direct messaging, task operations, push notification
 configs, and JSON-RPC, see the
 [LLM::A2A API docs](https://0x1eef.github.io/x/llm.rb/LLM/A2A.html).
+#### Transports
+Providers use Ruby's standard library Net::HTTP transport by default.
+You can opt into persistent Net::HTTP connections with `persistent: true`,
+or provide a transport shortcut when you want a different backend.
+`transport: :curb` uses libcurl through the optional `curb` gem.
+Custom transports can implement the
+[LLM::Transport](https://0x1eef.github.io/x/llm.rb/LLM/Transport.html)
+interface and receive transport-agnostic
+[LLM::Transport::Request](https://0x1eef.github.io/x/llm.rb/LLM/Transport/Request.html)
+objects from providers.
+```ruby
+require "llm"
+llm = LLM.openai(key: ENV["KEY"], persistent: true)
+llm = LLM.openai(key: ENV["KEY"], transport: :net_http_persistent)
+llm = LLM.openai(key: ENV["KEY"], transport: :curb)
+```
 #### Skills
 Skills are reusable instructions loaded from a `SKILL.md` directory. They let
@@ -294,8 +330,8 @@ class Stream < LLM::Stream
 end
 llm = LLM.openai(key: ENV["KEY"])
-ctx = LLM::Context.new(llm, stream: Stream.new)
-ctx.talk "Write a haiku about Ruby."
+agent = LLM::Agent.new(llm, stream: Stream.new)
+agent.talk "Write a haiku about Ruby."
 ```
 #### LLM::Stream (advanced)
@@ -352,30 +388,31 @@ agent.talk "Read README.md and CHANGELOG.md and compare them."
 #### Serialization
-The [`LLM::Context`](https://0x1eef.github.io/x/llm.rb/LLM/Context.html)
+The [`LLM::Agent`](https://0x1eef.github.io/x/llm.rb/LLM/Agent.html)
 object can be serialized to JSON, which makes it suitable for storing
 in a file, a database column, or a Redis queue. The built-in
-ActiveRecord and Sequel plugins are built on top of this feature:
+ActiveRecord and Sequel plugins are built on top of the same underlying
+serialization feature:
 ```ruby
 require "llm"
 llm = LLM.openai(key: ENV["KEY"])
-# Serialize a context
-ctx1 = LLM::Context.new(llm)
-ctx1.talk "Remember that my favorite language is Ruby"
-string = ctx1.to_json
+# Serialize an agent
+agent1 = LLM::Agent.new(llm)
+agent1.talk "Remember that my favorite language is Ruby"
+string = agent1.to_json
-# Restore a context (from JSON)
-ctx2 = LLM::Context.new(llm, stream: $stdout)
-ctx2.restore(string:)
-ctx2.talk "What is my favorite language?"
+# Restore an agent (from JSON)
+agent2 = LLM::Agent.new(llm, stream: $stdout)
+agent2.restore(string:)
+agent2.talk "What is my favorite language?"
 ```
 #### ask
-[`LLM::Context`](https://0x1eef.github.io/x/llm.rb/LLM/Context.html)
+[`LLM::Agent`](https://0x1eef.github.io/x/llm.rb/LLM/Agent.html)
 also provides `ask`, a convenience interface that is compatible with
 RubyLLM's `ask` method. It accepts a prompt, an optional `with:`
 attachment path or paths, an optional `stream:` target, and an optional
@@ -387,11 +424,11 @@ so use `.content` when you want the text directly:
 require "llm"
 llm = LLM.openai(key: ENV["KEY"])
-ctx = LLM::Context.new(llm)
+agent = LLM::Agent.new(llm)
-puts ctx.ask("Hello world").content
-puts ctx.ask("Summarize this document.", with: "README.md").content
-ctx.ask("Stream this reply.") { $stdout << _1 }
+puts agent.ask("Hello world").content
+puts agent.ask("Summarize this document.", with: "README.md").content
+agent.ask("Stream this reply.") { $stdout << _1 }
 ```
 ## Installation
@@ -404,8 +441,8 @@ gem install llm.rb
 #### REPL
-This example uses [`LLM::Context`](https://0x1eef.github.io/x/llm.rb/LLM/Context.html)
-directly for an interactive REPL. <br> See the
+This example uses [`LLM::Agent`](https://0x1eef.github.io/x/llm.rb/LLM/Agent.html)
+for an interactive REPL. <br> See the
 [deepdive (web)](https://llmrb.github.io/llm.rb/) or
 [deepdive (markdown)](resources/deepdive.md) for more examples.
@@ -413,11 +450,11 @@ directly for an interactive REPL. <br> See the
 require "llm"
 llm = LLM.openai(key: ENV["KEY"])
-ctx = LLM::Context.new(llm, stream: $stdout)
+agent = LLM::Agent.new(llm, stream: $stdout)
 loop do
   print "> "
-  ctx.talk(STDIN.gets || break)
+  agent.talk(STDIN.gets || break)
   puts
 end
 ```
@@ -426,36 +463,36 @@ end
 In llm.rb, a prompt can be a string, an [`LLM::Prompt`](https://0x1eef.github.io/x/llm.rb/LLM/Prompt.html), or an array.
 When you use an array, each element can be plain text or a tagged object such as
-[`ctx.image_url(...)`](https://0x1eef.github.io/x/llm.rb/LLM/Context.html#image_url-instance_method),
-[`ctx.local_file(...)`](https://0x1eef.github.io/x/llm.rb/LLM/Context.html#local_file-instance_method),
-or [`ctx.remote_file(...)`](https://0x1eef.github.io/x/llm.rb/LLM/Context.html#remote_file-instance_method).
+[`agent.image_url(...)`](https://0x1eef.github.io/x/llm.rb/LLM/Agent.html#image_url-instance_method),
+[`agent.local_file(...)`](https://0x1eef.github.io/x/llm.rb/LLM/Agent.html#local_file-instance_method),
+or [`agent.remote_file(...)`](https://0x1eef.github.io/x/llm.rb/LLM/Agent.html#remote_file-instance_method).
 Those tagged objects carry the metadata the provider adapter needs to turn one
 Ruby prompt into the provider-specific multimodal request schema.
 If the model understands that file type, you can attach a local file directly
-with `ctx.ask(..., with: path)` instead of uploading it first through a
+with `agent.ask(..., with: path)` instead of uploading it first through a
 provider Files API. Under the hood, llm.rb tags the path as a
-[`ctx.local_file(...)`](https://0x1eef.github.io/x/llm.rb/LLM/Context.html#local_file-instance_method)
+[`agent.local_file(...)`](https://0x1eef.github.io/x/llm.rb/LLM/Agent.html#local_file-instance_method)
 object:
 ```ruby
 require "llm"
 llm = LLM.openai(key: ENV["KEY"])
-ctx = LLM::Context.new(llm)
-puts ctx.ask("Summarize this document.", with: "README.md").content
+agent = LLM::Agent.new(llm)
+puts agent.ask("Summarize this document.", with: "README.md").content
 ```
 #### Context Compaction
-This example uses [`LLM::Context`](https://0x1eef.github.io/x/llm.rb/LLM/Context.html),
+This example uses [`LLM::Agent`](https://0x1eef.github.io/x/llm.rb/LLM/Agent.html),
 [`LLM::Compactor`](https://0x1eef.github.io/x/llm.rb/LLM/Compactor.html), and
 [`LLM::Stream`](https://0x1eef.github.io/x/llm.rb/LLM/Stream.html) together so
-long-lived contexts can summarize older history and expose the lifecycle
+long-lived conversations can summarize older history and expose the lifecycle
 through stream hooks. This approach is inspired by General Intelligence
 Systems. The
 compactor can also use its own `model:` if you want summarization to run on a
-different model from the main context. `token_threshold:` accepts either a
+different model from the main conversation. `token_threshold:` accepts either a
 fixed token count or a percentage string like `"90%"`, which resolves
 against the active model context window and triggers compaction once total
 token usage goes over that percentage. See the
@@ -476,7 +513,7 @@ class Stream < LLM::Stream
 end
 llm = LLM.openai(key: ENV["KEY"])
-ctx = LLM::Context.new(
+agent = LLM::Agent.new(
   llm,
   stream: Stream.new,
   compactor: {
@@ -495,9 +532,8 @@ visible assistant output. See the
 [deepdive (web)](https://llmrb.github.io/llm.rb/) or
 [deepdive (markdown)](resources/deepdive.md) for more examples.
-To use the Responses API (OpenAI-specific), initialize a
-context or agent with `mode: :responses` and keep using
-`talk` for turns.
+To use the Responses API (OpenAI-specific), initialize an agent with
+`mode: :responses` and keep using `talk` for turns.
 ```ruby
 require "llm"
@@ -513,20 +549,20 @@ class Stream < LLM::Stream
 end
 llm = LLM.openai(key: ENV["KEY"])
-ctx = LLM::Context.new(
+agent = LLM::Agent.new(
   llm,
   model: "gpt-5.4-mini",
   mode: :responses,
   reasoning: {effort: "medium"},
   stream: Stream.new
 )
-ctx.talk("Solve 17 * 19 and show your work.")
+agent.talk("Solve 17 * 19 and show your work.")
 ```
 #### Request Cancellation
 Need to cancel a stream? llm.rb has you covered through
-[`LLM::Context#interrupt!`](https://0x1eef.github.io/x/llm.rb/LLM/Context.html#interrupt-21-instance_method).
+[`LLM::Agent#interrupt!`](https://0x1eef.github.io/x/llm.rb/LLM/Agent.html#interrupt-21-instance_method).
 <br> See the [deepdive (web)](https://llmrb.github.io/llm.rb/)
 or [deepdive (markdown)](resources/deepdive.md) for more examples.
@@ -535,15 +571,15 @@ require "llm"
 require "io/console"
 llm = LLM.openai(key: ENV["KEY"])
-ctx = LLM::Context.new(llm, stream: $stdout)
+agent = LLM::Agent.new(llm, stream: $stdout)
 worker = Thread.new do
-  ctx.talk("Write a very long essay about network protocols.")
+  agent.talk("Write a very long essay about network protocols.")
 rescue LLM::Interrupt
   puts "Request was interrupted!"
 end
 STDIN.getch
-ctx.interrupt!
+agent.interrupt!
 worker.join
 ```
@@ -704,7 +740,7 @@ end
 This example uses [`LLM::MCP`](https://0x1eef.github.io/x/llm.rb/LLM/MCP.html)
 over HTTP so remote GitHub MCP tools run through the same
-`LLM::Context` tool path as local tools. It expects a GitHub token in
+`LLM::Agent` tool path as local tools. It expects a GitHub token in
 `ENV["GITHUB_PAT"]`. See the
 [deepdive (web)](https://llmrb.github.io/llm.rb/) or
 [deepdive (markdown)](resources/deepdive.md) for more examples.
@@ -720,9 +756,8 @@ mcp = LLM::MCP.http(
   persistent: true
 )
-ctx = LLM::Context.new(llm, stream: $stdout, tools: mcp.tools)
-ctx.talk("Pull information about my GitHub account.")
-ctx.talk(ctx.wait(:call)) while ctx.functions?
+agent = LLM::Agent.new(llm, stream: $stdout, tools: mcp.tools)
+agent.talk("Pull information about my GitHub account.")
 ```
 ## Resources