RubyGems - llm.rb - Versions diffs - 8.1.0 → 10.0.0 - Mend

llm.rb 8.1.0 → 10.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (86) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +196 -6
data/README.md +233 -518
data/data/anthropic.json +278 -258
data/data/bedrock.json +1288 -1561
data/data/deepseek.json +38 -38
data/data/google.json +656 -579
data/data/openai.json +860 -818
data/data/xai.json +243 -552
data/data/zai.json +168 -168
data/lib/llm/active_record/acts_as_agent.rb +5 -0
data/lib/llm/active_record/acts_as_llm.rb +7 -8
data/lib/llm/active_record.rb +1 -6
data/lib/llm/agent.rb +121 -82
data/lib/llm/context.rb +79 -74
data/lib/llm/contract/completion.rb +45 -0
data/lib/llm/cost.rb +81 -4
data/lib/llm/error.rb +1 -1
data/lib/llm/function/array.rb +8 -5
data/lib/llm/function/call_group.rb +39 -0
data/lib/llm/function/call_task.rb +46 -0
data/lib/llm/function/fork/task.rb +6 -0
data/lib/llm/function/ractor/task.rb +6 -0
data/lib/llm/function/task.rb +10 -0
data/lib/llm/function.rb +28 -1
data/lib/llm/mcp/transport/http.rb +26 -46
data/lib/llm/mcp/transport/stdio.rb +0 -8
data/lib/llm/mcp.rb +6 -23
data/lib/llm/provider.rb +30 -20
data/lib/llm/providers/anthropic/error_handler.rb +6 -7
data/lib/llm/providers/anthropic/files.rb +2 -2
data/lib/llm/providers/anthropic/response_adapter/completion.rb +30 -0
data/lib/llm/providers/anthropic/stream_parser.rb +2 -2
data/lib/llm/providers/anthropic.rb +1 -1
data/lib/llm/providers/bedrock/error_handler.rb +8 -9
data/lib/llm/providers/bedrock/models.rb +13 -13
data/lib/llm/providers/bedrock/response_adapter/completion.rb +30 -0
data/lib/llm/providers/bedrock/stream_parser.rb +2 -2
data/lib/llm/providers/bedrock.rb +1 -1
data/lib/llm/providers/google/error_handler.rb +6 -7
data/lib/llm/providers/google/files.rb +2 -4
data/lib/llm/providers/google/images.rb +1 -1
data/lib/llm/providers/google/models.rb +0 -2
data/lib/llm/providers/google/response_adapter/completion.rb +30 -0
data/lib/llm/providers/google/stream_parser.rb +2 -2
data/lib/llm/providers/google.rb +1 -1
data/lib/llm/providers/ollama/error_handler.rb +6 -7
data/lib/llm/providers/ollama/models.rb +0 -2
data/lib/llm/providers/ollama/response_adapter/completion.rb +30 -0
data/lib/llm/providers/ollama.rb +1 -1
data/lib/llm/providers/openai/audio.rb +3 -3
data/lib/llm/providers/openai/error_handler.rb +6 -7
data/lib/llm/providers/openai/files.rb +2 -2
data/lib/llm/providers/openai/images.rb +3 -3
data/lib/llm/providers/openai/models.rb +1 -1
data/lib/llm/providers/openai/response_adapter/completion.rb +42 -0
data/lib/llm/providers/openai/response_adapter/responds.rb +39 -0
data/lib/llm/providers/openai/responses/stream_parser.rb +2 -2
data/lib/llm/providers/openai/responses.rb +2 -2
data/lib/llm/providers/openai/stream_parser.rb +2 -2
data/lib/llm/providers/openai/vector_stores.rb +1 -1
data/lib/llm/providers/openai.rb +1 -1
data/lib/llm/response.rb +10 -8
data/lib/llm/schema.rb +11 -0
data/lib/llm/sequel/agent.rb +5 -0
data/lib/llm/sequel/plugin.rb +8 -14
data/lib/llm/stream/queue.rb +15 -42
data/lib/llm/stream.rb +15 -40
data/lib/llm/tool/param.rb +1 -8
data/lib/llm/transport/execution.rb +67 -0
data/lib/llm/transport/http.rb +134 -0
data/lib/llm/transport/persistent_http.rb +152 -0
data/lib/llm/transport/response/http.rb +113 -0
data/lib/llm/transport/response.rb +112 -0
data/lib/llm/{provider/transport/http → transport}/stream_decoder.rb +8 -4
data/lib/llm/transport.rb +139 -0
data/lib/llm/usage.rb +14 -5
data/lib/llm/utils.rb +24 -14
data/lib/llm/version.rb +1 -1
data/lib/llm.rb +3 -12
data/llm.gemspec +2 -16
metadata +13 -20
data/lib/llm/bot.rb +0 -3
data/lib/llm/provider/transport/http/execution.rb +0 -115
data/lib/llm/provider/transport/http/interruptible.rb +0 -114
data/lib/llm/provider/transport/http.rb +0 -145

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 8aa3ee461642fb157bece63a4ebe00ceda8ec66ce24df5c842efdcc176861a53
-  data.tar.gz: 2d26e36b812704a80e5c8ba4814cfbec770afd5694be71b69d7937422f9a642c
+  metadata.gz: 6ba756238fa72e58ba774567a0c8e2a6d7351cb6313f9c8c08cbdeec8ec9cfa4
+  data.tar.gz: cba8295670dab2843cec902ae97b7ae14e775359380ba401ca5a0066eb60ad0e
 SHA512:
-  metadata.gz: 3a30bf9d5309bf49c660137ed5e81b74f9b028f8846077f3db0b7c92745a5d96b16115db765a4bd1970ba0cbaaa7bd805e0a4a37c04c7e63aacdf3d019d268ec
-  data.tar.gz: 4e297d159dc459ee9ec228862f271b7a21be48ce06f092773c4b56d9cc007252b1cfeb66a119c7e14f3e683213e5923d34b0c256397b92cfa981cd47fe023008
+  metadata.gz: b8347b2adfe05a4700ec42e0ed5992a1332355bd20330590d8b3de214d980476a490855ff7e69b5b36c75f3684304c4ee61bdff9ecbcf8001f0b477b8010d064
+  data.tar.gz: a41512ffbc52b3665118161251441152389ca9daba1a6f4e010303490938dc33393da62f5e821521b2a9f4b45d85fd219b558fa7d2e185c24f43777d26e36a14

data/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,197 @@
 ## Unreleased
+## v10.0.0
+Changes since `v9.0.0`.
+This release unifies context turns under `#talk`, removes the
+deprecated `LLM::Bot` alias, and adds shared option resolution
+through `LLM::Utils`.
+Class-level agent tunables can now be resolved lazily via Proc,
+`Array[...]` schema/tool param types are supported, and a `key?`
+method has been added on providers.
+Agent tool confirmation hooks let selected tools be approved or
+cancelled before execution. Keep reading to learn more.
+### Breaking
+* **Unify context turns under `#talk`** <br>
+  Remove `LLM::Context#respond` and route responses-mode turns through
+  `LLM::Context#talk` with `mode: :responses` instead.
+* **Remove the `LLM::Bot` alias** <br>
+  Remove the backward-compatible `LLM::Bot` alias for `LLM::Context`.
+  Use `LLM::Context` directly instead.
+### Add
+* **Add shared option resolution through `LLM::Utils`** <br>
+  Add `LLM::Utils.resolve_option` for resolving configured values as
+  literals, procs, symbol-named methods, or duplicated hashes, and use
+  it in agent and ORM option resolution paths.
+* **Resolve all class-level agent tunables via Proc** <br>
+  Let `model`, `tools`, `skills`, `schema`, `stream`, and `tracer`
+  declared with a block be lazily evaluated against the agent instance
+  at initialization time, matching how `stream` and `tracer` already
+  worked.
+  Add `LLM::Agent#params` for direct access to the underlying context
+  parameters.
+  Ported from mruby-llm.
+* **Support `Array[...]` schema and tool param types** <br>
+  Let `LLM::Schema` properties and `LLM::Tool` params accept
+  `Array[...]` type declarations, including mixed item unions that are
+  serialized as `anyOf` array items.
+* **Add `LLM::Provider#key?`** <br>
+  Add `key?` to providers so callers can check whether a non-blank API
+  key has been configured.
+* **Add agent tool confirmation hooks** <br>
+  Add `LLM::Agent.confirm` and `LLM::Agent#on_tool_confirmation` so
+  selected tools can be approved or cancelled before execution. Pending
+  tool resolution now relies on `LLM::Context#functions` so confirmed
+  tools are not executed twice when mixed with unconfirmed tool calls.
+* **Add `LLM::Function#spawn(:call).wait`** <br>
+  Add task-shaped sequential execution support for direct
+  `LLM::Function#spawn(:call).wait`.
+### Fix
+* **Reduce private internal methods on `LLM::Stream`** <br>
+  Remove `tool_not_found` and `__tools__` from `LLM::Stream`. The
+  `__tools__` logic is inlined directly into `__find__` since that
+  was its only caller. The `tool_not_found` utility method was unused
+  externally and added unnecessary surface to LLM::Stream.
+  Ported from mruby-llm.
+## v9.0.0
+Changes since `v8.1.0`.
+This release deepens llm.rb's transport and cost-tracking surface. It
+replaces the old mutable `persist!` API with constructor-driven transport
+selection, removes `#call` from contexts and agents in favor of explicit
+`ctx.wait(:call)`, makes queued stream waits strategy-free, and deletes
+the unused `LLM::Utils` module.
+It adds cache read/write token tracking
+with corresponding cost components, audio and image token pricing,
+`LLM::Context#functions?` for queue-aware tool loops,
+`LLM::Agent.stream` DSL support, and exposes `#stream` readers on
+contexts and agents.
+The HTTP transport layer has been refactored around shared backends so
+providers, MCP, and custom transports all use the same normalized
+response interface.
+### Breaking
+* **Remove `#call` as a context and agent tool-loop API** <br>
+  Remove `LLM::Context#call(:functions)` and `LLM::Agent#call(:functions)`.
+  Tool loops should use `ctx.wait(:call)` or `agent.wait(:call)` instead.
+  The ActiveRecord and Sequel wrappers no longer expose `#call` passthroughs
+  for stored llm.rb contexts.
+* **Make HTTP transport selection constructor-driven** <br>
+  Remove public `persist!` and `.persistent` mutation APIs from
+  providers, transports, and MCP clients. Select persistent behavior at
+  construction time with `persistent: true`, `LLM::Transport.net_http`,
+  `LLM::Transport.net_http_persistent`, or an explicit `transport:`
+  override.
+* **Make queued stream waits strategy-free** <br>
+  Change `LLM::Stream::Queue#wait` to resolve queued work by the actual
+  task types already present in the queue instead of accepting an
+  external wait strategy. `LLM::Stream#wait(...)` remains compatible but
+  now ignores its arguments when delegating to the queue.
+* **Remove unused `LLM::Utils`** <br>
+  Delete the `LLM::Utils` module and remove its remaining unused
+  provider includes and top-level require.
+### Add
+* **Expose `#stream` readers on contexts and agents** <br>
+  Add public `LLM::Context#stream` and `LLM::Agent#stream` accessors so
+  callers can inspect the active stream object directly.
+* **Track cache read and write tokens in usage** <br>
+  Add `cache_read_tokens` and `cache_write_tokens` to `LLM::Usage` and
+  preserve them through completion usage adaptation and context usage
+  aggregation.
+* **Add `LLM::Context#functions?` for queue-aware tool loops** <br>
+  Add `functions?` to `LLM::Context` and the ActiveRecord and Sequel
+  wrappers so callers can detect pending tool work through either the
+  bound stream queue or unresolved functions, and update the docs to
+  prefer `while ctx.functions?` over `ctx.functions.any?` in tool-loop
+  examples.
+* **Add `:call` as a first-class wait strategy** <br>
+  Add `:call` to pending-function wait paths so `ctx.wait(:call)` can
+  prefer queued streamed work when present and otherwise fall back to
+  direct sequential function execution through `spawn(:call).wait`.
+* **Read provider cache usage into completion responses** <br>
+  Read cache read tokens from provider usage metadata, including OpenAI
+  `usage.prompt_tokens_details` and Anthropic
+  `usage.cache_read_input_tokens`. Read Anthropic cache write tokens
+  from `usage.cache_creation_input_tokens`, and expose explicit
+  zero-valued `cache_write_tokens` methods on providers that do not
+  report cache creation usage.
+* **Extend cost tracking with cache write pricing** <br>
+  Extend `LLM::Cost` with `cache_read_costs`, `cache_write_costs`, and
+  `reasoning_costs` alongside the existing `input_costs` and
+  `output_costs`. Add `#to_h` for structured cost insight and update
+  `ctx.cost` to calculate all available components from registry
+  pricing data.
+* **Price input and output audio separately** <br>
+  Track `input_audio_tokens` and `output_audio_tokens` in usage and
+  include `input_audio_costs` and `output_audio_costs` in `LLM::Cost`
+  so multimodal requests report accurate audio spend.
+* **Track image tokens in input cost reporting** <br>
+  Add `input_image_tokens` to usage and include `input_image_costs` in
+  `LLM::Cost` using the model's generic input rate so image-bearing
+  prompts report their input spend.
+* **Add `LLM::Agent.stream` DSL support** <br>
+  Let agents define a default `stream` through the class DSL, including
+  block-based stream construction so each agent instance can resolve its
+  stream the same way `tracer` does.
+### Change
+* **Refactor HTTP transports around shared backends** <br>
+  Split `Net::HTTP` and `Net::HTTP::Persistent` into separate
+  `LLM::Transport` implementations, move HTTP-specific request helpers
+  and response execution into the shared transport layer, and let MCP
+  HTTP wrap those transports instead of maintaining a separate
+  transient/persistent client split.
+* **Share transport overrides across providers and MCP** <br>
+  Let both provider construction and `LLM::MCP.http(...)` accept
+  `LLM::Transport` instances or classes as HTTP transport overrides, so
+  callers can reuse the same transport implementation across the
+  runtime.
+* **Let custom transports adapt their own response objects** <br>
+  Introduce a transport response interface so custom transports can
+  adapt backend-specific response objects to one normalized shape and
+  have them work with the existing provider execution and error-handling
+  code.
 ## v8.1.0
 Changes since `v8.0.0`.
@@ -43,7 +234,7 @@ DSML tool-marker filtering in streamed text.
   blocks that Bedrock rejects.
 * **Suppress Bedrock DSML tool markers in streamed text** <br>
-  Filter `"<｜DSML｜function_calls"` markers out of streamed Bedrock
+  Filter `\"<｜DSML｜function_calls\"` markers out of streamed Bedrock
   assistant text so tool-call sentinels do not leak into user-visible
   output.
@@ -96,8 +287,7 @@ and `acts_as_agent`.
 * **Allow `persistent: true` on `LLM::MCP.http`** <br>
   Let `LLM::MCP.http(...)` enable persistent HTTP transport directly
-  through `persistent: true`, instead of requiring a separate
-  `.persistent` call after construction.
+  through `persistent: true` at construction time.
 * **Expose `LLM::Function#runner` as public API** <br>
   Promote the internal runner instantiation to a public `runner` method on
@@ -195,7 +385,7 @@ provider usage has been recorded yet.
   buffer API.
 * **Support percentage compaction token thresholds** <br>
-  Let `LLM::Compactor` accept `token_threshold:` values like `"90%"` so
+  Let `LLM::Compactor` accept `token_threshold:` values like `\"90%\"` so
   compaction can trigger at a percentage of the active model context
   window.
@@ -978,7 +1168,7 @@ Changes since `v4.9.0`.
 - Add HTTP transport for MCP with `LLM::MCP::Transport::HTTP` for remote servers
 - Add JSON Schema union types (`any_of`, `all_of`, `one_of`) with parser integration
-- Add JSON Schema type array union support (e.g., `"type\": [\"object\", \"null\"]`)
+- Add JSON Schema type array union support (e.g., `\"type\": [\"object\", \"null\"]`)
 - Add JSON Schema type inference from `const`, `enum`, or `default` fields
 ### Change
@@ -1079,7 +1269,7 @@ Notable merged work in this range includes:
 - `Add rack + websocket example (#130)`
 - `feat(gemspec): add changelog URI (#136)`
 - `feat(function): alias ThreadGroup#wait as ThreadGroup#value (#62)`
-- README and screencast refresh across `#66`, `#67`, `#68`, `#71`, and
+- README and screencast refresh across `#66`, `#68`, `#71`, and
   `#72`
 - `chore(bot): update deprecation warning from v5.0 to v6.0`
 - `fix(deepseek): tolerate malformed tool arguments`