RubyGems - llm.rb - Versions diffs - 8.0.0 → 9.0.0 - Mend

llm.rb 8.0.0 → 9.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (80) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +165 -2
data/README.md +161 -509
data/data/bedrock.json +2948 -0
data/data/deepseek.json +8 -8
data/data/openai.json +39 -2
data/data/xai.json +35 -0
data/data/zai.json +1 -1
data/lib/llm/active_record/acts_as_llm.rb +7 -8
data/lib/llm/agent.rb +36 -16
data/lib/llm/context.rb +30 -26
data/lib/llm/contract/completion.rb +45 -0
data/lib/llm/cost.rb +81 -4
data/lib/llm/error.rb +1 -1
data/lib/llm/function/array.rb +8 -5
data/lib/llm/function/call_group.rb +39 -0
data/lib/llm/function/fork/task.rb +6 -0
data/lib/llm/function/ractor/task.rb +6 -0
data/lib/llm/function/task.rb +10 -0
data/lib/llm/function.rb +1 -0
data/lib/llm/mcp/transport/http.rb +26 -46
data/lib/llm/mcp/transport/stdio.rb +0 -8
data/lib/llm/mcp.rb +6 -23
data/lib/llm/object.rb +8 -0
data/lib/llm/provider.rb +29 -19
data/lib/llm/providers/anthropic/error_handler.rb +6 -7
data/lib/llm/providers/anthropic/files.rb +2 -2
data/lib/llm/providers/anthropic/response_adapter/completion.rb +30 -0
data/lib/llm/providers/anthropic.rb +1 -1
data/lib/llm/providers/bedrock/error_handler.rb +79 -0
data/lib/llm/providers/bedrock/models.rb +109 -0
data/lib/llm/providers/bedrock/request_adapter/completion.rb +153 -0
data/lib/llm/providers/bedrock/request_adapter.rb +95 -0
data/lib/llm/providers/bedrock/response_adapter/completion.rb +173 -0
data/lib/llm/providers/bedrock/response_adapter/models.rb +34 -0
data/lib/llm/providers/bedrock/response_adapter.rb +40 -0
data/lib/llm/providers/bedrock/signature.rb +166 -0
data/lib/llm/providers/bedrock/stream_decoder.rb +140 -0
data/lib/llm/providers/bedrock/stream_parser.rb +201 -0
data/lib/llm/providers/bedrock.rb +272 -0
data/lib/llm/providers/google/error_handler.rb +6 -7
data/lib/llm/providers/google/files.rb +2 -4
data/lib/llm/providers/google/images.rb +1 -1
data/lib/llm/providers/google/models.rb +0 -2
data/lib/llm/providers/google/response_adapter/completion.rb +30 -0
data/lib/llm/providers/google.rb +1 -1
data/lib/llm/providers/ollama/error_handler.rb +6 -7
data/lib/llm/providers/ollama/models.rb +0 -2
data/lib/llm/providers/ollama/response_adapter/completion.rb +30 -0
data/lib/llm/providers/ollama.rb +1 -1
data/lib/llm/providers/openai/audio.rb +3 -3
data/lib/llm/providers/openai/error_handler.rb +6 -7
data/lib/llm/providers/openai/files.rb +2 -2
data/lib/llm/providers/openai/images.rb +3 -3
data/lib/llm/providers/openai/models.rb +1 -1
data/lib/llm/providers/openai/response_adapter/completion.rb +42 -0
data/lib/llm/providers/openai/response_adapter/responds.rb +39 -0
data/lib/llm/providers/openai/responses.rb +2 -2
data/lib/llm/providers/openai/vector_stores.rb +1 -1
data/lib/llm/providers/openai.rb +1 -1
data/lib/llm/response.rb +10 -8
data/lib/llm/sequel/plugin.rb +7 -8
data/lib/llm/stream/queue.rb +15 -42
data/lib/llm/stream.rb +4 -4
data/lib/llm/transport/execution.rb +67 -0
data/lib/llm/transport/http.rb +134 -0
data/lib/llm/transport/persistent_http.rb +152 -0
data/lib/llm/transport/response/http.rb +113 -0
data/lib/llm/transport/response.rb +112 -0
data/lib/llm/{provider/transport/http → transport}/stream_decoder.rb +8 -4
data/lib/llm/transport.rb +139 -0
data/lib/llm/usage.rb +14 -5
data/lib/llm/version.rb +1 -1
data/lib/llm.rb +10 -12
data/llm.gemspec +2 -16
metadata +23 -19
data/lib/llm/provider/transport/http/execution.rb +0 -115
data/lib/llm/provider/transport/http/interruptible.rb +0 -114
data/lib/llm/provider/transport/http.rb +0 -145
data/lib/llm/utils.rb +0 -19

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 4d726213f6b63342582738a133f7f82c1158934d6f25a48ae6b6c9e59a8f8262
-  data.tar.gz: 6288d177adc7a07a37368066329c882f746747d5bed9ffba7cb50d2bcbd1d98c
+  metadata.gz: 197ff330dc5e414f4f9291835fbcdeece4450ee3a8d3748e4f9cf28a46db07b1
+  data.tar.gz: 3020a4511134f292ed38c6fc826b157f05cc31c722e9fe52692b8b2f705551c7
 SHA512:
-  metadata.gz: 4ae089f4117dc384000a70500c40ebadf48f42d1bd820d0840568b3b31b0197e51c65e9f60fe65d0e75c23aa4c7eac977be928a38969580174169bd0efe39912
-  data.tar.gz: 9653135f93b9b2b722102f055dc961346949368dab161a3cff64e99ddfc6781933a94b527151da9a24ff39451814f76c5409389f91c3692852eb17bd5d3d11f9
+  metadata.gz: 41f733d7d5b8a329420497f85c289f070f9016cf4d1bfdf5c5e49e274714310f5285e5598d4d27f5daa84cf26e837f311541ac8526bd987d7c7d917eb60eca21
+  data.tar.gz: f751b3887bd380e8f911106bedf6a0c606bcdc813ea4d7b01f2f332311ddd974c6dcfe838c7db5446d1683844ffbf379b2f5c8ef4b94459bedefaafc70be2098

data/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,170 @@
 ## Unreleased
+## v9.0.0
+Changes since `v8.1.0`.
+This release deepens llm.rb's transport and cost-tracking surface. It
+replaces the old mutable `persist!` API with constructor-driven transport
+selection, removes `#call` from contexts and agents in favor of explicit
+`ctx.wait(:call)`, makes queued stream waits strategy-free, and deletes
+the unused `LLM::Utils` module.
+It adds cache read/write token tracking
+with corresponding cost components, audio and image token pricing,
+`LLM::Context#functions?` for queue-aware tool loops,
+`LLM::Agent.stream` DSL support, and exposes `#stream` readers on
+contexts and agents.
+The HTTP transport layer has been refactored around shared backends so
+providers, MCP, and custom transports all use the same normalized
+response interface.
+### Breaking
+* **Remove `#call` as a context and agent tool-loop API** <br>
+  Remove `LLM::Context#call(:functions)` and `LLM::Agent#call(:functions)`.
+  Tool loops should use `ctx.wait(:call)` or `agent.wait(:call)` instead.
+  The ActiveRecord and Sequel wrappers no longer expose `#call` passthroughs
+  for stored llm.rb contexts.
+* **Make HTTP transport selection constructor-driven** <br>
+  Remove public `persist!` and `.persistent` mutation APIs from
+  providers, transports, and MCP clients. Select persistent behavior at
+  construction time with `persistent: true`, `LLM::Transport.net_http`,
+  `LLM::Transport.net_http_persistent`, or an explicit `transport:`
+  override.
+* **Make queued stream waits strategy-free** <br>
+  Change `LLM::Stream::Queue#wait` to resolve queued work by the actual
+  task types already present in the queue instead of accepting an
+  external wait strategy. `LLM::Stream#wait(...)` remains compatible but
+  now ignores its arguments when delegating to the queue.
+* **Remove unused `LLM::Utils`** <br>
+  Delete the `LLM::Utils` module and remove its remaining unused
+  provider includes and top-level require.
+### Add
+* **Expose `#stream` readers on contexts and agents** <br>
+  Add public `LLM::Context#stream` and `LLM::Agent#stream` accessors so
+  callers can inspect the active stream object directly.
+* **Track cache read and write tokens in usage** <br>
+  Add `cache_read_tokens` and `cache_write_tokens` to `LLM::Usage` and
+  preserve them through completion usage adaptation and context usage
+  aggregation.
+* **Add `LLM::Context#functions?` for queue-aware tool loops** <br>
+  Add `functions?` to `LLM::Context` and the ActiveRecord and Sequel
+  wrappers so callers can detect pending tool work through either the
+  bound stream queue or unresolved functions, and update the docs to
+  prefer `while ctx.functions?` over `ctx.functions.any?` in tool-loop
+  examples.
+* **Add `:call` as a first-class wait strategy** <br>
+  Add `:call` to pending-function wait paths so `ctx.wait(:call)` can
+  prefer queued streamed work when present and otherwise fall back to
+  direct sequential function execution through `spawn(:call).wait`.
+* **Read provider cache usage into completion responses** <br>
+  Read cache read tokens from provider usage metadata, including OpenAI
+  `usage.prompt_tokens_details` and Anthropic
+  `usage.cache_read_input_tokens`. Read Anthropic cache write tokens
+  from `usage.cache_creation_input_tokens`, and expose explicit
+  zero-valued `cache_write_tokens` methods on providers that do not
+  report cache creation usage.
+* **Extend cost tracking with cache write pricing** <br>
+  Extend `LLM::Cost` with `cache_read_costs`, `cache_write_costs`, and
+  `reasoning_costs` alongside the existing `input_costs` and
+  `output_costs`. Add `#to_h` for structured cost insight and update
+  `ctx.cost` to calculate all available components from registry
+  pricing data.
+* **Price input and output audio separately** <br>
+  Track `input_audio_tokens` and `output_audio_tokens` in usage and
+  include `input_audio_costs` and `output_audio_costs` in `LLM::Cost`
+  so multimodal requests report accurate audio spend.
+* **Track image tokens in input cost reporting** <br>
+  Add `input_image_tokens` to usage and include `input_image_costs` in
+  `LLM::Cost` using the model's generic input rate so image-bearing
+  prompts report their input spend.
+* **Add `LLM::Agent.stream` DSL support** <br>
+  Let agents define a default `stream` through the class DSL, including
+  block-based stream construction so each agent instance can resolve its
+  stream the same way `tracer` does.
+### Change
+* **Refactor HTTP transports around shared backends** <br>
+  Split `Net::HTTP` and `Net::HTTP::Persistent` into separate
+  `LLM::Transport` implementations, move HTTP-specific request helpers
+  and response execution into the shared transport layer, and let MCP
+  HTTP wrap those transports instead of maintaining a separate
+  transient/persistent client split.
+* **Share transport overrides across providers and MCP** <br>
+  Let both provider construction and `LLM::MCP.http(...)` accept
+  `LLM::Transport` instances or classes as HTTP transport overrides, so
+  callers can reuse the same transport implementation across the
+  runtime.
+* **Let custom transports adapt their own response objects** <br>
+  Introduce a transport response interface so custom transports can
+  adapt backend-specific response objects to one normalized shape and
+  have them work with the existing provider execution and error-handling
+  code.
+## v8.1.0
+Changes since `v8.0.0`.
+This release adds Amazon Bedrock provider support through the Converse
+API, including AWS SigV4 request signing, event stream decoding,
+structured output through `schema:`, and a models.dev-backed registry.
+It exposes `llm.models.all` for Bedrock via the ListFoundationModels
+API and adds `LLM::Object#transform_values!` for in-place value
+transformation. Several Bedrock-specific fixes land as well, including
+response id exposure, blank text block suppression in tool turns, and
+DSML tool-marker filtering in streamed text.
+### Add
+* **Add AWS Bedrock provider support** <br>
+  Add `LLM.bedrock(...)` with Bedrock Converse chat support, AWS SigV4
+  request signing, Bedrock event stream decoding, structured output
+  support through `schema:`, and models.dev-backed `bedrock.json`
+  registry generation.
+* **Add AWS Bedrock Models endpoint support** <br>
+  Add `llm.models.all` for Bedrock via the ListFoundationModels API,
+  including SigV4 signing for the control-plane endpoint and normalized
+  `LLM::Model` collection responses.
+* **Add `LLM::Object#transform_values!`** <br>
+  Let `LLM::Object` transform stored values in place through
+  `#transform_values!`.
+### Fix
+* **Expose response ids on Bedrock completion responses** <br>
+  Read the Bedrock request id into `LLM::Response#id` for completion
+  responses adapted from the Converse API.
+* **Avoid blank assistant text blocks in Bedrock tool turns** <br>
+  Stop replaying assistant tool-call messages with empty text content
+  blocks that Bedrock rejects.
+* **Suppress Bedrock DSML tool markers in streamed text** <br>
+  Filter `"<｜DSML｜function_calls"` markers out of streamed Bedrock
+  assistant text so tool-call sentinels do not leak into user-visible
+  output.
 ## v8.0.0
 Changes since `v7.0.0`.
@@ -51,8 +215,7 @@ and `acts_as_agent`.
 * **Allow `persistent: true` on `LLM::MCP.http`** <br>
   Let `LLM::MCP.http(...)` enable persistent HTTP transport directly
-  through `persistent: true`, instead of requiring a separate
-  `.persistent` call after construction.
+  through `persistent: true` at construction time.
 * **Expose `LLM::Function#runner` as public API** <br>
   Promote the internal runner instantiation to a public `runner` method on