RubyGems - llm.rb - Versions diffs - 8.1.0 → 9.0.0 - Mend

llm.rb 8.1.0 → 9.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (67) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +120 -2
data/README.md +161 -514
data/lib/llm/active_record/acts_as_llm.rb +7 -8
data/lib/llm/agent.rb +36 -16
data/lib/llm/context.rb +30 -26
data/lib/llm/contract/completion.rb +45 -0
data/lib/llm/cost.rb +81 -4
data/lib/llm/error.rb +1 -1
data/lib/llm/function/array.rb +8 -5
data/lib/llm/function/call_group.rb +39 -0
data/lib/llm/function/fork/task.rb +6 -0
data/lib/llm/function/ractor/task.rb +6 -0
data/lib/llm/function/task.rb +10 -0
data/lib/llm/function.rb +1 -0
data/lib/llm/mcp/transport/http.rb +26 -46
data/lib/llm/mcp/transport/stdio.rb +0 -8
data/lib/llm/mcp.rb +6 -23
data/lib/llm/provider.rb +23 -20
data/lib/llm/providers/anthropic/error_handler.rb +6 -7
data/lib/llm/providers/anthropic/files.rb +2 -2
data/lib/llm/providers/anthropic/response_adapter/completion.rb +30 -0
data/lib/llm/providers/anthropic.rb +1 -1
data/lib/llm/providers/bedrock/error_handler.rb +8 -9
data/lib/llm/providers/bedrock/models.rb +13 -13
data/lib/llm/providers/bedrock/response_adapter/completion.rb +30 -0
data/lib/llm/providers/bedrock.rb +1 -1
data/lib/llm/providers/google/error_handler.rb +6 -7
data/lib/llm/providers/google/files.rb +2 -4
data/lib/llm/providers/google/images.rb +1 -1
data/lib/llm/providers/google/models.rb +0 -2
data/lib/llm/providers/google/response_adapter/completion.rb +30 -0
data/lib/llm/providers/google.rb +1 -1
data/lib/llm/providers/ollama/error_handler.rb +6 -7
data/lib/llm/providers/ollama/models.rb +0 -2
data/lib/llm/providers/ollama/response_adapter/completion.rb +30 -0
data/lib/llm/providers/ollama.rb +1 -1
data/lib/llm/providers/openai/audio.rb +3 -3
data/lib/llm/providers/openai/error_handler.rb +6 -7
data/lib/llm/providers/openai/files.rb +2 -2
data/lib/llm/providers/openai/images.rb +3 -3
data/lib/llm/providers/openai/models.rb +1 -1
data/lib/llm/providers/openai/response_adapter/completion.rb +42 -0
data/lib/llm/providers/openai/response_adapter/responds.rb +39 -0
data/lib/llm/providers/openai/responses.rb +2 -2
data/lib/llm/providers/openai/vector_stores.rb +1 -1
data/lib/llm/providers/openai.rb +1 -1
data/lib/llm/response.rb +10 -8
data/lib/llm/sequel/plugin.rb +7 -8
data/lib/llm/stream/queue.rb +15 -42
data/lib/llm/stream.rb +4 -4
data/lib/llm/transport/execution.rb +67 -0
data/lib/llm/transport/http.rb +134 -0
data/lib/llm/transport/persistent_http.rb +152 -0
data/lib/llm/transport/response/http.rb +113 -0
data/lib/llm/transport/response.rb +112 -0
data/lib/llm/{provider/transport/http → transport}/stream_decoder.rb +8 -4
data/lib/llm/transport.rb +139 -0
data/lib/llm/usage.rb +14 -5
data/lib/llm/version.rb +1 -1
data/lib/llm.rb +2 -12
data/llm.gemspec +2 -16
metadata +11 -19
data/lib/llm/provider/transport/http/execution.rb +0 -115
data/lib/llm/provider/transport/http/interruptible.rb +0 -114
data/lib/llm/provider/transport/http.rb +0 -145
data/lib/llm/utils.rb +0 -19

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 8aa3ee461642fb157bece63a4ebe00ceda8ec66ce24df5c842efdcc176861a53
-  data.tar.gz: 2d26e36b812704a80e5c8ba4814cfbec770afd5694be71b69d7937422f9a642c
+  metadata.gz: 197ff330dc5e414f4f9291835fbcdeece4450ee3a8d3748e4f9cf28a46db07b1
+  data.tar.gz: 3020a4511134f292ed38c6fc826b157f05cc31c722e9fe52692b8b2f705551c7
 SHA512:
-  metadata.gz: 3a30bf9d5309bf49c660137ed5e81b74f9b028f8846077f3db0b7c92745a5d96b16115db765a4bd1970ba0cbaaa7bd805e0a4a37c04c7e63aacdf3d019d268ec
-  data.tar.gz: 4e297d159dc459ee9ec228862f271b7a21be48ce06f092773c4b56d9cc007252b1cfeb66a119c7e14f3e683213e5923d34b0c256397b92cfa981cd47fe023008
+  metadata.gz: 41f733d7d5b8a329420497f85c289f070f9016cf4d1bfdf5c5e49e274714310f5285e5598d4d27f5daa84cf26e837f311541ac8526bd987d7c7d917eb60eca21
+  data.tar.gz: f751b3887bd380e8f911106bedf6a0c606bcdc813ea4d7b01f2f332311ddd974c6dcfe838c7db5446d1683844ffbf379b2f5c8ef4b94459bedefaafc70be2098

data/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,125 @@
 ## Unreleased
+## v9.0.0
+Changes since `v8.1.0`.
+This release deepens llm.rb's transport and cost-tracking surface. It
+replaces the old mutable `persist!` API with constructor-driven transport
+selection, removes `#call` from contexts and agents in favor of explicit
+`ctx.wait(:call)`, makes queued stream waits strategy-free, and deletes
+the unused `LLM::Utils` module.
+It adds cache read/write token tracking
+with corresponding cost components, audio and image token pricing,
+`LLM::Context#functions?` for queue-aware tool loops,
+`LLM::Agent.stream` DSL support, and exposes `#stream` readers on
+contexts and agents.
+The HTTP transport layer has been refactored around shared backends so
+providers, MCP, and custom transports all use the same normalized
+response interface.
+### Breaking
+* **Remove `#call` as a context and agent tool-loop API** <br>
+  Remove `LLM::Context#call(:functions)` and `LLM::Agent#call(:functions)`.
+  Tool loops should use `ctx.wait(:call)` or `agent.wait(:call)` instead.
+  The ActiveRecord and Sequel wrappers no longer expose `#call` passthroughs
+  for stored llm.rb contexts.
+* **Make HTTP transport selection constructor-driven** <br>
+  Remove public `persist!` and `.persistent` mutation APIs from
+  providers, transports, and MCP clients. Select persistent behavior at
+  construction time with `persistent: true`, `LLM::Transport.net_http`,
+  `LLM::Transport.net_http_persistent`, or an explicit `transport:`
+  override.
+* **Make queued stream waits strategy-free** <br>
+  Change `LLM::Stream::Queue#wait` to resolve queued work by the actual
+  task types already present in the queue instead of accepting an
+  external wait strategy. `LLM::Stream#wait(...)` remains compatible but
+  now ignores its arguments when delegating to the queue.
+* **Remove unused `LLM::Utils`** <br>
+  Delete the `LLM::Utils` module and remove its remaining unused
+  provider includes and top-level require.
+### Add
+* **Expose `#stream` readers on contexts and agents** <br>
+  Add public `LLM::Context#stream` and `LLM::Agent#stream` accessors so
+  callers can inspect the active stream object directly.
+* **Track cache read and write tokens in usage** <br>
+  Add `cache_read_tokens` and `cache_write_tokens` to `LLM::Usage` and
+  preserve them through completion usage adaptation and context usage
+  aggregation.
+* **Add `LLM::Context#functions?` for queue-aware tool loops** <br>
+  Add `functions?` to `LLM::Context` and the ActiveRecord and Sequel
+  wrappers so callers can detect pending tool work through either the
+  bound stream queue or unresolved functions, and update the docs to
+  prefer `while ctx.functions?` over `ctx.functions.any?` in tool-loop
+  examples.
+* **Add `:call` as a first-class wait strategy** <br>
+  Add `:call` to pending-function wait paths so `ctx.wait(:call)` can
+  prefer queued streamed work when present and otherwise fall back to
+  direct sequential function execution through `spawn(:call).wait`.
+* **Read provider cache usage into completion responses** <br>
+  Read cache read tokens from provider usage metadata, including OpenAI
+  `usage.prompt_tokens_details` and Anthropic
+  `usage.cache_read_input_tokens`. Read Anthropic cache write tokens
+  from `usage.cache_creation_input_tokens`, and expose explicit
+  zero-valued `cache_write_tokens` methods on providers that do not
+  report cache creation usage.
+* **Extend cost tracking with cache write pricing** <br>
+  Extend `LLM::Cost` with `cache_read_costs`, `cache_write_costs`, and
+  `reasoning_costs` alongside the existing `input_costs` and
+  `output_costs`. Add `#to_h` for structured cost insight and update
+  `ctx.cost` to calculate all available components from registry
+  pricing data.
+* **Price input and output audio separately** <br>
+  Track `input_audio_tokens` and `output_audio_tokens` in usage and
+  include `input_audio_costs` and `output_audio_costs` in `LLM::Cost`
+  so multimodal requests report accurate audio spend.
+* **Track image tokens in input cost reporting** <br>
+  Add `input_image_tokens` to usage and include `input_image_costs` in
+  `LLM::Cost` using the model's generic input rate so image-bearing
+  prompts report their input spend.
+* **Add `LLM::Agent.stream` DSL support** <br>
+  Let agents define a default `stream` through the class DSL, including
+  block-based stream construction so each agent instance can resolve its
+  stream the same way `tracer` does.
+### Change
+* **Refactor HTTP transports around shared backends** <br>
+  Split `Net::HTTP` and `Net::HTTP::Persistent` into separate
+  `LLM::Transport` implementations, move HTTP-specific request helpers
+  and response execution into the shared transport layer, and let MCP
+  HTTP wrap those transports instead of maintaining a separate
+  transient/persistent client split.
+* **Share transport overrides across providers and MCP** <br>
+  Let both provider construction and `LLM::MCP.http(...)` accept
+  `LLM::Transport` instances or classes as HTTP transport overrides, so
+  callers can reuse the same transport implementation across the
+  runtime.
+* **Let custom transports adapt their own response objects** <br>
+  Introduce a transport response interface so custom transports can
+  adapt backend-specific response objects to one normalized shape and
+  have them work with the existing provider execution and error-handling
+  code.
 ## v8.1.0
 Changes since `v8.0.0`.
@@ -96,8 +215,7 @@ and `acts_as_agent`.
 * **Allow `persistent: true` on `LLM::MCP.http`** <br>
   Let `LLM::MCP.http(...)` enable persistent HTTP transport directly
-  through `persistent: true`, instead of requiring a separate
-  `.persistent` call after construction.
+  through `persistent: true` at construction time.
 * **Expose `LLM::Function#runner` as public API** <br>
   Promote the internal runner instantiation to a public `runner` method on