llm.rb 8.1.0 → 9.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (67) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +120 -2
  3. data/README.md +161 -514
  4. data/lib/llm/active_record/acts_as_llm.rb +7 -8
  5. data/lib/llm/agent.rb +36 -16
  6. data/lib/llm/context.rb +30 -26
  7. data/lib/llm/contract/completion.rb +45 -0
  8. data/lib/llm/cost.rb +81 -4
  9. data/lib/llm/error.rb +1 -1
  10. data/lib/llm/function/array.rb +8 -5
  11. data/lib/llm/function/call_group.rb +39 -0
  12. data/lib/llm/function/fork/task.rb +6 -0
  13. data/lib/llm/function/ractor/task.rb +6 -0
  14. data/lib/llm/function/task.rb +10 -0
  15. data/lib/llm/function.rb +1 -0
  16. data/lib/llm/mcp/transport/http.rb +26 -46
  17. data/lib/llm/mcp/transport/stdio.rb +0 -8
  18. data/lib/llm/mcp.rb +6 -23
  19. data/lib/llm/provider.rb +23 -20
  20. data/lib/llm/providers/anthropic/error_handler.rb +6 -7
  21. data/lib/llm/providers/anthropic/files.rb +2 -2
  22. data/lib/llm/providers/anthropic/response_adapter/completion.rb +30 -0
  23. data/lib/llm/providers/anthropic.rb +1 -1
  24. data/lib/llm/providers/bedrock/error_handler.rb +8 -9
  25. data/lib/llm/providers/bedrock/models.rb +13 -13
  26. data/lib/llm/providers/bedrock/response_adapter/completion.rb +30 -0
  27. data/lib/llm/providers/bedrock.rb +1 -1
  28. data/lib/llm/providers/google/error_handler.rb +6 -7
  29. data/lib/llm/providers/google/files.rb +2 -4
  30. data/lib/llm/providers/google/images.rb +1 -1
  31. data/lib/llm/providers/google/models.rb +0 -2
  32. data/lib/llm/providers/google/response_adapter/completion.rb +30 -0
  33. data/lib/llm/providers/google.rb +1 -1
  34. data/lib/llm/providers/ollama/error_handler.rb +6 -7
  35. data/lib/llm/providers/ollama/models.rb +0 -2
  36. data/lib/llm/providers/ollama/response_adapter/completion.rb +30 -0
  37. data/lib/llm/providers/ollama.rb +1 -1
  38. data/lib/llm/providers/openai/audio.rb +3 -3
  39. data/lib/llm/providers/openai/error_handler.rb +6 -7
  40. data/lib/llm/providers/openai/files.rb +2 -2
  41. data/lib/llm/providers/openai/images.rb +3 -3
  42. data/lib/llm/providers/openai/models.rb +1 -1
  43. data/lib/llm/providers/openai/response_adapter/completion.rb +42 -0
  44. data/lib/llm/providers/openai/response_adapter/responds.rb +39 -0
  45. data/lib/llm/providers/openai/responses.rb +2 -2
  46. data/lib/llm/providers/openai/vector_stores.rb +1 -1
  47. data/lib/llm/providers/openai.rb +1 -1
  48. data/lib/llm/response.rb +10 -8
  49. data/lib/llm/sequel/plugin.rb +7 -8
  50. data/lib/llm/stream/queue.rb +15 -42
  51. data/lib/llm/stream.rb +4 -4
  52. data/lib/llm/transport/execution.rb +67 -0
  53. data/lib/llm/transport/http.rb +134 -0
  54. data/lib/llm/transport/persistent_http.rb +152 -0
  55. data/lib/llm/transport/response/http.rb +113 -0
  56. data/lib/llm/transport/response.rb +112 -0
  57. data/lib/llm/{provider/transport/http → transport}/stream_decoder.rb +8 -4
  58. data/lib/llm/transport.rb +139 -0
  59. data/lib/llm/usage.rb +14 -5
  60. data/lib/llm/version.rb +1 -1
  61. data/lib/llm.rb +2 -12
  62. data/llm.gemspec +2 -16
  63. metadata +11 -19
  64. data/lib/llm/provider/transport/http/execution.rb +0 -115
  65. data/lib/llm/provider/transport/http/interruptible.rb +0 -114
  66. data/lib/llm/provider/transport/http.rb +0 -145
  67. data/lib/llm/utils.rb +0 -19
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 8aa3ee461642fb157bece63a4ebe00ceda8ec66ce24df5c842efdcc176861a53
4
- data.tar.gz: 2d26e36b812704a80e5c8ba4814cfbec770afd5694be71b69d7937422f9a642c
3
+ metadata.gz: 197ff330dc5e414f4f9291835fbcdeece4450ee3a8d3748e4f9cf28a46db07b1
4
+ data.tar.gz: 3020a4511134f292ed38c6fc826b157f05cc31c722e9fe52692b8b2f705551c7
5
5
  SHA512:
6
- metadata.gz: 3a30bf9d5309bf49c660137ed5e81b74f9b028f8846077f3db0b7c92745a5d96b16115db765a4bd1970ba0cbaaa7bd805e0a4a37c04c7e63aacdf3d019d268ec
7
- data.tar.gz: 4e297d159dc459ee9ec228862f271b7a21be48ce06f092773c4b56d9cc007252b1cfeb66a119c7e14f3e683213e5923d34b0c256397b92cfa981cd47fe023008
6
+ metadata.gz: 41f733d7d5b8a329420497f85c289f070f9016cf4d1bfdf5c5e49e274714310f5285e5598d4d27f5daa84cf26e837f311541ac8526bd987d7c7d917eb60eca21
7
+ data.tar.gz: f751b3887bd380e8f911106bedf6a0c606bcdc813ea4d7b01f2f332311ddd974c6dcfe838c7db5446d1683844ffbf379b2f5c8ef4b94459bedefaafc70be2098
data/CHANGELOG.md CHANGED
@@ -2,6 +2,125 @@
2
2
 
3
3
  ## Unreleased
4
4
 
5
+ ## v9.0.0
6
+
7
+ Changes since `v8.1.0`.
8
+
9
+ This release deepens llm.rb's transport and cost-tracking surface. It
10
+ replaces the old mutable `persist!` API with constructor-driven transport
11
+ selection, removes `#call` from contexts and agents in favor of explicit
12
+ `ctx.wait(:call)`, makes queued stream waits strategy-free, and deletes
13
+ the unused `LLM::Utils` module.
14
+
15
+ It adds cache read/write token tracking
16
+ with corresponding cost components, audio and image token pricing,
17
+ `LLM::Context#functions?` for queue-aware tool loops,
18
+ `LLM::Agent.stream` DSL support, and exposes `#stream` readers on
19
+ contexts and agents.
20
+
21
+ The HTTP transport layer has been refactored around shared backends so
22
+ providers, MCP, and custom transports all use the same normalized
23
+ response interface.
24
+
25
+ ### Breaking
26
+
27
+ * **Remove `#call` as a context and agent tool-loop API** <br>
28
+ Remove `LLM::Context#call(:functions)` and `LLM::Agent#call(:functions)`.
29
+ Tool loops should use `ctx.wait(:call)` or `agent.wait(:call)` instead.
30
+ The ActiveRecord and Sequel wrappers no longer expose `#call` passthroughs
31
+ for stored llm.rb contexts.
32
+
33
+ * **Make HTTP transport selection constructor-driven** <br>
34
+ Remove public `persist!` and `.persistent` mutation APIs from
35
+ providers, transports, and MCP clients. Select persistent behavior at
36
+ construction time with `persistent: true`, `LLM::Transport.net_http`,
37
+ `LLM::Transport.net_http_persistent`, or an explicit `transport:`
38
+ override.
39
+
40
+ * **Make queued stream waits strategy-free** <br>
41
+ Change `LLM::Stream::Queue#wait` to resolve queued work by the actual
42
+ task types already present in the queue instead of accepting an
43
+ external wait strategy. `LLM::Stream#wait(...)` remains compatible but
44
+ now ignores its arguments when delegating to the queue.
45
+
46
+ * **Remove unused `LLM::Utils`** <br>
47
+ Delete the `LLM::Utils` module and remove its remaining unused
48
+ provider includes and top-level require.
49
+
50
+ ### Add
51
+
52
+ * **Expose `#stream` readers on contexts and agents** <br>
53
+ Add public `LLM::Context#stream` and `LLM::Agent#stream` accessors so
54
+ callers can inspect the active stream object directly.
55
+
56
+ * **Track cache read and write tokens in usage** <br>
57
+ Add `cache_read_tokens` and `cache_write_tokens` to `LLM::Usage` and
58
+ preserve them through completion usage adaptation and context usage
59
+ aggregation.
60
+
61
+ * **Add `LLM::Context#functions?` for queue-aware tool loops** <br>
62
+ Add `functions?` to `LLM::Context` and the ActiveRecord and Sequel
63
+ wrappers so callers can detect pending tool work through either the
64
+ bound stream queue or unresolved functions, and update the docs to
65
+ prefer `while ctx.functions?` over `ctx.functions.any?` in tool-loop
66
+ examples.
67
+
68
+ * **Add `:call` as a first-class wait strategy** <br>
69
+ Add `:call` to pending-function wait paths so `ctx.wait(:call)` can
70
+ prefer queued streamed work when present and otherwise fall back to
71
+ direct sequential function execution through `spawn(:call).wait`.
72
+
73
+ * **Read provider cache usage into completion responses** <br>
74
+ Read cache read tokens from provider usage metadata, including OpenAI
75
+ `usage.prompt_tokens_details` and Anthropic
76
+ `usage.cache_read_input_tokens`. Read Anthropic cache write tokens
77
+ from `usage.cache_creation_input_tokens`, and expose explicit
78
+ zero-valued `cache_write_tokens` methods on providers that do not
79
+ report cache creation usage.
80
+
81
+ * **Extend cost tracking with cache write pricing** <br>
82
+ Extend `LLM::Cost` with `cache_read_costs`, `cache_write_costs`, and
83
+ `reasoning_costs` alongside the existing `input_costs` and
84
+ `output_costs`. Add `#to_h` for structured cost insight and update
85
+ `ctx.cost` to calculate all available components from registry
86
+ pricing data.
87
+
88
+ * **Price input and output audio separately** <br>
89
+ Track `input_audio_tokens` and `output_audio_tokens` in usage and
90
+ include `input_audio_costs` and `output_audio_costs` in `LLM::Cost`
91
+ so multimodal requests report accurate audio spend.
92
+
93
+ * **Track image tokens in input cost reporting** <br>
94
+ Add `input_image_tokens` to usage and include `input_image_costs` in
95
+ `LLM::Cost` using the model's generic input rate so image-bearing
96
+ prompts report their input spend.
97
+
98
+ * **Add `LLM::Agent.stream` DSL support** <br>
99
+ Let agents define a default `stream` through the class DSL, including
100
+ block-based stream construction so each agent instance can resolve its
101
+ stream the same way `tracer` does.
102
+
103
+ ### Change
104
+
105
+ * **Refactor HTTP transports around shared backends** <br>
106
+ Split `Net::HTTP` and `Net::HTTP::Persistent` into separate
107
+ `LLM::Transport` implementations, move HTTP-specific request helpers
108
+ and response execution into the shared transport layer, and let MCP
109
+ HTTP wrap those transports instead of maintaining a separate
110
+ transient/persistent client split.
111
+
112
+ * **Share transport overrides across providers and MCP** <br>
113
+ Let both provider construction and `LLM::MCP.http(...)` accept
114
+ `LLM::Transport` instances or classes as HTTP transport overrides, so
115
+ callers can reuse the same transport implementation across the
116
+ runtime.
117
+
118
+ * **Let custom transports adapt their own response objects** <br>
119
+ Introduce a transport response interface so custom transports can
120
+ adapt backend-specific response objects to one normalized shape and
121
+ have them work with the existing provider execution and error-handling
122
+ code.
123
+
5
124
  ## v8.1.0
6
125
 
7
126
  Changes since `v8.0.0`.
@@ -96,8 +215,7 @@ and `acts_as_agent`.
96
215
 
97
216
  * **Allow `persistent: true` on `LLM::MCP.http`** <br>
98
217
  Let `LLM::MCP.http(...)` enable persistent HTTP transport directly
99
- through `persistent: true`, instead of requiring a separate
100
- `.persistent` call after construction.
218
+ through `persistent: true` at construction time.
101
219
 
102
220
  * **Expose `LLM::Function#runner` as public API** <br>
103
221
  Promote the internal runner instantiation to a public `runner` method on