llm.rb 8.0.0 → 9.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (80) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +165 -2
  3. data/README.md +161 -509
  4. data/data/bedrock.json +2948 -0
  5. data/data/deepseek.json +8 -8
  6. data/data/openai.json +39 -2
  7. data/data/xai.json +35 -0
  8. data/data/zai.json +1 -1
  9. data/lib/llm/active_record/acts_as_llm.rb +7 -8
  10. data/lib/llm/agent.rb +36 -16
  11. data/lib/llm/context.rb +30 -26
  12. data/lib/llm/contract/completion.rb +45 -0
  13. data/lib/llm/cost.rb +81 -4
  14. data/lib/llm/error.rb +1 -1
  15. data/lib/llm/function/array.rb +8 -5
  16. data/lib/llm/function/call_group.rb +39 -0
  17. data/lib/llm/function/fork/task.rb +6 -0
  18. data/lib/llm/function/ractor/task.rb +6 -0
  19. data/lib/llm/function/task.rb +10 -0
  20. data/lib/llm/function.rb +1 -0
  21. data/lib/llm/mcp/transport/http.rb +26 -46
  22. data/lib/llm/mcp/transport/stdio.rb +0 -8
  23. data/lib/llm/mcp.rb +6 -23
  24. data/lib/llm/object.rb +8 -0
  25. data/lib/llm/provider.rb +29 -19
  26. data/lib/llm/providers/anthropic/error_handler.rb +6 -7
  27. data/lib/llm/providers/anthropic/files.rb +2 -2
  28. data/lib/llm/providers/anthropic/response_adapter/completion.rb +30 -0
  29. data/lib/llm/providers/anthropic.rb +1 -1
  30. data/lib/llm/providers/bedrock/error_handler.rb +79 -0
  31. data/lib/llm/providers/bedrock/models.rb +109 -0
  32. data/lib/llm/providers/bedrock/request_adapter/completion.rb +153 -0
  33. data/lib/llm/providers/bedrock/request_adapter.rb +95 -0
  34. data/lib/llm/providers/bedrock/response_adapter/completion.rb +173 -0
  35. data/lib/llm/providers/bedrock/response_adapter/models.rb +34 -0
  36. data/lib/llm/providers/bedrock/response_adapter.rb +40 -0
  37. data/lib/llm/providers/bedrock/signature.rb +166 -0
  38. data/lib/llm/providers/bedrock/stream_decoder.rb +140 -0
  39. data/lib/llm/providers/bedrock/stream_parser.rb +201 -0
  40. data/lib/llm/providers/bedrock.rb +272 -0
  41. data/lib/llm/providers/google/error_handler.rb +6 -7
  42. data/lib/llm/providers/google/files.rb +2 -4
  43. data/lib/llm/providers/google/images.rb +1 -1
  44. data/lib/llm/providers/google/models.rb +0 -2
  45. data/lib/llm/providers/google/response_adapter/completion.rb +30 -0
  46. data/lib/llm/providers/google.rb +1 -1
  47. data/lib/llm/providers/ollama/error_handler.rb +6 -7
  48. data/lib/llm/providers/ollama/models.rb +0 -2
  49. data/lib/llm/providers/ollama/response_adapter/completion.rb +30 -0
  50. data/lib/llm/providers/ollama.rb +1 -1
  51. data/lib/llm/providers/openai/audio.rb +3 -3
  52. data/lib/llm/providers/openai/error_handler.rb +6 -7
  53. data/lib/llm/providers/openai/files.rb +2 -2
  54. data/lib/llm/providers/openai/images.rb +3 -3
  55. data/lib/llm/providers/openai/models.rb +1 -1
  56. data/lib/llm/providers/openai/response_adapter/completion.rb +42 -0
  57. data/lib/llm/providers/openai/response_adapter/responds.rb +39 -0
  58. data/lib/llm/providers/openai/responses.rb +2 -2
  59. data/lib/llm/providers/openai/vector_stores.rb +1 -1
  60. data/lib/llm/providers/openai.rb +1 -1
  61. data/lib/llm/response.rb +10 -8
  62. data/lib/llm/sequel/plugin.rb +7 -8
  63. data/lib/llm/stream/queue.rb +15 -42
  64. data/lib/llm/stream.rb +4 -4
  65. data/lib/llm/transport/execution.rb +67 -0
  66. data/lib/llm/transport/http.rb +134 -0
  67. data/lib/llm/transport/persistent_http.rb +152 -0
  68. data/lib/llm/transport/response/http.rb +113 -0
  69. data/lib/llm/transport/response.rb +112 -0
  70. data/lib/llm/{provider/transport/http → transport}/stream_decoder.rb +8 -4
  71. data/lib/llm/transport.rb +139 -0
  72. data/lib/llm/usage.rb +14 -5
  73. data/lib/llm/version.rb +1 -1
  74. data/lib/llm.rb +10 -12
  75. data/llm.gemspec +2 -16
  76. metadata +23 -19
  77. data/lib/llm/provider/transport/http/execution.rb +0 -115
  78. data/lib/llm/provider/transport/http/interruptible.rb +0 -114
  79. data/lib/llm/provider/transport/http.rb +0 -145
  80. data/lib/llm/utils.rb +0 -19
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 4d726213f6b63342582738a133f7f82c1158934d6f25a48ae6b6c9e59a8f8262
4
- data.tar.gz: 6288d177adc7a07a37368066329c882f746747d5bed9ffba7cb50d2bcbd1d98c
3
+ metadata.gz: 197ff330dc5e414f4f9291835fbcdeece4450ee3a8d3748e4f9cf28a46db07b1
4
+ data.tar.gz: 3020a4511134f292ed38c6fc826b157f05cc31c722e9fe52692b8b2f705551c7
5
5
  SHA512:
6
- metadata.gz: 4ae089f4117dc384000a70500c40ebadf48f42d1bd820d0840568b3b31b0197e51c65e9f60fe65d0e75c23aa4c7eac977be928a38969580174169bd0efe39912
7
- data.tar.gz: 9653135f93b9b2b722102f055dc961346949368dab161a3cff64e99ddfc6781933a94b527151da9a24ff39451814f76c5409389f91c3692852eb17bd5d3d11f9
6
+ metadata.gz: 41f733d7d5b8a329420497f85c289f070f9016cf4d1bfdf5c5e49e274714310f5285e5598d4d27f5daa84cf26e837f311541ac8526bd987d7c7d917eb60eca21
7
+ data.tar.gz: f751b3887bd380e8f911106bedf6a0c606bcdc813ea4d7b01f2f332311ddd974c6dcfe838c7db5446d1683844ffbf379b2f5c8ef4b94459bedefaafc70be2098
data/CHANGELOG.md CHANGED
@@ -2,6 +2,170 @@
2
2
 
3
3
  ## Unreleased
4
4
 
5
+ ## v9.0.0
6
+
7
+ Changes since `v8.1.0`.
8
+
9
+ This release deepens llm.rb's transport and cost-tracking surface. It
10
+ replaces the old mutable `persist!` API with constructor-driven transport
11
+ selection, removes `#call` from contexts and agents in favor of explicit
12
+ `ctx.wait(:call)`, makes queued stream waits strategy-free, and deletes
13
+ the unused `LLM::Utils` module.
14
+
15
+ It adds cache read/write token tracking
16
+ with corresponding cost components, audio and image token pricing,
17
+ `LLM::Context#functions?` for queue-aware tool loops,
18
+ `LLM::Agent.stream` DSL support, and exposes `#stream` readers on
19
+ contexts and agents.
20
+
21
+ The HTTP transport layer has been refactored around shared backends so
22
+ providers, MCP, and custom transports all use the same normalized
23
+ response interface.
24
+
25
+ ### Breaking
26
+
27
+ * **Remove `#call` as a context and agent tool-loop API** <br>
28
+ Remove `LLM::Context#call(:functions)` and `LLM::Agent#call(:functions)`.
29
+ Tool loops should use `ctx.wait(:call)` or `agent.wait(:call)` instead.
30
+ The ActiveRecord and Sequel wrappers no longer expose `#call` passthroughs
31
+ for stored llm.rb contexts.
32
+
33
+ * **Make HTTP transport selection constructor-driven** <br>
34
+ Remove public `persist!` and `.persistent` mutation APIs from
35
+ providers, transports, and MCP clients. Select persistent behavior at
36
+ construction time with `persistent: true`, `LLM::Transport.net_http`,
37
+ `LLM::Transport.net_http_persistent`, or an explicit `transport:`
38
+ override.
39
+
40
+ * **Make queued stream waits strategy-free** <br>
41
+ Change `LLM::Stream::Queue#wait` to resolve queued work by the actual
42
+ task types already present in the queue instead of accepting an
43
+ external wait strategy. `LLM::Stream#wait(...)` remains compatible but
44
+ now ignores its arguments when delegating to the queue.
45
+
46
+ * **Remove unused `LLM::Utils`** <br>
47
+ Delete the `LLM::Utils` module and remove its remaining unused
48
+ provider includes and top-level require.
49
+
50
+ ### Add
51
+
52
+ * **Expose `#stream` readers on contexts and agents** <br>
53
+ Add public `LLM::Context#stream` and `LLM::Agent#stream` accessors so
54
+ callers can inspect the active stream object directly.
55
+
56
+ * **Track cache read and write tokens in usage** <br>
57
+ Add `cache_read_tokens` and `cache_write_tokens` to `LLM::Usage` and
58
+ preserve them through completion usage adaptation and context usage
59
+ aggregation.
60
+
61
+ * **Add `LLM::Context#functions?` for queue-aware tool loops** <br>
62
+ Add `functions?` to `LLM::Context` and the ActiveRecord and Sequel
63
+ wrappers so callers can detect pending tool work through either the
64
+ bound stream queue or unresolved functions, and update the docs to
65
+ prefer `while ctx.functions?` over `ctx.functions.any?` in tool-loop
66
+ examples.
67
+
68
+ * **Add `:call` as a first-class wait strategy** <br>
69
+ Add `:call` to pending-function wait paths so `ctx.wait(:call)` can
70
+ prefer queued streamed work when present and otherwise fall back to
71
+ direct sequential function execution through `spawn(:call).wait`.
72
+
73
+ * **Read provider cache usage into completion responses** <br>
74
+ Read cache read tokens from provider usage metadata, including OpenAI
75
+ `usage.prompt_tokens_details` and Anthropic
76
+ `usage.cache_read_input_tokens`. Read Anthropic cache write tokens
77
+ from `usage.cache_creation_input_tokens`, and expose explicit
78
+ zero-valued `cache_write_tokens` methods on providers that do not
79
+ report cache creation usage.
80
+
81
+ * **Extend cost tracking with cache write pricing** <br>
82
+ Extend `LLM::Cost` with `cache_read_costs`, `cache_write_costs`, and
83
+ `reasoning_costs` alongside the existing `input_costs` and
84
+ `output_costs`. Add `#to_h` for structured cost insight and update
85
+ `ctx.cost` to calculate all available components from registry
86
+ pricing data.
87
+
88
+ * **Price input and output audio separately** <br>
89
+ Track `input_audio_tokens` and `output_audio_tokens` in usage and
90
+ include `input_audio_costs` and `output_audio_costs` in `LLM::Cost`
91
+ so multimodal requests report accurate audio spend.
92
+
93
+ * **Track image tokens in input cost reporting** <br>
94
+ Add `input_image_tokens` to usage and include `input_image_costs` in
95
+ `LLM::Cost` using the model's generic input rate so image-bearing
96
+ prompts report their input spend.
97
+
98
+ * **Add `LLM::Agent.stream` DSL support** <br>
99
+ Let agents define a default `stream` through the class DSL, including
100
+ block-based stream construction so each agent instance can resolve its
101
+ stream the same way `tracer` does.
102
+
103
+ ### Change
104
+
105
+ * **Refactor HTTP transports around shared backends** <br>
106
+ Split `Net::HTTP` and `Net::HTTP::Persistent` into separate
107
+ `LLM::Transport` implementations, move HTTP-specific request helpers
108
+ and response execution into the shared transport layer, and let MCP
109
+ HTTP wrap those transports instead of maintaining a separate
110
+ transient/persistent client split.
111
+
112
+ * **Share transport overrides across providers and MCP** <br>
113
+ Let both provider construction and `LLM::MCP.http(...)` accept
114
+ `LLM::Transport` instances or classes as HTTP transport overrides, so
115
+ callers can reuse the same transport implementation across the
116
+ runtime.
117
+
118
+ * **Let custom transports adapt their own response objects** <br>
119
+ Introduce a transport response interface so custom transports can
120
+ adapt backend-specific response objects to one normalized shape and
121
+ have them work with the existing provider execution and error-handling
122
+ code.
123
+
124
+ ## v8.1.0
125
+
126
+ Changes since `v8.0.0`.
127
+
128
+ This release adds Amazon Bedrock provider support through the Converse
129
+ API, including AWS SigV4 request signing, event stream decoding,
130
+ structured output through `schema:`, and a models.dev-backed registry.
131
+ It exposes `llm.models.all` for Bedrock via the ListFoundationModels
132
+ API and adds `LLM::Object#transform_values!` for in-place value
133
+ transformation. Several Bedrock-specific fixes land as well, including
134
+ response id exposure, blank text block suppression in tool turns, and
135
+ DSML tool-marker filtering in streamed text.
136
+
137
+ ### Add
138
+
139
+ * **Add AWS Bedrock provider support** <br>
140
+ Add `LLM.bedrock(...)` with Bedrock Converse chat support, AWS SigV4
141
+ request signing, Bedrock event stream decoding, structured output
142
+ support through `schema:`, and models.dev-backed `bedrock.json`
143
+ registry generation.
144
+
145
+ * **Add AWS Bedrock Models endpoint support** <br>
146
+ Add `llm.models.all` for Bedrock via the ListFoundationModels API,
147
+ including SigV4 signing for the control-plane endpoint and normalized
148
+ `LLM::Model` collection responses.
149
+
150
+ * **Add `LLM::Object#transform_values!`** <br>
151
+ Let `LLM::Object` transform stored values in place through
152
+ `#transform_values!`.
153
+
154
+ ### Fix
155
+
156
+ * **Expose response ids on Bedrock completion responses** <br>
157
+ Read the Bedrock request id into `LLM::Response#id` for completion
158
+ responses adapted from the Converse API.
159
+
160
+ * **Avoid blank assistant text blocks in Bedrock tool turns** <br>
161
+ Stop replaying assistant tool-call messages with empty text content
162
+ blocks that Bedrock rejects.
163
+
164
+ * **Suppress Bedrock DSML tool markers in streamed text** <br>
165
+ Filter `"<|DSML|function_calls"` markers out of streamed Bedrock
166
+ assistant text so tool-call sentinels do not leak into user-visible
167
+ output.
168
+
5
169
  ## v8.0.0
6
170
 
7
171
  Changes since `v7.0.0`.
@@ -51,8 +215,7 @@ and `acts_as_agent`.
51
215
 
52
216
  * **Allow `persistent: true` on `LLM::MCP.http`** <br>
53
217
  Let `LLM::MCP.http(...)` enable persistent HTTP transport directly
54
- through `persistent: true`, instead of requiring a separate
55
- `.persistent` call after construction.
218
+ through `persistent: true` at construction time.
56
219
 
57
220
  * **Expose `LLM::Function#runner` as public API** <br>
58
221
  Promote the internal runner instantiation to a public `runner` method on