llm.rb 8.1.0 → 10.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (86) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +196 -6
  3. data/README.md +233 -518
  4. data/data/anthropic.json +278 -258
  5. data/data/bedrock.json +1288 -1561
  6. data/data/deepseek.json +38 -38
  7. data/data/google.json +656 -579
  8. data/data/openai.json +860 -818
  9. data/data/xai.json +243 -552
  10. data/data/zai.json +168 -168
  11. data/lib/llm/active_record/acts_as_agent.rb +5 -0
  12. data/lib/llm/active_record/acts_as_llm.rb +7 -8
  13. data/lib/llm/active_record.rb +1 -6
  14. data/lib/llm/agent.rb +121 -82
  15. data/lib/llm/context.rb +79 -74
  16. data/lib/llm/contract/completion.rb +45 -0
  17. data/lib/llm/cost.rb +81 -4
  18. data/lib/llm/error.rb +1 -1
  19. data/lib/llm/function/array.rb +8 -5
  20. data/lib/llm/function/call_group.rb +39 -0
  21. data/lib/llm/function/call_task.rb +46 -0
  22. data/lib/llm/function/fork/task.rb +6 -0
  23. data/lib/llm/function/ractor/task.rb +6 -0
  24. data/lib/llm/function/task.rb +10 -0
  25. data/lib/llm/function.rb +28 -1
  26. data/lib/llm/mcp/transport/http.rb +26 -46
  27. data/lib/llm/mcp/transport/stdio.rb +0 -8
  28. data/lib/llm/mcp.rb +6 -23
  29. data/lib/llm/provider.rb +30 -20
  30. data/lib/llm/providers/anthropic/error_handler.rb +6 -7
  31. data/lib/llm/providers/anthropic/files.rb +2 -2
  32. data/lib/llm/providers/anthropic/response_adapter/completion.rb +30 -0
  33. data/lib/llm/providers/anthropic/stream_parser.rb +2 -2
  34. data/lib/llm/providers/anthropic.rb +1 -1
  35. data/lib/llm/providers/bedrock/error_handler.rb +8 -9
  36. data/lib/llm/providers/bedrock/models.rb +13 -13
  37. data/lib/llm/providers/bedrock/response_adapter/completion.rb +30 -0
  38. data/lib/llm/providers/bedrock/stream_parser.rb +2 -2
  39. data/lib/llm/providers/bedrock.rb +1 -1
  40. data/lib/llm/providers/google/error_handler.rb +6 -7
  41. data/lib/llm/providers/google/files.rb +2 -4
  42. data/lib/llm/providers/google/images.rb +1 -1
  43. data/lib/llm/providers/google/models.rb +0 -2
  44. data/lib/llm/providers/google/response_adapter/completion.rb +30 -0
  45. data/lib/llm/providers/google/stream_parser.rb +2 -2
  46. data/lib/llm/providers/google.rb +1 -1
  47. data/lib/llm/providers/ollama/error_handler.rb +6 -7
  48. data/lib/llm/providers/ollama/models.rb +0 -2
  49. data/lib/llm/providers/ollama/response_adapter/completion.rb +30 -0
  50. data/lib/llm/providers/ollama.rb +1 -1
  51. data/lib/llm/providers/openai/audio.rb +3 -3
  52. data/lib/llm/providers/openai/error_handler.rb +6 -7
  53. data/lib/llm/providers/openai/files.rb +2 -2
  54. data/lib/llm/providers/openai/images.rb +3 -3
  55. data/lib/llm/providers/openai/models.rb +1 -1
  56. data/lib/llm/providers/openai/response_adapter/completion.rb +42 -0
  57. data/lib/llm/providers/openai/response_adapter/responds.rb +39 -0
  58. data/lib/llm/providers/openai/responses/stream_parser.rb +2 -2
  59. data/lib/llm/providers/openai/responses.rb +2 -2
  60. data/lib/llm/providers/openai/stream_parser.rb +2 -2
  61. data/lib/llm/providers/openai/vector_stores.rb +1 -1
  62. data/lib/llm/providers/openai.rb +1 -1
  63. data/lib/llm/response.rb +10 -8
  64. data/lib/llm/schema.rb +11 -0
  65. data/lib/llm/sequel/agent.rb +5 -0
  66. data/lib/llm/sequel/plugin.rb +8 -14
  67. data/lib/llm/stream/queue.rb +15 -42
  68. data/lib/llm/stream.rb +15 -40
  69. data/lib/llm/tool/param.rb +1 -8
  70. data/lib/llm/transport/execution.rb +67 -0
  71. data/lib/llm/transport/http.rb +134 -0
  72. data/lib/llm/transport/persistent_http.rb +152 -0
  73. data/lib/llm/transport/response/http.rb +113 -0
  74. data/lib/llm/transport/response.rb +112 -0
  75. data/lib/llm/{provider/transport/http → transport}/stream_decoder.rb +8 -4
  76. data/lib/llm/transport.rb +139 -0
  77. data/lib/llm/usage.rb +14 -5
  78. data/lib/llm/utils.rb +24 -14
  79. data/lib/llm/version.rb +1 -1
  80. data/lib/llm.rb +3 -12
  81. data/llm.gemspec +2 -16
  82. metadata +13 -20
  83. data/lib/llm/bot.rb +0 -3
  84. data/lib/llm/provider/transport/http/execution.rb +0 -115
  85. data/lib/llm/provider/transport/http/interruptible.rb +0 -114
  86. data/lib/llm/provider/transport/http.rb +0 -145
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 8aa3ee461642fb157bece63a4ebe00ceda8ec66ce24df5c842efdcc176861a53
4
- data.tar.gz: 2d26e36b812704a80e5c8ba4814cfbec770afd5694be71b69d7937422f9a642c
3
+ metadata.gz: 6ba756238fa72e58ba774567a0c8e2a6d7351cb6313f9c8c08cbdeec8ec9cfa4
4
+ data.tar.gz: cba8295670dab2843cec902ae97b7ae14e775359380ba401ca5a0066eb60ad0e
5
5
  SHA512:
6
- metadata.gz: 3a30bf9d5309bf49c660137ed5e81b74f9b028f8846077f3db0b7c92745a5d96b16115db765a4bd1970ba0cbaaa7bd805e0a4a37c04c7e63aacdf3d019d268ec
7
- data.tar.gz: 4e297d159dc459ee9ec228862f271b7a21be48ce06f092773c4b56d9cc007252b1cfeb66a119c7e14f3e683213e5923d34b0c256397b92cfa981cd47fe023008
6
+ metadata.gz: b8347b2adfe05a4700ec42e0ed5992a1332355bd20330590d8b3de214d980476a490855ff7e69b5b36c75f3684304c4ee61bdff9ecbcf8001f0b477b8010d064
7
+ data.tar.gz: a41512ffbc52b3665118161251441152389ca9daba1a6f4e010303490938dc33393da62f5e821521b2a9f4b45d85fd219b558fa7d2e185c24f43777d26e36a14
data/CHANGELOG.md CHANGED
@@ -2,6 +2,197 @@
2
2
 
3
3
  ## Unreleased
4
4
 
5
+ ## v10.0.0
6
+
7
+ Changes since `v9.0.0`.
8
+
9
+ This release unifies context turns under `#talk`, removes the
10
+ deprecated `LLM::Bot` alias, and adds shared option resolution
11
+ through `LLM::Utils`.
12
+
13
+ Class-level agent tunables can now be resolved lazily via Proc,
14
+ `Array[...]` schema/tool param types are supported, and a `key?`
15
+ method has been added on providers.
16
+
17
+ Agent tool confirmation hooks let selected tools be approved or
18
+ cancelled before execution. Keep reading to learn more.
19
+
20
+ ### Breaking
21
+
22
+ * **Unify context turns under `#talk`** <br>
23
+ Remove `LLM::Context#respond` and route responses-mode turns through
24
+ `LLM::Context#talk` with `mode: :responses` instead.
25
+
26
+ * **Remove the `LLM::Bot` alias** <br>
27
+ Remove the backward-compatible `LLM::Bot` alias for `LLM::Context`.
28
+ Use `LLM::Context` directly instead.
29
+
30
+ ### Add
31
+
32
+ * **Add shared option resolution through `LLM::Utils`** <br>
33
+ Add `LLM::Utils.resolve_option` for resolving configured values as
34
+ literals, procs, symbol-named methods, or duplicated hashes, and use
35
+ it in agent and ORM option resolution paths.
36
+
37
+ * **Resolve all class-level agent tunables via Proc** <br>
38
+ Let `model`, `tools`, `skills`, `schema`, `stream`, and `tracer`
39
+ declared with a block be lazily evaluated against the agent instance
40
+ at initialization time, matching how `stream` and `tracer` already
41
+ worked.
42
+
43
+ Add `LLM::Agent#params` for direct access to the underlying context
44
+ parameters.
45
+
46
+ Ported from mruby-llm.
47
+
48
+ * **Support `Array[...]` schema and tool param types** <br>
49
+ Let `LLM::Schema` properties and `LLM::Tool` params accept
50
+ `Array[...]` type declarations, including mixed item unions that are
51
+ serialized as `anyOf` array items.
52
+
53
+ * **Add `LLM::Provider#key?`** <br>
54
+ Add `key?` to providers so callers can check whether a non-blank API
55
+ key has been configured.
56
+
57
+ * **Add agent tool confirmation hooks** <br>
58
+ Add `LLM::Agent.confirm` and `LLM::Agent#on_tool_confirmation` so
59
+ selected tools can be approved or cancelled before execution. Pending
60
+ tool resolution now relies on `LLM::Context#functions` so confirmed
61
+ tools are not executed twice when mixed with unconfirmed tool calls.
62
+
63
+ * **Add `LLM::Function#spawn(:call).wait`** <br>
64
+ Add task-shaped sequential execution support for direct
65
+ `LLM::Function#spawn(:call).wait`.
66
+
67
+ ### Fix
68
+
69
+ * **Reduce private internal methods on `LLM::Stream`** <br>
70
+ Remove `tool_not_found` and `__tools__` from `LLM::Stream`. The
71
+ `__tools__` logic is inlined directly into `__find__` since that
72
+ was its only caller. The `tool_not_found` utility method was unused
73
+ externally and added unnecessary surface to LLM::Stream.
74
+
75
+ Ported from mruby-llm.
76
+
77
+ ## v9.0.0
78
+
79
+ Changes since `v8.1.0`.
80
+
81
+ This release deepens llm.rb's transport and cost-tracking surface. It
82
+ replaces the old mutable `persist!` API with constructor-driven transport
83
+ selection, removes `#call` from contexts and agents in favor of explicit
84
+ `ctx.wait(:call)`, makes queued stream waits strategy-free, and deletes
85
+ the unused `LLM::Utils` module.
86
+
87
+ It adds cache read/write token tracking
88
+ with corresponding cost components, audio and image token pricing,
89
+ `LLM::Context#functions?` for queue-aware tool loops,
90
+ `LLM::Agent.stream` DSL support, and exposes `#stream` readers on
91
+ contexts and agents.
92
+
93
+ The HTTP transport layer has been refactored around shared backends so
94
+ providers, MCP, and custom transports all use the same normalized
95
+ response interface.
96
+
97
+ ### Breaking
98
+
99
+ * **Remove `#call` as a context and agent tool-loop API** <br>
100
+ Remove `LLM::Context#call(:functions)` and `LLM::Agent#call(:functions)`.
101
+ Tool loops should use `ctx.wait(:call)` or `agent.wait(:call)` instead.
102
+ The ActiveRecord and Sequel wrappers no longer expose `#call` passthroughs
103
+ for stored llm.rb contexts.
104
+
105
+ * **Make HTTP transport selection constructor-driven** <br>
106
+ Remove public `persist!` and `.persistent` mutation APIs from
107
+ providers, transports, and MCP clients. Select persistent behavior at
108
+ construction time with `persistent: true`, `LLM::Transport.net_http`,
109
+ `LLM::Transport.net_http_persistent`, or an explicit `transport:`
110
+ override.
111
+
112
+ * **Make queued stream waits strategy-free** <br>
113
+ Change `LLM::Stream::Queue#wait` to resolve queued work by the actual
114
+ task types already present in the queue instead of accepting an
115
+ external wait strategy. `LLM::Stream#wait(...)` remains compatible but
116
+ now ignores its arguments when delegating to the queue.
117
+
118
+ * **Remove unused `LLM::Utils`** <br>
119
+ Delete the `LLM::Utils` module and remove its remaining unused
120
+ provider includes and top-level require.
121
+
122
+ ### Add
123
+
124
+ * **Expose `#stream` readers on contexts and agents** <br>
125
+ Add public `LLM::Context#stream` and `LLM::Agent#stream` accessors so
126
+ callers can inspect the active stream object directly.
127
+
128
+ * **Track cache read and write tokens in usage** <br>
129
+ Add `cache_read_tokens` and `cache_write_tokens` to `LLM::Usage` and
130
+ preserve them through completion usage adaptation and context usage
131
+ aggregation.
132
+
133
+ * **Add `LLM::Context#functions?` for queue-aware tool loops** <br>
134
+ Add `functions?` to `LLM::Context` and the ActiveRecord and Sequel
135
+ wrappers so callers can detect pending tool work through either the
136
+ bound stream queue or unresolved functions, and update the docs to
137
+ prefer `while ctx.functions?` over `ctx.functions.any?` in tool-loop
138
+ examples.
139
+
140
+ * **Add `:call` as a first-class wait strategy** <br>
141
+ Add `:call` to pending-function wait paths so `ctx.wait(:call)` can
142
+ prefer queued streamed work when present and otherwise fall back to
143
+ direct sequential function execution through `spawn(:call).wait`.
144
+
145
+ * **Read provider cache usage into completion responses** <br>
146
+ Read cache read tokens from provider usage metadata, including OpenAI
147
+ `usage.prompt_tokens_details` and Anthropic
148
+ `usage.cache_read_input_tokens`. Read Anthropic cache write tokens
149
+ from `usage.cache_creation_input_tokens`, and expose explicit
150
+ zero-valued `cache_write_tokens` methods on providers that do not
151
+ report cache creation usage.
152
+
153
+ * **Extend cost tracking with cache write pricing** <br>
154
+ Extend `LLM::Cost` with `cache_read_costs`, `cache_write_costs`, and
155
+ `reasoning_costs` alongside the existing `input_costs` and
156
+ `output_costs`. Add `#to_h` for structured cost insight and update
157
+ `ctx.cost` to calculate all available components from registry
158
+ pricing data.
159
+
160
+ * **Price input and output audio separately** <br>
161
+ Track `input_audio_tokens` and `output_audio_tokens` in usage and
162
+ include `input_audio_costs` and `output_audio_costs` in `LLM::Cost`
163
+ so multimodal requests report accurate audio spend.
164
+
165
+ * **Track image tokens in input cost reporting** <br>
166
+ Add `input_image_tokens` to usage and include `input_image_costs` in
167
+ `LLM::Cost` using the model's generic input rate so image-bearing
168
+ prompts report their input spend.
169
+
170
+ * **Add `LLM::Agent.stream` DSL support** <br>
171
+ Let agents define a default `stream` through the class DSL, including
172
+ block-based stream construction so each agent instance can resolve its
173
+ stream the same way `tracer` does.
174
+
175
+ ### Change
176
+
177
+ * **Refactor HTTP transports around shared backends** <br>
178
+ Split `Net::HTTP` and `Net::HTTP::Persistent` into separate
179
+ `LLM::Transport` implementations, move HTTP-specific request helpers
180
+ and response execution into the shared transport layer, and let MCP
181
+ HTTP wrap those transports instead of maintaining a separate
182
+ transient/persistent client split.
183
+
184
+ * **Share transport overrides across providers and MCP** <br>
185
+ Let both provider construction and `LLM::MCP.http(...)` accept
186
+ `LLM::Transport` instances or classes as HTTP transport overrides, so
187
+ callers can reuse the same transport implementation across the
188
+ runtime.
189
+
190
+ * **Let custom transports adapt their own response objects** <br>
191
+ Introduce a transport response interface so custom transports can
192
+ adapt backend-specific response objects to one normalized shape and
193
+ have them work with the existing provider execution and error-handling
194
+ code.
195
+
5
196
  ## v8.1.0
6
197
 
7
198
  Changes since `v8.0.0`.
@@ -43,7 +234,7 @@ DSML tool-marker filtering in streamed text.
43
234
  blocks that Bedrock rejects.
44
235
 
45
236
  * **Suppress Bedrock DSML tool markers in streamed text** <br>
46
- Filter `"<|DSML|function_calls"` markers out of streamed Bedrock
237
+ Filter `\"<|DSML|function_calls\"` markers out of streamed Bedrock
47
238
  assistant text so tool-call sentinels do not leak into user-visible
48
239
  output.
49
240
 
@@ -96,8 +287,7 @@ and `acts_as_agent`.
96
287
 
97
288
  * **Allow `persistent: true` on `LLM::MCP.http`** <br>
98
289
  Let `LLM::MCP.http(...)` enable persistent HTTP transport directly
99
- through `persistent: true`, instead of requiring a separate
100
- `.persistent` call after construction.
290
+ through `persistent: true` at construction time.
101
291
 
102
292
  * **Expose `LLM::Function#runner` as public API** <br>
103
293
  Promote the internal runner instantiation to a public `runner` method on
@@ -195,7 +385,7 @@ provider usage has been recorded yet.
195
385
  buffer API.
196
386
 
197
387
  * **Support percentage compaction token thresholds** <br>
198
- Let `LLM::Compactor` accept `token_threshold:` values like `"90%"` so
388
+ Let `LLM::Compactor` accept `token_threshold:` values like `\"90%\"` so
199
389
  compaction can trigger at a percentage of the active model context
200
390
  window.
201
391
 
@@ -978,7 +1168,7 @@ Changes since `v4.9.0`.
978
1168
 
979
1169
  - Add HTTP transport for MCP with `LLM::MCP::Transport::HTTP` for remote servers
980
1170
  - Add JSON Schema union types (`any_of`, `all_of`, `one_of`) with parser integration
981
- - Add JSON Schema type array union support (e.g., `"type\": [\"object\", \"null\"]`)
1171
+ - Add JSON Schema type array union support (e.g., `\"type\": [\"object\", \"null\"]`)
982
1172
  - Add JSON Schema type inference from `const`, `enum`, or `default` fields
983
1173
 
984
1174
  ### Change
@@ -1079,7 +1269,7 @@ Notable merged work in this range includes:
1079
1269
  - `Add rack + websocket example (#130)`
1080
1270
  - `feat(gemspec): add changelog URI (#136)`
1081
1271
  - `feat(function): alias ThreadGroup#wait as ThreadGroup#value (#62)`
1082
- - README and screencast refresh across `#66`, `#67`, `#68`, `#71`, and
1272
+ - README and screencast refresh across `#66`, `#68`, `#71`, and
1083
1273
  `#72`
1084
1274
  - `chore(bot): update deprecation warning from v5.0 to v6.0`
1085
1275
  - `fix(deepseek): tolerate malformed tool arguments`