llm.rb 8.1.0 → 10.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +196 -6
- data/README.md +233 -518
- data/data/anthropic.json +278 -258
- data/data/bedrock.json +1288 -1561
- data/data/deepseek.json +38 -38
- data/data/google.json +656 -579
- data/data/openai.json +860 -818
- data/data/xai.json +243 -552
- data/data/zai.json +168 -168
- data/lib/llm/active_record/acts_as_agent.rb +5 -0
- data/lib/llm/active_record/acts_as_llm.rb +7 -8
- data/lib/llm/active_record.rb +1 -6
- data/lib/llm/agent.rb +121 -82
- data/lib/llm/context.rb +79 -74
- data/lib/llm/contract/completion.rb +45 -0
- data/lib/llm/cost.rb +81 -4
- data/lib/llm/error.rb +1 -1
- data/lib/llm/function/array.rb +8 -5
- data/lib/llm/function/call_group.rb +39 -0
- data/lib/llm/function/call_task.rb +46 -0
- data/lib/llm/function/fork/task.rb +6 -0
- data/lib/llm/function/ractor/task.rb +6 -0
- data/lib/llm/function/task.rb +10 -0
- data/lib/llm/function.rb +28 -1
- data/lib/llm/mcp/transport/http.rb +26 -46
- data/lib/llm/mcp/transport/stdio.rb +0 -8
- data/lib/llm/mcp.rb +6 -23
- data/lib/llm/provider.rb +30 -20
- data/lib/llm/providers/anthropic/error_handler.rb +6 -7
- data/lib/llm/providers/anthropic/files.rb +2 -2
- data/lib/llm/providers/anthropic/response_adapter/completion.rb +30 -0
- data/lib/llm/providers/anthropic/stream_parser.rb +2 -2
- data/lib/llm/providers/anthropic.rb +1 -1
- data/lib/llm/providers/bedrock/error_handler.rb +8 -9
- data/lib/llm/providers/bedrock/models.rb +13 -13
- data/lib/llm/providers/bedrock/response_adapter/completion.rb +30 -0
- data/lib/llm/providers/bedrock/stream_parser.rb +2 -2
- data/lib/llm/providers/bedrock.rb +1 -1
- data/lib/llm/providers/google/error_handler.rb +6 -7
- data/lib/llm/providers/google/files.rb +2 -4
- data/lib/llm/providers/google/images.rb +1 -1
- data/lib/llm/providers/google/models.rb +0 -2
- data/lib/llm/providers/google/response_adapter/completion.rb +30 -0
- data/lib/llm/providers/google/stream_parser.rb +2 -2
- data/lib/llm/providers/google.rb +1 -1
- data/lib/llm/providers/ollama/error_handler.rb +6 -7
- data/lib/llm/providers/ollama/models.rb +0 -2
- data/lib/llm/providers/ollama/response_adapter/completion.rb +30 -0
- data/lib/llm/providers/ollama.rb +1 -1
- data/lib/llm/providers/openai/audio.rb +3 -3
- data/lib/llm/providers/openai/error_handler.rb +6 -7
- data/lib/llm/providers/openai/files.rb +2 -2
- data/lib/llm/providers/openai/images.rb +3 -3
- data/lib/llm/providers/openai/models.rb +1 -1
- data/lib/llm/providers/openai/response_adapter/completion.rb +42 -0
- data/lib/llm/providers/openai/response_adapter/responds.rb +39 -0
- data/lib/llm/providers/openai/responses/stream_parser.rb +2 -2
- data/lib/llm/providers/openai/responses.rb +2 -2
- data/lib/llm/providers/openai/stream_parser.rb +2 -2
- data/lib/llm/providers/openai/vector_stores.rb +1 -1
- data/lib/llm/providers/openai.rb +1 -1
- data/lib/llm/response.rb +10 -8
- data/lib/llm/schema.rb +11 -0
- data/lib/llm/sequel/agent.rb +5 -0
- data/lib/llm/sequel/plugin.rb +8 -14
- data/lib/llm/stream/queue.rb +15 -42
- data/lib/llm/stream.rb +15 -40
- data/lib/llm/tool/param.rb +1 -8
- data/lib/llm/transport/execution.rb +67 -0
- data/lib/llm/transport/http.rb +134 -0
- data/lib/llm/transport/persistent_http.rb +152 -0
- data/lib/llm/transport/response/http.rb +113 -0
- data/lib/llm/transport/response.rb +112 -0
- data/lib/llm/{provider/transport/http → transport}/stream_decoder.rb +8 -4
- data/lib/llm/transport.rb +139 -0
- data/lib/llm/usage.rb +14 -5
- data/lib/llm/utils.rb +24 -14
- data/lib/llm/version.rb +1 -1
- data/lib/llm.rb +3 -12
- data/llm.gemspec +2 -16
- metadata +13 -20
- data/lib/llm/bot.rb +0 -3
- data/lib/llm/provider/transport/http/execution.rb +0 -115
- data/lib/llm/provider/transport/http/interruptible.rb +0 -114
- data/lib/llm/provider/transport/http.rb +0 -145
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 6ba756238fa72e58ba774567a0c8e2a6d7351cb6313f9c8c08cbdeec8ec9cfa4
|
|
4
|
+
data.tar.gz: cba8295670dab2843cec902ae97b7ae14e775359380ba401ca5a0066eb60ad0e
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: b8347b2adfe05a4700ec42e0ed5992a1332355bd20330590d8b3de214d980476a490855ff7e69b5b36c75f3684304c4ee61bdff9ecbcf8001f0b477b8010d064
|
|
7
|
+
data.tar.gz: a41512ffbc52b3665118161251441152389ca9daba1a6f4e010303490938dc33393da62f5e821521b2a9f4b45d85fd219b558fa7d2e185c24f43777d26e36a14
|
data/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,197 @@
|
|
|
2
2
|
|
|
3
3
|
## Unreleased
|
|
4
4
|
|
|
5
|
+
## v10.0.0
|
|
6
|
+
|
|
7
|
+
Changes since `v9.0.0`.
|
|
8
|
+
|
|
9
|
+
This release unifies context turns under `#talk`, removes the
|
|
10
|
+
deprecated `LLM::Bot` alias, and adds shared option resolution
|
|
11
|
+
through `LLM::Utils`.
|
|
12
|
+
|
|
13
|
+
Class-level agent tunables can now be resolved lazily via Proc,
|
|
14
|
+
`Array[...]` schema/tool param types are supported, and a `key?`
|
|
15
|
+
method has been added on providers.
|
|
16
|
+
|
|
17
|
+
Agent tool confirmation hooks let selected tools be approved or
|
|
18
|
+
cancelled before execution. Keep reading to learn more.
|
|
19
|
+
|
|
20
|
+
### Breaking
|
|
21
|
+
|
|
22
|
+
* **Unify context turns under `#talk`** <br>
|
|
23
|
+
Remove `LLM::Context#respond` and route responses-mode turns through
|
|
24
|
+
`LLM::Context#talk` with `mode: :responses` instead.
|
|
25
|
+
|
|
26
|
+
* **Remove the `LLM::Bot` alias** <br>
|
|
27
|
+
Remove the backward-compatible `LLM::Bot` alias for `LLM::Context`.
|
|
28
|
+
Use `LLM::Context` directly instead.
|
|
29
|
+
|
|
30
|
+
### Add
|
|
31
|
+
|
|
32
|
+
* **Add shared option resolution through `LLM::Utils`** <br>
|
|
33
|
+
Add `LLM::Utils.resolve_option` for resolving configured values as
|
|
34
|
+
literals, procs, symbol-named methods, or duplicated hashes, and use
|
|
35
|
+
it in agent and ORM option resolution paths.
|
|
36
|
+
|
|
37
|
+
* **Resolve all class-level agent tunables via Proc** <br>
|
|
38
|
+
Let `model`, `tools`, `skills`, `schema`, `stream`, and `tracer`
|
|
39
|
+
declared with a block be lazily evaluated against the agent instance
|
|
40
|
+
at initialization time, matching how `stream` and `tracer` already
|
|
41
|
+
worked.
|
|
42
|
+
|
|
43
|
+
Add `LLM::Agent#params` for direct access to the underlying context
|
|
44
|
+
parameters.
|
|
45
|
+
|
|
46
|
+
Ported from mruby-llm.
|
|
47
|
+
|
|
48
|
+
* **Support `Array[...]` schema and tool param types** <br>
|
|
49
|
+
Let `LLM::Schema` properties and `LLM::Tool` params accept
|
|
50
|
+
`Array[...]` type declarations, including mixed item unions that are
|
|
51
|
+
serialized as `anyOf` array items.
|
|
52
|
+
|
|
53
|
+
* **Add `LLM::Provider#key?`** <br>
|
|
54
|
+
Add `key?` to providers so callers can check whether a non-blank API
|
|
55
|
+
key has been configured.
|
|
56
|
+
|
|
57
|
+
* **Add agent tool confirmation hooks** <br>
|
|
58
|
+
Add `LLM::Agent.confirm` and `LLM::Agent#on_tool_confirmation` so
|
|
59
|
+
selected tools can be approved or cancelled before execution. Pending
|
|
60
|
+
tool resolution now relies on `LLM::Context#functions` so confirmed
|
|
61
|
+
tools are not executed twice when mixed with unconfirmed tool calls.
|
|
62
|
+
|
|
63
|
+
* **Add `LLM::Function#spawn(:call).wait`** <br>
|
|
64
|
+
Add task-shaped sequential execution support for direct
|
|
65
|
+
`LLM::Function#spawn(:call).wait`.
|
|
66
|
+
|
|
67
|
+
### Fix
|
|
68
|
+
|
|
69
|
+
* **Reduce private internal methods on `LLM::Stream`** <br>
|
|
70
|
+
Remove `tool_not_found` and `__tools__` from `LLM::Stream`. The
|
|
71
|
+
`__tools__` logic is inlined directly into `__find__` since that
|
|
72
|
+
was its only caller. The `tool_not_found` utility method was unused
|
|
73
|
+
externally and added unnecessary surface to LLM::Stream.
|
|
74
|
+
|
|
75
|
+
Ported from mruby-llm.
|
|
76
|
+
|
|
77
|
+
## v9.0.0
|
|
78
|
+
|
|
79
|
+
Changes since `v8.1.0`.
|
|
80
|
+
|
|
81
|
+
This release deepens llm.rb's transport and cost-tracking surface. It
|
|
82
|
+
replaces the old mutable `persist!` API with constructor-driven transport
|
|
83
|
+
selection, removes `#call` from contexts and agents in favor of explicit
|
|
84
|
+
`ctx.wait(:call)`, makes queued stream waits strategy-free, and deletes
|
|
85
|
+
the unused `LLM::Utils` module.
|
|
86
|
+
|
|
87
|
+
It adds cache read/write token tracking
|
|
88
|
+
with corresponding cost components, audio and image token pricing,
|
|
89
|
+
`LLM::Context#functions?` for queue-aware tool loops,
|
|
90
|
+
`LLM::Agent.stream` DSL support, and exposes `#stream` readers on
|
|
91
|
+
contexts and agents.
|
|
92
|
+
|
|
93
|
+
The HTTP transport layer has been refactored around shared backends so
|
|
94
|
+
providers, MCP, and custom transports all use the same normalized
|
|
95
|
+
response interface.
|
|
96
|
+
|
|
97
|
+
### Breaking
|
|
98
|
+
|
|
99
|
+
* **Remove `#call` as a context and agent tool-loop API** <br>
|
|
100
|
+
Remove `LLM::Context#call(:functions)` and `LLM::Agent#call(:functions)`.
|
|
101
|
+
Tool loops should use `ctx.wait(:call)` or `agent.wait(:call)` instead.
|
|
102
|
+
The ActiveRecord and Sequel wrappers no longer expose `#call` passthroughs
|
|
103
|
+
for stored llm.rb contexts.
|
|
104
|
+
|
|
105
|
+
* **Make HTTP transport selection constructor-driven** <br>
|
|
106
|
+
Remove public `persist!` and `.persistent` mutation APIs from
|
|
107
|
+
providers, transports, and MCP clients. Select persistent behavior at
|
|
108
|
+
construction time with `persistent: true`, `LLM::Transport.net_http`,
|
|
109
|
+
`LLM::Transport.net_http_persistent`, or an explicit `transport:`
|
|
110
|
+
override.
|
|
111
|
+
|
|
112
|
+
* **Make queued stream waits strategy-free** <br>
|
|
113
|
+
Change `LLM::Stream::Queue#wait` to resolve queued work by the actual
|
|
114
|
+
task types already present in the queue instead of accepting an
|
|
115
|
+
external wait strategy. `LLM::Stream#wait(...)` remains compatible but
|
|
116
|
+
now ignores its arguments when delegating to the queue.
|
|
117
|
+
|
|
118
|
+
* **Remove unused `LLM::Utils`** <br>
|
|
119
|
+
Delete the `LLM::Utils` module and remove its remaining unused
|
|
120
|
+
provider includes and top-level require.
|
|
121
|
+
|
|
122
|
+
### Add
|
|
123
|
+
|
|
124
|
+
* **Expose `#stream` readers on contexts and agents** <br>
|
|
125
|
+
Add public `LLM::Context#stream` and `LLM::Agent#stream` accessors so
|
|
126
|
+
callers can inspect the active stream object directly.
|
|
127
|
+
|
|
128
|
+
* **Track cache read and write tokens in usage** <br>
|
|
129
|
+
Add `cache_read_tokens` and `cache_write_tokens` to `LLM::Usage` and
|
|
130
|
+
preserve them through completion usage adaptation and context usage
|
|
131
|
+
aggregation.
|
|
132
|
+
|
|
133
|
+
* **Add `LLM::Context#functions?` for queue-aware tool loops** <br>
|
|
134
|
+
Add `functions?` to `LLM::Context` and the ActiveRecord and Sequel
|
|
135
|
+
wrappers so callers can detect pending tool work through either the
|
|
136
|
+
bound stream queue or unresolved functions, and update the docs to
|
|
137
|
+
prefer `while ctx.functions?` over `ctx.functions.any?` in tool-loop
|
|
138
|
+
examples.
|
|
139
|
+
|
|
140
|
+
* **Add `:call` as a first-class wait strategy** <br>
|
|
141
|
+
Add `:call` to pending-function wait paths so `ctx.wait(:call)` can
|
|
142
|
+
prefer queued streamed work when present and otherwise fall back to
|
|
143
|
+
direct sequential function execution through `spawn(:call).wait`.
|
|
144
|
+
|
|
145
|
+
* **Read provider cache usage into completion responses** <br>
|
|
146
|
+
Read cache read tokens from provider usage metadata, including OpenAI
|
|
147
|
+
`usage.prompt_tokens_details` and Anthropic
|
|
148
|
+
`usage.cache_read_input_tokens`. Read Anthropic cache write tokens
|
|
149
|
+
from `usage.cache_creation_input_tokens`, and expose explicit
|
|
150
|
+
zero-valued `cache_write_tokens` methods on providers that do not
|
|
151
|
+
report cache creation usage.
|
|
152
|
+
|
|
153
|
+
* **Extend cost tracking with cache write pricing** <br>
|
|
154
|
+
Extend `LLM::Cost` with `cache_read_costs`, `cache_write_costs`, and
|
|
155
|
+
`reasoning_costs` alongside the existing `input_costs` and
|
|
156
|
+
`output_costs`. Add `#to_h` for structured cost insight and update
|
|
157
|
+
`ctx.cost` to calculate all available components from registry
|
|
158
|
+
pricing data.
|
|
159
|
+
|
|
160
|
+
* **Price input and output audio separately** <br>
|
|
161
|
+
Track `input_audio_tokens` and `output_audio_tokens` in usage and
|
|
162
|
+
include `input_audio_costs` and `output_audio_costs` in `LLM::Cost`
|
|
163
|
+
so multimodal requests report accurate audio spend.
|
|
164
|
+
|
|
165
|
+
* **Track image tokens in input cost reporting** <br>
|
|
166
|
+
Add `input_image_tokens` to usage and include `input_image_costs` in
|
|
167
|
+
`LLM::Cost` using the model's generic input rate so image-bearing
|
|
168
|
+
prompts report their input spend.
|
|
169
|
+
|
|
170
|
+
* **Add `LLM::Agent.stream` DSL support** <br>
|
|
171
|
+
Let agents define a default `stream` through the class DSL, including
|
|
172
|
+
block-based stream construction so each agent instance can resolve its
|
|
173
|
+
stream the same way `tracer` does.
|
|
174
|
+
|
|
175
|
+
### Change
|
|
176
|
+
|
|
177
|
+
* **Refactor HTTP transports around shared backends** <br>
|
|
178
|
+
Split `Net::HTTP` and `Net::HTTP::Persistent` into separate
|
|
179
|
+
`LLM::Transport` implementations, move HTTP-specific request helpers
|
|
180
|
+
and response execution into the shared transport layer, and let MCP
|
|
181
|
+
HTTP wrap those transports instead of maintaining a separate
|
|
182
|
+
transient/persistent client split.
|
|
183
|
+
|
|
184
|
+
* **Share transport overrides across providers and MCP** <br>
|
|
185
|
+
Let both provider construction and `LLM::MCP.http(...)` accept
|
|
186
|
+
`LLM::Transport` instances or classes as HTTP transport overrides, so
|
|
187
|
+
callers can reuse the same transport implementation across the
|
|
188
|
+
runtime.
|
|
189
|
+
|
|
190
|
+
* **Let custom transports adapt their own response objects** <br>
|
|
191
|
+
Introduce a transport response interface so custom transports can
|
|
192
|
+
adapt backend-specific response objects to one normalized shape and
|
|
193
|
+
have them work with the existing provider execution and error-handling
|
|
194
|
+
code.
|
|
195
|
+
|
|
5
196
|
## v8.1.0
|
|
6
197
|
|
|
7
198
|
Changes since `v8.0.0`.
|
|
@@ -43,7 +234,7 @@ DSML tool-marker filtering in streamed text.
|
|
|
43
234
|
blocks that Bedrock rejects.
|
|
44
235
|
|
|
45
236
|
* **Suppress Bedrock DSML tool markers in streamed text** <br>
|
|
46
|
-
Filter
|
|
237
|
+
Filter `\"<|DSML|function_calls\"` markers out of streamed Bedrock
|
|
47
238
|
assistant text so tool-call sentinels do not leak into user-visible
|
|
48
239
|
output.
|
|
49
240
|
|
|
@@ -96,8 +287,7 @@ and `acts_as_agent`.
|
|
|
96
287
|
|
|
97
288
|
* **Allow `persistent: true` on `LLM::MCP.http`** <br>
|
|
98
289
|
Let `LLM::MCP.http(...)` enable persistent HTTP transport directly
|
|
99
|
-
through `persistent: true
|
|
100
|
-
`.persistent` call after construction.
|
|
290
|
+
through `persistent: true` at construction time.
|
|
101
291
|
|
|
102
292
|
* **Expose `LLM::Function#runner` as public API** <br>
|
|
103
293
|
Promote the internal runner instantiation to a public `runner` method on
|
|
@@ -195,7 +385,7 @@ provider usage has been recorded yet.
|
|
|
195
385
|
buffer API.
|
|
196
386
|
|
|
197
387
|
* **Support percentage compaction token thresholds** <br>
|
|
198
|
-
Let `LLM::Compactor` accept `token_threshold:` values like
|
|
388
|
+
Let `LLM::Compactor` accept `token_threshold:` values like `\"90%\"` so
|
|
199
389
|
compaction can trigger at a percentage of the active model context
|
|
200
390
|
window.
|
|
201
391
|
|
|
@@ -978,7 +1168,7 @@ Changes since `v4.9.0`.
|
|
|
978
1168
|
|
|
979
1169
|
- Add HTTP transport for MCP with `LLM::MCP::Transport::HTTP` for remote servers
|
|
980
1170
|
- Add JSON Schema union types (`any_of`, `all_of`, `one_of`) with parser integration
|
|
981
|
-
- Add JSON Schema type array union support (e.g.,
|
|
1171
|
+
- Add JSON Schema type array union support (e.g., `\"type\": [\"object\", \"null\"]`)
|
|
982
1172
|
- Add JSON Schema type inference from `const`, `enum`, or `default` fields
|
|
983
1173
|
|
|
984
1174
|
### Change
|
|
@@ -1079,7 +1269,7 @@ Notable merged work in this range includes:
|
|
|
1079
1269
|
- `Add rack + websocket example (#130)`
|
|
1080
1270
|
- `feat(gemspec): add changelog URI (#136)`
|
|
1081
1271
|
- `feat(function): alias ThreadGroup#wait as ThreadGroup#value (#62)`
|
|
1082
|
-
- README and screencast refresh across `#66`, `#
|
|
1272
|
+
- README and screencast refresh across `#66`, `#68`, `#71`, and
|
|
1083
1273
|
`#72`
|
|
1084
1274
|
- `chore(bot): update deprecation warning from v5.0 to v6.0`
|
|
1085
1275
|
- `fix(deepseek): tolerate malformed tool arguments`
|