llm.rb 5.0.0 → 5.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +64 -0
- data/README.md +33 -12
- data/data/deepseek.json +68 -0
- data/data/google.json +26 -26
- data/data/openai.json +55 -0
- data/lib/llm/context.rb +9 -6
- data/lib/llm/mcp.rb +15 -0
- data/lib/llm/message.rb +14 -5
- data/lib/llm/providers/anthropic/stream_parser.rb +1 -1
- data/lib/llm/providers/deepseek/request_adapter/completion.rb +30 -7
- data/lib/llm/providers/deepseek.rb +3 -3
- data/lib/llm/providers/google/stream_parser.rb +1 -1
- data/lib/llm/providers/openai/responses/stream_parser.rb +1 -1
- data/lib/llm/providers/openai/stream_parser.rb +1 -1
- data/lib/llm/stream.rb +34 -6
- data/lib/llm/version.rb +1 -1
- metadata +1 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 03ed8d289dc230fb6404f2fb3d1482401354f078b3502cd550949bcff48d97d2
|
|
4
|
+
data.tar.gz: 8b54acc8723263b5bf8c2d0025452e1448dfc66953a2c0d0c24c13e4d7b3343b
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: b088838c5b1860e30413ba87e2c66dec393b3bff51e462e38af5bc1f13b746b7bdf5d103b67f949aa31a6bc6da280da3e170f876743f6286f8a5674f6cee42a6
|
|
7
|
+
data.tar.gz: 769fecd327298f7b17b731f181d3091194cddeb758e1723a20a5c789f4b0298ce9a5f5244aa3d4a807b8d8260d541e1286443ca9d841a11fd666cb354a7f893b
|
data/CHANGELOG.md
CHANGED
|
@@ -2,8 +2,72 @@
|
|
|
2
2
|
|
|
3
3
|
## Unreleased
|
|
4
4
|
|
|
5
|
+
Changes since `v5.2.0`.
|
|
6
|
+
|
|
7
|
+
## v5.2.0
|
|
8
|
+
|
|
9
|
+
Changes since `v5.1.0`.
|
|
10
|
+
|
|
11
|
+
This release adds current DeepSeek V4 support through refreshed provider
|
|
12
|
+
metadata, including `deepseek-v4-flash` and `deepseek-v4-pro`, while fixing
|
|
13
|
+
request-local queue handling for concurrent streamed workloads so `wait` and
|
|
14
|
+
interruption use the active per-call stream correctly.
|
|
15
|
+
|
|
16
|
+
### Change
|
|
17
|
+
|
|
18
|
+
* **Add `LLM::MCP#run` for scoped MCP client lifecycle** <br>
|
|
19
|
+
Add `LLM::MCP#run` so MCP clients can be started for the duration of a
|
|
20
|
+
block and then stopped automatically, which simplifies the usual
|
|
21
|
+
`start`/`stop` pattern in examples and application code.
|
|
22
|
+
|
|
23
|
+
* **Refresh provider model metadata** <br>
|
|
24
|
+
Add current DeepSeek and OpenAI model metadata to `data/` and update the
|
|
25
|
+
Google Gemma model entry to match the current provider naming.
|
|
26
|
+
|
|
27
|
+
### Fix
|
|
28
|
+
|
|
29
|
+
* **Reject unsupported DeepSeek multimodal prompt objects early** <br>
|
|
30
|
+
Raise `LLM::PromptError` for `image_url`, `local_file`, and
|
|
31
|
+
`remote_file` in DeepSeek chat requests instead of sending invalid
|
|
32
|
+
OpenAI-compatible payloads that the provider rejects at runtime.
|
|
33
|
+
|
|
34
|
+
* **Preserve DeepSeek reasoning content across tool turns** <br>
|
|
35
|
+
Replay `reasoning_content` when serializing prior assistant messages for
|
|
36
|
+
DeepSeek chat completions, so thinking-mode tool calls can continue into
|
|
37
|
+
follow-up requests without triggering invalid request errors.
|
|
38
|
+
|
|
39
|
+
* **Default DeepSeek to `deepseek-v4-flash`** <br>
|
|
40
|
+
Change `LLM::DeepSeek#default_model` to `deepseek-v4-flash` so new
|
|
41
|
+
contexts and default provider usage align with the current preferred chat
|
|
42
|
+
model.
|
|
43
|
+
|
|
44
|
+
* **Use per-call streams when waiting on streamed tool work** <br>
|
|
45
|
+
Track request-local streams bound through `talk(..., stream:)` and
|
|
46
|
+
`respond(..., stream:)` so `LLM::Context#wait` and interruption-aware
|
|
47
|
+
queue handling use the active stream instead of falling back to pending
|
|
48
|
+
function spawning.
|
|
49
|
+
|
|
50
|
+
## v5.1.0
|
|
51
|
+
|
|
5
52
|
Changes since `v5.0.0`.
|
|
6
53
|
|
|
54
|
+
This release tightens streamed tool execution around the actual request-local
|
|
55
|
+
runtime state. It fixes streamed resolution of per-request tools and makes
|
|
56
|
+
that streamed path work cleanly with `LLM.function(...)`, MCP tools, bound
|
|
57
|
+
tool instances, and normal tool classes.
|
|
58
|
+
|
|
59
|
+
### Fix
|
|
60
|
+
|
|
61
|
+
* **Resolve request-local tools during streaming** <br>
|
|
62
|
+
Resolve streamed tool calls through `LLM::Stream` request-local tools
|
|
63
|
+
before falling back to the global registry, so per-request tools and bound
|
|
64
|
+
tool instances work correctly during streaming.
|
|
65
|
+
|
|
66
|
+
* **Support `LLM.function(...)` and MCP tools in streamed tool resolution** <br>
|
|
67
|
+
Let streamed tool resolution use the current request tool set, so
|
|
68
|
+
`LLM.function(...)`, MCP tools, bound tool instances, and normal
|
|
69
|
+
`LLM::Tool` classes all work through the same streamed tool path.
|
|
70
|
+
|
|
7
71
|
## v5.0.0
|
|
8
72
|
|
|
9
73
|
Changes since `v4.23.0`.
|
data/README.md
CHANGED
|
@@ -4,7 +4,7 @@
|
|
|
4
4
|
<p align="center">
|
|
5
5
|
<a href="https://0x1eef.github.io/x/llm.rb?rebuild=1"><img src="https://img.shields.io/badge/docs-0x1eef.github.io-blue.svg" alt="RubyDoc"></a>
|
|
6
6
|
<a href="https://opensource.org/license/0bsd"><img src="https://img.shields.io/badge/License-0BSD-orange.svg?" alt="License"></a>
|
|
7
|
-
<a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-5.
|
|
7
|
+
<a href="https://github.com/llmrb/llm.rb/tags"><img src="https://img.shields.io/badge/version-5.2.0-green.svg?" alt="Version"></a>
|
|
8
8
|
</p>
|
|
9
9
|
|
|
10
10
|
## About
|
|
@@ -261,13 +261,17 @@ Remote MCP tools and prompts are not bolted on as a separate integration
|
|
|
261
261
|
stack. They adapt into the same tool and prompt path used by local tools,
|
|
262
262
|
skills, contexts, and agents.
|
|
263
263
|
|
|
264
|
+
Use `mcp.run do ... end` for scoped work where the client should start and
|
|
265
|
+
stop around one block. Use `mcp.start` and `mcp.stop` directly when you need
|
|
266
|
+
finer sequential control across several steps before shutting the client down.
|
|
267
|
+
|
|
264
268
|
```ruby
|
|
265
|
-
|
|
266
|
-
|
|
267
|
-
|
|
269
|
+
mcp = LLM::MCP.http(
|
|
270
|
+
url: "https://api.githubcopilot.com/mcp/",
|
|
271
|
+
headers: {"Authorization" => "Bearer #{ENV.fetch("GITHUB_PAT")}"}
|
|
272
|
+
).persistent
|
|
273
|
+
mcp.run do
|
|
268
274
|
ctx = LLM::Context.new(llm, tools: mcp.tools)
|
|
269
|
-
ensure
|
|
270
|
-
mcp.stop
|
|
271
275
|
end
|
|
272
276
|
```
|
|
273
277
|
|
|
@@ -281,12 +285,17 @@ Go's context package. In fact, llm.rb is heavily inspired by Go but with a Ruby
|
|
|
281
285
|
twist.
|
|
282
286
|
|
|
283
287
|
```ruby
|
|
288
|
+
require "llm"
|
|
289
|
+
require "io/console"
|
|
290
|
+
|
|
291
|
+
llm = LLM.openai(key: ENV["KEY"])
|
|
284
292
|
ctx = LLM::Context.new(llm, stream: $stdout)
|
|
285
293
|
worker = Thread.new do
|
|
286
294
|
ctx.talk("Write a very long essay about network protocols.")
|
|
287
295
|
rescue LLM::Interrupt
|
|
288
296
|
puts "Request was interrupted!"
|
|
289
297
|
end
|
|
298
|
+
|
|
290
299
|
STDIN.getch
|
|
291
300
|
ctx.interrupt!
|
|
292
301
|
worker.join
|
|
@@ -615,9 +624,10 @@ require "io/console"
|
|
|
615
624
|
|
|
616
625
|
llm = LLM.openai(key: ENV["KEY"])
|
|
617
626
|
ctx = LLM::Context.new(llm, stream: $stdout)
|
|
618
|
-
|
|
619
627
|
worker = Thread.new do
|
|
620
628
|
ctx.talk("Write a very long essay about network protocols.")
|
|
629
|
+
rescue LLM::Interrupt
|
|
630
|
+
puts "Request was interrupted!"
|
|
621
631
|
end
|
|
622
632
|
|
|
623
633
|
STDIN.getch
|
|
@@ -695,7 +705,7 @@ puts ticket.talk("How do I rotate my API key?").content
|
|
|
695
705
|
|
|
696
706
|
#### MCP
|
|
697
707
|
|
|
698
|
-
This example uses [`LLM::MCP`](https://0x1eef.github.io/x/llm.rb/LLM/MCP.html) over HTTP so remote GitHub MCP tools run through the same `LLM::Context` tool path as local tools. See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.
|
|
708
|
+
This example uses [`LLM::MCP`](https://0x1eef.github.io/x/llm.rb/LLM/MCP.html) over HTTP so remote GitHub MCP tools run through the same `LLM::Context` tool path as local tools. It expects a GitHub token in `ENV["GITHUB_PAT"]`. See the [deepdive (web)](https://0x1eef.github.io/x/llm.rb/file.deepdive.html) or [deepdive (markdown)](resources/deepdive.md) for more examples.
|
|
699
709
|
|
|
700
710
|
```ruby
|
|
701
711
|
require "llm"
|
|
@@ -707,13 +717,24 @@ mcp = LLM::MCP.http(
|
|
|
707
717
|
headers: {"Authorization" => "Bearer #{ENV.fetch("GITHUB_PAT")}"}
|
|
708
718
|
).persistent
|
|
709
719
|
|
|
710
|
-
|
|
711
|
-
|
|
720
|
+
mcp.start
|
|
721
|
+
ctx = LLM::Context.new(llm, stream: $stdout, tools: mcp.tools)
|
|
722
|
+
ctx.talk("Pull information about my GitHub account.")
|
|
723
|
+
ctx.talk(ctx.call(:functions)) while ctx.functions.any?
|
|
724
|
+
mcp.stop
|
|
725
|
+
```
|
|
726
|
+
|
|
727
|
+
For scoped work, `mcp.run do ... end` is shorter and handles cleanup for you:
|
|
728
|
+
|
|
729
|
+
```ruby
|
|
730
|
+
mcp = LLM::MCP.http(
|
|
731
|
+
url: "https://api.githubcopilot.com/mcp/",
|
|
732
|
+
headers: {"Authorization" => "Bearer #{ENV.fetch("GITHUB_PAT")}"}
|
|
733
|
+
).persistent
|
|
734
|
+
mcp.run do
|
|
712
735
|
ctx = LLM::Context.new(llm, stream: $stdout, tools: mcp.tools)
|
|
713
736
|
ctx.talk("Pull information about my GitHub account.")
|
|
714
737
|
ctx.talk(ctx.call(:functions)) while ctx.functions.any?
|
|
715
|
-
ensure
|
|
716
|
-
mcp.stop
|
|
717
738
|
end
|
|
718
739
|
```
|
|
719
740
|
|
data/data/deepseek.json
CHANGED
|
@@ -70,6 +70,74 @@
|
|
|
70
70
|
"context": 128000,
|
|
71
71
|
"output": 64000
|
|
72
72
|
}
|
|
73
|
+
},
|
|
74
|
+
"deepseek-v4-flash": {
|
|
75
|
+
"id": "deepseek-v4-flash",
|
|
76
|
+
"name": "DeepSeek V4 Flash",
|
|
77
|
+
"family": "deepseek-flash",
|
|
78
|
+
"attachment": false,
|
|
79
|
+
"reasoning": true,
|
|
80
|
+
"tool_call": true,
|
|
81
|
+
"interleaved": {
|
|
82
|
+
"field": "reasoning_content"
|
|
83
|
+
},
|
|
84
|
+
"structured_output": true,
|
|
85
|
+
"temperature": true,
|
|
86
|
+
"knowledge": "2025-05",
|
|
87
|
+
"release_date": "2026-04-24",
|
|
88
|
+
"last_updated": "2026-04-24",
|
|
89
|
+
"modalities": {
|
|
90
|
+
"input": [
|
|
91
|
+
"text"
|
|
92
|
+
],
|
|
93
|
+
"output": [
|
|
94
|
+
"text"
|
|
95
|
+
]
|
|
96
|
+
},
|
|
97
|
+
"open_weights": true,
|
|
98
|
+
"cost": {
|
|
99
|
+
"input": 0.14,
|
|
100
|
+
"output": 0.28,
|
|
101
|
+
"cache_read": 0.028
|
|
102
|
+
},
|
|
103
|
+
"limit": {
|
|
104
|
+
"context": 1000000,
|
|
105
|
+
"output": 384000
|
|
106
|
+
}
|
|
107
|
+
},
|
|
108
|
+
"deepseek-v4-pro": {
|
|
109
|
+
"id": "deepseek-v4-pro",
|
|
110
|
+
"name": "DeepSeek V4 Pro",
|
|
111
|
+
"family": "deepseek-thinking",
|
|
112
|
+
"attachment": false,
|
|
113
|
+
"reasoning": true,
|
|
114
|
+
"tool_call": true,
|
|
115
|
+
"interleaved": {
|
|
116
|
+
"field": "reasoning_content"
|
|
117
|
+
},
|
|
118
|
+
"structured_output": true,
|
|
119
|
+
"temperature": true,
|
|
120
|
+
"knowledge": "2025-05",
|
|
121
|
+
"release_date": "2026-04-24",
|
|
122
|
+
"last_updated": "2026-04-24",
|
|
123
|
+
"modalities": {
|
|
124
|
+
"input": [
|
|
125
|
+
"text"
|
|
126
|
+
],
|
|
127
|
+
"output": [
|
|
128
|
+
"text"
|
|
129
|
+
]
|
|
130
|
+
},
|
|
131
|
+
"open_weights": true,
|
|
132
|
+
"cost": {
|
|
133
|
+
"input": 1.74,
|
|
134
|
+
"output": 3.48,
|
|
135
|
+
"cache_read": 0.145
|
|
136
|
+
},
|
|
137
|
+
"limit": {
|
|
138
|
+
"context": 1000000,
|
|
139
|
+
"output": 384000
|
|
140
|
+
}
|
|
73
141
|
}
|
|
74
142
|
}
|
|
75
143
|
}
|
data/data/google.json
CHANGED
|
@@ -1058,6 +1058,32 @@
|
|
|
1058
1058
|
"output": 8192
|
|
1059
1059
|
}
|
|
1060
1060
|
},
|
|
1061
|
+
"gemma-4-26b-a4b-it": {
|
|
1062
|
+
"id": "gemma-4-26b-a4b-it",
|
|
1063
|
+
"name": "Gemma 4 26B",
|
|
1064
|
+
"family": "gemma",
|
|
1065
|
+
"attachment": false,
|
|
1066
|
+
"reasoning": true,
|
|
1067
|
+
"tool_call": true,
|
|
1068
|
+
"structured_output": true,
|
|
1069
|
+
"temperature": true,
|
|
1070
|
+
"release_date": "2026-04-02",
|
|
1071
|
+
"last_updated": "2026-04-02",
|
|
1072
|
+
"modalities": {
|
|
1073
|
+
"input": [
|
|
1074
|
+
"text",
|
|
1075
|
+
"image"
|
|
1076
|
+
],
|
|
1077
|
+
"output": [
|
|
1078
|
+
"text"
|
|
1079
|
+
]
|
|
1080
|
+
},
|
|
1081
|
+
"open_weights": true,
|
|
1082
|
+
"limit": {
|
|
1083
|
+
"context": 256000,
|
|
1084
|
+
"output": 8192
|
|
1085
|
+
}
|
|
1086
|
+
},
|
|
1061
1087
|
"gemini-2.5-flash-lite": {
|
|
1062
1088
|
"id": "gemini-2.5-flash-lite",
|
|
1063
1089
|
"name": "Gemini 2.5 Flash Lite",
|
|
@@ -1093,32 +1119,6 @@
|
|
|
1093
1119
|
"output": 65536
|
|
1094
1120
|
}
|
|
1095
1121
|
},
|
|
1096
|
-
"gemma-4-26b-it": {
|
|
1097
|
-
"id": "gemma-4-26b-it",
|
|
1098
|
-
"name": "Gemma 4 26B",
|
|
1099
|
-
"family": "gemma",
|
|
1100
|
-
"attachment": false,
|
|
1101
|
-
"reasoning": true,
|
|
1102
|
-
"tool_call": true,
|
|
1103
|
-
"structured_output": true,
|
|
1104
|
-
"temperature": true,
|
|
1105
|
-
"release_date": "2026-04-02",
|
|
1106
|
-
"last_updated": "2026-04-02",
|
|
1107
|
-
"modalities": {
|
|
1108
|
-
"input": [
|
|
1109
|
-
"text",
|
|
1110
|
-
"image"
|
|
1111
|
-
],
|
|
1112
|
-
"output": [
|
|
1113
|
-
"text"
|
|
1114
|
-
]
|
|
1115
|
-
},
|
|
1116
|
-
"open_weights": true,
|
|
1117
|
-
"limit": {
|
|
1118
|
-
"context": 256000,
|
|
1119
|
-
"output": 8192
|
|
1120
|
-
}
|
|
1121
|
-
},
|
|
1122
1122
|
"gemini-2.5-flash-image-preview": {
|
|
1123
1123
|
"id": "gemini-2.5-flash-image-preview",
|
|
1124
1124
|
"name": "Gemini 2.5 Flash Image (Preview)",
|
data/data/openai.json
CHANGED
|
@@ -195,6 +195,61 @@
|
|
|
195
195
|
"output": 16384
|
|
196
196
|
}
|
|
197
197
|
},
|
|
198
|
+
"gpt-5.5": {
|
|
199
|
+
"id": "gpt-5.5",
|
|
200
|
+
"name": "GPT-5.5",
|
|
201
|
+
"family": "gpt",
|
|
202
|
+
"attachment": true,
|
|
203
|
+
"reasoning": true,
|
|
204
|
+
"tool_call": true,
|
|
205
|
+
"structured_output": true,
|
|
206
|
+
"temperature": false,
|
|
207
|
+
"knowledge": "2025-12-01",
|
|
208
|
+
"release_date": "2026-04-23",
|
|
209
|
+
"last_updated": "2026-04-23",
|
|
210
|
+
"modalities": {
|
|
211
|
+
"input": [
|
|
212
|
+
"text",
|
|
213
|
+
"image",
|
|
214
|
+
"pdf"
|
|
215
|
+
],
|
|
216
|
+
"output": [
|
|
217
|
+
"text"
|
|
218
|
+
]
|
|
219
|
+
},
|
|
220
|
+
"open_weights": false,
|
|
221
|
+
"cost": {
|
|
222
|
+
"input": 5,
|
|
223
|
+
"output": 30,
|
|
224
|
+
"cache_read": 0.5,
|
|
225
|
+
"context_over_200k": {
|
|
226
|
+
"input": 10,
|
|
227
|
+
"output": 45,
|
|
228
|
+
"cache_read": 1
|
|
229
|
+
}
|
|
230
|
+
},
|
|
231
|
+
"limit": {
|
|
232
|
+
"context": 1050000,
|
|
233
|
+
"input": 920000,
|
|
234
|
+
"output": 130000
|
|
235
|
+
},
|
|
236
|
+
"experimental": {
|
|
237
|
+
"modes": {
|
|
238
|
+
"fast": {
|
|
239
|
+
"cost": {
|
|
240
|
+
"input": 12.5,
|
|
241
|
+
"output": 75,
|
|
242
|
+
"cache_read": 1.25
|
|
243
|
+
},
|
|
244
|
+
"provider": {
|
|
245
|
+
"body": {
|
|
246
|
+
"service_tier": "priority"
|
|
247
|
+
}
|
|
248
|
+
}
|
|
249
|
+
}
|
|
250
|
+
}
|
|
251
|
+
}
|
|
252
|
+
},
|
|
198
253
|
"gpt-5-mini": {
|
|
199
254
|
"id": "gpt-5-mini",
|
|
200
255
|
"name": "GPT-5 Mini",
|
data/lib/llm/context.rb
CHANGED
|
@@ -177,7 +177,7 @@ module LLM
|
|
|
177
177
|
params = params.merge(messages: @messages.to_a)
|
|
178
178
|
params = @params.merge(params)
|
|
179
179
|
prompt, params = transform(prompt, params)
|
|
180
|
-
bind!(params[:stream], params[:model])
|
|
180
|
+
bind!(params[:stream], params[:model], params[:tools])
|
|
181
181
|
res = @llm.complete(prompt, params)
|
|
182
182
|
role = params[:role] || @llm.user_role
|
|
183
183
|
role = @llm.tool_role if params[:role].nil? && [*prompt].grep(LLM::Function::Return).any?
|
|
@@ -205,7 +205,7 @@ module LLM
|
|
|
205
205
|
compactor.compact!(prompt) if compactor.compact?(prompt)
|
|
206
206
|
params = @params.merge(params)
|
|
207
207
|
prompt, params = transform(prompt, params)
|
|
208
|
-
bind!(params[:stream], params[:model])
|
|
208
|
+
bind!(params[:stream], params[:model], params[:tools])
|
|
209
209
|
res_id = params[:store] == false ? nil : @messages.find(&:assistant?)&.response&.response_id
|
|
210
210
|
params = params.merge(previous_response_id: res_id, input: @messages.to_a).compact
|
|
211
211
|
res = @llm.responses.create(prompt, params)
|
|
@@ -295,7 +295,6 @@ module LLM
|
|
|
295
295
|
# ractor work, in that order.
|
|
296
296
|
# @return [Array<LLM::Function::Return>]
|
|
297
297
|
def wait(strategy)
|
|
298
|
-
stream = @params[:stream]
|
|
299
298
|
if LLM::Stream === stream && !stream.queue.empty?
|
|
300
299
|
@queue = stream.queue
|
|
301
300
|
@queue.wait(strategy)
|
|
@@ -459,19 +458,24 @@ module LLM
|
|
|
459
458
|
|
|
460
459
|
private
|
|
461
460
|
|
|
462
|
-
def bind!(stream, model)
|
|
461
|
+
def bind!(stream, model, tools)
|
|
463
462
|
return unless LLM::Stream === stream
|
|
463
|
+
@stream = stream
|
|
464
464
|
stream.extra[:ctx] = self
|
|
465
465
|
stream.extra[:tracer] = tracer
|
|
466
466
|
stream.extra[:model] = model
|
|
467
|
+
stream.extra[:tools] = tools
|
|
467
468
|
end
|
|
468
469
|
|
|
469
470
|
def queue
|
|
470
471
|
return @queue if @queue
|
|
471
|
-
stream = @params[:stream]
|
|
472
472
|
stream.queue if LLM::Stream === stream
|
|
473
473
|
end
|
|
474
474
|
|
|
475
|
+
def stream
|
|
476
|
+
@stream || @params[:stream]
|
|
477
|
+
end
|
|
478
|
+
|
|
475
479
|
def load_skills(skills)
|
|
476
480
|
[*skills].map { LLM::Skill.load(_1).to_tool(self) }
|
|
477
481
|
end
|
|
@@ -494,7 +498,6 @@ module LLM
|
|
|
494
498
|
message: warning
|
|
495
499
|
})
|
|
496
500
|
end
|
|
497
|
-
|
|
498
501
|
end
|
|
499
502
|
|
|
500
503
|
# Backward-compatible alias
|
data/lib/llm/mcp.rb
CHANGED
|
@@ -103,6 +103,21 @@ class LLM::MCP
|
|
|
103
103
|
nil
|
|
104
104
|
end
|
|
105
105
|
|
|
106
|
+
##
|
|
107
|
+
# Starts the MCP client for the duration of a block and then stops it.
|
|
108
|
+
# @yield Runs with the MCP client started
|
|
109
|
+
# @raise [LocalJumpError]
|
|
110
|
+
# When called without a block
|
|
111
|
+
# @raise [StandardError]
|
|
112
|
+
# Propagates errors raised by {#start}, the block itself, or {#stop}
|
|
113
|
+
# @return [void]
|
|
114
|
+
def run
|
|
115
|
+
start
|
|
116
|
+
yield
|
|
117
|
+
ensure
|
|
118
|
+
stop
|
|
119
|
+
end
|
|
120
|
+
|
|
106
121
|
##
|
|
107
122
|
# Configures an HTTP MCP transport to use a persistent connection pool
|
|
108
123
|
# via the optional dependency [Net::HTTP::Persistent](https://github.com/drbrain/net-http-persistent)
|
data/lib/llm/message.rb
CHANGED
|
@@ -33,11 +33,15 @@ module LLM
|
|
|
33
33
|
# Returns a Hash representation of the message.
|
|
34
34
|
# @return [Hash]
|
|
35
35
|
def to_h
|
|
36
|
-
{
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
|
|
36
|
+
{
|
|
37
|
+
role:,
|
|
38
|
+
content:,
|
|
39
|
+
reasoning_content:,
|
|
40
|
+
compaction: extra.compaction,
|
|
41
|
+
tools: extra.tool_calls&.map { LLM::Object === _1 ? _1.to_h : _1 },
|
|
42
|
+
usage:,
|
|
43
|
+
original_tool_calls: extra.original_tool_calls
|
|
44
|
+
}.compact.then { preserve_nil_content(_1) }
|
|
41
45
|
end
|
|
42
46
|
|
|
43
47
|
##
|
|
@@ -208,6 +212,11 @@ module LLM
|
|
|
208
212
|
|
|
209
213
|
private
|
|
210
214
|
|
|
215
|
+
def preserve_nil_content(hash)
|
|
216
|
+
hash[:content] = content if content.nil?
|
|
217
|
+
hash
|
|
218
|
+
end
|
|
219
|
+
|
|
211
220
|
def tool_calls
|
|
212
221
|
@tool_calls ||= LLM::Object.from(extra.tool_calls || [])
|
|
213
222
|
end
|
|
@@ -105,7 +105,7 @@ class LLM::Anthropic
|
|
|
105
105
|
end
|
|
106
106
|
|
|
107
107
|
def resolve_tool(tool)
|
|
108
|
-
registered =
|
|
108
|
+
registered = @stream.find_tool(tool["name"])
|
|
109
109
|
fn = (registered || LLM::Function.new(tool["name"])).dup.tap do |fn|
|
|
110
110
|
fn.id = tool["id"]
|
|
111
111
|
fn.arguments = LLM::Anthropic.parse_tool_input(tool["input"])
|
|
@@ -19,7 +19,7 @@ module LLM::DeepSeek::RequestAdapter
|
|
|
19
19
|
if Hash === message
|
|
20
20
|
{role: message[:role], content: adapt_content(message[:content])}
|
|
21
21
|
elsif message.tool_call?
|
|
22
|
-
|
|
22
|
+
wrap(content: nil, tool_calls: message.extra[:original_tool_calls])
|
|
23
23
|
else
|
|
24
24
|
adapt_message
|
|
25
25
|
end
|
|
@@ -30,25 +30,34 @@ module LLM::DeepSeek::RequestAdapter
|
|
|
30
30
|
|
|
31
31
|
def adapt_content(content)
|
|
32
32
|
case content
|
|
33
|
+
when LLM::Object
|
|
34
|
+
adapt_object(content)
|
|
33
35
|
when String
|
|
34
|
-
content.to_s
|
|
36
|
+
[{type: :text, text: content.to_s}]
|
|
35
37
|
when LLM::Message
|
|
36
38
|
adapt_content(content.content)
|
|
37
39
|
when LLM::Function::Return
|
|
38
40
|
throw(:abort, {role: "tool", tool_call_id: content.id, content: LLM.json.dump(content.value)})
|
|
39
|
-
when LLM::Object
|
|
40
|
-
prompt_error!(content)
|
|
41
41
|
else
|
|
42
42
|
prompt_error!(content)
|
|
43
43
|
end
|
|
44
44
|
end
|
|
45
45
|
|
|
46
|
+
def adapt_object(object)
|
|
47
|
+
case object.kind
|
|
48
|
+
when :image_url, :local_file, :remote_file
|
|
49
|
+
prompt_error!(object)
|
|
50
|
+
else
|
|
51
|
+
prompt_error!(object)
|
|
52
|
+
end
|
|
53
|
+
end
|
|
54
|
+
|
|
46
55
|
def adapt_message
|
|
47
56
|
case content
|
|
48
57
|
when Array
|
|
49
58
|
adapt_array
|
|
50
59
|
else
|
|
51
|
-
|
|
60
|
+
wrap(content: adapt_content(content))
|
|
52
61
|
end
|
|
53
62
|
end
|
|
54
63
|
|
|
@@ -58,13 +67,13 @@ module LLM::DeepSeek::RequestAdapter
|
|
|
58
67
|
elsif returns.any?
|
|
59
68
|
returns.map { {role: "tool", tool_call_id: _1.id, content: LLM.json.dump(_1.value)} }
|
|
60
69
|
else
|
|
61
|
-
|
|
70
|
+
wrap(content: content.flat_map { adapt_content(_1) })
|
|
62
71
|
end
|
|
63
72
|
end
|
|
64
73
|
|
|
65
74
|
def prompt_error!(object)
|
|
66
75
|
if LLM::Object === object
|
|
67
|
-
raise LLM::PromptError, "The given LLM::Object with kind '#{
|
|
76
|
+
raise LLM::PromptError, "The given LLM::Object with kind '#{object.kind}' is not " \
|
|
68
77
|
"supported by the DeepSeek API"
|
|
69
78
|
else
|
|
70
79
|
raise LLM::PromptError, "The given object (an instance of #{object.class}) " \
|
|
@@ -72,8 +81,22 @@ module LLM::DeepSeek::RequestAdapter
|
|
|
72
81
|
end
|
|
73
82
|
end
|
|
74
83
|
|
|
84
|
+
def wrap(content:, tool_calls: nil)
|
|
85
|
+
{
|
|
86
|
+
role: message.role,
|
|
87
|
+
content:,
|
|
88
|
+
tool_calls: tool_calls&.map { LLM::Object === _1 ? _1.to_h : _1 },
|
|
89
|
+
reasoning_content: message.reasoning_content
|
|
90
|
+
}.compact.then { preserve_nil_content(_1) }
|
|
91
|
+
end
|
|
92
|
+
|
|
75
93
|
def message = @message
|
|
76
94
|
def content = message.content
|
|
77
95
|
def returns = content.grep(LLM::Function::Return)
|
|
96
|
+
|
|
97
|
+
def preserve_nil_content(hash)
|
|
98
|
+
hash[:content] = content if content.nil?
|
|
99
|
+
hash
|
|
100
|
+
end
|
|
78
101
|
end
|
|
79
102
|
end
|
|
@@ -15,7 +15,7 @@ module LLM
|
|
|
15
15
|
#
|
|
16
16
|
# llm = LLM.deepseek(key: ENV["KEY"])
|
|
17
17
|
# ctx = LLM::Context.new(llm)
|
|
18
|
-
# ctx.talk
|
|
18
|
+
# ctx.talk "Hello"
|
|
19
19
|
# ctx.messages.select(&:assistant?).each { print "[#{_1.role}]", _1.content, "\n" }
|
|
20
20
|
class DeepSeek < OpenAI
|
|
21
21
|
require_relative "deepseek/request_adapter"
|
|
@@ -73,10 +73,10 @@ module LLM
|
|
|
73
73
|
|
|
74
74
|
##
|
|
75
75
|
# Returns the default model for chat completions
|
|
76
|
-
# @see https://api-docs.deepseek.com/quick_start/pricing deepseek-
|
|
76
|
+
# @see https://api-docs.deepseek.com/quick_start/pricing deepseek-v4-flash
|
|
77
77
|
# @return [String]
|
|
78
78
|
def default_model
|
|
79
|
-
"deepseek-
|
|
79
|
+
"deepseek-v4-flash"
|
|
80
80
|
end
|
|
81
81
|
end
|
|
82
82
|
end
|
|
@@ -153,7 +153,7 @@ class LLM::Google
|
|
|
153
153
|
|
|
154
154
|
def resolve_tool(part, cindex, pindex)
|
|
155
155
|
call = part["functionCall"]
|
|
156
|
-
registered =
|
|
156
|
+
registered = @stream.find_tool(call["name"])
|
|
157
157
|
fn = (registered || LLM::Function.new(call["name"])).dup.tap do |fn|
|
|
158
158
|
fn.id = LLM::Google.tool_id(part:, cindex:, pindex:)
|
|
159
159
|
fn.arguments = call["args"]
|
|
@@ -269,7 +269,7 @@ class LLM::OpenAI
|
|
|
269
269
|
# @group Resolvers
|
|
270
270
|
|
|
271
271
|
def resolve_tool(tool, arguments)
|
|
272
|
-
registered =
|
|
272
|
+
registered = @stream.find_tool(tool["name"])
|
|
273
273
|
fn = (registered || LLM::Function.new(tool["name"])).dup.tap do |fn|
|
|
274
274
|
fn.id = tool["call_id"]
|
|
275
275
|
fn.arguments = arguments
|
|
@@ -185,7 +185,7 @@ class LLM::OpenAI
|
|
|
185
185
|
end
|
|
186
186
|
|
|
187
187
|
def resolve_tool(tool, function, arguments)
|
|
188
|
-
registered =
|
|
188
|
+
registered = @stream.find_tool(function["name"])
|
|
189
189
|
fn = (registered || LLM::Function.new(function["name"])).dup.tap do |fn|
|
|
190
190
|
fn.id = tool["id"]
|
|
191
191
|
fn.arguments = arguments
|
data/lib/llm/stream.rb
CHANGED
|
@@ -83,12 +83,12 @@ module LLM
|
|
|
83
83
|
# `tool.mcp? ? ctx.spawn(tool, :task) : ctx.spawn(tool, :ractor)`.
|
|
84
84
|
# When a streamed tool cannot be resolved, `error` is passed as an
|
|
85
85
|
# {LLM::Function::Return}. It can be sent back to the model, allowing
|
|
86
|
-
# the tool-call path to recover and the session to continue.
|
|
87
|
-
# resolution
|
|
88
|
-
# {LLM
|
|
89
|
-
#
|
|
90
|
-
#
|
|
91
|
-
# and does not support MCP tools.
|
|
86
|
+
# the tool-call path to recover and the session to continue. Streamed
|
|
87
|
+
# tool resolution now prefers the current request tools, so
|
|
88
|
+
# {LLM.function}, MCP tools, bound tool instances, and normal
|
|
89
|
+
# {LLM::Tool LLM::Tool} classes can all resolve through the same
|
|
90
|
+
# request-local path. The current `:ractor` mode is for class-based
|
|
91
|
+
# tools and does not support MCP tools.
|
|
92
92
|
# @param [LLM::Function] tool
|
|
93
93
|
# The parsed tool call.
|
|
94
94
|
# @param [LLM::Function::Return, nil] error
|
|
@@ -148,6 +148,34 @@ module LLM
|
|
|
148
148
|
})
|
|
149
149
|
end
|
|
150
150
|
|
|
151
|
+
##
|
|
152
|
+
# Returns the tool definitions available for the current streamed request.
|
|
153
|
+
# This prefers request-local tools attached to the stream and falls back
|
|
154
|
+
# to the current context defaults when present.
|
|
155
|
+
# @return [Array<LLM::Function, LLM::Tool>]
|
|
156
|
+
def tools
|
|
157
|
+
extra[:tools] || ctx&.params&.dig(:tools) || []
|
|
158
|
+
end
|
|
159
|
+
|
|
160
|
+
##
|
|
161
|
+
# Resolves a streamed tool call against the current request tools first,
|
|
162
|
+
# then falls back to the global function registry.
|
|
163
|
+
# @param [String] name
|
|
164
|
+
# @return [LLM::Function, nil]
|
|
165
|
+
def find_tool(name)
|
|
166
|
+
tool = tools.find do |candidate|
|
|
167
|
+
candidate_name =
|
|
168
|
+
if candidate.respond_to?(:function)
|
|
169
|
+
candidate.function.name
|
|
170
|
+
else
|
|
171
|
+
candidate.name
|
|
172
|
+
end
|
|
173
|
+
candidate_name.to_s == name.to_s
|
|
174
|
+
end
|
|
175
|
+
tool&.then { _1.respond_to?(:function) ? _1.function : _1 } ||
|
|
176
|
+
LLM::Function.find_by_name(name)
|
|
177
|
+
end
|
|
178
|
+
|
|
151
179
|
# @endgroup
|
|
152
180
|
end
|
|
153
181
|
end
|
data/lib/llm/version.rb
CHANGED