ruby-pi 0.1.6 → 0.1.8
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +40 -0
- data/lib/ruby_pi/agent/core.rb +6 -0
- data/lib/ruby_pi/agent/loop.rb +40 -25
- data/lib/ruby_pi/agent/state.rb +6 -0
- data/lib/ruby_pi/configuration.rb +50 -5
- data/lib/ruby_pi/context/compaction.rb +48 -46
- data/lib/ruby_pi/llm/anthropic.rb +26 -9
- data/lib/ruby_pi/llm/base_provider.rb +34 -2
- data/lib/ruby_pi/llm/fallback.rb +30 -9
- data/lib/ruby_pi/llm/gemini.rb +24 -10
- data/lib/ruby_pi/llm/openai.rb +17 -7
- data/lib/ruby_pi/llm/tool_call.rb +2 -0
- data/lib/ruby_pi/tools/definition.rb +39 -4
- data/lib/ruby_pi/tools/executor.rb +14 -6
- data/lib/ruby_pi/tools/schema.rb +10 -0
- data/lib/ruby_pi/version.rb +1 -1
- data/lib/ruby_pi.rb +7 -0
- metadata +17 -3
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: b0054adb6a0863a8f296917be736df0ebfd789aa7205589b82689199d4bf4c06
|
|
4
|
+
data.tar.gz: fc79dcc61dbefce874e609807d989cf2293b0ecb45a6aa036069b11038ac5c9a
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: c130ada9b7ed93f5c9a0d16596c1176fec258204be26af15c61db3c18effee94bc7a8a1783620397780b0e3501e660b4e8ff48d8463e4089067edfcbf3bf9b60
|
|
7
|
+
data.tar.gz: dc179fe40cb063c4321a1c7a1aff5abb7b441d5fd87ced19908f9875c1f5b26bf2e17555b44658f603b32627577fe99feb8f34d061ed7e53eaba3c28cecd8bbb
|
data/CHANGELOG.md
CHANGED
|
@@ -5,6 +5,46 @@ All notable changes to this project will be documented in this file.
|
|
|
5
5
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
|
|
6
6
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
7
7
|
|
|
8
|
+
## [0.1.8] - 2026-06-09
|
|
9
|
+
|
|
10
|
+
### Fixed (adversarial review round 6)
|
|
11
|
+
|
|
12
|
+
- **`Retry-After` header was parsed but never honored (High)**: On a 429, `handle_error_response` stored the server's `Retry-After` on `RateLimitError#retry_after`, but the retry loop in `BaseProvider#complete` always slept the local exponential backoff (capped at `retry_max_delay`) — hammering a server that asked for a longer cooldown until the retry budget burned out. The retry delay now prefers a positive `retry_after` (capped at `RETRY_AFTER_CEILING`, 60s); HTTP-date values (which parse to `0.0`) and absent headers fall through to the computed backoff
|
|
13
|
+
- **Parallel executor timeout/rejection Results reported `name: "unknown"` (High)**: `execute_parallel` hardcoded `"unknown"` in the timeout and rejected-future branches, so with several tools timing out concurrently, logs and `:tool_execution_end` subscribers could not tell which tool hung. Futures are now zipped with their originating calls and failure Results carry the real tool name; the timeout message matches sequential mode (`Tool 'x' timed out after Ns`). The rejected-future branch also no longer reports a misleading full-timeout `duration_ms` for what may have been an instant failure (now `0.0`)
|
|
14
|
+
- **Keyword-parameter tool blocks failed on every call (High)**: `Definition#call` passed a single positional Hash, so a block written `{ |content:, platform:| ... }` — the natural style given named schema parameters — raised `ArgumentError: missing keyword` on every invocation (surfacing as a confusing failed Result). `Definition` now detects keyword parameters at construction and splats the arguments hash to keywords; positional-Hash blocks are unchanged. Keyword blocks without `**rest` raise on unexpected keys — strict by design, since the keys come from the LLM
|
|
15
|
+
- **`:compaction` event was never emitted in production (Medium)**: `Compaction#emitter` defaults to nil and nothing ever assigned it, so the documented `agent.on(:compaction)` subscription silently never fired — the only place the emitter was set was the spec itself. `Loop#initialize` now wires its emitter into the compaction strategy (an explicitly preassigned emitter is left untouched)
|
|
16
|
+
- **Streaming chunks were never normalized to UTF-8 (Medium)**: Faraday delivers `on_data` chunks as ASCII-8BIT; appending a chunk to a UTF-8 SSE buffer already holding non-ASCII text raises `Encoding::CompatibilityError`, and yielded deltas could carry binary encoding into consumers' UTF-8 buffers. All three providers now buffer in BINARY and re-encode each complete SSE line to UTF-8 (with `scrub` guarding invalid bytes) before parsing, so `:text_delta` events are always valid UTF-8 — including multi-byte characters split across network chunks
|
|
17
|
+
- **Streaming fallback gave consumers no way to truncate partial primary output (Medium)**: If the primary streamed text and then died mid-stream, the fallback streamed a complete fresh response — a delta-appending consumer rendered `"<partial primary><full fallback>"` with no signal of how much to discard. The `:fallback_start` payload now includes `partial_output` (Boolean) and `partial_chars` (characters already yielded), so consumers can deterministically reset
|
|
18
|
+
- **Tool names were not validated against provider constraints (Medium)**: A tool named `send.email` registered fine and then 400'd on every Anthropic request with an opaque server error. `Definition` now validates names against `/\A[a-zA-Z0-9_-]{1,64}\z/` (the strictest provider constraint) and raises `ArgumentError` at definition time with a pointed message
|
|
19
|
+
- **`json` was used everywhere but never declared or required (Medium)**: `JSON.parse`/`JSON.generate` are called throughout the providers and agent loop, but the gem relied on Faraday's transitive `json` dependency and the entry point's single `require "json"` — loading `agent/loop.rb` in isolation raised `NameError`, contradicting the composability principle. The gemspec now declares `json >= 2.0` and every file referencing `JSON` requires it directly (pinned by a source-scan spec)
|
|
20
|
+
- **Configuration accepted negative retry/timeout values (Low)**: `max_retries = -1` silently disabled retries and a negative delay raised deep inside the retry loop's `sleep`. The numeric settings now have validated writers that raise `ArgumentError` at assignment time
|
|
21
|
+
- **Global configuration first-access race (Low)**: `@configuration ||= Configuration.new` was unsynchronized; two threads racing the first call could each construct a Configuration with one silently discarded. The configuration is now eagerly initialized at require time
|
|
22
|
+
- **`continue()` Result accounting documented (Docs)**: Each `run`/`continue` builds a fresh Loop, so the returned Result's `usage`/`tool_calls_made`/`turns` cover only that invocation while `messages` is cumulative — an undocumented asymmetry, now documented on `Core#continue`
|
|
23
|
+
- **Schema DSL documented as LLM-facing hints, not validation (Docs)**: Nothing validates model-supplied arguments against `tool.parameters` before invoking the block — `required`/`enum`/`minimum` constrain what the model is asked to produce, with no runtime enforcement or type coercion. This is deliberate (anti-framework), but the schema header now says so loudly and directs tool blocks to treat arguments as untrusted input
|
|
24
|
+
- **`State#add_message` unbounded growth documented (Docs)**: Long-lived agents calling `continue()` repeatedly accumulate messages linearly without compaction configured; documented on the method
|
|
25
|
+
- **CLAUDE.md module map corrected (Docs)**: The map referenced a nonexistent `agent/agent.rb`, omitted `core.rb`/`loop.rb`/`state.rb`/`events.rb`, hardcoded version `0.1.0`, and the extension example used the one-arg `|event|` block signature instead of the actual `|data, agent|`. All corrected
|
|
26
|
+
|
|
27
|
+
### Release-history note
|
|
28
|
+
|
|
29
|
+
- **`[0.1.4]` below was never actually released**: `lib/ruby_pi/version.rb` went from `0.1.3` directly to `0.1.5` — the round-2 fixes documented under 0.1.4 shipped without a version bump and were first published as part of 0.1.5. There is intentionally no `v0.1.4` git tag or gem. (Discovered during round 6; the entry is kept for historical accuracy of *what* changed.)
|
|
30
|
+
|
|
31
|
+
## [0.1.7] - 2026-05-28
|
|
32
|
+
|
|
33
|
+
### Fixed (adversarial review round 5)
|
|
34
|
+
|
|
35
|
+
- **Compaction produced an Anthropic-invalid leading `:assistant` message (Critical)**: The 0.1.6 orphan-`:tool` strip fixed tool-result splitting but left the summary-role logic (`first_preserved == :assistant ? :user : :assistant`) intact. Whenever the first preserved message was `:user` (multi-turn reuse) or the preserved window emptied out (all tool results), the summary became an `:assistant` message at the head of the conversation — which Anthropic rejects with HTTP 400 "first message must use the 'user' role". The summary is now **always** a `:user` message (valid as the first message and never overwriting the system prompt). When the first preserved message is itself `:user`, the summary is merged into it to avoid consecutive same-role messages; an empty preserved window yields a lone `:user` summary. Extracted into `Compaction#build_compacted_history`
|
|
36
|
+
- **Compaction dead "mirror case" branch removed (Minor)**: The 0.1.6 `if droppable.last … && preserved.first[:role] == :tool` block was unreachable — the preceding `while` loop guarantees `preserved.first` is never `:tool`. Removed it (the originating assistant message is already in droppable alongside its now-moved tool results, so the pair is never split), eliminating misleading dead code
|
|
37
|
+
- **Deterministic `ProviderError` was retried with backoff (Minor)**: 0.1.6 added `RubyPi::ProviderError` to the retryable set in `BaseProvider#complete`, but provider errors are overwhelmingly deterministic request-construction failures (missing `tool_call_id`, invalid tool-argument JSON) raised before any HTTP call — retrying only burned the backoff schedule before re-raising the identical error. `ProviderError` is no longer retried. Fallback failover is unaffected (it rescues the `RubyPi::Error` superclass)
|
|
38
|
+
- **Lifecycle hooks saw string-keyed tool arguments while events saw symbols (Minor)**: `before_tool_call`/`after_tool_call` received the raw `ToolCall` (string-keyed `arguments`) while the `:tool_execution_start` event and `tool_calls_made` carried symbol keys — so a hook and an event subscriber disagreed on the key type for the same call. `Loop#act` now rebuilds each `ToolCall` with symbol-keyed arguments up front, so hooks, events, `tool_calls_made`, and the tool block all observe the identical shape
|
|
39
|
+
- **Anthropic streaming `finish_reason` could be clobbered to nil (Minor)**: A trailing `message_delta` event without a `stop_reason` overwrote the previously captured value, yielding a `Response` with no `finish_reason`. The assignment is now guarded (`finish_reason = delta["stop_reason"] if delta["stop_reason"]`), matching the OpenAI/Gemini guards
|
|
40
|
+
- **Gemini `finishReason` assumed a String (Minor)**: `finishReason.downcase` would raise `NoMethodError` on a non-String payload mid-stream. Both the streaming and standard paths now coerce via `to_s` before `downcase`, and remain consistent with each other
|
|
41
|
+
- **Dead streamed-content accumulator removed (Cleanup)**: `Loop#think` accumulated `streamed_content` that was never read (the recorded assistant message uses `Response#content`); the `.clear` on `:fallback_start` was a no-op and its comment was inaccurate. Removed the local; the `:provider_fallback` event still fires
|
|
42
|
+
- **`Fallback` class docstring corrected (Docs)**: The class-level docstring still described the removed happy-path buffering ("the Fallback now buffers deltas… buffered deltas are discarded"), contradicting the real-time direct-streaming implementation. Updated to describe direct streaming plus the `:fallback_start` signal
|
|
43
|
+
|
|
44
|
+
### Investigated, no change
|
|
45
|
+
|
|
46
|
+
- **Streaming HTTP error bodies via `env.status`**: A prior review raised that streaming error responses might lose their body if Faraday's `on_data` callback received a nil `env.status`. Verified against the actual stack (faraday 2.14.1 / faraday-net_http 3.3.0): the net_http adapter calls `save_http_response` (which sets `env.status`) before `response.read_body` streams chunks, and `Env#stream_response` passes that same populated `env` to the user's `on_data` proc. `env.status` is therefore reliably available before the first chunk, so the existing `error_body` recovery works. No fix needed
|
|
47
|
+
|
|
8
48
|
## [0.1.6] - 2026-05-01
|
|
9
49
|
|
|
10
50
|
### Fixed (adversarial review round 4)
|
data/lib/ruby_pi/agent/core.rb
CHANGED
|
@@ -140,12 +140,18 @@ module RubyPi
|
|
|
140
140
|
# the existing conversation history and appends the new prompt before
|
|
141
141
|
# resuming the loop.
|
|
142
142
|
#
|
|
143
|
+
# NOTE on Result accounting: each run/continue builds a fresh Loop, so
|
|
144
|
+
# the returned Result's `usage`, `tool_calls_made`, and `turns` cover
|
|
145
|
+
# ONLY this invocation — while `messages` is cumulative across the whole
|
|
146
|
+
# conversation. Sum the per-call Results if you need session totals.
|
|
147
|
+
#
|
|
143
148
|
# Issue #16: Uses the encapsulated reset_iteration! method instead of
|
|
144
149
|
# the old approach that bypassed encapsulation
|
|
145
150
|
# and was fragile.
|
|
146
151
|
#
|
|
147
152
|
# @param prompt [String] the follow-up user message
|
|
148
153
|
# @return [RubyPi::Agent::Result] the outcome of the continued run
|
|
154
|
+
# (usage/tool_calls_made/turns are per-invocation; messages cumulative)
|
|
149
155
|
def continue(prompt)
|
|
150
156
|
@state.reset_iteration!
|
|
151
157
|
@state.add_message(role: :user, content: prompt)
|
data/lib/ruby_pi/agent/loop.rb
CHANGED
|
@@ -10,6 +10,8 @@
|
|
|
10
10
|
# is reached. It handles streaming, lifecycle events, compaction, and all
|
|
11
11
|
# pre/post tool call hooks.
|
|
12
12
|
|
|
13
|
+
require "json"
|
|
14
|
+
|
|
13
15
|
module RubyPi
|
|
14
16
|
module Agent
|
|
15
17
|
# Executes the think-act-observe cycle against a given State, emitting
|
|
@@ -55,6 +57,14 @@ module RubyPi
|
|
|
55
57
|
@state = state
|
|
56
58
|
@emitter = emitter
|
|
57
59
|
@compaction = compaction
|
|
60
|
+
# Wire the loop's emitter into the compaction strategy so the
|
|
61
|
+
# documented :compaction event actually reaches agent subscribers.
|
|
62
|
+
# Compaction#emitter defaults to nil and nothing else ever sets it —
|
|
63
|
+
# without this, `agent.on(:compaction)` never fires. An emitter that
|
|
64
|
+
# was already assigned explicitly is left untouched.
|
|
65
|
+
if @compaction.respond_to?(:emitter=) && @compaction.respond_to?(:emitter) && @compaction.emitter.nil?
|
|
66
|
+
@compaction.emitter = emitter
|
|
67
|
+
end
|
|
58
68
|
@execution_mode = execution_mode
|
|
59
69
|
@tool_timeout = tool_timeout
|
|
60
70
|
@tool_calls_made = []
|
|
@@ -145,17 +155,16 @@ module RubyPi
|
|
|
145
155
|
# Build tools array for the LLM
|
|
146
156
|
tools = build_tools_array
|
|
147
157
|
|
|
148
|
-
#
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
#
|
|
158
|
+
# Call the LLM with streaming. The recorded assistant message uses
|
|
159
|
+
# the returned Response#content (already the final, authoritative
|
|
160
|
+
# text), so there is no need to accumulate deltas here — we only
|
|
161
|
+
# re-emit them for subscribers.
|
|
152
162
|
response = @state.model.complete(
|
|
153
163
|
messages: messages,
|
|
154
164
|
tools: tools,
|
|
155
165
|
stream: true
|
|
156
166
|
) do |event|
|
|
157
167
|
if event.text_delta?
|
|
158
|
-
streamed_content << event.data.to_s
|
|
159
168
|
@emitter.emit(:text_delta, content: event.data)
|
|
160
169
|
elsif event.tool_call_delta?
|
|
161
170
|
# Emit tool call delta events so subscribers can observe partial
|
|
@@ -164,12 +173,11 @@ module RubyPi
|
|
|
164
173
|
@emitter.emit(:tool_call_delta, data: event.data)
|
|
165
174
|
elsif event.fallback_start?
|
|
166
175
|
# The primary LLM provider failed mid-stream and a Fallback
|
|
167
|
-
# provider is now taking over.
|
|
168
|
-
#
|
|
169
|
-
#
|
|
170
|
-
#
|
|
171
|
-
#
|
|
172
|
-
streamed_content.clear
|
|
176
|
+
# provider is now taking over. Surface a :provider_fallback event
|
|
177
|
+
# so subscribers can clear any UI state they rendered from the
|
|
178
|
+
# discarded primary deltas. The recorded response is unaffected:
|
|
179
|
+
# it comes from the fallback provider's returned Response#content,
|
|
180
|
+
# never from the failed primary's partial text.
|
|
173
181
|
@emitter.emit(:provider_fallback, **event.data)
|
|
174
182
|
end
|
|
175
183
|
end
|
|
@@ -208,32 +216,39 @@ module RubyPi
|
|
|
208
216
|
timeout: @tool_timeout
|
|
209
217
|
)
|
|
210
218
|
|
|
211
|
-
#
|
|
212
|
-
#
|
|
213
|
-
#
|
|
214
|
-
#
|
|
215
|
-
#
|
|
216
|
-
#
|
|
217
|
-
symbolized
|
|
218
|
-
|
|
219
|
+
# Normalize each tool call's arguments to symbol keys ONCE, up front,
|
|
220
|
+
# by rebuilding the ToolCall objects. Every downstream consumer — the
|
|
221
|
+
# executor (which invokes the tool block), the before/after_tool_call
|
|
222
|
+
# hooks (which receive the ToolCall directly), the emitted
|
|
223
|
+
# :tool_execution_start event, and the recorded `tool_calls_made`
|
|
224
|
+
# payload — then observes the identical symbol-keyed shape. Carrying
|
|
225
|
+
# the symbolized form on the ToolCall itself (rather than in a side
|
|
226
|
+
# array) is what keeps the hooks consistent with everything else;
|
|
227
|
+
# previously hooks saw raw string keys while events/records saw symbols.
|
|
228
|
+
tool_calls = response.tool_calls.map do |tc|
|
|
229
|
+
RubyPi::LLM::ToolCall.new(
|
|
230
|
+
id: tc.id,
|
|
231
|
+
name: tc.name,
|
|
232
|
+
arguments: RubyPi::Tools::Executor.deep_symbolize_keys(tc.arguments)
|
|
233
|
+
)
|
|
219
234
|
end
|
|
220
235
|
|
|
221
236
|
# Prepare call hashes for the executor
|
|
222
|
-
calls =
|
|
223
|
-
{ name: tc.name, arguments:
|
|
237
|
+
calls = tool_calls.map do |tc|
|
|
238
|
+
{ name: tc.name, arguments: tc.arguments }
|
|
224
239
|
end
|
|
225
240
|
|
|
226
241
|
# Fire before_tool_call hooks and emit start events
|
|
227
|
-
|
|
242
|
+
tool_calls.each do |tc|
|
|
228
243
|
@state.before_tool_call&.call(tc)
|
|
229
|
-
@emitter.emit(:tool_execution_start, tool_name: tc.name, arguments:
|
|
244
|
+
@emitter.emit(:tool_execution_start, tool_name: tc.name, arguments: tc.arguments)
|
|
230
245
|
end
|
|
231
246
|
|
|
232
247
|
# Execute all tool calls
|
|
233
248
|
results = executor.execute(calls)
|
|
234
249
|
|
|
235
250
|
# Fire after_tool_call hooks, emit end events, and add results to messages
|
|
236
|
-
|
|
251
|
+
tool_calls.each_with_index do |tc, idx|
|
|
237
252
|
result = results[idx]
|
|
238
253
|
|
|
239
254
|
@state.after_tool_call&.call(tc, result)
|
|
@@ -247,7 +262,7 @@ module RubyPi
|
|
|
247
262
|
# arguments so callers see the same shape the tool itself received.
|
|
248
263
|
@tool_calls_made << {
|
|
249
264
|
tool_name: tc.name,
|
|
250
|
-
arguments:
|
|
265
|
+
arguments: tc.arguments,
|
|
251
266
|
result: result.to_h
|
|
252
267
|
}
|
|
253
268
|
|
data/lib/ruby_pi/agent/state.rb
CHANGED
|
@@ -91,6 +91,12 @@ module RubyPi
|
|
|
91
91
|
|
|
92
92
|
# Appends a message to the conversation history.
|
|
93
93
|
#
|
|
94
|
+
# NOTE: history grows without bound — there is no built-in cap. Growth
|
|
95
|
+
# per run is limited by max_iterations, but long-lived agents that call
|
|
96
|
+
# continue() repeatedly (or use a high max_iterations with large tool
|
|
97
|
+
# outputs) accumulate messages linearly. Configure
|
|
98
|
+
# Agent.new(compaction: ...) to keep the context bounded.
|
|
99
|
+
#
|
|
94
100
|
# @param role [Symbol, String] the message role (:user, :assistant, :system, :tool)
|
|
95
101
|
# @param content [String, nil] the text content of the message
|
|
96
102
|
# @param options [Hash] additional fields (e.g., :tool_call_id, :tool_calls)
|
|
@@ -37,19 +37,53 @@ module RubyPi
|
|
|
37
37
|
attr_accessor :openai_api_key
|
|
38
38
|
|
|
39
39
|
# @return [Integer] Maximum number of retry attempts for transient errors (default: 3)
|
|
40
|
-
|
|
40
|
+
attr_reader :max_retries
|
|
41
41
|
|
|
42
42
|
# @return [Float] Base delay in seconds for exponential backoff (default: 1.0)
|
|
43
|
-
|
|
43
|
+
attr_reader :retry_base_delay
|
|
44
44
|
|
|
45
45
|
# @return [Float] Maximum delay in seconds between retries (default: 30.0)
|
|
46
|
-
|
|
46
|
+
attr_reader :retry_max_delay
|
|
47
47
|
|
|
48
48
|
# @return [Integer] HTTP request timeout in seconds (default: 120)
|
|
49
|
-
|
|
49
|
+
attr_reader :request_timeout
|
|
50
50
|
|
|
51
51
|
# @return [Integer] HTTP connection open timeout in seconds (default: 10)
|
|
52
|
-
|
|
52
|
+
attr_reader :open_timeout
|
|
53
|
+
|
|
54
|
+
# Validated writers for numeric settings. A negative max_retries silently
|
|
55
|
+
# disables retries and a negative delay raises deep inside the retry
|
|
56
|
+
# loop's sleep — fail fast at assignment time instead, where the typo is.
|
|
57
|
+
|
|
58
|
+
# @param value [Integer] must be a non-negative integer
|
|
59
|
+
def max_retries=(value)
|
|
60
|
+
validate_numeric!(:max_retries, value)
|
|
61
|
+
@max_retries = value
|
|
62
|
+
end
|
|
63
|
+
|
|
64
|
+
# @param value [Numeric] must be non-negative
|
|
65
|
+
def retry_base_delay=(value)
|
|
66
|
+
validate_numeric!(:retry_base_delay, value)
|
|
67
|
+
@retry_base_delay = value
|
|
68
|
+
end
|
|
69
|
+
|
|
70
|
+
# @param value [Numeric] must be non-negative
|
|
71
|
+
def retry_max_delay=(value)
|
|
72
|
+
validate_numeric!(:retry_max_delay, value)
|
|
73
|
+
@retry_max_delay = value
|
|
74
|
+
end
|
|
75
|
+
|
|
76
|
+
# @param value [Numeric] must be non-negative
|
|
77
|
+
def request_timeout=(value)
|
|
78
|
+
validate_numeric!(:request_timeout, value)
|
|
79
|
+
@request_timeout = value
|
|
80
|
+
end
|
|
81
|
+
|
|
82
|
+
# @param value [Numeric] must be non-negative
|
|
83
|
+
def open_timeout=(value)
|
|
84
|
+
validate_numeric!(:open_timeout, value)
|
|
85
|
+
@open_timeout = value
|
|
86
|
+
end
|
|
53
87
|
|
|
54
88
|
# @return [String] Default model name for Gemini provider
|
|
55
89
|
attr_accessor :default_gemini_model
|
|
@@ -78,6 +112,17 @@ module RubyPi
|
|
|
78
112
|
|
|
79
113
|
private
|
|
80
114
|
|
|
115
|
+
# Raises unless the value is a non-negative Numeric.
|
|
116
|
+
#
|
|
117
|
+
# @param name [Symbol] the setting name (for the error message)
|
|
118
|
+
# @param value [Object] the value being assigned
|
|
119
|
+
# @raise [ArgumentError] if value is not a Numeric or is negative
|
|
120
|
+
def validate_numeric!(name, value)
|
|
121
|
+
return if value.is_a?(Numeric) && value >= 0
|
|
122
|
+
|
|
123
|
+
raise ArgumentError, "#{name} must be a non-negative number, got #{value.inspect}"
|
|
124
|
+
end
|
|
125
|
+
|
|
81
126
|
# Sets all configuration ivars to their default values. Called by both
|
|
82
127
|
# initialize and reset! to ensure consistent defaults without the
|
|
83
128
|
# anti-pattern of calling initialize from reset!.
|
|
@@ -87,34 +87,22 @@ module RubyPi
|
|
|
87
87
|
# call but keep the matching tool_result, the API rejects the
|
|
88
88
|
# request with "tool_result without preceding tool_use".
|
|
89
89
|
#
|
|
90
|
-
#
|
|
91
|
-
#
|
|
92
|
-
#
|
|
93
|
-
#
|
|
94
|
-
#
|
|
95
|
-
#
|
|
96
|
-
#
|
|
97
|
-
#
|
|
98
|
-
# that assistant message back into preserved so the pair
|
|
99
|
-
# stays intact.
|
|
100
|
-
#
|
|
101
|
-
# We apply (a) first: it's the common case (preserve_last_n=4 cuts
|
|
102
|
-
# mid-pair, leaving a stranded tool message). Then (b) catches the
|
|
103
|
-
# mirror case.
|
|
90
|
+
# When the boundary between droppable and preserved cuts mid-exchange,
|
|
91
|
+
# preserved can start with one or more orphan :tool messages whose
|
|
92
|
+
# matching assistant turn is in droppable. Strip those off the head of
|
|
93
|
+
# preserved and move them into droppable so they are summarized away
|
|
94
|
+
# rather than sent. Because the originating assistant message is older,
|
|
95
|
+
# it is already in droppable, so the pair stays together there — there
|
|
96
|
+
# is no mirror case to handle (once a tool result is moved across, its
|
|
97
|
+
# assistant is never left stranded on the preserved side).
|
|
104
98
|
while preserved.first && preserved.first[:role] == :tool
|
|
105
99
|
droppable << preserved.shift
|
|
106
100
|
end
|
|
107
101
|
|
|
108
|
-
|
|
109
|
-
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
preserved.first && preserved.first[:role] == :tool
|
|
113
|
-
preserved.unshift(droppable.pop)
|
|
114
|
-
end
|
|
115
|
-
|
|
116
|
-
# After the boundary fix-ups, droppable may have become empty.
|
|
117
|
-
return nil if droppable.empty?
|
|
102
|
+
# The orphan-strip only moves messages INTO droppable, so droppable
|
|
103
|
+
# cannot have shrunk; it is still non-empty here. preserved, however,
|
|
104
|
+
# may now be empty (the whole window was tool results) — the summary
|
|
105
|
+
# construction below handles that case.
|
|
118
106
|
|
|
119
107
|
# Generate a summary of the dropped messages
|
|
120
108
|
summary = summarize(droppable)
|
|
@@ -122,28 +110,42 @@ module RubyPi
|
|
|
122
110
|
# Emit compaction event if an emitter is available
|
|
123
111
|
@emitter&.emit(:compaction, dropped_count: droppable.size, summary: summary)
|
|
124
112
|
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
|
|
128
|
-
|
|
129
|
-
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
|
|
133
|
-
|
|
134
|
-
|
|
135
|
-
|
|
136
|
-
|
|
137
|
-
|
|
138
|
-
|
|
139
|
-
|
|
140
|
-
|
|
141
|
-
|
|
142
|
-
|
|
143
|
-
|
|
144
|
-
|
|
145
|
-
|
|
146
|
-
|
|
113
|
+
build_compacted_history(summary, preserved)
|
|
114
|
+
end
|
|
115
|
+
|
|
116
|
+
# Builds the compacted history: a summary message followed by the
|
|
117
|
+
# preserved tail.
|
|
118
|
+
#
|
|
119
|
+
# The summary becomes the FIRST message of the compacted history, so it
|
|
120
|
+
# must satisfy the strictest provider constraints (Anthropic):
|
|
121
|
+
# 1. The summary role MUST NOT be :system — that would overwrite the
|
|
122
|
+
# real system prompt on Anthropic, which promotes the last :system
|
|
123
|
+
# message to the top-level `system:` parameter.
|
|
124
|
+
# 2. The first message MUST use role :user.
|
|
125
|
+
# 3. Consecutive same-role messages are rejected.
|
|
126
|
+
#
|
|
127
|
+
# A :user summary satisfies (1) and (2). For (3): the orphan-strip above
|
|
128
|
+
# guarantees the first preserved message is :assistant, :user, or absent
|
|
129
|
+
# (never :tool). When it is :assistant or absent, a standalone :user
|
|
130
|
+
# summary alternates correctly. When it is :user, a separate :user
|
|
131
|
+
# summary would create two consecutive user messages, so we instead
|
|
132
|
+
# merge the summary text into that existing user message — keeping the
|
|
133
|
+
# first message a single :user message with no role collision.
|
|
134
|
+
#
|
|
135
|
+
# @param summary [String] the generated summary text
|
|
136
|
+
# @param preserved [Array<Hash>] the preserved tail of messages
|
|
137
|
+
# @return [Array<Hash>] the compacted history
|
|
138
|
+
def build_compacted_history(summary, preserved)
|
|
139
|
+
summary_text = "[Conversation Summary]\n#{summary}"
|
|
140
|
+
first_preserved = preserved.first
|
|
141
|
+
|
|
142
|
+
if first_preserved && first_preserved[:role] == :user
|
|
143
|
+
merged = first_preserved.dup
|
|
144
|
+
merged[:content] = "#{summary_text}\n\n#{first_preserved[:content]}"
|
|
145
|
+
[merged] + preserved.drop(1)
|
|
146
|
+
else
|
|
147
|
+
[{ role: :user, content: summary_text }] + preserved
|
|
148
|
+
end
|
|
147
149
|
end
|
|
148
150
|
|
|
149
151
|
# Estimates the total token count for a system prompt and message array
|
|
@@ -6,6 +6,8 @@
|
|
|
6
6
|
# the Anthropic Messages API for both synchronous and streaming completions,
|
|
7
7
|
# including tool_use block support.
|
|
8
8
|
|
|
9
|
+
require "json"
|
|
10
|
+
|
|
9
11
|
module RubyPi
|
|
10
12
|
module LLM
|
|
11
13
|
# Anthropic Claude provider implementation. Communicates with the Anthropic
|
|
@@ -370,12 +372,19 @@ module RubyPi
|
|
|
370
372
|
# process complete lines incrementally so that deltas reach the caller
|
|
371
373
|
# as soon as each SSE event is fully received — not after the entire
|
|
372
374
|
# response has been buffered.
|
|
373
|
-
|
|
375
|
+
#
|
|
376
|
+
# The buffer is BINARY because chunks arrive as ASCII-8BIT and may end
|
|
377
|
+
# mid-way through a multi-byte UTF-8 character; appending such a chunk
|
|
378
|
+
# to a UTF-8 buffer that already holds non-ASCII text raises
|
|
379
|
+
# Encoding::CompatibilityError. Each complete line is re-encoded to
|
|
380
|
+
# UTF-8 (and scrubbed) before parsing, so deltas reach the caller as
|
|
381
|
+
# valid UTF-8 strings.
|
|
382
|
+
sse_buffer = (+"").force_encoding(Encoding::BINARY)
|
|
374
383
|
response_status = nil
|
|
375
384
|
|
|
376
385
|
# Accumulate error response body separately so ApiError gets the
|
|
377
386
|
# full body even though on_data consumed the chunks.
|
|
378
|
-
error_body = +""
|
|
387
|
+
error_body = (+"").force_encoding(Encoding::BINARY)
|
|
379
388
|
|
|
380
389
|
response = with_transport_errors do
|
|
381
390
|
conn.post("/v1/messages") do |req|
|
|
@@ -394,14 +403,17 @@ module RubyPi
|
|
|
394
403
|
# calls on_data for error responses too, which would otherwise
|
|
395
404
|
# consume the body and leave response.body empty.
|
|
396
405
|
if response_status && response_status >= 400
|
|
397
|
-
error_body << chunk
|
|
406
|
+
error_body << chunk.b
|
|
398
407
|
next
|
|
399
408
|
end
|
|
400
409
|
|
|
401
|
-
sse_buffer << chunk
|
|
402
|
-
# Process all complete lines in the buffer
|
|
410
|
+
sse_buffer << chunk.b
|
|
411
|
+
# Process all complete lines in the buffer. A complete line holds
|
|
412
|
+
# complete UTF-8 sequences (multi-byte characters split across
|
|
413
|
+
# chunks are repaired by the buffering), so re-encode it to UTF-8
|
|
414
|
+
# here; scrub guards against a server sending invalid bytes.
|
|
403
415
|
while (line_end = sse_buffer.index("\n"))
|
|
404
|
-
line = sse_buffer.slice!(0, line_end + 1).strip
|
|
416
|
+
line = sse_buffer.slice!(0, line_end + 1).force_encoding(Encoding::UTF_8).scrub.strip
|
|
405
417
|
next if line.empty?
|
|
406
418
|
next unless line.start_with?("data: ")
|
|
407
419
|
|
|
@@ -436,12 +448,12 @@ module RubyPi
|
|
|
436
448
|
unless response.success?
|
|
437
449
|
# Reconstruct the response body from what on_data accumulated
|
|
438
450
|
error_response = response
|
|
439
|
-
error_body_str = error_body.empty? ? response.body : error_body
|
|
451
|
+
error_body_str = error_body.empty? ? response.body : error_body.force_encoding(Encoding::UTF_8).scrub
|
|
440
452
|
handle_error_response(error_response, override_body: error_body_str)
|
|
441
453
|
end
|
|
442
454
|
|
|
443
455
|
# Process any remaining data in the buffer after the connection closes
|
|
444
|
-
sse_buffer.each_line do |line|
|
|
456
|
+
sse_buffer.force_encoding(Encoding::UTF_8).scrub.each_line do |line|
|
|
445
457
|
line = line.strip
|
|
446
458
|
next if line.empty?
|
|
447
459
|
next unless line.start_with?("data: ")
|
|
@@ -562,7 +574,12 @@ module RubyPi
|
|
|
562
574
|
|
|
563
575
|
when "message_delta"
|
|
564
576
|
delta = data["delta"] || {}
|
|
565
|
-
finish_reason
|
|
577
|
+
# Only overwrite finish_reason when this delta actually carries a
|
|
578
|
+
# stop_reason. Anthropic emits the stop_reason on a single
|
|
579
|
+
# message_delta near the end of the stream; a later message_delta
|
|
580
|
+
# without one must not clobber the captured value back to nil
|
|
581
|
+
# (which would yield a Response with no finish_reason).
|
|
582
|
+
finish_reason = delta["stop_reason"] if delta["stop_reason"]
|
|
566
583
|
if data.key?("usage")
|
|
567
584
|
usage_info = data["usage"]
|
|
568
585
|
usage_data[:completion_tokens] = usage_info["output_tokens"]
|
|
@@ -78,14 +78,23 @@ module RubyPi
|
|
|
78
78
|
rescue RubyPi::AuthenticationError
|
|
79
79
|
# Authentication errors are not retryable — raise immediately
|
|
80
80
|
raise
|
|
81
|
-
rescue RubyPi::RateLimitError, RubyPi::ApiError, RubyPi::TimeoutError
|
|
81
|
+
rescue RubyPi::RateLimitError, RubyPi::ApiError, RubyPi::TimeoutError => e
|
|
82
|
+
# NOTE: RubyPi::ProviderError is intentionally NOT retried. Provider
|
|
83
|
+
# errors are overwhelmingly deterministic request-construction
|
|
84
|
+
# failures (missing tool_call_id, invalid tool-argument JSON, missing
|
|
85
|
+
# tool name) raised by build_request_body BEFORE any HTTP call. They
|
|
86
|
+
# produce the identical error on every attempt, so retrying only
|
|
87
|
+
# burns the backoff schedule before surfacing the same failure.
|
|
88
|
+
# Fallback wrappers still rescue RubyPi::Error (the ProviderError
|
|
89
|
+
# superclass), so provider failover is unaffected.
|
|
90
|
+
#
|
|
82
91
|
# Retry up to max_retries times AFTER the initial attempt.
|
|
83
92
|
# With max_retries: 3, attempt goes 1 (initial), 2, 3, 4 — the condition
|
|
84
93
|
# `attempt <= @max_retries` allows retries on attempts 1..3, so we get
|
|
85
94
|
# 3 retries + 1 initial = 4 total attempts. Previously used `< @max_retries`
|
|
86
95
|
# which was off-by-one (only 2 retries with max_retries: 3).
|
|
87
96
|
if attempt <= @max_retries
|
|
88
|
-
delay =
|
|
97
|
+
delay = retry_delay_for(e, attempt)
|
|
89
98
|
log_retry(attempt, delay, e)
|
|
90
99
|
sleep(delay)
|
|
91
100
|
retry
|
|
@@ -127,6 +136,29 @@ module RubyPi
|
|
|
127
136
|
raise RubyPi::AbstractMethodError, :perform_complete
|
|
128
137
|
end
|
|
129
138
|
|
|
139
|
+
# Maximum delay (seconds) honored from a server-provided Retry-After
|
|
140
|
+
# header. Caps pathological or misconfigured server values so a single
|
|
141
|
+
# 429 cannot stall the client indefinitely.
|
|
142
|
+
RETRY_AFTER_CEILING = 60.0
|
|
143
|
+
|
|
144
|
+
# Picks the delay before the next retry. A server-provided Retry-After
|
|
145
|
+
# on a 429 takes precedence over the local exponential backoff: the
|
|
146
|
+
# server knows its own cooldown window, and retrying earlier just burns
|
|
147
|
+
# the retry budget against guaranteed 429s. Retry-After parsed from an
|
|
148
|
+
# HTTP-date (rather than delta-seconds) arrives as 0.0 and falls through
|
|
149
|
+
# to the computed backoff.
|
|
150
|
+
#
|
|
151
|
+
# @param error [Exception] the error that triggered the retry
|
|
152
|
+
# @param attempt [Integer] the current attempt number (1-based)
|
|
153
|
+
# @return [Float] delay in seconds
|
|
154
|
+
def retry_delay_for(error, attempt)
|
|
155
|
+
if error.is_a?(RubyPi::RateLimitError) && error.retry_after&.positive?
|
|
156
|
+
[error.retry_after, RETRY_AFTER_CEILING].min
|
|
157
|
+
else
|
|
158
|
+
calculate_backoff(attempt)
|
|
159
|
+
end
|
|
160
|
+
end
|
|
161
|
+
|
|
130
162
|
# Calculates the backoff delay for a given retry attempt using
|
|
131
163
|
# exponential backoff with jitter.
|
|
132
164
|
#
|
data/lib/ruby_pi/llm/fallback.rb
CHANGED
|
@@ -16,11 +16,14 @@ module RubyPi
|
|
|
16
16
|
# Authentication errors are NOT retried with the fallback since they
|
|
17
17
|
# indicate a configuration problem rather than a transient failure.
|
|
18
18
|
#
|
|
19
|
-
# Issue #23: When streaming,
|
|
20
|
-
#
|
|
21
|
-
#
|
|
22
|
-
#
|
|
23
|
-
#
|
|
19
|
+
# Issue #23 + Issue #12: When streaming, events flow from the primary
|
|
20
|
+
# provider directly to the consumer in real time (no buffering), preserving
|
|
21
|
+
# the streaming UX on the happy path. If the primary fails mid-stream, a
|
|
22
|
+
# :fallback_start StreamEvent is emitted before the fallback takes over, so
|
|
23
|
+
# the consumer can discard any partial output already rendered from the
|
|
24
|
+
# failed primary. (The agent loop translates :fallback_start into a
|
|
25
|
+
# :provider_fallback event; raw Fallback consumers should handle
|
|
26
|
+
# :fallback_start themselves.)
|
|
24
27
|
#
|
|
25
28
|
# @example Setting up a fallback chain
|
|
26
29
|
# primary = RubyPi::LLM.model(:gemini, "gemini-2.0-flash")
|
|
@@ -146,6 +149,19 @@ module RubyPi
|
|
|
146
149
|
# @yield [event] the consumer's streaming block
|
|
147
150
|
# @return [RubyPi::LLM::Response]
|
|
148
151
|
def perform_complete_with_streaming_fallback(messages:, tools:, &block)
|
|
152
|
+
# Count the characters of text already delivered to the consumer from
|
|
153
|
+
# the primary. If the primary fails mid-stream AFTER yielding text,
|
|
154
|
+
# the fallback streams a complete fresh response — a consumer that
|
|
155
|
+
# merely appends deltas would render the primary's partial text
|
|
156
|
+
# followed by the full fallback text. The :fallback_start payload
|
|
157
|
+
# carries partial_output/partial_chars so consumers can deterministically
|
|
158
|
+
# truncate what they already rendered.
|
|
159
|
+
partial_chars = 0
|
|
160
|
+
counting_block = proc do |event|
|
|
161
|
+
partial_chars += event.data.to_s.length if event.text_delta?
|
|
162
|
+
block.call(event)
|
|
163
|
+
end
|
|
164
|
+
|
|
149
165
|
begin
|
|
150
166
|
# Stream primary events directly to the consumer for real-time UX.
|
|
151
167
|
# No buffering — tokens appear immediately as they arrive.
|
|
@@ -153,7 +169,7 @@ module RubyPi
|
|
|
153
169
|
messages: messages,
|
|
154
170
|
tools: tools,
|
|
155
171
|
stream: true,
|
|
156
|
-
&
|
|
172
|
+
&counting_block
|
|
157
173
|
)
|
|
158
174
|
|
|
159
175
|
response
|
|
@@ -164,12 +180,17 @@ module RubyPi
|
|
|
164
180
|
log_fallback(e)
|
|
165
181
|
|
|
166
182
|
# Signal the consumer that the primary failed mid-stream and a
|
|
167
|
-
# fallback provider is taking over. Consumers
|
|
168
|
-
# to clear any partial output from the failed primary
|
|
183
|
+
# fallback provider is taking over. Consumers MUST use this event
|
|
184
|
+
# to clear any partial output from the failed primary:
|
|
185
|
+
# partial_output — true when the primary yielded any text deltas
|
|
186
|
+
# partial_chars — how many characters were yielded (truncate by
|
|
187
|
+
# this amount if appending to a shared buffer)
|
|
169
188
|
block.call(StreamEvent.new(type: :fallback_start, data: {
|
|
170
189
|
failed_provider: @primary.provider_name,
|
|
171
190
|
error: e.message,
|
|
172
|
-
fallback_provider: @fallback.provider_name
|
|
191
|
+
fallback_provider: @fallback.provider_name,
|
|
192
|
+
partial_output: partial_chars.positive?,
|
|
193
|
+
partial_chars: partial_chars
|
|
173
194
|
}))
|
|
174
195
|
|
|
175
196
|
# Stream directly from the fallback to the consumer's block.
|
data/lib/ruby_pi/llm/gemini.rb
CHANGED
|
@@ -6,6 +6,7 @@
|
|
|
6
6
|
# the Gemini REST API for both synchronous and streaming completions, including
|
|
7
7
|
# tool/function calling support.
|
|
8
8
|
|
|
9
|
+
require "json"
|
|
9
10
|
require "securerandom"
|
|
10
11
|
|
|
11
12
|
module RubyPi
|
|
@@ -305,9 +306,14 @@ module RubyPi
|
|
|
305
306
|
# which may split SSE events mid-line. We accumulate a line buffer and
|
|
306
307
|
# process complete lines incrementally so that deltas reach the caller
|
|
307
308
|
# as soon as each SSE event is fully received.
|
|
308
|
-
|
|
309
|
+
# BINARY buffer: chunks arrive as ASCII-8BIT and may end mid-way
|
|
310
|
+
# through a multi-byte UTF-8 character; appending such a chunk to a
|
|
311
|
+
# UTF-8 buffer holding non-ASCII text raises
|
|
312
|
+
# Encoding::CompatibilityError. Complete lines are re-encoded to
|
|
313
|
+
# UTF-8 (and scrubbed) before parsing.
|
|
314
|
+
sse_buffer = (+"").force_encoding(Encoding::BINARY)
|
|
309
315
|
response_status = nil
|
|
310
|
-
error_body = +""
|
|
316
|
+
error_body = (+"").force_encoding(Encoding::BINARY)
|
|
311
317
|
|
|
312
318
|
response = with_transport_errors do
|
|
313
319
|
conn.post(url) do |req|
|
|
@@ -324,14 +330,17 @@ module RubyPi
|
|
|
324
330
|
# If the HTTP status indicates an error, accumulate the body for
|
|
325
331
|
# the error handler instead of parsing it as SSE events.
|
|
326
332
|
if response_status && response_status >= 400
|
|
327
|
-
error_body << chunk
|
|
333
|
+
error_body << chunk.b
|
|
328
334
|
next
|
|
329
335
|
end
|
|
330
336
|
|
|
331
|
-
sse_buffer << chunk
|
|
332
|
-
# Process all complete lines in the buffer
|
|
337
|
+
sse_buffer << chunk.b
|
|
338
|
+
# Process all complete lines in the buffer. A complete line holds
|
|
339
|
+
# complete UTF-8 sequences (multi-byte characters split across
|
|
340
|
+
# chunks are repaired by the buffering), so re-encode it to UTF-8
|
|
341
|
+
# here; scrub guards against a server sending invalid bytes.
|
|
333
342
|
while (line_end = sse_buffer.index("\n"))
|
|
334
|
-
line = sse_buffer.slice!(0, line_end + 1).strip
|
|
343
|
+
line = sse_buffer.slice!(0, line_end + 1).force_encoding(Encoding::UTF_8).scrub.strip
|
|
335
344
|
next if line.empty?
|
|
336
345
|
next unless line.start_with?("data: ")
|
|
337
346
|
|
|
@@ -375,8 +384,11 @@ module RubyPi
|
|
|
375
384
|
# Parse the actual finish reason from the streaming response
|
|
376
385
|
# instead of hardcoding "stop". Gemini sends finishReason in
|
|
377
386
|
# the candidate object (e.g., "STOP", "MAX_TOKENS", "SAFETY").
|
|
387
|
+
# Coerce via to_s before downcase so a non-String payload can
|
|
388
|
+
# never raise NoMethodError mid-stream (mirrors the &.to_s in
|
|
389
|
+
# the non-streaming parse path).
|
|
378
390
|
if candidate["finishReason"]
|
|
379
|
-
finish_reason = candidate["finishReason"].downcase
|
|
391
|
+
finish_reason = candidate["finishReason"].to_s.downcase
|
|
380
392
|
end
|
|
381
393
|
|
|
382
394
|
# Capture usage metadata if present
|
|
@@ -397,7 +409,7 @@ module RubyPi
|
|
|
397
409
|
# callback. Pass the accumulated error_body so ApiError carries the
|
|
398
410
|
# full server message instead of an empty body.
|
|
399
411
|
unless response.success?
|
|
400
|
-
error_body_str = error_body.empty? ? response.body : error_body
|
|
412
|
+
error_body_str = error_body.empty? ? response.body : error_body.force_encoding(Encoding::UTF_8).scrub
|
|
401
413
|
handle_error_response(response, override_body: error_body_str)
|
|
402
414
|
end
|
|
403
415
|
|
|
@@ -450,8 +462,10 @@ module RubyPi
|
|
|
450
462
|
}
|
|
451
463
|
end
|
|
452
464
|
|
|
453
|
-
# Map Gemini finish reason to normalized string
|
|
454
|
-
|
|
465
|
+
# Map Gemini finish reason to normalized string. to_s guards against
|
|
466
|
+
# a non-String payload (mirrors the streaming path); &. keeps a
|
|
467
|
+
# missing finishReason as nil.
|
|
468
|
+
finish_reason = candidate["finishReason"]&.to_s&.downcase
|
|
455
469
|
|
|
456
470
|
Response.new(
|
|
457
471
|
content: content,
|
data/lib/ruby_pi/llm/openai.rb
CHANGED
|
@@ -6,6 +6,8 @@
|
|
|
6
6
|
# OpenAI Chat Completions API for both synchronous and streaming completions,
|
|
7
7
|
# including function/tool calling support.
|
|
8
8
|
|
|
9
|
+
require "json"
|
|
10
|
+
|
|
9
11
|
module RubyPi
|
|
10
12
|
module LLM
|
|
11
13
|
# OpenAI provider implementation. Communicates with the OpenAI Chat
|
|
@@ -318,9 +320,14 @@ module RubyPi
|
|
|
318
320
|
# which may split SSE events mid-line. We accumulate a line buffer and
|
|
319
321
|
# process complete lines incrementally so that deltas reach the caller
|
|
320
322
|
# as soon as each SSE event is fully received.
|
|
321
|
-
|
|
323
|
+
# BINARY buffer: chunks arrive as ASCII-8BIT and may end mid-way
|
|
324
|
+
# through a multi-byte UTF-8 character; appending such a chunk to a
|
|
325
|
+
# UTF-8 buffer holding non-ASCII text raises
|
|
326
|
+
# Encoding::CompatibilityError. Complete lines are re-encoded to
|
|
327
|
+
# UTF-8 (and scrubbed) before parsing.
|
|
328
|
+
sse_buffer = (+"").force_encoding(Encoding::BINARY)
|
|
322
329
|
response_status = nil
|
|
323
|
-
error_body = +""
|
|
330
|
+
error_body = (+"").force_encoding(Encoding::BINARY)
|
|
324
331
|
|
|
325
332
|
response = with_transport_errors do
|
|
326
333
|
conn.post("/v1/chat/completions") do |req|
|
|
@@ -337,14 +344,17 @@ module RubyPi
|
|
|
337
344
|
# If the HTTP status indicates an error, accumulate the body for
|
|
338
345
|
# the error handler instead of parsing it as SSE events.
|
|
339
346
|
if response_status && response_status >= 400
|
|
340
|
-
error_body << chunk
|
|
347
|
+
error_body << chunk.b
|
|
341
348
|
next
|
|
342
349
|
end
|
|
343
350
|
|
|
344
|
-
sse_buffer << chunk
|
|
345
|
-
# Process all complete lines in the buffer
|
|
351
|
+
sse_buffer << chunk.b
|
|
352
|
+
# Process all complete lines in the buffer. A complete line holds
|
|
353
|
+
# complete UTF-8 sequences (multi-byte characters split across
|
|
354
|
+
# chunks are repaired by the buffering), so re-encode it to UTF-8
|
|
355
|
+
# here; scrub guards against a server sending invalid bytes.
|
|
346
356
|
while (line_end = sse_buffer.index("\n"))
|
|
347
|
-
line = sse_buffer.slice!(0, line_end + 1).strip
|
|
357
|
+
line = sse_buffer.slice!(0, line_end + 1).force_encoding(Encoding::UTF_8).scrub.strip
|
|
348
358
|
next if line.empty?
|
|
349
359
|
next unless line.start_with?("data: ")
|
|
350
360
|
|
|
@@ -419,7 +429,7 @@ module RubyPi
|
|
|
419
429
|
# callback. Pass the accumulated error_body so ApiError carries the
|
|
420
430
|
# full server message instead of an empty body.
|
|
421
431
|
unless response.success?
|
|
422
|
-
error_body_str = error_body.empty? ? response.body : error_body
|
|
432
|
+
error_body_str = error_body.empty? ? response.body : error_body.force_encoding(Encoding::UTF_8).scrub
|
|
423
433
|
handle_error_response(response, override_body: error_body_str)
|
|
424
434
|
end
|
|
425
435
|
|
|
@@ -37,16 +37,32 @@ module RubyPi
|
|
|
37
37
|
# @return [Hash] A JSON Schema hash describing the tool's parameters.
|
|
38
38
|
attr_reader :parameters
|
|
39
39
|
|
|
40
|
+
# Tool names must satisfy the strictest provider constraint (Anthropic's
|
|
41
|
+
# ^[a-zA-Z0-9_-]{1,64}$). Without this guard, a name like "send.email"
|
|
42
|
+
# registers fine and then 400s on every API request with an opaque
|
|
43
|
+
# server-side validation error that doesn't point back to the tool.
|
|
44
|
+
NAME_FORMAT = /\A[a-zA-Z0-9_-]{1,64}\z/
|
|
45
|
+
|
|
40
46
|
# Creates a new tool definition.
|
|
41
47
|
#
|
|
42
|
-
# @param name [String, Symbol] Unique identifier for the tool.
|
|
48
|
+
# @param name [String, Symbol] Unique identifier for the tool. Must match
|
|
49
|
+
# NAME_FORMAT (letters, digits, underscore, hyphen; max 64 chars).
|
|
43
50
|
# @param description [String] What the tool does (shown to the LLM).
|
|
44
51
|
# @param category [Symbol, nil] Optional grouping category.
|
|
45
52
|
# @param parameters [Hash] JSON Schema hash for the tool's input parameters.
|
|
46
|
-
# @yield [Hash] Block that implements the tool logic. Receives a hash of
|
|
47
|
-
#
|
|
53
|
+
# @yield [Hash] Block that implements the tool logic. Receives a hash of
|
|
54
|
+
# symbol-keyed arguments, or keyword arguments if the block declares
|
|
55
|
+
# keyword parameters (see #call).
|
|
56
|
+
# @raise [ArgumentError] If name is missing or violates NAME_FORMAT,
|
|
57
|
+
# description is missing, or no block given.
|
|
48
58
|
def initialize(name:, description:, category: nil, parameters: {}, &block)
|
|
49
59
|
raise ArgumentError, "Tool name is required" if name.nil? || name.to_s.strip.empty?
|
|
60
|
+
unless name.to_s.match?(NAME_FORMAT)
|
|
61
|
+
raise ArgumentError,
|
|
62
|
+
"Tool name #{name.to_s.inspect} is invalid — provider APIs require " \
|
|
63
|
+
"names matching #{NAME_FORMAT.inspect} (letters, digits, underscore, " \
|
|
64
|
+
"hyphen; 1-64 characters)"
|
|
65
|
+
end
|
|
50
66
|
raise ArgumentError, "Tool description is required" if description.nil? || description.strip.empty?
|
|
51
67
|
raise ArgumentError, "Tool implementation block is required" unless block_given?
|
|
52
68
|
|
|
@@ -55,14 +71,33 @@ module RubyPi
|
|
|
55
71
|
@category = category&.to_sym
|
|
56
72
|
@parameters = parameters
|
|
57
73
|
@implementation = block
|
|
74
|
+
# On Ruby 3.x a positional Hash is never auto-splatted to keywords, so
|
|
75
|
+
# a block written `{ |content:, platform:| ... }` — the natural style
|
|
76
|
+
# given named schema parameters — would fail every call with
|
|
77
|
+
# "missing keyword". Detect keyword parameters once here and splat in
|
|
78
|
+
# #call accordingly.
|
|
79
|
+
@expects_keywords = block.parameters.any? { |type, _| %i[key keyreq keyrest].include?(type) }
|
|
58
80
|
end
|
|
59
81
|
|
|
60
82
|
# Invokes the tool with the given arguments.
|
|
61
83
|
#
|
|
84
|
+
# Blocks may be written either style:
|
|
85
|
+
# { |args| args[:content] } # single positional Hash
|
|
86
|
+
# { |content:, platform: "x"| ... } # keyword parameters
|
|
87
|
+
#
|
|
88
|
+
# When the block declares keyword parameters, the arguments hash is
|
|
89
|
+
# splatted to keywords. Note that a keyword-style block without **rest
|
|
90
|
+
# raises ArgumentError on unexpected keys — strict by design, since the
|
|
91
|
+
# keys come from the LLM.
|
|
92
|
+
#
|
|
62
93
|
# @param args [Hash] The arguments to pass to the tool implementation.
|
|
63
94
|
# @return [Object] Whatever the implementation block returns.
|
|
64
95
|
def call(args = {})
|
|
65
|
-
@
|
|
96
|
+
if @expects_keywords
|
|
97
|
+
@implementation.call(**args)
|
|
98
|
+
else
|
|
99
|
+
@implementation.call(args)
|
|
100
|
+
end
|
|
66
101
|
end
|
|
67
102
|
|
|
68
103
|
# Converts this tool definition to Google Gemini function declaration format.
|
|
@@ -115,7 +115,12 @@ module RubyPi
|
|
|
115
115
|
end
|
|
116
116
|
|
|
117
117
|
# Collect results, respecting the configured timeout for each future.
|
|
118
|
-
|
|
118
|
+
# Zip each future with its originating call so failure Results carry
|
|
119
|
+
# the real tool name — with several tools timing out in parallel,
|
|
120
|
+
# "unknown" Results are indistinguishable in logs and extension events.
|
|
121
|
+
calls.zip(futures).map do |call, future|
|
|
122
|
+
tool_name = (call[:name] || call["name"]).to_s
|
|
123
|
+
|
|
119
124
|
# Issue #10: Wait for the future to complete, then check its state
|
|
120
125
|
# explicitly. Future#value returns nil both on timeout AND when the
|
|
121
126
|
# block legitimately returned nil, so we cannot use || to distinguish.
|
|
@@ -128,13 +133,16 @@ module RubyPi
|
|
|
128
133
|
else
|
|
129
134
|
# Future was rejected (raised an exception within the block).
|
|
130
135
|
# This shouldn't normally happen since execute_single rescues
|
|
131
|
-
# internally, but handle it defensively.
|
|
136
|
+
# internally, but handle it defensively. The actual run time is
|
|
137
|
+
# unknown here (the future failed at some point before the wait
|
|
138
|
+
# elapsed), so report 0.0 rather than a misleading full-timeout
|
|
139
|
+
# duration for what may have been an instant failure.
|
|
132
140
|
error = future.reason
|
|
133
141
|
Result.new(
|
|
134
|
-
name:
|
|
142
|
+
name: tool_name,
|
|
135
143
|
success: false,
|
|
136
144
|
error: "#{error.class}: #{error.message}",
|
|
137
|
-
duration_ms:
|
|
145
|
+
duration_ms: 0.0
|
|
138
146
|
)
|
|
139
147
|
end
|
|
140
148
|
else
|
|
@@ -147,9 +155,9 @@ module RubyPi
|
|
|
147
155
|
future.cancel if future.respond_to?(:cancel)
|
|
148
156
|
|
|
149
157
|
Result.new(
|
|
150
|
-
name:
|
|
158
|
+
name: tool_name,
|
|
151
159
|
success: false,
|
|
152
|
-
error: "Tool
|
|
160
|
+
error: "Tool '#{tool_name}' timed out after #{@timeout}s",
|
|
153
161
|
duration_ms: @timeout * 1000.0
|
|
154
162
|
)
|
|
155
163
|
end
|
data/lib/ruby_pi/tools/schema.rb
CHANGED
|
@@ -13,6 +13,16 @@
|
|
|
13
13
|
# flag consumed by `.object` to populate the top-level "required" array.
|
|
14
14
|
# It is stripped from the property's own schema hash before inclusion.
|
|
15
15
|
#
|
|
16
|
+
# IMPORTANT: Schemas are LLM-facing hints, NOT runtime input validation.
|
|
17
|
+
# Nothing in the execution pipeline validates the model's arguments against
|
|
18
|
+
# the schema before invoking the tool block: `required`, `enum`, `minimum`,
|
|
19
|
+
# and type declarations constrain what the model is *asked* to produce, but a
|
|
20
|
+
# misbehaving model can still omit required fields, send extra keys, or pass
|
|
21
|
+
# a String where an Integer is declared — no coercion is performed. Tool
|
|
22
|
+
# blocks should treat their arguments as untrusted input and validate or
|
|
23
|
+
# coerce what they depend on. (This is deliberate, per the anti-framework
|
|
24
|
+
# philosophy: validation policy belongs to the tool, not the harness.)
|
|
25
|
+
#
|
|
16
26
|
# Usage:
|
|
17
27
|
# schema = RubyPi::Schema.object(
|
|
18
28
|
# name: RubyPi::Schema.string("User's name", required: true),
|
data/lib/ruby_pi/version.rb
CHANGED
data/lib/ruby_pi.rb
CHANGED
|
@@ -82,6 +82,13 @@ module RubyPi
|
|
|
82
82
|
end
|
|
83
83
|
end
|
|
84
84
|
|
|
85
|
+
# Eagerly initialize the global configuration at load time. The lazy
|
|
86
|
+
# `@configuration ||= ...` in .configuration is not synchronized; two
|
|
87
|
+
# threads hitting it concurrently on first access could each construct a
|
|
88
|
+
# Configuration, with one silently discarded. Initializing here (requires
|
|
89
|
+
# run single-threaded) removes the race without adding a mutex to every read.
|
|
90
|
+
@configuration = Configuration.new
|
|
91
|
+
|
|
85
92
|
# Namespace for large language model providers and related abstractions.
|
|
86
93
|
module LLM
|
|
87
94
|
class << self
|
metadata
CHANGED
|
@@ -1,13 +1,13 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: ruby-pi
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.1.
|
|
4
|
+
version: 0.1.8
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- RubyPi Contributors
|
|
8
8
|
bindir: bin
|
|
9
9
|
cert_chain: []
|
|
10
|
-
date:
|
|
10
|
+
date: 1980-01-02 00:00:00.000000000 Z
|
|
11
11
|
dependencies:
|
|
12
12
|
- !ruby/object:Gem::Dependency
|
|
13
13
|
name: faraday
|
|
@@ -51,6 +51,20 @@ dependencies:
|
|
|
51
51
|
- - "~>"
|
|
52
52
|
- !ruby/object:Gem::Version
|
|
53
53
|
version: '1.2'
|
|
54
|
+
- !ruby/object:Gem::Dependency
|
|
55
|
+
name: json
|
|
56
|
+
requirement: !ruby/object:Gem::Requirement
|
|
57
|
+
requirements:
|
|
58
|
+
- - ">="
|
|
59
|
+
- !ruby/object:Gem::Version
|
|
60
|
+
version: '2.0'
|
|
61
|
+
type: :runtime
|
|
62
|
+
prerelease: false
|
|
63
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
64
|
+
requirements:
|
|
65
|
+
- - ">="
|
|
66
|
+
- !ruby/object:Gem::Version
|
|
67
|
+
version: '2.0'
|
|
54
68
|
- !ruby/object:Gem::Dependency
|
|
55
69
|
name: rspec
|
|
56
70
|
requirement: !ruby/object:Gem::Requirement
|
|
@@ -157,7 +171,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
|
157
171
|
- !ruby/object:Gem::Version
|
|
158
172
|
version: '0'
|
|
159
173
|
requirements: []
|
|
160
|
-
rubygems_version: 3.6.
|
|
174
|
+
rubygems_version: 3.6.9
|
|
161
175
|
specification_version: 4
|
|
162
176
|
summary: AI agent harness for Ruby — build LLM agents with tool calling, streaming,
|
|
163
177
|
and a unified interface to OpenAI, Anthropic Claude, and Google Gemini.
|