ruby-pi 0.1.5 → 0.1.8
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +51 -0
- data/lib/ruby_pi/agent/core.rb +6 -0
- data/lib/ruby_pi/agent/loop.rb +40 -25
- data/lib/ruby_pi/agent/state.rb +6 -0
- data/lib/ruby_pi/configuration.rb +50 -5
- data/lib/ruby_pi/context/compaction.rb +61 -24
- data/lib/ruby_pi/llm/anthropic.rb +38 -17
- data/lib/ruby_pi/llm/base_provider.rb +72 -1
- data/lib/ruby_pi/llm/fallback.rb +30 -9
- data/lib/ruby_pi/llm/gemini.rb +136 -37
- data/lib/ruby_pi/llm/openai.rb +53 -19
- data/lib/ruby_pi/llm/tool_call.rb +2 -0
- data/lib/ruby_pi/tools/definition.rb +39 -4
- data/lib/ruby_pi/tools/executor.rb +24 -7
- data/lib/ruby_pi/tools/schema.rb +10 -0
- data/lib/ruby_pi/version.rb +1 -1
- data/lib/ruby_pi.rb +7 -0
- metadata +15 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: b0054adb6a0863a8f296917be736df0ebfd789aa7205589b82689199d4bf4c06
|
|
4
|
+
data.tar.gz: fc79dcc61dbefce874e609807d989cf2293b0ecb45a6aa036069b11038ac5c9a
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: c130ada9b7ed93f5c9a0d16596c1176fec258204be26af15c61db3c18effee94bc7a8a1783620397780b0e3501e660b4e8ff48d8463e4089067edfcbf3bf9b60
|
|
7
|
+
data.tar.gz: dc179fe40cb063c4321a1c7a1aff5abb7b441d5fd87ced19908f9875c1f5b26bf2e17555b44658f603b32627577fe99feb8f34d061ed7e53eaba3c28cecd8bbb
|
data/CHANGELOG.md
CHANGED
|
@@ -5,6 +5,57 @@ All notable changes to this project will be documented in this file.
|
|
|
5
5
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
|
|
6
6
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
7
7
|
|
|
8
|
+
## [0.1.8] - 2026-06-09
|
|
9
|
+
|
|
10
|
+
### Fixed (adversarial review round 6)
|
|
11
|
+
|
|
12
|
+
- **`Retry-After` header was parsed but never honored (High)**: On a 429, `handle_error_response` stored the server's `Retry-After` on `RateLimitError#retry_after`, but the retry loop in `BaseProvider#complete` always slept the local exponential backoff (capped at `retry_max_delay`) — hammering a server that asked for a longer cooldown until the retry budget burned out. The retry delay now prefers a positive `retry_after` (capped at `RETRY_AFTER_CEILING`, 60s); HTTP-date values (which parse to `0.0`) and absent headers fall through to the computed backoff
|
|
13
|
+
- **Parallel executor timeout/rejection Results reported `name: "unknown"` (High)**: `execute_parallel` hardcoded `"unknown"` in the timeout and rejected-future branches, so with several tools timing out concurrently, logs and `:tool_execution_end` subscribers could not tell which tool hung. Futures are now zipped with their originating calls and failure Results carry the real tool name; the timeout message matches sequential mode (`Tool 'x' timed out after Ns`). The rejected-future branch also no longer reports a misleading full-timeout `duration_ms` for what may have been an instant failure (now `0.0`)
|
|
14
|
+
- **Keyword-parameter tool blocks failed on every call (High)**: `Definition#call` passed a single positional Hash, so a block written `{ |content:, platform:| ... }` — the natural style given named schema parameters — raised `ArgumentError: missing keyword` on every invocation (surfacing as a confusing failed Result). `Definition` now detects keyword parameters at construction and splats the arguments hash to keywords; positional-Hash blocks are unchanged. Keyword blocks without `**rest` raise on unexpected keys — strict by design, since the keys come from the LLM
|
|
15
|
+
- **`:compaction` event was never emitted in production (Medium)**: `Compaction#emitter` defaults to nil and nothing ever assigned it, so the documented `agent.on(:compaction)` subscription silently never fired — the only place the emitter was set was the spec itself. `Loop#initialize` now wires its emitter into the compaction strategy (an explicitly preassigned emitter is left untouched)
|
|
16
|
+
- **Streaming chunks were never normalized to UTF-8 (Medium)**: Faraday delivers `on_data` chunks as ASCII-8BIT; appending a chunk to a UTF-8 SSE buffer already holding non-ASCII text raises `Encoding::CompatibilityError`, and yielded deltas could carry binary encoding into consumers' UTF-8 buffers. All three providers now buffer in BINARY and re-encode each complete SSE line to UTF-8 (with `scrub` guarding invalid bytes) before parsing, so `:text_delta` events are always valid UTF-8 — including multi-byte characters split across network chunks
|
|
17
|
+
- **Streaming fallback gave consumers no way to truncate partial primary output (Medium)**: If the primary streamed text and then died mid-stream, the fallback streamed a complete fresh response — a delta-appending consumer rendered `"<partial primary><full fallback>"` with no signal of how much to discard. The `:fallback_start` payload now includes `partial_output` (Boolean) and `partial_chars` (characters already yielded), so consumers can deterministically reset
|
|
18
|
+
- **Tool names were not validated against provider constraints (Medium)**: A tool named `send.email` registered fine and then 400'd on every Anthropic request with an opaque server error. `Definition` now validates names against `/\A[a-zA-Z0-9_-]{1,64}\z/` (the strictest provider constraint) and raises `ArgumentError` at definition time with a pointed message
|
|
19
|
+
- **`json` was used everywhere but never declared or required (Medium)**: `JSON.parse`/`JSON.generate` are called throughout the providers and agent loop, but the gem relied on Faraday's transitive `json` dependency and the entry point's single `require "json"` — loading `agent/loop.rb` in isolation raised `NameError`, contradicting the composability principle. The gemspec now declares `json >= 2.0` and every file referencing `JSON` requires it directly (pinned by a source-scan spec)
|
|
20
|
+
- **Configuration accepted negative retry/timeout values (Low)**: `max_retries = -1` silently disabled retries and a negative delay raised deep inside the retry loop's `sleep`. The numeric settings now have validated writers that raise `ArgumentError` at assignment time
|
|
21
|
+
- **Global configuration first-access race (Low)**: `@configuration ||= Configuration.new` was unsynchronized; two threads racing the first call could each construct a Configuration with one silently discarded. The configuration is now eagerly initialized at require time
|
|
22
|
+
- **`continue()` Result accounting documented (Docs)**: Each `run`/`continue` builds a fresh Loop, so the returned Result's `usage`/`tool_calls_made`/`turns` cover only that invocation while `messages` is cumulative — an undocumented asymmetry, now documented on `Core#continue`
|
|
23
|
+
- **Schema DSL documented as LLM-facing hints, not validation (Docs)**: Nothing validates model-supplied arguments against `tool.parameters` before invoking the block — `required`/`enum`/`minimum` constrain what the model is asked to produce, with no runtime enforcement or type coercion. This is deliberate (anti-framework), but the schema header now says so loudly and directs tool blocks to treat arguments as untrusted input
|
|
24
|
+
- **`State#add_message` unbounded growth documented (Docs)**: Long-lived agents calling `continue()` repeatedly accumulate messages linearly without compaction configured; documented on the method
|
|
25
|
+
- **CLAUDE.md module map corrected (Docs)**: The map referenced a nonexistent `agent/agent.rb`, omitted `core.rb`/`loop.rb`/`state.rb`/`events.rb`, hardcoded version `0.1.0`, and the extension example used the one-arg `|event|` block signature instead of the actual `|data, agent|`. All corrected
|
|
26
|
+
|
|
27
|
+
### Release-history note
|
|
28
|
+
|
|
29
|
+
- **`[0.1.4]` below was never actually released**: `lib/ruby_pi/version.rb` went from `0.1.3` directly to `0.1.5` — the round-2 fixes documented under 0.1.4 shipped without a version bump and were first published as part of 0.1.5. There is intentionally no `v0.1.4` git tag or gem. (Discovered during round 6; the entry is kept for historical accuracy of *what* changed.)
|
|
30
|
+
|
|
31
|
+
## [0.1.7] - 2026-05-28
|
|
32
|
+
|
|
33
|
+
### Fixed (adversarial review round 5)
|
|
34
|
+
|
|
35
|
+
- **Compaction produced an Anthropic-invalid leading `:assistant` message (Critical)**: The 0.1.6 orphan-`:tool` strip fixed tool-result splitting but left the summary-role logic (`first_preserved == :assistant ? :user : :assistant`) intact. Whenever the first preserved message was `:user` (multi-turn reuse) or the preserved window emptied out (all tool results), the summary became an `:assistant` message at the head of the conversation — which Anthropic rejects with HTTP 400 "first message must use the 'user' role". The summary is now **always** a `:user` message (valid as the first message and never overwriting the system prompt). When the first preserved message is itself `:user`, the summary is merged into it to avoid consecutive same-role messages; an empty preserved window yields a lone `:user` summary. Extracted into `Compaction#build_compacted_history`
|
|
36
|
+
- **Compaction dead "mirror case" branch removed (Minor)**: The 0.1.6 `if droppable.last … && preserved.first[:role] == :tool` block was unreachable — the preceding `while` loop guarantees `preserved.first` is never `:tool`. Removed it (the originating assistant message is already in droppable alongside its now-moved tool results, so the pair is never split), eliminating misleading dead code
|
|
37
|
+
- **Deterministic `ProviderError` was retried with backoff (Minor)**: 0.1.6 added `RubyPi::ProviderError` to the retryable set in `BaseProvider#complete`, but provider errors are overwhelmingly deterministic request-construction failures (missing `tool_call_id`, invalid tool-argument JSON) raised before any HTTP call — retrying only burned the backoff schedule before re-raising the identical error. `ProviderError` is no longer retried. Fallback failover is unaffected (it rescues the `RubyPi::Error` superclass)
|
|
38
|
+
- **Lifecycle hooks saw string-keyed tool arguments while events saw symbols (Minor)**: `before_tool_call`/`after_tool_call` received the raw `ToolCall` (string-keyed `arguments`) while the `:tool_execution_start` event and `tool_calls_made` carried symbol keys — so a hook and an event subscriber disagreed on the key type for the same call. `Loop#act` now rebuilds each `ToolCall` with symbol-keyed arguments up front, so hooks, events, `tool_calls_made`, and the tool block all observe the identical shape
|
|
39
|
+
- **Anthropic streaming `finish_reason` could be clobbered to nil (Minor)**: A trailing `message_delta` event without a `stop_reason` overwrote the previously captured value, yielding a `Response` with no `finish_reason`. The assignment is now guarded (`finish_reason = delta["stop_reason"] if delta["stop_reason"]`), matching the OpenAI/Gemini guards
|
|
40
|
+
- **Gemini `finishReason` assumed a String (Minor)**: `finishReason.downcase` would raise `NoMethodError` on a non-String payload mid-stream. Both the streaming and standard paths now coerce via `to_s` before `downcase`, and remain consistent with each other
|
|
41
|
+
- **Dead streamed-content accumulator removed (Cleanup)**: `Loop#think` accumulated `streamed_content` that was never read (the recorded assistant message uses `Response#content`); the `.clear` on `:fallback_start` was a no-op and its comment was inaccurate. Removed the local; the `:provider_fallback` event still fires
|
|
42
|
+
- **`Fallback` class docstring corrected (Docs)**: The class-level docstring still described the removed happy-path buffering ("the Fallback now buffers deltas… buffered deltas are discarded"), contradicting the real-time direct-streaming implementation. Updated to describe direct streaming plus the `:fallback_start` signal
|
|
43
|
+
|
|
44
|
+
### Investigated, no change
|
|
45
|
+
|
|
46
|
+
- **Streaming HTTP error bodies via `env.status`**: A prior review raised that streaming error responses might lose their body if Faraday's `on_data` callback received a nil `env.status`. Verified against the actual stack (faraday 2.14.1 / faraday-net_http 3.3.0): the net_http adapter calls `save_http_response` (which sets `env.status`) before `response.read_body` streams chunks, and `Env#stream_response` passes that same populated `env` to the user's `on_data` proc. `env.status` is therefore reliably available before the first chunk, so the existing `error_body` recovery works. No fix needed
|
|
47
|
+
|
|
48
|
+
## [0.1.6] - 2026-05-01
|
|
49
|
+
|
|
50
|
+
### Fixed (adversarial review round 4)
|
|
51
|
+
|
|
52
|
+
- **Faraday transport errors leaked untyped, bypassed retry (Critical)**: `BaseProvider#complete` rescued only `RubyPi::*` errors, but providers never wrapped Faraday network exceptions. A `Faraday::TimeoutError`, `Faraday::ConnectionFailed`, or `Faraday::SSLError` propagated as the raw Faraday class — breaking the documented error hierarchy and skipping the retry loop entirely (the exact case retries exist for). Added `BaseProvider#with_transport_errors` which translates `Faraday::TimeoutError` → `RubyPi::TimeoutError` and `Faraday::ConnectionFailed`/`SSLError`/other `Faraday::Error` → `RubyPi::ApiError`. Wrapped every `conn.post` call in all three providers (standard and streaming paths). `RubyPi::ProviderError` is now also retryable
|
|
53
|
+
- **Gemini multi-turn tool use was broken (Critical)**: `Gemini#format_message` rendered assistant messages as text-only and silently dropped the `:tool_calls` field set by the agent loop. The next turn's `functionResponse` had no preceding `functionCall` to bind to, so Gemini rejected any conversation that included a tool call followed by a tool result. Assistant messages now emit one `functionCall` part per tool call (mirroring Anthropic's `tool_use` and OpenAI's `tool_calls` behavior). Empty text parts are also no longer emitted on tool-only assistant turns
|
|
54
|
+
- **Compaction split tool_use/tool_result pairs (Critical)**: When `preserve_last_n` cut between an assistant `tool_calls` message (in droppable) and its matching `:tool` result (in preserved), Anthropic and OpenAI rejected the conversation with "tool_result without preceding tool_use". Compaction now strips orphan `:tool` messages from the head of preserved (moves them into droppable so they're summarized away). Mirror case where preserved starts with a tool result whose assistant is the last droppable message also handled
|
|
55
|
+
- **`Tools::Executor` swallowed non-StandardError exceptions as nil success (Major)**: The worker thread rescued only `StandardError`. A tool block raising `Interrupt`, `SystemExit`, or any other `Exception` subclass left both `value` and `error` nil; the join then reported a *successful* `nil` result. Now rescues `Exception`, captures it as a failed `Result`. Worker thread also sets `report_on_exception = false` to avoid stderr spam
|
|
56
|
+
- **Gemini tool_call IDs collided across turns (Major)**: IDs were generated as `"gemini_#{accumulated_tool_calls.length}"` — every response restarted numbering at 0, so a multi-turn conversation produced multiple tool calls all named `"gemini_0"`. Any caller using ID as a hash key (observability, result correlation) saw collisions. IDs now use `SecureRandom.hex(8)` for global uniqueness across both standard and streaming responses
|
|
57
|
+
- **OpenAI passed malformed tool_call.arguments JSON verbatim (Minor)**: A non-JSON string in `tool_call.arguments` on an assistant message was forwarded unchanged to OpenAI, producing an opaque HTTP 400. Now validated up-front with `JSON.parse`; malformed input raises a typed `RubyPi::ProviderError` with the tool name and parse error before sending the request, matching Anthropic's input validation
|
|
58
|
+
|
|
8
59
|
## [0.1.5] - 2026-04-30
|
|
9
60
|
|
|
10
61
|
### Fixed (adversarial review round 3)
|
data/lib/ruby_pi/agent/core.rb
CHANGED
|
@@ -140,12 +140,18 @@ module RubyPi
|
|
|
140
140
|
# the existing conversation history and appends the new prompt before
|
|
141
141
|
# resuming the loop.
|
|
142
142
|
#
|
|
143
|
+
# NOTE on Result accounting: each run/continue builds a fresh Loop, so
|
|
144
|
+
# the returned Result's `usage`, `tool_calls_made`, and `turns` cover
|
|
145
|
+
# ONLY this invocation — while `messages` is cumulative across the whole
|
|
146
|
+
# conversation. Sum the per-call Results if you need session totals.
|
|
147
|
+
#
|
|
143
148
|
# Issue #16: Uses the encapsulated reset_iteration! method instead of
|
|
144
149
|
# the old approach that bypassed encapsulation
|
|
145
150
|
# and was fragile.
|
|
146
151
|
#
|
|
147
152
|
# @param prompt [String] the follow-up user message
|
|
148
153
|
# @return [RubyPi::Agent::Result] the outcome of the continued run
|
|
154
|
+
# (usage/tool_calls_made/turns are per-invocation; messages cumulative)
|
|
149
155
|
def continue(prompt)
|
|
150
156
|
@state.reset_iteration!
|
|
151
157
|
@state.add_message(role: :user, content: prompt)
|
data/lib/ruby_pi/agent/loop.rb
CHANGED
|
@@ -10,6 +10,8 @@
|
|
|
10
10
|
# is reached. It handles streaming, lifecycle events, compaction, and all
|
|
11
11
|
# pre/post tool call hooks.
|
|
12
12
|
|
|
13
|
+
require "json"
|
|
14
|
+
|
|
13
15
|
module RubyPi
|
|
14
16
|
module Agent
|
|
15
17
|
# Executes the think-act-observe cycle against a given State, emitting
|
|
@@ -55,6 +57,14 @@ module RubyPi
|
|
|
55
57
|
@state = state
|
|
56
58
|
@emitter = emitter
|
|
57
59
|
@compaction = compaction
|
|
60
|
+
# Wire the loop's emitter into the compaction strategy so the
|
|
61
|
+
# documented :compaction event actually reaches agent subscribers.
|
|
62
|
+
# Compaction#emitter defaults to nil and nothing else ever sets it —
|
|
63
|
+
# without this, `agent.on(:compaction)` never fires. An emitter that
|
|
64
|
+
# was already assigned explicitly is left untouched.
|
|
65
|
+
if @compaction.respond_to?(:emitter=) && @compaction.respond_to?(:emitter) && @compaction.emitter.nil?
|
|
66
|
+
@compaction.emitter = emitter
|
|
67
|
+
end
|
|
58
68
|
@execution_mode = execution_mode
|
|
59
69
|
@tool_timeout = tool_timeout
|
|
60
70
|
@tool_calls_made = []
|
|
@@ -145,17 +155,16 @@ module RubyPi
|
|
|
145
155
|
# Build tools array for the LLM
|
|
146
156
|
tools = build_tools_array
|
|
147
157
|
|
|
148
|
-
#
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
#
|
|
158
|
+
# Call the LLM with streaming. The recorded assistant message uses
|
|
159
|
+
# the returned Response#content (already the final, authoritative
|
|
160
|
+
# text), so there is no need to accumulate deltas here — we only
|
|
161
|
+
# re-emit them for subscribers.
|
|
152
162
|
response = @state.model.complete(
|
|
153
163
|
messages: messages,
|
|
154
164
|
tools: tools,
|
|
155
165
|
stream: true
|
|
156
166
|
) do |event|
|
|
157
167
|
if event.text_delta?
|
|
158
|
-
streamed_content << event.data.to_s
|
|
159
168
|
@emitter.emit(:text_delta, content: event.data)
|
|
160
169
|
elsif event.tool_call_delta?
|
|
161
170
|
# Emit tool call delta events so subscribers can observe partial
|
|
@@ -164,12 +173,11 @@ module RubyPi
|
|
|
164
173
|
@emitter.emit(:tool_call_delta, data: event.data)
|
|
165
174
|
elsif event.fallback_start?
|
|
166
175
|
# The primary LLM provider failed mid-stream and a Fallback
|
|
167
|
-
# provider is now taking over.
|
|
168
|
-
#
|
|
169
|
-
#
|
|
170
|
-
#
|
|
171
|
-
#
|
|
172
|
-
streamed_content.clear
|
|
176
|
+
# provider is now taking over. Surface a :provider_fallback event
|
|
177
|
+
# so subscribers can clear any UI state they rendered from the
|
|
178
|
+
# discarded primary deltas. The recorded response is unaffected:
|
|
179
|
+
# it comes from the fallback provider's returned Response#content,
|
|
180
|
+
# never from the failed primary's partial text.
|
|
173
181
|
@emitter.emit(:provider_fallback, **event.data)
|
|
174
182
|
end
|
|
175
183
|
end
|
|
@@ -208,32 +216,39 @@ module RubyPi
|
|
|
208
216
|
timeout: @tool_timeout
|
|
209
217
|
)
|
|
210
218
|
|
|
211
|
-
#
|
|
212
|
-
#
|
|
213
|
-
#
|
|
214
|
-
#
|
|
215
|
-
#
|
|
216
|
-
#
|
|
217
|
-
symbolized
|
|
218
|
-
|
|
219
|
+
# Normalize each tool call's arguments to symbol keys ONCE, up front,
|
|
220
|
+
# by rebuilding the ToolCall objects. Every downstream consumer — the
|
|
221
|
+
# executor (which invokes the tool block), the before/after_tool_call
|
|
222
|
+
# hooks (which receive the ToolCall directly), the emitted
|
|
223
|
+
# :tool_execution_start event, and the recorded `tool_calls_made`
|
|
224
|
+
# payload — then observes the identical symbol-keyed shape. Carrying
|
|
225
|
+
# the symbolized form on the ToolCall itself (rather than in a side
|
|
226
|
+
# array) is what keeps the hooks consistent with everything else;
|
|
227
|
+
# previously hooks saw raw string keys while events/records saw symbols.
|
|
228
|
+
tool_calls = response.tool_calls.map do |tc|
|
|
229
|
+
RubyPi::LLM::ToolCall.new(
|
|
230
|
+
id: tc.id,
|
|
231
|
+
name: tc.name,
|
|
232
|
+
arguments: RubyPi::Tools::Executor.deep_symbolize_keys(tc.arguments)
|
|
233
|
+
)
|
|
219
234
|
end
|
|
220
235
|
|
|
221
236
|
# Prepare call hashes for the executor
|
|
222
|
-
calls =
|
|
223
|
-
{ name: tc.name, arguments:
|
|
237
|
+
calls = tool_calls.map do |tc|
|
|
238
|
+
{ name: tc.name, arguments: tc.arguments }
|
|
224
239
|
end
|
|
225
240
|
|
|
226
241
|
# Fire before_tool_call hooks and emit start events
|
|
227
|
-
|
|
242
|
+
tool_calls.each do |tc|
|
|
228
243
|
@state.before_tool_call&.call(tc)
|
|
229
|
-
@emitter.emit(:tool_execution_start, tool_name: tc.name, arguments:
|
|
244
|
+
@emitter.emit(:tool_execution_start, tool_name: tc.name, arguments: tc.arguments)
|
|
230
245
|
end
|
|
231
246
|
|
|
232
247
|
# Execute all tool calls
|
|
233
248
|
results = executor.execute(calls)
|
|
234
249
|
|
|
235
250
|
# Fire after_tool_call hooks, emit end events, and add results to messages
|
|
236
|
-
|
|
251
|
+
tool_calls.each_with_index do |tc, idx|
|
|
237
252
|
result = results[idx]
|
|
238
253
|
|
|
239
254
|
@state.after_tool_call&.call(tc, result)
|
|
@@ -247,7 +262,7 @@ module RubyPi
|
|
|
247
262
|
# arguments so callers see the same shape the tool itself received.
|
|
248
263
|
@tool_calls_made << {
|
|
249
264
|
tool_name: tc.name,
|
|
250
|
-
arguments:
|
|
265
|
+
arguments: tc.arguments,
|
|
251
266
|
result: result.to_h
|
|
252
267
|
}
|
|
253
268
|
|
data/lib/ruby_pi/agent/state.rb
CHANGED
|
@@ -91,6 +91,12 @@ module RubyPi
|
|
|
91
91
|
|
|
92
92
|
# Appends a message to the conversation history.
|
|
93
93
|
#
|
|
94
|
+
# NOTE: history grows without bound — there is no built-in cap. Growth
|
|
95
|
+
# per run is limited by max_iterations, but long-lived agents that call
|
|
96
|
+
# continue() repeatedly (or use a high max_iterations with large tool
|
|
97
|
+
# outputs) accumulate messages linearly. Configure
|
|
98
|
+
# Agent.new(compaction: ...) to keep the context bounded.
|
|
99
|
+
#
|
|
94
100
|
# @param role [Symbol, String] the message role (:user, :assistant, :system, :tool)
|
|
95
101
|
# @param content [String, nil] the text content of the message
|
|
96
102
|
# @param options [Hash] additional fields (e.g., :tool_call_id, :tool_calls)
|
|
@@ -37,19 +37,53 @@ module RubyPi
|
|
|
37
37
|
attr_accessor :openai_api_key
|
|
38
38
|
|
|
39
39
|
# @return [Integer] Maximum number of retry attempts for transient errors (default: 3)
|
|
40
|
-
|
|
40
|
+
attr_reader :max_retries
|
|
41
41
|
|
|
42
42
|
# @return [Float] Base delay in seconds for exponential backoff (default: 1.0)
|
|
43
|
-
|
|
43
|
+
attr_reader :retry_base_delay
|
|
44
44
|
|
|
45
45
|
# @return [Float] Maximum delay in seconds between retries (default: 30.0)
|
|
46
|
-
|
|
46
|
+
attr_reader :retry_max_delay
|
|
47
47
|
|
|
48
48
|
# @return [Integer] HTTP request timeout in seconds (default: 120)
|
|
49
|
-
|
|
49
|
+
attr_reader :request_timeout
|
|
50
50
|
|
|
51
51
|
# @return [Integer] HTTP connection open timeout in seconds (default: 10)
|
|
52
|
-
|
|
52
|
+
attr_reader :open_timeout
|
|
53
|
+
|
|
54
|
+
# Validated writers for numeric settings. A negative max_retries silently
|
|
55
|
+
# disables retries and a negative delay raises deep inside the retry
|
|
56
|
+
# loop's sleep — fail fast at assignment time instead, where the typo is.
|
|
57
|
+
|
|
58
|
+
# @param value [Integer] must be a non-negative integer
|
|
59
|
+
def max_retries=(value)
|
|
60
|
+
validate_numeric!(:max_retries, value)
|
|
61
|
+
@max_retries = value
|
|
62
|
+
end
|
|
63
|
+
|
|
64
|
+
# @param value [Numeric] must be non-negative
|
|
65
|
+
def retry_base_delay=(value)
|
|
66
|
+
validate_numeric!(:retry_base_delay, value)
|
|
67
|
+
@retry_base_delay = value
|
|
68
|
+
end
|
|
69
|
+
|
|
70
|
+
# @param value [Numeric] must be non-negative
|
|
71
|
+
def retry_max_delay=(value)
|
|
72
|
+
validate_numeric!(:retry_max_delay, value)
|
|
73
|
+
@retry_max_delay = value
|
|
74
|
+
end
|
|
75
|
+
|
|
76
|
+
# @param value [Numeric] must be non-negative
|
|
77
|
+
def request_timeout=(value)
|
|
78
|
+
validate_numeric!(:request_timeout, value)
|
|
79
|
+
@request_timeout = value
|
|
80
|
+
end
|
|
81
|
+
|
|
82
|
+
# @param value [Numeric] must be non-negative
|
|
83
|
+
def open_timeout=(value)
|
|
84
|
+
validate_numeric!(:open_timeout, value)
|
|
85
|
+
@open_timeout = value
|
|
86
|
+
end
|
|
53
87
|
|
|
54
88
|
# @return [String] Default model name for Gemini provider
|
|
55
89
|
attr_accessor :default_gemini_model
|
|
@@ -78,6 +112,17 @@ module RubyPi
|
|
|
78
112
|
|
|
79
113
|
private
|
|
80
114
|
|
|
115
|
+
# Raises unless the value is a non-negative Numeric.
|
|
116
|
+
#
|
|
117
|
+
# @param name [Symbol] the setting name (for the error message)
|
|
118
|
+
# @param value [Object] the value being assigned
|
|
119
|
+
# @raise [ArgumentError] if value is not a Numeric or is negative
|
|
120
|
+
def validate_numeric!(name, value)
|
|
121
|
+
return if value.is_a?(Numeric) && value >= 0
|
|
122
|
+
|
|
123
|
+
raise ArgumentError, "#{name} must be a non-negative number, got #{value.inspect}"
|
|
124
|
+
end
|
|
125
|
+
|
|
81
126
|
# Sets all configuration ivars to their default values. Called by both
|
|
82
127
|
# initialize and reset! to ensure consistent defaults without the
|
|
83
128
|
# anti-pattern of calling initialize from reset!.
|
|
@@ -75,40 +75,77 @@ module RubyPi
|
|
|
75
75
|
|
|
76
76
|
# Split into messages to summarize and messages to keep
|
|
77
77
|
preserved_count = [@preserve_last_n, messages.size].min
|
|
78
|
-
droppable = messages[0...(messages.size - preserved_count)]
|
|
79
|
-
preserved = messages[(messages.size - preserved_count)..]
|
|
78
|
+
droppable = messages[0...(messages.size - preserved_count)].dup
|
|
79
|
+
preserved = messages[(messages.size - preserved_count)..].dup
|
|
80
80
|
|
|
81
81
|
# If there's nothing to drop, we can't compact further
|
|
82
82
|
return nil if droppable.empty?
|
|
83
83
|
|
|
84
|
+
# Anthropic and OpenAI both require every tool_result / tool message
|
|
85
|
+
# to reference a tool_use / tool_call from a preceding assistant
|
|
86
|
+
# message. If we summarize the assistant turn that originated a tool
|
|
87
|
+
# call but keep the matching tool_result, the API rejects the
|
|
88
|
+
# request with "tool_result without preceding tool_use".
|
|
89
|
+
#
|
|
90
|
+
# When the boundary between droppable and preserved cuts mid-exchange,
|
|
91
|
+
# preserved can start with one or more orphan :tool messages whose
|
|
92
|
+
# matching assistant turn is in droppable. Strip those off the head of
|
|
93
|
+
# preserved and move them into droppable so they are summarized away
|
|
94
|
+
# rather than sent. Because the originating assistant message is older,
|
|
95
|
+
# it is already in droppable, so the pair stays together there — there
|
|
96
|
+
# is no mirror case to handle (once a tool result is moved across, its
|
|
97
|
+
# assistant is never left stranded on the preserved side).
|
|
98
|
+
while preserved.first && preserved.first[:role] == :tool
|
|
99
|
+
droppable << preserved.shift
|
|
100
|
+
end
|
|
101
|
+
|
|
102
|
+
# The orphan-strip only moves messages INTO droppable, so droppable
|
|
103
|
+
# cannot have shrunk; it is still non-empty here. preserved, however,
|
|
104
|
+
# may now be empty (the whole window was tool results) — the summary
|
|
105
|
+
# construction below handles that case.
|
|
106
|
+
|
|
84
107
|
# Generate a summary of the dropped messages
|
|
85
108
|
summary = summarize(droppable)
|
|
86
109
|
|
|
87
110
|
# Emit compaction event if an emitter is available
|
|
88
111
|
@emitter&.emit(:compaction, dropped_count: droppable.size, summary: summary)
|
|
89
112
|
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
|
|
95
|
-
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
|
|
102
|
-
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
|
|
110
|
-
|
|
111
|
-
|
|
113
|
+
build_compacted_history(summary, preserved)
|
|
114
|
+
end
|
|
115
|
+
|
|
116
|
+
# Builds the compacted history: a summary message followed by the
|
|
117
|
+
# preserved tail.
|
|
118
|
+
#
|
|
119
|
+
# The summary becomes the FIRST message of the compacted history, so it
|
|
120
|
+
# must satisfy the strictest provider constraints (Anthropic):
|
|
121
|
+
# 1. The summary role MUST NOT be :system — that would overwrite the
|
|
122
|
+
# real system prompt on Anthropic, which promotes the last :system
|
|
123
|
+
# message to the top-level `system:` parameter.
|
|
124
|
+
# 2. The first message MUST use role :user.
|
|
125
|
+
# 3. Consecutive same-role messages are rejected.
|
|
126
|
+
#
|
|
127
|
+
# A :user summary satisfies (1) and (2). For (3): the orphan-strip above
|
|
128
|
+
# guarantees the first preserved message is :assistant, :user, or absent
|
|
129
|
+
# (never :tool). When it is :assistant or absent, a standalone :user
|
|
130
|
+
# summary alternates correctly. When it is :user, a separate :user
|
|
131
|
+
# summary would create two consecutive user messages, so we instead
|
|
132
|
+
# merge the summary text into that existing user message — keeping the
|
|
133
|
+
# first message a single :user message with no role collision.
|
|
134
|
+
#
|
|
135
|
+
# @param summary [String] the generated summary text
|
|
136
|
+
# @param preserved [Array<Hash>] the preserved tail of messages
|
|
137
|
+
# @return [Array<Hash>] the compacted history
|
|
138
|
+
def build_compacted_history(summary, preserved)
|
|
139
|
+
summary_text = "[Conversation Summary]\n#{summary}"
|
|
140
|
+
first_preserved = preserved.first
|
|
141
|
+
|
|
142
|
+
if first_preserved && first_preserved[:role] == :user
|
|
143
|
+
merged = first_preserved.dup
|
|
144
|
+
merged[:content] = "#{summary_text}\n\n#{first_preserved[:content]}"
|
|
145
|
+
[merged] + preserved.drop(1)
|
|
146
|
+
else
|
|
147
|
+
[{ role: :user, content: summary_text }] + preserved
|
|
148
|
+
end
|
|
112
149
|
end
|
|
113
150
|
|
|
114
151
|
# Estimates the total token count for a system prompt and message array
|
|
@@ -6,6 +6,8 @@
|
|
|
6
6
|
# the Anthropic Messages API for both synchronous and streaming completions,
|
|
7
7
|
# including tool_use block support.
|
|
8
8
|
|
|
9
|
+
require "json"
|
|
10
|
+
|
|
9
11
|
module RubyPi
|
|
10
12
|
module LLM
|
|
11
13
|
# Anthropic Claude provider implementation. Communicates with the Anthropic
|
|
@@ -330,9 +332,11 @@ module RubyPi
|
|
|
330
332
|
headers: default_headers
|
|
331
333
|
)
|
|
332
334
|
|
|
333
|
-
response =
|
|
334
|
-
|
|
335
|
-
|
|
335
|
+
response = with_transport_errors do
|
|
336
|
+
conn.post("/v1/messages") do |req|
|
|
337
|
+
req.headers["Content-Type"] = "application/json"
|
|
338
|
+
req.body = JSON.generate(body)
|
|
339
|
+
end
|
|
336
340
|
end
|
|
337
341
|
|
|
338
342
|
handle_error_response(response) unless response.success?
|
|
@@ -368,18 +372,26 @@ module RubyPi
|
|
|
368
372
|
# process complete lines incrementally so that deltas reach the caller
|
|
369
373
|
# as soon as each SSE event is fully received — not after the entire
|
|
370
374
|
# response has been buffered.
|
|
371
|
-
|
|
375
|
+
#
|
|
376
|
+
# The buffer is BINARY because chunks arrive as ASCII-8BIT and may end
|
|
377
|
+
# mid-way through a multi-byte UTF-8 character; appending such a chunk
|
|
378
|
+
# to a UTF-8 buffer that already holds non-ASCII text raises
|
|
379
|
+
# Encoding::CompatibilityError. Each complete line is re-encoded to
|
|
380
|
+
# UTF-8 (and scrubbed) before parsing, so deltas reach the caller as
|
|
381
|
+
# valid UTF-8 strings.
|
|
382
|
+
sse_buffer = (+"").force_encoding(Encoding::BINARY)
|
|
372
383
|
response_status = nil
|
|
373
384
|
|
|
374
385
|
# Accumulate error response body separately so ApiError gets the
|
|
375
386
|
# full body even though on_data consumed the chunks.
|
|
376
|
-
error_body = +""
|
|
387
|
+
error_body = (+"").force_encoding(Encoding::BINARY)
|
|
377
388
|
|
|
378
|
-
response =
|
|
379
|
-
|
|
380
|
-
|
|
389
|
+
response = with_transport_errors do
|
|
390
|
+
conn.post("/v1/messages") do |req|
|
|
391
|
+
req.headers["Content-Type"] = "application/json"
|
|
392
|
+
req.body = JSON.generate(body)
|
|
381
393
|
|
|
382
|
-
|
|
394
|
+
# Use Faraday's on_data callback for real incremental streaming.
|
|
383
395
|
# Without this, Faraday buffers the entire response body before
|
|
384
396
|
# returning, which means no deltas reach the caller until the model
|
|
385
397
|
# finishes generating (fake streaming).
|
|
@@ -391,14 +403,17 @@ module RubyPi
|
|
|
391
403
|
# calls on_data for error responses too, which would otherwise
|
|
392
404
|
# consume the body and leave response.body empty.
|
|
393
405
|
if response_status && response_status >= 400
|
|
394
|
-
error_body << chunk
|
|
406
|
+
error_body << chunk.b
|
|
395
407
|
next
|
|
396
408
|
end
|
|
397
409
|
|
|
398
|
-
sse_buffer << chunk
|
|
399
|
-
# Process all complete lines in the buffer
|
|
410
|
+
sse_buffer << chunk.b
|
|
411
|
+
# Process all complete lines in the buffer. A complete line holds
|
|
412
|
+
# complete UTF-8 sequences (multi-byte characters split across
|
|
413
|
+
# chunks are repaired by the buffering), so re-encode it to UTF-8
|
|
414
|
+
# here; scrub guards against a server sending invalid bytes.
|
|
400
415
|
while (line_end = sse_buffer.index("\n"))
|
|
401
|
-
line = sse_buffer.slice!(0, line_end + 1).strip
|
|
416
|
+
line = sse_buffer.slice!(0, line_end + 1).force_encoding(Encoding::UTF_8).scrub.strip
|
|
402
417
|
next if line.empty?
|
|
403
418
|
next unless line.start_with?("data: ")
|
|
404
419
|
|
|
@@ -424,7 +439,8 @@ module RubyPi
|
|
|
424
439
|
finish_reason = stream_state[:finish_reason]
|
|
425
440
|
end
|
|
426
441
|
end
|
|
427
|
-
|
|
442
|
+
end # conn.post
|
|
443
|
+
end # with_transport_errors
|
|
428
444
|
|
|
429
445
|
# Check for HTTP errors. When on_data was active, the response body
|
|
430
446
|
# was consumed by the callback, so we pass the accumulated error_body
|
|
@@ -432,12 +448,12 @@ module RubyPi
|
|
|
432
448
|
unless response.success?
|
|
433
449
|
# Reconstruct the response body from what on_data accumulated
|
|
434
450
|
error_response = response
|
|
435
|
-
error_body_str = error_body.empty? ? response.body : error_body
|
|
451
|
+
error_body_str = error_body.empty? ? response.body : error_body.force_encoding(Encoding::UTF_8).scrub
|
|
436
452
|
handle_error_response(error_response, override_body: error_body_str)
|
|
437
453
|
end
|
|
438
454
|
|
|
439
455
|
# Process any remaining data in the buffer after the connection closes
|
|
440
|
-
sse_buffer.each_line do |line|
|
|
456
|
+
sse_buffer.force_encoding(Encoding::UTF_8).scrub.each_line do |line|
|
|
441
457
|
line = line.strip
|
|
442
458
|
next if line.empty?
|
|
443
459
|
next unless line.start_with?("data: ")
|
|
@@ -558,7 +574,12 @@ module RubyPi
|
|
|
558
574
|
|
|
559
575
|
when "message_delta"
|
|
560
576
|
delta = data["delta"] || {}
|
|
561
|
-
finish_reason
|
|
577
|
+
# Only overwrite finish_reason when this delta actually carries a
|
|
578
|
+
# stop_reason. Anthropic emits the stop_reason on a single
|
|
579
|
+
# message_delta near the end of the stream; a later message_delta
|
|
580
|
+
# without one must not clobber the captured value back to nil
|
|
581
|
+
# (which would yield a Response with no finish_reason).
|
|
582
|
+
finish_reason = delta["stop_reason"] if delta["stop_reason"]
|
|
562
583
|
if data.key?("usage")
|
|
563
584
|
usage_info = data["usage"]
|
|
564
585
|
usage_data[:completion_tokens] = usage_info["output_tokens"]
|
|
@@ -79,13 +79,22 @@ module RubyPi
|
|
|
79
79
|
# Authentication errors are not retryable — raise immediately
|
|
80
80
|
raise
|
|
81
81
|
rescue RubyPi::RateLimitError, RubyPi::ApiError, RubyPi::TimeoutError => e
|
|
82
|
+
# NOTE: RubyPi::ProviderError is intentionally NOT retried. Provider
|
|
83
|
+
# errors are overwhelmingly deterministic request-construction
|
|
84
|
+
# failures (missing tool_call_id, invalid tool-argument JSON, missing
|
|
85
|
+
# tool name) raised by build_request_body BEFORE any HTTP call. They
|
|
86
|
+
# produce the identical error on every attempt, so retrying only
|
|
87
|
+
# burns the backoff schedule before surfacing the same failure.
|
|
88
|
+
# Fallback wrappers still rescue RubyPi::Error (the ProviderError
|
|
89
|
+
# superclass), so provider failover is unaffected.
|
|
90
|
+
#
|
|
82
91
|
# Retry up to max_retries times AFTER the initial attempt.
|
|
83
92
|
# With max_retries: 3, attempt goes 1 (initial), 2, 3, 4 — the condition
|
|
84
93
|
# `attempt <= @max_retries` allows retries on attempts 1..3, so we get
|
|
85
94
|
# 3 retries + 1 initial = 4 total attempts. Previously used `< @max_retries`
|
|
86
95
|
# which was off-by-one (only 2 retries with max_retries: 3).
|
|
87
96
|
if attempt <= @max_retries
|
|
88
|
-
delay =
|
|
97
|
+
delay = retry_delay_for(e, attempt)
|
|
89
98
|
log_retry(attempt, delay, e)
|
|
90
99
|
sleep(delay)
|
|
91
100
|
retry
|
|
@@ -127,6 +136,29 @@ module RubyPi
|
|
|
127
136
|
raise RubyPi::AbstractMethodError, :perform_complete
|
|
128
137
|
end
|
|
129
138
|
|
|
139
|
+
# Maximum delay (seconds) honored from a server-provided Retry-After
|
|
140
|
+
# header. Caps pathological or misconfigured server values so a single
|
|
141
|
+
# 429 cannot stall the client indefinitely.
|
|
142
|
+
RETRY_AFTER_CEILING = 60.0
|
|
143
|
+
|
|
144
|
+
# Picks the delay before the next retry. A server-provided Retry-After
|
|
145
|
+
# on a 429 takes precedence over the local exponential backoff: the
|
|
146
|
+
# server knows its own cooldown window, and retrying earlier just burns
|
|
147
|
+
# the retry budget against guaranteed 429s. Retry-After parsed from an
|
|
148
|
+
# HTTP-date (rather than delta-seconds) arrives as 0.0 and falls through
|
|
149
|
+
# to the computed backoff.
|
|
150
|
+
#
|
|
151
|
+
# @param error [Exception] the error that triggered the retry
|
|
152
|
+
# @param attempt [Integer] the current attempt number (1-based)
|
|
153
|
+
# @return [Float] delay in seconds
|
|
154
|
+
def retry_delay_for(error, attempt)
|
|
155
|
+
if error.is_a?(RubyPi::RateLimitError) && error.retry_after&.positive?
|
|
156
|
+
[error.retry_after, RETRY_AFTER_CEILING].min
|
|
157
|
+
else
|
|
158
|
+
calculate_backoff(attempt)
|
|
159
|
+
end
|
|
160
|
+
end
|
|
161
|
+
|
|
130
162
|
# Calculates the backoff delay for a given retry attempt using
|
|
131
163
|
# exponential backoff with jitter.
|
|
132
164
|
#
|
|
@@ -178,6 +210,45 @@ module RubyPi
|
|
|
178
210
|
end
|
|
179
211
|
end
|
|
180
212
|
|
|
213
|
+
# Wraps an HTTP block, translating Faraday transport-level exceptions
|
|
214
|
+
# (DNS failures, connection resets, TLS handshakes, read/write timeouts)
|
|
215
|
+
# into the RubyPi typed-error hierarchy so callers and the retry loop
|
|
216
|
+
# can rescue them uniformly.
|
|
217
|
+
#
|
|
218
|
+
# Without this wrapper, a `Faraday::TimeoutError` or
|
|
219
|
+
# `Faraday::ConnectionFailed` would propagate out of the provider as
|
|
220
|
+
# the raw Faraday class. That breaks two contracts:
|
|
221
|
+
# 1. The documented retry policy (BaseProvider#complete) only rescues
|
|
222
|
+
# RubyPi errors, so transport failures would not be retried —
|
|
223
|
+
# exactly the case retries exist for.
|
|
224
|
+
# 2. Callers `rescue RubyPi::TimeoutError` per the documented error
|
|
225
|
+
# hierarchy and would not catch real network timeouts.
|
|
226
|
+
#
|
|
227
|
+
# @yield the HTTP call to wrap
|
|
228
|
+
# @return [Object] whatever the block returns
|
|
229
|
+
# @raise [RubyPi::TimeoutError] on Faraday::TimeoutError
|
|
230
|
+
# @raise [RubyPi::ApiError] on connection failures, SSL errors, or
|
|
231
|
+
# any other Faraday::Error not otherwise classified
|
|
232
|
+
def with_transport_errors
|
|
233
|
+
yield
|
|
234
|
+
rescue Faraday::TimeoutError => e
|
|
235
|
+
raise RubyPi::TimeoutError, "#{provider_name} request timed out: #{e.message}"
|
|
236
|
+
rescue Faraday::ConnectionFailed, Faraday::SSLError => e
|
|
237
|
+
raise RubyPi::ApiError.new(
|
|
238
|
+
"#{provider_name} transport error: #{e.class}: #{e.message}",
|
|
239
|
+
status_code: nil,
|
|
240
|
+
response_body: nil
|
|
241
|
+
)
|
|
242
|
+
rescue Faraday::Error => e
|
|
243
|
+
# Catch-all for any other Faraday-level failure (parsing, adapter
|
|
244
|
+
# issues, etc.) so transport problems never leak provider internals.
|
|
245
|
+
raise RubyPi::ApiError.new(
|
|
246
|
+
"#{provider_name} HTTP client error: #{e.class}: #{e.message}",
|
|
247
|
+
status_code: nil,
|
|
248
|
+
response_body: nil
|
|
249
|
+
)
|
|
250
|
+
end
|
|
251
|
+
|
|
181
252
|
# Handles HTTP error responses by raising the appropriate RubyPi error.
|
|
182
253
|
# When streaming with on_data, the response body is consumed by the
|
|
183
254
|
# callback and response.body may be empty. Pass override_body with the
|