ruby-pi 0.1.6 → 0.1.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: c78d37122ed67d80e61cf51b182dcd79a20a7efa77b503c8b0340963ad60b728
4
- data.tar.gz: e3b147cb2b01fe28ac15c2a65d6177156992be7560601886296b16941784ee08
3
+ metadata.gz: b0054adb6a0863a8f296917be736df0ebfd789aa7205589b82689199d4bf4c06
4
+ data.tar.gz: fc79dcc61dbefce874e609807d989cf2293b0ecb45a6aa036069b11038ac5c9a
5
5
  SHA512:
6
- metadata.gz: cbc0c9abddf98885bf1a22352a9cd09475c324f9aff4bcdff66ce3a6a87e06eb677ab045c966038666744cf9819d5114714c66ba5b7c676de5958d5d964a6242
7
- data.tar.gz: 3f9c28b1a30d0e3ad0f1badd391c95065adea822927c1d334dc5fc5c9867e658b43e339e9307dd3eba8dd5a534043c9fae3ea8d0384bfae8eb35a1a09356f035
6
+ metadata.gz: c130ada9b7ed93f5c9a0d16596c1176fec258204be26af15c61db3c18effee94bc7a8a1783620397780b0e3501e660b4e8ff48d8463e4089067edfcbf3bf9b60
7
+ data.tar.gz: dc179fe40cb063c4321a1c7a1aff5abb7b441d5fd87ced19908f9875c1f5b26bf2e17555b44658f603b32627577fe99feb8f34d061ed7e53eaba3c28cecd8bbb
data/CHANGELOG.md CHANGED
@@ -5,6 +5,46 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [0.1.8] - 2026-06-09
9
+
10
+ ### Fixed (adversarial review round 6)
11
+
12
+ - **`Retry-After` header was parsed but never honored (High)**: On a 429, `handle_error_response` stored the server's `Retry-After` on `RateLimitError#retry_after`, but the retry loop in `BaseProvider#complete` always slept the local exponential backoff (capped at `retry_max_delay`) — hammering a server that asked for a longer cooldown until the retry budget burned out. The retry delay now prefers a positive `retry_after` (capped at `RETRY_AFTER_CEILING`, 60s); HTTP-date values (which parse to `0.0`) and absent headers fall through to the computed backoff
13
+ - **Parallel executor timeout/rejection Results reported `name: "unknown"` (High)**: `execute_parallel` hardcoded `"unknown"` in the timeout and rejected-future branches, so with several tools timing out concurrently, logs and `:tool_execution_end` subscribers could not tell which tool hung. Futures are now zipped with their originating calls and failure Results carry the real tool name; the timeout message matches sequential mode (`Tool 'x' timed out after Ns`). The rejected-future branch also no longer reports a misleading full-timeout `duration_ms` for what may have been an instant failure (now `0.0`)
14
+ - **Keyword-parameter tool blocks failed on every call (High)**: `Definition#call` passed a single positional Hash, so a block written `{ |content:, platform:| ... }` — the natural style given named schema parameters — raised `ArgumentError: missing keyword` on every invocation (surfacing as a confusing failed Result). `Definition` now detects keyword parameters at construction and splats the arguments hash to keywords; positional-Hash blocks are unchanged. Keyword blocks without `**rest` raise on unexpected keys — strict by design, since the keys come from the LLM
15
+ - **`:compaction` event was never emitted in production (Medium)**: `Compaction#emitter` defaults to nil and nothing ever assigned it, so the documented `agent.on(:compaction)` subscription silently never fired — the only place the emitter was set was the spec itself. `Loop#initialize` now wires its emitter into the compaction strategy (an explicitly preassigned emitter is left untouched)
16
+ - **Streaming chunks were never normalized to UTF-8 (Medium)**: Faraday delivers `on_data` chunks as ASCII-8BIT; appending a chunk to a UTF-8 SSE buffer already holding non-ASCII text raises `Encoding::CompatibilityError`, and yielded deltas could carry binary encoding into consumers' UTF-8 buffers. All three providers now buffer in BINARY and re-encode each complete SSE line to UTF-8 (with `scrub` guarding invalid bytes) before parsing, so `:text_delta` events are always valid UTF-8 — including multi-byte characters split across network chunks
17
+ - **Streaming fallback gave consumers no way to truncate partial primary output (Medium)**: If the primary streamed text and then died mid-stream, the fallback streamed a complete fresh response — a delta-appending consumer rendered `"<partial primary><full fallback>"` with no signal of how much to discard. The `:fallback_start` payload now includes `partial_output` (Boolean) and `partial_chars` (characters already yielded), so consumers can deterministically reset
18
+ - **Tool names were not validated against provider constraints (Medium)**: A tool named `send.email` registered fine and then 400'd on every Anthropic request with an opaque server error. `Definition` now validates names against `/\A[a-zA-Z0-9_-]{1,64}\z/` (the strictest provider constraint) and raises `ArgumentError` at definition time with a pointed message
19
+ - **`json` was used everywhere but never declared or required (Medium)**: `JSON.parse`/`JSON.generate` are called throughout the providers and agent loop, but the gem relied on Faraday's transitive `json` dependency and the entry point's single `require "json"` — loading `agent/loop.rb` in isolation raised `NameError`, contradicting the composability principle. The gemspec now declares `json >= 2.0` and every file referencing `JSON` requires it directly (pinned by a source-scan spec)
20
+ - **Configuration accepted negative retry/timeout values (Low)**: `max_retries = -1` silently disabled retries and a negative delay raised deep inside the retry loop's `sleep`. The numeric settings now have validated writers that raise `ArgumentError` at assignment time
21
+ - **Global configuration first-access race (Low)**: `@configuration ||= Configuration.new` was unsynchronized; two threads racing the first call could each construct a Configuration with one silently discarded. The configuration is now eagerly initialized at require time
22
+ - **`continue()` Result accounting documented (Docs)**: Each `run`/`continue` builds a fresh Loop, so the returned Result's `usage`/`tool_calls_made`/`turns` cover only that invocation while `messages` is cumulative — an undocumented asymmetry, now documented on `Core#continue`
23
+ - **Schema DSL documented as LLM-facing hints, not validation (Docs)**: Nothing validates model-supplied arguments against `tool.parameters` before invoking the block — `required`/`enum`/`minimum` constrain what the model is asked to produce, with no runtime enforcement or type coercion. This is deliberate (anti-framework), but the schema header now says so loudly and directs tool blocks to treat arguments as untrusted input
24
+ - **`State#add_message` unbounded growth documented (Docs)**: Long-lived agents calling `continue()` repeatedly accumulate messages linearly without compaction configured; documented on the method
25
+ - **CLAUDE.md module map corrected (Docs)**: The map referenced a nonexistent `agent/agent.rb`, omitted `core.rb`/`loop.rb`/`state.rb`/`events.rb`, hardcoded version `0.1.0`, and the extension example used the one-arg `|event|` block signature instead of the actual `|data, agent|`. All corrected
26
+
27
+ ### Release-history note
28
+
29
+ - **`[0.1.4]` below was never actually released**: `lib/ruby_pi/version.rb` went from `0.1.3` directly to `0.1.5` — the round-2 fixes documented under 0.1.4 shipped without a version bump and were first published as part of 0.1.5. There is intentionally no `v0.1.4` git tag or gem. (Discovered during round 6; the entry is kept for historical accuracy of *what* changed.)
30
+
31
+ ## [0.1.7] - 2026-05-28
32
+
33
+ ### Fixed (adversarial review round 5)
34
+
35
+ - **Compaction produced an Anthropic-invalid leading `:assistant` message (Critical)**: The 0.1.6 orphan-`:tool` strip fixed tool-result splitting but left the summary-role logic (`first_preserved == :assistant ? :user : :assistant`) intact. Whenever the first preserved message was `:user` (multi-turn reuse) or the preserved window emptied out (all tool results), the summary became an `:assistant` message at the head of the conversation — which Anthropic rejects with HTTP 400 "first message must use the 'user' role". The summary is now **always** a `:user` message (valid as the first message and never overwriting the system prompt). When the first preserved message is itself `:user`, the summary is merged into it to avoid consecutive same-role messages; an empty preserved window yields a lone `:user` summary. Extracted into `Compaction#build_compacted_history`
36
+ - **Compaction dead "mirror case" branch removed (Minor)**: The 0.1.6 `if droppable.last … && preserved.first[:role] == :tool` block was unreachable — the preceding `while` loop guarantees `preserved.first` is never `:tool`. Removed it (the originating assistant message is already in droppable alongside its now-moved tool results, so the pair is never split), eliminating misleading dead code
37
+ - **Deterministic `ProviderError` was retried with backoff (Minor)**: 0.1.6 added `RubyPi::ProviderError` to the retryable set in `BaseProvider#complete`, but provider errors are overwhelmingly deterministic request-construction failures (missing `tool_call_id`, invalid tool-argument JSON) raised before any HTTP call — retrying only burned the backoff schedule before re-raising the identical error. `ProviderError` is no longer retried. Fallback failover is unaffected (it rescues the `RubyPi::Error` superclass)
38
+ - **Lifecycle hooks saw string-keyed tool arguments while events saw symbols (Minor)**: `before_tool_call`/`after_tool_call` received the raw `ToolCall` (string-keyed `arguments`) while the `:tool_execution_start` event and `tool_calls_made` carried symbol keys — so a hook and an event subscriber disagreed on the key type for the same call. `Loop#act` now rebuilds each `ToolCall` with symbol-keyed arguments up front, so hooks, events, `tool_calls_made`, and the tool block all observe the identical shape
39
+ - **Anthropic streaming `finish_reason` could be clobbered to nil (Minor)**: A trailing `message_delta` event without a `stop_reason` overwrote the previously captured value, yielding a `Response` with no `finish_reason`. The assignment is now guarded (`finish_reason = delta["stop_reason"] if delta["stop_reason"]`), matching the OpenAI/Gemini guards
40
+ - **Gemini `finishReason` assumed a String (Minor)**: `finishReason.downcase` would raise `NoMethodError` on a non-String payload mid-stream. Both the streaming and standard paths now coerce via `to_s` before `downcase`, and remain consistent with each other
41
+ - **Dead streamed-content accumulator removed (Cleanup)**: `Loop#think` accumulated `streamed_content` that was never read (the recorded assistant message uses `Response#content`); the `.clear` on `:fallback_start` was a no-op and its comment was inaccurate. Removed the local; the `:provider_fallback` event still fires
42
+ - **`Fallback` class docstring corrected (Docs)**: The class-level docstring still described the removed happy-path buffering ("the Fallback now buffers deltas… buffered deltas are discarded"), contradicting the real-time direct-streaming implementation. Updated to describe direct streaming plus the `:fallback_start` signal
43
+
44
+ ### Investigated, no change
45
+
46
+ - **Streaming HTTP error bodies via `env.status`**: A prior review raised that streaming error responses might lose their body if Faraday's `on_data` callback received a nil `env.status`. Verified against the actual stack (faraday 2.14.1 / faraday-net_http 3.3.0): the net_http adapter calls `save_http_response` (which sets `env.status`) before `response.read_body` streams chunks, and `Env#stream_response` passes that same populated `env` to the user's `on_data` proc. `env.status` is therefore reliably available before the first chunk, so the existing `error_body` recovery works. No fix needed
47
+
8
48
  ## [0.1.6] - 2026-05-01
9
49
 
10
50
  ### Fixed (adversarial review round 4)
@@ -140,12 +140,18 @@ module RubyPi
140
140
  # the existing conversation history and appends the new prompt before
141
141
  # resuming the loop.
142
142
  #
143
+ # NOTE on Result accounting: each run/continue builds a fresh Loop, so
144
+ # the returned Result's `usage`, `tool_calls_made`, and `turns` cover
145
+ # ONLY this invocation — while `messages` is cumulative across the whole
146
+ # conversation. Sum the per-call Results if you need session totals.
147
+ #
143
148
  # Issue #16: Uses the encapsulated reset_iteration! method instead of
144
149
  # the old approach that bypassed encapsulation
145
150
  # and was fragile.
146
151
  #
147
152
  # @param prompt [String] the follow-up user message
148
153
  # @return [RubyPi::Agent::Result] the outcome of the continued run
154
+ # (usage/tool_calls_made/turns are per-invocation; messages cumulative)
149
155
  def continue(prompt)
150
156
  @state.reset_iteration!
151
157
  @state.add_message(role: :user, content: prompt)
@@ -10,6 +10,8 @@
10
10
  # is reached. It handles streaming, lifecycle events, compaction, and all
11
11
  # pre/post tool call hooks.
12
12
 
13
+ require "json"
14
+
13
15
  module RubyPi
14
16
  module Agent
15
17
  # Executes the think-act-observe cycle against a given State, emitting
@@ -55,6 +57,14 @@ module RubyPi
55
57
  @state = state
56
58
  @emitter = emitter
57
59
  @compaction = compaction
60
+ # Wire the loop's emitter into the compaction strategy so the
61
+ # documented :compaction event actually reaches agent subscribers.
62
+ # Compaction#emitter defaults to nil and nothing else ever sets it —
63
+ # without this, `agent.on(:compaction)` never fires. An emitter that
64
+ # was already assigned explicitly is left untouched.
65
+ if @compaction.respond_to?(:emitter=) && @compaction.respond_to?(:emitter) && @compaction.emitter.nil?
66
+ @compaction.emitter = emitter
67
+ end
58
68
  @execution_mode = execution_mode
59
69
  @tool_timeout = tool_timeout
60
70
  @tool_calls_made = []
@@ -145,17 +155,16 @@ module RubyPi
145
155
  # Build tools array for the LLM
146
156
  tools = build_tools_array
147
157
 
148
- # Accumulate streamed content
149
- streamed_content = +""
150
-
151
- # Call the LLM with streaming
158
+ # Call the LLM with streaming. The recorded assistant message uses
159
+ # the returned Response#content (already the final, authoritative
160
+ # text), so there is no need to accumulate deltas here — we only
161
+ # re-emit them for subscribers.
152
162
  response = @state.model.complete(
153
163
  messages: messages,
154
164
  tools: tools,
155
165
  stream: true
156
166
  ) do |event|
157
167
  if event.text_delta?
158
- streamed_content << event.data.to_s
159
168
  @emitter.emit(:text_delta, content: event.data)
160
169
  elsif event.tool_call_delta?
161
170
  # Emit tool call delta events so subscribers can observe partial
@@ -164,12 +173,11 @@ module RubyPi
164
173
  @emitter.emit(:tool_call_delta, data: event.data)
165
174
  elsif event.fallback_start?
166
175
  # The primary LLM provider failed mid-stream and a Fallback
167
- # provider is now taking over. Discard the partial text we
168
- # accumulated from the failed primary so the agent's recorded
169
- # response reflects only the fallback's output, and surface a
170
- # :provider_fallback event so subscribers can clear any UI
171
- # state they rendered from the discarded primary deltas.
172
- streamed_content.clear
176
+ # provider is now taking over. Surface a :provider_fallback event
177
+ # so subscribers can clear any UI state they rendered from the
178
+ # discarded primary deltas. The recorded response is unaffected:
179
+ # it comes from the fallback provider's returned Response#content,
180
+ # never from the failed primary's partial text.
173
181
  @emitter.emit(:provider_fallback, **event.data)
174
182
  end
175
183
  end
@@ -208,32 +216,39 @@ module RubyPi
208
216
  timeout: @tool_timeout
209
217
  )
210
218
 
211
- # Symbolize the JSON-parsed (string-keyed) tool_call arguments once,
212
- # up front. Both the executor (which actually invokes the tool block)
213
- # and the recorded `tool_calls_made` payload use this symbol-keyed
214
- # form, keeping a single consistent shape across the pipeline rather
215
- # than mixing string keys (raw from JSON) and symbol keys (post-
216
- # symbolize) in different places.
217
- symbolized = response.tool_calls.map do |tc|
218
- RubyPi::Tools::Executor.deep_symbolize_keys(tc.arguments)
219
+ # Normalize each tool call's arguments to symbol keys ONCE, up front,
220
+ # by rebuilding the ToolCall objects. Every downstream consumer the
221
+ # executor (which invokes the tool block), the before/after_tool_call
222
+ # hooks (which receive the ToolCall directly), the emitted
223
+ # :tool_execution_start event, and the recorded `tool_calls_made`
224
+ # payload then observes the identical symbol-keyed shape. Carrying
225
+ # the symbolized form on the ToolCall itself (rather than in a side
226
+ # array) is what keeps the hooks consistent with everything else;
227
+ # previously hooks saw raw string keys while events/records saw symbols.
228
+ tool_calls = response.tool_calls.map do |tc|
229
+ RubyPi::LLM::ToolCall.new(
230
+ id: tc.id,
231
+ name: tc.name,
232
+ arguments: RubyPi::Tools::Executor.deep_symbolize_keys(tc.arguments)
233
+ )
219
234
  end
220
235
 
221
236
  # Prepare call hashes for the executor
222
- calls = response.tool_calls.each_with_index.map do |tc, idx|
223
- { name: tc.name, arguments: symbolized[idx] }
237
+ calls = tool_calls.map do |tc|
238
+ { name: tc.name, arguments: tc.arguments }
224
239
  end
225
240
 
226
241
  # Fire before_tool_call hooks and emit start events
227
- response.tool_calls.each_with_index do |tc, idx|
242
+ tool_calls.each do |tc|
228
243
  @state.before_tool_call&.call(tc)
229
- @emitter.emit(:tool_execution_start, tool_name: tc.name, arguments: symbolized[idx])
244
+ @emitter.emit(:tool_execution_start, tool_name: tc.name, arguments: tc.arguments)
230
245
  end
231
246
 
232
247
  # Execute all tool calls
233
248
  results = executor.execute(calls)
234
249
 
235
250
  # Fire after_tool_call hooks, emit end events, and add results to messages
236
- response.tool_calls.each_with_index do |tc, idx|
251
+ tool_calls.each_with_index do |tc, idx|
237
252
  result = results[idx]
238
253
 
239
254
  @state.after_tool_call&.call(tc, result)
@@ -247,7 +262,7 @@ module RubyPi
247
262
  # arguments so callers see the same shape the tool itself received.
248
263
  @tool_calls_made << {
249
264
  tool_name: tc.name,
250
- arguments: symbolized[idx],
265
+ arguments: tc.arguments,
251
266
  result: result.to_h
252
267
  }
253
268
 
@@ -91,6 +91,12 @@ module RubyPi
91
91
 
92
92
  # Appends a message to the conversation history.
93
93
  #
94
+ # NOTE: history grows without bound — there is no built-in cap. Growth
95
+ # per run is limited by max_iterations, but long-lived agents that call
96
+ # continue() repeatedly (or use a high max_iterations with large tool
97
+ # outputs) accumulate messages linearly. Configure
98
+ # Agent.new(compaction: ...) to keep the context bounded.
99
+ #
94
100
  # @param role [Symbol, String] the message role (:user, :assistant, :system, :tool)
95
101
  # @param content [String, nil] the text content of the message
96
102
  # @param options [Hash] additional fields (e.g., :tool_call_id, :tool_calls)
@@ -37,19 +37,53 @@ module RubyPi
37
37
  attr_accessor :openai_api_key
38
38
 
39
39
  # @return [Integer] Maximum number of retry attempts for transient errors (default: 3)
40
- attr_accessor :max_retries
40
+ attr_reader :max_retries
41
41
 
42
42
  # @return [Float] Base delay in seconds for exponential backoff (default: 1.0)
43
- attr_accessor :retry_base_delay
43
+ attr_reader :retry_base_delay
44
44
 
45
45
  # @return [Float] Maximum delay in seconds between retries (default: 30.0)
46
- attr_accessor :retry_max_delay
46
+ attr_reader :retry_max_delay
47
47
 
48
48
  # @return [Integer] HTTP request timeout in seconds (default: 120)
49
- attr_accessor :request_timeout
49
+ attr_reader :request_timeout
50
50
 
51
51
  # @return [Integer] HTTP connection open timeout in seconds (default: 10)
52
- attr_accessor :open_timeout
52
+ attr_reader :open_timeout
53
+
54
+ # Validated writers for numeric settings. A negative max_retries silently
55
+ # disables retries and a negative delay raises deep inside the retry
56
+ # loop's sleep — fail fast at assignment time instead, where the typo is.
57
+
58
+ # @param value [Integer] must be a non-negative integer
59
+ def max_retries=(value)
60
+ validate_numeric!(:max_retries, value)
61
+ @max_retries = value
62
+ end
63
+
64
+ # @param value [Numeric] must be non-negative
65
+ def retry_base_delay=(value)
66
+ validate_numeric!(:retry_base_delay, value)
67
+ @retry_base_delay = value
68
+ end
69
+
70
+ # @param value [Numeric] must be non-negative
71
+ def retry_max_delay=(value)
72
+ validate_numeric!(:retry_max_delay, value)
73
+ @retry_max_delay = value
74
+ end
75
+
76
+ # @param value [Numeric] must be non-negative
77
+ def request_timeout=(value)
78
+ validate_numeric!(:request_timeout, value)
79
+ @request_timeout = value
80
+ end
81
+
82
+ # @param value [Numeric] must be non-negative
83
+ def open_timeout=(value)
84
+ validate_numeric!(:open_timeout, value)
85
+ @open_timeout = value
86
+ end
53
87
 
54
88
  # @return [String] Default model name for Gemini provider
55
89
  attr_accessor :default_gemini_model
@@ -78,6 +112,17 @@ module RubyPi
78
112
 
79
113
  private
80
114
 
115
+ # Raises unless the value is a non-negative Numeric.
116
+ #
117
+ # @param name [Symbol] the setting name (for the error message)
118
+ # @param value [Object] the value being assigned
119
+ # @raise [ArgumentError] if value is not a Numeric or is negative
120
+ def validate_numeric!(name, value)
121
+ return if value.is_a?(Numeric) && value >= 0
122
+
123
+ raise ArgumentError, "#{name} must be a non-negative number, got #{value.inspect}"
124
+ end
125
+
81
126
  # Sets all configuration ivars to their default values. Called by both
82
127
  # initialize and reset! to ensure consistent defaults without the
83
128
  # anti-pattern of calling initialize from reset!.
@@ -87,34 +87,22 @@ module RubyPi
87
87
  # call but keep the matching tool_result, the API rejects the
88
88
  # request with "tool_result without preceding tool_use".
89
89
  #
90
- # The boundary between droppable and preserved can split a tool
91
- # exchange in two ways:
92
- # (a) preserved starts with one or more :tool messages whose
93
- # matching assistant turn is in droppable. Strip those
94
- # orphan tool messages from the head of preserved (move
95
- # them into droppable so they are summarized, not sent).
96
- # (b) the last droppable message is an :assistant with tool_calls,
97
- # but its matching :tool result(s) are in preserved. Pull
98
- # that assistant message back into preserved so the pair
99
- # stays intact.
100
- #
101
- # We apply (a) first: it's the common case (preserve_last_n=4 cuts
102
- # mid-pair, leaving a stranded tool message). Then (b) catches the
103
- # mirror case.
90
+ # When the boundary between droppable and preserved cuts mid-exchange,
91
+ # preserved can start with one or more orphan :tool messages whose
92
+ # matching assistant turn is in droppable. Strip those off the head of
93
+ # preserved and move them into droppable so they are summarized away
94
+ # rather than sent. Because the originating assistant message is older,
95
+ # it is already in droppable, so the pair stays together there — there
96
+ # is no mirror case to handle (once a tool result is moved across, its
97
+ # assistant is never left stranded on the preserved side).
104
98
  while preserved.first && preserved.first[:role] == :tool
105
99
  droppable << preserved.shift
106
100
  end
107
101
 
108
- if droppable.last &&
109
- droppable.last[:role] == :assistant &&
110
- droppable.last[:tool_calls].is_a?(Array) &&
111
- !droppable.last[:tool_calls].empty? &&
112
- preserved.first && preserved.first[:role] == :tool
113
- preserved.unshift(droppable.pop)
114
- end
115
-
116
- # After the boundary fix-ups, droppable may have become empty.
117
- return nil if droppable.empty?
102
+ # The orphan-strip only moves messages INTO droppable, so droppable
103
+ # cannot have shrunk; it is still non-empty here. preserved, however,
104
+ # may now be empty (the whole window was tool results) — the summary
105
+ # construction below handles that case.
118
106
 
119
107
  # Generate a summary of the dropped messages
120
108
  summary = summarize(droppable)
@@ -122,28 +110,42 @@ module RubyPi
122
110
  # Emit compaction event if an emitter is available
123
111
  @emitter&.emit(:compaction, dropped_count: droppable.size, summary: summary)
124
112
 
125
- # Build the compacted history: summary message + preserved.
126
- #
127
- # The summary role MUST NOT be :system (that would overwrite the real
128
- # system prompt on Anthropic, which extracts the last :system message
129
- # as the top-level `system:` parameter).
130
- #
131
- # The summary role must also NOT match the role of the first preserved
132
- # message consecutive same-role messages are rejected by Anthropic.
133
- # We pick :user when the next preserved message is :assistant, and
134
- # :assistant otherwise (covers :user, :tool, and an empty preserved).
135
- # On Anthropic, :tool messages become role :user with tool_result
136
- # blocks, so :assistant is the safe choice when the next message is
137
- # :tool too.
138
- first_preserved_role = preserved.first&.dig(:role)
139
- summary_role = first_preserved_role == :assistant ? :user : :assistant
140
-
141
- summary_message = {
142
- role: summary_role,
143
- content: "[Conversation Summary]\n#{summary}"
144
- }
145
-
146
- [summary_message] + preserved
113
+ build_compacted_history(summary, preserved)
114
+ end
115
+
116
+ # Builds the compacted history: a summary message followed by the
117
+ # preserved tail.
118
+ #
119
+ # The summary becomes the FIRST message of the compacted history, so it
120
+ # must satisfy the strictest provider constraints (Anthropic):
121
+ # 1. The summary role MUST NOT be :system that would overwrite the
122
+ # real system prompt on Anthropic, which promotes the last :system
123
+ # message to the top-level `system:` parameter.
124
+ # 2. The first message MUST use role :user.
125
+ # 3. Consecutive same-role messages are rejected.
126
+ #
127
+ # A :user summary satisfies (1) and (2). For (3): the orphan-strip above
128
+ # guarantees the first preserved message is :assistant, :user, or absent
129
+ # (never :tool). When it is :assistant or absent, a standalone :user
130
+ # summary alternates correctly. When it is :user, a separate :user
131
+ # summary would create two consecutive user messages, so we instead
132
+ # merge the summary text into that existing user message — keeping the
133
+ # first message a single :user message with no role collision.
134
+ #
135
+ # @param summary [String] the generated summary text
136
+ # @param preserved [Array<Hash>] the preserved tail of messages
137
+ # @return [Array<Hash>] the compacted history
138
+ def build_compacted_history(summary, preserved)
139
+ summary_text = "[Conversation Summary]\n#{summary}"
140
+ first_preserved = preserved.first
141
+
142
+ if first_preserved && first_preserved[:role] == :user
143
+ merged = first_preserved.dup
144
+ merged[:content] = "#{summary_text}\n\n#{first_preserved[:content]}"
145
+ [merged] + preserved.drop(1)
146
+ else
147
+ [{ role: :user, content: summary_text }] + preserved
148
+ end
147
149
  end
148
150
 
149
151
  # Estimates the total token count for a system prompt and message array
@@ -6,6 +6,8 @@
6
6
  # the Anthropic Messages API for both synchronous and streaming completions,
7
7
  # including tool_use block support.
8
8
 
9
+ require "json"
10
+
9
11
  module RubyPi
10
12
  module LLM
11
13
  # Anthropic Claude provider implementation. Communicates with the Anthropic
@@ -370,12 +372,19 @@ module RubyPi
370
372
  # process complete lines incrementally so that deltas reach the caller
371
373
  # as soon as each SSE event is fully received — not after the entire
372
374
  # response has been buffered.
373
- sse_buffer = +""
375
+ #
376
+ # The buffer is BINARY because chunks arrive as ASCII-8BIT and may end
377
+ # mid-way through a multi-byte UTF-8 character; appending such a chunk
378
+ # to a UTF-8 buffer that already holds non-ASCII text raises
379
+ # Encoding::CompatibilityError. Each complete line is re-encoded to
380
+ # UTF-8 (and scrubbed) before parsing, so deltas reach the caller as
381
+ # valid UTF-8 strings.
382
+ sse_buffer = (+"").force_encoding(Encoding::BINARY)
374
383
  response_status = nil
375
384
 
376
385
  # Accumulate error response body separately so ApiError gets the
377
386
  # full body even though on_data consumed the chunks.
378
- error_body = +""
387
+ error_body = (+"").force_encoding(Encoding::BINARY)
379
388
 
380
389
  response = with_transport_errors do
381
390
  conn.post("/v1/messages") do |req|
@@ -394,14 +403,17 @@ module RubyPi
394
403
  # calls on_data for error responses too, which would otherwise
395
404
  # consume the body and leave response.body empty.
396
405
  if response_status && response_status >= 400
397
- error_body << chunk
406
+ error_body << chunk.b
398
407
  next
399
408
  end
400
409
 
401
- sse_buffer << chunk
402
- # Process all complete lines in the buffer
410
+ sse_buffer << chunk.b
411
+ # Process all complete lines in the buffer. A complete line holds
412
+ # complete UTF-8 sequences (multi-byte characters split across
413
+ # chunks are repaired by the buffering), so re-encode it to UTF-8
414
+ # here; scrub guards against a server sending invalid bytes.
403
415
  while (line_end = sse_buffer.index("\n"))
404
- line = sse_buffer.slice!(0, line_end + 1).strip
416
+ line = sse_buffer.slice!(0, line_end + 1).force_encoding(Encoding::UTF_8).scrub.strip
405
417
  next if line.empty?
406
418
  next unless line.start_with?("data: ")
407
419
 
@@ -436,12 +448,12 @@ module RubyPi
436
448
  unless response.success?
437
449
  # Reconstruct the response body from what on_data accumulated
438
450
  error_response = response
439
- error_body_str = error_body.empty? ? response.body : error_body
451
+ error_body_str = error_body.empty? ? response.body : error_body.force_encoding(Encoding::UTF_8).scrub
440
452
  handle_error_response(error_response, override_body: error_body_str)
441
453
  end
442
454
 
443
455
  # Process any remaining data in the buffer after the connection closes
444
- sse_buffer.each_line do |line|
456
+ sse_buffer.force_encoding(Encoding::UTF_8).scrub.each_line do |line|
445
457
  line = line.strip
446
458
  next if line.empty?
447
459
  next unless line.start_with?("data: ")
@@ -562,7 +574,12 @@ module RubyPi
562
574
 
563
575
  when "message_delta"
564
576
  delta = data["delta"] || {}
565
- finish_reason = delta["stop_reason"]
577
+ # Only overwrite finish_reason when this delta actually carries a
578
+ # stop_reason. Anthropic emits the stop_reason on a single
579
+ # message_delta near the end of the stream; a later message_delta
580
+ # without one must not clobber the captured value back to nil
581
+ # (which would yield a Response with no finish_reason).
582
+ finish_reason = delta["stop_reason"] if delta["stop_reason"]
566
583
  if data.key?("usage")
567
584
  usage_info = data["usage"]
568
585
  usage_data[:completion_tokens] = usage_info["output_tokens"]
@@ -78,14 +78,23 @@ module RubyPi
78
78
  rescue RubyPi::AuthenticationError
79
79
  # Authentication errors are not retryable — raise immediately
80
80
  raise
81
- rescue RubyPi::RateLimitError, RubyPi::ApiError, RubyPi::TimeoutError, RubyPi::ProviderError => e
81
+ rescue RubyPi::RateLimitError, RubyPi::ApiError, RubyPi::TimeoutError => e
82
+ # NOTE: RubyPi::ProviderError is intentionally NOT retried. Provider
83
+ # errors are overwhelmingly deterministic request-construction
84
+ # failures (missing tool_call_id, invalid tool-argument JSON, missing
85
+ # tool name) raised by build_request_body BEFORE any HTTP call. They
86
+ # produce the identical error on every attempt, so retrying only
87
+ # burns the backoff schedule before surfacing the same failure.
88
+ # Fallback wrappers still rescue RubyPi::Error (the ProviderError
89
+ # superclass), so provider failover is unaffected.
90
+ #
82
91
  # Retry up to max_retries times AFTER the initial attempt.
83
92
  # With max_retries: 3, attempt goes 1 (initial), 2, 3, 4 — the condition
84
93
  # `attempt <= @max_retries` allows retries on attempts 1..3, so we get
85
94
  # 3 retries + 1 initial = 4 total attempts. Previously used `< @max_retries`
86
95
  # which was off-by-one (only 2 retries with max_retries: 3).
87
96
  if attempt <= @max_retries
88
- delay = calculate_backoff(attempt)
97
+ delay = retry_delay_for(e, attempt)
89
98
  log_retry(attempt, delay, e)
90
99
  sleep(delay)
91
100
  retry
@@ -127,6 +136,29 @@ module RubyPi
127
136
  raise RubyPi::AbstractMethodError, :perform_complete
128
137
  end
129
138
 
139
+ # Maximum delay (seconds) honored from a server-provided Retry-After
140
+ # header. Caps pathological or misconfigured server values so a single
141
+ # 429 cannot stall the client indefinitely.
142
+ RETRY_AFTER_CEILING = 60.0
143
+
144
+ # Picks the delay before the next retry. A server-provided Retry-After
145
+ # on a 429 takes precedence over the local exponential backoff: the
146
+ # server knows its own cooldown window, and retrying earlier just burns
147
+ # the retry budget against guaranteed 429s. Retry-After parsed from an
148
+ # HTTP-date (rather than delta-seconds) arrives as 0.0 and falls through
149
+ # to the computed backoff.
150
+ #
151
+ # @param error [Exception] the error that triggered the retry
152
+ # @param attempt [Integer] the current attempt number (1-based)
153
+ # @return [Float] delay in seconds
154
+ def retry_delay_for(error, attempt)
155
+ if error.is_a?(RubyPi::RateLimitError) && error.retry_after&.positive?
156
+ [error.retry_after, RETRY_AFTER_CEILING].min
157
+ else
158
+ calculate_backoff(attempt)
159
+ end
160
+ end
161
+
130
162
  # Calculates the backoff delay for a given retry attempt using
131
163
  # exponential backoff with jitter.
132
164
  #
@@ -16,11 +16,14 @@ module RubyPi
16
16
  # Authentication errors are NOT retried with the fallback since they
17
17
  # indicate a configuration problem rather than a transient failure.
18
18
  #
19
- # Issue #23: When streaming, the Fallback now buffers deltas from the
20
- # primary provider. If the primary fails mid-stream, the buffered deltas
21
- # are discarded and the fallback provider streams fresh from the start.
22
- # This prevents the consumer from seeing partial output from the primary
23
- # concatenated with the complete output from the fallback.
19
+ # Issue #23 + Issue #12: When streaming, events flow from the primary
20
+ # provider directly to the consumer in real time (no buffering), preserving
21
+ # the streaming UX on the happy path. If the primary fails mid-stream, a
22
+ # :fallback_start StreamEvent is emitted before the fallback takes over, so
23
+ # the consumer can discard any partial output already rendered from the
24
+ # failed primary. (The agent loop translates :fallback_start into a
25
+ # :provider_fallback event; raw Fallback consumers should handle
26
+ # :fallback_start themselves.)
24
27
  #
25
28
  # @example Setting up a fallback chain
26
29
  # primary = RubyPi::LLM.model(:gemini, "gemini-2.0-flash")
@@ -146,6 +149,19 @@ module RubyPi
146
149
  # @yield [event] the consumer's streaming block
147
150
  # @return [RubyPi::LLM::Response]
148
151
  def perform_complete_with_streaming_fallback(messages:, tools:, &block)
152
+ # Count the characters of text already delivered to the consumer from
153
+ # the primary. If the primary fails mid-stream AFTER yielding text,
154
+ # the fallback streams a complete fresh response — a consumer that
155
+ # merely appends deltas would render the primary's partial text
156
+ # followed by the full fallback text. The :fallback_start payload
157
+ # carries partial_output/partial_chars so consumers can deterministically
158
+ # truncate what they already rendered.
159
+ partial_chars = 0
160
+ counting_block = proc do |event|
161
+ partial_chars += event.data.to_s.length if event.text_delta?
162
+ block.call(event)
163
+ end
164
+
149
165
  begin
150
166
  # Stream primary events directly to the consumer for real-time UX.
151
167
  # No buffering — tokens appear immediately as they arrive.
@@ -153,7 +169,7 @@ module RubyPi
153
169
  messages: messages,
154
170
  tools: tools,
155
171
  stream: true,
156
- &block
172
+ &counting_block
157
173
  )
158
174
 
159
175
  response
@@ -164,12 +180,17 @@ module RubyPi
164
180
  log_fallback(e)
165
181
 
166
182
  # Signal the consumer that the primary failed mid-stream and a
167
- # fallback provider is taking over. Consumers should use this event
168
- # to clear any partial output from the failed primary.
183
+ # fallback provider is taking over. Consumers MUST use this event
184
+ # to clear any partial output from the failed primary:
185
+ # partial_output — true when the primary yielded any text deltas
186
+ # partial_chars — how many characters were yielded (truncate by
187
+ # this amount if appending to a shared buffer)
169
188
  block.call(StreamEvent.new(type: :fallback_start, data: {
170
189
  failed_provider: @primary.provider_name,
171
190
  error: e.message,
172
- fallback_provider: @fallback.provider_name
191
+ fallback_provider: @fallback.provider_name,
192
+ partial_output: partial_chars.positive?,
193
+ partial_chars: partial_chars
173
194
  }))
174
195
 
175
196
  # Stream directly from the fallback to the consumer's block.
@@ -6,6 +6,7 @@
6
6
  # the Gemini REST API for both synchronous and streaming completions, including
7
7
  # tool/function calling support.
8
8
 
9
+ require "json"
9
10
  require "securerandom"
10
11
 
11
12
  module RubyPi
@@ -305,9 +306,14 @@ module RubyPi
305
306
  # which may split SSE events mid-line. We accumulate a line buffer and
306
307
  # process complete lines incrementally so that deltas reach the caller
307
308
  # as soon as each SSE event is fully received.
308
- sse_buffer = +""
309
+ # BINARY buffer: chunks arrive as ASCII-8BIT and may end mid-way
310
+ # through a multi-byte UTF-8 character; appending such a chunk to a
311
+ # UTF-8 buffer holding non-ASCII text raises
312
+ # Encoding::CompatibilityError. Complete lines are re-encoded to
313
+ # UTF-8 (and scrubbed) before parsing.
314
+ sse_buffer = (+"").force_encoding(Encoding::BINARY)
309
315
  response_status = nil
310
- error_body = +""
316
+ error_body = (+"").force_encoding(Encoding::BINARY)
311
317
 
312
318
  response = with_transport_errors do
313
319
  conn.post(url) do |req|
@@ -324,14 +330,17 @@ module RubyPi
324
330
  # If the HTTP status indicates an error, accumulate the body for
325
331
  # the error handler instead of parsing it as SSE events.
326
332
  if response_status && response_status >= 400
327
- error_body << chunk
333
+ error_body << chunk.b
328
334
  next
329
335
  end
330
336
 
331
- sse_buffer << chunk
332
- # Process all complete lines in the buffer
337
+ sse_buffer << chunk.b
338
+ # Process all complete lines in the buffer. A complete line holds
339
+ # complete UTF-8 sequences (multi-byte characters split across
340
+ # chunks are repaired by the buffering), so re-encode it to UTF-8
341
+ # here; scrub guards against a server sending invalid bytes.
333
342
  while (line_end = sse_buffer.index("\n"))
334
- line = sse_buffer.slice!(0, line_end + 1).strip
343
+ line = sse_buffer.slice!(0, line_end + 1).force_encoding(Encoding::UTF_8).scrub.strip
335
344
  next if line.empty?
336
345
  next unless line.start_with?("data: ")
337
346
 
@@ -375,8 +384,11 @@ module RubyPi
375
384
  # Parse the actual finish reason from the streaming response
376
385
  # instead of hardcoding "stop". Gemini sends finishReason in
377
386
  # the candidate object (e.g., "STOP", "MAX_TOKENS", "SAFETY").
387
+ # Coerce via to_s before downcase so a non-String payload can
388
+ # never raise NoMethodError mid-stream (mirrors the &.to_s in
389
+ # the non-streaming parse path).
378
390
  if candidate["finishReason"]
379
- finish_reason = candidate["finishReason"].downcase
391
+ finish_reason = candidate["finishReason"].to_s.downcase
380
392
  end
381
393
 
382
394
  # Capture usage metadata if present
@@ -397,7 +409,7 @@ module RubyPi
397
409
  # callback. Pass the accumulated error_body so ApiError carries the
398
410
  # full server message instead of an empty body.
399
411
  unless response.success?
400
- error_body_str = error_body.empty? ? response.body : error_body
412
+ error_body_str = error_body.empty? ? response.body : error_body.force_encoding(Encoding::UTF_8).scrub
401
413
  handle_error_response(response, override_body: error_body_str)
402
414
  end
403
415
 
@@ -450,8 +462,10 @@ module RubyPi
450
462
  }
451
463
  end
452
464
 
453
- # Map Gemini finish reason to normalized string
454
- finish_reason = candidate["finishReason"]&.downcase
465
+ # Map Gemini finish reason to normalized string. to_s guards against
466
+ # a non-String payload (mirrors the streaming path); &. keeps a
467
+ # missing finishReason as nil.
468
+ finish_reason = candidate["finishReason"]&.to_s&.downcase
455
469
 
456
470
  Response.new(
457
471
  content: content,
@@ -6,6 +6,8 @@
6
6
  # OpenAI Chat Completions API for both synchronous and streaming completions,
7
7
  # including function/tool calling support.
8
8
 
9
+ require "json"
10
+
9
11
  module RubyPi
10
12
  module LLM
11
13
  # OpenAI provider implementation. Communicates with the OpenAI Chat
@@ -318,9 +320,14 @@ module RubyPi
318
320
  # which may split SSE events mid-line. We accumulate a line buffer and
319
321
  # process complete lines incrementally so that deltas reach the caller
320
322
  # as soon as each SSE event is fully received.
321
- sse_buffer = +""
323
+ # BINARY buffer: chunks arrive as ASCII-8BIT and may end mid-way
324
+ # through a multi-byte UTF-8 character; appending such a chunk to a
325
+ # UTF-8 buffer holding non-ASCII text raises
326
+ # Encoding::CompatibilityError. Complete lines are re-encoded to
327
+ # UTF-8 (and scrubbed) before parsing.
328
+ sse_buffer = (+"").force_encoding(Encoding::BINARY)
322
329
  response_status = nil
323
- error_body = +""
330
+ error_body = (+"").force_encoding(Encoding::BINARY)
324
331
 
325
332
  response = with_transport_errors do
326
333
  conn.post("/v1/chat/completions") do |req|
@@ -337,14 +344,17 @@ module RubyPi
337
344
  # If the HTTP status indicates an error, accumulate the body for
338
345
  # the error handler instead of parsing it as SSE events.
339
346
  if response_status && response_status >= 400
340
- error_body << chunk
347
+ error_body << chunk.b
341
348
  next
342
349
  end
343
350
 
344
- sse_buffer << chunk
345
- # Process all complete lines in the buffer
351
+ sse_buffer << chunk.b
352
+ # Process all complete lines in the buffer. A complete line holds
353
+ # complete UTF-8 sequences (multi-byte characters split across
354
+ # chunks are repaired by the buffering), so re-encode it to UTF-8
355
+ # here; scrub guards against a server sending invalid bytes.
346
356
  while (line_end = sse_buffer.index("\n"))
347
- line = sse_buffer.slice!(0, line_end + 1).strip
357
+ line = sse_buffer.slice!(0, line_end + 1).force_encoding(Encoding::UTF_8).scrub.strip
348
358
  next if line.empty?
349
359
  next unless line.start_with?("data: ")
350
360
 
@@ -419,7 +429,7 @@ module RubyPi
419
429
  # callback. Pass the accumulated error_body so ApiError carries the
420
430
  # full server message instead of an empty body.
421
431
  unless response.success?
422
- error_body_str = error_body.empty? ? response.body : error_body
432
+ error_body_str = error_body.empty? ? response.body : error_body.force_encoding(Encoding::UTF_8).scrub
423
433
  handle_error_response(response, override_body: error_body_str)
424
434
  end
425
435
 
@@ -6,6 +6,8 @@
6
6
  # decides to invoke a tool, it returns one or more ToolCall objects describing
7
7
  # which function to call and with what arguments.
8
8
 
9
+ require "json"
10
+
9
11
  module RubyPi
10
12
  module LLM
11
13
  # A tool call extracted from an LLM response. Contains the unique call ID,
@@ -37,16 +37,32 @@ module RubyPi
37
37
  # @return [Hash] A JSON Schema hash describing the tool's parameters.
38
38
  attr_reader :parameters
39
39
 
40
+ # Tool names must satisfy the strictest provider constraint (Anthropic's
41
+ # ^[a-zA-Z0-9_-]{1,64}$). Without this guard, a name like "send.email"
42
+ # registers fine and then 400s on every API request with an opaque
43
+ # server-side validation error that doesn't point back to the tool.
44
+ NAME_FORMAT = /\A[a-zA-Z0-9_-]{1,64}\z/
45
+
40
46
  # Creates a new tool definition.
41
47
  #
42
- # @param name [String, Symbol] Unique identifier for the tool.
48
+ # @param name [String, Symbol] Unique identifier for the tool. Must match
49
+ # NAME_FORMAT (letters, digits, underscore, hyphen; max 64 chars).
43
50
  # @param description [String] What the tool does (shown to the LLM).
44
51
  # @param category [Symbol, nil] Optional grouping category.
45
52
  # @param parameters [Hash] JSON Schema hash for the tool's input parameters.
46
- # @yield [Hash] Block that implements the tool logic. Receives a hash of arguments.
47
- # @raise [ArgumentError] If name or description is missing, or no block given.
53
+ # @yield [Hash] Block that implements the tool logic. Receives a hash of
54
+ # symbol-keyed arguments, or keyword arguments if the block declares
55
+ # keyword parameters (see #call).
56
+ # @raise [ArgumentError] If name is missing or violates NAME_FORMAT,
57
+ # description is missing, or no block given.
48
58
  def initialize(name:, description:, category: nil, parameters: {}, &block)
49
59
  raise ArgumentError, "Tool name is required" if name.nil? || name.to_s.strip.empty?
60
+ unless name.to_s.match?(NAME_FORMAT)
61
+ raise ArgumentError,
62
+ "Tool name #{name.to_s.inspect} is invalid — provider APIs require " \
63
+ "names matching #{NAME_FORMAT.inspect} (letters, digits, underscore, " \
64
+ "hyphen; 1-64 characters)"
65
+ end
50
66
  raise ArgumentError, "Tool description is required" if description.nil? || description.strip.empty?
51
67
  raise ArgumentError, "Tool implementation block is required" unless block_given?
52
68
 
@@ -55,14 +71,33 @@ module RubyPi
55
71
  @category = category&.to_sym
56
72
  @parameters = parameters
57
73
  @implementation = block
74
+ # On Ruby 3.x a positional Hash is never auto-splatted to keywords, so
75
+ # a block written `{ |content:, platform:| ... }` — the natural style
76
+ # given named schema parameters — would fail every call with
77
+ # "missing keyword". Detect keyword parameters once here and splat in
78
+ # #call accordingly.
79
+ @expects_keywords = block.parameters.any? { |type, _| %i[key keyreq keyrest].include?(type) }
58
80
  end
59
81
 
60
82
  # Invokes the tool with the given arguments.
61
83
  #
84
+ # Blocks may be written either style:
85
+ # { |args| args[:content] } # single positional Hash
86
+ # { |content:, platform: "x"| ... } # keyword parameters
87
+ #
88
+ # When the block declares keyword parameters, the arguments hash is
89
+ # splatted to keywords. Note that a keyword-style block without **rest
90
+ # raises ArgumentError on unexpected keys — strict by design, since the
91
+ # keys come from the LLM.
92
+ #
62
93
  # @param args [Hash] The arguments to pass to the tool implementation.
63
94
  # @return [Object] Whatever the implementation block returns.
64
95
  def call(args = {})
65
- @implementation.call(args)
96
+ if @expects_keywords
97
+ @implementation.call(**args)
98
+ else
99
+ @implementation.call(args)
100
+ end
66
101
  end
67
102
 
68
103
  # Converts this tool definition to Google Gemini function declaration format.
@@ -115,7 +115,12 @@ module RubyPi
115
115
  end
116
116
 
117
117
  # Collect results, respecting the configured timeout for each future.
118
- futures.map do |future|
118
+ # Zip each future with its originating call so failure Results carry
119
+ # the real tool name — with several tools timing out in parallel,
120
+ # "unknown" Results are indistinguishable in logs and extension events.
121
+ calls.zip(futures).map do |call, future|
122
+ tool_name = (call[:name] || call["name"]).to_s
123
+
119
124
  # Issue #10: Wait for the future to complete, then check its state
120
125
  # explicitly. Future#value returns nil both on timeout AND when the
121
126
  # block legitimately returned nil, so we cannot use || to distinguish.
@@ -128,13 +133,16 @@ module RubyPi
128
133
  else
129
134
  # Future was rejected (raised an exception within the block).
130
135
  # This shouldn't normally happen since execute_single rescues
131
- # internally, but handle it defensively.
136
+ # internally, but handle it defensively. The actual run time is
137
+ # unknown here (the future failed at some point before the wait
138
+ # elapsed), so report 0.0 rather than a misleading full-timeout
139
+ # duration for what may have been an instant failure.
132
140
  error = future.reason
133
141
  Result.new(
134
- name: "unknown",
142
+ name: tool_name,
135
143
  success: false,
136
144
  error: "#{error.class}: #{error.message}",
137
- duration_ms: @timeout * 1000.0
145
+ duration_ms: 0.0
138
146
  )
139
147
  end
140
148
  else
@@ -147,9 +155,9 @@ module RubyPi
147
155
  future.cancel if future.respond_to?(:cancel)
148
156
 
149
157
  Result.new(
150
- name: "unknown",
158
+ name: tool_name,
151
159
  success: false,
152
- error: "Tool execution timed out after #{@timeout}s",
160
+ error: "Tool '#{tool_name}' timed out after #{@timeout}s",
153
161
  duration_ms: @timeout * 1000.0
154
162
  )
155
163
  end
@@ -13,6 +13,16 @@
13
13
  # flag consumed by `.object` to populate the top-level "required" array.
14
14
  # It is stripped from the property's own schema hash before inclusion.
15
15
  #
16
+ # IMPORTANT: Schemas are LLM-facing hints, NOT runtime input validation.
17
+ # Nothing in the execution pipeline validates the model's arguments against
18
+ # the schema before invoking the tool block: `required`, `enum`, `minimum`,
19
+ # and type declarations constrain what the model is *asked* to produce, but a
20
+ # misbehaving model can still omit required fields, send extra keys, or pass
21
+ # a String where an Integer is declared — no coercion is performed. Tool
22
+ # blocks should treat their arguments as untrusted input and validate or
23
+ # coerce what they depend on. (This is deliberate, per the anti-framework
24
+ # philosophy: validation policy belongs to the tool, not the harness.)
25
+ #
16
26
  # Usage:
17
27
  # schema = RubyPi::Schema.object(
18
28
  # name: RubyPi::Schema.string("User's name", required: true),
@@ -7,5 +7,5 @@
7
7
 
8
8
  module RubyPi
9
9
  # The current version of the RubyPi gem, following Semantic Versioning.
10
- VERSION = "0.1.6"
10
+ VERSION = "0.1.8"
11
11
  end
data/lib/ruby_pi.rb CHANGED
@@ -82,6 +82,13 @@ module RubyPi
82
82
  end
83
83
  end
84
84
 
85
+ # Eagerly initialize the global configuration at load time. The lazy
86
+ # `@configuration ||= ...` in .configuration is not synchronized; two
87
+ # threads hitting it concurrently on first access could each construct a
88
+ # Configuration, with one silently discarded. Initializing here (requires
89
+ # run single-threaded) removes the race without adding a mutex to every read.
90
+ @configuration = Configuration.new
91
+
85
92
  # Namespace for large language model providers and related abstractions.
86
93
  module LLM
87
94
  class << self
metadata CHANGED
@@ -1,13 +1,13 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: ruby-pi
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.6
4
+ version: 0.1.8
5
5
  platform: ruby
6
6
  authors:
7
7
  - RubyPi Contributors
8
8
  bindir: bin
9
9
  cert_chain: []
10
- date: 2026-05-01 00:00:00.000000000 Z
10
+ date: 1980-01-02 00:00:00.000000000 Z
11
11
  dependencies:
12
12
  - !ruby/object:Gem::Dependency
13
13
  name: faraday
@@ -51,6 +51,20 @@ dependencies:
51
51
  - - "~>"
52
52
  - !ruby/object:Gem::Version
53
53
  version: '1.2'
54
+ - !ruby/object:Gem::Dependency
55
+ name: json
56
+ requirement: !ruby/object:Gem::Requirement
57
+ requirements:
58
+ - - ">="
59
+ - !ruby/object:Gem::Version
60
+ version: '2.0'
61
+ type: :runtime
62
+ prerelease: false
63
+ version_requirements: !ruby/object:Gem::Requirement
64
+ requirements:
65
+ - - ">="
66
+ - !ruby/object:Gem::Version
67
+ version: '2.0'
54
68
  - !ruby/object:Gem::Dependency
55
69
  name: rspec
56
70
  requirement: !ruby/object:Gem::Requirement
@@ -157,7 +171,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
157
171
  - !ruby/object:Gem::Version
158
172
  version: '0'
159
173
  requirements: []
160
- rubygems_version: 3.6.2
174
+ rubygems_version: 3.6.9
161
175
  specification_version: 4
162
176
  summary: AI agent harness for Ruby — build LLM agents with tool calling, streaming,
163
177
  and a unified interface to OpenAI, Anthropic Claude, and Google Gemini.