phronomy 0.9.0 → 0.9.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: d91e0fb85732153a69d268b41bdfe865791dd8f007e8bed983269284478af002
4
- data.tar.gz: c334678280139ac7934b6804b06e282051218472985c022823d26913a3f64905
3
+ metadata.gz: 7e84ccabf84c48e16cdb968c1f7b69f2348b24a70e477aa39bbbe1244d34edfc
4
+ data.tar.gz: f31dc2d1c4ed4bb7717e88278f1ced3debd0177f1f7a8042b170421a5d8e7493
5
5
  SHA512:
6
- metadata.gz: 393567f7c01633ea20160101705b0fde21ddd009a4950f1cb44a106285500b90a3bec88d4c9681cebb7656d0529c09c9e7c52da42e3e12f103231423921b43aa
7
- data.tar.gz: 03f5d2e764df9d3becb782ecdec0bf42f03b0f3fc7414efaad2334fe1d047443ef3180e1993244cad92c305607113d0afe2915caa6ff53d14c05c779a61f6b4b
6
+ metadata.gz: 1c1ab4d05c27930b84abbad09f5c59027f9bfcddf9a89aa485608afdcd22ba50fcf971c2185a815206edfc37b29abb0fa99b7f80a8fa3f436c1d6a97b5ad38e4
7
+ data.tar.gz: 04016a561705ff24c4a6b9f8bb3d6918c303071f7bf97d94d70313b95f796ae561fee29fad9e7e620928655bf7e2007751cfa217bd973d83d2ad4d26d9754e3e
data/CHANGELOG.md CHANGED
@@ -9,6 +9,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
9
9
 
10
10
  ## [Unreleased]
11
11
 
12
+ ---
13
+
14
+ ## [0.9.1] - 2026-06-06
15
+
12
16
  ### Added
13
17
 
14
18
  - **`Phronomy::Diagnostics` and `SchedulerReentrancyError`** (#278, #279):
@@ -174,10 +178,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
174
178
  tasks are treated the same as errors and follow the existing `on_error:` policy (`:raise`
175
179
  or `:skip`).
176
180
 
177
- - **MCP `HttpTransport` custom authentication headers** (#144): `McpTool.from_server` now
178
- accepts `headers: {}`, forwarded all the way to `HttpTransport#initialize`. Arbitrary
179
- headers (e.g. `Authorization: Bearer …`) are injected into every JSON-RPC request,
180
- enabling use of MCP servers that require bearer tokens or API keys.
181
+ - **MCP `HttpTransport` custom authentication headers** (#144): `Phronomy::Tools::Mcp::HttpTransport#initialize`
182
+ now accepts `headers: {}`. Arbitrary headers (e.g. `Authorization: Bearer …`) are injected
183
+ into every JSON-RPC request, enabling use of MCP servers that require bearer tokens or
184
+ API keys. Threading `headers:` through `Mcp.from_server` is tracked in issue #144 and
185
+ pending in PR #151.
181
186
 
182
187
  - **`StdioTransport` — `env:`, `cwd:`, and `startup_timeout:` options** (#145):
183
188
  Three new keyword arguments are now accepted when constructing a `StdioTransport` (and
@@ -226,8 +231,39 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
226
231
  `dispatch_parallel` and `fan_out` accept `cancellation_token:` and automatically
227
232
  inject it into every worker task's config unless the task already supplies its own.
228
233
 
234
+ ### Added (post-v0.9.0)
235
+
236
+ - **`Phronomy::Agent::CheckpointStore` — idempotency store for HITL resume** (post-v0.9.0):
237
+ New in-memory store tracks consumed checkpoint IDs. Calling `Agent::Base#resume` twice
238
+ with the same checkpoint raises `Phronomy::CheckpointAlreadyResumedError` instead of
239
+ silently re-executing the approved tool. Custom stores can be injected via
240
+ `agent.checkpoint_store = MyRedis::CheckpointStore.new`. Duck-type contract:
241
+ `consumed?(id)`, `consume!(id)`, and optionally `cleanup!(id)` / `clear!`.
242
+
243
+ - **`checkpoint_id`, `agent_class`, `requested_at` on `Checkpoint`; `Agent::Base.resume` class method** (post-v0.9.0):
244
+ `Checkpoint` now carries a UUID `checkpoint_id` (idempotency key), `agent_class`
245
+ (fully-qualified class name), and `requested_at` (UTC timestamp). The new class-level
246
+ `Agent::Base.resume(checkpoint, approved:)` method instantiates the correct agent class
247
+ automatically and delegates to `#resume`, simplifying job-queue resume flows.
248
+
249
+ - **`CheckpointStore#cleanup!` and `#clear!`** (post-v0.9.0):
250
+ Optional methods on the `CheckpointStore` duck-type contract. `cleanup!(checkpoint_id)`
251
+ removes a single checkpoint entry; `clear!` wipes all tracking state.
252
+
229
253
  ### Removed
230
254
 
255
+ - **`Phronomy::ReactAgent` class removed** (post-v0.9.0):
256
+ Use `Phronomy::Agent::Base` directly. `ReactAgent` had no distinct public API beyond
257
+ `Agent::Base` and was not listed in the stability table.
258
+
259
+ - **`Phronomy::Agent::FSM` class removed** (post-v0.9.0, internal):
260
+ The agent invocation path is now unified through `Agent::Base#invoke` with inline logic.
261
+ No public API impact.
262
+
263
+ - **`Phronomy::Agent::Lifecycle::FSMSession` and `::PhaseMachineBuilder` moved to `Workflow` namespace** (post-v0.9.0, internal):
264
+ These internal classes now live at `Phronomy::Workflow::FSMSession` and
265
+ `Phronomy::Workflow::PhaseMachineBuilder`. No public API impact.
266
+
231
267
  - **BREAKING: `Agent::Base#run_as_child` drops `&result_writer` block parameter** (#265):
232
268
  The optional block form `run_as_child(input, ctx: ctx) { |r| ctx.answer = r[:output] }`
233
269
  is no longer supported. The result is now delivered **exclusively** as the
data/README.md CHANGED
@@ -76,6 +76,7 @@ It provides composable building blocks — Workflows, Agents, Tools, Guardrails,
76
76
  | **Agent::TeamCoordinator** — Agent teams pattern: LLM coordinator + stateful workers with sequential task assignment (worker-local message history persisted across tasks) | Beta |
77
77
  | **Agent::SharedState** — Shared state pattern: peer agents collaborate via a shared KnowledgeStore; `member` DSL with per-agent instructions and `coordination` team protocol | Experimental |
78
78
  | **`ScopePolicy`** — Configurable policy callable that maps (tool, scope, agent) to `:allow`/`:approve`/`:reject`; default policy auto-routes high-risk scopes through the approval gate | Experimental |
79
+ | **HITL Checkpoint/Resume** — `Agent::Base#invoke` returns `{ suspended: true, checkpoint: Checkpoint }` when an approval-required tool is encountered without a synchronous handler; `Agent::Base#resume(checkpoint, approved:)` resumes execution; `Agent::Base.resume(checkpoint, approved:)` (class-level) resolves the agent class automatically; `Checkpoint#to_h` / `Checkpoint.from_h` for serialization; `Agent::Base#checkpoint_store=` for custom idempotency backends; `CheckpointAlreadyResumedError` raised on duplicate resume | Experimental |
79
80
 
80
81
  > **Public API boundary**: The tables above are the complete list of classes, modules, and features
81
82
  > intended for gem consumers. Every entry has an associated stability label.
@@ -1,6 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  require "securerandom"
4
+ require_relative "checkpoint_store"
4
5
  require_relative "concerns/retryable"
5
6
  require_relative "concerns/guardrailable"
6
7
  require_relative "concerns/before_completion"
@@ -374,6 +375,27 @@ module Phronomy
374
375
  @context_overhead = val.to_i
375
376
  end
376
377
  end
378
+
379
+ # Resumes a suspended invocation identified by +checkpoint+ without
380
+ # requiring the original agent instance to be kept in memory.
381
+ #
382
+ # Validates that the checkpoint was created by this agent class, then
383
+ # instantiates a fresh agent and delegates to {Suspendable#resume}.
384
+ #
385
+ # @param checkpoint [Phronomy::Agent::Checkpoint]
386
+ # @param approved [Boolean] +true+ to execute the pending tool; +false+ to deny
387
+ # @param config [Hash] same runtime options as {#invoke}
388
+ # @return [Hash] same shape as {#invoke} — may contain +suspended: true+ if
389
+ # another approval-required tool is encountered during continuation
390
+ # @raise [ArgumentError] when +checkpoint.agent_class+ does not match this class
391
+ # @api public
392
+ def resume(checkpoint, approved:, config: {})
393
+ if checkpoint.agent_class && checkpoint.agent_class != name
394
+ raise ArgumentError,
395
+ "checkpoint belongs to #{checkpoint.agent_class}, cannot resume with #{name}"
396
+ end
397
+ new.resume(checkpoint, approved: approved, config: config)
398
+ end
377
399
  end
378
400
 
379
401
  # Registers an anonymous handoff tool class on this agent instance.
@@ -442,12 +464,35 @@ module Phronomy
442
464
  if invocation_context
443
465
  thread_id, config = _apply_invocation_context(thread_id, config, invocation_context)
444
466
  end
445
- if Phronomy.configuration.event_loop
446
- _invoke_via_event_loop(input, messages: messages, thread_id: thread_id, config: config)
447
- else
448
- _check_scheduler_reentrancy
449
- invoke_async(input, messages: messages, thread_id: thread_id, config: config).await
467
+ _check_scheduler_reentrancy
468
+
469
+ timeout_sec = self.class.invoke_timeout
470
+ unless timeout_sec
471
+ return invoke_async(input, messages: messages, thread_id: thread_id, config: config).await
472
+ end
473
+
474
+ # invoke_timeout: create a CancellationScope with deadline, pass its token
475
+ # to the async invocation, and use scope.pop_queue so the calling thread
476
+ # unblocks as soon as either the result arrives or the deadline fires.
477
+ scope = Phronomy::Concurrency::CancellationScope.new(parent_token: config[:cancellation_token])
478
+ scope.deadline_in(timeout_sec)
479
+ effective_config = config.merge(cancellation_token: scope.token)
480
+ task = invoke_async(input, messages: messages, thread_id: thread_id, config: effective_config)
481
+
482
+ # Bridge the task result to an AsyncQueue so scope.pop_queue can observe the deadline.
483
+ completion_queue = Phronomy::Concurrency::AsyncQueue.new
484
+ Phronomy::Runtime.instance.spawn(name: "invoke-timeout-bridge:#{(self.class.name || "agent").downcase}") do
485
+ completion_queue.push(task.await)
486
+ rescue => e
487
+ completion_queue.push(e)
488
+ end
489
+
490
+ result = scope.pop_queue(completion_queue) do
491
+ raise Phronomy::TimeoutError,
492
+ "Agent #{self.class.name} invoke timed out after #{timeout_sec}s"
450
493
  end
494
+ raise result if result.is_a?(Exception)
495
+ result
451
496
  end
452
497
 
453
498
  # Invokes this agent asynchronously and returns a {Phronomy::Task}.
@@ -522,15 +567,18 @@ module Phronomy
522
567
  "Enable with: Phronomy.configure { |c| c.event_loop = true }"
523
568
  end
524
569
 
525
- fsm = Agent::FSM.new(
526
- agent: self,
527
- input: input,
528
- messages: messages,
529
- thread_id: "#{ctx.thread_id}_agent_#{SecureRandom.uuid}",
530
- config: config,
531
- parent_id: ctx.thread_id
532
- )
533
- Phronomy::EventLoop.instance.enqueue_child(fsm)
570
+ parent_id = ctx.thread_id
571
+ thread_id = "#{parent_id}_agent_#{SecureRandom.uuid}"
572
+ Phronomy::Runtime.instance.spawn(name: "agent-child:#{thread_id}") do
573
+ result = _invoke_impl(input, messages: messages, thread_id: thread_id, config: config)
574
+ Phronomy::EventLoop.instance.post(
575
+ Phronomy::Event.new(type: :child_completed, target_id: parent_id, payload: result)
576
+ )
577
+ rescue => e
578
+ Phronomy::EventLoop.instance.post(
579
+ Phronomy::Event.new(type: :child_failed, target_id: parent_id, payload: e)
580
+ )
581
+ end
534
582
  nil
535
583
  end
536
584
 
@@ -539,8 +587,8 @@ module Phronomy
539
587
  #
540
588
  # Events emitted (in order):
541
589
  # :token — each content delta from the LLM
542
- # :tool_call — when the LLM requests a tool (ReactAgent subclasses only)
543
- # :tool_result — after a tool completes (ReactAgent subclasses only)
590
+ # :tool_call — when the LLM requests a tool
591
+ # :tool_result — after a tool completes
544
592
  # :done — final event carrying output, messages, and usage
545
593
  # :error — if an unrecoverable error occurs
546
594
  #
@@ -587,42 +635,6 @@ module Phronomy
587
635
  [effective_thread_id, effective_config]
588
636
  end
589
637
 
590
- def _invoke_via_event_loop(input, messages:, thread_id:, config:)
591
- if Phronomy::EventLoop.current?
592
- raise Phronomy::Error,
593
- "Cannot call Agent#invoke (EventLoop mode) from within an EventLoop " \
594
- "entry action. Use agent.run_as_child(input, ctx: ctx) instead."
595
- end
596
-
597
- timeout_sec = self.class.invoke_timeout
598
- effective_config, scope = if timeout_sec
599
- s = Phronomy::Concurrency::CancellationScope.new(parent_token: config[:cancellation_token])
600
- s.deadline_in(timeout_sec)
601
- [config.merge(cancellation_token: s.token), s]
602
- else
603
- [config, nil]
604
- end
605
-
606
- fsm = Agent::FSM.new(
607
- agent: self,
608
- input: input,
609
- messages: messages,
610
- thread_id: thread_id || SecureRandom.uuid,
611
- config: effective_config
612
- )
613
- completion_queue = Phronomy::EventLoop.instance.register(fsm)
614
- result = if scope
615
- scope.pop_queue(completion_queue) do
616
- raise Phronomy::TimeoutError,
617
- "Agent #{self.class.name} invoke timed out after #{timeout_sec}s"
618
- end
619
- else
620
- completion_queue.pop
621
- end
622
- raise result if result.is_a?(Exception)
623
- result
624
- end
625
-
626
638
  def _check_scheduler_reentrancy
627
639
  return unless Phronomy::Task.current
628
640
 
@@ -851,12 +863,30 @@ module Phronomy
851
863
  # wrap it in a retry loop without duplicating the LLM interaction logic.
852
864
  def invoke_once(input, messages: [], thread_id: nil, config: {})
853
865
  trace("agent.invoke", input: input, **_build_caller_meta(config)) do |_span|
854
- Agent::InvocationPipeline.new(self).run(
866
+ run_input_guardrails!(input)
867
+
868
+ user_message = extract_message(input)
869
+ chat = build_chat
870
+ context = build_context(
855
871
  input,
856
- messages: messages,
857
- thread_id: thread_id,
858
- config: config
872
+ messages: messages, thread_id: thread_id, config: config,
873
+ budget: build_token_budget, instruction: build_instructions(input),
874
+ tools: self.class.tools + _handoff_tools
859
875
  )
876
+ _apply_context_to_chat(chat, context)
877
+
878
+ run_before_completion_hooks!(chat, config)
879
+ _register_suspension_hook!(chat)
880
+ check_cancellation!(config, "invocation cancelled before LLM call")
881
+
882
+ result, usage = _complete_with_suspension_guard(
883
+ chat, user_message, config,
884
+ thread_id: thread_id, original_input: input
885
+ )
886
+ next [result, usage] if result[:suspended]
887
+
888
+ run_output_guardrails!(result[:output])
889
+ [result, usage]
860
890
  end
861
891
  end
862
892
 
@@ -877,6 +907,36 @@ module Phronomy
877
907
  context[:messages].each { |msg| chat.messages << msg }
878
908
  end
879
909
 
910
+ # Submits the LLM call via LLMAdapter and handles SuspendSignal.
911
+ # Sets/clears the chat cancellation token around the call so that
912
+ # ParallelToolChat can observe cancellation without Thread.current.
913
+ # Returns [result_hash, usage_or_nil].
914
+ def _complete_with_suspension_guard(chat, user_message, config, thread_id:, original_input:)
915
+ chat.cancellation_token = config[:cancellation_token] if chat.respond_to?(:cancellation_token=)
916
+ begin
917
+ adapter = Phronomy.configuration.llm_adapter
918
+ response = adapter.complete_async(chat, user_message, config: config).await
919
+ rescue SuspendSignal => signal
920
+ checkpoint = Checkpoint.new(
921
+ checkpoint_id: SecureRandom.uuid,
922
+ agent_class: self.class.name,
923
+ requested_at: Time.now.utc,
924
+ thread_id: thread_id,
925
+ original_input: original_input,
926
+ messages: chat.messages.dup,
927
+ pending_tool_name: signal.tool_name,
928
+ pending_tool_args: signal.args,
929
+ pending_tool_call_id: signal.tool_call_id
930
+ )
931
+ return [{output: nil, suspended: true, checkpoint: checkpoint, messages: chat.messages}, nil]
932
+ ensure
933
+ chat.cancellation_token = nil if chat.respond_to?(:cancellation_token=)
934
+ end
935
+ output = response.content
936
+ usage = Phronomy::TokenUsage.from_tokens(response.tokens)
937
+ [{output: output, messages: chat.messages, usage: usage}, usage]
938
+ end
939
+
880
940
  def _drain_stream(chat, user_message, config, &block)
881
941
  adapter = Phronomy.configuration.llm_adapter
882
942
  chunk_queue = Phronomy::Concurrency::AsyncQueue.new(max_size: Phronomy.configuration.stream_queue_max_size)
@@ -920,12 +980,12 @@ module Phronomy
920
980
  end
921
981
 
922
982
  # Returns the chat class to instantiate for this invocation.
923
- # When EventLoop mode is enabled ({Phronomy.configuration.event_loop}),
983
+ # When {Phronomy.configuration.parallel_tool_execution} is true,
924
984
  # returns {ParallelToolChat} so that concurrent tool dispatch is enabled.
925
985
  # Falls back to +nil+ otherwise, signalling {#build_chat} to use the
926
986
  # standard +RubyLLM.chat+ factory.
927
987
  def build_chat_class
928
- Phronomy.configuration.event_loop ? Phronomy::MultiAgent::ParallelToolChat : nil
988
+ Phronomy.configuration.parallel_tool_execution ? Phronomy::MultiAgent::ParallelToolChat : nil
929
989
  end
930
990
 
931
991
  def build_chat
@@ -1,5 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
+ require "securerandom"
4
+
3
5
  module Phronomy
4
6
  module Agent
5
7
  # Encapsulates the suspended state of an agent invocation.
@@ -19,6 +21,18 @@ module Phronomy
19
21
  # end
20
22
  # puts result[:output]
21
23
  class Checkpoint
24
+ # @return [String] a globally unique identifier for this checkpoint;
25
+ # used as an idempotency key when guarding against duplicate resumes
26
+ attr_reader :checkpoint_id
27
+
28
+ # @return [String, nil] the fully-qualified name of the agent class that
29
+ # created this checkpoint (e.g. +"MyApp::ReviewAgent"+); used by the
30
+ # class-level +resume+ method to validate the correct agent is used
31
+ attr_reader :agent_class
32
+
33
+ # @return [Time] the UTC timestamp when this checkpoint was created
34
+ attr_reader :requested_at
35
+
22
36
  # @return [String, nil] the thread_id from the invocation config
23
37
  attr_reader :thread_id
24
38
 
@@ -41,6 +55,9 @@ module Phronomy
41
55
  # inject the tool result message on resume)
42
56
  attr_reader :pending_tool_call_id
43
57
 
58
+ # @param checkpoint_id [String] unique identifier; defaults to a new UUID
59
+ # @param agent_class [String, nil] fully-qualified agent class name
60
+ # @param requested_at [Time] when the checkpoint was created; defaults to +Time.now.utc+
44
61
  # @param thread_id [String, nil]
45
62
  # @param original_input [String, Hash] the input passed to the original #invoke call
46
63
  # @param messages [Array<RubyLLM::Message>]
@@ -48,7 +65,11 @@ module Phronomy
48
65
  # @param pending_tool_args [Hash]
49
66
  # @param pending_tool_call_id [String]
50
67
  # @api public
51
- def initialize(thread_id:, original_input:, messages:, pending_tool_name:, pending_tool_args:, pending_tool_call_id:)
68
+ def initialize(thread_id:, original_input:, messages:, pending_tool_name:, pending_tool_args:, pending_tool_call_id:,
69
+ checkpoint_id: SecureRandom.uuid, agent_class: nil, requested_at: Time.now.utc)
70
+ @checkpoint_id = checkpoint_id
71
+ @agent_class = agent_class
72
+ @requested_at = requested_at
52
73
  @thread_id = thread_id
53
74
  @original_input = original_input
54
75
  @messages = messages.dup.freeze
@@ -71,6 +92,9 @@ module Phronomy
71
92
  # @api public
72
93
  def to_h
73
94
  {
95
+ checkpoint_id: @checkpoint_id,
96
+ agent_class: @agent_class,
97
+ requested_at: @requested_at&.iso8601,
74
98
  thread_id: @thread_id,
75
99
  original_input: @original_input,
76
100
  messages: @messages.map { |m| serialize_message(m) },
@@ -99,7 +123,12 @@ module Phronomy
99
123
  end
100
124
  }
101
125
  messages = Array(h[:messages]).map { |m| deserialize_message(m) }
126
+ requested_at_raw = h[:requested_at]
127
+ requested_at = requested_at_raw ? Time.parse(requested_at_raw.to_s).utc : nil
102
128
  new(
129
+ checkpoint_id: h[:checkpoint_id]&.to_s || SecureRandom.uuid,
130
+ agent_class: h[:agent_class]&.to_s,
131
+ requested_at: requested_at || Time.now.utc,
103
132
  thread_id: h[:thread_id],
104
133
  original_input: h[:original_input],
105
134
  messages: messages,
@@ -0,0 +1,97 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Phronomy
4
+ module Agent
5
+ # Default in-memory idempotency store for {Checkpoint} resume operations.
6
+ #
7
+ # Tracks consumed checkpoint IDs so that calling {Agent::Base#resume} twice
8
+ # with the same checkpoint raises {Phronomy::CheckpointAlreadyResumedError}
9
+ # instead of silently executing the approved tool a second time.
10
+ #
11
+ # This implementation is *not thread-safe*. It assumes a single agent instance
12
+ # is accessed from only one thread at a time, which is the expected usage pattern.
13
+ # Agent instances themselves are not thread-safe (state like +@messages+, +@config+
14
+ # is not protected), so concurrent calls to the same agent instance are unsupported.
15
+ #
16
+ # Each agent instance gets its own store by default, so no sharing occurs unless
17
+ # the caller explicitly assigns the same store object to multiple agents.
18
+ #
19
+ # For distributed environments (multiple processes or background jobs), swap this
20
+ # for a custom implementation backed by Redis, ActiveRecord, or another shared store.
21
+ # *Your custom store implementation is responsible for ensuring thread-safety* if
22
+ # your application shares the same store instance across multiple threads.
23
+ #
24
+ # @example Plugging in a custom store
25
+ # agent = MyAgent.new
26
+ # agent.checkpoint_store = MyRedis::CheckpointStore.new
27
+ #
28
+ # @example Duck-type contract required by any replacement
29
+ # # consumed?(checkpoint_id) => Boolean
30
+ # # consume!(checkpoint_id) => void; raises CheckpointAlreadyResumedError if duplicate
31
+ # # cleanup!(checkpoint_id) => void (optional); removes tracking for the checkpoint
32
+ # # clear! => void (optional); removes all tracked checkpoints
33
+ #
34
+ # @api public
35
+ class CheckpointStore
36
+ def initialize
37
+ @consumed = Set.new
38
+ end
39
+
40
+ # Returns +true+ if the given checkpoint ID has already been consumed.
41
+ #
42
+ # @param checkpoint_id [String]
43
+ # @return [Boolean]
44
+ # @api public
45
+ def consumed?(checkpoint_id)
46
+ @consumed.include?(checkpoint_id)
47
+ end
48
+
49
+ # Marks +checkpoint_id+ as consumed, or raises if it was already consumed.
50
+ #
51
+ # @param checkpoint_id [String]
52
+ # @raise [Phronomy::CheckpointAlreadyResumedError]
53
+ # @return [void]
54
+ # @api public
55
+ def consume!(checkpoint_id)
56
+ if @consumed.include?(checkpoint_id)
57
+ raise Phronomy::CheckpointAlreadyResumedError,
58
+ "checkpoint #{checkpoint_id} has already been resumed"
59
+ end
60
+ @consumed.add(checkpoint_id)
61
+ nil
62
+ end
63
+
64
+ # Removes tracking for a specific checkpoint ID.
65
+ #
66
+ # Use this to explicitly discard a checkpoint when the application
67
+ # determines it is no longer needed (e.g., user abandons an approval
68
+ # workflow).
69
+ #
70
+ # This method is optional in the duck-type contract. Custom store
71
+ # implementations may choose not to implement it.
72
+ #
73
+ # @param checkpoint_id [String]
74
+ # @return [void]
75
+ # @api public
76
+ def cleanup!(checkpoint_id)
77
+ @consumed.delete(checkpoint_id)
78
+ nil
79
+ end
80
+
81
+ # Removes all tracked checkpoint IDs.
82
+ #
83
+ # Use this for test cleanup, periodic maintenance, or application
84
+ # shutdown.
85
+ #
86
+ # This method is optional in the duck-type contract. Custom store
87
+ # implementations may choose not to implement it.
88
+ #
89
+ # @return [void]
90
+ # @api public
91
+ def clear!
92
+ @consumed.clear
93
+ nil
94
+ end
95
+ end
96
+ end
97
+ end
@@ -49,7 +49,7 @@ module Phronomy
49
49
 
50
50
  private
51
51
 
52
- # Retry loop for #invoke. Separated so that ReactAgent can override #invoke_once.
52
+ # Retry loop for #invoke.
53
53
  def _invoke_impl(input, messages: [], thread_id: nil, config: {})
54
54
  # Fail fast when the token is already cancelled before any LLM call.
55
55
  if (token = config[:cancellation_token]) && token.cancelled?
@@ -1,5 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
+ require "securerandom"
4
+
3
5
  module Phronomy
4
6
  module Agent
5
7
  module Concerns
@@ -47,6 +49,23 @@ module Phronomy
47
49
  @scope_policy = policy
48
50
  end
49
51
 
52
+ # Sets the idempotency store used to guard against duplicate resumes.
53
+ #
54
+ # The store must respond to:
55
+ # - +consumed?(checkpoint_id)+ ⇒ Boolean
56
+ # - +consume!(checkpoint_id)+ ⇒ void; raises {Phronomy::CheckpointAlreadyResumedError} on duplicate
57
+ #
58
+ # Defaults to a per-instance {Phronomy::Agent::CheckpointStore} (in-memory, not thread-safe).
59
+ # Assign a shared persistent store when resuming across processes (e.g. Redis-backed).
60
+ # Custom stores are responsible for ensuring thread-safety if shared across threads.
61
+ #
62
+ # @param store [#consumed?, #consume!]
63
+ # @return [void]
64
+ # @api public
65
+ def checkpoint_store=(store)
66
+ @checkpoint_store = store
67
+ end
68
+
50
69
  # Resumes a previously suspended invocation from a {Phronomy::Agent::Checkpoint}.
51
70
  #
52
71
  # This method reconstructs the conversation state captured at suspension
@@ -59,9 +78,14 @@ module Phronomy
59
78
  # to inject a denial message and let the LLM handle it gracefully
60
79
  # @param config [Hash] same runtime options as #invoke
61
80
  # @return [Hash] +{ output: String, suspended: false, messages: Array, usage: Phronomy::TokenUsage }+
81
+ # or +{ output: nil, suspended: true, checkpoint: Phronomy::Agent::Checkpoint, messages: Array }+
82
+ # when a second approval-required tool is encountered during continuation
62
83
  # @raise [Phronomy::GuardrailError] when an output guardrail rejects the value
84
+ # @raise [Phronomy::CheckpointAlreadyResumedError] when the checkpoint has already been consumed
63
85
  # @api private
64
86
  def resume(checkpoint, approved:, config: {})
87
+ # Guard against duplicate resumes using the idempotency store.
88
+ _checkpoint_store.consume!(checkpoint.checkpoint_id)
65
89
  # Build a fresh chat with all tools registered.
66
90
  chat = build_chat
67
91
 
@@ -91,8 +115,30 @@ module Phronomy
91
115
  tool_call_id: checkpoint.pending_tool_call_id
92
116
  )
93
117
 
94
- # Continue the React loop.
95
- response = chat.complete
118
+ # Re-register the suspension hook so that any further requires_approval
119
+ # tools encountered during continuation are intercepted rather than
120
+ # executed without approval (cascading / chained approval scenario).
121
+ _register_suspension_hook!(chat)
122
+
123
+ # Continue the LLM loop. Rescue SuspendSignal so that a second
124
+ # approval-required tool produces a new checkpoint instead of running
125
+ # without consent.
126
+ begin
127
+ response = chat.complete
128
+ rescue SuspendSignal => signal
129
+ new_checkpoint = Checkpoint.new(
130
+ checkpoint_id: SecureRandom.uuid,
131
+ agent_class: self.class.name,
132
+ requested_at: Time.now.utc,
133
+ thread_id: checkpoint.thread_id,
134
+ original_input: checkpoint.original_input,
135
+ messages: chat.messages.dup,
136
+ pending_tool_name: signal.tool_name,
137
+ pending_tool_args: signal.args,
138
+ pending_tool_call_id: signal.tool_call_id
139
+ )
140
+ return {output: nil, suspended: true, checkpoint: new_checkpoint, messages: chat.messages}
141
+ end
96
142
 
97
143
  output = response.content
98
144
  usage = Phronomy::TokenUsage.from_tokens(response.tokens)
@@ -129,6 +175,15 @@ module Phronomy
129
175
  end
130
176
  end
131
177
  end
178
+
179
+ # Returns the checkpoint idempotency store for this instance, lazily
180
+ # initialising a default in-memory {Phronomy::Agent::CheckpointStore}.
181
+ #
182
+ # @return [#consumed?, #consume!]
183
+ # @api private
184
+ def _checkpoint_store
185
+ @checkpoint_store ||= CheckpointStore.new
186
+ end
132
187
  end
133
188
  end
134
189
  end
@@ -33,6 +33,18 @@ module Phronomy
33
33
  # @see Phronomy::EventLoop
34
34
  attr_accessor :event_loop
35
35
 
36
+ # When true, agent LLM calls use {Phronomy::MultiAgent::ParallelToolChat}
37
+ # for concurrent tool dispatch within a single agent turn.
38
+ # Defaults to false.
39
+ #
40
+ # Previously, this was automatically enabled when +event_loop+ was true.
41
+ # As of Phase 3, +parallel_tool_execution+ is a separate setting that must
42
+ # be explicitly enabled.
43
+ # @example
44
+ # Phronomy.configure { |c| c.parallel_tool_execution = true }
45
+ # @return [Boolean]
46
+ attr_accessor :parallel_tool_execution
47
+
36
48
  # When true, user input and LLM output are recorded in trace spans.
37
49
  # Defaults to false; set to true only in environments where PII capture is acceptable.
38
50
  # Set to false in privacy-sensitive environments to prevent PII from reaching
@@ -186,6 +198,7 @@ module Phronomy
186
198
  @tracer = Phronomy::Tracing::NullTracer.new
187
199
  @trace_pii = false
188
200
  @event_loop = false
201
+ @parallel_tool_execution = false
189
202
  @event_loop_stop_grace_seconds = 5
190
203
  @llm_adapter = Phronomy::LLMAdapter::RubyLLM.new
191
204
  @backpressure = :wait
@@ -129,7 +129,7 @@ module Phronomy
129
129
  # (WorkflowContext) once the workflow finishes or halts. If an error occurred,
130
130
  # the popped value will be an Exception — callers are responsible for re-raising it.
131
131
  #
132
- # @param fsm_session [Phronomy::Agent::Lifecycle::FSMSession]
132
+ # @param fsm_session [Phronomy::Workflow::FSMSession]
133
133
  # @return [Phronomy::Concurrency::AsyncQueue] resolves to final/halted context, or an Exception
134
134
  # @api private
135
135
  def register(fsm_session)
@@ -150,23 +150,6 @@ module Phronomy
150
150
  completion_queue
151
151
  end
152
152
 
153
- # Enqueues an {AgentFSM} as a fire-and-forget child session.
154
- #
155
- # Unlike {#register}, this method:
156
- # - Is safe to call from the EventLoop thread (entry actions).
157
- # - Does NOT block — no completion queue is created.
158
- # - Delegates `:finished`/`:error` cleanup to the EventLoop via posted events.
159
- #
160
- # @param agent_fsm [Phronomy::Agent::FSM]
161
- # @return [nil]
162
- # @api private
163
- def enqueue_child(agent_fsm)
164
- @queue.push([Event.new(type: :start, target_id: agent_fsm.id,
165
- payload: {session: agent_fsm, completion: nil}),
166
- Process.clock_gettime(Process::CLOCK_MONOTONIC, :nanosecond)])
167
- nil
168
- end
169
-
170
153
  # Posts an event to the loop. Safe to call from any thread (including IO threads).
171
154
  # The current monotonic clock time is recorded so that the EventLoop can
172
155
  # measure the dispatch lag when it dequeues the event.