RubyGems - phronomy - Versions diffs - 0.9.0 → 0.9.1 - Mend

phronomy 0.9.0 → 0.9.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (23) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +40 -4
data/README.md +1 -0
data/lib/phronomy/agent/base.rb +118 -58
data/lib/phronomy/agent/checkpoint.rb +30 -1
data/lib/phronomy/agent/checkpoint_store.rb +97 -0
data/lib/phronomy/agent/concerns/retryable.rb +1 -1
data/lib/phronomy/agent/concerns/suspendable.rb +57 -2
data/lib/phronomy/configuration.rb +13 -0
data/lib/phronomy/event_loop.rb +1 -18
data/lib/phronomy/tools/agent.rb +2 -3
data/lib/phronomy/version.rb +1 -1
data/lib/phronomy/workflow/fsm_session.rb +249 -0
data/lib/phronomy/workflow/phase_machine_builder.rb +247 -0
data/lib/phronomy/workflow_runner.rb +2 -2
data/lib/phronomy.rb +8 -2
data/scripts/api_snapshot.rb +0 -1
metadata +5 -7
data/lib/phronomy/agent/fsm.rb +0 -157
data/lib/phronomy/agent/invocation_pipeline.rb +0 -108
data/lib/phronomy/agent/lifecycle/fsm_session.rb +0 -251
data/lib/phronomy/agent/lifecycle/phase_machine_builder.rb +0 -249
data/lib/phronomy/agent/react_agent.rb +0 -205

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: d91e0fb85732153a69d268b41bdfe865791dd8f007e8bed983269284478af002
-  data.tar.gz: c334678280139ac7934b6804b06e282051218472985c022823d26913a3f64905
+  metadata.gz: 7e84ccabf84c48e16cdb968c1f7b69f2348b24a70e477aa39bbbe1244d34edfc
+  data.tar.gz: f31dc2d1c4ed4bb7717e88278f1ced3debd0177f1f7a8042b170421a5d8e7493
 SHA512:
-  metadata.gz: 393567f7c01633ea20160101705b0fde21ddd009a4950f1cb44a106285500b90a3bec88d4c9681cebb7656d0529c09c9e7c52da42e3e12f103231423921b43aa
-  data.tar.gz: 03f5d2e764df9d3becb782ecdec0bf42f03b0f3fc7414efaad2334fe1d047443ef3180e1993244cad92c305607113d0afe2915caa6ff53d14c05c779a61f6b4b
+  metadata.gz: 1c1ab4d05c27930b84abbad09f5c59027f9bfcddf9a89aa485608afdcd22ba50fcf971c2185a815206edfc37b29abb0fa99b7f80a8fa3f436c1d6a97b5ad38e4
+  data.tar.gz: 04016a561705ff24c4a6b9f8bb3d6918c303071f7bf97d94d70313b95f796ae561fee29fad9e7e620928655bf7e2007751cfa217bd973d83d2ad4d26d9754e3e

data/CHANGELOG.md CHANGED Viewed

@@ -9,6 +9,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
+---
+## [0.9.1] - 2026-06-06
 ### Added
 - **`Phronomy::Diagnostics` and `SchedulerReentrancyError`** (#278, #279):
@@ -174,10 +178,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
   tasks are treated the same as errors and follow the existing `on_error:` policy (`:raise`
   or `:skip`).
-- **MCP `HttpTransport` custom authentication headers** (#144): `McpTool.from_server` now
-  accepts `headers: {}`, forwarded all the way to `HttpTransport#initialize`. Arbitrary
-  headers (e.g. `Authorization: Bearer …`) are injected into every JSON-RPC request,
-  enabling use of MCP servers that require bearer tokens or API keys.
+- **MCP `HttpTransport` custom authentication headers** (#144): `Phronomy::Tools::Mcp::HttpTransport#initialize`
+  now accepts `headers: {}`. Arbitrary headers (e.g. `Authorization: Bearer …`) are injected
+  into every JSON-RPC request, enabling use of MCP servers that require bearer tokens or
+  API keys. Threading `headers:` through `Mcp.from_server` is tracked in issue #144 and
+  pending in PR #151.
 - **`StdioTransport` — `env:`, `cwd:`, and `startup_timeout:` options** (#145):
   Three new keyword arguments are now accepted when constructing a `StdioTransport` (and
@@ -226,8 +231,39 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
   `dispatch_parallel` and `fan_out` accept `cancellation_token:` and automatically
   inject it into every worker task's config unless the task already supplies its own.
+### Added (post-v0.9.0)
+- **`Phronomy::Agent::CheckpointStore` — idempotency store for HITL resume** (post-v0.9.0):
+  New in-memory store tracks consumed checkpoint IDs. Calling `Agent::Base#resume` twice
+  with the same checkpoint raises `Phronomy::CheckpointAlreadyResumedError` instead of
+  silently re-executing the approved tool. Custom stores can be injected via
+  `agent.checkpoint_store = MyRedis::CheckpointStore.new`. Duck-type contract:
+  `consumed?(id)`, `consume!(id)`, and optionally `cleanup!(id)` / `clear!`.
+- **`checkpoint_id`, `agent_class`, `requested_at` on `Checkpoint`; `Agent::Base.resume` class method** (post-v0.9.0):
+  `Checkpoint` now carries a UUID `checkpoint_id` (idempotency key), `agent_class`
+  (fully-qualified class name), and `requested_at` (UTC timestamp). The new class-level
+  `Agent::Base.resume(checkpoint, approved:)` method instantiates the correct agent class
+  automatically and delegates to `#resume`, simplifying job-queue resume flows.
+- **`CheckpointStore#cleanup!` and `#clear!`** (post-v0.9.0):
+  Optional methods on the `CheckpointStore` duck-type contract. `cleanup!(checkpoint_id)`
+  removes a single checkpoint entry; `clear!` wipes all tracking state.
 ### Removed
+- **`Phronomy::ReactAgent` class removed** (post-v0.9.0):
+  Use `Phronomy::Agent::Base` directly. `ReactAgent` had no distinct public API beyond
+  `Agent::Base` and was not listed in the stability table.
+- **`Phronomy::Agent::FSM` class removed** (post-v0.9.0, internal):
+  The agent invocation path is now unified through `Agent::Base#invoke` with inline logic.
+  No public API impact.
+- **`Phronomy::Agent::Lifecycle::FSMSession` and `::PhaseMachineBuilder` moved to `Workflow` namespace** (post-v0.9.0, internal):
+  These internal classes now live at `Phronomy::Workflow::FSMSession` and
+  `Phronomy::Workflow::PhaseMachineBuilder`. No public API impact.
 - **BREAKING: `Agent::Base#run_as_child` drops `&result_writer` block parameter** (#265):
   The optional block form `run_as_child(input, ctx: ctx) { |r| ctx.answer = r[:output] }`
   is no longer supported. The result is now delivered **exclusively** as the

data/README.md CHANGED Viewed

@@ -76,6 +76,7 @@ It provides composable building blocks — Workflows, Agents, Tools, Guardrails,
 | **Agent::TeamCoordinator** — Agent teams pattern: LLM coordinator + stateful workers with sequential task assignment (worker-local message history persisted across tasks) | Beta |
 | **Agent::SharedState** — Shared state pattern: peer agents collaborate via a shared KnowledgeStore; `member` DSL with per-agent instructions and `coordination` team protocol | Experimental |
 | **`ScopePolicy`** — Configurable policy callable that maps (tool, scope, agent) to `:allow`/`:approve`/`:reject`; default policy auto-routes high-risk scopes through the approval gate | Experimental |
+| **HITL Checkpoint/Resume** — `Agent::Base#invoke` returns `{ suspended: true, checkpoint: Checkpoint }` when an approval-required tool is encountered without a synchronous handler; `Agent::Base#resume(checkpoint, approved:)` resumes execution; `Agent::Base.resume(checkpoint, approved:)` (class-level) resolves the agent class automatically; `Checkpoint#to_h` / `Checkpoint.from_h` for serialization; `Agent::Base#checkpoint_store=` for custom idempotency backends; `CheckpointAlreadyResumedError` raised on duplicate resume | Experimental |
 > **Public API boundary**: The tables above are the complete list of classes, modules, and features
 > intended for gem consumers. Every entry has an associated stability label.

data/lib/phronomy/agent/base.rb CHANGED Viewed

@@ -1,6 +1,7 @@
 # frozen_string_literal: true
 require "securerandom"
+require_relative "checkpoint_store"
 require_relative "concerns/retryable"
 require_relative "concerns/guardrailable"
 require_relative "concerns/before_completion"
@@ -374,6 +375,27 @@ module Phronomy
             @context_overhead = val.to_i
           end
         end
+        # Resumes a suspended invocation identified by +checkpoint+ without
+        # requiring the original agent instance to be kept in memory.
+        #
+        # Validates that the checkpoint was created by this agent class, then
+        # instantiates a fresh agent and delegates to {Suspendable#resume}.
+        #
+        # @param checkpoint [Phronomy::Agent::Checkpoint]
+        # @param approved   [Boolean] +true+ to execute the pending tool; +false+ to deny
+        # @param config     [Hash] same runtime options as {#invoke}
+        # @return [Hash] same shape as {#invoke} — may contain +suspended: true+ if
+        #   another approval-required tool is encountered during continuation
+        # @raise [ArgumentError] when +checkpoint.agent_class+ does not match this class
+        # @api public
+        def resume(checkpoint, approved:, config: {})
+          if checkpoint.agent_class && checkpoint.agent_class != name
+            raise ArgumentError,
+              "checkpoint belongs to #{checkpoint.agent_class}, cannot resume with #{name}"
+          end
+          new.resume(checkpoint, approved: approved, config: config)
+        end
       end
       # Registers an anonymous handoff tool class on this agent instance.
@@ -442,12 +464,35 @@ module Phronomy
         if invocation_context
           thread_id, config = _apply_invocation_context(thread_id, config, invocation_context)
         end
-        if Phronomy.configuration.event_loop
-          _invoke_via_event_loop(input, messages: messages, thread_id: thread_id, config: config)
-        else
-          _check_scheduler_reentrancy
-          invoke_async(input, messages: messages, thread_id: thread_id, config: config).await
+        _check_scheduler_reentrancy
+        timeout_sec = self.class.invoke_timeout
+        unless timeout_sec
+          return invoke_async(input, messages: messages, thread_id: thread_id, config: config).await
+        end
+        # invoke_timeout: create a CancellationScope with deadline, pass its token
+        # to the async invocation, and use scope.pop_queue so the calling thread
+        # unblocks as soon as either the result arrives or the deadline fires.
+        scope = Phronomy::Concurrency::CancellationScope.new(parent_token: config[:cancellation_token])
+        scope.deadline_in(timeout_sec)
+        effective_config = config.merge(cancellation_token: scope.token)
+        task = invoke_async(input, messages: messages, thread_id: thread_id, config: effective_config)
+        # Bridge the task result to an AsyncQueue so scope.pop_queue can observe the deadline.
+        completion_queue = Phronomy::Concurrency::AsyncQueue.new
+        Phronomy::Runtime.instance.spawn(name: "invoke-timeout-bridge:#{(self.class.name || "agent").downcase}") do
+          completion_queue.push(task.await)
+        rescue => e
+          completion_queue.push(e)
+        end
+        result = scope.pop_queue(completion_queue) do
+          raise Phronomy::TimeoutError,
+            "Agent #{self.class.name} invoke timed out after #{timeout_sec}s"
         end
+        raise result if result.is_a?(Exception)
+        result
       end
       # Invokes this agent asynchronously and returns a {Phronomy::Task}.
@@ -522,15 +567,18 @@ module Phronomy
             "Enable with: Phronomy.configure { |c| c.event_loop = true }"
         end
-        fsm = Agent::FSM.new(
-          agent: self,
-          input: input,
-          messages: messages,
-          thread_id: "#{ctx.thread_id}_agent_#{SecureRandom.uuid}",
-          config: config,
-          parent_id: ctx.thread_id
-        )
-        Phronomy::EventLoop.instance.enqueue_child(fsm)
+        parent_id = ctx.thread_id
+        thread_id = "#{parent_id}_agent_#{SecureRandom.uuid}"
+        Phronomy::Runtime.instance.spawn(name: "agent-child:#{thread_id}") do
+          result = _invoke_impl(input, messages: messages, thread_id: thread_id, config: config)
+          Phronomy::EventLoop.instance.post(
+            Phronomy::Event.new(type: :child_completed, target_id: parent_id, payload: result)
+          )
+        rescue => e
+          Phronomy::EventLoop.instance.post(
+            Phronomy::Event.new(type: :child_failed, target_id: parent_id, payload: e)
+          )
+        end
         nil
       end
@@ -539,8 +587,8 @@ module Phronomy
       #
       # Events emitted (in order):
       #   :token       — each content delta from the LLM
-      #   :tool_call   — when the LLM requests a tool (ReactAgent subclasses only)
-      #   :tool_result — after a tool completes (ReactAgent subclasses only)
+      #   :tool_call   — when the LLM requests a tool
+      #   :tool_result — after a tool completes
       #   :done        — final event carrying output, messages, and usage
       #   :error       — if an unrecoverable error occurs
       #
@@ -587,42 +635,6 @@ module Phronomy
         [effective_thread_id, effective_config]
       end
-      def _invoke_via_event_loop(input, messages:, thread_id:, config:)
-        if Phronomy::EventLoop.current?
-          raise Phronomy::Error,
-            "Cannot call Agent#invoke (EventLoop mode) from within an EventLoop " \
-            "entry action. Use agent.run_as_child(input, ctx: ctx) instead."
-        end
-        timeout_sec = self.class.invoke_timeout
-        effective_config, scope = if timeout_sec
-          s = Phronomy::Concurrency::CancellationScope.new(parent_token: config[:cancellation_token])
-          s.deadline_in(timeout_sec)
-          [config.merge(cancellation_token: s.token), s]
-        else
-          [config, nil]
-        end
-        fsm = Agent::FSM.new(
-          agent: self,
-          input: input,
-          messages: messages,
-          thread_id: thread_id || SecureRandom.uuid,
-          config: effective_config
-        )
-        completion_queue = Phronomy::EventLoop.instance.register(fsm)
-        result = if scope
-          scope.pop_queue(completion_queue) do
-            raise Phronomy::TimeoutError,
-              "Agent #{self.class.name} invoke timed out after #{timeout_sec}s"
-          end
-        else
-          completion_queue.pop
-        end
-        raise result if result.is_a?(Exception)
-        result
-      end
       def _check_scheduler_reentrancy
         return unless Phronomy::Task.current
@@ -851,12 +863,30 @@ module Phronomy
       # wrap it in a retry loop without duplicating the LLM interaction logic.
       def invoke_once(input, messages: [], thread_id: nil, config: {})
         trace("agent.invoke", input: input, **_build_caller_meta(config)) do |_span|
-          Agent::InvocationPipeline.new(self).run(
+          run_input_guardrails!(input)
+          user_message = extract_message(input)
+          chat = build_chat
+          context = build_context(
             input,
-            messages: messages,
-            thread_id: thread_id,
-            config: config
+            messages: messages, thread_id: thread_id, config: config,
+            budget: build_token_budget, instruction: build_instructions(input),
+            tools: self.class.tools + _handoff_tools
           )
+          _apply_context_to_chat(chat, context)
+          run_before_completion_hooks!(chat, config)
+          _register_suspension_hook!(chat)
+          check_cancellation!(config, "invocation cancelled before LLM call")
+          result, usage = _complete_with_suspension_guard(
+            chat, user_message, config,
+            thread_id: thread_id, original_input: input
+          )
+          next [result, usage] if result[:suspended]
+          run_output_guardrails!(result[:output])
+          [result, usage]
         end
       end
@@ -877,6 +907,36 @@ module Phronomy
         context[:messages].each { |msg| chat.messages << msg }
       end
+      # Submits the LLM call via LLMAdapter and handles SuspendSignal.
+      # Sets/clears the chat cancellation token around the call so that
+      # ParallelToolChat can observe cancellation without Thread.current.
+      # Returns [result_hash, usage_or_nil].
+      def _complete_with_suspension_guard(chat, user_message, config, thread_id:, original_input:)
+        chat.cancellation_token = config[:cancellation_token] if chat.respond_to?(:cancellation_token=)
+        begin
+          adapter = Phronomy.configuration.llm_adapter
+          response = adapter.complete_async(chat, user_message, config: config).await
+        rescue SuspendSignal => signal
+          checkpoint = Checkpoint.new(
+            checkpoint_id: SecureRandom.uuid,
+            agent_class: self.class.name,
+            requested_at: Time.now.utc,
+            thread_id: thread_id,
+            original_input: original_input,
+            messages: chat.messages.dup,
+            pending_tool_name: signal.tool_name,
+            pending_tool_args: signal.args,
+            pending_tool_call_id: signal.tool_call_id
+          )
+          return [{output: nil, suspended: true, checkpoint: checkpoint, messages: chat.messages}, nil]
+        ensure
+          chat.cancellation_token = nil if chat.respond_to?(:cancellation_token=)
+        end
+        output = response.content
+        usage = Phronomy::TokenUsage.from_tokens(response.tokens)
+        [{output: output, messages: chat.messages, usage: usage}, usage]
+      end
       def _drain_stream(chat, user_message, config, &block)
         adapter = Phronomy.configuration.llm_adapter
         chunk_queue = Phronomy::Concurrency::AsyncQueue.new(max_size: Phronomy.configuration.stream_queue_max_size)
@@ -920,12 +980,12 @@ module Phronomy
       end
       # Returns the chat class to instantiate for this invocation.
-      # When EventLoop mode is enabled ({Phronomy.configuration.event_loop}),
+      # When {Phronomy.configuration.parallel_tool_execution} is true,
       # returns {ParallelToolChat} so that concurrent tool dispatch is enabled.
       # Falls back to +nil+ otherwise, signalling {#build_chat} to use the
       # standard +RubyLLM.chat+ factory.
       def build_chat_class
-        Phronomy.configuration.event_loop ? Phronomy::MultiAgent::ParallelToolChat : nil
+        Phronomy.configuration.parallel_tool_execution ? Phronomy::MultiAgent::ParallelToolChat : nil
       end
       def build_chat

data/lib/phronomy/agent/checkpoint.rb CHANGED Viewed

@@ -1,5 +1,7 @@
 # frozen_string_literal: true
+require "securerandom"
 module Phronomy
   module Agent
     # Encapsulates the suspended state of an agent invocation.
@@ -19,6 +21,18 @@ module Phronomy
     #   end
     #   puts result[:output]
     class Checkpoint
+      # @return [String] a globally unique identifier for this checkpoint;
+      #   used as an idempotency key when guarding against duplicate resumes
+      attr_reader :checkpoint_id
+      # @return [String, nil] the fully-qualified name of the agent class that
+      #   created this checkpoint (e.g. +"MyApp::ReviewAgent"+); used by the
+      #   class-level +resume+ method to validate the correct agent is used
+      attr_reader :agent_class
+      # @return [Time] the UTC timestamp when this checkpoint was created
+      attr_reader :requested_at
       # @return [String, nil] the thread_id from the invocation config
       attr_reader :thread_id
@@ -41,6 +55,9 @@ module Phronomy
       #   inject the tool result message on resume)
       attr_reader :pending_tool_call_id
+      # @param checkpoint_id        [String] unique identifier; defaults to a new UUID
+      # @param agent_class           [String, nil] fully-qualified agent class name
+      # @param requested_at          [Time] when the checkpoint was created; defaults to +Time.now.utc+
       # @param thread_id            [String, nil]
       # @param original_input       [String, Hash] the input passed to the original #invoke call
       # @param messages             [Array<RubyLLM::Message>]
@@ -48,7 +65,11 @@ module Phronomy
       # @param pending_tool_args    [Hash]
       # @param pending_tool_call_id [String]
       # @api public
-      def initialize(thread_id:, original_input:, messages:, pending_tool_name:, pending_tool_args:, pending_tool_call_id:)
+      def initialize(thread_id:, original_input:, messages:, pending_tool_name:, pending_tool_args:, pending_tool_call_id:,
+        checkpoint_id: SecureRandom.uuid, agent_class: nil, requested_at: Time.now.utc)
+        @checkpoint_id = checkpoint_id
+        @agent_class = agent_class
+        @requested_at = requested_at
         @thread_id = thread_id
         @original_input = original_input
         @messages = messages.dup.freeze
@@ -71,6 +92,9 @@ module Phronomy
       # @api public
       def to_h
         {
+          checkpoint_id: @checkpoint_id,
+          agent_class: @agent_class,
+          requested_at: @requested_at&.iso8601,
           thread_id: @thread_id,
           original_input: @original_input,
           messages: @messages.map { |m| serialize_message(m) },
@@ -99,7 +123,12 @@ module Phronomy
           end
         }
         messages = Array(h[:messages]).map { |m| deserialize_message(m) }
+        requested_at_raw = h[:requested_at]
+        requested_at = requested_at_raw ? Time.parse(requested_at_raw.to_s).utc : nil
         new(
+          checkpoint_id: h[:checkpoint_id]&.to_s || SecureRandom.uuid,
+          agent_class: h[:agent_class]&.to_s,
+          requested_at: requested_at || Time.now.utc,
           thread_id: h[:thread_id],
           original_input: h[:original_input],
           messages: messages,

data/lib/phronomy/agent/checkpoint_store.rb ADDED Viewed

@@ -0,0 +1,97 @@
+# frozen_string_literal: true
+module Phronomy
+  module Agent
+    # Default in-memory idempotency store for {Checkpoint} resume operations.
+    #
+    # Tracks consumed checkpoint IDs so that calling {Agent::Base#resume} twice
+    # with the same checkpoint raises {Phronomy::CheckpointAlreadyResumedError}
+    # instead of silently executing the approved tool a second time.
+    #
+    # This implementation is *not thread-safe*. It assumes a single agent instance
+    # is accessed from only one thread at a time, which is the expected usage pattern.
+    # Agent instances themselves are not thread-safe (state like +@messages+, +@config+
+    # is not protected), so concurrent calls to the same agent instance are unsupported.
+    #
+    # Each agent instance gets its own store by default, so no sharing occurs unless
+    # the caller explicitly assigns the same store object to multiple agents.
+    #
+    # For distributed environments (multiple processes or background jobs), swap this
+    # for a custom implementation backed by Redis, ActiveRecord, or another shared store.
+    # *Your custom store implementation is responsible for ensuring thread-safety* if
+    # your application shares the same store instance across multiple threads.
+    #
+    # @example Plugging in a custom store
+    #   agent = MyAgent.new
+    #   agent.checkpoint_store = MyRedis::CheckpointStore.new
+    #
+    # @example Duck-type contract required by any replacement
+    #   # consumed?(checkpoint_id) => Boolean
+    #   # consume!(checkpoint_id)  => void; raises CheckpointAlreadyResumedError if duplicate
+    #   # cleanup!(checkpoint_id)  => void (optional); removes tracking for the checkpoint
+    #   # clear!                   => void (optional); removes all tracked checkpoints
+    #
+    # @api public
+    class CheckpointStore
+      def initialize
+        @consumed = Set.new
+      end
+      # Returns +true+ if the given checkpoint ID has already been consumed.
+      #
+      # @param checkpoint_id [String]
+      # @return [Boolean]
+      # @api public
+      def consumed?(checkpoint_id)
+        @consumed.include?(checkpoint_id)
+      end
+      # Marks +checkpoint_id+ as consumed, or raises if it was already consumed.
+      #
+      # @param checkpoint_id [String]
+      # @raise [Phronomy::CheckpointAlreadyResumedError]
+      # @return [void]
+      # @api public
+      def consume!(checkpoint_id)
+        if @consumed.include?(checkpoint_id)
+          raise Phronomy::CheckpointAlreadyResumedError,
+            "checkpoint #{checkpoint_id} has already been resumed"
+        end
+        @consumed.add(checkpoint_id)
+        nil
+      end
+      # Removes tracking for a specific checkpoint ID.
+      #
+      # Use this to explicitly discard a checkpoint when the application
+      # determines it is no longer needed (e.g., user abandons an approval
+      # workflow).
+      #
+      # This method is optional in the duck-type contract. Custom store
+      # implementations may choose not to implement it.
+      #
+      # @param checkpoint_id [String]
+      # @return [void]
+      # @api public
+      def cleanup!(checkpoint_id)
+        @consumed.delete(checkpoint_id)
+        nil
+      end
+      # Removes all tracked checkpoint IDs.
+      #
+      # Use this for test cleanup, periodic maintenance, or application
+      # shutdown.
+      #
+      # This method is optional in the duck-type contract. Custom store
+      # implementations may choose not to implement it.
+      #
+      # @return [void]
+      # @api public
+      def clear!
+        @consumed.clear
+        nil
+      end
+    end
+  end
+end

data/lib/phronomy/agent/concerns/retryable.rb CHANGED Viewed

@@ -49,7 +49,7 @@ module Phronomy
         private
-        # Retry loop for #invoke. Separated so that ReactAgent can override #invoke_once.
+        # Retry loop for #invoke.
         def _invoke_impl(input, messages: [], thread_id: nil, config: {})
           # Fail fast when the token is already cancelled before any LLM call.
           if (token = config[:cancellation_token]) && token.cancelled?

data/lib/phronomy/agent/concerns/suspendable.rb CHANGED Viewed

@@ -1,5 +1,7 @@
 # frozen_string_literal: true
+require "securerandom"
 module Phronomy
   module Agent
     module Concerns
@@ -47,6 +49,23 @@ module Phronomy
           @scope_policy = policy
         end
+        # Sets the idempotency store used to guard against duplicate resumes.
+        #
+        # The store must respond to:
+        # - +consumed?(checkpoint_id)+ ⇒ Boolean
+        # - +consume!(checkpoint_id)+  ⇒ void; raises {Phronomy::CheckpointAlreadyResumedError} on duplicate
+        #
+        # Defaults to a per-instance {Phronomy::Agent::CheckpointStore} (in-memory, not thread-safe).
+        # Assign a shared persistent store when resuming across processes (e.g. Redis-backed).
+        # Custom stores are responsible for ensuring thread-safety if shared across threads.
+        #
+        # @param store [#consumed?, #consume!]
+        # @return [void]
+        # @api public
+        def checkpoint_store=(store)
+          @checkpoint_store = store
+        end
         # Resumes a previously suspended invocation from a {Phronomy::Agent::Checkpoint}.
         #
         # This method reconstructs the conversation state captured at suspension
@@ -59,9 +78,14 @@ module Phronomy
         #   to inject a denial message and let the LLM handle it gracefully
         # @param config     [Hash] same runtime options as #invoke
         # @return [Hash] +{ output: String, suspended: false, messages: Array, usage: Phronomy::TokenUsage }+
+        #   or +{ output: nil, suspended: true, checkpoint: Phronomy::Agent::Checkpoint, messages: Array }+
+        #   when a second approval-required tool is encountered during continuation
         # @raise [Phronomy::GuardrailError] when an output guardrail rejects the value
+        # @raise [Phronomy::CheckpointAlreadyResumedError] when the checkpoint has already been consumed
         # @api private
         def resume(checkpoint, approved:, config: {})
+          # Guard against duplicate resumes using the idempotency store.
+          _checkpoint_store.consume!(checkpoint.checkpoint_id)
           # Build a fresh chat with all tools registered.
           chat = build_chat
@@ -91,8 +115,30 @@ module Phronomy
             tool_call_id: checkpoint.pending_tool_call_id
           )
-          # Continue the React loop.
-          response = chat.complete
+          # Re-register the suspension hook so that any further requires_approval
+          # tools encountered during continuation are intercepted rather than
+          # executed without approval (cascading / chained approval scenario).
+          _register_suspension_hook!(chat)
+          # Continue the LLM loop. Rescue SuspendSignal so that a second
+          # approval-required tool produces a new checkpoint instead of running
+          # without consent.
+          begin
+            response = chat.complete
+          rescue SuspendSignal => signal
+            new_checkpoint = Checkpoint.new(
+              checkpoint_id: SecureRandom.uuid,
+              agent_class: self.class.name,
+              requested_at: Time.now.utc,
+              thread_id: checkpoint.thread_id,
+              original_input: checkpoint.original_input,
+              messages: chat.messages.dup,
+              pending_tool_name: signal.tool_name,
+              pending_tool_args: signal.args,
+              pending_tool_call_id: signal.tool_call_id
+            )
+            return {output: nil, suspended: true, checkpoint: new_checkpoint, messages: chat.messages}
+          end
           output = response.content
           usage = Phronomy::TokenUsage.from_tokens(response.tokens)
@@ -129,6 +175,15 @@ module Phronomy
             end
           end
         end
+        # Returns the checkpoint idempotency store for this instance, lazily
+        # initialising a default in-memory {Phronomy::Agent::CheckpointStore}.
+        #
+        # @return [#consumed?, #consume!]
+        # @api private
+        def _checkpoint_store
+          @checkpoint_store ||= CheckpointStore.new
+        end
       end
     end
   end

data/lib/phronomy/configuration.rb CHANGED Viewed

@@ -33,6 +33,18 @@ module Phronomy
     # @see Phronomy::EventLoop
     attr_accessor :event_loop
+    # When true, agent LLM calls use {Phronomy::MultiAgent::ParallelToolChat}
+    # for concurrent tool dispatch within a single agent turn.
+    # Defaults to false.
+    #
+    # Previously, this was automatically enabled when +event_loop+ was true.
+    # As of Phase 3, +parallel_tool_execution+ is a separate setting that must
+    # be explicitly enabled.
+    # @example
+    #   Phronomy.configure { |c| c.parallel_tool_execution = true }
+    # @return [Boolean]
+    attr_accessor :parallel_tool_execution
     # When true, user input and LLM output are recorded in trace spans.
     # Defaults to false; set to true only in environments where PII capture is acceptable.
     # Set to false in privacy-sensitive environments to prevent PII from reaching
@@ -186,6 +198,7 @@ module Phronomy
       @tracer = Phronomy::Tracing::NullTracer.new
       @trace_pii = false
       @event_loop = false
+      @parallel_tool_execution = false
       @event_loop_stop_grace_seconds = 5
       @llm_adapter = Phronomy::LLMAdapter::RubyLLM.new
       @backpressure = :wait

data/lib/phronomy/event_loop.rb CHANGED Viewed

@@ -129,7 +129,7 @@ module Phronomy
     # (WorkflowContext) once the workflow finishes or halts. If an error occurred,
     # the popped value will be an Exception — callers are responsible for re-raising it.
     #
-    # @param fsm_session [Phronomy::Agent::Lifecycle::FSMSession]
+    # @param fsm_session [Phronomy::Workflow::FSMSession]
     # @return [Phronomy::Concurrency::AsyncQueue] resolves to final/halted context, or an Exception
     # @api private
     def register(fsm_session)
@@ -150,23 +150,6 @@ module Phronomy
       completion_queue
     end
-    # Enqueues an {AgentFSM} as a fire-and-forget child session.
-    #
-    # Unlike {#register}, this method:
-    # - Is safe to call from the EventLoop thread (entry actions).
-    # - Does NOT block — no completion queue is created.
-    # - Delegates `:finished`/`:error` cleanup to the EventLoop via posted events.
-    #
-    # @param agent_fsm [Phronomy::Agent::FSM]
-    # @return [nil]
-    # @api private
-    def enqueue_child(agent_fsm)
-      @queue.push([Event.new(type: :start, target_id: agent_fsm.id,
-        payload: {session: agent_fsm, completion: nil}),
-        Process.clock_gettime(Process::CLOCK_MONOTONIC, :nanosecond)])
-      nil
-    end
     # Posts an event to the loop. Safe to call from any thread (including IO threads).
     # The current monotonic clock time is recorded so that the EventLoop can
     # measure the dispatch lag when it dequeues the event.