RubyGems - phronomy - Versions diffs - 0.1.3 → 0.2.0 - Mend

phronomy 0.1.3 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (44) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +56 -0
data/README.md +49 -38
data/docs/trustworthy_ai_enhancements.md +4 -4
data/lib/generators/phronomy/install/templates/create_phronomy_messages.rb.tt +1 -1
data/lib/phronomy/actor.rb +68 -0
data/lib/phronomy/agent/base.rb +125 -91
data/lib/phronomy/agent/handoff.rb +2 -2
data/lib/phronomy/agent/react_agent.rb +51 -33
data/lib/phronomy/context/assembler.rb +11 -3
data/lib/phronomy/context/compaction_context.rb +1 -3
data/lib/phronomy/context/context_version_cache.rb +7 -16
data/lib/phronomy/eval/runner.rb +39 -11
data/lib/phronomy/guardrail/builtin/pii_pattern_detector.rb +47 -3
data/lib/phronomy/memory/compression/summary.rb +4 -3
data/lib/phronomy/memory/compression/tool_output_pruner.rb +11 -6
data/lib/phronomy/memory/conversation_manager.rb +25 -16
data/lib/phronomy/memory/retrieval/semantic.rb +21 -5
data/lib/phronomy/memory/storage/active_record.rb +32 -10
data/lib/phronomy/memory/storage/base.rb +22 -0
data/lib/phronomy/memory/storage/in_memory.rb +65 -26
data/lib/phronomy/state_store/active_record.rb +1 -1
data/lib/phronomy/state_store/base.rb +14 -16
data/lib/phronomy/state_store/in_memory.rb +23 -9
data/lib/phronomy/state_store/redis.rb +1 -1
data/lib/phronomy/thread_actor_registry.rb +52 -0
data/lib/phronomy/tool/base.rb +9 -2
data/lib/phronomy/tool/mcp_tool.rb +28 -4
data/lib/phronomy/tracing/base.rb +0 -2
data/lib/phronomy/tracing/langfuse_tracer.rb +24 -6
data/lib/phronomy/tracing/null_tracer.rb +6 -3
data/lib/phronomy/trust_pipeline.rb +60 -52
data/lib/phronomy/vector_store/redis_search.rb +28 -23
data/lib/phronomy/version.rb +1 -1
data/lib/phronomy/workflow.rb +281 -0
data/lib/phronomy/workflow_context.rb +119 -0
data/lib/phronomy/workflow_runner.rb +262 -0
data/lib/phronomy.rb +30 -34
metadata +25 -10
data/lib/phronomy/graph/compiled_graph.rb +0 -183
data/lib/phronomy/graph/parallel_node.rb +0 -193
data/lib/phronomy/graph/state.rb +0 -105
data/lib/phronomy/graph/state_graph.rb +0 -148
data/lib/phronomy/graph.rb +0 -13

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 04a7eceda662bfc638c3ec07ac161299b2bb08863e18e9ba03e7a3226165a921
-  data.tar.gz: 9938ace6e4a7250c08f733339af4de14c7d9ff62ff7c52a27eb954c0700d18f4
+  metadata.gz: c0aaadfedad1ee8b4afa1efb2e205a626a20cd636f268e711dce29128c84b6fe
+  data.tar.gz: f781f66ae3d570caca771d2b5579874169acef5eeed7f6cf99474a4c82fd077a
 SHA512:
-  metadata.gz: 1dc032c438a407a751b5c74fd76d172796e852a3add651c1ca6bb210991d20305a1cc609fe157e48026293084943714569219d359794c9bc7fde0dc396ed16d1
-  data.tar.gz: c1b52dcfa5196b92b72641f8d980856e5e827d61ff0e8afc61f5664f0556e0dbb0b047a7755d2f5f148525774c885ead4319510c32d7f2699fb4601c38d12ecd
+  metadata.gz: 73793efee9cf2c0cb81828fc86927181edca2b4d3fa830a6cefa176b30fd70c57c75e71173d57b51d385377ec3795f1d8cb42e25c3650be40eff543fc2f856fd
+  data.tar.gz: b7dd9cc538e478774d46cf7823e4f5e52b599a02fdd1edf300439287dabfe2fc9a90e93f50647fffa246f7f3c90bd4abf6cf258e9723af1222f836d89125dc59

data/CHANGELOG.md ADDED Viewed

@@ -0,0 +1,56 @@
+# Changelog
+All notable changes to this project will be documented in this file.
+The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
+and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+---
+## [Unreleased]
+### Added
+- **`Phronomy::Graph::Context`** module — canonical module for defining workflow
+  context classes (replaces the removed `Phronomy::Graph::State`).
+- **`Phronomy::Graph.register_context_class`** — registers context classes for
+  deserialization from external stores (Redis, DB).
+- **`Phronomy::Workflow.define`** DSL — primary high-level API for declaring
+  stateful workflows (`state`, `wait_state`, `event`, `after`, `initial`).
+- **`Phronomy::Graph::WorkflowRunner`** — state-machine execution engine backing
+  the Workflow DSL. Replaces the removed `CompiledGraph`.
+- **`app.send_event(event, config:)`** — event-driven resume for workflows halted
+  at a `wait_state`.
+- **`state.halted?`** — returns `true` when the workflow is paused at a `wait_state`.
+- **`state.phase`** — single source of truth for execution state.
+### Removed
+- `Phronomy::Graph::StateGraph` / `CompiledGraph` — use `Phronomy::Workflow.define`.
+- `Phronomy::Graph::State` — use `Phronomy::Graph::Context`.
+- `Phronomy::Graph.register_state_class` — use `register_context_class`.
+- `state.current_nodes` / `state.halted_before` — use `state.phase` / `state.halted?`.
+- `compiled.interrupt_before` / `compiled.interrupt_after` — use `wait_state` + `event`.
+- `compiled.resume` — use `app.send_event`.
+---
+## [0.2.0] - 2026-05-13
+### Added
+- `Phronomy::Graph::WorkflowRunner` — state_machines-based execution engine
+  (introduced as the internal successor to `CompiledGraph`).
+- `state.phase` — single source of truth for graph execution state (replaces
+  `current_nodes` + `halted_before` dual attributes).
+- `state.halted?` — returns `true` when the graph is paused.
+- `CompiledGraph#add_wait_state` — declared a named wait state that halts
+  automatically when reached (later superseded by `wait_state` DSL in `Workflow.define`).
+- `CompiledGraph#send_event(state:, event:, input: nil)` — event-driven resume API
+  (later superseded by `app.send_event`).
+### Removed
+- `ParallelNode` and `add_parallel_node` DSL. Use `Thread.new` or
+  `Concurrent::Future` at the application level instead.
+- `Phronomy::Graph::TimeoutError` (was only used by `ParallelNode`).

data/README.md CHANGED Viewed

@@ -1,12 +1,12 @@
 # Phronomy
 **Phronomy** is a Ruby AI agent framework inspired by open-source AI agent frameworks.
-It provides composable building blocks — Graphs, Agents, and Memory — all powered by [RubyLLM](https://github.com/crmne/ruby_llm) for LLM abstraction.
+It provides composable building blocks — Workflows, Agents, and Memory — all powered by [RubyLLM](https://github.com/crmne/ruby_llm) for LLM abstraction.
 ## Features
-- **Graph** — Build stateful, branching agent workflows with interrupt/resume support
-- **Graph Parallel Node** — Execute independent graph branches concurrently with configurable merge and error policies
+- **Workflow** — Build stateful, branching agent workflows with wait_state/send_event support
+- **Workflow Parallel Node** — Execute independent workflow branches concurrently using application-level Ruby threads
 - **Agent** — ReAct-style tool-calling agents with memory and guardrails
 - **Before-Completion Hook** — Three-tier (global / class / instance) LLM parameter injection before each chat request
 - **Memory** — Window, summary, ActiveRecord-backed, semantic, and composite conversation memory
@@ -70,31 +70,39 @@ result = ResearchAgent.new.invoke("What happened in AI research this week?")
 puts result[:output]
 ```
-### Graph — Stateful workflow with interrupt/resume
+### Workflow — Stateful workflow with wait_state/send_event
 ```ruby
-class ReviewState
-  include Phronomy::Graph::State
+class ReviewContext
+  include Phronomy::WorkflowContext
   field :draft,    type: :replace
   field :feedback, type: :replace
   field :approved, type: :replace, default: false
 end
-graph = Phronomy::Graph::StateGraph.new(ReviewState)
-graph.add_node(:write)    { |s| { draft: Writer.call(s) } }
-graph.add_node(:review)   { |s| { feedback: Reviewer.call(s.draft) } }
-graph.add_node(:finalize) { |s| { approved: true } }
-graph.add_edge(:write, :review)
-graph.add_edge(:review, :finalize)
-graph.set_entry_point(:write)
-# Register an interrupt callback before the :finalize node
-graph.interrupt_before(:finalize) do |state|
-  puts "Draft ready for human review: #{state.draft}"
+app = Phronomy::Workflow.define(ReviewContext) do
+  initial :write
+  state     :write,    action: ->(s) { s.merge(draft: Writer.call(s)) }
+  state     :review,   action: ->(s) { s.merge(feedback: Reviewer.call(s.draft)) }
+  wait_state :awaiting_approval           # halts here for human decision
+  state     :finalize, action: ->(s) { s.merge(approved: true) }
+  after :write,    to: :review
+  after :review,   to: :awaiting_approval
+  after :finalize, to: :__finish__
+  event :approve, from: :awaiting_approval, to: :finalize
+  event :reject,  from: :awaiting_approval, to: :write
 end
-compiled = graph.compile
-compiled.invoke({ draft: "" }, config: { thread_id: "doc-1" })
+Phronomy.configure { |c| c.default_state_store = Phronomy::StateStore::InMemory.new }
+# First run — halts at :awaiting_approval
+state = app.invoke({ draft: "" }, config: { thread_id: "doc-1" })
+puts "Halted: #{state.halted?}"   # => true
+puts "Draft: #{state.draft}"
+# Resume after human approval
+final = app.send_event(:approve, config: { thread_id: "doc-1" })
+puts "Approved: #{final.approved}"  # => true
 ```
 ### Multi-Agent — Agent-as-Tool pattern
@@ -238,29 +246,32 @@ result.citations.each do |c|
 end
 ```
-### Graph Parallel Node — Concurrent branches
+### Workflow Parallel Node — Concurrent branches
+Phronomy does not provide a built-in parallel abstraction. Use application-level Ruby threads inside a `state` action:
 ```ruby
-class MyState
-  include Phronomy::Graph::State
+class EnrichContext
+  include Phronomy::WorkflowContext
   field :summary, type: :replace
-  field :tags,    type: :append,  default: -> { [] }
+  field :tags,    type: :append, default: -> { [] }
 end
-graph = Phronomy::Graph::StateGraph.new(MyState)
-graph.add_parallel_node(
-  :enrich,
-  ->(s) { { summary: Summarizer.call(s) } },
-  ->(s) { { tags:    Tagger.call(s) } },
-  timeout:  10,
-  on_error: :best_effort
-)
+app = Phronomy::Workflow.define(EnrichContext) do
+  initial :enrich
+  state :enrich, action: ->(s) do
+    results = {}
+    threads = [
+      Thread.new { results[:summary] = Summarizer.call(s) },
+      Thread.new { results[:tags]    = Tagger.call(s) }
+    ]
+    threads.each { |t| t.join(10) }  # 10-second timeout
+    s.merge(summary: results[:summary], tags: Array(results[:tags]))
+  end
+  after :enrich, to: :__finish__
+end
-graph.set_entry_point(:enrich)
-graph.add_edge(:enrich, Phronomy::Graph::StateGraph::FINISH)
-app = graph.compile
-app.invoke({}, config: { thread_id: "t1" })
+state = app.invoke({}, config: { thread_id: "t1" })
 ```
 ### Output Parser — Structured LLM responses
@@ -483,8 +494,8 @@ bundle exec ruby NN_example_name/run.rb
 |---|-----------|----------------------|
 | 01 | `01_basic_chain/` | PromptTemplate → LLMChain pipeline |
 | 02 | `02_react_agent/` | ReAct tool-calling agent |
-| 03 | `03_state_graph/` | Stateful graph with interrupt/resume |
-| 04 | `04_interrupt_resume/` | Human-in-the-loop interrupt and resume |
+| 03 | `03_state_graph/` | Stateful workflow with wait_state/send_event |
+| 04 | `04_interrupt_resume/` | Human-in-the-loop wait_state and resume |
 | 05 | `05_multi_agent/` | Multi-agent coordination via Agent-as-Tool |
 | 06 | `06_guardrails/` | Input/output guardrails |
 | 07 | `07_tracing/` | Custom observability with Langfuse tracer |

data/docs/trustworthy_ai_enhancements.md CHANGED Viewed

@@ -49,7 +49,7 @@ cannot be delegated to the LLM must be enforced by phronomy or the application l
 | Layer | Responsibility | Status |
 |---|---|---|
 | LLM | Basic harmful-content avoidance (RLHF) | Model-dependent, not guaranteed |
-| **phronomy** | Intervention points, iteration limits, approval gates | ✅ `interrupt_before/after`, `requires_approval`, `max_iterations` — see `lib/phronomy/graph/compiled_graph.rb`, `lib/phronomy/agent/base.rb` |
+| **phronomy** | Intervention points, iteration limits, approval gates | ✅ `wait_state`/`send_event`, `requires_approval`, `max_iterations` — see `lib/phronomy/workflow.rb`, `lib/phronomy/agent/base.rb` |
 | **phronomy** | Built-in guardrails (PII, prompt injection) | ❌ Not implemented — **planned (Feature A)** |
 | Application | Concrete guardrail logic, approval workflows | Application responsibility |
@@ -82,7 +82,7 @@ cannot be delegated to the LLM must be enforced by phronomy or the application l
 | Layer | Responsibility | Status |
 |---|---|---|
 | LLM | Chain-of-thought generation | Prompt-dependent |
-| **phronomy** | Processing step recording via Graph and Tracing | ✅ Partial — `StateGraph`, `Tracing` |
+| **phronomy** | Processing step recording via Graph and Tracing | ✅ Partial — `Workflow`/`WorkflowRunner`, `Tracing` |
 | Application | Explanation UI, CoT prompt design | Application responsibility |
 **Planned work:** None in this iteration.
@@ -184,8 +184,8 @@ attributed to users or sessions, which is a requirement for accountability under
 NIST AI RMF 3.4.
 **Design:**
-- `Agent::Base#invoke` and `CompiledGraph#invoke` already accept `config: {}` — see
-  `lib/phronomy/agent/base.rb:367` and `lib/phronomy/graph/compiled_graph.rb`.
+- `Agent::Base#invoke` and `WorkflowRunner#invoke` already accept `config: {}` — see
+  `lib/phronomy/agent/base.rb` and `lib/phronomy/graph/workflow_runner.rb`.
 - Add two new optional keys to `config:`:
   - `user_id:` (String | nil) — caller identity
   - `session_id:` (String | nil) — session / request identity

data/lib/generators/phronomy/install/templates/create_phronomy_messages.rb.tt CHANGED Viewed

@@ -3,7 +3,7 @@ class CreatePhronomyMessages < ActiveRecord::Migration[<%= ActiveRecord::Migrati
     create_table :phronomy_messages do |t|
       t.string :thread_id,       null: false
       t.string :role,            null: false
-      t.text   :content,         null: false
+      t.text   :content
       t.text   :tool_calls_json
       t.string :model_id
       t.timestamps

data/lib/phronomy/actor.rb ADDED Viewed

@@ -0,0 +1,68 @@
+# frozen_string_literal: true
+module Phronomy
+  # Lightweight synchronous actor backed by a dedicated +Thread+ and a +Queue+.
+  #
+  # A caller submits work via {#call}, which blocks until the actor's thread
+  # finishes executing the block and then returns the result (or re-raises any
+  # exception that occurred inside the actor).
+  #
+  # === Reentrant safety
+  #
+  # If {#call} is invoked from within the actor's own thread (i.e. from inside
+  # a block that is already executing on this actor), the block is executed
+  # directly in the current thread instead of being pushed onto the queue.
+  # This prevents deadlocks in deeply nested call paths without requiring
+  # callers to track whether they are already "inside" the actor.
+  #
+  # === Usage
+  #
+  #   actor = Phronomy::Actor.new
+  #   result = actor.call { expensive_operation() }   # blocks caller; runs on actor thread
+  #   actor.stop                                       # graceful shutdown
+  class Actor
+    def initialize
+      @queue = Queue.new
+      @thread = Thread.new do
+        loop do
+          task = @queue.pop
+          break if task == :stop
+          task.call
+        end
+      end
+    end
+    # Run +block+ on the actor's thread and return its result.
+    #
+    # If the current thread is already the actor's thread (reentrant call),
+    # the block is executed inline to prevent deadlocks.
+    #
+    # Any exception raised inside the block is captured and re-raised in the
+    # calling thread.
+    #
+    # @yield block to execute on the actor's thread
+    # @return the return value of the block
+    def call(&block)
+      return block.call if Thread.current == @thread
+      done = Queue.new
+      @queue.push(-> {
+        begin
+          done.push([true, block.call])
+        rescue => e
+          done.push([false, e])
+        end
+      })
+      success, value = done.pop
+      raise value unless success
+      value
+    end
+    # Send a +:stop+ sentinel to gracefully terminate the actor's thread.
+    # Pending tasks already in the queue will still be processed before
+    # the thread exits.
+    def stop
+      @queue.push(:stop)
+    end
+  end
+end

data/lib/phronomy/agent/base.rb CHANGED Viewed

@@ -412,21 +412,8 @@ module Phronomy
       #   result = MyAgent.new.invoke("What is Ruby?")
       #   puts result[:output]
       def invoke(input, config: {})
-        policy = self.class._retry_policy
-        attempt = 0
-        begin
-          invoke_once(input, config: config)
-        rescue Phronomy::GuardrailError
-          raise
-        rescue
-          if policy && attempt < policy[:times]
-            wait = compute_agent_retry_wait(policy[:wait], policy[:base], attempt)
-            self.class._sleep_proc.call(wait) if wait > 0
-            attempt += 1
-            retry
-          end
-          raise
-        end
+        thread_id = config[:thread_id]
+        _run_in_thread_actor(thread_id) { _invoke_impl(input, config: config) }
       end
       # Streaming version of #invoke. Yields {Phronomy::Agent::StreamEvent} objects
@@ -446,82 +433,8 @@ module Phronomy
       def stream(input, config: {}, &block)
         return invoke(input, config: config) unless block
-        run_input_guardrails!(input)
-        memory = config[:memory]
         thread_id = config[:thread_id]
-        chat = build_chat
-        user_message = extract_message(input)
-        budget = build_token_budget
-        # Assemble context via Assembler (same as invoke_once).
-        assembler = Context::Assembler.new(budget: budget)
-        system_msg = build_instructions(input)
-        assembler.add_instruction(system_msg) if system_msg
-        Array(config[:knowledge_sources]).each do |ks|
-          ks.fetch(query: user_message).each do |chunk|
-            assembler.add_knowledge(chunk[:content], type: chunk[:type], source: chunk[:source])
-          end
-        end
-        if memory && thread_id
-          msgs = load_from_memory(memory, thread_id: thread_id, query: user_message)
-          message_elements = build_message_elements(msgs)
-          # Run on_trim: app may call ctx.remove(seqs) to drop messages this turn.
-          if (trim_cb = self.class._on_trim_callback)
-            trim_ctx = Context::TrimContext.new(message_elements: message_elements, budget: budget)
-            trim_cb.call(trim_ctx)
-            message_elements = trim_ctx.message_elements
-          end
-          # Run on_compaction_trigger → on_compact pipeline before calling the LLM.
-          if (trigger_cb = self.class._on_compaction_trigger_callback)
-            trigger_ctx = Context::TriggerContext.new(message_elements: message_elements, budget: budget)
-            if trigger_cb.call(trigger_ctx)
-              if (compact_cb = self.class._on_compact_callback)
-                compact_ctx = Context::CompactionContext.new(
-                  message_elements: message_elements,
-                  budget: budget,
-                  thread_id: thread_id,
-                  memory: memory
-                )
-                compact_cb.call(compact_ctx)
-                message_elements = build_message_elements(compact_ctx.result_messages)
-              end
-            end
-          end
-          assembler.add_messages(message_elements.map { |e| e[:message] })
-        end
-        context = assembler.build
-        apply_instructions(chat, context[:system]) if context[:system]
-        context[:messages].each { |msg| chat.messages << msg }
-        # Wire per-event callbacks to yield StreamEvents.
-        chat.on_tool_call { |tool_call| block.call(StreamEvent.new(type: :tool_call, payload: {tool_call: tool_call})) }
-        chat.on_tool_result { |tool_result| block.call(StreamEvent.new(type: :tool_result, payload: {tool_result: tool_result})) }
-        # Run before_completion hooks (global → class → instance) before the LLM call.
-        run_before_completion_hooks!(chat, config)
-        response = chat.ask(user_message) do |chunk|
-          block.call(StreamEvent.new(type: :token, payload: {content: chunk.content}))
-        end
-        save_to_memory(memory, thread_id: thread_id, messages: chat.messages) if memory && thread_id
-        output = response.content
-        usage = Phronomy::TokenUsage.from_tokens(response.tokens)
-        run_output_guardrails!(output)
-        result = {output: output, messages: chat.messages, usage: usage}
-        block.call(StreamEvent.new(type: :done, payload: result))
-        result
+        _run_in_thread_actor(thread_id) { _stream_impl(input, config: config, &block) }
       rescue => e
         block&.call(StreamEvent.new(type: :error, payload: {error: e}))
         raise
@@ -561,8 +474,127 @@ module Phronomy
         self
       end
+      # Returns the {Context::ContextVersionCache} for the current thread.
+      # @api private
+      def context_version_cache
+        (Thread.current[:phronomy_context_version_caches] ||= {})[object_id]
+      end
       private
+      # Retry loop for #invoke. Separated so that ReactAgent can override #invoke_once.
+      def _invoke_impl(input, config: {})
+        policy = self.class._retry_policy
+        attempt = 0
+        begin
+          invoke_once(input, config: config)
+        rescue Phronomy::GuardrailError
+          raise
+        rescue
+          if policy && attempt < policy[:times]
+            wait = compute_agent_retry_wait(policy[:wait], policy[:base], attempt)
+            self.class._sleep_proc.call(wait) if wait > 0
+            attempt += 1
+            retry
+          end
+          raise
+        end
+      end
+      # Streaming implementation for #stream.
+      def _stream_impl(input, config: {}, &block)
+        caller_meta = {}
+        caller_meta[:user_id] = config[:user_id] if config[:user_id]
+        caller_meta[:session_id] = config[:session_id] if config[:session_id]
+        trace("agent.invoke", input: input, **caller_meta) do |_span|
+          run_input_guardrails!(input)
+          memory = config[:memory]
+          thread_id = config[:thread_id]
+          chat = build_chat
+          user_message = extract_message(input)
+          budget = build_token_budget
+          # Assemble context via Assembler (same as invoke_once).
+          assembler = Context::Assembler.new(budget: budget)
+          system_msg = build_instructions(input)
+          assembler.add_instruction(system_msg) if system_msg
+          Array(config[:knowledge_sources]).each do |ks|
+            ks.fetch(query: user_message).each do |chunk|
+              assembler.add_knowledge(chunk[:content], type: chunk[:type], source: chunk[:source])
+            end
+          end
+          if memory && thread_id
+            msgs = load_from_memory(memory, thread_id: thread_id, query: user_message)
+            message_elements = build_message_elements(msgs)
+            # Run on_trim: app may call ctx.remove(seqs) to drop messages this turn.
+            if (trim_cb = self.class._on_trim_callback)
+              trim_ctx = Context::TrimContext.new(message_elements: message_elements, budget: budget)
+              trim_cb.call(trim_ctx)
+              message_elements = trim_ctx.message_elements
+            end
+            # Run on_compaction_trigger → on_compact pipeline before calling the LLM.
+            if (trigger_cb = self.class._on_compaction_trigger_callback)
+              trigger_ctx = Context::TriggerContext.new(message_elements: message_elements, budget: budget)
+              if trigger_cb.call(trigger_ctx)
+                if (compact_cb = self.class._on_compact_callback)
+                  compact_ctx = Context::CompactionContext.new(
+                    message_elements: message_elements,
+                    budget: budget,
+                    thread_id: thread_id,
+                    memory: memory
+                  )
+                  compact_cb.call(compact_ctx)
+                  message_elements = build_message_elements(compact_ctx.result_messages)
+                end
+              end
+            end
+            assembler.add_messages(message_elements.map { |e| e[:message] })
+          end
+          context = assembler.build
+          apply_instructions(chat, context[:system]) if context[:system]
+          context[:messages].each { |msg| chat.messages << msg }
+          # Wire per-event callbacks to yield StreamEvents.
+          chat.before_tool_call { |tool_call| block.call(StreamEvent.new(type: :tool_call, payload: {tool_call: tool_call})) }
+          chat.after_tool_result { |tool_result| block.call(StreamEvent.new(type: :tool_result, payload: {tool_result: tool_result})) }
+          # Run before_completion hooks (global → class → instance) before the LLM call.
+          run_before_completion_hooks!(chat, config)
+          response = chat.ask(user_message) do |chunk|
+            block.call(StreamEvent.new(type: :token, payload: {content: chunk.content}))
+          end
+          save_to_memory(memory, thread_id: thread_id, messages: chat.messages) if memory && thread_id
+          output = response.content
+          usage = Phronomy::TokenUsage.from_tokens(response.tokens)
+          run_output_guardrails!(output)
+          result = {output: output, messages: chat.messages, usage: usage}
+          block.call(StreamEvent.new(type: :done, payload: result))
+          [result, usage]
+        end
+      end
+      # Runs +block+ inside the {Phronomy::ThreadActorRegistry} Actor for
+      # +thread_id+. When +thread_id+ is nil the block executes on the calling thread.
+      def _run_in_thread_actor(thread_id, &block)
+        return block.call unless thread_id
+        Phronomy::ThreadActorRegistry.for(thread_id).call(&block)
+      end
       # Performs a single (non-retried) invocation. Extracted so that #invoke can
       # wrap it in a retry loop without duplicating the LLM interaction logic.
       def invoke_once(input, config: {})
@@ -784,7 +816,9 @@ module Phronomy
           [instruction.to_s, *static_chunks.map { |c| c[:content] }].join("\0")
         )
-        cache = (@_context_version_cache ||= Context::ContextVersionCache.new)
+        agent_id = object_id
+        cache = (Thread.current[:phronomy_context_version_caches] ||= {})[agent_id] ||=
+          Context::ContextVersionCache.new
         unless cache.valid?(fingerprint)
           parts = [instruction]
           static_chunks.each do |chunk|

data/lib/phronomy/agent/handoff.rb CHANGED Viewed

@@ -25,10 +25,10 @@ module Phronomy
       def initialize(target_agent:, description: nil)
         @target_agent = target_agent
         klass_name = target_agent.class.name&.split("::")&.last || "Agent"
-        @tool_name = "transfer_to_#{snake_case(klass_name)}"
-        @description = description || "Transfer the conversation to #{klass_name}."
         # Use a UUID so that two handoffs targeting the same class remain distinct.
         @uuid = SecureRandom.uuid
+        @tool_name = "transfer_to_#{snake_case(klass_name)}_#{@uuid.delete("-")[0, 8]}"
+        @description = description || "Transfer the conversation to #{klass_name}."
       end
       # Builds an anonymous Phronomy::Tool::Base subclass for this handoff.