phronomy 0.7.0 → 0.7.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (70) hide show
  1. checksums.yaml +4 -4
  2. data/.mutant.yml +8 -7
  3. data/CHANGELOG.md +151 -1
  4. data/README.md +155 -32
  5. data/Rakefile +33 -0
  6. data/benchmark/baseline.json +1 -1
  7. data/benchmark/bench_regression.rb +1 -0
  8. data/docs/decisions/004-invoke-timeout-is-not-cancellation.md +24 -0
  9. data/docs/decisions/006-no-built-in-guardrails.md +20 -2
  10. data/docs/decisions/010-cooperative-first-concurrency.md +248 -0
  11. data/lib/phronomy/agent/base.rb +250 -65
  12. data/lib/phronomy/agent/concerns/suspendable.rb +15 -0
  13. data/lib/phronomy/agent/fsm.rb +41 -64
  14. data/lib/phronomy/agent/orchestrator.rb +146 -121
  15. data/lib/phronomy/agent/parallel_tool_chat.rb +79 -22
  16. data/lib/phronomy/agent/react_agent.rb +8 -0
  17. data/lib/phronomy/async_queue.rb +155 -0
  18. data/lib/phronomy/blocking_adapter_pool.rb +435 -0
  19. data/lib/phronomy/cancellation_scope.rb +123 -0
  20. data/lib/phronomy/cancellation_token.rb +43 -2
  21. data/lib/phronomy/concurrency_gate.rb +155 -0
  22. data/lib/phronomy/configuration.rb +142 -0
  23. data/lib/phronomy/deadline.rb +63 -0
  24. data/lib/phronomy/diagnostics.rb +62 -0
  25. data/lib/phronomy/embeddings/base.rb +17 -0
  26. data/lib/phronomy/eval/runner.rb +9 -9
  27. data/lib/phronomy/event_loop.rb +181 -43
  28. data/lib/phronomy/fsm_session.rb +50 -4
  29. data/lib/phronomy/guardrail/prompt_injection_guardrail.rb +58 -0
  30. data/lib/phronomy/invocation_context.rb +152 -0
  31. data/lib/phronomy/knowledge_source/base.rb +18 -0
  32. data/lib/phronomy/llm_adapter/base.rb +104 -0
  33. data/lib/phronomy/llm_adapter/ruby_llm.rb +41 -0
  34. data/lib/phronomy/llm_adapter.rb +20 -0
  35. data/lib/phronomy/metrics.rb +38 -0
  36. data/lib/phronomy/runtime/deterministic_scheduler.rb +412 -0
  37. data/lib/phronomy/runtime/fake_scheduler.rb +165 -0
  38. data/lib/phronomy/runtime/gate_registry.rb +52 -0
  39. data/lib/phronomy/runtime/pool_registry.rb +57 -0
  40. data/lib/phronomy/runtime/runtime_metrics.rb +117 -0
  41. data/lib/phronomy/runtime/scheduler.rb +98 -0
  42. data/lib/phronomy/runtime/scheduler_timer_adapter.rb +79 -0
  43. data/lib/phronomy/runtime/task_registry.rb +48 -0
  44. data/lib/phronomy/runtime/thread_scheduler.rb +30 -0
  45. data/lib/phronomy/runtime/timer_queue.rb +106 -0
  46. data/lib/phronomy/runtime/timer_service.rb +42 -0
  47. data/lib/phronomy/runtime.rb +374 -0
  48. data/lib/phronomy/task/backend.rb +80 -0
  49. data/lib/phronomy/task/fiber_backend.rb +157 -0
  50. data/lib/phronomy/task/immediate_backend.rb +89 -0
  51. data/lib/phronomy/task/thread_backend.rb +84 -0
  52. data/lib/phronomy/task.rb +275 -0
  53. data/lib/phronomy/task_group.rb +265 -0
  54. data/lib/phronomy/testing/fake_clock.rb +109 -0
  55. data/lib/phronomy/testing/fake_scheduler.rb +104 -0
  56. data/lib/phronomy/testing/scheduler_helpers.rb +59 -0
  57. data/lib/phronomy/testing.rb +12 -0
  58. data/lib/phronomy/tool/base.rb +110 -2
  59. data/lib/phronomy/tool/mcp_tool.rb +47 -16
  60. data/lib/phronomy/tool/scope_policy.rb +50 -0
  61. data/lib/phronomy/tool_executor.rb +106 -0
  62. data/lib/phronomy/tracing/open_telemetry_tracer.rb +34 -0
  63. data/lib/phronomy/vector_store/async_backend.rb +110 -0
  64. data/lib/phronomy/vector_store/base.rb +7 -0
  65. data/lib/phronomy/version.rb +1 -1
  66. data/lib/phronomy/workflow.rb +52 -5
  67. data/lib/phronomy/workflow_context.rb +29 -2
  68. data/lib/phronomy/workflow_runner.rb +74 -3
  69. data/lib/phronomy.rb +42 -0
  70. metadata +40 -2
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: fbca82a7a23706719deda2e827af5a9b342c9b388d700929ca9eca19a531a2c9
4
- data.tar.gz: b9727e2010acefc14738dbd5b71b5ea06b10f5ebd994a858dfbf1133e15ed003
3
+ metadata.gz: d9ae370d656048e38f700b6bced931fe249f731cea819ab94691eb4bcf6ef43c
4
+ data.tar.gz: 97d01ca3475f547a41397d1dad2ddb8ccaa10f6466d5a75c3f79e6875a7af0c6
5
5
  SHA512:
6
- metadata.gz: c33ee2c26a4b6e3d0470f4d95e30b04e9bc228cc87bdd807db1569c707888817f2d904f085822f91e94441e5c21e95ab348f0d7bc56b1ef87c72770ed4976e1d
7
- data.tar.gz: 706148e7047ab570ca7d69f735f5767c2a983cf35312a70732723595ea80ed3b5f3efaaddd59d393f5680ec91c36c6ca3e01a70dcb671faa940ae9211df59b5a
6
+ metadata.gz: d3ab9ebd145e1ed706ad1741a2e3184c412aa8fd0eac32c95eb0b4a1ef87af38ae73eb5b4205b7f2894dd228929130c9a7569d24a1d7a571a5aa3ec5a68a4172
7
+ data.tar.gz: efa88afdbaa2f3d8fc38ee7cbc7044711479490546a888d44540f3b6bae6da60a3a3e64cfbbef455d65f78bab64dd9a68056e4c9f7ac7a360d512179364c8b23
data/.mutant.yml CHANGED
@@ -12,10 +12,11 @@ includes:
12
12
  requires:
13
13
  - phronomy
14
14
 
15
- subjects:
16
- - Phronomy::WorkflowContext
17
- - Phronomy::WorkflowRunner
18
- - Phronomy::Tool::Base
19
- - Phronomy::Context::TokenBudget
20
- - Phronomy::Context::TokenEstimator
21
- - Phronomy::VectorStore::InMemory
15
+ matcher:
16
+ subjects:
17
+ - Phronomy::WorkflowContext
18
+ - Phronomy::WorkflowRunner
19
+ - Phronomy::Tool::Base
20
+ - Phronomy::Context::TokenBudget
21
+ - Phronomy::Context::TokenEstimator
22
+ - Phronomy::VectorStore::InMemory
data/CHANGELOG.md CHANGED
@@ -11,6 +11,104 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
11
11
 
12
12
  ### Added
13
13
 
14
+ - **`Phronomy::Diagnostics` and `SchedulerReentrancyError`** (#278, #279):
15
+ `Phronomy::Diagnostics` exposes a snapshot of current scheduler state
16
+ (`pending_count`, `active_tasks`, `pool_utilization`, etc.) for debugging and
17
+ monitoring. `SchedulerReentrancyError` is raised when a scheduler operation is
18
+ attempted from within a scheduler callback, preventing deadlocks.
19
+ `Phronomy.configure { |c| c.scheduler_debug = true }` enables verbose scheduler
20
+ logging.
21
+
22
+ - **`task_id` / `parent_task_id` on `InvocationContext`** (#277):
23
+ Every task spawned via `Task.spawn` now carries a `task_id` (a random UUID) and
24
+ an optional `parent_task_id`. These fields enable hierarchical task-tree tracing
25
+ and are forwarded automatically by `TaskGroup`.
26
+
27
+ - **`Phronomy::Metrics` — task-centric observability snapshot** (#276):
28
+ `Phronomy::Metrics.snapshot` returns a hash with scheduler statistics:
29
+ `tasks_started`, `tasks_completed`, `tasks_failed`, `pool_queue_depth`, and
30
+ `pool_active_threads`. Intended for metrics export and health-check endpoints.
31
+
32
+ - **`Phronomy::Testing::FakeClock` and `FakeScheduler`** (#273):
33
+ Two test helpers for deterministic concurrency testing.
34
+ `FakeClock` exposes `advance(seconds)` to control the passage of time without
35
+ sleeping. `FakeScheduler` replaces the real scheduler in specs, providing
36
+ synchronous execution and `flush` / `drain` helpers to drive task completion.
37
+
38
+ - **`ScopePolicy` and approval gate integration** (#270):
39
+ `Phronomy::Tool::ScopePolicy` is a callable that maps `(tool_class, scope, agent)`
40
+ to `:allow`, `:approve`, or `:reject`. The default policy (`ScopePolicy::DEFAULT`)
41
+ automatically routes tools declaring high-risk scopes (`:write`, `:admin`,
42
+ `:external_network`, `:filesystem`, `:process`, `:external_process`) through the
43
+ existing approval gate; tools with `scope :read_only` or no scope are allowed
44
+ unconditionally. Per-agent policy overrides are available via
45
+ `agent.scope_policy = my_policy`.
46
+ **Behaviour change**: tools with the above scopes that previously executed without
47
+ an approval handler will now be **rejected** unless an approval handler is
48
+ registered or the agent uses a custom permissive policy.
49
+
50
+ - **`PromptInjectionGuardrail`, `Tool::Base#redact_params`, and `#max_result_size`** (#271):
51
+ `Phronomy::Guardrail::PromptInjectionGuardrail` is a built-in `InputGuardrail`
52
+ subclass that detects prompt-injection patterns in user input.
53
+ `Tool::Base.redact_params(*names)` marks parameter names as sensitive; their
54
+ values are replaced with `"[REDACTED]"` in log and trace output.
55
+ `Tool::Base.max_result_size(n)` sets a per-tool character limit; results
56
+ exceeding the limit are truncated and a warning is logged. The global fallback is
57
+ `Phronomy.configure { |c| c.tool_result_max_size = n }` (default: no limit).
58
+
59
+ - **`execution_mode` DSL on `Tool::Base`** (#263):
60
+ `Tool::Base.execution_mode` accepts `:cooperative`, `:blocking_io` (default),
61
+ `:cpu_bound`, or `:external_process`. Tools marked `:blocking_io` (the default)
62
+ are dispatched through `BlockingAdapterPool` when a `Runtime` is available,
63
+ keeping the scheduler thread unblocked. Tools marked `:cooperative` are called
64
+ directly on the scheduler thread (suitable for pure in-memory operations).
65
+
66
+ - **`invoke_async` and `call_async` — async entry points** (#262):
67
+ `Agent::Base#invoke_async(input, **opts)` returns a `Phronomy::Task` wrapping
68
+ `#invoke`. `Workflow#invoke_async(input, config:)` does the same for workflows.
69
+ `Tool::Base#call_async(args, cancellation_token:)` returns a `Task` wrapping
70
+ `#call`. All three are backward-compatible with existing synchronous callers.
71
+
72
+ - **`LLMAdapter` abstraction** (#266):
73
+ `Phronomy::LLMAdapter::Base` decouples the agent pipeline from RubyLLM.
74
+ `Phronomy::LLMAdapter::RubyLLM` (registered by default) wraps the existing
75
+ integration. Custom adapters can be registered via
76
+ `Phronomy.configure { |c| c.llm_adapter = MyAdapter }` for testing or
77
+ alternative LLM backends.
78
+
79
+ - **`BlockingAdapterPool` backpressure limits** (#268):
80
+ `BlockingAdapterPool` now enforces configurable `pool_size` (default: 10) and
81
+ `queue_size` (default: 100) limits. Tasks submitted when the queue is full raise
82
+ `Phronomy::BackpressureError` immediately instead of growing the queue without
83
+ bound.
84
+
85
+ - **Cooperative scheduler fairness** (#269):
86
+ The scheduler measures per-task lag and emits starvation and dispatch warnings
87
+ via `Phronomy.configuration.logger` when tasks wait longer than configured
88
+ thresholds. Configurable via `scheduler_starvation_warn_ms` and
89
+ `scheduler_dispatch_warn_ms`.
90
+
91
+ - **Workflow entry actions awaitable with Task** (#264):
92
+ Entry action lambdas may now return a `Phronomy::Task`. The FSMSession awaits
93
+ the task on a background thread and posts `:action_completed` (with the resulting
94
+ `WorkflowContext`) or `:state_completed` back to the EventLoop without blocking
95
+ it. Backward-compatible: lambdas that return a `WorkflowContext` or `nil`
96
+ continue to work as before.
97
+
98
+ - **`Task`, `TaskGroup`, `AsyncQueue`, `Deadline`, `InvocationContext`, `Runtime` concurrency abstractions** (#255):
99
+ Six new concurrency primitives form the foundation of the async execution layer.
100
+ `Task` wraps a callable with cancellation, timeout (`Deadline`), and context
101
+ propagation (`InvocationContext`). `TaskGroup` runs tasks concurrently and waits
102
+ for all to finish (or the first failure). `AsyncQueue` is a bounded, cancellable
103
+ queue. `Runtime` is the top-level façade that resolves a `BlockingAdapterPool`
104
+ and provides `blocking_io { }` and `cpu_bound { }` dispatch helpers.
105
+
106
+ - **`BlockingAdapterPool`** (#256):
107
+ A bounded thread pool that isolates blocking I/O (LLM calls, database queries,
108
+ HTTP requests) from the cooperative scheduler thread. Default pool size is 10
109
+ threads with a queue depth of 100. Replaces direct `Thread.new` calls in core
110
+ agent and tool paths.
111
+
14
112
  - **`VectorStore#size` — document count for all backends, contract coverage for RedisSearch and Pgvector** (#240):
15
113
  `VectorStore::Base` gains `#size` as an abstract method; `InMemory`, `RedisSearch`,
16
114
  and `Pgvector` all implement it. `RedisSearch#size` queries `FT.INFO num_docs`;
@@ -128,9 +226,52 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
128
226
  `dispatch_parallel` and `fan_out` accept `cancellation_token:` and automatically
129
227
  inject it into every worker task's config unless the task already supplies its own.
130
228
 
229
+ ### Removed
230
+
231
+ - **BREAKING: `Agent::Base#run_as_child` drops `&result_writer` block parameter** (#265):
232
+ The optional block form `run_as_child(input, ctx: ctx) { |r| ctx.answer = r[:output] }`
233
+ is no longer supported. The result is now delivered **exclusively** as the
234
+ `:child_completed` event payload `{ output:, messages:, usage: }`. The parent
235
+ Workflow task is the sole owner of the `WorkflowContext`; no background thread
236
+ writes to it directly. Callers that were using the block to write back into the
237
+ context must update their workflow design (e.g. read the result in the target
238
+ state's entry action after the transition, or store output through an external
239
+ shared resource if needed).
240
+
241
+ - **BREAKING (internal): `AgentFSM#initialize` drops `result_writer:` keyword** (#265):
242
+ Direct callers of `AgentFSM.new(result_writer: ...)` must remove that keyword.
243
+ This class is considered internal; gem consumers should use `run_as_child` instead.
244
+
131
245
  ### Changed
132
246
 
133
- - **`CancellationToken` checked at granular checkpoints** (#223):
247
+ - **`AgentFSM`, `ParallelToolChat`, and `Orchestrator` use `Task`/`TaskGroup` instead of bare `Thread.new`** (#257, #258, #259):
248
+ All three components now spawn async work through the `Task` and `TaskGroup`
249
+ abstractions. This enables cancellation propagation, context threading, and
250
+ `BlockingAdapterPool` routing. No public API changes; behaviour is equivalent.
251
+
252
+ - **`Thread.current[:phronomy_*]` context propagation replaced with explicit `InvocationContext`** (#260):
253
+ Thread-local keys `phronomy_event_loop_thread`, `phronomy_cancellation_token`,
254
+ and `phronomy_context_version_caches` are no longer used as the primary
255
+ propagation channel. `InvocationContext` is threaded explicitly through call
256
+ stacks. Importantly, `Tool::Base#call` no longer falls back to
257
+ `Thread.current[:phronomy_cancellation_token]`; cancellation is only observed
258
+ when the caller passes `cancellation_token:` explicitly (or when
259
+ `ParallelToolChat` injects it). Tools that relied on the thread-local fallback
260
+ must be updated.
261
+
262
+ - **`Timeout.timeout` removed from core paths; replaced with `CancellationScope`** (#261):
263
+ `Agent::Base#invoke` and `McpTool::StdioTransport` no longer use `Timeout.timeout`
264
+ (which is unsafe with `Thread.new` and `ensure` blocks). A `CancellationScope`
265
+ with `deadline_in(seconds)` provides equivalent semantics without the thread-
266
+ interruption hazards. `ScopeTimeoutError < TimeoutError` is raised on expiry.
267
+
268
+ - **RAG/VectorStore blocking I/O placed behind `BlockingAdapterPool` async boundary** (#267):
269
+ `KnowledgeSource#fetch` and all three `VectorStore` backends now execute their
270
+ blocking I/O through `Runtime#blocking_io` when a `Runtime` is present. Callers
271
+ in a synchronous context see no change; callers in an EventLoop context benefit
272
+ from non-blocking scheduler behaviour.
273
+
274
+
134
275
  The cancellation token (passed via `config: { cancellation_token: token }`) is
135
276
  now checked at multiple additional points beyond the initial LLM call boundary:
136
277
  before each `KnowledgeSource#fetch` in `build_context` (RAG phase); after each
@@ -195,6 +336,15 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
195
336
 
196
337
  ### Fixed
197
338
 
339
+ - **`tool_name` preserved in `Orchestrator#prepare_tool_class` anonymous subclass wrapper**:
340
+ When `Orchestrator#prepare_tool_class` wrapped a subagent tool in an anonymous
341
+ subclass (`Class.new(prepared)`), the class-level instance variable `@tool_name`
342
+ was not inherited, causing the wrapper's `tool_name` to return `nil`. RubyLLM
343
+ then registered the tool under a `nil` key, making it unreachable when the LLM
344
+ called it by name. The fix captures the effective name before subclassing and
345
+ calls `tool_name effective_name` explicitly inside the anonymous class body —
346
+ the same pattern already used by the approval-gate wrapper.
347
+
198
348
  - **`EventLoop#start` is now idempotent; stale `:__stop__` sentinel race fixed** (#203):
199
349
  Calling `start` on an already-running `EventLoop` is now a no-op. Fixed a race condition
200
350
  where `stop` setting `@running = false` before the worker thread was scheduled left the
data/README.md CHANGED
@@ -22,31 +22,80 @@ It provides composable building blocks — Workflows, Agents, Tools, Guardrails,
22
22
  > **Note**: The `main` branch contains unreleased development work. Pin to a released gem
23
23
  > version (`gem "phronomy", "~> 0.x"`) for stability in production.
24
24
 
25
+ **Core building blocks**
26
+
25
27
  | Feature | Stability |
26
28
  |---|---|
27
29
  | **Workflow** — Stateful, branching workflows with wait_state/send_event | Stable |
28
- | **Workflow EventLoop Mode** — Opt-in event-driven execution: `Phronomy.configure { \|c\| c.event_loop = true }` | Experimental |
29
- | **Agent EventLoop Mode** — `Agent#invoke` (non-blocking via EventLoop), `Agent#run_as_child` (child-FSM pattern for Workflow integration), parallel tool dispatch via `ParallelToolChat` | Experimental |
30
- | **Workflow Parallel Node** — Concurrent branches via application-level threads | Beta |
30
+ | **Workflow action_timeout** — Per-state `action_timeout:` keyword on `state` DSL; cancels Task-returning entry actions that exceed the limit and raises `Phronomy::ActionTimeoutError` | Beta |
31
31
  | **Agent** — ReAct-style tool-calling agents with guardrails and conversation history | Stable |
32
32
  | **Before-Completion Hook** — Three-tier LLM parameter injection | Stable |
33
33
  | **Context Management** — Token budget calculation, estimation, and pruning | Stable |
34
- | **Knowledge/RAG** — Retrieval sources with pluggable loaders, splitters, and vector stores; `static_knowledge_refresh!` for runtime cache invalidation | Beta |
35
- | **`VectorStore#size`** — Returns document count for all three backends (InMemory, RedisSearch, Pgvector) | Beta |
36
- | **Multi-agent** — Agent-as-Tool pattern and hub-and-spoke handoff routing | Beta |
37
- | **GeneratorVerifier** — Generator-Verifier loop with injectable prompt builders/parsers | Beta |
38
- | **Agent::Orchestrator** — Parallel subagent dispatch, fan-out, and `subagent` DSL | Beta |
39
- | **Agent::TeamCoordinator** — Agent teams pattern: LLM coordinator + stateful workers with sequential task assignment (worker-local message history persisted across tasks) | Beta |
40
- | **Agent::SharedState** — Shared state pattern: peer agents collaborate via a shared KnowledgeStore; `member` DSL with per-agent instructions and `coordination` team protocol | Experimental |
41
34
  | **Guardrails** — Input/output validation with custom `InputGuardrail`/`OutputGuardrail` | Beta |
35
+ | **`PromptInjectionGuardrail`** — Built-in `InputGuardrail` subclass that detects prompt-injection patterns; usable standalone or as part of a guardrail chain | Beta |
36
+ | **`Tool::Base.redact_params` / `.max_result_size`** — Class-level DSL: `redact_params` masks parameter values in log/trace output; `max_result_size` truncates oversized tool results before they reach the LLM | Beta |
42
37
  | **Output Parser** — JSON and Struct-mapped parsers for structured LLM responses | Stable |
43
38
  | **Eval Framework** — Dataset-driven evaluation with multiple scorer types | Beta |
44
39
  | **Tracing** — Pluggable span-based observability | Stable |
45
- | **MCP Tool** — Model Context Protocol server integration | Beta |
46
40
  | **Error Taxonomy** — `RateLimitError`, `AuthenticationError`, `ContextLengthError`, `TransportError` (subclasses of `Phronomy::Error`) raised at the agent retry boundary | Beta |
47
- | **`Phronomy.with_configuration` / `Phronomy.reset_runtime!`** — Scoped configuration override and full runtime reset for test isolation | Beta |
41
+
42
+ **Knowledge and integration**
43
+
44
+ | Feature | Stability |
45
+ |---|---|
46
+ | **Knowledge/RAG** — Retrieval sources with pluggable loaders, splitters, and vector stores; `static_knowledge_refresh!` for runtime cache invalidation | Beta |
47
+ | **`VectorStore#size`** — Returns document count for all three backends (InMemory, RedisSearch, Pgvector) | Beta |
48
+ | **`VectorStore::AsyncBackend` mixin** — Pluggable async interface for `VectorStore`; default pool-backed implementations for `search_async`, `add_async`, `remove_async`, `clear_async`; backends with native async drivers override individual methods to bypass `BlockingAdapterPool` entirely; all existing backends remain unchanged | Beta |
49
+ | **Parallel RAG multi-source fetch** — `Agent#build_context` fetches all `knowledge_sources` concurrently via `TaskGroup`; `config[:rag_failure_policy]` `:skip` (default) silently ignores failed sources so the agent answers with partial context, `:fail` surfaces the first error; per-source latency is emitted to `Phronomy.configuration.logger` at debug level | Beta |
50
+ | **MCP Tool** — Model Context Protocol server integration | Beta |
51
+
52
+ **Execution and reliability**
53
+
54
+ | Feature | Stability |
55
+ |---|---|
56
+ | **Workflow EventLoop Mode** — Opt-in event-driven execution: `Phronomy.configure { \|c\| c.event_loop = true }` | Experimental |
57
+ | **Agent EventLoop Mode** — `Agent#invoke` (non-blocking via EventLoop), `Agent#run_as_child` (child-FSM pattern for Workflow integration), parallel tool dispatch via `ParallelToolChat` | Experimental |
58
+ | **`invoke_async` / `call_async`** — `Agent::Base#invoke_async` and `Workflow#invoke_async` return a `Task`; `Tool::Base#call_async` similarly; compatible with EventLoop and standalone contexts | Experimental |
48
59
  | **CancellationToken** — Cooperative cancellation via `cancel!`/`cancelled?`/`raise_if_cancelled!`; `timeout_after(seconds)` for monotonic-clock deadlines; optional `deadline:` (wall-clock) for backward compatibility; passed as `config: { cancellation_token: token }` to agents and `dispatch_parallel`; injected into `tool.execute` when the method declares a `cancellation_token:` keyword | Experimental |
49
60
  | **`dispatch_parallel` / `fan_out` `force_kill:` option** — `force_kill: false` (default) leaves timed-out workers running and raises `TimeoutError` immediately; `force_kill: true` restores the old `Thread#kill` behaviour with a `logger.warn` | Beta |
61
+ | **`execution_mode` DSL on `Tool::Base`** — Declares how a tool's `execute` should be dispatched: `:cooperative` (same scheduler thread), `:blocking_io` (default; offloaded to `BlockingAdapterPool`), `:cpu_bound`, `:external_process` | Experimental |
62
+ | **`invocation_context:` keyword on `Agent#invoke` / `Workflow#invoke`** — Pass a `Phronomy::InvocationContext` directly; `thread_id`, `cancellation_token`, and `deadline`-based timeout are derived from it; `task_id` / `parent_task_id` appear in trace spans automatically; `config:` keys remain supported as backward-compat aliases | Beta |
63
+ | **`ConcurrencyGate` — unified backpressure** — Counting semaphore that enforces per-resource concurrency caps (`max_concurrent_agent_tasks`, `max_concurrent_tool_tasks`, `max_concurrent_workflow_tasks`, `max_concurrent_llm_calls`, `max_concurrent_rag_fetches`, `max_concurrent_vector_searches`); configured via `Phronomy.configure`; backpressure behaviour follows the global `backpressure` setting (`:wait`, `:raise`/`:reject`, `:timeout`); `nil` cap = unlimited (default) | Beta |
64
+ | **Cooperative scheduler yield points** — `Runtime#yield` (cooperative yield; yields the current task's time slice); `Runtime#yield_if_needed(every: N)` (thread-local counter, yields every N calls); CPU-bound detection when `blocking_detect_threshold_ms` is set (warns and increments `non_yield_threshold_violation_count` when a task runs longer than the threshold without yielding); `starvation_threshold_ms` configuration field (default: 50ms) | Beta |
65
+ | **`Phronomy::Metrics`** — `Phronomy::Metrics.snapshot` returns task-tree and pool counters; task-centric keys: `active_agent_tasks`, `active_tool_tasks`, `active_workflow_tasks`, `active_rag_tasks`, `active_llm_tasks`, `task_wait_time_p50_ms`, `task_wait_time_p95_ms`, `task_run_time_p50_ms`, `task_run_time_p95_ms`, `cancelled_tasks`, `failed_tasks`, `non_yield_threshold_violation_count`; pool/event-loop keys remain for backward compatibility; `Runtime#task_snapshot` exposes task-centric metrics directly | Beta |
66
+ | **`Phronomy.with_configuration` / `Phronomy.reset_runtime!`** — Scoped configuration override and full runtime reset for test isolation | Beta |
67
+
68
+ **Agent patterns**
69
+
70
+ | Feature | Stability |
71
+ |---|---|
72
+ | **Workflow parallel pattern** — Concurrent branches via application-level threads (no built-in parallel primitive; see the Workflow section for the recommended pattern) | Beta |
73
+ | **Multi-agent** — Agent-as-Tool pattern and hub-and-spoke handoff routing | Beta |
74
+ | **GeneratorVerifier** — Generator-Verifier loop with injectable prompt builders/parsers | Beta |
75
+ | **Agent::Orchestrator** — Parallel subagent dispatch, fan-out, and `subagent` DSL | Beta |
76
+ | **Agent::TeamCoordinator** — Agent teams pattern: LLM coordinator + stateful workers with sequential task assignment (worker-local message history persisted across tasks) | Beta |
77
+ | **Agent::SharedState** — Shared state pattern: peer agents collaborate via a shared KnowledgeStore; `member` DSL with per-agent instructions and `coordination` team protocol | Experimental |
78
+ | **`ScopePolicy`** — Configurable policy callable that maps (tool, scope, agent) to `:allow`/`:approve`/`:reject`; default policy auto-routes high-risk scopes through the approval gate | Experimental |
79
+
80
+ > **Public API boundary**: The tables above are the complete list of classes, modules, and features
81
+ > intended for gem consumers. Every entry has an associated stability label.
82
+ > All other classes, modules, and methods — including everything in the
83
+ > [Advanced / Internal APIs](#advanced--internal-apis) section below — are
84
+ > marked `@api private` in source and may change without notice. Do not
85
+ > depend on internal APIs in application code.
86
+
87
+ ## Advanced / Internal APIs
88
+
89
+ The APIs listed below are intended for advanced use cases, framework internals, and test infrastructure. Typical application code does not need to interact with them directly.
90
+
91
+ > These APIs are subject to change without the same backwards-compatibility guarantees as the stable public API.
92
+
93
+ | Feature | Stability |
94
+ |---|---|
95
+ | **`Phronomy::Diagnostics`** — Snapshot of scheduler internals for debug/monitoring; `SchedulerReentrancyError` raised on invalid re-entrant scheduler use; `Runtime.in_scheduler_context?` returns `true` when called from inside a scheduler task | Experimental |
96
+ | **`Phronomy::Testing::FakeClock` / `FakeScheduler` / `SchedulerHelpers`** — Test helpers for deterministic concurrency specs: `FakeClock#advance(seconds)` controls time; `FakeScheduler` runs tasks synchronously and records `event_log`; `FakeScheduler#assert_order` / `#assert_cancelled` for ordering assertions; `FakeClock#advance_to_next_timer` fires the next pending callback; `Testing::SchedulerHelpers#with_fake_scheduler` replaces the global Runtime for the duration of a block | Beta |
97
+ | **`Configuration#runtime_backend`** — `:thread` (default, one OS thread per task), `:immediate` (tests — tasks run synchronously, no extra threads), `:fiber` (**EXPERIMENTAL** — experimental validation backend only: runs tasks as Ruby Fibers on a cooperative scheduler to verify that framework components are truly non-blocking; **not for production use** and not a planned production replacement for `:thread`; no preemptive scheduling will be added). `:cooperative` is a **deprecated alias** for `:immediate` — do not use in new code | Beta |
98
+ | **`Configuration#strict_runtime_guards`** — When `true`, calling `Agent#invoke` from inside a scheduler task raises `SchedulerReentrancyError`; when `false` (default) a warning is logged instead | Beta |
50
99
 
51
100
  ## Installation
52
101
 
@@ -150,13 +199,16 @@ puts "Approved: #{final.approved}" # => true
150
199
  ```
151
200
 
152
201
  In EventLoop mode (`c.event_loop = true`), `Agent#run_as_child` spawns a child agent
153
- asynchronously. When the child succeeds, `:child_completed` is dispatched; when it fails,
154
- `:child_failed` is dispatched. Always declare both transitions to avoid a stuck workflow:
202
+ asynchronously. When the child succeeds, `:child_completed` is dispatched with the result
203
+ `{ output:, messages:, usage: }` as its payload; when it fails, `:child_failed` is
204
+ dispatched. Always declare both transitions to avoid a stuck workflow:
155
205
 
156
206
  ```ruby
157
- # EventLoop mode: workflow that runs an agent as a child FSM
207
+ # EventLoop mode: workflow that runs an agent as a child FSM.
208
+ # The result { output:, messages:, usage: } arrives as the :child_completed event
209
+ # payload — write it back to the context in the target state's entry action.
158
210
  entry :run_agent, ->(ctx) {
159
- MyAgent.new.run_as_child(ctx.query, ctx: ctx) { |r| ctx.answer = r[:output] }
211
+ MyAgent.new.run_as_child(ctx.query, ctx: ctx)
160
212
  }
161
213
  transition from: :run_agent, on: :child_completed, to: :done
162
214
  transition from: :run_agent, on: :child_failed, to: :handle_error
@@ -222,10 +274,11 @@ rescue Phronomy::GuardrailError => e
222
274
  end
223
275
  ```
224
276
 
225
- > **Limitations:** Phronomy ships no built-in guardrail implementations. There is no
226
- > built-in prompt injection detector, PII scanner, or content classifier. All guardrail
227
- > logic must be implemented by the application. Reference implementations for common
228
- > patterns are available in `phronomy-examples` (example 06).
277
+ > **Note:** Phronomy includes `PromptInjectionGuardrail`, a built-in pattern-based
278
+ > input guardrail that detects common injection patterns (see the feature table above).
279
+ > PII scanning and content classification are **not** provided by the framework;
280
+ > that logic must be implemented by the application. Reference implementations for
281
+ > common patterns are available in `phronomy-examples` (example 06).
229
282
 
230
283
  ### Knowledge/RAG — Context injection and vector retrieval
231
284
 
@@ -407,9 +460,11 @@ class MyOrchestrator < Phronomy::Agent::Orchestrator
407
460
  end
408
461
  ```
409
462
 
410
- ### Workflow Parallel Node — Concurrent branches
463
+ ### Workflow parallel pattern — Concurrent branches
411
464
 
412
- Phronomy does not provide a built-in parallel abstraction. Use application-level Ruby threads inside a `state` action:
465
+ Phronomy does not provide a dedicated parallel-node primitive. The recommended
466
+ pattern for concurrent branches is to use application-level Ruby threads inside
467
+ a `state` action:
413
468
 
414
469
  ```ruby
415
470
  class EnrichContext
@@ -426,9 +481,9 @@ app = Phronomy::Workflow.define(EnrichContext) do
426
481
  summary: Thread.new { Summarizer.call(s) },
427
482
  tags: Thread.new { Tagger.call(s) }
428
483
  }
429
- # For production use, wrap with Timeout.timeout to avoid unbounded waits:
430
- # require "timeout"
431
- # Timeout.timeout(30) { threads.each_value(&:join) }
484
+ # For bounded waits, use Thread#join(timeout_seconds); nil means timed out — handle explicitly.
485
+ # Do not use Timeout.timeout or Thread#kill — both inject async exceptions that bypass cleanup.
486
+ # Prefer CancellationToken for cooperative cancellation of Phronomy-managed tasks.
432
487
  threads.each_value(&:join)
433
488
  s.merge(summary: threads[:summary].value, tags: Array(threads[:tags].value))
434
489
  end
@@ -535,6 +590,8 @@ Phronomy.configure do |c|
535
590
  c.trace_pii = false # default; set to true only when trace data contains no PII
536
591
  c.logger = nil # optional; any object responding to #warn (e.g. Rails.logger)
537
592
  c.event_loop_stop_grace_seconds = 5 # seconds to wait for sessions to drain on EventLoop#stop(drain: true)
593
+ c.runtime_backend = :thread # :thread (default); :immediate (tests, synchronous); :fiber (experimental validation only); :cooperative (deprecated alias for :immediate)
594
+ c.strict_runtime_guards = false # when true, raises on invoke-inside-task
538
595
  end
539
596
  ```
540
597
 
@@ -546,6 +603,66 @@ end
546
603
  > The default is `false` (PII protection enabled). Set to `true` only when
547
604
  > trace data does not contain sensitive information.
548
605
 
606
+ ## Sync vs Async API
607
+
608
+ Phronomy provides both synchronous and asynchronous invocation APIs.
609
+ Understanding when to use each prevents scheduler stalls and hidden deadlocks.
610
+
611
+ | Context | Recommended API |
612
+ |---------|----------------|
613
+ | Top-level application code, Rails controller, background job | `agent.invoke(input)` — blocks the calling thread until done |
614
+ | Inside a `Runtime#spawn` block, `TaskGroup`, Workflow action, Tool `execute` | `agent.invoke_async(input).await` — non-blocking within the scheduler |
615
+
616
+ ### Why this matters
617
+
618
+ `invoke` is a synchronous wrapper that calls `invoke_async` and then _blocks_ the calling
619
+ thread until the task completes. When called from **inside** an active scheduler task, the
620
+ calling task blocks the scheduler thread, preventing other tasks from making progress — a
621
+ hidden deadlock when all scheduler threads are occupied.
622
+
623
+ ### Runtime guard
624
+
625
+ Phronomy detects this pattern automatically:
626
+
627
+ ```ruby
628
+ # Default (soft mode): logs a warning and continues
629
+ Phronomy.configure { |c| c.strict_runtime_guards = false }
630
+
631
+ # Strict mode: raises SchedulerReentrancyError immediately
632
+ Phronomy.configure { |c| c.strict_runtime_guards = true }
633
+ ```
634
+
635
+ You can also query the current context directly:
636
+
637
+ ```ruby
638
+ Phronomy::Runtime.in_scheduler_context? # => true if called from inside a task
639
+ ```
640
+
641
+ ### Migration: invoke → invoke_async
642
+
643
+ ```ruby
644
+ # Before (blocks scheduler if called from inside a task)
645
+ result = my_agent.invoke("Hello")
646
+
647
+ # After (safe inside tasks and TaskGroups)
648
+ result = my_agent.invoke_async("Hello").await
649
+ ```
650
+
651
+ ### :immediate backend (synchronous / test mode)
652
+
653
+ The `:immediate` backend runs tasks synchronously using `FakeScheduler`
654
+ (backed by `Task::ImmediateBackend`). Blocking I/O is isolated in `BlockingAdapterPool`.
655
+ To switch back to the default thread-per-task backend:
656
+
657
+ ```ruby
658
+ Phronomy.configure { |c| c.runtime_backend = :thread }
659
+ # or per-example using SchedulerHelpers:
660
+ include Phronomy::Testing::SchedulerHelpers
661
+ with_fake_scheduler do |sched|
662
+ # all spawns run synchronously; sched.event_log records every lifecycle event
663
+ end
664
+ ```
665
+
549
666
  ## Context Management
550
667
 
551
668
  Phronomy includes a context window management layer. When model metadata is
@@ -583,7 +700,7 @@ class MyAgent < Phronomy::Agent::Base
583
700
  max_output_tokens 4096 # override max_output_tokens from registry
584
701
  context_overhead 600 # extra reservation for system prompt + tools
585
702
  invoke_timeout 30 # raise Phronomy::TimeoutError after 30 s (wait timeout, not cancellation)
586
- max_parallel_tools 4 # cap concurrent tool-call threads (default: 10)
703
+ max_parallel_tools 4 # cap concurrent tool executions (default: 10)
587
704
  end
588
705
  ```
589
706
 
@@ -624,9 +741,13 @@ blocks always execute.
624
741
  > - Any external I/O (database query, vector search, HTTP request) inside those calls
625
742
  >
626
743
  > For deep in-flight safety, complement `CancellationToken` with per-source or
627
- > per-tool timeouts (e.g. `Net::HTTP#read_timeout`, `Timeout.timeout`, connection
628
- > pool limits). Ruby's GVL prevents fully preemptive cancellation without
629
- > `Thread#kill`, which Phronomy avoids by default due to resource safety concerns.
744
+ > per-tool timeouts. Prefer library-native timeouts such as `Net::HTTP#read_timeout`,
745
+ > database `statement_timeout`, or Redis client timeout these signal the I/O layer
746
+ > to abort cleanly. Avoid `Timeout.timeout` unless you understand its async-exception
747
+ > risks: it injects `Timeout::Error` at an arbitrary execution point (the same
748
+ > mechanism as `Thread#kill`), which Phronomy avoids by default due to resource
749
+ > safety concerns. Ruby's GVL prevents fully preemptive cancellation without such
750
+ > risky interruption.
630
751
 
631
752
  ```ruby
632
753
  token = Phronomy::CancellationToken.new
@@ -740,9 +861,11 @@ span attributes by default (`trace_pii: false`). To include full content in trac
740
861
  Phronomy configuration. Evaluate whether your tracing backend (OTLP collector, Jaeger,
741
862
  Honeycomb, etc.) meets your data-retention and privacy requirements.
742
863
 
743
- **Prompt injection** — Phronomy provides no built-in prompt injection detection.
744
- Applications that process untrusted user input should implement their own input
745
- guardrails (see the Guardrails section above).
864
+ **Prompt injection** — Phronomy provides `PromptInjectionGuardrail`, a built-in
865
+ pattern-based input guardrail that detects common injection patterns (ignore/override
866
+ instructions, role-switching phrases, etc.). It is a useful starting point, not a
867
+ comprehensive defence; applications processing untrusted input should layer additional
868
+ custom guardrails as needed (see the Guardrails section above).
746
869
 
747
870
  **Tool and MCP security** — Tools can perform real-world side effects (database
748
871
  writes, API calls, file deletion). Treat tool execution as a privileged operation:
data/Rakefile CHANGED
@@ -7,4 +7,37 @@ RSpec::Core::RakeTask.new(:spec)
7
7
 
8
8
  require "standard/rake"
9
9
 
10
+ # Verify that @api private classes do not leak into the public YARD output.
11
+ # Any class or module without @api private that ends up in the public doc must
12
+ # have a corresponding entry in the Features table in README.md.
13
+ #
14
+ # Usage: bundle exec rake yard_check
15
+ desc "Build YARD docs excluding @api private items and check for undocumented public APIs"
16
+ task :yard_check do
17
+ require "yard"
18
+ YARD::Registry.clear
19
+ YARD.parse(Dir["lib/**/*.rb"])
20
+
21
+ undocumented = []
22
+ YARD::Registry.all(:class, :module).each do |obj|
23
+ next if obj.visibility == :private
24
+ next if obj.tag(:api)&.name == "private"
25
+ next if obj.docstring.blank?
26
+
27
+ # Classes/modules with no docstring that are not @api private are worth
28
+ # noting, but only raise on truly undocumented public objects.
29
+ if obj.docstring.empty?
30
+ undocumented << obj.path
31
+ end
32
+ end
33
+
34
+ unless undocumented.empty?
35
+ warn "The following public classes/modules have no YARD documentation:\n" \
36
+ " #{undocumented.join("\n ")}\n" \
37
+ "Either add a docstring or mark them @api private."
38
+ exit 1
39
+ end
40
+ puts "yard_check passed — no undocumented public classes/modules found."
41
+ end
42
+
10
43
  task default: %i[spec standard]
@@ -2,7 +2,7 @@
2
2
  "workflow_context_merge": 124364.81010472385,
3
3
  "workflow_define": 2179.945274115319,
4
4
  "tool_params_schema_definition": 19534379.159046534,
5
- "dispatch_parallel_10": 1483.2255243486482,
5
+ "dispatch_parallel_10": 886.0,
6
6
  "cancellation_token_cancelled": 4335060.97443425,
7
7
  "cancellation_token_raise_if_cancelled_noop": 3566903.189098373,
8
8
  "trim_context_remove_2000": 1761.5700678986254
@@ -91,6 +91,7 @@ stub_agent_class = Class.new(Phronomy::Agent::Base) do
91
91
  define_method(:invoke) do |_input, messages: [], thread_id: nil, config: {}|
92
92
  {output: "stub", messages: []}
93
93
  end
94
+ define_method(:invoke_async) { |input, **_kw| Phronomy::Runtime.instance.spawn(name: "bench-stub") { invoke(input) } }
94
95
  end
95
96
 
96
97
  orchestrator_class = Class.new(Phronomy::Agent::Orchestrator)
@@ -49,3 +49,27 @@ transport layer participation.
49
49
  - Users who expect "cancel" semantics from a timeout will be surprised.
50
50
  - Proper cancellation requires the `CancellationToken` feature (#216), which
51
51
  has not yet been implemented.
52
+
53
+ ## Extension: PendingOperation#await cooperative cancellation semantics
54
+
55
+ `BlockingAdapterPool::PendingOperation#await` also supports both `timeout:` and
56
+ `cancellation_token:` parameters. The same non-preemptive rule applies here,
57
+ consistent with ADR-010 (cooperative-first, non-preemptive concurrency model):
58
+
59
+ 1. **No forcible thread termination.** When a `cancellation_token` is cancelled,
60
+ `CancellationError` is raised to the `await` caller; when the timeout fires,
61
+ `TimeoutError` is raised instead. In both cases, the underlying worker thread
62
+ is **not** killed. The worker runs its block to natural completion.
63
+ 2. **Cooperative, not preemptive.** Cancellation takes effect only at `await`
64
+ call sites or at explicit `token.check!` checkpoints inside the submitted
65
+ block. Code that ignores the token will not be interrupted.
66
+ 3. **Timeout scope.** `timeout:` at `await` time is measured from the moment
67
+ `await` is called. If both submit-time and await-time timeouts are provided,
68
+ the earlier deadline wins.
69
+ 4. **Error propagation.** `CancellationError` (or `TimeoutError`) is raised to
70
+ the `await` caller; the submitter is responsible for handling it.
71
+
72
+ These semantics are identical in spirit to the `invoke_timeout` decision above:
73
+ the framework exposes a *wait* boundary, not a hard-kill boundary. Safe resource
74
+ cleanup is the caller's responsibility.
75
+
@@ -1,8 +1,8 @@
1
- # ADR-006: Built-in Guardrail Implementations Are Not Shipped
1
+ # ADR-006: Minimal Built-in Guardrail Implementations
2
2
 
3
3
  ## Status
4
4
 
5
- Accepted
5
+ Amended (see Amendment section below)
6
6
 
7
7
  ## Context
8
8
 
@@ -46,3 +46,21 @@ Users are responsible for implementing domain-specific guardrail logic.
46
46
  **Negative / Tradeoffs:**
47
47
  - Users must implement their own guardrails from scratch. Providing a cookbook
48
48
  of example patterns in the README partially mitigates this.
49
+
50
+ ## Amendment — `PromptInjectionGuardrail` Added
51
+
52
+ After the original decision was accepted, `Guardrail::PromptInjectionGuardrail`
53
+ was introduced as the **one exception** to the "no built-ins" rule.
54
+
55
+ **Rationale for the exception:**
56
+ - Prompt injection patterns are broadly applicable across almost all LLM
57
+ applications regardless of domain, unlike PII patterns which are locale-specific.
58
+ - A lightweight, pure-regex implementation has no third-party dependency and
59
+ adds negligible gem weight.
60
+ - It serves as a documented reference implementation that users can subclass with
61
+ `extra_patterns:` to extend.
62
+
63
+ **Scope of the exception:**
64
+ Only prompt-injection detection is provided as a built-in. PII scanning,
65
+ content classification, and toxic-content filtering remain out of scope per the
66
+ original decision.