RubyGems - phronomy - Versions diffs - 0.6.0 → 0.7.1 - Mend

phronomy 0.6.0 → 0.7.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (143) hide show

checksums.yaml +4 -4
data/.mutant.yml +22 -0
data/CHANGELOG.md +488 -0
data/CONTRIBUTING.md +102 -0
data/README.md +374 -36
data/RELEASE_CHECKLIST.md +86 -0
data/Rakefile +33 -0
data/SECURITY.md +80 -0
data/benchmark/baseline.json +9 -0
data/benchmark/bench_agent_invoke.rb +105 -0
data/benchmark/bench_context_assembler.rb +46 -0
data/benchmark/bench_regression.rb +172 -0
data/benchmark/bench_token_estimator.rb +44 -0
data/benchmark/bench_tool_schema.rb +69 -0
data/benchmark/bench_vector_store.rb +39 -0
data/benchmark/bench_workflow.rb +55 -0
data/benchmark/run_all.rb +118 -0
data/docs/decisions/001-rubyllm-as-provider-layer.md +42 -0
data/docs/decisions/002-workflow-context-immutability.md +42 -0
data/docs/decisions/003-event-loop-singleton.md +48 -0
data/docs/decisions/004-invoke-timeout-is-not-cancellation.md +75 -0
data/docs/decisions/005-static-knowledge-class-level-cache.md +45 -0
data/docs/decisions/006-no-built-in-guardrails.md +66 -0
data/docs/decisions/007-mcp-is-beta-stability.md +51 -0
data/docs/decisions/008-orchestrator-uses-os-threads.md +52 -0
data/docs/decisions/009-state-store-abstraction.md +141 -0
data/docs/decisions/010-cooperative-first-concurrency.md +248 -0
data/lib/phronomy/agent/base.rb +416 -49
data/lib/phronomy/agent/before_completion_context.rb +1 -0
data/lib/phronomy/agent/checkpoint.rb +1 -0
data/lib/phronomy/agent/concerns/before_completion.rb +6 -0
data/lib/phronomy/agent/concerns/error_translation.rb +45 -0
data/lib/phronomy/agent/concerns/guardrailable.rb +3 -0
data/lib/phronomy/agent/concerns/retryable.rb +12 -1
data/lib/phronomy/agent/concerns/suspendable.rb +19 -0
data/lib/phronomy/agent/fsm.rb +44 -52
data/lib/phronomy/agent/handoff.rb +3 -0
data/lib/phronomy/agent/orchestrator.rb +191 -54
data/lib/phronomy/agent/parallel_tool_chat.rb +87 -13
data/lib/phronomy/agent/react_agent.rb +16 -6
data/lib/phronomy/agent/runner.rb +2 -0
data/lib/phronomy/agent/shared_state.rb +11 -0
data/lib/phronomy/agent/suspend_signal.rb +2 -0
data/lib/phronomy/agent/team_coordinator.rb +17 -5
data/lib/phronomy/async_queue.rb +155 -0
data/lib/phronomy/blocking_adapter_pool.rb +435 -0
data/lib/phronomy/cancellation_scope.rb +123 -0
data/lib/phronomy/cancellation_token.rb +133 -0
data/lib/phronomy/concurrency_gate.rb +155 -0
data/lib/phronomy/configuration.rb +168 -2
data/lib/phronomy/context/assembler.rb +6 -0
data/lib/phronomy/context/compaction_context.rb +2 -0
data/lib/phronomy/context/context_version_cache.rb +2 -0
data/lib/phronomy/context/token_budget.rb +3 -0
data/lib/phronomy/context/token_estimator.rb +9 -2
data/lib/phronomy/context/trigger_context.rb +1 -0
data/lib/phronomy/context/trim_context.rb +4 -0
data/lib/phronomy/deadline.rb +63 -0
data/lib/phronomy/diagnostics.rb +62 -0
data/lib/phronomy/embeddings/base.rb +22 -2
data/lib/phronomy/embeddings/ruby_llm_embeddings.rb +6 -2
data/lib/phronomy/eval/comparison.rb +2 -0
data/lib/phronomy/eval/dataset.rb +4 -0
data/lib/phronomy/eval/metrics.rb +6 -0
data/lib/phronomy/eval/runner.rb +11 -9
data/lib/phronomy/eval/scorer/base.rb +1 -0
data/lib/phronomy/eval/scorer/exact_match.rb +2 -0
data/lib/phronomy/eval/scorer/includes_scorer.rb +2 -0
data/lib/phronomy/eval/scorer/llm_judge.rb +2 -0
data/lib/phronomy/event_loop.rb +275 -30
data/lib/phronomy/fsm_session.rb +57 -4
data/lib/phronomy/generator_verifier.rb +2 -0
data/lib/phronomy/guardrail/base.rb +3 -0
data/lib/phronomy/guardrail/prompt_injection_guardrail.rb +58 -0
data/lib/phronomy/invocation_context.rb +152 -0
data/lib/phronomy/knowledge_source/base.rb +24 -2
data/lib/phronomy/knowledge_source/entity_knowledge.rb +7 -2
data/lib/phronomy/knowledge_source/rag_knowledge.rb +8 -4
data/lib/phronomy/knowledge_source/static_knowledge.rb +7 -2
data/lib/phronomy/llm_adapter/base.rb +104 -0
data/lib/phronomy/llm_adapter/ruby_llm.rb +41 -0
data/lib/phronomy/llm_adapter.rb +20 -0
data/lib/phronomy/loader/base.rb +1 -0
data/lib/phronomy/loader/csv_loader.rb +2 -0
data/lib/phronomy/loader/markdown_loader.rb +2 -0
data/lib/phronomy/loader/plain_text_loader.rb +1 -0
data/lib/phronomy/metrics.rb +38 -0
data/lib/phronomy/output_parser/base.rb +1 -0
data/lib/phronomy/output_parser/json_parser.rb +22 -3
data/lib/phronomy/output_parser/structured_parser.rb +2 -0
data/lib/phronomy/prompt_template.rb +5 -0
data/lib/phronomy/runnable.rb +20 -3
data/lib/phronomy/runtime/deterministic_scheduler.rb +412 -0
data/lib/phronomy/runtime/fake_scheduler.rb +165 -0
data/lib/phronomy/runtime/gate_registry.rb +52 -0
data/lib/phronomy/runtime/pool_registry.rb +57 -0
data/lib/phronomy/runtime/runtime_metrics.rb +117 -0
data/lib/phronomy/runtime/scheduler.rb +98 -0
data/lib/phronomy/runtime/scheduler_timer_adapter.rb +79 -0
data/lib/phronomy/runtime/task_registry.rb +48 -0
data/lib/phronomy/runtime/thread_scheduler.rb +30 -0
data/lib/phronomy/runtime/timer_queue.rb +106 -0
data/lib/phronomy/runtime/timer_service.rb +42 -0
data/lib/phronomy/runtime.rb +374 -0
data/lib/phronomy/splitter/base.rb +2 -0
data/lib/phronomy/splitter/fixed_size_splitter.rb +2 -0
data/lib/phronomy/splitter/recursive_splitter.rb +2 -0
data/lib/phronomy/state_store/base.rb +48 -0
data/lib/phronomy/state_store/in_memory.rb +62 -0
data/lib/phronomy/task/backend.rb +80 -0
data/lib/phronomy/task/fiber_backend.rb +157 -0
data/lib/phronomy/task/immediate_backend.rb +89 -0
data/lib/phronomy/task/thread_backend.rb +84 -0
data/lib/phronomy/task.rb +275 -0
data/lib/phronomy/task_group.rb +265 -0
data/lib/phronomy/testing/fake_clock.rb +109 -0
data/lib/phronomy/testing/fake_scheduler.rb +104 -0
data/lib/phronomy/testing/scheduler_helpers.rb +59 -0
data/lib/phronomy/testing.rb +12 -0
data/lib/phronomy/tool/agent_tool.rb +1 -0
data/lib/phronomy/tool/base.rb +298 -28
data/lib/phronomy/tool/mcp_tool.rb +103 -17
data/lib/phronomy/tool/scope_policy.rb +50 -0
data/lib/phronomy/tool_executor.rb +106 -0
data/lib/phronomy/tracing/base.rb +3 -0
data/lib/phronomy/tracing/langfuse_tracer.rb +2 -0
data/lib/phronomy/tracing/open_telemetry_tracer.rb +36 -0
data/lib/phronomy/vector_store/async_backend.rb +110 -0
data/lib/phronomy/vector_store/base.rb +40 -7
data/lib/phronomy/vector_store/in_memory.rb +16 -7
data/lib/phronomy/vector_store/pgvector.rb +40 -9
data/lib/phronomy/vector_store/redis_search.rb +29 -8
data/lib/phronomy/version.rb +1 -1
data/lib/phronomy/workflow.rb +147 -11
data/lib/phronomy/workflow_context.rb +83 -6
data/lib/phronomy/workflow_runner.rb +106 -7
data/lib/phronomy.rb +112 -1
data/scripts/api_snapshot.rb +91 -0
data/scripts/check_api_annotations.rb +68 -0
data/scripts/check_private_enforcement.rb +93 -0
data/scripts/check_readme_runnable.rb +98 -0
data/scripts/run_mutation.sh +46 -0
metadata +83 -2

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 81df7b877b08caffbfdafb9ab1f1c186739a04ef643a14e7b457be805c8b2b9d
-  data.tar.gz: c0fd0ffad64df476c21e0205926df15589c0e654fed9675a6e8aef3589636f1c
+  metadata.gz: d9ae370d656048e38f700b6bced931fe249f731cea819ab94691eb4bcf6ef43c
+  data.tar.gz: 97d01ca3475f547a41397d1dad2ddb8ccaa10f6466d5a75c3f79e6875a7af0c6
 SHA512:
-  metadata.gz: cb22a0d7f3edba46a46e9614f4cdad1641941164a641e17c1b3aa24ed07a3d7fb88b408304f1e9c5eaceac02ef8a1fa8503cfb0cffac3ae86b1dd9786756f5ac
-  data.tar.gz: 4be7f67215d0b3b8381508f9ccf062fbfc8f41bb7a8a76299e2642634e78421c8ad5fcc551170db4e739c3db7e1cb8fd69ffad6982f50cdba8375f2237aa5ce9
+  metadata.gz: d3ab9ebd145e1ed706ad1741a2e3184c412aa8fd0eac32c95eb0b4a1ef87af38ae73eb5b4205b7f2894dd228929130c9a7569d24a1d7a571a5aa3ec5a68a4172
+  data.tar.gz: efa88afdbaa2f3d8fc38ee7cbc7044711479490546a888d44540f3b6bae6da60a3a3e64cfbbef455d65f78bab64dd9a68056e4c9f7ac7a360d512179364c8b23

data/.mutant.yml ADDED Viewed

@@ -0,0 +1,22 @@
+---
+# Mutant configuration for Phronomy (opensource project)
+# See: https://github.com/mbj/mutant
+usage: opensource
+integration: rspec
+includes:
+  - lib
+requires:
+  - phronomy
+matcher:
+  subjects:
+    - Phronomy::WorkflowContext
+    - Phronomy::WorkflowRunner
+    - Phronomy::Tool::Base
+    - Phronomy::Context::TokenBudget
+    - Phronomy::Context::TokenEstimator
+    - Phronomy::VectorStore::InMemory

data/CHANGELOG.md CHANGED Viewed

@@ -9,6 +9,494 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
+### Added
+- **`Phronomy::Diagnostics` and `SchedulerReentrancyError`** (#278, #279):
+  `Phronomy::Diagnostics` exposes a snapshot of current scheduler state
+  (`pending_count`, `active_tasks`, `pool_utilization`, etc.) for debugging and
+  monitoring. `SchedulerReentrancyError` is raised when a scheduler operation is
+  attempted from within a scheduler callback, preventing deadlocks.
+  `Phronomy.configure { |c| c.scheduler_debug = true }` enables verbose scheduler
+  logging.
+- **`task_id` / `parent_task_id` on `InvocationContext`** (#277):
+  Every task spawned via `Task.spawn` now carries a `task_id` (a random UUID) and
+  an optional `parent_task_id`. These fields enable hierarchical task-tree tracing
+  and are forwarded automatically by `TaskGroup`.
+- **`Phronomy::Metrics` — task-centric observability snapshot** (#276):
+  `Phronomy::Metrics.snapshot` returns a hash with scheduler statistics:
+  `tasks_started`, `tasks_completed`, `tasks_failed`, `pool_queue_depth`, and
+  `pool_active_threads`. Intended for metrics export and health-check endpoints.
+- **`Phronomy::Testing::FakeClock` and `FakeScheduler`** (#273):
+  Two test helpers for deterministic concurrency testing.
+  `FakeClock` exposes `advance(seconds)` to control the passage of time without
+  sleeping. `FakeScheduler` replaces the real scheduler in specs, providing
+  synchronous execution and `flush` / `drain` helpers to drive task completion.
+- **`ScopePolicy` and approval gate integration** (#270):
+  `Phronomy::Tool::ScopePolicy` is a callable that maps `(tool_class, scope, agent)`
+  to `:allow`, `:approve`, or `:reject`. The default policy (`ScopePolicy::DEFAULT`)
+  automatically routes tools declaring high-risk scopes (`:write`, `:admin`,
+  `:external_network`, `:filesystem`, `:process`, `:external_process`) through the
+  existing approval gate; tools with `scope :read_only` or no scope are allowed
+  unconditionally. Per-agent policy overrides are available via
+  `agent.scope_policy = my_policy`.
+  **Behaviour change**: tools with the above scopes that previously executed without
+  an approval handler will now be **rejected** unless an approval handler is
+  registered or the agent uses a custom permissive policy.
+- **`PromptInjectionGuardrail`, `Tool::Base#redact_params`, and `#max_result_size`** (#271):
+  `Phronomy::Guardrail::PromptInjectionGuardrail` is a built-in `InputGuardrail`
+  subclass that detects prompt-injection patterns in user input.
+  `Tool::Base.redact_params(*names)` marks parameter names as sensitive; their
+  values are replaced with `"[REDACTED]"` in log and trace output.
+  `Tool::Base.max_result_size(n)` sets a per-tool character limit; results
+  exceeding the limit are truncated and a warning is logged. The global fallback is
+  `Phronomy.configure { |c| c.tool_result_max_size = n }` (default: no limit).
+- **`execution_mode` DSL on `Tool::Base`** (#263):
+  `Tool::Base.execution_mode` accepts `:cooperative`, `:blocking_io` (default),
+  `:cpu_bound`, or `:external_process`. Tools marked `:blocking_io` (the default)
+  are dispatched through `BlockingAdapterPool` when a `Runtime` is available,
+  keeping the scheduler thread unblocked. Tools marked `:cooperative` are called
+  directly on the scheduler thread (suitable for pure in-memory operations).
+- **`invoke_async` and `call_async` — async entry points** (#262):
+  `Agent::Base#invoke_async(input, **opts)` returns a `Phronomy::Task` wrapping
+  `#invoke`. `Workflow#invoke_async(input, config:)` does the same for workflows.
+  `Tool::Base#call_async(args, cancellation_token:)` returns a `Task` wrapping
+  `#call`. All three are backward-compatible with existing synchronous callers.
+- **`LLMAdapter` abstraction** (#266):
+  `Phronomy::LLMAdapter::Base` decouples the agent pipeline from RubyLLM.
+  `Phronomy::LLMAdapter::RubyLLM` (registered by default) wraps the existing
+  integration. Custom adapters can be registered via
+  `Phronomy.configure { |c| c.llm_adapter = MyAdapter }` for testing or
+  alternative LLM backends.
+- **`BlockingAdapterPool` backpressure limits** (#268):
+  `BlockingAdapterPool` now enforces configurable `pool_size` (default: 10) and
+  `queue_size` (default: 100) limits. Tasks submitted when the queue is full raise
+  `Phronomy::BackpressureError` immediately instead of growing the queue without
+  bound.
+- **Cooperative scheduler fairness** (#269):
+  The scheduler measures per-task lag and emits starvation and dispatch warnings
+  via `Phronomy.configuration.logger` when tasks wait longer than configured
+  thresholds. Configurable via `scheduler_starvation_warn_ms` and
+  `scheduler_dispatch_warn_ms`.
+- **Workflow entry actions awaitable with Task** (#264):
+  Entry action lambdas may now return a `Phronomy::Task`. The FSMSession awaits
+  the task on a background thread and posts `:action_completed` (with the resulting
+  `WorkflowContext`) or `:state_completed` back to the EventLoop without blocking
+  it. Backward-compatible: lambdas that return a `WorkflowContext` or `nil`
+  continue to work as before.
+- **`Task`, `TaskGroup`, `AsyncQueue`, `Deadline`, `InvocationContext`, `Runtime` concurrency abstractions** (#255):
+  Six new concurrency primitives form the foundation of the async execution layer.
+  `Task` wraps a callable with cancellation, timeout (`Deadline`), and context
+  propagation (`InvocationContext`). `TaskGroup` runs tasks concurrently and waits
+  for all to finish (or the first failure). `AsyncQueue` is a bounded, cancellable
+  queue. `Runtime` is the top-level façade that resolves a `BlockingAdapterPool`
+  and provides `blocking_io { }` and `cpu_bound { }` dispatch helpers.
+- **`BlockingAdapterPool`** (#256):
+  A bounded thread pool that isolates blocking I/O (LLM calls, database queries,
+  HTTP requests) from the cooperative scheduler thread. Default pool size is 10
+  threads with a queue depth of 100. Replaces direct `Thread.new` calls in core
+  agent and tool paths.
+- **`VectorStore#size` — document count for all backends, contract coverage for RedisSearch and Pgvector** (#240):
+  `VectorStore::Base` gains `#size` as an abstract method; `InMemory`, `RedisSearch`,
+  and `Pgvector` all implement it. `RedisSearch#size` queries `FT.INFO num_docs`;
+  `Pgvector#size` delegates to `model_class.count`. The `a_vector_store` shared example
+  is applied to RedisSearch and Pgvector (nightly real-backend CI); unit specs add a
+  skip-guarded `it_behaves_like` reference and dedicated `#size` unit tests.
+  `empty_store` override hook added to the shared example for real-backend callers.
+- **`force_kill: false` default in `dispatch_parallel`, `fan_out`, and `EventLoop#stop`** (#235):
+  Thread#kill is now opt-in. The default `force_kill: false` leaves timed-out workers
+  running and raises `TimeoutError` immediately, avoiding the risk of interrupted
+  `ensure` blocks or corrupted database transactions. Pass `force_kill: true` to
+  restore the previous behaviour (with a `logger.warn` to make it visible).
+  `EventLoop#stop` gains the same keyword and returns `:timeout` instead of
+  `:force_killed` when `force_kill: false` and the thread is still alive.
+- **Public API compatibility snapshot spec** (#236):
+  `spec/phronomy/public_api_spec.rb` enumerates expected public methods for every
+  `Stable`-tagged constant. The spec runs as part of the default RSpec suite; any
+  accidental removal or rename of a listed method now fails CI immediately.
+- **Nightly real-backend CI split into three independent job groups** (#238):
+  The nightly workflow (`nightly.yml`) now has three separately skippable jobs:
+  `real-backend-redis` (Redis Stack), `real-backend-pgvector` (PostgreSQL + pgvector),
+  and `real-backend-otel` (OpenTelemetry in-process SDK exporter). Each job runs only
+  the relevant spec with `--tag real_backend:<backend>`. The existing `redis_search_spec`
+  and `pgvector_spec` gain the `real_backend:` metadata tag. A new `otel_spec.rb`
+  verifies span emission, attribute attachment, and error recording via
+  `InMemorySpanExporter`.
+- **`CancellationToken#raise_if_cancelled!` — convenience cancellation check** (#234):
+  New instance method that raises `Phronomy::CancellationError` when the token is
+  cancelled, or returns `nil` otherwise. Replaces the `if cancelled? then raise`
+  pattern inside tools, RAG loaders, and hooks.
+- **Tool cooperative cancellation via `cancellation_token:` keyword** (#234):
+  `Tool::Base#call` now injects `Thread.current[:phronomy_cancellation_token]` as
+  `cancellation_token:` into `execute` when the method declares that keyword. Existing
+  tools without the keyword continue to work unchanged. Tool authors can opt in:
+  `def execute(query:, cancellation_token: nil)`.
+- **`CancellationToken.timeout_after` — monotonic-clock deadline** (#225):
+  New `CancellationToken.timeout_after(seconds)` class method creates a token that
+  becomes cancelled after the specified number of seconds, measured with
+  `Process::CLOCK_MONOTONIC` (immune to NTP/DST drift). The existing `deadline:`
+  keyword for wall-clock deadlines remains supported for backward compatibility.
+- **`EventLoop#stop` — drain mode and cooperative shutdown** (#233):
+  `EventLoop#stop` now accepts a `drain: true` keyword (default: `false`). When
+  set, the loop waits up to `Phronomy.configuration.event_loop_stop_grace_seconds`
+  (default: 5 s, configurable) for in-flight FSM sessions to complete before
+  joining threads. New sessions submitted while shutdown is pending are rejected
+  immediately with `Phronomy::CancellationError`. A new
+  `event_loop_stop_grace_seconds` configuration attribute is available on
+  `Phronomy::Configuration`.
+- **`invoke_timeout` DSL and `Phronomy::TimeoutError`**: Agents can declare a per-invoke
+  timeout in seconds via `invoke_timeout N` in the class body. Exceeding the timeout raises
+  `Phronomy::TimeoutError` (a subclass of `Phronomy::Error`). The default remains unlimited.
+- **`dispatch_parallel` / `fan_out` — per-call `timeout:` option** (#133): Both methods now
+  accept `timeout: nil` (default, unlimited) or a positive `Numeric` in seconds. Timed-out
+  tasks are treated the same as errors and follow the existing `on_error:` policy (`:raise`
+  or `:skip`).
+- **MCP `HttpTransport` custom authentication headers** (#144): `McpTool.from_server` now
+  accepts `headers: {}`, forwarded all the way to `HttpTransport#initialize`. Arbitrary
+  headers (e.g. `Authorization: Bearer …`) are injected into every JSON-RPC request,
+  enabling use of MCP servers that require bearer tokens or API keys.
+- **`StdioTransport` — `env:`, `cwd:`, and `startup_timeout:` options** (#145):
+  Three new keyword arguments are now accepted when constructing a `StdioTransport` (and
+  therefore via `McpTool.from_server`): `env: {}` merges extra variables into the child
+  process environment; `cwd: nil` sets the working directory; `startup_timeout: 5` limits
+  how long to wait for the child process to become ready.
+- **Workflow DSL validates graph structure at build time** (#124): `Phronomy::Workflow.define`
+  now raises `ArgumentError` immediately for hard structural errors (no states declared,
+  transitions referencing undefined targets). Unreachable states emit a warning but do not
+  raise. Errors surface at load time rather than at the first `invoke`.
+- **Expanded error taxonomy** (#149): Five new subclasses of `Phronomy::Error` are now
+  available: `TransportError` (MCP or LLM network-layer failure; subclasses are
+  `RateLimitError` for HTTP 429 and `AuthenticationError` for HTTP 401/403),
+  `ContextLengthError` (prompt exceeds model context window), and
+  `CancellationError` (explicit invocation cancellation, distinct from the
+  deadline-exceeded `TimeoutError`). All five are defined as subclasses of
+  `Phronomy::Error` so application code can rescue them uniformly.
+- **`Agent::Base.static_knowledge_refresh!`** (#164): New class-level method that clears the
+  cached `static_knowledge` chunks so the next `invoke` re-fetches from all registered
+  sources. Essential for long-running processes (web servers, job workers) where knowledge
+  sources may be updated at runtime without a process restart.
+- **`Phronomy::Configuration#logger`** (#158): New optional configuration attribute. Any
+  object responding to `#warn` (e.g. `Rails.logger`) can be assigned. Framework diagnostic
+  messages — starting with the unreachable-state warning from `Workflow.define` — are routed
+  through this logger instead of writing directly to `$stderr` via `Kernel#warn`.
+- **`Phronomy.with_configuration` and `Phronomy.reset_runtime!`** (#206): Two new class
+  methods for runtime isolation. `with_configuration` yields the current `Configuration`
+  object and restores the original after the block — even on exception — enabling per-request
+  overrides and scoped test configuration. `reset_runtime!` stops any running `EventLoop`,
+  clears its singleton, and resets configuration to defaults; intended for test suites to
+  ensure clean state between examples. `spec_helper.rb` now calls `reset_runtime!` in an
+  `after(:each)` hook automatically.
+- **`CancellationToken` — cooperative cancellation for agent invocations** (#216):
+  New class `Phronomy::CancellationToken` enables cooperative cancellation without
+  `Thread#kill`. Tokens are passed via `config: { cancellation_token: token }`.
+  `cancel!` marks the token (thread-safe via Mutex); `cancelled?` returns `true`
+  once cancelled or once an optional `deadline: Time` has passed. Agents check the
+  token in `_invoke_impl` (fail-fast before any LLM call) and again immediately
+  before `chat.ask`. `CancellationError` is never retried by the retry policy.
+  `dispatch_parallel` and `fan_out` accept `cancellation_token:` and automatically
+  inject it into every worker task's config unless the task already supplies its own.
+### Removed
+- **BREAKING: `Agent::Base#run_as_child` drops `&result_writer` block parameter** (#265):
+  The optional block form `run_as_child(input, ctx: ctx) { |r| ctx.answer = r[:output] }`
+  is no longer supported. The result is now delivered **exclusively** as the
+  `:child_completed` event payload `{ output:, messages:, usage: }`. The parent
+  Workflow task is the sole owner of the `WorkflowContext`; no background thread
+  writes to it directly. Callers that were using the block to write back into the
+  context must update their workflow design (e.g. read the result in the target
+  state's entry action after the transition, or store output through an external
+  shared resource if needed).
+- **BREAKING (internal): `AgentFSM#initialize` drops `result_writer:` keyword** (#265):
+  Direct callers of `AgentFSM.new(result_writer: ...)` must remove that keyword.
+  This class is considered internal; gem consumers should use `run_as_child` instead.
+### Changed
+- **`AgentFSM`, `ParallelToolChat`, and `Orchestrator` use `Task`/`TaskGroup` instead of bare `Thread.new`** (#257, #258, #259):
+  All three components now spawn async work through the `Task` and `TaskGroup`
+  abstractions. This enables cancellation propagation, context threading, and
+  `BlockingAdapterPool` routing. No public API changes; behaviour is equivalent.
+- **`Thread.current[:phronomy_*]` context propagation replaced with explicit `InvocationContext`** (#260):
+  Thread-local keys `phronomy_event_loop_thread`, `phronomy_cancellation_token`,
+  and `phronomy_context_version_caches` are no longer used as the primary
+  propagation channel. `InvocationContext` is threaded explicitly through call
+  stacks. Importantly, `Tool::Base#call` no longer falls back to
+  `Thread.current[:phronomy_cancellation_token]`; cancellation is only observed
+  when the caller passes `cancellation_token:` explicitly (or when
+  `ParallelToolChat` injects it). Tools that relied on the thread-local fallback
+  must be updated.
+- **`Timeout.timeout` removed from core paths; replaced with `CancellationScope`** (#261):
+  `Agent::Base#invoke` and `McpTool::StdioTransport` no longer use `Timeout.timeout`
+  (which is unsafe with `Thread.new` and `ensure` blocks). A `CancellationScope`
+  with `deadline_in(seconds)` provides equivalent semantics without the thread-
+  interruption hazards. `ScopeTimeoutError < TimeoutError` is raised on expiry.
+- **RAG/VectorStore blocking I/O placed behind `BlockingAdapterPool` async boundary** (#267):
+  `KnowledgeSource#fetch` and all three `VectorStore` backends now execute their
+  blocking I/O through `Runtime#blocking_io` when a `Runtime` is present. Callers
+  in a synchronous context see no change; callers in an EventLoop context benefit
+  from non-blocking scheduler behaviour.
+  The cancellation token (passed via `config: { cancellation_token: token }`) is
+  now checked at multiple additional points beyond the initial LLM call boundary:
+  before each `KnowledgeSource#fetch` in `build_context` (RAG phase); after each
+  streaming chunk in `_stream_impl`; before each tool-call batch in
+  `ParallelToolChat`; and after each `before_completion` hook. This ensures that
+  long-running retrieval, streaming, and tool-dispatch phases respect cancellation
+  with minimal latency.
+- **`Agent::Orchestrator` uses `CancellationToken` for internal stop flag** (#224):
+  The boolean stop flag in `Orchestrator` is replaced with an internal
+  `CancellationToken`. FSM session loops perform cooperative cancellation checks
+  via `cancelled?`; `Thread#kill` is retained only as a last resort after
+  cooperative shutdown.
+- **Error taxonomy classes are now raised at the retry boundary** (#204): The classes
+  `Phronomy::RateLimitError`, `Phronomy::AuthenticationError`, `Phronomy::ContextLengthError`,
+  and `Phronomy::TransportError` (introduced in #149) are now actually raised when the
+  corresponding `RubyLLM` exceptions occur. A new internal `ErrorTranslation` concern wraps
+  the retry exhaust path and maps `RubyLLM::*` exceptions to their Phronomy counterparts,
+  preserving the original exception as `#cause`. **Migration**: callers rescuing
+  `RubyLLM::RateLimitError` (or other `RubyLLM::*` errors) directly should migrate to
+  `rescue Phronomy::RateLimitError` / `Phronomy::TransportError` etc.
+- **`Orchestrator#bounded_map` uses cooperative cancellation before force-kill** (#203):
+  Workers now check a shared `cancelled` flag at each loop iteration and stop picking up new
+  tasks once the timeout deadline passes. A 0.5 s grace period is given to in-flight workers
+  before `Thread#kill` is used as a last resort. `EventLoop#stop` similarly logs a warning
+  via `Phronomy.configuration.logger` when force-kill is triggered.
+- **`Orchestrator#bounded_map` timeout deadline uses monotonic clock** (#209): Replaced
+  `Time.now` deadline arithmetic with `Process.clock_gettime(Process::CLOCK_MONOTONIC)` to
+  avoid sensitivity to NTP adjustments, DST transitions, and system-clock changes that could
+  inflate or deflate effective timeouts.
+- **`EventLoop` warns on events for unknown `target_id`**: When the event loop receives an
+  event whose `target_id` does not match any registered session, a warning is emitted instead
+  of silently discarding the event.
+- **`VectorStore#search` validates `k` is a positive integer**: All three backends
+  (`InMemory`, `RedisSearch`, `Pgvector`) now raise `ArgumentError` immediately when `k` is
+  not a positive integer, providing a clear error instead of a silent empty result or an
+  obscure database error.
+- **`max_parallel_tools` DSL**: Agents can cap the number of concurrent tool-call threads
+  with `max_parallel_tools N` in the class body. Useful for rate-limiting external API calls.
+  The default is **10** (inheriting from `Base`); set explicitly to raise or lower the cap.
+- **`max_parallel_tools` and `invoke_timeout` DSL argument validation** (#152): Both setters
+  now raise `ArgumentError` at class-definition time if the supplied value is invalid
+  (`max_parallel_tools` requires an `Integer >= 1`; `invoke_timeout` requires a positive
+  `Numeric`), surfacing configuration mistakes immediately.
+- **`on_error :suppress` — canonical alias for `:return_empty`** (#165): `:suppress` is the
+  new preferred name for the error-suppression behaviour in `Tool::Base`. `:return_empty`
+  continues to function but emits a deprecation warning and will be removed in a future major
+  release. Migrate by replacing `on_error :return_empty` with `on_error :suppress`.
+- **Tool nested object properties injected into JSON Schema** (#162): `Tool::Base#params_schema`
+  now recursively serialises nested `:object` param specs (including `enum` constraints and
+  further nesting) into the JSON Schema `properties` structure forwarded to the LLM,
+  enabling accurate structured argument generation for complex tool parameters.
+### Fixed
+- **`tool_name` preserved in `Orchestrator#prepare_tool_class` anonymous subclass wrapper**:
+  When `Orchestrator#prepare_tool_class` wrapped a subagent tool in an anonymous
+  subclass (`Class.new(prepared)`), the class-level instance variable `@tool_name`
+  was not inherited, causing the wrapper's `tool_name` to return `nil`. RubyLLM
+  then registered the tool under a `nil` key, making it unreachable when the LLM
+  called it by name. The fix captures the effective name before subclassing and
+  calls `tool_name effective_name` explicitly inside the anonymous class body —
+  the same pattern already used by the approval-gate wrapper.
+- **`EventLoop#start` is now idempotent; stale `:__stop__` sentinel race fixed** (#203):
+  Calling `start` on an already-running `EventLoop` is now a no-op. Fixed a race condition
+  where `stop` setting `@running = false` before the worker thread was scheduled left the
+  `:__stop__` sentinel unconsumed in the queue; a subsequent `start` would then immediately
+  terminate the new thread upon popping the stale sentinel. The sentinel is now treated as a
+  pure unblock signal for `queue.pop` (`next` instead of `break`) — loop termination is
+  driven solely by `@running`.
+- **`trace_pii: false` now redacts both input and output**: Previously only the user input
+  was redacted when `trace_pii` was `false`; LLM responses and tool results were still
+  forwarded to the tracing backend unredacted. Both sides are now replaced with `[REDACTED]`.
+- **`StdioTransport` — `read_timeout` prevents indefinite blocking**: A configurable
+  `read_timeout` (default 30 s) is now enforced on MCP stdio reads. A silent child process
+  could previously block the calling thread forever.
+- **MCP schema `required` and `enum` constraints propagated to `param` DSL**:
+  `McpTool.from_server` now copies `required` and `enum` constraints from the MCP JSON Schema
+  into the generated `param` declarations so downstream validation sees them.
+- **`FSMSession` notifies parent when child `AgentFSM` fails**: An unhandled error in a child
+  `AgentFSM` now correctly notifies the parent `FSMSession`, preventing it from waiting
+  indefinitely for a completion event that will never arrive.
+- **`WorkflowContext.field` rejects plain `Array` or `Hash` defaults**: Passing a plain `Array`
+  or `Hash` as a field default now raises `ArgumentError` at class-definition time,
+  preventing accidental state sharing across workflow invocations. Other mutable objects
+  are not checked. Wrap collection defaults in a Proc: `default: -> { [] }`.
+- **Tool aliases inherited by `Agent` subclasses**: `tool_aliases` declared in a parent
+  `Agent::Base` subclass are now correctly merged into subclasses rather than being silently
+  dropped.
+- **`ReactAgent` output selection skips tool-role messages**: The final output selection
+  logic no longer misidentifies `tool`-role messages as the assistant response, fixing
+  spurious tool-call JSON appearing in `result[:output]`.
+- **Thread-local context cache cleaned up after each `invoke`** (#128): `Agent::Base#invoke`
+  previously leaked thread-local context cache entries after each call, causing stale cache
+  hits in long-lived threads. The cache is now cleared in an `ensure` block.
+- **Unknown tool parameters are rejected** (#130): `Tool::Base#call` now raises
+  `ArgumentError` when keyword arguments not declared via the `param` DSL are passed, instead
+  of forwarding them silently to `execute`.
+- **`EventLoop#stop` uses cooperative shutdown instead of `Thread#kill`** (#135):
+  `Thread#kill` bypasses `ensure` blocks and is unsafe. The event loop now sets a sentinel
+  flag and joins the worker thread, allowing it to flush pending events before termination.
+- **`Orchestrator` propagates parent `config` and `thread_id` to sub-agents** (#132):
+  Sub-agents spawned via `dispatch` or `dispatch_parallel` now inherit the caller's `config`
+  hash and `thread_id`, enabling correct memory isolation and distributed tracing in
+  multi-agent pipelines.
+- **`Agent::Base` caches `static_knowledge` fetch at the class level** (#127): The RAG
+  knowledge fetch was re-executed on every `invoke`. The result is now memoized at the class
+  level (`@static_knowledge_chunks ||= ...`), eliminating redundant vector-store queries.
+  The cache is **not** invalidated automatically when source content changes; call
+  `static_knowledge_refresh!` explicitly to force a reload.
+- **`WorkflowContext#initialize` raises on unknown field keys** (#121): Passing an
+  unrecognised key to `WorkflowContext.new` was silently ignored. The constructor now raises
+  `ArgumentError`, surfacing typos and API mismatches immediately.
+- **`WorkflowContext#merge` raises `ArgumentError` for unknown field keys** (#154): Passing
+  an unrecognised key to `WorkflowContext#merge` was silently ignored. The method now raises
+  `ArgumentError`, matching the guard added to `#initialize` in #121.
+- **`WorkflowContext#deep_dup_value` rescues `TypeError` for non-dupable objects** (#156):
+  Objects that raise `TypeError` from `#dup` (e.g. `Method`, frozen `Proc`, `Integer`,
+  `Symbol`) are now returned as-is instead of crashing.
+- **`Workflow.define` raises for undefined `from:` state in transitions** (#157): Transitions
+  that reference a `from:` state not declared in the DSL now raise `ArgumentError` at
+  build time, complementing the existing check for undefined `to:` targets.
+- **`Workflow.define` unreachable-state warning routes through configured logger** (#158):
+  The diagnostic warning for unreachable states now uses `Phronomy.configuration.logger`
+  when set, falling back to `Kernel#warn`. Previously the warning always went to `$stderr`.
+- **`require "set"` added to `workflow.rb`** (#159): Eliminates an implicit dependency on
+  `Set` being pre-loaded by another gem.
+- **`Tool::Base#validate_nested_object` rejects undeclared extra keys** (#166): Keys present
+  in the LLM-supplied hash but absent from the tool's nested `param` schema now produce a
+  validation error rather than being silently forwarded.
+- **`WorkflowContext#merge` deep-copies unchanged fields** (#123): Fields absent from the
+  `merge` argument were previously shared by reference with the original context, allowing
+  one branch to mutate another branch's state. All fields are now independently copied.
+- **Robust metadata parsing in `VectorStore::Pgvector#search`** (#139): Metadata stored as a
+  PostgreSQL JSON string is now parsed correctly regardless of whether the database driver
+  returns a `String` or an already-decoded `Hash`.
+- **`OutputParser::JsonParser` tries all fenced code blocks before falling back** (#146):
+  The parser now scans every fenced block in the LLM response (in order) and returns the
+  first one that parses as valid JSON, rather than only checking the first block. This
+  improves reliability with models that include prose before the JSON block.
+- **`on_error: :return_empty` emits a warning and returns a descriptive string** (#147):
+  Errors in tools that declare `on_error :return_empty` are now logged to `warn` before the
+  tool returns. The placeholder string includes the tool name and a brief reason, making
+  silent failures easier to diagnose.
+- **`context_version_cache` accessible after `invoke` completes**: The thread-local cache is
+  cleared in `invoke`'s `ensure` block, which caused `context_version_cache` to return `nil`
+  immediately after every call. The value is now persisted in `@last_context_version_cache`
+  so it remains readable post-invoke.
+- **`WorkflowContext` field type `:merge` comment corrected**: The inline comment incorrectly
+  described `:merge` as a deep-merge. It performs a shallow merge (`Hash#merge`). The comment
+  has been updated.
+- **`WorkflowContext` return value from entry actions now adopted in EventLoop mode** (#107):
+  `FSMSession` previously discarded the `WorkflowContext` returned by entry action callables,
+  causing `s.merge(...)` updates to be silently lost when `event_loop = true`. The context is
+  now correctly propagated, bringing EventLoop semantics in line with the synchronous
+  `WorkflowRunner`. Regression tests added in `spec/phronomy/fsm_session_spec.rb` (unit)
+  and `spec/integration/workflow_spec.rb` (integration, both sync and EventLoop paths).
+### Documentation
+- **`trace_pii = false` description corrected** (#153): The inline comment and README Note
+  now correctly state that both the input and the output are redacted.
+- **`invoke_timeout` is a wait timeout, not cancellation** (#163): YARD comment now
+  explicitly documents that the background agent thread and in-flight LLM/tool calls are
+  **not** interrupted when the timeout fires. Only the caller receives `TimeoutError`.
+- **`context_version_cache` thread-safety limitation documented** (#161): A NOTE in the YARD
+  comment explains that the per-instance cache is not thread-safe when the same agent
+  instance is shared across threads.
+- **`trace_pii` option documented in README**: The `trace_pii:` configuration key and its
+  behaviour (default `false`, redacts input and output in trace records) is now described in
+  the Configuration section of the README.
+- **CJK token under-count warning in `TokenEstimator`**: A note in both the source and README
+  explains that the byte-based heuristic under-counts CJK characters by roughly 3×. Users
+  processing Chinese, Japanese, or Korean content should apply a correction factor or use a
+  model-specific tokenizer.
+- **Stability labels, `reset_configuration!` caveat, CI, and gemspec** (#140 / #141 / #142 / #143 / #148 / #150):
+  README stability table revised for several APIs. `Phronomy.reset_configuration!` now carries
+  a warning that it is intended for test isolation only. Gemspec upper bounds added for
+  `ruby_llm` and `pg`. `ruby head` added to the CI test matrix. README API smoke tests added.
+---
+## [0.6.0] - 2026-05-21
 ### Removed
 - **`Phronomy::Guardrail::Builtin` module removed**: `PromptInjectionDetector`

data/CONTRIBUTING.md ADDED Viewed

@@ -0,0 +1,102 @@
+# Contributing to phronomy
+Thank you for your interest in contributing!
+---
+## Development Setup
+```bash
+git clone https://github.com/Raizo-TCS/phronomy.git
+cd phronomy
+bundle install
+```
+Run the test suite:
+```bash
+bundle exec rspec --format documentation
+bundle exec rspec --tag integration
+```
+Run the linter:
+```bash
+bundle exec standardrb
+```
+Check that no Japanese characters appear in source files:
+```bash
+ruby scripts/check_japanese.rb
+```
+---
+## Code Style
+- All source files under `lib/` and `spec/` must begin with `# frozen_string_literal: true`.
+- All comments, error messages (`raise`), and YARD documentation inside source files must be in **English**.
+- Follow [Ruby Standard Style](https://github.com/standardrb/standard) (`standardrb`).
+---
+## Public API Changes
+When adding, removing, or renaming a public method or class:
+1. Update the stability table in `README.md`.
+2. Add or update `@api private` YARD annotations for internal APIs.
+3. Regenerate the API compatibility snapshot:
+   ```bash
+   bundle exec ruby scripts/api_snapshot.rb --write
+   ```
+---
+## Architecture Decision Records
+Key design decisions are documented as ADRs in
+[docs/decisions/](docs/decisions/). Read these before making significant changes
+to the threading model, caching strategy, or public API shape.
+---
+## Mutation Testing
+Phronomy uses [mutant](https://github.com/mbj/mutant) to verify that each test
+actually detects real code changes. Mutation tests are **not** part of the
+required CI gate (they are slow), but run nightly via `.github/workflows/nightly-mutation.yml`.
+### Run mutation tests locally
+```bash
+# All subjects defined in .mutant.yml
+bash scripts/run_mutation.sh
+# Single subject
+bash scripts/run_mutation.sh "Phronomy::WorkflowContext"
+```
+### Coverage targets
+| Subject | Baseline | Target |
+|---|---|---|
+| `Phronomy::WorkflowContext` | 84.85% | ≥ 80% |
+| `Phronomy::WorkflowRunner` | — | ≥ 80% |
+| `Phronomy::Tool::Base` | 55.74% | ≥ 80% |
+| `Phronomy::Context::TokenBudget` | — | ≥ 80% |
+| `Phronomy::VectorStore::InMemory` | — | ≥ 80% |
+When you add or modify tests for a covered subject, run mutation tests to confirm
+the score does not regress.
+---
+## Releasing
+See [RELEASE_CHECKLIST.md](RELEASE_CHECKLIST.md) for the full pre-release quality
+gate and step-by-step release instructions.
+**Never run `gem push` directly.** Releases are published via the GitHub Actions
+`release.yml` workflow.