RubyGems - agentic - Versions diffs - 0.1.0 → 0.2.0 - Mend

agentic 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (130) hide show

checksums.yaml +4 -4
data/.agentic.yml +2 -0
data/.architecture/decisions/ArchitecturalFeatureBuilder.md +136 -0
data/.architecture/decisions/ArchitectureConsiderations.md +200 -0
data/.architecture/decisions/adr_001_observer_pattern_implementation.md +196 -0
data/.architecture/decisions/adr_002_plan_orchestrator.md +320 -0
data/.architecture/decisions/adr_003_plan_orchestrator_interface.md +179 -0
data/.architecture/decisions/adrs/ADR-001-dependency-management.md +147 -0
data/.architecture/decisions/adrs/ADR-002-system-boundaries.md +162 -0
data/.architecture/decisions/adrs/ADR-003-content-safety.md +158 -0
data/.architecture/decisions/adrs/ADR-004-agent-permissions.md +161 -0
data/.architecture/decisions/adrs/ADR-005-adaptation-engine.md +127 -0
data/.architecture/decisions/adrs/ADR-006-extension-system.md +273 -0
data/.architecture/decisions/adrs/ADR-007-learning-system.md +156 -0
data/.architecture/decisions/adrs/ADR-008-prompt-generation.md +325 -0
data/.architecture/decisions/adrs/ADR-009-task-failure-handling.md +353 -0
data/.architecture/decisions/adrs/ADR-010-task-input-handling.md +251 -0
data/.architecture/decisions/adrs/ADR-011-task-observable-pattern.md +391 -0
data/.architecture/decisions/adrs/ADR-012-task-output-handling.md +205 -0
data/.architecture/decisions/adrs/ADR-013-architecture-alignment.md +211 -0
data/.architecture/decisions/adrs/ADR-014-agent-capability-registry.md +80 -0
data/.architecture/decisions/adrs/ADR-015-persistent-agent-store.md +100 -0
data/.architecture/decisions/adrs/ADR-016-agent-assembly-engine.md +117 -0
data/.architecture/decisions/adrs/ADR-017-streaming-observability.md +171 -0
data/.architecture/decisions/capability_tools_distinction.md +150 -0
data/.architecture/decisions/cli_command_structure.md +61 -0
data/.architecture/implementation/agent_self_assembly_implementation.md +267 -0
data/.architecture/implementation/agent_self_assembly_summary.md +138 -0
data/.architecture/members.yml +187 -0
data/.architecture/planning/self_implementation_exercise.md +295 -0
data/.architecture/planning/session_compaction_rule.md +43 -0
data/.architecture/planning/streaming_observability_feature.md +223 -0
data/.architecture/principles.md +151 -0
data/.architecture/recalibration/0-2-0.md +92 -0
data/.architecture/recalibration/agent_self_assembly.md +238 -0
data/.architecture/recalibration/cli_command_structure.md +91 -0
data/.architecture/recalibration/implementation_roadmap_0-2-0.md +301 -0
data/.architecture/recalibration/progress_tracking_0-2-0.md +114 -0
data/.architecture/recalibration_process.md +127 -0
data/.architecture/reviews/0-2-0.md +181 -0
data/.architecture/reviews/cli_command_duplication.md +98 -0
data/.architecture/templates/adr.md +105 -0
data/.architecture/templates/implementation_roadmap.md +125 -0
data/.architecture/templates/progress_tracking.md +89 -0
data/.architecture/templates/recalibration_plan.md +70 -0
data/.architecture/templates/version_comparison.md +124 -0
data/.claude/settings.local.json +13 -0
data/.claude-sessions/001-task-class-architecture-implementation.md +129 -0
data/.claude-sessions/002-plan-orchestrator-interface-review.md +105 -0
data/.claude-sessions/architecture-governance-implementation.md +37 -0
data/.claude-sessions/architecture-review-session.md +27 -0
data/ArchitecturalFeatureBuilder.md +136 -0
data/ArchitectureConsiderations.md +229 -0
data/CHANGELOG.md +57 -2
data/CLAUDE.md +111 -0
data/CONTRIBUTING.md +286 -0
data/MAINTAINING.md +301 -0
data/README.md +582 -28
data/docs/agent_capabilities_api.md +259 -0
data/docs/artifact_extension_points.md +757 -0
data/docs/artifact_generation_architecture.md +323 -0
data/docs/artifact_implementation_plan.md +596 -0
data/docs/artifact_integration_points.md +345 -0
data/docs/artifact_verification_strategies.md +581 -0
data/docs/streaming_observability_architecture.md +510 -0
data/exe/agentic +6 -1
data/lefthook.yml +5 -0
data/lib/agentic/adaptation_engine.rb +124 -0
data/lib/agentic/agent.rb +181 -4
data/lib/agentic/agent_assembly_engine.rb +442 -0
data/lib/agentic/agent_capability_registry.rb +260 -0
data/lib/agentic/agent_config.rb +63 -0
data/lib/agentic/agent_specification.rb +46 -0
data/lib/agentic/capabilities/examples.rb +530 -0
data/lib/agentic/capabilities.rb +14 -0
data/lib/agentic/capability_provider.rb +146 -0
data/lib/agentic/capability_specification.rb +118 -0
data/lib/agentic/cli/agent.rb +31 -0
data/lib/agentic/cli/capabilities.rb +191 -0
data/lib/agentic/cli/config.rb +134 -0
data/lib/agentic/cli/execution_observer.rb +796 -0
data/lib/agentic/cli.rb +1068 -0
data/lib/agentic/default_agent_provider.rb +35 -0
data/lib/agentic/errors/llm_error.rb +184 -0
data/lib/agentic/execution_plan.rb +53 -0
data/lib/agentic/execution_result.rb +91 -0
data/lib/agentic/expected_answer_format.rb +46 -0
data/lib/agentic/extension/domain_adapter.rb +109 -0
data/lib/agentic/extension/plugin_manager.rb +163 -0
data/lib/agentic/extension/protocol_handler.rb +116 -0
data/lib/agentic/extension.rb +45 -0
data/lib/agentic/factory_methods.rb +9 -1
data/lib/agentic/generation_stats.rb +61 -0
data/lib/agentic/learning/README.md +84 -0
data/lib/agentic/learning/capability_optimizer.rb +613 -0
data/lib/agentic/learning/execution_history_store.rb +251 -0
data/lib/agentic/learning/pattern_recognizer.rb +500 -0
data/lib/agentic/learning/strategy_optimizer.rb +706 -0
data/lib/agentic/learning.rb +131 -0
data/lib/agentic/llm_assisted_composition_strategy.rb +188 -0
data/lib/agentic/llm_client.rb +215 -15
data/lib/agentic/llm_config.rb +65 -1
data/lib/agentic/llm_response.rb +163 -0
data/lib/agentic/logger.rb +1 -1
data/lib/agentic/observable.rb +51 -0
data/lib/agentic/persistent_agent_store.rb +385 -0
data/lib/agentic/plan_execution_result.rb +129 -0
data/lib/agentic/plan_orchestrator.rb +464 -0
data/lib/agentic/plan_orchestrator_config.rb +57 -0
data/lib/agentic/retry_config.rb +63 -0
data/lib/agentic/retry_handler.rb +125 -0
data/lib/agentic/structured_outputs.rb +1 -1
data/lib/agentic/task.rb +193 -0
data/lib/agentic/task_definition.rb +39 -0
data/lib/agentic/task_execution_result.rb +92 -0
data/lib/agentic/task_failure.rb +66 -0
data/lib/agentic/task_output_schemas.rb +112 -0
data/lib/agentic/task_planner.rb +54 -19
data/lib/agentic/task_result.rb +48 -0
data/lib/agentic/ui.rb +244 -0
data/lib/agentic/verification/critic_framework.rb +116 -0
data/lib/agentic/verification/llm_verification_strategy.rb +60 -0
data/lib/agentic/verification/schema_verification_strategy.rb +47 -0
data/lib/agentic/verification/verification_hub.rb +62 -0
data/lib/agentic/verification/verification_result.rb +50 -0
data/lib/agentic/verification/verification_strategy.rb +26 -0
data/lib/agentic/version.rb +1 -1
data/lib/agentic.rb +74 -2
data/plugins/README.md +41 -0
metadata +245 -6

data/.architecture/decisions/adr_002_plan_orchestrator.md ADDED Viewed

@@ -0,0 +1,320 @@
+# ADR 002: Plan Orchestrator Implementation with Async
+## Status
+Proposed
+## Context
+The Agentic framework requires a component to manage the execution of tasks in a plan, handling dependencies, orchestrating execution flow, and managing task state transitions. Having already implemented the Observable pattern for task state notification, we need to evaluate whether to continue with that approach or to use a more comprehensive concurrency solution for our PlanOrchestrator.
+After reviewing the available options, we're considering whether the [socketry/async](https://github.com/socketry/async) gem would provide a better foundation for task orchestration than our current Observable pattern approach.
+## Decision Drivers
+1. **Execution Flexibility**: Support for various execution patterns (sequential, parallel, conditional)
+2. **Dependency Management**: Ability to handle complex task dependencies
+3. **Failure Handling**: Graceful management of task failures with appropriate recovery strategies
+4. **Extensibility**: Support for custom execution strategies and plugins
+5. **Observability**: Comprehensive metrics and visibility into execution state
+6. **Human Intervention**: Clear points for human involvement when needed
+7. **State Management**: Maintaining consistent state across the execution lifecycle
+8. **Concurrency Model**: Efficient and predictable concurrency
+9. **Resource Management**: Proper handling of system resources
+## Options Considered
+### 1. Observer Pattern-Based Orchestrator (Current Approach)
+**Description**:
+- Continue using our current Observer pattern implementation
+- Design PlanOrchestrator as an observer of task state changes
+- React to events (task completion, failure) to drive execution
+**Pros**:
+- Already implemented and tested
+- No additional dependencies
+- Simple to understand
+- Loose coupling between tasks and orchestrator
+**Cons**:
+- Limited concurrency support - not optimized for parallel execution
+- Manual thread management required for true parallelism
+- No built-in limiting of concurrency
+- Potential for race conditions in concurrent scenarios
+### 2. Async Gem-Based Orchestrator
+**Description**:
+- Use socketry/async as the foundation for task orchestration
+- Leverage Async's fiber-based concurrency model
+- Use Async barriers and semaphores to manage task completion and concurrency
+**Pros**:
+- Fiber-based concurrency model is lightweight and efficient
+- Built-in mechanisms for limiting concurrency (semaphores)
+- Better support for parallel execution
+- Task parent-child relationships handle cleanup automatically
+- More idiomatic for concurrent Ruby code
+- Support for graceful termination
+**Cons**:
+- Adds an external dependency
+- Different mental model than our current implementation
+- May require refactoring of existing task code
+- Learning curve for developers unfamiliar with Async
+## Decision
+We will adopt the **Async Gem-Based Orchestrator** approach (Option 2) for our PlanOrchestrator implementation. While the Observer pattern has served us well for simple task state notification, the Async gem provides a more comprehensive solution for managing concurrent task execution with better control over resource usage, concurrency limits, and error handling.
+### Implementation Approach
+The PlanOrchestrator will leverage Async's capabilities while maintaining our existing task lifecycle and interface. We'll implement it as follows:
+```ruby
+module Agentic
+  class PlanOrchestrator
+    attr_reader :plan_id, :tasks, :execution_state, :results
+    def initialize(plan_id: SecureRandom.uuid, concurrency_limit: 10)
+      @plan_id = plan_id
+      @tasks = {}
+      @dependencies = {}
+      @results = {}
+      @execution_state = {
+        pending: Set.new,
+        in_progress: Set.new,
+        completed: Set.new,
+        failed: Set.new
+      }
+      @concurrency_limit = concurrency_limit
+    end
+    def add_task(task, dependencies = [])
+      task_id = task.id
+      @tasks[task_id] = task
+      @dependencies[task_id] = Array(dependencies)
+      @execution_state[:pending].add(task_id)
+    end
+    def execute_plan(agent_provider)
+      Async do |reactor|
+        barrier = Async::Barrier.new
+        semaphore = Async::Semaphore.new(@concurrency_limit, parent: barrier)
+        # Start with tasks that have no dependencies
+        eligible_tasks = find_eligible_tasks
+        # Initial execution of eligible tasks
+        eligible_tasks.each do |task_id|
+          schedule_task(task_id, agent_provider, semaphore, barrier)
+        end
+        # Wait for all tasks to complete
+        barrier.wait
+      end
+      # Return execution results
+      {
+        plan_id: @plan_id,
+        status: overall_status,
+        tasks: @tasks.transform_values(&:to_h),
+        results: @results
+      }
+    end
+    private
+    def schedule_task(task_id, agent_provider, semaphore, barrier)
+      return unless @execution_state[:pending].include?(task_id)
+      # Move to in_progress state
+      @execution_state[:pending].delete(task_id)
+      @execution_state[:in_progress].add(task_id)
+      task = @tasks[task_id]
+      # Schedule task execution with the semaphore
+      semaphore.async do
+        begin
+          agent = agent_provider.get_agent_for_task(task)
+          result = task.perform(agent)
+          # Record result and update state
+          if result.successful?
+            @execution_state[:in_progress].delete(task_id)
+            @execution_state[:completed].add(task_id)
+            @results[task_id] = {
+              status: :completed,
+              output: result.output
+            }
+            # Find and schedule dependent tasks
+            schedule_dependent_tasks(task_id, agent_provider, semaphore, barrier)
+          else
+            @execution_state[:in_progress].delete(task_id)
+            @execution_state[:failed].add(task_id)
+            @results[task_id] = {
+              status: :failed,
+              failure: result.failure&.to_h
+            }
+            # Handle failure based on policy
+            handle_task_failure(task, result.failure, agent_provider, semaphore, barrier)
+          end
+        rescue => e
+          # Handle unexpected errors
+          @execution_state[:in_progress].delete(task_id)
+          @execution_state[:failed].add(task_id)
+          @results[task_id] = {
+            status: :failed,
+            failure: TaskFailure.from_exception(e).to_h
+          }
+          Agentic.logger.error("Unexpected error in task #{task_id}: #{e.message}")
+        end
+      end
+    end
+    def schedule_dependent_tasks(completed_task_id, agent_provider, semaphore, barrier)
+      # Find tasks that depend on the completed task
+      dependent_tasks = @dependencies.select do |task_id, deps|
+        deps.include?(completed_task_id) && @execution_state[:pending].include?(task_id)
+      end.keys
+      # For each dependent task, check if all dependencies are satisfied
+      dependent_tasks.each do |task_id|
+        deps = @dependencies[task_id]
+        all_deps_satisfied = deps.all? do |dep_id|
+          @execution_state[:completed].include?(dep_id)
+        end
+        if all_deps_satisfied
+          schedule_task(task_id, agent_provider, semaphore, barrier)
+        end
+      end
+    end
+    def handle_task_failure(task, failure, agent_provider, semaphore, barrier)
+      # Implement different strategies based on failure type
+      case failure.type
+      when "TimeoutError"
+        # Maybe retry with extended timeout
+        Agentic.logger.info("Task #{task.id} failed with timeout, retrying...")
+        retry_task(task, agent_provider, semaphore, barrier)
+      when "AuthenticationError"
+        # Maybe request new credentials
+        Agentic.logger.warn("Task #{task.id} failed with authentication error, intervention required")
+        request_human_intervention(task, failure)
+      else
+        # Apply general failure policy
+        Agentic.logger.error("Task #{task.id} failed: #{failure.message}")
+      end
+    end
+    def retry_task(task, agent_provider, semaphore, barrier, max_retries = 3)
+      # Check if the task can be retried
+      return unless task.status == :failed
+      return if task.retry_count && task.retry_count >= max_retries
+      # Increment retry count
+      task.retry_count ||= 0
+      task.retry_count += 1
+      # Put task back in pending state
+      @execution_state[:failed].delete(task.id)
+      @execution_state[:pending].add(task.id)
+      # Schedule retrying the task
+      schedule_task(task.id, agent_provider, semaphore, barrier)
+    end
+    def request_human_intervention(task, failure)
+      # This would integrate with the yet-to-be-implemented human intervention system
+      Agentic.logger.warn("Human intervention requested for task #{task.id}: #{failure.message}")
+    end
+    def find_eligible_tasks
+      @dependencies.select do |task_id, deps|
+        deps.empty? && @execution_state[:pending].include?(task_id)
+      end.keys
+    end
+    def overall_status
+      if @execution_state[:failed].any?
+        :partial_failure
+      elsif @execution_state[:pending].empty? && @execution_state[:in_progress].empty?
+        :completed
+      else
+        :in_progress
+      end
+    end
+  end
+end
+```
+## Integration with Existing Task System
+To integrate with the Async-based PlanOrchestrator, we'll need to make the following adjustments to our Task implementation:
+1. **Minimal Interface Changes**: We'll maintain the current Task interface while ensuring it's compatible with Async's concurrency model.
+2. **Task Result Handling**: We'll continue using our TaskResult approach for communicating execution outcomes.
+3. **Observer Pattern Coexistence**: We'll maintain the Observer pattern for task state notification, which can coexist with the Async-based execution model, allowing components that don't need concurrent execution to still observe task state.
+## Consequences
+### Positive
+1. **Better Concurrency**: Fiber-based concurrency offers a more efficient and scalable model for task execution.
+2. **Resource Management**: Built-in semaphores prevent resource exhaustion by limiting concurrent tasks.
+3. **Task Lifecycle**: Parent-child relationships in Async tasks handle cleanup and termination automatically.
+4. **Simplified Orchestration**: The complexity of managing concurrent execution is largely handled by the Async gem.
+5. **Graceful Termination**: Better support for stopping and cleaning up tasks during termination.
+### Negative
+1. **New Dependency**: Adds a dependency on the Async gem.
+2. **Learning Curve**: Team members will need to understand Async's concurrency model.
+3. **Integration Effort**: Requires careful integration with our existing Observer pattern.
+### Neutral
+1. **Performance Characteristics**: While expected to be better, actual performance improvements need to be measured.
+2. **API Evolution**: The Async gem is actively developed, which may introduce API changes over time.
+## Implementation Notes
+1. **Gradual Transition**:
+   - Start by making the PlanOrchestrator Async-based without requiring tasks to change
+   - Later, consider deeper integration where tasks themselves leverage Async
+2. **Testing Strategy**:
+   - Create dedicated tests for the Async-based PlanOrchestrator
+   - Ensure existing tests still pass with the new implementation
+   - Test concurrency limits and behavior under high load
+3. **Monitoring and Metrics**:
+   - Add instrumentation for tracking task execution performance
+   - Measure and compare against the previous Observer-based approach
+4. **Error Handling**:
+   - Ensure proper propagation of errors from Async tasks
+   - Maintain our existing error context information
+## Alternative Paths
+If the Async approach proves problematic, we can:
+1. Revert to our Observer pattern implementation
+2. Consider other concurrency frameworks like concurrent-ruby
+3. Implement a hybrid approach that uses Observer pattern for state notification and a simpler execution model
+## References
+- [Socketry Async GitHub Repository](https://github.com/socketry/async)
+- [Async Best Practices](https://socketry.github.io/async/guides/best-practices/index)
+- [Asynchronous Tasks Guide](https://socketry.github.io/async/guides/asynchronous-tasks/index.html)
+- [Observer Pattern Implementation (ADR-001)](file:///Users/valentinostoll/src/agentic/.architecture-review/adr_001_observer_pattern_implementation.md)
+- [Task Failure Handling Architecture](file:///Users/valentinostoll/src/agentic/.architecture-review/task_failure_handling.md)

data/.architecture/decisions/adr_003_plan_orchestrator_interface.md ADDED Viewed

@@ -0,0 +1,179 @@
+# ADR 003: Plan Orchestrator Interface Design
+## Status
+Proposed
+## Context
+The `PlanOrchestrator` class in the Agentic framework is responsible for managing the execution of tasks, handling dependencies, and tracking task state throughout the execution lifecycle. During testing, we encountered a tension between proper encapsulation of implementation details and the need for effective testing.
+Prior to this change, several methods were marked as `private` but were being accessed in tests using `send(:method_name)`. This approach is generally considered a testing anti-pattern as it couples tests to implementation details rather than observable behavior. To address this issue, we needed to decide whether to make these methods public or restructure our testing approach.
+The methods in question were:
+1. `all_dependencies_met?` - Checks if dependencies for a task are satisfied
+2. `find_eligible_tasks` - Identifies tasks eligible for execution
+3. `overall_status` - Determines the current status of the plan
+## Decision Drivers
+1. **Encapsulation**: Maintaining a clean separation between public interface and implementation details
+2. **Testability**: Enabling effective testing without violating encapsulation principles
+3. **API Design**: Creating a coherent and intuitive public API
+4. **Future Compatibility**: Ensuring changes don't restrict future refactoring options
+5. **Code Clarity**: Providing clear boundaries between public and private concerns
+## Options Considered
+### 1. Make the methods public
+**Description**:
+- Move the three methods from the private section to the public interface
+- Update tests to use direct calls instead of `send(:method_name)`
+**Pros**:
+- Immediately solves the testing issue
+- Simple change with minimal code modification
+- No additional dependencies or structures required
+**Cons**:
+- Exposes implementation details that may not belong in the public API
+- Could lead to inappropriate coupling to these methods by client code
+- May restrict future refactoring by creating contract obligations
+- Violates the principle of minimizing public interfaces
+### 2. Create test-specific interfaces or subclasses
+**Description**:
+- Create testing-specific subclasses that expose private methods for testing
+- Keep production code properly encapsulated
+**Pros**:
+- Maintains encapsulation in production code
+- Explicitly separates test-specific access from production interfaces
+- Preserves future refactoring flexibility
+**Cons**:
+- Introduces additional complexity and indirection
+- Requires maintaining test-specific classes
+- May still lead to tests that are coupled to implementation details
+### 3. Refactor to introduce proper abstractions
+**Description**:
+- Identify the underlying architectural concerns represented by these methods
+- Extract these concerns into appropriate abstractions (e.g., a dependency resolver, eligibility provider, status reporter)
+- Make these new abstractions testable components in their own right
+**Pros**:
+- Results in better separation of concerns and cohesion
+- Creates properly designed abstractions rather than exposing implementation details
+- Improves overall system architecture
+- Provides truly unit-testable components
+**Cons**:
+- Requires significant refactoring
+- More time-consuming implementation
+- May require changes to multiple components and tests
+### 4. Use alternative testing approaches
+**Description**:
+- Instead of testing these methods directly, test their observable effects
+- Focus on behavior verification rather than state verification
+- Use integration tests rather than unit tests for orchestration logic
+**Pros**:
+- Avoids coupling tests to implementation details
+- Tests what matters: the observable behavior
+- More resilient to refactoring
+**Cons**:
+- May require more complex test setups
+- Could be more difficult to diagnose test failures
+- Might not provide sufficient test coverage for complex logic
+## Decision
+For immediate pragmatic reasons, we have chosen **Option 1: Make the methods public**. However, we acknowledge that this is a compromise that introduces architectural debt, and we should plan to implement **Option 3: Refactor to introduce proper abstractions** in the future.
+This decision takes into account the immediate need to fix the testing approach while balancing architectural concerns. By making these methods public now, we allow tests to function correctly without using `send(:method_name)`, but we recognize the need for a better long-term solution.
+## Consequences
+### Positive
+1. **Improved Testability**: Tests no longer need to use `send(:method_name)`, making them more straightforward and less brittle.
+2. **Explicit Contract**: The contract of these methods is now explicitly part of the public interface, providing clarity on their expected behavior.
+3. **Documentation Visibility**: The methods now have yard-doc comments visible in the public API documentation, making their purpose clear.
+### Negative
+1. **Expanded Public Interface**: The public API surface is now larger, potentially making the class harder to understand and use correctly.
+2. **Exposed Implementation Details**: Internal orchestration concepts are now exposed, potentially creating inappropriate dependencies.
+3. **Future Constraints**: These methods must now be maintained as part of the public contract, limiting future refactoring options.
+4. **Design Tension**: The current design violates the principle of minimal public interfaces and proper encapsulation.
+### Neutral
+1. **Method Semantics**: The methods themselves are well-named and their behavior is clear, so even as public methods they are unlikely to cause confusion.
+2. **Documentation**: The methods already had good documentation, so making them public required no additional documentation effort.
+## Implementation Notes
+1. These methods were moved from the private section to the public section:
+```ruby
+# Checks if all dependencies for a task are met
+# @param task_id [String] ID of the task to check
+# @return [Boolean] True if all dependencies are met, false otherwise
+def all_dependencies_met?(task_id)
+  deps = @dependencies[task_id] || []
+  deps.all? do |dep_id|
+    @execution_state[:completed].include?(dep_id)
+  end
+end
+# Finds tasks that are eligible for execution (have no dependencies)
+# @return [Array<String>] IDs of eligible tasks
+def find_eligible_tasks
+  @dependencies.select do |task_id, deps|
+    deps.empty? && @execution_state[:pending].include?(task_id)
+  end.keys
+end
+# Determines the overall status of the plan
+# @return [Symbol] The overall status (:completed, :in_progress, or :partial_failure)
+def overall_status
+  if @execution_state[:failed].any?
+    :partial_failure
+  elsif @execution_state[:pending].empty? && @execution_state[:in_progress].empty?
+    :completed
+  else
+    :in_progress
+  end
+end
+```
+2. Tests were updated to access these methods directly rather than using `send(:method_name)`.
+3. Care was taken to ensure all tests still pass with this change.
+## Future Work
+While this change resolves the immediate issue, several architectural improvements should be considered for future work:
+1. **Task Dependency Resolution**: Extract dependency management into a dedicated component that manages the relationships between tasks and determines eligibility.
+2. **Plan Status Management**: Create a dedicated component for tracking and reporting on plan status, allowing for more sophisticated status reporting.
+3. **Execution State Management**: Consider extracting the state transition logic into a dedicated state management component.
+4. **Test Strategy Review**: Review and potentially revise the testing strategy to focus more on behavior verification rather than state verification.
+## References
+- [Tell Don't Ask Principle](https://martinfowler.com/bliki/TellDontAsk.html)
+- [Law of Demeter](https://en.wikipedia.org/wiki/Law_of_Demeter)
+- [Testing Anti-Patterns: Reaching into Private State](https://blog.thecodewhisperer.com/permalink/getting-your-tests-to-tell-you-when-theyre-asking-for-too-much)
+- [ADR 002: Plan Orchestrator Implementation with Async](file:///Users/valentinostoll/src/agentic/.architecture-review/adr_002_plan_orchestrator.md)

data/.architecture/decisions/adrs/ADR-001-dependency-management.md ADDED Viewed

@@ -0,0 +1,147 @@
+# ADR-001: Dependency Management for Tasks
+## Status
+Draft
+## Context
+The architectural review of version 0.2.0 identified that task dependencies are currently handled directly within the PlanOrchestrator without a proper abstraction. This has led to several issues:
+1. The PlanOrchestrator has multiple responsibilities (dependency management, execution flow control, result collection)
+2. Dependency validation and cycle detection are mixed with execution logic
+3. Testing dependency-related logic requires testing the entire orchestration flow
+4. Extending dependency capabilities requires modifying the core orchestration code
+As the system grows, this tight coupling will increasingly limit flexibility and maintainability.
+## Decision Drivers
+* Separation of concerns: Each component should have a single responsibility
+* Testability: Dependency management should be testable in isolation
+* Extensibility: Support for new dependency types and validation rules
+* Maintainability: Reduce complexity in the PlanOrchestrator
+* Performance: Efficient dependency resolution and validation
+## Decision
+Create a dedicated `DependencyGraph` class responsible for managing task dependencies separate from execution orchestration.
+**Architectural Components Affected:**
+* PlanOrchestrator (modified to delegate dependency management)
+* DependencyGraph (new component)
+* ExecutionPlan (potentially modified to include dependency metadata)
+* Task (potentially modified to expose dependency information)
+**Interface Changes:**
+* New public DependencyGraph class with methods for:
+  - Adding dependencies between tasks
+  - Validating the dependency graph (e.g., cycle detection)
+  - Computing execution order based on dependencies
+  - Determining which tasks are ready to execute
+  - Updating graph state when tasks complete
+## Consequences
+### Positive
+* Clearer separation of concerns with single-responsibility components
+* Improved testability for dependency-related logic
+* Easier extension of dependency types (hard dependencies, soft dependencies, optional dependencies)
+* Reduced complexity in PlanOrchestrator
+* Potential for more sophisticated dependency resolution algorithms
+* Clearer visualization of task dependencies for debugging and monitoring
+### Negative
+* Initial development overhead to extract and refactor the dependency logic
+* Need for careful migration to avoid breaking existing users' code
+* Potential for slight performance overhead from additional abstraction layer
+* More classes and interfaces to understand for new developers
+### Neutral
+* Possible need for additional configuration options for dependency behavior
+* Shift in responsibility for dependency validation from execution time to plan time
+## Implementation
+**Phase 1: Internal Abstraction**
+* Create the DependencyGraph class with core functionality
+* Modify PlanOrchestrator to use DependencyGraph internally
+* Add comprehensive tests for DependencyGraph
+* Maintain existing public interfaces to ensure backward compatibility
+**Phase 2: Public API Extension**
+* Expose DependencyGraph as a public API for advanced use cases
+* Add feature flag to control use of the new implementation
+* Provide documentation and examples for the new API
+* Create migration guide for users of custom orchestration extensions
+**Phase 3: Enhanced Capabilities**
+* Add support for different dependency types
+* Implement visualization tools for dependency graphs
+* Create utilities for working with complex dependency scenarios
+## Alternatives Considered
+### Alternative 1: Enhanced PlanOrchestrator without extraction
+**Pros:**
+* Less initial refactoring work
+* No additional abstraction layer
+* Potentially simpler for basic use cases
+**Cons:**
+* Continues to mix concerns in a single component
+* Harder to test in isolation
+* Increasingly complex as new features are added
+* Limited extensibility for different dependency types
+### Alternative 2: Task-based dependency management
+**Pros:**
+* More distributed approach with tasks knowing their dependencies
+* Potentially more intuitive for simple use cases
+* Easier local reasoning about individual task dependencies
+**Cons:**
+* Harder to validate global properties (e.g., cycles in the dependency graph)
+* More complex to determine execution order
+* Limited visibility into the complete dependency structure
+* Potential for redundant dependency checking
+### Alternative 3: Event-based dependency resolution
+**Pros:**
+* Greater decoupling between tasks
+* Support for dynamic dependencies that change during execution
+* Potentially more flexible for complex workflows
+**Cons:**
+* More complex to reason about and debug
+* Harder to validate before execution
+* Potential performance overhead from event processing
+* Steeper learning curve for users
+## Validation
+**Acceptance Criteria:**
+- [ ] All existing dependency functionality works with the new implementation
+- [ ] PlanOrchestrator delegates all dependency management to DependencyGraph
+- [ ] Cycle detection and validation occur at appropriate times
+- [ ] Test coverage for DependencyGraph exceeds 90%
+- [ ] No performance regression in standard benchmark tests
+- [ ] Documentation and examples exist for the new API
+**Testing Approach:**
+* Unit tests for DependencyGraph in isolation
+* Integration tests with PlanOrchestrator
+* Performance benchmarks comparing old and new implementations
+* Edge case testing for complex dependency scenarios
+* Migration tests for existing code
+## References
+* [Architectural Review 0.2.0](../../../.architecture/reviews/0-2-0.md)
+* [Implementation Roadmap](../../../.architecture/recalibration/implementation_roadmap_0-2-0.md)