agentic 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (130) hide show
  1. checksums.yaml +4 -4
  2. data/.agentic.yml +2 -0
  3. data/.architecture/decisions/ArchitecturalFeatureBuilder.md +136 -0
  4. data/.architecture/decisions/ArchitectureConsiderations.md +200 -0
  5. data/.architecture/decisions/adr_001_observer_pattern_implementation.md +196 -0
  6. data/.architecture/decisions/adr_002_plan_orchestrator.md +320 -0
  7. data/.architecture/decisions/adr_003_plan_orchestrator_interface.md +179 -0
  8. data/.architecture/decisions/adrs/ADR-001-dependency-management.md +147 -0
  9. data/.architecture/decisions/adrs/ADR-002-system-boundaries.md +162 -0
  10. data/.architecture/decisions/adrs/ADR-003-content-safety.md +158 -0
  11. data/.architecture/decisions/adrs/ADR-004-agent-permissions.md +161 -0
  12. data/.architecture/decisions/adrs/ADR-005-adaptation-engine.md +127 -0
  13. data/.architecture/decisions/adrs/ADR-006-extension-system.md +273 -0
  14. data/.architecture/decisions/adrs/ADR-007-learning-system.md +156 -0
  15. data/.architecture/decisions/adrs/ADR-008-prompt-generation.md +325 -0
  16. data/.architecture/decisions/adrs/ADR-009-task-failure-handling.md +353 -0
  17. data/.architecture/decisions/adrs/ADR-010-task-input-handling.md +251 -0
  18. data/.architecture/decisions/adrs/ADR-011-task-observable-pattern.md +391 -0
  19. data/.architecture/decisions/adrs/ADR-012-task-output-handling.md +205 -0
  20. data/.architecture/decisions/adrs/ADR-013-architecture-alignment.md +211 -0
  21. data/.architecture/decisions/adrs/ADR-014-agent-capability-registry.md +80 -0
  22. data/.architecture/decisions/adrs/ADR-015-persistent-agent-store.md +100 -0
  23. data/.architecture/decisions/adrs/ADR-016-agent-assembly-engine.md +117 -0
  24. data/.architecture/decisions/adrs/ADR-017-streaming-observability.md +171 -0
  25. data/.architecture/decisions/capability_tools_distinction.md +150 -0
  26. data/.architecture/decisions/cli_command_structure.md +61 -0
  27. data/.architecture/implementation/agent_self_assembly_implementation.md +267 -0
  28. data/.architecture/implementation/agent_self_assembly_summary.md +138 -0
  29. data/.architecture/members.yml +187 -0
  30. data/.architecture/planning/self_implementation_exercise.md +295 -0
  31. data/.architecture/planning/session_compaction_rule.md +43 -0
  32. data/.architecture/planning/streaming_observability_feature.md +223 -0
  33. data/.architecture/principles.md +151 -0
  34. data/.architecture/recalibration/0-2-0.md +92 -0
  35. data/.architecture/recalibration/agent_self_assembly.md +238 -0
  36. data/.architecture/recalibration/cli_command_structure.md +91 -0
  37. data/.architecture/recalibration/implementation_roadmap_0-2-0.md +301 -0
  38. data/.architecture/recalibration/progress_tracking_0-2-0.md +114 -0
  39. data/.architecture/recalibration_process.md +127 -0
  40. data/.architecture/reviews/0-2-0.md +181 -0
  41. data/.architecture/reviews/cli_command_duplication.md +98 -0
  42. data/.architecture/templates/adr.md +105 -0
  43. data/.architecture/templates/implementation_roadmap.md +125 -0
  44. data/.architecture/templates/progress_tracking.md +89 -0
  45. data/.architecture/templates/recalibration_plan.md +70 -0
  46. data/.architecture/templates/version_comparison.md +124 -0
  47. data/.claude/settings.local.json +13 -0
  48. data/.claude-sessions/001-task-class-architecture-implementation.md +129 -0
  49. data/.claude-sessions/002-plan-orchestrator-interface-review.md +105 -0
  50. data/.claude-sessions/architecture-governance-implementation.md +37 -0
  51. data/.claude-sessions/architecture-review-session.md +27 -0
  52. data/ArchitecturalFeatureBuilder.md +136 -0
  53. data/ArchitectureConsiderations.md +229 -0
  54. data/CHANGELOG.md +57 -2
  55. data/CLAUDE.md +111 -0
  56. data/CONTRIBUTING.md +286 -0
  57. data/MAINTAINING.md +301 -0
  58. data/README.md +582 -28
  59. data/docs/agent_capabilities_api.md +259 -0
  60. data/docs/artifact_extension_points.md +757 -0
  61. data/docs/artifact_generation_architecture.md +323 -0
  62. data/docs/artifact_implementation_plan.md +596 -0
  63. data/docs/artifact_integration_points.md +345 -0
  64. data/docs/artifact_verification_strategies.md +581 -0
  65. data/docs/streaming_observability_architecture.md +510 -0
  66. data/exe/agentic +6 -1
  67. data/lefthook.yml +5 -0
  68. data/lib/agentic/adaptation_engine.rb +124 -0
  69. data/lib/agentic/agent.rb +181 -4
  70. data/lib/agentic/agent_assembly_engine.rb +442 -0
  71. data/lib/agentic/agent_capability_registry.rb +260 -0
  72. data/lib/agentic/agent_config.rb +63 -0
  73. data/lib/agentic/agent_specification.rb +46 -0
  74. data/lib/agentic/capabilities/examples.rb +530 -0
  75. data/lib/agentic/capabilities.rb +14 -0
  76. data/lib/agentic/capability_provider.rb +146 -0
  77. data/lib/agentic/capability_specification.rb +118 -0
  78. data/lib/agentic/cli/agent.rb +31 -0
  79. data/lib/agentic/cli/capabilities.rb +191 -0
  80. data/lib/agentic/cli/config.rb +134 -0
  81. data/lib/agentic/cli/execution_observer.rb +796 -0
  82. data/lib/agentic/cli.rb +1068 -0
  83. data/lib/agentic/default_agent_provider.rb +35 -0
  84. data/lib/agentic/errors/llm_error.rb +184 -0
  85. data/lib/agentic/execution_plan.rb +53 -0
  86. data/lib/agentic/execution_result.rb +91 -0
  87. data/lib/agentic/expected_answer_format.rb +46 -0
  88. data/lib/agentic/extension/domain_adapter.rb +109 -0
  89. data/lib/agentic/extension/plugin_manager.rb +163 -0
  90. data/lib/agentic/extension/protocol_handler.rb +116 -0
  91. data/lib/agentic/extension.rb +45 -0
  92. data/lib/agentic/factory_methods.rb +9 -1
  93. data/lib/agentic/generation_stats.rb +61 -0
  94. data/lib/agentic/learning/README.md +84 -0
  95. data/lib/agentic/learning/capability_optimizer.rb +613 -0
  96. data/lib/agentic/learning/execution_history_store.rb +251 -0
  97. data/lib/agentic/learning/pattern_recognizer.rb +500 -0
  98. data/lib/agentic/learning/strategy_optimizer.rb +706 -0
  99. data/lib/agentic/learning.rb +131 -0
  100. data/lib/agentic/llm_assisted_composition_strategy.rb +188 -0
  101. data/lib/agentic/llm_client.rb +215 -15
  102. data/lib/agentic/llm_config.rb +65 -1
  103. data/lib/agentic/llm_response.rb +163 -0
  104. data/lib/agentic/logger.rb +1 -1
  105. data/lib/agentic/observable.rb +51 -0
  106. data/lib/agentic/persistent_agent_store.rb +385 -0
  107. data/lib/agentic/plan_execution_result.rb +129 -0
  108. data/lib/agentic/plan_orchestrator.rb +464 -0
  109. data/lib/agentic/plan_orchestrator_config.rb +57 -0
  110. data/lib/agentic/retry_config.rb +63 -0
  111. data/lib/agentic/retry_handler.rb +125 -0
  112. data/lib/agentic/structured_outputs.rb +1 -1
  113. data/lib/agentic/task.rb +193 -0
  114. data/lib/agentic/task_definition.rb +39 -0
  115. data/lib/agentic/task_execution_result.rb +92 -0
  116. data/lib/agentic/task_failure.rb +66 -0
  117. data/lib/agentic/task_output_schemas.rb +112 -0
  118. data/lib/agentic/task_planner.rb +54 -19
  119. data/lib/agentic/task_result.rb +48 -0
  120. data/lib/agentic/ui.rb +244 -0
  121. data/lib/agentic/verification/critic_framework.rb +116 -0
  122. data/lib/agentic/verification/llm_verification_strategy.rb +60 -0
  123. data/lib/agentic/verification/schema_verification_strategy.rb +47 -0
  124. data/lib/agentic/verification/verification_hub.rb +62 -0
  125. data/lib/agentic/verification/verification_result.rb +50 -0
  126. data/lib/agentic/verification/verification_strategy.rb +26 -0
  127. data/lib/agentic/version.rb +1 -1
  128. data/lib/agentic.rb +74 -2
  129. data/plugins/README.md +41 -0
  130. metadata +245 -6
@@ -0,0 +1,320 @@
1
+ # ADR 002: Plan Orchestrator Implementation with Async
2
+
3
+ ## Status
4
+
5
+ Proposed
6
+
7
+ ## Context
8
+
9
+ The Agentic framework requires a component to manage the execution of tasks in a plan, handling dependencies, orchestrating execution flow, and managing task state transitions. Having already implemented the Observable pattern for task state notification, we need to evaluate whether to continue with that approach or to use a more comprehensive concurrency solution for our PlanOrchestrator.
10
+
11
+ After reviewing the available options, we're considering whether the [socketry/async](https://github.com/socketry/async) gem would provide a better foundation for task orchestration than our current Observable pattern approach.
12
+
13
+ ## Decision Drivers
14
+
15
+ 1. **Execution Flexibility**: Support for various execution patterns (sequential, parallel, conditional)
16
+ 2. **Dependency Management**: Ability to handle complex task dependencies
17
+ 3. **Failure Handling**: Graceful management of task failures with appropriate recovery strategies
18
+ 4. **Extensibility**: Support for custom execution strategies and plugins
19
+ 5. **Observability**: Comprehensive metrics and visibility into execution state
20
+ 6. **Human Intervention**: Clear points for human involvement when needed
21
+ 7. **State Management**: Maintaining consistent state across the execution lifecycle
22
+ 8. **Concurrency Model**: Efficient and predictable concurrency
23
+ 9. **Resource Management**: Proper handling of system resources
24
+
25
+ ## Options Considered
26
+
27
+ ### 1. Observer Pattern-Based Orchestrator (Current Approach)
28
+
29
+ **Description**:
30
+ - Continue using our current Observer pattern implementation
31
+ - Design PlanOrchestrator as an observer of task state changes
32
+ - React to events (task completion, failure) to drive execution
33
+
34
+ **Pros**:
35
+ - Already implemented and tested
36
+ - No additional dependencies
37
+ - Simple to understand
38
+ - Loose coupling between tasks and orchestrator
39
+
40
+ **Cons**:
41
+ - Limited concurrency support - not optimized for parallel execution
42
+ - Manual thread management required for true parallelism
43
+ - No built-in limiting of concurrency
44
+ - Potential for race conditions in concurrent scenarios
45
+
46
+ ### 2. Async Gem-Based Orchestrator
47
+
48
+ **Description**:
49
+ - Use socketry/async as the foundation for task orchestration
50
+ - Leverage Async's fiber-based concurrency model
51
+ - Use Async barriers and semaphores to manage task completion and concurrency
52
+
53
+ **Pros**:
54
+ - Fiber-based concurrency model is lightweight and efficient
55
+ - Built-in mechanisms for limiting concurrency (semaphores)
56
+ - Better support for parallel execution
57
+ - Task parent-child relationships handle cleanup automatically
58
+ - More idiomatic for concurrent Ruby code
59
+ - Support for graceful termination
60
+
61
+ **Cons**:
62
+ - Adds an external dependency
63
+ - Different mental model than our current implementation
64
+ - May require refactoring of existing task code
65
+ - Learning curve for developers unfamiliar with Async
66
+
67
+ ## Decision
68
+
69
+ We will adopt the **Async Gem-Based Orchestrator** approach (Option 2) for our PlanOrchestrator implementation. While the Observer pattern has served us well for simple task state notification, the Async gem provides a more comprehensive solution for managing concurrent task execution with better control over resource usage, concurrency limits, and error handling.
70
+
71
+ ### Implementation Approach
72
+
73
+ The PlanOrchestrator will leverage Async's capabilities while maintaining our existing task lifecycle and interface. We'll implement it as follows:
74
+
75
+ ```ruby
76
+ module Agentic
77
+ class PlanOrchestrator
78
+ attr_reader :plan_id, :tasks, :execution_state, :results
79
+
80
+ def initialize(plan_id: SecureRandom.uuid, concurrency_limit: 10)
81
+ @plan_id = plan_id
82
+ @tasks = {}
83
+ @dependencies = {}
84
+ @results = {}
85
+ @execution_state = {
86
+ pending: Set.new,
87
+ in_progress: Set.new,
88
+ completed: Set.new,
89
+ failed: Set.new
90
+ }
91
+ @concurrency_limit = concurrency_limit
92
+ end
93
+
94
+ def add_task(task, dependencies = [])
95
+ task_id = task.id
96
+ @tasks[task_id] = task
97
+ @dependencies[task_id] = Array(dependencies)
98
+ @execution_state[:pending].add(task_id)
99
+ end
100
+
101
+ def execute_plan(agent_provider)
102
+ Async do |reactor|
103
+ barrier = Async::Barrier.new
104
+ semaphore = Async::Semaphore.new(@concurrency_limit, parent: barrier)
105
+
106
+ # Start with tasks that have no dependencies
107
+ eligible_tasks = find_eligible_tasks
108
+
109
+ # Initial execution of eligible tasks
110
+ eligible_tasks.each do |task_id|
111
+ schedule_task(task_id, agent_provider, semaphore, barrier)
112
+ end
113
+
114
+ # Wait for all tasks to complete
115
+ barrier.wait
116
+ end
117
+
118
+ # Return execution results
119
+ {
120
+ plan_id: @plan_id,
121
+ status: overall_status,
122
+ tasks: @tasks.transform_values(&:to_h),
123
+ results: @results
124
+ }
125
+ end
126
+
127
+ private
128
+
129
+ def schedule_task(task_id, agent_provider, semaphore, barrier)
130
+ return unless @execution_state[:pending].include?(task_id)
131
+
132
+ # Move to in_progress state
133
+ @execution_state[:pending].delete(task_id)
134
+ @execution_state[:in_progress].add(task_id)
135
+ task = @tasks[task_id]
136
+
137
+ # Schedule task execution with the semaphore
138
+ semaphore.async do
139
+ begin
140
+ agent = agent_provider.get_agent_for_task(task)
141
+ result = task.perform(agent)
142
+
143
+ # Record result and update state
144
+ if result.successful?
145
+ @execution_state[:in_progress].delete(task_id)
146
+ @execution_state[:completed].add(task_id)
147
+ @results[task_id] = {
148
+ status: :completed,
149
+ output: result.output
150
+ }
151
+
152
+ # Find and schedule dependent tasks
153
+ schedule_dependent_tasks(task_id, agent_provider, semaphore, barrier)
154
+ else
155
+ @execution_state[:in_progress].delete(task_id)
156
+ @execution_state[:failed].add(task_id)
157
+ @results[task_id] = {
158
+ status: :failed,
159
+ failure: result.failure&.to_h
160
+ }
161
+
162
+ # Handle failure based on policy
163
+ handle_task_failure(task, result.failure, agent_provider, semaphore, barrier)
164
+ end
165
+ rescue => e
166
+ # Handle unexpected errors
167
+ @execution_state[:in_progress].delete(task_id)
168
+ @execution_state[:failed].add(task_id)
169
+ @results[task_id] = {
170
+ status: :failed,
171
+ failure: TaskFailure.from_exception(e).to_h
172
+ }
173
+
174
+ Agentic.logger.error("Unexpected error in task #{task_id}: #{e.message}")
175
+ end
176
+ end
177
+ end
178
+
179
+ def schedule_dependent_tasks(completed_task_id, agent_provider, semaphore, barrier)
180
+ # Find tasks that depend on the completed task
181
+ dependent_tasks = @dependencies.select do |task_id, deps|
182
+ deps.include?(completed_task_id) && @execution_state[:pending].include?(task_id)
183
+ end.keys
184
+
185
+ # For each dependent task, check if all dependencies are satisfied
186
+ dependent_tasks.each do |task_id|
187
+ deps = @dependencies[task_id]
188
+ all_deps_satisfied = deps.all? do |dep_id|
189
+ @execution_state[:completed].include?(dep_id)
190
+ end
191
+
192
+ if all_deps_satisfied
193
+ schedule_task(task_id, agent_provider, semaphore, barrier)
194
+ end
195
+ end
196
+ end
197
+
198
+ def handle_task_failure(task, failure, agent_provider, semaphore, barrier)
199
+ # Implement different strategies based on failure type
200
+ case failure.type
201
+ when "TimeoutError"
202
+ # Maybe retry with extended timeout
203
+ Agentic.logger.info("Task #{task.id} failed with timeout, retrying...")
204
+ retry_task(task, agent_provider, semaphore, barrier)
205
+ when "AuthenticationError"
206
+ # Maybe request new credentials
207
+ Agentic.logger.warn("Task #{task.id} failed with authentication error, intervention required")
208
+ request_human_intervention(task, failure)
209
+ else
210
+ # Apply general failure policy
211
+ Agentic.logger.error("Task #{task.id} failed: #{failure.message}")
212
+ end
213
+ end
214
+
215
+ def retry_task(task, agent_provider, semaphore, barrier, max_retries = 3)
216
+ # Check if the task can be retried
217
+ return unless task.status == :failed
218
+ return if task.retry_count && task.retry_count >= max_retries
219
+
220
+ # Increment retry count
221
+ task.retry_count ||= 0
222
+ task.retry_count += 1
223
+
224
+ # Put task back in pending state
225
+ @execution_state[:failed].delete(task.id)
226
+ @execution_state[:pending].add(task.id)
227
+
228
+ # Schedule retrying the task
229
+ schedule_task(task.id, agent_provider, semaphore, barrier)
230
+ end
231
+
232
+ def request_human_intervention(task, failure)
233
+ # This would integrate with the yet-to-be-implemented human intervention system
234
+ Agentic.logger.warn("Human intervention requested for task #{task.id}: #{failure.message}")
235
+ end
236
+
237
+ def find_eligible_tasks
238
+ @dependencies.select do |task_id, deps|
239
+ deps.empty? && @execution_state[:pending].include?(task_id)
240
+ end.keys
241
+ end
242
+
243
+ def overall_status
244
+ if @execution_state[:failed].any?
245
+ :partial_failure
246
+ elsif @execution_state[:pending].empty? && @execution_state[:in_progress].empty?
247
+ :completed
248
+ else
249
+ :in_progress
250
+ end
251
+ end
252
+ end
253
+ end
254
+ ```
255
+
256
+ ## Integration with Existing Task System
257
+
258
+ To integrate with the Async-based PlanOrchestrator, we'll need to make the following adjustments to our Task implementation:
259
+
260
+ 1. **Minimal Interface Changes**: We'll maintain the current Task interface while ensuring it's compatible with Async's concurrency model.
261
+
262
+ 2. **Task Result Handling**: We'll continue using our TaskResult approach for communicating execution outcomes.
263
+
264
+ 3. **Observer Pattern Coexistence**: We'll maintain the Observer pattern for task state notification, which can coexist with the Async-based execution model, allowing components that don't need concurrent execution to still observe task state.
265
+
266
+ ## Consequences
267
+
268
+ ### Positive
269
+
270
+ 1. **Better Concurrency**: Fiber-based concurrency offers a more efficient and scalable model for task execution.
271
+ 2. **Resource Management**: Built-in semaphores prevent resource exhaustion by limiting concurrent tasks.
272
+ 3. **Task Lifecycle**: Parent-child relationships in Async tasks handle cleanup and termination automatically.
273
+ 4. **Simplified Orchestration**: The complexity of managing concurrent execution is largely handled by the Async gem.
274
+ 5. **Graceful Termination**: Better support for stopping and cleaning up tasks during termination.
275
+
276
+ ### Negative
277
+
278
+ 1. **New Dependency**: Adds a dependency on the Async gem.
279
+ 2. **Learning Curve**: Team members will need to understand Async's concurrency model.
280
+ 3. **Integration Effort**: Requires careful integration with our existing Observer pattern.
281
+
282
+ ### Neutral
283
+
284
+ 1. **Performance Characteristics**: While expected to be better, actual performance improvements need to be measured.
285
+ 2. **API Evolution**: The Async gem is actively developed, which may introduce API changes over time.
286
+
287
+ ## Implementation Notes
288
+
289
+ 1. **Gradual Transition**:
290
+ - Start by making the PlanOrchestrator Async-based without requiring tasks to change
291
+ - Later, consider deeper integration where tasks themselves leverage Async
292
+
293
+ 2. **Testing Strategy**:
294
+ - Create dedicated tests for the Async-based PlanOrchestrator
295
+ - Ensure existing tests still pass with the new implementation
296
+ - Test concurrency limits and behavior under high load
297
+
298
+ 3. **Monitoring and Metrics**:
299
+ - Add instrumentation for tracking task execution performance
300
+ - Measure and compare against the previous Observer-based approach
301
+
302
+ 4. **Error Handling**:
303
+ - Ensure proper propagation of errors from Async tasks
304
+ - Maintain our existing error context information
305
+
306
+ ## Alternative Paths
307
+
308
+ If the Async approach proves problematic, we can:
309
+
310
+ 1. Revert to our Observer pattern implementation
311
+ 2. Consider other concurrency frameworks like concurrent-ruby
312
+ 3. Implement a hybrid approach that uses Observer pattern for state notification and a simpler execution model
313
+
314
+ ## References
315
+
316
+ - [Socketry Async GitHub Repository](https://github.com/socketry/async)
317
+ - [Async Best Practices](https://socketry.github.io/async/guides/best-practices/index)
318
+ - [Asynchronous Tasks Guide](https://socketry.github.io/async/guides/asynchronous-tasks/index.html)
319
+ - [Observer Pattern Implementation (ADR-001)](file:///Users/valentinostoll/src/agentic/.architecture-review/adr_001_observer_pattern_implementation.md)
320
+ - [Task Failure Handling Architecture](file:///Users/valentinostoll/src/agentic/.architecture-review/task_failure_handling.md)
@@ -0,0 +1,179 @@
1
+ # ADR 003: Plan Orchestrator Interface Design
2
+
3
+ ## Status
4
+
5
+ Proposed
6
+
7
+ ## Context
8
+
9
+ The `PlanOrchestrator` class in the Agentic framework is responsible for managing the execution of tasks, handling dependencies, and tracking task state throughout the execution lifecycle. During testing, we encountered a tension between proper encapsulation of implementation details and the need for effective testing.
10
+
11
+ Prior to this change, several methods were marked as `private` but were being accessed in tests using `send(:method_name)`. This approach is generally considered a testing anti-pattern as it couples tests to implementation details rather than observable behavior. To address this issue, we needed to decide whether to make these methods public or restructure our testing approach.
12
+
13
+ The methods in question were:
14
+ 1. `all_dependencies_met?` - Checks if dependencies for a task are satisfied
15
+ 2. `find_eligible_tasks` - Identifies tasks eligible for execution
16
+ 3. `overall_status` - Determines the current status of the plan
17
+
18
+ ## Decision Drivers
19
+
20
+ 1. **Encapsulation**: Maintaining a clean separation between public interface and implementation details
21
+ 2. **Testability**: Enabling effective testing without violating encapsulation principles
22
+ 3. **API Design**: Creating a coherent and intuitive public API
23
+ 4. **Future Compatibility**: Ensuring changes don't restrict future refactoring options
24
+ 5. **Code Clarity**: Providing clear boundaries between public and private concerns
25
+
26
+ ## Options Considered
27
+
28
+ ### 1. Make the methods public
29
+
30
+ **Description**:
31
+ - Move the three methods from the private section to the public interface
32
+ - Update tests to use direct calls instead of `send(:method_name)`
33
+
34
+ **Pros**:
35
+ - Immediately solves the testing issue
36
+ - Simple change with minimal code modification
37
+ - No additional dependencies or structures required
38
+
39
+ **Cons**:
40
+ - Exposes implementation details that may not belong in the public API
41
+ - Could lead to inappropriate coupling to these methods by client code
42
+ - May restrict future refactoring by creating contract obligations
43
+ - Violates the principle of minimizing public interfaces
44
+
45
+ ### 2. Create test-specific interfaces or subclasses
46
+
47
+ **Description**:
48
+ - Create testing-specific subclasses that expose private methods for testing
49
+ - Keep production code properly encapsulated
50
+
51
+ **Pros**:
52
+ - Maintains encapsulation in production code
53
+ - Explicitly separates test-specific access from production interfaces
54
+ - Preserves future refactoring flexibility
55
+
56
+ **Cons**:
57
+ - Introduces additional complexity and indirection
58
+ - Requires maintaining test-specific classes
59
+ - May still lead to tests that are coupled to implementation details
60
+
61
+ ### 3. Refactor to introduce proper abstractions
62
+
63
+ **Description**:
64
+ - Identify the underlying architectural concerns represented by these methods
65
+ - Extract these concerns into appropriate abstractions (e.g., a dependency resolver, eligibility provider, status reporter)
66
+ - Make these new abstractions testable components in their own right
67
+
68
+ **Pros**:
69
+ - Results in better separation of concerns and cohesion
70
+ - Creates properly designed abstractions rather than exposing implementation details
71
+ - Improves overall system architecture
72
+ - Provides truly unit-testable components
73
+
74
+ **Cons**:
75
+ - Requires significant refactoring
76
+ - More time-consuming implementation
77
+ - May require changes to multiple components and tests
78
+
79
+ ### 4. Use alternative testing approaches
80
+
81
+ **Description**:
82
+ - Instead of testing these methods directly, test their observable effects
83
+ - Focus on behavior verification rather than state verification
84
+ - Use integration tests rather than unit tests for orchestration logic
85
+
86
+ **Pros**:
87
+ - Avoids coupling tests to implementation details
88
+ - Tests what matters: the observable behavior
89
+ - More resilient to refactoring
90
+
91
+ **Cons**:
92
+ - May require more complex test setups
93
+ - Could be more difficult to diagnose test failures
94
+ - Might not provide sufficient test coverage for complex logic
95
+
96
+ ## Decision
97
+
98
+ For immediate pragmatic reasons, we have chosen **Option 1: Make the methods public**. However, we acknowledge that this is a compromise that introduces architectural debt, and we should plan to implement **Option 3: Refactor to introduce proper abstractions** in the future.
99
+
100
+ This decision takes into account the immediate need to fix the testing approach while balancing architectural concerns. By making these methods public now, we allow tests to function correctly without using `send(:method_name)`, but we recognize the need for a better long-term solution.
101
+
102
+ ## Consequences
103
+
104
+ ### Positive
105
+
106
+ 1. **Improved Testability**: Tests no longer need to use `send(:method_name)`, making them more straightforward and less brittle.
107
+ 2. **Explicit Contract**: The contract of these methods is now explicitly part of the public interface, providing clarity on their expected behavior.
108
+ 3. **Documentation Visibility**: The methods now have yard-doc comments visible in the public API documentation, making their purpose clear.
109
+
110
+ ### Negative
111
+
112
+ 1. **Expanded Public Interface**: The public API surface is now larger, potentially making the class harder to understand and use correctly.
113
+ 2. **Exposed Implementation Details**: Internal orchestration concepts are now exposed, potentially creating inappropriate dependencies.
114
+ 3. **Future Constraints**: These methods must now be maintained as part of the public contract, limiting future refactoring options.
115
+ 4. **Design Tension**: The current design violates the principle of minimal public interfaces and proper encapsulation.
116
+
117
+ ### Neutral
118
+
119
+ 1. **Method Semantics**: The methods themselves are well-named and their behavior is clear, so even as public methods they are unlikely to cause confusion.
120
+ 2. **Documentation**: The methods already had good documentation, so making them public required no additional documentation effort.
121
+
122
+ ## Implementation Notes
123
+
124
+ 1. These methods were moved from the private section to the public section:
125
+
126
+ ```ruby
127
+ # Checks if all dependencies for a task are met
128
+ # @param task_id [String] ID of the task to check
129
+ # @return [Boolean] True if all dependencies are met, false otherwise
130
+ def all_dependencies_met?(task_id)
131
+ deps = @dependencies[task_id] || []
132
+ deps.all? do |dep_id|
133
+ @execution_state[:completed].include?(dep_id)
134
+ end
135
+ end
136
+
137
+ # Finds tasks that are eligible for execution (have no dependencies)
138
+ # @return [Array<String>] IDs of eligible tasks
139
+ def find_eligible_tasks
140
+ @dependencies.select do |task_id, deps|
141
+ deps.empty? && @execution_state[:pending].include?(task_id)
142
+ end.keys
143
+ end
144
+
145
+ # Determines the overall status of the plan
146
+ # @return [Symbol] The overall status (:completed, :in_progress, or :partial_failure)
147
+ def overall_status
148
+ if @execution_state[:failed].any?
149
+ :partial_failure
150
+ elsif @execution_state[:pending].empty? && @execution_state[:in_progress].empty?
151
+ :completed
152
+ else
153
+ :in_progress
154
+ end
155
+ end
156
+ ```
157
+
158
+ 2. Tests were updated to access these methods directly rather than using `send(:method_name)`.
159
+
160
+ 3. Care was taken to ensure all tests still pass with this change.
161
+
162
+ ## Future Work
163
+
164
+ While this change resolves the immediate issue, several architectural improvements should be considered for future work:
165
+
166
+ 1. **Task Dependency Resolution**: Extract dependency management into a dedicated component that manages the relationships between tasks and determines eligibility.
167
+
168
+ 2. **Plan Status Management**: Create a dedicated component for tracking and reporting on plan status, allowing for more sophisticated status reporting.
169
+
170
+ 3. **Execution State Management**: Consider extracting the state transition logic into a dedicated state management component.
171
+
172
+ 4. **Test Strategy Review**: Review and potentially revise the testing strategy to focus more on behavior verification rather than state verification.
173
+
174
+ ## References
175
+
176
+ - [Tell Don't Ask Principle](https://martinfowler.com/bliki/TellDontAsk.html)
177
+ - [Law of Demeter](https://en.wikipedia.org/wiki/Law_of_Demeter)
178
+ - [Testing Anti-Patterns: Reaching into Private State](https://blog.thecodewhisperer.com/permalink/getting-your-tests-to-tell-you-when-theyre-asking-for-too-much)
179
+ - [ADR 002: Plan Orchestrator Implementation with Async](file:///Users/valentinostoll/src/agentic/.architecture-review/adr_002_plan_orchestrator.md)
@@ -0,0 +1,147 @@
1
+ # ADR-001: Dependency Management for Tasks
2
+
3
+ ## Status
4
+
5
+ Draft
6
+
7
+ ## Context
8
+
9
+ The architectural review of version 0.2.0 identified that task dependencies are currently handled directly within the PlanOrchestrator without a proper abstraction. This has led to several issues:
10
+
11
+ 1. The PlanOrchestrator has multiple responsibilities (dependency management, execution flow control, result collection)
12
+ 2. Dependency validation and cycle detection are mixed with execution logic
13
+ 3. Testing dependency-related logic requires testing the entire orchestration flow
14
+ 4. Extending dependency capabilities requires modifying the core orchestration code
15
+
16
+ As the system grows, this tight coupling will increasingly limit flexibility and maintainability.
17
+
18
+ ## Decision Drivers
19
+
20
+ * Separation of concerns: Each component should have a single responsibility
21
+ * Testability: Dependency management should be testable in isolation
22
+ * Extensibility: Support for new dependency types and validation rules
23
+ * Maintainability: Reduce complexity in the PlanOrchestrator
24
+ * Performance: Efficient dependency resolution and validation
25
+
26
+ ## Decision
27
+
28
+ Create a dedicated `DependencyGraph` class responsible for managing task dependencies separate from execution orchestration.
29
+
30
+ **Architectural Components Affected:**
31
+ * PlanOrchestrator (modified to delegate dependency management)
32
+ * DependencyGraph (new component)
33
+ * ExecutionPlan (potentially modified to include dependency metadata)
34
+ * Task (potentially modified to expose dependency information)
35
+
36
+ **Interface Changes:**
37
+ * New public DependencyGraph class with methods for:
38
+ - Adding dependencies between tasks
39
+ - Validating the dependency graph (e.g., cycle detection)
40
+ - Computing execution order based on dependencies
41
+ - Determining which tasks are ready to execute
42
+ - Updating graph state when tasks complete
43
+
44
+ ## Consequences
45
+
46
+ ### Positive
47
+
48
+ * Clearer separation of concerns with single-responsibility components
49
+ * Improved testability for dependency-related logic
50
+ * Easier extension of dependency types (hard dependencies, soft dependencies, optional dependencies)
51
+ * Reduced complexity in PlanOrchestrator
52
+ * Potential for more sophisticated dependency resolution algorithms
53
+ * Clearer visualization of task dependencies for debugging and monitoring
54
+
55
+ ### Negative
56
+
57
+ * Initial development overhead to extract and refactor the dependency logic
58
+ * Need for careful migration to avoid breaking existing users' code
59
+ * Potential for slight performance overhead from additional abstraction layer
60
+ * More classes and interfaces to understand for new developers
61
+
62
+ ### Neutral
63
+
64
+ * Possible need for additional configuration options for dependency behavior
65
+ * Shift in responsibility for dependency validation from execution time to plan time
66
+
67
+ ## Implementation
68
+
69
+ **Phase 1: Internal Abstraction**
70
+ * Create the DependencyGraph class with core functionality
71
+ * Modify PlanOrchestrator to use DependencyGraph internally
72
+ * Add comprehensive tests for DependencyGraph
73
+ * Maintain existing public interfaces to ensure backward compatibility
74
+
75
+ **Phase 2: Public API Extension**
76
+ * Expose DependencyGraph as a public API for advanced use cases
77
+ * Add feature flag to control use of the new implementation
78
+ * Provide documentation and examples for the new API
79
+ * Create migration guide for users of custom orchestration extensions
80
+
81
+ **Phase 3: Enhanced Capabilities**
82
+ * Add support for different dependency types
83
+ * Implement visualization tools for dependency graphs
84
+ * Create utilities for working with complex dependency scenarios
85
+
86
+ ## Alternatives Considered
87
+
88
+ ### Alternative 1: Enhanced PlanOrchestrator without extraction
89
+
90
+ **Pros:**
91
+ * Less initial refactoring work
92
+ * No additional abstraction layer
93
+ * Potentially simpler for basic use cases
94
+
95
+ **Cons:**
96
+ * Continues to mix concerns in a single component
97
+ * Harder to test in isolation
98
+ * Increasingly complex as new features are added
99
+ * Limited extensibility for different dependency types
100
+
101
+ ### Alternative 2: Task-based dependency management
102
+
103
+ **Pros:**
104
+ * More distributed approach with tasks knowing their dependencies
105
+ * Potentially more intuitive for simple use cases
106
+ * Easier local reasoning about individual task dependencies
107
+
108
+ **Cons:**
109
+ * Harder to validate global properties (e.g., cycles in the dependency graph)
110
+ * More complex to determine execution order
111
+ * Limited visibility into the complete dependency structure
112
+ * Potential for redundant dependency checking
113
+
114
+ ### Alternative 3: Event-based dependency resolution
115
+
116
+ **Pros:**
117
+ * Greater decoupling between tasks
118
+ * Support for dynamic dependencies that change during execution
119
+ * Potentially more flexible for complex workflows
120
+
121
+ **Cons:**
122
+ * More complex to reason about and debug
123
+ * Harder to validate before execution
124
+ * Potential performance overhead from event processing
125
+ * Steeper learning curve for users
126
+
127
+ ## Validation
128
+
129
+ **Acceptance Criteria:**
130
+ - [ ] All existing dependency functionality works with the new implementation
131
+ - [ ] PlanOrchestrator delegates all dependency management to DependencyGraph
132
+ - [ ] Cycle detection and validation occur at appropriate times
133
+ - [ ] Test coverage for DependencyGraph exceeds 90%
134
+ - [ ] No performance regression in standard benchmark tests
135
+ - [ ] Documentation and examples exist for the new API
136
+
137
+ **Testing Approach:**
138
+ * Unit tests for DependencyGraph in isolation
139
+ * Integration tests with PlanOrchestrator
140
+ * Performance benchmarks comparing old and new implementations
141
+ * Edge case testing for complex dependency scenarios
142
+ * Migration tests for existing code
143
+
144
+ ## References
145
+
146
+ * [Architectural Review 0.2.0](../../../.architecture/reviews/0-2-0.md)
147
+ * [Implementation Roadmap](../../../.architecture/recalibration/implementation_roadmap_0-2-0.md)