agentic 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (130) hide show
  1. checksums.yaml +4 -4
  2. data/.agentic.yml +2 -0
  3. data/.architecture/decisions/ArchitecturalFeatureBuilder.md +136 -0
  4. data/.architecture/decisions/ArchitectureConsiderations.md +200 -0
  5. data/.architecture/decisions/adr_001_observer_pattern_implementation.md +196 -0
  6. data/.architecture/decisions/adr_002_plan_orchestrator.md +320 -0
  7. data/.architecture/decisions/adr_003_plan_orchestrator_interface.md +179 -0
  8. data/.architecture/decisions/adrs/ADR-001-dependency-management.md +147 -0
  9. data/.architecture/decisions/adrs/ADR-002-system-boundaries.md +162 -0
  10. data/.architecture/decisions/adrs/ADR-003-content-safety.md +158 -0
  11. data/.architecture/decisions/adrs/ADR-004-agent-permissions.md +161 -0
  12. data/.architecture/decisions/adrs/ADR-005-adaptation-engine.md +127 -0
  13. data/.architecture/decisions/adrs/ADR-006-extension-system.md +273 -0
  14. data/.architecture/decisions/adrs/ADR-007-learning-system.md +156 -0
  15. data/.architecture/decisions/adrs/ADR-008-prompt-generation.md +325 -0
  16. data/.architecture/decisions/adrs/ADR-009-task-failure-handling.md +353 -0
  17. data/.architecture/decisions/adrs/ADR-010-task-input-handling.md +251 -0
  18. data/.architecture/decisions/adrs/ADR-011-task-observable-pattern.md +391 -0
  19. data/.architecture/decisions/adrs/ADR-012-task-output-handling.md +205 -0
  20. data/.architecture/decisions/adrs/ADR-013-architecture-alignment.md +211 -0
  21. data/.architecture/decisions/adrs/ADR-014-agent-capability-registry.md +80 -0
  22. data/.architecture/decisions/adrs/ADR-015-persistent-agent-store.md +100 -0
  23. data/.architecture/decisions/adrs/ADR-016-agent-assembly-engine.md +117 -0
  24. data/.architecture/decisions/adrs/ADR-017-streaming-observability.md +171 -0
  25. data/.architecture/decisions/capability_tools_distinction.md +150 -0
  26. data/.architecture/decisions/cli_command_structure.md +61 -0
  27. data/.architecture/implementation/agent_self_assembly_implementation.md +267 -0
  28. data/.architecture/implementation/agent_self_assembly_summary.md +138 -0
  29. data/.architecture/members.yml +187 -0
  30. data/.architecture/planning/self_implementation_exercise.md +295 -0
  31. data/.architecture/planning/session_compaction_rule.md +43 -0
  32. data/.architecture/planning/streaming_observability_feature.md +223 -0
  33. data/.architecture/principles.md +151 -0
  34. data/.architecture/recalibration/0-2-0.md +92 -0
  35. data/.architecture/recalibration/agent_self_assembly.md +238 -0
  36. data/.architecture/recalibration/cli_command_structure.md +91 -0
  37. data/.architecture/recalibration/implementation_roadmap_0-2-0.md +301 -0
  38. data/.architecture/recalibration/progress_tracking_0-2-0.md +114 -0
  39. data/.architecture/recalibration_process.md +127 -0
  40. data/.architecture/reviews/0-2-0.md +181 -0
  41. data/.architecture/reviews/cli_command_duplication.md +98 -0
  42. data/.architecture/templates/adr.md +105 -0
  43. data/.architecture/templates/implementation_roadmap.md +125 -0
  44. data/.architecture/templates/progress_tracking.md +89 -0
  45. data/.architecture/templates/recalibration_plan.md +70 -0
  46. data/.architecture/templates/version_comparison.md +124 -0
  47. data/.claude/settings.local.json +13 -0
  48. data/.claude-sessions/001-task-class-architecture-implementation.md +129 -0
  49. data/.claude-sessions/002-plan-orchestrator-interface-review.md +105 -0
  50. data/.claude-sessions/architecture-governance-implementation.md +37 -0
  51. data/.claude-sessions/architecture-review-session.md +27 -0
  52. data/ArchitecturalFeatureBuilder.md +136 -0
  53. data/ArchitectureConsiderations.md +229 -0
  54. data/CHANGELOG.md +57 -2
  55. data/CLAUDE.md +111 -0
  56. data/CONTRIBUTING.md +286 -0
  57. data/MAINTAINING.md +301 -0
  58. data/README.md +582 -28
  59. data/docs/agent_capabilities_api.md +259 -0
  60. data/docs/artifact_extension_points.md +757 -0
  61. data/docs/artifact_generation_architecture.md +323 -0
  62. data/docs/artifact_implementation_plan.md +596 -0
  63. data/docs/artifact_integration_points.md +345 -0
  64. data/docs/artifact_verification_strategies.md +581 -0
  65. data/docs/streaming_observability_architecture.md +510 -0
  66. data/exe/agentic +6 -1
  67. data/lefthook.yml +5 -0
  68. data/lib/agentic/adaptation_engine.rb +124 -0
  69. data/lib/agentic/agent.rb +181 -4
  70. data/lib/agentic/agent_assembly_engine.rb +442 -0
  71. data/lib/agentic/agent_capability_registry.rb +260 -0
  72. data/lib/agentic/agent_config.rb +63 -0
  73. data/lib/agentic/agent_specification.rb +46 -0
  74. data/lib/agentic/capabilities/examples.rb +530 -0
  75. data/lib/agentic/capabilities.rb +14 -0
  76. data/lib/agentic/capability_provider.rb +146 -0
  77. data/lib/agentic/capability_specification.rb +118 -0
  78. data/lib/agentic/cli/agent.rb +31 -0
  79. data/lib/agentic/cli/capabilities.rb +191 -0
  80. data/lib/agentic/cli/config.rb +134 -0
  81. data/lib/agentic/cli/execution_observer.rb +796 -0
  82. data/lib/agentic/cli.rb +1068 -0
  83. data/lib/agentic/default_agent_provider.rb +35 -0
  84. data/lib/agentic/errors/llm_error.rb +184 -0
  85. data/lib/agentic/execution_plan.rb +53 -0
  86. data/lib/agentic/execution_result.rb +91 -0
  87. data/lib/agentic/expected_answer_format.rb +46 -0
  88. data/lib/agentic/extension/domain_adapter.rb +109 -0
  89. data/lib/agentic/extension/plugin_manager.rb +163 -0
  90. data/lib/agentic/extension/protocol_handler.rb +116 -0
  91. data/lib/agentic/extension.rb +45 -0
  92. data/lib/agentic/factory_methods.rb +9 -1
  93. data/lib/agentic/generation_stats.rb +61 -0
  94. data/lib/agentic/learning/README.md +84 -0
  95. data/lib/agentic/learning/capability_optimizer.rb +613 -0
  96. data/lib/agentic/learning/execution_history_store.rb +251 -0
  97. data/lib/agentic/learning/pattern_recognizer.rb +500 -0
  98. data/lib/agentic/learning/strategy_optimizer.rb +706 -0
  99. data/lib/agentic/learning.rb +131 -0
  100. data/lib/agentic/llm_assisted_composition_strategy.rb +188 -0
  101. data/lib/agentic/llm_client.rb +215 -15
  102. data/lib/agentic/llm_config.rb +65 -1
  103. data/lib/agentic/llm_response.rb +163 -0
  104. data/lib/agentic/logger.rb +1 -1
  105. data/lib/agentic/observable.rb +51 -0
  106. data/lib/agentic/persistent_agent_store.rb +385 -0
  107. data/lib/agentic/plan_execution_result.rb +129 -0
  108. data/lib/agentic/plan_orchestrator.rb +464 -0
  109. data/lib/agentic/plan_orchestrator_config.rb +57 -0
  110. data/lib/agentic/retry_config.rb +63 -0
  111. data/lib/agentic/retry_handler.rb +125 -0
  112. data/lib/agentic/structured_outputs.rb +1 -1
  113. data/lib/agentic/task.rb +193 -0
  114. data/lib/agentic/task_definition.rb +39 -0
  115. data/lib/agentic/task_execution_result.rb +92 -0
  116. data/lib/agentic/task_failure.rb +66 -0
  117. data/lib/agentic/task_output_schemas.rb +112 -0
  118. data/lib/agentic/task_planner.rb +54 -19
  119. data/lib/agentic/task_result.rb +48 -0
  120. data/lib/agentic/ui.rb +244 -0
  121. data/lib/agentic/verification/critic_framework.rb +116 -0
  122. data/lib/agentic/verification/llm_verification_strategy.rb +60 -0
  123. data/lib/agentic/verification/schema_verification_strategy.rb +47 -0
  124. data/lib/agentic/verification/verification_hub.rb +62 -0
  125. data/lib/agentic/verification/verification_result.rb +50 -0
  126. data/lib/agentic/verification/verification_strategy.rb +26 -0
  127. data/lib/agentic/version.rb +1 -1
  128. data/lib/agentic.rb +74 -2
  129. data/plugins/README.md +41 -0
  130. metadata +245 -6
@@ -0,0 +1,162 @@
1
+ # ADR-002: Implementation of System Boundaries
2
+
3
+ ## Status
4
+
5
+ Draft
6
+
7
+ ## Context
8
+
9
+ The architectural review of version 0.2.0 identified that the boundaries between different subsystems (planning, execution, learning) in the Agentic codebase could be more explicit. Currently, there is direct coupling between these subsystems, which:
10
+
11
+ 1. Makes it difficult to understand the interfaces between major components
12
+ 2. Limits the ability to replace or extend individual subsystems
13
+ 3. Creates potential for unintended side effects when modifying one subsystem
14
+ 4. Increases cognitive load for developers working with the codebase
15
+ 5. Makes testing more challenging as subsystems cannot be easily isolated
16
+
17
+ As the codebase continues to grow, these issues will become more pronounced and limit the system's evolvability.
18
+
19
+ ## Decision Drivers
20
+
21
+ * Modularity: Enable independent development and evolution of subsystems
22
+ * Comprehensibility: Make system boundaries clear for new developers
23
+ * Testability: Allow subsystems to be tested in isolation
24
+ * Extensibility: Support plugging in alternative implementations for subsystems
25
+ * Maintainability: Reduce coupling between conceptually separate parts of the system
26
+ * Evolution: Enable subsystems to evolve at different rates
27
+
28
+ ## Decision
29
+
30
+ Implement explicit boundaries between the major subsystems in Agentic by:
31
+
32
+ 1. Defining clear interfaces between the planning, execution, and learning subsystems
33
+ 2. Introducing anti-corruption layers where needed to maintain domain consistency
34
+ 3. Using dependency inversion to reduce direct coupling
35
+ 4. Implementing explicit contracts for cross-subsystem communication
36
+
37
+ **Architectural Components Affected:**
38
+ * TaskPlanner (interface extraction and implementation)
39
+ * PlanOrchestrator (interface extraction and implementation)
40
+ * Learning system (interface extraction and implementation)
41
+ * Verification system (interface extraction and implementation)
42
+ * Cross-cutting concerns (logging, configuration, error handling)
43
+
44
+ **Interface Changes:**
45
+ * New interfaces for major subsystems:
46
+ - IPlanningSystem
47
+ - IExecutionSystem
48
+ - ILearningSystem
49
+ - IVerificationSystem
50
+ * Domain event interfaces for cross-subsystem communication
51
+ * Factory methods for creating subsystem implementations
52
+
53
+ ## Consequences
54
+
55
+ ### Positive
56
+
57
+ * Clearer system structure with well-defined boundaries
58
+ * Improved ability to work on subsystems independently
59
+ * Better testability through proper isolation
60
+ * Easier to swap implementations for specific subsystems
61
+ * Reduced coupling between conceptually separate parts
62
+ * More deliberate cross-subsystem communication
63
+
64
+ ### Negative
65
+
66
+ * Additional interfaces increase initial complexity
67
+ * Potential for over-engineering if boundaries are too rigid
68
+ * Initial development overhead to refactor existing code
69
+ * Slight performance overhead from additional abstraction layers
70
+ * Migration challenges for existing code
71
+
72
+ ### Neutral
73
+
74
+ * Shift in development approach requiring more upfront design
75
+ * Need for documentation about subsystem interactions
76
+ * Potential need for adapter implementations during transition
77
+
78
+ ## Implementation
79
+
80
+ **Phase 1: Interface Definition**
81
+ * Define interfaces for all major subsystems
82
+ * Document interaction patterns between subsystems
83
+ * Create domain event system for cross-boundary communication
84
+ * Add initial validation tests for interfaces
85
+
86
+ **Phase 2: Implementation Refactoring**
87
+ * Refactor existing implementations to follow the new interfaces
88
+ * Create anti-corruption layers where needed
89
+ * Update factories to work with interfaces instead of concrete classes
90
+ * Keep backward compatibility through adapter patterns
91
+
92
+ **Phase 3: Boundary Enforcement**
93
+ * Add static analysis tools to enforce architectural boundaries
94
+ * Create visualizations of subsystem interactions
95
+ * Implement metrics for measuring coupling between subsystems
96
+ * Update documentation with architectural diagrams and guidelines
97
+
98
+ ## Alternatives Considered
99
+
100
+ ### Alternative 1: Looser boundaries with documentation only
101
+
102
+ **Pros:**
103
+ * Less initial refactoring work
104
+ * More flexibility for cross-subsystem optimizations
105
+ * Less ceremony for developers working across boundaries
106
+
107
+ **Cons:**
108
+ * Relies on discipline rather than structure
109
+ * Boundaries may erode over time
110
+ * Harder to maintain as the system grows
111
+ * Limited enforcement of architectural intentions
112
+
113
+ ### Alternative 2: Microservice-like boundaries with separate packages
114
+
115
+ **Pros:**
116
+ * Strongest enforcement of boundaries
117
+ * Maximum independence for subsystem teams
118
+ * Clearest separation of concerns
119
+ * Forces explicit API design
120
+
121
+ **Cons:**
122
+ * Excessive overhead for a library
123
+ * Potential performance impact from stricter isolation
124
+ * More complex deployment and integration
125
+ * May feel over-engineered for the current scale
126
+
127
+ ### Alternative 3: Boundary enforcement through aspect-oriented programming
128
+
129
+ **Pros:**
130
+ * Could maintain boundaries without extensive refactoring
131
+ * Potentially more flexible boundary definitions
132
+ * Less invasive to existing code
133
+
134
+ **Cons:**
135
+ * Adds complexity through AOP mechanisms
136
+ * Less explicit in the code itself
137
+ * Potentially harder to understand for new developers
138
+ * Limited tools in Ruby for this approach
139
+
140
+ ## Validation
141
+
142
+ **Acceptance Criteria:**
143
+ - [ ] All subsystem interactions occur through defined interfaces
144
+ - [ ] No direct dependencies between implementation classes across subsystems
145
+ - [ ] Each subsystem can be tested in isolation with mocked dependencies
146
+ - [ ] Static analysis tools can verify boundary compliance
147
+ - [ ] Performance overhead is acceptable (<5% in benchmark tests)
148
+ - [ ] Documentation clearly explains subsystem boundaries and interactions
149
+
150
+ **Testing Approach:**
151
+ * Unit tests for individual subsystems with mocked dependencies
152
+ * Integration tests for subsystem interactions
153
+ * Static analysis to verify boundary compliance
154
+ * Performance benchmarks comparing before and after implementations
155
+ * Documentation review by developers not involved in the implementation
156
+
157
+ ## References
158
+
159
+ * [Architectural Review 0.2.0](../../../.architecture/reviews/0-2-0.md)
160
+ * [Implementation Roadmap](../../../.architecture/recalibration/implementation_roadmap_0-2-0.md)
161
+ * [Domain-Driven Design concepts](https://www.martinfowler.com/bliki/BoundedContext.html)
162
+ * [Clean Architecture principles](https://blog.cleancoder.com/uncle-bob/2012/08/13/the-clean-architecture.html)
@@ -0,0 +1,158 @@
1
+ # ADR-003: Content Safety Filtering Approach
2
+
3
+ ## Status
4
+
5
+ Draft
6
+
7
+ ## Context
8
+
9
+ The architectural review of version 0.2.0 identified that the Agentic gem currently lacks protection against harmful or inappropriate content being generated or requested. As AI agents can potentially generate or be prompted with unsafe content, this poses several risks:
10
+
11
+ 1. Safety risks from generating harmful instructions or content
12
+ 2. Reputational risks for users of the library
13
+ 3. Potential violations of API provider terms of service
14
+ 4. Lack of controls to prevent misuse
15
+ 5. Inconsistent handling of unsafe content across the system
16
+
17
+ As AI capabilities continue to advance, content safety becomes increasingly important for any production AI system.
18
+
19
+ ## Decision Drivers
20
+
21
+ * Safety: Prevent generation of harmful content and instructions
22
+ * Compliance: Ensure compliance with API provider policies
23
+ * Configurability: Allow users to adapt filtering to their specific needs
24
+ * Performance: Minimize impact on system performance
25
+ * Transparency: Make filtering decisions transparent to users
26
+ * Consistency: Apply safety measures consistently across the system
27
+
28
+ ## Decision
29
+
30
+ Implement a comprehensive content safety filtering system with:
31
+
32
+ 1. Input filtering before sending to LLMs
33
+ 2. Output filtering after receiving LLM responses
34
+ 3. Configurable filtering levels and rules
35
+ 4. Transparent logging of filtering decisions
36
+ 5. Override mechanisms for trusted contexts
37
+
38
+ **Architectural Components Affected:**
39
+ * LlmClient (modified to apply filtering)
40
+ * ContentSafetyFilter (new component)
41
+ * Agent (modified to apply filtering to instructions)
42
+ * Configuration (extended to include safety settings)
43
+
44
+ **Interface Changes:**
45
+ * New ContentSafetyFilter class with methods for:
46
+ - Checking input safety
47
+ - Checking output safety
48
+ - Configuring filtering rules
49
+ - Logging filtering decisions
50
+ * Configuration extensions for safety settings
51
+
52
+ ## Consequences
53
+
54
+ ### Positive
55
+
56
+ * Reduced risk of generating or processing harmful content
57
+ * Better compliance with API provider policies
58
+ * More control for users over content safety
59
+ * Consistent handling of content safety across the system
60
+ * Transparent safety decisions with appropriate logging
61
+
62
+ ### Negative
63
+
64
+ * Additional processing overhead for all LLM interactions
65
+ * Potential for false positives blocking legitimate content
66
+ * Complexity of handling edge cases in content filtering
67
+ * Need for regular updates to filtering rules as threats evolve
68
+
69
+ ### Neutral
70
+
71
+ * Shift in responsibility for content safety to the library
72
+ * Need for documentation about safety capabilities and limitations
73
+ * Potential need for domain-specific customizations
74
+
75
+ ## Implementation
76
+
77
+ **Phase 1: Basic Filtering**
78
+ * Create ContentSafetyFilter class with basic pattern-based filtering
79
+ * Integrate with LlmClient for input and output filtering
80
+ * Add configuration options for enabling/disabling filtering
81
+ * Implement logging for filtering decisions
82
+
83
+ **Phase 2: Enhanced Filtering**
84
+ * Add support for different filtering levels (minimal, standard, strict)
85
+ * Implement domain-specific filtering rules
86
+ * Create override mechanisms for trusted contexts
87
+ * Add detection for more subtle safety issues
88
+
89
+ **Phase 3: Advanced Capabilities**
90
+ * Implement embeddings-based filtering for semantic safety issues
91
+ * Add support for custom filtering rules
92
+ * Create tools for analyzing and improving filtering accuracy
93
+ * Implement content sanitization (as opposed to just blocking)
94
+
95
+ ## Alternatives Considered
96
+
97
+ ### Alternative 1: Rely on API provider safety measures
98
+
99
+ **Pros:**
100
+ * Less development effort
101
+ * No performance overhead in our library
102
+ * Leverage specialized expertise of API providers
103
+
104
+ **Cons:**
105
+ * Inconsistent handling across different providers
106
+ * Limited control over filtering behavior
107
+ * No protection for inputs before they reach API providers
108
+ * Potential compliance gaps with some providers
109
+
110
+ ### Alternative 2: Third-party content moderation service
111
+
112
+ **Pros:**
113
+ * Leverage specialized moderation expertise
114
+ * Regular updates to detection capabilities
115
+ * Potentially higher accuracy than internal solution
116
+
117
+ **Cons:**
118
+ * External dependency for critical functionality
119
+ * Additional latency from API calls
120
+ * Potential cost implications
121
+ * Privacy concerns with sending data to third parties
122
+
123
+ ### Alternative 3: Client-side responsibility only
124
+
125
+ **Pros:**
126
+ * Simplicity in library implementation
127
+ * No performance overhead within the library
128
+ * Maximum flexibility for library users
129
+
130
+ **Cons:**
131
+ * Inconsistent safety measures across implementations
132
+ * Higher burden on library users
133
+ * No protection by default
134
+ * Potential reputation risks if misused
135
+
136
+ ## Validation
137
+
138
+ **Acceptance Criteria:**
139
+ - [ ] Content filtering can detect common categories of unsafe content
140
+ - [ ] False positive rate is below acceptable threshold (target: <5%)
141
+ - [ ] Performance impact is acceptable (target: <50ms per interaction)
142
+ - [ ] Filtering can be configured at different levels
143
+ - [ ] Override mechanisms work correctly for trusted contexts
144
+ - [ ] Filtering decisions are properly logged
145
+
146
+ **Testing Approach:**
147
+ * Unit tests with various input patterns including edge cases
148
+ * Performance benchmarks for filtering overhead
149
+ * Integration tests with LlmClient
150
+ * Validation with synthetic (safe) examples of problematic patterns
151
+ * User testing of configuration options
152
+
153
+ ## References
154
+
155
+ * [Architectural Review 0.2.0](../../../.architecture/reviews/0-2-0.md)
156
+ * [Implementation Roadmap](../../../.architecture/recalibration/implementation_roadmap_0-2-0.md)
157
+ * [OpenAI Moderation API](https://platform.openai.com/docs/guides/moderation)
158
+ * [AI content safety best practices](https://www.responsible.ai/resources/content-safety)
@@ -0,0 +1,161 @@
1
+ # ADR-004: Agent Permission Model
2
+
3
+ ## Status
4
+
5
+ Draft
6
+
7
+ ## Context
8
+
9
+ The architectural review of version 0.2.0 identified that the Agentic gem lacks a clear mechanism to restrict what capabilities agents have access to. This presents several challenges:
10
+
11
+ 1. No granular control over agent actions, creating potential security risks
12
+ 2. Inability to enforce least-privilege principles for agents
13
+ 3. Difficulty implementing role-based agent systems with proper security boundaries
14
+ 4. Limited auditing capabilities for agent actions and permissions
15
+ 5. No formal way to express agent capability requirements
16
+
17
+ As AI agents become more powerful and deployed in more sensitive contexts, controlling their capabilities becomes increasingly important for security and compliance.
18
+
19
+ ## Decision Drivers
20
+
21
+ * Security: Implement principle of least privilege for agents
22
+ * Flexibility: Support diverse permission models for different use cases
23
+ * Usability: Make permission system intuitive for developers
24
+ * Auditability: Enable tracking of permissions and access attempts
25
+ * Performance: Minimize overhead from permission checks
26
+ * Integration: Work seamlessly with existing agent models
27
+
28
+ ## Decision
29
+
30
+ Implement a comprehensive agent permission system with:
31
+
32
+ 1. Explicit permission definitions with granular capabilities
33
+ 2. Permission registry for centralized management
34
+ 3. Capability checking at agent execution time
35
+ 4. Permission inheritance and composition
36
+ 5. Audit logging for permission checks and violations
37
+
38
+ **Architectural Components Affected:**
39
+ * Agent (modified to include and check permissions)
40
+ * Permission (new component)
41
+ * PermissionRegistry (new component)
42
+ * AgentSpecification (modified to include permission requirements)
43
+ * Task (modified to specify required permissions)
44
+
45
+ **Interface Changes:**
46
+ * New Permission class to represent individual capabilities
47
+ * PermissionRegistry for managing system permissions
48
+ * Agent interface extensions for capability checking:
49
+ - `can?(capability_name)` method
50
+ - `require_capability(capability_name)` method
51
+ * Configuration options for permission management
52
+
53
+ ## Consequences
54
+
55
+ ### Positive
56
+
57
+ * Improved security through controlled agent capabilities
58
+ * Support for role-based agent authorization
59
+ * Better auditability of agent actions and permissions
60
+ * Clear expression of capability requirements
61
+ * Foundation for more complex security models
62
+
63
+ ### Negative
64
+
65
+ * Additional complexity in agent configuration
66
+ * Potential friction in development if permissions are too restrictive
67
+ * Performance overhead from permission checking
68
+ * Migration challenges for existing agent implementations
69
+
70
+ ### Neutral
71
+
72
+ * Shift toward more explicit capability management
73
+ * Need for documentation about permission models
74
+ * Potential need for helper methods to simplify common patterns
75
+
76
+ ## Implementation
77
+
78
+ **Phase 1: Core Permission Model**
79
+ * Create Permission class for representing capabilities
80
+ * Implement PermissionRegistry for centralized management
81
+ * Extend Agent to support permission checking
82
+ * Add basic audit logging for permission decisions
83
+
84
+ **Phase 2: Enhanced Permission Features**
85
+ * Implement permission inheritance and composition
86
+ * Create permission sets for common agent roles
87
+ * Add configuration options for permission management
88
+ * Enhance audit logging with more context
89
+
90
+ **Phase 3: Advanced Security Model**
91
+ * Implement context-sensitive permissions
92
+ * Add dynamic permission granting/revocation
93
+ * Create tools for analyzing permission usage
94
+ * Implement permission-based sandbox execution
95
+
96
+ ## Alternatives Considered
97
+
98
+ ### Alternative 1: Capability-based security model
99
+
100
+ **Pros:**
101
+ * More object-oriented approach with capabilities as objects
102
+ * Can be more secure with proper unforgeable capabilities
103
+ * More flexible composition of capabilities
104
+
105
+ **Cons:**
106
+ * More complex implementation
107
+ * Less familiar to most developers
108
+ * Potentially higher performance overhead
109
+ * More challenging to audit centrally
110
+
111
+ ### Alternative 2: Role-based access control only
112
+
113
+ **Pros:**
114
+ * Simpler implementation
115
+ * More familiar to developers from other systems
116
+ * Easier to reason about at a high level
117
+ * Potentially lower overhead
118
+
119
+ **Cons:**
120
+ * Less granular control than capability-based approach
121
+ * More rigid permission structure
122
+ * Harder to implement dynamic permissions
123
+ * Less aligned with agent-oriented design
124
+
125
+ ### Alternative 3: Attribute-based access control
126
+
127
+ **Pros:**
128
+ * More flexible for complex permission scenarios
129
+ * Better support for context-sensitive permissions
130
+ * More expressive permission model
131
+
132
+ **Cons:**
133
+ * Significantly more complex to implement
134
+ * Higher performance overhead
135
+ * Steeper learning curve for users
136
+ * More difficult to reason about permissions
137
+
138
+ ## Validation
139
+
140
+ **Acceptance Criteria:**
141
+ - [ ] Agents can be restricted to specific capabilities
142
+ - [ ] Permission checks prevent unauthorized actions
143
+ - [ ] Permissions can be composed and inherited
144
+ - [ ] Permission checks have acceptable performance overhead (<1ms)
145
+ - [ ] Audit logs capture all permission decisions
146
+ - [ ] Permission model integrates with existing agent concepts
147
+
148
+ **Testing Approach:**
149
+ * Unit tests for permission checks under various scenarios
150
+ * Performance benchmarks for permission overhead
151
+ * Integration tests with agent execution
152
+ * Security-focused tests to verify proper enforcement
153
+ * User testing of permission configuration
154
+
155
+ ## References
156
+
157
+ * [Architectural Review 0.2.0](../../../.architecture/reviews/0-2-0.md)
158
+ * [Implementation Roadmap](../../../.architecture/recalibration/implementation_roadmap_0-2-0.md)
159
+ * [Principle of Least Privilege](https://en.wikipedia.org/wiki/Principle_of_least_privilege)
160
+ * [Capability-based security](https://en.wikipedia.org/wiki/Capability-based_security)
161
+ * [Role-based access control](https://en.wikipedia.org/wiki/Role-based_access_control)
@@ -0,0 +1,127 @@
1
+ # Adaptation Engine Design
2
+
3
+ ## Purpose and Scope
4
+
5
+ The Adaptation Engine component is a core part of the Verification Layer, responsible for implementing feedback-driven adjustments to improve agent and task performance over time. It analyzes task outcomes, verification results, and human feedback to suggest or automatically apply adaptations to various system components.
6
+
7
+ ## Design Principles
8
+
9
+ 1. **Feedback-driven**: All adaptations are based on explicit feedback from execution, verification, or human input
10
+ 2. **Component-oriented**: Each system component (agents, tasks, prompts) can have specific adaptation strategies
11
+ 3. **Progressive autonomy**: Supports both manual review of suggestions and automatic application of adaptations
12
+ 4. **Historical awareness**: Maintains history of adaptations for analysis and continuous improvement
13
+ 5. **Threshold-based actions**: Uses confidence scores to determine when adaptation is needed
14
+
15
+ ## Architecture
16
+
17
+ ### Class Structure
18
+
19
+ The `AdaptationEngine` is designed as a registry of adaptation strategies that can be applied to different components based on feedback:
20
+
21
+ ```ruby
22
+ module Agentic
23
+ class AdaptationEngine
24
+ def initialize(options = {})
25
+ # Configuration settings
26
+ # Adaptation registry
27
+ # Feedback history
28
+ end
29
+
30
+ def register_adaptation_strategy(component, strategy)
31
+ # Register a callable strategy for a component
32
+ end
33
+
34
+ def process_feedback(feedback)
35
+ # Process feedback and determine if adaptation is needed
36
+ end
37
+
38
+ def apply_adaptation(feedback)
39
+ # Apply registered strategy to adapt the component
40
+ end
41
+
42
+ def adaptation_history(component = nil)
43
+ # Retrieve adaptation history
44
+ end
45
+ end
46
+ end
47
+ ```
48
+
49
+ ### Interfaces
50
+
51
+ #### Feedback Format
52
+
53
+ Feedback is structured as a hash containing:
54
+ - `:component`: Symbol identifying the component (e.g., `:agent`, `:task`, `:prompt`)
55
+ - `:target`: The instance to adapt
56
+ - `:metrics`: Performance metrics (including `:confidence` score)
57
+ - `:outcome`: Success/failure indicator
58
+ - `:suggestion`: Optional suggested improvement
59
+
60
+ #### Adaptation Strategy Interface
61
+
62
+ Adaptation strategies are implemented as callables (Procs or lambdas) that:
63
+ 1. Accept a feedback hash
64
+ 2. Perform adaptation on the target
65
+ 3. Return a result hash with adaptation details
66
+
67
+ ### Integration Points
68
+
69
+ 1. **Verification Hub**: Provides feedback based on verification results
70
+ 2. **Task Execution**: Reports outcomes for adaptation consideration
71
+ 3. **Human Interface**: Allows manual feedback to drive adaptation
72
+ 4. **Learning System**: Provides pattern-based suggestions for adaptations
73
+
74
+ ## Key Behaviors
75
+
76
+ ### Adaptation Threshold
77
+
78
+ The engine uses a configurable threshold to determine when adaptation is needed:
79
+ - Confidence scores below threshold trigger adaptation consideration
80
+ - Threshold can be adjusted based on domain requirements and risk tolerance
81
+
82
+ ### Auto-Adaptation
83
+
84
+ Two operating modes are supported:
85
+ 1. **Manual review**: Adaptations are suggested but require confirmation
86
+ 2. **Automatic application**: Adaptations are applied immediately when needed
87
+
88
+ ### Adaptation Registry
89
+
90
+ Components register specific adaptation strategies:
91
+ - Different strategies for different component types
92
+ - Strategy registration at runtime allows for extensibility
93
+ - Domain-specific strategies can be registered as needed
94
+
95
+ ### History Tracking
96
+
97
+ All feedback and adaptations are tracked:
98
+ - Provides audit trail of system improvements
99
+ - Enables analysis of adaptation effectiveness
100
+ - Supports learning for future adaptation strategies
101
+
102
+ ## Implementation Considerations
103
+
104
+ 1. **Error Handling**: Adaptations could potentially create regression issues, so robust error handling is essential
105
+ 2. **Persistence**: Consider whether adaptation history should be persisted across sessions
106
+ 3. **Metrics**: Define standard metrics for measuring adaptation effectiveness
107
+ 4. **Strategy Composition**: Allow complex adaptations through composition of simpler strategies
108
+ 5. **Validation**: Ensure adaptations maintain system consistency and don't violate constraints
109
+
110
+ ## Future Extensions
111
+
112
+ 1. **Adaptation Chains**: Support sequences of adaptations with dependencies
113
+ 2. **Meta-Adaptation**: Adapt the adaptation strategies themselves based on effectiveness
114
+ 3. **A/B Testing**: Compare different adaptation strategies for effectiveness
115
+ 4. **Domain-Specific Adapters**: Create specialized adaptation libraries for different domains
116
+ 5. **Collaborative Adaptation**: Allow multiple agents to contribute to adaptation decisions
117
+
118
+ ## Security and Safety
119
+
120
+ 1. **Adaptation Limits**: Set boundaries on what can be changed through adaptation
121
+ 2. **Rollback Capability**: Ability to revert problematic adaptations
122
+ 3. **Approval Workflows**: Multi-stage approval for critical adaptations
123
+ 4. **Isolation**: Ensure adaptations can't compromise system integrity
124
+
125
+ ## Conclusion
126
+
127
+ The Adaptation Engine provides a flexible, extensible mechanism for improving system performance through feedback-driven adjustments. By applying targeted adaptations based on execution outcomes, verification results, and human feedback, the system can continuously improve its effectiveness in achieving user goals.