agentic 0.1.0 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.agentic.yml +2 -0
- data/.architecture/decisions/ArchitecturalFeatureBuilder.md +136 -0
- data/.architecture/decisions/ArchitectureConsiderations.md +200 -0
- data/.architecture/decisions/adr_001_observer_pattern_implementation.md +196 -0
- data/.architecture/decisions/adr_002_plan_orchestrator.md +320 -0
- data/.architecture/decisions/adr_003_plan_orchestrator_interface.md +179 -0
- data/.architecture/decisions/adrs/ADR-001-dependency-management.md +147 -0
- data/.architecture/decisions/adrs/ADR-002-system-boundaries.md +162 -0
- data/.architecture/decisions/adrs/ADR-003-content-safety.md +158 -0
- data/.architecture/decisions/adrs/ADR-004-agent-permissions.md +161 -0
- data/.architecture/decisions/adrs/ADR-005-adaptation-engine.md +127 -0
- data/.architecture/decisions/adrs/ADR-006-extension-system.md +273 -0
- data/.architecture/decisions/adrs/ADR-007-learning-system.md +156 -0
- data/.architecture/decisions/adrs/ADR-008-prompt-generation.md +325 -0
- data/.architecture/decisions/adrs/ADR-009-task-failure-handling.md +353 -0
- data/.architecture/decisions/adrs/ADR-010-task-input-handling.md +251 -0
- data/.architecture/decisions/adrs/ADR-011-task-observable-pattern.md +391 -0
- data/.architecture/decisions/adrs/ADR-012-task-output-handling.md +205 -0
- data/.architecture/decisions/adrs/ADR-013-architecture-alignment.md +211 -0
- data/.architecture/decisions/adrs/ADR-014-agent-capability-registry.md +80 -0
- data/.architecture/decisions/adrs/ADR-015-persistent-agent-store.md +100 -0
- data/.architecture/decisions/adrs/ADR-016-agent-assembly-engine.md +117 -0
- data/.architecture/decisions/adrs/ADR-017-streaming-observability.md +171 -0
- data/.architecture/decisions/capability_tools_distinction.md +150 -0
- data/.architecture/decisions/cli_command_structure.md +61 -0
- data/.architecture/implementation/agent_self_assembly_implementation.md +267 -0
- data/.architecture/implementation/agent_self_assembly_summary.md +138 -0
- data/.architecture/members.yml +187 -0
- data/.architecture/planning/self_implementation_exercise.md +295 -0
- data/.architecture/planning/session_compaction_rule.md +43 -0
- data/.architecture/planning/streaming_observability_feature.md +223 -0
- data/.architecture/principles.md +151 -0
- data/.architecture/recalibration/0-2-0.md +92 -0
- data/.architecture/recalibration/agent_self_assembly.md +238 -0
- data/.architecture/recalibration/cli_command_structure.md +91 -0
- data/.architecture/recalibration/implementation_roadmap_0-2-0.md +301 -0
- data/.architecture/recalibration/progress_tracking_0-2-0.md +114 -0
- data/.architecture/recalibration_process.md +127 -0
- data/.architecture/reviews/0-2-0.md +181 -0
- data/.architecture/reviews/cli_command_duplication.md +98 -0
- data/.architecture/templates/adr.md +105 -0
- data/.architecture/templates/implementation_roadmap.md +125 -0
- data/.architecture/templates/progress_tracking.md +89 -0
- data/.architecture/templates/recalibration_plan.md +70 -0
- data/.architecture/templates/version_comparison.md +124 -0
- data/.claude/settings.local.json +13 -0
- data/.claude-sessions/001-task-class-architecture-implementation.md +129 -0
- data/.claude-sessions/002-plan-orchestrator-interface-review.md +105 -0
- data/.claude-sessions/architecture-governance-implementation.md +37 -0
- data/.claude-sessions/architecture-review-session.md +27 -0
- data/ArchitecturalFeatureBuilder.md +136 -0
- data/ArchitectureConsiderations.md +229 -0
- data/CHANGELOG.md +57 -2
- data/CLAUDE.md +111 -0
- data/CONTRIBUTING.md +286 -0
- data/MAINTAINING.md +301 -0
- data/README.md +582 -28
- data/docs/agent_capabilities_api.md +259 -0
- data/docs/artifact_extension_points.md +757 -0
- data/docs/artifact_generation_architecture.md +323 -0
- data/docs/artifact_implementation_plan.md +596 -0
- data/docs/artifact_integration_points.md +345 -0
- data/docs/artifact_verification_strategies.md +581 -0
- data/docs/streaming_observability_architecture.md +510 -0
- data/exe/agentic +6 -1
- data/lefthook.yml +5 -0
- data/lib/agentic/adaptation_engine.rb +124 -0
- data/lib/agentic/agent.rb +181 -4
- data/lib/agentic/agent_assembly_engine.rb +442 -0
- data/lib/agentic/agent_capability_registry.rb +260 -0
- data/lib/agentic/agent_config.rb +63 -0
- data/lib/agentic/agent_specification.rb +46 -0
- data/lib/agentic/capabilities/examples.rb +530 -0
- data/lib/agentic/capabilities.rb +14 -0
- data/lib/agentic/capability_provider.rb +146 -0
- data/lib/agentic/capability_specification.rb +118 -0
- data/lib/agentic/cli/agent.rb +31 -0
- data/lib/agentic/cli/capabilities.rb +191 -0
- data/lib/agentic/cli/config.rb +134 -0
- data/lib/agentic/cli/execution_observer.rb +796 -0
- data/lib/agentic/cli.rb +1068 -0
- data/lib/agentic/default_agent_provider.rb +35 -0
- data/lib/agentic/errors/llm_error.rb +184 -0
- data/lib/agentic/execution_plan.rb +53 -0
- data/lib/agentic/execution_result.rb +91 -0
- data/lib/agentic/expected_answer_format.rb +46 -0
- data/lib/agentic/extension/domain_adapter.rb +109 -0
- data/lib/agentic/extension/plugin_manager.rb +163 -0
- data/lib/agentic/extension/protocol_handler.rb +116 -0
- data/lib/agentic/extension.rb +45 -0
- data/lib/agentic/factory_methods.rb +9 -1
- data/lib/agentic/generation_stats.rb +61 -0
- data/lib/agentic/learning/README.md +84 -0
- data/lib/agentic/learning/capability_optimizer.rb +613 -0
- data/lib/agentic/learning/execution_history_store.rb +251 -0
- data/lib/agentic/learning/pattern_recognizer.rb +500 -0
- data/lib/agentic/learning/strategy_optimizer.rb +706 -0
- data/lib/agentic/learning.rb +131 -0
- data/lib/agentic/llm_assisted_composition_strategy.rb +188 -0
- data/lib/agentic/llm_client.rb +215 -15
- data/lib/agentic/llm_config.rb +65 -1
- data/lib/agentic/llm_response.rb +163 -0
- data/lib/agentic/logger.rb +1 -1
- data/lib/agentic/observable.rb +51 -0
- data/lib/agentic/persistent_agent_store.rb +385 -0
- data/lib/agentic/plan_execution_result.rb +129 -0
- data/lib/agentic/plan_orchestrator.rb +464 -0
- data/lib/agentic/plan_orchestrator_config.rb +57 -0
- data/lib/agentic/retry_config.rb +63 -0
- data/lib/agentic/retry_handler.rb +125 -0
- data/lib/agentic/structured_outputs.rb +1 -1
- data/lib/agentic/task.rb +193 -0
- data/lib/agentic/task_definition.rb +39 -0
- data/lib/agentic/task_execution_result.rb +92 -0
- data/lib/agentic/task_failure.rb +66 -0
- data/lib/agentic/task_output_schemas.rb +112 -0
- data/lib/agentic/task_planner.rb +54 -19
- data/lib/agentic/task_result.rb +48 -0
- data/lib/agentic/ui.rb +244 -0
- data/lib/agentic/verification/critic_framework.rb +116 -0
- data/lib/agentic/verification/llm_verification_strategy.rb +60 -0
- data/lib/agentic/verification/schema_verification_strategy.rb +47 -0
- data/lib/agentic/verification/verification_hub.rb +62 -0
- data/lib/agentic/verification/verification_result.rb +50 -0
- data/lib/agentic/verification/verification_strategy.rb +26 -0
- data/lib/agentic/version.rb +1 -1
- data/lib/agentic.rb +74 -2
- data/plugins/README.md +41 -0
- metadata +245 -6
@@ -0,0 +1,162 @@
|
|
1
|
+
# ADR-002: Implementation of System Boundaries
|
2
|
+
|
3
|
+
## Status
|
4
|
+
|
5
|
+
Draft
|
6
|
+
|
7
|
+
## Context
|
8
|
+
|
9
|
+
The architectural review of version 0.2.0 identified that the boundaries between different subsystems (planning, execution, learning) in the Agentic codebase could be more explicit. Currently, there is direct coupling between these subsystems, which:
|
10
|
+
|
11
|
+
1. Makes it difficult to understand the interfaces between major components
|
12
|
+
2. Limits the ability to replace or extend individual subsystems
|
13
|
+
3. Creates potential for unintended side effects when modifying one subsystem
|
14
|
+
4. Increases cognitive load for developers working with the codebase
|
15
|
+
5. Makes testing more challenging as subsystems cannot be easily isolated
|
16
|
+
|
17
|
+
As the codebase continues to grow, these issues will become more pronounced and limit the system's evolvability.
|
18
|
+
|
19
|
+
## Decision Drivers
|
20
|
+
|
21
|
+
* Modularity: Enable independent development and evolution of subsystems
|
22
|
+
* Comprehensibility: Make system boundaries clear for new developers
|
23
|
+
* Testability: Allow subsystems to be tested in isolation
|
24
|
+
* Extensibility: Support plugging in alternative implementations for subsystems
|
25
|
+
* Maintainability: Reduce coupling between conceptually separate parts of the system
|
26
|
+
* Evolution: Enable subsystems to evolve at different rates
|
27
|
+
|
28
|
+
## Decision
|
29
|
+
|
30
|
+
Implement explicit boundaries between the major subsystems in Agentic by:
|
31
|
+
|
32
|
+
1. Defining clear interfaces between the planning, execution, and learning subsystems
|
33
|
+
2. Introducing anti-corruption layers where needed to maintain domain consistency
|
34
|
+
3. Using dependency inversion to reduce direct coupling
|
35
|
+
4. Implementing explicit contracts for cross-subsystem communication
|
36
|
+
|
37
|
+
**Architectural Components Affected:**
|
38
|
+
* TaskPlanner (interface extraction and implementation)
|
39
|
+
* PlanOrchestrator (interface extraction and implementation)
|
40
|
+
* Learning system (interface extraction and implementation)
|
41
|
+
* Verification system (interface extraction and implementation)
|
42
|
+
* Cross-cutting concerns (logging, configuration, error handling)
|
43
|
+
|
44
|
+
**Interface Changes:**
|
45
|
+
* New interfaces for major subsystems:
|
46
|
+
- IPlanningSystem
|
47
|
+
- IExecutionSystem
|
48
|
+
- ILearningSystem
|
49
|
+
- IVerificationSystem
|
50
|
+
* Domain event interfaces for cross-subsystem communication
|
51
|
+
* Factory methods for creating subsystem implementations
|
52
|
+
|
53
|
+
## Consequences
|
54
|
+
|
55
|
+
### Positive
|
56
|
+
|
57
|
+
* Clearer system structure with well-defined boundaries
|
58
|
+
* Improved ability to work on subsystems independently
|
59
|
+
* Better testability through proper isolation
|
60
|
+
* Easier to swap implementations for specific subsystems
|
61
|
+
* Reduced coupling between conceptually separate parts
|
62
|
+
* More deliberate cross-subsystem communication
|
63
|
+
|
64
|
+
### Negative
|
65
|
+
|
66
|
+
* Additional interfaces increase initial complexity
|
67
|
+
* Potential for over-engineering if boundaries are too rigid
|
68
|
+
* Initial development overhead to refactor existing code
|
69
|
+
* Slight performance overhead from additional abstraction layers
|
70
|
+
* Migration challenges for existing code
|
71
|
+
|
72
|
+
### Neutral
|
73
|
+
|
74
|
+
* Shift in development approach requiring more upfront design
|
75
|
+
* Need for documentation about subsystem interactions
|
76
|
+
* Potential need for adapter implementations during transition
|
77
|
+
|
78
|
+
## Implementation
|
79
|
+
|
80
|
+
**Phase 1: Interface Definition**
|
81
|
+
* Define interfaces for all major subsystems
|
82
|
+
* Document interaction patterns between subsystems
|
83
|
+
* Create domain event system for cross-boundary communication
|
84
|
+
* Add initial validation tests for interfaces
|
85
|
+
|
86
|
+
**Phase 2: Implementation Refactoring**
|
87
|
+
* Refactor existing implementations to follow the new interfaces
|
88
|
+
* Create anti-corruption layers where needed
|
89
|
+
* Update factories to work with interfaces instead of concrete classes
|
90
|
+
* Keep backward compatibility through adapter patterns
|
91
|
+
|
92
|
+
**Phase 3: Boundary Enforcement**
|
93
|
+
* Add static analysis tools to enforce architectural boundaries
|
94
|
+
* Create visualizations of subsystem interactions
|
95
|
+
* Implement metrics for measuring coupling between subsystems
|
96
|
+
* Update documentation with architectural diagrams and guidelines
|
97
|
+
|
98
|
+
## Alternatives Considered
|
99
|
+
|
100
|
+
### Alternative 1: Looser boundaries with documentation only
|
101
|
+
|
102
|
+
**Pros:**
|
103
|
+
* Less initial refactoring work
|
104
|
+
* More flexibility for cross-subsystem optimizations
|
105
|
+
* Less ceremony for developers working across boundaries
|
106
|
+
|
107
|
+
**Cons:**
|
108
|
+
* Relies on discipline rather than structure
|
109
|
+
* Boundaries may erode over time
|
110
|
+
* Harder to maintain as the system grows
|
111
|
+
* Limited enforcement of architectural intentions
|
112
|
+
|
113
|
+
### Alternative 2: Microservice-like boundaries with separate packages
|
114
|
+
|
115
|
+
**Pros:**
|
116
|
+
* Strongest enforcement of boundaries
|
117
|
+
* Maximum independence for subsystem teams
|
118
|
+
* Clearest separation of concerns
|
119
|
+
* Forces explicit API design
|
120
|
+
|
121
|
+
**Cons:**
|
122
|
+
* Excessive overhead for a library
|
123
|
+
* Potential performance impact from stricter isolation
|
124
|
+
* More complex deployment and integration
|
125
|
+
* May feel over-engineered for the current scale
|
126
|
+
|
127
|
+
### Alternative 3: Boundary enforcement through aspect-oriented programming
|
128
|
+
|
129
|
+
**Pros:**
|
130
|
+
* Could maintain boundaries without extensive refactoring
|
131
|
+
* Potentially more flexible boundary definitions
|
132
|
+
* Less invasive to existing code
|
133
|
+
|
134
|
+
**Cons:**
|
135
|
+
* Adds complexity through AOP mechanisms
|
136
|
+
* Less explicit in the code itself
|
137
|
+
* Potentially harder to understand for new developers
|
138
|
+
* Limited tools in Ruby for this approach
|
139
|
+
|
140
|
+
## Validation
|
141
|
+
|
142
|
+
**Acceptance Criteria:**
|
143
|
+
- [ ] All subsystem interactions occur through defined interfaces
|
144
|
+
- [ ] No direct dependencies between implementation classes across subsystems
|
145
|
+
- [ ] Each subsystem can be tested in isolation with mocked dependencies
|
146
|
+
- [ ] Static analysis tools can verify boundary compliance
|
147
|
+
- [ ] Performance overhead is acceptable (<5% in benchmark tests)
|
148
|
+
- [ ] Documentation clearly explains subsystem boundaries and interactions
|
149
|
+
|
150
|
+
**Testing Approach:**
|
151
|
+
* Unit tests for individual subsystems with mocked dependencies
|
152
|
+
* Integration tests for subsystem interactions
|
153
|
+
* Static analysis to verify boundary compliance
|
154
|
+
* Performance benchmarks comparing before and after implementations
|
155
|
+
* Documentation review by developers not involved in the implementation
|
156
|
+
|
157
|
+
## References
|
158
|
+
|
159
|
+
* [Architectural Review 0.2.0](../../../.architecture/reviews/0-2-0.md)
|
160
|
+
* [Implementation Roadmap](../../../.architecture/recalibration/implementation_roadmap_0-2-0.md)
|
161
|
+
* [Domain-Driven Design concepts](https://www.martinfowler.com/bliki/BoundedContext.html)
|
162
|
+
* [Clean Architecture principles](https://blog.cleancoder.com/uncle-bob/2012/08/13/the-clean-architecture.html)
|
@@ -0,0 +1,158 @@
|
|
1
|
+
# ADR-003: Content Safety Filtering Approach
|
2
|
+
|
3
|
+
## Status
|
4
|
+
|
5
|
+
Draft
|
6
|
+
|
7
|
+
## Context
|
8
|
+
|
9
|
+
The architectural review of version 0.2.0 identified that the Agentic gem currently lacks protection against harmful or inappropriate content being generated or requested. As AI agents can potentially generate or be prompted with unsafe content, this poses several risks:
|
10
|
+
|
11
|
+
1. Safety risks from generating harmful instructions or content
|
12
|
+
2. Reputational risks for users of the library
|
13
|
+
3. Potential violations of API provider terms of service
|
14
|
+
4. Lack of controls to prevent misuse
|
15
|
+
5. Inconsistent handling of unsafe content across the system
|
16
|
+
|
17
|
+
As AI capabilities continue to advance, content safety becomes increasingly important for any production AI system.
|
18
|
+
|
19
|
+
## Decision Drivers
|
20
|
+
|
21
|
+
* Safety: Prevent generation of harmful content and instructions
|
22
|
+
* Compliance: Ensure compliance with API provider policies
|
23
|
+
* Configurability: Allow users to adapt filtering to their specific needs
|
24
|
+
* Performance: Minimize impact on system performance
|
25
|
+
* Transparency: Make filtering decisions transparent to users
|
26
|
+
* Consistency: Apply safety measures consistently across the system
|
27
|
+
|
28
|
+
## Decision
|
29
|
+
|
30
|
+
Implement a comprehensive content safety filtering system with:
|
31
|
+
|
32
|
+
1. Input filtering before sending to LLMs
|
33
|
+
2. Output filtering after receiving LLM responses
|
34
|
+
3. Configurable filtering levels and rules
|
35
|
+
4. Transparent logging of filtering decisions
|
36
|
+
5. Override mechanisms for trusted contexts
|
37
|
+
|
38
|
+
**Architectural Components Affected:**
|
39
|
+
* LlmClient (modified to apply filtering)
|
40
|
+
* ContentSafetyFilter (new component)
|
41
|
+
* Agent (modified to apply filtering to instructions)
|
42
|
+
* Configuration (extended to include safety settings)
|
43
|
+
|
44
|
+
**Interface Changes:**
|
45
|
+
* New ContentSafetyFilter class with methods for:
|
46
|
+
- Checking input safety
|
47
|
+
- Checking output safety
|
48
|
+
- Configuring filtering rules
|
49
|
+
- Logging filtering decisions
|
50
|
+
* Configuration extensions for safety settings
|
51
|
+
|
52
|
+
## Consequences
|
53
|
+
|
54
|
+
### Positive
|
55
|
+
|
56
|
+
* Reduced risk of generating or processing harmful content
|
57
|
+
* Better compliance with API provider policies
|
58
|
+
* More control for users over content safety
|
59
|
+
* Consistent handling of content safety across the system
|
60
|
+
* Transparent safety decisions with appropriate logging
|
61
|
+
|
62
|
+
### Negative
|
63
|
+
|
64
|
+
* Additional processing overhead for all LLM interactions
|
65
|
+
* Potential for false positives blocking legitimate content
|
66
|
+
* Complexity of handling edge cases in content filtering
|
67
|
+
* Need for regular updates to filtering rules as threats evolve
|
68
|
+
|
69
|
+
### Neutral
|
70
|
+
|
71
|
+
* Shift in responsibility for content safety to the library
|
72
|
+
* Need for documentation about safety capabilities and limitations
|
73
|
+
* Potential need for domain-specific customizations
|
74
|
+
|
75
|
+
## Implementation
|
76
|
+
|
77
|
+
**Phase 1: Basic Filtering**
|
78
|
+
* Create ContentSafetyFilter class with basic pattern-based filtering
|
79
|
+
* Integrate with LlmClient for input and output filtering
|
80
|
+
* Add configuration options for enabling/disabling filtering
|
81
|
+
* Implement logging for filtering decisions
|
82
|
+
|
83
|
+
**Phase 2: Enhanced Filtering**
|
84
|
+
* Add support for different filtering levels (minimal, standard, strict)
|
85
|
+
* Implement domain-specific filtering rules
|
86
|
+
* Create override mechanisms for trusted contexts
|
87
|
+
* Add detection for more subtle safety issues
|
88
|
+
|
89
|
+
**Phase 3: Advanced Capabilities**
|
90
|
+
* Implement embeddings-based filtering for semantic safety issues
|
91
|
+
* Add support for custom filtering rules
|
92
|
+
* Create tools for analyzing and improving filtering accuracy
|
93
|
+
* Implement content sanitization (as opposed to just blocking)
|
94
|
+
|
95
|
+
## Alternatives Considered
|
96
|
+
|
97
|
+
### Alternative 1: Rely on API provider safety measures
|
98
|
+
|
99
|
+
**Pros:**
|
100
|
+
* Less development effort
|
101
|
+
* No performance overhead in our library
|
102
|
+
* Leverage specialized expertise of API providers
|
103
|
+
|
104
|
+
**Cons:**
|
105
|
+
* Inconsistent handling across different providers
|
106
|
+
* Limited control over filtering behavior
|
107
|
+
* No protection for inputs before they reach API providers
|
108
|
+
* Potential compliance gaps with some providers
|
109
|
+
|
110
|
+
### Alternative 2: Third-party content moderation service
|
111
|
+
|
112
|
+
**Pros:**
|
113
|
+
* Leverage specialized moderation expertise
|
114
|
+
* Regular updates to detection capabilities
|
115
|
+
* Potentially higher accuracy than internal solution
|
116
|
+
|
117
|
+
**Cons:**
|
118
|
+
* External dependency for critical functionality
|
119
|
+
* Additional latency from API calls
|
120
|
+
* Potential cost implications
|
121
|
+
* Privacy concerns with sending data to third parties
|
122
|
+
|
123
|
+
### Alternative 3: Client-side responsibility only
|
124
|
+
|
125
|
+
**Pros:**
|
126
|
+
* Simplicity in library implementation
|
127
|
+
* No performance overhead within the library
|
128
|
+
* Maximum flexibility for library users
|
129
|
+
|
130
|
+
**Cons:**
|
131
|
+
* Inconsistent safety measures across implementations
|
132
|
+
* Higher burden on library users
|
133
|
+
* No protection by default
|
134
|
+
* Potential reputation risks if misused
|
135
|
+
|
136
|
+
## Validation
|
137
|
+
|
138
|
+
**Acceptance Criteria:**
|
139
|
+
- [ ] Content filtering can detect common categories of unsafe content
|
140
|
+
- [ ] False positive rate is below acceptable threshold (target: <5%)
|
141
|
+
- [ ] Performance impact is acceptable (target: <50ms per interaction)
|
142
|
+
- [ ] Filtering can be configured at different levels
|
143
|
+
- [ ] Override mechanisms work correctly for trusted contexts
|
144
|
+
- [ ] Filtering decisions are properly logged
|
145
|
+
|
146
|
+
**Testing Approach:**
|
147
|
+
* Unit tests with various input patterns including edge cases
|
148
|
+
* Performance benchmarks for filtering overhead
|
149
|
+
* Integration tests with LlmClient
|
150
|
+
* Validation with synthetic (safe) examples of problematic patterns
|
151
|
+
* User testing of configuration options
|
152
|
+
|
153
|
+
## References
|
154
|
+
|
155
|
+
* [Architectural Review 0.2.0](../../../.architecture/reviews/0-2-0.md)
|
156
|
+
* [Implementation Roadmap](../../../.architecture/recalibration/implementation_roadmap_0-2-0.md)
|
157
|
+
* [OpenAI Moderation API](https://platform.openai.com/docs/guides/moderation)
|
158
|
+
* [AI content safety best practices](https://www.responsible.ai/resources/content-safety)
|
@@ -0,0 +1,161 @@
|
|
1
|
+
# ADR-004: Agent Permission Model
|
2
|
+
|
3
|
+
## Status
|
4
|
+
|
5
|
+
Draft
|
6
|
+
|
7
|
+
## Context
|
8
|
+
|
9
|
+
The architectural review of version 0.2.0 identified that the Agentic gem lacks a clear mechanism to restrict what capabilities agents have access to. This presents several challenges:
|
10
|
+
|
11
|
+
1. No granular control over agent actions, creating potential security risks
|
12
|
+
2. Inability to enforce least-privilege principles for agents
|
13
|
+
3. Difficulty implementing role-based agent systems with proper security boundaries
|
14
|
+
4. Limited auditing capabilities for agent actions and permissions
|
15
|
+
5. No formal way to express agent capability requirements
|
16
|
+
|
17
|
+
As AI agents become more powerful and deployed in more sensitive contexts, controlling their capabilities becomes increasingly important for security and compliance.
|
18
|
+
|
19
|
+
## Decision Drivers
|
20
|
+
|
21
|
+
* Security: Implement principle of least privilege for agents
|
22
|
+
* Flexibility: Support diverse permission models for different use cases
|
23
|
+
* Usability: Make permission system intuitive for developers
|
24
|
+
* Auditability: Enable tracking of permissions and access attempts
|
25
|
+
* Performance: Minimize overhead from permission checks
|
26
|
+
* Integration: Work seamlessly with existing agent models
|
27
|
+
|
28
|
+
## Decision
|
29
|
+
|
30
|
+
Implement a comprehensive agent permission system with:
|
31
|
+
|
32
|
+
1. Explicit permission definitions with granular capabilities
|
33
|
+
2. Permission registry for centralized management
|
34
|
+
3. Capability checking at agent execution time
|
35
|
+
4. Permission inheritance and composition
|
36
|
+
5. Audit logging for permission checks and violations
|
37
|
+
|
38
|
+
**Architectural Components Affected:**
|
39
|
+
* Agent (modified to include and check permissions)
|
40
|
+
* Permission (new component)
|
41
|
+
* PermissionRegistry (new component)
|
42
|
+
* AgentSpecification (modified to include permission requirements)
|
43
|
+
* Task (modified to specify required permissions)
|
44
|
+
|
45
|
+
**Interface Changes:**
|
46
|
+
* New Permission class to represent individual capabilities
|
47
|
+
* PermissionRegistry for managing system permissions
|
48
|
+
* Agent interface extensions for capability checking:
|
49
|
+
- `can?(capability_name)` method
|
50
|
+
- `require_capability(capability_name)` method
|
51
|
+
* Configuration options for permission management
|
52
|
+
|
53
|
+
## Consequences
|
54
|
+
|
55
|
+
### Positive
|
56
|
+
|
57
|
+
* Improved security through controlled agent capabilities
|
58
|
+
* Support for role-based agent authorization
|
59
|
+
* Better auditability of agent actions and permissions
|
60
|
+
* Clear expression of capability requirements
|
61
|
+
* Foundation for more complex security models
|
62
|
+
|
63
|
+
### Negative
|
64
|
+
|
65
|
+
* Additional complexity in agent configuration
|
66
|
+
* Potential friction in development if permissions are too restrictive
|
67
|
+
* Performance overhead from permission checking
|
68
|
+
* Migration challenges for existing agent implementations
|
69
|
+
|
70
|
+
### Neutral
|
71
|
+
|
72
|
+
* Shift toward more explicit capability management
|
73
|
+
* Need for documentation about permission models
|
74
|
+
* Potential need for helper methods to simplify common patterns
|
75
|
+
|
76
|
+
## Implementation
|
77
|
+
|
78
|
+
**Phase 1: Core Permission Model**
|
79
|
+
* Create Permission class for representing capabilities
|
80
|
+
* Implement PermissionRegistry for centralized management
|
81
|
+
* Extend Agent to support permission checking
|
82
|
+
* Add basic audit logging for permission decisions
|
83
|
+
|
84
|
+
**Phase 2: Enhanced Permission Features**
|
85
|
+
* Implement permission inheritance and composition
|
86
|
+
* Create permission sets for common agent roles
|
87
|
+
* Add configuration options for permission management
|
88
|
+
* Enhance audit logging with more context
|
89
|
+
|
90
|
+
**Phase 3: Advanced Security Model**
|
91
|
+
* Implement context-sensitive permissions
|
92
|
+
* Add dynamic permission granting/revocation
|
93
|
+
* Create tools for analyzing permission usage
|
94
|
+
* Implement permission-based sandbox execution
|
95
|
+
|
96
|
+
## Alternatives Considered
|
97
|
+
|
98
|
+
### Alternative 1: Capability-based security model
|
99
|
+
|
100
|
+
**Pros:**
|
101
|
+
* More object-oriented approach with capabilities as objects
|
102
|
+
* Can be more secure with proper unforgeable capabilities
|
103
|
+
* More flexible composition of capabilities
|
104
|
+
|
105
|
+
**Cons:**
|
106
|
+
* More complex implementation
|
107
|
+
* Less familiar to most developers
|
108
|
+
* Potentially higher performance overhead
|
109
|
+
* More challenging to audit centrally
|
110
|
+
|
111
|
+
### Alternative 2: Role-based access control only
|
112
|
+
|
113
|
+
**Pros:**
|
114
|
+
* Simpler implementation
|
115
|
+
* More familiar to developers from other systems
|
116
|
+
* Easier to reason about at a high level
|
117
|
+
* Potentially lower overhead
|
118
|
+
|
119
|
+
**Cons:**
|
120
|
+
* Less granular control than capability-based approach
|
121
|
+
* More rigid permission structure
|
122
|
+
* Harder to implement dynamic permissions
|
123
|
+
* Less aligned with agent-oriented design
|
124
|
+
|
125
|
+
### Alternative 3: Attribute-based access control
|
126
|
+
|
127
|
+
**Pros:**
|
128
|
+
* More flexible for complex permission scenarios
|
129
|
+
* Better support for context-sensitive permissions
|
130
|
+
* More expressive permission model
|
131
|
+
|
132
|
+
**Cons:**
|
133
|
+
* Significantly more complex to implement
|
134
|
+
* Higher performance overhead
|
135
|
+
* Steeper learning curve for users
|
136
|
+
* More difficult to reason about permissions
|
137
|
+
|
138
|
+
## Validation
|
139
|
+
|
140
|
+
**Acceptance Criteria:**
|
141
|
+
- [ ] Agents can be restricted to specific capabilities
|
142
|
+
- [ ] Permission checks prevent unauthorized actions
|
143
|
+
- [ ] Permissions can be composed and inherited
|
144
|
+
- [ ] Permission checks have acceptable performance overhead (<1ms)
|
145
|
+
- [ ] Audit logs capture all permission decisions
|
146
|
+
- [ ] Permission model integrates with existing agent concepts
|
147
|
+
|
148
|
+
**Testing Approach:**
|
149
|
+
* Unit tests for permission checks under various scenarios
|
150
|
+
* Performance benchmarks for permission overhead
|
151
|
+
* Integration tests with agent execution
|
152
|
+
* Security-focused tests to verify proper enforcement
|
153
|
+
* User testing of permission configuration
|
154
|
+
|
155
|
+
## References
|
156
|
+
|
157
|
+
* [Architectural Review 0.2.0](../../../.architecture/reviews/0-2-0.md)
|
158
|
+
* [Implementation Roadmap](../../../.architecture/recalibration/implementation_roadmap_0-2-0.md)
|
159
|
+
* [Principle of Least Privilege](https://en.wikipedia.org/wiki/Principle_of_least_privilege)
|
160
|
+
* [Capability-based security](https://en.wikipedia.org/wiki/Capability-based_security)
|
161
|
+
* [Role-based access control](https://en.wikipedia.org/wiki/Role-based_access_control)
|
@@ -0,0 +1,127 @@
|
|
1
|
+
# Adaptation Engine Design
|
2
|
+
|
3
|
+
## Purpose and Scope
|
4
|
+
|
5
|
+
The Adaptation Engine component is a core part of the Verification Layer, responsible for implementing feedback-driven adjustments to improve agent and task performance over time. It analyzes task outcomes, verification results, and human feedback to suggest or automatically apply adaptations to various system components.
|
6
|
+
|
7
|
+
## Design Principles
|
8
|
+
|
9
|
+
1. **Feedback-driven**: All adaptations are based on explicit feedback from execution, verification, or human input
|
10
|
+
2. **Component-oriented**: Each system component (agents, tasks, prompts) can have specific adaptation strategies
|
11
|
+
3. **Progressive autonomy**: Supports both manual review of suggestions and automatic application of adaptations
|
12
|
+
4. **Historical awareness**: Maintains history of adaptations for analysis and continuous improvement
|
13
|
+
5. **Threshold-based actions**: Uses confidence scores to determine when adaptation is needed
|
14
|
+
|
15
|
+
## Architecture
|
16
|
+
|
17
|
+
### Class Structure
|
18
|
+
|
19
|
+
The `AdaptationEngine` is designed as a registry of adaptation strategies that can be applied to different components based on feedback:
|
20
|
+
|
21
|
+
```ruby
|
22
|
+
module Agentic
|
23
|
+
class AdaptationEngine
|
24
|
+
def initialize(options = {})
|
25
|
+
# Configuration settings
|
26
|
+
# Adaptation registry
|
27
|
+
# Feedback history
|
28
|
+
end
|
29
|
+
|
30
|
+
def register_adaptation_strategy(component, strategy)
|
31
|
+
# Register a callable strategy for a component
|
32
|
+
end
|
33
|
+
|
34
|
+
def process_feedback(feedback)
|
35
|
+
# Process feedback and determine if adaptation is needed
|
36
|
+
end
|
37
|
+
|
38
|
+
def apply_adaptation(feedback)
|
39
|
+
# Apply registered strategy to adapt the component
|
40
|
+
end
|
41
|
+
|
42
|
+
def adaptation_history(component = nil)
|
43
|
+
# Retrieve adaptation history
|
44
|
+
end
|
45
|
+
end
|
46
|
+
end
|
47
|
+
```
|
48
|
+
|
49
|
+
### Interfaces
|
50
|
+
|
51
|
+
#### Feedback Format
|
52
|
+
|
53
|
+
Feedback is structured as a hash containing:
|
54
|
+
- `:component`: Symbol identifying the component (e.g., `:agent`, `:task`, `:prompt`)
|
55
|
+
- `:target`: The instance to adapt
|
56
|
+
- `:metrics`: Performance metrics (including `:confidence` score)
|
57
|
+
- `:outcome`: Success/failure indicator
|
58
|
+
- `:suggestion`: Optional suggested improvement
|
59
|
+
|
60
|
+
#### Adaptation Strategy Interface
|
61
|
+
|
62
|
+
Adaptation strategies are implemented as callables (Procs or lambdas) that:
|
63
|
+
1. Accept a feedback hash
|
64
|
+
2. Perform adaptation on the target
|
65
|
+
3. Return a result hash with adaptation details
|
66
|
+
|
67
|
+
### Integration Points
|
68
|
+
|
69
|
+
1. **Verification Hub**: Provides feedback based on verification results
|
70
|
+
2. **Task Execution**: Reports outcomes for adaptation consideration
|
71
|
+
3. **Human Interface**: Allows manual feedback to drive adaptation
|
72
|
+
4. **Learning System**: Provides pattern-based suggestions for adaptations
|
73
|
+
|
74
|
+
## Key Behaviors
|
75
|
+
|
76
|
+
### Adaptation Threshold
|
77
|
+
|
78
|
+
The engine uses a configurable threshold to determine when adaptation is needed:
|
79
|
+
- Confidence scores below threshold trigger adaptation consideration
|
80
|
+
- Threshold can be adjusted based on domain requirements and risk tolerance
|
81
|
+
|
82
|
+
### Auto-Adaptation
|
83
|
+
|
84
|
+
Two operating modes are supported:
|
85
|
+
1. **Manual review**: Adaptations are suggested but require confirmation
|
86
|
+
2. **Automatic application**: Adaptations are applied immediately when needed
|
87
|
+
|
88
|
+
### Adaptation Registry
|
89
|
+
|
90
|
+
Components register specific adaptation strategies:
|
91
|
+
- Different strategies for different component types
|
92
|
+
- Strategy registration at runtime allows for extensibility
|
93
|
+
- Domain-specific strategies can be registered as needed
|
94
|
+
|
95
|
+
### History Tracking
|
96
|
+
|
97
|
+
All feedback and adaptations are tracked:
|
98
|
+
- Provides audit trail of system improvements
|
99
|
+
- Enables analysis of adaptation effectiveness
|
100
|
+
- Supports learning for future adaptation strategies
|
101
|
+
|
102
|
+
## Implementation Considerations
|
103
|
+
|
104
|
+
1. **Error Handling**: Adaptations could potentially create regression issues, so robust error handling is essential
|
105
|
+
2. **Persistence**: Consider whether adaptation history should be persisted across sessions
|
106
|
+
3. **Metrics**: Define standard metrics for measuring adaptation effectiveness
|
107
|
+
4. **Strategy Composition**: Allow complex adaptations through composition of simpler strategies
|
108
|
+
5. **Validation**: Ensure adaptations maintain system consistency and don't violate constraints
|
109
|
+
|
110
|
+
## Future Extensions
|
111
|
+
|
112
|
+
1. **Adaptation Chains**: Support sequences of adaptations with dependencies
|
113
|
+
2. **Meta-Adaptation**: Adapt the adaptation strategies themselves based on effectiveness
|
114
|
+
3. **A/B Testing**: Compare different adaptation strategies for effectiveness
|
115
|
+
4. **Domain-Specific Adapters**: Create specialized adaptation libraries for different domains
|
116
|
+
5. **Collaborative Adaptation**: Allow multiple agents to contribute to adaptation decisions
|
117
|
+
|
118
|
+
## Security and Safety
|
119
|
+
|
120
|
+
1. **Adaptation Limits**: Set boundaries on what can be changed through adaptation
|
121
|
+
2. **Rollback Capability**: Ability to revert problematic adaptations
|
122
|
+
3. **Approval Workflows**: Multi-stage approval for critical adaptations
|
123
|
+
4. **Isolation**: Ensure adaptations can't compromise system integrity
|
124
|
+
|
125
|
+
## Conclusion
|
126
|
+
|
127
|
+
The Adaptation Engine provides a flexible, extensible mechanism for improving system performance through feedback-driven adjustments. By applying targeted adaptations based on execution outcomes, verification results, and human feedback, the system can continuously improve its effectiveness in achieving user goals.
|