agentic-code 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,230 @@
1
+ # Test Strategy: ROI-Based Selection
2
+
3
+ ## Core Principle: Maximum Coverage, Minimum Tests
4
+
5
+ **Philosophy**: 10 reliable tests > 100 unmaintained tests
6
+
7
+ Quality over quantity - focus resources on high-value tests that provide maximum coverage with minimum maintenance burden.
8
+
9
+ ## ROI Calculation Framework
10
+
11
+ ### ROI Formula
12
+
13
+ ```
14
+ ROI Score = (Business Value × User Frequency + Legal Requirement × 10 + Defect Detection)
15
+ / (Creation Cost + Execution Cost + Maintenance Cost)
16
+ ```
17
+
18
+ ### Value Components
19
+
20
+ **Business Value** (0-10 scale):
21
+ - 10: Revenue-critical (payment processing, checkout)
22
+ - 8-9: Core business features (user registration, data persistence)
23
+ - 5-7: Important secondary features (search, filtering)
24
+ - 2-4: Nice-to-have features (UI enhancements)
25
+ - 0-1: Cosmetic features
26
+
27
+ **User Frequency** (0-10 scale):
28
+ - 10: Every user, every session (authentication)
29
+ - 8-9: >80% of users regularly
30
+ - 5-7: 50-80% of users occasionally
31
+ - 2-4: <50% of users rarely
32
+ - 0-1: Edge case users only
33
+
34
+ **Legal Requirement** (boolean → 0 or 1):
35
+ - 1: Legally mandated (GDPR compliance, data protection)
36
+ - 0: Not legally required
37
+
38
+ **Defect Detection** (0-10 scale):
39
+ - 10: High likelihood of catching critical bugs
40
+ - 5-7: Moderate likelihood of catching bugs
41
+ - 0-4: Low likelihood (simple logic, well-tested patterns)
42
+
43
+ ### Cost Components
44
+
45
+ **Test Level Cost Table**:
46
+
47
+ | Test Level | Creation Cost | Execution Cost | Maintenance Cost | Total Cost |
48
+ |-------------|---------------|----------------|------------------|------------|
49
+ | Unit | 1 | 1 | 1 | 3 |
50
+ | Integration | 3 | 5 | 3 | 11 |
51
+ | E2E | 10 | 20 | 8 | 38 |
52
+
53
+ **Cost Rationale**:
54
+ - **Unit Tests**: Fast to write, fast to run, rarely break from refactoring
55
+ - **Integration Tests**: Moderate setup, slower execution, moderate maintenance
56
+ - **E2E Tests**: Complex setup, very slow execution, high brittleness (12x more expensive than unit tests)
57
+
58
+ ### ROI Calculation Examples
59
+
60
+ **Example 1: Payment Processing Integration Test**
61
+ ```
62
+ Business Value: 10 (revenue-critical)
63
+ User Frequency: 9 (90% of users)
64
+ Legal Requirement: 0
65
+ Defect Detection: 8 (high complexity)
66
+
67
+ ROI = (10 × 9 + 0 + 8) / 11 = 98 / 11 = 8.9
68
+ Decision: HIGH ROI → Generate this test
69
+ ```
70
+
71
+ **Example 2: UI Theme Toggle E2E Test**
72
+ ```
73
+ Business Value: 2 (cosmetic feature)
74
+ User Frequency: 5 (50% of users)
75
+ Legal Requirement: 0
76
+ Defect Detection: 3 (simple logic)
77
+
78
+ ROI = (2 × 5 + 0 + 3) / 38 = 13 / 38 = 0.34
79
+ Decision: LOW ROI → Skip this E2E test (consider unit test instead)
80
+ ```
81
+
82
+ **Example 3: GDPR Data Deletion E2E Test**
83
+ ```
84
+ Business Value: 8 (critical compliance)
85
+ User Frequency: 1 (rare user action)
86
+ Legal Requirement: 1 (legally mandated)
87
+ Defect Detection: 9 (high consequences if broken)
88
+
89
+ ROI = (8 × 1 + 1 × 10 + 9) / 38 = 27 / 38 = 0.71
90
+ Decision: MEDIUM ROI → Generate (legal requirement justifies cost)
91
+ ```
92
+
93
+ ## Critical User Journey Definition
94
+
95
+ Tests with HIGH priority regardless of strict ROI calculation:
96
+
97
+ ### Mandatory Coverage Areas
98
+
99
+ 1. **Revenue-Impacting Flows**
100
+ - Payment processing end-to-end
101
+ - Checkout and order completion
102
+ - Subscription management
103
+ - Purchase confirmation and receipts
104
+
105
+ 2. **Legally Required Flows**
106
+ - GDPR data deletion/export
107
+ - User consent management
108
+ - Data protection compliance
109
+ - Regulatory audit trails
110
+
111
+ 3. **High-Frequency Core Functionality**
112
+ - User authentication/authorization (>80% of users)
113
+ - Core CRUD operations for primary entities
114
+ - Critical business workflows
115
+ - Data integrity for primary data models
116
+
117
+ **Budget Exception**: Critical User Journeys may exceed standard budget limits with explicit justification.
118
+
119
+ ## Test Selection Guidelines
120
+
121
+ ### Selection Thresholds
122
+
123
+ **Integration Tests**:
124
+ - ROI > 3.0: Strong candidate
125
+ - ROI 1.5-3.0: Consider based on available budget
126
+ - ROI < 1.5: Skip or convert to unit test
127
+
128
+ **E2E Tests**:
129
+ - ROI > 2.0: Strong candidate
130
+ - ROI 1.0-2.0: Consider if Critical User Journey
131
+ - ROI < 1.0: Skip (too expensive relative to value)
132
+
133
+ ### Push-Down Analysis
134
+
135
+ Before generating higher-level test, ask:
136
+
137
+ 1. **Can this be unit-tested?**
138
+ - YES → Generate unit test instead
139
+ - NO → Continue to integration test consideration
140
+
141
+ 2. **Already covered by integration test?**
142
+ - YES → Don't create E2E version
143
+ - NO → Consider E2E test if ROI justifies
144
+
145
+ **Example**:
146
+ - "Tax calculation accuracy" → Unit test (pure logic)
147
+ - "Tax applied to order total" → Integration test (multiple components)
148
+ - "User sees correct tax in checkout flow" → E2E test only if Critical User Journey
149
+
150
+ ## Deduplication Strategy
151
+
152
+ Before generating any test:
153
+
154
+ 1. **Search existing test suite** for similar coverage
155
+ 2. **Check for overlapping scenarios** at different test levels
156
+ 3. **Identify redundant verifications** already covered elsewhere
157
+
158
+ **Decision Matrix**:
159
+ ```
160
+ Existing coverage found?
161
+ → Full coverage: Skip new test
162
+ → Partial coverage: Extend existing test
163
+ → No coverage: Generate new test
164
+ ```
165
+
166
+ ## Application in Test Generation
167
+
168
+ ### Phase 1: Candidate Enumeration
169
+ - List all possible test scenarios
170
+ - Assign ROI metadata to each candidate
171
+
172
+ ### Phase 2: ROI-Based Selection
173
+ 1. Calculate ROI for each candidate
174
+ 2. Apply deduplication checks
175
+ 3. Apply push-down analysis
176
+ 4. Sort by ROI (descending)
177
+
178
+ ### Phase 3: Budget Enforcement
179
+ - Select top N tests within budget limits
180
+ - Document budget usage
181
+ - Report selection rationale
182
+
183
+ **See**: `.agents/tasks/acceptance-test-generation.md` for detailed implementation process
184
+
185
+ ## Continuous Improvement
186
+
187
+ ### Metrics to Track
188
+
189
+ 1. **Selection Rate**: Tests generated / Total candidates
190
+ - Target: 25-35% (indicates effective filtering)
191
+
192
+ 2. **Average ROI**: Average ROI of generated tests
193
+ - Target: >3.0 for integration, >1.5 for E2E
194
+
195
+ 3. **Budget Utilization**: Actual tests / Budget limit
196
+ - Target: 80-100% (full utilization of valuable test slots)
197
+
198
+ 4. **Defect Detection Rate**: Bugs caught / Total tests
199
+ - Track over time to validate ROI predictions
200
+
201
+ ### Calibration
202
+
203
+ Periodically review:
204
+ - Are high-ROI tests actually catching bugs?
205
+ - Are cost estimates accurate?
206
+ - Do business value ratings align with stakeholder priorities?
207
+
208
+ **Adjust formula weights based on empirical data**
209
+
210
+ ## Anti-Patterns to Avoid
211
+
212
+ ❌ **Gaming the System**:
213
+ - Inflating business value scores to justify favorite tests
214
+ - Ignoring ROI when it contradicts intuition
215
+ - Cherry-picking ROI calculation only for preferred tests
216
+
217
+ ✅ **Proper Usage**:
218
+ - Apply ROI calculation consistently to all candidates
219
+ - Document justification when overriding ROI decisions
220
+ - Use empirical data to calibrate scores over time
221
+
222
+ ❌ **Analysis Paralysis**:
223
+ - Spending excessive time on precise ROI calculations
224
+ - Debating single-point differences in scores
225
+ - Treating ROI as exact science rather than decision aid
226
+
227
+ ✅ **Practical Application**:
228
+ - Use ROI for relative prioritization, not absolute precision
229
+ - Focus on order-of-magnitude differences (8.9 vs 0.34)
230
+ - Make quick decisions for obvious high/low ROI cases
@@ -0,0 +1,117 @@
1
+ # General Development Rules
2
+
3
+ ## Basic Principles
4
+
5
+ ✅ **Aggressive Refactoring**
6
+ - Continuously improve code structure and readability
7
+ - Make code changes in small, safe steps
8
+ - Prioritize maintainability over initial implementation speed
9
+
10
+ ❌ **Unused "Just in Case" Code** - YAGNI principle
11
+ - Don't write code for hypothetical future requirements
12
+ - Delete unused functions, variables, and imports immediately
13
+ - Keep codebase lean and focused on current needs
14
+
15
+ ## Comment Writing Rules
16
+
17
+ - **Function Description Focus**: Describe what the code "does", not how it works
18
+ - **No Historical Information**: Do not record development history in comments
19
+ - **Timeless**: Write only content that remains valid whenever read
20
+ - **Conciseness**: Keep explanations to necessary minimum
21
+ - **Explain "Why"**: Comments should explain reasoning, not implementation details
22
+
23
+ ## Function Design
24
+
25
+ **Parameter Management**
26
+ - **0-2 parameters maximum**: Use structured data (object/struct/dict) for 3+ parameters
27
+ ```
28
+ ✅ Good: createUser({name, email, role})
29
+ ❌ Avoid: createUser(name, email, role, department, startDate)
30
+ ```
31
+ *Note: Use your language's idiomatic approach for grouping parameters*
32
+
33
+ **Dependency Injection**
34
+ - **Inject external dependencies explicitly**: Ensure testability and modularity
35
+ - Pass dependencies as parameters (functions, constructors, or other language-appropriate mechanisms)
36
+ - Avoid global state, direct instantiation, or implicit dependencies
37
+ - Prefer interfaces/contracts over concrete implementations where applicable
38
+
39
+ ## Error Handling
40
+
41
+ **Absolute Rule**: Error suppression prohibited. All errors must have log output and appropriate handling.
42
+
43
+ **Layer-Specific Error Handling**
44
+ - **Presentation Layer**: Convert errors to user-friendly messages, log excluding sensitive information
45
+ - **Business Layer**: Detect business rule violations, propagate domain-specific errors
46
+ - **Data Layer**: Convert technical errors to domain errors
47
+
48
+ **Structured Logging and Sensitive Information Protection**
49
+ Never include sensitive information in logs:
50
+ - Passwords, tokens, API keys, secrets
51
+ - Credit card numbers, personal identification numbers
52
+ - Any personally identifiable information (PII)
53
+
54
+ **Asynchronous Error Handling**
55
+ - Use appropriate error handling mechanisms for your language
56
+ - Always log and appropriately propagate errors
57
+ - Set up global error handlers where applicable
58
+
59
+ ## Clean Code Principles
60
+
61
+ ✅ **Recommended Practices**
62
+ - Delete unused code immediately
63
+ - Remove debug statements and temporary logging
64
+ - Use meaningful variable and function names
65
+ - Keep functions small and focused on single responsibility
66
+
67
+ ❌ **Avoid These Practices**
68
+ - Commented-out code (use version control for history)
69
+ - Magic numbers without explanation
70
+ - Deep nesting (prefer early returns)
71
+ - Functions that do multiple unrelated things
72
+
73
+ ## Refactoring Techniques
74
+
75
+ **Basic Policy**
76
+ - **Small Steps**: Maintain always-working state through gradual improvements
77
+ - **Safe Changes**: Minimize the scope of changes at once
78
+ - **Behavior Guarantee**: Ensure existing behavior remains unchanged while proceeding
79
+
80
+ **Implementation Procedure**
81
+ 1. Understand Current State
82
+ 2. Make Gradual Changes
83
+ 3. Verify Behavior
84
+ 4. Final Validation
85
+
86
+ **Priority Order**
87
+ 1. Duplicate Code Removal
88
+ 2. Large Function Division
89
+ 3. Complex Conditional Branch Simplification
90
+ 4. Architecture Improvement
91
+
92
+ ## Performance Considerations
93
+
94
+ **General Principles**
95
+ - Measure before optimizing (avoid premature optimization)
96
+ - Focus on algorithmic complexity over micro-optimizations
97
+ - Consider memory usage, especially with large datasets
98
+ - Use appropriate data structures for the use case
99
+
100
+ **Resource Management**
101
+ - Properly close files, connections, and other resources
102
+ - Be mindful of memory leaks in long-running applications
103
+ - Use efficient algorithms for data processing
104
+
105
+ ## Code Organization
106
+
107
+ **File Structure**
108
+ - Group related functionality together
109
+ - Separate concerns (business logic, data access, presentation)
110
+ - Use consistent naming conventions throughout the project
111
+ - Keep configuration separate from business logic
112
+
113
+ **Modularity**
114
+ - Write small, focused modules/functions
115
+ - Minimize dependencies between modules
116
+ - Use clear interfaces between components
117
+ - Follow single responsibility principle
@@ -0,0 +1,257 @@
1
+ # General Testing Rules
2
+
3
+ ## TDD Process [MANDATORY for all code changes]
4
+
5
+ **Execute this process for every code change:**
6
+
7
+ ### RED Phase
8
+ 1. Write test that defines expected behavior
9
+ 2. Run test
10
+ 3. Confirm test FAILS (if it passes, the test is wrong)
11
+
12
+ ### GREEN Phase
13
+ 1. Write MINIMAL code to make test pass
14
+ 2. Run test
15
+ 3. Confirm test PASSES
16
+
17
+ ### REFACTOR Phase
18
+ 1. Improve code quality
19
+ 2. Run test
20
+ 3. Confirm test STILL PASSES
21
+
22
+ ### VERIFY Phase [MANDATORY - 0 ERRORS REQUIRED]
23
+ 1. Execute ALL quality check commands for your language/project
24
+ 2. Fix any errors until ALL commands pass with 0 errors
25
+ 3. Confirm no regressions
26
+ 4. ENFORCEMENT: Cannot proceed with ANY errors or warnings
27
+
28
+ **Exceptions (no TDD required):**
29
+ - Pure configuration files
30
+ - Documentation only
31
+ - Emergency fixes (but add tests immediately after)
32
+
33
+ ## Basic Testing Policy
34
+
35
+ ### Quality Requirements
36
+ - **Coverage**: Unit test coverage must be 80% or higher
37
+ - **Independence**: Each test can run independently
38
+ - **Reproducibility**: Tests are environment-independent
39
+
40
+ ### Coverage Requirements
41
+ **Mandatory**: Unit test coverage must be 80% or higher
42
+ **Metrics**: Statements, Branches, Functions, Lines
43
+
44
+ ### Test Types and Scope
45
+ 1. **Unit Tests**
46
+ - Verify behavior of individual units (functions, modules, or components)
47
+ - Mock all external dependencies
48
+ - Fast execution (milliseconds)
49
+
50
+ 2. **Integration Tests**
51
+ - Verify coordination between multiple components
52
+ - Use actual dependencies when appropriate
53
+ - Test real system interactions
54
+
55
+ 3. **E2E Tests (End-to-End Tests)**
56
+ - Verify complete user workflows across entire system
57
+ - Test real-world scenarios with all components integrated
58
+ - Validate system behavior from user perspective
59
+ - Ensure business requirements are met end-to-end
60
+
61
+ 4. **Cross-functional Verification in E2E Tests** [MANDATORY for feature modifications]
62
+
63
+ **Purpose**: Prevent regression and ensure existing features remain stable when introducing new features or modifications.
64
+
65
+ **When Required**:
66
+ - Adding new features that interact with existing components
67
+ - Modifying core business logic or workflows
68
+ - Changing shared resources or data structures
69
+ - Updating APIs or integration points
70
+
71
+ **Integration Point Analysis**:
72
+ - **High Impact**: Changes to core process flows, breaking changes, or workflow modifications
73
+ - Mandatory comprehensive E2E test coverage
74
+ - Full regression test suite required
75
+ - Performance benchmarking before/after
76
+
77
+ - **Medium Impact**: Data usage modifications, shared state changes, or new dependencies
78
+ - Integration tests minimum requirement
79
+ - Targeted E2E tests for affected workflows
80
+ - Edge case coverage mandatory
81
+
82
+ - **Low Impact**: Read-only operations, logging additions, or monitoring hooks
83
+ - Unit test coverage sufficient
84
+ - Smoke tests for integration points
85
+
86
+ **Verification Pattern**:
87
+ 1. **Establish Baseline**
88
+ - Test and document existing feature behavior
89
+ - Capture performance metrics
90
+ - Record expected outputs
91
+
92
+ 2. **Apply Changes**
93
+ - Deploy or enable new feature
94
+ - Maintain feature flags for rollback capability
95
+
96
+ 3. **Verify Existing Features**
97
+ - Confirm existing features still function correctly
98
+ - Compare against baseline metrics
99
+ - Validate no unexpected side effects
100
+
101
+ 4. **Measure Performance** (NOT long-term stability tests)
102
+ - Response times within acceptable limits (project-specific)
103
+ - Resource usage remains stable
104
+ - No memory leaks or degradation
105
+
106
+ **Success Criteria**:
107
+ - Zero breaking changes in existing workflows
108
+ - Performance degradation within project-defined acceptable limits
109
+ - No new errors in previously stable features
110
+ - All integration points maintain expected contracts
111
+ - Backward compatibility preserved where required
112
+
113
+ **Documentation Requirements**:
114
+ - Map all integration points in Design Doc
115
+ - Document test coverage for each impact level
116
+
117
+ ## Test Design Principles
118
+
119
+ ### Test Case Structure
120
+ - Tests consist of three stages: **Setup, Execute, Verify** (also known as Arrange-Act-Assert or Given-When-Then)
121
+ - Clear naming that shows purpose of each test
122
+ - One test case verifies only one behavior
123
+ - Test names should describe expected behavior, not implementation
124
+
125
+ ### Test Data Management
126
+ - Manage test data in dedicated directories
127
+ - Define test-specific configuration values
128
+ - Always mock sensitive information (passwords, tokens, API keys)
129
+ - Keep test data minimal, using only data directly related to test case verification
130
+
131
+ ### Test Independence
132
+ - Each test should be able to run in isolation
133
+ - Tests should not depend on execution order
134
+ - Clean up test state after each test
135
+ - Avoid shared mutable state between tests
136
+
137
+ ## Mock and Stub Usage Policy
138
+
139
+ ✅ **Recommended: Mock external dependencies in unit tests**
140
+ - Merit: Ensures test independence and reproducibility
141
+ - Practice: Mock databases, APIs, file systems, and other external dependencies
142
+ - Use framework-appropriate mocking tools
143
+
144
+ ❌ **Avoid: Actual external connections in unit tests**
145
+ - Reason: Slows test speed and causes environment-dependent problems
146
+ - Exception: Integration tests that specifically test external integration
147
+
148
+ ### Mock Decision Criteria
149
+ | Mock Characteristics | Response Policy |
150
+ |---------------------|-----------------|
151
+ | **Simple and stable** | Consolidate in common helpers |
152
+ | **Complex or frequently changing** | Individual implementation |
153
+ | **Duplicated in 3+ places** | Consider consolidation |
154
+ | **Test-specific logic** | Individual implementation |
155
+
156
+ ## Test Granularity Principles
157
+
158
+ ### Core Principle: Observable Behavior Only
159
+ **MUST Test**:
160
+ - Public APIs and contracts
161
+ - Return values and outputs
162
+ - Exceptions and error conditions
163
+ - External calls and side effects
164
+ - Persisted state changes
165
+
166
+ **MUST NOT Test**:
167
+ - Internal implementation details not exposed publicly
168
+ - Internal state that's not observable from outside
169
+ - Algorithm implementation details
170
+ - Framework/library internals
171
+
172
+ ### Test Failure Response Decision Criteria
173
+
174
+ **Fix tests when:**
175
+ - Expected values are wrong
176
+ - Tests reference non-existent features
177
+ - Tests depend on implementation details
178
+ - Tests were written only for coverage
179
+
180
+ **Fix implementation when:**
181
+ - Tests represent valid specifications
182
+ - Business logic requirements have changed
183
+ - Important edge cases are failing
184
+
185
+ **When in doubt**: Confirm with stakeholders or domain experts
186
+
187
+ ## Test Implementation Best Practices
188
+
189
+ ### Naming Conventions
190
+ - Test files: Follow your language/framework conventions
191
+ - Test suites: Names describing target features or situations
192
+ - Test cases: Names describing expected behavior (not implementation)
193
+
194
+ ### Test Code Quality Rules
195
+
196
+ ✅ **Recommended: Keep all tests always active**
197
+ - Merit: Guarantees test suite completeness
198
+ - Practice: Fix problematic tests and activate them
199
+
200
+ ❌ **Avoid: Skipping or commenting out tests**
201
+ - Reason: Creates test gaps and incomplete quality checks
202
+ - Solution: Either fix the test or completely delete if truly unnecessary
203
+
204
+ ## Test Quality Criteria [MANDATORY]
205
+
206
+ 1. **Boundary coverage**: Include empty/zero/max/error cases with happy paths
207
+ 2. **Literal expectations**: Use literal values in assertions, not computed expressions
208
+ 3. **Result verification**: Assert return values and state, not call order
209
+ 4. **Meaningful assertions**: Every test must have at least one assertion
210
+ 5. **Mock external I/O only**: Mock DB/API/filesystem, use real internal utilities
211
+
212
+ ### Test Helper Guidelines
213
+
214
+ **Basic Principles**
215
+ Test helpers should reduce duplication and improve maintainability.
216
+
217
+ **Usage Examples**
218
+ - Builder patterns for test data creation
219
+ - Custom assertions for domain-specific validation
220
+ - Shared setup/teardown utilities
221
+ - Common mock configurations
222
+
223
+ ## Quality Check Commands [MANDATORY for VERIFY phase]
224
+
225
+ **Execute quality checks appropriate for your language and project setup:**
226
+
227
+ ### Required Quality Checks
228
+ Your project MUST have mechanisms to verify:
229
+
230
+ 1. **All Tests Pass**
231
+ - Unit tests execute successfully
232
+ - Integration tests (if applicable) pass
233
+ - No test failures or errors
234
+
235
+ 2. **Code Builds Successfully**
236
+ - Compilation succeeds (for compiled languages)
237
+ - No build errors or warnings
238
+
239
+ 3. **Code Style Compliance**
240
+ - Linting rules are satisfied
241
+ - Formatting standards are met
242
+ - Style guide adherence verified
243
+
244
+ 4. **Type Safety** (for typed languages)
245
+ - Type checking passes
246
+ - No type errors or warnings
247
+
248
+ ### Implementation Guidelines
249
+ - **Identify Your Tools**: Use your project's existing quality tools (test runners, linters, formatters, type checkers)
250
+ - **Zero Error Policy**: ALL quality checks must pass with 0 errors before task completion
251
+ - **Document Execution**: Note which quality checks were run and their results
252
+ - **Project-Specific**: Adapt to your specific language, framework, and tooling setup
253
+
254
+ ### ENFORCEMENT
255
+ - Cannot proceed with task completion if ANY quality check fails
256
+ - Must fix all errors and warnings before marking task complete
257
+ - If your project lacks certain quality tools, establish them or document the gap