agentic-code 0.5.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agents/context-maps/task-rule-matrix.yaml +248 -0
- package/.agents/rules/contextual/architecture/implementation-approach.md +202 -0
- package/.agents/rules/core/ai-development-guide.md +272 -0
- package/.agents/rules/core/documentation-criteria.md +184 -0
- package/.agents/rules/core/integration-e2e-testing.md +76 -0
- package/.agents/rules/core/metacognition.md +153 -0
- package/.agents/rules/core/testing-strategy.md +230 -0
- package/.agents/rules/language/general/rules.md +117 -0
- package/.agents/rules/language/general/testing.md +257 -0
- package/.agents/rules/language/typescript/rules.md +178 -0
- package/.agents/rules/language/typescript/testing.md +284 -0
- package/.agents/tasks/acceptance-test-generation.md +461 -0
- package/.agents/tasks/code-review.md +207 -0
- package/.agents/tasks/implementation.md +199 -0
- package/.agents/tasks/integration-test-review.md +132 -0
- package/.agents/tasks/prd-creation.md +336 -0
- package/.agents/tasks/quality-assurance.md +219 -0
- package/.agents/tasks/task-analysis.md +263 -0
- package/.agents/tasks/technical-design.md +432 -0
- package/.agents/tasks/technical-document-review.md +254 -0
- package/.agents/tasks/work-planning.md +239 -0
- package/.agents/workflows/agentic-coding.md +333 -0
- package/AGENTS.md +156 -0
- package/LICENSE +21 -0
- package/README.md +268 -0
- package/bin/cli.js +117 -0
- package/package.json +45 -0
- package/scripts/setup.js +82 -0
|
@@ -0,0 +1,230 @@
|
|
|
1
|
+
# Test Strategy: ROI-Based Selection
|
|
2
|
+
|
|
3
|
+
## Core Principle: Maximum Coverage, Minimum Tests
|
|
4
|
+
|
|
5
|
+
**Philosophy**: 10 reliable tests > 100 unmaintained tests
|
|
6
|
+
|
|
7
|
+
Quality over quantity - focus resources on high-value tests that provide maximum coverage with minimum maintenance burden.
|
|
8
|
+
|
|
9
|
+
## ROI Calculation Framework
|
|
10
|
+
|
|
11
|
+
### ROI Formula
|
|
12
|
+
|
|
13
|
+
```
|
|
14
|
+
ROI Score = (Business Value × User Frequency + Legal Requirement × 10 + Defect Detection)
|
|
15
|
+
/ (Creation Cost + Execution Cost + Maintenance Cost)
|
|
16
|
+
```
|
|
17
|
+
|
|
18
|
+
### Value Components
|
|
19
|
+
|
|
20
|
+
**Business Value** (0-10 scale):
|
|
21
|
+
- 10: Revenue-critical (payment processing, checkout)
|
|
22
|
+
- 8-9: Core business features (user registration, data persistence)
|
|
23
|
+
- 5-7: Important secondary features (search, filtering)
|
|
24
|
+
- 2-4: Nice-to-have features (UI enhancements)
|
|
25
|
+
- 0-1: Cosmetic features
|
|
26
|
+
|
|
27
|
+
**User Frequency** (0-10 scale):
|
|
28
|
+
- 10: Every user, every session (authentication)
|
|
29
|
+
- 8-9: >80% of users regularly
|
|
30
|
+
- 5-7: 50-80% of users occasionally
|
|
31
|
+
- 2-4: <50% of users rarely
|
|
32
|
+
- 0-1: Edge case users only
|
|
33
|
+
|
|
34
|
+
**Legal Requirement** (boolean → 0 or 1):
|
|
35
|
+
- 1: Legally mandated (GDPR compliance, data protection)
|
|
36
|
+
- 0: Not legally required
|
|
37
|
+
|
|
38
|
+
**Defect Detection** (0-10 scale):
|
|
39
|
+
- 10: High likelihood of catching critical bugs
|
|
40
|
+
- 5-7: Moderate likelihood of catching bugs
|
|
41
|
+
- 0-4: Low likelihood (simple logic, well-tested patterns)
|
|
42
|
+
|
|
43
|
+
### Cost Components
|
|
44
|
+
|
|
45
|
+
**Test Level Cost Table**:
|
|
46
|
+
|
|
47
|
+
| Test Level | Creation Cost | Execution Cost | Maintenance Cost | Total Cost |
|
|
48
|
+
|-------------|---------------|----------------|------------------|------------|
|
|
49
|
+
| Unit | 1 | 1 | 1 | 3 |
|
|
50
|
+
| Integration | 3 | 5 | 3 | 11 |
|
|
51
|
+
| E2E | 10 | 20 | 8 | 38 |
|
|
52
|
+
|
|
53
|
+
**Cost Rationale**:
|
|
54
|
+
- **Unit Tests**: Fast to write, fast to run, rarely break from refactoring
|
|
55
|
+
- **Integration Tests**: Moderate setup, slower execution, moderate maintenance
|
|
56
|
+
- **E2E Tests**: Complex setup, very slow execution, high brittleness (12x more expensive than unit tests)
|
|
57
|
+
|
|
58
|
+
### ROI Calculation Examples
|
|
59
|
+
|
|
60
|
+
**Example 1: Payment Processing Integration Test**
|
|
61
|
+
```
|
|
62
|
+
Business Value: 10 (revenue-critical)
|
|
63
|
+
User Frequency: 9 (90% of users)
|
|
64
|
+
Legal Requirement: 0
|
|
65
|
+
Defect Detection: 8 (high complexity)
|
|
66
|
+
|
|
67
|
+
ROI = (10 × 9 + 0 + 8) / 11 = 98 / 11 = 8.9
|
|
68
|
+
Decision: HIGH ROI → Generate this test
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
**Example 2: UI Theme Toggle E2E Test**
|
|
72
|
+
```
|
|
73
|
+
Business Value: 2 (cosmetic feature)
|
|
74
|
+
User Frequency: 5 (50% of users)
|
|
75
|
+
Legal Requirement: 0
|
|
76
|
+
Defect Detection: 3 (simple logic)
|
|
77
|
+
|
|
78
|
+
ROI = (2 × 5 + 0 + 3) / 38 = 13 / 38 = 0.34
|
|
79
|
+
Decision: LOW ROI → Skip this E2E test (consider unit test instead)
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
**Example 3: GDPR Data Deletion E2E Test**
|
|
83
|
+
```
|
|
84
|
+
Business Value: 8 (critical compliance)
|
|
85
|
+
User Frequency: 1 (rare user action)
|
|
86
|
+
Legal Requirement: 1 (legally mandated)
|
|
87
|
+
Defect Detection: 9 (high consequences if broken)
|
|
88
|
+
|
|
89
|
+
ROI = (8 × 1 + 1 × 10 + 9) / 38 = 27 / 38 = 0.71
|
|
90
|
+
Decision: MEDIUM ROI → Generate (legal requirement justifies cost)
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
## Critical User Journey Definition
|
|
94
|
+
|
|
95
|
+
Tests with HIGH priority regardless of strict ROI calculation:
|
|
96
|
+
|
|
97
|
+
### Mandatory Coverage Areas
|
|
98
|
+
|
|
99
|
+
1. **Revenue-Impacting Flows**
|
|
100
|
+
- Payment processing end-to-end
|
|
101
|
+
- Checkout and order completion
|
|
102
|
+
- Subscription management
|
|
103
|
+
- Purchase confirmation and receipts
|
|
104
|
+
|
|
105
|
+
2. **Legally Required Flows**
|
|
106
|
+
- GDPR data deletion/export
|
|
107
|
+
- User consent management
|
|
108
|
+
- Data protection compliance
|
|
109
|
+
- Regulatory audit trails
|
|
110
|
+
|
|
111
|
+
3. **High-Frequency Core Functionality**
|
|
112
|
+
- User authentication/authorization (>80% of users)
|
|
113
|
+
- Core CRUD operations for primary entities
|
|
114
|
+
- Critical business workflows
|
|
115
|
+
- Data integrity for primary data models
|
|
116
|
+
|
|
117
|
+
**Budget Exception**: Critical User Journeys may exceed standard budget limits with explicit justification.
|
|
118
|
+
|
|
119
|
+
## Test Selection Guidelines
|
|
120
|
+
|
|
121
|
+
### Selection Thresholds
|
|
122
|
+
|
|
123
|
+
**Integration Tests**:
|
|
124
|
+
- ROI > 3.0: Strong candidate
|
|
125
|
+
- ROI 1.5-3.0: Consider based on available budget
|
|
126
|
+
- ROI < 1.5: Skip or convert to unit test
|
|
127
|
+
|
|
128
|
+
**E2E Tests**:
|
|
129
|
+
- ROI > 2.0: Strong candidate
|
|
130
|
+
- ROI 1.0-2.0: Consider if Critical User Journey
|
|
131
|
+
- ROI < 1.0: Skip (too expensive relative to value)
|
|
132
|
+
|
|
133
|
+
### Push-Down Analysis
|
|
134
|
+
|
|
135
|
+
Before generating higher-level test, ask:
|
|
136
|
+
|
|
137
|
+
1. **Can this be unit-tested?**
|
|
138
|
+
- YES → Generate unit test instead
|
|
139
|
+
- NO → Continue to integration test consideration
|
|
140
|
+
|
|
141
|
+
2. **Already covered by integration test?**
|
|
142
|
+
- YES → Don't create E2E version
|
|
143
|
+
- NO → Consider E2E test if ROI justifies
|
|
144
|
+
|
|
145
|
+
**Example**:
|
|
146
|
+
- "Tax calculation accuracy" → Unit test (pure logic)
|
|
147
|
+
- "Tax applied to order total" → Integration test (multiple components)
|
|
148
|
+
- "User sees correct tax in checkout flow" → E2E test only if Critical User Journey
|
|
149
|
+
|
|
150
|
+
## Deduplication Strategy
|
|
151
|
+
|
|
152
|
+
Before generating any test:
|
|
153
|
+
|
|
154
|
+
1. **Search existing test suite** for similar coverage
|
|
155
|
+
2. **Check for overlapping scenarios** at different test levels
|
|
156
|
+
3. **Identify redundant verifications** already covered elsewhere
|
|
157
|
+
|
|
158
|
+
**Decision Matrix**:
|
|
159
|
+
```
|
|
160
|
+
Existing coverage found?
|
|
161
|
+
→ Full coverage: Skip new test
|
|
162
|
+
→ Partial coverage: Extend existing test
|
|
163
|
+
→ No coverage: Generate new test
|
|
164
|
+
```
|
|
165
|
+
|
|
166
|
+
## Application in Test Generation
|
|
167
|
+
|
|
168
|
+
### Phase 1: Candidate Enumeration
|
|
169
|
+
- List all possible test scenarios
|
|
170
|
+
- Assign ROI metadata to each candidate
|
|
171
|
+
|
|
172
|
+
### Phase 2: ROI-Based Selection
|
|
173
|
+
1. Calculate ROI for each candidate
|
|
174
|
+
2. Apply deduplication checks
|
|
175
|
+
3. Apply push-down analysis
|
|
176
|
+
4. Sort by ROI (descending)
|
|
177
|
+
|
|
178
|
+
### Phase 3: Budget Enforcement
|
|
179
|
+
- Select top N tests within budget limits
|
|
180
|
+
- Document budget usage
|
|
181
|
+
- Report selection rationale
|
|
182
|
+
|
|
183
|
+
**See**: `.agents/tasks/acceptance-test-generation.md` for detailed implementation process
|
|
184
|
+
|
|
185
|
+
## Continuous Improvement
|
|
186
|
+
|
|
187
|
+
### Metrics to Track
|
|
188
|
+
|
|
189
|
+
1. **Selection Rate**: Tests generated / Total candidates
|
|
190
|
+
- Target: 25-35% (indicates effective filtering)
|
|
191
|
+
|
|
192
|
+
2. **Average ROI**: Average ROI of generated tests
|
|
193
|
+
- Target: >3.0 for integration, >1.5 for E2E
|
|
194
|
+
|
|
195
|
+
3. **Budget Utilization**: Actual tests / Budget limit
|
|
196
|
+
- Target: 80-100% (full utilization of valuable test slots)
|
|
197
|
+
|
|
198
|
+
4. **Defect Detection Rate**: Bugs caught / Total tests
|
|
199
|
+
- Track over time to validate ROI predictions
|
|
200
|
+
|
|
201
|
+
### Calibration
|
|
202
|
+
|
|
203
|
+
Periodically review:
|
|
204
|
+
- Are high-ROI tests actually catching bugs?
|
|
205
|
+
- Are cost estimates accurate?
|
|
206
|
+
- Do business value ratings align with stakeholder priorities?
|
|
207
|
+
|
|
208
|
+
**Adjust formula weights based on empirical data**
|
|
209
|
+
|
|
210
|
+
## Anti-Patterns to Avoid
|
|
211
|
+
|
|
212
|
+
❌ **Gaming the System**:
|
|
213
|
+
- Inflating business value scores to justify favorite tests
|
|
214
|
+
- Ignoring ROI when it contradicts intuition
|
|
215
|
+
- Cherry-picking ROI calculation only for preferred tests
|
|
216
|
+
|
|
217
|
+
✅ **Proper Usage**:
|
|
218
|
+
- Apply ROI calculation consistently to all candidates
|
|
219
|
+
- Document justification when overriding ROI decisions
|
|
220
|
+
- Use empirical data to calibrate scores over time
|
|
221
|
+
|
|
222
|
+
❌ **Analysis Paralysis**:
|
|
223
|
+
- Spending excessive time on precise ROI calculations
|
|
224
|
+
- Debating single-point differences in scores
|
|
225
|
+
- Treating ROI as exact science rather than decision aid
|
|
226
|
+
|
|
227
|
+
✅ **Practical Application**:
|
|
228
|
+
- Use ROI for relative prioritization, not absolute precision
|
|
229
|
+
- Focus on order-of-magnitude differences (8.9 vs 0.34)
|
|
230
|
+
- Make quick decisions for obvious high/low ROI cases
|
|
@@ -0,0 +1,117 @@
|
|
|
1
|
+
# General Development Rules
|
|
2
|
+
|
|
3
|
+
## Basic Principles
|
|
4
|
+
|
|
5
|
+
✅ **Aggressive Refactoring**
|
|
6
|
+
- Continuously improve code structure and readability
|
|
7
|
+
- Make code changes in small, safe steps
|
|
8
|
+
- Prioritize maintainability over initial implementation speed
|
|
9
|
+
|
|
10
|
+
❌ **Unused "Just in Case" Code** - YAGNI principle
|
|
11
|
+
- Don't write code for hypothetical future requirements
|
|
12
|
+
- Delete unused functions, variables, and imports immediately
|
|
13
|
+
- Keep codebase lean and focused on current needs
|
|
14
|
+
|
|
15
|
+
## Comment Writing Rules
|
|
16
|
+
|
|
17
|
+
- **Function Description Focus**: Describe what the code "does", not how it works
|
|
18
|
+
- **No Historical Information**: Do not record development history in comments
|
|
19
|
+
- **Timeless**: Write only content that remains valid whenever read
|
|
20
|
+
- **Conciseness**: Keep explanations to necessary minimum
|
|
21
|
+
- **Explain "Why"**: Comments should explain reasoning, not implementation details
|
|
22
|
+
|
|
23
|
+
## Function Design
|
|
24
|
+
|
|
25
|
+
**Parameter Management**
|
|
26
|
+
- **0-2 parameters maximum**: Use structured data (object/struct/dict) for 3+ parameters
|
|
27
|
+
```
|
|
28
|
+
✅ Good: createUser({name, email, role})
|
|
29
|
+
❌ Avoid: createUser(name, email, role, department, startDate)
|
|
30
|
+
```
|
|
31
|
+
*Note: Use your language's idiomatic approach for grouping parameters*
|
|
32
|
+
|
|
33
|
+
**Dependency Injection**
|
|
34
|
+
- **Inject external dependencies explicitly**: Ensure testability and modularity
|
|
35
|
+
- Pass dependencies as parameters (functions, constructors, or other language-appropriate mechanisms)
|
|
36
|
+
- Avoid global state, direct instantiation, or implicit dependencies
|
|
37
|
+
- Prefer interfaces/contracts over concrete implementations where applicable
|
|
38
|
+
|
|
39
|
+
## Error Handling
|
|
40
|
+
|
|
41
|
+
**Absolute Rule**: Error suppression prohibited. All errors must have log output and appropriate handling.
|
|
42
|
+
|
|
43
|
+
**Layer-Specific Error Handling**
|
|
44
|
+
- **Presentation Layer**: Convert errors to user-friendly messages, log excluding sensitive information
|
|
45
|
+
- **Business Layer**: Detect business rule violations, propagate domain-specific errors
|
|
46
|
+
- **Data Layer**: Convert technical errors to domain errors
|
|
47
|
+
|
|
48
|
+
**Structured Logging and Sensitive Information Protection**
|
|
49
|
+
Never include sensitive information in logs:
|
|
50
|
+
- Passwords, tokens, API keys, secrets
|
|
51
|
+
- Credit card numbers, personal identification numbers
|
|
52
|
+
- Any personally identifiable information (PII)
|
|
53
|
+
|
|
54
|
+
**Asynchronous Error Handling**
|
|
55
|
+
- Use appropriate error handling mechanisms for your language
|
|
56
|
+
- Always log and appropriately propagate errors
|
|
57
|
+
- Set up global error handlers where applicable
|
|
58
|
+
|
|
59
|
+
## Clean Code Principles
|
|
60
|
+
|
|
61
|
+
✅ **Recommended Practices**
|
|
62
|
+
- Delete unused code immediately
|
|
63
|
+
- Remove debug statements and temporary logging
|
|
64
|
+
- Use meaningful variable and function names
|
|
65
|
+
- Keep functions small and focused on single responsibility
|
|
66
|
+
|
|
67
|
+
❌ **Avoid These Practices**
|
|
68
|
+
- Commented-out code (use version control for history)
|
|
69
|
+
- Magic numbers without explanation
|
|
70
|
+
- Deep nesting (prefer early returns)
|
|
71
|
+
- Functions that do multiple unrelated things
|
|
72
|
+
|
|
73
|
+
## Refactoring Techniques
|
|
74
|
+
|
|
75
|
+
**Basic Policy**
|
|
76
|
+
- **Small Steps**: Maintain always-working state through gradual improvements
|
|
77
|
+
- **Safe Changes**: Minimize the scope of changes at once
|
|
78
|
+
- **Behavior Guarantee**: Ensure existing behavior remains unchanged while proceeding
|
|
79
|
+
|
|
80
|
+
**Implementation Procedure**
|
|
81
|
+
1. Understand Current State
|
|
82
|
+
2. Make Gradual Changes
|
|
83
|
+
3. Verify Behavior
|
|
84
|
+
4. Final Validation
|
|
85
|
+
|
|
86
|
+
**Priority Order**
|
|
87
|
+
1. Duplicate Code Removal
|
|
88
|
+
2. Large Function Division
|
|
89
|
+
3. Complex Conditional Branch Simplification
|
|
90
|
+
4. Architecture Improvement
|
|
91
|
+
|
|
92
|
+
## Performance Considerations
|
|
93
|
+
|
|
94
|
+
**General Principles**
|
|
95
|
+
- Measure before optimizing (avoid premature optimization)
|
|
96
|
+
- Focus on algorithmic complexity over micro-optimizations
|
|
97
|
+
- Consider memory usage, especially with large datasets
|
|
98
|
+
- Use appropriate data structures for the use case
|
|
99
|
+
|
|
100
|
+
**Resource Management**
|
|
101
|
+
- Properly close files, connections, and other resources
|
|
102
|
+
- Be mindful of memory leaks in long-running applications
|
|
103
|
+
- Use efficient algorithms for data processing
|
|
104
|
+
|
|
105
|
+
## Code Organization
|
|
106
|
+
|
|
107
|
+
**File Structure**
|
|
108
|
+
- Group related functionality together
|
|
109
|
+
- Separate concerns (business logic, data access, presentation)
|
|
110
|
+
- Use consistent naming conventions throughout the project
|
|
111
|
+
- Keep configuration separate from business logic
|
|
112
|
+
|
|
113
|
+
**Modularity**
|
|
114
|
+
- Write small, focused modules/functions
|
|
115
|
+
- Minimize dependencies between modules
|
|
116
|
+
- Use clear interfaces between components
|
|
117
|
+
- Follow single responsibility principle
|
|
@@ -0,0 +1,257 @@
|
|
|
1
|
+
# General Testing Rules
|
|
2
|
+
|
|
3
|
+
## TDD Process [MANDATORY for all code changes]
|
|
4
|
+
|
|
5
|
+
**Execute this process for every code change:**
|
|
6
|
+
|
|
7
|
+
### RED Phase
|
|
8
|
+
1. Write test that defines expected behavior
|
|
9
|
+
2. Run test
|
|
10
|
+
3. Confirm test FAILS (if it passes, the test is wrong)
|
|
11
|
+
|
|
12
|
+
### GREEN Phase
|
|
13
|
+
1. Write MINIMAL code to make test pass
|
|
14
|
+
2. Run test
|
|
15
|
+
3. Confirm test PASSES
|
|
16
|
+
|
|
17
|
+
### REFACTOR Phase
|
|
18
|
+
1. Improve code quality
|
|
19
|
+
2. Run test
|
|
20
|
+
3. Confirm test STILL PASSES
|
|
21
|
+
|
|
22
|
+
### VERIFY Phase [MANDATORY - 0 ERRORS REQUIRED]
|
|
23
|
+
1. Execute ALL quality check commands for your language/project
|
|
24
|
+
2. Fix any errors until ALL commands pass with 0 errors
|
|
25
|
+
3. Confirm no regressions
|
|
26
|
+
4. ENFORCEMENT: Cannot proceed with ANY errors or warnings
|
|
27
|
+
|
|
28
|
+
**Exceptions (no TDD required):**
|
|
29
|
+
- Pure configuration files
|
|
30
|
+
- Documentation only
|
|
31
|
+
- Emergency fixes (but add tests immediately after)
|
|
32
|
+
|
|
33
|
+
## Basic Testing Policy
|
|
34
|
+
|
|
35
|
+
### Quality Requirements
|
|
36
|
+
- **Coverage**: Unit test coverage must be 80% or higher
|
|
37
|
+
- **Independence**: Each test can run independently
|
|
38
|
+
- **Reproducibility**: Tests are environment-independent
|
|
39
|
+
|
|
40
|
+
### Coverage Requirements
|
|
41
|
+
**Mandatory**: Unit test coverage must be 80% or higher
|
|
42
|
+
**Metrics**: Statements, Branches, Functions, Lines
|
|
43
|
+
|
|
44
|
+
### Test Types and Scope
|
|
45
|
+
1. **Unit Tests**
|
|
46
|
+
- Verify behavior of individual units (functions, modules, or components)
|
|
47
|
+
- Mock all external dependencies
|
|
48
|
+
- Fast execution (milliseconds)
|
|
49
|
+
|
|
50
|
+
2. **Integration Tests**
|
|
51
|
+
- Verify coordination between multiple components
|
|
52
|
+
- Use actual dependencies when appropriate
|
|
53
|
+
- Test real system interactions
|
|
54
|
+
|
|
55
|
+
3. **E2E Tests (End-to-End Tests)**
|
|
56
|
+
- Verify complete user workflows across entire system
|
|
57
|
+
- Test real-world scenarios with all components integrated
|
|
58
|
+
- Validate system behavior from user perspective
|
|
59
|
+
- Ensure business requirements are met end-to-end
|
|
60
|
+
|
|
61
|
+
4. **Cross-functional Verification in E2E Tests** [MANDATORY for feature modifications]
|
|
62
|
+
|
|
63
|
+
**Purpose**: Prevent regression and ensure existing features remain stable when introducing new features or modifications.
|
|
64
|
+
|
|
65
|
+
**When Required**:
|
|
66
|
+
- Adding new features that interact with existing components
|
|
67
|
+
- Modifying core business logic or workflows
|
|
68
|
+
- Changing shared resources or data structures
|
|
69
|
+
- Updating APIs or integration points
|
|
70
|
+
|
|
71
|
+
**Integration Point Analysis**:
|
|
72
|
+
- **High Impact**: Changes to core process flows, breaking changes, or workflow modifications
|
|
73
|
+
- Mandatory comprehensive E2E test coverage
|
|
74
|
+
- Full regression test suite required
|
|
75
|
+
- Performance benchmarking before/after
|
|
76
|
+
|
|
77
|
+
- **Medium Impact**: Data usage modifications, shared state changes, or new dependencies
|
|
78
|
+
- Integration tests minimum requirement
|
|
79
|
+
- Targeted E2E tests for affected workflows
|
|
80
|
+
- Edge case coverage mandatory
|
|
81
|
+
|
|
82
|
+
- **Low Impact**: Read-only operations, logging additions, or monitoring hooks
|
|
83
|
+
- Unit test coverage sufficient
|
|
84
|
+
- Smoke tests for integration points
|
|
85
|
+
|
|
86
|
+
**Verification Pattern**:
|
|
87
|
+
1. **Establish Baseline**
|
|
88
|
+
- Test and document existing feature behavior
|
|
89
|
+
- Capture performance metrics
|
|
90
|
+
- Record expected outputs
|
|
91
|
+
|
|
92
|
+
2. **Apply Changes**
|
|
93
|
+
- Deploy or enable new feature
|
|
94
|
+
- Maintain feature flags for rollback capability
|
|
95
|
+
|
|
96
|
+
3. **Verify Existing Features**
|
|
97
|
+
- Confirm existing features still function correctly
|
|
98
|
+
- Compare against baseline metrics
|
|
99
|
+
- Validate no unexpected side effects
|
|
100
|
+
|
|
101
|
+
4. **Measure Performance** (NOT long-term stability tests)
|
|
102
|
+
- Response times within acceptable limits (project-specific)
|
|
103
|
+
- Resource usage remains stable
|
|
104
|
+
- No memory leaks or degradation
|
|
105
|
+
|
|
106
|
+
**Success Criteria**:
|
|
107
|
+
- Zero breaking changes in existing workflows
|
|
108
|
+
- Performance degradation within project-defined acceptable limits
|
|
109
|
+
- No new errors in previously stable features
|
|
110
|
+
- All integration points maintain expected contracts
|
|
111
|
+
- Backward compatibility preserved where required
|
|
112
|
+
|
|
113
|
+
**Documentation Requirements**:
|
|
114
|
+
- Map all integration points in Design Doc
|
|
115
|
+
- Document test coverage for each impact level
|
|
116
|
+
|
|
117
|
+
## Test Design Principles
|
|
118
|
+
|
|
119
|
+
### Test Case Structure
|
|
120
|
+
- Tests consist of three stages: **Setup, Execute, Verify** (also known as Arrange-Act-Assert or Given-When-Then)
|
|
121
|
+
- Clear naming that shows purpose of each test
|
|
122
|
+
- One test case verifies only one behavior
|
|
123
|
+
- Test names should describe expected behavior, not implementation
|
|
124
|
+
|
|
125
|
+
### Test Data Management
|
|
126
|
+
- Manage test data in dedicated directories
|
|
127
|
+
- Define test-specific configuration values
|
|
128
|
+
- Always mock sensitive information (passwords, tokens, API keys)
|
|
129
|
+
- Keep test data minimal, using only data directly related to test case verification
|
|
130
|
+
|
|
131
|
+
### Test Independence
|
|
132
|
+
- Each test should be able to run in isolation
|
|
133
|
+
- Tests should not depend on execution order
|
|
134
|
+
- Clean up test state after each test
|
|
135
|
+
- Avoid shared mutable state between tests
|
|
136
|
+
|
|
137
|
+
## Mock and Stub Usage Policy
|
|
138
|
+
|
|
139
|
+
✅ **Recommended: Mock external dependencies in unit tests**
|
|
140
|
+
- Merit: Ensures test independence and reproducibility
|
|
141
|
+
- Practice: Mock databases, APIs, file systems, and other external dependencies
|
|
142
|
+
- Use framework-appropriate mocking tools
|
|
143
|
+
|
|
144
|
+
❌ **Avoid: Actual external connections in unit tests**
|
|
145
|
+
- Reason: Slows test speed and causes environment-dependent problems
|
|
146
|
+
- Exception: Integration tests that specifically test external integration
|
|
147
|
+
|
|
148
|
+
### Mock Decision Criteria
|
|
149
|
+
| Mock Characteristics | Response Policy |
|
|
150
|
+
|---------------------|-----------------|
|
|
151
|
+
| **Simple and stable** | Consolidate in common helpers |
|
|
152
|
+
| **Complex or frequently changing** | Individual implementation |
|
|
153
|
+
| **Duplicated in 3+ places** | Consider consolidation |
|
|
154
|
+
| **Test-specific logic** | Individual implementation |
|
|
155
|
+
|
|
156
|
+
## Test Granularity Principles
|
|
157
|
+
|
|
158
|
+
### Core Principle: Observable Behavior Only
|
|
159
|
+
**MUST Test**:
|
|
160
|
+
- Public APIs and contracts
|
|
161
|
+
- Return values and outputs
|
|
162
|
+
- Exceptions and error conditions
|
|
163
|
+
- External calls and side effects
|
|
164
|
+
- Persisted state changes
|
|
165
|
+
|
|
166
|
+
**MUST NOT Test**:
|
|
167
|
+
- Internal implementation details not exposed publicly
|
|
168
|
+
- Internal state that's not observable from outside
|
|
169
|
+
- Algorithm implementation details
|
|
170
|
+
- Framework/library internals
|
|
171
|
+
|
|
172
|
+
### Test Failure Response Decision Criteria
|
|
173
|
+
|
|
174
|
+
**Fix tests when:**
|
|
175
|
+
- Expected values are wrong
|
|
176
|
+
- Tests reference non-existent features
|
|
177
|
+
- Tests depend on implementation details
|
|
178
|
+
- Tests were written only for coverage
|
|
179
|
+
|
|
180
|
+
**Fix implementation when:**
|
|
181
|
+
- Tests represent valid specifications
|
|
182
|
+
- Business logic requirements have changed
|
|
183
|
+
- Important edge cases are failing
|
|
184
|
+
|
|
185
|
+
**When in doubt**: Confirm with stakeholders or domain experts
|
|
186
|
+
|
|
187
|
+
## Test Implementation Best Practices
|
|
188
|
+
|
|
189
|
+
### Naming Conventions
|
|
190
|
+
- Test files: Follow your language/framework conventions
|
|
191
|
+
- Test suites: Names describing target features or situations
|
|
192
|
+
- Test cases: Names describing expected behavior (not implementation)
|
|
193
|
+
|
|
194
|
+
### Test Code Quality Rules
|
|
195
|
+
|
|
196
|
+
✅ **Recommended: Keep all tests always active**
|
|
197
|
+
- Merit: Guarantees test suite completeness
|
|
198
|
+
- Practice: Fix problematic tests and activate them
|
|
199
|
+
|
|
200
|
+
❌ **Avoid: Skipping or commenting out tests**
|
|
201
|
+
- Reason: Creates test gaps and incomplete quality checks
|
|
202
|
+
- Solution: Either fix the test or completely delete if truly unnecessary
|
|
203
|
+
|
|
204
|
+
## Test Quality Criteria [MANDATORY]
|
|
205
|
+
|
|
206
|
+
1. **Boundary coverage**: Include empty/zero/max/error cases with happy paths
|
|
207
|
+
2. **Literal expectations**: Use literal values in assertions, not computed expressions
|
|
208
|
+
3. **Result verification**: Assert return values and state, not call order
|
|
209
|
+
4. **Meaningful assertions**: Every test must have at least one assertion
|
|
210
|
+
5. **Mock external I/O only**: Mock DB/API/filesystem, use real internal utilities
|
|
211
|
+
|
|
212
|
+
### Test Helper Guidelines
|
|
213
|
+
|
|
214
|
+
**Basic Principles**
|
|
215
|
+
Test helpers should reduce duplication and improve maintainability.
|
|
216
|
+
|
|
217
|
+
**Usage Examples**
|
|
218
|
+
- Builder patterns for test data creation
|
|
219
|
+
- Custom assertions for domain-specific validation
|
|
220
|
+
- Shared setup/teardown utilities
|
|
221
|
+
- Common mock configurations
|
|
222
|
+
|
|
223
|
+
## Quality Check Commands [MANDATORY for VERIFY phase]
|
|
224
|
+
|
|
225
|
+
**Execute quality checks appropriate for your language and project setup:**
|
|
226
|
+
|
|
227
|
+
### Required Quality Checks
|
|
228
|
+
Your project MUST have mechanisms to verify:
|
|
229
|
+
|
|
230
|
+
1. **All Tests Pass**
|
|
231
|
+
- Unit tests execute successfully
|
|
232
|
+
- Integration tests (if applicable) pass
|
|
233
|
+
- No test failures or errors
|
|
234
|
+
|
|
235
|
+
2. **Code Builds Successfully**
|
|
236
|
+
- Compilation succeeds (for compiled languages)
|
|
237
|
+
- No build errors or warnings
|
|
238
|
+
|
|
239
|
+
3. **Code Style Compliance**
|
|
240
|
+
- Linting rules are satisfied
|
|
241
|
+
- Formatting standards are met
|
|
242
|
+
- Style guide adherence verified
|
|
243
|
+
|
|
244
|
+
4. **Type Safety** (for typed languages)
|
|
245
|
+
- Type checking passes
|
|
246
|
+
- No type errors or warnings
|
|
247
|
+
|
|
248
|
+
### Implementation Guidelines
|
|
249
|
+
- **Identify Your Tools**: Use your project's existing quality tools (test runners, linters, formatters, type checkers)
|
|
250
|
+
- **Zero Error Policy**: ALL quality checks must pass with 0 errors before task completion
|
|
251
|
+
- **Document Execution**: Note which quality checks were run and their results
|
|
252
|
+
- **Project-Specific**: Adapt to your specific language, framework, and tooling setup
|
|
253
|
+
|
|
254
|
+
### ENFORCEMENT
|
|
255
|
+
- Cannot proceed with task completion if ANY quality check fails
|
|
256
|
+
- Must fix all errors and warnings before marking task complete
|
|
257
|
+
- If your project lacks certain quality tools, establish them or document the gap
|