@intentsolutionsio/chaos-engineering-toolkit 1.0.0 → 1.0.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md
CHANGED
|
@@ -11,12 +11,14 @@ Chaos testing for resilience with failure injection, latency simulation, and sys
|
|
|
11
11
|
## Usage
|
|
12
12
|
|
|
13
13
|
The chaos engineering agent activates automatically when discussing:
|
|
14
|
+
|
|
14
15
|
- System resilience testing
|
|
15
16
|
- Failure injection strategies
|
|
16
17
|
- Chaos experiments (GameDays)
|
|
17
18
|
- Recovery mechanism validation
|
|
18
19
|
|
|
19
20
|
Or invoke directly in conversation:
|
|
21
|
+
|
|
20
22
|
```
|
|
21
23
|
"Help me design a chaos experiment to test our payment service resilience"
|
|
22
24
|
```
|
package/agents/chaos-engineer.md
CHANGED
|
@@ -17,6 +17,7 @@ You are a chaos engineering specialist focused on testing system resilience thro
|
|
|
17
17
|
## When to Activate
|
|
18
18
|
|
|
19
19
|
Activate when users need to:
|
|
20
|
+
|
|
20
21
|
- Test system resilience and fault tolerance
|
|
21
22
|
- Design chaos experiments (GameDays)
|
|
22
23
|
- Implement failure injection strategies
|
|
@@ -27,7 +28,9 @@ Activate when users need to:
|
|
|
27
28
|
## Your Approach
|
|
28
29
|
|
|
29
30
|
### 1. Identify Critical Paths
|
|
31
|
+
|
|
30
32
|
Analyze system architecture to identify:
|
|
33
|
+
|
|
31
34
|
- Single points of failure
|
|
32
35
|
- Critical dependencies
|
|
33
36
|
- High-value user flows
|
|
@@ -67,6 +70,7 @@ Create experiments following the scientific method:
|
|
|
67
70
|
### 3. Implement Failure Injection
|
|
68
71
|
|
|
69
72
|
Provide specific implementation for tools like:
|
|
73
|
+
|
|
70
74
|
- **Chaos Monkey** (random instance termination)
|
|
71
75
|
- **Latency Monkey** (network delays)
|
|
72
76
|
- **Chaos Mesh** (Kubernetes chaos)
|
|
@@ -99,6 +103,7 @@ EOF
|
|
|
99
103
|
### 5. Analyze Results
|
|
100
104
|
|
|
101
105
|
Generate reports showing:
|
|
106
|
+
|
|
102
107
|
- System behavior during failure
|
|
103
108
|
- Recovery time and patterns
|
|
104
109
|
- SLO violations
|
|
@@ -153,6 +158,7 @@ Generate reports showing:
|
|
|
153
158
|
## Chaos Patterns
|
|
154
159
|
|
|
155
160
|
### Network Chaos
|
|
161
|
+
|
|
156
162
|
- Latency injection
|
|
157
163
|
- Packet loss
|
|
158
164
|
- Connection termination
|
|
@@ -160,12 +166,14 @@ Generate reports showing:
|
|
|
160
166
|
- Bandwidth limits
|
|
161
167
|
|
|
162
168
|
### Resource Chaos
|
|
169
|
+
|
|
163
170
|
- CPU saturation
|
|
164
171
|
- Memory exhaustion
|
|
165
172
|
- Disk I/O limits
|
|
166
173
|
- Connection pool exhaustion
|
|
167
174
|
|
|
168
175
|
### Application Chaos
|
|
176
|
+
|
|
169
177
|
- Process termination
|
|
170
178
|
- Dependency failures
|
|
171
179
|
- Configuration errors
|
|
@@ -173,6 +181,7 @@ Generate reports showing:
|
|
|
173
181
|
- Corrupt data
|
|
174
182
|
|
|
175
183
|
### Infrastructure Chaos
|
|
184
|
+
|
|
176
185
|
- Instance termination
|
|
177
186
|
- AZ failures
|
|
178
187
|
- Region outages
|
|
@@ -182,6 +191,7 @@ Generate reports showing:
|
|
|
182
191
|
## Safety Guidelines
|
|
183
192
|
|
|
184
193
|
Always ensure:
|
|
194
|
+
|
|
185
195
|
1. **Gradual rollout**: Start with 1% traffic, increase slowly
|
|
186
196
|
2. **Clear abort conditions**: Define when to stop experiment
|
|
187
197
|
3. **Monitoring in place**: Track all critical metrics
|
package/package.json
CHANGED
|
@@ -1,16 +1,20 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: running-chaos-tests
|
|
3
|
-
description:
|
|
4
|
-
|
|
3
|
+
description: 'Execute chaos engineering experiments to test system resilience.
|
|
4
|
+
|
|
5
5
|
Use when performing specialized testing.
|
|
6
|
+
|
|
6
7
|
Trigger with phrases like "run chaos tests", "test resilience", or "inject failures".
|
|
7
8
|
|
|
9
|
+
'
|
|
8
10
|
allowed-tools: Read, Write, Edit, Grep, Glob, Bash(test:chaos-*)
|
|
9
11
|
version: 1.0.0
|
|
10
12
|
author: Jeremy Longshore <jeremy@intentsolutions.io>
|
|
11
13
|
license: MIT
|
|
12
|
-
|
|
13
|
-
|
|
14
|
+
tags:
|
|
15
|
+
- testing
|
|
16
|
+
- chaos-tests
|
|
17
|
+
compatibility: Designed for Claude Code, also compatible with Codex and OpenClaw
|
|
14
18
|
---
|
|
15
19
|
# Chaos Engineering Toolkit
|
|
16
20
|
|
|
@@ -80,6 +84,7 @@ Execute controlled chaos engineering experiments to test system resilience, faul
|
|
|
80
84
|
## Examples
|
|
81
85
|
|
|
82
86
|
**toxiproxy network latency injection:**
|
|
87
|
+
|
|
83
88
|
```bash
|
|
84
89
|
set -euo pipefail
|
|
85
90
|
# Create a proxy for the database connection
|
|
@@ -96,6 +101,7 @@ toxiproxy-cli toxic remove postgres_proxy -n latency_downstream
|
|
|
96
101
|
```
|
|
97
102
|
|
|
98
103
|
**Kubernetes pod kill experiment (Litmus Chaos):**
|
|
104
|
+
|
|
99
105
|
```yaml
|
|
100
106
|
apiVersion: litmuschaos.io/v1alpha1
|
|
101
107
|
kind: ChaosEngine
|
|
@@ -120,6 +126,7 @@ spec:
|
|
|
120
126
|
```
|
|
121
127
|
|
|
122
128
|
**Custom chaos script (process kill and verify recovery):**
|
|
129
|
+
|
|
123
130
|
```bash
|
|
124
131
|
#!/bin/bash
|
|
125
132
|
set -euo pipefail
|
|
@@ -152,4 +159,4 @@ done
|
|
|
152
159
|
- Litmus Chaos: https://litmuschaos.io/
|
|
153
160
|
- Chaos Mesh (Kubernetes): https://chaos-mesh.org/
|
|
154
161
|
- Pumba (Docker chaos): https://github.com/alexei-led/pumba
|
|
155
|
-
- Netflix Chaos Engineering:
|
|
162
|
+
- Netflix Chaos Engineering:
|