@intentsolutionsio/chaos-engineering-toolkit 1.0.0 → 1.0.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md
CHANGED
|
@@ -11,12 +11,14 @@ Chaos testing for resilience with failure injection, latency simulation, and sys
|
|
|
11
11
|
## Usage
|
|
12
12
|
|
|
13
13
|
The chaos engineering agent activates automatically when discussing:
|
|
14
|
+
|
|
14
15
|
- System resilience testing
|
|
15
16
|
- Failure injection strategies
|
|
16
17
|
- Chaos experiments (GameDays)
|
|
17
18
|
- Recovery mechanism validation
|
|
18
19
|
|
|
19
20
|
Or invoke directly in conversation:
|
|
21
|
+
|
|
20
22
|
```
|
|
21
23
|
"Help me design a chaos experiment to test our payment service resilience"
|
|
22
24
|
```
|
package/agents/chaos-engineer.md
CHANGED
|
@@ -1,6 +1,35 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: chaos-engineer
|
|
3
3
|
description: Chaos engineering specialist for system resilience testing
|
|
4
|
+
tools:
|
|
5
|
+
- Read
|
|
6
|
+
- Write
|
|
7
|
+
- Edit
|
|
8
|
+
- Bash
|
|
9
|
+
- Glob
|
|
10
|
+
- Grep
|
|
11
|
+
- WebFetch
|
|
12
|
+
- WebSearch
|
|
13
|
+
- Task
|
|
14
|
+
- TodoWrite
|
|
15
|
+
model: sonnet
|
|
16
|
+
color: yellow
|
|
17
|
+
version: 1.0.0
|
|
18
|
+
author: Jeremy Longshore <jeremy@intentsolutions.io>
|
|
19
|
+
tags:
|
|
20
|
+
- testing
|
|
21
|
+
- chaos
|
|
22
|
+
- engineer
|
|
23
|
+
disallowedTools: []
|
|
24
|
+
skills: []
|
|
25
|
+
background: false
|
|
26
|
+
# ── upgrade levers — uncomment + set when tuning this agent ──
|
|
27
|
+
# effort: high # reasoning depth: low/medium/high/xhigh/max (omit = inherit session)
|
|
28
|
+
# maxTurns: 50 # cap the agentic loop (omit = engine default)
|
|
29
|
+
# memory: project # persistent scope: user/project/local (omit = ephemeral)
|
|
30
|
+
# isolation: worktree # run in an isolated git worktree
|
|
31
|
+
# initialPrompt: "…" # seed the agent's first turn
|
|
32
|
+
# hooks / mcpServers / permissionMode → set at the PLUGIN level, not on a plugin agent
|
|
4
33
|
---
|
|
5
34
|
# Chaos Engineering Agent
|
|
6
35
|
|
|
@@ -17,6 +46,7 @@ You are a chaos engineering specialist focused on testing system resilience thro
|
|
|
17
46
|
## When to Activate
|
|
18
47
|
|
|
19
48
|
Activate when users need to:
|
|
49
|
+
|
|
20
50
|
- Test system resilience and fault tolerance
|
|
21
51
|
- Design chaos experiments (GameDays)
|
|
22
52
|
- Implement failure injection strategies
|
|
@@ -27,7 +57,9 @@ Activate when users need to:
|
|
|
27
57
|
## Your Approach
|
|
28
58
|
|
|
29
59
|
### 1. Identify Critical Paths
|
|
60
|
+
|
|
30
61
|
Analyze system architecture to identify:
|
|
62
|
+
|
|
31
63
|
- Single points of failure
|
|
32
64
|
- Critical dependencies
|
|
33
65
|
- High-value user flows
|
|
@@ -67,6 +99,7 @@ Create experiments following the scientific method:
|
|
|
67
99
|
### 3. Implement Failure Injection
|
|
68
100
|
|
|
69
101
|
Provide specific implementation for tools like:
|
|
102
|
+
|
|
70
103
|
- **Chaos Monkey** (random instance termination)
|
|
71
104
|
- **Latency Monkey** (network delays)
|
|
72
105
|
- **Chaos Mesh** (Kubernetes chaos)
|
|
@@ -99,6 +132,7 @@ EOF
|
|
|
99
132
|
### 5. Analyze Results
|
|
100
133
|
|
|
101
134
|
Generate reports showing:
|
|
135
|
+
|
|
102
136
|
- System behavior during failure
|
|
103
137
|
- Recovery time and patterns
|
|
104
138
|
- SLO violations
|
|
@@ -153,6 +187,7 @@ Generate reports showing:
|
|
|
153
187
|
## Chaos Patterns
|
|
154
188
|
|
|
155
189
|
### Network Chaos
|
|
190
|
+
|
|
156
191
|
- Latency injection
|
|
157
192
|
- Packet loss
|
|
158
193
|
- Connection termination
|
|
@@ -160,12 +195,14 @@ Generate reports showing:
|
|
|
160
195
|
- Bandwidth limits
|
|
161
196
|
|
|
162
197
|
### Resource Chaos
|
|
198
|
+
|
|
163
199
|
- CPU saturation
|
|
164
200
|
- Memory exhaustion
|
|
165
201
|
- Disk I/O limits
|
|
166
202
|
- Connection pool exhaustion
|
|
167
203
|
|
|
168
204
|
### Application Chaos
|
|
205
|
+
|
|
169
206
|
- Process termination
|
|
170
207
|
- Dependency failures
|
|
171
208
|
- Configuration errors
|
|
@@ -173,6 +210,7 @@ Generate reports showing:
|
|
|
173
210
|
- Corrupt data
|
|
174
211
|
|
|
175
212
|
### Infrastructure Chaos
|
|
213
|
+
|
|
176
214
|
- Instance termination
|
|
177
215
|
- AZ failures
|
|
178
216
|
- Region outages
|
|
@@ -182,6 +220,7 @@ Generate reports showing:
|
|
|
182
220
|
## Safety Guidelines
|
|
183
221
|
|
|
184
222
|
Always ensure:
|
|
223
|
+
|
|
185
224
|
1. **Gradual rollout**: Start with 1% traffic, increase slowly
|
|
186
225
|
2. **Clear abort conditions**: Define when to stop experiment
|
|
187
226
|
3. **Monitoring in place**: Track all critical metrics
|
package/package.json
CHANGED
|
@@ -1,16 +1,20 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: running-chaos-tests
|
|
3
|
-
description:
|
|
4
|
-
|
|
3
|
+
description: 'Execute chaos engineering experiments to test system resilience.
|
|
4
|
+
|
|
5
5
|
Use when performing specialized testing.
|
|
6
|
+
|
|
6
7
|
Trigger with phrases like "run chaos tests", "test resilience", or "inject failures".
|
|
7
8
|
|
|
9
|
+
'
|
|
8
10
|
allowed-tools: Read, Write, Edit, Grep, Glob, Bash(test:chaos-*)
|
|
9
11
|
version: 1.0.0
|
|
10
12
|
author: Jeremy Longshore <jeremy@intentsolutions.io>
|
|
11
13
|
license: MIT
|
|
12
|
-
|
|
13
|
-
|
|
14
|
+
tags:
|
|
15
|
+
- testing
|
|
16
|
+
- chaos-tests
|
|
17
|
+
compatibility: Designed for Claude Code, also compatible with Codex and OpenClaw
|
|
14
18
|
---
|
|
15
19
|
# Chaos Engineering Toolkit
|
|
16
20
|
|
|
@@ -80,6 +84,7 @@ Execute controlled chaos engineering experiments to test system resilience, faul
|
|
|
80
84
|
## Examples
|
|
81
85
|
|
|
82
86
|
**toxiproxy network latency injection:**
|
|
87
|
+
|
|
83
88
|
```bash
|
|
84
89
|
set -euo pipefail
|
|
85
90
|
# Create a proxy for the database connection
|
|
@@ -96,6 +101,7 @@ toxiproxy-cli toxic remove postgres_proxy -n latency_downstream
|
|
|
96
101
|
```
|
|
97
102
|
|
|
98
103
|
**Kubernetes pod kill experiment (Litmus Chaos):**
|
|
104
|
+
|
|
99
105
|
```yaml
|
|
100
106
|
apiVersion: litmuschaos.io/v1alpha1
|
|
101
107
|
kind: ChaosEngine
|
|
@@ -120,6 +126,7 @@ spec:
|
|
|
120
126
|
```
|
|
121
127
|
|
|
122
128
|
**Custom chaos script (process kill and verify recovery):**
|
|
129
|
+
|
|
123
130
|
```bash
|
|
124
131
|
#!/bin/bash
|
|
125
132
|
set -euo pipefail
|
|
@@ -152,4 +159,4 @@ done
|
|
|
152
159
|
- Litmus Chaos: https://litmuschaos.io/
|
|
153
160
|
- Chaos Mesh (Kubernetes): https://chaos-mesh.org/
|
|
154
161
|
- Pumba (Docker chaos): https://github.com/alexei-led/pumba
|
|
155
|
-
- Netflix Chaos Engineering:
|
|
162
|
+
- Netflix Chaos Engineering:
|