async-service-chaos_kitty 0.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/architecture.md +153 -0
- data/lib/async/service/chaos_kitty/chaos_controller.rb +113 -0
- data/lib/async/service/chaos_kitty/client.rb +22 -0
- data/lib/async/service/chaos_kitty/endpoint.rb +20 -0
- data/lib/async/service/chaos_kitty/floop.rb +104 -0
- data/lib/async/service/chaos_kitty/hairball.rb +101 -0
- data/lib/async/service/chaos_kitty/loop.rb +39 -0
- data/lib/async/service/chaos_kitty/scratch.rb +98 -0
- data/lib/async/service/chaos_kitty/server.rb +119 -0
- data/lib/async/service/chaos_kitty/version.rb +12 -0
- data/lib/async/service/chaos_kitty/victim_controller.rb +93 -0
- data/lib/async/service/chaos_kitty/worker.rb +57 -0
- data/lib/async/service/chaos_kitty/yowl.rb +106 -0
- data/lib/async/service/chaos_kitty/zoomies.rb +101 -0
- data/lib/async/service/chaos_kitty.rb +17 -0
- data/license.md +21 -0
- data/readme.md +150 -0
- data/releases.md +19 -0
- metadata +100 -0
checksums.yaml
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
1
|
+
---
|
|
2
|
+
SHA256:
|
|
3
|
+
metadata.gz: f3ea2295d1cc85dec15cac5131cc571726ecbc1b1ca17ece44598d47f66a6feb
|
|
4
|
+
data.tar.gz: 35bc5bb1c4eaf3f889413525653c931a115f71b07170ea8cd6733500207f4f83
|
|
5
|
+
SHA512:
|
|
6
|
+
metadata.gz: b41e245341e3fb7140592b7782aa8561648555db543c241e26afb0a054d0029bb99efb8b0ab0fea42ef664504a65d7003517e8bc0ccdf2212b8f4579bdc6a721
|
|
7
|
+
data.tar.gz: bcaa93b95324b9c6461f8abfebabc1fe55e2255d839d7db9cd0a7c2cbe58a23bd98162a339bfa932203812e32845db6a083c4101d5a764a57a381ffdbcf14ce7
|
data/architecture.md
ADDED
|
@@ -0,0 +1,153 @@
|
|
|
1
|
+
# Architecture
|
|
2
|
+
|
|
3
|
+
This document describes the architecture of async-service-chaos_kitty, which follows the same pattern as async-service-supervisor.
|
|
4
|
+
|
|
5
|
+
## Overview
|
|
6
|
+
|
|
7
|
+
ChaosKitty is a chaos monkey system for testing service resilience. It uses a client-server architecture where workers connect to a central chaos server, and various chaos operations are unleashed on the connected workers.
|
|
8
|
+
|
|
9
|
+
## Components
|
|
10
|
+
|
|
11
|
+
### Core Components
|
|
12
|
+
|
|
13
|
+
#### Server (`server.rb`)
|
|
14
|
+
The main chaos server that:
|
|
15
|
+
- Accepts connections from workers (victims)
|
|
16
|
+
- Manages chaos operations
|
|
17
|
+
- Coordinates between chaos operations and connected victims
|
|
18
|
+
- Tracks all connected victims via controllers
|
|
19
|
+
|
|
20
|
+
#### Worker (`worker.rb`)
|
|
21
|
+
A worker process that:
|
|
22
|
+
- Connects to the chaos server
|
|
23
|
+
- Registers itself as a victim
|
|
24
|
+
- Exposes victim controller methods for chaos operations
|
|
25
|
+
- Runs the main application logic
|
|
26
|
+
|
|
27
|
+
#### Client (`client.rb`)
|
|
28
|
+
Base client class for connecting to the chaos server. Extended by Worker.
|
|
29
|
+
|
|
30
|
+
### Controllers
|
|
31
|
+
|
|
32
|
+
#### ChaosController (`chaos_controller.rb`)
|
|
33
|
+
Server-side controller that:
|
|
34
|
+
- Manages victim registration
|
|
35
|
+
- Provides access to victim proxies
|
|
36
|
+
- Handles status queries
|
|
37
|
+
- Tracks victim metadata (ID, process ID, connection)
|
|
38
|
+
|
|
39
|
+
#### VictimController (`victim_controller.rb`)
|
|
40
|
+
Client-side controller that:
|
|
41
|
+
- Exposes methods that can be invoked by chaos operations
|
|
42
|
+
- Implements chaos actions: delay, raise_error, allocate_memory, cpu_spin, trigger_gc
|
|
43
|
+
- Logs chaos events
|
|
44
|
+
|
|
45
|
+
### Chaos Operations
|
|
46
|
+
|
|
47
|
+
All chaos operations follow the same pattern:
|
|
48
|
+
- `register(chaos_controller)`: Called when a new victim connects
|
|
49
|
+
- `remove(chaos_controller)`: Called when a victim disconnects
|
|
50
|
+
- `status()`: Returns current status
|
|
51
|
+
- `run()`: Starts the chaos operation loop
|
|
52
|
+
|
|
53
|
+
#### Hairball (`hairball.rb`)
|
|
54
|
+
Causes random delays and blocking operations.
|
|
55
|
+
|
|
56
|
+
**Parameters:**
|
|
57
|
+
- `interval`: How often to check for chaos opportunities
|
|
58
|
+
- `probability`: Chance of causing chaos (0.0-1.0)
|
|
59
|
+
- `min_delay`: Minimum delay duration
|
|
60
|
+
- `max_delay`: Maximum delay duration
|
|
61
|
+
|
|
62
|
+
**Effect:** Calls `victim.delay(duration:)` on random victims
|
|
63
|
+
|
|
64
|
+
#### Scratch (`scratch.rb`)
|
|
65
|
+
Randomly terminates victim processes.
|
|
66
|
+
|
|
67
|
+
**Parameters:**
|
|
68
|
+
- `interval`: How often to check for chaos opportunities
|
|
69
|
+
- `probability`: Chance of causing chaos (0.0-1.0)
|
|
70
|
+
- `signal`: Signal to send to process
|
|
71
|
+
|
|
72
|
+
**Effect:** Sends signal to victim's process ID
|
|
73
|
+
|
|
74
|
+
#### Floop (`floop.rb`)
|
|
75
|
+
Creates random memory spikes.
|
|
76
|
+
|
|
77
|
+
**Parameters:**
|
|
78
|
+
- `interval`: How often to check for chaos opportunities
|
|
79
|
+
- `probability`: Chance of causing chaos (0.0-1.0)
|
|
80
|
+
- `min_size_mb`: Minimum memory allocation
|
|
81
|
+
- `max_size_mb`: Maximum memory allocation
|
|
82
|
+
- `hold_duration`: How long to hold the allocation
|
|
83
|
+
|
|
84
|
+
**Effect:** Calls `victim.allocate_memory(size_mb:, hold_duration:)`
|
|
85
|
+
|
|
86
|
+
#### Zoomies (`zoomies.rb`)
|
|
87
|
+
Generates random CPU spikes.
|
|
88
|
+
|
|
89
|
+
**Parameters:**
|
|
90
|
+
- `interval`: How often to check for chaos opportunities
|
|
91
|
+
- `probability`: Chance of causing chaos (0.0-1.0)
|
|
92
|
+
- `min_duration`: Minimum CPU spin duration
|
|
93
|
+
- `max_duration`: Maximum CPU spin duration
|
|
94
|
+
|
|
95
|
+
**Effect:** Calls `victim.cpu_spin(duration:)`
|
|
96
|
+
|
|
97
|
+
#### Yowl (`yowl.rb`)
|
|
98
|
+
Raises random exceptions.
|
|
99
|
+
|
|
100
|
+
**Parameters:**
|
|
101
|
+
- `interval`: How often to check for chaos opportunities
|
|
102
|
+
- `probability`: Chance of causing chaos (0.0-1.0)
|
|
103
|
+
- `messages`: Array of possible error messages
|
|
104
|
+
|
|
105
|
+
**Effect:** Calls `victim.raise_error(message:)`
|
|
106
|
+
|
|
107
|
+
## Communication Flow
|
|
108
|
+
|
|
109
|
+
1. **Worker Startup:**
|
|
110
|
+
- Worker creates a connection to chaos server
|
|
111
|
+
- Worker creates VictimController and binds it
|
|
112
|
+
- Worker calls `chaos.register(victim_proxy, process_id:)`
|
|
113
|
+
- Server allocates ID and calls `chaos_operation.register()` for each operation
|
|
114
|
+
|
|
115
|
+
2. **Chaos Execution:**
|
|
116
|
+
- Chaos operation runs in a loop at specified interval
|
|
117
|
+
- On each iteration, randomly selects a victim
|
|
118
|
+
- Checks probability to determine if chaos should occur
|
|
119
|
+
- Invokes remote method on victim via proxy
|
|
120
|
+
- Victim controller executes the chaos action
|
|
121
|
+
|
|
122
|
+
3. **Worker Shutdown:**
|
|
123
|
+
- Connection closes
|
|
124
|
+
- Server calls `chaos_operation.remove()` for each operation
|
|
125
|
+
- Controller is removed from tracking
|
|
126
|
+
|
|
127
|
+
## IPC Mechanism
|
|
128
|
+
|
|
129
|
+
- Uses Unix domain sockets for inter-process communication
|
|
130
|
+
- Default socket path: `chaos_kitty.ipc`
|
|
131
|
+
- Built on async-bus for RPC capabilities
|
|
132
|
+
- Supports multi-hop forwarding for proxy calls
|
|
133
|
+
|
|
134
|
+
## Threading Model
|
|
135
|
+
|
|
136
|
+
- Built on Async framework for cooperative concurrency
|
|
137
|
+
- Each chaos operation runs in its own Async task
|
|
138
|
+
- Connection handling is concurrent
|
|
139
|
+
- Chaos operations execute independently
|
|
140
|
+
|
|
141
|
+
## Comparison with async-service-supervisor
|
|
142
|
+
|
|
143
|
+
| Supervisor | ChaosKitty | Purpose |
|
|
144
|
+
|------------|------------|---------|
|
|
145
|
+
| Monitor | Chaos Operation | Watches/affects workers |
|
|
146
|
+
| MemoryMonitor | Floop | Memory-related |
|
|
147
|
+
| ProcessMonitor | Scratch | Process-related |
|
|
148
|
+
| Worker | Worker | Connects to server |
|
|
149
|
+
| SupervisorController | ChaosController | Server-side RPC |
|
|
150
|
+
| WorkerController | VictimController | Client-side RPC |
|
|
151
|
+
| Monitors health | Causes chaos | Core function |
|
|
152
|
+
|
|
153
|
+
Both systems share the same architectural pattern but serve opposite purposes: one monitors and maintains health, the other intentionally causes problems to test resilience.
|
|
@@ -0,0 +1,113 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
# Released under the MIT License.
|
|
4
|
+
# Copyright, 2026, by Samuel Williams.
|
|
5
|
+
|
|
6
|
+
require "async/bus/controller"
|
|
7
|
+
|
|
8
|
+
module Async
|
|
9
|
+
module Service
|
|
10
|
+
module ChaosKitty
|
|
11
|
+
# Controller for chaos operations.
|
|
12
|
+
#
|
|
13
|
+
# Handles registration of victims, victim lookup, and status queries.
|
|
14
|
+
class ChaosController < Async::Bus::Controller
|
|
15
|
+
def initialize(server, connection)
|
|
16
|
+
@server = server
|
|
17
|
+
@connection = connection
|
|
18
|
+
|
|
19
|
+
@id = nil
|
|
20
|
+
@process_id = nil
|
|
21
|
+
@victim = nil
|
|
22
|
+
end
|
|
23
|
+
|
|
24
|
+
# @attribute [Server] The server instance.
|
|
25
|
+
attr :server
|
|
26
|
+
|
|
27
|
+
# @attribute [Connection] The connection instance.
|
|
28
|
+
attr :connection
|
|
29
|
+
|
|
30
|
+
# @attribute [Integer] The ID assigned to this victim.
|
|
31
|
+
attr :id
|
|
32
|
+
|
|
33
|
+
# @attribute [Integer] The process ID of the victim.
|
|
34
|
+
attr :process_id
|
|
35
|
+
|
|
36
|
+
# @attribute [Proxy] The proxy to the victim controller.
|
|
37
|
+
attr :victim
|
|
38
|
+
|
|
39
|
+
# Register a victim connection with the chaos server.
|
|
40
|
+
#
|
|
41
|
+
# Allocates a unique sequential ID, stores the victim controller proxy,
|
|
42
|
+
# and notifies all chaos operations of the new connection.
|
|
43
|
+
#
|
|
44
|
+
# @parameter victim [Proxy] The proxy to the victim controller.
|
|
45
|
+
# @parameter process_id [Integer] The process ID of the victim.
|
|
46
|
+
# @returns [Integer] The connection ID assigned to the victim.
|
|
47
|
+
def register(victim, process_id:)
|
|
48
|
+
raise RuntimeError, "Already registered" if @id
|
|
49
|
+
|
|
50
|
+
@id = @server.next_id
|
|
51
|
+
@process_id = process_id
|
|
52
|
+
@victim = victim
|
|
53
|
+
|
|
54
|
+
@server.add(self)
|
|
55
|
+
|
|
56
|
+
return @id
|
|
57
|
+
end
|
|
58
|
+
|
|
59
|
+
# Get a victim controller proxy by connection ID.
|
|
60
|
+
#
|
|
61
|
+
# Returns a proxy to the victim controller that can be used to invoke
|
|
62
|
+
# operations directly on the victim. The proxy uses multi-hop forwarding
|
|
63
|
+
# to route calls through the chaos server to the victim.
|
|
64
|
+
#
|
|
65
|
+
# @parameter id [Integer] The ID of the victim.
|
|
66
|
+
# @returns [Proxy] A proxy to the victim controller.
|
|
67
|
+
# @raises [ArgumentError] If the connection ID is not found.
|
|
68
|
+
def [](id)
|
|
69
|
+
unless id
|
|
70
|
+
raise ArgumentError, "Missing 'id' parameter"
|
|
71
|
+
end
|
|
72
|
+
|
|
73
|
+
chaos_controller = @server.controllers[id]
|
|
74
|
+
|
|
75
|
+
unless chaos_controller
|
|
76
|
+
raise ArgumentError, "Connection not found: #{id}"
|
|
77
|
+
end
|
|
78
|
+
|
|
79
|
+
victim = chaos_controller.victim
|
|
80
|
+
|
|
81
|
+
unless victim
|
|
82
|
+
raise ArgumentError, "Victim controller not found for connection: #{id}"
|
|
83
|
+
end
|
|
84
|
+
|
|
85
|
+
return victim
|
|
86
|
+
end
|
|
87
|
+
|
|
88
|
+
# List all registered victim IDs.
|
|
89
|
+
#
|
|
90
|
+
# @returns [Array(Integer)] An array of IDs for all registered victims.
|
|
91
|
+
def keys
|
|
92
|
+
@server.controllers.keys
|
|
93
|
+
end
|
|
94
|
+
|
|
95
|
+
# Query the status of the chaos server and all connected victims.
|
|
96
|
+
#
|
|
97
|
+
# Returns an array of status information from each chaos operation.
|
|
98
|
+
# Each chaos operation provides its own status representation.
|
|
99
|
+
#
|
|
100
|
+
# @returns [Array] An array of status information from each chaos operation.
|
|
101
|
+
def status
|
|
102
|
+
@server.chaos_operations.map do |chaos|
|
|
103
|
+
begin
|
|
104
|
+
chaos.status
|
|
105
|
+
rescue => error
|
|
106
|
+
error
|
|
107
|
+
end
|
|
108
|
+
end.compact
|
|
109
|
+
end
|
|
110
|
+
end
|
|
111
|
+
end
|
|
112
|
+
end
|
|
113
|
+
end
|
|
@@ -0,0 +1,22 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
# Released under the MIT License.
|
|
4
|
+
# Copyright, 2026, by Samuel Williams.
|
|
5
|
+
|
|
6
|
+
require "async/bus/client"
|
|
7
|
+
|
|
8
|
+
module Async
|
|
9
|
+
module Service
|
|
10
|
+
module ChaosKitty
|
|
11
|
+
# A client provides a mechanism to connect to a chaos server in order to execute operations.
|
|
12
|
+
class Client < Async::Bus::Client
|
|
13
|
+
# Initialize a new client.
|
|
14
|
+
#
|
|
15
|
+
# @parameter endpoint [IO::Endpoint] The chaos endpoint to connect to.
|
|
16
|
+
def initialize(endpoint: ChaosKitty.endpoint, **options)
|
|
17
|
+
super(endpoint, **options)
|
|
18
|
+
end
|
|
19
|
+
end
|
|
20
|
+
end
|
|
21
|
+
end
|
|
22
|
+
end
|
|
@@ -0,0 +1,20 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
# Released under the MIT License.
|
|
4
|
+
# Copyright, 2026, by Samuel Williams.
|
|
5
|
+
|
|
6
|
+
require "io/endpoint/unix_endpoint"
|
|
7
|
+
|
|
8
|
+
module Async
|
|
9
|
+
module Service
|
|
10
|
+
module ChaosKitty
|
|
11
|
+
# Get the chaos kitty IPC endpoint.
|
|
12
|
+
#
|
|
13
|
+
# @parameter path [String] The path for the Unix socket (default: "chaos_kitty.ipc").
|
|
14
|
+
# @returns [IO::Endpoint] The Unix socket endpoint.
|
|
15
|
+
def self.endpoint(path = "chaos_kitty.ipc")
|
|
16
|
+
::IO::Endpoint.unix(path)
|
|
17
|
+
end
|
|
18
|
+
end
|
|
19
|
+
end
|
|
20
|
+
end
|
|
@@ -0,0 +1,104 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
# Released under the MIT License.
|
|
4
|
+
# Copyright, 2026, by Samuel Williams.
|
|
5
|
+
|
|
6
|
+
require "set"
|
|
7
|
+
require_relative "loop"
|
|
8
|
+
|
|
9
|
+
module Async
|
|
10
|
+
module Service
|
|
11
|
+
module ChaosKitty
|
|
12
|
+
# Floop causes random memory spikes in victim processes.
|
|
13
|
+
#
|
|
14
|
+
# Like a cat flopping over dramatically, this chaos operation randomly
|
|
15
|
+
# allocates large amounts of memory to test memory handling and limits.
|
|
16
|
+
class Floop
|
|
17
|
+
# Create a new floop chaos operation.
|
|
18
|
+
#
|
|
19
|
+
# @parameter interval [Integer] How often to check for chaos opportunities.
|
|
20
|
+
# @parameter probability [Float] Probability (0.0 to 1.0) of causing chaos on each check.
|
|
21
|
+
# @parameter min_size_mb [Integer] Minimum memory allocation in megabytes.
|
|
22
|
+
# @parameter max_size_mb [Integer] Maximum memory allocation in megabytes.
|
|
23
|
+
# @parameter hold_duration [Numeric] How long to hold the allocation.
|
|
24
|
+
def initialize(interval: 30, probability: 0.2, min_size_mb: 10, max_size_mb: 100, hold_duration: 2)
|
|
25
|
+
@interval = interval
|
|
26
|
+
@probability = probability
|
|
27
|
+
@min_size_mb = min_size_mb
|
|
28
|
+
@max_size_mb = max_size_mb
|
|
29
|
+
@hold_duration = hold_duration
|
|
30
|
+
@victims = Set.new.compare_by_identity
|
|
31
|
+
end
|
|
32
|
+
|
|
33
|
+
# @attribute [Set] The set of registered victims.
|
|
34
|
+
attr_reader :victims
|
|
35
|
+
|
|
36
|
+
# Register a victim with the floop chaos.
|
|
37
|
+
#
|
|
38
|
+
# @parameter chaos_controller [ChaosController] The chaos controller for the victim.
|
|
39
|
+
def register(chaos_controller)
|
|
40
|
+
Console.debug(self, "😺 Registering victim for floop chaos.", id: chaos_controller.id)
|
|
41
|
+
@victims.add(chaos_controller)
|
|
42
|
+
end
|
|
43
|
+
|
|
44
|
+
# Remove a victim from the floop chaos.
|
|
45
|
+
#
|
|
46
|
+
# @parameter chaos_controller [ChaosController] The chaos controller for the victim.
|
|
47
|
+
def remove(chaos_controller)
|
|
48
|
+
@victims.delete(chaos_controller)
|
|
49
|
+
end
|
|
50
|
+
|
|
51
|
+
# Get status for the floop chaos.
|
|
52
|
+
#
|
|
53
|
+
# @returns [Hash] Status including victim count and configuration.
|
|
54
|
+
def status
|
|
55
|
+
{
|
|
56
|
+
floop: {
|
|
57
|
+
victims: @victims.size,
|
|
58
|
+
probability: @probability,
|
|
59
|
+
size_range_mb: [@min_size_mb, @max_size_mb],
|
|
60
|
+
hold_duration: @hold_duration
|
|
61
|
+
}
|
|
62
|
+
}
|
|
63
|
+
end
|
|
64
|
+
|
|
65
|
+
# Unleash a floop on a random victim.
|
|
66
|
+
def unleash_floop
|
|
67
|
+
return if @victims.empty?
|
|
68
|
+
|
|
69
|
+
# Pick a random victim
|
|
70
|
+
victim = @victims.to_a.sample
|
|
71
|
+
return unless victim
|
|
72
|
+
|
|
73
|
+
# Check probability
|
|
74
|
+
return unless rand < @probability
|
|
75
|
+
|
|
76
|
+
# Calculate random size
|
|
77
|
+
size_mb = @min_size_mb + rand(@max_size_mb - @min_size_mb)
|
|
78
|
+
|
|
79
|
+
Console.info(self, "😾 *FLOOP* Memory spike incoming!", id: victim.id, size_mb: size_mb)
|
|
80
|
+
|
|
81
|
+
begin
|
|
82
|
+
victim_proxy = victim.connection[:victim]
|
|
83
|
+
if victim_proxy
|
|
84
|
+
victim_proxy.allocate_memory(size_mb: size_mb, hold_duration: @hold_duration)
|
|
85
|
+
end
|
|
86
|
+
rescue => error
|
|
87
|
+
Console.error(self, "Failed to unleash floop!", id: victim.id, exception: error)
|
|
88
|
+
end
|
|
89
|
+
end
|
|
90
|
+
|
|
91
|
+
# Run the floop chaos operation.
|
|
92
|
+
#
|
|
93
|
+
# @returns [Async::Task] The task that is running the floop chaos.
|
|
94
|
+
def run
|
|
95
|
+
Async do
|
|
96
|
+
Loop.run(interval: @interval) do
|
|
97
|
+
unleash_floop
|
|
98
|
+
end
|
|
99
|
+
end
|
|
100
|
+
end
|
|
101
|
+
end
|
|
102
|
+
end
|
|
103
|
+
end
|
|
104
|
+
end
|
|
@@ -0,0 +1,101 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
# Released under the MIT License.
|
|
4
|
+
# Copyright, 2026, by Samuel Williams.
|
|
5
|
+
|
|
6
|
+
require "set"
|
|
7
|
+
require_relative "loop"
|
|
8
|
+
|
|
9
|
+
module Async
|
|
10
|
+
module Service
|
|
11
|
+
module ChaosKitty
|
|
12
|
+
# Hairball causes random delays and blocking in victim processes.
|
|
13
|
+
#
|
|
14
|
+
# Like a cat hacking up a hairball, this chaos operation randomly
|
|
15
|
+
# blocks victims, simulating slow responses or stuck operations.
|
|
16
|
+
class Hairball
|
|
17
|
+
# Create a new hairball chaos operation.
|
|
18
|
+
#
|
|
19
|
+
# @parameter interval [Integer] How often to check for chaos opportunities.
|
|
20
|
+
# @parameter probability [Float] Probability (0.0 to 1.0) of causing chaos on each check.
|
|
21
|
+
# @parameter min_delay [Numeric] Minimum delay duration in seconds.
|
|
22
|
+
# @parameter max_delay [Numeric] Maximum delay duration in seconds.
|
|
23
|
+
def initialize(interval: 30, probability: 0.3, min_delay: 0.5, max_delay: 5.0)
|
|
24
|
+
@interval = interval
|
|
25
|
+
@probability = probability
|
|
26
|
+
@min_delay = min_delay
|
|
27
|
+
@max_delay = max_delay
|
|
28
|
+
@victims = Set.new.compare_by_identity
|
|
29
|
+
end
|
|
30
|
+
|
|
31
|
+
# @attribute [Set] The set of registered victims.
|
|
32
|
+
attr_reader :victims
|
|
33
|
+
|
|
34
|
+
# Register a victim with the hairball chaos.
|
|
35
|
+
#
|
|
36
|
+
# @parameter chaos_controller [ChaosController] The chaos controller for the victim.
|
|
37
|
+
def register(chaos_controller)
|
|
38
|
+
Console.debug(self, "😺 Registering victim for hairball chaos.", id: chaos_controller.id)
|
|
39
|
+
@victims.add(chaos_controller)
|
|
40
|
+
end
|
|
41
|
+
|
|
42
|
+
# Remove a victim from the hairball chaos.
|
|
43
|
+
#
|
|
44
|
+
# @parameter chaos_controller [ChaosController] The chaos controller for the victim.
|
|
45
|
+
def remove(chaos_controller)
|
|
46
|
+
@victims.delete(chaos_controller)
|
|
47
|
+
end
|
|
48
|
+
|
|
49
|
+
# Get status for the hairball chaos.
|
|
50
|
+
#
|
|
51
|
+
# @returns [Hash] Status including victim count and configuration.
|
|
52
|
+
def status
|
|
53
|
+
{
|
|
54
|
+
hairball: {
|
|
55
|
+
victims: @victims.size,
|
|
56
|
+
probability: @probability,
|
|
57
|
+
delay_range: [@min_delay, @max_delay]
|
|
58
|
+
}
|
|
59
|
+
}
|
|
60
|
+
end
|
|
61
|
+
|
|
62
|
+
# Unleash a hairball on a random victim.
|
|
63
|
+
def unleash_hairball
|
|
64
|
+
return if @victims.empty?
|
|
65
|
+
|
|
66
|
+
# Pick a random victim
|
|
67
|
+
victim = @victims.to_a.sample
|
|
68
|
+
return unless victim
|
|
69
|
+
|
|
70
|
+
# Check probability
|
|
71
|
+
return unless rand < @probability
|
|
72
|
+
|
|
73
|
+
# Calculate random delay
|
|
74
|
+
delay = @min_delay + rand * (@max_delay - @min_delay)
|
|
75
|
+
|
|
76
|
+
Console.info(self, "😾 *HACK* *HACK* Hairball time!", id: victim.id, delay: delay)
|
|
77
|
+
|
|
78
|
+
begin
|
|
79
|
+
victim_proxy = victim.connection[:victim]
|
|
80
|
+
if victim_proxy
|
|
81
|
+
victim_proxy.delay(duration: delay)
|
|
82
|
+
end
|
|
83
|
+
rescue => error
|
|
84
|
+
Console.error(self, "Failed to unleash hairball!", id: victim.id, exception: error)
|
|
85
|
+
end
|
|
86
|
+
end
|
|
87
|
+
|
|
88
|
+
# Run the hairball chaos operation.
|
|
89
|
+
#
|
|
90
|
+
# @returns [Async::Task] The task that is running the hairball chaos.
|
|
91
|
+
def run
|
|
92
|
+
Async do
|
|
93
|
+
Loop.run(interval: @interval) do
|
|
94
|
+
unleash_hairball
|
|
95
|
+
end
|
|
96
|
+
end
|
|
97
|
+
end
|
|
98
|
+
end
|
|
99
|
+
end
|
|
100
|
+
end
|
|
101
|
+
end
|
|
@@ -0,0 +1,39 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
# Released under the MIT License.
|
|
4
|
+
# Copyright, 2026, by Samuel Williams.
|
|
5
|
+
|
|
6
|
+
module Async
|
|
7
|
+
module Service
|
|
8
|
+
module ChaosKitty
|
|
9
|
+
# A helper for running loops at aligned intervals.
|
|
10
|
+
module Loop
|
|
11
|
+
# A robust loop that executes a block at aligned intervals.
|
|
12
|
+
#
|
|
13
|
+
# The alignment is modulo the current clock in seconds.
|
|
14
|
+
#
|
|
15
|
+
# If an error occurs during the execution of the block, it is logged and the loop continues.
|
|
16
|
+
#
|
|
17
|
+
# @parameter interval [Integer] The interval in seconds between executions of the block.
|
|
18
|
+
def self.run(interval: 60, &block)
|
|
19
|
+
while true
|
|
20
|
+
# Compute the wait time to the next interval:
|
|
21
|
+
wait = interval - (Time.now.to_f % interval)
|
|
22
|
+
if wait.positive?
|
|
23
|
+
# Sleep until the next interval boundary:
|
|
24
|
+
sleep(wait)
|
|
25
|
+
end
|
|
26
|
+
|
|
27
|
+
begin
|
|
28
|
+
yield
|
|
29
|
+
rescue => error
|
|
30
|
+
Console.error(self, "Loop error:", error)
|
|
31
|
+
end
|
|
32
|
+
end
|
|
33
|
+
end
|
|
34
|
+
end
|
|
35
|
+
|
|
36
|
+
private_constant :Loop
|
|
37
|
+
end
|
|
38
|
+
end
|
|
39
|
+
end
|
|
@@ -0,0 +1,98 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
# Released under the MIT License.
|
|
4
|
+
# Copyright, 2026, by Samuel Williams.
|
|
5
|
+
|
|
6
|
+
require "set"
|
|
7
|
+
require_relative "loop"
|
|
8
|
+
|
|
9
|
+
module Async
|
|
10
|
+
module Service
|
|
11
|
+
module ChaosKitty
|
|
12
|
+
# Scratch randomly kills victim processes.
|
|
13
|
+
#
|
|
14
|
+
# Like a cat scratching furniture, this chaos operation randomly
|
|
15
|
+
# terminates victim processes to test resilience and recovery.
|
|
16
|
+
class Scratch
|
|
17
|
+
# Create a new scratch chaos operation.
|
|
18
|
+
#
|
|
19
|
+
# @parameter interval [Integer] How often to check for chaos opportunities.
|
|
20
|
+
# @parameter probability [Float] Probability (0.0 to 1.0) of causing chaos on each check.
|
|
21
|
+
# @parameter signal [Symbol] The signal to send when scratching.
|
|
22
|
+
def initialize(interval: 60, probability: 0.1, signal: :TERM)
|
|
23
|
+
@interval = interval
|
|
24
|
+
@probability = probability
|
|
25
|
+
@signal = signal
|
|
26
|
+
@victims = Set.new.compare_by_identity
|
|
27
|
+
end
|
|
28
|
+
|
|
29
|
+
# @attribute [Set] The set of registered victims.
|
|
30
|
+
attr_reader :victims
|
|
31
|
+
|
|
32
|
+
# Register a victim with the scratch chaos.
|
|
33
|
+
#
|
|
34
|
+
# @parameter chaos_controller [ChaosController] The chaos controller for the victim.
|
|
35
|
+
def register(chaos_controller)
|
|
36
|
+
Console.debug(self, "😺 Registering victim for scratch chaos.", id: chaos_controller.id)
|
|
37
|
+
@victims.add(chaos_controller)
|
|
38
|
+
end
|
|
39
|
+
|
|
40
|
+
# Remove a victim from the scratch chaos.
|
|
41
|
+
#
|
|
42
|
+
# @parameter chaos_controller [ChaosController] The chaos controller for the victim.
|
|
43
|
+
def remove(chaos_controller)
|
|
44
|
+
@victims.delete(chaos_controller)
|
|
45
|
+
end
|
|
46
|
+
|
|
47
|
+
# Get status for the scratch chaos.
|
|
48
|
+
#
|
|
49
|
+
# @returns [Hash] Status including victim count and configuration.
|
|
50
|
+
def status
|
|
51
|
+
{
|
|
52
|
+
scratch: {
|
|
53
|
+
victims: @victims.size,
|
|
54
|
+
probability: @probability,
|
|
55
|
+
signal: @signal
|
|
56
|
+
}
|
|
57
|
+
}
|
|
58
|
+
end
|
|
59
|
+
|
|
60
|
+
# Unleash a scratch on a random victim.
|
|
61
|
+
def unleash_scratch
|
|
62
|
+
return if @victims.empty?
|
|
63
|
+
|
|
64
|
+
# Pick a random victim
|
|
65
|
+
victim = @victims.to_a.sample
|
|
66
|
+
return unless victim
|
|
67
|
+
|
|
68
|
+
# Check probability
|
|
69
|
+
return unless rand < @probability
|
|
70
|
+
|
|
71
|
+
process_id = victim.process_id
|
|
72
|
+
return unless process_id
|
|
73
|
+
|
|
74
|
+
Console.info(self, "😾 *SCRATCH* Taking down a victim!", id: victim.id, process_id: process_id, signal: @signal)
|
|
75
|
+
|
|
76
|
+
begin
|
|
77
|
+
Process.kill(@signal, process_id)
|
|
78
|
+
rescue Errno::ESRCH
|
|
79
|
+
Console.warn(self, "Process already gone!", process_id: process_id)
|
|
80
|
+
rescue => error
|
|
81
|
+
Console.error(self, "Failed to scratch victim!", process_id: process_id, exception: error)
|
|
82
|
+
end
|
|
83
|
+
end
|
|
84
|
+
|
|
85
|
+
# Run the scratch chaos operation.
|
|
86
|
+
#
|
|
87
|
+
# @returns [Async::Task] The task that is running the scratch chaos.
|
|
88
|
+
def run
|
|
89
|
+
Async do
|
|
90
|
+
Loop.run(interval: @interval) do
|
|
91
|
+
unleash_scratch
|
|
92
|
+
end
|
|
93
|
+
end
|
|
94
|
+
end
|
|
95
|
+
end
|
|
96
|
+
end
|
|
97
|
+
end
|
|
98
|
+
end
|