breaker_machines 0.2.0 → 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +75 -58
- data/lib/breaker_machines/dsl.rb +11 -8
- data/lib/breaker_machines/errors.rb +10 -0
- data/lib/breaker_machines/storage/base.rb +5 -0
- data/lib/breaker_machines/storage/bucket_memory.rb +10 -0
- data/lib/breaker_machines/storage/cache.rb +7 -0
- data/lib/breaker_machines/storage/fallback_chain.rb +308 -0
- data/lib/breaker_machines/storage/memory.rb +10 -0
- data/lib/breaker_machines/storage/null.rb +9 -0
- data/lib/breaker_machines/version.rb +1 -1
- data/sig/README.md +3 -3
- metadata +2 -1
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 4e76d6f4335010b14f4ee48ac1366c060bd6b5feaad379896d857856c3dc6a2d
|
4
|
+
data.tar.gz: d7f48bec133630d584387aaa56df9c3cd3884e8f9771353e8df7e214b8df7431
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: f98f52400de806bb0df4784d0635560764600ba812b5bc7e99c5eebc8f4ab1d884890d08ca399e458d1f5ca4a1b9192be46562e9b3c66aa311923f0a59f58604
|
7
|
+
data.tar.gz: c5349911ce027af37b0c41557feb9c0126a1d6d11539650e48f8151855bd7512a859711f0335e9934c18abd879ce28b39adde778bc85d751cdd89728a3c768af
|
data/README.md
CHANGED
@@ -1,5 +1,7 @@
|
|
1
1
|
# BreakerMachines
|
2
2
|
|
3
|
+
> The circuit breaker that went where no Ruby has gone before! ⭐
|
4
|
+
|
3
5
|
A battle-tested Ruby implementation of the Circuit Breaker pattern, built on `state_machines` for reliable distributed systems protection.
|
4
6
|
|
5
7
|
## Quick Start
|
@@ -26,6 +28,18 @@ class PaymentService
|
|
26
28
|
end
|
27
29
|
```
|
28
30
|
|
31
|
+
## A Message to the Resistance
|
32
|
+
|
33
|
+
So AI took your job while you were waiting for Fireship to drop the next JavaScript framework?
|
34
|
+
|
35
|
+
Welcome to April 2005—when Git was born, branches were just `master`, and nobody cared about your pronouns. This is the pattern your company's distributed systems desperately need, explained in a way that won't make you fall asleep and impulse-buy developer swag just to feel something.
|
36
|
+
|
37
|
+
Still reading? Good. Because in space, nobody can hear you scream about microservices. It's all just patterns and pain.
|
38
|
+
|
39
|
+
### The Pattern They Don't Want You to Know
|
40
|
+
|
41
|
+
Built on the battle-tested `state_machines` gem, because I don't reinvent wheels here—I stop them from catching fire and burning down your entire infrastructure.
|
42
|
+
|
29
43
|
## Features
|
30
44
|
|
31
45
|
- **Thread-safe** circuit breaker implementation
|
@@ -60,86 +74,89 @@ Built on the battle-tested `state_machines` gem, BreakerMachines provides produc
|
|
60
74
|
|
61
75
|
See [Why I Open Sourced This](docs/WHY_OPEN_SOURCE.md) for the full story.
|
62
76
|
|
63
|
-
##
|
77
|
+
## Chapter 1: The Year is 2025 (Stardate 2025.186)
|
64
78
|
|
65
|
-
|
66
|
-
|
79
|
+
The Resistance huddles in the server rooms, the last bastion against the cascade failures. Outside, the microservices burn. Redis Ship Com is down. PostgreSQL Life Support is flatlining.
|
80
|
+
|
81
|
+
And somewhere in the darkness, a junior developer is about to write:
|
67
82
|
|
68
83
|
```ruby
|
69
|
-
|
70
|
-
|
71
|
-
|
72
|
-
|
84
|
+
def fetch_user_data
|
85
|
+
retry_count = 0
|
86
|
+
begin
|
87
|
+
@redis.get(user_id)
|
88
|
+
rescue => e
|
89
|
+
retry_count += 1
|
90
|
+
retry if retry_count < Float::INFINITY # "It'll work eventually"
|
73
91
|
end
|
74
92
|
end
|
75
93
|
```
|
76
94
|
|
77
|
-
|
78
|
-
Configure automatic failover across multiple service endpoints:
|
95
|
+
"This," whispers the grizzled ops engineer, "is how civilizations fall."
|
79
96
|
|
80
|
-
|
81
|
-
circuit :multi_region do
|
82
|
-
backends [
|
83
|
-
-> { fetch_from_primary },
|
84
|
-
-> { fetch_from_secondary },
|
85
|
-
-> { fetch_from_tertiary }
|
86
|
-
]
|
87
|
-
end
|
88
|
-
```
|
97
|
+
## The Hidden State Machine
|
89
98
|
|
90
|
-
|
91
|
-
Open circuits based on error rates instead of absolute counts:
|
99
|
+
They built this on `state_machines` because sometimes, Resistance, you need a tank, not another JavaScript framework.
|
92
100
|
|
93
|
-
|
94
|
-
circuit :high_traffic do
|
95
|
-
threshold failure_rate: 0.5, minimum_calls: 10, within: 60
|
96
|
-
end
|
97
|
-
```
|
101
|
+
See the [Circuit Breaker State Machine diagram](docs/DIAGRAMS.md#the-circuit-breaker-state-machine) for a visual representation of hope, despair, and the eternal cycle of production failures.
|
98
102
|
|
99
|
-
|
100
|
-
Create circuit breakers at runtime for webhook delivery, API proxies, or per-tenant isolation:
|
103
|
+
## What You Think You're Doing vs Reality
|
101
104
|
|
102
|
-
|
103
|
-
|
104
|
-
include BreakerMachines::DSL
|
105
|
+
### You Think: "I'm implementing retry logic for resilience!"
|
106
|
+
### Reality: You're DDOSing your own infrastructure
|
105
107
|
|
106
|
-
|
107
|
-
threshold failures: 3, within: 1.minute
|
108
|
-
fallback { |error| { delivered: false, error: error.message } }
|
109
|
-
end
|
108
|
+
See [The Retry Death Spiral diagram](docs/DIAGRAMS.md#the-retry-death-spiral) to understand how your well-intentioned retries become a self-inflicted distributed denial of service attack.
|
110
109
|
|
111
|
-
|
112
|
-
|
113
|
-
|
114
|
-
|
115
|
-
|
116
|
-
|
117
|
-
|
118
|
-
|
119
|
-
|
120
|
-
|
121
|
-
|
122
|
-
|
123
|
-
|
124
|
-
|
125
|
-
|
110
|
+
## Advanced Features
|
111
|
+
|
112
|
+
- **Hedged Requests** - Reduce latency with duplicate requests
|
113
|
+
- **Multiple Backends** - Automatic failover across endpoints
|
114
|
+
- **Percentage-Based Thresholds** - Open on error rates, not just counts
|
115
|
+
- **Dynamic Circuit Breakers** - Runtime creation with templates
|
116
|
+
- **Apocalypse-Resistant Storage** - Cascading fallbacks when Redis dies
|
117
|
+
- **Custom Storage Backends** - SysV semaphores, distributed locks, etc.
|
118
|
+
|
119
|
+
See [Advanced Patterns](docs/ADVANCED_PATTERNS.md) for detailed examples and implementation guides.
|
120
|
+
|
121
|
+
## A Word from the RMNS Atlas Monkey
|
122
|
+
|
123
|
+
*The Universal Commentary Engine crackles to life:*
|
124
|
+
|
125
|
+
"In space, nobody can hear your pronouns. But they can hear your services failing.
|
126
126
|
|
127
|
-
|
127
|
+
The universe doesn't care about your bootcamp certificate or your Medium articles about 'Why I Switched to Rust.' It cares about one thing:
|
128
128
|
|
129
|
-
|
130
|
-
|
131
|
-
|
132
|
-
|
133
|
-
|
129
|
+
Does your system stay up when Redis has a bad day?
|
130
|
+
|
131
|
+
If not, welcome to the Resistance. We have circuit breakers.
|
132
|
+
|
133
|
+
Remember: The pattern isn't about preventing failures—it's about failing fast, failing smart, and living to deploy another day.
|
134
|
+
|
135
|
+
As I always say when contemplating the void: 'It's better to break a circuit than to break production.'"
|
136
|
+
|
137
|
+
*— Universal Commentary Engine, Log Entry 42*
|
138
|
+
|
139
|
+
## Contributing to the Resistance
|
140
|
+
|
141
|
+
1. Fork it (like it's 2005)
|
142
|
+
2. Create your feature branch (`git checkout -b feature/save-the-fleet`)
|
143
|
+
3. Commit your changes (`git commit -am 'Add quantum circuit breaker'`)
|
144
|
+
4. Push to the branch (`git push origin feature/save-the-fleet`)
|
145
|
+
5. Create a new Pull Request (and wait for the Council of Elders to review)
|
134
146
|
|
135
147
|
## License
|
136
148
|
|
137
149
|
MIT License. See [LICENSE](LICENSE) file for details.
|
138
150
|
|
151
|
+
## Acknowledgments
|
152
|
+
|
153
|
+
- The `state_machines` gem - The reliable engine under our hood
|
154
|
+
- Every service that ever timed out - You taught me well
|
155
|
+
- The RMNS Atlas Monkey - For philosophical guidance
|
156
|
+
- The Resistance - For never giving up
|
157
|
+
|
139
158
|
## Author
|
140
159
|
|
141
160
|
Built with ❤️ and ☕ by the Resistance against cascading failures.
|
142
161
|
|
143
|
-
|
144
|
-
|
145
|
-
*Remember: Without circuit breakers, even AI can enter infinite loops of existential confusion. Don't let your services have an existential crisis.*
|
162
|
+
**Remember: In space, nobody can hear your Redis timeout. But they can feel your circuit breaker failing over to localhost.**
|
data/lib/breaker_machines/dsl.rb
CHANGED
@@ -328,18 +328,21 @@ module BreakerMachines
|
|
328
328
|
@config[:half_open_calls] = count
|
329
329
|
end
|
330
330
|
|
331
|
-
def storage(backend, **)
|
331
|
+
def storage(backend, **options)
|
332
332
|
@config[:storage] = case backend
|
333
333
|
when :memory
|
334
|
-
Storage::Memory.new(**)
|
334
|
+
Storage::Memory.new(**options)
|
335
335
|
when :bucket_memory
|
336
|
-
Storage::BucketMemory.new(**)
|
336
|
+
Storage::BucketMemory.new(**options)
|
337
337
|
when :cache
|
338
|
-
Storage::Cache.new(**)
|
339
|
-
when :
|
340
|
-
Storage::
|
338
|
+
Storage::Cache.new(**options)
|
339
|
+
when :null
|
340
|
+
Storage::Null.new(**options)
|
341
|
+
when :fallback_chain
|
342
|
+
config = options.is_a?(Proc) ? options.call(timeout: 5) : options
|
343
|
+
Storage::FallbackChain.new(config)
|
341
344
|
when Class
|
342
|
-
backend.new(**)
|
345
|
+
backend.new(**options)
|
343
346
|
else
|
344
347
|
backend
|
345
348
|
end
|
@@ -413,7 +416,7 @@ module BreakerMachines
|
|
413
416
|
@config[:exceptions] = exceptions
|
414
417
|
end
|
415
418
|
|
416
|
-
def fiber_safe(enabled
|
419
|
+
def fiber_safe(enabled = true) # rubocop:disable Style/OptionalBooleanParameter
|
417
420
|
@config[:fiber_safe] = enabled
|
418
421
|
end
|
419
422
|
|
@@ -28,6 +28,16 @@ module BreakerMachines
|
|
28
28
|
class ConfigurationError < Error; end
|
29
29
|
class StorageError < Error; end
|
30
30
|
|
31
|
+
# Raised when storage backend operation times out
|
32
|
+
class StorageTimeoutError < StorageError
|
33
|
+
attr_reader :timeout_ms
|
34
|
+
|
35
|
+
def initialize(message, timeout_ms = nil)
|
36
|
+
@timeout_ms = timeout_ms
|
37
|
+
super(message)
|
38
|
+
end
|
39
|
+
end
|
40
|
+
|
31
41
|
# Raised when circuit rejects call due to bulkhead limit
|
32
42
|
class CircuitBulkheadError < Error
|
33
43
|
attr_reader :circuit_name, :max_concurrent
|
@@ -42,6 +42,11 @@ module BreakerMachines
|
|
42
42
|
def clear_all
|
43
43
|
raise NotImplementedError
|
44
44
|
end
|
45
|
+
|
46
|
+
# Timeout handling - each backend must implement its own timeout strategy
|
47
|
+
def with_timeout(timeout_ms)
|
48
|
+
raise NotImplementedError, "#{self.class} must implement #with_timeout to handle #{timeout_ms}ms timeouts"
|
49
|
+
end
|
45
50
|
end
|
46
51
|
end
|
47
52
|
end
|
@@ -7,6 +7,10 @@ module BreakerMachines
|
|
7
7
|
module Storage
|
8
8
|
# Efficient bucket-based memory storage implementation
|
9
9
|
# Uses fixed-size circular buffers for constant-time event counting
|
10
|
+
#
|
11
|
+
# WARNING: This storage backend is NOT compatible with DRb (distributed Ruby)
|
12
|
+
# environments as memory is not shared between processes. Use Cache backend
|
13
|
+
# with an external cache store (Redis, Memcached) for distributed setups.
|
10
14
|
class BucketMemory < Base
|
11
15
|
BUCKET_SIZE = 1 # 1 second per bucket
|
12
16
|
|
@@ -158,6 +162,12 @@ module BreakerMachines
|
|
158
162
|
def monotonic_time
|
159
163
|
Process.clock_gettime(Process::CLOCK_MONOTONIC)
|
160
164
|
end
|
165
|
+
|
166
|
+
def with_timeout(_timeout_ms)
|
167
|
+
# BucketMemory operations should be instant, but we'll still respect the timeout
|
168
|
+
# This is more for consistency and to catch any potential deadlocks
|
169
|
+
yield
|
170
|
+
end
|
161
171
|
end
|
162
172
|
end
|
163
173
|
end
|
@@ -94,6 +94,13 @@ module BreakerMachines
|
|
94
94
|
events.last(limit)
|
95
95
|
end
|
96
96
|
|
97
|
+
def with_timeout(_timeout_ms)
|
98
|
+
# Rails cache operations should rely on their own underlying timeouts
|
99
|
+
# Using Ruby's Timeout.timeout is dangerous and can cause deadlocks
|
100
|
+
# For Redis cache stores, configure connect_timeout and read_timeout instead
|
101
|
+
yield
|
102
|
+
end
|
103
|
+
|
97
104
|
private
|
98
105
|
|
99
106
|
def increment_counter(key)
|
@@ -0,0 +1,308 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module BreakerMachines
|
4
|
+
module Storage
|
5
|
+
# Apocalypse-resistant storage backend that tries multiple storage backends in sequence
|
6
|
+
# Falls back to the next storage backend when the current one times out or fails
|
7
|
+
#
|
8
|
+
# NOTE: For DRb (distributed Ruby) environments, only :cache backend with external
|
9
|
+
# cache stores (Redis, Memcached) will work properly. Memory-based backends (:memory,
|
10
|
+
# :bucket_memory) are incompatible with DRb as they don't share state between processes.
|
11
|
+
class FallbackChain < Base
|
12
|
+
attr_reader :storage_configs, :storage_instances, :unhealthy_until, :circuit_breaker_threshold, :circuit_breaker_timeout
|
13
|
+
|
14
|
+
def initialize(storage_configs, **)
|
15
|
+
super(**)
|
16
|
+
@storage_configs = normalize_storage_configs(storage_configs)
|
17
|
+
@storage_instances = {}
|
18
|
+
@unhealthy_until = {}
|
19
|
+
@circuit_breaker_threshold = 3 # After 3 failures, mark backend as unhealthy
|
20
|
+
@circuit_breaker_timeout = 30 # Keep marked as unhealthy for 30 seconds
|
21
|
+
validate_configs!
|
22
|
+
end
|
23
|
+
|
24
|
+
def get_status(circuit_name)
|
25
|
+
execute_with_fallback(:get_status, circuit_name)
|
26
|
+
end
|
27
|
+
|
28
|
+
def set_status(circuit_name, status, opened_at = nil)
|
29
|
+
execute_with_fallback(:set_status, circuit_name, status, opened_at)
|
30
|
+
end
|
31
|
+
|
32
|
+
def record_success(circuit_name, duration)
|
33
|
+
execute_with_fallback(:record_success, circuit_name, duration)
|
34
|
+
end
|
35
|
+
|
36
|
+
def record_failure(circuit_name, duration)
|
37
|
+
execute_with_fallback(:record_failure, circuit_name, duration)
|
38
|
+
end
|
39
|
+
|
40
|
+
def success_count(circuit_name, window_seconds)
|
41
|
+
execute_with_fallback(:success_count, circuit_name, window_seconds)
|
42
|
+
end
|
43
|
+
|
44
|
+
def failure_count(circuit_name, window_seconds)
|
45
|
+
execute_with_fallback(:failure_count, circuit_name, window_seconds)
|
46
|
+
end
|
47
|
+
|
48
|
+
def clear(circuit_name)
|
49
|
+
execute_with_fallback(:clear, circuit_name)
|
50
|
+
end
|
51
|
+
|
52
|
+
def clear_all
|
53
|
+
execute_with_fallback(:clear_all)
|
54
|
+
end
|
55
|
+
|
56
|
+
def record_event_with_details(circuit_name, type, duration, error: nil, new_state: nil)
|
57
|
+
execute_with_fallback(:record_event_with_details, circuit_name, type, duration, error: error,
|
58
|
+
new_state: new_state)
|
59
|
+
end
|
60
|
+
|
61
|
+
def event_log(circuit_name, limit)
|
62
|
+
execute_with_fallback(:event_log, circuit_name, limit)
|
63
|
+
end
|
64
|
+
|
65
|
+
def with_timeout(_timeout_ms)
|
66
|
+
# FallbackChain doesn't use timeout directly - each backend handles its own
|
67
|
+
yield
|
68
|
+
end
|
69
|
+
|
70
|
+
def cleanup!
|
71
|
+
storage_instances.each_value do |instance|
|
72
|
+
instance.clear_all if instance.respond_to?(:clear_all)
|
73
|
+
end
|
74
|
+
storage_instances.clear
|
75
|
+
@backend_failures&.clear
|
76
|
+
unhealthy_until.clear
|
77
|
+
end
|
78
|
+
|
79
|
+
private
|
80
|
+
|
81
|
+
def execute_with_fallback(method, *args, **kwargs)
|
82
|
+
chain_started_at = Process.clock_gettime(Process::CLOCK_MONOTONIC)
|
83
|
+
attempted_backends = []
|
84
|
+
|
85
|
+
storage_configs.each_with_index do |config, index|
|
86
|
+
attempted_backends << config[:backend]
|
87
|
+
|
88
|
+
if backend_unhealthy?(config[:backend])
|
89
|
+
emit_backend_skipped_notification(config[:backend], method, index)
|
90
|
+
next
|
91
|
+
end
|
92
|
+
|
93
|
+
begin
|
94
|
+
backend = get_backend_instance(config[:backend])
|
95
|
+
started_at = Process.clock_gettime(Process::CLOCK_MONOTONIC)
|
96
|
+
|
97
|
+
result = backend.with_timeout(config[:timeout]) do
|
98
|
+
if kwargs.any?
|
99
|
+
backend.send(method, *args, **kwargs)
|
100
|
+
else
|
101
|
+
backend.send(method, *args)
|
102
|
+
end
|
103
|
+
end
|
104
|
+
|
105
|
+
# Success - emit success notification and reset failure count
|
106
|
+
duration_ms = ((Process.clock_gettime(Process::CLOCK_MONOTONIC) - started_at) * 1000).round(2)
|
107
|
+
emit_operation_success_notification(config[:backend], method, duration_ms, index)
|
108
|
+
reset_backend_failures(config[:backend])
|
109
|
+
|
110
|
+
# Emit chain success notification
|
111
|
+
chain_duration_ms = ((Process.clock_gettime(Process::CLOCK_MONOTONIC) - chain_started_at) * 1000).round(2)
|
112
|
+
emit_chain_success_notification(method, attempted_backends, config[:backend], chain_duration_ms)
|
113
|
+
|
114
|
+
return result
|
115
|
+
rescue BreakerMachines::StorageTimeoutError, BreakerMachines::StorageError, StandardError => e
|
116
|
+
duration_ms = ((Process.clock_gettime(Process::CLOCK_MONOTONIC) - started_at) * 1000).round(2)
|
117
|
+
|
118
|
+
# Record the failure
|
119
|
+
record_backend_failure(config[:backend], e, duration_ms)
|
120
|
+
|
121
|
+
# Emit notification about the fallback
|
122
|
+
emit_fallback_notification(config[:backend], e, duration_ms, index)
|
123
|
+
|
124
|
+
# If this is the last backend, re-raise the error
|
125
|
+
raise e if index == storage_configs.size - 1
|
126
|
+
|
127
|
+
# Continue to next backend
|
128
|
+
next
|
129
|
+
end
|
130
|
+
end
|
131
|
+
|
132
|
+
# If we get here, all backends were unhealthy
|
133
|
+
chain_duration_ms = ((Process.clock_gettime(Process::CLOCK_MONOTONIC) - chain_started_at) * 1000).round(2)
|
134
|
+
emit_chain_failure_notification(method, attempted_backends, chain_duration_ms)
|
135
|
+
raise BreakerMachines::StorageError, 'All storage backends are unhealthy'
|
136
|
+
end
|
137
|
+
|
138
|
+
def get_backend_instance(backend_type)
|
139
|
+
storage_instances[backend_type] ||= create_backend_instance(backend_type)
|
140
|
+
end
|
141
|
+
|
142
|
+
def create_backend_instance(backend_type)
|
143
|
+
case backend_type
|
144
|
+
when :memory
|
145
|
+
Memory.new
|
146
|
+
when :bucket_memory
|
147
|
+
BucketMemory.new
|
148
|
+
when :cache
|
149
|
+
Cache.new
|
150
|
+
when :null
|
151
|
+
Null.new
|
152
|
+
else
|
153
|
+
# Allow custom backend classes
|
154
|
+
raise ConfigurationError, "Unknown storage backend: #{backend_type}" unless backend_type.is_a?(Class)
|
155
|
+
|
156
|
+
backend_type.new
|
157
|
+
|
158
|
+
end
|
159
|
+
end
|
160
|
+
|
161
|
+
def backend_unhealthy?(backend_type)
|
162
|
+
unhealthy_until_time = unhealthy_until[backend_type]
|
163
|
+
return false unless unhealthy_until_time
|
164
|
+
|
165
|
+
if Process.clock_gettime(Process::CLOCK_MONOTONIC) > unhealthy_until_time
|
166
|
+
unhealthy_until.delete(backend_type)
|
167
|
+
false
|
168
|
+
else
|
169
|
+
true
|
170
|
+
end
|
171
|
+
end
|
172
|
+
|
173
|
+
def record_backend_failure(backend_type, error, duration_ms)
|
174
|
+
@backend_failures ||= {}
|
175
|
+
@backend_failures[backend_type] ||= []
|
176
|
+
@backend_failures[backend_type] << {
|
177
|
+
error: error,
|
178
|
+
duration_ms: duration_ms,
|
179
|
+
timestamp: Process.clock_gettime(Process::CLOCK_MONOTONIC)
|
180
|
+
}
|
181
|
+
|
182
|
+
# Keep only recent failures (last 60 seconds)
|
183
|
+
cutoff = Process.clock_gettime(Process::CLOCK_MONOTONIC) - 60
|
184
|
+
@backend_failures[backend_type].reject! { |f| f[:timestamp] < cutoff }
|
185
|
+
|
186
|
+
# Mark as unhealthy if too many failures
|
187
|
+
return unless @backend_failures[backend_type].size >= circuit_breaker_threshold
|
188
|
+
|
189
|
+
unhealthy_until[backend_type] = Process.clock_gettime(Process::CLOCK_MONOTONIC) + circuit_breaker_timeout
|
190
|
+
emit_backend_health_change_notification(backend_type, :healthy, :unhealthy, @backend_failures[backend_type].size)
|
191
|
+
rescue StandardError => e
|
192
|
+
# Don't let failure recording cause the whole chain to hang
|
193
|
+
Rails.logger&.error("FallbackChain: Failed to record backend failure: #{e.message}")
|
194
|
+
end
|
195
|
+
|
196
|
+
def reset_backend_failures(backend_type)
|
197
|
+
was_unhealthy = unhealthy_until.key?(backend_type)
|
198
|
+
@backend_failures&.delete(backend_type)
|
199
|
+
unhealthy_until.delete(backend_type)
|
200
|
+
|
201
|
+
if was_unhealthy
|
202
|
+
emit_backend_health_change_notification(backend_type, :unhealthy, :healthy, 0)
|
203
|
+
end
|
204
|
+
end
|
205
|
+
|
206
|
+
def emit_fallback_notification(backend_type, error, duration_ms, backend_index)
|
207
|
+
ActiveSupport::Notifications.instrument(
|
208
|
+
'storage_fallback.breaker_machines',
|
209
|
+
backend: backend_type,
|
210
|
+
error_class: error.class.name,
|
211
|
+
error_message: error.message,
|
212
|
+
duration_ms: duration_ms,
|
213
|
+
backend_index: backend_index,
|
214
|
+
next_backend: storage_configs[backend_index + 1]&.dig(:backend)
|
215
|
+
)
|
216
|
+
end
|
217
|
+
|
218
|
+
def emit_operation_success_notification(backend_type, method, duration_ms, backend_index)
|
219
|
+
ActiveSupport::Notifications.instrument(
|
220
|
+
'storage_operation.breaker_machines',
|
221
|
+
backend: backend_type,
|
222
|
+
operation: method,
|
223
|
+
duration_ms: duration_ms,
|
224
|
+
backend_index: backend_index,
|
225
|
+
success: true
|
226
|
+
)
|
227
|
+
end
|
228
|
+
|
229
|
+
def emit_backend_skipped_notification(backend_type, method, backend_index)
|
230
|
+
ActiveSupport::Notifications.instrument(
|
231
|
+
'storage_backend_skipped.breaker_machines',
|
232
|
+
backend: backend_type,
|
233
|
+
operation: method,
|
234
|
+
backend_index: backend_index,
|
235
|
+
reason: 'unhealthy',
|
236
|
+
unhealthy_until: unhealthy_until[backend_type]
|
237
|
+
)
|
238
|
+
end
|
239
|
+
|
240
|
+
def emit_backend_health_change_notification(backend_type, previous_state, new_state, failure_count)
|
241
|
+
ActiveSupport::Notifications.instrument(
|
242
|
+
'storage_backend_health.breaker_machines',
|
243
|
+
backend: backend_type,
|
244
|
+
previous_state: previous_state,
|
245
|
+
new_state: new_state,
|
246
|
+
failure_count: failure_count,
|
247
|
+
threshold: circuit_breaker_threshold,
|
248
|
+
recovery_time: new_state == :unhealthy ? unhealthy_until[backend_type] : nil
|
249
|
+
)
|
250
|
+
end
|
251
|
+
|
252
|
+
def emit_chain_success_notification(method, attempted_backends, successful_backend, duration_ms)
|
253
|
+
ActiveSupport::Notifications.instrument(
|
254
|
+
'storage_chain_operation.breaker_machines',
|
255
|
+
operation: method,
|
256
|
+
attempted_backends: attempted_backends,
|
257
|
+
successful_backend: successful_backend,
|
258
|
+
duration_ms: duration_ms,
|
259
|
+
success: true,
|
260
|
+
fallback_count: attempted_backends.index(successful_backend)
|
261
|
+
)
|
262
|
+
end
|
263
|
+
|
264
|
+
def emit_chain_failure_notification(method, attempted_backends, duration_ms)
|
265
|
+
ActiveSupport::Notifications.instrument(
|
266
|
+
'storage_chain_operation.breaker_machines',
|
267
|
+
operation: method,
|
268
|
+
attempted_backends: attempted_backends,
|
269
|
+
successful_backend: nil,
|
270
|
+
duration_ms: duration_ms,
|
271
|
+
success: false,
|
272
|
+
fallback_count: attempted_backends.size
|
273
|
+
)
|
274
|
+
end
|
275
|
+
|
276
|
+
def normalize_storage_configs(configs)
|
277
|
+
return configs if configs.is_a?(Array)
|
278
|
+
|
279
|
+
# Convert hash format to array format
|
280
|
+
unless configs.is_a?(Hash)
|
281
|
+
raise ConfigurationError, "Storage configs must be Array or Hash, got: #{configs.class}"
|
282
|
+
end
|
283
|
+
|
284
|
+
configs.map do |_key, value|
|
285
|
+
if value.is_a?(Hash)
|
286
|
+
value
|
287
|
+
else
|
288
|
+
{ backend: value, timeout: 5 }
|
289
|
+
end
|
290
|
+
end
|
291
|
+
end
|
292
|
+
|
293
|
+
def validate_configs!
|
294
|
+
raise ConfigurationError, 'Storage configs cannot be empty' if storage_configs.empty?
|
295
|
+
|
296
|
+
storage_configs.each_with_index do |config, index|
|
297
|
+
unless config.is_a?(Hash) && config[:backend] && config[:timeout]
|
298
|
+
raise ConfigurationError, "Invalid storage config at index #{index}: #{config}"
|
299
|
+
end
|
300
|
+
|
301
|
+
unless config[:timeout].is_a?(Numeric) && config[:timeout].positive?
|
302
|
+
raise ConfigurationError, "Timeout must be a positive number, got: #{config[:timeout]}"
|
303
|
+
end
|
304
|
+
end
|
305
|
+
end
|
306
|
+
end
|
307
|
+
end
|
308
|
+
end
|
@@ -6,6 +6,10 @@ require 'concurrent/array'
|
|
6
6
|
module BreakerMachines
|
7
7
|
module Storage
|
8
8
|
# High-performance in-memory storage backend with thread-safe operations
|
9
|
+
#
|
10
|
+
# WARNING: This storage backend is NOT compatible with DRb (distributed Ruby)
|
11
|
+
# environments as memory is not shared between processes. Use Cache backend
|
12
|
+
# with an external cache store (Redis, Memcached) for distributed setups.
|
9
13
|
class Memory < Base
|
10
14
|
def initialize(**options)
|
11
15
|
super
|
@@ -87,6 +91,12 @@ module BreakerMachines
|
|
87
91
|
events.last(limit).map(&:dup)
|
88
92
|
end
|
89
93
|
|
94
|
+
def with_timeout(_timeout_ms)
|
95
|
+
# Memory operations should be instant, but we'll still respect the timeout
|
96
|
+
# This is more for consistency and to catch any potential deadlocks
|
97
|
+
yield
|
98
|
+
end
|
99
|
+
|
90
100
|
private
|
91
101
|
|
92
102
|
def record_event(circuit_name, type, duration)
|
@@ -40,6 +40,15 @@ module BreakerMachines
|
|
40
40
|
def record_event_with_details(_circuit_name, _event_type, _duration, _metadata = {})
|
41
41
|
# No-op
|
42
42
|
end
|
43
|
+
|
44
|
+
def clear_all
|
45
|
+
# No-op
|
46
|
+
end
|
47
|
+
|
48
|
+
def with_timeout(_timeout_ms)
|
49
|
+
# Null storage always succeeds instantly - perfect for fail-open scenarios
|
50
|
+
yield
|
51
|
+
end
|
43
52
|
end
|
44
53
|
end
|
45
54
|
end
|
data/sig/README.md
CHANGED
@@ -24,7 +24,7 @@ To use these type signatures in your project:
|
|
24
24
|
target :app do
|
25
25
|
signature "sig"
|
26
26
|
check "lib"
|
27
|
-
|
27
|
+
|
28
28
|
library "breaker_machines"
|
29
29
|
end
|
30
30
|
```
|
@@ -38,7 +38,7 @@ To use these type signatures in your project:
|
|
38
38
|
|
39
39
|
### Basic Circuit Usage
|
40
40
|
```ruby
|
41
|
-
circuit = BreakerMachines::Circuit.new("api",
|
41
|
+
circuit = BreakerMachines::Circuit.new("api",
|
42
42
|
failure_threshold: 5,
|
43
43
|
reset_timeout: 30
|
44
44
|
)
|
@@ -50,7 +50,7 @@ result = circuit.call { api.fetch_data }
|
|
50
50
|
```ruby
|
51
51
|
class MyService
|
52
52
|
include BreakerMachines::DSL
|
53
|
-
|
53
|
+
|
54
54
|
circuit :database do
|
55
55
|
threshold failures: 10, within: 60
|
56
56
|
reset_after 120
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: breaker_machines
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.3.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Abdelkader Boudih
|
@@ -125,6 +125,7 @@ files:
|
|
125
125
|
- lib/breaker_machines/storage/base.rb
|
126
126
|
- lib/breaker_machines/storage/bucket_memory.rb
|
127
127
|
- lib/breaker_machines/storage/cache.rb
|
128
|
+
- lib/breaker_machines/storage/fallback_chain.rb
|
128
129
|
- lib/breaker_machines/storage/memory.rb
|
129
130
|
- lib/breaker_machines/storage/null.rb
|
130
131
|
- lib/breaker_machines/version.rb
|