semian 0.27.0 → 0.28.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +75 -0
- data/lib/semian/adapter.rb +1 -1
- data/lib/semian/adaptive_circuit_breaker.rb +136 -0
- data/lib/semian/circuit_breaker.rb +44 -25
- data/lib/semian/circuit_breaker_behaviour.rb +64 -0
- data/lib/semian/configuration_validator.rb +52 -0
- data/lib/semian/dual_circuit_breaker.rb +165 -0
- data/lib/semian/mysql2.rb +2 -2
- data/lib/semian/net_http.rb +3 -3
- data/lib/semian/pid_controller.rb +217 -0
- data/lib/semian/pid_controller_thread.rb +72 -0
- data/lib/semian/protected_resource.rb +1 -1
- data/lib/semian/simple_exponential_smoother.rb +137 -0
- data/lib/semian/unprotected_resource.rb +3 -3
- data/lib/semian/version.rb +1 -1
- data/lib/semian.rb +78 -3
- metadata +8 -2
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: c6423964e3bf474c3f1c6c31ab4c52a14fc28a35d51d91480c537f46f0c1e5f2
|
|
4
|
+
data.tar.gz: a7ec4ad1154ceef88bdb3dd2bbd40d419d8dc3162a81a17804afdd9573d380be
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: ccc4f1217740efde74e84acf26c2f73de7ac9c088575ac03d7be5408bc1c2711cd95ffd27a9b524071db95de3ce24ad3673b5a4f960a4d96b6b6c1c083c85183
|
|
7
|
+
data.tar.gz: 1f94bc2139a6397ba5a963a230496ab694f98cb1ae29e42b69017972d2268afc97c8fdde0d49ffd8d6514036199c15a5d62f67439a6790ed9d4696771ddf7b8e
|
data/README.md
CHANGED
|
@@ -588,6 +588,10 @@ There are four configuration parameters for circuit breakers in Semian:
|
|
|
588
588
|
Defaults to `error_timeout` seconds if not set.
|
|
589
589
|
- **error_timeout**. The amount of time in seconds until trying to query the resource
|
|
590
590
|
again.
|
|
591
|
+
- **exponential_backoff_error_timeout**. If set to `true`, we will progress towards error_timeout exponentially, instead of committing to it directly.
|
|
592
|
+
This is useful to avoid rejecting too many requests if the dependency is not really degraded.
|
|
593
|
+
- **exponential_backoff_initial_timeout**. Where to start the exponential backoff towards `error_timeout` from. Defaults to 1 second.
|
|
594
|
+
- **exponential_backoff_multiplier**. The exponential multiplier to use during the exponential backoff towards the `error_timeout`. Defaults to 2.
|
|
591
595
|
- **error_threshold_timeout_enabled**. If set to false it will disable
|
|
592
596
|
the time window for evicting old exceptions. `error_timeout` is still used and
|
|
593
597
|
will reset the circuit. Defaults to `true` if not set.
|
|
@@ -603,6 +607,77 @@ It is possible to disable Circuit Breaker with environment variable
|
|
|
603
607
|
For more information about configuring these parameters, please read
|
|
604
608
|
[this post](https://shopify.engineering/circuit-breaker-misconfigured).
|
|
605
609
|
|
|
610
|
+
#### Adaptive Circuit Breaker (Experimental)
|
|
611
|
+
|
|
612
|
+
Semian also includes an experimental adaptive circuit breaker that uses a [PID controller](https://en.wikipedia.org/wiki/Proportional%E2%80%93integral%E2%80%93derivative_controller)
|
|
613
|
+
to dynamically adjust the rejection rate based on real-time error rates. Unlike the
|
|
614
|
+
traditional circuit breaker with fixed thresholds, the adaptive circuit breaker continuously
|
|
615
|
+
monitors error rates and adjusts its behavior accordingly.
|
|
616
|
+
|
|
617
|
+
##### How It Works
|
|
618
|
+
|
|
619
|
+
The adaptive circuit breaker has two components:
|
|
620
|
+
|
|
621
|
+
1. An ideal error rate estimator that determines when the service is starting to become unhealthy
|
|
622
|
+
2. A PID controller that opens the circuit fully or partially based on how bad the situation is.
|
|
623
|
+
|
|
624
|
+
The ideal error rate estimator uses a "simple exponential smoother", which means it simply takes the average error rate
|
|
625
|
+
that it observes as the ideal. With the following caveat:
|
|
626
|
+
|
|
627
|
+
1. It ignores any data that is too high from its calculations. For example, we know that 20% error rate is an anamolous
|
|
628
|
+
observation so we ignore it.
|
|
629
|
+
1. It starts with an educated guess about the ideal error rate,
|
|
630
|
+
and then converges down quickly if it observes a lower error rate, and slowly if it observes a higher error rate.
|
|
631
|
+
1. After 30 minutes, it becomes more confident of its guess, and thus converges even slower in either directions.
|
|
632
|
+
|
|
633
|
+
The PID controller uses the following equation to determine whether to open or close the circuit:
|
|
634
|
+
|
|
635
|
+
```
|
|
636
|
+
P = (error_rate - ideal_error_rate) - (1 - (error_rate - ideal_error_rate)) * rejection_rate
|
|
637
|
+
```
|
|
638
|
+
|
|
639
|
+
Or, more simply, if you define `delta_error = error_rate - ideal_error_rate` then:
|
|
640
|
+
|
|
641
|
+
```
|
|
642
|
+
P = delta_error - (1 - delta_error) * rejection_rate
|
|
643
|
+
```
|
|
644
|
+
|
|
645
|
+
In simple terms: This equation says: open more when the error rate is higher than the rejection rate,
|
|
646
|
+
and less when the opposite. The multiplier of `(1 - delta_error)` is called the aggressiveness multiplier.
|
|
647
|
+
It allows the circuit to open more aggressively depending on how bad the situation is.
|
|
648
|
+
|
|
649
|
+
This P is fed into a typical PID equation, and is used to control the rejection rate of the circuit breaker.
|
|
650
|
+
|
|
651
|
+
##### Adaptive Circuit Breaker Configuration
|
|
652
|
+
|
|
653
|
+
To enable the adaptive circuit breaker, simply set **adaptive_circuit_breaker** to true.
|
|
654
|
+
|
|
655
|
+
Example configuration:
|
|
656
|
+
```ruby
|
|
657
|
+
Semian.register(
|
|
658
|
+
:my_service,
|
|
659
|
+
adaptive_circuit_breaker: true, # Use adaptive instead of traditional
|
|
660
|
+
bulkhead: false # Can be combined with bulkhead
|
|
661
|
+
)
|
|
662
|
+
```
|
|
663
|
+
|
|
664
|
+
**Note**: When `adaptive_circuit_breaker: true` is set, traditional circuit breaker
|
|
665
|
+
parameters (`error_threshold`, `error_timeout`, etc.) are ignored.
|
|
666
|
+
|
|
667
|
+
|
|
668
|
+
We **_highly_** recommend just setting that configuration and not any other.
|
|
669
|
+
One of the main goals of the adaptive circuit breaker is that it "just works".
|
|
670
|
+
Configuring it might be difficult and not provide much value. That said, here are the configurations you can set:
|
|
671
|
+
* **kp:** The contribution of P in the PID equation. Increasing it means you react more quickly to the latest data. Defaults to 1.0
|
|
672
|
+
* **ki**: The contribution of the integral in the PID equation. Increasing it means adding more "memory", which is useful to ignoring noise. Defaults to 0.2
|
|
673
|
+
* **kd**: The contribution of the derivative in the PID equation. Its behaviour can be complex because of our complex P equation. Defaults to 0.0
|
|
674
|
+
* **integral_upper_cap**: Maximum value of the integral, prevents integral windup. Default to 10.0
|
|
675
|
+
* **integral_lower_cap**: Minimum value of the integral, prevents integral windup. Default to -10.0
|
|
676
|
+
* **window_size**: How many seconds of observations to take into account. Note that this window is a sliding window of 1 second sliding interval. To control the sliding interval you should set the environment variable SEMIAN_ADAPTIVE_CIRCUIT_BREAKER_SLIDING_INTERVAL (shared among all adaptive circuit breakers). window_size default to 10 seconds
|
|
677
|
+
* **dead_zone_ratio**: An error percentage above the ideal_error_rate to ignore. This helps remove noise. Defaults to 0.25
|
|
678
|
+
* **initial_error_rate**: The guess to start with for the ideal error rate. Defaults to 0.05 (5%)
|
|
679
|
+
* **ideal_error_rate_estimator_cap_value**: The value above which we ignore observations for the ideal error rate. Defaults to 0.1 (10%)
|
|
680
|
+
|
|
606
681
|
### Bulkheading
|
|
607
682
|
|
|
608
683
|
For some applications, circuit breakers are not enough. This is best illustrated
|
data/lib/semian/adapter.rb
CHANGED
|
@@ -45,7 +45,7 @@ module Semian
|
|
|
45
45
|
end
|
|
46
46
|
rescue ::Semian::OpenCircuitError => error
|
|
47
47
|
last_error = semian_resource.circuit_breaker.last_error
|
|
48
|
-
message = "#{error.message} caused by #{last_error
|
|
48
|
+
message = "#{error.message} caused by #{last_error&.message}"
|
|
49
49
|
last_error = nil unless last_error.is_a?(Exception) # Net::HTTPServerError is not an exception
|
|
50
50
|
raise self.class::CircuitOpenError.new(semian_identifier, message), cause: last_error
|
|
51
51
|
rescue ::Semian::BaseError => error
|
|
@@ -0,0 +1,136 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require_relative "circuit_breaker_behaviour"
|
|
4
|
+
require_relative "pid_controller_thread"
|
|
5
|
+
|
|
6
|
+
module Semian
|
|
7
|
+
# Adaptive Circuit Breaker that uses PID controller for dynamic rejection
|
|
8
|
+
class AdaptiveCircuitBreaker
|
|
9
|
+
include CircuitBreakerBehaviour
|
|
10
|
+
|
|
11
|
+
attr_reader :pid_controller, :update_thread, :sliding_interval, :pid_controller_thread, :stopped
|
|
12
|
+
|
|
13
|
+
@pid_controller_thread = nil
|
|
14
|
+
|
|
15
|
+
def initialize(name:, exceptions:, kp:, ki:, kd:, window_size:, initial_error_rate:, implementation:,
|
|
16
|
+
sliding_interval:, dead_zone_ratio:, ideal_error_rate_estimator_cap_value:, integral_upper_cap:,
|
|
17
|
+
integral_lower_cap:)
|
|
18
|
+
initialize_behaviour(name: name)
|
|
19
|
+
|
|
20
|
+
@exceptions = exceptions
|
|
21
|
+
@stopped = false
|
|
22
|
+
|
|
23
|
+
@pid_controller = implementation::PIDController.new(
|
|
24
|
+
kp: kp,
|
|
25
|
+
ki: ki,
|
|
26
|
+
kd: kd,
|
|
27
|
+
window_size: window_size,
|
|
28
|
+
implementation: implementation,
|
|
29
|
+
sliding_interval: sliding_interval,
|
|
30
|
+
initial_error_rate: initial_error_rate,
|
|
31
|
+
dead_zone_ratio: dead_zone_ratio,
|
|
32
|
+
ideal_error_rate_estimator_cap_value: ideal_error_rate_estimator_cap_value,
|
|
33
|
+
integral_upper_cap: integral_upper_cap,
|
|
34
|
+
integral_lower_cap: integral_lower_cap,
|
|
35
|
+
)
|
|
36
|
+
|
|
37
|
+
@pid_controller_thread = PIDControllerThread.instance.register_resource(self)
|
|
38
|
+
end
|
|
39
|
+
|
|
40
|
+
def acquire(resource = nil, scope: nil, adapter: nil, &block)
|
|
41
|
+
unless request_allowed?
|
|
42
|
+
mark_rejected(scope:, adapter:)
|
|
43
|
+
raise OpenCircuitError, "Rejected by adaptive circuit breaker"
|
|
44
|
+
end
|
|
45
|
+
|
|
46
|
+
result = nil
|
|
47
|
+
begin
|
|
48
|
+
result = block.call
|
|
49
|
+
rescue *@exceptions => error
|
|
50
|
+
if !error.respond_to?(:marks_semian_circuits?) || error.marks_semian_circuits?
|
|
51
|
+
mark_failed(error, scope:, adapter:)
|
|
52
|
+
end
|
|
53
|
+
raise error
|
|
54
|
+
else
|
|
55
|
+
mark_success(scope:, adapter:)
|
|
56
|
+
end
|
|
57
|
+
result
|
|
58
|
+
end
|
|
59
|
+
|
|
60
|
+
def reset(scope: nil, adapter: nil)
|
|
61
|
+
@last_error = nil
|
|
62
|
+
@pid_controller.reset
|
|
63
|
+
end
|
|
64
|
+
|
|
65
|
+
def stop
|
|
66
|
+
destroy
|
|
67
|
+
end
|
|
68
|
+
|
|
69
|
+
def destroy
|
|
70
|
+
@stopped = true
|
|
71
|
+
PIDControllerThread.instance.unregister_resource(self)
|
|
72
|
+
@pid_controller.reset
|
|
73
|
+
end
|
|
74
|
+
|
|
75
|
+
def metrics
|
|
76
|
+
@pid_controller.metrics
|
|
77
|
+
end
|
|
78
|
+
|
|
79
|
+
def open?
|
|
80
|
+
@pid_controller.rejection_rate == 1
|
|
81
|
+
end
|
|
82
|
+
|
|
83
|
+
def closed?
|
|
84
|
+
@pid_controller.rejection_rate == 0
|
|
85
|
+
end
|
|
86
|
+
|
|
87
|
+
# Compatibility with ProtectedResource - Adaptive circuit breaker does not have a half open state
|
|
88
|
+
def half_open?
|
|
89
|
+
!open? && !closed?
|
|
90
|
+
end
|
|
91
|
+
|
|
92
|
+
def mark_failed(error, scope: nil, adapter: nil)
|
|
93
|
+
@last_error = error
|
|
94
|
+
@pid_controller.record_request(:error)
|
|
95
|
+
end
|
|
96
|
+
|
|
97
|
+
def mark_success(scope: nil, adapter: nil)
|
|
98
|
+
@pid_controller.record_request(:success)
|
|
99
|
+
end
|
|
100
|
+
|
|
101
|
+
def mark_rejected(scope: nil, adapter: nil)
|
|
102
|
+
@pid_controller.record_request(:rejected)
|
|
103
|
+
end
|
|
104
|
+
|
|
105
|
+
def request_allowed?
|
|
106
|
+
!@pid_controller.should_reject?
|
|
107
|
+
end
|
|
108
|
+
|
|
109
|
+
def in_use?
|
|
110
|
+
true
|
|
111
|
+
end
|
|
112
|
+
|
|
113
|
+
def pid_controller_update
|
|
114
|
+
@pid_controller.update
|
|
115
|
+
notify_metrics_update(@pid_controller.metrics(full: false))
|
|
116
|
+
end
|
|
117
|
+
|
|
118
|
+
private
|
|
119
|
+
|
|
120
|
+
def notify_metrics_update(metrics)
|
|
121
|
+
Semian.notify(
|
|
122
|
+
:adaptive_update,
|
|
123
|
+
self,
|
|
124
|
+
nil,
|
|
125
|
+
nil,
|
|
126
|
+
rejection_rate: metrics[:rejection_rate],
|
|
127
|
+
error_rate: metrics[:error_rate],
|
|
128
|
+
ideal_error_rate: metrics[:ideal_error_rate],
|
|
129
|
+
p_value: metrics[:p_value],
|
|
130
|
+
integral: metrics[:integral],
|
|
131
|
+
derivative: metrics[:derivative],
|
|
132
|
+
previous_p_value: metrics[:previous_p_value],
|
|
133
|
+
)
|
|
134
|
+
end
|
|
135
|
+
end
|
|
136
|
+
end
|
|
@@ -1,33 +1,43 @@
|
|
|
1
1
|
# frozen_string_literal: true
|
|
2
2
|
|
|
3
|
+
require_relative "circuit_breaker_behaviour"
|
|
4
|
+
|
|
3
5
|
module Semian
|
|
4
6
|
class CircuitBreaker # :nodoc:
|
|
7
|
+
include CircuitBreakerBehaviour
|
|
5
8
|
extend Forwardable
|
|
6
9
|
|
|
7
10
|
def_delegators :@state, :closed?, :open?, :half_open?
|
|
8
11
|
|
|
9
12
|
attr_reader(
|
|
10
|
-
:name,
|
|
11
13
|
:half_open_resource_timeout,
|
|
12
14
|
:error_timeout,
|
|
13
15
|
:state,
|
|
14
|
-
:last_error,
|
|
15
16
|
:error_threshold_timeout_enabled,
|
|
17
|
+
:exponential_backoff_error_timeout,
|
|
18
|
+
:exponential_backoff_initial_timeout,
|
|
19
|
+
:exponential_backoff_multiplier,
|
|
16
20
|
)
|
|
17
21
|
|
|
18
22
|
def initialize(name, exceptions:, success_threshold:, error_threshold:,
|
|
19
23
|
error_timeout:, implementation:, half_open_resource_timeout: nil,
|
|
20
24
|
error_threshold_timeout: nil, error_threshold_timeout_enabled: true,
|
|
21
|
-
lumping_interval: 0
|
|
22
|
-
|
|
25
|
+
lumping_interval: 0, exponential_backoff_error_timeout: false,
|
|
26
|
+
exponential_backoff_initial_timeout: 1, exponential_backoff_multiplier: 2)
|
|
27
|
+
initialize_behaviour(name: name)
|
|
28
|
+
|
|
29
|
+
@exceptions = exceptions
|
|
23
30
|
@success_count_threshold = success_threshold
|
|
24
31
|
@error_count_threshold = error_threshold
|
|
25
32
|
@error_threshold_timeout = error_threshold_timeout || error_timeout
|
|
26
33
|
@error_threshold_timeout_enabled = error_threshold_timeout_enabled.nil? ? true : error_threshold_timeout_enabled
|
|
27
34
|
@error_timeout = error_timeout
|
|
28
|
-
@exceptions = exceptions
|
|
29
35
|
@half_open_resource_timeout = half_open_resource_timeout
|
|
30
36
|
@lumping_interval = lumping_interval
|
|
37
|
+
@exponential_backoff_error_timeout = exponential_backoff_error_timeout
|
|
38
|
+
@exponential_backoff_initial_timeout = exponential_backoff_initial_timeout
|
|
39
|
+
@exponential_backoff_multiplier = exponential_backoff_multiplier
|
|
40
|
+
@current_error_timeout = exponential_backoff_error_timeout ? exponential_backoff_initial_timeout : error_timeout
|
|
31
41
|
|
|
32
42
|
@errors = implementation::SlidingWindow.new(max_size: @error_count_threshold)
|
|
33
43
|
@successes = implementation::Integer.new
|
|
@@ -36,8 +46,8 @@ module Semian
|
|
|
36
46
|
reset
|
|
37
47
|
end
|
|
38
48
|
|
|
39
|
-
def acquire(resource = nil, &block)
|
|
40
|
-
transition_to_half_open if transition_to_half_open?
|
|
49
|
+
def acquire(resource = nil, scope: nil, adapter: nil, &block)
|
|
50
|
+
transition_to_half_open(scope: scope, adapter: adapter) if transition_to_half_open?
|
|
41
51
|
|
|
42
52
|
raise OpenCircuitError unless request_allowed?
|
|
43
53
|
|
|
@@ -46,11 +56,11 @@ module Semian
|
|
|
46
56
|
result = maybe_with_half_open_resource_timeout(resource, &block)
|
|
47
57
|
rescue *@exceptions => error
|
|
48
58
|
if !error.respond_to?(:marks_semian_circuits?) || error.marks_semian_circuits?
|
|
49
|
-
mark_failed(error)
|
|
59
|
+
mark_failed(error, scope: scope, adapter: adapter)
|
|
50
60
|
end
|
|
51
61
|
raise error
|
|
52
62
|
else
|
|
53
|
-
mark_success
|
|
63
|
+
mark_success(scope: scope, adapter: adapter)
|
|
54
64
|
end
|
|
55
65
|
result
|
|
56
66
|
end
|
|
@@ -63,26 +73,26 @@ module Semian
|
|
|
63
73
|
closed? || half_open? || transition_to_half_open?
|
|
64
74
|
end
|
|
65
75
|
|
|
66
|
-
def mark_failed(error)
|
|
76
|
+
def mark_failed(error, scope: nil, adapter: nil)
|
|
67
77
|
push_error(error)
|
|
68
78
|
if closed?
|
|
69
|
-
transition_to_open if error_threshold_reached?
|
|
79
|
+
transition_to_open(scope: scope, adapter: adapter) if error_threshold_reached?
|
|
70
80
|
elsif half_open?
|
|
71
|
-
transition_to_open
|
|
81
|
+
transition_to_open(scope: scope, adapter: adapter)
|
|
72
82
|
end
|
|
73
83
|
end
|
|
74
84
|
|
|
75
|
-
def mark_success
|
|
85
|
+
def mark_success(scope: nil, adapter: nil)
|
|
76
86
|
return unless half_open?
|
|
77
87
|
|
|
78
88
|
@successes.increment
|
|
79
|
-
transition_to_close if success_threshold_reached?
|
|
89
|
+
transition_to_close(scope: scope, adapter: adapter) if success_threshold_reached?
|
|
80
90
|
end
|
|
81
91
|
|
|
82
|
-
def reset
|
|
92
|
+
def reset(scope: nil, adapter: nil)
|
|
83
93
|
@errors.clear
|
|
84
94
|
@successes.reset
|
|
85
|
-
transition_to_close
|
|
95
|
+
transition_to_close(scope: scope, adapter: adapter)
|
|
86
96
|
end
|
|
87
97
|
|
|
88
98
|
def destroy
|
|
@@ -97,24 +107,30 @@ module Semian
|
|
|
97
107
|
|
|
98
108
|
private
|
|
99
109
|
|
|
100
|
-
def transition_to_close
|
|
101
|
-
notify_state_transition(:closed)
|
|
110
|
+
def transition_to_close(scope: nil, adapter: nil)
|
|
111
|
+
notify_state_transition(:closed, scope: scope, adapter: adapter)
|
|
102
112
|
log_state_transition(:closed)
|
|
103
113
|
@state.close!
|
|
104
114
|
@errors.clear
|
|
115
|
+
# Reset exponential backoff when circuit closes
|
|
116
|
+
@current_error_timeout = @exponential_backoff_error_timeout ? @exponential_backoff_initial_timeout : @error_timeout
|
|
105
117
|
end
|
|
106
118
|
|
|
107
|
-
def transition_to_open
|
|
108
|
-
notify_state_transition(:open)
|
|
119
|
+
def transition_to_open(scope: nil, adapter: nil)
|
|
120
|
+
notify_state_transition(:open, scope: scope, adapter: adapter)
|
|
109
121
|
log_state_transition(:open)
|
|
110
122
|
@state.open!
|
|
111
123
|
end
|
|
112
124
|
|
|
113
|
-
def transition_to_half_open
|
|
114
|
-
notify_state_transition(:half_open)
|
|
125
|
+
def transition_to_half_open(scope: nil, adapter: nil)
|
|
126
|
+
notify_state_transition(:half_open, scope: scope, adapter: adapter)
|
|
115
127
|
log_state_transition(:half_open)
|
|
116
128
|
@state.half_open!
|
|
117
129
|
@successes.reset
|
|
130
|
+
# Multiply the backoff timeout when circuit opens (up to the max error_timeout)
|
|
131
|
+
if @exponential_backoff_error_timeout && @current_error_timeout < @error_timeout
|
|
132
|
+
@current_error_timeout = [@current_error_timeout * @exponential_backoff_multiplier, @error_timeout].min
|
|
133
|
+
end
|
|
118
134
|
end
|
|
119
135
|
|
|
120
136
|
def success_threshold_reached?
|
|
@@ -129,7 +145,7 @@ module Semian
|
|
|
129
145
|
last_error_time = @errors.last
|
|
130
146
|
return false unless last_error_time
|
|
131
147
|
|
|
132
|
-
last_error_time + @
|
|
148
|
+
last_error_time + @current_error_timeout < Process.clock_gettime(Process::CLOCK_MONOTONIC)
|
|
133
149
|
end
|
|
134
150
|
|
|
135
151
|
def push_error(error)
|
|
@@ -153,6 +169,9 @@ module Semian
|
|
|
153
169
|
str += " success_count_threshold=#{@success_count_threshold}"
|
|
154
170
|
str += " error_count_threshold=#{@error_count_threshold}"
|
|
155
171
|
str += " error_timeout=#{@error_timeout} error_last_at=\"#{@errors.last}\""
|
|
172
|
+
if @exponential_backoff_error_timeout
|
|
173
|
+
str += " current_error_timeout=#{@current_error_timeout}"
|
|
174
|
+
end
|
|
156
175
|
str += " name=\"#{@name}\""
|
|
157
176
|
if new_state == :open && @last_error
|
|
158
177
|
str += " last_error_message=#{@last_error.message.inspect}"
|
|
@@ -161,8 +180,8 @@ module Semian
|
|
|
161
180
|
Semian.logger.info(str)
|
|
162
181
|
end
|
|
163
182
|
|
|
164
|
-
def notify_state_transition(new_state)
|
|
165
|
-
Semian.notify(:state_change, self,
|
|
183
|
+
def notify_state_transition(new_state, scope: nil, adapter: nil)
|
|
184
|
+
Semian.notify(:state_change, self, scope, adapter, state: new_state)
|
|
166
185
|
end
|
|
167
186
|
|
|
168
187
|
def maybe_with_half_open_resource_timeout(resource, &block)
|
|
@@ -0,0 +1,64 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module Semian
|
|
4
|
+
module CircuitBreakerBehaviour
|
|
5
|
+
attr_reader :name, :last_error
|
|
6
|
+
attr_accessor :exceptions
|
|
7
|
+
|
|
8
|
+
# Initialize common circuit breaker attributes
|
|
9
|
+
def initialize_behaviour(name:)
|
|
10
|
+
@name = name.to_sym
|
|
11
|
+
@last_error = nil
|
|
12
|
+
end
|
|
13
|
+
|
|
14
|
+
# Main method to execute a block with circuit breaker protection
|
|
15
|
+
def acquire(resource = nil, scope: nil, adapter: nil, &block)
|
|
16
|
+
raise NotImplementedError, "#{self.class} must implement #acquire"
|
|
17
|
+
end
|
|
18
|
+
|
|
19
|
+
# Reset the circuit breaker to its initial state
|
|
20
|
+
def reset(scope: nil, adapter: nil)
|
|
21
|
+
raise NotImplementedError, "#{self.class} must implement #reset"
|
|
22
|
+
end
|
|
23
|
+
|
|
24
|
+
# Clean up resources
|
|
25
|
+
def destroy
|
|
26
|
+
raise NotImplementedError, "#{self.class} must implement #destroy"
|
|
27
|
+
end
|
|
28
|
+
|
|
29
|
+
# Check if the circuit is open (rejecting requests)
|
|
30
|
+
def open?
|
|
31
|
+
raise NotImplementedError, "#{self.class} must implement #open?"
|
|
32
|
+
end
|
|
33
|
+
|
|
34
|
+
# Check if the circuit is closed (allowing requests)
|
|
35
|
+
def closed?
|
|
36
|
+
raise NotImplementedError, "#{self.class} must implement #closed?"
|
|
37
|
+
end
|
|
38
|
+
|
|
39
|
+
# Check if the circuit is half-open (testing if service recovered)
|
|
40
|
+
def half_open?
|
|
41
|
+
raise NotImplementedError, "#{self.class} must implement #half_open?"
|
|
42
|
+
end
|
|
43
|
+
|
|
44
|
+
# Check if requests are currently allowed
|
|
45
|
+
def request_allowed?
|
|
46
|
+
raise NotImplementedError, "#{self.class} must implement #request_allowed?"
|
|
47
|
+
end
|
|
48
|
+
|
|
49
|
+
# Mark a request as failed
|
|
50
|
+
def mark_failed(error, scope: nil, adapter: nil)
|
|
51
|
+
raise NotImplementedError, "#{self.class} must implement #mark_failed"
|
|
52
|
+
end
|
|
53
|
+
|
|
54
|
+
# Mark a request as successful
|
|
55
|
+
def mark_success(scope: nil, adapter: nil)
|
|
56
|
+
raise NotImplementedError, "#{self.class} must implement #mark_success"
|
|
57
|
+
end
|
|
58
|
+
|
|
59
|
+
# Check if the circuit breaker is actively tracking failures
|
|
60
|
+
def in_use?
|
|
61
|
+
raise NotImplementedError, "#{self.class} must implement #in_use?"
|
|
62
|
+
end
|
|
63
|
+
end
|
|
64
|
+
end
|
|
@@ -66,6 +66,7 @@ module Semian
|
|
|
66
66
|
def validate_circuit_breaker_configuration!
|
|
67
67
|
return if ENV.key?("SEMIAN_CIRCUIT_BREAKER_DISABLED")
|
|
68
68
|
return unless @configuration.fetch(:circuit_breaker, true)
|
|
69
|
+
return if @configuration[:adaptive_circuit_breaker] # Skip traditional validation if using adaptive
|
|
69
70
|
|
|
70
71
|
require_keys!([:success_threshold, :error_threshold, :error_timeout], @configuration)
|
|
71
72
|
validate_thresholds!
|
|
@@ -103,6 +104,7 @@ module Semian
|
|
|
103
104
|
error_threshold = @configuration[:error_threshold]
|
|
104
105
|
lumping_interval = @configuration[:lumping_interval]
|
|
105
106
|
half_open_resource_timeout = @configuration[:half_open_resource_timeout]
|
|
107
|
+
exponential_backoff_error_timeout = @configuration[:exponential_backoff_error_timeout]
|
|
106
108
|
|
|
107
109
|
unless error_timeout.is_a?(Numeric) && error_timeout > 0
|
|
108
110
|
err = "error_timeout must be a positive number, got #{error_timeout}"
|
|
@@ -174,6 +176,56 @@ module Semian
|
|
|
174
176
|
|
|
175
177
|
raise_or_log_validation_required!(err)
|
|
176
178
|
end
|
|
179
|
+
|
|
180
|
+
unless exponential_backoff_error_timeout.nil? || [true, false].include?(exponential_backoff_error_timeout)
|
|
181
|
+
err = "exponential_backoff_error_timeout must be a boolean, got #{exponential_backoff_error_timeout}"
|
|
182
|
+
err += hint_format("Use true to enable exponential backoff for error timeout. Use false to disable.")
|
|
183
|
+
|
|
184
|
+
raise_or_log_validation_required!(err)
|
|
185
|
+
end
|
|
186
|
+
|
|
187
|
+
# Validate exponential backoff initial timeout
|
|
188
|
+
exponential_backoff_initial_timeout = @configuration[:exponential_backoff_initial_timeout]
|
|
189
|
+
unless exponential_backoff_initial_timeout.nil? || (exponential_backoff_initial_timeout.is_a?(Numeric) && exponential_backoff_initial_timeout > 0)
|
|
190
|
+
err = "exponential_backoff_initial_timeout must be a positive number, got #{exponential_backoff_initial_timeout}"
|
|
191
|
+
err += hint_format("This is the initial timeout when exponential backoff is enabled. Must be less than error_timeout.")
|
|
192
|
+
|
|
193
|
+
raise_or_log_validation_required!(err)
|
|
194
|
+
end
|
|
195
|
+
|
|
196
|
+
# Validate exponential backoff multiplier
|
|
197
|
+
exponential_backoff_multiplier = @configuration[:exponential_backoff_multiplier]
|
|
198
|
+
unless exponential_backoff_multiplier.nil? || (exponential_backoff_multiplier.is_a?(Numeric) && exponential_backoff_multiplier > 1)
|
|
199
|
+
err = "exponential_backoff_multiplier must be a number greater than 1, got #{exponential_backoff_multiplier}"
|
|
200
|
+
err += hint_format("This is the factor by which the timeout increases on each subsequent opening. Common values are 2 (double) or 1.5.")
|
|
201
|
+
|
|
202
|
+
raise_or_log_validation_required!(err)
|
|
203
|
+
end
|
|
204
|
+
|
|
205
|
+
# Ensure exponential backoff parameters are only provided when exponential_backoff_error_timeout is true
|
|
206
|
+
unless exponential_backoff_error_timeout
|
|
207
|
+
if exponential_backoff_initial_timeout
|
|
208
|
+
err = "exponential_backoff_initial_timeout can only be specified when exponential_backoff_error_timeout is true"
|
|
209
|
+
err += hint_format("Set exponential_backoff_error_timeout: true to use exponential backoff features.")
|
|
210
|
+
|
|
211
|
+
raise_or_log_validation_required!(err)
|
|
212
|
+
end
|
|
213
|
+
|
|
214
|
+
if exponential_backoff_multiplier
|
|
215
|
+
err = "exponential_backoff_multiplier can only be specified when exponential_backoff_error_timeout is true"
|
|
216
|
+
err += hint_format("Set exponential_backoff_error_timeout: true to use exponential backoff features.")
|
|
217
|
+
|
|
218
|
+
raise_or_log_validation_required!(err)
|
|
219
|
+
end
|
|
220
|
+
end
|
|
221
|
+
|
|
222
|
+
# Ensure initial timeout is less than error_timeout when using exponential backoff
|
|
223
|
+
if exponential_backoff_error_timeout && exponential_backoff_initial_timeout && exponential_backoff_initial_timeout >= error_timeout
|
|
224
|
+
err = "exponential_backoff_initial_timeout (#{exponential_backoff_initial_timeout}) must be less than error_timeout (#{error_timeout})"
|
|
225
|
+
err += hint_format("The initial timeout should be smaller than the maximum timeout for exponential backoff to be effective.")
|
|
226
|
+
|
|
227
|
+
raise_or_log_validation_required!(err)
|
|
228
|
+
end
|
|
177
229
|
end
|
|
178
230
|
|
|
179
231
|
def validate_quota!(quota)
|