RubyGems - semian - Versions diffs - 0.27.1 → 0.28.0 - Mend

semian 0.27.1 → 0.28.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (18) hide show

checksums.yaml +4 -4
data/README.md +71 -0
data/lib/semian/adapter.rb +1 -1
data/lib/semian/adaptive_circuit_breaker.rb +136 -0
data/lib/semian/circuit_breaker.rb +25 -23
data/lib/semian/circuit_breaker_behaviour.rb +64 -0
data/lib/semian/configuration_validator.rb +1 -0
data/lib/semian/dual_circuit_breaker.rb +165 -0
data/lib/semian/mysql2.rb +2 -2
data/lib/semian/net_http.rb +3 -3
data/lib/semian/pid_controller.rb +217 -0
data/lib/semian/pid_controller_thread.rb +72 -0
data/lib/semian/protected_resource.rb +1 -1
data/lib/semian/simple_exponential_smoother.rb +137 -0
data/lib/semian/unprotected_resource.rb +3 -3
data/lib/semian/version.rb +1 -1
data/lib/semian.rb +64 -4
metadata +8 -2

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: b84efc35cf9e47382fc7ac82d69ff0bfb55028eb79c8e4d76127b1e4ee8b7053
-  data.tar.gz: 0e16ab39a45133600134574abd65012dae026cfa1f1020695974edf46d00d41d
+  metadata.gz: c6423964e3bf474c3f1c6c31ab4c52a14fc28a35d51d91480c537f46f0c1e5f2
+  data.tar.gz: a7ec4ad1154ceef88bdb3dd2bbd40d419d8dc3162a81a17804afdd9573d380be
 SHA512:
-  metadata.gz: abb82539122c2b4ef05420bc996d67cbcd0da9cafd3c735cd85547dd3691af6a6e287f52034d9e2870195ee13708f94e3edf68ad11962e2ad0b6c9023e4243ec
-  data.tar.gz: 9041cb2e5834339f584c951558ea071dfd91afc1502971495ab55890963c0ce401ac5b7b1b51611f1d51a3d81618dafb2f2ab648ec8811d7f20f999face6653d
+  metadata.gz: ccc4f1217740efde74e84acf26c2f73de7ac9c088575ac03d7be5408bc1c2711cd95ffd27a9b524071db95de3ce24ad3673b5a4f960a4d96b6b6c1c083c85183
+  data.tar.gz: 1f94bc2139a6397ba5a963a230496ab694f98cb1ae29e42b69017972d2268afc97c8fdde0d49ffd8d6514036199c15a5d62f67439a6790ed9d4696771ddf7b8e

data/README.md CHANGED Viewed

@@ -607,6 +607,77 @@ It is possible to disable Circuit Breaker with environment variable
 For more information about configuring these parameters, please read
 [this post](https://shopify.engineering/circuit-breaker-misconfigured).
+#### Adaptive Circuit Breaker (Experimental)
+Semian also includes an experimental adaptive circuit breaker that uses a [PID controller](https://en.wikipedia.org/wiki/Proportional%E2%80%93integral%E2%80%93derivative_controller)
+to dynamically adjust the rejection rate based on real-time error rates. Unlike the
+traditional circuit breaker with fixed thresholds, the adaptive circuit breaker continuously
+monitors error rates and adjusts its behavior accordingly.
+##### How It Works
+The adaptive circuit breaker has two components:
+1. An ideal error rate estimator that determines when the service is starting to become unhealthy
+2. A PID controller that opens the circuit fully or partially based on how bad the situation is.
+The ideal error rate estimator uses a "simple exponential smoother", which means it simply takes the average error rate
+that it observes as the ideal. With the following caveat:
+1. It ignores any data that is too high from its calculations. For example, we know that 20% error rate is an anamolous
+observation so we ignore it.
+1. It starts with an educated guess about the ideal error rate,
+and then converges down quickly if it observes a lower error rate, and slowly if it observes a higher error rate.
+1. After 30 minutes, it becomes more confident of its guess, and thus converges even slower in either directions.
+The PID controller uses the following equation to determine whether to open or close the circuit:
+```
+P = (error_rate - ideal_error_rate) - (1 - (error_rate - ideal_error_rate)) * rejection_rate
+```
+Or, more simply, if you define `delta_error = error_rate - ideal_error_rate` then:
+```
+P = delta_error - (1 - delta_error) * rejection_rate
+```
+In simple terms: This equation says: open more when the error rate is higher than the rejection rate,
+and less when the opposite. The multiplier of `(1 - delta_error)` is called the aggressiveness multiplier.
+It allows the circuit to open more aggressively depending on how bad the situation is.
+This P is fed into a typical PID equation, and is used to control the rejection rate of the circuit breaker.
+##### Adaptive Circuit Breaker Configuration
+To enable the adaptive circuit breaker, simply set **adaptive_circuit_breaker** to true.
+Example configuration:
+```ruby
+Semian.register(
+  :my_service,
+  adaptive_circuit_breaker: true,  # Use adaptive instead of traditional
+  bulkhead: false                   # Can be combined with bulkhead
+)
+```
+**Note**: When `adaptive_circuit_breaker: true` is set, traditional circuit breaker
+parameters (`error_threshold`, `error_timeout`, etc.) are ignored.
+We **_highly_** recommend just setting that configuration and not any other.
+One of the main goals of the adaptive circuit breaker is that it "just works".
+Configuring it might be difficult and not provide much value. That said, here are the configurations you can set:
+* **kp:** The contribution of P in the PID equation. Increasing it means you react more quickly to the latest data. Defaults to 1.0
+* **ki**: The contribution of the integral in the PID equation. Increasing it means adding more "memory", which is useful to ignoring noise. Defaults to 0.2
+* **kd**: The contribution of the derivative in the PID equation. Its behaviour can be complex because of our complex P equation. Defaults to 0.0
+* **integral_upper_cap**: Maximum value of the integral, prevents integral windup. Default to 10.0
+* **integral_lower_cap**: Minimum value of the integral, prevents integral windup. Default to -10.0
+* **window_size**: How many seconds of observations to take into account. Note that this window is a sliding window of 1 second sliding interval. To control the sliding interval you should set the environment variable SEMIAN_ADAPTIVE_CIRCUIT_BREAKER_SLIDING_INTERVAL (shared among all adaptive circuit breakers). window_size default to 10 seconds
+* **dead_zone_ratio**: An error percentage above the ideal_error_rate to ignore. This helps remove noise. Defaults to 0.25
+* **initial_error_rate**: The guess to start with for the ideal error rate. Defaults to 0.05 (5%)
+* **ideal_error_rate_estimator_cap_value**: The value above which we ignore observations for the ideal error rate. Defaults to 0.1 (10%)
 ### Bulkheading
 For some applications, circuit breakers are not enough. This is best illustrated

data/lib/semian/adapter.rb CHANGED Viewed

@@ -45,7 +45,7 @@ module Semian
       end
     rescue ::Semian::OpenCircuitError => error
       last_error = semian_resource.circuit_breaker.last_error
-      message = "#{error.message} caused by #{last_error.message}"
+      message = "#{error.message} caused by #{last_error&.message}"
       last_error = nil unless last_error.is_a?(Exception) # Net::HTTPServerError is not an exception
       raise self.class::CircuitOpenError.new(semian_identifier, message), cause: last_error
     rescue ::Semian::BaseError => error

data/lib/semian/adaptive_circuit_breaker.rb ADDED Viewed

@@ -0,0 +1,136 @@
+# frozen_string_literal: true
+require_relative "circuit_breaker_behaviour"
+require_relative "pid_controller_thread"
+module Semian
+  # Adaptive Circuit Breaker that uses PID controller for dynamic rejection
+  class AdaptiveCircuitBreaker
+    include CircuitBreakerBehaviour
+    attr_reader :pid_controller, :update_thread, :sliding_interval, :pid_controller_thread, :stopped
+    @pid_controller_thread = nil
+    def initialize(name:, exceptions:, kp:, ki:, kd:, window_size:, initial_error_rate:, implementation:,
+      sliding_interval:, dead_zone_ratio:, ideal_error_rate_estimator_cap_value:, integral_upper_cap:,
+      integral_lower_cap:)
+      initialize_behaviour(name: name)
+      @exceptions = exceptions
+      @stopped = false
+      @pid_controller = implementation::PIDController.new(
+        kp: kp,
+        ki: ki,
+        kd: kd,
+        window_size: window_size,
+        implementation: implementation,
+        sliding_interval: sliding_interval,
+        initial_error_rate: initial_error_rate,
+        dead_zone_ratio: dead_zone_ratio,
+        ideal_error_rate_estimator_cap_value: ideal_error_rate_estimator_cap_value,
+        integral_upper_cap: integral_upper_cap,
+        integral_lower_cap: integral_lower_cap,
+      )
+      @pid_controller_thread = PIDControllerThread.instance.register_resource(self)
+    end
+    def acquire(resource = nil, scope: nil, adapter: nil, &block)
+      unless request_allowed?
+        mark_rejected(scope:, adapter:)
+        raise OpenCircuitError, "Rejected by adaptive circuit breaker"
+      end
+      result = nil
+      begin
+        result = block.call
+      rescue *@exceptions => error
+        if !error.respond_to?(:marks_semian_circuits?) || error.marks_semian_circuits?
+          mark_failed(error, scope:, adapter:)
+        end
+        raise error
+      else
+        mark_success(scope:, adapter:)
+      end
+      result
+    end
+    def reset(scope: nil, adapter: nil)
+      @last_error = nil
+      @pid_controller.reset
+    end
+    def stop
+      destroy
+    end
+    def destroy
+      @stopped = true
+      PIDControllerThread.instance.unregister_resource(self)
+      @pid_controller.reset
+    end
+    def metrics
+      @pid_controller.metrics
+    end
+    def open?
+      @pid_controller.rejection_rate == 1
+    end
+    def closed?
+      @pid_controller.rejection_rate == 0
+    end
+    # Compatibility with ProtectedResource - Adaptive circuit breaker does not have a half open state
+    def half_open?
+      !open? && !closed?
+    end
+    def mark_failed(error, scope: nil, adapter: nil)
+      @last_error = error
+      @pid_controller.record_request(:error)
+    end
+    def mark_success(scope: nil, adapter: nil)
+      @pid_controller.record_request(:success)
+    end
+    def mark_rejected(scope: nil, adapter: nil)
+      @pid_controller.record_request(:rejected)
+    end
+    def request_allowed?
+      !@pid_controller.should_reject?
+    end
+    def in_use?
+      true
+    end
+    def pid_controller_update
+      @pid_controller.update
+      notify_metrics_update(@pid_controller.metrics(full: false))
+    end
+    private
+    def notify_metrics_update(metrics)
+      Semian.notify(
+        :adaptive_update,
+        self,
+        nil,
+        nil,
+        rejection_rate: metrics[:rejection_rate],
+        error_rate: metrics[:error_rate],
+        ideal_error_rate: metrics[:ideal_error_rate],
+        p_value: metrics[:p_value],
+        integral: metrics[:integral],
+        derivative: metrics[:derivative],
+        previous_p_value: metrics[:previous_p_value],
+      )
+    end
+  end
+end

data/lib/semian/circuit_breaker.rb CHANGED Viewed

@@ -1,17 +1,18 @@
 # frozen_string_literal: true
+require_relative "circuit_breaker_behaviour"
 module Semian
   class CircuitBreaker # :nodoc:
+    include CircuitBreakerBehaviour
     extend Forwardable
     def_delegators :@state, :closed?, :open?, :half_open?
     attr_reader(
-      :name,
       :half_open_resource_timeout,
       :error_timeout,
       :state,
-      :last_error,
       :error_threshold_timeout_enabled,
       :exponential_backoff_error_timeout,
       :exponential_backoff_initial_timeout,
@@ -23,13 +24,14 @@ module Semian
       error_threshold_timeout: nil, error_threshold_timeout_enabled: true,
       lumping_interval: 0, exponential_backoff_error_timeout: false,
       exponential_backoff_initial_timeout: 1, exponential_backoff_multiplier: 2)
-      @name = name.to_sym
+      initialize_behaviour(name: name)
+      @exceptions = exceptions
       @success_count_threshold = success_threshold
       @error_count_threshold = error_threshold
       @error_threshold_timeout = error_threshold_timeout || error_timeout
       @error_threshold_timeout_enabled = error_threshold_timeout_enabled.nil? ? true : error_threshold_timeout_enabled
       @error_timeout = error_timeout
-      @exceptions = exceptions
       @half_open_resource_timeout = half_open_resource_timeout
       @lumping_interval = lumping_interval
       @exponential_backoff_error_timeout = exponential_backoff_error_timeout
@@ -44,8 +46,8 @@ module Semian
       reset
     end
-    def acquire(resource = nil, &block)
-      transition_to_half_open if transition_to_half_open?
+    def acquire(resource = nil, scope: nil, adapter: nil, &block)
+      transition_to_half_open(scope: scope, adapter: adapter) if transition_to_half_open?
       raise OpenCircuitError unless request_allowed?
@@ -54,11 +56,11 @@ module Semian
         result = maybe_with_half_open_resource_timeout(resource, &block)
       rescue *@exceptions => error
         if !error.respond_to?(:marks_semian_circuits?) || error.marks_semian_circuits?
-          mark_failed(error)
+          mark_failed(error, scope: scope, adapter: adapter)
         end
         raise error
       else
-        mark_success
+        mark_success(scope: scope, adapter: adapter)
       end
       result
     end
@@ -71,26 +73,26 @@ module Semian
       closed? || half_open? || transition_to_half_open?
     end
-    def mark_failed(error)
+    def mark_failed(error, scope: nil, adapter: nil)
       push_error(error)
       if closed?
-        transition_to_open if error_threshold_reached?
+        transition_to_open(scope: scope, adapter: adapter) if error_threshold_reached?
       elsif half_open?
-        transition_to_open
+        transition_to_open(scope: scope, adapter: adapter)
       end
     end
-    def mark_success
+    def mark_success(scope: nil, adapter: nil)
       return unless half_open?
       @successes.increment
-      transition_to_close if success_threshold_reached?
+      transition_to_close(scope: scope, adapter: adapter) if success_threshold_reached?
     end
-    def reset
+    def reset(scope: nil, adapter: nil)
       @errors.clear
       @successes.reset
-      transition_to_close
+      transition_to_close(scope: scope, adapter: adapter)
     end
     def destroy
@@ -105,8 +107,8 @@ module Semian
     private
-    def transition_to_close
-      notify_state_transition(:closed)
+    def transition_to_close(scope: nil, adapter: nil)
+      notify_state_transition(:closed, scope: scope, adapter: adapter)
       log_state_transition(:closed)
       @state.close!
       @errors.clear
@@ -114,14 +116,14 @@ module Semian
       @current_error_timeout = @exponential_backoff_error_timeout ? @exponential_backoff_initial_timeout : @error_timeout
     end
-    def transition_to_open
-      notify_state_transition(:open)
+    def transition_to_open(scope: nil, adapter: nil)
+      notify_state_transition(:open, scope: scope, adapter: adapter)
       log_state_transition(:open)
       @state.open!
     end
-    def transition_to_half_open
-      notify_state_transition(:half_open)
+    def transition_to_half_open(scope: nil, adapter: nil)
+      notify_state_transition(:half_open, scope: scope, adapter: adapter)
       log_state_transition(:half_open)
       @state.half_open!
       @successes.reset
@@ -178,8 +180,8 @@ module Semian
       Semian.logger.info(str)
     end
-    def notify_state_transition(new_state)
-      Semian.notify(:state_change, self, nil, nil, state: new_state)
+    def notify_state_transition(new_state, scope: nil, adapter: nil)
+      Semian.notify(:state_change, self, scope, adapter, state: new_state)
     end
     def maybe_with_half_open_resource_timeout(resource, &block)

data/lib/semian/circuit_breaker_behaviour.rb ADDED Viewed

@@ -0,0 +1,64 @@
+# frozen_string_literal: true
+module Semian
+  module CircuitBreakerBehaviour
+    attr_reader :name, :last_error
+    attr_accessor :exceptions
+    # Initialize common circuit breaker attributes
+    def initialize_behaviour(name:)
+      @name = name.to_sym
+      @last_error = nil
+    end
+    # Main method to execute a block with circuit breaker protection
+    def acquire(resource = nil, scope: nil, adapter: nil, &block)
+      raise NotImplementedError, "#{self.class} must implement #acquire"
+    end
+    # Reset the circuit breaker to its initial state
+    def reset(scope: nil, adapter: nil)
+      raise NotImplementedError, "#{self.class} must implement #reset"
+    end
+    # Clean up resources
+    def destroy
+      raise NotImplementedError, "#{self.class} must implement #destroy"
+    end
+    # Check if the circuit is open (rejecting requests)
+    def open?
+      raise NotImplementedError, "#{self.class} must implement #open?"
+    end
+    # Check if the circuit is closed (allowing requests)
+    def closed?
+      raise NotImplementedError, "#{self.class} must implement #closed?"
+    end
+    # Check if the circuit is half-open (testing if service recovered)
+    def half_open?
+      raise NotImplementedError, "#{self.class} must implement #half_open?"
+    end
+    # Check if requests are currently allowed
+    def request_allowed?
+      raise NotImplementedError, "#{self.class} must implement #request_allowed?"
+    end
+    # Mark a request as failed
+    def mark_failed(error, scope: nil, adapter: nil)
+      raise NotImplementedError, "#{self.class} must implement #mark_failed"
+    end
+    # Mark a request as successful
+    def mark_success(scope: nil, adapter: nil)
+      raise NotImplementedError, "#{self.class} must implement #mark_success"
+    end
+    # Check if the circuit breaker is actively tracking failures
+    def in_use?
+      raise NotImplementedError, "#{self.class} must implement #in_use?"
+    end
+  end
+end

data/lib/semian/configuration_validator.rb CHANGED Viewed

@@ -66,6 +66,7 @@ module Semian
     def validate_circuit_breaker_configuration!
       return if ENV.key?("SEMIAN_CIRCUIT_BREAKER_DISABLED")
       return unless @configuration.fetch(:circuit_breaker, true)
+      return if @configuration[:adaptive_circuit_breaker] # Skip traditional validation if using adaptive
       require_keys!([:success_threshold, :error_threshold, :error_timeout], @configuration)
       validate_thresholds!

data/lib/semian/dual_circuit_breaker.rb ADDED Viewed

@@ -0,0 +1,165 @@
+# frozen_string_literal: true
+module Semian
+  # DualCircuitBreaker wraps both classic and adaptive circuit breakers,
+  # allowing runtime switching between them via a callable that determines which to use.
+  class DualCircuitBreaker
+    include CircuitBreakerBehaviour
+    # Module to synchronize mark_success and mark_failed calls between sibling circuit breakers
+    # and reduce code duplication
+    module SiblingSync
+      attr_writer :sibling
+      def mark_success(scope: nil, adapter: nil)
+        super
+        @sibling.method(:mark_success).super_method.call(scope:, adapter:)
+      end
+      def mark_failed(error, scope: nil, adapter: nil)
+        super
+        @sibling.method(:mark_failed).super_method.call(error, scope:, adapter:)
+      end
+    end
+    class ChildClassicCircuitBreaker < CircuitBreaker
+      include SiblingSync
+    end
+    class ChildAdaptiveCircuitBreaker < AdaptiveCircuitBreaker
+      include SiblingSync
+    end
+    attr_reader :classic_circuit_breaker, :adaptive_circuit_breaker, :active_circuit_breaker
+    # use_adaptive should be a callable (Proc/lambda) that returns true/false
+    # to determine which circuit breaker to use. If it returns true, use adaptive.
+    def initialize(name:, classic_circuit_breaker:, adaptive_circuit_breaker:)
+      initialize_behaviour(name: name)
+      @classic_circuit_breaker = classic_circuit_breaker
+      @adaptive_circuit_breaker = adaptive_circuit_breaker
+      @classic_circuit_breaker.sibling = @adaptive_circuit_breaker
+      @adaptive_circuit_breaker.sibling = @classic_circuit_breaker
+      @active_circuit_breaker = @classic_circuit_breaker
+    end
+    def self.adaptive_circuit_breaker_selector(selector) # rubocop:disable Style/ClassMethodsDefinitions
+      @@adaptive_circuit_breaker_selector = selector # rubocop:disable Style/ClassVars
+    end
+    def active_breaker_type
+      @active_circuit_breaker.is_a?(Semian::AdaptiveCircuitBreaker) ? :adaptive : :classic
+    end
+    def acquire(resource = nil, scope: nil, adapter: nil, &block)
+      # NOTE: This assignment is not thread-safe, but this is acceptable for now:
+      # - Each request gets its own decision based on the selector at that moment
+      # - The worst case is a brief inconsistency where a thread reads a stale value,
+      #    which just means it uses the previous circuit breaker type for that one request
+      old_type = active_breaker_type
+      @active_circuit_breaker = get_active_circuit_breaker(resource)
+      if old_type != active_breaker_type
+        Semian.notify(:circuit_breaker_mode_change, self, nil, nil, old_mode: old_type, new_mode: active_breaker_type)
+      end
+      @active_circuit_breaker.acquire(resource, scope:, adapter:, &block)
+    end
+    def open?
+      @active_circuit_breaker.open?
+    end
+    def closed?
+      @active_circuit_breaker.closed?
+    end
+    def half_open?
+      @active_circuit_breaker.half_open?
+    end
+    def request_allowed?
+      @active_circuit_breaker.request_allowed?
+    end
+    def mark_failed(error, scope: nil, adapter: nil)
+      @active_circuit_breaker&.mark_failed(error, scope: nil, adapter: nil)
+    end
+    def mark_success(scope: nil, adapter: nil)
+      @active_circuit_breaker&.mark_success(scope: nil, adapter: nil)
+    end
+    def stop
+      @adaptive_circuit_breaker&.stop
+    end
+    def reset(scope: nil, adapter: nil)
+      @classic_circuit_breaker&.reset(scope:, adapter:)
+      @adaptive_circuit_breaker&.reset(scope:, adapter:)
+    end
+    def destroy
+      @classic_circuit_breaker&.destroy
+      @adaptive_circuit_breaker&.destroy
+    end
+    def in_use?
+      @classic_circuit_breaker&.in_use? || @adaptive_circuit_breaker&.in_use?
+    end
+    def last_error
+      @active_circuit_breaker.last_error
+    end
+    def metrics
+      {
+        active: active_breaker_type,
+        classic: classic_metrics,
+        adaptive: adaptive_metrics,
+      }
+    end
+    private
+    def classic_metrics
+      return {} unless @classic_circuit_breaker
+      {
+        state: @classic_circuit_breaker.state&.value,
+        open: @classic_circuit_breaker.open?,
+        closed: @classic_circuit_breaker.closed?,
+        half_open: @classic_circuit_breaker.half_open?,
+      }
+    end
+    def adaptive_metrics
+      return {} unless @adaptive_circuit_breaker
+      @adaptive_circuit_breaker.metrics.merge(
+        open: @adaptive_circuit_breaker.open?,
+        closed: @adaptive_circuit_breaker.closed?,
+        half_open: @adaptive_circuit_breaker.half_open?,
+      )
+    end
+    def get_active_circuit_breaker(resource)
+      if use_adaptive?(resource)
+        @adaptive_circuit_breaker
+      else
+        @classic_circuit_breaker
+      end
+    end
+    def use_adaptive?(resource = nil)
+      return false unless defined?(@@adaptive_circuit_breaker_selector)
+      @@adaptive_circuit_breaker_selector.call(resource)
+    rescue => e
+      Semian.logger&.warn("[#{@name}] use_adaptive check failed: #{e.message}. Defaulting to classic circuit breaker.")
+      false
+    end
+  end
+end

data/lib/semian/mysql2.rb CHANGED Viewed

@@ -126,11 +126,11 @@ module Semian
       acquire_semian_resource(adapter: :mysql, scope: :connection) { raw_connect(*args) }
     end
-    def acquire_semian_resource(**)
+    def acquire_semian_resource(adapter: nil, scope: nil, **)
       super
     rescue ::Mysql2::Error => error
       if error.is_a?(PingFailure) || (!error.is_a?(::Mysql2::SemianError) && error.message.match?(CONNECTION_ERROR))
-        semian_resource.mark_failed(error)
+        semian_resource.mark_failed(error, scope: scope, adapter: adapter)
         error.semian_identifier = semian_identifier
       end
       raise

data/lib/semian/net_http.rb CHANGED Viewed

@@ -106,7 +106,7 @@ module Semian
         return super if disabled?
         acquire_semian_resource(adapter: :http, scope: :query) do
-          handle_error_responses(super)
+          handle_error_responses(super, adapter: :http, scope: :query)
         end
       end
     end
@@ -126,9 +126,9 @@ module Semian
     private
-    def handle_error_responses(result)
+    def handle_error_responses(result, scope:, adapter:)
       if raw_semian_options.fetch(:open_circuit_server_errors, false)
-        semian_resource.mark_failed(result) if result.is_a?(::Net::HTTPServerError)
+        semian_resource.mark_failed(result, scope: scope, adapter: adapter) if result.is_a?(::Net::HTTPServerError)
       end
       result
     end

data/lib/semian/pid_controller.rb ADDED Viewed

@@ -0,0 +1,217 @@
+# frozen_string_literal: true
+require "thread"
+require_relative "simple_exponential_smoother"
+module Semian
+  module Simple
+    # PID Controller for adaptive circuit breaking
+    # Based on the error function:
+    # P = (error_rate - ideal_error_rate) - (1 - (error_rate - ideal_error_rate)) * rejection_rate
+    # Note: P increases when error_rate increases
+    #       P decreases when rejection_rate increases (providing feedback)
+    class PIDController
+      attr_reader :rejection_rate
+      def initialize(kp:, ki:, kd:, window_size:, sliding_interval:, implementation:, initial_error_rate:,
+        dead_zone_ratio:, ideal_error_rate_estimator_cap_value:, integral_upper_cap:, integral_lower_cap:)
+        @kp = kp
+        @ki = ki
+        @kd = kd
+        @dead_zone_ratio = dead_zone_ratio
+        @integral_upper_cap = integral_upper_cap
+        @integral_lower_cap = integral_lower_cap
+        @rejection_rate = 0.0
+        @integral = 0.0
+        @derivative = 0.0
+        @previous_p_value = 0.0
+        @last_ideal_error_rate = initial_error_rate
+        @window_size = window_size
+        @sliding_interval = sliding_interval
+        @smoother = SimpleExponentialSmoother.new(
+          cap_value: ideal_error_rate_estimator_cap_value,
+          initial_value: initial_error_rate,
+          observations_per_minute: 60 / sliding_interval,
+        )
+        @errors = implementation::SlidingWindow.new(max_size: 200 * window_size)
+        @successes = implementation::SlidingWindow.new(max_size: 200 * window_size)
+        @rejections = implementation::SlidingWindow.new(max_size: 200 * window_size)
+        @last_error_rate = 0.0
+        @last_p_value = 0.0
+      end
+      def record_request(outcome)
+        case outcome
+        when :error
+          @errors.push(current_time)
+        when :success
+          @successes.push(current_time)
+        when :rejected
+          @rejections.push(current_time)
+        end
+      end
+      def update
+        # Store the last window's P value so that we can serve it up in the metrics snapshots
+        @previous_p_value = @last_p_value
+        @last_error_rate = calculate_error_rate
+        store_error_rate(@last_error_rate)
+        dt = @sliding_interval
+        @last_p_value = calculate_p_value(@last_error_rate)
+        proportional = @kp * @last_p_value
+        @integral += @last_p_value * dt
+        integral = @ki * @integral
+        @derivative = @kd * (@last_p_value - @previous_p_value) / dt
+        # Calculate the control signal (change in rejection rate)
+        control_signal = proportional + integral + @derivative
+        # Calculate what the new rejection rate would be
+        new_rejection_rate = @rejection_rate + control_signal
+        # Update rejection rate (clamped between 0 and 1)
+        @rejection_rate = new_rejection_rate.clamp(0.0, 1.0)
+        @integral = @integral.clamp(@integral_lower_cap, @integral_upper_cap)
+        @rejection_rate
+      end
+      # Should we reject this request based on current rejection rate?
+      def should_reject?
+        rand < @rejection_rate
+      end
+      # Reset the controller state
+      def reset
+        @rejection_rate = 0.0
+        @integral = 0.0
+        @previous_p_value = 0.0
+        @derivative = 0.0
+        @last_p_value = 0.0
+        @errors.clear
+        @successes.clear
+        @rejections.clear
+        @last_error_rate = 0.0
+        @smoother.reset
+        @last_ideal_error_rate = @smoother.forecast
+      end
+      # Get current metrics for monitoring/debugging
+      def metrics(full: true)
+        result = {
+          rejection_rate: @rejection_rate,
+          error_rate: @last_error_rate,
+          ideal_error_rate: @last_ideal_error_rate,
+          dead_zone_ratio: @dead_zone_ratio,
+          p_value: @last_p_value,
+          previous_p_value: @previous_p_value,
+          integral: @integral,
+          derivative: @derivative,
+        }
+        if full
+          result[:smoother_state] = @smoother.state
+          result[:current_window_requests] = {
+            success: @successes.size,
+            error: @errors.size,
+            rejected: @rejections.size,
+          }
+        end
+        result
+      end
+      private
+      # Calculate the current P value with dead-zone noise suppression.
+      # The dead zone prevents the controller from reacting to small, noisy
+      # deviations from the ideal error rate. Only deviations exceeding
+      # ideal_error_rate * dead_zone_ratio trigger a response.
+      def calculate_p_value(current_error_rate)
+        @last_ideal_error_rate = calculate_ideal_error_rate
+        raw_delta = current_error_rate - @last_ideal_error_rate
+        dead_zone = @last_ideal_error_rate * @dead_zone_ratio
+        delta_error = if raw_delta <= 0
+          # Below or at ideal: pass through for recovery
+          raw_delta
+        elsif raw_delta <= dead_zone
+          # Within dead zone: suppress noise
+          0.0
+        else
+          # Above dead zone: full signal, dead zone only silences noise
+          raw_delta
+        end
+        delta_error - (1 - delta_error) * @rejection_rate
+      end
+      def calculate_error_rate
+        # Clean up old observations
+        current_timestamp = current_time
+        cutoff_time = current_timestamp - @window_size
+        @errors.reject! { |timestamp| timestamp < cutoff_time }
+        @successes.reject! { |timestamp| timestamp < cutoff_time }
+        @rejections.reject! { |timestamp| timestamp < cutoff_time }
+        total_requests = @successes.size + @errors.size
+        return 0.0 if total_requests == 0
+        @errors.size.to_f / total_requests
+      end
+      def store_error_rate(error_rate)
+        @smoother.add_observation(error_rate)
+      end
+      def calculate_ideal_error_rate
+        @smoother.forecast
+      end
+      def current_time
+        Process.clock_gettime(Process::CLOCK_MONOTONIC)
+      end
+    end
+  end
+  module ThreadSafe
+    # Thread-safe version of PIDController
+    class PIDController < Simple::PIDController
+      def initialize(**kwargs)
+        super(**kwargs)
+        @lock = Mutex.new
+      end
+      def record_request(outcome)
+        @lock.synchronize { super }
+      end
+      def update
+        @lock.synchronize { super }
+      end
+      def should_reject?
+        @lock.synchronize { super }
+      end
+      def reset
+        @lock.synchronize { super }
+      end
+      # NOTE: metrics, calculate_error_rate are not overridden
+      # to avoid deadlock. calculate_error_rate is private method
+      # only called internally from update (synchronized) and metrics (not synchronized).
+    end
+  end
+end

data/lib/semian/pid_controller_thread.rb ADDED Viewed

@@ -0,0 +1,72 @@
+# frozen_string_literal: true
+require "singleton"
+require_relative "pid_controller"
+module Semian
+  class PIDControllerThread
+    include Singleton
+    def initialize
+      @stopped = true
+      @update_thread = nil
+      @circuit_breakers = Concurrent::Map.new
+      @sliding_interval = ENV.fetch("SEMIAN_ADAPTIVE_CIRCUIT_BREAKER_SLIDING_INTERVAL", 1).to_i
+    end
+    # As per the singleton pattern, this is called only once
+    def start
+      @stopped = false
+      update_proc = proc do
+        loop do
+          break if @stopped
+          wait_for_window
+          # Update PID controller state for each registered circuit breaker
+          @circuit_breakers.each do |_, circuit_breaker|
+            circuit_breaker.pid_controller_update
+          end
+        rescue => e
+          Semian.logger&.warn("[#{@name}] PID controller update thread error: #{e.message}")
+        end
+      end
+      @update_thread = Thread.new(&update_proc)
+    end
+    def stop
+      @stopped = true
+      @update_thread&.kill
+      @update_thread = nil
+    end
+    def register_resource(circuit_breaker)
+      # Track every registered circuit breaker in a Concurrent::Map
+      # Start the thread if it's not already running
+      if @circuit_breakers.empty? && @stopped
+        start
+      end
+      # Add the circuit breaker to the map
+      @circuit_breakers[circuit_breaker.name] = circuit_breaker
+      self
+    end
+    def unregister_resource(circuit_breaker)
+      # Remove the circuit breaker from the map
+      @circuit_breakers.delete(circuit_breaker.name)
+      # Stop the thread if there are no more circuit breakers
+      if @circuit_breakers.empty?
+        stop
+      end
+    end
+    def wait_for_window
+      Kernel.sleep(@sliding_interval)
+    end
+  end
+end

data/lib/semian/protected_resource.rb CHANGED Viewed

@@ -48,7 +48,7 @@ module Semian
       if @circuit_breaker.nil?
         yield self
       else
-        @circuit_breaker.acquire(resource) do
+        @circuit_breaker.acquire(resource, scope: scope, adapter: adapter) do
           yield self
         end
       end

data/lib/semian/simple_exponential_smoother.rb ADDED Viewed

@@ -0,0 +1,137 @@
+# frozen_string_literal: true
+module Semian
+  # SimpleExponentialSmoother implements Simple Exponential Smoothing (SES) for forecasting
+  # a stable baseline error rate in adaptive circuit breakers.
+  #
+  # SES focuses on the level component only (no trend or seasonality), using the formula:
+  #   smoothed = alpha * value + (1 - alpha) * previous_smoothed
+  #
+  # Key characteristics:
+  # - Drops extreme values above cap to prevent outliers from distorting the forecast
+  # - Runs in two periods: low confidence (first 30 minutes) and high confidence (after 30 minutes)
+  # - During the low confidence period, we converge faster towards observed value than during the high confidence period
+  # - The choice of alphas follows the following criteria:
+  # - During low confidence:
+  #   - If we are observing 2x our current estimate, we need to converge towards it in 30 minutes
+  #   - If we are observing 0.5x our current estimate, we need to converge towards it in 5 minutes
+  # - During high confidence:
+  #   - If we are observing 2x our current estimate, we need to converge towards it in 1 hour
+  #   - If we are observing 0.5x our current estimate, we need to converge towards it in 10 minutes
+  # The following code snippet can be used to calculate the alphas:
+  # def find_alpha(name, start_point, multiplier, convergence_duration)
+  #   target = start_point * multiplier
+  #   desired_distance = 0.003
+  #   alpha_ceil = 0.5
+  #   alpha_floor = 0.0
+  #   alpha = 0.25
+  #   while true
+  #      smoothed_value = start_point
+  #      step_size = convergence_duration / 10
+  #      converged_too_fast = false
+  #      10.times do |step|
+  #          step_size.times do
+  #             smoothed_value = alpha * target + (1 - alpha) * smoothed_value
+  #          end
+  #          if step < 9 and (smoothed_value - target).abs < desired_distance
+  #             converged_too_fast = true
+  #          end
+  #      end
+  #
+  #      if converged_too_fast
+  #         alpha_ceil = alpha
+  #         alpha = (alpha + alpha_floor) / 2
+  #         next
+  #      end
+  #
+  #      if (smoothed_value - target).abs > desired_distance
+  #         alpha_floor = alpha
+  #         alpha =  (alpha + alpha_ceil) / 2
+  #         next
+  #      end
+  #
+  #      break
+  #   end
+  #
+  #   print "#{name} is #{alpha}\n"
+  # end
+  #
+  # initial_error_rate = 0.05
+  #
+  # find_alpha("low confidence upward convergence alpha", initial_error_rate, 2, 1800)
+  # find_alpha("low confidence downward convergence alpha", initial_error_rate, 0.5, 300)
+  # find_alpha("high confidence upward convergence alpha", initial_error_rate, 2, 3600)
+  # find_alpha("high confidence downward convergence alpha", initial_error_rate, 0.5, 600)
+  class SimpleExponentialSmoother
+    LOW_CONFIDENCE_ALPHA_UP = 0.0017
+    LOW_CONFIDENCE_ALPHA_DOWN = 0.078
+    HIGH_CONFIDENCE_ALPHA_UP = 0.0009
+    HIGH_CONFIDENCE_ALPHA_DOWN = 0.039
+    LOW_CONFIDENCE_THRESHOLD_MINUTES = 30
+    # Validate all alpha constants at class load time
+    [
+      LOW_CONFIDENCE_ALPHA_UP,
+      LOW_CONFIDENCE_ALPHA_DOWN,
+      HIGH_CONFIDENCE_ALPHA_UP,
+      HIGH_CONFIDENCE_ALPHA_DOWN,
+    ].each do |alpha|
+      if alpha <= 0 || alpha >= 0.5
+        raise ArgumentError, "alpha constant must be in range (0, 0.5), got: #{alpha}"
+      end
+    end
+    attr_reader :alpha, :cap_value, :initial_value, :smoothed_value, :observations_per_minute
+    def initialize(cap_value:, initial_value:, observations_per_minute:)
+      @alpha = LOW_CONFIDENCE_ALPHA_DOWN # Start with low confidence, converging down
+      @cap_value = cap_value
+      @initial_value = initial_value
+      @observations_per_minute = observations_per_minute
+      @smoothed_value = initial_value
+      @observation_count = 0
+    end
+    def add_observation(value)
+      raise ArgumentError, "value must be non-negative, got: #{value}" if value < 0
+      return @smoothed_value if value > cap_value
+      @observation_count += 1
+      low_confidence = @observation_count < (@observations_per_minute * LOW_CONFIDENCE_THRESHOLD_MINUTES)
+      converging_up = value > @smoothed_value
+      @alpha = if low_confidence
+        converging_up ? LOW_CONFIDENCE_ALPHA_UP : LOW_CONFIDENCE_ALPHA_DOWN
+      else
+        converging_up ? HIGH_CONFIDENCE_ALPHA_UP : HIGH_CONFIDENCE_ALPHA_DOWN
+      end
+      @smoothed_value = (@alpha * value) + ((1.0 - @alpha) * @smoothed_value)
+      @smoothed_value
+    end
+    def forecast
+      @smoothed_value
+    end
+    def state
+      {
+        smoothed_value: @smoothed_value,
+        alpha: @alpha,
+        cap_value: @cap_value,
+        initial_value: @initial_value,
+        observations_per_minute: @observations_per_minute,
+        observation_count: @observation_count,
+      }
+    end
+    def reset
+      @smoothed_value = initial_value
+      @observation_count = 0
+      @alpha = LOW_CONFIDENCE_ALPHA_DOWN
+      self
+    end
+  end
+end

data/lib/semian/unprotected_resource.rb CHANGED Viewed

@@ -35,7 +35,7 @@ module Semian
       0
     end
-    def reset
+    def reset(**)
     end
     def open?
@@ -54,10 +54,10 @@ module Semian
       true
     end
-    def mark_failed(_error)
+    def mark_failed(_error, **)
     end
-    def mark_success
+    def mark_success(**)
     end
     def bulkhead

data/lib/semian/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module Semian
-  VERSION = "0.27.1"
+  VERSION = "0.28.0"
 end

data/lib/semian.rb CHANGED Viewed

@@ -11,6 +11,8 @@ require "semian/instrumentable"
 require "semian/platform"
 require "semian/resource"
 require "semian/circuit_breaker"
+require "semian/adaptive_circuit_breaker"
+require "semian/dual_circuit_breaker"
 require "semian/protected_resource"
 require "semian/unprotected_resource"
 require "semian/simple_sliding_window"
@@ -197,7 +199,7 @@ module Semian
   # +exceptions+: An array of exception classes that should be accounted as resource errors. Default [].
   # (circuit breaker)
   #
-  # +exponential_backoff_error_timeout+: When set to true, instead of opening the circuit for the full
+  #   # +exponential_backoff_error_timeout+: When set to true, instead of opening the circuit for the full
   # error_timeout duration, it starts with a smaller timeout and increases exponentially on each subsequent
   # opening up to error_timeout. This helps avoid over-opening the circuit for temporary issues.
   # Default false. (circuit breaker)
@@ -209,6 +211,20 @@ module Semian
   # when exponential backoff is enabled. Only valid when exponential_backoff_error_timeout is true.
   # Default 2. (circuit breaker)
   #
+  # +adaptive_circuit_breaker+: Enable adaptive circuit breaker using PID controller. Default false.
+  # When enabled, this replaces the traditional circuit breaker with an adaptive version
+  # that dynamically adjusts rejection rates based on service health. (adaptive circuit breaker)
+  #
+  # +dual_circuit_breaker+: Enable dual circuit breaker mode where both legacy and adaptive
+  # circuit breakers are initialized. Default false. When enabled, both circuit breakers track
+  # requests, but only one is used for decision-making based on use_adaptive.
+  # (dual circuit breaker)
+  #
+  # +use_adaptive+: A callable (Proc/lambda) that returns true to use adaptive circuit breaker
+  # or false to use legacy. Only used when dual_circuit_breaker is enabled. Default: ->() { false }.
+  # Example: ->() { MyFeatureFlag.enabled?(:adaptive_circuit_breaker) }
+  # (dual circuit breaker)
+  #
   # Returns the registered resource.
   def register(name, **options)
     return UnprotectedResource.new(name) if ENV.key?("SEMIAN_DISABLED")
@@ -216,7 +232,14 @@ module Semian
     # Validate configuration before proceeding
     ConfigurationValidator.new(name, options).validate!
-    circuit_breaker = create_circuit_breaker(name, **options)
+    circuit_breaker = if options[:dual_circuit_breaker]
+      create_dual_circuit_breaker(name, **options)
+    elsif options[:adaptive_circuit_breaker]
+      create_adaptive_circuit_breaker(name, **options)
+    else
+      create_circuit_breaker(name, **options)
+    end
     bulkhead = create_bulkhead(name, **options)
     resources[name] = ProtectedResource.new(name, bulkhead, circuit_breaker)
@@ -312,12 +335,49 @@ module Semian
   private
-  def create_circuit_breaker(name, **options)
+  def create_dual_circuit_breaker(name, **options)
+    return if ENV.key?("SEMIAN_CIRCUIT_BREAKER_DISABLED")
+    classic_cb = create_circuit_breaker(name, is_child: true, **options)
+    adaptive_cb = create_adaptive_circuit_breaker(name, is_child: true, **options)
+    DualCircuitBreaker.new(
+      name: name,
+      classic_circuit_breaker: classic_cb,
+      adaptive_circuit_breaker: adaptive_cb,
+    )
+  end
+  def create_adaptive_circuit_breaker(name, is_child: false, **options)
+    return if ENV.key?("SEMIAN_CIRCUIT_BREAKER_DISABLED")
+    exceptions = options[:exceptions] || []
+    cls = is_child ? DualCircuitBreaker::ChildAdaptiveCircuitBreaker : AdaptiveCircuitBreaker
+    cls.new(
+      name: name,
+      exceptions: Array(exceptions) + [::Semian::BaseError],
+      kp: options[:kp] || 1.0,
+      ki: options[:ki] || 0.2,
+      kd: options[:kd] || 0.0,
+      window_size: options[:window_size] || 10,
+      initial_error_rate: options[:initial_error_rate] || 0.05,
+      dead_zone_ratio: options[:dead_zone_ratio] || 0.25,
+      # We use an environment vraiable for the sliding interval because it is shared among all circuit breakers
+      sliding_interval: ENV.fetch("SEMIAN_ADAPTIVE_CIRCUIT_BREAKER_SLIDING_INTERVAL", 1).to_i,
+      ideal_error_rate_estimator_cap_value: options[:ideal_error_rate_estimator_cap_value] || 0.1,
+      integral_upper_cap: options[:integral_upper_cap] || 10.0,
+      integral_lower_cap: options[:integral_lower_cap] || -10.0,
+      implementation: implementation(**options),
+    )
+  end
+  def create_circuit_breaker(name, is_child: false, **options)
     return if ENV.key?("SEMIAN_CIRCUIT_BREAKER_DISABLED")
     return unless options.fetch(:circuit_breaker, true)
     exceptions = options[:exceptions] || []
-    CircuitBreaker.new(
+    cls = is_child ? DualCircuitBreaker::ChildClassicCircuitBreaker : CircuitBreaker
+    cls.new(
       name,
       success_threshold: options[:success_threshold],
       error_threshold: options[:error_threshold],

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: semian
 version: !ruby/object:Gem::Version
-  version: 0.27.1
+  version: 0.28.0
 platform: ruby
 authors:
 - Scott Francis
@@ -51,13 +51,18 @@ files:
 - lib/semian/activerecord_postgresql_adapter.rb
 - lib/semian/activerecord_trilogy_adapter.rb
 - lib/semian/adapter.rb
+- lib/semian/adaptive_circuit_breaker.rb
 - lib/semian/circuit_breaker.rb
+- lib/semian/circuit_breaker_behaviour.rb
 - lib/semian/configuration_validator.rb
+- lib/semian/dual_circuit_breaker.rb
 - lib/semian/grpc.rb
 - lib/semian/instrumentable.rb
 - lib/semian/lru_hash.rb
 - lib/semian/mysql2.rb
 - lib/semian/net_http.rb
+- lib/semian/pid_controller.rb
+- lib/semian/pid_controller_thread.rb
 - lib/semian/platform.rb
 - lib/semian/protected_resource.rb
 - lib/semian/rails.rb
@@ -65,6 +70,7 @@ files:
 - lib/semian/redis/v5.rb
 - lib/semian/redis_client.rb
 - lib/semian/resource.rb
+- lib/semian/simple_exponential_smoother.rb
 - lib/semian/simple_integer.rb
 - lib/semian/simple_sliding_window.rb
 - lib/semian/simple_state.rb
@@ -94,7 +100,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
     - !ruby/object:Gem::Version
       version: '0'
 requirements: []
-rubygems_version: 4.0.6
+rubygems_version: 4.0.8
 specification_version: 4
 summary: Bulkheading for Ruby with SysV semaphores
 test_files: []