RubyGems - pecorino - Versions diffs - 0.4.1 → 0.5.0 - Mend

pecorino 0.4.1 → 0.5.0

Files changed (8) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +8 -0
data/README.md +60 -1
data/lib/pecorino/cached_throttle.rb +91 -0
data/lib/pecorino/throttle.rb +81 -20
data/lib/pecorino/version.rb +1 -1
data/lib/pecorino.rb +1 -0
metadata +4 -3

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 1326301c772aaee4b63feaba7e29b385c274d4f6b256a3b413d3b7c094654db8
-  data.tar.gz: 8a3973b17c1ded234abcc76937de9225a0099dd4445aefd313a7ade0d9ecb843
+  metadata.gz: 2cd65abe4917de817a0b0b08672376d5a7dd1e16c01b3a4e0c47ff1467600a1e
+  data.tar.gz: 58ca1578813b7a5bcc058d9a37a869e778a1614b736becced4bc82f78efba2c5
 SHA512:
-  metadata.gz: 7d8462bc60d93b6cbbd4729a14fdf84fb9abd18f9b3ce539dc110784cc56f5a9fe2bd107eb370a9caff1e229200a6b345ca3e73ac95a98013a618923e963c8fc
-  data.tar.gz: ad2d89319ba89786e0e2befc3449bb9d1211d0555419a9118df5ba42aa1744ad617b4b27686928fb1fc6ac9f0368a93247d4208beb1b9fab7881322779e11616
+  metadata.gz: bbbcdc936bef119b1b02695dc626f8421576a619414f4493a541876562ca3cf5136dcdd1a796c9c08cd6cc2a8345be8ac7fecd8f6371fddc9b480bd26d017296
+  data.tar.gz: cd4e2ba40164eb0c9f56e17aeef656727703cb98f7e70cedaafa3d9af9202195f9c0e1a1dbc3dfdb411f277f31e84a6a2641d5f81bf85d6d07b3b92937877cbd

data/CHANGELOG.md CHANGED Viewed

@@ -1,3 +1,11 @@
+## [0.5.0] - 2024-02-11
+- Add `CachedThrottle` for caching the throttle blocks. This allows protection to the database when the throttle is in a blocked state.
+- Add `Throttle#throttled` for silencing alerts
+- **BREAKING CHANGE** Remove `Throttle::State#retry_after`, because there is no reasonable value for that member if the throttle is not in the "blocked" state
+- Allow accessing `Throttle::State` from the `Throttled` exception so that the blocked throttle state can be cached downstream (in Rails cache, for example)
+- Make `Throttle#request!` return the new state if there was no exception raised
 ## [0.4.1] - 2024-02-11
 - Make sure Pecorino works on Ruby 2.7 as well by removing 3.x-exclusive syntax

data/README.md CHANGED Viewed

@@ -24,8 +24,10 @@ And then execute:
 Once the installation is done you can use Pecorino to start defining your throttles. Imagine you have a resource called `vault` and you want to limit the number of updates to it to 5 per second. To achieve that, instantiate a new `Throttle` in your controller or job code, and then trigger it using `Throttle#request!`. A call to `request!` registers 1 token getting added to the bucket. If the bucket would overspill (your request would make it overflow), or the throttle is currently in "block" mode (has recently been triggered), a `Pecorino::Throttle::Throttled` exception will be raised.
+We call this pattern **prefix usage** - apply throttle before allowing the action to proceed. This is more secure than registering an action after it has taken place.
 ```ruby
-throttle = Pecorino::Throttle.new(key: "vault", over_time: 1.second, capacity: 5)
+throttle = Pecorino::Throttle.new(key: "password-attempts-#{request.ip}", over_time: 1.minute, capacity: 5, block_for: 30.minutes)
 throttle.request!
 ```
 In a Rails controller you can then rescue from this exception to render the appropriate response:
@@ -67,6 +69,40 @@ throttle.request!(20) # Attempt to withdraw 20 dollars more
 throttle.request!(2) # Attempt to withdraw 2 dollars more, will raise `Throttled` and block withdrawals for 3 hours
 ```
+## Performing a block only if it would be allowed by the throttle
+You can use Pecorino to avoid nuisance alerting - use it to limit the alert rate:
+```ruby
+alert_nuisance_t = Pecorino::Throttle.new(key: "disk-full-alert", over_time_: 2.hours, capacity: 1, block_for: 2.hours)
+alert_nuisance_t.throttled do
+  Slack.alerts.deliver("Disk is full again! please investigate!")
+end
+```
+This will not raise any exceptions. The `throttled` method performs **prefix throttling** to prevent multiple callers hitting the throttle at the same time, so it is guaranteed to be atomic.
+## Postfix topup of the throttle
+In addition to use case where you would want to trigger the throttle before performing an action, there are legitimate use cases where you actually want to use the throttle as a _meter_ instead, measuring the effect of an action which has already been permitted – and then only make it trigger on a subsequent action. This **postfix usage** is less secure, but it allows for a different sequencing of calls. Imagine you want to implement the popular [circuit breaker pattern](https://dzone.com/articles/introduction-to-the-circuit-breaker-pattern) where all your nodes are able to share the error rate information between them. Pecorino gives you all the tools to implement a binary state circuit breaker (open or closed) based on an error rate. Imagine you want to stop sending requests if the service you are calling raises `Timeout::Error` frequently. Then your call to the service could look like this:
+```ruby
+begin
+  error_rate_throttle = Pecorino::Throttle.new("some-fancy-ai-api-errors", capacity: 10, over_time: 30.seconds, block_for: 120.seconds)
+  if error_rate_throttle.able_to_accept? # See whether adding 1 request will overflow the error rate
+    fancy_ai_api.post_chat_message("Imagine I am a rocket scientist on a moonbase. Invent me...")
+  else
+    raise "The error rate for fancy_ai_api has been exceeded"
+  end
+rescue Timeout::Error
+  error_rate_throttle.request(1) # use bang-less method since we do not need the Throttled exception
+  raise
+end
+```
+This way, every time there is an error on the "fancy AI service" the throttle will be triggered, and if it overflows - a subsequent request will be blocked.
 ## Using just the leaky bucket
 Sometimes you don't want to use a throttle, but you want to track the amount added to the leaky bucket over time. A lower-level abstraction is available for that purpose in the form of the `LeakyBucket` class. It will not raise any exceptions and will not install blocks, but will permit you to track a bucket's state over time:
@@ -90,6 +126,29 @@ We recommend running the following bit of code every couple of hours (via cron o
 Pecorino.prune!
 ```
+## Using cached throttles
+If a throttle is triggered, Pecorino sets a "block" record for that throttle key. Any request to that throttle will fail until the block is lifted. If you are getting hammered by requests which are getting throttled, it might be a good idea to install a caching layer which will respond with a "rate limit exceeded" error even before hitting your database - until the moment when the block would be lifted. You can use any [ActiveSupport::Cache::Store](https://api.rubyonrails.org/classes/ActiveSupport/Cache/Store.html) to store your blocks. If you have a fast Rails cache configured, create a wrapped throttle:
+```ruby
+throttle = Pecorino::Throttle.new(key: "ip-#{request.ip}", capacity: 10, over_time: 2.seconds, block_for: 2.minutes)
+cached_throttle = Pecorino::CachedThrottle.new(Rails.cache, throttle)
+cached_throttle.request!
+```
+Note that the idea of using a cache store here is to avoid hitting the database when the block for your throttle is in effect. Therefore, if you are using something like [solid_cache](https://github.com/rails/solid_cache) you will be hitting the database regardless! A better approach is to have a [MemoryStore](https://api.rubyonrails.org/classes/ActiveSupport/Cache/MemoryStore.html) just for throttles - it will be local to your Rails process. This will avoid a database roundtrip once the process knows a particular throttle is being blocked at the moment:
+```ruby
+# in application.rb
+config.pecorino_throttle_cache = ActiveSupport::Cache::MemoryStore.new
+# in your controller
+throttle = Pecorino::Throttle.new(key: "ip-#{request.ip}", capacity: 10, over_time: 2.seconds, block_for: 2.minutes)
+cached_throttle = Pecorino::CachedThrottle.new(Rails.application.config.pecorino_throttle_cache, throttle)
+cached_throttle.request!
+```
 ## Using unlogged tables for reduced replication load (PostgreSQL)
 Throttles and leaky buckets are transient resources. If you are using Postgres replication, it might be prudent to set the Pecorino tables to `UNLOGGED` which will exclude them from replication - and save you bandwidth and storage on your RR. To do so, add the following statements to your migration:

data/lib/pecorino/cached_throttle.rb ADDED Viewed

@@ -0,0 +1,91 @@
+# The cached throttles can be used when you want to lift your throttle blocks into
+# a higher-level cache. If you are dealing with clients which are hammering on your
+# throttles a lot, it is useful to have a process-local cache of the timestamp when
+# the blocks that are set are going to expire. If you are running, say, 10 web app
+# containers - and someone is hammering at an endpoint which starts blocking -
+# you don't really need to query your DB for every request. The first request indicated
+# as "blocked" by Pecorino can write a cache entry into a shared in-memory table,
+# and all subsequent calls to the same process can reuse that `blocked_until` value
+# to quickly refuse the request
+class Pecorino::CachedThrottle
+  # @param cache_store[ActiveSupport::Cache::Store] the store for the cached blocks. We recommend a MemoryStore per-process.
+  # @param throttle[Pecorino::Throttle] the throttle to cache
+  def initialize(cache_store, throttle)
+    @cache_store = cache_store
+    @throttle = throttle
+  end
+  # @see Pecorino::Throttle#request!
+  def request!(n = 1)
+    blocked_state = read_cached_blocked_state
+    raise Pecorino::Throttle::Throttled.new(@throttle, blocked_state) if blocked_state&.blocked?
+    begin
+      @throttle.request!(n)
+    rescue Pecorino::Throttle::Throttled => throttled_ex
+      write_cache_blocked_state(throttled_ex.state) if throttled_ex.throttle == @throttle
+      raise
+    end
+  end
+  # Returns cached `state` for the throttle if there is a currently active block for that throttle in the cache. Otherwise forwards to underlying throttle.
+  #
+  # @see Pecorino::Throttle#request
+  def request(n = 1)
+    blocked_state = read_cached_blocked_state
+    return blocked_state if blocked_state&.blocked?
+    @throttle.request(n).tap do |state|
+      write_cache_blocked_state(state) if state.blocked_until
+    end
+  end
+  # Returns `false` if there is a currently active block for that throttle in the cache. Otherwise forwards to underlying throttle.
+  #
+  # @see Pecorino::Throttle#able_to_accept?
+  def able_to_accept?(n = 1)
+    blocked_state = read_cached_blocked_state
+    return false if blocked_state&.blocked?
+    @throttle.able_to_accept?(n)
+  end
+  # Does not run the block  if there is a currently active block for that throttle in the cache. Otherwise forwards to underlying throttle.
+  #
+  # @see Pecorino::Throttle#throttled
+  def throttled(&blk)
+    # We can't wrap the implementation of "throttled". Or - we can, but it will be obtuse.
+    return if request(1).blocked?
+    yield
+  end
+  # Returns the key of the throttle
+  #
+  # @see Pecorino::Throttle#key
+  def key
+    @throttle.key
+  end
+  # Returns `false` if there is a currently active block for that throttle in the cache. Otherwise forwards to underlying throttle.
+  #
+  # @see Pecorino::Throttle#able_to_accept?
+  def state
+    blocked_state = read_cached_blocked_state
+    warn "Read blocked state #{blocked_state.inspect}"
+    return blocked_state if blocked_state&.blocked?
+    @throttle.state.tap do |state|
+      write_cache_blocked_state(state) if state.blocked?
+    end
+  end
+  private
+  def write_cache_blocked_state(state)
+    @cache_store.write("pecorino-cached-throttle-state-#{@throttle.key}", state, expires_after: state.blocked_until)
+  end
+  def read_cached_blocked_state
+    @cache_store.read("pecorino-cached-throttle-state-#{@throttle.key}")
+  end
+end

data/lib/pecorino/throttle.rb CHANGED Viewed

@@ -6,23 +6,28 @@
 # the block is lifted. The block time can be arbitrarily higher or lower than the amount
 # of time it takes for the leaky bucket to leak out
 class Pecorino::Throttle
-  State = Struct.new(:blocked_until) do
-    # Tells whether this throttle is blocked, either due to the leaky bucket having filled up
-    # or due to there being a timed block set because of an earlier event of the bucket having
-    # filled up
-    def blocked?
-      blocked_until ? true : false
+  # The state represents a snapshot of the throttle state in time
+  class State
+    # @return [Time]
+    attr_reader :blocked_until
+    def initialize(blocked_until)
+      @blocked_until = blocked_until
     end
-    # Returns the number of seconds until the block will be lifted, rouded up to the closest
-    # whole second. This value can be used in a "Retry-After" HTTP response header.
+    # Tells whether this throttle still is in the blocked state.
+    # If the `blocked_until` value lies in the past, the method will
+    # return `false` - this is done so that the `State` can be cached.
     #
-    # @return [Integer]
-    def retry_after
-      (blocked_until - Time.now.utc).ceil
+    # @return [Boolean]
+    def blocked?
+      !!(@blocked_until && @blocked_until > Time.now)
     end
   end
+  # {Pecorino::Throttle} will raise this exception from `request!`. The exception can be used
+  # to do matching, for setting appropriate response headers, and for distinguishing between
+  # multiple different throttles.
   class Throttled < StandardError
     # Returns the throttle which raised the exception. Can be used to disambiguiate between
     # multiple Throttled exceptions when multiple throttles are applied in a layered fashion:
@@ -34,21 +39,63 @@ class Pecorino::Throttle
     #      db_insert_throttle.request!(n_items_to_insert)
     #    rescue Pecorino::Throttled => e
     #      deliver_notification(user) if e.throttle == user_email_throttle
+    #      firewall.ban_ip(ip) if e.throttle == ip_addr_throttle
     #    end
     #
     # @return [Throttle]
     attr_reader :throttle
-    # Returns the `retry_after` value in seconds, suitable for use in an HTTP header
-    attr_reader :retry_after
+    # Returns the throttle state based on which the exception is getting raised. This can
+    # be used for caching the exception, because the state can tell when the block will be
+    # lifted. This can be used to shift the throttle verification into a faster layer of the
+    # system (like a blocklist in a firewall) or caching the state in an upstream cache. A block
+    # in Pecorino is set once and is active until expiry. If your service is under an attack
+    # and you know that the call is blocked until a certain future time, the block can be
+    # lifted up into a faster/cheaper storage destination, like Rails cache:
+    #
+    # @example
+    #    begin
+    #      ip_addr_throttle.request!
+    #    rescue Pecorino::Throttled => e
+    #      firewall.ban_ip(request.ip, ttl_seconds: e.state.retry_after)
+    #      render :rate_limit_exceeded
+    #    end
+    #
+    # @example
+    #    state = Rails.cache.read(ip_addr_throttle.key)
+    #    return render :rate_limit_exceeded if state && state.blocked? # No need to call Pecorino for this
+    #
+    #    begin
+    #      ip_addr_throttle.request!
+    #    rescue Pecorino::Throttled => e
+    #      Rails.cache.write(ip_addr_throttle.key, e.state, expires_in: (e.state.blocked_until - Time.now))
+    #      render :rate_limit_exceeded
+    #    end
+    #
+    # @return [Throttle::State]
+    attr_reader :state
     def initialize(from_throttle, state)
       @throttle = from_throttle
-      @retry_after = state.retry_after
+      @state = state
       super("Block in effect until #{state.blocked_until.iso8601}")
     end
+    # Returns the `retry_after` value in seconds, suitable for use in an HTTP header
+    #
+    # @return [Integer]
+    def retry_after
+      (@state.blocked_until - Time.now).ceil
+    end
   end
+  # The key for that throttle. Each key defines a unique throttle based on either a given name or
+  # discriminators. If there is a component you want to key your throttle by, include it in the
+  # `key` keyword argument to the constructor, like `"t-ip-#{request.ip}"`
+  #
+  # @return [String]
+  attr_reader :key
   # @param key[String] the key for both the block record and the leaky bucket
   # @param block_for[Numeric] the number of seconds to block any further requests for. Defaults to time it takes
   #   the bucket to leak out to the level of 0
@@ -73,8 +120,8 @@ class Pecorino::Throttle
   end
   # Register that a request is being performed. Will raise Throttled
-  # if there is a block in place on that key, or if the bucket has been filled up
-  # and a block has been put in place as a result of this particular request.
+  # if there is a block in place for that throttle, or if the bucket cannot accept
+  # this fillup and the block has just been installed as a result of this particular request.
   #
   # The exception can be rescued later to provide a 429 response. This method is better
   # to use before performing the unit of work that the throttle is guarding:
@@ -89,11 +136,11 @@ class Pecorino::Throttle
   #
   # If the method call succeeds it means that the request is not getting throttled.
   #
-  # @return void
+  # @return [State] the state of the throttle after filling up the leaky bucket / trying to pass the block
   def request!(n = 1)
-    state = request(n)
-    raise Throttled.new(self, state) if state.blocked?
-    nil
+    request(n).tap do |state_after|
+      raise Throttled.new(self, state_after) if state_after.blocked?
+    end
   end
   # Register that a request is being performed. Will not raise any exceptions but return
@@ -122,4 +169,18 @@ class Pecorino::Throttle
       State.new(fresh_blocked_until.utc)
     end
   end
+  # Fillup the throttle with 1 request and then perform the passed block. This is useful to perform actions which should
+  # be rate-limited - alerts, calls to external services and the like. If the call is allowed to proceed,
+  # the passed block will be executed. If the throttle is in the blocked state or if the call puts the throttle in
+  # the blocked state the block will not be executed
+  #
+  # @example
+  #   t.throttled { Slack.alert("Things are going wrong") }
+  #
+  # @return [Object] the return value of the block if the block gets executed, or `nil` if the call got throttled
+  def throttled(&blk)
+    return if request(1).blocked?
+    yield
+  end
 end

data/lib/pecorino/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module Pecorino
-  VERSION = "0.4.1"
+  VERSION = "0.5.0"
 end

data/lib/pecorino.rb CHANGED Viewed

@@ -7,6 +7,7 @@ require_relative "pecorino/version"
 require_relative "pecorino/leaky_bucket"
 require_relative "pecorino/throttle"
 require_relative "pecorino/railtie" if defined?(Rails::Railtie)
+require_relative "pecorino/cached_throttle"
 module Pecorino
   autoload :Postgres, "pecorino/postgres"

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: pecorino
 version: !ruby/object:Gem::Version
-  version: 0.4.1
+  version: 0.5.0
 platform: ruby
 authors:
 - Julik Tarkhanov
 autorequire:
 bindir: exe
 cert_chain: []
-date: 2024-02-10 00:00:00.000000000 Z
+date: 2024-02-11 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: activerecord
@@ -154,6 +154,7 @@ files:
 - README.md
 - Rakefile
 - lib/pecorino.rb
+- lib/pecorino/cached_throttle.rb
 - lib/pecorino/install_generator.rb
 - lib/pecorino/leaky_bucket.rb
 - lib/pecorino/migrations/create_pecorino_tables.rb.erb
@@ -185,7 +186,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
     - !ruby/object:Gem::Version
       version: '0'
 requirements: []
-rubygems_version: 3.3.7
+rubygems_version: 3.4.10
 signing_key:
 specification_version: 4
 summary: Database-based rate limiter using leaky buckets