RubyGems - prorate - Versions diffs - 0.1.0 → 0.7.0 - Mend

prorate 0.1.0 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (22) hide show

checksums.yaml +5 -5
data/.rubocop.yml +2 -0
data/.travis.yml +16 -4
data/CHANGELOG.md +39 -0
data/README.md +128 -17
data/Rakefile +10 -2
data/lib/prorate.rb +0 -2
data/lib/prorate/leaky_bucket.lua +77 -0
data/lib/prorate/leaky_bucket.rb +134 -0
data/lib/prorate/null_logger.rb +5 -0
data/lib/prorate/null_pool.rb +3 -1
data/lib/prorate/rate_limit.lua +54 -0
data/lib/prorate/throttle.rb +146 -20
data/lib/prorate/throttled.rb +18 -7
data/lib/prorate/version.rb +1 -1
data/prorate.gemspec +7 -5
data/scripts/bm.rb +43 -0
data/scripts/bm_latency_lb_vs_mget.rb +59 -0
data/scripts/reload_lua.rb +6 -0
metadata +58 -25
data/lib/prorate/block_for.rb +0 -13
data/lib/prorate/counter.rb +0 -53

checksums.yaml CHANGED

@@ -1,7 +1,7 @@
 ---
-SHA1:
-  metadata.gz: d846e16d888b11395f59a44e8d487e9c3cccf336
-  data.tar.gz: 16051a7c67452164d40c3d8c785a26b973eaf54e
+SHA256:
+  metadata.gz: bb2c78403fd3d37fd073ccf736618673e532c6fc53efd7e1c342c7edebc4037f
+  data.tar.gz: 497b74b1d07d1590e44f338f7150b6b75fe5ed154af82d052381915a2b174c69
 SHA512:
-  metadata.gz: b241b77bf6ec18bd394a0360bebd6b9e10b3b0cde21f1642860b0425d33b8b3a50dbaba9a5fe4fd7b5595dd95a566dfe122939a94711b651afec125e3ecfcc94
-  data.tar.gz: 1f110ef412d5c231f1eed8550f3957c4fc9797eef42f3a92de802b2be1d15b06c450b1913706c036dd30924925acfce410ec29adf495539ea0dd08763f904d50
+  metadata.gz: d2a262971d745073dfd385088d92bf40667fa8108e6c3b71982b17fad41d6ee94472e40f8189badfa6131ac853a0dfe381e9bfbef93a0b7bbd24c3f39339251f
+  data.tar.gz: 33f4f60558e7cee9fd671ebe48f4e35fc67e58c2ea5ad5cde5e40368b8b486e941eabb71ddf3dfaa87a9380689d31c059d56ea81cfa8860a955409d075840824

data/.rubocop.yml ADDED

	@@ -0,0 +1,2 @@
1	+ inherit_gem:
2	+ wetransfer_style: ruby/default.yml

data/.travis.yml CHANGED

@@ -1,5 +1,17 @@
-sudo: false
-language: ruby
 rvm:
-  - 2.2.5
-before_install: gem install bundler -v 1.12.5
+- 2.2
+- 2.3
+- 2.4
+- 2.5
+- 2.6
+- 2.7
+services:
+- redis
+dist: trusty # https://docs.travis-ci.com/user/trusty-ci-environment/
+sudo: false
+cache: bundler
+script:
+  - bundle exec rake

data/CHANGELOG.md ADDED

@@ -0,0 +1,39 @@
+# 0.7.0
+* Add a naked `LeakyBucket` object which allows one to build sophisticated rate limiting relying
+  on the Ruby side of things more. It has less features than the `Throttle` but can be used for more
+  fine-graned control of the throttling. It also does not use exceptions for flow control.
+  The `Throttle` object used them because it should make the code abort *loudly* if a throttle is hit, but
+  when the objective is to measure instead a smaller, less opinionated module can be more useful.
+* Refactor the internals of the Throttle class so that it uses a default Logger, and document the arguments.
+* Use fractional time measurement from Redis in Lua code. For our throttle to be precise we cannot really
+  limit ourselves to "anchored slots" on the start of a second, and we would be effectively doing that
+  with our previous setup.
+* Fix the `redis` gem deprecation warnings when using `exists` - we will now use `exists?` if available.
+* Remove dependency on the `ks` gem as we can use vanilla Structs or classes instead.
+# 0.6.0
+* Add `Throttle#status` method for retrieving the status of a throttle without placing any tokens
+  or raising any exceptions. This is useful for layered throttles.
+# 0.5.0
+* Allow setting the number of tokens to add to the bucket in `Throttle#throttle!` - this is useful because
+  sometimes a request effectively uses N of some resource in one go, and should thus cause a throttle
+  to fire without having to do repeated calls
+# 0.4.0
+* When raising a `Throttled` exception, add the name of the throttle to it. This is useful when multiple
+  throttles are used together and one needs to find out which throttle has fired.
+* Reformat code according to wetransfer_style and make it compulsory on CI
+# 0.3.0
+* Replace the Ruby implementation of the throttle with a Lua script which runs within Redis. This allows us
+  to do atomic gets+sets very rapidly.
+# 0.1.0
+* Initial release of Prorate

data/README.md CHANGED

@@ -1,7 +1,13 @@
 # Prorate
-Provides a low-level time-based throttle. Is mainly meant for situations where using something like Rack::Attack is not very
-useful since you need access to more variables.
+Provides a low-level time-based throttle. Is mainly meant for situations where
+using something like Rack::Attack is not very useful since you need access to
+more variables. Under the hood, this uses a Lua script that implements the
+[Leaky Bucket](https://en.wikipedia.org/wiki/Leaky_bucket) algorithm in a single
+threaded and race condition safe way.
+[![Build Status](https://travis-ci.org/WeTransfer/prorate.svg?branch=master)](https://travis-ci.org/WeTransfer/prorate)
+[![Gem Version](https://badge.fury.io/rb/prorate.svg)](https://badge.fury.io/rb/prorate)
 ## Installation
@@ -13,30 +19,137 @@ gem 'prorate'
 And then execute:
-    $ bundle
+```shell
+bundle install
+```
 Or install it yourself as:
-    $ gem install prorate
+```shell
+gem install prorate
+```
 ## Usage
+The simplest mode of operation is throttling an endpoint, using the throttler
+before the action happens.
 Within your Rails controller:
-    throttle_args[:block_for] ||= throttle_args.fetch(:period)
-    t = Prorate::Throttle.new(redis: Redis.new, logger: Rails.logger,
-        name: "throttle-login-email", limit: 20, period: 5.seconds)
-    # Add all the parameters that function as a discriminator
-    t << request.ip
-    t << params.require(:email)
-    # ...and call the throttle! method
-    t.throttle! # Will raise a Prorate::Throttled exception if the limit has been reached
+```ruby
+t = Prorate::Throttle.new(
+    redis: Redis.new,
+    logger: Rails.logger,
+    name: "throttle-login-email",
+    limit: 20,
+    period: 5.seconds
+)
+# Add all the parameters that function as a discriminator.
+t << request.ip << params.require(:email)
+# ...and call the throttle! method
+t.throttle! # Will raise a Prorate::Throttled exception if the limit has been reached
+#
+# Your regular action happens after this point
+```
+To capture that exception, in the controller
+```ruby
+rescue_from Prorate::Throttled do |e|
+  response.set_header('Retry-After', e.retry_in_seconds.to_s)
+  render nothing: true, status: 429
+end
+```
+### Throttling and checking status
+More exquisite control can be achieved by combining throttling (see previous
+step) and - in subsequent calls - checking the status of the throttle before
+invoking the throttle. **When you call `throttle!`, you add tokens to the leaky bucket.**
+Let's say you have an endpoint that not only needs throttling, but you want to
+ban [credential stuffers](https://en.wikipedia.org/wiki/Credential_stuffing)
+outright. This is a multi-step process:
+1. Respond with a 429 if the discriminators of the request would land in an
+  already blocking 'credential-stuffing'-throttle
+1. Run your regular throttling
+1. Perform your sign in action
+1. If the sign in was unsuccessful, add the discriminators to the
+  'credential-stuffing'-throttle
+In your controller that would look like this:
+```ruby
+t = Prorate::Throttle.new(
+    redis: Redis.new,
+    logger: Rails.logger,
+    name: "credential-stuffing",
+    limit: 20,
+    period: 20.minutes
+)
+# Add all the parameters that function as a discriminator.
+t << request.ip
+# And before anything else, check whether it is throttled
+if t.status.throttled?
+  response.set_header('Retry-After', t.status.remaining_throttle_seconds.to_s)
+  render(nothing: true, status: 429) and return
+end
+# run your regular throttles for the endpoint
+other_throttles.map(:throttle!)
+# Perform your sign in logic..
+user = YourSignInLogic.valid?(
+  email: params[:email],
+  password: params[:password]
+)
+# Add the request to the credential stuffing throttle if we didn't succeed
+t.throttle! unless user
+# the rest of your action
+```
 To capture that exception, in the controller
-    rescue_from Prorate::Throttled do |e|
-      render nothing: true, status: 429
-    end
+```ruby
+rescue_from Prorate::Throttled do |e|
+  response.set_header('Retry-After', e.retry_in_seconds.to_s)
+  render nothing: true, status: 429
+end
+```
+## Using just the leaky bucket
+There is also an object for using the heart of Prorate (the leaky bucket) without blocking or exceptions. This is useful
+if you want to implement a more generic rate limiting solution and customise it in a fancier way. The leaky bucket on
+it's own provides the following conveniences only:
+* Track the number of tokens added and the number of tokens that have leaked
+* Tracks whether a specific token fillup has overflown the bucket. This is only tracked momentarily if the bucket is limited
+Level and leak rate are computed and provided as Floats instead of Integers (in the Throttle class).
+To use it, employ the `LeakyBucket` object:
+```ruby
+# The leak_rate is in tokens per second
+leaky_bucket = Prorate::LeakyBucket.new(redis: Redis.new, redis_key_prefix: "user123", leak_rate: 0.8, bucket_capacity: 2)
+leaky_bucket.state.level #=> will return 0.0
+leaky_bucket.state.full? #=> will return "false"
+state_after_add = leaky_bucket.fillup(2) #=> returns a State object_
+state_after_add.full? #=> will return "true"
+state_after_add.level #=> will return 2.0
+```
+## Why Lua?
+Prorate is implementing throttling using the "Leaky Bucket" algorithm and is extensively described [here](https://github.com/WeTransfer/prorate/blob/master/lib/prorate/throttle.rb). The implementation is using a Lua script, because is the only language available which runs _inside_ Redis. Thanks to the speed benefits of Lua the script runs fast enough to apply it on every throttle call.
+Using a Lua script in Prorate helps us achieve the following guarantees:
+- **The script will run atomically.** The script is evaluated as a single Redis command. This ensures that the commands in the Lua script will never be interleaved with another client: they will always execute together.
+- **Any usages of time will use the Redis time.** Throttling requires a consistent and monotonic _time source_. The only monotonic and consistent time source which is usable in the context of Prorate, is the `TIME` result of Redis itself. We are throttling requests from different machines, which will invariably have clock drift between them. This way using the Redis server `TIME` helps achieve consistency.
 ## Development
@@ -48,8 +161,6 @@ To install this gem onto your local machine, run `bundle exec rake install`. To
 Bug reports and pull requests are welcome on GitHub at https://github.com/WeTransfer/prorate.
 ## License
 The gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).

data/Rakefile CHANGED

@@ -1,6 +1,14 @@
 require "bundler/gem_tasks"
 require "rspec/core/rake_task"
+require 'rubocop/rake_task'
+require 'yard'
-RSpec::Core::RakeTask.new(:spec)
+YARD::Rake::YardocTask.new(:doc) do |t|
+  # The dash has to be between the two to "divide" the source files and
+  # miscellaneous documentation files that contain no code
+  t.files = ['lib/**/*.rb', '-', 'LICENSE.txt', 'CHANGELOG.md']
+end
-task :default => :spec
+RSpec::Core::RakeTask.new(:spec)
+RuboCop::RakeTask.new(:rubocop)
+task default: [:spec, :rubocop]

data/lib/prorate.rb CHANGED

@@ -1,6 +1,4 @@
 require "prorate/version"
-require "ks"
-require "logger"
 require "redis"
 module Prorate

data/lib/prorate/leaky_bucket.lua ADDED

@@ -0,0 +1,77 @@
+-- Single threaded Leaky Bucket implementation (without blocking).
+-- args: key_base, leak_rate, bucket_ttl, fillup. To just verify the state of the bucket leak_rate of 0 may be passed.
+-- returns: the leve of the bucket in number of tokens
+-- this is required to be able to use TIME and writes; basically it lifts the script into IO
+redis.replicate_commands()
+-- Redis documentation recommends passing the keys separately so that Redis
+-- can - in the future - verify that they live on the same shard of a cluster, and
+-- raise an error if they are not. As far as can be understood this functionality is not
+-- yet present, but if we can make a little effort to make ourselves more future proof
+-- we should.
+local bucket_level_key = KEYS[1]
+local last_updated_key = KEYS[2]
+local leak_rate = tonumber(ARGV[1])
+local fillup = tonumber(ARGV[2]) -- How many tokens this call adds to the bucket.
+local bucket_capacity = tonumber(ARGV[3]) -- How many tokens is the bucket allowed to contain
+-- Compute the key TTL for the bucket. We are interested in how long it takes the bucket
+-- to leak all the way to 0, as this is the time when the values stay relevant. We pad with 1 second
+-- to have a little cushion.
+local key_lifetime = math.ceil((bucket_capacity / leak_rate) + 1)
+-- Take a timestamp
+local redis_time = redis.call("TIME") -- Array of [seconds, microseconds]
+local now = tonumber(redis_time[1]) + (tonumber(redis_time[2]) / 1000000)
+-- get current bucket level. The throttle key might not exist yet in which
+-- case we default to 0
+local bucket_level = tonumber(redis.call("GET", bucket_level_key)) or 0
+-- ...and then perform the leaky bucket fillup/leak. We need to do this also when the bucket has
+-- just been created because the initial fillup to add might be so high that it will
+-- immediately overflow the bucket and trigger the throttle, on the first call.
+local last_updated = tonumber(redis.call("GET", last_updated_key)) or now -- use sensible default of 'now' if the key does not exist
+-- Subtract the number of tokens leaked since last call
+local dt = now - last_updated
+local new_bucket_level = bucket_level - (leak_rate * dt) + fillup
+-- and _then_ and add the tokens we fillup with. Cap the value to be 0 < capacity
+new_bucket_level = math.max(0, math.min(bucket_capacity, new_bucket_level))
+-- Since we return a floating point number string-formatted even if the bucket is full we
+-- have some loss of precision in the formatting, even if the bucket was actually full.
+-- This bit of information is useful to preserve.
+local at_capacity = 0
+if new_bucket_level == bucket_capacity then
+  at_capacity = 1
+end
+-- If both the initial level was 0, and the level after putting tokens in is 0 we
+-- can avoid setting keys in Redis at all as this was only a level check.
+if new_bucket_level == 0 and bucket_level == 0 then
+  return {"0.0", at_capacity}
+end
+-- Save the new bucket level
+redis.call("SETEX", bucket_level_key, key_lifetime, new_bucket_level)
+-- Record when we updated the bucket so that the amount of tokens leaked
+-- can be correctly determined on the next invocation
+redis.call("SETEX", last_updated_key, key_lifetime, now)
+-- Most Redis adapters when used with the Lua interface truncate floats
+-- to integers (at least in Python that is documented to be the case in
+-- the Redis ebook here
+-- https://redislabs.com/ebook/part-3-next-steps/chapter-11-scripting-redis-with-lua/11-1-adding-functionality-without-writing-c
+-- We need access to the bucket level as a float value since our leak rate might as well be floating point, and to achieve that
+-- we can go two ways. We can turn the float into a Lua string, and then parse it on the other side, or we can convert it to
+-- a tuple of two integer values - one for the integer component and one for fraction.
+-- Now, the unpleasant aspect is that when we do this we will lose precision - the number is not going to be
+-- exactly equal to capacity, thus we lose the bit of information which tells us whether we filled up the bucket or not.
+-- Also since the only moment we can register whether the bucket is above capacity is now - in this script, since
+-- by the next call some tokens will have leaked.
+return {string.format("%.9f", new_bucket_level), at_capacity}

data/lib/prorate/leaky_bucket.rb ADDED

@@ -0,0 +1,134 @@
+module Prorate
+  # This offers just the leaky bucket implementation with fill control, but without the timed lock.
+  # It does not raise any exceptions, it just tracks the state of a leaky bucket in Redis.
+  #
+  # Important differences from the more full-featured Throttle class are:
+  #
+  # * No logging (as most meaningful code lives in Lua anyway)
+  # * No timed block - if you need to keep track of timed blocking it can be done externally
+  # * Leak rate is specified directly in tokens per second, instead of specifying the block period.
+  # * The bucket level is stored and returned as a Float which allows for finer-grained measurement,
+  #   but more importantly - makes testing from the outside easier.
+  #
+  # It does have a few downsides compared to the Throttle though
+  #
+  # * Bucket is only full momentarily. On subsequent calls some tokens will leak already, so you either
+  #   need to do delta checks on the value or rely on putting the token into the bucket.
+  class LeakyBucket
+    LUA_SCRIPT_CODE = File.read(File.join(__dir__, "leaky_bucket.lua"))
+    LUA_SCRIPT_HASH = Digest::SHA1.hexdigest(LUA_SCRIPT_CODE)
+    class BucketState < Struct.new(:level, :full)
+      # Returns the level of the bucket after the operation on the LeakyBucket
+      # object has taken place. There is a guarantee that no tokens have leaked
+      # from the bucket between the operation and the freezing of the BucketState
+      # struct.
+      #
+      # @!attribute [r] level
+      #   @return [Float]
+      # Tells whether the bucket was detected to be full when the operation on
+      # the LeakyBucket was performed. There is a guarantee that no tokens have leaked
+      # from the bucket between the operation and the freezing of the BucketState
+      # struct.
+      #
+      # @!attribute [r] full
+      #   @return [Boolean]
+      alias_method :full?, :full
+      # Returns the bucket level of the bucket state as a Float
+      #
+      # @return [Float]
+      def to_f
+        level.to_f
+      end
+      # Returns the bucket level of the bucket state rounded to an Integer
+      #
+      # @return [Integer]
+      def to_i
+        level.to_i
+      end
+    end
+    # Creates a new LeakyBucket. The object controls 2 keys in Redis: one
+    # for the last access time, and one for the contents of the key.
+    #
+    # @param redis_key_prefix[String] the prefix that is going to be used for keys.
+    #   If your bucket is specific to a user, a browser or an IP address you need to mix in
+    #   those values into the key prefix as appropriate.
+    # @param leak_rate[Float] the leak rate of the bucket, in tokens per second
+    # @param redis[Redis,#with] a Redis connection or a ConnectonPool instance
+    #   if you are using the connection_pool gem. With a connection pool Prorate will
+    #   checkout a connection using `#with` and check it in when it's done.
+    # @param bucket_capacity[Numeric] how many tokens is the bucket capped at.
+    #   Filling up the bucket using `fillup()` will add to that number, but
+    #   the bucket contents will then be capped at this value. So with
+    #   bucket_capacity set to 12 and a `fillup(14)` the bucket will reach the level
+    #   of 12, and will then immediately start leaking again.
+    def initialize(redis_key_prefix:, leak_rate:, redis:, bucket_capacity:)
+      @redis_key_prefix = redis_key_prefix
+      @redis = NullPool.new(redis) unless redis.respond_to?(:with)
+      @leak_rate = leak_rate.to_f
+      @capacity = bucket_capacity.to_f
+    end
+    # Places `n` tokens in the bucket.
+    #
+    # @return [BucketState] the state of the bucket after the operation
+    def fillup(n_tokens)
+      run_lua_bucket_script(n_tokens.to_f)
+    end
+    # Returns the current state of the bucket, containing the level and whether the bucket is full
+    #
+    # @return [BucketState] the state of the bucket after the operation
+    def state
+      run_lua_bucket_script(0)
+    end
+    # Returns the Redis key for the leaky bucket itself
+    # Note that the key is not guaranteed to contain a value if the bucket has not been filled
+    # up recently.
+    #
+    # @return [String]
+    def leaky_bucket_key
+      "#{@redis_key_prefix}.leaky_bucket.bucket_level"
+    end
+    # Returns the Redis key under which the last updated time of the bucket gets stored.
+    # Note that the key is not guaranteed to contain a value if the bucket has not been filled
+    # up recently.
+    #
+    # @return [String]
+    def last_updated_key
+      "#{@redis_key_prefix}.leaky_bucket.last_updated"
+    end
+    private
+    def run_lua_bucket_script(n_tokens)
+      @redis.with do |r|
+        begin
+          # The script returns a tuple of "whole tokens, microtokens"
+          # to be able to smuggle the float across (similar to Redis TIME command)
+          level_str, is_full_int = r.evalsha(
+            LUA_SCRIPT_HASH,
+            keys: [leaky_bucket_key, last_updated_key], argv: [@leak_rate, n_tokens, @capacity])
+          BucketState.new(level_str.to_f, is_full_int == 1)
+        rescue Redis::CommandError => e
+          if e.message.include? "NOSCRIPT"
+            # The Redis server has never seen this script before. Needs to run only once in the entire lifetime
+            # of the Redis server, until the script changes - in which case it will be loaded under a different SHA
+            r.script(:load, LUA_SCRIPT_CODE)
+            retry
+          else
+            raise e
+          end
+        end
+      end
+    end
+  end
+end

data/lib/prorate/null_logger.rb CHANGED

@@ -1,10 +1,15 @@
 module Prorate
   module NullLogger
     def self.debug(*); end
     def self.info(*); end
     def self.warn(*); end
     def self.error(*); end
     def self.fatal(*); end
     def self.unknown(*); end
   end
 end

data/lib/prorate/null_pool.rb CHANGED

@@ -1,5 +1,7 @@
 module Prorate
   class NullPool < Struct.new(:conn)
-    def with; yield conn; end
+    def with
+      yield conn
+    end
   end
 end

data/lib/prorate/rate_limit.lua ADDED

@@ -0,0 +1,54 @@
+-- Single threaded Leaky Bucket implementation.
+-- args: key_base, leak_rate, max_bucket_capacity, block_duration, n_tokens
+-- returns: an array of two integers, the first of which indicates the remaining block time.
+-- if the block time is nonzero, the second integer is always zero. If the block time is zero,
+-- the second integer indicates the level of the bucket
+-- this is required to be able to use TIME and writes; basically it lifts the script into IO
+redis.replicate_commands()
+-- make some nicer looking variable names:
+local retval = nil
+local bucket_level_key = ARGV[1] .. ".bucket_level"
+local last_updated_key = ARGV[1] .. ".last_updated"
+local block_key = ARGV[1] .. ".block"
+local max_bucket_capacity = tonumber(ARGV[2])
+local leak_rate = tonumber(ARGV[3])
+local block_duration = tonumber(ARGV[4])
+local n_tokens = tonumber(ARGV[5]) -- How many tokens this call adds to the bucket. Defaults to 1
+-- Take the Redis timestamp
+local redis_time = redis.call("TIME") -- Array of [seconds, microseconds]
+local now = tonumber(redis_time[1]) + (tonumber(redis_time[2]) / 1000000)
+local key_lifetime = math.ceil(max_bucket_capacity / leak_rate)
+local blocked_until = redis.call("GET", block_key)
+if blocked_until then
+  return {(tonumber(blocked_until) - now), 0}
+end
+-- get current bucket level. The throttle key might not exist yet in which
+-- case we default to 0
+local bucket_level = tonumber(redis.call("GET", bucket_level_key)) or 0
+-- ...and then perform the leaky bucket fillup/leak. We need to do this also when the bucket has
+-- just been created because the initial n_tokens to add might be so high that it will
+-- immediately overflow the bucket and trigger the throttle, on the first call.
+local last_updated = tonumber(redis.call("GET", last_updated_key)) or now -- use sensible default of 'now' if the key does not exist
+local new_bucket_level = math.max(0, bucket_level - (leak_rate * (now - last_updated)))
+if (new_bucket_level + n_tokens) <= max_bucket_capacity then
+  new_bucket_level = math.max(0, new_bucket_level + n_tokens)
+  retval = {0, math.ceil(new_bucket_level)}
+else
+  redis.call("SETEX", block_key, block_duration, now + block_duration)
+  retval = {block_duration, 0}
+end
+-- Save the new bucket level
+redis.call("SETEX", bucket_level_key, key_lifetime, new_bucket_level)
+-- Record when we updated the bucket so that the amount of tokens leaked
+-- can be correctly determined on the next invocation
+redis.call("SETEX", last_updated_key, key_lifetime, now)
+return retval

data/lib/prorate/throttle.rb CHANGED

@@ -1,34 +1,160 @@
 require 'digest'
 module Prorate
-  class Throttle < Ks.strict(:name, :limit, :period, :block_for, :redis, :logger)
-    def initialize(*)
-      super
+  class MisconfiguredThrottle < StandardError
+  end
+  class Throttle
+    LUA_SCRIPT_CODE = File.read(File.join(__dir__, "rate_limit.lua"))
+    LUA_SCRIPT_HASH = Digest::SHA1.hexdigest(LUA_SCRIPT_CODE)
+    attr_reader :name, :limit, :period, :block_for, :redis, :logger
+    def initialize(name:, limit:, period:, block_for:, redis:, logger: Prorate::NullLogger)
+      @name = name.to_s
       @discriminators = [name.to_s]
-      self.redis = NullPool.new(redis) unless redis.respond_to?(:with)
+      @redis = NullPool.new(redis) unless redis.respond_to?(:with)
+      @logger = logger
+      @block_for = block_for
+      raise MisconfiguredThrottle if (period <= 0) || (limit <= 0)
+      # Do not do type conversions here since we want to allow the caller to read
+      # those values back later
+      # (API contract which the previous implementation of Throttle already supported)
+      @limit = limit
+      @period = period
+      @leak_rate = limit.to_f / period # tokens per second;
     end
+    # Add a value that will be used to distinguish this throttle from others.
+    # It has to be something user- or connection-specific, and multiple
+    # discriminators can be combined:
+    #
+    #    throttle << ip_address << user_agent_fingerprint
+    #
+    # @param discriminator[Object] a Ruby object that can be marshaled
+    #    in an equivalent way between requests, using `Marshal.dump
     def <<(discriminator)
       @discriminators << discriminator
     end
-    def throttle!
+    # Applies the throttle and raises a {Throttled} exception if it has been triggered
+    #
+    # Accepts an optional number of tokens to put in the bucket (default is 1).
+    # The effect of `n_tokens:` set to 0 is a "ping".
+    # It makes sure the throttle keys in Redis get created and adjusts the
+    # last invoked time of the leaky bucket. Can be used when a throttle
+    # is applied in a "shadow" fashion. For example, imagine you
+    # have a cascade of throttles with the following block times:
+    #
+    #   Throttle A: [-------]
+    #   Throttle B: [----------]
+    #
+    # You apply Throttle A: and it fires, but when that happens you also
+    # want to enable a throttle that is applied to "repeat offenders" only -
+    # - for instance ones that probe for tokens and/or passwords.
+    #
+    #   Throttle C: [-------------------------------]
+    #
+    # If your "Throttle A" fires, you can trigger Throttle C
+    #
+    #    Throttle A: [-----|-]
+    #    Throttle C: [-----|-------------------------]
+    #
+    # because you know that Throttle A has fired and thus Throttle C comes
+    # into effect.  What you want to do, however, is to fire Throttle C
+    # even though Throttle A: would have unlatched, which would create this
+    # call sequence:
+    #
+    #    Throttle A: [-------]    *(A not triggered)
+    #    Throttle C: [------------|------------------]
+    #
+    # To achieve that you can keep Throttle C alive using `throttle!(n_tokens: 0)`,
+    # on every check that touches Throttle A and/or Throttle C. It keeps the leaky bucket
+    # updated but does not add any tokens to it:
+    #
+    #    Throttle A: [------]    *(A not triggered since block period has ended)
+    #    Throttle C: [-----------|(ping)------------------]  C is still blocking
+    #
+    # So you can effectively "keep a throttle alive" without ever triggering it,
+    # or keep it alive in combination with other throttles.
+    #
+    # @param n_tokens[Integer] the number of tokens to put in the bucket. If you are
+    #   using Prorate for rate limiting, and a single request is adding N objects to your
+    #   database for example, you can "top up" the bucket with a set number of tokens
+    #   with a arbitrary ratio - like 1 token per inserted row. Once the bucket fills up
+    #   the Throttled exception is going to be raised. Defaults to 1.
+    def throttle!(n_tokens: 1)
+      @logger.debug { "Applying throttle counter %s" % @name }
+      remaining_block_time, bucket_level = run_lua_throttler(
+        identifier: identifier,
+        bucket_capacity: @limit,
+        leak_rate: @leak_rate,
+        block_for: @block_for,
+        n_tokens: n_tokens)
+      if remaining_block_time > 0
+        @logger.warn do
+          "Throttle %s exceeded limit of %d in %d seconds and is blocked for the next %d seconds" % [@name, @limit, @period, remaining_block_time]
+        end
+        raise ::Prorate::Throttled.new(@name, remaining_block_time)
+      end
+      @limit - bucket_level # Return how many calls remain
+    end
+    def status
+      redis_block_key = "#{identifier}.block"
+      @redis.with do |r|
+        is_blocked = redis_key_exists?(r, redis_block_key)
+        if is_blocked
+          remaining_seconds = r.get(redis_block_key).to_i - Time.now.to_i
+          Status.new(_is_throttled = true, remaining_seconds)
+        else
+          remaining_seconds = 0
+          Status.new(_is_throttled = false, remaining_seconds)
+        end
+      end
+    end
+    private
+    def identifier
       discriminator = Digest::SHA1.hexdigest(Marshal.dump(@discriminators))
-      identifier = [name, discriminator].join(':')
-      redis.with do |r|
-        logger.info { "Checking throttle block %s" % name }
-        raise Throttled.new(block_for) if Prorate::BlockFor.blocked?(id: identifier, redis: r)
-        logger.info { "Applying throttle counter %s" % name }
-        c = Prorate::Counter.new(redis: r, id: identifier, logger: logger, window_size: period)
-        after_increment = c.incr
-        if after_increment > limit
-          logger.warn { "Throttle %s exceeded limit of %d at %d" % [name, limit, after_increment] }
-          Prorate::BlockFor.block!(redis: r, id: identifier, duration: block_for)
-          raise Throttled.new(period)
+      "#{@name}:#{discriminator}"
+    end
+    # redis-rb 4.2 started printing a warning for every single-argument use of `#exists`, because
+    # they intend to break compatibility in a future version (to return an integer instead of a
+    # boolean). The old behavior (returning a boolean) is available using the new `exists?` method.
+    def redis_key_exists?(redis, key)
+      return redis.exists?(key) if redis.respond_to?(:exists?)
+      redis.exists(key)
+    end
+    def run_lua_throttler(identifier:, bucket_capacity:, leak_rate:, block_for:, n_tokens:)
+      @redis.with do |redis|
+        begin
+          redis.evalsha(LUA_SCRIPT_HASH, [], [identifier, bucket_capacity, leak_rate, block_for, n_tokens])
+        rescue Redis::CommandError => e
+          if e.message.include? "NOSCRIPT"
+            # The Redis server has never seen this script before. Needs to run only once in the entire lifetime
+            # of the Redis server, until the script changes - in which case it will be loaded under a different SHA
+            redis.script(:load, LUA_SCRIPT_CODE)
+            retry
+          else
+            raise e
+          end
         end
       end
     end
+    class Status < Struct.new(:is_throttled, :remaining_throttle_seconds)
+      def throttled?
+        is_throttled
+      end
+    end
   end
 end

data/lib/prorate/throttled.rb CHANGED

@@ -1,9 +1,20 @@
-module Prorate
-  class Throttled < StandardError
-    attr_reader :retry_in_seconds
-    def initialize(try_again_in)
-      @retry_in_seconds = try_again_in
-      super("Throttled, please lower your temper and try again in %d seconds" % try_again_in)
-    end
+# The Throttled exception gets raised when a throttle is triggered.
+#
+# The exception carries additional attributes which can be used for
+# error tracking and for creating a correct Retry-After HTTP header for
+# a 429 response
+class Prorate::Throttled < StandardError
+  # @attr [String] the name of the throttle (like "shpongs-per-ip").
+  #   Can be used to detect which throttle has fired when multiple
+  #   throttles are used within the same block.
+  attr_reader :throttle_name
+  # @attr [Integer] for how long the caller will be blocked, in seconds.
+  attr_reader :retry_in_seconds
+  def initialize(throttle_name, try_again_in)
+    @throttle_name = throttle_name
+    @retry_in_seconds = try_again_in
+    super("Throttled, please lower your temper and try again in #{retry_in_seconds} seconds")
   end
 end

data/lib/prorate/version.rb CHANGED

@@ -1,3 +1,3 @@
 module Prorate
-  VERSION = "0.1.0"
+  VERSION = "0.7.0"
 end

data/prorate.gemspec CHANGED

@@ -1,4 +1,4 @@
-# coding: utf-8
 lib = File.expand_path('../lib', __FILE__)
 $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
 require 'prorate/version'
@@ -27,10 +27,12 @@ Gem::Specification.new do |spec|
   spec.executables   = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
   spec.require_paths = ["lib"]
-  spec.add_dependency "ks"
   spec.add_dependency "redis", ">= 2"
-  spec.add_development_dependency "connection_pool", "~> 1"
-  spec.add_development_dependency "bundler", "~> 1.12"
-  spec.add_development_dependency "rake", "~> 10.0"
+  spec.add_development_dependency "connection_pool", "~> 2"
+  spec.add_development_dependency "bundler"
+  spec.add_development_dependency "rake", "~> 13.0"
   spec.add_development_dependency "rspec", "~> 3.0"
+  spec.add_development_dependency 'wetransfer_style', '0.6.5'
+  spec.add_development_dependency 'yard', '~> 0.9'
+  spec.add_development_dependency 'pry', '~> 0.13.1'
 end

data/scripts/bm.rb ADDED

@@ -0,0 +1,43 @@
+# Runs a mild benchmark and prints out the average time a call to 'throttle!' takes.
+require 'prorate'
+require 'benchmark'
+require 'redis'
+require 'securerandom'
+def average_ms(ary)
+  ary.map { |x| x * 1000 }.inject(0, &:+) / ary.length
+end
+r = Redis.new
+logz = Logger.new(STDERR)
+logz.level = Logger::FATAL # block out most stuff
+times = []
+50.times do
+  times << Benchmark.realtime {
+    t = Prorate::Throttle.new(redis: r, logger: logz, name: "throttle-login-email", limit: 60, period: 30, block_for: 5)
+    # Add all the parameters that function as a discriminator
+    t << '127.0.2.1'
+    t << 'no_person@nowhere.com'
+    t.throttle!
+  }
+end
+puts average_ms times
+times = []
+50.times do
+  email = SecureRandom.hex(20)
+  ip = SecureRandom.hex(10)
+  times << Benchmark.realtime {
+    t = Prorate::Throttle.new(redis: r, logger: logz, name: "throttle-login-email", limit: 30, period: 30, block_for: 5)
+    # Add all the parameters that function as a discriminator
+    t << ip
+    t << email
+    t.throttle!
+  }
+end
+puts average_ms times

data/scripts/bm_latency_lb_vs_mget.rb ADDED

@@ -0,0 +1,59 @@
+# Runs a mild benchmark and prints out the average time a call to 'throttle!' takes.
+require 'prorate'
+require 'benchmark'
+require 'redis'
+require 'securerandom'
+def average_ms(ary)
+  ary.map { |x| x * 1000 }.inject(0, &:+) / ary.length
+end
+r = Redis.new
+# 4000000.times do
+#   random1 = SecureRandom.hex(10)
+#   random2 = SecureRandom.hex(10)
+#   r.set(random1,random2)
+# end
+logz = Logger.new(STDERR)
+logz.level = Logger::FATAL # block out most stuff
+times = []
+15.times do
+  id = SecureRandom.hex(10)
+  times << Benchmark.realtime {
+    r.evalsha('c95c5f1197cef04ec4afd7d64760f9175933e55a', [], [id, 120, 50, 10]) # values beyond 120 chosen more or less at random
+  }
+end
+puts average_ms times
+def key_for_ts(ts)
+  "th:%s:%d" % [@id, ts]
+end
+times = []
+15.times do
+  sec, _ = r.time # Use Redis time instead of the system timestamp, so that all the nodes are consistent
+  ts = sec.to_i # All Redis results are strings
+  k = key_for_ts(ts)
+  times << Benchmark.realtime {
+    r.multi do |txn|
+      # Increment the counter
+      txn.incr(k)
+      txn.expire(k, 120)
+      span_start = ts - 120
+      span_end = ts + 1
+      possible_keys = (span_start..span_end).map { |prev_time| key_for_ts(prev_time) }
+      # Fetch all the counter values within the time window. Despite the fact that this
+      # will return thousands of elements for large sliding window sizes, the values are
+      # small and an MGET in Redis is pretty cheap, so perf should stay well within limits.
+      txn.mget(*possible_keys)
+    end
+  }
+end
+puts average_ms times

data/scripts/reload_lua.rb ADDED

@@ -0,0 +1,6 @@
+# Reloads the script into redis and prints out the SHA it can be called with
+require 'redis'
+r = Redis.new
+script = File.read('../lib/prorate/rate_limit.lua')
+sha = r.script(:load, script)
+puts sha

metadata CHANGED

@@ -1,99 +1,127 @@
 --- !ruby/object:Gem::Specification
 name: prorate
 version: !ruby/object:Gem::Version
-  version: 0.1.0
+  version: 0.7.0
 platform: ruby
 authors:
 - Julik Tarkhanov
 autorequire:
 bindir: exe
 cert_chain: []
-date: 2017-02-14 00:00:00.000000000 Z
+date: 2020-07-17 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
-  name: ks
+  name: redis
   requirement: !ruby/object:Gem::Requirement
     requirements:
     - - ">="
       - !ruby/object:Gem::Version
-        version: '0'
+        version: '2'
   type: :runtime
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
     - - ">="
       - !ruby/object:Gem::Version
-        version: '0'
+        version: '2'
 - !ruby/object:Gem::Dependency
-  name: redis
+  name: connection_pool
   requirement: !ruby/object:Gem::Requirement
     requirements:
-    - - ">="
+    - - "~>"
       - !ruby/object:Gem::Version
         version: '2'
-  type: :runtime
+  type: :development
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
-    - - ">="
+    - - "~>"
       - !ruby/object:Gem::Version
         version: '2'
 - !ruby/object:Gem::Dependency
-  name: connection_pool
+  name: bundler
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+- !ruby/object:Gem::Dependency
+  name: rake
   requirement: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: '1'
+        version: '13.0'
   type: :development
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: '1'
+        version: '13.0'
 - !ruby/object:Gem::Dependency
-  name: bundler
+  name: rspec
   requirement: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: '1.12'
+        version: '3.0'
   type: :development
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: '1.12'
+        version: '3.0'
 - !ruby/object:Gem::Dependency
-  name: rake
+  name: wetransfer_style
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - '='
+      - !ruby/object:Gem::Version
+        version: 0.6.5
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - '='
+      - !ruby/object:Gem::Version
+        version: 0.6.5
+- !ruby/object:Gem::Dependency
+  name: yard
   requirement: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: '10.0'
+        version: '0.9'
   type: :development
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: '10.0'
+        version: '0.9'
 - !ruby/object:Gem::Dependency
-  name: rspec
+  name: pry
   requirement: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: '3.0'
+        version: 0.13.1
   type: :development
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: '3.0'
+        version: 0.13.1
 description: Can be used to implement all kinds of throttles
 email:
 - me@julik.nl
@@ -103,7 +131,9 @@ extra_rdoc_files: []
 files:
 - ".gitignore"
 - ".rspec"
+- ".rubocop.yml"
 - ".travis.yml"
+- CHANGELOG.md
 - Gemfile
 - LICENSE.txt
 - README.md
@@ -111,14 +141,18 @@ files:
 - bin/console
 - bin/setup
 - lib/prorate.rb
-- lib/prorate/block_for.rb
-- lib/prorate/counter.rb
+- lib/prorate/leaky_bucket.lua
+- lib/prorate/leaky_bucket.rb
 - lib/prorate/null_logger.rb
 - lib/prorate/null_pool.rb
+- lib/prorate/rate_limit.lua
 - lib/prorate/throttle.rb
 - lib/prorate/throttled.rb
 - lib/prorate/version.rb
 - prorate.gemspec
+- scripts/bm.rb
+- scripts/bm_latency_lb_vs_mget.rb
+- scripts/reload_lua.rb
 homepage: https://github.com/WeTransfer/prorate
 licenses:
 - MIT
@@ -139,8 +173,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
     - !ruby/object:Gem::Version
       version: '0'
 requirements: []
-rubyforge_project:
-rubygems_version: 2.4.5.1
+rubygems_version: 3.0.3
 signing_key:
 specification_version: 4
 summary: Time-restricted rate limiter using Redis

data/lib/prorate/block_for.rb DELETED

@@ -1,13 +0,0 @@
-module Prorate
-  module BlockFor
-    def self.block!(redis:, id:, duration:)
-      k = "bl:%s" % id
-      redis.setex(k, duration.to_i, 1)
-    end
-    def self.blocked?(redis:, id:)
-      k = "bl:%s" % id
-      !!redis.get(k)
-    end
-  end
-end

data/lib/prorate/counter.rb DELETED

@@ -1,53 +0,0 @@
-module Prorate
-  # The counter implements a rolling window throttling mechanism. At each call to incr(), the Redis time
-  # is obtained. A counter then gets set at the key corresponding to the timestamp of the request, with a
-  # granularity of a second. If requests are done continuously and in large volume, the counter will therefore
-  # create one key for each second of the given rolling window size. he counters per second are set to auto-expire
-  # after the window lapses. When incr() is performed, there is
-  class Counter
-    def initialize(redis:, logger: NullLogger, id:, window_size:)
-      @redis = redis
-      @logger = logger
-      @id = id
-      @in_span_of_seconds = window_size.to_i.abs
-    end
-    # Increments the throttle counter for this identifier, and returns the total number of requests
-    # performed so far within the given time span. The caller can then determine whether the request has
-    # to be throttled or can be let through.
-    def incr
-      sec, _ = @redis.time # Use Redis time instead of the system timestamp, so that all the nodes are consistent
-      ts = sec.to_i # All Redis results are strings
-      k = key_for_ts(ts)
-      # Do the Redis stuff in a transaction, and capture only the necessary values
-      # (the result of MULTI is all the return values of each call in sequence)
-      *_, done_last_second, _, counter_values = @redis.multi do |txn|
-        # Increment the counter
-        txn.incr(k)
-        txn.expire(k, @in_span_of_seconds)
-        span_start = ts - @in_span_of_seconds
-        span_end = ts + 1
-        possible_keys = (span_start..span_end).map{|prev_time| key_for_ts(prev_time) }
-        @logger.debug { "%s: Scanning %d possible keys" % [@id, possible_keys.length] }
-        # Fetch all the counter values within the time window. Despite the fact that this
-        # will return thousands of elements for large sliding window sizes, the values are
-        # small and an MGET in Redis is pretty cheap, so perf should stay well within limits.
-        txn.mget(*possible_keys)
-      end
-      # Sum all the values. The empty keys return nils from MGET, which become 0 on to_i casts.
-      total_requests_during_period = counter_values.map(&:to_i).inject(&:+)
-      @logger.debug { "%s: %d reqs total during the last %d seconds" % [@id, total_requests_during_period, @in_span_of_seconds] }
-      total_requests_during_period
-    end
-    private
-    def key_for_ts(ts)
-      "th:%s:%d" % [@id, ts]
-    end
-  end
-end