bot_challenge_page 0.1.0 → 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +73 -12
- data/app/controllers/bot_challenge_page/bot_challenge_page_controller.rb +16 -95
- data/app/controllers/concerns/bot_challenge_page/enforce_filter.rb +59 -0
- data/app/controllers/concerns/bot_challenge_page/rack_attack_init.rb +60 -0
- data/app/models/bot_challenge_page/config.rb +9 -1
- data/app/views/bot_challenge_page/_local_turnstile_script_tag.html.erb +15 -9
- data/app/views/bot_challenge_page/_turnstile_widget_placeholder.html.erb +2 -1
- data/app/views/bot_challenge_page/bot_challenge_page/challenge.html.erb +4 -4
- data/lib/bot_challenge_page/version.rb +1 -1
- data/lib/generators/bot_challenge_page/install_generator.rb +29 -7
- data/lib/generators/bot_challenge_page/templates/initializer.rb.erb +11 -5
- metadata +4 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: a1913d93cd52d599d33f7217bb3714fbf67c82d08a888a3025cc30e51c51e438
|
4
|
+
data.tar.gz: 6e06e420e625a069132ae89a3ef733c1b2c246878e771e3b459ea0cdc351d14e
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 11504f2622783e4ca4bfb06b981df28260c1903e04a8900fe1797d34cd05bc29bcf3b2a1a6b9c93b85ae0f3a4639ff115699d4a548f6912b839bee62147e595c
|
7
|
+
data.tar.gz: 229ed02d6173b651ffd3cbef26e6444a974a14a1e9e3390d76567774a2385ad38cb4e1c289a78dda2c66f57d0e6a54e375506c9826cdbd7df7f799a7b96f03ff
|
data/README.md
CHANGED
@@ -1,10 +1,10 @@
|
|
1
1
|
# BotChallengePage
|
2
2
|
|
3
|
-
[](https://github.com/samvera-labs/bot_challenge_page/actions/workflows/ci.yml)
|
3
|
+
[](https://github.com/samvera-labs/bot_challenge_page/actions/workflows/ci.yml) [](http://badge.fury.io/rb/bot_challenge_page)
|
4
4
|
|
5
|
-
BotChallengePage lets you protect certain routes in your Rails app with [CloudFlare Turnstile](https://www.cloudflare.com/application-services/products/turnstile/) "CAPTHCA alternate" bot detector. Rather than the typical form submission use case for Turnstile, the user will be redirected to an interstitial challenge page, and redirected back on success.
|
5
|
+
BotChallengePage lets you protect certain routes in your Rails app with [CloudFlare Turnstile](https://www.cloudflare.com/application-services/products/turnstile/) "CAPTHCA alternate" bot detector. Rather than the typical form submission use case for Turnstile, the user will be redirected to an interstitial challenge page, and automatically redirected back immediately on success.
|
6
6
|
|
7
|
-
The motivating use case is fairly dumb (probably AI-related) crawlers, rather than targetted attacks, although we have tried to pay attention to security. Many of our use cases were crawlers getting caught
|
7
|
+
The motivating use case is fairly dumb (probably AI-related) crawlers, rather than targetted attacks, although we have tried to pay attention to security. Many of our use cases were crawlers getting caught following every combination of voluminous facet values in search results in a near "infinite space", and causing us resource usage issues.
|
8
8
|
|
9
9
|

|
10
10
|
|
@@ -18,26 +18,39 @@ The motivating use case is fairly dumb (probably AI-related) crawlers, rather th
|
|
18
18
|
|
19
19
|
## Installation and Configuration
|
20
20
|
|
21
|
-
* Get a CloudFlare account and Turnstile widget set up, which should give you a turnstile `sitekey` and `secret_key` you will need later in configuration.
|
21
|
+
* Get a [CloudFlare account and Turnstile widget set up](https://www.cloudflare.com/application-services/products/turnstile/), which should give you a turnstile `sitekey` and `secret_key` you will need later in configuration.
|
22
22
|
|
23
23
|
* `bundle add bot_challenge_page`, `bundle install`
|
24
24
|
|
25
25
|
* Run the installer
|
26
26
|
* if you want to use rack-attack for some permissive pre-challenge rate, `rails g bot_challenge_page:install`
|
27
|
-
|
27
|
+
|
28
|
+
* If you do not want to use rack-attack and want challenge on FIRST request, `rails g bot_challenge_page:install --no-rack-attack`
|
29
|
+
|
30
|
+
* By default challenge pages are "inline" at protected URL. To redirect to a separate challenge page URL instead, `--redirect-for-challenge`
|
31
|
+
|
32
|
+
* If you are **not using rack-attack**, you need to add a before_action to the controller(s)
|
33
|
+
you'd like to protect, eg:
|
34
|
+
|
35
|
+
before_action only: :index do |controller|
|
36
|
+
BotChallengePage::BotChallengePageController.bot_challenge_enforce_filter(controller, immediate: true)
|
37
|
+
end
|
38
|
+
|
28
39
|
|
29
40
|
* Configure in the generated `./config/initializers/bot_challenge_page.rb`
|
30
41
|
* At a minimum you need to configure your Cloudflare Turnstile keys, and some paths to protect!
|
31
42
|
* Note that we can only protect GET paths, and also think about making sure you DON'T protect
|
32
43
|
any path your front-end needs JS `fetch` access to, as this would block it (at least
|
33
44
|
without custom front-end code we haven't really explored)
|
45
|
+
|
34
46
|
* If you are tempted to just protect `/` that may work, but worth thinking about any hearbeat paths, front-end requestable paths, or other machine-access-desired paths.
|
47
|
+
|
35
48
|
* Some other configuration options are offered -- more advanced/specialized ones are available that are not mentioned in generated config file, see [Config class](./app/models/bot_challenge_page/config.rb)
|
36
49
|
|
37
50
|
|
38
51
|
## Customize challenge page display
|
39
52
|
|
40
|
-
Some of the default challenge page html uses bootstrap alert classes. You may want to provide custom CSS if you aren't using bootstrap. You can see the default challenge page html at [challenge.html.erb](./app/views/bot_challenge_page/bot_challenge_page/challenge.html). You may wish to CSS-style other parts too!
|
53
|
+
Some of the default challenge page html uses bootstrap alert classes. You may want to provide custom CSS if you aren't using bootstrap. You can see the default challenge page html at [challenge.html.erb](./app/views/bot_challenge_page/bot_challenge_page/challenge.html.erb). You may wish to CSS-style other parts too!
|
41
54
|
|
42
55
|
You can customize all text via I18n, see keys in [bot_challenge_page.en.yml](./config/locales/bot_challenge_page.en.yml)
|
43
56
|
|
@@ -48,10 +61,41 @@ To customize the layout or challenge page HTML more further, you can use configu
|
|
48
61
|
```ruby
|
49
62
|
BotChallengePage::BotChallengePageController.bot_challenge_config.challenge_renderer = ()-> {
|
50
63
|
render "my_local_view_folder/whatever", layout "another_layout"
|
51
|
-
render layout: "another_layout" # default html but change layout. etc.
|
52
64
|
}
|
53
65
|
```
|
54
66
|
|
67
|
+
## Logging
|
68
|
+
|
69
|
+
By default we log when a challenge result is submitted to the back-end; you can find challenge passes or failures by searching your logs for `BotChallengePage`.
|
70
|
+
|
71
|
+
We do not log when a challenge is issued -- experience shows challenge issues far outnumber challenge results, and can fill up the logs too fast.
|
72
|
+
|
73
|
+
If you'd like to log or observe challenge issues, you can configure a proc that is executed
|
74
|
+
in the context of the controller, and is called when a page is blocked by a challenge.
|
75
|
+
|
76
|
+
```ruby
|
77
|
+
BotChallengePage::BotChallengePageController.bot_challenge_config.after_blocked = (_bot_challenge_class)-> {
|
78
|
+
logger.info("page blocked by challenge: #{request.uri}")
|
79
|
+
}
|
80
|
+
```
|
81
|
+
|
82
|
+
Or, here's how I managed to get it in [lograge](https://github.com/roidrage/lograge), so a page blocked results in a `bot_chlng=true` param in a lograge line.
|
83
|
+
|
84
|
+
```ruby
|
85
|
+
BotChallengePage::BotChallengePageController.bot_challenge_config.after_blocked =
|
86
|
+
->(bot_detect_class) {
|
87
|
+
request.env["bot_detect.blocked_for_challenge"] = true
|
88
|
+
}
|
89
|
+
|
90
|
+
|
91
|
+
# production.rb
|
92
|
+
config.lograge.custom_payload do |controller|
|
93
|
+
{
|
94
|
+
bot_chlng: controller.request.env["bot_detect.blocked_for_challenge"]
|
95
|
+
}.compact
|
96
|
+
end
|
97
|
+
```
|
98
|
+
|
55
99
|
## Example possible Blacklight config
|
56
100
|
|
57
101
|
Many of us in my professional community use [blacklight](https://github.com/projectblacklight/blacklight). Here's a possible sample blacklight config to:
|
@@ -77,13 +121,12 @@ Rails.application.config.to_prepare do
|
|
77
121
|
BotChallengePage::BotChallengePageController.bot_challenge_config.rate_limit_count = 3
|
78
122
|
|
79
123
|
BotChallengePage::BotChallengePageController.allow_exempt = ->(controller) {
|
80
|
-
# Excempt any Catalog #facet action that looks like an ajax/fetch request, the
|
81
|
-
# ain't gonna work there, we just exempt it.
|
124
|
+
# Excempt any Catalog #facet or #range_limit action that looks like an ajax/fetch request, the # challenge isn't going to work there, we just exempt it.
|
82
125
|
#
|
83
126
|
# sec-fetch-dest is set to 'empty' by browser on fetch requests, to limit us further;
|
84
127
|
# sure an attacker could fake it, we don't mind if someone determined can avoid
|
85
|
-
#
|
86
|
-
( controller.params[:action]
|
128
|
+
# bot challenge on this one action
|
129
|
+
( controller.params[:action].in?(["facet", "range_limit"]) &&
|
87
130
|
controller.request.headers["sec-fetch-dest"] == "empty" &&
|
88
131
|
controller.kind_of?(CatalogController)
|
89
132
|
)
|
@@ -94,6 +137,20 @@ end
|
|
94
137
|
|
95
138
|
```
|
96
139
|
|
140
|
+
## Development and automated testing
|
141
|
+
|
142
|
+
All logic and config hangs off a controller, with the idea that you could sub-class the controller to override any functionality -- or even have multiple sub-classes in your app with different configuration or customized config. But this hasn't really been tested/fleshed out yet.
|
143
|
+
|
144
|
+
Run tests with `bundle exec rspec`.
|
145
|
+
|
146
|
+
We test with a checked-into-repo dummy app at `./spec/dummy`, and use [Appraisal](https://github.com/thoughtbot/appraisal) to test under different rails versions.
|
147
|
+
|
148
|
+
Locally one way to test with a specific rails version appraisal is `bundle exec appraisal rails-7.2 rspec`
|
149
|
+
|
150
|
+
If you make any changes to `Gemfile` you may need to run `bundle exec appraisal install` and commit changes.
|
151
|
+
|
152
|
+
**One reason tests are slow** is I think we're running system tests with real turnstile proof-of-work bot detection JS code? (Or is it, when we are are using a CF turnstile testing key that always passes?). There aren't many tests so it's no big deal, but this is something that could be investigated/optmized more potentially.
|
153
|
+
|
97
154
|
## Possible future features?
|
98
155
|
|
99
156
|
* allow regex in default location_matcher? Easy to do if you want it, just say so.
|
@@ -114,6 +171,10 @@ The gem is available as open source under the terms of the [MIT License](https:/
|
|
114
171
|
|
115
172
|
* [Similar feature built into PHP VuFind app](https://github.com/vufind-org/vufind/pull/4079)
|
116
173
|
|
117
|
-
*
|
174
|
+
* [My own blog post about this approach](https://bibwild.wordpress.com/2025/01/16/using-cloudflare-turnstile-to-protect-certain-pages-on-a-rails-app/).
|
175
|
+
|
176
|
+
* Wow only after I developed all this did I notice [rails-cloudflare-turnstile](https://github.com/instrumentl/rails-cloudflare-turnstile) which implements some pieces that could have been re-used here, but I feel good becuase we wanted these weird features. But if you want a much simpler more straightforward Turnstile implementation for more standard use cases or your own different use cases, I'd go here.
|
118
177
|
|
119
178
|
* And yet another implementation in Rails that perhaps makes more assumptions about use cases, [turnstile-captcha](https://github.com/pfeiffer/turnstile-captcha). Haven't looked at it much.
|
179
|
+
|
180
|
+
|
@@ -11,119 +11,37 @@ require 'http'
|
|
11
11
|
#
|
12
12
|
module BotChallengePage
|
13
13
|
class BotChallengePageController < ::ApplicationController
|
14
|
+
include BotChallengePage::RackAttackInit
|
15
|
+
include BotChallengePage::EnforceFilter
|
16
|
+
|
14
17
|
# Config for bot detection is held in class object here -- idea is
|
15
18
|
# to support different controllers with different config protecting
|
16
19
|
# different paths in your app if you like, is why config is with controller
|
17
20
|
class_attribute :bot_challenge_config, default: ::BotChallengePage::Config.new
|
18
21
|
|
19
|
-
delegate :cf_turnstile_js_url, :cf_turnstile_sitekey, :still_around_delay_ms, to: :bot_challenge_config
|
20
|
-
helper_method :cf_turnstile_js_url, :cf_turnstile_sitekey, :still_around_delay_ms
|
21
|
-
|
22
22
|
SESSION_DATETIME_KEY = "t"
|
23
23
|
SESSION_IP_KEY = "i"
|
24
24
|
|
25
25
|
# for allowing unsubscribe for testing
|
26
26
|
class_attribute :_track_notification_subscription, instance_accessor: false
|
27
27
|
|
28
|
-
# perhaps in an initializer, and after changing any config, run:
|
29
|
-
#
|
30
|
-
# Rails.application.config.to_prepare do
|
31
|
-
# BotChallengePage::BotChallengePageController.rack_attack_init
|
32
|
-
# end
|
33
|
-
#
|
34
|
-
# Safe to call more than once if you change config and want to call again, say in testing.
|
35
|
-
def self.rack_attack_init
|
36
|
-
self._rack_attack_uninit # make it safe for calling multiple times
|
37
|
-
|
38
|
-
## Turnstile bot detection throttling
|
39
|
-
#
|
40
|
-
# for paths matched by `rate_limited_locations`, after over rate_limit count requests in rate_limit_period,
|
41
|
-
# token will be stored in rack env instructing challenge is required.
|
42
|
-
#
|
43
|
-
# For actual challenge, need before_action in controller.
|
44
|
-
#
|
45
|
-
# You could rate limit detect on wider paths than you actually challenge on, or the same. You probably
|
46
|
-
# don't want to rate-limit detect on narrower list of paths than you challenge on!
|
47
|
-
Rack::Attack.track("bot_detect/rate_exceeded/#{self.name}",
|
48
|
-
limit: self.bot_challenge_config.rate_limit_count,
|
49
|
-
period: self.bot_challenge_config.rate_limit_period) do |req|
|
50
|
-
if self.bot_challenge_config.enabled && self.bot_challenge_config.location_matcher.call(req, self.bot_challenge_config)
|
51
|
-
self.bot_challenge_config.rate_limit_discriminator.call(req, self.bot_challenge_config)
|
52
|
-
end
|
53
|
-
end
|
54
|
-
|
55
|
-
self._track_notification_subscription = ActiveSupport::Notifications.subscribe("track.rack_attack") do |_name, _start, _finish, request_id, payload|
|
56
|
-
rack_request = payload[:request]
|
57
|
-
rack_env = rack_request.env
|
58
|
-
match_name = rack_env["rack.attack.matched"] # name of rack-attack rule
|
59
|
-
#
|
60
|
-
if match_name == "bot_detect/rate_exceeded/#{self.name}"
|
61
|
-
match_data = rack_env["rack.attack.match_data"]
|
62
|
-
match_data_formatted = match_data.slice(:count, :limit, :period).map { |k, v| "#{k}=#{v}"}.join(" ")
|
63
|
-
discriminator = rack_env["rack.attack.match_discriminator"] # unique key for rate limit, usually includes ip
|
64
|
-
|
65
|
-
rack_env[self.bot_challenge_config.env_challenge_trigger_key] = true
|
66
|
-
end
|
67
|
-
end
|
68
|
-
end
|
69
|
-
|
70
|
-
def self._rack_attack_uninit
|
71
|
-
Rack::Attack.track("bot_detect/rate_exceeded/#{self.name}") {} # overwrite track name with empty proc
|
72
|
-
ActiveSupport::Notifications.unsubscribe(self._track_notification_subscription) if self._track_notification_subscription
|
73
|
-
self._track_notification_subscription = nil
|
74
|
-
end
|
75
|
-
|
76
|
-
# Usually in your ApplicationController,
|
77
|
-
#
|
78
|
-
# before_action { |controller| BotChallengePage::BotChallengePageController.bot_challenge_enforce_filter(controller) }
|
79
|
-
#
|
80
|
-
# @param immediate [Boolean] always force bot protection, ignore any allowed pre-challenge rate limit
|
81
|
-
def self.bot_challenge_enforce_filter(controller, immediate: false)
|
82
|
-
if self.bot_challenge_config.enabled &&
|
83
|
-
(controller.request.env[self.bot_challenge_config.env_challenge_trigger_key] || immediate) &&
|
84
|
-
! self._bot_detect_passed_good?(controller.request) &&
|
85
|
-
! controller.kind_of?(self) && # don't ever guard ourself, that'd be a mess!
|
86
|
-
! self.bot_challenge_config.allow_exempt.call(controller, self.bot_challenge_config)
|
87
|
-
|
88
|
-
# we can only do GET requests right now
|
89
|
-
if !controller.request.get?
|
90
|
-
Rails.logger.warn("#{self}: Asked to protect request we could not, unprotected: #{controller.request.method} #{controller.request.url}, (#{controller.request.remote_ip}, #{controller.request.user_agent})")
|
91
|
-
return
|
92
|
-
end
|
93
|
-
|
94
|
-
Rails.logger.info("#{self.name}: Cloudflare Turnstile challenge redirect: (#{controller.request.remote_ip}, #{controller.request.user_agent}): from #{controller.request.url}")
|
95
|
-
# status code temporary
|
96
|
-
controller.redirect_to controller.bot_detect_challenge_path(dest: controller.request.original_fullpath), status: 307
|
97
|
-
end
|
98
|
-
end
|
99
|
-
|
100
|
-
# Does the session already contain a bot detect pass that is good for this request
|
101
|
-
# Tie to IP address to prevent session replay shared among IPs
|
102
|
-
def self._bot_detect_passed_good?(request)
|
103
|
-
session_data = request.session[self.bot_challenge_config.session_passed_key]
|
104
|
-
|
105
|
-
return false unless session_data && session_data.kind_of?(Hash)
|
106
|
-
|
107
|
-
datetime = session_data[SESSION_DATETIME_KEY]
|
108
|
-
ip = session_data[SESSION_IP_KEY]
|
109
|
-
|
110
|
-
(ip == request.remote_ip) && (Time.now - Time.iso8601(datetime) < self.bot_challenge_config.session_passed_good_for )
|
111
|
-
end
|
112
|
-
|
113
28
|
|
29
|
+
# only used if config.redirect_for_challenge is true
|
114
30
|
def challenge
|
115
|
-
# possible custom render to choose layouts or templates, but
|
116
|
-
#
|
117
|
-
|
118
|
-
|
119
|
-
|
31
|
+
# possible custom render to choose layouts or templates, but
|
32
|
+
# default is what would be default template for this action
|
33
|
+
#
|
34
|
+
# We put it in instancevar as a hacky way of passing to template that can be fulfilled
|
35
|
+
# both here and in arbitrary controllers for direct render.
|
36
|
+
@bot_challenge_config = bot_challenge_config
|
37
|
+
instance_exec &self.bot_challenge_config.challenge_renderer
|
120
38
|
end
|
121
39
|
|
122
40
|
def verify_challenge
|
123
41
|
body = {
|
124
42
|
secret: self.bot_challenge_config.cf_turnstile_secret_key,
|
125
43
|
response: params["cf_turnstile_response"],
|
126
|
-
remoteip: request.remote_ip
|
44
|
+
remoteip: request.remote_ip,
|
127
45
|
}
|
128
46
|
|
129
47
|
http = HTTP.timeout(self.bot_challenge_config.cf_timeout)
|
@@ -146,7 +64,10 @@ module BotChallengePage
|
|
146
64
|
Rails.logger.warn("#{self.class.name}: Cloudflare Turnstile validation failed (#{request.remote_ip}, #{request.user_agent}): #{result}: #{params["dest"]}")
|
147
65
|
end
|
148
66
|
|
149
|
-
#
|
67
|
+
# add config needed by JS to result
|
68
|
+
result["redirect_for_challenge"] = self.bot_challenge_config.redirect_for_challenge
|
69
|
+
|
70
|
+
# and let's just return the whole thing to client? Is there anything confidential there?
|
150
71
|
render json: result
|
151
72
|
rescue HTTP::Error, JSON::ParserError => e
|
152
73
|
# probably an http timeout? or something weird.
|
@@ -0,0 +1,59 @@
|
|
1
|
+
module BotChallengePage
|
2
|
+
|
3
|
+
# Extracted to concern in separate file mostly for readability, not expected to be used
|
4
|
+
# anywehre but BotChallengePageController -- we hang all logic off controller to allow multiple
|
5
|
+
# controllers in an app, and over-ride in sub-classes.
|
6
|
+
module EnforceFilter
|
7
|
+
extend ActiveSupport::Concern
|
8
|
+
|
9
|
+
class_methods do
|
10
|
+
# Usually in your ApplicationController, unless using `immediate`.
|
11
|
+
#
|
12
|
+
# before_action { |controller| BotChallengePage::BotChallengePageController.bot_challenge_enforce_filter(controller) }
|
13
|
+
#
|
14
|
+
# @param immediate [Boolean] always force bot protection, ignore any allowed pre-challenge rate limit
|
15
|
+
def bot_challenge_enforce_filter(controller, immediate: false)
|
16
|
+
if self.bot_challenge_config.enabled &&
|
17
|
+
(controller.request.env[self.bot_challenge_config.env_challenge_trigger_key] || immediate) &&
|
18
|
+
! self._bot_detect_passed_good?(controller.request) &&
|
19
|
+
! controller.kind_of?(self) && # don't ever guard ourself, that'd be a mess!
|
20
|
+
! self.bot_challenge_config.allow_exempt.call(controller, self.bot_challenge_config)
|
21
|
+
|
22
|
+
# we can only do GET requests right now
|
23
|
+
if !controller.request.get?
|
24
|
+
Rails.logger.warn("#{self}: Asked to protect request we could not, unprotected: #{controller.request.method} #{controller.request.url}, (#{controller.request.remote_ip}, #{controller.request.user_agent})")
|
25
|
+
return
|
26
|
+
end
|
27
|
+
|
28
|
+
# Prevent caching of bot challenge page
|
29
|
+
controller.response.headers["Cache-Control"] = "no-store"
|
30
|
+
|
31
|
+
if self.bot_challenge_config.redirect_for_challenge
|
32
|
+
# status code temporary
|
33
|
+
controller.redirect_to controller.bot_detect_challenge_path(dest: controller.request.original_fullpath), status: 307
|
34
|
+
else
|
35
|
+
# hacky way to get config to view template in an arbitrary controller, good enough for now
|
36
|
+
controller.instance_variable_set("@bot_challenge_config", self.bot_challenge_config) unless controller.instance_variable_get("@bot_challenge_config")
|
37
|
+
controller.instance_exec &self.bot_challenge_config.challenge_renderer
|
38
|
+
end
|
39
|
+
|
40
|
+
# allow app to see and log if desired
|
41
|
+
controller.instance_exec(self, &self.bot_challenge_config.after_blocked)
|
42
|
+
end
|
43
|
+
end
|
44
|
+
|
45
|
+
# Does the session already contain a bot detect pass that is good for this request
|
46
|
+
# Tie to IP address to prevent session replay shared among IPs
|
47
|
+
def _bot_detect_passed_good?(request)
|
48
|
+
session_data = request.session[self.bot_challenge_config.session_passed_key]
|
49
|
+
|
50
|
+
return false unless session_data && session_data.kind_of?(Hash)
|
51
|
+
|
52
|
+
datetime = session_data[BotChallengePageController::SESSION_DATETIME_KEY]
|
53
|
+
ip = session_data[BotChallengePageController::SESSION_IP_KEY]
|
54
|
+
|
55
|
+
(ip == request.remote_ip) && (Time.now - Time.iso8601(datetime) < self.bot_challenge_config.session_passed_good_for )
|
56
|
+
end
|
57
|
+
end
|
58
|
+
end
|
59
|
+
end
|
@@ -0,0 +1,60 @@
|
|
1
|
+
module BotChallengePage
|
2
|
+
|
3
|
+
# Extracted to concern in separate file mostly for readability, not expected to be used
|
4
|
+
# anywehre but BotChallengePageController -- we hang all logic off controller to allow multiple
|
5
|
+
# controllers in an app, and over-ride in sub-classes.
|
6
|
+
module RackAttackInit
|
7
|
+
extend ActiveSupport::Concern
|
8
|
+
|
9
|
+
|
10
|
+
class_methods do
|
11
|
+
# perhaps in an initializer, and after changing any config, run:
|
12
|
+
#
|
13
|
+
# Rails.application.config.to_prepare do
|
14
|
+
# BotChallengePage::BotChallengePageController.rack_attack_init
|
15
|
+
# end
|
16
|
+
#
|
17
|
+
# Safe to call more than once if you change config and want to call again, say in testing.
|
18
|
+
def rack_attack_init
|
19
|
+
self._rack_attack_uninit # make it safe for calling multiple times
|
20
|
+
|
21
|
+
## Turnstile bot detection throttling
|
22
|
+
#
|
23
|
+
# for paths matched by `rate_limited_locations`, after over rate_limit count requests in rate_limit_period,
|
24
|
+
# token will be stored in rack env instructing challenge is required.
|
25
|
+
#
|
26
|
+
# For actual challenge, need before_action in controller.
|
27
|
+
#
|
28
|
+
# You could rate limit detect on wider paths than you actually challenge on, or the same. You probably
|
29
|
+
# don't want to rate-limit detect on narrower list of paths than you challenge on!
|
30
|
+
Rack::Attack.track("bot_detect/rate_exceeded/#{self.name}",
|
31
|
+
limit: self.bot_challenge_config.rate_limit_count,
|
32
|
+
period: self.bot_challenge_config.rate_limit_period) do |req|
|
33
|
+
if self.bot_challenge_config.enabled && self.bot_challenge_config.location_matcher.call(req, self.bot_challenge_config)
|
34
|
+
self.bot_challenge_config.rate_limit_discriminator.call(req, self.bot_challenge_config)
|
35
|
+
end
|
36
|
+
end
|
37
|
+
|
38
|
+
self._track_notification_subscription = ActiveSupport::Notifications.subscribe("track.rack_attack") do |_name, _start, _finish, request_id, payload|
|
39
|
+
rack_request = payload[:request]
|
40
|
+
rack_env = rack_request.env
|
41
|
+
match_name = rack_env["rack.attack.matched"] # name of rack-attack rule
|
42
|
+
#
|
43
|
+
if match_name == "bot_detect/rate_exceeded/#{self.name}"
|
44
|
+
match_data = rack_env["rack.attack.match_data"]
|
45
|
+
match_data_formatted = match_data.slice(:count, :limit, :period).map { |k, v| "#{k}=#{v}"}.join(" ")
|
46
|
+
discriminator = rack_env["rack.attack.match_discriminator"] # unique key for rate limit, usually includes ip
|
47
|
+
|
48
|
+
rack_env[self.bot_challenge_config.env_challenge_trigger_key] = true
|
49
|
+
end
|
50
|
+
end
|
51
|
+
end
|
52
|
+
|
53
|
+
def _rack_attack_uninit
|
54
|
+
Rack::Attack.track("bot_detect/rate_exceeded/#{self.name}") {} # overwrite track name with empty proc
|
55
|
+
ActiveSupport::Notifications.unsubscribe(self._track_notification_subscription) if self._track_notification_subscription
|
56
|
+
self._track_notification_subscription = nil
|
57
|
+
end
|
58
|
+
end
|
59
|
+
end
|
60
|
+
end
|
@@ -27,6 +27,10 @@ module BotChallengePage
|
|
27
27
|
end
|
28
28
|
end
|
29
29
|
|
30
|
+
# Should we redirect to a challenge page (true) or just display it inline
|
31
|
+
# with a 403 status (false)
|
32
|
+
attribute :redirect_for_challenge, default: false
|
33
|
+
|
30
34
|
attribute :enabled, default: false # Must set to true to turn on at all
|
31
35
|
|
32
36
|
attribute :cf_turnstile_sitekey, default: "1x00000000000000000000AA" # a testing key that always passes
|
@@ -55,7 +59,11 @@ module BotChallengePage
|
|
55
59
|
attribute :allow_exempt, default: ->(controller, config) { false }
|
56
60
|
|
57
61
|
# replace with say `->() { render layout: 'something' }`, or `render "somedir/some_template"`
|
58
|
-
attribute :challenge_renderer, default:
|
62
|
+
attribute :challenge_renderer, default: ->() {
|
63
|
+
render "bot_challenge_page/bot_challenge_page/challenge", status: 403
|
64
|
+
}
|
65
|
+
|
66
|
+
attribute :after_blocked, default: ->(bot_detect_class) {}
|
59
67
|
|
60
68
|
|
61
69
|
# rate limit per subnet, following lehigh's lead, although we use a smaller
|
@@ -1,6 +1,7 @@
|
|
1
|
+
<%# locals: (bot_challenge_config:) -%>
|
2
|
+
|
1
3
|
<%# we deliver our simple javascript as inline script to make deployment more
|
2
4
|
reliable without having to deal with different asset pipelines, and it's really a fine choice anyway %>
|
3
|
-
|
4
5
|
<script type="text/javascript">
|
5
6
|
async function turnstileCallback(token) {
|
6
7
|
try {
|
@@ -31,12 +32,6 @@
|
|
31
32
|
|
32
33
|
result = await response.json();
|
33
34
|
if (result["success"] == true) {
|
34
|
-
const dest = new URLSearchParams(window.location.search).get("dest");
|
35
|
-
// For security make sure it only has path and on
|
36
|
-
if (!dest.startsWith("/") || dest.startsWith("//")) {
|
37
|
-
throw new Error("illegal non-local redirect: " + dest);
|
38
|
-
}
|
39
|
-
|
40
35
|
// in case this page stays around, (say it was rediret to media asset), let's add a failsafe message after
|
41
36
|
// a couple seconds.
|
42
37
|
const delay = document.querySelector("#botChallengePageStillAroundTemplate")?.getAttribute("data-still-around-delay-ms") || 1200;
|
@@ -44,8 +39,19 @@
|
|
44
39
|
_displayStillAroundNote()
|
45
40
|
}, delay);
|
46
41
|
|
47
|
-
|
48
|
-
|
42
|
+
if (result["redirect_for_challenge"] == true) {
|
43
|
+
const dest = new URLSearchParams(window.location.search).get("dest");
|
44
|
+
// For security make sure it only has path and on
|
45
|
+
if (!dest.startsWith("/") || dest.startsWith("//")) {
|
46
|
+
throw new Error("illegal non-local redirect: " + dest);
|
47
|
+
}
|
48
|
+
|
49
|
+
// replace the challenge page in history
|
50
|
+
window.location.replace(dest);
|
51
|
+
} else {
|
52
|
+
// just need to reload and now we'll get through
|
53
|
+
window.location.reload();
|
54
|
+
}
|
49
55
|
} else {
|
50
56
|
console.error("Turnstile response reported as failure: " + JSON.stringify(result))
|
51
57
|
_displayChallengeError();
|
@@ -1,7 +1,7 @@
|
|
1
1
|
<div class="bot_challenge_page">
|
2
2
|
<h1 class="mb-4"><%= t('bot_challenge_page.title') %></h1>
|
3
3
|
|
4
|
-
<%= render "bot_challenge_page/turnstile_widget_placeholder" %>
|
4
|
+
<%= render "bot_challenge_page/turnstile_widget_placeholder", bot_challenge_config: @bot_challenge_config %>
|
5
5
|
|
6
6
|
<noscript>
|
7
7
|
<div class="alert alert-danger"><%= t('bot_challenge_page.noscript') %></div>
|
@@ -16,14 +16,14 @@
|
|
16
16
|
</div>
|
17
17
|
</template>
|
18
18
|
|
19
|
-
<template id="botChallengePageStillAroundTemplate" data-still_around_delay_ms="<%= still_around_delay_ms %>">
|
19
|
+
<template id="botChallengePageStillAroundTemplate" data-still_around_delay_ms="<%= @bot_challenge_config.still_around_delay_ms %>">
|
20
20
|
<div class="alert alert-info" role="alert">
|
21
21
|
<i class="fa fa-info-circle" aria-hidden="true"></i>
|
22
22
|
<%= t('bot_challenge_page.still_around') %>
|
23
23
|
</div>
|
24
24
|
</template>
|
25
25
|
|
26
|
-
<script src="<%= cf_turnstile_js_url %>" async defer></script>
|
26
|
+
<script src="<%= @bot_challenge_config.cf_turnstile_js_url %>" async defer></script>
|
27
27
|
|
28
|
-
<%= render "bot_challenge_page/local_turnstile_script_tag" %>
|
28
|
+
<%= render "bot_challenge_page/local_turnstile_script_tag", bot_challenge_config: @bot_challenge_config %>
|
29
29
|
</div>
|
@@ -3,19 +3,23 @@ module BotChallengePage
|
|
3
3
|
source_root File.expand_path("templates", __dir__)
|
4
4
|
|
5
5
|
class_option :'rack_attack', type: :boolean, default: true, desc: "Support rate-limit allowance configuration"
|
6
|
+
class_option :redirect_for_challenge, type: :boolean, default: false, desc: "Redirect to separate challenge page instead of inline challenge"
|
6
7
|
|
7
8
|
def generate_routes
|
8
|
-
route '
|
9
|
-
|
9
|
+
route 'post "/challenge", to: "bot_challenge_page/bot_challenge_page#verify_challenge", as: :bot_detect_challenge'
|
10
|
+
|
11
|
+
if options[:redirect_for_challenge]
|
12
|
+
route 'get "/challenge", to: "bot_challenge_page/bot_challenge_page#challenge"'
|
13
|
+
end
|
10
14
|
end
|
11
15
|
|
12
16
|
def add_before_filter_enforcement
|
17
|
+
# make the user do this themselves if they aren't using rack-attack, as it should
|
18
|
+
# only be on protected filters
|
19
|
+
return unless options[:rack_attack]
|
20
|
+
|
13
21
|
inject_into_class "app/controllers/application_controller.rb", "ApplicationController" do
|
14
|
-
filter_code =
|
15
|
-
"BotChallengePage::BotChallengePageController.bot_challenge_enforce_filter(controller)"
|
16
|
-
else
|
17
|
-
"BotChallengePage::BotChallengePageController.bot_challenge_enforce_filter(controller, immediate: true)"
|
18
|
-
end
|
22
|
+
filter_code = "BotChallengePage::BotChallengePageController.bot_challenge_enforce_filter(controller)"
|
19
23
|
|
20
24
|
<<-EOS
|
21
25
|
# This will only protect CONFIGURED routes, but also could be put on just certain
|
@@ -41,5 +45,23 @@ module BotChallengePage
|
|
41
45
|
template "initializer.rb.erb", "config/initializers/bot_challenge_page.rb"
|
42
46
|
end
|
43
47
|
|
48
|
+
def suggest_filter
|
49
|
+
unless options[:rack_attack]
|
50
|
+
instructions = <<~EOS
|
51
|
+
You must add before_action to protect controllers
|
52
|
+
|
53
|
+
Add, eg:
|
54
|
+
|
55
|
+
before_action only: :index do |controller|
|
56
|
+
BotChallengePage::BotChallengePageController.bot_challenge_enforce_filter(controller, immediate: true)
|
57
|
+
end
|
58
|
+
|
59
|
+
To desired controllers and/or ApplicationController
|
60
|
+
EOS
|
61
|
+
|
62
|
+
say_status("advise", instructions, :green)
|
63
|
+
end
|
64
|
+
end
|
65
|
+
|
44
66
|
end
|
45
67
|
end
|
@@ -3,9 +3,16 @@ Rails.application.config.to_prepare do
|
|
3
3
|
BotChallengePage::BotChallengePageController.bot_challenge_config.enabled = true
|
4
4
|
|
5
5
|
# Get from CloudFlare Turnstile: https://www.cloudflare.com/application-services/products/turnstile/
|
6
|
+
# Some testing keys are also available: https://developers.cloudflare.com/turnstile/troubleshooting/testing/
|
7
|
+
#
|
8
|
+
# Always pass testing sitekey: "1x00000000000000000000AA"
|
6
9
|
BotChallengePage::BotChallengePageController.bot_challenge_config.cf_turnstile_sitekey = "MUST GET"
|
10
|
+
# Always pass testing secret_key: "1x0000000000000000000000000000000AA"
|
7
11
|
BotChallengePage::BotChallengePageController.bot_challenge_config.cf_turnstile_secret_key = "MUST GET"
|
8
12
|
|
13
|
+
BotChallengePage::BotChallengePageController.bot_challenge_config.redirect_for_challenge = <%= options[:redirect_for_challenge] %>
|
14
|
+
|
15
|
+
<%- if options[:rack_attack] %>
|
9
16
|
# What paths do you want to protect?
|
10
17
|
#
|
11
18
|
# You can use path prefixes: "/catalog" or even "/"
|
@@ -22,17 +29,16 @@ Rails.application.config.to_prepare do
|
|
22
29
|
BotChallengePage::BotChallengePageController.bot_challenge_config.rate_limited_locations = [
|
23
30
|
]
|
24
31
|
|
25
|
-
# How long will a challenge success exempt a session from further challenges?
|
26
|
-
# BotChallengePage::BotChallengePageController.bot_challenge_config.session_passed_good_for = 36.hours
|
27
|
-
|
28
|
-
<%- if options[:rack_attack] %>
|
29
32
|
# allow rate_limit_count requests in rate_limit_period, before issuing challenge
|
30
33
|
BotChallengePage::BotChallengePageController.bot_challenge_config.rate_limit_period = 12.hour
|
31
34
|
BotChallengePage::BotChallengePageController.bot_challenge_config.rate_limit_count = 2
|
32
35
|
<% end -%>
|
33
36
|
|
37
|
+
# How long will a challenge success exempt a session from further challenges?
|
38
|
+
# BotChallengePage::BotChallengePageController.bot_challenge_config.session_passed_good_for = 36.hours
|
39
|
+
|
34
40
|
# Exempt some requests from bot challenge protection
|
35
|
-
# BotChallengePage::BotChallengePageController.allow_exempt = ->(controller) {
|
41
|
+
# BotChallengePage::BotChallengePageController.bot_challenge_config.allow_exempt = ->(controller) {
|
36
42
|
# # controller.params
|
37
43
|
# # controller.request
|
38
44
|
# # controller.session
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: bot_challenge_page
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.3.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Jonathan Rochkind
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2025-
|
11
|
+
date: 2025-03-19 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: appraisal
|
@@ -153,6 +153,8 @@ files:
|
|
153
153
|
- README.md
|
154
154
|
- Rakefile
|
155
155
|
- app/controllers/bot_challenge_page/bot_challenge_page_controller.rb
|
156
|
+
- app/controllers/concerns/bot_challenge_page/enforce_filter.rb
|
157
|
+
- app/controllers/concerns/bot_challenge_page/rack_attack_init.rb
|
156
158
|
- app/models/bot_challenge_page/config.rb
|
157
159
|
- app/views/bot_challenge_page/_local_turnstile_script_tag.html.erb
|
158
160
|
- app/views/bot_challenge_page/_turnstile_widget_placeholder.html.erb
|