bot_challenge_page 0.3.1 → 0.10.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +96 -40
- data/app/controllers/bot_challenge_page/bot_challenge_page_controller.rb +9 -12
- data/app/controllers/concerns/bot_challenge_page/controller.rb +95 -0
- data/app/controllers/concerns/bot_challenge_page/{enforce_filter.rb → guard_action.rb} +11 -12
- data/app/views/bot_challenge_page/pow1/_pow1_placeholder.html.erb +9 -0
- data/app/views/bot_challenge_page/pow1/_pow1_script_tag.html.erb +121 -0
- data/lib/bot_challenge_page/config.rb +103 -0
- data/lib/bot_challenge_page/version.rb +1 -1
- data/lib/bot_challenge_page.rb +12 -1
- data/lib/generators/bot_challenge_page/install_generator.rb +2 -45
- data/lib/generators/bot_challenge_page/templates/initializer.rb.erb +43 -35
- metadata +9 -22
- data/app/controllers/concerns/bot_challenge_page/rack_attack_init.rb +0 -60
- data/app/models/bot_challenge_page/config.rb +0 -124
- data/app/models/bot_challenge_page/simple_pow1.rb +0 -90
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: cb4f4b1a29eae8a320d95d2279e21ab44ef8168be416136790c96d2150de1ac3
|
4
|
+
data.tar.gz: fa8fe854808413b95c4e2c4a5c4ecf4ed7b3826f77b487d46aee895f9a2d6735
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: '082b26cdef3709185fb7616d009798c26d244485ccb03e24593512f1af6481450413351356f044e380e953f65e28a6ac386ca490ba778fbea78c71109f2dc98d'
|
7
|
+
data.tar.gz: 805a64dfc7de28b6f27fdf0c4b64b18cba29e98ae59376bce6565e133b993246b3783312f65ad69d9c65df8ea31a4193f0dbe2cd5cb4006ca81264caa60464c7
|
data/README.md
CHANGED
@@ -2,16 +2,15 @@
|
|
2
2
|
|
3
3
|
[](https://github.com/samvera-labs/bot_challenge_page/actions/workflows/ci.yml) [](http://badge.fury.io/rb/bot_challenge_page)
|
4
4
|
|
5
|
-
BotChallengePage lets you protect certain routes in your Rails app with [CloudFlare Turnstile](https://www.cloudflare.com/application-services/products/turnstile/) "CAPTHCA alternate" bot detector. Rather than the typical form submission use case for Turnstile, the user will be redirected to an interstitial challenge page, and automatically redirected back immediately on success.
|
5
|
+
BotChallengePage lets you protect certain **GET** routes in your Rails app with [CloudFlare Turnstile](https://www.cloudflare.com/application-services/products/turnstile/) "CAPTHCA alternate" bot detector. Rather than the typical form submission use case for Turnstile, the user will be redirected to an interstitial challenge page, and automatically redirected back immediately on success.
|
6
6
|
|
7
|
-
The motivating use case is fairly dumb (probably AI-related) crawlers, rather than targetted attacks, although we have tried to pay attention to security. Many of our use cases were crawlers getting caught following every combination of voluminous facet values in search results in a near "infinite space", and causing us resource usage issues.
|
7
|
+
The motivating use case is fairly dumb (probably AI-related) crawlers crawling search results pages, rather than targetted attacks, although we have tried to pay attention to security. Many of our use cases were crawlers getting caught following every combination of voluminous facet values in search results in a near "infinite space", and causing us resource usage issues.
|
8
8
|
|
9
9
|

|
10
10
|
|
11
|
-
*
|
12
|
-
* Uses rack-attack to track rate, requires `Rails.cache` or `Rack::Attack.cache.store` to be set to a persistent shared high-performance cache, probably redis or memcached.
|
11
|
+
* Support both immediate bot challenge, or optionally a rate limit that will trigger a bot challenge.
|
13
12
|
|
14
|
-
* Once a challenge is passed, the pass is stored in a cookie, and a challenge won't be redisplayed for a configurable amount of time, so long as cookie is present
|
13
|
+
* Once a challenge is passed, the pass is stored in a cookie, and a challenge won't be redisplayed for a configurable amount of time, so long as cookie is present, and client matches a configurable user-agent/IP address fingerprint.
|
15
14
|
|
16
15
|
* **Note:** User-agent does always need both cookies and javascript enabled to be able to pass challenge and get through!
|
17
16
|
|
@@ -23,30 +22,64 @@ The motivating use case is fairly dumb (probably AI-related) crawlers, rather th
|
|
23
22
|
* `bundle add bot_challenge_page`, `bundle install`
|
24
23
|
|
25
24
|
* Run the installer
|
26
|
-
*
|
25
|
+
* `rails g bot_challenge_page:install`
|
26
|
+
* This will add a line to your ApplicationController to include a mixin to provide a `bot_challenge` configuration method in your controllers
|
27
|
+
* And a template configuration page at `./config/initializers/bot_challenge_page.rb`
|
27
28
|
|
28
|
-
* If you do not want to use rack-attack and want challenge on FIRST request, `rails g bot_challenge_page:install --no-rack-attack`
|
29
29
|
|
30
|
-
* By default challenge pages are "inline" at protected URL. To redirect to a separate challenge page URL instead, `--redirect-for-challenge`
|
31
30
|
|
32
|
-
*
|
33
|
-
you
|
31
|
+
* Configure in the generated `./config/initializers/bot_challenge_page.rb`
|
32
|
+
* At a minimum you need to configure your Cloudflare Turnstile keys
|
34
33
|
|
35
|
-
|
36
|
-
BotChallengePage::BotChallengePageController.bot_challenge_enforce_filter(controller, immediate: true)
|
37
|
-
end
|
34
|
+
* Some other configuration options are offered -- more advanced/specialized ones are available that are not mentioned in generated config file, see [Config class](./app/models/bot_challenge_page/config.rb)
|
38
35
|
|
36
|
+
## Protect some paths
|
37
|
+
|
38
|
+
You can add `bot_challenge` to a controller to protect all actions in that controller with a bot challenge.
|
39
|
+
|
40
|
+
You can also use all the Rails `before_action` params to apply to only some actions or requests in that controller: `only` and `except` to specify actions; and `if` and `unless` to specify procs to filter individual requests.
|
39
41
|
|
40
|
-
* Configure in the generated `./config/initializers/bot_challenge_page.rb`
|
41
|
-
* At a minimum you need to configure your Cloudflare Turnstile keys, and some paths to protect!
|
42
42
|
* Note that we can only protect GET paths, and also think about making sure you DON'T protect
|
43
43
|
any path your front-end needs JS `fetch` access to, as this would block it (at least
|
44
44
|
without custom front-end code we haven't really explored)
|
45
45
|
|
46
|
-
* If you are tempted to just protect `/` that may work, but
|
46
|
+
* If you are tempted to just protect `/` that may work, but you may need to exclude hearbeat paths, front-end (AJAX) requestable paths, API endpoints, uptime checker requests, or other machine-access-desired paths. These may be good candidates for an `unless` parameter, or the `skip_when` configuration.
|
47
47
|
|
48
|
-
|
48
|
+
* The author is a librarian who believes maintaining machine access in general is a public good, and tries to limit access with a bot challenge to the minimum paths necessary for app sustainability.
|
49
|
+
|
50
|
+
* The default configuration only allows re-use of a 'pass' cookie from requests with same IP address subnet and user-agent-related headers. This can be customized.
|
51
|
+
|
52
|
+
```ruby
|
53
|
+
class WidgetController < ApplicationController
|
54
|
+
bot_challenge only: :index, unless: -> { headers['x-secret-code'] == "i_am_uptime-checker" }
|
55
|
+
end
|
56
|
+
```
|
57
|
+
|
58
|
+
### Protect some paths with a rate limit
|
59
|
+
|
60
|
+
If you want to display a bot challenge only after some rate is reached, you will need some [Rails cache store configured](https://guides.rubyonrails.org/caching_with_rails.html#configuration) to keep track of rate. You can configure `Rails.config.cache`, or for bot_challenge_page specifically in it's config.
|
61
|
+
|
62
|
+
* Redis or Memcached are typical, but the `memory_store` cache can work if you don't mind your rate limits being only approximate -- they will reset on every web server process restart, and if you have more than one web server process they will each have their own rate limit.
|
63
|
+
|
64
|
+
You use the `after` and `within` argument to `bot_challenge` to include a rate limit. `only`, `except`, `if`, and `unless` are still supported.
|
65
|
+
|
66
|
+
```ruby
|
67
|
+
class WidgetController < ApplicationController
|
68
|
+
bot_challenge after: 2, within: 3.hours, only: :index, if: -> { request_has_facet_limits? }
|
69
|
+
end
|
70
|
+
```
|
71
|
+
|
72
|
+
#### rate limit counters
|
73
|
+
|
74
|
+
By default, all `bot_challenge` directives share a rate limit counter. So if two differnet controllers have a `bot_challenge`, requests to either one add to the same counter for rate limit checks.
|
75
|
+
|
76
|
+
Which also means if you have more than one `bot_challenge` that can apply to the _same request_, it might get double-counted (or more-counted). (Also too many rate-limited bot_challenges applying to the same request could have performance implications).
|
49
77
|
|
78
|
+
To avoid this problem or achieve desired behavior, you can pass a `counter` string into `bot_challenge` to declare separate counteres and decide which `bot_challenge` should or should not share counters.
|
79
|
+
|
80
|
+
The `counter` arg has overlapping use but distinct effect from passing in a `by` argument or setting `config.default_limit_by`, which lets you determine how user-agents are identified to share a counter bucket, which by default buckets clients by IP subnet, not just individual IP.
|
81
|
+
|
82
|
+
If a given request does not apply to `bot_challenge` because of `only`, `except`, `if`, `unless` or `config.skip_when` -- it **does not count toward rate limit either**.
|
50
83
|
|
51
84
|
## Customize challenge page display
|
52
85
|
|
@@ -101,42 +134,66 @@ end
|
|
101
134
|
Many of us in my professional community use [blacklight](https://github.com/projectblacklight/blacklight). Here's a possible sample blacklight config to:
|
102
135
|
|
103
136
|
* Protect default catalog controller, including search results and any other actions
|
104
|
-
*
|
105
|
-
*
|
137
|
+
* ONLY if the search includes a query string or facet limit -- allow unfiltered search, including pagination, without bot challenge.
|
138
|
+
* Even for queried or limite results, give an IP subnet 1 free searches in a 36 hour period before challenged
|
139
|
+
* For the action used for "facet… more" links and `blacklight_range_limit` that need to be XHR/JS-fetchable -- exempt from protection if the request is being made by a browser JS `fetch`, we just let those through. (Which means a determined attacker could do that on purpose, not defense against on purpose DDoS)
|
140
|
+
* Let's an uptime checker in based on secret code in headers
|
141
|
+
|
142
|
+
|
106
143
|
|
107
144
|
```ruby
|
145
|
+
# ./config/initializers/bot_challenge_page.rb
|
108
146
|
Rails.application.config.to_prepare do
|
109
147
|
BotChallengePage::BotChallengePageController.bot_challenge_config.enabled = true
|
110
148
|
|
149
|
+
# Need to set store to a Rails cache store other than null store, if you want to track
|
150
|
+
# rate limits.
|
151
|
+
BotChallengePage::BotChallengePageController.bot_challenge_config.store = :redis_store
|
152
|
+
|
111
153
|
# Get from CloudFlare Turnstile: https://www.cloudflare.com/application-services/products/turnstile/
|
112
154
|
BotChallengePage::BotChallengePageController.bot_challenge_config.cf_turnstile_sitekey = "MUST GET"
|
113
155
|
BotChallengePage::BotChallengePageController.bot_challenge_config.cf_turnstile_secret_key = "MUST GET"
|
114
156
|
|
115
|
-
BotChallengePage::BotChallengePageController.bot_challenge_config.
|
116
|
-
|
117
|
-
|
118
|
-
|
119
|
-
|
120
|
-
|
121
|
-
BotChallengePage::BotChallengePageController.bot_challenge_config.rate_limit_count = 3
|
122
|
-
|
123
|
-
BotChallengePage::BotChallengePageController.allow_exempt = ->(controller) {
|
124
|
-
# Excempt any Catalog #facet or #range_limit action that looks like an ajax/fetch request, the # challenge isn't going to work there, we just exempt it.
|
125
|
-
#
|
126
|
-
# sec-fetch-dest is set to 'empty' by browser on fetch requests, to limit us further;
|
127
|
-
# sure an attacker could fake it, we don't mind if someone determined can avoid
|
128
|
-
# bot challenge on this one action
|
129
|
-
( controller.params[:action].in?(["facet", "range_limit"]) &&
|
130
|
-
controller.request.headers["sec-fetch-dest"] == "empty" &&
|
131
|
-
controller.kind_of?(CatalogController)
|
157
|
+
BotChallengePage::BotChallengePageController.bot_challenge_config.skip_when = ->(config) {
|
158
|
+
# Exempt honeybadger token to allow HB uptime checker in
|
159
|
+
# https://docs.honeybadger.io/guides/security/
|
160
|
+
(
|
161
|
+
ENV['HONEYBADGER_TOKEN'].present? &&
|
162
|
+
controller.request.headers['Honeybadger-Token'] == ENV['HONEYBADGER_TOKEN']
|
132
163
|
)
|
133
164
|
}
|
134
165
|
|
135
|
-
BotChallengePage::BotChallengePageController.rack_attack_init
|
136
166
|
end
|
137
|
-
|
138
167
|
```
|
139
168
|
|
169
|
+
```ruby
|
170
|
+
# ./app/controllers/catalog_controller.rb
|
171
|
+
class CatalogController < ApplicationController
|
172
|
+
# from default blacklight first...
|
173
|
+
include Blacklight::Catalog
|
174
|
+
include BlacklightRangeLimit::ControllerOverride
|
175
|
+
|
176
|
+
# This should apply to all CatalogController sub-classes too, which include CollectionShowController and
|
177
|
+
# FeaturedTopicController. They all share a counter though.
|
178
|
+
#
|
179
|
+
# We let bots through if they have NO query params, we want let collection/focus sploash
|
180
|
+
# pages be indexed -- this will actually let bot paginate through entire results with
|
181
|
+
# no query/facets, which we seem to be able to tolerate.
|
182
|
+
#
|
183
|
+
bot_challenge after: 1, within: 12.hours,
|
184
|
+
if: -> {
|
185
|
+
has_search_parameters?
|
186
|
+
},
|
187
|
+
except: ["facet", "range_limit"]
|
188
|
+
|
189
|
+
# facet and range_limit both get challenged immediately, unless they are JS fetch,
|
190
|
+
# in which case they are let in freely.
|
191
|
+
bot_challenge only: ["facet", "range_limit"], unless: -> {
|
192
|
+
request.headers["sec-fetch-dest"] == "empty"
|
193
|
+
}
|
194
|
+
|
195
|
+
end
|
196
|
+
|
140
197
|
## Development and automated testing
|
141
198
|
|
142
199
|
All logic and config hangs off a controller, with the idea that you could sub-class the controller to override any functionality -- or even have multiple sub-classes in your app with different configuration or customized config. But this hasn't really been tested/fleshed out yet.
|
@@ -153,12 +210,11 @@ If you make any changes to `Gemfile` you may need to run `bundle exec appraisal
|
|
153
210
|
|
154
211
|
## Possible future features?
|
155
212
|
|
156
|
-
* allow regex in default location_matcher? Easy to do if you want it, just say so.
|
157
|
-
|
158
213
|
* We could support swap-in Turnstile-alternatives, like [hCAPTHCA](https://www.hcaptcha.com/), [Google reCAPTCHA v3](https://developers.google.com/recaptcha/docs/v3), or even open source proof of work implementations like [ALTCHA](https://altcha.org/docs/get-started/), [pow-bot-deterrent](https://github.com/sequentialread/pow-bot-deterrent), or [Friendly Captcha](https://github.com/FriendlyCaptcha/friendly-captcha-sdk). But the (free) cost/benefit of Turnstile are pretty good, so I don't myself have a lot of motivation to add this complexity.
|
159
214
|
|
160
215
|
* Something to make it easier to switch the challenge on only based on signals that server/app is under some defined heavy load?
|
161
216
|
|
217
|
+
* Use the in-development [bot auth](https://developers.cloudflare.com/bots/concepts/bot/verified-bots/web-bot-auth/) standard, to support allow-listing of specified auth\'d bots.
|
162
218
|
|
163
219
|
## License
|
164
220
|
The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
|
@@ -11,20 +11,17 @@ require 'http'
|
|
11
11
|
#
|
12
12
|
module BotChallengePage
|
13
13
|
class BotChallengePageController < ::ApplicationController
|
14
|
-
include BotChallengePage::
|
15
|
-
include BotChallengePage::EnforceFilter
|
14
|
+
include BotChallengePage::GuardAction
|
16
15
|
|
17
|
-
#
|
18
|
-
#
|
19
|
-
# different
|
20
|
-
|
16
|
+
# We access all config at the controller level, intending to support
|
17
|
+
# a design of different controllers in the same app with different config, protecting
|
18
|
+
# different parts of your app.
|
19
|
+
#
|
20
|
+
# But most people won't use that, just default to a global config for simplicity.
|
21
|
+
class_attribute :bot_challenge_config, default: ::BotChallengePage.config
|
21
22
|
|
22
23
|
SESSION_DATETIME_KEY = "t"
|
23
|
-
|
24
|
-
|
25
|
-
# for allowing unsubscribe for testing
|
26
|
-
class_attribute :_track_notification_subscription, instance_accessor: false
|
27
|
-
|
24
|
+
SESSION_FINGERPRINT_KEY = "f"
|
28
25
|
|
29
26
|
# only used if config.redirect_for_challenge is true
|
30
27
|
def challenge
|
@@ -58,7 +55,7 @@ module BotChallengePage
|
|
58
55
|
Rails.logger.info("#{self.class.name}: Cloudflare Turnstile validation passed api (#{request.remote_ip}, #{request.user_agent}): #{params["dest"]}")
|
59
56
|
session[self.bot_challenge_config.session_passed_key] = {
|
60
57
|
SESSION_DATETIME_KEY => Time.now.utc.iso8601,
|
61
|
-
|
58
|
+
SESSION_FINGERPRINT_KEY => self.bot_challenge_config.session_valid_fingerprint.call(request)
|
62
59
|
}
|
63
60
|
else
|
64
61
|
Rails.logger.warn("#{self.class.name}: Cloudflare Turnstile validation failed (#{request.remote_ip}, #{request.user_agent}): #{result}: #{params["dest"]}")
|
@@ -0,0 +1,95 @@
|
|
1
|
+
module BotChallengePage
|
2
|
+
# To be included in app controllers to make `bot_challenge` class method macro available.
|
3
|
+
module Controller
|
4
|
+
extend ActiveSupport::Concern
|
5
|
+
|
6
|
+
class_methods do
|
7
|
+
def bot_challenge(challenge_controller: BotChallengePage::BotChallengePageController,
|
8
|
+
after:nil,
|
9
|
+
within:nil,
|
10
|
+
by: ->{
|
11
|
+
instance_exec(challenge_controller.bot_challenge_config, &challenge_controller.bot_challenge_config.default_limit_by)
|
12
|
+
},
|
13
|
+
store: nil,
|
14
|
+
counter: nil,
|
15
|
+
**before_action_options)
|
16
|
+
|
17
|
+
|
18
|
+
|
19
|
+
unless_arg = before_action_options.delete(:unless)
|
20
|
+
generated_unless = -> {
|
21
|
+
(unless_arg && instance_exec(&unless_arg)) ||
|
22
|
+
instance_exec(challenge_controller.bot_challenge_config, &challenge_controller.bot_challenge_config.skip_when)
|
23
|
+
}
|
24
|
+
|
25
|
+
if after
|
26
|
+
unless within
|
27
|
+
raise ArgumentError.new("either both or neither of `after` and `within` must be speciied")
|
28
|
+
end
|
29
|
+
|
30
|
+
self._bot_challenge_rate_limit(to: after, within: within, by: by, store: store,
|
31
|
+
context: ["bot_challenge", counter].compact.join('.'),
|
32
|
+
with: ->{
|
33
|
+
challenge_controller.bot_challenge_guard_action(self)
|
34
|
+
},
|
35
|
+
unless: generated_unless,
|
36
|
+
**before_action_options)
|
37
|
+
else
|
38
|
+
before_action(unless: generated_unless, **before_action_options) do
|
39
|
+
ActiveSupport::Notifications.instrument("before_action.bot_challenge_page", request: request) do
|
40
|
+
challenge_controller.bot_challenge_guard_action(self)
|
41
|
+
end
|
42
|
+
end
|
43
|
+
end
|
44
|
+
end
|
45
|
+
|
46
|
+
|
47
|
+
# A copy-paste-customize of Rails rate_limit at
|
48
|
+
# https://github.com/rails/rails/blob/9a64857d7002554b0af94158de386def5bfef9d3/actionpack/lib/action_controller/metal/rate_limiting.rb#L55
|
49
|
+
#
|
50
|
+
# For two purposes:
|
51
|
+
#
|
52
|
+
# 1. Apply 'context' argument from https://github.com/rails/rails/pull/55299 (not merged when I write this)
|
53
|
+
#
|
54
|
+
# 2. Make 'store' defaults calculated _at execution time_ rather than definition time, which is
|
55
|
+
# convenient for being able to mock config in applicaiton tests.
|
56
|
+
#
|
57
|
+
def _bot_challenge_rate_limit(to:, within:, by: -> { request.remote_ip }, with: -> { head :too_many_requests }, store: nil, name: nil, context: nil,
|
58
|
+
challenge_controller: BotChallengePage::BotChallengePageController, # to get config for store default
|
59
|
+
**options)
|
60
|
+
before_action -> {
|
61
|
+
_bot_challenge_rate_limiting(to: to,
|
62
|
+
within: within,
|
63
|
+
by: by,
|
64
|
+
with: with,
|
65
|
+
store: store || challenge_controller.bot_challenge_config.store || cache_store,
|
66
|
+
name: name,
|
67
|
+
context: context)
|
68
|
+
}, **options
|
69
|
+
end
|
70
|
+
end
|
71
|
+
|
72
|
+
private
|
73
|
+
|
74
|
+
# See above at _bot_challenge_rate_limit
|
75
|
+
#
|
76
|
+
def _bot_challenge_rate_limiting(to:, within:, by:, with:, store:, name:, context:)
|
77
|
+
by = instance_exec(&by)
|
78
|
+
cache_key = ["rate-limit", context || controller_path, name, by].compact.join(":")
|
79
|
+
count = store.increment(cache_key, 1, expires_in: within)
|
80
|
+
if count && count > to
|
81
|
+
ActiveSupport::Notifications.instrument("rate_limit.action_controller",
|
82
|
+
request: request,
|
83
|
+
count: count,
|
84
|
+
to: to,
|
85
|
+
within: within,
|
86
|
+
by: by,
|
87
|
+
name: name,
|
88
|
+
context: context,
|
89
|
+
cache_key: cache_key) do
|
90
|
+
instance_exec(&with)
|
91
|
+
end
|
92
|
+
end
|
93
|
+
end
|
94
|
+
end
|
95
|
+
end
|
@@ -3,21 +3,18 @@ module BotChallengePage
|
|
3
3
|
# Extracted to concern in separate file mostly for readability, not expected to be used
|
4
4
|
# anywehre but BotChallengePageController -- we hang all logic off controller to allow multiple
|
5
5
|
# controllers in an app, and over-ride in sub-classes.
|
6
|
-
module
|
6
|
+
module GuardAction
|
7
7
|
extend ActiveSupport::Concern
|
8
8
|
|
9
9
|
class_methods do
|
10
|
-
#
|
10
|
+
# All the logic for enforcing bot challenge protection, usually in a before_filter
|
11
|
+
# of some kind, direct or rate_limit.
|
11
12
|
#
|
12
|
-
#
|
13
|
-
|
14
|
-
# @param immediate [Boolean] always force bot protection, ignore any allowed pre-challenge rate limit
|
15
|
-
def bot_challenge_enforce_filter(controller, immediate: false)
|
13
|
+
# Render challenge page when necessary, otherwise do nothing allowing ordinary rails render.
|
14
|
+
def bot_challenge_guard_action(controller)
|
16
15
|
if self.bot_challenge_config.enabled &&
|
17
|
-
(controller.request.env[self.bot_challenge_config.env_challenge_trigger_key] || immediate) &&
|
18
16
|
! self._bot_detect_passed_good?(controller.request) &&
|
19
|
-
! controller.kind_of?(self)
|
20
|
-
! self.bot_challenge_config.allow_exempt.call(controller, self.bot_challenge_config)
|
17
|
+
! controller.kind_of?(self) # don't ever guard ourself, that'd be a mess!
|
21
18
|
|
22
19
|
# we can only do GET requests right now
|
23
20
|
if !controller.request.get?
|
@@ -49,10 +46,12 @@ module BotChallengePage
|
|
49
46
|
|
50
47
|
return false unless session_data && session_data.kind_of?(Hash)
|
51
48
|
|
52
|
-
datetime = session_data[
|
53
|
-
|
49
|
+
datetime = session_data[self::SESSION_DATETIME_KEY]
|
50
|
+
|
51
|
+
fingerprint = session_data[self::SESSION_FINGERPRINT_KEY]
|
54
52
|
|
55
|
-
(
|
53
|
+
(Time.now - Time.iso8601(datetime) < self.bot_challenge_config.session_passed_good_for ) &&
|
54
|
+
fingerprint == self.bot_challenge_config.session_valid_fingerprint.call(request)
|
56
55
|
end
|
57
56
|
end
|
58
57
|
end
|
@@ -0,0 +1,121 @@
|
|
1
|
+
<script type="text/javascript" async defer>
|
2
|
+
|
3
|
+
|
4
|
+
async function execute() {
|
5
|
+
try {
|
6
|
+
const placeholder = document.querySelector("*[data-bot-challenge='pow1']");
|
7
|
+
|
8
|
+
const challenge = placeholder.dataset.pow1Challenge;
|
9
|
+
const difficulty = placeholder.dataset.pow1Difficulty;
|
10
|
+
|
11
|
+
const solution_base64 = await process(challenge, difficulty);
|
12
|
+
|
13
|
+
const csrfToken = document.querySelector("[name='csrf-token']");
|
14
|
+
const response = await fetch('<%= bot_detect_challenge_path %>', {
|
15
|
+
method: 'POST',
|
16
|
+
headers: {
|
17
|
+
"X-CSRF-Token": csrfToken?.content,
|
18
|
+
"Content-Type": "application/json"
|
19
|
+
},
|
20
|
+
body: JSON.stringify({
|
21
|
+
pow1_solution_base64: solution_base64,
|
22
|
+
difficulty: difficulty
|
23
|
+
}),
|
24
|
+
});
|
25
|
+
|
26
|
+
processFetchResponse(response);
|
27
|
+
}
|
28
|
+
catch(error) {
|
29
|
+
console.error("Error processing turnstile challenge backend action: " + error);
|
30
|
+
_displayChallengeError();
|
31
|
+
}
|
32
|
+
}
|
33
|
+
|
34
|
+
// Returns a solution as base64 encoded uint8Array
|
35
|
+
async function process(hexChallenge, difficulty = 18) {
|
36
|
+
const uint8Challenge = uint8Array_fromHex(hexChallenge);
|
37
|
+
|
38
|
+
let solution;
|
39
|
+
let bitPrefixArr;
|
40
|
+
let solutionNum = 0;
|
41
|
+
let digestUint8array;
|
42
|
+
do {
|
43
|
+
solutionNum = solutionNum + 1;
|
44
|
+
solution = intToUInt8Array(solutionNum);
|
45
|
+
digestUint8array = await sha256(mergeUInt8Arrays(solution, uint8Challenge));
|
46
|
+
|
47
|
+
bitPrefixArr = getFirstXBits(digestUint8array, difficulty);
|
48
|
+
} while( !(bitPrefixArr.length == difficulty && bitPrefixArr.every(element => element === 0)) )
|
49
|
+
|
50
|
+
|
51
|
+
|
52
|
+
console.log("solution hex: " + uInt8Array_toHex(solution));
|
53
|
+
console.log("challenge hex: " + hexChallenge);
|
54
|
+
console.log("solution + challenge (hex): " + uInt8Array_toHex(mergeUInt8Arrays(solution, uint8Challenge)));
|
55
|
+
console.log("combined sha hex: " + uInt8Array_toHex(digestUint8array));
|
56
|
+
console.log("bit prefix arr: " + bitPrefixArr);
|
57
|
+
|
58
|
+
return uint8Array_toBase64(solution);
|
59
|
+
}
|
60
|
+
|
61
|
+
// Mostly for debugging
|
62
|
+
function uInt8Array_toHex(buffer) {
|
63
|
+
return Array.from(buffer)
|
64
|
+
.map(x => x.toString(16).padStart(2, '0'))
|
65
|
+
.join('');
|
66
|
+
}
|
67
|
+
|
68
|
+
// Uint8Array => Uint8Array
|
69
|
+
async function sha256(uint8_array) {
|
70
|
+
return new Uint8Array(await crypto.subtle.digest('SHA-256', uint8_array));
|
71
|
+
}
|
72
|
+
|
73
|
+
function intToUInt8Array(i) {
|
74
|
+
// Got me, but https://stackoverflow.com/a/78592211/307106
|
75
|
+
return new Uint8Array(new BigUint64Array([BigInt(i)]).buffer);
|
76
|
+
}
|
77
|
+
|
78
|
+
function mergeUInt8Arrays(a1, a2) {
|
79
|
+
// sum of individual array lengths
|
80
|
+
var mergedArray = new Uint8Array(a1.length + a2.length);
|
81
|
+
mergedArray.set(a1);
|
82
|
+
mergedArray.set(a2, a1.length);
|
83
|
+
return mergedArray;
|
84
|
+
}
|
85
|
+
|
86
|
+
function uint8Array_fromHex(hexString) {
|
87
|
+
// https://stackoverflow.com/a/50868276/307106
|
88
|
+
return Uint8Array.from(hexString.match(/.{1,2}/g).map((byte) => parseInt(byte, 16)));
|
89
|
+
}
|
90
|
+
|
91
|
+
// https://stackoverflow.com/a/66046176/307106
|
92
|
+
async function uint8Array_toBase64(buffer) {
|
93
|
+
// use a FileReader to generate a base64 data URI:
|
94
|
+
const base64url = await new Promise(r => {
|
95
|
+
const reader = new FileReader()
|
96
|
+
reader.onload = () => r(reader.result)
|
97
|
+
reader.readAsDataURL(new Blob([buffer]))
|
98
|
+
});
|
99
|
+
// remove the `data:...;base64,` part from the start
|
100
|
+
return base64url.slice(base64url.indexOf(',') + 1);
|
101
|
+
}
|
102
|
+
|
103
|
+
//https://stackoverflow.com/a/78592211/307106
|
104
|
+
function getFirstXBits(uint8Array, x) {
|
105
|
+
const numBytes = Math.ceil(x / 8);
|
106
|
+
const view = uint8Array.subarray(0, numBytes);
|
107
|
+
const bits = [];
|
108
|
+
let bitIndex = 0;
|
109
|
+
|
110
|
+
for (const byte of view) {
|
111
|
+
for (let i = 7; i >= 0 && bitIndex < x; i--) {
|
112
|
+
bits.push((byte >> i) & 1);
|
113
|
+
bitIndex++;
|
114
|
+
}
|
115
|
+
}
|
116
|
+
return bits;
|
117
|
+
}
|
118
|
+
|
119
|
+
<%= render partial: "bot_challenge_page/common_func", formats: [:js] %>
|
120
|
+
|
121
|
+
</script>
|
@@ -0,0 +1,103 @@
|
|
1
|
+
module BotChallengePage
|
2
|
+
class Config
|
3
|
+
# meh let's do a little accessor definition to make this value class more legible
|
4
|
+
|
5
|
+
# default can be a proc, in which case it really is a proc as a value for default,
|
6
|
+
# the value is the proc!
|
7
|
+
def self.attribute(name, default:nil)
|
8
|
+
attr_defaults[name] = default
|
9
|
+
self.attr_accessor name
|
10
|
+
end
|
11
|
+
|
12
|
+
class_attribute :attr_defaults, default: {}, instance_accessor: false
|
13
|
+
|
14
|
+
def initialize(**values)
|
15
|
+
self.class.attr_defaults.merge(values).each_pair do |key, value|
|
16
|
+
send("#{key}=", value)
|
17
|
+
end
|
18
|
+
end
|
19
|
+
|
20
|
+
# Should we redirect to a challenge page (true) or just display it inline
|
21
|
+
# with a 403 status (false)
|
22
|
+
attribute :redirect_for_challenge, default: false
|
23
|
+
|
24
|
+
attribute :enabled, default: true
|
25
|
+
|
26
|
+
# ActiveSupport::Cache::Store to use for rate info, if nil will use Controller #cache_store
|
27
|
+
attribute :store
|
28
|
+
|
29
|
+
attribute :cf_turnstile_sitekey, default: "1x00000000000000000000AA" # a testing key that always passes
|
30
|
+
attribute :cf_turnstile_secret_key, default: "1x0000000000000000000000000000000AA" # a testing key always passes
|
31
|
+
# Turnstile testing keys: https://developers.cloudflare.com/turnstile/troubleshooting/testing/
|
32
|
+
|
33
|
+
# how long is a challenge pass good for before re-challenge?
|
34
|
+
attribute :session_passed_good_for, default: 24.hours
|
35
|
+
|
36
|
+
|
37
|
+
# Executed inside a controller instance, to omit a request from bot challenge.
|
38
|
+
# Adds on to :unless arg.
|
39
|
+
attribute :skip_when, default: ->(config) { false }
|
40
|
+
|
41
|
+
# replace with say `->() { render layout: 'something' }`, or `render "somedir/some_template"`
|
42
|
+
attribute :challenge_renderer, default: ->() {
|
43
|
+
render "bot_challenge_page/bot_challenge_page/challenge", status: 403
|
44
|
+
}
|
45
|
+
|
46
|
+
attribute :after_blocked, default: ->(bot_detect_class) {}
|
47
|
+
|
48
|
+
|
49
|
+
# rate limit per subnet, follow lehigh's lead with
|
50
|
+
# subnet: /16 for IPv4 (x.y.*.*), and /64 for IPv6 (about the same size subnet for better or worse)
|
51
|
+
# https://git.drupalcode.org/project/turnstile_protect/-/blob/0dae9f95d48f9d8cae5a8e61e767c69f64490983/src/EventSubscriber/Challenge.php#L140-151
|
52
|
+
attribute :default_limit_by, default: (lambda do |config|
|
53
|
+
if request.remote_ip.index(":") # ipv6
|
54
|
+
IPAddr.new("#{request.remote_ip}/64").to_string
|
55
|
+
else
|
56
|
+
IPAddr.new("#{request.remote_ip}/16").to_string
|
57
|
+
end
|
58
|
+
rescue IPAddr::InvalidAddressError
|
59
|
+
req.remote_ip
|
60
|
+
end)
|
61
|
+
|
62
|
+
# fingerprint is taken when "pass" is stored in session. client
|
63
|
+
# fingerprint needs to be the same to use pass, or else it's rejected.
|
64
|
+
#
|
65
|
+
# Algorithm parts based on advice from Xe laso @ Anubis, with variations.
|
66
|
+
#
|
67
|
+
# Allow exact IP to change -- various IPv6 and NAT can make it -- but within limited
|
68
|
+
# subnet. But also force some other headers to match, which they should if it's the same
|
69
|
+
# user-agent, which it should be if it's re-using a cookie.
|
70
|
+
attribute :session_valid_fingerprint, default: ->(request) {
|
71
|
+
ip_subnet_base = if request.remote_ip.index(":") #ipv6
|
72
|
+
IPAddr.new("#{request.remote_ip}/64").to_string
|
73
|
+
else
|
74
|
+
IPAddr.new("#{request.remote_ip}/24").to_string
|
75
|
+
end
|
76
|
+
|
77
|
+
[
|
78
|
+
request.user_agent,
|
79
|
+
request.headers['sec-ch-ua-platform'],
|
80
|
+
request.headers['accept-encoding'],
|
81
|
+
ip_subnet_base
|
82
|
+
].join(":")
|
83
|
+
}
|
84
|
+
|
85
|
+
|
86
|
+
attribute :cf_turnstile_js_url, default: "https://challenges.cloudflare.com/turnstile/v0/api.js"
|
87
|
+
attribute :cf_turnstile_validation_url, default: "https://challenges.cloudflare.com/turnstile/v0/siteverify"
|
88
|
+
attribute :cf_timeout, default: 3 # max timeout seconds waiting on Cloudfront Turnstile api
|
89
|
+
|
90
|
+
# key stored in Rails session object with channge passed confirmed
|
91
|
+
attribute :session_passed_key, default: "bot_detection-passed"
|
92
|
+
|
93
|
+
attribute :still_around_delay_ms, default: 1200
|
94
|
+
|
95
|
+
# make sure dup dups all attributes please
|
96
|
+
def initialize_dup(source)
|
97
|
+
self.class.attr_defaults.keys.each do |attr_key|
|
98
|
+
instance_variable_set("@#{attr_key}", instance_variable_get("@#{attr_key}").deep_dup)
|
99
|
+
super
|
100
|
+
end
|
101
|
+
end
|
102
|
+
end
|
103
|
+
end
|
data/lib/bot_challenge_page.rb
CHANGED
@@ -1,6 +1,17 @@
|
|
1
1
|
require "bot_challenge_page/version"
|
2
2
|
require "bot_challenge_page/engine"
|
3
|
+
require "bot_challenge_page/config"
|
3
4
|
|
4
5
|
module BotChallengePage
|
5
|
-
|
6
|
+
mattr_reader :config, default: ::BotChallengePage::Config.new
|
7
|
+
|
8
|
+
# Just a convenience to allow
|
9
|
+
#
|
10
|
+
# BotChallengePage.configure do |config|
|
11
|
+
# config.foo = "bar"
|
12
|
+
# end
|
13
|
+
#
|
14
|
+
def self.configure
|
15
|
+
yield config
|
16
|
+
end
|
6
17
|
end
|
@@ -13,55 +13,12 @@ module BotChallengePage
|
|
13
13
|
end
|
14
14
|
end
|
15
15
|
|
16
|
-
def
|
17
|
-
|
18
|
-
# only be on protected filters
|
19
|
-
return unless options[:rack_attack]
|
20
|
-
|
21
|
-
inject_into_class "app/controllers/application_controller.rb", "ApplicationController" do
|
22
|
-
filter_code = "BotChallengePage::BotChallengePageController.bot_challenge_enforce_filter(controller)"
|
23
|
-
|
24
|
-
<<-EOS
|
25
|
-
# This will only protect CONFIGURED routes, but also could be put on just certain
|
26
|
-
# controllers, it does not need to be in ApplicationController
|
27
|
-
before_action do |controller|
|
28
|
-
#{filter_code}
|
29
|
-
end
|
30
|
-
|
31
|
-
EOS
|
32
|
-
end
|
33
|
-
end
|
34
|
-
|
35
|
-
def add_rack_attack_require_if_needed
|
36
|
-
if options[:rack_attack]
|
37
|
-
# since it's an intermediate dependency, we need to require it after rails
|
38
|
-
# so it will load it's rails stuff
|
39
|
-
inject_into_file "config/application.rb", "\nrequire 'rack/attack'\n", after: /require.*rails\/[^\n]+\n/m
|
40
|
-
|
41
|
-
end
|
16
|
+
def add_controller_mixin
|
17
|
+
inject_into_class "app/controllers/application_controller.rb", "ApplicationController", " include BotChallengePage::Controller\n"
|
42
18
|
end
|
43
19
|
|
44
20
|
def copy_initializer_file
|
45
21
|
template "initializer.rb.erb", "config/initializers/bot_challenge_page.rb"
|
46
22
|
end
|
47
|
-
|
48
|
-
def suggest_filter
|
49
|
-
unless options[:rack_attack]
|
50
|
-
instructions = <<~EOS
|
51
|
-
You must add before_action to protect controllers
|
52
|
-
|
53
|
-
Add, eg:
|
54
|
-
|
55
|
-
before_action only: :index do |controller|
|
56
|
-
BotChallengePage::BotChallengePageController.bot_challenge_enforce_filter(controller, immediate: true)
|
57
|
-
end
|
58
|
-
|
59
|
-
To desired controllers and/or ApplicationController
|
60
|
-
EOS
|
61
|
-
|
62
|
-
say_status("advise", instructions, :green)
|
63
|
-
end
|
64
|
-
end
|
65
|
-
|
66
23
|
end
|
67
24
|
end
|
@@ -1,56 +1,64 @@
|
|
1
|
-
|
1
|
+
BotChallengePage.configure do |config|
|
2
2
|
|
3
|
-
|
3
|
+
# Can globally disable in configuration if desired
|
4
|
+
config.enabled = true
|
4
5
|
|
5
6
|
# Get from CloudFlare Turnstile: https://www.cloudflare.com/application-services/products/turnstile/
|
6
7
|
# Some testing keys are also available: https://developers.cloudflare.com/turnstile/troubleshooting/testing/
|
7
8
|
#
|
8
9
|
# Always pass testing sitekey: "1x00000000000000000000AA"
|
9
|
-
|
10
|
+
config.cf_turnstile_sitekey = "MUST GET"
|
10
11
|
# Always pass testing secret_key: "1x0000000000000000000000000000000AA"
|
11
|
-
|
12
|
+
config.cf_turnstile_secret_key = "MUST GET"
|
12
13
|
|
13
|
-
BotChallengePage::BotChallengePageController.bot_challenge_config.redirect_for_challenge = <%= options[:redirect_for_challenge] %>
|
14
14
|
|
15
|
-
<%- if options[:
|
16
|
-
|
17
|
-
|
18
|
-
|
15
|
+
<%- if options[:redirect_for_challenge] -%>
|
16
|
+
config.redirect_for_challenge = <%= options[:redirect_for_challenge] %>
|
17
|
+
<% end %>
|
18
|
+
|
19
|
+
# For rate-limiting, we need a rails cache store that keeps state, by default
|
20
|
+
# will use `config.action_controller.cache_store` or Rails.cache, but if you'd
|
21
|
+
# like to use a separate store database, eg. :
|
22
|
+
# config.store = ActiveSupport::Cache::RedisCacheStore.new(url: "...")
|
23
|
+
|
24
|
+
# Filter to omit requests from bot challenge control, executed in controller instance context
|
19
25
|
#
|
20
|
-
#
|
26
|
+
# config.skip_when = ->(config) {
|
27
|
+
# # maybe you want to globally exempt a heartbeat path
|
28
|
+
# current_page?(rails_health_check_path) ||
|
21
29
|
#
|
22
|
-
#
|
23
|
-
#
|
30
|
+
# # Here's a way to identify browser `fetch` API requests; note
|
31
|
+
# # it can be faked by an "attacker" so you might not want to do this globally
|
32
|
+
# (request.headers["sec-fetch-dest"] == "empty") ||
|
24
33
|
#
|
25
|
-
#
|
26
|
-
#
|
27
|
-
#
|
34
|
+
# # Maybe you want to exempt an uptime checker or other trusted bot
|
35
|
+
# #based on shared secret
|
36
|
+
# (headers["x-some-secret"] == "some_shared_secret")
|
37
|
+
# }
|
28
38
|
|
29
|
-
|
30
|
-
|
39
|
+
# Hook after a bot challenge is presented, for logging or other
|
40
|
+
# config.after_blocked = ->(bot_challenge_controller) {
|
41
|
+
# }
|
31
42
|
|
32
|
-
# allow rate_limit_count requests in rate_limit_period, before issuing challenge
|
33
|
-
BotChallengePage::BotChallengePageController.bot_challenge_config.rate_limit_period = 12.hour
|
34
|
-
BotChallengePage::BotChallengePageController.bot_challenge_config.rate_limit_count = 2
|
35
|
-
<% end -%>
|
36
43
|
|
37
44
|
# How long will a challenge success exempt a session from further challenges?
|
38
|
-
#
|
45
|
+
# config.session_passed_good_for = 36.hours
|
39
46
|
|
40
|
-
# Exempt some requests from bot challenge protection
|
41
|
-
# BotChallengePage::BotChallengePageController.bot_challenge_config.allow_exempt = ->(controller) {
|
42
|
-
# # controller.params
|
43
|
-
# # controller.request
|
44
|
-
# # controller.session
|
45
47
|
|
46
|
-
#
|
47
|
-
#
|
48
|
-
#
|
49
|
-
#
|
48
|
+
# Functions like to Rails rate_limit `by` parameter, as a configured default.
|
49
|
+
# A discriminator or identifier in which a client's requests will be bucketted
|
50
|
+
# by rate limit. Normally this gem buckets by IP address subnets. Switching
|
51
|
+
# to individual IPs would be much more generous:
|
52
|
+
# config.default_limit_by = ->(config) {
|
53
|
+
# request.remote_ip
|
54
|
+
# }
|
50
55
|
|
51
|
-
#
|
56
|
+
# When a "pass" cookie is saved, a fingerprint value is stored with it,
|
57
|
+
# and subsequent uses of the pass need to have a request that matches
|
58
|
+
# fingerprint. By default we insist on IP subnet match, and same user-agent
|
59
|
+
# and other headers. But can be customized.
|
60
|
+
# config.session_valid_fingerprint = ->(request) {
|
61
|
+
# # whatever
|
62
|
+
# }
|
52
63
|
|
53
|
-
<%- if options[:rack_attack] %>
|
54
|
-
BotChallengePage::BotChallengePageController.rack_attack_init
|
55
|
-
<% end %>
|
56
64
|
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: bot_challenge_page
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.10.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Jonathan Rochkind
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2025-04
|
11
|
+
date: 2025-08-04 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: appraisal
|
@@ -30,14 +30,14 @@ dependencies:
|
|
30
30
|
requirements:
|
31
31
|
- - "~>"
|
32
32
|
- !ruby/object:Gem::Version
|
33
|
-
version: '
|
33
|
+
version: '8.0'
|
34
34
|
type: :development
|
35
35
|
prerelease: false
|
36
36
|
version_requirements: !ruby/object:Gem::Requirement
|
37
37
|
requirements:
|
38
38
|
- - "~>"
|
39
39
|
- !ruby/object:Gem::Version
|
40
|
-
version: '
|
40
|
+
version: '8.0'
|
41
41
|
- !ruby/object:Gem::Dependency
|
42
42
|
name: capybara
|
43
43
|
requirement: !ruby/object:Gem::Requirement
|
@@ -114,20 +114,6 @@ dependencies:
|
|
114
114
|
- - "<"
|
115
115
|
- !ruby/object:Gem::Version
|
116
116
|
version: '8.1'
|
117
|
-
- !ruby/object:Gem::Dependency
|
118
|
-
name: rack-attack
|
119
|
-
requirement: !ruby/object:Gem::Requirement
|
120
|
-
requirements:
|
121
|
-
- - "~>"
|
122
|
-
- !ruby/object:Gem::Version
|
123
|
-
version: '6.7'
|
124
|
-
type: :runtime
|
125
|
-
prerelease: false
|
126
|
-
version_requirements: !ruby/object:Gem::Requirement
|
127
|
-
requirements:
|
128
|
-
- - "~>"
|
129
|
-
- !ruby/object:Gem::Version
|
130
|
-
version: '6.7'
|
131
117
|
- !ruby/object:Gem::Dependency
|
132
118
|
name: http
|
133
119
|
requirement: !ruby/object:Gem::Requirement
|
@@ -153,17 +139,18 @@ files:
|
|
153
139
|
- README.md
|
154
140
|
- Rakefile
|
155
141
|
- app/controllers/bot_challenge_page/bot_challenge_page_controller.rb
|
156
|
-
- app/controllers/concerns/bot_challenge_page/
|
157
|
-
- app/controllers/concerns/bot_challenge_page/
|
158
|
-
- app/models/bot_challenge_page/config.rb
|
159
|
-
- app/models/bot_challenge_page/simple_pow1.rb
|
142
|
+
- app/controllers/concerns/bot_challenge_page/controller.rb
|
143
|
+
- app/controllers/concerns/bot_challenge_page/guard_action.rb
|
160
144
|
- app/models/bot_challenge_page/test.html
|
161
145
|
- app/views/bot_challenge_page/_local_turnstile_script_tag.html.erb
|
162
146
|
- app/views/bot_challenge_page/_turnstile_widget_placeholder.html.erb
|
163
147
|
- app/views/bot_challenge_page/bot_challenge_page/challenge.html.erb
|
148
|
+
- app/views/bot_challenge_page/pow1/_pow1_placeholder.html.erb
|
149
|
+
- app/views/bot_challenge_page/pow1/_pow1_script_tag.html.erb
|
164
150
|
- config/locales/bot_challenge_page.en.yml
|
165
151
|
- config/routes.rb
|
166
152
|
- lib/bot_challenge_page.rb
|
153
|
+
- lib/bot_challenge_page/config.rb
|
167
154
|
- lib/bot_challenge_page/engine.rb
|
168
155
|
- lib/bot_challenge_page/version.rb
|
169
156
|
- lib/generators/bot_challenge_page/install_generator.rb
|
@@ -1,60 +0,0 @@
|
|
1
|
-
module BotChallengePage
|
2
|
-
|
3
|
-
# Extracted to concern in separate file mostly for readability, not expected to be used
|
4
|
-
# anywehre but BotChallengePageController -- we hang all logic off controller to allow multiple
|
5
|
-
# controllers in an app, and over-ride in sub-classes.
|
6
|
-
module RackAttackInit
|
7
|
-
extend ActiveSupport::Concern
|
8
|
-
|
9
|
-
|
10
|
-
class_methods do
|
11
|
-
# perhaps in an initializer, and after changing any config, run:
|
12
|
-
#
|
13
|
-
# Rails.application.config.to_prepare do
|
14
|
-
# BotChallengePage::BotChallengePageController.rack_attack_init
|
15
|
-
# end
|
16
|
-
#
|
17
|
-
# Safe to call more than once if you change config and want to call again, say in testing.
|
18
|
-
def rack_attack_init
|
19
|
-
self._rack_attack_uninit # make it safe for calling multiple times
|
20
|
-
|
21
|
-
## Turnstile bot detection throttling
|
22
|
-
#
|
23
|
-
# for paths matched by `rate_limited_locations`, after over rate_limit count requests in rate_limit_period,
|
24
|
-
# token will be stored in rack env instructing challenge is required.
|
25
|
-
#
|
26
|
-
# For actual challenge, need before_action in controller.
|
27
|
-
#
|
28
|
-
# You could rate limit detect on wider paths than you actually challenge on, or the same. You probably
|
29
|
-
# don't want to rate-limit detect on narrower list of paths than you challenge on!
|
30
|
-
Rack::Attack.track("bot_detect/rate_exceeded/#{self.name}",
|
31
|
-
limit: self.bot_challenge_config.rate_limit_count,
|
32
|
-
period: self.bot_challenge_config.rate_limit_period) do |req|
|
33
|
-
if self.bot_challenge_config.enabled && self.bot_challenge_config.location_matcher.call(req, self.bot_challenge_config)
|
34
|
-
self.bot_challenge_config.rate_limit_discriminator.call(req, self.bot_challenge_config)
|
35
|
-
end
|
36
|
-
end
|
37
|
-
|
38
|
-
self._track_notification_subscription = ActiveSupport::Notifications.subscribe("track.rack_attack") do |_name, _start, _finish, request_id, payload|
|
39
|
-
rack_request = payload[:request]
|
40
|
-
rack_env = rack_request.env
|
41
|
-
match_name = rack_env["rack.attack.matched"] # name of rack-attack rule
|
42
|
-
#
|
43
|
-
if match_name == "bot_detect/rate_exceeded/#{self.name}"
|
44
|
-
match_data = rack_env["rack.attack.match_data"]
|
45
|
-
match_data_formatted = match_data.slice(:count, :limit, :period).map { |k, v| "#{k}=#{v}"}.join(" ")
|
46
|
-
discriminator = rack_env["rack.attack.match_discriminator"] # unique key for rate limit, usually includes ip
|
47
|
-
|
48
|
-
rack_env[self.bot_challenge_config.env_challenge_trigger_key] = true
|
49
|
-
end
|
50
|
-
end
|
51
|
-
end
|
52
|
-
|
53
|
-
def _rack_attack_uninit
|
54
|
-
Rack::Attack.track("bot_detect/rate_exceeded/#{self.name}") {} # overwrite track name with empty proc
|
55
|
-
ActiveSupport::Notifications.unsubscribe(self._track_notification_subscription) if self._track_notification_subscription
|
56
|
-
self._track_notification_subscription = nil
|
57
|
-
end
|
58
|
-
end
|
59
|
-
end
|
60
|
-
end
|
@@ -1,124 +0,0 @@
|
|
1
|
-
module BotChallengePage
|
2
|
-
class Config
|
3
|
-
# meh let's do a little accessor definition to make this value class more legible
|
4
|
-
|
5
|
-
# default can be a proc, in which case it really is a proc as a value for default,
|
6
|
-
# the value is the proc!
|
7
|
-
def self.attribute(name, default:nil)
|
8
|
-
attr_defaults[name] = default
|
9
|
-
self.attr_accessor name
|
10
|
-
end
|
11
|
-
|
12
|
-
class_attribute :attr_defaults, default: {}, instance_accessor: false
|
13
|
-
|
14
|
-
def initialize(**values)
|
15
|
-
self.class.attr_defaults.merge(values).each_pair do |key, value|
|
16
|
-
# super hacky way to execute any procs in the context of this config,
|
17
|
-
# so they can access other config values easily.
|
18
|
-
if value.kind_of?(Proc)
|
19
|
-
newval = lambda do |*args|
|
20
|
-
self.instance_exec(*args, &value)
|
21
|
-
end
|
22
|
-
else
|
23
|
-
newval = value
|
24
|
-
end
|
25
|
-
|
26
|
-
send("#{key}=", newval)
|
27
|
-
end
|
28
|
-
end
|
29
|
-
|
30
|
-
# Should we redirect to a challenge page (true) or just display it inline
|
31
|
-
# with a 403 status (false)
|
32
|
-
attribute :redirect_for_challenge, default: false
|
33
|
-
|
34
|
-
attribute :enabled, default: false # Must set to true to turn on at all
|
35
|
-
|
36
|
-
attribute :cf_turnstile_sitekey, default: "1x00000000000000000000AA" # a testing key that always passes
|
37
|
-
attribute :cf_turnstile_secret_key, default: "1x0000000000000000000000000000000AA" # a testing key always passes
|
38
|
-
# Turnstile testing keys: https://developers.cloudflare.com/turnstile/troubleshooting/testing/
|
39
|
-
|
40
|
-
# up to rate_limit_count requests in rate_limit_period before challenged
|
41
|
-
attribute :rate_limit_period, default: 12.hour
|
42
|
-
attribute :rate_limit_count, default: 10
|
43
|
-
|
44
|
-
# how long is a challenge pass good for before re-challenge?
|
45
|
-
attribute :session_passed_good_for, default: 24.hours
|
46
|
-
|
47
|
-
# An array, can be:
|
48
|
-
# * a string, path prefix
|
49
|
-
# * a hash of rails route-decoded params, like `{ controller: "something" }`,
|
50
|
-
# or `{ controller: "something", action: "index" }
|
51
|
-
# The hash is more expensive to check and uses some not-technically-public
|
52
|
-
# Rails api, but it's just so convenient.
|
53
|
-
#
|
54
|
-
# Used by default :location_matcher, if set custom may not be used
|
55
|
-
attribute :rate_limited_locations, default: []
|
56
|
-
|
57
|
-
# Executed at the _controller_ filter level, to last minute exempt certain
|
58
|
-
# actions from protection.
|
59
|
-
attribute :allow_exempt, default: ->(controller, config) { false }
|
60
|
-
|
61
|
-
# replace with say `->() { render layout: 'something' }`, or `render "somedir/some_template"`
|
62
|
-
attribute :challenge_renderer, default: ->() {
|
63
|
-
render "bot_challenge_page/bot_challenge_page/challenge", status: 403
|
64
|
-
}
|
65
|
-
|
66
|
-
attribute :after_blocked, default: ->(bot_detect_class) {}
|
67
|
-
|
68
|
-
|
69
|
-
# rate limit per subnet, following lehigh's lead, although we use a smaller
|
70
|
-
# subnet: /24 for IPv4, and /72 for IPv6
|
71
|
-
# https://git.drupalcode.org/project/turnstile_protect/-/blob/0dae9f95d48f9d8cae5a8e61e767c69f64490983/src/EventSubscriber/Challenge.php#L140-151
|
72
|
-
attribute :rate_limit_discriminator, default: (lambda do |req, config|
|
73
|
-
if req.ip.index(":") # ipv6
|
74
|
-
IPAddr.new("#{req.ip}/72").to_string
|
75
|
-
else
|
76
|
-
IPAddr.new("#{req.ip}/24").to_string
|
77
|
-
end
|
78
|
-
rescue IPAddr::InvalidAddressError
|
79
|
-
req.ip
|
80
|
-
end)
|
81
|
-
|
82
|
-
attribute :location_matcher, default: ->(rack_req, config) {
|
83
|
-
parsed_route = nil
|
84
|
-
config.rate_limited_locations.any? do |val|
|
85
|
-
case val
|
86
|
-
when Hash
|
87
|
-
begin
|
88
|
-
# #recognize_path may e not techinically public API, and may be expensive, but
|
89
|
-
# no other way to do this, and it's mentioned in rack-attack:
|
90
|
-
# https://github.com/rack/rack-attack/blob/86650c4f7ea1af24fe4a89d3040e1309ee8a88bc/docs/advanced_configuration.md#match-actions-in-rails
|
91
|
-
# We do it lazily only if needed so if you don't want that don't use it.
|
92
|
-
parsed_route ||= rack_req.env["action_dispatch.routes"].recognize_path(rack_req.url, method: rack_req.request_method)
|
93
|
-
parsed_route && parsed_route >= val
|
94
|
-
rescue ActionController::RoutingError
|
95
|
-
false
|
96
|
-
end
|
97
|
-
when String
|
98
|
-
# string complete path at beginning, must end in ?, or end of string
|
99
|
-
/\A#{Regexp.escape val}(\/|\?|\Z)/ =~ rack_req.path
|
100
|
-
end
|
101
|
-
end
|
102
|
-
}
|
103
|
-
attribute :cf_turnstile_js_url, default: "https://challenges.cloudflare.com/turnstile/v0/api.js"
|
104
|
-
attribute :cf_turnstile_validation_url, default: "https://challenges.cloudflare.com/turnstile/v0/siteverify"
|
105
|
-
attribute :cf_timeout, default: 3 # max timeout seconds waiting on Cloudfront Turnstile api
|
106
|
-
|
107
|
-
|
108
|
-
# key stored in Rails session object with channge passed confirmed
|
109
|
-
attribute :session_passed_key, default: "bot_detection-passed"
|
110
|
-
|
111
|
-
# key in rack env that says challenge is required
|
112
|
-
attribute :env_challenge_trigger_key, default: "bot_detect.should_challenge"
|
113
|
-
|
114
|
-
attribute :still_around_delay_ms, default: 1200
|
115
|
-
|
116
|
-
# make sure dup dups all attributes please
|
117
|
-
def initialize_dup(source)
|
118
|
-
self.class.attr_defaults.keys.each do |attr_key|
|
119
|
-
instance_variable_set("@#{attr_key}", instance_variable_get("@#{attr_key}").deep_dup)
|
120
|
-
super
|
121
|
-
end
|
122
|
-
end
|
123
|
-
end
|
124
|
-
end
|
@@ -1,90 +0,0 @@
|
|
1
|
-
require 'digest'
|
2
|
-
require "base64"
|
3
|
-
|
4
|
-
module BotChallengePage
|
5
|
-
# A simple proof-of-work algorithm, that we can also do in javascript
|
6
|
-
#
|
7
|
-
# ## Algorithm
|
8
|
-
#
|
9
|
-
# We calculate a deterministic "challenge" based on a secret key (salt?), current time period,
|
10
|
-
# and the specific client request characteristics (prob just client IP).
|
11
|
-
#
|
12
|
-
# The client has to find a prefix than when prepended to the challenge yields a Sha256 hash
|
13
|
-
# that begins with a certain number of zeroes in the hex representtion. The number of zeroes is the "difficulty".
|
14
|
-
# Each zero in hex rep is 4 bits.
|
15
|
-
#
|
16
|
-
# They send the prefix back to us as a solution, and we confirm that when prefixed to
|
17
|
-
# our challenge, and hashed, it has the required number of leading zeroes.
|
18
|
-
#
|
19
|
-
# (TODO: Leading zeroes in a hex represnetation or what?)
|
20
|
-
class SimplePow1
|
21
|
-
# how long is a challenge good for, it will really be good for somewhere between this and 2x this,
|
22
|
-
# since we always try previous challenge to avoid race condition on switch
|
23
|
-
CHALLENGE_PERIOD = 6.minutes
|
24
|
-
|
25
|
-
# how many leading 0 *BITS* -- and time varies a LOT and expands RAPIDLY when we add, we dont' totally knokw what we're doing
|
26
|
-
DEFAULT_DIFFICULTY = 18
|
27
|
-
|
28
|
-
DEFAULT_SECRET = ActiveSupport::KeyGenerator.new(Rails.application.config.secret_key_base).generate_key("BotChallengePage::SimplePow1")
|
29
|
-
|
30
|
-
attr_reader :client_id, :difficulty
|
31
|
-
|
32
|
-
def initialize(client_id:, secret: DEFAULT_SECRET, difficulty: DEFAULT_DIFFICULTY)
|
33
|
-
@client_id = client_id # usually client ip
|
34
|
-
@difficulty = difficulty
|
35
|
-
@secret = secret
|
36
|
-
end
|
37
|
-
|
38
|
-
# challenge is determinsitic based on our secret, the current time, and the client_id
|
39
|
-
def challenge(for_time: Time.now.utc)
|
40
|
-
period_normalized_time = for_time - (for_time.to_i % CHALLENGE_PERIOD)
|
41
|
-
|
42
|
-
Digest::SHA256.hexdigest "#{period_normalized_time.to_s}_#{client_id.to_s}_#{@secret.to_s}"
|
43
|
-
end
|
44
|
-
|
45
|
-
def challenge_for_last_period
|
46
|
-
challenge(for_time: Time.now.utc - CHALLENGE_PERIOD)
|
47
|
-
end
|
48
|
-
|
49
|
-
def challenge_params
|
50
|
-
{
|
51
|
-
challenge: challenge,
|
52
|
-
difficulty: difficulty
|
53
|
-
}
|
54
|
-
end
|
55
|
-
|
56
|
-
# Check solution against current challenge, AND against the previous period's challenge,
|
57
|
-
# in case we just had a race condition, meaning our time goid is actually
|
58
|
-
# min CHALLENGE_PERIOD and max 2 * CHALLENGE_PERIOD
|
59
|
-
#
|
60
|
-
# @param solution [String] *Base64-encoded data*, that when prefixed to the challenge,
|
61
|
-
# results in a sha256 digest with `difficulty` leading 0 bits.
|
62
|
-
#
|
63
|
-
def verify_solution(solution)
|
64
|
-
solution = Base64.decode64(solution)
|
65
|
-
|
66
|
-
verify_solution_for_challenge(solution, challenge) ||
|
67
|
-
verify_solution_for_challenge(solution, challenge(for_time: Time.now.utc - CHALLENGE_PERIOD))
|
68
|
-
end
|
69
|
-
|
70
|
-
# @param solution [String] actual data, **not** base64 encoded
|
71
|
-
#
|
72
|
-
def verify_solution_for_challenge(aSolution, aChallenge)
|
73
|
-
# there's prob a more efficient mathematical way to do this wihtout converting
|
74
|
-
# to hex string, but this is what we've got.
|
75
|
-
bindigest = Digest::SHA256.digest(aSolution + aChallenge)
|
76
|
-
|
77
|
-
# hopefully we are not going to have a problem with endian-ness here. :(
|
78
|
-
|
79
|
-
bytes_required = (difficulty / 8) + 1
|
80
|
-
prefix_bytes = bindigest.byteslice(0, bytes_required).bytes
|
81
|
-
prefix_bits = prefix_bytes.collect do |byte|
|
82
|
-
reversed_bits = byte.digits(2)
|
83
|
-
reversed_bits.fill(0, reversed_bits.length..7).reverse
|
84
|
-
end.compact.join.slice(0, difficulty)
|
85
|
-
|
86
|
-
prefix_bits == ("0" * difficulty)
|
87
|
-
end
|
88
|
-
|
89
|
-
end
|
90
|
-
end
|