bot_challenge_page 0.2.0 → 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +38 -4
- data/app/controllers/bot_challenge_page/bot_challenge_page_controller.rb +13 -10
- data/app/controllers/concerns/bot_challenge_page/enforce_filter.rb +14 -3
- data/app/models/bot_challenge_page/config.rb +9 -1
- data/app/views/bot_challenge_page/_local_turnstile_script_tag.html.erb +15 -9
- data/app/views/bot_challenge_page/_turnstile_widget_placeholder.html.erb +2 -1
- data/app/views/bot_challenge_page/bot_challenge_page/challenge.html.erb +4 -4
- data/lib/bot_challenge_page/version.rb +1 -1
- data/lib/generators/bot_challenge_page/install_generator.rb +6 -2
- data/lib/generators/bot_challenge_page/templates/initializer.rb.erb +6 -1
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: a1913d93cd52d599d33f7217bb3714fbf67c82d08a888a3025cc30e51c51e438
|
4
|
+
data.tar.gz: 6e06e420e625a069132ae89a3ef733c1b2c246878e771e3b459ea0cdc351d14e
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 11504f2622783e4ca4bfb06b981df28260c1903e04a8900fe1797d34cd05bc29bcf3b2a1a6b9c93b85ae0f3a4639ff115699d4a548f6912b839bee62147e595c
|
7
|
+
data.tar.gz: 229ed02d6173b651ffd3cbef26e6444a974a14a1e9e3390d76567774a2385ad38cb4e1c289a78dda2c66f57d0e6a54e375506c9826cdbd7df7f799a7b96f03ff
|
data/README.md
CHANGED
@@ -27,6 +27,8 @@ The motivating use case is fairly dumb (probably AI-related) crawlers, rather th
|
|
27
27
|
|
28
28
|
* If you do not want to use rack-attack and want challenge on FIRST request, `rails g bot_challenge_page:install --no-rack-attack`
|
29
29
|
|
30
|
+
* By default challenge pages are "inline" at protected URL. To redirect to a separate challenge page URL instead, `--redirect-for-challenge`
|
31
|
+
|
30
32
|
* If you are **not using rack-attack**, you need to add a before_action to the controller(s)
|
31
33
|
you'd like to protect, eg:
|
32
34
|
|
@@ -59,10 +61,41 @@ To customize the layout or challenge page HTML more further, you can use configu
|
|
59
61
|
```ruby
|
60
62
|
BotChallengePage::BotChallengePageController.bot_challenge_config.challenge_renderer = ()-> {
|
61
63
|
render "my_local_view_folder/whatever", layout "another_layout"
|
62
|
-
render layout: "another_layout" # default html but change layout. etc.
|
63
64
|
}
|
64
65
|
```
|
65
66
|
|
67
|
+
## Logging
|
68
|
+
|
69
|
+
By default we log when a challenge result is submitted to the back-end; you can find challenge passes or failures by searching your logs for `BotChallengePage`.
|
70
|
+
|
71
|
+
We do not log when a challenge is issued -- experience shows challenge issues far outnumber challenge results, and can fill up the logs too fast.
|
72
|
+
|
73
|
+
If you'd like to log or observe challenge issues, you can configure a proc that is executed
|
74
|
+
in the context of the controller, and is called when a page is blocked by a challenge.
|
75
|
+
|
76
|
+
```ruby
|
77
|
+
BotChallengePage::BotChallengePageController.bot_challenge_config.after_blocked = (_bot_challenge_class)-> {
|
78
|
+
logger.info("page blocked by challenge: #{request.uri}")
|
79
|
+
}
|
80
|
+
```
|
81
|
+
|
82
|
+
Or, here's how I managed to get it in [lograge](https://github.com/roidrage/lograge), so a page blocked results in a `bot_chlng=true` param in a lograge line.
|
83
|
+
|
84
|
+
```ruby
|
85
|
+
BotChallengePage::BotChallengePageController.bot_challenge_config.after_blocked =
|
86
|
+
->(bot_detect_class) {
|
87
|
+
request.env["bot_detect.blocked_for_challenge"] = true
|
88
|
+
}
|
89
|
+
|
90
|
+
|
91
|
+
# production.rb
|
92
|
+
config.lograge.custom_payload do |controller|
|
93
|
+
{
|
94
|
+
bot_chlng: controller.request.env["bot_detect.blocked_for_challenge"]
|
95
|
+
}.compact
|
96
|
+
end
|
97
|
+
```
|
98
|
+
|
66
99
|
## Example possible Blacklight config
|
67
100
|
|
68
101
|
Many of us in my professional community use [blacklight](https://github.com/projectblacklight/blacklight). Here's a possible sample blacklight config to:
|
@@ -88,13 +121,12 @@ Rails.application.config.to_prepare do
|
|
88
121
|
BotChallengePage::BotChallengePageController.bot_challenge_config.rate_limit_count = 3
|
89
122
|
|
90
123
|
BotChallengePage::BotChallengePageController.allow_exempt = ->(controller) {
|
91
|
-
# Excempt any Catalog #facet action that looks like an ajax/fetch request, the
|
92
|
-
# ain't gonna work there, we just exempt it.
|
124
|
+
# Excempt any Catalog #facet or #range_limit action that looks like an ajax/fetch request, the # challenge isn't going to work there, we just exempt it.
|
93
125
|
#
|
94
126
|
# sec-fetch-dest is set to 'empty' by browser on fetch requests, to limit us further;
|
95
127
|
# sure an attacker could fake it, we don't mind if someone determined can avoid
|
96
128
|
# bot challenge on this one action
|
97
|
-
( controller.params[:action]
|
129
|
+
( controller.params[:action].in?(["facet", "range_limit"]) &&
|
98
130
|
controller.request.headers["sec-fetch-dest"] == "empty" &&
|
99
131
|
controller.kind_of?(CatalogController)
|
100
132
|
)
|
@@ -117,6 +149,8 @@ Locally one way to test with a specific rails version appraisal is `bundle exec
|
|
117
149
|
|
118
150
|
If you make any changes to `Gemfile` you may need to run `bundle exec appraisal install` and commit changes.
|
119
151
|
|
152
|
+
**One reason tests are slow** is I think we're running system tests with real turnstile proof-of-work bot detection JS code? (Or is it, when we are are using a CF turnstile testing key that always passes?). There aren't many tests so it's no big deal, but this is something that could be investigated/optmized more potentially.
|
153
|
+
|
120
154
|
## Possible future features?
|
121
155
|
|
122
156
|
* allow regex in default location_matcher? Easy to do if you want it, just say so.
|
@@ -19,9 +19,6 @@ module BotChallengePage
|
|
19
19
|
# different paths in your app if you like, is why config is with controller
|
20
20
|
class_attribute :bot_challenge_config, default: ::BotChallengePage::Config.new
|
21
21
|
|
22
|
-
delegate :cf_turnstile_js_url, :cf_turnstile_sitekey, :still_around_delay_ms, to: :bot_challenge_config
|
23
|
-
helper_method :cf_turnstile_js_url, :cf_turnstile_sitekey, :still_around_delay_ms
|
24
|
-
|
25
22
|
SESSION_DATETIME_KEY = "t"
|
26
23
|
SESSION_IP_KEY = "i"
|
27
24
|
|
@@ -29,19 +26,22 @@ module BotChallengePage
|
|
29
26
|
class_attribute :_track_notification_subscription, instance_accessor: false
|
30
27
|
|
31
28
|
|
29
|
+
# only used if config.redirect_for_challenge is true
|
32
30
|
def challenge
|
33
|
-
# possible custom render to choose layouts or templates, but
|
34
|
-
#
|
35
|
-
|
36
|
-
|
37
|
-
|
31
|
+
# possible custom render to choose layouts or templates, but
|
32
|
+
# default is what would be default template for this action
|
33
|
+
#
|
34
|
+
# We put it in instancevar as a hacky way of passing to template that can be fulfilled
|
35
|
+
# both here and in arbitrary controllers for direct render.
|
36
|
+
@bot_challenge_config = bot_challenge_config
|
37
|
+
instance_exec &self.bot_challenge_config.challenge_renderer
|
38
38
|
end
|
39
39
|
|
40
40
|
def verify_challenge
|
41
41
|
body = {
|
42
42
|
secret: self.bot_challenge_config.cf_turnstile_secret_key,
|
43
43
|
response: params["cf_turnstile_response"],
|
44
|
-
remoteip: request.remote_ip
|
44
|
+
remoteip: request.remote_ip,
|
45
45
|
}
|
46
46
|
|
47
47
|
http = HTTP.timeout(self.bot_challenge_config.cf_timeout)
|
@@ -64,7 +64,10 @@ module BotChallengePage
|
|
64
64
|
Rails.logger.warn("#{self.class.name}: Cloudflare Turnstile validation failed (#{request.remote_ip}, #{request.user_agent}): #{result}: #{params["dest"]}")
|
65
65
|
end
|
66
66
|
|
67
|
-
#
|
67
|
+
# add config needed by JS to result
|
68
|
+
result["redirect_for_challenge"] = self.bot_challenge_config.redirect_for_challenge
|
69
|
+
|
70
|
+
# and let's just return the whole thing to client? Is there anything confidential there?
|
68
71
|
render json: result
|
69
72
|
rescue HTTP::Error, JSON::ParserError => e
|
70
73
|
# probably an http timeout? or something weird.
|
@@ -25,9 +25,20 @@ module BotChallengePage
|
|
25
25
|
return
|
26
26
|
end
|
27
27
|
|
28
|
-
|
29
|
-
|
30
|
-
|
28
|
+
# Prevent caching of bot challenge page
|
29
|
+
controller.response.headers["Cache-Control"] = "no-store"
|
30
|
+
|
31
|
+
if self.bot_challenge_config.redirect_for_challenge
|
32
|
+
# status code temporary
|
33
|
+
controller.redirect_to controller.bot_detect_challenge_path(dest: controller.request.original_fullpath), status: 307
|
34
|
+
else
|
35
|
+
# hacky way to get config to view template in an arbitrary controller, good enough for now
|
36
|
+
controller.instance_variable_set("@bot_challenge_config", self.bot_challenge_config) unless controller.instance_variable_get("@bot_challenge_config")
|
37
|
+
controller.instance_exec &self.bot_challenge_config.challenge_renderer
|
38
|
+
end
|
39
|
+
|
40
|
+
# allow app to see and log if desired
|
41
|
+
controller.instance_exec(self, &self.bot_challenge_config.after_blocked)
|
31
42
|
end
|
32
43
|
end
|
33
44
|
|
@@ -27,6 +27,10 @@ module BotChallengePage
|
|
27
27
|
end
|
28
28
|
end
|
29
29
|
|
30
|
+
# Should we redirect to a challenge page (true) or just display it inline
|
31
|
+
# with a 403 status (false)
|
32
|
+
attribute :redirect_for_challenge, default: false
|
33
|
+
|
30
34
|
attribute :enabled, default: false # Must set to true to turn on at all
|
31
35
|
|
32
36
|
attribute :cf_turnstile_sitekey, default: "1x00000000000000000000AA" # a testing key that always passes
|
@@ -55,7 +59,11 @@ module BotChallengePage
|
|
55
59
|
attribute :allow_exempt, default: ->(controller, config) { false }
|
56
60
|
|
57
61
|
# replace with say `->() { render layout: 'something' }`, or `render "somedir/some_template"`
|
58
|
-
attribute :challenge_renderer, default:
|
62
|
+
attribute :challenge_renderer, default: ->() {
|
63
|
+
render "bot_challenge_page/bot_challenge_page/challenge", status: 403
|
64
|
+
}
|
65
|
+
|
66
|
+
attribute :after_blocked, default: ->(bot_detect_class) {}
|
59
67
|
|
60
68
|
|
61
69
|
# rate limit per subnet, following lehigh's lead, although we use a smaller
|
@@ -1,6 +1,7 @@
|
|
1
|
+
<%# locals: (bot_challenge_config:) -%>
|
2
|
+
|
1
3
|
<%# we deliver our simple javascript as inline script to make deployment more
|
2
4
|
reliable without having to deal with different asset pipelines, and it's really a fine choice anyway %>
|
3
|
-
|
4
5
|
<script type="text/javascript">
|
5
6
|
async function turnstileCallback(token) {
|
6
7
|
try {
|
@@ -31,12 +32,6 @@
|
|
31
32
|
|
32
33
|
result = await response.json();
|
33
34
|
if (result["success"] == true) {
|
34
|
-
const dest = new URLSearchParams(window.location.search).get("dest");
|
35
|
-
// For security make sure it only has path and on
|
36
|
-
if (!dest.startsWith("/") || dest.startsWith("//")) {
|
37
|
-
throw new Error("illegal non-local redirect: " + dest);
|
38
|
-
}
|
39
|
-
|
40
35
|
// in case this page stays around, (say it was rediret to media asset), let's add a failsafe message after
|
41
36
|
// a couple seconds.
|
42
37
|
const delay = document.querySelector("#botChallengePageStillAroundTemplate")?.getAttribute("data-still-around-delay-ms") || 1200;
|
@@ -44,8 +39,19 @@
|
|
44
39
|
_displayStillAroundNote()
|
45
40
|
}, delay);
|
46
41
|
|
47
|
-
|
48
|
-
|
42
|
+
if (result["redirect_for_challenge"] == true) {
|
43
|
+
const dest = new URLSearchParams(window.location.search).get("dest");
|
44
|
+
// For security make sure it only has path and on
|
45
|
+
if (!dest.startsWith("/") || dest.startsWith("//")) {
|
46
|
+
throw new Error("illegal non-local redirect: " + dest);
|
47
|
+
}
|
48
|
+
|
49
|
+
// replace the challenge page in history
|
50
|
+
window.location.replace(dest);
|
51
|
+
} else {
|
52
|
+
// just need to reload and now we'll get through
|
53
|
+
window.location.reload();
|
54
|
+
}
|
49
55
|
} else {
|
50
56
|
console.error("Turnstile response reported as failure: " + JSON.stringify(result))
|
51
57
|
_displayChallengeError();
|
@@ -1,7 +1,7 @@
|
|
1
1
|
<div class="bot_challenge_page">
|
2
2
|
<h1 class="mb-4"><%= t('bot_challenge_page.title') %></h1>
|
3
3
|
|
4
|
-
<%= render "bot_challenge_page/turnstile_widget_placeholder" %>
|
4
|
+
<%= render "bot_challenge_page/turnstile_widget_placeholder", bot_challenge_config: @bot_challenge_config %>
|
5
5
|
|
6
6
|
<noscript>
|
7
7
|
<div class="alert alert-danger"><%= t('bot_challenge_page.noscript') %></div>
|
@@ -16,14 +16,14 @@
|
|
16
16
|
</div>
|
17
17
|
</template>
|
18
18
|
|
19
|
-
<template id="botChallengePageStillAroundTemplate" data-still_around_delay_ms="<%= still_around_delay_ms %>">
|
19
|
+
<template id="botChallengePageStillAroundTemplate" data-still_around_delay_ms="<%= @bot_challenge_config.still_around_delay_ms %>">
|
20
20
|
<div class="alert alert-info" role="alert">
|
21
21
|
<i class="fa fa-info-circle" aria-hidden="true"></i>
|
22
22
|
<%= t('bot_challenge_page.still_around') %>
|
23
23
|
</div>
|
24
24
|
</template>
|
25
25
|
|
26
|
-
<script src="<%= cf_turnstile_js_url %>" async defer></script>
|
26
|
+
<script src="<%= @bot_challenge_config.cf_turnstile_js_url %>" async defer></script>
|
27
27
|
|
28
|
-
<%= render "bot_challenge_page/local_turnstile_script_tag" %>
|
28
|
+
<%= render "bot_challenge_page/local_turnstile_script_tag", bot_challenge_config: @bot_challenge_config %>
|
29
29
|
</div>
|
@@ -3,10 +3,14 @@ module BotChallengePage
|
|
3
3
|
source_root File.expand_path("templates", __dir__)
|
4
4
|
|
5
5
|
class_option :'rack_attack', type: :boolean, default: true, desc: "Support rate-limit allowance configuration"
|
6
|
+
class_option :redirect_for_challenge, type: :boolean, default: false, desc: "Redirect to separate challenge page instead of inline challenge"
|
6
7
|
|
7
8
|
def generate_routes
|
8
|
-
route '
|
9
|
-
|
9
|
+
route 'post "/challenge", to: "bot_challenge_page/bot_challenge_page#verify_challenge", as: :bot_detect_challenge'
|
10
|
+
|
11
|
+
if options[:redirect_for_challenge]
|
12
|
+
route 'get "/challenge", to: "bot_challenge_page/bot_challenge_page#challenge"'
|
13
|
+
end
|
10
14
|
end
|
11
15
|
|
12
16
|
def add_before_filter_enforcement
|
@@ -4,9 +4,14 @@ Rails.application.config.to_prepare do
|
|
4
4
|
|
5
5
|
# Get from CloudFlare Turnstile: https://www.cloudflare.com/application-services/products/turnstile/
|
6
6
|
# Some testing keys are also available: https://developers.cloudflare.com/turnstile/troubleshooting/testing/
|
7
|
+
#
|
8
|
+
# Always pass testing sitekey: "1x00000000000000000000AA"
|
7
9
|
BotChallengePage::BotChallengePageController.bot_challenge_config.cf_turnstile_sitekey = "MUST GET"
|
10
|
+
# Always pass testing secret_key: "1x0000000000000000000000000000000AA"
|
8
11
|
BotChallengePage::BotChallengePageController.bot_challenge_config.cf_turnstile_secret_key = "MUST GET"
|
9
12
|
|
13
|
+
BotChallengePage::BotChallengePageController.bot_challenge_config.redirect_for_challenge = <%= options[:redirect_for_challenge] %>
|
14
|
+
|
10
15
|
<%- if options[:rack_attack] %>
|
11
16
|
# What paths do you want to protect?
|
12
17
|
#
|
@@ -33,7 +38,7 @@ Rails.application.config.to_prepare do
|
|
33
38
|
# BotChallengePage::BotChallengePageController.bot_challenge_config.session_passed_good_for = 36.hours
|
34
39
|
|
35
40
|
# Exempt some requests from bot challenge protection
|
36
|
-
# BotChallengePage::BotChallengePageController.allow_exempt = ->(controller) {
|
41
|
+
# BotChallengePage::BotChallengePageController.bot_challenge_config.allow_exempt = ->(controller) {
|
37
42
|
# # controller.params
|
38
43
|
# # controller.request
|
39
44
|
# # controller.session
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: bot_challenge_page
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.3.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Jonathan Rochkind
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2025-03-
|
11
|
+
date: 2025-03-19 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: appraisal
|