bot_challenge_page 0.10.0 → 0.11.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +20 -17
- data/app/controllers/concerns/bot_challenge_page/guard_action.rb +10 -0
- data/lib/bot_challenge_page/version.rb +1 -1
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 3630f38384be644c246e841ca6d582b1e942b232ae46f5917eef1dbd2f65c288
|
4
|
+
data.tar.gz: 1e74969897aaa0fecc0bce2498c4abedf0049655f98ff5db07eb77c503e924ba
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 20e0594a23e03d59a1655136713f1a5cdccd6de8f644e41f300644eada64442d58855da93a907609c711092f407e9fb2d835dba2f635b5b07af3f6874857c24a
|
7
|
+
data.tar.gz: '03358909eb4c97a61a558e93a55b736cb91ddee70d9892765f3b00877acc5ef6a1ef725dfbd669b2ff7d9e31b6e629f62423028ca94d346795f98a1df8db646d'
|
data/README.md
CHANGED
@@ -31,7 +31,7 @@ The motivating use case is fairly dumb (probably AI-related) crawlers crawling s
|
|
31
31
|
* Configure in the generated `./config/initializers/bot_challenge_page.rb`
|
32
32
|
* At a minimum you need to configure your Cloudflare Turnstile keys
|
33
33
|
|
34
|
-
* Some other configuration options are offered -- more advanced/specialized ones are available that are not mentioned in generated config file, see [Config class](./
|
34
|
+
* Some other configuration options are offered -- more advanced/specialized ones are available that are not mentioned in generated config file, see [Config class](./lib/bot_challenge_page/config.rb)
|
35
35
|
|
36
36
|
## Protect some paths
|
37
37
|
|
@@ -39,15 +39,15 @@ You can add `bot_challenge` to a controller to protect all actions in that contr
|
|
39
39
|
|
40
40
|
You can also use all the Rails `before_action` params to apply to only some actions or requests in that controller: `only` and `except` to specify actions; and `if` and `unless` to specify procs to filter individual requests.
|
41
41
|
|
42
|
-
|
43
|
-
|
44
|
-
|
42
|
+
* Note that we can only protect GET paths, and also think about making sure you DON'T protect
|
43
|
+
any path your front-end needs JS `fetch` access to, as this would block it (at least
|
44
|
+
without custom front-end code we haven't really explored)
|
45
45
|
|
46
|
-
|
46
|
+
* If you are tempted to just protect `/` that may work, but you may need to exclude hearbeat paths, front-end (AJAX) requestable paths, API endpoints, uptime checker requests, or other machine-access-desired paths. These may be good candidates for an `unless` parameter, or the `skip_when` configuration.
|
47
47
|
|
48
|
-
|
48
|
+
* The author is a librarian who believes maintaining machine access in general is a public good, and tries to limit access with a bot challenge to the minimum paths necessary for app sustainability.
|
49
49
|
|
50
|
-
|
50
|
+
* The default configuration only allows re-use of a 'pass' cookie from requests with same IP address subnet and user-agent-related headers. This can be customized.
|
51
51
|
|
52
52
|
```ruby
|
53
53
|
class WidgetController < ApplicationController
|
@@ -92,7 +92,7 @@ The challenge page by default will be displayed in your app's default rails `lay
|
|
92
92
|
To customize the layout or challenge page HTML more further, you can use configuration to supply a `render` method for the controller pointing to your own templates or other layouts. You will probably want to re-use the partials we use in our default template, for standard functionality. And you'll want to provide `<template>` elements with the same id's for those elements, but can put whatever you want inside the templates!
|
93
93
|
|
94
94
|
```ruby
|
95
|
-
|
95
|
+
config.challenge_renderer = ()-> {
|
96
96
|
render "my_local_view_folder/whatever", layout "another_layout"
|
97
97
|
}
|
98
98
|
```
|
@@ -107,7 +107,7 @@ If you'd like to log or observe challenge issues, you can configure a proc that
|
|
107
107
|
in the context of the controller, and is called when a page is blocked by a challenge.
|
108
108
|
|
109
109
|
```ruby
|
110
|
-
|
110
|
+
config.after_blocked = (_bot_challenge_class)-> {
|
111
111
|
logger.info("page blocked by challenge: #{request.uri}")
|
112
112
|
}
|
113
113
|
```
|
@@ -115,7 +115,7 @@ BotChallengePage::BotChallengePageController.bot_challenge_config.after_blocked
|
|
115
115
|
Or, here's how I managed to get it in [lograge](https://github.com/roidrage/lograge), so a page blocked results in a `bot_chlng=true` param in a lograge line.
|
116
116
|
|
117
117
|
```ruby
|
118
|
-
|
118
|
+
config.after_blocked =
|
119
119
|
->(bot_detect_class) {
|
120
120
|
request.env["bot_detect.blocked_for_challenge"] = true
|
121
121
|
}
|
@@ -129,6 +129,8 @@ config.lograge.custom_payload do |controller|
|
|
129
129
|
end
|
130
130
|
```
|
131
131
|
|
132
|
+
Later, however, using similar mechanism, I actually suppressed logging of actions that resulted in bot challenges altogether -- they were exhausting my log platform quota.
|
133
|
+
|
132
134
|
## Example possible Blacklight config
|
133
135
|
|
134
136
|
Many of us in my professional community use [blacklight](https://github.com/projectblacklight/blacklight). Here's a possible sample blacklight config to:
|
@@ -143,18 +145,18 @@ Many of us in my professional community use [blacklight](https://github.com/proj
|
|
143
145
|
|
144
146
|
```ruby
|
145
147
|
# ./config/initializers/bot_challenge_page.rb
|
146
|
-
|
147
|
-
|
148
|
+
BotChanngePage.configure do
|
149
|
+
config.enabled = true
|
148
150
|
|
149
151
|
# Need to set store to a Rails cache store other than null store, if you want to track
|
150
|
-
# rate limits.
|
151
|
-
|
152
|
+
# rate limits. We chooes to use a different store than Rails.cache.
|
153
|
+
config.store = ActiveSupport::Cache::RedisCacheStore.new(url: $some_redis_url)
|
152
154
|
|
153
155
|
# Get from CloudFlare Turnstile: https://www.cloudflare.com/application-services/products/turnstile/
|
154
|
-
|
155
|
-
|
156
|
+
config.cf_turnstile_sitekey = "MUST GET"
|
157
|
+
config.cf_turnstile_secret_key = "MUST GET"
|
156
158
|
|
157
|
-
|
159
|
+
config.skip_when = ->(config) {
|
158
160
|
# Exempt honeybadger token to allow HB uptime checker in
|
159
161
|
# https://docs.honeybadger.io/guides/security/
|
160
162
|
(
|
@@ -193,6 +195,7 @@ class CatalogController < ApplicationController
|
|
193
195
|
}
|
194
196
|
|
195
197
|
end
|
198
|
+
```
|
196
199
|
|
197
200
|
## Development and automated testing
|
198
201
|
|
@@ -31,6 +31,16 @@ module BotChallengePage
|
|
31
31
|
else
|
32
32
|
# hacky way to get config to view template in an arbitrary controller, good enough for now
|
33
33
|
controller.instance_variable_set("@bot_challenge_config", self.bot_challenge_config) unless controller.instance_variable_get("@bot_challenge_config")
|
34
|
+
|
35
|
+
# set preload HTTP header with turnstile url for better page speed
|
36
|
+
# May or may not be one there already, we can always add on
|
37
|
+
preload_link_value = %Q{<#{self.bot_challenge_config.cf_turnstile_js_url}>; rel=preload; as=script}
|
38
|
+
if controller.headers["link"].present?
|
39
|
+
controller.headers["link"] += ",#{preload_link_value}"
|
40
|
+
else
|
41
|
+
controller.headers["link"] = "#{preload_link_value}"
|
42
|
+
end
|
43
|
+
|
34
44
|
controller.instance_exec &self.bot_challenge_config.challenge_renderer
|
35
45
|
end
|
36
46
|
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: bot_challenge_page
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.11.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Jonathan Rochkind
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2025-08-
|
11
|
+
date: 2025-08-28 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: appraisal
|