selenium_tor 1.3.0 → 1.4.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +12 -2
- data/README.md +40 -38
- data/lib/tor/driver.rb +7 -2
- data/lib/tor/tor_process.rb +19 -15
- data/lib/tor/version.rb +1 -1
- metadata +2 -3
- data/lib/string_extensions.rb +0 -12
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 89802b75ed87505f2f9ba8671aeb649b66db4a7fd63a9f44f695acc3b7bcce18
|
4
|
+
data.tar.gz: fcaa6f71fbac4a9ba4a165ab6919c1c7160fb01b0d3d79a8b2c4db55bd65dec0
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 01fb59e8597bf62a649785f65ea8c616479384e8a18c376954981d7dedc978cb063dca1f8ce280a23dfbe1c05a0ca538e8115b3c6b4c5fd4a1c8cb56ebf173fd
|
7
|
+
data.tar.gz: 91cb32514aef461e29695b1e5e5470e439c8e0c407ec19229adb84b9207e7febc94cde429c235a809ad4db8cca37fb57c6f857d909b5589e408970bbbdadfb0e
|
data/CHANGELOG.md
CHANGED
@@ -1,10 +1,20 @@
|
|
1
1
|
## master (unreleased)
|
2
2
|
|
3
|
-
## [1.
|
3
|
+
## [1.4.0] - 2024-08-07
|
4
|
+
|
5
|
+
### Breaking changes
|
6
|
+
|
7
|
+
* only :socks_port and :control_port symbols are valid in :tor_opts
|
8
|
+
|
9
|
+
### New features
|
10
|
+
|
11
|
+
* add ability to set tor bootstrap timeout with tor_opts
|
4
12
|
|
5
13
|
### Bug fixes
|
6
14
|
|
7
|
-
* [#
|
15
|
+
* [#8](https://gitlab.com/matzfan/selenium-tor/-/issues/8)
|
16
|
+
|
17
|
+
## [1.3.0] - 2024-08-05
|
8
18
|
|
9
19
|
### New features
|
10
20
|
|
data/README.md
CHANGED
@@ -5,8 +5,14 @@
|
|
5
5
|
|
6
6
|
A Selenium extension for Tor Browser.
|
7
7
|
```ruby
|
8
|
-
|
9
|
-
@driver.
|
8
|
+
options = Selenium::WebDriver::Tor::Options.new
|
9
|
+
@driver = Selenium::WebDriver.for :tor, options: options
|
10
|
+
```
|
11
|
+
Once you have a driver instance:
|
12
|
+
```ruby
|
13
|
+
@driver.get 'https://check.torproject.org'
|
14
|
+
@driver.title
|
15
|
+
# => Congratulations. This browser is configured to use Tor.
|
10
16
|
```
|
11
17
|
|
12
18
|
## \_why?
|
@@ -15,14 +21,6 @@ I can use Firefox with Selenium and set a SOCKS proxy to use the Tor network, so
|
|
15
21
|
|
16
22
|
The above approach will hide your IP, but there is a good chance your browser's unique or near-unique fingerprint may be logged by site owners. Subsequent visits could identify you. A primary aim of this project is to enable Selenium to leverage Tor Browser's unique anonymity characteristics - in particular its resistance to browser fingerprinting. The aim is to ensure Selenium Tor site visits leave an identical fingerprint to the thousands of regular Tor Browser users.
|
17
23
|
|
18
|
-
## Known issues
|
19
|
-
|
20
|
-
Known issues are recorded [here](https://gitlab.com/matzfan/selenium-tor/-/issues).
|
21
|
-
|
22
|
-
### Fingerprinting
|
23
|
-
|
24
|
-
The gem uses Xvfb to allow Tor Browser to be manipulated headlessly. Xvfb in turn uses `llvmpipe` software acceleration which may lead to issues with WebGL fingerprinting - see [issue #7](https://gitlab.com/matzfan/selenium-tor/-/issues/7). If this is a problem we recommend using [VirtualGL](https://www.virtualgl.org) to force Xvfb to use your hardware driver instead. This is done by prepending your executable code with the command `vglrun`. For an example see the section below on testing. VirtualGL can be installed from a [package repo](https://virtualgl.org/Downloads/YUM).
|
25
|
-
|
26
24
|
## Installation
|
27
25
|
|
28
26
|
Install the gem and add to the application's Gemfile by executing:
|
@@ -33,27 +31,41 @@ If bundler is not being used to manage dependencies, install the gem by executin
|
|
33
31
|
|
34
32
|
$ gem install selenium_tor
|
35
33
|
|
34
|
+
## Dependencies and configuration
|
35
|
+
|
36
|
+
[Tor Browser](https://www.torproject.org/download).
|
37
|
+
|
38
|
+
The shared libraries Tor Bowser requires will be available if you have Firefox or Firefox ESR installed on your system. If not, do `sudo apt install firefox-esr` or the equivalent for your package manager.
|
39
|
+
|
40
|
+
As with Firefox browser, `geckodriver` needs to be installed and in your PATH.
|
41
|
+
|
42
|
+
The gem needs to know the location of the Tor Browser Bundle (TBB). The Tor Browser download package archive must be extracted and the root TBB directory (named "tor-browser") placed somewhere on your system. By default it is assumed to be in the current user's HOME directory. An alternative location can be set via the env var `TOR_BROWSER_ROOT_DIR` - e.g. `export TOR_BROWSER_ROOT_DIR=/home/<user>/Downloads`. The Tor Browser binary location is *automatically set* by reference to this directory, so there is no need to do this:
|
43
|
+
```ruby
|
44
|
+
options = Selenium::WebDriver::Tor::Options.new
|
45
|
+
options.binary = '/some/path/to/tor_firefox_binary' # UNNECESSARY
|
46
|
+
```
|
47
|
+
|
48
|
+
The location of the TBB is not expected to change during code execution.
|
49
|
+
|
50
|
+
Tor Selenium is tested on **Linux only** right now.
|
51
|
+
|
36
52
|
## Usage
|
37
53
|
```ruby
|
38
54
|
require 'selenium_tor'
|
39
55
|
```
|
40
56
|
Tor Browser is based on Firefox, so for usage please read the Selenium [docs](https://www.selenium.dev/documentation/webdriver/browsers/firefox) for Firefox browser.
|
41
57
|
|
42
|
-
|
43
|
-
```ruby
|
44
|
-
options = Selenium::WebDriver::Tor::Options.new
|
45
|
-
@driver = Selenium::WebDriver.for :tor, options: options
|
58
|
+
If the tor network is inaccessible for any reason a `Selenium::WebDriver::Error::TimeoutError` will result.
|
46
59
|
|
47
|
-
|
48
|
-
|
49
|
-
|
50
|
-
@driver.quit
|
51
|
-
```
|
52
|
-
If the network is inaccessible for any reason a `TorNetworkError` will result.
|
60
|
+
A separate tor process is used for each driver. Failure to call `Driver#quit` after code execution may leave orphaned tor processes.
|
61
|
+
|
62
|
+
### Tor options
|
53
63
|
|
54
|
-
|
64
|
+
In addition to the regular Firefox options, a :tor_opts key may be passed to an instance of `Tor::Options` with a hash of tor options. All valid tor options are recognized - see `man tor`. For convenience, "SocksPort" and "ControlPort" options may be set using snake_case symbols - i.e. :socks_port and :control_port. Additionally, a :tor_opts timeout value may be set with the :timeout key. This overrides the default time allowed for the tor process to bootstrap (10 seconds).
|
55
65
|
|
56
|
-
|
66
|
+
### Multiple driver instances (headless drivers only)
|
67
|
+
|
68
|
+
Running multiple tor processes requires that each uses different ports for SocksPort (and ControlPort, if used). These must be passed using the `:tor_opts` key. An example using the [Parallel gem](https://rubygems.org/gems/parallel):
|
57
69
|
```ruby
|
58
70
|
require 'parallel'
|
59
71
|
|
@@ -77,7 +89,7 @@ end
|
|
77
89
|
```
|
78
90
|
### Tor Browser specific functionality
|
79
91
|
|
80
|
-
You can get and set the
|
92
|
+
You can get and set the security level (shield icon in TB) as follows. 4 is 'Standard', 2 is 'Safer', 1 is 'Safest'.
|
81
93
|
```ruby
|
82
94
|
@driver.security_level
|
83
95
|
# => 4 - the default value
|
@@ -114,23 +126,13 @@ Tor::TBB_EXTENSIONS_DIR # path to the 'extensions' directory in the above
|
|
114
126
|
Tor::TBB_VERSION # the version installed, e.g. "13.0.1", note: driver.capabilities.browser_version returns the Firefox version Tor Browser is based on
|
115
127
|
```
|
116
128
|
|
117
|
-
##
|
118
|
-
|
119
|
-
[Tor Browser](https://www.torproject.org/download).
|
120
|
-
|
121
|
-
The shared libraries Tor Bowser requires will be available if you have Firefox or Firefox ESR installed on your system. If not, do `sudo apt install firefox-esr` or similar.
|
122
|
-
|
123
|
-
As with Firefox browser, `geckodriver` needs to be installed and in your PATH.
|
129
|
+
## Known issues
|
124
130
|
|
125
|
-
|
126
|
-
```ruby
|
127
|
-
options = Selenium::WebDriver::Tor::Options.new
|
128
|
-
options.binary = '/some/path/to/tor_firefox_binary' # UNNECESSARY
|
129
|
-
```
|
131
|
+
Known issues are recorded [here](https://gitlab.com/matzfan/selenium-tor/-/issues).
|
130
132
|
|
131
|
-
|
133
|
+
### Fingerprinting
|
132
134
|
|
133
|
-
Tor
|
135
|
+
The gem uses Xvfb to allow Tor Browser to be manipulated headlessly. Xvfb in turn uses `llvmpipe` software acceleration which may lead to issues with WebGL fingerprinting - see [issue #7](https://gitlab.com/matzfan/selenium-tor/-/issues/7). If this is a problem we recommend using [VirtualGL](https://www.virtualgl.org) to force Xvfb to use your hardware driver instead. This is done by prepending your executable code with the command `vglrun`. For an example see the section below on testing. VirtualGL can be installed from a [package repo](https://virtualgl.org/Downloads/YUM).
|
134
136
|
|
135
137
|
## Testing
|
136
138
|
|
@@ -140,7 +142,7 @@ Tests are run in the display set by the `DISPLAY` env var, usually :0 by default
|
|
140
142
|
|
141
143
|
$ DISPLAY=:99 bundle exec rake
|
142
144
|
|
143
|
-
If you find
|
145
|
+
If you find driver instantiation failing with port bind failure error messages ( these include "Address already in use. Is Tor already running?") check you have no other tor processes running with conflicting ports. With timeout errors, you may also want to check the Tor network is actually up, it isn't always..
|
144
146
|
|
145
147
|
If you wish run the `vglrun` WebGL fingerprint test install VirtualGL (see above, assumes you have the relevent drivers installed) and run the following command:
|
146
148
|
|
data/lib/tor/driver.rb
CHANGED
@@ -69,12 +69,17 @@ module Selenium
|
|
69
69
|
end
|
70
70
|
|
71
71
|
def create_tor_process_and_start_tor(opts)
|
72
|
-
|
73
|
-
@tor_process
|
72
|
+
timeout = opts.delete :timeout
|
73
|
+
@tor_process = tor_process(opts)
|
74
|
+
@tor_process.start_tor(timeout: timeout)
|
74
75
|
rescue Tor::TorProcess::TorProcessError => e
|
75
76
|
raise Error::WebDriverError, e
|
76
77
|
end
|
77
78
|
|
79
|
+
def tor_process(opts)
|
80
|
+
TorProcess.new(@data_dir, opts || {})
|
81
|
+
end
|
82
|
+
|
78
83
|
def domain
|
79
84
|
URI(current_url).host&.match(/[^\.]+\.\w+$/)
|
80
85
|
end
|
data/lib/tor/tor_process.rb
CHANGED
@@ -2,15 +2,12 @@
|
|
2
2
|
|
3
3
|
require 'timeout'
|
4
4
|
require_relative 'torrc'
|
5
|
-
require_relative '../string_extensions'
|
6
5
|
|
7
6
|
module Selenium
|
8
7
|
module WebDriver
|
9
8
|
module Tor
|
10
9
|
# Respresentation of a tor process
|
11
10
|
class TorProcess
|
12
|
-
include StringExtensions
|
13
|
-
|
14
11
|
class TorProcessError < StandardError; end
|
15
12
|
|
16
13
|
BOOTSTRAP_SUCCESS_REGEX = /Bootstrapped 100% \(done\): Done$/
|
@@ -28,12 +25,15 @@ module Selenium
|
|
28
25
|
@config ||= setup_config # Hash to store torrc config
|
29
26
|
end
|
30
27
|
|
31
|
-
def start_tor
|
28
|
+
def start_tor(timeout: 10)
|
32
29
|
r, io = IO.pipe
|
33
|
-
pid = Process.spawn
|
30
|
+
pid = Process.spawn tor_command, out: io, err: :out
|
34
31
|
io.close
|
35
|
-
errors =
|
32
|
+
errors = parse_tor_bootstrap_errors_with_timeout(r, timeout: timeout)
|
36
33
|
errors.empty? ? @pid = pid : raise(TorProcessError, "Tor failed to start with errors:\n\n#{errors}")
|
34
|
+
rescue Timeout::Error
|
35
|
+
Process.kill 'KILL', pid if pid
|
36
|
+
raise Error::TimeoutError, "Tor not bootstrapped after #{timeout} seconds"
|
37
37
|
ensure
|
38
38
|
r.close
|
39
39
|
end
|
@@ -44,7 +44,7 @@ module Selenium
|
|
44
44
|
@pid = nil
|
45
45
|
end
|
46
46
|
|
47
|
-
|
47
|
+
private
|
48
48
|
|
49
49
|
def valid_data_dir?(dir)
|
50
50
|
msg = 'data_dir must exist and be a dir'
|
@@ -52,9 +52,7 @@ module Selenium
|
|
52
52
|
end
|
53
53
|
|
54
54
|
def map_opts_to_torrc_keys(opts)
|
55
|
-
|
56
|
-
|
57
|
-
opts.transform_keys { |k| k.to_s.split('_').map(&:capitalize).join }
|
55
|
+
opts.transform_keys { |k| k.is_a?(String) ? k : k.to_s.split('_').map(&:capitalize).join }
|
58
56
|
end
|
59
57
|
|
60
58
|
def setup_config
|
@@ -62,12 +60,18 @@ module Selenium
|
|
62
60
|
@torrc.config
|
63
61
|
end
|
64
62
|
|
65
|
-
def
|
63
|
+
def tor_command
|
64
|
+
"#{TBB_TOR_BINARY_PATH} -f #{@torrc.path}"
|
65
|
+
end
|
66
|
+
|
67
|
+
def parse_tor_bootstrap_errors_with_timeout(io, timeout:)
|
66
68
|
lines = []
|
67
|
-
|
68
|
-
|
69
|
-
|
70
|
-
|
69
|
+
Timeout.timeout timeout do
|
70
|
+
io.each_line do |line|
|
71
|
+
lines << line
|
72
|
+
break lines.join if line.match BOOTSTRAP_FAIL_REGEX
|
73
|
+
break '' if line.match BOOTSTRAP_SUCCESS_REGEX
|
74
|
+
end
|
71
75
|
end
|
72
76
|
end
|
73
77
|
end
|
data/lib/tor/version.rb
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: selenium_tor
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.
|
4
|
+
version: 1.4.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- MatzFan
|
8
8
|
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date: 2024-08-
|
11
|
+
date: 2024-08-07 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: selenium-webdriver
|
@@ -46,7 +46,6 @@ files:
|
|
46
46
|
- lib/options.rb
|
47
47
|
- lib/selenium_tor.rb
|
48
48
|
- lib/service.rb
|
49
|
-
- lib/string_extensions.rb
|
50
49
|
- lib/tor/driver.rb
|
51
50
|
- lib/tor/options.rb
|
52
51
|
- lib/tor/profile.rb
|