selenium_tor 1.3.0 → 1.4.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: fcf1656f54ce23992e788e76b1d6342157b6ba3996ec95f04c955a2857e9b2be
4
- data.tar.gz: a9ef924f85fdd1370638ff3c12f9009ec1b2ab1b65b5a872bb8bd3d3e3e6e4f5
3
+ metadata.gz: 89802b75ed87505f2f9ba8671aeb649b66db4a7fd63a9f44f695acc3b7bcce18
4
+ data.tar.gz: fcaa6f71fbac4a9ba4a165ab6919c1c7160fb01b0d3d79a8b2c4db55bd65dec0
5
5
  SHA512:
6
- metadata.gz: 68018127837d8e59dd60647d9a0d60da4af363da8fad7104ac6ed52bf15844d7ead4d4cbc81f1b2403e64ad2fdda55f488ce725ca71b23c618686fb3ed5dbde3
7
- data.tar.gz: 3a6bd3db60f2a9de62e232774edabb11e4ccd7bb6e48158c4b6709339f8c964e56f4817f78d471d8ae57a970fec25b325a76a4c3779a3ee67aee8bdc2ad5250f
6
+ metadata.gz: 01fb59e8597bf62a649785f65ea8c616479384e8a18c376954981d7dedc978cb063dca1f8ce280a23dfbe1c05a0ca538e8115b3c6b4c5fd4a1c8cb56ebf173fd
7
+ data.tar.gz: 91cb32514aef461e29695b1e5e5470e439c8e0c407ec19229adb84b9207e7febc94cde429c235a809ad4db8cca37fb57c6f857d909b5589e408970bbbdadfb0e
data/CHANGELOG.md CHANGED
@@ -1,10 +1,20 @@
1
1
  ## master (unreleased)
2
2
 
3
- ## [1.3.0] - 2024-08-05
3
+ ## [1.4.0] - 2024-08-07
4
+
5
+ ### Breaking changes
6
+
7
+ * only :socks_port and :control_port symbols are valid in :tor_opts
8
+
9
+ ### New features
10
+
11
+ * add ability to set tor bootstrap timeout with tor_opts
4
12
 
5
13
  ### Bug fixes
6
14
 
7
- * [#14](https://gitlab.com/matzfan/selenium-tor/-/issues/14)
15
+ * [#8](https://gitlab.com/matzfan/selenium-tor/-/issues/8)
16
+
17
+ ## [1.3.0] - 2024-08-05
8
18
 
9
19
  ### New features
10
20
 
data/README.md CHANGED
@@ -5,8 +5,14 @@
5
5
 
6
6
  A Selenium extension for Tor Browser.
7
7
  ```ruby
8
- @driver = Selenium::WebDriver.for :tor
9
- @driver.quit
8
+ options = Selenium::WebDriver::Tor::Options.new
9
+ @driver = Selenium::WebDriver.for :tor, options: options
10
+ ```
11
+ Once you have a driver instance:
12
+ ```ruby
13
+ @driver.get 'https://check.torproject.org'
14
+ @driver.title
15
+ # => Congratulations. This browser is configured to use Tor.
10
16
  ```
11
17
 
12
18
  ## \_why?
@@ -15,14 +21,6 @@ I can use Firefox with Selenium and set a SOCKS proxy to use the Tor network, so
15
21
 
16
22
  The above approach will hide your IP, but there is a good chance your browser's unique or near-unique fingerprint may be logged by site owners. Subsequent visits could identify you. A primary aim of this project is to enable Selenium to leverage Tor Browser's unique anonymity characteristics - in particular its resistance to browser fingerprinting. The aim is to ensure Selenium Tor site visits leave an identical fingerprint to the thousands of regular Tor Browser users.
17
23
 
18
- ## Known issues
19
-
20
- Known issues are recorded [here](https://gitlab.com/matzfan/selenium-tor/-/issues).
21
-
22
- ### Fingerprinting
23
-
24
- The gem uses Xvfb to allow Tor Browser to be manipulated headlessly. Xvfb in turn uses `llvmpipe` software acceleration which may lead to issues with WebGL fingerprinting - see [issue #7](https://gitlab.com/matzfan/selenium-tor/-/issues/7). If this is a problem we recommend using [VirtualGL](https://www.virtualgl.org) to force Xvfb to use your hardware driver instead. This is done by prepending your executable code with the command `vglrun`. For an example see the section below on testing. VirtualGL can be installed from a [package repo](https://virtualgl.org/Downloads/YUM).
25
-
26
24
  ## Installation
27
25
 
28
26
  Install the gem and add to the application's Gemfile by executing:
@@ -33,27 +31,41 @@ If bundler is not being used to manage dependencies, install the gem by executin
33
31
 
34
32
  $ gem install selenium_tor
35
33
 
34
+ ## Dependencies and configuration
35
+
36
+ [Tor Browser](https://www.torproject.org/download).
37
+
38
+ The shared libraries Tor Bowser requires will be available if you have Firefox or Firefox ESR installed on your system. If not, do `sudo apt install firefox-esr` or the equivalent for your package manager.
39
+
40
+ As with Firefox browser, `geckodriver` needs to be installed and in your PATH.
41
+
42
+ The gem needs to know the location of the Tor Browser Bundle (TBB). The Tor Browser download package archive must be extracted and the root TBB directory (named "tor-browser") placed somewhere on your system. By default it is assumed to be in the current user's HOME directory. An alternative location can be set via the env var `TOR_BROWSER_ROOT_DIR` - e.g. `export TOR_BROWSER_ROOT_DIR=/home/<user>/Downloads`. The Tor Browser binary location is *automatically set* by reference to this directory, so there is no need to do this:
43
+ ```ruby
44
+ options = Selenium::WebDriver::Tor::Options.new
45
+ options.binary = '/some/path/to/tor_firefox_binary' # UNNECESSARY
46
+ ```
47
+
48
+ The location of the TBB is not expected to change during code execution.
49
+
50
+ Tor Selenium is tested on **Linux only** right now.
51
+
36
52
  ## Usage
37
53
  ```ruby
38
54
  require 'selenium_tor'
39
55
  ```
40
56
  Tor Browser is based on Firefox, so for usage please read the Selenium [docs](https://www.selenium.dev/documentation/webdriver/browsers/firefox) for Firefox browser.
41
57
 
42
- A driver is instantiated like this:
43
- ```ruby
44
- options = Selenium::WebDriver::Tor::Options.new
45
- @driver = Selenium::WebDriver.for :tor, options: options
58
+ If the tor network is inaccessible for any reason a `Selenium::WebDriver::Error::TimeoutError` will result.
46
59
 
47
- @driver.get 'https://check.torproject.org'
48
- @driver.title
49
- # => Congratulations. This browser is configured to use Tor.
50
- @driver.quit
51
- ```
52
- If the network is inaccessible for any reason a `TorNetworkError` will result.
60
+ A separate tor process is used for each driver. Failure to call `Driver#quit` after code execution may leave orphaned tor processes.
61
+
62
+ ### Tor options
53
63
 
54
- ### Multiple driver instances
64
+ In addition to the regular Firefox options, a :tor_opts key may be passed to an instance of `Tor::Options` with a hash of tor options. All valid tor options are recognized - see `man tor`. For convenience, "SocksPort" and "ControlPort" options may be set using snake_case symbols - i.e. :socks_port and :control_port. Additionally, a :tor_opts timeout value may be set with the :timeout key. This overrides the default time allowed for the tor process to bootstrap (10 seconds).
55
65
 
56
- Running multiple `tor` processes requires that each uses different ports for SocksPort and ControlPort. These and other valid tor options can be passed using the `:tor_opts` key. Recognized options are snake_case equivalents of the camel case options reconized by tor. For a list see `man tor`. An example using the [Parallel gem](https://rubygems.org/gems/parallel):
66
+ ### Multiple driver instances (headless drivers only)
67
+
68
+ Running multiple tor processes requires that each uses different ports for SocksPort (and ControlPort, if used). These must be passed using the `:tor_opts` key. An example using the [Parallel gem](https://rubygems.org/gems/parallel):
57
69
  ```ruby
58
70
  require 'parallel'
59
71
 
@@ -77,7 +89,7 @@ end
77
89
  ```
78
90
  ### Tor Browser specific functionality
79
91
 
80
- You can get and set the secuirty level (shield icon in TB) as follows. 4 is 'Standard', 2 is 'Safer', 1 is 'Safest'.
92
+ You can get and set the security level (shield icon in TB) as follows. 4 is 'Standard', 2 is 'Safer', 1 is 'Safest'.
81
93
  ```ruby
82
94
  @driver.security_level
83
95
  # => 4 - the default value
@@ -114,23 +126,13 @@ Tor::TBB_EXTENSIONS_DIR # path to the 'extensions' directory in the above
114
126
  Tor::TBB_VERSION # the version installed, e.g. "13.0.1", note: driver.capabilities.browser_version returns the Firefox version Tor Browser is based on
115
127
  ```
116
128
 
117
- ## Dependencies and configuration
118
-
119
- [Tor Browser](https://www.torproject.org/download).
120
-
121
- The shared libraries Tor Bowser requires will be available if you have Firefox or Firefox ESR installed on your system. If not, do `sudo apt install firefox-esr` or similar.
122
-
123
- As with Firefox browser, `geckodriver` needs to be installed and in your PATH.
129
+ ## Known issues
124
130
 
125
- The gem needs to know the location of the Tor Browser Bundle (TBB). The Tor Browser download package archive must be extracted and the root TBB directory (named "tor-browser") placed somewhere on your system. By default it is assumed to be in the current user's HOME directory. An alternative location can be set via the env var `TOR_BROWSER_ROOT_DIR` - e.g. `export TOR_BROWSER_ROOT_DIR=/home/<user>/Downloads`. The Tor Browser binary location is *automatically set* by reference to this directory, so there is no need to do this:
126
- ```ruby
127
- options = Selenium::WebDriver::Tor::Options.new
128
- options.binary = '/some/path/to/tor_firefox_binary' # UNNECESSARY
129
- ```
131
+ Known issues are recorded [here](https://gitlab.com/matzfan/selenium-tor/-/issues).
130
132
 
131
- The location of the TBB is not expected to change during code execution.
133
+ ### Fingerprinting
132
134
 
133
- Tor Selenium is tested on **Linux only** right now.
135
+ The gem uses Xvfb to allow Tor Browser to be manipulated headlessly. Xvfb in turn uses `llvmpipe` software acceleration which may lead to issues with WebGL fingerprinting - see [issue #7](https://gitlab.com/matzfan/selenium-tor/-/issues/7). If this is a problem we recommend using [VirtualGL](https://www.virtualgl.org) to force Xvfb to use your hardware driver instead. This is done by prepending your executable code with the command `vglrun`. For an example see the section below on testing. VirtualGL can be installed from a [package repo](https://virtualgl.org/Downloads/YUM).
134
136
 
135
137
  ## Testing
136
138
 
@@ -140,7 +142,7 @@ Tests are run in the display set by the `DISPLAY` env var, usually :0 by default
140
142
 
141
143
  $ DISPLAY=:99 bundle exec rake
142
144
 
143
- If you find tests are failing with `TorNetworkError` and a timeout message, check you have no other Tor Browser processes running. The processes to look for are either "firefox.real" or "firefox-esr". The latter may be legitimate if you are also running Firefox browser. You may also want to check the Tor network is actually up, it isn't always..
145
+ If you find driver instantiation failing with port bind failure error messages ( these include "Address already in use. Is Tor already running?") check you have no other tor processes running with conflicting ports. With timeout errors, you may also want to check the Tor network is actually up, it isn't always..
144
146
 
145
147
  If you wish run the `vglrun` WebGL fingerprint test install VirtualGL (see above, assumes you have the relevent drivers installed) and run the following command:
146
148
 
data/lib/tor/driver.rb CHANGED
@@ -69,12 +69,17 @@ module Selenium
69
69
  end
70
70
 
71
71
  def create_tor_process_and_start_tor(opts)
72
- @tor_process = TorProcess.new(@data_dir, opts || {})
73
- @tor_process.start_tor
72
+ timeout = opts.delete :timeout
73
+ @tor_process = tor_process(opts)
74
+ @tor_process.start_tor(timeout: timeout)
74
75
  rescue Tor::TorProcess::TorProcessError => e
75
76
  raise Error::WebDriverError, e
76
77
  end
77
78
 
79
+ def tor_process(opts)
80
+ TorProcess.new(@data_dir, opts || {})
81
+ end
82
+
78
83
  def domain
79
84
  URI(current_url).host&.match(/[^\.]+\.\w+$/)
80
85
  end
@@ -2,15 +2,12 @@
2
2
 
3
3
  require 'timeout'
4
4
  require_relative 'torrc'
5
- require_relative '../string_extensions'
6
5
 
7
6
  module Selenium
8
7
  module WebDriver
9
8
  module Tor
10
9
  # Respresentation of a tor process
11
10
  class TorProcess
12
- include StringExtensions
13
-
14
11
  class TorProcessError < StandardError; end
15
12
 
16
13
  BOOTSTRAP_SUCCESS_REGEX = /Bootstrapped 100% \(done\): Done$/
@@ -28,12 +25,15 @@ module Selenium
28
25
  @config ||= setup_config # Hash to store torrc config
29
26
  end
30
27
 
31
- def start_tor
28
+ def start_tor(timeout: 10)
32
29
  r, io = IO.pipe
33
- pid = Process.spawn "#{TBB_TOR_BINARY_PATH} -f #{@torrc.path}", out: io, err: :out
30
+ pid = Process.spawn tor_command, out: io, err: :out
34
31
  io.close
35
- errors = parse_tor_bootstrap_errors r
32
+ errors = parse_tor_bootstrap_errors_with_timeout(r, timeout: timeout)
36
33
  errors.empty? ? @pid = pid : raise(TorProcessError, "Tor failed to start with errors:\n\n#{errors}")
34
+ rescue Timeout::Error
35
+ Process.kill 'KILL', pid if pid
36
+ raise Error::TimeoutError, "Tor not bootstrapped after #{timeout} seconds"
37
37
  ensure
38
38
  r.close
39
39
  end
@@ -44,7 +44,7 @@ module Selenium
44
44
  @pid = nil
45
45
  end
46
46
 
47
- # private
47
+ private
48
48
 
49
49
  def valid_data_dir?(dir)
50
50
  msg = 'data_dir must exist and be a dir'
@@ -52,9 +52,7 @@ module Selenium
52
52
  end
53
53
 
54
54
  def map_opts_to_torrc_keys(opts)
55
- raise TorProcessError, 'Options keys must be snake case' unless opts.keys.map!(&:to_s).all?(&:snake_case?)
56
-
57
- opts.transform_keys { |k| k.to_s.split('_').map(&:capitalize).join }
55
+ opts.transform_keys { |k| k.is_a?(String) ? k : k.to_s.split('_').map(&:capitalize).join }
58
56
  end
59
57
 
60
58
  def setup_config
@@ -62,12 +60,18 @@ module Selenium
62
60
  @torrc.config
63
61
  end
64
62
 
65
- def parse_tor_bootstrap_errors(io)
63
+ def tor_command
64
+ "#{TBB_TOR_BINARY_PATH} -f #{@torrc.path}"
65
+ end
66
+
67
+ def parse_tor_bootstrap_errors_with_timeout(io, timeout:)
66
68
  lines = []
67
- io.each_line do |line|
68
- lines << line
69
- break lines.join if line.match BOOTSTRAP_FAIL_REGEX
70
- break '' if line.match BOOTSTRAP_SUCCESS_REGEX
69
+ Timeout.timeout timeout do
70
+ io.each_line do |line|
71
+ lines << line
72
+ break lines.join if line.match BOOTSTRAP_FAIL_REGEX
73
+ break '' if line.match BOOTSTRAP_SUCCESS_REGEX
74
+ end
71
75
  end
72
76
  end
73
77
  end
data/lib/tor/version.rb CHANGED
@@ -3,7 +3,7 @@
3
3
  module Selenium
4
4
  module WebDriver
5
5
  module Tor
6
- VERSION = '1.3.0'
6
+ VERSION = '1.4.0'
7
7
  end
8
8
  end
9
9
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: selenium_tor
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.3.0
4
+ version: 1.4.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - MatzFan
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2024-08-05 00:00:00.000000000 Z
11
+ date: 2024-08-07 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: selenium-webdriver
@@ -46,7 +46,6 @@ files:
46
46
  - lib/options.rb
47
47
  - lib/selenium_tor.rb
48
48
  - lib/service.rb
49
- - lib/string_extensions.rb
50
49
  - lib/tor/driver.rb
51
50
  - lib/tor/options.rb
52
51
  - lib/tor/profile.rb
@@ -1,12 +0,0 @@
1
- # frozen_string_literal: true
2
-
3
- # patch String
4
- module StringExtensions
5
- def snake_case?
6
- match(/\A[a-z]+(_[a-z]+)?\z/)
7
- end
8
- end
9
-
10
- class String
11
- prepend StringExtensions
12
- end