ferrum 0.12 → 0.13

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 9e4ca57662283c8c6b917aa3e2f1f06b154cb66ffdd8522756f55e09be84a423
4
- data.tar.gz: 50d37d615399fe481634154d619d55872ee13b0cace511a1b147da38454247b5
3
+ metadata.gz: bde7c8e40700ace2d713cba69eee5828dcb888e5468c07a6b1e5e0d668e4c641
4
+ data.tar.gz: d52f7278dd76e670aa50721e6d70adc26297bbfb031ddac259dde5421c817a04
5
5
  SHA512:
6
- metadata.gz: 9cc514740ae8041b61a184bdeee0a44ce26b276f36af0358c6a00524f8e13b6db9ddda48acf5781c4122acaa4980579cf06053ffb420c2013e0b44177eab2571
7
- data.tar.gz: cc00b6d3cb440db600241156354dcd92a6bf67c0402b9e2881f32ea507549d80f69b51011bb246baf9faf95f33e5596933ce569cd9b03f41c7cc066d215b6073
6
+ metadata.gz: a7206d7a92d8483bd106fe262130492e7bf51e2db8b99f73c39ada0a674fb29c84bc948bdbb6f554b672ade9f4e3812a9158447b30d6f976cb4892b5e4e8df30
7
+ data.tar.gz: aef76b65c27dca2a5385d9881f9be8caccb85e804b31a26e421e35a43295baa3a5c818de856b49e3a7928139af4aeb588a5fae1cb67275440e5a0b3ddf97f76e
data/README.md CHANGED
@@ -154,7 +154,7 @@ Ferrum::Browser.new(options)
154
154
  * `:logger` (Object responding to `puts`) - When present, debug output is
155
155
  written to this object.
156
156
  * `:slowmo` (Integer | Float) - Set a delay in seconds to wait before sending command.
157
- Usefull companion of headless option, so that you have time to see changes.
157
+ Useful companion of headless option, so that you have time to see changes.
158
158
  * `:timeout` (Numeric) - The number of seconds we'll wait for a response when
159
159
  communicating with browser. Default is 5.
160
160
  * `:js_errors` (Boolean) - When true, JavaScript errors get re-raised in Ruby.
@@ -601,41 +601,36 @@ browser.go_to("https://github.com/") # => Ferrum::StatusError (Request to https:
601
601
 
602
602
  ## Proxy
603
603
 
604
- You can set a proxy with the `proxy` option.
604
+ You can set a proxy with a `:proxy` option:
605
605
 
606
606
  ```ruby
607
- browser = Ferrum::Browser.new(proxy: { host: "x.x.x.x", port: "8800" })
608
- browser = Ferrum::Browser.new(proxy: { host: "x.x.x.x", port: "8800", user: "user", pasword: "pa$$" })
607
+ browser = Ferrum::Browser.new(proxy: { host: "x.x.x.x", port: "8800", user: "user", password: "pa$$" })
609
608
  ```
610
609
 
611
- Chrome Devtools Protocol does not support changing proxies after the browser is launched. If you want to change proxies, you must restart your browser, which may not be convenient. There is a workaround. Ferrum provides a wrapper for a proxy server that can rotate proxies. We can run a proxy in the same process and rotate proxies inside this proxy server:
610
+ `:bypass` can specify semi-colon-separated list of hosts for which proxy shouldn't be used:
612
611
 
613
612
  ```ruby
614
- browser = Ferrum::Browser.new(proxy: { server: true })
613
+ browser = Ferrum::Browser.new(proxy: { host: "x.x.x.x", port: "8800", bypass: "*.google.com;*foo.com" })
614
+ ```
615
+
616
+ In general passing a proxy option when instantiating a browser results in a browser running with proxy command line
617
+ flags, so that it affects all pages and contexts. You can create a page in a new context which can use its own proxy
618
+ settings:
619
+
620
+ ```ruby
621
+ browser = Ferrum::Browser.new
615
622
 
616
- browser.proxy_server.rotate(host: "x.x.x.x", port: 31337, user: "user", password: "password")
617
- browser.create_page(new_context: true) do |page|
623
+ browser.create_page(proxy: { host: "x.x.x.x", port: 31337, user: "user", password: "password" }) do |page|
618
624
  page.go_to("https://api.ipify.org?format=json")
619
625
  page.body # => "x.x.x.x"
620
626
  end
621
627
 
622
- browser.proxy_server.rotate(host: "y.y.y.y", port: 31337, user: "user", password: "password")
623
- browser.create_page(new_context: true) do |page|
628
+ browser.create_page(proxy: { host: "y.y.y.y", port: 31337, user: "user", password: "password" }) do |page|
624
629
  page.go_to("https://api.ipify.org?format=json")
625
630
  page.body # => "y.y.y.y"
626
631
  end
627
632
  ```
628
633
 
629
- Make sure to create page in the new context, because Chrome doesn't break the connection with the proxy for `CONNECT`
630
- requests even if you close the page.
631
-
632
- You can specify semi-colon-separated list of hosts for which proxy shouldn't be used:
633
-
634
- ```ruby
635
- browser = Ferrum::Browser.new(proxy: { host: "x.x.x.x", port: "8800", bypass: "*.google.com;*foo.com" })
636
- browser = Ferrum::Browser.new(proxy: { server: true, bypass: "*.google.com;*foo.com" })
637
- ```
638
-
639
634
 
640
635
  ### Mouse
641
636
 
@@ -8,11 +8,11 @@ module Ferrum
8
8
  class Client
9
9
  INTERRUPTIONS = %w[Fetch.requestPaused Fetch.authRequired].freeze
10
10
 
11
- def initialize(browser, ws_url, id_starts_with: 0)
12
- @browser = browser
11
+ def initialize(ws_url, connectable, logger: nil, ws_max_receive_size: nil, id_starts_with: 0)
12
+ @connectable = connectable
13
13
  @command_id = id_starts_with
14
14
  @pendings = Concurrent::Hash.new
15
- @ws = WebSocket.new(ws_url, @browser.ws_max_receive_size, @browser.logger)
15
+ @ws = WebSocket.new(ws_url, ws_max_receive_size, logger)
16
16
  @subscriber, @interrupter = Subscriber.build(2)
17
17
 
18
18
  @thread = Thread.new do
@@ -39,7 +39,7 @@ module Ferrum
39
39
  message = build_message(method, params)
40
40
  @pendings[message[:id]] = pending
41
41
  @ws.send_message(message)
42
- data = pending.value!(@browser.timeout)
42
+ data = pending.value!(@connectable.timeout)
43
43
  @pendings.delete(message[:id])
44
44
 
45
45
  raise DeadBrowserError if data.nil? && @ws.messages.closed?
@@ -10,7 +10,7 @@ module Ferrum
10
10
  # Currently only these browsers support CDP:
11
11
  # https://github.com/cyrus-and/chrome-remote-interface#implementations
12
12
  def self.build(options, user_data_dir)
13
- defaults = case options[:browser_name]
13
+ defaults = case options.browser_name
14
14
  when :firefox
15
15
  Options::Firefox.options
16
16
  when :chrome, :opera, :edge, nil
@@ -29,14 +29,14 @@ module Ferrum
29
29
  @defaults = defaults
30
30
  @options = options
31
31
  @user_data_dir = user_data_dir
32
- @path = options[:browser_path] || ENV.fetch("BROWSER_PATH", nil) || defaults.detect_path
32
+ @path = options.browser_path || ENV.fetch("BROWSER_PATH", nil) || defaults.detect_path
33
33
  raise BinaryNotFoundError, NOT_FOUND unless @path
34
34
 
35
35
  merge_options
36
36
  end
37
37
 
38
38
  def xvfb?
39
- !!options[:xvfb]
39
+ !!options.xvfb
40
40
  end
41
41
 
42
42
  def to_a
@@ -47,9 +47,8 @@ module Ferrum
47
47
 
48
48
  def merge_options
49
49
  @flags = defaults.merge_required(@flags, options, @user_data_dir)
50
- @flags = defaults.merge_default(@flags, options) unless options[:ignore_default_browser_options]
51
-
52
- @flags.merge!(options.fetch(:browser_options, {}))
50
+ @flags = defaults.merge_default(@flags, options) unless options.ignore_default_browser_options
51
+ @flags.merge!(options.browser_options)
53
52
  end
54
53
  end
55
54
  end
@@ -4,11 +4,8 @@ require "singleton"
4
4
 
5
5
  module Ferrum
6
6
  class Browser
7
- module Options
7
+ class Options
8
8
  class Base
9
- BROWSER_HOST = "127.0.0.1"
10
- BROWSER_PORT = "0"
11
-
12
9
  include Singleton
13
10
 
14
11
  def self.options
@@ -2,7 +2,7 @@
2
2
 
3
3
  module Ferrum
4
4
  class Browser
5
- module Options
5
+ class Options
6
6
  class Chrome < Base
7
7
  DEFAULT_OPTIONS = {
8
8
  "headless" => nil,
@@ -59,17 +59,22 @@ module Ferrum
59
59
  }.freeze
60
60
 
61
61
  def merge_required(flags, options, user_data_dir)
62
- port = options.fetch(:port, BROWSER_PORT)
63
- host = options.fetch(:host, BROWSER_HOST)
64
- flags.merge("remote-debugging-port" => port,
65
- "remote-debugging-address" => host,
66
- # Doesn't work on MacOS, so we need to set it by CDP
67
- "window-size" => options[:window_size]&.join(","),
68
- "user-data-dir" => user_data_dir)
62
+ flags = flags.merge("remote-debugging-port" => options.port,
63
+ "remote-debugging-address" => options.host,
64
+ # Doesn't work on MacOS, so we need to set it by CDP
65
+ "window-size" => options.window_size&.join(","),
66
+ "user-data-dir" => user_data_dir)
67
+
68
+ if options.proxy
69
+ flags.merge!("proxy-server" => "#{options.proxy[:host]}:#{options.proxy[:port]}")
70
+ flags.merge!("proxy-bypass-list" => options.proxy[:bypass]) if options.proxy[:bypass]
71
+ end
72
+
73
+ flags
69
74
  end
70
75
 
71
76
  def merge_default(flags, options)
72
- defaults = except("headless", "disable-gpu") unless options.fetch(:headless, true)
77
+ defaults = except("headless", "disable-gpu") unless options.headless
73
78
 
74
79
  defaults ||= DEFAULT_OPTIONS
75
80
  defaults.merge(flags)
@@ -2,7 +2,7 @@
2
2
 
3
3
  module Ferrum
4
4
  class Browser
5
- module Options
5
+ class Options
6
6
  class Firefox < Base
7
7
  DEFAULT_OPTIONS = {
8
8
  "headless" => nil
@@ -23,14 +23,11 @@ module Ferrum
23
23
  }.freeze
24
24
 
25
25
  def merge_required(flags, options, user_data_dir)
26
- port = options.fetch(:port, BROWSER_PORT)
27
- host = options.fetch(:host, BROWSER_HOST)
28
- flags.merge("remote-debugger" => "#{host}:#{port}",
29
- "profile" => user_data_dir)
26
+ flags.merge("remote-debugger" => "#{options.host}:#{options.port}", "profile" => user_data_dir)
30
27
  end
31
28
 
32
29
  def merge_default(flags, options)
33
- defaults = except("headless") unless options.fetch(:headless, true)
30
+ defaults = except("headless") unless options.headless
34
31
 
35
32
  defaults ||= DEFAULT_OPTIONS
36
33
  defaults.merge(flags)
@@ -0,0 +1,84 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Ferrum
4
+ class Browser
5
+ class Options
6
+ HEADLESS = true
7
+ BROWSER_PORT = "0"
8
+ BROWSER_HOST = "127.0.0.1"
9
+ WINDOW_SIZE = [1024, 768].freeze
10
+ BASE_URL_SCHEMA = %w[http https].freeze
11
+ DEFAULT_TIMEOUT = ENV.fetch("FERRUM_DEFAULT_TIMEOUT", 5).to_i
12
+ PROCESS_TIMEOUT = ENV.fetch("FERRUM_PROCESS_TIMEOUT", 10).to_i
13
+ DEBUG_MODE = !ENV.fetch("FERRUM_DEBUG", nil).nil?
14
+
15
+ attr_reader :window_size, :timeout, :logger, :ws_max_receive_size,
16
+ :js_errors, :base_url, :slowmo, :pending_connection_errors,
17
+ :url, :env, :process_timeout, :browser_name, :browser_path,
18
+ :save_path, :extensions, :proxy, :port, :host, :headless,
19
+ :ignore_default_browser_options, :browser_options, :xvfb
20
+
21
+ def initialize(options = nil)
22
+ @options = Hash(options&.dup)
23
+ @port = @options.fetch(:port, BROWSER_PORT)
24
+ @host = @options.fetch(:host, BROWSER_HOST)
25
+ @timeout = @options.fetch(:timeout, DEFAULT_TIMEOUT)
26
+ @window_size = @options.fetch(:window_size, WINDOW_SIZE)
27
+ @js_errors = @options.fetch(:js_errors, false)
28
+ @headless = @options.fetch(:headless, HEADLESS)
29
+ @pending_connection_errors = @options.fetch(:pending_connection_errors, true)
30
+ @process_timeout = @options.fetch(:process_timeout, PROCESS_TIMEOUT)
31
+ @browser_options = @options.fetch(:browser_options, {})
32
+ @slowmo = @options[:slowmo].to_f
33
+
34
+ @ws_max_receive_size, @env, @browser_name, @browser_path,
35
+ @save_path, @extensions, @ignore_default_browser_options, @xvfb = @options.values_at(
36
+ :ws_max_receive_size, :env, :browser_name, :browser_path, :save_path, :extensions,
37
+ :ignore_default_browser_options, :xvfb
38
+ )
39
+
40
+ @options[:window_size] = @window_size
41
+ @proxy = parse_proxy(@options[:proxy])
42
+ @logger = parse_logger(@options[:logger])
43
+ @base_url = parse_base_url(@options[:base_url]) if @options[:base_url]
44
+ @url = @options[:url].to_s if @options[:url]
45
+
46
+ @options.freeze
47
+ @browser_options.freeze
48
+ end
49
+
50
+ def to_h
51
+ @options
52
+ end
53
+
54
+ def parse_base_url(value)
55
+ parsed = Addressable::URI.parse(value)
56
+ unless BASE_URL_SCHEMA.include?(parsed&.normalized_scheme)
57
+ raise ArgumentError, "`base_url` should be absolute and include schema: #{BASE_URL_SCHEMA.join(' | ')}"
58
+ end
59
+
60
+ parsed
61
+ end
62
+
63
+ def parse_proxy(options)
64
+ return unless options
65
+
66
+ raise ArgumentError, "proxy options must be a Hash" unless options.is_a?(Hash)
67
+
68
+ if options[:host].nil? && options[:port].nil?
69
+ raise ArgumentError, "proxy options must be a Hash with at least :host | :port"
70
+ end
71
+
72
+ options
73
+ end
74
+
75
+ private
76
+
77
+ def parse_logger(logger)
78
+ return logger if logger
79
+
80
+ !logger && DEBUG_MODE ? $stdout.tap { |s| s.sync = true } : logger
81
+ end
82
+ end
83
+ end
84
+ end
@@ -15,7 +15,6 @@ module Ferrum
15
15
  class Process
16
16
  KILL_TIMEOUT = 2
17
17
  WAIT_KILLED = 0.05
18
- PROCESS_TIMEOUT = ENV.fetch("FERRUM_PROCESS_TIMEOUT", 10).to_i
19
18
 
20
19
  attr_reader :host, :port, :ws_url, :pid, :command,
21
20
  :default_user_agent, :browser_version, :protocol_version,
@@ -63,17 +62,17 @@ module Ferrum
63
62
  def initialize(options)
64
63
  @pid = @xvfb = @user_data_dir = nil
65
64
 
66
- if options[:url]
67
- url = URI.join(options[:url].to_s, "/json/version")
65
+ if options.url
66
+ url = URI.join(options.url, "/json/version")
68
67
  response = JSON.parse(::Net::HTTP.get(url))
69
68
  self.ws_url = response["webSocketDebuggerUrl"]
70
69
  parse_browser_versions
71
70
  return
72
71
  end
73
72
 
74
- @logger = options[:logger]
75
- @process_timeout = options.fetch(:process_timeout, PROCESS_TIMEOUT)
76
- @env = Hash(options[:env])
73
+ @logger = options.logger
74
+ @process_timeout = options.process_timeout
75
+ @env = Hash(options.env)
77
76
 
78
77
  tmpdir = Dir.mktmpdir("ferrum_user_data_dir_")
79
78
  ObjectSpace.define_finalizer(self, self.class.directory_remover(tmpdir))
@@ -0,0 +1,71 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Ferrum
4
+ class Browser
5
+ #
6
+ # The browser's version information returned by [Browser.getVersion].
7
+ #
8
+ # [Browser.getVersion]: https://chromedevtools.github.io/devtools-protocol/1-3/Browser/#method-getVersion
9
+ #
10
+ # @since 0.13
11
+ #
12
+ class VersionInfo
13
+ #
14
+ # Initializes the browser's version information.
15
+ #
16
+ # @param [Hash{String => Object}] properties
17
+ # The object properties returned by [Browser.getVersion](https://chromedevtools.github.io/devtools-protocol/1-3/Browser/#method-getVersion).
18
+ #
19
+ # @api private
20
+ #
21
+ def initialize(properties)
22
+ @properties = properties
23
+ end
24
+
25
+ #
26
+ # The Chrome DevTools protocol version.
27
+ #
28
+ # @return [String]
29
+ #
30
+ def protocol_version
31
+ @properties["protocolVersion"]
32
+ end
33
+
34
+ #
35
+ # The Chrome version.
36
+ #
37
+ # @return [String]
38
+ #
39
+ def product
40
+ @properties["product"]
41
+ end
42
+
43
+ #
44
+ # The Chrome revision properties.
45
+ #
46
+ # @return [String]
47
+ #
48
+ def revision
49
+ @properties["revision"]
50
+ end
51
+
52
+ #
53
+ # The Chrome `User-Agent` string.
54
+ #
55
+ # @return [String]
56
+ #
57
+ def user_agent
58
+ @properties["userAgent"]
59
+ end
60
+
61
+ #
62
+ # The JavaScript engine version.
63
+ #
64
+ # @return [String]
65
+ #
66
+ def js_version
67
+ @properties["jsVersion"]
68
+ end
69
+ end
70
+ end
71
+ end
@@ -16,7 +16,7 @@ module Ferrum
16
16
  @path = Binary.find("Xvfb")
17
17
  raise BinaryNotFoundError, NOT_FOUND unless @path
18
18
 
19
- @screen_size = "#{options.fetch(:window_size, [1024, 768]).join('x')}x24"
19
+ @screen_size = "#{options.window_size.join('x')}x24"
20
20
  @display_id = (Time.now.to_f * 1000).to_i % 100_000_000
21
21
  end
22
22