ferrum 0.6 → 0.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 4f657c088b22a9ff5809b6ecd0d7005cd0c2252645b262e76faee1b6e3af94c8
4
- data.tar.gz: d22842188425c753e0a0e113be52e0176e8a7e1a97211c44bf3016e90aa0d1d4
3
+ metadata.gz: 8b4d6dc7aa1827fbf559e6025b82d29d15ed0e36a89793049266d8049fadabb9
4
+ data.tar.gz: 6fab0202e85a17971d613db12e37a7ef85325eaf23f718b6801812df565ac64c
5
5
  SHA512:
6
- metadata.gz: a519e80c6b1450ba85c555d4c5eb547ffb62f129e7873515ba2f5bad4726548289bcc3b4db95932873620e4b645f2f9b7f12c8bd0fdbe0ddada4f0faae080720
7
- data.tar.gz: 91a8f95433f9630dd0f2b36c42ea14bf28b49ee9504738c664399eeb7fc3d192bd26c14ae7de707dcaa8273ec65d7b1a8e9bcaeeff4287392a5e1b312ea03b0d
6
+ metadata.gz: fb109c1b65e73e8d0088734fa004fae3d15650121d01b44dc9086cdb193bb3c7f56c3ad911a2f6de5cd28522231b4e695b67eb515d90afe93b862eb64cfe7050
7
+ data.tar.gz: a4d5e4c192cbd634c640d86027f3a8faeaf42efcd8fbaa9b0e8c6e81c04aa9921b7f2353eb739c35850363f1ed58f84d5895714f52ecd333d9ebdb50c1de4031
data/README.md CHANGED
@@ -1,36 +1,81 @@
1
- # Ferrum - fearless Ruby Chrome driver
1
+ # Ferrum - high-level API to control Chrome in Ruby
2
2
 
3
- [![Build Status](https://travis-ci.org/route/ferrum.svg?branch=master)](https://travis-ci.org/route/ferrum)
3
+ [![Build Status](https://travis-ci.org/rubycdp/ferrum.svg?branch=master)](https://travis-ci.org/rubycdp/ferrum)
4
4
 
5
- <img align="right" width="95" height="95"
5
+ <img align="right"
6
+ width="320" height="241"
6
7
  alt="Ferrum logo"
7
- src="https://raw.githubusercontent.com/route/ferrum/master/logo.svg?sanitize=true">
8
+ src="https://raw.githubusercontent.com/rubycdp/ferrum/master/logo.svg?sanitize=true">
9
+
10
+ #### As simple as Puppeteer, though even simpler.
11
+
12
+ It is Ruby clean and high-level API to Chrome. Runs headless by default, but you
13
+ can configure it to run in a headful mode. All you need is Ruby and
14
+ [Chrome](https://www.google.com/chrome/) or
15
+ [Chromium](https://www.chromium.org/). Ferrum connects to the browser by [CDP
16
+ protocol](https://chromedevtools.github.io/devtools-protocol/) and there's _no_
17
+ Selenium/WebDriver/ChromeDriver dependency. The emphasis was made on a raw CDP
18
+ protocol because Chrome allows you to do so many things that are barely
19
+ supported by WebDriver because it should have consistent design with other
20
+ browsers.
21
+
22
+ * [Cuprite](https://github.com/rubycdp/cuprite) is a pure Ruby driver for
23
+ [Capybara](https://github.com/teamcapybara/capybara) based on Ferrum. If you are
24
+ going to crawl sites you better use Ferrum or
25
+ [Vessel](https://github.com/rubycdp/vessel) because you crawl, not test.
26
+
27
+ * [Vessel](https://github.com/rubycdp/vessel) high-level web crawling framework
28
+ based on Ferrum. It looks like [Scrapy](https://scrapy.org/) except that it uses
29
+ a real browser in order to grab data.
30
+
31
+ Web design by [Evrone](https://evrone.com/), what else
32
+ [we build with Ruby on Rails](https://evrone.com/ruby), what else
33
+ [we do at Evrone](https://evrone.com/cases#case-studies).
34
+
35
+ If you like this project, please consider to
36
+ _[become a backer](https://www.patreon.com/rubycdp_ferrum)_ on Patreon.
37
+
38
+
39
+ ## Index
40
+
41
+ * [Install](https://github.com/rubycdp/ferrum#install)
42
+ * [Examples](https://github.com/rubycdp/ferrum#examples)
43
+ * [Docker](https://github.com/rubycdp/ferrum#docker)
44
+ * [Customization](https://github.com/rubycdp/ferrum#customization)
45
+ * [Navigation](https://github.com/rubycdp/ferrum#navigation)
46
+ * [Finders](https://github.com/rubycdp/ferrum#finders)
47
+ * [Screenshots](https://github.com/rubycdp/ferrum#screenshots)
48
+ * [Network](https://github.com/rubycdp/ferrum#network)
49
+ * [Mouse](https://github.com/rubycdp/ferrum#mouse)
50
+ * [Keyboard](https://github.com/rubycdp/ferrum#keyboard)
51
+ * [Cookies](https://github.com/rubycdp/ferrum#cookies)
52
+ * [Headers](https://github.com/rubycdp/ferrum#headers)
53
+ * [JavaScript](https://github.com/rubycdp/ferrum#javascript)
54
+ * [Frames](https://github.com/rubycdp/ferrum#frames)
55
+ * [Frame](https://github.com/rubycdp/ferrum#frame)
56
+ * [Dialog](https://github.com/rubycdp/ferrum#dialog)
57
+ * [Thread safety](https://github.com/rubycdp/ferrum#thread-safety)
58
+ * [License](https://github.com/rubycdp/ferrum#license)
8
59
 
9
- As simple as Puppeteer, though even simpler.
10
-
11
- It is Ruby clean and high-level API to Chrome. Runs headless by default,
12
- but you can configure it to run in a non-headless mode. All you need is Ruby and
13
- Chrome/Chromium. Ferrum connects to the browser via DevTools Protocol.
14
-
15
- Relation to [Cuprite](https://github.com/machinio/cuprite). Cuprite used to have
16
- this code inside in one form or another but the thing is you don't need capybara
17
- if you are going to crawl sites. You crawl, not test. Besides that clean
18
- lightweight API to browser is what Ruby was missing, so here it comes.
19
60
 
20
61
  ## Install
21
62
 
22
63
  There's no official Chrome or Chromium package for Linux don't install it this
23
- way because it either will be outdated or unofficial, both are bad. Download it
24
- from official [source](https://www.chromium.org/getting-involved/download-chromium).
64
+ way because it's either outdated or unofficial, both are bad. Download it from
65
+ official [source](https://www.chromium.org/getting-involved/download-chromium).
25
66
  Chrome binary should be in the `PATH` or `BROWSER_PATH` or you can pass it as an
26
- option to browser instance `:browser_path`.
67
+ option to browser instance see `:browser_path` in
68
+ [Customization](https://github.com/rubycdp/ferrum#customization).
27
69
 
28
- Add this to your Gemfile:
70
+ Add this to your `Gemfile` and run `bundle install`.
29
71
 
30
72
  ``` ruby
31
73
  gem "ferrum"
32
74
  ```
33
75
 
76
+
77
+ ## Examples
78
+
34
79
  Navigate to a website and save a screenshot:
35
80
 
36
81
  ```ruby
@@ -45,9 +90,9 @@ Interact with a page:
45
90
  ```ruby
46
91
  browser = Ferrum::Browser.new
47
92
  browser.goto("https://google.com")
48
- input = browser.at_xpath("//div[@id='searchform']/form//input[@type='text']")
49
- input.focus.type("Ruby headless driver for Capybara", :Enter)
50
- browser.at_css("a > h3").text # => "machinio/cuprite: Headless Chrome driver for Capybara - GitHub"
93
+ input = browser.at_xpath("//input[@name='q']")
94
+ input.focus.type("Ruby headless driver for Chrome", :Enter)
95
+ browser.at_css("a > h3").text # => "rubycdp/ferrum: Ruby Chrome/Chromium driver - GitHub"
51
96
  browser.quit
52
97
  ```
53
98
 
@@ -82,7 +127,17 @@ browser.mouse
82
127
  browser.quit
83
128
  ```
84
129
 
85
- ## Customization ##
130
+
131
+ ## Docker
132
+
133
+ In docker as root you must pass the no-sandbox browser option:
134
+
135
+ ```ruby
136
+ Ferrum::Browser.new(browser_options: { 'no-sandbox': nil })
137
+ ```
138
+
139
+
140
+ ## Customization
86
141
 
87
142
  You can customize options with the following code in your test setup:
88
143
 
@@ -91,31 +146,40 @@ Ferrum::Browser.new(options)
91
146
  ```
92
147
 
93
148
  * options `Hash`
94
- * `:browser_path` (String) - Path to chrome binary, you can also set ENV
95
- variable as `BROWSER_PATH=some/path/chrome bundle exec rspec`.
96
149
  * `:headless` (Boolean) - Set browser as headless or not, `true` by default.
97
- * `:slowmo` (Integer | Float) - Set a delay to wait before sending command.
98
- Usefull companion of headless option, so that you have time to see changes.
150
+ * `:xvfb` (Boolean) - Run browser in a virtual framebuffer, `false` by default.
151
+ * `:window_size` (Array) - The dimensions of the browser window in which to
152
+ test, expressed as a 2-element array, e.g. [1024, 768]. Default: [1024, 768]
153
+ * `:extensions` (Array[String | Hash]) - An array of paths to files or JS
154
+ source code to be preloaded into the browser e.g.:
155
+ `["/path/to/script.js", { source: "window.secret = 'top'" }]`
99
156
  * `:logger` (Object responding to `puts`) - When present, debug output is
100
157
  written to this object.
158
+ * `:slowmo` (Integer | Float) - Set a delay to wait before sending command.
159
+ Usefull companion of headless option, so that you have time to see changes.
101
160
  * `:timeout` (Numeric) - The number of seconds we'll wait for a response when
102
161
  communicating with browser. Default is 5.
103
162
  * `:js_errors` (Boolean) - When true, JavaScript errors get re-raised in Ruby.
104
- * `:window_size` (Array) - The dimensions of the browser window in which to
105
- test, expressed as a 2-element array, e.g. [1024, 768]. Default: [1024, 768]
163
+ * `:browser_name` (Symbol) - `:chrome` by default, only experimental support
164
+ for `:firefox` for now.
165
+ * `:browser_path` (String) - Path to Chrome binary, you can also set ENV
166
+ variable as `BROWSER_PATH=some/path/chrome bundle exec rspec`.
106
167
  * `:browser_options` (Hash) - Additional command line options,
107
168
  [see them all](https://peter.sh/experiments/chromium-command-line-switches/)
108
169
  e.g. `{ "ignore-certificate-errors" => nil }`
109
- * `:extensions` (Array) - An array of JS files to be preloaded into the browser
170
+ * `:ignore_default_browser_options` (Boolean) - Ferrum has a number of default
171
+ options it passes to the browser, if you set this to `true` then only
172
+ options you put in `:browser_options` will be passed to the browser,
173
+ except required ones of course.
110
174
  * `:port` (Integer) - Remote debugging port for headless Chrome
111
175
  * `:host` (String) - Remote debugging address for headless Chrome
112
176
  * `:url` (String) - URL for a running instance of Chrome. If this is set, a
113
177
  browser process will not be spawned.
114
178
  * `:process_timeout` (Integer) - How long to wait for the Chrome process to
115
179
  respond on startup
116
-
117
-
118
- #### The API below is for master branch and a subject to change before 1.0
180
+ * `:ws_max_receive_size` (Integer) - How big messages to accept from Chrome
181
+ over the web socket, in bytes. Defaults to 64MB. Incoming messages larger
182
+ than this will cause a `Ferrum::DeadBrowserError`.
119
183
 
120
184
 
121
185
  ## Navigation
@@ -161,6 +225,15 @@ browser.goto("https://github.com/")
161
225
  browser.refresh
162
226
  ```
163
227
 
228
+ #### stop
229
+
230
+ Stop all navigations and loading pending resources on the page
231
+
232
+ ```ruby
233
+ browser.goto("https://github.com/")
234
+ browser.stop
235
+ ```
236
+
164
237
 
165
238
  ## Finders
166
239
 
@@ -340,6 +413,24 @@ browser.goto("https://github.com/")
340
413
  browser.network.status # => 200
341
414
  ```
342
415
 
416
+ #### wait_for_idle(\*\*options)
417
+
418
+ Waits for network idle or raises `Ferrum::TimeoutError` error
419
+
420
+ * options `Hash`
421
+ * :connections `Integer` how many connections are allowed for network to be
422
+ idling, `0` by default
423
+ * :duration `Float` sleep for given amount of time and check again, `0.05` by
424
+ default
425
+ * :timeout `Float` during what time we try to check idle, `browser.timeout`
426
+ by default
427
+
428
+ ```ruby
429
+ browser.goto("https://example.com/")
430
+ browser.at_xpath("//a[text() = 'No UI changes button']").click
431
+ browser.network.wait_for_idle
432
+ ```
433
+
343
434
  #### clear(type)
344
435
 
345
436
  Clear browser's cache or collected traffic.
@@ -515,6 +606,8 @@ Sets given values as cookie
515
606
  * :value `String`
516
607
  * :domain `String`
517
608
  * :expires `Integer`
609
+ * :samesite `String`
610
+ * :httponly `Boolean`
518
611
 
519
612
  ```ruby
520
613
  browser.cookies.set(name: "stealth", value: "omg", domain: "google.com") # => true
@@ -626,22 +719,163 @@ browser.add_script_tag(url: "http://example.com/stylesheet.css") # => true
626
719
 
627
720
  ```ruby
628
721
  browser.add_style_tag(content: "h1 { font-size: 40px; }") # => true
722
+
723
+ ```
724
+ #### bypass_csp(enabled) : `Boolean`
725
+
726
+ * enabled `Boolean`, `true` by default
727
+
728
+ ```ruby
729
+ browser.bypass_csp # => true
730
+ browser.goto("https://github.com/ruby-concurrency/concurrent-ruby/blob/master/docs-source/promises.in.md")
731
+ browser.refresh
732
+ browser.add_script_tag(content: "window.__injected = 42")
733
+ browser.evaluate("window.__injected") # => 42
629
734
  ```
630
735
 
631
736
 
632
737
  ## Frames
633
738
 
634
- #### frames
635
- #### main_frame
636
- #### frame_by
739
+ #### frames : `Array[Frame] | []`
740
+
741
+ Returns all the frames current page have.
742
+
743
+ ```ruby
744
+ browser.goto("https://www.w3schools.com/tags/tag_frame.asp")
745
+ browser.frames # =>
746
+ # [
747
+ # #<Ferrum::Frame @id="C6D104CE454A025FBCF22B98DE612B12" @parent_id=nil @name=nil @state=:stopped_loading @execution_id=1>,
748
+ # #<Ferrum::Frame @id="C09C4E4404314AAEAE85928EAC109A93" @parent_id="C6D104CE454A025FBCF22B98DE612B12" @state=:stopped_loading @execution_id=2>,
749
+ # #<Ferrum::Frame @id="2E9C7F476ED09D87A42F2FEE3C6FBC3C" @parent_id="C6D104CE454A025FBCF22B98DE612B12" @state=:stopped_loading @execution_id=3>,
750
+ # ...
751
+ # ]
752
+ ```
753
+
754
+ #### main_frame : `Frame`
755
+
756
+ Returns page's main frame, the top of the tree and the parent of all frames.
757
+
758
+ #### frame_by(\*\*options) : `Frame | nil`
759
+
760
+ Find frame by given options.
761
+
762
+ * options `Hash`
763
+ * :id `String` - Unique frame's id that browser provides
764
+ * :name `String` - Frame's name if there's one
765
+
766
+ ```ruby
767
+ browser.frame_by(id: "C6D104CE454A025FBCF22B98DE612B12")
768
+ ```
769
+
770
+
771
+ ## Frame
772
+
773
+ #### id : `String`
774
+
775
+ Frame's unique id.
637
776
 
638
- Play around inside given frame
777
+ #### parent_id : `String | nil`
778
+
779
+ Parent frame id if this one is nested in another one.
780
+
781
+ #### execution_id : `Integer`
782
+
783
+ Execution context id which is used by JS, each frame has it's own context in
784
+ which JS evaluates.
785
+
786
+ #### name : `String | nil`
787
+
788
+ If frame was given a name it should be here.
789
+
790
+ #### state : `Symbol | nil`
791
+
792
+ One of the states frame's in:
793
+
794
+ * `:started_loading`
795
+ * `:navigated`
796
+ * `:stopped_loading`
797
+
798
+ #### url : `String`
799
+
800
+ Returns current frame's location href.
801
+
802
+ ```ruby
803
+ browser.goto("https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe")
804
+ frame = browser.frames[1]
805
+ frame.url # => https://interactive-examples.mdn.mozilla.net/pages/tabbed/iframe.html
806
+ ```
807
+
808
+ #### title
809
+
810
+ Returns current frame's title.
639
811
 
640
812
  ```ruby
641
813
  browser.goto("https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe")
642
814
  frame = browser.frames[1]
643
- puts frame.title # => HTML Demo: <iframe>
644
- puts frame.url # => https://interactive-examples.mdn.mozilla.net/pages/tabbed/iframe.html
815
+ frame.title # => HTML Demo: <iframe>
816
+ ```
817
+
818
+ #### main? : `Boolean`
819
+
820
+ If current frame is the main frame of the page (top of the tree).
821
+
822
+ ```ruby
823
+ browser.goto("https://www.w3schools.com/tags/tag_frame.asp")
824
+ frame = browser.frame_by(id: "C09C4E4404314AAEAE85928EAC109A93")
825
+ frame.main? # => false
826
+ ```
827
+
828
+ #### current_url : `String`
829
+
830
+ Returns current frame's top window location href.
831
+
832
+ ```ruby
833
+ browser.goto("https://www.w3schools.com/tags/tag_frame.asp")
834
+ frame = browser.frame_by(id: "C09C4E4404314AAEAE85928EAC109A93")
835
+ frame.current_url # => "https://www.w3schools.com/tags/tag_frame.asp"
836
+ ```
837
+
838
+ #### current_title : `String`
839
+
840
+ Returns current frame's top window title.
841
+
842
+ ```ruby
843
+ browser.goto("https://www.w3schools.com/tags/tag_frame.asp")
844
+ frame = browser.frame_by(id: "C09C4E4404314AAEAE85928EAC109A93")
845
+ frame.current_title # => "HTML frame tag"
846
+ ```
847
+
848
+ #### body : `String`
849
+
850
+ Returns current frame's html.
851
+
852
+ ```ruby
853
+ browser.goto("https://www.w3schools.com/tags/tag_frame.asp")
854
+ frame = browser.frame_by(id: "C09C4E4404314AAEAE85928EAC109A93")
855
+ frame.body # => "<html><head></head><body></body></html>"
856
+ ```
857
+
858
+ #### doctype
859
+
860
+ Returns current frame's doctype.
861
+
862
+ ```ruby
863
+ browser.goto("https://www.w3schools.com/tags/tag_frame.asp")
864
+ browser.main_frame.doctype # => "<!DOCTYPE html>"
865
+ ```
866
+
867
+ #### set_content(html)
868
+
869
+ Sets a content of a given frame.
870
+
871
+ * html `String`
872
+
873
+ ```ruby
874
+ browser.goto("https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe")
875
+ frame = browser.frames[1]
876
+ frame.body # <html lang="en"><head><style>body {transition: opacity ease-in 0.2s; }...
877
+ frame.set_content("<html><head></head><body><p>lol</p></body></html>")
878
+ frame.body # => <html><head></head><body><p>lol</p></body></html>
645
879
  ```
646
880
 
647
881
 
@@ -726,3 +960,27 @@ t2.join
726
960
 
727
961
  browser.quit
728
962
  ```
963
+
964
+
965
+ ## License
966
+
967
+ Copyright 2018-2020 Machinio
968
+
969
+ Permission is hereby granted, free of charge, to any person obtaining
970
+ a copy of this software and associated documentation files (the
971
+ "Software"), to deal in the Software without restriction, including
972
+ without limitation the rights to use, copy, modify, merge, publish,
973
+ distribute, sublicense, and/or sell copies of the Software, and to
974
+ permit persons to whom the Software is furnished to do so, subject to
975
+ the following conditions:
976
+
977
+ The above copyright notice and this permission notice shall be
978
+ included in all copies or substantial portions of the Software.
979
+
980
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
981
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
982
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
983
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
984
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
985
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
986
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -10,8 +10,14 @@ module Ferrum
10
10
  class NotImplementedError < Error; end
11
11
 
12
12
  class StatusError < Error
13
- def initialize(url)
14
- super("Request to #{url} failed to reach server, check DNS and/or server status")
13
+ def initialize(url, pendings = [])
14
+ message = if pendings.empty?
15
+ "Request to #{url} failed to reach server, check DNS and/or server status"
16
+ else
17
+ "Request to #{url} reached server, but there are still pending connections: #{pendings.join(', ')}"
18
+ end
19
+
20
+ super(message)
15
21
  end
16
22
  end
17
23
 
@@ -30,12 +36,31 @@ module Ferrum
30
36
  end
31
37
  end
32
38
 
39
+ class ProcessTimeoutError < Error
40
+ def initialize(timeout)
41
+ super("Browser did not produce websocket url within #{timeout} seconds")
42
+ end
43
+ end
44
+
33
45
  class DeadBrowserError < Error
34
- def initialize(message = "Browser is dead")
46
+ def initialize(message = "Browser is dead or given window is closed")
35
47
  super
36
48
  end
37
49
  end
38
50
 
51
+ class NodeIsMovingError < Error
52
+ def initialize(node, prev, current)
53
+ @node, @prev, @current = node, prev, current
54
+ super(message)
55
+ end
56
+
57
+ def message
58
+ "#{@node.inspect} that you're trying to click is moving, hence " \
59
+ "we cannot. Previosuly it was at #{@prev.inspect} but now at " \
60
+ "#{@current.inspect}."
61
+ end
62
+ end
63
+
39
64
  class BrowserError < Error
40
65
  attr_reader :response
41
66
 
@@ -66,8 +91,8 @@ module Ferrum
66
91
  attr_reader :class_name, :message
67
92
 
68
93
  def initialize(response)
69
- super
70
94
  @class_name, @message = response.values_at("className", "description")
95
+ super(response.merge("message" => @message))
71
96
  end
72
97
  end
73
98