ferrum 0.6 → 0.9

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 4f657c088b22a9ff5809b6ecd0d7005cd0c2252645b262e76faee1b6e3af94c8
4
- data.tar.gz: d22842188425c753e0a0e113be52e0176e8a7e1a97211c44bf3016e90aa0d1d4
3
+ metadata.gz: 8b4d6dc7aa1827fbf559e6025b82d29d15ed0e36a89793049266d8049fadabb9
4
+ data.tar.gz: 6fab0202e85a17971d613db12e37a7ef85325eaf23f718b6801812df565ac64c
5
5
  SHA512:
6
- metadata.gz: a519e80c6b1450ba85c555d4c5eb547ffb62f129e7873515ba2f5bad4726548289bcc3b4db95932873620e4b645f2f9b7f12c8bd0fdbe0ddada4f0faae080720
7
- data.tar.gz: 91a8f95433f9630dd0f2b36c42ea14bf28b49ee9504738c664399eeb7fc3d192bd26c14ae7de707dcaa8273ec65d7b1a8e9bcaeeff4287392a5e1b312ea03b0d
6
+ metadata.gz: fb109c1b65e73e8d0088734fa004fae3d15650121d01b44dc9086cdb193bb3c7f56c3ad911a2f6de5cd28522231b4e695b67eb515d90afe93b862eb64cfe7050
7
+ data.tar.gz: a4d5e4c192cbd634c640d86027f3a8faeaf42efcd8fbaa9b0e8c6e81c04aa9921b7f2353eb739c35850363f1ed58f84d5895714f52ecd333d9ebdb50c1de4031
data/README.md CHANGED
@@ -1,36 +1,81 @@
1
- # Ferrum - fearless Ruby Chrome driver
1
+ # Ferrum - high-level API to control Chrome in Ruby
2
2
 
3
- [![Build Status](https://travis-ci.org/route/ferrum.svg?branch=master)](https://travis-ci.org/route/ferrum)
3
+ [![Build Status](https://travis-ci.org/rubycdp/ferrum.svg?branch=master)](https://travis-ci.org/rubycdp/ferrum)
4
4
 
5
- <img align="right" width="95" height="95"
5
+ <img align="right"
6
+ width="320" height="241"
6
7
  alt="Ferrum logo"
7
- src="https://raw.githubusercontent.com/route/ferrum/master/logo.svg?sanitize=true">
8
+ src="https://raw.githubusercontent.com/rubycdp/ferrum/master/logo.svg?sanitize=true">
9
+
10
+ #### As simple as Puppeteer, though even simpler.
11
+
12
+ It is Ruby clean and high-level API to Chrome. Runs headless by default, but you
13
+ can configure it to run in a headful mode. All you need is Ruby and
14
+ [Chrome](https://www.google.com/chrome/) or
15
+ [Chromium](https://www.chromium.org/). Ferrum connects to the browser by [CDP
16
+ protocol](https://chromedevtools.github.io/devtools-protocol/) and there's _no_
17
+ Selenium/WebDriver/ChromeDriver dependency. The emphasis was made on a raw CDP
18
+ protocol because Chrome allows you to do so many things that are barely
19
+ supported by WebDriver because it should have consistent design with other
20
+ browsers.
21
+
22
+ * [Cuprite](https://github.com/rubycdp/cuprite) is a pure Ruby driver for
23
+ [Capybara](https://github.com/teamcapybara/capybara) based on Ferrum. If you are
24
+ going to crawl sites you better use Ferrum or
25
+ [Vessel](https://github.com/rubycdp/vessel) because you crawl, not test.
26
+
27
+ * [Vessel](https://github.com/rubycdp/vessel) high-level web crawling framework
28
+ based on Ferrum. It looks like [Scrapy](https://scrapy.org/) except that it uses
29
+ a real browser in order to grab data.
30
+
31
+ Web design by [Evrone](https://evrone.com/), what else
32
+ [we build with Ruby on Rails](https://evrone.com/ruby), what else
33
+ [we do at Evrone](https://evrone.com/cases#case-studies).
34
+
35
+ If you like this project, please consider to
36
+ _[become a backer](https://www.patreon.com/rubycdp_ferrum)_ on Patreon.
37
+
38
+
39
+ ## Index
40
+
41
+ * [Install](https://github.com/rubycdp/ferrum#install)
42
+ * [Examples](https://github.com/rubycdp/ferrum#examples)
43
+ * [Docker](https://github.com/rubycdp/ferrum#docker)
44
+ * [Customization](https://github.com/rubycdp/ferrum#customization)
45
+ * [Navigation](https://github.com/rubycdp/ferrum#navigation)
46
+ * [Finders](https://github.com/rubycdp/ferrum#finders)
47
+ * [Screenshots](https://github.com/rubycdp/ferrum#screenshots)
48
+ * [Network](https://github.com/rubycdp/ferrum#network)
49
+ * [Mouse](https://github.com/rubycdp/ferrum#mouse)
50
+ * [Keyboard](https://github.com/rubycdp/ferrum#keyboard)
51
+ * [Cookies](https://github.com/rubycdp/ferrum#cookies)
52
+ * [Headers](https://github.com/rubycdp/ferrum#headers)
53
+ * [JavaScript](https://github.com/rubycdp/ferrum#javascript)
54
+ * [Frames](https://github.com/rubycdp/ferrum#frames)
55
+ * [Frame](https://github.com/rubycdp/ferrum#frame)
56
+ * [Dialog](https://github.com/rubycdp/ferrum#dialog)
57
+ * [Thread safety](https://github.com/rubycdp/ferrum#thread-safety)
58
+ * [License](https://github.com/rubycdp/ferrum#license)
8
59
 
9
- As simple as Puppeteer, though even simpler.
10
-
11
- It is Ruby clean and high-level API to Chrome. Runs headless by default,
12
- but you can configure it to run in a non-headless mode. All you need is Ruby and
13
- Chrome/Chromium. Ferrum connects to the browser via DevTools Protocol.
14
-
15
- Relation to [Cuprite](https://github.com/machinio/cuprite). Cuprite used to have
16
- this code inside in one form or another but the thing is you don't need capybara
17
- if you are going to crawl sites. You crawl, not test. Besides that clean
18
- lightweight API to browser is what Ruby was missing, so here it comes.
19
60
 
20
61
  ## Install
21
62
 
22
63
  There's no official Chrome or Chromium package for Linux don't install it this
23
- way because it either will be outdated or unofficial, both are bad. Download it
24
- from official [source](https://www.chromium.org/getting-involved/download-chromium).
64
+ way because it's either outdated or unofficial, both are bad. Download it from
65
+ official [source](https://www.chromium.org/getting-involved/download-chromium).
25
66
  Chrome binary should be in the `PATH` or `BROWSER_PATH` or you can pass it as an
26
- option to browser instance `:browser_path`.
67
+ option to browser instance see `:browser_path` in
68
+ [Customization](https://github.com/rubycdp/ferrum#customization).
27
69
 
28
- Add this to your Gemfile:
70
+ Add this to your `Gemfile` and run `bundle install`.
29
71
 
30
72
  ``` ruby
31
73
  gem "ferrum"
32
74
  ```
33
75
 
76
+
77
+ ## Examples
78
+
34
79
  Navigate to a website and save a screenshot:
35
80
 
36
81
  ```ruby
@@ -45,9 +90,9 @@ Interact with a page:
45
90
  ```ruby
46
91
  browser = Ferrum::Browser.new
47
92
  browser.goto("https://google.com")
48
- input = browser.at_xpath("//div[@id='searchform']/form//input[@type='text']")
49
- input.focus.type("Ruby headless driver for Capybara", :Enter)
50
- browser.at_css("a > h3").text # => "machinio/cuprite: Headless Chrome driver for Capybara - GitHub"
93
+ input = browser.at_xpath("//input[@name='q']")
94
+ input.focus.type("Ruby headless driver for Chrome", :Enter)
95
+ browser.at_css("a > h3").text # => "rubycdp/ferrum: Ruby Chrome/Chromium driver - GitHub"
51
96
  browser.quit
52
97
  ```
53
98
 
@@ -82,7 +127,17 @@ browser.mouse
82
127
  browser.quit
83
128
  ```
84
129
 
85
- ## Customization ##
130
+
131
+ ## Docker
132
+
133
+ In docker as root you must pass the no-sandbox browser option:
134
+
135
+ ```ruby
136
+ Ferrum::Browser.new(browser_options: { 'no-sandbox': nil })
137
+ ```
138
+
139
+
140
+ ## Customization
86
141
 
87
142
  You can customize options with the following code in your test setup:
88
143
 
@@ -91,31 +146,40 @@ Ferrum::Browser.new(options)
91
146
  ```
92
147
 
93
148
  * options `Hash`
94
- * `:browser_path` (String) - Path to chrome binary, you can also set ENV
95
- variable as `BROWSER_PATH=some/path/chrome bundle exec rspec`.
96
149
  * `:headless` (Boolean) - Set browser as headless or not, `true` by default.
97
- * `:slowmo` (Integer | Float) - Set a delay to wait before sending command.
98
- Usefull companion of headless option, so that you have time to see changes.
150
+ * `:xvfb` (Boolean) - Run browser in a virtual framebuffer, `false` by default.
151
+ * `:window_size` (Array) - The dimensions of the browser window in which to
152
+ test, expressed as a 2-element array, e.g. [1024, 768]. Default: [1024, 768]
153
+ * `:extensions` (Array[String | Hash]) - An array of paths to files or JS
154
+ source code to be preloaded into the browser e.g.:
155
+ `["/path/to/script.js", { source: "window.secret = 'top'" }]`
99
156
  * `:logger` (Object responding to `puts`) - When present, debug output is
100
157
  written to this object.
158
+ * `:slowmo` (Integer | Float) - Set a delay to wait before sending command.
159
+ Usefull companion of headless option, so that you have time to see changes.
101
160
  * `:timeout` (Numeric) - The number of seconds we'll wait for a response when
102
161
  communicating with browser. Default is 5.
103
162
  * `:js_errors` (Boolean) - When true, JavaScript errors get re-raised in Ruby.
104
- * `:window_size` (Array) - The dimensions of the browser window in which to
105
- test, expressed as a 2-element array, e.g. [1024, 768]. Default: [1024, 768]
163
+ * `:browser_name` (Symbol) - `:chrome` by default, only experimental support
164
+ for `:firefox` for now.
165
+ * `:browser_path` (String) - Path to Chrome binary, you can also set ENV
166
+ variable as `BROWSER_PATH=some/path/chrome bundle exec rspec`.
106
167
  * `:browser_options` (Hash) - Additional command line options,
107
168
  [see them all](https://peter.sh/experiments/chromium-command-line-switches/)
108
169
  e.g. `{ "ignore-certificate-errors" => nil }`
109
- * `:extensions` (Array) - An array of JS files to be preloaded into the browser
170
+ * `:ignore_default_browser_options` (Boolean) - Ferrum has a number of default
171
+ options it passes to the browser, if you set this to `true` then only
172
+ options you put in `:browser_options` will be passed to the browser,
173
+ except required ones of course.
110
174
  * `:port` (Integer) - Remote debugging port for headless Chrome
111
175
  * `:host` (String) - Remote debugging address for headless Chrome
112
176
  * `:url` (String) - URL for a running instance of Chrome. If this is set, a
113
177
  browser process will not be spawned.
114
178
  * `:process_timeout` (Integer) - How long to wait for the Chrome process to
115
179
  respond on startup
116
-
117
-
118
- #### The API below is for master branch and a subject to change before 1.0
180
+ * `:ws_max_receive_size` (Integer) - How big messages to accept from Chrome
181
+ over the web socket, in bytes. Defaults to 64MB. Incoming messages larger
182
+ than this will cause a `Ferrum::DeadBrowserError`.
119
183
 
120
184
 
121
185
  ## Navigation
@@ -161,6 +225,15 @@ browser.goto("https://github.com/")
161
225
  browser.refresh
162
226
  ```
163
227
 
228
+ #### stop
229
+
230
+ Stop all navigations and loading pending resources on the page
231
+
232
+ ```ruby
233
+ browser.goto("https://github.com/")
234
+ browser.stop
235
+ ```
236
+
164
237
 
165
238
  ## Finders
166
239
 
@@ -340,6 +413,24 @@ browser.goto("https://github.com/")
340
413
  browser.network.status # => 200
341
414
  ```
342
415
 
416
+ #### wait_for_idle(\*\*options)
417
+
418
+ Waits for network idle or raises `Ferrum::TimeoutError` error
419
+
420
+ * options `Hash`
421
+ * :connections `Integer` how many connections are allowed for network to be
422
+ idling, `0` by default
423
+ * :duration `Float` sleep for given amount of time and check again, `0.05` by
424
+ default
425
+ * :timeout `Float` during what time we try to check idle, `browser.timeout`
426
+ by default
427
+
428
+ ```ruby
429
+ browser.goto("https://example.com/")
430
+ browser.at_xpath("//a[text() = 'No UI changes button']").click
431
+ browser.network.wait_for_idle
432
+ ```
433
+
343
434
  #### clear(type)
344
435
 
345
436
  Clear browser's cache or collected traffic.
@@ -515,6 +606,8 @@ Sets given values as cookie
515
606
  * :value `String`
516
607
  * :domain `String`
517
608
  * :expires `Integer`
609
+ * :samesite `String`
610
+ * :httponly `Boolean`
518
611
 
519
612
  ```ruby
520
613
  browser.cookies.set(name: "stealth", value: "omg", domain: "google.com") # => true
@@ -626,22 +719,163 @@ browser.add_script_tag(url: "http://example.com/stylesheet.css") # => true
626
719
 
627
720
  ```ruby
628
721
  browser.add_style_tag(content: "h1 { font-size: 40px; }") # => true
722
+
723
+ ```
724
+ #### bypass_csp(enabled) : `Boolean`
725
+
726
+ * enabled `Boolean`, `true` by default
727
+
728
+ ```ruby
729
+ browser.bypass_csp # => true
730
+ browser.goto("https://github.com/ruby-concurrency/concurrent-ruby/blob/master/docs-source/promises.in.md")
731
+ browser.refresh
732
+ browser.add_script_tag(content: "window.__injected = 42")
733
+ browser.evaluate("window.__injected") # => 42
629
734
  ```
630
735
 
631
736
 
632
737
  ## Frames
633
738
 
634
- #### frames
635
- #### main_frame
636
- #### frame_by
739
+ #### frames : `Array[Frame] | []`
740
+
741
+ Returns all the frames current page have.
742
+
743
+ ```ruby
744
+ browser.goto("https://www.w3schools.com/tags/tag_frame.asp")
745
+ browser.frames # =>
746
+ # [
747
+ # #<Ferrum::Frame @id="C6D104CE454A025FBCF22B98DE612B12" @parent_id=nil @name=nil @state=:stopped_loading @execution_id=1>,
748
+ # #<Ferrum::Frame @id="C09C4E4404314AAEAE85928EAC109A93" @parent_id="C6D104CE454A025FBCF22B98DE612B12" @state=:stopped_loading @execution_id=2>,
749
+ # #<Ferrum::Frame @id="2E9C7F476ED09D87A42F2FEE3C6FBC3C" @parent_id="C6D104CE454A025FBCF22B98DE612B12" @state=:stopped_loading @execution_id=3>,
750
+ # ...
751
+ # ]
752
+ ```
753
+
754
+ #### main_frame : `Frame`
755
+
756
+ Returns page's main frame, the top of the tree and the parent of all frames.
757
+
758
+ #### frame_by(\*\*options) : `Frame | nil`
759
+
760
+ Find frame by given options.
761
+
762
+ * options `Hash`
763
+ * :id `String` - Unique frame's id that browser provides
764
+ * :name `String` - Frame's name if there's one
765
+
766
+ ```ruby
767
+ browser.frame_by(id: "C6D104CE454A025FBCF22B98DE612B12")
768
+ ```
769
+
770
+
771
+ ## Frame
772
+
773
+ #### id : `String`
774
+
775
+ Frame's unique id.
637
776
 
638
- Play around inside given frame
777
+ #### parent_id : `String | nil`
778
+
779
+ Parent frame id if this one is nested in another one.
780
+
781
+ #### execution_id : `Integer`
782
+
783
+ Execution context id which is used by JS, each frame has it's own context in
784
+ which JS evaluates.
785
+
786
+ #### name : `String | nil`
787
+
788
+ If frame was given a name it should be here.
789
+
790
+ #### state : `Symbol | nil`
791
+
792
+ One of the states frame's in:
793
+
794
+ * `:started_loading`
795
+ * `:navigated`
796
+ * `:stopped_loading`
797
+
798
+ #### url : `String`
799
+
800
+ Returns current frame's location href.
801
+
802
+ ```ruby
803
+ browser.goto("https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe")
804
+ frame = browser.frames[1]
805
+ frame.url # => https://interactive-examples.mdn.mozilla.net/pages/tabbed/iframe.html
806
+ ```
807
+
808
+ #### title
809
+
810
+ Returns current frame's title.
639
811
 
640
812
  ```ruby
641
813
  browser.goto("https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe")
642
814
  frame = browser.frames[1]
643
- puts frame.title # => HTML Demo: <iframe>
644
- puts frame.url # => https://interactive-examples.mdn.mozilla.net/pages/tabbed/iframe.html
815
+ frame.title # => HTML Demo: <iframe>
816
+ ```
817
+
818
+ #### main? : `Boolean`
819
+
820
+ If current frame is the main frame of the page (top of the tree).
821
+
822
+ ```ruby
823
+ browser.goto("https://www.w3schools.com/tags/tag_frame.asp")
824
+ frame = browser.frame_by(id: "C09C4E4404314AAEAE85928EAC109A93")
825
+ frame.main? # => false
826
+ ```
827
+
828
+ #### current_url : `String`
829
+
830
+ Returns current frame's top window location href.
831
+
832
+ ```ruby
833
+ browser.goto("https://www.w3schools.com/tags/tag_frame.asp")
834
+ frame = browser.frame_by(id: "C09C4E4404314AAEAE85928EAC109A93")
835
+ frame.current_url # => "https://www.w3schools.com/tags/tag_frame.asp"
836
+ ```
837
+
838
+ #### current_title : `String`
839
+
840
+ Returns current frame's top window title.
841
+
842
+ ```ruby
843
+ browser.goto("https://www.w3schools.com/tags/tag_frame.asp")
844
+ frame = browser.frame_by(id: "C09C4E4404314AAEAE85928EAC109A93")
845
+ frame.current_title # => "HTML frame tag"
846
+ ```
847
+
848
+ #### body : `String`
849
+
850
+ Returns current frame's html.
851
+
852
+ ```ruby
853
+ browser.goto("https://www.w3schools.com/tags/tag_frame.asp")
854
+ frame = browser.frame_by(id: "C09C4E4404314AAEAE85928EAC109A93")
855
+ frame.body # => "<html><head></head><body></body></html>"
856
+ ```
857
+
858
+ #### doctype
859
+
860
+ Returns current frame's doctype.
861
+
862
+ ```ruby
863
+ browser.goto("https://www.w3schools.com/tags/tag_frame.asp")
864
+ browser.main_frame.doctype # => "<!DOCTYPE html>"
865
+ ```
866
+
867
+ #### set_content(html)
868
+
869
+ Sets a content of a given frame.
870
+
871
+ * html `String`
872
+
873
+ ```ruby
874
+ browser.goto("https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe")
875
+ frame = browser.frames[1]
876
+ frame.body # <html lang="en"><head><style>body {transition: opacity ease-in 0.2s; }...
877
+ frame.set_content("<html><head></head><body><p>lol</p></body></html>")
878
+ frame.body # => <html><head></head><body><p>lol</p></body></html>
645
879
  ```
646
880
 
647
881
 
@@ -726,3 +960,27 @@ t2.join
726
960
 
727
961
  browser.quit
728
962
  ```
963
+
964
+
965
+ ## License
966
+
967
+ Copyright 2018-2020 Machinio
968
+
969
+ Permission is hereby granted, free of charge, to any person obtaining
970
+ a copy of this software and associated documentation files (the
971
+ "Software"), to deal in the Software without restriction, including
972
+ without limitation the rights to use, copy, modify, merge, publish,
973
+ distribute, sublicense, and/or sell copies of the Software, and to
974
+ permit persons to whom the Software is furnished to do so, subject to
975
+ the following conditions:
976
+
977
+ The above copyright notice and this permission notice shall be
978
+ included in all copies or substantial portions of the Software.
979
+
980
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
981
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
982
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
983
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
984
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
985
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
986
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -10,8 +10,14 @@ module Ferrum
10
10
  class NotImplementedError < Error; end
11
11
 
12
12
  class StatusError < Error
13
- def initialize(url)
14
- super("Request to #{url} failed to reach server, check DNS and/or server status")
13
+ def initialize(url, pendings = [])
14
+ message = if pendings.empty?
15
+ "Request to #{url} failed to reach server, check DNS and/or server status"
16
+ else
17
+ "Request to #{url} reached server, but there are still pending connections: #{pendings.join(', ')}"
18
+ end
19
+
20
+ super(message)
15
21
  end
16
22
  end
17
23
 
@@ -30,12 +36,31 @@ module Ferrum
30
36
  end
31
37
  end
32
38
 
39
+ class ProcessTimeoutError < Error
40
+ def initialize(timeout)
41
+ super("Browser did not produce websocket url within #{timeout} seconds")
42
+ end
43
+ end
44
+
33
45
  class DeadBrowserError < Error
34
- def initialize(message = "Browser is dead")
46
+ def initialize(message = "Browser is dead or given window is closed")
35
47
  super
36
48
  end
37
49
  end
38
50
 
51
+ class NodeIsMovingError < Error
52
+ def initialize(node, prev, current)
53
+ @node, @prev, @current = node, prev, current
54
+ super(message)
55
+ end
56
+
57
+ def message
58
+ "#{@node.inspect} that you're trying to click is moving, hence " \
59
+ "we cannot. Previosuly it was at #{@prev.inspect} but now at " \
60
+ "#{@current.inspect}."
61
+ end
62
+ end
63
+
39
64
  class BrowserError < Error
40
65
  attr_reader :response
41
66
 
@@ -66,8 +91,8 @@ module Ferrum
66
91
  attr_reader :class_name, :message
67
92
 
68
93
  def initialize(response)
69
- super
70
94
  @class_name, @message = response.values_at("className", "description")
95
+ super(response.merge("message" => @message))
71
96
  end
72
97
  end
73
98