metainspector 4.5.0 → 4.6.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: d8b2f4cf8526bd14a55d879334ff9bf14c95180f
4
- data.tar.gz: 0d39ceedb495d19a761fd7f6bcfdae767d1e1c26
3
+ metadata.gz: 6ee8411bb9ed926d53b27d0723f07ffb1ccf21d3
4
+ data.tar.gz: cb45f0aed52790578cc841f1caf35c933b290c80
5
5
  SHA512:
6
- metadata.gz: b9b8a345bb8f935bfe5a5fb74d4e86a92893c8d44066f87cdffbe029fc5746841c290c366fd94fc2a84edb73edbbf43c491189f6b82e754f4bc0c494eaed6591
7
- data.tar.gz: f559c11756c34406d5083a58c8cd80fdf5665fd79f4a204fdd22eee9eed6bbb65553c3a0b47f16962623ef4c4903adb74b6ac503893b0cb6ea66ee84513e85d4
6
+ metadata.gz: 7c64e3088e40204c2e32a6a00b2531828f5610fddbac8addb1f8032944562c25bd8fb7fb68c54149adc5cc279fbca2633901d4e927ff2a783c73df1eb083bb37
7
+ data.tar.gz: 72bf81478a0efd1cb57c22311aa13568f78cd57409f3dd54e3d32af46651e5cbed3b5d7e996d921b2b5de9d5454b92cda8d2bf271171aa97ff927dda44a7b1e4
data/CHANGELOG.md CHANGED
@@ -1,6 +1,19 @@
1
1
  # MetaInpector Changelog
2
2
 
3
- ## Changes in 4.5
3
+ ## [Changes in 4.6](https://github.com/jaimeiniesta/metainspector/compare/v4.5.0...v4.6.0)
4
+
5
+ Faraday can be passed options via `:faraday_options`. This is useful in cases where we need to
6
+ customize the way we request the page, like for example disabling SSL verification, like this:
7
+
8
+ ```ruby
9
+ MetaInspector.new('https://example.com')
10
+ # Faraday::SSLError: SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed
11
+
12
+ MetaInpector.new('https://example.com', faraday_options: { ssl: { verify: false } })
13
+ # Now we can access the page
14
+ ```
15
+
16
+ ## [Changes in 4.5](https://github.com/jaimeiniesta/metainspector/compare/v4.4.0...v4.5.0)
4
17
 
5
18
  * The Document API now includes access to head/link elements
6
19
  * `page.head_links` returns an array of hashes of all head/links.
@@ -15,16 +28,16 @@
15
28
  * The images API has been extended:
16
29
  * `page.images.with_size` returns a sorted array (by descending area) of [image_url, width, height]
17
30
 
18
- ## Changes in 4.4
31
+ ## [Changes in 4.4](https://github.com/jaimeiniesta/metainspector/compare/v4.3.0...v4.4.0)
19
32
 
20
33
  The default headers now include `'Accept-Encoding' => 'identity'` to minimize trouble with servers that respond with malformed compressed responses, [as explained here](https://github.com/lostisland/faraday/issues/337).
21
34
 
22
- ## Changes in 4.3
35
+ ## [Changes in 4.3](https://github.com/jaimeiniesta/metainspector/compare/v4.3.0...v4.4.0)
23
36
 
24
37
  * The Document API has been extended with one new method `page.best_title` that returns the longest text available from a selection of candidates.
25
38
  * `to_hash` now includes `scheme`, `host`, `root_url`, `best_title` and `description`.
26
39
 
27
- ## Changes in 4.2
40
+ ## [Changes in 4.2](https://github.com/jaimeiniesta/metainspector/compare/v4.1.0...v4.2.0)
28
41
 
29
42
  * The images API has been extended, with two new methods:
30
43
 
@@ -33,11 +46,11 @@ The default headers now include `'Accept-Encoding' => 'identity'` to minimize tr
33
46
 
34
47
  * The criteria for `page.images.best` has changed slightly, we'll now return the largest image instead of the first image if no owner-suggested image is found.
35
48
 
36
- ## Changes in 4.1
49
+ ## [Changes in 4.1](https://github.com/jaimeiniesta/metainspector/compare/v4.0.0...v4.1.0)
37
50
 
38
51
  * Introduces the `:normalize_url` option, which allows to disable URL normalization.
39
52
 
40
- ## Changes in 4.0
53
+ ## [Changes in 4.0](https://github.com/jaimeiniesta/metainspector/compare/v3.0.0...v4.0.0)
41
54
 
42
55
  * The links API has been changed, now instead of `page.links`, `page.internal_links` and `page.external_links` we have:
43
56
 
@@ -56,7 +69,7 @@ page.links.external # Returns all external HTTP links found
56
69
 
57
70
  * You can now specify 2 different timeouts, `connection_timeout` and `read_timeout`, instead of the previous single `timeout`.
58
71
 
59
- ## Changes in 3.0
72
+ ## [Changes in 3.0](https://github.com/jaimeiniesta/metainspector/compare/v2.0.0...v3.0.0)
60
73
 
61
74
  * The redirect API has been changed, now the `:allow_redirections` option will expect only a boolean, which by default is `true`. That is, no more specifying `:safe`, `:unsafe` or `:all`.
62
75
  * We've dropped support for Ruby < 2.
data/README.md CHANGED
@@ -311,6 +311,21 @@ If you want to override the default headers then use the `headers` option:
311
311
  page = MetaInspector.new('example.com', :headers => {'User-Agent' => 'My custom User-Agent'})
312
312
  ```
313
313
 
314
+ ### Disabling SSL verification (or any other Faraday options)
315
+
316
+ Faraday can be passed options via `:faraday_options`.
317
+
318
+ This is useful in cases where we need to
319
+ customize the way we request the page, like for example disabling SSL verification, like this:
320
+
321
+ ```ruby
322
+ MetaInspector.new('https://example.com')
323
+ # Faraday::SSLError: SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed
324
+
325
+ MetaInpector.new('https://example.com', faraday_options: { ssl: { verify: false } })
326
+ # Now we can access the page
327
+ ```
328
+
314
329
  ### HTML Content Only
315
330
 
316
331
  MetaInspector will try to parse all URLs by default. If you want to raise an exception when trying to parse a non-html URL (one that has a content-type different than text/html), you can state it like this:
@@ -18,6 +18,7 @@ module MetaInspector
18
18
  # Can be :warn, :raise or nil
19
19
  # * headers: object containing custom headers for the request
20
20
  # * normalize_url: true by default
21
+ # * faraday_options: an optional hash of options to pass to Faraday on the request
21
22
  def initialize(initial_url, options = {})
22
23
  options = defaults.merge(options)
23
24
  @connection_timeout = options[:connection_timeout]
@@ -31,6 +32,7 @@ module MetaInspector
31
32
  @warn_level = options[:warn_level]
32
33
  @exception_log = options[:exception_log] || MetaInspector::ExceptionLog.new(warn_level: warn_level)
33
34
  @normalize_url = options[:normalize_url]
35
+ @faraday_options = options[:faraday_options]
34
36
  @url = MetaInspector::URL.new(initial_url, exception_log: @exception_log,
35
37
  normalize: @normalize_url)
36
38
  @request = MetaInspector::Request.new(@url, allow_redirections: @allow_redirections,
@@ -38,7 +40,8 @@ module MetaInspector
38
40
  read_timeout: @read_timeout,
39
41
  retries: @retries,
40
42
  exception_log: @exception_log,
41
- headers: @headers) unless @document
43
+ headers: @headers,
44
+ faraday_options: @faraday_options) unless @document
42
45
  @parser = MetaInspector::Parser.new(self, exception_log: @exception_log,
43
46
  download_images: @download_images)
44
47
  end
@@ -17,6 +17,7 @@ module MetaInspector
17
17
  @retries = options[:retries]
18
18
  @exception_log = options[:exception_log]
19
19
  @headers = options[:headers]
20
+ @faraday_options = options[:faraday_options] || {}
20
21
 
21
22
  response # request early so we can fail early
22
23
  end
@@ -44,7 +45,9 @@ module MetaInspector
44
45
  private
45
46
 
46
47
  def fetch
47
- session = Faraday.new(:url => url) do |faraday|
48
+ @faraday_options.merge!(:url => url)
49
+
50
+ session = Faraday.new(@faraday_options) do |faraday|
48
51
  faraday.request :retry, max: @retries
49
52
 
50
53
  if @allow_redirections
@@ -1,3 +1,3 @@
1
1
  module MetaInspector
2
- VERSION = '4.5.0'
2
+ VERSION = '4.6.0'
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: metainspector
3
3
  version: !ruby/object:Gem::Version
4
- version: 4.5.0
4
+ version: 4.6.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Jaime Iniesta
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2015-05-29 00:00:00.000000000 Z
11
+ date: 2015-06-11 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: nokogiri