metainspector 4.5.0 → 4.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: d8b2f4cf8526bd14a55d879334ff9bf14c95180f
4
- data.tar.gz: 0d39ceedb495d19a761fd7f6bcfdae767d1e1c26
3
+ metadata.gz: 6ee8411bb9ed926d53b27d0723f07ffb1ccf21d3
4
+ data.tar.gz: cb45f0aed52790578cc841f1caf35c933b290c80
5
5
  SHA512:
6
- metadata.gz: b9b8a345bb8f935bfe5a5fb74d4e86a92893c8d44066f87cdffbe029fc5746841c290c366fd94fc2a84edb73edbbf43c491189f6b82e754f4bc0c494eaed6591
7
- data.tar.gz: f559c11756c34406d5083a58c8cd80fdf5665fd79f4a204fdd22eee9eed6bbb65553c3a0b47f16962623ef4c4903adb74b6ac503893b0cb6ea66ee84513e85d4
6
+ metadata.gz: 7c64e3088e40204c2e32a6a00b2531828f5610fddbac8addb1f8032944562c25bd8fb7fb68c54149adc5cc279fbca2633901d4e927ff2a783c73df1eb083bb37
7
+ data.tar.gz: 72bf81478a0efd1cb57c22311aa13568f78cd57409f3dd54e3d32af46651e5cbed3b5d7e996d921b2b5de9d5454b92cda8d2bf271171aa97ff927dda44a7b1e4
data/CHANGELOG.md CHANGED
@@ -1,6 +1,19 @@
1
1
  # MetaInpector Changelog
2
2
 
3
- ## Changes in 4.5
3
+ ## [Changes in 4.6](https://github.com/jaimeiniesta/metainspector/compare/v4.5.0...v4.6.0)
4
+
5
+ Faraday can be passed options via `:faraday_options`. This is useful in cases where we need to
6
+ customize the way we request the page, like for example disabling SSL verification, like this:
7
+
8
+ ```ruby
9
+ MetaInspector.new('https://example.com')
10
+ # Faraday::SSLError: SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed
11
+
12
+ MetaInpector.new('https://example.com', faraday_options: { ssl: { verify: false } })
13
+ # Now we can access the page
14
+ ```
15
+
16
+ ## [Changes in 4.5](https://github.com/jaimeiniesta/metainspector/compare/v4.4.0...v4.5.0)
4
17
 
5
18
  * The Document API now includes access to head/link elements
6
19
  * `page.head_links` returns an array of hashes of all head/links.
@@ -15,16 +28,16 @@
15
28
  * The images API has been extended:
16
29
  * `page.images.with_size` returns a sorted array (by descending area) of [image_url, width, height]
17
30
 
18
- ## Changes in 4.4
31
+ ## [Changes in 4.4](https://github.com/jaimeiniesta/metainspector/compare/v4.3.0...v4.4.0)
19
32
 
20
33
  The default headers now include `'Accept-Encoding' => 'identity'` to minimize trouble with servers that respond with malformed compressed responses, [as explained here](https://github.com/lostisland/faraday/issues/337).
21
34
 
22
- ## Changes in 4.3
35
+ ## [Changes in 4.3](https://github.com/jaimeiniesta/metainspector/compare/v4.3.0...v4.4.0)
23
36
 
24
37
  * The Document API has been extended with one new method `page.best_title` that returns the longest text available from a selection of candidates.
25
38
  * `to_hash` now includes `scheme`, `host`, `root_url`, `best_title` and `description`.
26
39
 
27
- ## Changes in 4.2
40
+ ## [Changes in 4.2](https://github.com/jaimeiniesta/metainspector/compare/v4.1.0...v4.2.0)
28
41
 
29
42
  * The images API has been extended, with two new methods:
30
43
 
@@ -33,11 +46,11 @@ The default headers now include `'Accept-Encoding' => 'identity'` to minimize tr
33
46
 
34
47
  * The criteria for `page.images.best` has changed slightly, we'll now return the largest image instead of the first image if no owner-suggested image is found.
35
48
 
36
- ## Changes in 4.1
49
+ ## [Changes in 4.1](https://github.com/jaimeiniesta/metainspector/compare/v4.0.0...v4.1.0)
37
50
 
38
51
  * Introduces the `:normalize_url` option, which allows to disable URL normalization.
39
52
 
40
- ## Changes in 4.0
53
+ ## [Changes in 4.0](https://github.com/jaimeiniesta/metainspector/compare/v3.0.0...v4.0.0)
41
54
 
42
55
  * The links API has been changed, now instead of `page.links`, `page.internal_links` and `page.external_links` we have:
43
56
 
@@ -56,7 +69,7 @@ page.links.external # Returns all external HTTP links found
56
69
 
57
70
  * You can now specify 2 different timeouts, `connection_timeout` and `read_timeout`, instead of the previous single `timeout`.
58
71
 
59
- ## Changes in 3.0
72
+ ## [Changes in 3.0](https://github.com/jaimeiniesta/metainspector/compare/v2.0.0...v3.0.0)
60
73
 
61
74
  * The redirect API has been changed, now the `:allow_redirections` option will expect only a boolean, which by default is `true`. That is, no more specifying `:safe`, `:unsafe` or `:all`.
62
75
  * We've dropped support for Ruby < 2.
data/README.md CHANGED
@@ -311,6 +311,21 @@ If you want to override the default headers then use the `headers` option:
311
311
  page = MetaInspector.new('example.com', :headers => {'User-Agent' => 'My custom User-Agent'})
312
312
  ```
313
313
 
314
+ ### Disabling SSL verification (or any other Faraday options)
315
+
316
+ Faraday can be passed options via `:faraday_options`.
317
+
318
+ This is useful in cases where we need to
319
+ customize the way we request the page, like for example disabling SSL verification, like this:
320
+
321
+ ```ruby
322
+ MetaInspector.new('https://example.com')
323
+ # Faraday::SSLError: SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed
324
+
325
+ MetaInpector.new('https://example.com', faraday_options: { ssl: { verify: false } })
326
+ # Now we can access the page
327
+ ```
328
+
314
329
  ### HTML Content Only
315
330
 
316
331
  MetaInspector will try to parse all URLs by default. If you want to raise an exception when trying to parse a non-html URL (one that has a content-type different than text/html), you can state it like this:
@@ -18,6 +18,7 @@ module MetaInspector
18
18
  # Can be :warn, :raise or nil
19
19
  # * headers: object containing custom headers for the request
20
20
  # * normalize_url: true by default
21
+ # * faraday_options: an optional hash of options to pass to Faraday on the request
21
22
  def initialize(initial_url, options = {})
22
23
  options = defaults.merge(options)
23
24
  @connection_timeout = options[:connection_timeout]
@@ -31,6 +32,7 @@ module MetaInspector
31
32
  @warn_level = options[:warn_level]
32
33
  @exception_log = options[:exception_log] || MetaInspector::ExceptionLog.new(warn_level: warn_level)
33
34
  @normalize_url = options[:normalize_url]
35
+ @faraday_options = options[:faraday_options]
34
36
  @url = MetaInspector::URL.new(initial_url, exception_log: @exception_log,
35
37
  normalize: @normalize_url)
36
38
  @request = MetaInspector::Request.new(@url, allow_redirections: @allow_redirections,
@@ -38,7 +40,8 @@ module MetaInspector
38
40
  read_timeout: @read_timeout,
39
41
  retries: @retries,
40
42
  exception_log: @exception_log,
41
- headers: @headers) unless @document
43
+ headers: @headers,
44
+ faraday_options: @faraday_options) unless @document
42
45
  @parser = MetaInspector::Parser.new(self, exception_log: @exception_log,
43
46
  download_images: @download_images)
44
47
  end
@@ -17,6 +17,7 @@ module MetaInspector
17
17
  @retries = options[:retries]
18
18
  @exception_log = options[:exception_log]
19
19
  @headers = options[:headers]
20
+ @faraday_options = options[:faraday_options] || {}
20
21
 
21
22
  response # request early so we can fail early
22
23
  end
@@ -44,7 +45,9 @@ module MetaInspector
44
45
  private
45
46
 
46
47
  def fetch
47
- session = Faraday.new(:url => url) do |faraday|
48
+ @faraday_options.merge!(:url => url)
49
+
50
+ session = Faraday.new(@faraday_options) do |faraday|
48
51
  faraday.request :retry, max: @retries
49
52
 
50
53
  if @allow_redirections
@@ -1,3 +1,3 @@
1
1
  module MetaInspector
2
- VERSION = '4.5.0'
2
+ VERSION = '4.6.0'
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: metainspector
3
3
  version: !ruby/object:Gem::Version
4
- version: 4.5.0
4
+ version: 4.6.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Jaime Iniesta
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2015-05-29 00:00:00.000000000 Z
11
+ date: 2015-06-11 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: nokogiri