metainspector 4.7.0 → 4.7.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 329fa61a1fb5c278adb8c07c58536cc84e774f31
4
- data.tar.gz: 8c40e805a960cc5591bee0317307dad27fa46719
3
+ metadata.gz: 76c49fb7187563a3e9daff74a8f3416521251fa6
4
+ data.tar.gz: 0c67668db7ec465badcbb6666b4d3183b1f3e70e
5
5
  SHA512:
6
- metadata.gz: 0535544d927c37e963b567404764ff7af352c00afbfb896c44604e85d6e844155575d26b24a2234f478015f6f4aff9b75724d933c6cac9c3cceda6180545aa99
7
- data.tar.gz: 9f7fa08ed92dcf775df62d97ab3a9d989c04ad4ff9ffe0cbaf6f83baf3c5bd052072d61e2db6b66ebef2643ca5de9de79e6aa78575a87bf432a21a3cb676a093
6
+ metadata.gz: 164db60a1bf7139c1fa4f92ad459073df0b0f0b2adf6a1c48aba960afcbcdcd02c8cbc66d4283b3b1f4967d8a1915f9aca2cf303384507ff74571cfaf17bf0c7
7
+ data.tar.gz: 3326aa3962c7136033557c4398c38a2f442e04d39d14656f7e8195a4b7de16e4f5f5e914a2a48d054ca808b7eff748543b1535f607a096a0a84c7819110c5b28
@@ -1,5 +1,19 @@
1
1
  # MetaInpector Changelog
2
2
 
3
+ ## [Changes in 4.7](https://github.com/jaimeiniesta/metainspector/compare/v4.6.0...v4.7.1)
4
+
5
+ MetaInspector can be configured to use [Faraday::HttpCache](https://github.com/plataformatec/faraday-http-cache) to cache page responses. For that you should pass the `faraday_http_cache` option with at least the `:store` key, for example:
6
+
7
+ ```ruby
8
+ cache = ActiveSupport::Cache.lookup_store(:file_store, '/tmp/cache')
9
+ page = MetaInspector.new('http://example.com', faraday_http_cache: { store: cache })
10
+ ```
11
+
12
+ Bugfixes:
13
+
14
+ * Parsing of the document is done as soon as it is initialized (just like we do with the request), so
15
+ that parsing errors will be catched earlier.
16
+
3
17
  ## [Changes in 4.6](https://github.com/jaimeiniesta/metainspector/compare/v4.5.0...v4.6.0)
4
18
 
5
19
  Faraday can be passed options via `:faraday_options`. This is useful in cases where we need to
data/README.md CHANGED
@@ -393,6 +393,21 @@ You can also set the `warn_level: :store` option so that exceptions found will b
393
393
 
394
394
  You should avoid using the `:store` option, or use it wisely, as silencing errors can be problematic, it's always better to face the errors and treat them accordingly.
395
395
 
396
+ If you're using this exception store, you're advised to first initialize the document, check if it seems OK, and then proceed with the extractions, like this:
397
+
398
+ ```ruby
399
+ # This will fail because the URL will return a text/xml document
400
+ page = MetaInspector.new("http://example.com/rss",
401
+ html_content_only: true,
402
+ warn_level: :store )
403
+
404
+ if page.ok?
405
+ puts "TITLE: #{page.title}"
406
+ else
407
+ puts "There were some exceptions: #{page.exceptions}"
408
+ end
409
+ ```
410
+
396
411
  ## Examples
397
412
 
398
413
  You can find some sample scripts on the `examples` folder, including a basic scraping and a spider that will follow external links using a queue. What follows is an example of use from irb:
@@ -19,6 +19,8 @@ module MetaInspector
19
19
  @download_images = options[:download_images]
20
20
  @images_parser = MetaInspector::Parsers::ImagesParser.new(self, download_images: @download_images)
21
21
  @texts_parser = MetaInspector::Parsers::TextsParser.new(self)
22
+
23
+ parsed # parse early so we can fail early
22
24
  end
23
25
 
24
26
  extend Forwardable
@@ -1,3 +1,3 @@
1
1
  module MetaInspector
2
- VERSION = '4.7.0'
2
+ VERSION = '4.7.1'
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: metainspector
3
3
  version: !ruby/object:Gem::Version
4
- version: 4.7.0
4
+ version: 4.7.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Jaime Iniesta
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2015-10-21 00:00:00.000000000 Z
11
+ date: 2015-10-22 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: nokogiri