metainspector 4.7.0 → 4.7.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 329fa61a1fb5c278adb8c07c58536cc84e774f31
4
- data.tar.gz: 8c40e805a960cc5591bee0317307dad27fa46719
3
+ metadata.gz: 76c49fb7187563a3e9daff74a8f3416521251fa6
4
+ data.tar.gz: 0c67668db7ec465badcbb6666b4d3183b1f3e70e
5
5
  SHA512:
6
- metadata.gz: 0535544d927c37e963b567404764ff7af352c00afbfb896c44604e85d6e844155575d26b24a2234f478015f6f4aff9b75724d933c6cac9c3cceda6180545aa99
7
- data.tar.gz: 9f7fa08ed92dcf775df62d97ab3a9d989c04ad4ff9ffe0cbaf6f83baf3c5bd052072d61e2db6b66ebef2643ca5de9de79e6aa78575a87bf432a21a3cb676a093
6
+ metadata.gz: 164db60a1bf7139c1fa4f92ad459073df0b0f0b2adf6a1c48aba960afcbcdcd02c8cbc66d4283b3b1f4967d8a1915f9aca2cf303384507ff74571cfaf17bf0c7
7
+ data.tar.gz: 3326aa3962c7136033557c4398c38a2f442e04d39d14656f7e8195a4b7de16e4f5f5e914a2a48d054ca808b7eff748543b1535f607a096a0a84c7819110c5b28
@@ -1,5 +1,19 @@
1
1
  # MetaInpector Changelog
2
2
 
3
+ ## [Changes in 4.7](https://github.com/jaimeiniesta/metainspector/compare/v4.6.0...v4.7.1)
4
+
5
+ MetaInspector can be configured to use [Faraday::HttpCache](https://github.com/plataformatec/faraday-http-cache) to cache page responses. For that you should pass the `faraday_http_cache` option with at least the `:store` key, for example:
6
+
7
+ ```ruby
8
+ cache = ActiveSupport::Cache.lookup_store(:file_store, '/tmp/cache')
9
+ page = MetaInspector.new('http://example.com', faraday_http_cache: { store: cache })
10
+ ```
11
+
12
+ Bugfixes:
13
+
14
+ * Parsing of the document is done as soon as it is initialized (just like we do with the request), so
15
+ that parsing errors will be catched earlier.
16
+
3
17
  ## [Changes in 4.6](https://github.com/jaimeiniesta/metainspector/compare/v4.5.0...v4.6.0)
4
18
 
5
19
  Faraday can be passed options via `:faraday_options`. This is useful in cases where we need to
data/README.md CHANGED
@@ -393,6 +393,21 @@ You can also set the `warn_level: :store` option so that exceptions found will b
393
393
 
394
394
  You should avoid using the `:store` option, or use it wisely, as silencing errors can be problematic, it's always better to face the errors and treat them accordingly.
395
395
 
396
+ If you're using this exception store, you're advised to first initialize the document, check if it seems OK, and then proceed with the extractions, like this:
397
+
398
+ ```ruby
399
+ # This will fail because the URL will return a text/xml document
400
+ page = MetaInspector.new("http://example.com/rss",
401
+ html_content_only: true,
402
+ warn_level: :store )
403
+
404
+ if page.ok?
405
+ puts "TITLE: #{page.title}"
406
+ else
407
+ puts "There were some exceptions: #{page.exceptions}"
408
+ end
409
+ ```
410
+
396
411
  ## Examples
397
412
 
398
413
  You can find some sample scripts on the `examples` folder, including a basic scraping and a spider that will follow external links using a queue. What follows is an example of use from irb:
@@ -19,6 +19,8 @@ module MetaInspector
19
19
  @download_images = options[:download_images]
20
20
  @images_parser = MetaInspector::Parsers::ImagesParser.new(self, download_images: @download_images)
21
21
  @texts_parser = MetaInspector::Parsers::TextsParser.new(self)
22
+
23
+ parsed # parse early so we can fail early
22
24
  end
23
25
 
24
26
  extend Forwardable
@@ -1,3 +1,3 @@
1
1
  module MetaInspector
2
- VERSION = '4.7.0'
2
+ VERSION = '4.7.1'
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: metainspector
3
3
  version: !ruby/object:Gem::Version
4
- version: 4.7.0
4
+ version: 4.7.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Jaime Iniesta
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2015-10-21 00:00:00.000000000 Z
11
+ date: 2015-10-22 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: nokogiri