RubyGems - metainspector - Versions diffs - 1.9.0 → 1.9.1 - Mend

metainspector 1.9.0 → 1.9.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

data/README.rdoc +18 -12
data/lib/meta_inspector/scraper.rb +7 -5
data/lib/meta_inspector/version.rb +1 -1
data/lib/meta_inspector.rb +2 -2
data/meta_inspector.gemspec +1 -1
metadata +6 -7

data/README.rdoc CHANGED Viewed

@@ -14,16 +14,21 @@ This gem is tested on Ruby versions 1.8.7, 1.9.2 and 1.9.3.
 Initialize a scraper instance for an URL, like this:
-  page = MetaInspector::Scraper.new('http://pagerankalert.com')
+  page = MetaInspector::Scraper.new('http://w3clove.com')
 or, for short, a convenience alias is also available:
-  page = MetaInspector.new('http://pagerankalert.com')
+  page = MetaInspector.new('http://w3clove.com')
 If you don't include the scheme on the URL, http:// will be used
 by defaul:
-  page = MetaInspector.new('pagerankalert.com')
+  page = MetaInspector.new('w3clove.com')
+By default, MetaInspector times out after 20 seconds of waiting for a page to respond.
+You can set a different timeout with a second parameter, like this:
+  page = MetaInspector.new('w3clove.com', 5) # this would wait just 5 seconds to timeout
 Then you can see the scraped data like this:
@@ -58,7 +63,7 @@ Please notice that MetaInspector is case sensitive, so page.meta_Content_Type is
 You can also access most of the scraped data as a hash:
-  page.to_hash               # { "url"=>"http://pagerankalert.com", "title" => "PageRankAlert.com", ... }
+  page.to_hash               # { "url"=>"http://w3clove.com", "title" => "W3CLove :: site-wide markup validation tool", ... }
 The full scraped document if accessible from:
@@ -72,23 +77,23 @@ You can find some sample scripts on the samples folder, including a basic scrapi
   >> require 'metainspector'
   => true
-  >> page = MetaInspector.new('http://pagerankalert.com')
-  => #<MetaInspector:0x11330c0 @url="http://pagerankalert.com">
+  >> page = MetaInspector.new('http://w3clove.com')
+  => #<MetaInspector:0x11330c0 @url="http://w3clove.com">
   >> page.title
-  => "PageRankAlert.com :: Track your PageRank changes"
+  => "W3CLove :: site-wide markup validation tool"
   >> page.meta_description
-  => "Track your PageRank(TM) changes and receive alerts by email"
+  => "Site-wide markup validation tool. Validate the markup of your whole site with just one click."
   >> page.meta_keywords
-  => "pagerank, seo, optimization, google"
+  => "html, markup, validation, validator, tool, w3c, development, standards, free"
   >> page.links.size
-  => 8
+  => 15
-  >> page.links[5]
-  => "http://pagerankalert.posterous.com"
+  >> page.links[4]
+  => "/plans-and-pricing"
   >> page.document.class
   => String
@@ -103,6 +108,7 @@ You're welcome to fork this project and send pull requests. I want to thank spec
 * Ryan Romanchuk https://github.com/rromanchuk
 * Edmund Haselwanter https://github.com/ehaselwanter
 * Jonathan Hernández https://github.com/ionmx
+* Oriol Gual https://github.com/oriolgual
 = To Do

data/lib/meta_inspector/scraper.rb CHANGED Viewed

@@ -4,6 +4,7 @@ require 'open-uri'
 require 'nokogiri'
 require 'charguess'
 require 'hashie/rash'
+require 'timeout'
 # MetaInspector provides an easy way to scrape web pages and get its elements
 module MetaInspector
@@ -11,10 +12,11 @@ module MetaInspector
     attr_reader :url, :scheme
     # Initializes a new instance of MetaInspector, setting the URL to the one given
     # If no scheme given, set it to http:// by default
-    def initialize(url)
-      @url    = URI.parse(url).scheme.nil? ? 'http://' + url : url
-      @scheme = URI.parse(url).scheme || 'http'
-      @data   = Hashie::Rash.new('url' => @url)
+    def initialize(url, timeout = 20)
+      @url      = URI.parse(url).scheme.nil? ? 'http://' + url : url
+      @scheme   = URI.parse(url).scheme || 'http'
+      @timeout  = timeout
+      @data     = Hashie::Rash.new('url' => @url)
     end
     # Returns the parsed document title, from the content of the <title> tag.
@@ -92,7 +94,7 @@ module MetaInspector
     # Returns the original, unparsed document
     def document
-      @document ||= open(@url).read
+      @document ||= Timeout::timeout(@timeout) { open(@url).read }
       rescue SocketError
         warn 'MetaInspector exception: The url provided does not exist or is temporarily unavailable (socket error)'

data/lib/meta_inspector/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # -*- encoding: utf-8 -*-
 module MetaInspector
-  VERSION = "1.9.0"
+  VERSION = "1.9.1"
 end

data/lib/meta_inspector.rb CHANGED Viewed

@@ -6,7 +6,7 @@ module MetaInspector
   extend self
   # Sugar method to be able to create a scraper in a shorter way
-  def new(url)
-    Scraper.new(url)
+  def new(url, timeout = 20)
+    Scraper.new(url, timeout)
   end
 end

data/meta_inspector.gemspec CHANGED Viewed

@@ -14,7 +14,7 @@ Gem::Specification.new do |gem|
   gem.require_paths = ["lib"]
   gem.version       = MetaInspector::VERSION
-  gem.add_dependency 'nokogiri', '1.5.3'
+  gem.add_dependency 'nokogiri', '~> 1.5'
   gem.add_dependency 'charguess', '1.3.20111021164500'
   gem.add_dependency 'rash', '0.3.2'

metadata CHANGED Viewed

@@ -1,13 +1,13 @@
 --- !ruby/object:Gem::Specification
 name: metainspector
 version: !ruby/object:Gem::Version
-  hash: 51
+  hash: 49
   prerelease:
   segments:
   - 1
   - 9
-  - 0
-  version: 1.9.0
+  - 1
+  version: 1.9.1
 platform: ruby
 authors:
 - Jaime Iniesta
@@ -15,7 +15,7 @@ autorequire:
 bindir: bin
 cert_chain: []
-date: 2012-06-03 00:00:00 Z
+date: 2012-07-11 00:00:00 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: nokogiri
@@ -23,14 +23,13 @@ dependencies:
   requirement: &id001 !ruby/object:Gem::Requirement
     none: false
     requirements:
-    - - "="
+    - - ~>
       - !ruby/object:Gem::Version
         hash: 5
         segments:
         - 1
         - 5
-        - 3
-        version: 1.5.3
+        version: "1.5"
   type: :runtime
   version_requirements: *id001
 - !ruby/object:Gem::Dependency