RubyGems - mechanizer - Versions diffs - 1.10 → 1.11 - Mend

mechanizer 1.10 → 1.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

checksums.yaml CHANGED

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 454518f3e065bb85be179436c05269e29bdfbd861d1ea4251979544775ec79ad
-  data.tar.gz: c0c2e74dffb8fb7a69064bce076187fd145f3da0e8c5594638023d8452ea75d6
+  metadata.gz: ef9ac4cda832d55e693e46c28e9dd569cad29e096545338ea364bee468f7f02f
+  data.tar.gz: b58af71643298b06fb45ee3cbb2aaa2df3abc2579826e42c6dc197d739aa8943
 SHA512:
-  metadata.gz: 2afa734cf56b9d9997f8943dab38cbaec17deaaa9f6b7a792f28925ba8e16112aa618d1d2453e9d16d3ac3aadb0fdaa3a87ed77fc02d2f642b97ff9d077e08fe
-  data.tar.gz: a73b49229a1bc6ec421b085f1210a3785b911984f83f3ef99e9cd3e666a0ea7dd299f0609ac2e79c10c08604dbbea86fd3b56f51247ec38c07e59f086202e590
+  metadata.gz: e3034c83bdbb741c0a3f65ec798edf4daf7eff600a233ae6bc301e241ee85b774538c7725e51893375e390e9c17c120683dca1b22986aa1dcf237e6eb19b7901
+  data.tar.gz: d8ca61e01ce5c34fbb906aea820e16d947f79ebcf75bc64cddfc7bf1fe9dbaa744438cb507e9551eceda67f724e9ebb256b3285a481228b2f6b6f3ce5cb84312

data/README.md CHANGED

@@ -7,13 +7,15 @@
 Light, easy to use wrapper for Mechanize and NokoGiri.  No configuration or error handling to worry about.  Simply enter the target URL and Mechanizer scrapes the page for you to easily parse.
-#### Recommended Gems
+### Recommended Gems
 Note: URL MUST be in proper format and be valid, example:
 Correct: https://www.example.com
 Incorrect: www.example.com, example.com, https://example.com
-##### 1. If you need to pre-format your URLs, try using `CrmFormatter gem`
-##### 2. If you need to verify your URLs, try using `UrlVerifier gem`, which includes the `CrmFormatter gem` inside of it.
+1. If you need to pre-format your URLs, try using `CrmFormatter gem`
+2. If you need to verify your URLs, try using `UrlVerifier gem`, which includes the `CrmFormatter gem` inside of it.
 Then, feed the results from those gems into this gem.  The documentation below assumes the URLs are correctly formatted and have been verified before passing them through the `Mechanizer gem`.
@@ -35,14 +37,14 @@ Or install it yourself as:
 ## Usage
-#### 1. Instantiate & Pass URL
+### 1. Instantiate & Pass URL
 ```
 noko = Mechanizer::Noko.new
 noko_hash = noko.scrape({url: 'https://www.wikipedia.org'})
 ```
-#### 2. To Customize Timeout:
+### 2. To Customize Timeout:
 Default timeout is set to 60.  You can adjust that time or omit it if 60 is fine.
 ```
@@ -51,7 +53,7 @@ args = {url: 'https://www.wikipedia.org', timeout: 30}
 noko_hash = noko.scrape(args)
 ```
-#### 3. Noko Result in Hash Format
+### 3. Noko Result in Hash Format
 ```
 err_msg = noko_hash[:err_msg]
@@ -59,7 +61,7 @@ page = noko_hash[:page]
 texts_and_hrefs = noko_hash[:texts_and_hrefs]
 ```
-#### 4. Example Texts & Hrefs:
+### 4. Example Texts & Hrefs:
 ```
 texts_and_hrefs = [
@@ -73,17 +75,17 @@ texts_and_hrefs = [
 ]
 ```
-#### 5. Example Parsing Page:
+### 5. Example Parsing Page:
 There are several ways to parse and manipulate `noko_hash[:page]`.  Essentially, you can parse the page using its css classes and html tags.  You can use either or both together.  Some pages are very straight forward, but others can require a lot of skill.  Here is a good reference guide: [Nokogiri Tutorials](http://www.nokogiri.org/tutorials).  All Nokogiri methods are available through this wrapper.  This wrapper simply helps you avoid setting up, manages and reduces errors, and helps to automate your scraping process.
-##### For the Wikipedia URL in the example above, at the time of this README there is a group of icons on its homepage.  If you right-click on any of them you can inspect.  Look for any classes that interest you.  In this example, it's `.other-project`.  Simply paste it like below to get started.  Remember, there are several ways to do this, so read the docs and explore what's available.
+For the Wikipedia URL in the example above, at the time of this README there is a group of icons on its homepage.  If you right-click on any of them you can inspect.  Look for any classes that interest you.  In this example, it's `.other-project`.  Simply paste it like below to get started.  Remember, there are several ways to do this, so read the docs and explore what's available.
 ```
 other_projects = page.css('.other-project')&.text
 other_projects = other_projects.split("\n").reject(&:blank?)
 ```
-##### 6. Results from Parsing Page (from example 5):
+### 6. Results from Parsing Page (from example 5):
 ```
 other_projects = [
@@ -114,7 +116,7 @@ other_projects = [
 ]
 ```
-##### 7. Automating Your Scraping:
+### 7. Automating Your Scraping:
 You may wish to automate your scraping for various reasons including:
 * Verifing Inventory Items and Pricing (car dealers, retail, menus, etc.),

data/Rakefile CHANGED

@@ -26,6 +26,7 @@ end
 def run_mechanizer
   noko = Mechanizer::Noko.new
   args = {url: 'https://www.wikipedia.org', timeout: 30}
+  # args = {url: 'wikipedia', timeout: 30}
   noko_hash = noko.scrape(args)
   err_msg = noko_hash[:err_msg]
@@ -34,5 +35,4 @@ def run_mechanizer
   other_projects = page.css('.other-project')&.text
   other_projects = other_projects.split("\n").reject(&:blank?)
 end

data/lib/mechanizer/noko.rb CHANGED

@@ -5,7 +5,6 @@
 # require 'open-uri'
 # require 'whois'
 # require 'delayed_job'
-#
 # require 'timeout'
 # require 'net/ping'
@@ -72,7 +71,9 @@ module Mechanizer
     end
     def pre_noko_msg(url)
-      puts "\n\n#{'='*40}\nSCRAPING: #{url}\nMax Wait Set: #{@timeout} Seconds\n\n"
+      msg = "\n\n#{'='*40}\nSCRAPING: #{url}\nMax Wait Set: #{@timeout} Seconds\n\n"
+      puts msg
+      msg
     end
     def error_parser(err_msg)
@@ -86,6 +87,8 @@ module Mechanizer
         err_msg = "Error: TCP"
       elsif err_msg.include?("execution expired")
         err_msg = "Error: Runtime"
+      elsif err_msg.include?("absolute URL needed")
+        err_msg = "Error: URL Not Absolute"
       else
         err_msg = "Error: Undefined"
       end

data/lib/mechanizer/version.rb CHANGED

@@ -1,3 +1,3 @@
 module Mechanizer
-  VERSION = "1.10"
+  VERSION = "1.11"
 end

metadata CHANGED

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: mechanizer
 version: !ruby/object:Gem::Version
-  version: '1.10'
+  version: '1.11'
 platform: ruby
 authors:
 - Adam Booth
 autorequire:
 bindir: exe
 cert_chain: []
-date: 2018-07-02 00:00:00.000000000 Z
+date: 2018-07-04 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: activesupport