RubyGems - rubyretriever - Versions diffs - 0.0.11 → 0.0.12 - Mend

rubyretriever 0.0.11 → 0.0.12

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA1:
-  metadata.gz: ffb93b0faa77d73f014f67be6dbb6320233a5497
-  data.tar.gz: 920547b074b92a01b164e2f27130010773a55e0b
+  metadata.gz: efc429906131b363741d6560e37cb095f905b48e
+  data.tar.gz: 85f320d55600f007315941b6c3213c8f04b70515
 SHA512:
-  metadata.gz: b3c36ff313a381ec3d1950abf1c148faed90aa99a0658741ab4533f15d6b6afd2e6dc95caa0be5afd231125099126c24aeee36b3a873c33d8a81c6f42dace510
-  data.tar.gz: 31bb5aa05f6354f083fae15b3351059d28073c449125081a2c043b7087340d48294541a11951cb2f13f67ec9b8944a029071bcf24e1421b358f64ad458a31d85
+  metadata.gz: 1cdeb51c607ee23b662128ae7b1071085314c9c04626fdfaf708ef9be7224e1bd83652e9bffb64175da480f7830af223a6e8a2a846cb429af3a4c58a71472941
+  data.tar.gz: 437ee738e18d69600897512e0dd047da23166b2c59ad5f70ae8336532ecfa73e85399e9092b5d7b11895ddc23cffd46dd9465c2c6859bbf0890c80a32e15218b

data/lib/retriever.rb CHANGED Viewed

@@ -10,7 +10,6 @@ require 'em-synchrony/fiber_iterator'
 require 'ruby-progressbar'
 require 'open-uri'
 require 'optparse'
-require 'uri'
 require 'csv'
 require 'bloomfilter-rb'

data/lib/retriever/version.rb CHANGED Viewed

@@ -1,3 +1,3 @@
 module Retriever
-  VERSION = '0.0.11'
+  VERSION = '0.0.12'
 end

data/readme.md CHANGED Viewed

@@ -1,37 +1,23 @@
-RubyRetriever  [![Gem Version](https://badge.fury.io/rb/rubyretriever.svg)](http://badge.fury.io/rb/rubyretriever)
+[RubyRetriever] (http://www.softwarebyjoe.com/rubyretriever/)  [![Gem Version](https://badge.fury.io/rb/rubyretriever.svg)](http://badge.fury.io/rb/rubyretriever)
 ==============
-Now an official RubyGem!
-```sh
-gem install rubyretriever
-```
-Update (5/26):
-Version 0.0.10 - fixes a bug that wouldn't allow sitemaps to write out to file correctly.
-Update (5/25):
- Version 0.0.6 - Switches to using a Bloom Filter to keep track of past 'visited pages'. I saw this in [Arachnid] (https://github.com/dchuk/Arachnid) and realized it's a much better idea for performance and implemented it immediately. Hat tip [dchuk] (https://github.com/dchuk/)
-About
-=====
+By Joe Norton
 RubyRetriever is a Web Crawler, Site Mapper, File Harvester & Autodownloader, and all around nice buddy to have around.
-Soon to add some high level scraping options.
 RubyRetriever uses aynchronous HTTP requests, thanks to eventmachine and Synchrony fibers, to crawl webpages *very quickly*.
-This is the 2nd or 3rd reincarnation of the RubyRetriever autodownloader project. It started out as a executable autodownloader, intended for malware research. From there it has morphed to become a more well-rounded web-crawler and general purpose file harvesting utility.
-RubyRetriever does NOT respect robots.txt, and RubyRetriever currently - by default - launches up to 10 parallel GET requests at once. This is a feature, do not abuse it. Use at own risk.
+RubyRetriever does NOT respect robots.txt, and RubyRetriever currently - by default - launches up to 10 parallel GET requests at once. This is a feature, do not abuse it. Use at own risk.
-HOW IT WORKS
+getting started
 -----------
+Install the gem
 ```sh
-gem install rubyretriever
-rr [MODE] [OPTIONS] Target_URL
+gem install rubyretriever
 ```
- **Site Mapper**
+ **Example: Sitemap mode**
 ```sh
 rr --sitemap --progress --limit 1000 --output cnet http://www.cnet.com
 ```
@@ -42,7 +28,7 @@ rr -s -p -l 1000 -o cnet http://www.cnet.com
 This would go to http://www.cnet.com and map it until it crawled a max of 1,000 pages, and then it would write it out to a csv named cnet.
- **File Harvesting**
+ **Example: File Harvesting mode**
 ```sh
 rr --files --ext pdf --progress --limit 1000 --output hubspot http://www.hubspot.com
 ```

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: rubyretriever
 version: !ruby/object:Gem::Version
-  version: 0.0.11
+  version: 0.0.12
 platform: ruby
 authors:
 - Joe Norton
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2014-05-25 00:00:00.000000000 Z
+date: 2014-05-26 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: em-synchrony
@@ -126,7 +126,7 @@ files:
 - readme.md
 - spec/retriever_spec.rb
 - spec/spec_helper.rb
-homepage: http://github.com/joenorton/rubyretriever
+homepage: http://www.softwarebyjoe.com/rubyretriever/
 licenses:
 - MIT
 metadata: {}