rubyretriever 1.4.0 → 1.4.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/lib/retriever/version.rb +1 -1
- data/readme.md +3 -1
- metadata +1 -1
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 9b337916fddfc246a2b3cd73bdd40cfe13eccb9a
|
4
|
+
data.tar.gz: 908d0a752ab89adaa9c8530ae7803329c14591ce
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 7bf4c80012e0232b6bea179b5e41ab36c8066e7b0bb6da39b0eaedd0a6b67a6b826754f6f4e94d9dd21287ebc05579ebc759b7887ed29218c82eccb68b31fd79
|
7
|
+
data.tar.gz: 2bc3158bf98beb2c2b701b7811b774c1ff32236899683b6486bf2d1680ddbb87e8ebbb7aaa27eb8aa968a20f63158640c9d45c1d38c992484a9920069b38b976
|
data/lib/retriever/version.rb
CHANGED
data/readme.md
CHANGED
@@ -8,7 +8,9 @@ RubyRetriever is a Web Crawler, Scraper & File Harvester. Available as a command
|
|
8
8
|
|
9
9
|
RubyRetriever (RR) uses asynchronous HTTP requests via [Eventmachine](https://github.com/eventmachine/eventmachine) & [Synchrony](https://github.com/igrigorik/em-synchrony) to crawl webpages *very quickly*. RR also uses a Ruby implementation of the [bloomfilter](https://github.com/igrigorik/bloomfilter-rb) in order to keep track of pages it has already crawled in a memory efficient manner.
|
10
10
|
|
11
|
-
**v1.
|
11
|
+
**v1.4.1 Update (3/24/2016)** - Update gemfile & external dependency versioning
|
12
|
+
|
13
|
+
**v1.4.0 Update (3/24/2016)** - Several bug fixes.
|
12
14
|
|
13
15
|
**v1.3.0 Update (6/22/2014)** - The major change in this update is the new PageIterator class which adds functionality for library/script usage. Now you can run custom blocks against each page during the crawl. This update also includes more tests, and other code improvements to improve modularity and testability.
|
14
16
|
|