spidermech 0.0.1 → 0.0.2

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 637a38fca20a36c523b57ae807ffa30b1b9d63f3
4
- data.tar.gz: 9b417e4ce15fea26b72127c4b4b6624f0ac85847
3
+ metadata.gz: 470adf10493e1607b18798221f1d43a83ddfb06b
4
+ data.tar.gz: b4fed7921f1a540c49f11c405fe3c404ece38978
5
5
  SHA512:
6
- metadata.gz: 11c1e177153ab63db942a826ab877242997d9d22e2e8971cf87d391fc7969e00d82475f504724054c4722bbc981c8257150f4c75ba19d9bd59013b38f6bff6b3
7
- data.tar.gz: fed4c3d45c6949e190239e435eab66aa74ae5c13e613be34a72edf3eb16d669bf495095d0804c7c07d4ed94470d5db2228cc7806c3e3dfcb1e89d12e43f8a58f
6
+ metadata.gz: fb6ffec034eeb8fb1e019dd53553aabfacb5a1bf5111d7d308fe340986ee39eb8ea7558046e634c017d3b91641702085ceb47cc6cd30861ecd2f5fe70ec579f5
7
+ data.tar.gz: 5c752359c2b958aacf2d3d39fdfca31e0224f816024f6360ca352c6f80a9d2ba7d8ed416fb5e607af19867de45eda7ace5af437af49ca42e59b730f218a1d8e9
data/README.md CHANGED
@@ -1,12 +1,12 @@
1
- # Crawler
1
+ # SpiderMech
2
2
 
3
- TODO: Write a gem description
3
+ SpiderMech crawls a given domain, and reports on the pages linked to from given urls, and the assets that said page depends on.
4
4
 
5
5
  ## Installation
6
6
 
7
7
  Add this line to your application's Gemfile:
8
8
 
9
- gem 'crawler'
9
+ gem 'spidermech'
10
10
 
11
11
  And then execute:
12
12
 
@@ -18,16 +18,34 @@ Or install it yourself as:
18
18
 
19
19
  ## Gem Usage
20
20
 
21
- TODO: Write usage instructions here
21
+ require 'spidermech'
22
+ spider = SpiderMech.new 'http://google.com'
23
+ spider.run # returns the sitemap hash
24
+ spider.save_json # saves the sitemap hash as google.com.json
22
25
 
23
26
  ## Command Line Usage
24
27
 
25
28
  The gem provides a command line tool. You can invoke it via
26
29
 
27
- bundle exec crawl http://google.com
30
+ bundle exec spidermech http://google.com
28
31
 
29
32
  It will crawl the page and give you the appropriate output.
30
33
 
34
+ ## Sample Output
35
+
36
+ [{:url=>"http://localhost:8321",
37
+ :assets=>
38
+ {:scripts=>["https://ajax.googleapis.com/ajax/libs/jquery/1.11.0/jquery.min.js", "http://getbootstrap.com/dist/js/bootstrap.min.js"],
39
+
40
+ :images=>[],
41
+
42
+ :css=>
43
+ ["http://getbootstrap.com/dist/css/bootstrap.min.css", "http://getbootstrap.com/examples/starter-template/starter-template.css"]},
44
+
45
+ :links
46
+ =>["/", "/about.html", "/contact.html"]},
47
+ ]
48
+
31
49
  ## Contributing
32
50
 
33
51
  1. Fork it ( http://github.com/<my-github-username>/crawler/fork )
@@ -1,7 +1,6 @@
1
1
  require 'mechanize'
2
2
  require 'logger'
3
3
  require 'json'
4
- require 'pry'
5
4
 
6
5
  class SpiderMech
7
6
  attr_reader :queue
@@ -4,7 +4,7 @@ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
4
 
5
5
  Gem::Specification.new do |spec|
6
6
  spec.name = "spidermech"
7
- spec.version = '0.0.1'
7
+ spec.version = '0.0.2'
8
8
  spec.authors = ["Caleb Albritton"]
9
9
  spec.email = ["ithinkincode@gmail.com"]
10
10
  spec.summary = "Single URL crawler."
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: spidermech
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.1
4
+ version: 0.0.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - Caleb Albritton