site_checker 0.2.1 → 0.3.0

Sign up to get free protection for your applications and to get access to all the features.
data/.travis.yml ADDED
@@ -0,0 +1,3 @@
1
+ rvm:
2
+ - 1.8.7
3
+ - 1.9.3
data/History.md CHANGED
@@ -1,3 +1,13 @@
1
+ ## [v0.3.0](https://github.com/ZsoltFabok/site_checker/compare/v0.2.1...v0.3.0)
2
+ ### Notes
3
+ * the root argument is no longer mandatory (Zsolt Fabok)
4
+
5
+ ### Other Changes
6
+ * you can use site_checker from the command line. Type site_checker -h for more details (Zsolt Fabok)
7
+
8
+ ### Fixes
9
+ * the library is ruby 1.8 compatible (Zsolt Fabok)
10
+
1
11
  ## [v0.2.1](https://github.com/ZsoltFabok/site_checker/compare/v0.2.0...v0.2.1)
2
12
  ### Other Changes
3
13
  * less strict gem dependencies (Zsolt Fabok)
data/README.md CHANGED
@@ -9,7 +9,7 @@ Site Checker is a simple ruby gem, which helps you check the integrity of your w
9
9
 
10
10
  ### Usage
11
11
 
12
- #### Default
12
+ #### In Test Code
13
13
 
14
14
  First, you have to load the `site_checker` by adding this line to the file where you would like to use it:
15
15
 
@@ -37,13 +37,13 @@ In case you don't want to use a DSL like API you can still do the following:
37
37
  puts SiteChecker.local_images.inspect
38
38
  puts SiteChecker.problems.inspect
39
39
 
40
- ### Using on Generated Content
40
+ ##### Using on Generated Content
41
41
  If you have a static website (e.g. generated by [octopress](https://github.com/imathis/octopress)) you can tell `site_checker` to use folders from the file system. With this approach, you don't need a webserver for verifying your website:
42
42
 
43
43
  check_site("./public", "./public")
44
44
  puts collected_problems.inspect
45
45
 
46
- ### Configuration
46
+ ##### Configuration
47
47
  You can instruct `site_checker` to ignore certain links:
48
48
 
49
49
  SiteChecker.configure do |config|
@@ -62,7 +62,7 @@ Too deep recursive calls may be expensive, so you can configure the maximum dept
62
62
  config.max_recursion_depth = 3
63
63
  end
64
64
 
65
- ### Examples
65
+ ##### Examples
66
66
  Make sure that there are no local dead links on the website (I'm using [rspec](https://github.com/rspec/rspec) syntax):
67
67
 
68
68
  before(:each) do
@@ -93,6 +93,24 @@ Check that all the local pages can be reached with maximum two steps:
93
93
  collected_local_pages.size.should eq @number_of_local_pages
94
94
  end
95
95
 
96
+ #### Command line
97
+ From version 0.3.0 the site checker can be used from the command line as well. Here is the list of the available options:
98
+
99
+ ~ % site_checker -h
100
+ Visits the <site_url> and prints out the list of those URLs which cannot be found
101
+
102
+ Usage: site_checker [options] <site_url>
103
+ -e, --visit-external-references Visit external references (may take a bit longer)
104
+ -m, --max-recursion-depth N Set the depth of the recursion
105
+ -r, --root URL The root URL of the path
106
+ -i, --ignore URL Ignore the provided URL (can be applied several times)
107
+ -p, --print-local-pages Prints the list of the URLs of the collected local pages
108
+ -x, --print-remote-pages Prints the list of the URLs of the collected remote pages
109
+ -y, --print-local-images Prints the list of the URLs of the collected local images
110
+ -z, --print-remote-images Prints the list of the URLs of the collected remote images
111
+ -h, --help Show a short description and this message
112
+ -v, --version Show version
113
+
96
114
  ### Troubleshooting
97
115
  #### undefined method 'new' for SiteChecker:Module
98
116
  This error occurs when the test code calls v0.1.1 methods, but a newer version of the gem has already been installed. Update your test code following the examples above.
data/bin/site_checker ADDED
@@ -0,0 +1,4 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "site_checker"
4
+ SiteChecker::Cli.new.start
@@ -0,0 +1,126 @@
1
+ require 'optparse'
2
+
3
+ module SiteChecker
4
+ class Cli
5
+ def start
6
+ begin
7
+ options = option_parser
8
+ configure_site_checker(options)
9
+ check_site(ARGV[0], options[:root])
10
+
11
+ print_problems_if_any
12
+
13
+ print(collected_local_pages, "Collected local pages:", options[:print_local_pages])
14
+ print(collected_remote_pages, "Collected remote pages:", options[:print_remote_pages])
15
+ print(collected_local_images, "Collected local images:", options[:print_local_images])
16
+ print(collected_remote_images, "Collected remote images:", options[:print_remote_images])
17
+
18
+ rescue Interrupt
19
+ puts "Error: Interrupted"
20
+ rescue SystemExit
21
+ puts
22
+ rescue Exception => e
23
+ puts "Error: #{e.message}"
24
+ puts
25
+ end
26
+ end
27
+
28
+ private
29
+ def option_parser
30
+ options, optparse = create_parser
31
+
32
+ begin
33
+ optparse.parse!
34
+ if ARGV.size != 1
35
+ raise OptionParser::MissingArgument.new("<site_url>")
36
+ end
37
+ rescue OptionParser::InvalidOption, OptionParser::MissingArgument, OptionParser::InvalidArgument
38
+ message = $!.to_s + "\n\n" + optparse.to_s
39
+ raise Exception.new(message)
40
+ end
41
+ options
42
+ end
43
+
44
+ def print(list, message, enabled)
45
+ if !list.empty? && enabled
46
+ puts message
47
+ list.sort.each do |entry|
48
+ puts " #{entry}"
49
+ end
50
+ end
51
+ end
52
+
53
+ def print_problems_if_any
54
+ if not collected_problems.empty?
55
+ puts "Collected problems:"
56
+ collected_problems.keys.sort.each do |parent|
57
+ puts " #{parent}"
58
+ collected_problems[parent].sort.each do |url|
59
+ puts " #{url}"
60
+ end
61
+ end
62
+ end
63
+ end
64
+
65
+ def configure_site_checker(options)
66
+ SiteChecker.configure do |config|
67
+ config.ignore_list = options[:ignore_list] if options[:ignore_list]
68
+ config.visit_references = options[:visit_references] if options[:visit_references]
69
+ config.max_recursion_depth = options[:max_recursion_depth] if options[:max_recursion_depth]
70
+ end
71
+ end
72
+
73
+ def create_parser
74
+ options = {}
75
+ optparse = OptionParser.new do |opts|
76
+ opts.banner = "Usage: site_checker [options] <site_url>"
77
+
78
+ opts.on("-e", "--visit-external-references", "Visit external references (may take a bit longer)") do |opt|
79
+ options[:visit_references] = opt
80
+ end
81
+
82
+ opts.on("-m", "--max-recursion-depth N", Integer, "Set the depth of the recursion") do |opt|
83
+ options[:max_recursion_depth] = opt
84
+ end
85
+
86
+ opts.on("-r", "--root URL", "The root URL of the path") do |opt|
87
+ options[:root] = opt
88
+ end
89
+
90
+ opts.on("-i", "--ignore URL", "Ignore the provided URL (can be applied several times)") do |opt|
91
+ options[:ignore_list] ||= []
92
+ options[:ignore_list] << opt
93
+ end
94
+
95
+ opts.on("-p","--print-local-pages", "Prints the list of the URLs of the collected local pages") do |opt|
96
+ options[:print_local_pages] = opt
97
+ end
98
+
99
+ opts.on("-x", "--print-remote-pages", "Prints the list of the URLs of the collected remote pages") do |opt|
100
+ options[:print_remote_pages] = opt
101
+ end
102
+
103
+ opts.on("-y", "--print-local-images", "Prints the list of the URLs of the collected local images") do |opt|
104
+ options[:print_local_images] = opt
105
+ end
106
+
107
+ opts.on("-z", "--print-remote-images", "Prints the list of the URLs of the collected remote images") do |opt|
108
+ options[:print_remote_images] = opt
109
+ end
110
+
111
+ opts.on_tail("-h", "--help", "Show a short description and this message") do
112
+ puts "Visits the <site_url> and prints out the list of those URLs which cannot be found"
113
+ puts
114
+ puts opts
115
+ exit
116
+ end
117
+
118
+ opts.on_tail("-v", "--version", "Show version") do
119
+ puts SiteChecker::VERSION
120
+ exit
121
+ end
122
+ end
123
+ [options, optparse]
124
+ end
125
+ end
126
+ end
@@ -1,13 +1,13 @@
1
1
  module SiteChecker
2
2
  class Link
3
- attr_accessor :url
3
+ attr_accessor :url
4
4
  attr_accessor :modified_url
5
- attr_accessor :parent_url
6
- attr_accessor :kind
7
- attr_accessor :location
8
- attr_accessor :problem
5
+ attr_accessor :parent_url
6
+ attr_accessor :kind
7
+ attr_accessor :location
8
+ attr_accessor :problem
9
9
 
10
- def eql?(other)
10
+ def eql?(other)
11
11
  @modified_url.eql? other.modified_url
12
12
  end
13
13
 
@@ -20,13 +20,13 @@ module SiteChecker
20
20
  end
21
21
 
22
22
  def self.create(attrs)
23
- link = Link.new
24
- attrs.each do |key, value|
25
- if self.instance_methods.include?("#{key}=".to_sym)
26
- eval("link.#{key}=value")
27
- end
28
- end
29
- link
23
+ link = Link.new
24
+ attrs.each do |key, value|
25
+ if self.instance_methods.map{|m| m.to_s}.include?("#{key}=")
26
+ eval("link.#{key}=value")
27
+ end
28
+ end
29
+ link
30
30
  end
31
31
 
32
32
  def parent_url=(parent_url)
@@ -40,19 +40,19 @@ module SiteChecker
40
40
  end
41
41
 
42
42
  def has_problem?
43
- @problem != nil
43
+ @problem != nil
44
44
  end
45
45
 
46
46
  def local_page?
47
- @location == :local && @kind == :page
47
+ @location == :local && @kind == :page
48
48
  end
49
49
 
50
50
  def local_image?
51
- @location == :local && @kind == :image
51
+ @location == :local && @kind == :image
52
52
  end
53
53
 
54
54
  def anchor?
55
- @kind == :anchor
55
+ @kind == :anchor
56
56
  end
57
57
 
58
58
  def anchor_ref?
@@ -9,10 +9,10 @@ module SiteChecker
9
9
  @max_recursion_depth ||= -1
10
10
  end
11
11
 
12
- def check(url, root)
12
+ def check(url, root=nil)
13
13
  @links = {}
14
14
  @recursion_depth = 0
15
- @root = root
15
+ @root = figure_out_root(url,root)
16
16
 
17
17
  @content_reader = get_content_reader
18
18
 
@@ -50,6 +50,18 @@ module SiteChecker
50
50
  end
51
51
 
52
52
  private
53
+ def figure_out_root(url, root)
54
+ unless root
55
+ url_uri = URI(url)
56
+ if url_uri.absolute?
57
+ root = "#{url_uri.scheme}://#{url_uri.host}"
58
+ else
59
+ root = url
60
+ end
61
+ end
62
+ root
63
+ end
64
+
53
65
  def get_content_reader
54
66
  if URI(@root).absolute?
55
67
  SiteChecker::IO::ContentFromWeb.new(@visit_references, @root)
@@ -0,0 +1,3 @@
1
+ module SiteChecker
2
+ VERSION = '0.3.0'
3
+ end
data/lib/site_checker.rb CHANGED
@@ -7,6 +7,8 @@ require 'site_checker/parse/page'
7
7
  require 'site_checker/link'
8
8
  require 'site_checker/link_collector'
9
9
  require 'site_checker/dsl'
10
+ require 'site_checker/version'
11
+ require 'site_checker/cli/cli'
10
12
 
11
13
  module SiteChecker
12
14
  class << self
@@ -44,9 +46,9 @@ module SiteChecker
44
46
  # Recursively visits the provided url looking for reference problems.
45
47
  #
46
48
  # @param [String] url where the processing starts
47
- # @param [String] root the root URL of the site
49
+ # @param [String] root (optional) the root URL of the site. If not provided then the method will use the url to figure it out.
48
50
  #
49
- def check(url, root)
51
+ def check(url, root=nil)
50
52
  create_instance
51
53
  @link_collector.check(url, root)
52
54
  end
data/site_checker.gemspec CHANGED
@@ -1,7 +1,13 @@
1
1
  # -*- encoding: utf-8 -*-
2
+
3
+ lib = File.expand_path('../lib/', __FILE__)
4
+ $:.unshift lib unless $:.include?(lib)
5
+
6
+ require 'site_checker/version'
7
+
2
8
  Gem::Specification.new do |s|
3
9
  s.name = 'site_checker'
4
- s.version = '0.2.1'
10
+ s.version = SiteChecker::VERSION
5
11
  s.date = '2012-12-31'
6
12
  s.summary = "site_checker-#{s.version}"
7
13
  s.description = "A simple tool for checking references on your website"
@@ -188,4 +188,34 @@ describe "Integration" do
188
188
  SiteChecker.problems.should be_empty
189
189
  end
190
190
  end
191
+
192
+ describe "without root argument" do
193
+ before(:each) do
194
+ @test_url = "http://localhost:4000"
195
+ end
196
+
197
+ it "should check the link to an external page" do
198
+ content = "<html>text<a href=\"http://external.org/\"/></html>"
199
+ webmock(@test_url, 200, content)
200
+ webmock("http://external.org", 200, "")
201
+ SiteChecker.check(@test_url)
202
+ SiteChecker.remote_pages.should eql(["http://external.org/" ])
203
+ SiteChecker.problems.should be_empty
204
+ end
205
+
206
+ it "should report a problem for a local page with absolute path" do
207
+ content = "<html>text<a href=\"#{@test_url}/another\"/></html>"
208
+ webmock(@test_url, 200, content)
209
+ webmock("#{@test_url}/another", 200, "")
210
+ SiteChecker.check(@test_url)
211
+ SiteChecker.problems.should eql({@test_url => ["#{@test_url}/another (absolute path)"]})
212
+ end
213
+
214
+ it "should report a problem when the local image cannot be found" do
215
+ content = "<html>text<img src=\"/a.png\"/></html>"
216
+ filesystemmock("index.html", content)
217
+ SiteChecker.check(fs_test_path)
218
+ SiteChecker.problems.should eql({fs_test_path => ["/a.png (404 Not Found)"]})
219
+ end
220
+ end
191
221
  end
@@ -0,0 +1,74 @@
1
+ require 'spec_helper'
2
+ require 'site_checker/cli/cli_spec_helper'
3
+
4
+ describe "CLI" do
5
+ include CliSpecHelper
6
+
7
+ before(:each) do
8
+ @command = File.expand_path('../../../bin/site_checker', File.dirname(__FILE__))
9
+ clean_fs_test_path
10
+ end
11
+
12
+ context "argument check" do
13
+ it "help option should print a description and the list of available options" do
14
+ description = "Visits the <site_url> and prints out the list of those URLs which cannot be found\n\n"
15
+ output = exec(@command, "-h")
16
+ output2 = exec(@command, "--help")
17
+ output.should eql(output2)
18
+ output.should eql(description + option_list)
19
+ end
20
+
21
+ it "version option should print the current version" do
22
+ output = exec(@command, "-v")
23
+ output2 = exec(@command, "--version")
24
+ output.should eql(output2)
25
+ output.should eql(SiteChecker::VERSION + "\n")
26
+ end
27
+
28
+ it "missing number after the max-recursion-depth option should print error and the list of available options" do
29
+ message = "Error: missing argument: --max-recursion-depth\n\n"
30
+ output = exec(@command, "--max-recursion-depth")
31
+ output.should eql(message + option_list)
32
+ end
33
+
34
+ it "missing site_url should print error message" do
35
+ message = "Error: missing argument: <site_url>\n\n"
36
+ output = exec(@command, "")
37
+ output.should eql(message + option_list)
38
+ end
39
+ end
40
+
41
+ context "execution" do
42
+ it "should run a basic check without any problem" do
43
+ content = "<html>text<a href=\"/one-level-down\"/></html>"
44
+ filesystemmock("index.html", content)
45
+ filesystemmock("/one-level-down/index.html", content)
46
+
47
+ expected = "Collected local pages:\n /one-level-down\n test_data_public"
48
+
49
+ output = exec(@command, "--print-local-pages #{fs_test_path}")
50
+ output.should eql(expected)
51
+ end
52
+
53
+ it "should print out the collected problems" do
54
+ content = "<html>text<a href=\"/one-level-down\"/></html>"
55
+ filesystemmock("index.html", content)
56
+ filesystemmock("/two-levels-down/index.html", content)
57
+
58
+ expected = "Collected problems:\n test_data_public\n /one-level-down (404 Not Found)"
59
+
60
+ output = exec(@command, "#{fs_test_path}")
61
+ output.should eql(expected)
62
+ end
63
+
64
+ it "should ignore all the provided URLs" do
65
+ content = "<html>text<a href=\"/one-level-down\"/><a href=\"/two-levels-down\"/></html>"
66
+ filesystemmock("index.html", content)
67
+
68
+ expected = "Collected local pages:\n test_data_public"
69
+
70
+ output = exec(@command, "--print-local-pages --ignore /one-level-down --ignore /two-levels-down #{fs_test_path}")
71
+ output.should eql(expected)
72
+ end
73
+ end
74
+ end
@@ -0,0 +1,25 @@
1
+ require 'site_checker/io/io_spec_helper'
2
+ require 'open3'
3
+
4
+ module CliSpecHelper
5
+ include IoSpecHelper
6
+
7
+ def exec(command, arguments)
8
+ stdin, stdout, stderr = Open3.popen3("#{command} #{arguments}")
9
+ stdout.readlines.map {|line| line.chomp}.join("\n")
10
+ end
11
+
12
+ def option_list
13
+ "Usage: site_checker [options] <site_url>\n" +
14
+ " -e, --visit-external-references Visit external references (may take a bit longer)\n" +
15
+ " -m, --max-recursion-depth N Set the depth of the recursion\n" +
16
+ " -r, --root URL The root URL of the path\n" +
17
+ " -i, --ignore URL Ignore the provided URL (can be applied several times)\n" +
18
+ " -p, --print-local-pages Prints the list of the URLs of the collected local pages\n" +
19
+ " -x, --print-remote-pages Prints the list of the URLs of the collected remote pages\n" +
20
+ " -y, --print-local-images Prints the list of the URLs of the collected local images\n" +
21
+ " -z, --print-remote-images Prints the list of the URLs of the collected remote images\n" +
22
+ " -h, --help Show a short description and this message\n" +
23
+ " -v, --version Show version\n"
24
+ end
25
+ end
@@ -1,22 +1,22 @@
1
1
  module IoSpecHelper
2
- def webmock(uri, status, content)
3
- stub_request(:get, uri).
4
- with(:headers => {'Accept'=>'*/*', 'User-Agent'=>'Ruby'}).
5
- to_return(:status => status, :body => content)
6
- end
2
+ def webmock(uri, status, content)
3
+ stub_request(:get, uri).
4
+ with(:headers => {'Accept'=>'*/*'}).\
5
+ to_return(:status => status, :body => content)
6
+ end
7
7
 
8
- def filesystemmock(uri, content)
9
- FileUtils.mkdir_p(File.dirname(File.join(fs_test_path, uri)))
10
- File.open(File.join(fs_test_path, uri), "w") do |f|
11
- f.write(content)
12
- end
13
- end
8
+ def filesystemmock(uri, content)
9
+ FileUtils.mkdir_p(File.dirname(File.join(fs_test_path, uri)))
10
+ File.open(File.join(fs_test_path, uri), "w") do |f|
11
+ f.write(content)
12
+ end
13
+ end
14
14
 
15
- def clean_fs_test_path
16
- FileUtils.rm_rf(fs_test_path)
17
- end
15
+ def clean_fs_test_path
16
+ FileUtils.rm_rf(fs_test_path)
17
+ end
18
18
 
19
- def fs_test_path
20
- "test_data_public"
21
- end
19
+ def fs_test_path
20
+ "test_data_public"
21
+ end
22
22
  end
data/spec/spec_helper.rb CHANGED
@@ -1,10 +1,16 @@
1
1
  require 'rspec'
2
- require 'debugger'
3
2
  require 'webmock/rspec'
4
3
 
4
+ begin
5
+ require "debugger"
6
+ rescue LoadError
7
+ # most probably using 1.8
8
+ require "ruby-debug"
9
+ end
10
+
5
11
  require File.expand_path('../../lib/site_checker', __FILE__)
6
12
 
7
13
  # common
8
14
  def create_link(url)
9
- SiteChecker::Link.create({:url => url})
15
+ SiteChecker::Link.create({:url => url})
10
16
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: site_checker
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.1
4
+ version: 0.3.0
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -93,27 +93,34 @@ dependencies:
93
93
  version: 1.5.6
94
94
  description: A simple tool for checking references on your website
95
95
  email: me@zsoltfabok.com
96
- executables: []
96
+ executables:
97
+ - site_checker
97
98
  extensions: []
98
99
  extra_rdoc_files: []
99
100
  files:
100
101
  - .rbenv-version
101
102
  - .rspec
103
+ - .travis.yml
102
104
  - History.md
103
105
  - LICENSE
104
106
  - README.md
107
+ - bin/site_checker
105
108
  - gem_tasks/rspec.rake
106
109
  - gem_tasks/yard.rake
107
110
  - lib/site_checker.rb
111
+ - lib/site_checker/cli/cli.rb
108
112
  - lib/site_checker/dsl.rb
109
113
  - lib/site_checker/io/content_from_file_system.rb
110
114
  - lib/site_checker/io/content_from_web.rb
111
115
  - lib/site_checker/link.rb
112
116
  - lib/site_checker/link_collector.rb
113
117
  - lib/site_checker/parse/page.rb
118
+ - lib/site_checker/version.rb
114
119
  - site_checker.gemspec
115
120
  - spec/dsl_spec.rb
116
121
  - spec/integration_spec.rb
122
+ - spec/site_checker/cli/cli_spec.rb
123
+ - spec/site_checker/cli/cli_spec_helper.rb
117
124
  - spec/site_checker/io/content_from_file_system_spec.rb
118
125
  - spec/site_checker/io/content_from_web_spec.rb
119
126
  - spec/site_checker/io/io_spec_helper.rb
@@ -137,7 +144,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
137
144
  version: '0'
138
145
  segments:
139
146
  - 0
140
- hash: 321001871174332709
147
+ hash: -2225611601008227703
141
148
  required_rubygems_version: !ruby/object:Gem::Requirement
142
149
  none: false
143
150
  requirements:
@@ -146,16 +153,18 @@ required_rubygems_version: !ruby/object:Gem::Requirement
146
153
  version: '0'
147
154
  segments:
148
155
  - 0
149
- hash: 321001871174332709
156
+ hash: -2225611601008227703
150
157
  requirements: []
151
158
  rubyforge_project:
152
159
  rubygems_version: 1.8.23
153
160
  signing_key:
154
161
  specification_version: 3
155
- summary: site_checker-0.2.1
162
+ summary: site_checker-0.3.0
156
163
  test_files:
157
164
  - spec/dsl_spec.rb
158
165
  - spec/integration_spec.rb
166
+ - spec/site_checker/cli/cli_spec.rb
167
+ - spec/site_checker/cli/cli_spec_helper.rb
159
168
  - spec/site_checker/io/content_from_file_system_spec.rb
160
169
  - spec/site_checker/io/content_from_web_spec.rb
161
170
  - spec/site_checker/io/io_spec_helper.rb