site_checker 0.2.1 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/.travis.yml ADDED
@@ -0,0 +1,3 @@
1
+ rvm:
2
+ - 1.8.7
3
+ - 1.9.3
data/History.md CHANGED
@@ -1,3 +1,13 @@
1
+ ## [v0.3.0](https://github.com/ZsoltFabok/site_checker/compare/v0.2.1...v0.3.0)
2
+ ### Notes
3
+ * the root argument is no longer mandatory (Zsolt Fabok)
4
+
5
+ ### Other Changes
6
+ * you can use site_checker from the command line. Type site_checker -h for more details (Zsolt Fabok)
7
+
8
+ ### Fixes
9
+ * the library is ruby 1.8 compatible (Zsolt Fabok)
10
+
1
11
  ## [v0.2.1](https://github.com/ZsoltFabok/site_checker/compare/v0.2.0...v0.2.1)
2
12
  ### Other Changes
3
13
  * less strict gem dependencies (Zsolt Fabok)
data/README.md CHANGED
@@ -9,7 +9,7 @@ Site Checker is a simple ruby gem, which helps you check the integrity of your w
9
9
 
10
10
  ### Usage
11
11
 
12
- #### Default
12
+ #### In Test Code
13
13
 
14
14
  First, you have to load the `site_checker` by adding this line to the file where you would like to use it:
15
15
 
@@ -37,13 +37,13 @@ In case you don't want to use a DSL like API you can still do the following:
37
37
  puts SiteChecker.local_images.inspect
38
38
  puts SiteChecker.problems.inspect
39
39
 
40
- ### Using on Generated Content
40
+ ##### Using on Generated Content
41
41
  If you have a static website (e.g. generated by [octopress](https://github.com/imathis/octopress)) you can tell `site_checker` to use folders from the file system. With this approach, you don't need a webserver for verifying your website:
42
42
 
43
43
  check_site("./public", "./public")
44
44
  puts collected_problems.inspect
45
45
 
46
- ### Configuration
46
+ ##### Configuration
47
47
  You can instruct `site_checker` to ignore certain links:
48
48
 
49
49
  SiteChecker.configure do |config|
@@ -62,7 +62,7 @@ Too deep recursive calls may be expensive, so you can configure the maximum dept
62
62
  config.max_recursion_depth = 3
63
63
  end
64
64
 
65
- ### Examples
65
+ ##### Examples
66
66
  Make sure that there are no local dead links on the website (I'm using [rspec](https://github.com/rspec/rspec) syntax):
67
67
 
68
68
  before(:each) do
@@ -93,6 +93,24 @@ Check that all the local pages can be reached with maximum two steps:
93
93
  collected_local_pages.size.should eq @number_of_local_pages
94
94
  end
95
95
 
96
+ #### Command line
97
+ From version 0.3.0 the site checker can be used from the command line as well. Here is the list of the available options:
98
+
99
+ ~ % site_checker -h
100
+ Visits the <site_url> and prints out the list of those URLs which cannot be found
101
+
102
+ Usage: site_checker [options] <site_url>
103
+ -e, --visit-external-references Visit external references (may take a bit longer)
104
+ -m, --max-recursion-depth N Set the depth of the recursion
105
+ -r, --root URL The root URL of the path
106
+ -i, --ignore URL Ignore the provided URL (can be applied several times)
107
+ -p, --print-local-pages Prints the list of the URLs of the collected local pages
108
+ -x, --print-remote-pages Prints the list of the URLs of the collected remote pages
109
+ -y, --print-local-images Prints the list of the URLs of the collected local images
110
+ -z, --print-remote-images Prints the list of the URLs of the collected remote images
111
+ -h, --help Show a short description and this message
112
+ -v, --version Show version
113
+
96
114
  ### Troubleshooting
97
115
  #### undefined method 'new' for SiteChecker:Module
98
116
  This error occurs when the test code calls v0.1.1 methods, but a newer version of the gem has already been installed. Update your test code following the examples above.
data/bin/site_checker ADDED
@@ -0,0 +1,4 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "site_checker"
4
+ SiteChecker::Cli.new.start
@@ -0,0 +1,126 @@
1
+ require 'optparse'
2
+
3
+ module SiteChecker
4
+ class Cli
5
+ def start
6
+ begin
7
+ options = option_parser
8
+ configure_site_checker(options)
9
+ check_site(ARGV[0], options[:root])
10
+
11
+ print_problems_if_any
12
+
13
+ print(collected_local_pages, "Collected local pages:", options[:print_local_pages])
14
+ print(collected_remote_pages, "Collected remote pages:", options[:print_remote_pages])
15
+ print(collected_local_images, "Collected local images:", options[:print_local_images])
16
+ print(collected_remote_images, "Collected remote images:", options[:print_remote_images])
17
+
18
+ rescue Interrupt
19
+ puts "Error: Interrupted"
20
+ rescue SystemExit
21
+ puts
22
+ rescue Exception => e
23
+ puts "Error: #{e.message}"
24
+ puts
25
+ end
26
+ end
27
+
28
+ private
29
+ def option_parser
30
+ options, optparse = create_parser
31
+
32
+ begin
33
+ optparse.parse!
34
+ if ARGV.size != 1
35
+ raise OptionParser::MissingArgument.new("<site_url>")
36
+ end
37
+ rescue OptionParser::InvalidOption, OptionParser::MissingArgument, OptionParser::InvalidArgument
38
+ message = $!.to_s + "\n\n" + optparse.to_s
39
+ raise Exception.new(message)
40
+ end
41
+ options
42
+ end
43
+
44
+ def print(list, message, enabled)
45
+ if !list.empty? && enabled
46
+ puts message
47
+ list.sort.each do |entry|
48
+ puts " #{entry}"
49
+ end
50
+ end
51
+ end
52
+
53
+ def print_problems_if_any
54
+ if not collected_problems.empty?
55
+ puts "Collected problems:"
56
+ collected_problems.keys.sort.each do |parent|
57
+ puts " #{parent}"
58
+ collected_problems[parent].sort.each do |url|
59
+ puts " #{url}"
60
+ end
61
+ end
62
+ end
63
+ end
64
+
65
+ def configure_site_checker(options)
66
+ SiteChecker.configure do |config|
67
+ config.ignore_list = options[:ignore_list] if options[:ignore_list]
68
+ config.visit_references = options[:visit_references] if options[:visit_references]
69
+ config.max_recursion_depth = options[:max_recursion_depth] if options[:max_recursion_depth]
70
+ end
71
+ end
72
+
73
+ def create_parser
74
+ options = {}
75
+ optparse = OptionParser.new do |opts|
76
+ opts.banner = "Usage: site_checker [options] <site_url>"
77
+
78
+ opts.on("-e", "--visit-external-references", "Visit external references (may take a bit longer)") do |opt|
79
+ options[:visit_references] = opt
80
+ end
81
+
82
+ opts.on("-m", "--max-recursion-depth N", Integer, "Set the depth of the recursion") do |opt|
83
+ options[:max_recursion_depth] = opt
84
+ end
85
+
86
+ opts.on("-r", "--root URL", "The root URL of the path") do |opt|
87
+ options[:root] = opt
88
+ end
89
+
90
+ opts.on("-i", "--ignore URL", "Ignore the provided URL (can be applied several times)") do |opt|
91
+ options[:ignore_list] ||= []
92
+ options[:ignore_list] << opt
93
+ end
94
+
95
+ opts.on("-p","--print-local-pages", "Prints the list of the URLs of the collected local pages") do |opt|
96
+ options[:print_local_pages] = opt
97
+ end
98
+
99
+ opts.on("-x", "--print-remote-pages", "Prints the list of the URLs of the collected remote pages") do |opt|
100
+ options[:print_remote_pages] = opt
101
+ end
102
+
103
+ opts.on("-y", "--print-local-images", "Prints the list of the URLs of the collected local images") do |opt|
104
+ options[:print_local_images] = opt
105
+ end
106
+
107
+ opts.on("-z", "--print-remote-images", "Prints the list of the URLs of the collected remote images") do |opt|
108
+ options[:print_remote_images] = opt
109
+ end
110
+
111
+ opts.on_tail("-h", "--help", "Show a short description and this message") do
112
+ puts "Visits the <site_url> and prints out the list of those URLs which cannot be found"
113
+ puts
114
+ puts opts
115
+ exit
116
+ end
117
+
118
+ opts.on_tail("-v", "--version", "Show version") do
119
+ puts SiteChecker::VERSION
120
+ exit
121
+ end
122
+ end
123
+ [options, optparse]
124
+ end
125
+ end
126
+ end
@@ -1,13 +1,13 @@
1
1
  module SiteChecker
2
2
  class Link
3
- attr_accessor :url
3
+ attr_accessor :url
4
4
  attr_accessor :modified_url
5
- attr_accessor :parent_url
6
- attr_accessor :kind
7
- attr_accessor :location
8
- attr_accessor :problem
5
+ attr_accessor :parent_url
6
+ attr_accessor :kind
7
+ attr_accessor :location
8
+ attr_accessor :problem
9
9
 
10
- def eql?(other)
10
+ def eql?(other)
11
11
  @modified_url.eql? other.modified_url
12
12
  end
13
13
 
@@ -20,13 +20,13 @@ module SiteChecker
20
20
  end
21
21
 
22
22
  def self.create(attrs)
23
- link = Link.new
24
- attrs.each do |key, value|
25
- if self.instance_methods.include?("#{key}=".to_sym)
26
- eval("link.#{key}=value")
27
- end
28
- end
29
- link
23
+ link = Link.new
24
+ attrs.each do |key, value|
25
+ if self.instance_methods.map{|m| m.to_s}.include?("#{key}=")
26
+ eval("link.#{key}=value")
27
+ end
28
+ end
29
+ link
30
30
  end
31
31
 
32
32
  def parent_url=(parent_url)
@@ -40,19 +40,19 @@ module SiteChecker
40
40
  end
41
41
 
42
42
  def has_problem?
43
- @problem != nil
43
+ @problem != nil
44
44
  end
45
45
 
46
46
  def local_page?
47
- @location == :local && @kind == :page
47
+ @location == :local && @kind == :page
48
48
  end
49
49
 
50
50
  def local_image?
51
- @location == :local && @kind == :image
51
+ @location == :local && @kind == :image
52
52
  end
53
53
 
54
54
  def anchor?
55
- @kind == :anchor
55
+ @kind == :anchor
56
56
  end
57
57
 
58
58
  def anchor_ref?
@@ -9,10 +9,10 @@ module SiteChecker
9
9
  @max_recursion_depth ||= -1
10
10
  end
11
11
 
12
- def check(url, root)
12
+ def check(url, root=nil)
13
13
  @links = {}
14
14
  @recursion_depth = 0
15
- @root = root
15
+ @root = figure_out_root(url,root)
16
16
 
17
17
  @content_reader = get_content_reader
18
18
 
@@ -50,6 +50,18 @@ module SiteChecker
50
50
  end
51
51
 
52
52
  private
53
+ def figure_out_root(url, root)
54
+ unless root
55
+ url_uri = URI(url)
56
+ if url_uri.absolute?
57
+ root = "#{url_uri.scheme}://#{url_uri.host}"
58
+ else
59
+ root = url
60
+ end
61
+ end
62
+ root
63
+ end
64
+
53
65
  def get_content_reader
54
66
  if URI(@root).absolute?
55
67
  SiteChecker::IO::ContentFromWeb.new(@visit_references, @root)
@@ -0,0 +1,3 @@
1
+ module SiteChecker
2
+ VERSION = '0.3.0'
3
+ end
data/lib/site_checker.rb CHANGED
@@ -7,6 +7,8 @@ require 'site_checker/parse/page'
7
7
  require 'site_checker/link'
8
8
  require 'site_checker/link_collector'
9
9
  require 'site_checker/dsl'
10
+ require 'site_checker/version'
11
+ require 'site_checker/cli/cli'
10
12
 
11
13
  module SiteChecker
12
14
  class << self
@@ -44,9 +46,9 @@ module SiteChecker
44
46
  # Recursively visits the provided url looking for reference problems.
45
47
  #
46
48
  # @param [String] url where the processing starts
47
- # @param [String] root the root URL of the site
49
+ # @param [String] root (optional) the root URL of the site. If not provided then the method will use the url to figure it out.
48
50
  #
49
- def check(url, root)
51
+ def check(url, root=nil)
50
52
  create_instance
51
53
  @link_collector.check(url, root)
52
54
  end
data/site_checker.gemspec CHANGED
@@ -1,7 +1,13 @@
1
1
  # -*- encoding: utf-8 -*-
2
+
3
+ lib = File.expand_path('../lib/', __FILE__)
4
+ $:.unshift lib unless $:.include?(lib)
5
+
6
+ require 'site_checker/version'
7
+
2
8
  Gem::Specification.new do |s|
3
9
  s.name = 'site_checker'
4
- s.version = '0.2.1'
10
+ s.version = SiteChecker::VERSION
5
11
  s.date = '2012-12-31'
6
12
  s.summary = "site_checker-#{s.version}"
7
13
  s.description = "A simple tool for checking references on your website"
@@ -188,4 +188,34 @@ describe "Integration" do
188
188
  SiteChecker.problems.should be_empty
189
189
  end
190
190
  end
191
+
192
+ describe "without root argument" do
193
+ before(:each) do
194
+ @test_url = "http://localhost:4000"
195
+ end
196
+
197
+ it "should check the link to an external page" do
198
+ content = "<html>text<a href=\"http://external.org/\"/></html>"
199
+ webmock(@test_url, 200, content)
200
+ webmock("http://external.org", 200, "")
201
+ SiteChecker.check(@test_url)
202
+ SiteChecker.remote_pages.should eql(["http://external.org/" ])
203
+ SiteChecker.problems.should be_empty
204
+ end
205
+
206
+ it "should report a problem for a local page with absolute path" do
207
+ content = "<html>text<a href=\"#{@test_url}/another\"/></html>"
208
+ webmock(@test_url, 200, content)
209
+ webmock("#{@test_url}/another", 200, "")
210
+ SiteChecker.check(@test_url)
211
+ SiteChecker.problems.should eql({@test_url => ["#{@test_url}/another (absolute path)"]})
212
+ end
213
+
214
+ it "should report a problem when the local image cannot be found" do
215
+ content = "<html>text<img src=\"/a.png\"/></html>"
216
+ filesystemmock("index.html", content)
217
+ SiteChecker.check(fs_test_path)
218
+ SiteChecker.problems.should eql({fs_test_path => ["/a.png (404 Not Found)"]})
219
+ end
220
+ end
191
221
  end
@@ -0,0 +1,74 @@
1
+ require 'spec_helper'
2
+ require 'site_checker/cli/cli_spec_helper'
3
+
4
+ describe "CLI" do
5
+ include CliSpecHelper
6
+
7
+ before(:each) do
8
+ @command = File.expand_path('../../../bin/site_checker', File.dirname(__FILE__))
9
+ clean_fs_test_path
10
+ end
11
+
12
+ context "argument check" do
13
+ it "help option should print a description and the list of available options" do
14
+ description = "Visits the <site_url> and prints out the list of those URLs which cannot be found\n\n"
15
+ output = exec(@command, "-h")
16
+ output2 = exec(@command, "--help")
17
+ output.should eql(output2)
18
+ output.should eql(description + option_list)
19
+ end
20
+
21
+ it "version option should print the current version" do
22
+ output = exec(@command, "-v")
23
+ output2 = exec(@command, "--version")
24
+ output.should eql(output2)
25
+ output.should eql(SiteChecker::VERSION + "\n")
26
+ end
27
+
28
+ it "missing number after the max-recursion-depth option should print error and the list of available options" do
29
+ message = "Error: missing argument: --max-recursion-depth\n\n"
30
+ output = exec(@command, "--max-recursion-depth")
31
+ output.should eql(message + option_list)
32
+ end
33
+
34
+ it "missing site_url should print error message" do
35
+ message = "Error: missing argument: <site_url>\n\n"
36
+ output = exec(@command, "")
37
+ output.should eql(message + option_list)
38
+ end
39
+ end
40
+
41
+ context "execution" do
42
+ it "should run a basic check without any problem" do
43
+ content = "<html>text<a href=\"/one-level-down\"/></html>"
44
+ filesystemmock("index.html", content)
45
+ filesystemmock("/one-level-down/index.html", content)
46
+
47
+ expected = "Collected local pages:\n /one-level-down\n test_data_public"
48
+
49
+ output = exec(@command, "--print-local-pages #{fs_test_path}")
50
+ output.should eql(expected)
51
+ end
52
+
53
+ it "should print out the collected problems" do
54
+ content = "<html>text<a href=\"/one-level-down\"/></html>"
55
+ filesystemmock("index.html", content)
56
+ filesystemmock("/two-levels-down/index.html", content)
57
+
58
+ expected = "Collected problems:\n test_data_public\n /one-level-down (404 Not Found)"
59
+
60
+ output = exec(@command, "#{fs_test_path}")
61
+ output.should eql(expected)
62
+ end
63
+
64
+ it "should ignore all the provided URLs" do
65
+ content = "<html>text<a href=\"/one-level-down\"/><a href=\"/two-levels-down\"/></html>"
66
+ filesystemmock("index.html", content)
67
+
68
+ expected = "Collected local pages:\n test_data_public"
69
+
70
+ output = exec(@command, "--print-local-pages --ignore /one-level-down --ignore /two-levels-down #{fs_test_path}")
71
+ output.should eql(expected)
72
+ end
73
+ end
74
+ end
@@ -0,0 +1,25 @@
1
+ require 'site_checker/io/io_spec_helper'
2
+ require 'open3'
3
+
4
+ module CliSpecHelper
5
+ include IoSpecHelper
6
+
7
+ def exec(command, arguments)
8
+ stdin, stdout, stderr = Open3.popen3("#{command} #{arguments}")
9
+ stdout.readlines.map {|line| line.chomp}.join("\n")
10
+ end
11
+
12
+ def option_list
13
+ "Usage: site_checker [options] <site_url>\n" +
14
+ " -e, --visit-external-references Visit external references (may take a bit longer)\n" +
15
+ " -m, --max-recursion-depth N Set the depth of the recursion\n" +
16
+ " -r, --root URL The root URL of the path\n" +
17
+ " -i, --ignore URL Ignore the provided URL (can be applied several times)\n" +
18
+ " -p, --print-local-pages Prints the list of the URLs of the collected local pages\n" +
19
+ " -x, --print-remote-pages Prints the list of the URLs of the collected remote pages\n" +
20
+ " -y, --print-local-images Prints the list of the URLs of the collected local images\n" +
21
+ " -z, --print-remote-images Prints the list of the URLs of the collected remote images\n" +
22
+ " -h, --help Show a short description and this message\n" +
23
+ " -v, --version Show version\n"
24
+ end
25
+ end
@@ -1,22 +1,22 @@
1
1
  module IoSpecHelper
2
- def webmock(uri, status, content)
3
- stub_request(:get, uri).
4
- with(:headers => {'Accept'=>'*/*', 'User-Agent'=>'Ruby'}).
5
- to_return(:status => status, :body => content)
6
- end
2
+ def webmock(uri, status, content)
3
+ stub_request(:get, uri).
4
+ with(:headers => {'Accept'=>'*/*'}).\
5
+ to_return(:status => status, :body => content)
6
+ end
7
7
 
8
- def filesystemmock(uri, content)
9
- FileUtils.mkdir_p(File.dirname(File.join(fs_test_path, uri)))
10
- File.open(File.join(fs_test_path, uri), "w") do |f|
11
- f.write(content)
12
- end
13
- end
8
+ def filesystemmock(uri, content)
9
+ FileUtils.mkdir_p(File.dirname(File.join(fs_test_path, uri)))
10
+ File.open(File.join(fs_test_path, uri), "w") do |f|
11
+ f.write(content)
12
+ end
13
+ end
14
14
 
15
- def clean_fs_test_path
16
- FileUtils.rm_rf(fs_test_path)
17
- end
15
+ def clean_fs_test_path
16
+ FileUtils.rm_rf(fs_test_path)
17
+ end
18
18
 
19
- def fs_test_path
20
- "test_data_public"
21
- end
19
+ def fs_test_path
20
+ "test_data_public"
21
+ end
22
22
  end
data/spec/spec_helper.rb CHANGED
@@ -1,10 +1,16 @@
1
1
  require 'rspec'
2
- require 'debugger'
3
2
  require 'webmock/rspec'
4
3
 
4
+ begin
5
+ require "debugger"
6
+ rescue LoadError
7
+ # most probably using 1.8
8
+ require "ruby-debug"
9
+ end
10
+
5
11
  require File.expand_path('../../lib/site_checker', __FILE__)
6
12
 
7
13
  # common
8
14
  def create_link(url)
9
- SiteChecker::Link.create({:url => url})
15
+ SiteChecker::Link.create({:url => url})
10
16
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: site_checker
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.1
4
+ version: 0.3.0
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -93,27 +93,34 @@ dependencies:
93
93
  version: 1.5.6
94
94
  description: A simple tool for checking references on your website
95
95
  email: me@zsoltfabok.com
96
- executables: []
96
+ executables:
97
+ - site_checker
97
98
  extensions: []
98
99
  extra_rdoc_files: []
99
100
  files:
100
101
  - .rbenv-version
101
102
  - .rspec
103
+ - .travis.yml
102
104
  - History.md
103
105
  - LICENSE
104
106
  - README.md
107
+ - bin/site_checker
105
108
  - gem_tasks/rspec.rake
106
109
  - gem_tasks/yard.rake
107
110
  - lib/site_checker.rb
111
+ - lib/site_checker/cli/cli.rb
108
112
  - lib/site_checker/dsl.rb
109
113
  - lib/site_checker/io/content_from_file_system.rb
110
114
  - lib/site_checker/io/content_from_web.rb
111
115
  - lib/site_checker/link.rb
112
116
  - lib/site_checker/link_collector.rb
113
117
  - lib/site_checker/parse/page.rb
118
+ - lib/site_checker/version.rb
114
119
  - site_checker.gemspec
115
120
  - spec/dsl_spec.rb
116
121
  - spec/integration_spec.rb
122
+ - spec/site_checker/cli/cli_spec.rb
123
+ - spec/site_checker/cli/cli_spec_helper.rb
117
124
  - spec/site_checker/io/content_from_file_system_spec.rb
118
125
  - spec/site_checker/io/content_from_web_spec.rb
119
126
  - spec/site_checker/io/io_spec_helper.rb
@@ -137,7 +144,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
137
144
  version: '0'
138
145
  segments:
139
146
  - 0
140
- hash: 321001871174332709
147
+ hash: -2225611601008227703
141
148
  required_rubygems_version: !ruby/object:Gem::Requirement
142
149
  none: false
143
150
  requirements:
@@ -146,16 +153,18 @@ required_rubygems_version: !ruby/object:Gem::Requirement
146
153
  version: '0'
147
154
  segments:
148
155
  - 0
149
- hash: 321001871174332709
156
+ hash: -2225611601008227703
150
157
  requirements: []
151
158
  rubyforge_project:
152
159
  rubygems_version: 1.8.23
153
160
  signing_key:
154
161
  specification_version: 3
155
- summary: site_checker-0.2.1
162
+ summary: site_checker-0.3.0
156
163
  test_files:
157
164
  - spec/dsl_spec.rb
158
165
  - spec/integration_spec.rb
166
+ - spec/site_checker/cli/cli_spec.rb
167
+ - spec/site_checker/cli/cli_spec_helper.rb
159
168
  - spec/site_checker/io/content_from_file_system_spec.rb
160
169
  - spec/site_checker/io/content_from_web_spec.rb
161
170
  - spec/site_checker/io/io_spec_helper.rb