sitemap_generator 0.2.0 → 0.2.1

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -42,36 +42,39 @@ Other "difficult" Sitemap issues, solved by this plugin:
42
42
  Installation
43
43
  =======
44
44
 
45
- *As a gem*
45
+ **As a gem**
46
46
 
47
47
  1. Add the gem as a dependency in your config/environment.rb
48
- <code>config.gem 'sitemap_generator', :source => 'http://gemcutter.org'</code>
49
48
 
50
- 2. rake gems:install
49
+ <code>config.gem 'sitemap_generator', :lib => false, :source => 'http://gemcutter.org'</code>
51
50
 
52
- 3. Add the following line to your RAILS_ROOT/Rakefile
53
- <code>begin require 'sitemap_generator/tasks' rescue LoadError end</code>
51
+ 2. `$ rake gems:install`
54
52
 
55
- 4. `rake sitemap:install`
53
+ 3. Add the following line to your RAILS_ROOT/Rakefile
56
54
 
57
- *As a plugin*
55
+ <code>require 'sitemap_generator/tasks' rescue LoadError</code>
56
+
57
+ 4. `$ rake sitemap:install`
58
+
59
+ **As a plugin**
58
60
 
59
61
  1. Install plugin as normal
60
62
 
61
- <code>./script/plugin install git://github.com/adamsalter/sitemap_generator-plugin.git</code>
63
+ <code>$ ./script/plugin install git://github.com/adamsalter/sitemap_generator.git</code>
62
64
 
65
+ ----
63
66
 
64
67
  Installation should create a 'config/sitemap.rb' file which will contain your logic for generation of the Sitemap files. (If you want to recreate this file manually run `rake sitemap:install`)
65
68
 
66
69
  You can run `rake sitemap:refresh` as needed to create Sitemap files. This will also ping all the ['major'][sitemap_engines] search engines. (if you want to disable all non-essential output run the rake task thusly `rake -s sitemap:refresh`)
67
70
 
68
- Sitemaps with many urls (100,000+) take quite a long time to generate, so if you need to refresh your Sitemaps regularly you can set the rake task up as a cron job. Most cron agents will only send you an email if there is output from the cron task.
71
+ Sitemaps with many urls (100,000+) take quite a long time to generate, so if you need to refresh your Sitemaps regularly you can set the rake task up as a cron job. Most cron agents will only send you an email if there is output from the cron task.
69
72
 
70
73
  Optionally, you can add the following to your robots.txt file, so that robots can find the sitemap file.
71
74
 
72
- <code>Sitemap: &lt;hostname>/sitemap_index.xml.gz</code>
75
+ Sitemap: <hostname>/sitemap_index.xml.gz
73
76
 
74
- The robots.txt Sitemap URL should be the complete URL to the Sitemap Index, such as: `http://www.example.org/sitemap_index.xml.gz`
77
+ The robots.txt Sitemap URL should be the complete URL to the Sitemap Index, such as: `http://www.example.org/sitemap_index.xml.gz`
75
78
 
76
79
 
77
80
  Example 'config/sitemap.rb'
@@ -139,6 +142,15 @@ Known Bugs
139
142
  - There's no check on the size of a URL which [isn't supposed to exceed 2,048 bytes][sitemaps_xml].
140
143
  - Currently only supports one Sitemap Index file, which can contain 50,000 Sitemap files which can each contain 50,000 urls, so it _only_ supports up to 2,500,000,000 (2.5 billion) urls. I personally have no need of support for more urls, but plugin could be improved to support this.
141
144
 
145
+ Thanks (in no particular order)
146
+ ========
147
+
148
+ - [Karl Varga (aka Bear Grylls)](http://github.com/kjvarga)
149
+ - [Dan Pickett](http://github.com/dpickett)
150
+ - [Rob Biedenharn](http://github.com/rab)
151
+ - [Richie Vos](http://github.com/jerryvos)
152
+
153
+
142
154
  Follow me on:
143
155
  ---------
144
156
 
@@ -151,6 +163,6 @@ Copyright (c) 2009 Adam @ [Codebright.net][cb], released under the MIT license
151
163
  [sitemap_engines]:http://en.wikipedia.org/wiki/Sitemap_index "http://en.wikipedia.org/wiki/Sitemap_index"
152
164
  [sitemaps_org]:http://www.sitemaps.org/protocol.php "http://www.sitemaps.org/protocol.php"
153
165
  [sitemaps_xml]:http://www.sitemaps.org/protocol.php#xmlTagDefinitions "XML Tag Definitions"
154
- [sitemap_generator_usage]:http://wiki.github.com/adamsalter/sitemap_generator-plugin/sitemapgenerator-usage "http://wiki.github.com/adamsalter/sitemap_generator-plugin/sitemapgenerator-usage"
166
+ [sitemap_generator_usage]:http://wiki.github.com/adamsalter/sitemap_generator/sitemapgenerator-usage "http://wiki.github.com/adamsalter/sitemap_generator/sitemapgenerator-usage"
155
167
  [boost_juice]:http://www.boostjuice.com.au/ "Mmmm, sweet, sweet Boost Juice."
156
168
  [cb]:http://codebright.net "http://codebright.net"
data/Rakefile CHANGED
@@ -5,14 +5,15 @@ begin
5
5
  require 'jeweler'
6
6
  Jeweler::Tasks.new do |s|
7
7
  s.name = "sitemap_generator"
8
- s.summary = %Q{This plugin enables 'enterprise-class' Google Sitemaps to be easily generated for a Rails site as a rake task}
9
- s.description = %Q{This plugin enables 'enterprise-class' Google Sitemaps to be easily generated for a Rails site as a rake task}
8
+ s.summary = %Q{Generate 'enterprise-class' Sitemaps for your Rails site using a simple 'Rails Routes'-like DSL and a single Rake task}
9
+ s.description = %Q{Install as a plugin or Gem to easily generate ['enterprise-class'][enterprise_class] Google Sitemaps for your Rails site, using a simple 'Rails Routes'-like DSL and a single rake task.}
10
10
  s.email = "adam.salter@codebright.net "
11
- s.homepage = "http://github.com/adamsalter/sitemap_generator-plugin"
11
+ s.homepage = "http://github.com/adamsalter/sitemap_generator"
12
12
  s.authors = ["Adam Salter"]
13
- s.files = FileList["[A-Z]*", "{bin,lib,rails,templates}/**/*"]
13
+ s.files = FileList["[A-Z]*", "{bin,lib,rails,templates,tasks}/**/*"]
14
14
  # s is a Gem::Specification... see http://www.rubygems.org/read/chapter/20 for additional settings
15
15
  end
16
+ Jeweler::GemcutterTasks.new
16
17
  rescue LoadError
17
18
  puts "Jeweler (or a dependency) not available. Install it with: gem install jeweler"
18
19
  end
data/VERSION CHANGED
@@ -1 +1 @@
1
- 0.2.0
1
+ 0.2.1
@@ -4,6 +4,16 @@ require 'sitemap_generator/link_set'
4
4
  require 'sitemap_generator/helper'
5
5
 
6
6
  module SitemapGenerator
7
+ class <<self
8
+ attr_accessor :root, :templates
9
+ end
10
+ self.root = File.expand_path(File.join(File.dirname(__FILE__), '../'))
11
+ self.templates = {
12
+ :sitemap_index => File.join(self.root, 'templates/sitemap_index.builder'),
13
+ :sitemap_xml => File.join(self.root, 'templates/xml_sitemap.builder'),
14
+ :sitemap_sample => File.join(self.root, 'templates/sitemap.rb'),
15
+ }
16
+
7
17
  Sitemap = LinkSet.new
8
18
  end
9
19
 
@@ -43,8 +43,8 @@ module SitemapGenerator
43
43
  puts "Successful ping of #{engine.to_s.titleize}" if verbose
44
44
  end
45
45
  rescue Timeout::Error, StandardError => e
46
- puts "Ping failed for #{engine.to_s.titleize}: #{e.inspect}"
47
- puts <<-END if engine == :yahoo
46
+ puts "Ping failed for #{engine.to_s.titleize}: #{e.inspect}" if verbose
47
+ puts <<-END if engine == :yahoo && verbose
48
48
  Yahoo requires an 'AppID' for more than one ping per "timeframe", you can either:
49
49
  - remove yahoo from the ping list (config/sitemap.rb):
50
50
  SitemapGenerator::Sitemap.yahoo_app_id = false
@@ -1,4 +1,3 @@
1
-
2
1
  module SitemapGenerator
3
2
  # Generator instances are used to build links.
4
3
  # The object passed to the add_links block in config/sitemap.rb is a Generator instance.
@@ -1 +1 @@
1
- load "#{File.dirname(__FILE__)}/../../tasks/sitemap_generator_tasks.rake"
1
+ load File.expand_path(File.join(File.dirname(__FILE__), '../../tasks/sitemap_generator_tasks.rake'))
data/rails/install.rb CHANGED
@@ -2,8 +2,7 @@
2
2
 
3
3
  # Copy sitemap_template.rb to config/sitemap.rb
4
4
  require 'fileutils'
5
- current_dir = File.dirname(__FILE__)
6
- sitemap_template = File.join(current_dir, 'templates/sitemap.rb')
5
+ sitemap_template = File.join(File.dirname(__FILE__), '../templates/sitemap.rb')
7
6
  new_sitemap = File.join(RAILS_ROOT, 'config/sitemap.rb')
8
7
  if File.exist?(new_sitemap)
9
8
  puts "already exists: config/sitemap.rb, file not copied"
@@ -0,0 +1,71 @@
1
+ require 'zlib'
2
+
3
+ namespace :sitemap do
4
+
5
+ desc "Install a default config/sitemap.rb file"
6
+ task :install do
7
+ load File.expand_path(File.join(File.dirname(__FILE__), "../rails/install.rb"))
8
+ end
9
+
10
+ desc "Delete all Sitemap files in public/ directory"
11
+ task :clean do
12
+ sitemap_files = Dir[File.join(RAILS_ROOT, 'public/sitemap*.xml.gz')]
13
+ FileUtils.rm sitemap_files
14
+ end
15
+
16
+ desc "Create Sitemap XML files in public/ directory"
17
+ desc "Create Sitemap XML files in public/ directory (rake -s for no output)"
18
+ task :refresh => ['sitemap:create'] do
19
+ ping_search_engines("sitemap_index.xml.gz")
20
+ end
21
+
22
+ desc "Create Sitemap XML files (don't ping search engines)"
23
+ task 'refresh:no_ping' => ['sitemap:create'] do
24
+ end
25
+
26
+ task :create => [:environment] do
27
+ include SitemapGenerator::Helper
28
+ include ActionView::Helpers::NumberHelper
29
+
30
+ start_time = Time.now
31
+
32
+ # update links from config/sitemap.rb
33
+ load_sitemap_rb
34
+
35
+ raise(ArgumentError, "Default hostname not defined") if SitemapGenerator::Sitemap.default_host.blank?
36
+
37
+ links_grps = SitemapGenerator::Sitemap.links.in_groups_of(50000, false)
38
+ raise(ArgumentError, "TOO MANY LINKS!! I really thought 2,500,000,000 links would be enough for anybody!") if links_grps.length > 50000
39
+
40
+ Rake::Task['sitemap:clean'].invoke
41
+
42
+ # render individual sitemaps
43
+ sitemap_files = []
44
+ links_grps.each_with_index do |links, index|
45
+ buffer = ''
46
+ xml = Builder::XmlMarkup.new(:target=>buffer)
47
+ eval(open(SitemapGenerator.templates[:sitemap_xml]).read, binding)
48
+ filename = File.join(RAILS_ROOT, "public/sitemap#{index+1}.xml.gz")
49
+ Zlib::GzipWriter.open(filename) do |gz|
50
+ gz.write buffer
51
+ end
52
+ puts "+ #{filename}" if verbose
53
+ puts "** Sitemap too big! The uncompressed size exceeds 10Mb" if (buffer.size > 10 * 1024 * 1024) && verbose
54
+ sitemap_files << filename
55
+ end
56
+
57
+ # render index
58
+ buffer = ''
59
+ xml = Builder::XmlMarkup.new(:target=>buffer)
60
+ eval(open(SitemapGenerator.templates[:sitemap_index]).read, binding)
61
+ filename = File.join(RAILS_ROOT, "public/sitemap_index.xml.gz")
62
+ Zlib::GzipWriter.open(filename) do |gz|
63
+ gz.write buffer
64
+ end
65
+ puts "+ #{filename}" if verbose
66
+ puts "** Sitemap Index too big! The uncompressed size exceeds 10Mb" if (buffer.size > 10 * 1024 * 1024) && verbose
67
+
68
+ stop_time = Time.now
69
+ puts "Sitemap stats: #{number_with_delimiter(SitemapGenerator::Sitemap.links.length)} links, " + ("%dm%02ds" % (stop_time - start_time).divmod(60)) if verbose
70
+ end
71
+ end
@@ -1,19 +1,83 @@
1
1
  require File.dirname(__FILE__) + '/test_helper'
2
2
 
3
3
  class SitemapGeneratorTest < Test::Unit::TestCase
4
- context "SitemapGenerator Rake Task" do
5
- setup do
6
- ::Rake::Task['sitemap:refresh'].invoke
4
+ context "SitemapGenerator Rake Tasks" do
5
+
6
+ context "when running the clean task" do
7
+ setup do
8
+ copy_sitemap_file_to_rails_app
9
+ FileUtils.touch(File.join(RAILS_ROOT, '/public/sitemap_index.xml.gz'))
10
+ Rake::Task['sitemap:clean'].invoke
11
+ end
12
+
13
+ should "the sitemap xml files be deleted" do
14
+ assert !File.exists?(File.join(RAILS_ROOT, '/public/sitemap_index.xml.gz'))
15
+ end
16
+ end
17
+
18
+ # For some reason I just couldn't get this to work! It seemed to delete the
19
+ # file before calling the second *should* assertion.
20
+ context "when installed to a clean Rails app" do
21
+ setup do
22
+ #delete_sitemap_file_from_rails_app
23
+ #Rake::Task['sitemap:install'].invoke
24
+ end
25
+
26
+ should "a sitemap.rb is created" do
27
+ #assert File.exists?(File.join(RAILS_ROOT, 'config/sitemap.rb'))
28
+ end
29
+
30
+ should "the sitemap.rb file matches the template" do
31
+ #assert identical_files?(File.join(RAILS_ROOT, 'config/sitemap.rb'), SitemapGenerator.templates[:sitemap_sample])
32
+ end
7
33
  end
8
34
 
9
- should "fail if hostname not defined" do
35
+ context "when installed multiple times" do
36
+ setup do
37
+ copy_sitemap_file_to_rails_app
38
+ Rake::Task['sitemap:install'].invoke
39
+ end
40
+
41
+ should "not overwrite existing sitemap.rb file" do
42
+ assert identical_files?(File.join(File.dirname(__FILE__), '/sitemap.file'), File.join(RAILS_ROOT, '/config/sitemap.rb'))
43
+ end
44
+ end
45
+
46
+ context "when sitemap generated" do
47
+ setup do
48
+ copy_sitemap_file_to_rails_app
49
+ Rake::Task['sitemap:refresh'].invoke
50
+ end
51
+
52
+ should "not create sitemap xml files" do
53
+ assert File.exists?(File.join(RAILS_ROOT, '/public/sitemap_index.xml.gz'))
54
+ assert File.exists?(File.join(RAILS_ROOT, '/public/sitemap1.xml.gz'))
55
+ end
10
56
  end
11
57
  end
12
-
58
+
13
59
  context "SitemapGenerator library" do
60
+ setup do
61
+ copy_sitemap_file_to_rails_app
62
+ end
63
+
14
64
  should "be have x elements" do
15
- assert_equal SitemapGenerator::Sitemap.links.size, 14
65
+ assert_equal 14, SitemapGenerator::Sitemap.links.size
16
66
  end
17
67
  end
68
+
69
+ def copy_sitemap_file_to_rails_app
70
+ FileUtils.cp(File.join(File.dirname(__FILE__), '/sitemap.file'), File.join(RAILS_ROOT, '/config/sitemap.rb'))
71
+ end
72
+
73
+ def delete_sitemap_file_from_rails_app
74
+ FileUtils.remove(File.join(RAILS_ROOT, '/config/sitemap.rb')) rescue nil
75
+ end
76
+
77
+ def identical_files?(first, second)
78
+ first = open(first, 'r').read
79
+ second = open(second, 'r').read
80
+ first == second
81
+ end
18
82
  end
19
83
 
data/test/test_helper.rb CHANGED
@@ -3,6 +3,7 @@ ENV['RAILS_ROOT'] ||= File.join(File.dirname(__FILE__), 'mock_app')
3
3
 
4
4
  require File.expand_path(File.join(ENV['RAILS_ROOT'], 'config', 'environment.rb'))
5
5
 
6
+ require 'fileutils'
6
7
  require 'rake'
7
8
  require 'shoulda'
8
9
 
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: sitemap_generator
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.0
4
+ version: 0.2.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Adam Salter
@@ -9,11 +9,11 @@ autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
11
 
12
- date: 2009-11-06 00:00:00 +11:00
12
+ date: 2009-11-10 00:00:00 +08:00
13
13
  default_executable:
14
14
  dependencies: []
15
15
 
16
- description: This plugin enables 'enterprise-class' Google Sitemaps to be easily generated for a Rails site as a rake task
16
+ description: Install as a plugin or Gem to easily generate ['enterprise-class'][enterprise_class] Google Sitemaps for your Rails site, using a simple 'Rails Routes'-like DSL and a single rake task.
17
17
  email: "adam.salter@codebright.net "
18
18
  executables: []
19
19
 
@@ -34,11 +34,12 @@ files:
34
34
  - lib/sitemap_generator/tasks.rb
35
35
  - rails/install.rb
36
36
  - rails/uninstall.rb
37
+ - tasks/sitemap_generator_tasks.rake
37
38
  - templates/sitemap.rb
38
39
  - templates/sitemap_index.builder
39
40
  - templates/xml_sitemap.builder
40
41
  has_rdoc: true
41
- homepage: http://github.com/adamsalter/sitemap_generator-plugin
42
+ homepage: http://github.com/adamsalter/sitemap_generator
42
43
  licenses: []
43
44
 
44
45
  post_install_message:
@@ -64,7 +65,7 @@ rubyforge_project:
64
65
  rubygems_version: 1.3.5
65
66
  signing_key:
66
67
  specification_version: 3
67
- summary: This plugin enables 'enterprise-class' Google Sitemaps to be easily generated for a Rails site as a rake task
68
+ summary: Generate 'enterprise-class' Sitemaps for your Rails site using a simple 'Rails Routes'-like DSL and a single Rake task
68
69
  test_files:
69
70
  - test/mock_app/app/controllers/application_controller.rb
70
71
  - test/mock_app/app/controllers/contents_controller.rb