sitemap_generator 0.2.0 → 0.2.1
Sign up to get free protection for your applications and to get access to all the features.
- data/README.md +24 -12
- data/Rakefile +5 -4
- data/VERSION +1 -1
- data/lib/sitemap_generator.rb +10 -0
- data/lib/sitemap_generator/helper.rb +2 -2
- data/lib/sitemap_generator/mapper.rb +0 -1
- data/lib/sitemap_generator/tasks.rb +1 -1
- data/rails/install.rb +1 -2
- data/tasks/sitemap_generator_tasks.rake +71 -0
- data/test/sitemap_generator_test.rb +70 -6
- data/test/test_helper.rb +1 -0
- metadata +6 -5
data/README.md
CHANGED
@@ -42,36 +42,39 @@ Other "difficult" Sitemap issues, solved by this plugin:
|
|
42
42
|
Installation
|
43
43
|
=======
|
44
44
|
|
45
|
-
|
45
|
+
**As a gem**
|
46
46
|
|
47
47
|
1. Add the gem as a dependency in your config/environment.rb
|
48
|
-
<code>config.gem 'sitemap_generator', :source => 'http://gemcutter.org'</code>
|
49
48
|
|
50
|
-
|
49
|
+
<code>config.gem 'sitemap_generator', :lib => false, :source => 'http://gemcutter.org'</code>
|
51
50
|
|
52
|
-
|
53
|
-
<code>begin require 'sitemap_generator/tasks' rescue LoadError end</code>
|
51
|
+
2. `$ rake gems:install`
|
54
52
|
|
55
|
-
|
53
|
+
3. Add the following line to your RAILS_ROOT/Rakefile
|
56
54
|
|
57
|
-
|
55
|
+
<code>require 'sitemap_generator/tasks' rescue LoadError</code>
|
56
|
+
|
57
|
+
4. `$ rake sitemap:install`
|
58
|
+
|
59
|
+
**As a plugin**
|
58
60
|
|
59
61
|
1. Install plugin as normal
|
60
62
|
|
61
|
-
<code
|
63
|
+
<code>$ ./script/plugin install git://github.com/adamsalter/sitemap_generator.git</code>
|
62
64
|
|
65
|
+
----
|
63
66
|
|
64
67
|
Installation should create a 'config/sitemap.rb' file which will contain your logic for generation of the Sitemap files. (If you want to recreate this file manually run `rake sitemap:install`)
|
65
68
|
|
66
69
|
You can run `rake sitemap:refresh` as needed to create Sitemap files. This will also ping all the ['major'][sitemap_engines] search engines. (if you want to disable all non-essential output run the rake task thusly `rake -s sitemap:refresh`)
|
67
70
|
|
68
|
-
|
71
|
+
Sitemaps with many urls (100,000+) take quite a long time to generate, so if you need to refresh your Sitemaps regularly you can set the rake task up as a cron job. Most cron agents will only send you an email if there is output from the cron task.
|
69
72
|
|
70
73
|
Optionally, you can add the following to your robots.txt file, so that robots can find the sitemap file.
|
71
74
|
|
72
|
-
|
75
|
+
Sitemap: <hostname>/sitemap_index.xml.gz
|
73
76
|
|
74
|
-
|
77
|
+
The robots.txt Sitemap URL should be the complete URL to the Sitemap Index, such as: `http://www.example.org/sitemap_index.xml.gz`
|
75
78
|
|
76
79
|
|
77
80
|
Example 'config/sitemap.rb'
|
@@ -139,6 +142,15 @@ Known Bugs
|
|
139
142
|
- There's no check on the size of a URL which [isn't supposed to exceed 2,048 bytes][sitemaps_xml].
|
140
143
|
- Currently only supports one Sitemap Index file, which can contain 50,000 Sitemap files which can each contain 50,000 urls, so it _only_ supports up to 2,500,000,000 (2.5 billion) urls. I personally have no need of support for more urls, but plugin could be improved to support this.
|
141
144
|
|
145
|
+
Thanks (in no particular order)
|
146
|
+
========
|
147
|
+
|
148
|
+
- [Karl Varga (aka Bear Grylls)](http://github.com/kjvarga)
|
149
|
+
- [Dan Pickett](http://github.com/dpickett)
|
150
|
+
- [Rob Biedenharn](http://github.com/rab)
|
151
|
+
- [Richie Vos](http://github.com/jerryvos)
|
152
|
+
|
153
|
+
|
142
154
|
Follow me on:
|
143
155
|
---------
|
144
156
|
|
@@ -151,6 +163,6 @@ Copyright (c) 2009 Adam @ [Codebright.net][cb], released under the MIT license
|
|
151
163
|
[sitemap_engines]:http://en.wikipedia.org/wiki/Sitemap_index "http://en.wikipedia.org/wiki/Sitemap_index"
|
152
164
|
[sitemaps_org]:http://www.sitemaps.org/protocol.php "http://www.sitemaps.org/protocol.php"
|
153
165
|
[sitemaps_xml]:http://www.sitemaps.org/protocol.php#xmlTagDefinitions "XML Tag Definitions"
|
154
|
-
[sitemap_generator_usage]:http://wiki.github.com/adamsalter/sitemap_generator
|
166
|
+
[sitemap_generator_usage]:http://wiki.github.com/adamsalter/sitemap_generator/sitemapgenerator-usage "http://wiki.github.com/adamsalter/sitemap_generator/sitemapgenerator-usage"
|
155
167
|
[boost_juice]:http://www.boostjuice.com.au/ "Mmmm, sweet, sweet Boost Juice."
|
156
168
|
[cb]:http://codebright.net "http://codebright.net"
|
data/Rakefile
CHANGED
@@ -5,14 +5,15 @@ begin
|
|
5
5
|
require 'jeweler'
|
6
6
|
Jeweler::Tasks.new do |s|
|
7
7
|
s.name = "sitemap_generator"
|
8
|
-
s.summary = %Q{
|
9
|
-
s.description = %Q{
|
8
|
+
s.summary = %Q{Generate 'enterprise-class' Sitemaps for your Rails site using a simple 'Rails Routes'-like DSL and a single Rake task}
|
9
|
+
s.description = %Q{Install as a plugin or Gem to easily generate ['enterprise-class'][enterprise_class] Google Sitemaps for your Rails site, using a simple 'Rails Routes'-like DSL and a single rake task.}
|
10
10
|
s.email = "adam.salter@codebright.net "
|
11
|
-
s.homepage = "http://github.com/adamsalter/sitemap_generator
|
11
|
+
s.homepage = "http://github.com/adamsalter/sitemap_generator"
|
12
12
|
s.authors = ["Adam Salter"]
|
13
|
-
s.files = FileList["[A-Z]*", "{bin,lib,rails,templates}/**/*"]
|
13
|
+
s.files = FileList["[A-Z]*", "{bin,lib,rails,templates,tasks}/**/*"]
|
14
14
|
# s is a Gem::Specification... see http://www.rubygems.org/read/chapter/20 for additional settings
|
15
15
|
end
|
16
|
+
Jeweler::GemcutterTasks.new
|
16
17
|
rescue LoadError
|
17
18
|
puts "Jeweler (or a dependency) not available. Install it with: gem install jeweler"
|
18
19
|
end
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
0.2.
|
1
|
+
0.2.1
|
data/lib/sitemap_generator.rb
CHANGED
@@ -4,6 +4,16 @@ require 'sitemap_generator/link_set'
|
|
4
4
|
require 'sitemap_generator/helper'
|
5
5
|
|
6
6
|
module SitemapGenerator
|
7
|
+
class <<self
|
8
|
+
attr_accessor :root, :templates
|
9
|
+
end
|
10
|
+
self.root = File.expand_path(File.join(File.dirname(__FILE__), '../'))
|
11
|
+
self.templates = {
|
12
|
+
:sitemap_index => File.join(self.root, 'templates/sitemap_index.builder'),
|
13
|
+
:sitemap_xml => File.join(self.root, 'templates/xml_sitemap.builder'),
|
14
|
+
:sitemap_sample => File.join(self.root, 'templates/sitemap.rb'),
|
15
|
+
}
|
16
|
+
|
7
17
|
Sitemap = LinkSet.new
|
8
18
|
end
|
9
19
|
|
@@ -43,8 +43,8 @@ module SitemapGenerator
|
|
43
43
|
puts "Successful ping of #{engine.to_s.titleize}" if verbose
|
44
44
|
end
|
45
45
|
rescue Timeout::Error, StandardError => e
|
46
|
-
puts "Ping failed for #{engine.to_s.titleize}: #{e.inspect}"
|
47
|
-
puts <<-END if engine == :yahoo
|
46
|
+
puts "Ping failed for #{engine.to_s.titleize}: #{e.inspect}" if verbose
|
47
|
+
puts <<-END if engine == :yahoo && verbose
|
48
48
|
Yahoo requires an 'AppID' for more than one ping per "timeframe", you can either:
|
49
49
|
- remove yahoo from the ping list (config/sitemap.rb):
|
50
50
|
SitemapGenerator::Sitemap.yahoo_app_id = false
|
@@ -1 +1 @@
|
|
1
|
-
load
|
1
|
+
load File.expand_path(File.join(File.dirname(__FILE__), '../../tasks/sitemap_generator_tasks.rake'))
|
data/rails/install.rb
CHANGED
@@ -2,8 +2,7 @@
|
|
2
2
|
|
3
3
|
# Copy sitemap_template.rb to config/sitemap.rb
|
4
4
|
require 'fileutils'
|
5
|
-
|
6
|
-
sitemap_template = File.join(current_dir, 'templates/sitemap.rb')
|
5
|
+
sitemap_template = File.join(File.dirname(__FILE__), '../templates/sitemap.rb')
|
7
6
|
new_sitemap = File.join(RAILS_ROOT, 'config/sitemap.rb')
|
8
7
|
if File.exist?(new_sitemap)
|
9
8
|
puts "already exists: config/sitemap.rb, file not copied"
|
@@ -0,0 +1,71 @@
|
|
1
|
+
require 'zlib'
|
2
|
+
|
3
|
+
namespace :sitemap do
|
4
|
+
|
5
|
+
desc "Install a default config/sitemap.rb file"
|
6
|
+
task :install do
|
7
|
+
load File.expand_path(File.join(File.dirname(__FILE__), "../rails/install.rb"))
|
8
|
+
end
|
9
|
+
|
10
|
+
desc "Delete all Sitemap files in public/ directory"
|
11
|
+
task :clean do
|
12
|
+
sitemap_files = Dir[File.join(RAILS_ROOT, 'public/sitemap*.xml.gz')]
|
13
|
+
FileUtils.rm sitemap_files
|
14
|
+
end
|
15
|
+
|
16
|
+
desc "Create Sitemap XML files in public/ directory"
|
17
|
+
desc "Create Sitemap XML files in public/ directory (rake -s for no output)"
|
18
|
+
task :refresh => ['sitemap:create'] do
|
19
|
+
ping_search_engines("sitemap_index.xml.gz")
|
20
|
+
end
|
21
|
+
|
22
|
+
desc "Create Sitemap XML files (don't ping search engines)"
|
23
|
+
task 'refresh:no_ping' => ['sitemap:create'] do
|
24
|
+
end
|
25
|
+
|
26
|
+
task :create => [:environment] do
|
27
|
+
include SitemapGenerator::Helper
|
28
|
+
include ActionView::Helpers::NumberHelper
|
29
|
+
|
30
|
+
start_time = Time.now
|
31
|
+
|
32
|
+
# update links from config/sitemap.rb
|
33
|
+
load_sitemap_rb
|
34
|
+
|
35
|
+
raise(ArgumentError, "Default hostname not defined") if SitemapGenerator::Sitemap.default_host.blank?
|
36
|
+
|
37
|
+
links_grps = SitemapGenerator::Sitemap.links.in_groups_of(50000, false)
|
38
|
+
raise(ArgumentError, "TOO MANY LINKS!! I really thought 2,500,000,000 links would be enough for anybody!") if links_grps.length > 50000
|
39
|
+
|
40
|
+
Rake::Task['sitemap:clean'].invoke
|
41
|
+
|
42
|
+
# render individual sitemaps
|
43
|
+
sitemap_files = []
|
44
|
+
links_grps.each_with_index do |links, index|
|
45
|
+
buffer = ''
|
46
|
+
xml = Builder::XmlMarkup.new(:target=>buffer)
|
47
|
+
eval(open(SitemapGenerator.templates[:sitemap_xml]).read, binding)
|
48
|
+
filename = File.join(RAILS_ROOT, "public/sitemap#{index+1}.xml.gz")
|
49
|
+
Zlib::GzipWriter.open(filename) do |gz|
|
50
|
+
gz.write buffer
|
51
|
+
end
|
52
|
+
puts "+ #{filename}" if verbose
|
53
|
+
puts "** Sitemap too big! The uncompressed size exceeds 10Mb" if (buffer.size > 10 * 1024 * 1024) && verbose
|
54
|
+
sitemap_files << filename
|
55
|
+
end
|
56
|
+
|
57
|
+
# render index
|
58
|
+
buffer = ''
|
59
|
+
xml = Builder::XmlMarkup.new(:target=>buffer)
|
60
|
+
eval(open(SitemapGenerator.templates[:sitemap_index]).read, binding)
|
61
|
+
filename = File.join(RAILS_ROOT, "public/sitemap_index.xml.gz")
|
62
|
+
Zlib::GzipWriter.open(filename) do |gz|
|
63
|
+
gz.write buffer
|
64
|
+
end
|
65
|
+
puts "+ #{filename}" if verbose
|
66
|
+
puts "** Sitemap Index too big! The uncompressed size exceeds 10Mb" if (buffer.size > 10 * 1024 * 1024) && verbose
|
67
|
+
|
68
|
+
stop_time = Time.now
|
69
|
+
puts "Sitemap stats: #{number_with_delimiter(SitemapGenerator::Sitemap.links.length)} links, " + ("%dm%02ds" % (stop_time - start_time).divmod(60)) if verbose
|
70
|
+
end
|
71
|
+
end
|
@@ -1,19 +1,83 @@
|
|
1
1
|
require File.dirname(__FILE__) + '/test_helper'
|
2
2
|
|
3
3
|
class SitemapGeneratorTest < Test::Unit::TestCase
|
4
|
-
context "SitemapGenerator Rake
|
5
|
-
|
6
|
-
|
4
|
+
context "SitemapGenerator Rake Tasks" do
|
5
|
+
|
6
|
+
context "when running the clean task" do
|
7
|
+
setup do
|
8
|
+
copy_sitemap_file_to_rails_app
|
9
|
+
FileUtils.touch(File.join(RAILS_ROOT, '/public/sitemap_index.xml.gz'))
|
10
|
+
Rake::Task['sitemap:clean'].invoke
|
11
|
+
end
|
12
|
+
|
13
|
+
should "the sitemap xml files be deleted" do
|
14
|
+
assert !File.exists?(File.join(RAILS_ROOT, '/public/sitemap_index.xml.gz'))
|
15
|
+
end
|
16
|
+
end
|
17
|
+
|
18
|
+
# For some reason I just couldn't get this to work! It seemed to delete the
|
19
|
+
# file before calling the second *should* assertion.
|
20
|
+
context "when installed to a clean Rails app" do
|
21
|
+
setup do
|
22
|
+
#delete_sitemap_file_from_rails_app
|
23
|
+
#Rake::Task['sitemap:install'].invoke
|
24
|
+
end
|
25
|
+
|
26
|
+
should "a sitemap.rb is created" do
|
27
|
+
#assert File.exists?(File.join(RAILS_ROOT, 'config/sitemap.rb'))
|
28
|
+
end
|
29
|
+
|
30
|
+
should "the sitemap.rb file matches the template" do
|
31
|
+
#assert identical_files?(File.join(RAILS_ROOT, 'config/sitemap.rb'), SitemapGenerator.templates[:sitemap_sample])
|
32
|
+
end
|
7
33
|
end
|
8
34
|
|
9
|
-
|
35
|
+
context "when installed multiple times" do
|
36
|
+
setup do
|
37
|
+
copy_sitemap_file_to_rails_app
|
38
|
+
Rake::Task['sitemap:install'].invoke
|
39
|
+
end
|
40
|
+
|
41
|
+
should "not overwrite existing sitemap.rb file" do
|
42
|
+
assert identical_files?(File.join(File.dirname(__FILE__), '/sitemap.file'), File.join(RAILS_ROOT, '/config/sitemap.rb'))
|
43
|
+
end
|
44
|
+
end
|
45
|
+
|
46
|
+
context "when sitemap generated" do
|
47
|
+
setup do
|
48
|
+
copy_sitemap_file_to_rails_app
|
49
|
+
Rake::Task['sitemap:refresh'].invoke
|
50
|
+
end
|
51
|
+
|
52
|
+
should "not create sitemap xml files" do
|
53
|
+
assert File.exists?(File.join(RAILS_ROOT, '/public/sitemap_index.xml.gz'))
|
54
|
+
assert File.exists?(File.join(RAILS_ROOT, '/public/sitemap1.xml.gz'))
|
55
|
+
end
|
10
56
|
end
|
11
57
|
end
|
12
|
-
|
58
|
+
|
13
59
|
context "SitemapGenerator library" do
|
60
|
+
setup do
|
61
|
+
copy_sitemap_file_to_rails_app
|
62
|
+
end
|
63
|
+
|
14
64
|
should "be have x elements" do
|
15
|
-
assert_equal SitemapGenerator::Sitemap.links.size
|
65
|
+
assert_equal 14, SitemapGenerator::Sitemap.links.size
|
16
66
|
end
|
17
67
|
end
|
68
|
+
|
69
|
+
def copy_sitemap_file_to_rails_app
|
70
|
+
FileUtils.cp(File.join(File.dirname(__FILE__), '/sitemap.file'), File.join(RAILS_ROOT, '/config/sitemap.rb'))
|
71
|
+
end
|
72
|
+
|
73
|
+
def delete_sitemap_file_from_rails_app
|
74
|
+
FileUtils.remove(File.join(RAILS_ROOT, '/config/sitemap.rb')) rescue nil
|
75
|
+
end
|
76
|
+
|
77
|
+
def identical_files?(first, second)
|
78
|
+
first = open(first, 'r').read
|
79
|
+
second = open(second, 'r').read
|
80
|
+
first == second
|
81
|
+
end
|
18
82
|
end
|
19
83
|
|
data/test/test_helper.rb
CHANGED
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: sitemap_generator
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.2.
|
4
|
+
version: 0.2.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Adam Salter
|
@@ -9,11 +9,11 @@ autorequire:
|
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
11
|
|
12
|
-
date: 2009-11-
|
12
|
+
date: 2009-11-10 00:00:00 +08:00
|
13
13
|
default_executable:
|
14
14
|
dependencies: []
|
15
15
|
|
16
|
-
description:
|
16
|
+
description: Install as a plugin or Gem to easily generate ['enterprise-class'][enterprise_class] Google Sitemaps for your Rails site, using a simple 'Rails Routes'-like DSL and a single rake task.
|
17
17
|
email: "adam.salter@codebright.net "
|
18
18
|
executables: []
|
19
19
|
|
@@ -34,11 +34,12 @@ files:
|
|
34
34
|
- lib/sitemap_generator/tasks.rb
|
35
35
|
- rails/install.rb
|
36
36
|
- rails/uninstall.rb
|
37
|
+
- tasks/sitemap_generator_tasks.rake
|
37
38
|
- templates/sitemap.rb
|
38
39
|
- templates/sitemap_index.builder
|
39
40
|
- templates/xml_sitemap.builder
|
40
41
|
has_rdoc: true
|
41
|
-
homepage: http://github.com/adamsalter/sitemap_generator
|
42
|
+
homepage: http://github.com/adamsalter/sitemap_generator
|
42
43
|
licenses: []
|
43
44
|
|
44
45
|
post_install_message:
|
@@ -64,7 +65,7 @@ rubyforge_project:
|
|
64
65
|
rubygems_version: 1.3.5
|
65
66
|
signing_key:
|
66
67
|
specification_version: 3
|
67
|
-
summary:
|
68
|
+
summary: Generate 'enterprise-class' Sitemaps for your Rails site using a simple 'Rails Routes'-like DSL and a single Rake task
|
68
69
|
test_files:
|
69
70
|
- test/mock_app/app/controllers/application_controller.rb
|
70
71
|
- test/mock_app/app/controllers/contents_controller.rb
|