airblade-sitemap_generator 0.3.4
Sign up to get free protection for your applications and to get access to all the features.
- data/MIT-LICENSE +20 -0
- data/README.md +232 -0
- data/Rakefile +114 -0
- data/VERSION +1 -0
- data/lib/sitemap_generator.rb +28 -0
- data/lib/sitemap_generator/builder.rb +9 -0
- data/lib/sitemap_generator/builder/helper.rb +10 -0
- data/lib/sitemap_generator/builder/sitemap_file.rb +124 -0
- data/lib/sitemap_generator/builder/sitemap_index_file.rb +33 -0
- data/lib/sitemap_generator/interpreter.rb +28 -0
- data/lib/sitemap_generator/link.rb +36 -0
- data/lib/sitemap_generator/link_set.rb +174 -0
- data/lib/sitemap_generator/mapper.rb +16 -0
- data/lib/sitemap_generator/railtie.rb +7 -0
- data/lib/sitemap_generator/tasks.rb +1 -0
- data/lib/sitemap_generator/templates.rb +41 -0
- data/lib/sitemap_generator/utilities.rb +54 -0
- data/rails/install.rb +2 -0
- data/rails/uninstall.rb +2 -0
- data/tasks/sitemap_generator_tasks.rake +31 -0
- data/templates/sitemap.rb +42 -0
- metadata +115 -0
data/MIT-LICENSE
ADDED
@@ -0,0 +1,20 @@
|
|
1
|
+
Copyright (c) 2009 [name of plugin creator]
|
2
|
+
|
3
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
4
|
+
a copy of this software and associated documentation files (the
|
5
|
+
"Software"), to deal in the Software without restriction, including
|
6
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
7
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
8
|
+
permit persons to whom the Software is furnished to do so, subject to
|
9
|
+
the following conditions:
|
10
|
+
|
11
|
+
The above copyright notice and this permission notice shall be
|
12
|
+
included in all copies or substantial portions of the Software.
|
13
|
+
|
14
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
15
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
16
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
17
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
18
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
19
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
20
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,232 @@
|
|
1
|
+
N.B. This fork is Ruby 1.8.6-compatible (and probably therefore not compatible with 1.8.7+).
|
2
|
+
|
3
|
+
<hr>
|
4
|
+
|
5
|
+
SitemapGenerator
|
6
|
+
================
|
7
|
+
|
8
|
+
A Rails 3-compatible gem/plugin to generate ['enterprise-class'][enterprise_class] Sitemaps using a familiar Rails Routes-like DSL. Sitemaps are readable by all search engines and adhere to the ['Sitemap protocol specification'][sitemap_protocol]. Automatically pings search engines to notify them of new sitemaps (including Google, Yahoo and Bing). Provides rake tasks to easily manage your sitemaps. Supports image sitemaps and handles millions of links.
|
9
|
+
|
10
|
+
Features
|
11
|
+
-------
|
12
|
+
|
13
|
+
- v0.2.6: **Support ['image sitemaps'][sitemap_images]**!
|
14
|
+
- v0.2.5: **Support Rails 3**!
|
15
|
+
|
16
|
+
- Adheres to the ['Sitemap protocol specification'][sitemap_protocol]
|
17
|
+
- Handles millions of links
|
18
|
+
- Automatic Gzip of Sitemap files
|
19
|
+
- Automatic ping of search engines to notify them of new sitemaps: Google, Yahoo, Bing, Ask, SitemapWriter
|
20
|
+
- Won't clobber your old sitemaps if the new one fails to generate
|
21
|
+
- Set the priority of links, change frequency etc
|
22
|
+
- You control which links are included
|
23
|
+
- You set the host name, so it doesn't matter if your application is in a subdirectory
|
24
|
+
|
25
|
+
Foreword
|
26
|
+
-------
|
27
|
+
|
28
|
+
Unfortunately, Adam Salter passed away in 2009. Those who knew him know what an amazing guy he was, and what an excellent Rails programmer he was. His passing is a great loss to the Rails community.
|
29
|
+
|
30
|
+
[Karl Varga](http://github.com/kjvarga) has taken over development of SitemapGenerator. The canonical repository is [http://github.com/kjvarga/sitemap_generator][canonical_repo]
|
31
|
+
|
32
|
+
Installation
|
33
|
+
=======
|
34
|
+
|
35
|
+
**Rails 3:**
|
36
|
+
|
37
|
+
1. Add the gem to your <tt>Gemspec</tt>
|
38
|
+
|
39
|
+
<code>gem 'sitemap_generator'</code>
|
40
|
+
|
41
|
+
2. `$ rake sitemap:install`
|
42
|
+
|
43
|
+
**Rails 2.x: As a gem**
|
44
|
+
|
45
|
+
1. Add the gem as a dependency in your <tt>config/environment.rb</tt>
|
46
|
+
|
47
|
+
<code>config.gem 'sitemap_generator', :lib => false</code>
|
48
|
+
|
49
|
+
2. `$ rake gems:install`
|
50
|
+
|
51
|
+
3. Add the following to your <tt>RAILS_ROOT/Rakefile</tt>
|
52
|
+
|
53
|
+
<pre>begin
|
54
|
+
require 'sitemap_generator/tasks'
|
55
|
+
rescue Exception => e
|
56
|
+
puts "Warning, couldn't load gem tasks: #{e.message}! Skipping..."
|
57
|
+
end</pre>
|
58
|
+
|
59
|
+
4. `$ rake sitemap:install`
|
60
|
+
|
61
|
+
**Rails 2.x: As a plugin**
|
62
|
+
|
63
|
+
1. <code>$ ./script/plugin install git://github.com/kjvarga/sitemap_generator.git</code>
|
64
|
+
|
65
|
+
----
|
66
|
+
|
67
|
+
Installation creates a <tt>config/sitemap.rb</tt> file which will contain your logic for generating the Sitemap files. If you want to create this file manually run <code>rake sitemap:install</code>.
|
68
|
+
|
69
|
+
You can run <code>rake sitemap:refresh</code> as needed to create Sitemap files. This will also ping these ['major search engines'][sitemap_engines]: Google, Yahoo, Bing, Ask, SitemapWriter. If you want to disable all non-essential output run the rake task with <code>rake -s sitemap:refresh</code>.
|
70
|
+
|
71
|
+
To keep your Sitemaps up-to-date, setup a cron job. Pass the <tt>-s</tt> option to the rake task to silence all but the most important output. If you're using Whenever, then your schedule would look something like:
|
72
|
+
|
73
|
+
# config/schedule.rb
|
74
|
+
every 1.day, :at => '5:00 am' do
|
75
|
+
rake "-s sitemap:refresh"
|
76
|
+
end
|
77
|
+
|
78
|
+
Optionally, you can add the following to your <code>public/robots.txt</code> file, so that robots can find the sitemap file:
|
79
|
+
|
80
|
+
Sitemap: <hostname>/sitemap_index.xml.gz
|
81
|
+
|
82
|
+
The Sitemap URL in the robots file should be the complete URL to the Sitemap Index, such as <tt>http://www.example.org/sitemap_index.xml.gz</tt>
|
83
|
+
|
84
|
+
|
85
|
+
Example 'config/sitemap.rb'
|
86
|
+
==========
|
87
|
+
|
88
|
+
# Set the host name for URL creation
|
89
|
+
SitemapGenerator::Sitemap.default_host = "http://www.example.com"
|
90
|
+
|
91
|
+
SitemapGenerator::Sitemap.add_links do |sitemap|
|
92
|
+
# Put links creation logic here.
|
93
|
+
#
|
94
|
+
# The Root Path ('/') and Sitemap Index file are added automatically.
|
95
|
+
# Links are added to the Sitemap output in the order they are specified.
|
96
|
+
#
|
97
|
+
# Usage: sitemap.add path, options
|
98
|
+
# (default options are used if you don't specify them)
|
99
|
+
#
|
100
|
+
# Defaults: :priority => 0.5, :changefreq => 'weekly',
|
101
|
+
# :lastmod => Time.now, :host => default_host
|
102
|
+
|
103
|
+
|
104
|
+
# Examples:
|
105
|
+
|
106
|
+
# add '/articles'
|
107
|
+
sitemap.add articles_path, :priority => 0.7, :changefreq => 'daily'
|
108
|
+
|
109
|
+
# add all individual articles
|
110
|
+
Article.find(:all).each do |a|
|
111
|
+
sitemap.add article_path(a), :lastmod => a.updated_at
|
112
|
+
end
|
113
|
+
|
114
|
+
# add merchant path
|
115
|
+
sitemap.add '/purchase', :priority => 0.7, :host => "https://www.example.com"
|
116
|
+
|
117
|
+
# add all individual news with images
|
118
|
+
News.all.each do |n|
|
119
|
+
sitemap.add news_path(n), :lastmod => n.updated_at, :images=>n.images.collect{ |r| :loc=>r.image.url, :title=>r.image.name }
|
120
|
+
end
|
121
|
+
|
122
|
+
end
|
123
|
+
|
124
|
+
# Including Sitemaps from Rails Engines.
|
125
|
+
#
|
126
|
+
# These Sitemaps should be almost identical to a regular Sitemap file except
|
127
|
+
# they needn't define their own SitemapGenerator::Sitemap.default_host since
|
128
|
+
# they will undoubtedly share the host name of the application they belong to.
|
129
|
+
#
|
130
|
+
# As an example, say we have a Rails Engine in vendor/plugins/cadability_client
|
131
|
+
# We can include its Sitemap here as follows:
|
132
|
+
#
|
133
|
+
file = File.join(Rails.root, 'vendor/plugins/cadability_client/config/sitemap.rb')
|
134
|
+
eval(open(file).read, binding, file)
|
135
|
+
|
136
|
+
Raison d'être
|
137
|
+
-------
|
138
|
+
|
139
|
+
Most of the Sitemap plugins out there seem to try to recreate the Sitemap links by iterating the Rails routes. In some cases this is possible, but for a great deal of cases it isn't.
|
140
|
+
|
141
|
+
a) There are probably quite a few routes in your routes file that don't need inclusion in the Sitemap. (AJAX routes I'm looking at you.)
|
142
|
+
|
143
|
+
and
|
144
|
+
|
145
|
+
b) How would you infer the correct series of links for the following route?
|
146
|
+
|
147
|
+
map.zipcode 'location/:state/:city/:zipcode', :controller => 'zipcode', :action => 'index'
|
148
|
+
|
149
|
+
Don't tell me it's trivial, because it isn't. It just looks trivial.
|
150
|
+
|
151
|
+
So my idea is to have another file similar to 'routes.rb' called 'sitemap.rb', where you can define what goes into the Sitemap.
|
152
|
+
|
153
|
+
Here's my solution:
|
154
|
+
|
155
|
+
Zipcode.find(:all, :include => :city).each do |z|
|
156
|
+
sitemap.add zipcode_path(:state => z.city.state, :city => z.city, :zipcode => z)
|
157
|
+
end
|
158
|
+
|
159
|
+
Easy hey?
|
160
|
+
|
161
|
+
Other Sitemap settings for the link, like `lastmod`, `priority`, `changefreq` and `host` are entered automatically, although you can override them if you need to.
|
162
|
+
|
163
|
+
Compatibility
|
164
|
+
=======
|
165
|
+
|
166
|
+
Tested and working on:
|
167
|
+
|
168
|
+
- **Rails** 3.0.0, sitemap_generator version >= 0.2.5
|
169
|
+
- **Rails** 1.x - 2.3.5
|
170
|
+
- **Ruby** 1.8.7, 1.9.1
|
171
|
+
|
172
|
+
Notes
|
173
|
+
=======
|
174
|
+
|
175
|
+
1) For large sitemaps it may be useful to split your generation into batches to avoid running out of memory. E.g.:
|
176
|
+
|
177
|
+
# add movies
|
178
|
+
Movie.find_in_batches(:batch_size => 1000) do |movies|
|
179
|
+
movies.each do |movie|
|
180
|
+
sitemap.add "/movies/show/#{movie.to_param}", :lastmod => movie.updated_at, :changefreq => 'weekly'
|
181
|
+
end
|
182
|
+
end
|
183
|
+
|
184
|
+
2) New Capistrano deploys will remove your Sitemap files, unless you run `rake sitemap:refresh`. The way around this is to create a cap task:
|
185
|
+
|
186
|
+
after "deploy:update_code", "deploy:copy_old_sitemap"
|
187
|
+
|
188
|
+
namespace :deploy do
|
189
|
+
task :copy_old_sitemap do
|
190
|
+
run "if [ -e #{previous_release}/public/sitemap_index.xml.gz ]; then cp #{previous_release}/public/sitemap* #{current_release}/public/; fi"
|
191
|
+
end
|
192
|
+
end
|
193
|
+
|
194
|
+
3) If generation of your sitemap fails for some reason, the old sitemap will remain in public/. This ensures that robots will always find a valid sitemap. Running silently (`rake -s sitemap:refresh`) and with email forwarding setup you'll only get an email if your sitemap fails to build, and no notification when everything is fine - which will be most of the time.
|
195
|
+
|
196
|
+
Known Bugs
|
197
|
+
========
|
198
|
+
|
199
|
+
- There's no check on the size of a URL which [isn't supposed to exceed 2,048 bytes][sitemaps_xml].
|
200
|
+
- Currently only supports one Sitemap Index file, which can contain 50,000 Sitemap files which can each contain 50,000 urls, so it _only_ supports up to 2,500,000,000 (2.5 billion) urls. I personally have no need of support for more urls, but plugin could be improved to support this.
|
201
|
+
|
202
|
+
Wishlist & Coming Soon
|
203
|
+
========
|
204
|
+
|
205
|
+
- Support for generating sitemaps for sites with multiple domains. Sitemaps are generated into subdirectories and we use a Rack middleware to rewrite requests for sitemaps to the correct subdirectory based on the request host.
|
206
|
+
- I want to refactor the code because it has grown a lot. Part of this refactoring will include implementing some more checks to make sure we adhere to standards as well as making sure that the sitemaps are being generated as efficiently as possible.
|
207
|
+
|
208
|
+
I'd like to simplify adding links to a sitemap. Right now it's all or nothing. I'd like to break it up so you can add batches.
|
209
|
+
- Auto coverage testing. Generate a report of broken URLs by checking the status codes of each page in the sitemap.
|
210
|
+
|
211
|
+
Thanks (in no particular order)
|
212
|
+
========
|
213
|
+
|
214
|
+
- [Alexadre Bini](http://github.com/alexandrebini) for image sitemaps
|
215
|
+
- [Dan Pickett](http://github.com/dpickett)
|
216
|
+
- [Rob Biedenharn](http://github.com/rab)
|
217
|
+
- [Richie Vos](http://github.com/jerryvos)
|
218
|
+
- [Adrian Mugnolo](http://github.com/xymbol)
|
219
|
+
- [Jason Weathered](http://github.com/jasoncodes)
|
220
|
+
|
221
|
+
Copyright (c) 2009 Karl Varga released under the MIT license
|
222
|
+
|
223
|
+
[canonical_repo]:http://github.com/kjvarga/sitemap_generator
|
224
|
+
[enterprise_class]:https://twitter.com/dhh/status/1631034662 "I use enterprise in the same sense the Phusion guys do - i.e. Enterprise Ruby. Please don't look down on my use of the word 'enterprise' to represent being a cut above. It doesn't mean you ever have to work for a company the size of IBM. Or constantly fight inertia, writing crappy software, adhering to change management practices and spending hours in meetings... Not that there's anything wrong with that - Wait, what?"
|
225
|
+
[sitemap_engines]:http://en.wikipedia.org/wiki/Sitemap_index "http://en.wikipedia.org/wiki/Sitemap_index"
|
226
|
+
[sitemaps_org]:http://www.sitemaps.org/protocol.php "http://www.sitemaps.org/protocol.php"
|
227
|
+
[sitemaps_xml]:http://www.sitemaps.org/protocol.php#xmlTagDefinitions "XML Tag Definitions"
|
228
|
+
[sitemap_generator_usage]:http://wiki.github.com/adamsalter/sitemap_generator/sitemapgenerator-usage "http://wiki.github.com/adamsalter/sitemap_generator/sitemapgenerator-usage"
|
229
|
+
[boost_juice]:http://www.boostjuice.com.au/ "Mmmm, sweet, sweet Boost Juice."
|
230
|
+
[cb]:http://codebright.net "http://codebright.net"
|
231
|
+
[sitemap_images]:http://www.google.com/support/webmasters/bin/answer.py?answer=178636
|
232
|
+
[sitemap_protocol]:http://sitemaps.org/protocol.php
|
data/Rakefile
ADDED
@@ -0,0 +1,114 @@
|
|
1
|
+
require 'rake'
|
2
|
+
require 'rake/rdoctask'
|
3
|
+
require 'rubygems'
|
4
|
+
gem 'rspec', '1.3.0'
|
5
|
+
require 'spec/rake/spectask'
|
6
|
+
gem 'nokogiri'
|
7
|
+
|
8
|
+
begin
|
9
|
+
require 'jeweler'
|
10
|
+
Jeweler::Tasks.new do |gem|
|
11
|
+
gem.name = "airblade-sitemap_generator"
|
12
|
+
gem.summary = %Q{Easily generate enterprise class Sitemaps for your Rails site using a familiar Rails Routes-like DSL}
|
13
|
+
gem.description = %Q{A Rails 3-compatible gem/plugin to generate enterprise-class Sitemaps using a familiar Rails Routes-like DSL. Sitemaps are readable by all search engines and adhere to the Sitemap protocol specification. Automatically pings search engines to notify them of new sitemaps (including Google, Yahoo and Bing). Provides rake tasks to easily manage your sitemaps. Supports image sitemaps and handles millions of links.}
|
14
|
+
gem.email = "boss@airbladesoftware.com"
|
15
|
+
gem.homepage = "http://github.com/airblade/sitemap_generator"
|
16
|
+
gem.authors = ["Adam Salter", "Karl Varga"]
|
17
|
+
gem.files = FileList["[A-Z]*", "{bin,lib,rails,templates,tasks}/**/*"]
|
18
|
+
gem.test_files = []
|
19
|
+
gem.add_development_dependency "rspec"
|
20
|
+
gem.add_development_dependency "nokogiri"
|
21
|
+
end
|
22
|
+
Jeweler::GemcutterTasks.new
|
23
|
+
rescue LoadError
|
24
|
+
puts "Jeweler (or a dependency) not available. Install it with: gem install jeweler"
|
25
|
+
end
|
26
|
+
|
27
|
+
#
|
28
|
+
# Helper methods
|
29
|
+
#
|
30
|
+
module Helpers
|
31
|
+
extend self
|
32
|
+
|
33
|
+
# Return a full local path to path fragment <tt>path</tt>
|
34
|
+
def local_path(path)
|
35
|
+
File.join(File.dirname(__FILE__), path)
|
36
|
+
end
|
37
|
+
|
38
|
+
# Copy all of the local files into <tt>path</tt> after completely cleaning it
|
39
|
+
def prepare_path(path)
|
40
|
+
rm_rf path
|
41
|
+
mkdir_p path
|
42
|
+
cp_r(FileList["[A-Z]*", "{bin,lib,rails,templates,tasks}"], path)
|
43
|
+
end
|
44
|
+
end
|
45
|
+
|
46
|
+
#
|
47
|
+
# Tasks
|
48
|
+
#
|
49
|
+
task :default => :test
|
50
|
+
|
51
|
+
namespace :test do
|
52
|
+
#desc "Test as a gem, plugin and Rails 3 gem"
|
53
|
+
#task :all => ['test:gem', 'test:plugin']
|
54
|
+
|
55
|
+
task :gem => ['test:prepare:gem', 'multi_spec']
|
56
|
+
task :plugin => ['test:prepare:plugin', 'multi_spec']
|
57
|
+
task :rails3 => ['test:prepare:rails3', 'multi_spec']
|
58
|
+
|
59
|
+
task :multi_spec do
|
60
|
+
Rake::Task['spec'].invoke
|
61
|
+
Rake::Task['spec'].reenable
|
62
|
+
end
|
63
|
+
|
64
|
+
namespace :prepare do
|
65
|
+
task :gem do
|
66
|
+
ENV["SITEMAP_RAILS"] = 'gem'
|
67
|
+
Helpers.prepare_path(Helpers.local_path('spec/mock_app_gem/vendor/gems/sitemap_generator-1.2.3'))
|
68
|
+
rm_rf(Helpers.local_path('spec/mock_app_gem/public/sitemap*'))
|
69
|
+
end
|
70
|
+
|
71
|
+
task :plugin do
|
72
|
+
ENV["SITEMAP_RAILS"] = 'plugin'
|
73
|
+
Helpers.prepare_path(Helpers.local_path('spec/mock_app_plugin/vendor/plugins/sitemap_generator-1.2.3'))
|
74
|
+
rm_rf(Helpers.local_path('spec/mock_app_plugin/public/sitemap*'))
|
75
|
+
end
|
76
|
+
|
77
|
+
task :rails3 do
|
78
|
+
ENV["SITEMAP_RAILS"] = 'rails3'
|
79
|
+
rm_rf(Helpers.local_path('spec/mock_rails3_gem/public/sitemap*'))
|
80
|
+
end
|
81
|
+
end
|
82
|
+
end
|
83
|
+
|
84
|
+
desc "Release a new patch version"
|
85
|
+
task :release_new_version do
|
86
|
+
Rake::Task['version:bump:patch'].invoke
|
87
|
+
Rake::Task['github:release'].invoke
|
88
|
+
Rake::Task['git:release'].invoke
|
89
|
+
Rake::Task['gemcutter:release'].invoke
|
90
|
+
end
|
91
|
+
|
92
|
+
desc "Run tests as a gem install"
|
93
|
+
task :test => ['test:gem']
|
94
|
+
|
95
|
+
Spec::Rake::SpecTask.new(:spec) do |spec|
|
96
|
+
spec.libs << 'lib' << 'spec'
|
97
|
+
spec.spec_files = FileList['spec/**/*_spec.rb']
|
98
|
+
end
|
99
|
+
task :spec => :check_dependencies
|
100
|
+
|
101
|
+
Spec::Rake::SpecTask.new(:rcov) do |spec|
|
102
|
+
spec.libs << 'lib' << 'spec'
|
103
|
+
spec.pattern = 'spec/**/*_spec.rb'
|
104
|
+
spec.rcov = true
|
105
|
+
end
|
106
|
+
|
107
|
+
desc 'Generate documentation'
|
108
|
+
Rake::RDocTask.new(:rdoc) do |rdoc|
|
109
|
+
rdoc.rdoc_dir = 'rdoc'
|
110
|
+
rdoc.title = 'SitemapGenerator'
|
111
|
+
rdoc.options << '--line-numbers' << '--inline-source'
|
112
|
+
rdoc.rdoc_files.include('README.md')
|
113
|
+
rdoc.rdoc_files.include('lib/**/*.rb')
|
114
|
+
end
|
data/VERSION
ADDED
@@ -0,0 +1 @@
|
|
1
|
+
0.3.4
|
@@ -0,0 +1,28 @@
|
|
1
|
+
require 'sitemap_generator/builder'
|
2
|
+
require 'sitemap_generator/mapper'
|
3
|
+
require 'sitemap_generator/link'
|
4
|
+
require 'sitemap_generator/link_set'
|
5
|
+
require 'sitemap_generator/templates'
|
6
|
+
require 'sitemap_generator/utilities'
|
7
|
+
require 'sitemap_generator/railtie' if SitemapGenerator::Utilities.rails3?
|
8
|
+
|
9
|
+
require 'active_support/core_ext/numeric'
|
10
|
+
|
11
|
+
module SitemapGenerator
|
12
|
+
silence_warnings do
|
13
|
+
VERSION = File.read(File.dirname(__FILE__) + "/../VERSION").strip
|
14
|
+
MAX_SITEMAP_FILES = 50_000 # max sitemap links per index file
|
15
|
+
MAX_SITEMAP_LINKS = 50_000 # max links per sitemap
|
16
|
+
MAX_SITEMAP_IMAGES = 1_000 # max images per url
|
17
|
+
MAX_SITEMAP_FILESIZE = 10.megabytes # bytes
|
18
|
+
|
19
|
+
Sitemap = LinkSet.new
|
20
|
+
end
|
21
|
+
|
22
|
+
class << self
|
23
|
+
attr_accessor :root, :templates
|
24
|
+
end
|
25
|
+
|
26
|
+
self.root = File.expand_path(File.join(File.dirname(__FILE__), '../'))
|
27
|
+
self.templates = SitemapGenerator::Templates.new(self.root)
|
28
|
+
end
|
@@ -0,0 +1,124 @@
|
|
1
|
+
require 'sitemap_generator/builder/helper'
|
2
|
+
require 'builder'
|
3
|
+
require 'zlib'
|
4
|
+
|
5
|
+
module SitemapGenerator
|
6
|
+
module Builder
|
7
|
+
class SitemapFile
|
8
|
+
include SitemapGenerator::Builder::Helper
|
9
|
+
|
10
|
+
attr_accessor :sitemap_path, :public_path, :filesize, :link_count, :hostname
|
11
|
+
|
12
|
+
# <tt>public_path</tt> full path of the directory to write sitemaps in.
|
13
|
+
# Usually your Rails <tt>public/</tt> directory.
|
14
|
+
#
|
15
|
+
# <tt>sitemap_path</tt> relative path including filename of the sitemap
|
16
|
+
# file relative to <tt>public_path</tt>
|
17
|
+
#
|
18
|
+
# <tt>hostname</tt> hostname including protocol to use in all links
|
19
|
+
# e.g. http://en.google.ca
|
20
|
+
def initialize(public_path, sitemap_path, hostname)
|
21
|
+
self.sitemap_path = sitemap_path
|
22
|
+
self.public_path = public_path
|
23
|
+
self.hostname = hostname
|
24
|
+
self.link_count = 0
|
25
|
+
|
26
|
+
@xml_content = '' # XML urlset content
|
27
|
+
@xml_wrapper_start = <<-HTML
|
28
|
+
<?xml version="1.0" encoding="UTF-8"?>
|
29
|
+
<urlset
|
30
|
+
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
|
31
|
+
xmlns:image="http://www.google.com/schemas/sitemap-image/1.1"
|
32
|
+
xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
|
33
|
+
http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"
|
34
|
+
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
|
35
|
+
>
|
36
|
+
HTML
|
37
|
+
@xml_wrapper_start.gsub!(/\s+/, ' ').gsub!(/ *> */, '>').strip!
|
38
|
+
@xml_wrapper_end = %q[</urlset>]
|
39
|
+
self.filesize = @xml_wrapper_start.length + @xml_wrapper_end.length
|
40
|
+
end
|
41
|
+
|
42
|
+
def lastmod
|
43
|
+
File.mtime(self.full_path) rescue nil
|
44
|
+
end
|
45
|
+
|
46
|
+
def empty?
|
47
|
+
self.link_count == 0
|
48
|
+
end
|
49
|
+
|
50
|
+
def full_url
|
51
|
+
URI.join(self.hostname, self.sitemap_path).to_s
|
52
|
+
end
|
53
|
+
|
54
|
+
def full_path
|
55
|
+
@full_path ||= File.join(self.public_path, self.sitemap_path)
|
56
|
+
end
|
57
|
+
|
58
|
+
# Return a boolean indicating whether the sitemap file can fit another link
|
59
|
+
# of <tt>bytes</tt> bytes in size.
|
60
|
+
def file_can_fit?(bytes)
|
61
|
+
(self.filesize + bytes) < SitemapGenerator::MAX_SITEMAP_FILESIZE && self.link_count < SitemapGenerator::MAX_SITEMAP_LINKS
|
62
|
+
end
|
63
|
+
|
64
|
+
# Add a link to the sitemap file and return a boolean indicating whether the
|
65
|
+
# link was added.
|
66
|
+
#
|
67
|
+
# If a link cannot be added, the file is too large or the link limit has been reached.
|
68
|
+
def add_link(link)
|
69
|
+
xml = build_xml(::Builder::XmlMarkup.new, link)
|
70
|
+
unless file_can_fit?(xml.length)
|
71
|
+
self.finalize!
|
72
|
+
return false
|
73
|
+
end
|
74
|
+
|
75
|
+
@xml_content << xml
|
76
|
+
self.filesize += xml.length
|
77
|
+
self.link_count += 1
|
78
|
+
true
|
79
|
+
end
|
80
|
+
alias_method :<<, :add_link
|
81
|
+
|
82
|
+
# Return XML as a String
|
83
|
+
def build_xml(builder, link)
|
84
|
+
builder.url do
|
85
|
+
builder.loc link[:loc]
|
86
|
+
builder.lastmod w3c_date(link[:lastmod]) if link[:lastmod]
|
87
|
+
builder.changefreq link[:changefreq] if link[:changefreq]
|
88
|
+
builder.priority link[:priority] if link[:priority]
|
89
|
+
|
90
|
+
unless link[:images].blank?
|
91
|
+
link[:images].each do |image|
|
92
|
+
builder.image:image do
|
93
|
+
builder.image :loc, image[:loc]
|
94
|
+
builder.image :caption, image[:caption] if image[:caption]
|
95
|
+
builder.image :geo_location, image[:geo_location] if image[:geo_location]
|
96
|
+
builder.image :title, image[:title] if image[:title]
|
97
|
+
builder.image :license, image[:license] if image[:license]
|
98
|
+
end
|
99
|
+
end
|
100
|
+
end
|
101
|
+
end
|
102
|
+
builder << ''
|
103
|
+
end
|
104
|
+
|
105
|
+
# Insert the content into the XML "wrapper" and write and close the file.
|
106
|
+
#
|
107
|
+
# All the xml content in the instance is cleared, but attributes like
|
108
|
+
# <tt>filesize</tt> are still available.
|
109
|
+
def finalize!
|
110
|
+
return if self.frozen?
|
111
|
+
|
112
|
+
open(self.full_path, 'wb') do |file|
|
113
|
+
gz = Zlib::GzipWriter.new(file)
|
114
|
+
gz.write @xml_wrapper_start
|
115
|
+
gz.write @xml_content
|
116
|
+
gz.write @xml_wrapper_end
|
117
|
+
gz.close
|
118
|
+
end
|
119
|
+
@xml_content = @xml_wrapper_start = @xml_wrapper_end = ''
|
120
|
+
self.freeze
|
121
|
+
end
|
122
|
+
end
|
123
|
+
end
|
124
|
+
end
|
@@ -0,0 +1,33 @@
|
|
1
|
+
module SitemapGenerator
|
2
|
+
module Builder
|
3
|
+
class SitemapIndexFile < SitemapFile
|
4
|
+
|
5
|
+
def initialize(*args)
|
6
|
+
super(*args)
|
7
|
+
|
8
|
+
@xml_content = '' # XML urlset content
|
9
|
+
@xml_wrapper_start = <<-HTML
|
10
|
+
<?xml version="1.0" encoding="UTF-8"?>
|
11
|
+
<sitemapindex
|
12
|
+
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
|
13
|
+
xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
|
14
|
+
http://www.sitemaps.org/schemas/sitemap/0.9/siteindex.xsd"
|
15
|
+
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
|
16
|
+
>
|
17
|
+
HTML
|
18
|
+
@xml_wrapper_start.gsub!(/\s+/, ' ').gsub!(/ *> */, '>').strip!
|
19
|
+
@xml_wrapper_end = %q[</sitemapindex>]
|
20
|
+
self.filesize = @xml_wrapper_start.length + @xml_wrapper_end.length
|
21
|
+
end
|
22
|
+
|
23
|
+
# Return XML as a String
|
24
|
+
def build_xml(builder, link)
|
25
|
+
builder.sitemap do
|
26
|
+
builder.loc link[:loc]
|
27
|
+
builder.lastmod w3c_date(link[:lastmod]) if link[:lastmod]
|
28
|
+
end
|
29
|
+
builder << ''
|
30
|
+
end
|
31
|
+
end
|
32
|
+
end
|
33
|
+
end
|
@@ -0,0 +1,28 @@
|
|
1
|
+
module SitemapGenerator
|
2
|
+
|
3
|
+
# Evaluate a sitemap config file within the context of a class that includes the
|
4
|
+
# Rails URL helpers.
|
5
|
+
class Interpreter
|
6
|
+
|
7
|
+
if SitemapGenerator::Utilities.rails3?
|
8
|
+
include ::Rails.application.routes.url_helpers
|
9
|
+
else
|
10
|
+
require 'action_controller'
|
11
|
+
include ActionController::UrlWriter
|
12
|
+
end
|
13
|
+
|
14
|
+
def initialize(sitemap_config_file=nil)
|
15
|
+
sitemap_config_file ||= File.join(::Rails.root, 'config/sitemap.rb')
|
16
|
+
eval(open(sitemap_config_file).read)
|
17
|
+
end
|
18
|
+
|
19
|
+
# KJV do we need this? We should be using path_* helpers.
|
20
|
+
# def self.default_url_options(options = nil)
|
21
|
+
# { :host => SitemapGenerator::Sitemap.default_host }
|
22
|
+
# end
|
23
|
+
|
24
|
+
def self.run
|
25
|
+
new
|
26
|
+
end
|
27
|
+
end
|
28
|
+
end
|
@@ -0,0 +1,36 @@
|
|
1
|
+
module SitemapGenerator
|
2
|
+
module Link
|
3
|
+
extend self
|
4
|
+
|
5
|
+
# Return a Hash of options suitable to pass to a SitemapGenerator::Builder::SitemapFile instance.
|
6
|
+
def generate(path, options = {})
|
7
|
+
if path.is_a?(SitemapGenerator::Builder::SitemapFile)
|
8
|
+
options.reverse_merge!(:host => path.hostname, :lastmod => path.lastmod)
|
9
|
+
path = path.sitemap_path
|
10
|
+
end
|
11
|
+
|
12
|
+
options.assert_valid_keys(:priority, :changefreq, :lastmod, :host, :images)
|
13
|
+
options.reverse_merge!(:priority => 0.5, :changefreq => 'weekly', :lastmod => Time.now, :host => Sitemap.default_host, :images => [])
|
14
|
+
{
|
15
|
+
:path => path,
|
16
|
+
:priority => options[:priority],
|
17
|
+
:changefreq => options[:changefreq],
|
18
|
+
:lastmod => options[:lastmod],
|
19
|
+
:host => options[:host],
|
20
|
+
:loc => URI.join(options[:host], path).to_s,
|
21
|
+
:images => prepare_images(options[:images], options[:host])
|
22
|
+
}
|
23
|
+
end
|
24
|
+
|
25
|
+
# Return an Array of image option Hashes suitable to be parsed by SitemapGenerator::Builder::SitemapFile
|
26
|
+
def prepare_images(images, host)
|
27
|
+
images.delete_if { |key,value| key[:loc] == nil }
|
28
|
+
images.each do |r|
|
29
|
+
r.assert_valid_keys(:loc, :caption, :geo_location, :title, :license)
|
30
|
+
r[:loc] = URI.join(host, r[:loc]).to_s
|
31
|
+
end
|
32
|
+
images[0..(SitemapGenerator::MAX_SITEMAP_IMAGES-1)]
|
33
|
+
end
|
34
|
+
end
|
35
|
+
end
|
36
|
+
|
@@ -0,0 +1,174 @@
|
|
1
|
+
require 'builder'
|
2
|
+
require 'action_view'
|
3
|
+
|
4
|
+
# A LinkSet provisions a bunch of links to sitemap files. It also writes the index file
|
5
|
+
# which lists all the sitemap files written.
|
6
|
+
module SitemapGenerator
|
7
|
+
class LinkSet
|
8
|
+
include ActionView::Helpers::NumberHelper # for number_with_delimiter
|
9
|
+
|
10
|
+
attr_accessor :default_host, :public_path, :sitemaps_path
|
11
|
+
attr_accessor :sitemap, :sitemaps, :sitemap_index
|
12
|
+
attr_accessor :verbose, :yahoo_app_id
|
13
|
+
|
14
|
+
# Evaluate the sitemap config file and write all sitemaps.
|
15
|
+
#
|
16
|
+
# This should be refactored so that we can have multiple instances
|
17
|
+
# of LinkSet.
|
18
|
+
def create
|
19
|
+
require 'sitemap_generator/interpreter'
|
20
|
+
|
21
|
+
self.public_path = File.join(::Rails.root, 'public/') if self.public_path.nil?
|
22
|
+
|
23
|
+
start_time = Time.now
|
24
|
+
SitemapGenerator::Interpreter.run
|
25
|
+
finalize!
|
26
|
+
end_time = Time.now
|
27
|
+
|
28
|
+
puts "\nSitemap stats: #{number_with_delimiter(self.link_count)} links / #{self.sitemaps.size} files / " + ("%dm%02ds" % (end_time - start_time).divmod(60)) if verbose
|
29
|
+
end
|
30
|
+
|
31
|
+
# <tt>public_path</tt> (optional) full path to the directory to write sitemaps in.
|
32
|
+
# Defaults to your Rails <tt>public/</tt> directory.
|
33
|
+
#
|
34
|
+
# <tt>sitemaps_path</tt> (optional) path fragment within public to write sitemaps
|
35
|
+
# to e.g. 'en/'. Sitemaps are written to <tt>public_path</tt> + <tt>sitemaps_path</tt>
|
36
|
+
#
|
37
|
+
# <tt>default_host</tt> hostname including protocol to use in all sitemap links
|
38
|
+
# e.g. http://en.google.ca
|
39
|
+
def initialize(public_path = nil, sitemaps_path = nil, default_host = nil)
|
40
|
+
self.default_host = default_host
|
41
|
+
self.public_path = public_path
|
42
|
+
self.sitemaps_path = sitemaps_path
|
43
|
+
|
44
|
+
# Completed sitemaps
|
45
|
+
self.sitemaps = []
|
46
|
+
end
|
47
|
+
|
48
|
+
def link_count
|
49
|
+
self.sitemaps.inject(0) { |link_count_sum, sitemap| link_count_sum + sitemap.link_count }
|
50
|
+
end
|
51
|
+
|
52
|
+
# Called within the user's eval'ed sitemap config file. Add links to sitemap files
|
53
|
+
# passing a block.
|
54
|
+
#
|
55
|
+
# TODO: Refactor. The call chain is confusing and convoluted here.
|
56
|
+
def add_links
|
57
|
+
raise ArgumentError, "Default hostname not set" if default_host.blank?
|
58
|
+
|
59
|
+
# I'd rather have these calls in <tt>create</tt> but we have to wait
|
60
|
+
# for <tt>default_host</tt> to be set by the user's sitemap config
|
61
|
+
new_sitemap
|
62
|
+
add_default_links
|
63
|
+
|
64
|
+
yield Mapper.new(self)
|
65
|
+
end
|
66
|
+
|
67
|
+
# Called from Mapper.
|
68
|
+
#
|
69
|
+
# Add a link to the current sitemap.
|
70
|
+
def add_link(link)
|
71
|
+
unless self.sitemap << link
|
72
|
+
new_sitemap
|
73
|
+
self.sitemap << link
|
74
|
+
end
|
75
|
+
end
|
76
|
+
|
77
|
+
# Add the current sitemap to the <tt>sitemaps</tt> Array and
|
78
|
+
# start a new sitemap.
|
79
|
+
#
|
80
|
+
# If the current sitemap is nil or empty it is not added.
|
81
|
+
def new_sitemap
|
82
|
+
unless self.sitemap_index
|
83
|
+
self.sitemap_index = SitemapGenerator::Builder::SitemapIndexFile.new(public_path, sitemap_index_path, default_host)
|
84
|
+
end
|
85
|
+
|
86
|
+
unless self.sitemap
|
87
|
+
self.sitemap = SitemapGenerator::Builder::SitemapFile.new(public_path, new_sitemap_path, default_host)
|
88
|
+
end
|
89
|
+
|
90
|
+
# Mark the sitemap as complete and add it to the sitemap index
|
91
|
+
unless self.sitemap.empty?
|
92
|
+
self.sitemap.finalize!
|
93
|
+
self.sitemap_index << Link.generate(self.sitemap)
|
94
|
+
self.sitemaps << self.sitemap
|
95
|
+
show_progress(self.sitemap) if verbose
|
96
|
+
|
97
|
+
self.sitemap = SitemapGenerator::Builder::SitemapFile.new(public_path, new_sitemap_path, default_host)
|
98
|
+
end
|
99
|
+
end
|
100
|
+
|
101
|
+
# Report progress line.
|
102
|
+
def show_progress(sitemap)
|
103
|
+
uncompressed_size = number_to_human_size(sitemap.filesize)
|
104
|
+
compressed_size = number_to_human_size(File.size?(sitemap.full_path))
|
105
|
+
puts "+ #{sitemap.sitemap_path} #{sitemap.link_count} links / #{uncompressed_size} / #{compressed_size} gzipped"
|
106
|
+
end
|
107
|
+
|
108
|
+
# Finalize all sitemap files
|
109
|
+
def finalize!
|
110
|
+
new_sitemap
|
111
|
+
self.sitemap_index.finalize!
|
112
|
+
end
|
113
|
+
|
114
|
+
# Ping search engines.
|
115
|
+
#
|
116
|
+
# @see http://en.wikipedia.org/wiki/Sitemap_index
|
117
|
+
def ping_search_engines
|
118
|
+
require 'open-uri'
|
119
|
+
|
120
|
+
sitemap_index_url = CGI.escape(self.sitemap_index.full_url)
|
121
|
+
search_engines = {
|
122
|
+
:google => "http://www.google.com/webmasters/sitemaps/ping?sitemap=#{sitemap_index_url}",
|
123
|
+
:yahoo => "http://search.yahooapis.com/SiteExplorerService/V1/ping?sitemap=#{sitemap_index_url}&appid=#{yahoo_app_id}",
|
124
|
+
:ask => "http://submissions.ask.com/ping?sitemap=#{sitemap_index_url}",
|
125
|
+
:bing => "http://www.bing.com/webmaster/ping.aspx?siteMap=#{sitemap_index_url}",
|
126
|
+
:sitemap_writer => "http://www.sitemapwriter.com/notify.php?crawler=all&url=#{sitemap_index_url}"
|
127
|
+
}
|
128
|
+
|
129
|
+
puts "\n" if verbose
|
130
|
+
search_engines.each do |engine, link|
|
131
|
+
next if engine == :yahoo && !self.yahoo_app_id
|
132
|
+
begin
|
133
|
+
open(link)
|
134
|
+
puts "Successful ping of #{engine.to_s.titleize}" if verbose
|
135
|
+
rescue Timeout::Error, StandardError => e
|
136
|
+
puts "Ping failed for #{engine.to_s.titleize}: #{e.inspect} (URL #{link})" if verbose
|
137
|
+
end
|
138
|
+
end
|
139
|
+
|
140
|
+
if !self.yahoo_app_id && verbose
|
141
|
+
puts "\n"
|
142
|
+
puts <<-END.gsub(/^\s+/, '')
|
143
|
+
To ping Yahoo you require a Yahoo AppID. Add it to your config/sitemap.rb with:
|
144
|
+
|
145
|
+
SitemapGenerator::Sitemap.yahoo_app_id = "my_app_id"
|
146
|
+
|
147
|
+
For more information see http://developer.yahoo.com/search/siteexplorer/V1/updateNotification.html
|
148
|
+
END
|
149
|
+
end
|
150
|
+
end
|
151
|
+
|
152
|
+
protected
|
153
|
+
|
154
|
+
def add_default_links
|
155
|
+
self.sitemap << Link.generate('/', :lastmod => Time.now, :changefreq => 'always', :priority => 1.0)
|
156
|
+
self.sitemap << Link.generate(self.sitemap_index, :lastmod => Time.now, :changefreq => 'always', :priority => 1.0)
|
157
|
+
end
|
158
|
+
|
159
|
+
# Return the current sitemap filename with index.
|
160
|
+
#
|
161
|
+
# The index depends on the length of the <tt>sitemaps</tt> array.
|
162
|
+
def new_sitemap_path
|
163
|
+
File.join(self.sitemaps_path || '', "sitemap#{self.sitemaps.length + 1}.xml.gz")
|
164
|
+
end
|
165
|
+
|
166
|
+
# Return the current sitemap index filename.
|
167
|
+
#
|
168
|
+
# At the moment we only support one index file which can link to
|
169
|
+
# up to 50,000 sitemap files.
|
170
|
+
def sitemap_index_path
|
171
|
+
File.join(self.sitemaps_path || '', 'sitemap_index.xml.gz')
|
172
|
+
end
|
173
|
+
end
|
174
|
+
end
|
@@ -0,0 +1,16 @@
|
|
1
|
+
module SitemapGenerator
|
2
|
+
# Generator instances are used to build links.
|
3
|
+
# The object passed to the add_links block in config/sitemap.rb is a Generator instance.
|
4
|
+
class Mapper
|
5
|
+
attr_accessor :set
|
6
|
+
|
7
|
+
def initialize(set)
|
8
|
+
@set = set
|
9
|
+
end
|
10
|
+
|
11
|
+
def add(loc, options = {})
|
12
|
+
set.add_link Link.generate(loc, options)
|
13
|
+
end
|
14
|
+
end
|
15
|
+
end
|
16
|
+
|
@@ -0,0 +1 @@
|
|
1
|
+
load File.expand_path(File.join(File.dirname(__FILE__), '../../tasks/sitemap_generator_tasks.rake'))
|
@@ -0,0 +1,41 @@
|
|
1
|
+
module SitemapGenerator
|
2
|
+
# Provide convenient access to template files. E.g.
|
3
|
+
#
|
4
|
+
# SitemapGenerator.templates.sitemap_index
|
5
|
+
#
|
6
|
+
# Lazy-load and cache for efficient access.
|
7
|
+
# Define an accessor method for each template file.
|
8
|
+
class Templates
|
9
|
+
FILES = {
|
10
|
+
:sitemap_sample => 'sitemap.rb',
|
11
|
+
}
|
12
|
+
|
13
|
+
# Dynamically define accessors for each key defined in <tt>FILES</tt>
|
14
|
+
attr_accessor *FILES.keys
|
15
|
+
FILES.keys.each do |name|
|
16
|
+
eval <<-END
|
17
|
+
define_method(:#{name}) do
|
18
|
+
@#{name} ||= read_template(:#{name})
|
19
|
+
end
|
20
|
+
END
|
21
|
+
end
|
22
|
+
|
23
|
+
def initialize(root = SitemapGenerator.root)
|
24
|
+
@root = root
|
25
|
+
end
|
26
|
+
|
27
|
+
# Return the full path to a template.
|
28
|
+
#
|
29
|
+
# <tt>file</tt> template symbol e.g. <tt>:sitemap_sample</tt>
|
30
|
+
def template_path(template)
|
31
|
+
File.join(@root, 'templates', self.class::FILES[template])
|
32
|
+
end
|
33
|
+
|
34
|
+
protected
|
35
|
+
|
36
|
+
# Read the template file and return its contents.
|
37
|
+
def read_template(template)
|
38
|
+
File.read(template_path(template))
|
39
|
+
end
|
40
|
+
end
|
41
|
+
end
|
@@ -0,0 +1,54 @@
|
|
1
|
+
module SitemapGenerator
|
2
|
+
module Utilities
|
3
|
+
extend self
|
4
|
+
|
5
|
+
# Copy templates/sitemap.rb to config if not there yet.
|
6
|
+
def install_sitemap_rb(verbose=false)
|
7
|
+
if File.exist?(File.join(RAILS_ROOT, 'config/sitemap.rb'))
|
8
|
+
puts "already exists: config/sitemap.rb, file not copied" if verbose
|
9
|
+
else
|
10
|
+
FileUtils.cp(
|
11
|
+
SitemapGenerator.templates.template_path(:sitemap_sample),
|
12
|
+
File.join(RAILS_ROOT, 'config/sitemap.rb'))
|
13
|
+
puts "created: config/sitemap.rb" if verbose
|
14
|
+
end
|
15
|
+
end
|
16
|
+
|
17
|
+
# Remove config/sitemap.rb if exists.
|
18
|
+
def uninstall_sitemap_rb
|
19
|
+
if File.exist?(File.join(RAILS_ROOT, 'config/sitemap.rb'))
|
20
|
+
File.rm(File.join(RAILS_ROOT, 'config/sitemap.rb'))
|
21
|
+
end
|
22
|
+
end
|
23
|
+
|
24
|
+
# Clean sitemap files in output directory.
|
25
|
+
def clean_files
|
26
|
+
FileUtils.rm(Dir[File.join(RAILS_ROOT, 'public/sitemap*.xml.gz')])
|
27
|
+
end
|
28
|
+
|
29
|
+
# Returns whether this environment is using ActionPack
|
30
|
+
# version 3.0.0 or greater.
|
31
|
+
#
|
32
|
+
# @return [Boolean]
|
33
|
+
def self.rails3?
|
34
|
+
# The ActionPack module is always loaded automatically in Rails >= 3
|
35
|
+
return false unless defined?(ActionPack) && defined?(ActionPack::VERSION)
|
36
|
+
|
37
|
+
version =
|
38
|
+
if defined?(ActionPack::VERSION::MAJOR)
|
39
|
+
ActionPack::VERSION::MAJOR
|
40
|
+
else
|
41
|
+
# Rails 1.2
|
42
|
+
ActionPack::VERSION::Major
|
43
|
+
end
|
44
|
+
|
45
|
+
# 3.0.0.beta1 acts more like ActionPack 2
|
46
|
+
# for purposes of this method
|
47
|
+
# (checking whether block helpers require = or -).
|
48
|
+
# This extra check can be removed when beta2 is out.
|
49
|
+
version >= 3 &&
|
50
|
+
!(defined?(ActionPack::VERSION::TINY) &&
|
51
|
+
ActionPack::VERSION::TINY == "0.beta")
|
52
|
+
end
|
53
|
+
end
|
54
|
+
end
|
data/rails/install.rb
ADDED
data/rails/uninstall.rb
ADDED
@@ -0,0 +1,31 @@
|
|
1
|
+
begin
|
2
|
+
require 'sitemap_generator'
|
3
|
+
rescue LoadError, NameError
|
4
|
+
# Application should work without vlad
|
5
|
+
end
|
6
|
+
|
7
|
+
namespace :sitemap do
|
8
|
+
desc "Install a default config/sitemap.rb file"
|
9
|
+
task :install do
|
10
|
+
SitemapGenerator::Utilities.install_sitemap_rb(verbose)
|
11
|
+
end
|
12
|
+
|
13
|
+
desc "Delete all Sitemap files in public/ directory"
|
14
|
+
task :clean do
|
15
|
+
SitemapGenerator::Utilities.clean_files
|
16
|
+
end
|
17
|
+
|
18
|
+
desc "Create Sitemap XML files in public/ directory (rake -s for no output)"
|
19
|
+
task :refresh => ['sitemap:create'] do
|
20
|
+
SitemapGenerator::Sitemap.ping_search_engines
|
21
|
+
end
|
22
|
+
|
23
|
+
desc "Create Sitemap XML files (don't ping search engines)"
|
24
|
+
task 'refresh:no_ping' => ['sitemap:create']
|
25
|
+
|
26
|
+
task :create => [:environment] do
|
27
|
+
SitemapGenerator::Sitemap.verbose = verbose
|
28
|
+
SitemapGenerator::Sitemap.create
|
29
|
+
end
|
30
|
+
end
|
31
|
+
|
@@ -0,0 +1,42 @@
|
|
1
|
+
# Set the host name for URL creation
|
2
|
+
SitemapGenerator::Sitemap.default_host = "http://www.example.com"
|
3
|
+
|
4
|
+
SitemapGenerator::Sitemap.add_links do |sitemap|
|
5
|
+
# Put links creation logic here.
|
6
|
+
#
|
7
|
+
# The root path '/' and sitemap index file are added automatically.
|
8
|
+
# Links are added to the Sitemap in the order they are specified.
|
9
|
+
#
|
10
|
+
# Usage: sitemap.add path, options
|
11
|
+
# (default options are used if you don't specify)
|
12
|
+
#
|
13
|
+
# Defaults: :priority => 0.5, :changefreq => 'weekly',
|
14
|
+
# :lastmod => Time.now, :host => default_host
|
15
|
+
|
16
|
+
|
17
|
+
# Examples:
|
18
|
+
|
19
|
+
# add '/articles'
|
20
|
+
sitemap.add articles_path, :priority => 0.7, :changefreq => 'daily'
|
21
|
+
|
22
|
+
# add all individual articles
|
23
|
+
Article.find(:all).each do |a|
|
24
|
+
sitemap.add article_path(a), :lastmod => a.updated_at
|
25
|
+
end
|
26
|
+
|
27
|
+
# add merchant path
|
28
|
+
sitemap.add '/purchase', :priority => 0.7, :host => "https://www.example.com"
|
29
|
+
|
30
|
+
end
|
31
|
+
|
32
|
+
# Including Sitemaps from Rails Engines.
|
33
|
+
#
|
34
|
+
# These Sitemaps should be almost identical to a regular Sitemap file except
|
35
|
+
# they needn't define their own SitemapGenerator::Sitemap.default_host since
|
36
|
+
# they will undoubtedly share the host name of the application they belong to.
|
37
|
+
#
|
38
|
+
# As an example, say we have a Rails Engine in vendor/plugins/cadability_client
|
39
|
+
# We can include its Sitemap here as follows:
|
40
|
+
#
|
41
|
+
# file = File.join(Rails.root, 'vendor/plugins/cadability_client/config/sitemap.rb')
|
42
|
+
# eval(open(file).read, binding, file)
|
metadata
ADDED
@@ -0,0 +1,115 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: airblade-sitemap_generator
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
hash: 27
|
5
|
+
prerelease: false
|
6
|
+
segments:
|
7
|
+
- 0
|
8
|
+
- 3
|
9
|
+
- 4
|
10
|
+
version: 0.3.4
|
11
|
+
platform: ruby
|
12
|
+
authors:
|
13
|
+
- Adam Salter
|
14
|
+
- Karl Varga
|
15
|
+
autorequire:
|
16
|
+
bindir: bin
|
17
|
+
cert_chain: []
|
18
|
+
|
19
|
+
date: 2010-06-28 00:00:00 +01:00
|
20
|
+
default_executable:
|
21
|
+
dependencies:
|
22
|
+
- !ruby/object:Gem::Dependency
|
23
|
+
name: rspec
|
24
|
+
prerelease: false
|
25
|
+
requirement: &id001 !ruby/object:Gem::Requirement
|
26
|
+
none: false
|
27
|
+
requirements:
|
28
|
+
- - ">="
|
29
|
+
- !ruby/object:Gem::Version
|
30
|
+
hash: 3
|
31
|
+
segments:
|
32
|
+
- 0
|
33
|
+
version: "0"
|
34
|
+
type: :development
|
35
|
+
version_requirements: *id001
|
36
|
+
- !ruby/object:Gem::Dependency
|
37
|
+
name: nokogiri
|
38
|
+
prerelease: false
|
39
|
+
requirement: &id002 !ruby/object:Gem::Requirement
|
40
|
+
none: false
|
41
|
+
requirements:
|
42
|
+
- - ">="
|
43
|
+
- !ruby/object:Gem::Version
|
44
|
+
hash: 3
|
45
|
+
segments:
|
46
|
+
- 0
|
47
|
+
version: "0"
|
48
|
+
type: :development
|
49
|
+
version_requirements: *id002
|
50
|
+
description: A Rails 3-compatible gem/plugin to generate enterprise-class Sitemaps using a familiar Rails Routes-like DSL. Sitemaps are readable by all search engines and adhere to the Sitemap protocol specification. Automatically pings search engines to notify them of new sitemaps (including Google, Yahoo and Bing). Provides rake tasks to easily manage your sitemaps. Supports image sitemaps and handles millions of links.
|
51
|
+
email: boss@airbladesoftware.com
|
52
|
+
executables: []
|
53
|
+
|
54
|
+
extensions: []
|
55
|
+
|
56
|
+
extra_rdoc_files:
|
57
|
+
- README.md
|
58
|
+
files:
|
59
|
+
- MIT-LICENSE
|
60
|
+
- README.md
|
61
|
+
- Rakefile
|
62
|
+
- VERSION
|
63
|
+
- lib/sitemap_generator.rb
|
64
|
+
- lib/sitemap_generator/builder.rb
|
65
|
+
- lib/sitemap_generator/builder/helper.rb
|
66
|
+
- lib/sitemap_generator/builder/sitemap_file.rb
|
67
|
+
- lib/sitemap_generator/builder/sitemap_index_file.rb
|
68
|
+
- lib/sitemap_generator/interpreter.rb
|
69
|
+
- lib/sitemap_generator/link.rb
|
70
|
+
- lib/sitemap_generator/link_set.rb
|
71
|
+
- lib/sitemap_generator/mapper.rb
|
72
|
+
- lib/sitemap_generator/railtie.rb
|
73
|
+
- lib/sitemap_generator/tasks.rb
|
74
|
+
- lib/sitemap_generator/templates.rb
|
75
|
+
- lib/sitemap_generator/utilities.rb
|
76
|
+
- rails/install.rb
|
77
|
+
- rails/uninstall.rb
|
78
|
+
- tasks/sitemap_generator_tasks.rake
|
79
|
+
- templates/sitemap.rb
|
80
|
+
has_rdoc: true
|
81
|
+
homepage: http://github.com/airblade/sitemap_generator
|
82
|
+
licenses: []
|
83
|
+
|
84
|
+
post_install_message:
|
85
|
+
rdoc_options:
|
86
|
+
- --charset=UTF-8
|
87
|
+
require_paths:
|
88
|
+
- lib
|
89
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
90
|
+
none: false
|
91
|
+
requirements:
|
92
|
+
- - ">="
|
93
|
+
- !ruby/object:Gem::Version
|
94
|
+
hash: 3
|
95
|
+
segments:
|
96
|
+
- 0
|
97
|
+
version: "0"
|
98
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
99
|
+
none: false
|
100
|
+
requirements:
|
101
|
+
- - ">="
|
102
|
+
- !ruby/object:Gem::Version
|
103
|
+
hash: 3
|
104
|
+
segments:
|
105
|
+
- 0
|
106
|
+
version: "0"
|
107
|
+
requirements: []
|
108
|
+
|
109
|
+
rubyforge_project:
|
110
|
+
rubygems_version: 1.3.7
|
111
|
+
signing_key:
|
112
|
+
specification_version: 3
|
113
|
+
summary: Easily generate enterprise class Sitemaps for your Rails site using a familiar Rails Routes-like DSL
|
114
|
+
test_files: []
|
115
|
+
|