alexrabarts-big_sitemap 0.1.3
Sign up to get free protection for your applications and to get access to all the features.
- data/README.markdown +90 -0
- data/VERSION.yml +4 -0
- data/lib/big_sitemap.rb +200 -0
- data/test/big_sitemap_test.rb +177 -0
- data/test/fixtures/test_model.rb +18 -0
- data/test/test_helper.rb +11 -0
- metadata +79 -0
data/README.markdown
ADDED
@@ -0,0 +1,90 @@
|
|
1
|
+
# BigSitemap
|
2
|
+
|
3
|
+
## DESCRIPTION:
|
4
|
+
|
5
|
+
BigSitemap is a Sitemap generator specifically designed for large sites (although it works equally well with small sites). It splits large Sitemaps into multiple files, gzips the files to minimize bandwidth usage, batches database queries so it doesn't take your site down, can be set up with just a few lines of code and is compatible with just about any framework.
|
6
|
+
|
7
|
+
## INSTALL:
|
8
|
+
|
9
|
+
* Via git: git clone git://github.com/alexrabarts/big_sitemap.git
|
10
|
+
* Via gem: gem install alexrabarts-big_sitemap -s http://gems.github.com
|
11
|
+
|
12
|
+
## SYNOPSIS:
|
13
|
+
|
14
|
+
The minimum required to generated a sitemap is:
|
15
|
+
|
16
|
+
<pre>
|
17
|
+
sitemap = BigSitemap.new(:base_url => 'http://example.com')
|
18
|
+
sitemap.add(:model => MyModel, :path => 'my_controller')
|
19
|
+
sitemap.generate
|
20
|
+
</pre>
|
21
|
+
|
22
|
+
You can put this in a rake/thor task and create a cron job to run it periodically. It should be enough for most Rails/Merb applications.
|
23
|
+
|
24
|
+
Your models must provide either a <code>find_for_sitemap</code> or <code>all</code> class method that returns the instances that are to be included in the sitemap. Additionally, you models must provide a <code>count_for_sitemap</code> or <code>count</code> class method that returns a count of the instances to be included. If you're using ActiveRecord (Rails) or DataMapper then <code>all</code> and <code>count</code> are already provided and you don't need to do anything unless you want to include a subset of records. If you provide your own <code>find_for_sitemap</code> or <code>all</code> method then it should be able to handle the <code>:offset</code> and <code>:limit</code> options, in the same way that ActiveRecord and DataMapper handle them. This is especially important if you have more than 50,000 URLs.
|
25
|
+
|
26
|
+
To generate the URLs, BigSitemap will combine the constructor arguments with the <code>to_param</code> method of each instance returned (provided by ActiveRecord but not DataMapper). If this method is not present, <code>id</code> will be used. The URL is constructed as:
|
27
|
+
|
28
|
+
<pre>
|
29
|
+
":base_url/:path/:to_param" # (if to_param exists)
|
30
|
+
":base_url/:path/:id" # (if to_param does not exist)
|
31
|
+
</pre>
|
32
|
+
|
33
|
+
BigSitemap knows about the document root of Rails and Merb. If you are using another framework then you can specify the document root with the <code>:document_root</code> option. e.g.:
|
34
|
+
|
35
|
+
<pre>
|
36
|
+
BigSitemap.new(:base_url => 'http://example.com', :document_root => "#{FOO_ROOT}/httpdocs")
|
37
|
+
</pre>
|
38
|
+
|
39
|
+
By default, the sitemap files are created under <code>/sitemaps</code>. You can modify this with the <code>:path</code> option:
|
40
|
+
|
41
|
+
<pre>
|
42
|
+
BigSitemap.new(:base_url => 'http://example.com', :path => 'google-sitemaps') # places Sitemaps under /google-sitemaps
|
43
|
+
</pre>
|
44
|
+
|
45
|
+
Sitemaps will be split across several files if more than 50,000 records are returned. You can customize this limit with the <code>:max_per_sitemap</code> option:
|
46
|
+
|
47
|
+
<pre>
|
48
|
+
BigSitemap.new(:base_url => 'http://example.com', :max_per_sitemap => 1000) # Max of 1000 URLs per Sitemap
|
49
|
+
</pre>
|
50
|
+
|
51
|
+
The database is queries in batches to prevent large SQL select statements from locking the database for too long. By default, the batch size is 1001 (not 1000 due to an obscure bug in DataMapper that appears when an offset of 37000 is used). You can customize the batch size with the <code>:batch_size</code> option:
|
52
|
+
|
53
|
+
<pre>
|
54
|
+
BigSitemap.new(:base_url => 'http://example.com, :batch_size => 5000) # Database is queried in batches of 5,000
|
55
|
+
</pre>
|
56
|
+
|
57
|
+
Google, Yahoo!, MSN and Ask are pinged once the Sitemap files are generated. You can turn one or more of these off:
|
58
|
+
|
59
|
+
<pre>
|
60
|
+
BigSitemap.new(
|
61
|
+
:base_url => 'http://example.com',
|
62
|
+
:ping_google => false,
|
63
|
+
:ping_yahoo => false,
|
64
|
+
:ping_msn => false,
|
65
|
+
:ping_ask => false
|
66
|
+
)
|
67
|
+
</pre>
|
68
|
+
|
69
|
+
You must provide an App ID in order to ping Yahoo! (more info at http://developer.yahoo.com/search/siteexplorer/V1/updateNotification.html):
|
70
|
+
|
71
|
+
<pre>
|
72
|
+
BigSitemap.new(:base_url => 'http://example.com', :yahoo_app_id => 'myYahooAppId') # Yahoo! will now be pinged
|
73
|
+
</pre>
|
74
|
+
|
75
|
+
## LIMITATIONS:
|
76
|
+
|
77
|
+
If your database is likely to shrink during the time it takes to create the sitemap then you might run into problems (the final, batched SQL select will overrun by setting a limit that is too large since it is calculated from the count, which is queried at the very beginning). Patches welcome!
|
78
|
+
|
79
|
+
## TODO
|
80
|
+
|
81
|
+
* Support for priority and changefreq (currently hard-coded to 'weekly')
|
82
|
+
|
83
|
+
## CREDITS
|
84
|
+
|
85
|
+
Thanks to Alastair Brunton and Harry Love, who's work provided a starting point for this library.
|
86
|
+
http://scoop.cheerfactory.co.uk/2008/02/26/google-sitemap-generator/
|
87
|
+
|
88
|
+
## COPYRIGHT
|
89
|
+
|
90
|
+
Copyright (c) 2009 Stateless Systems (http://statelesssystems.com). See LICENSE for details.
|
data/VERSION.yml
ADDED
data/lib/big_sitemap.rb
ADDED
@@ -0,0 +1,200 @@
|
|
1
|
+
require 'net/http'
|
2
|
+
require 'uri'
|
3
|
+
require 'zlib'
|
4
|
+
require 'builder'
|
5
|
+
require 'extlib'
|
6
|
+
|
7
|
+
class BigSitemap
|
8
|
+
def initialize(options)
|
9
|
+
document_root = options.delete(:document_root)
|
10
|
+
|
11
|
+
if document_root.nil?
|
12
|
+
if defined? RAILS_ROOT
|
13
|
+
document_root = "#{RAILS_ROOT}/public"
|
14
|
+
elsif defined? Merb
|
15
|
+
document_root = "#{Merb.root}/public"
|
16
|
+
end
|
17
|
+
end
|
18
|
+
|
19
|
+
raise ArgumentError, 'Document root must be specified with the :document_root option' if document_root.nil?
|
20
|
+
|
21
|
+
@base_url = options.delete(:base_url)
|
22
|
+
@max_per_sitemap = options.delete(:max_per_sitemap) || 50000
|
23
|
+
@batch_size = options.delete(:batch_size) || 1001 # TODO: Set this to 1000 once DM offset 37000 bug is fixed
|
24
|
+
@web_path = options.delete(:path) || 'sitemaps'
|
25
|
+
@ping_google = options[:ping_google].nil? ? true : options.delete(:ping_google)
|
26
|
+
@ping_yahoo = options[:ping_yahoo].nil? ? true : options.delete(:ping_yahoo)
|
27
|
+
@yahoo_app_id = options.delete(:yahoo_app_id)
|
28
|
+
@ping_msn = options[:ping_msn].nil? ? true : options.delete(:ping_msn)
|
29
|
+
@ping_ask = options[:ping_ask].nil? ? true : options.delete(:ping_ask)
|
30
|
+
@file_path = "#{document_root}/#{@web_path}"
|
31
|
+
@sources = []
|
32
|
+
|
33
|
+
raise ArgumentError, "Base URL must be specified with the :base_url option" if @base_url.nil?
|
34
|
+
|
35
|
+
raise(
|
36
|
+
ArgumentError,
|
37
|
+
'Batch size (:batch_size) must be less than or equal to maximum URLs per sitemap (:max_per_sitemap)'
|
38
|
+
) if @batch_size > @max_per_sitemap
|
39
|
+
|
40
|
+
unless File.exists? @file_path
|
41
|
+
Dir.mkdir(@file_path)
|
42
|
+
end
|
43
|
+
end
|
44
|
+
|
45
|
+
def add(options)
|
46
|
+
raise ArgumentError, ':model and :path options must be provided' unless options[:model] && options[:path]
|
47
|
+
@sources << options
|
48
|
+
end
|
49
|
+
|
50
|
+
def generate
|
51
|
+
paths = []
|
52
|
+
sitemaps = []
|
53
|
+
|
54
|
+
@sources.each do |source|
|
55
|
+
klass = source[:model]
|
56
|
+
|
57
|
+
count_method = pick_method(klass, [:count_for_sitemap, :count])
|
58
|
+
find_method = pick_method(klass, [:find_for_sitemap, :all])
|
59
|
+
raise ArgumentError, "#{klass} must provide a count_for_sitemap class method" if count_method.nil?
|
60
|
+
raise ArgumentError, "#{klass} must provide a find_for_sitemap class method" if find_method.nil?
|
61
|
+
|
62
|
+
count = klass.send(count_method)
|
63
|
+
num_sitemaps = 1
|
64
|
+
num_batches = 1
|
65
|
+
|
66
|
+
if count > @batch_size
|
67
|
+
num_batches = (count.to_f / @batch_size.to_f).ceil
|
68
|
+
num_sitemaps = (count.to_f / @max_per_sitemap.to_f).ceil
|
69
|
+
end
|
70
|
+
batches_per_sitemap = num_batches.to_f / num_sitemaps.to_f
|
71
|
+
|
72
|
+
# Update the @sources hash so that the index file knows how many sitemaps to link to
|
73
|
+
source[:num_sitemaps] = num_sitemaps
|
74
|
+
|
75
|
+
for sitemap_num in 1..num_sitemaps
|
76
|
+
# Work out the start and end batch numbers for this sitemap
|
77
|
+
batch_num_start = sitemap_num == 1 ? 1 : ((sitemap_num * batches_per_sitemap).ceil - batches_per_sitemap + 1).to_i
|
78
|
+
batch_num_end = (batch_num_start + [batches_per_sitemap, num_batches].min).floor - 1
|
79
|
+
|
80
|
+
# Stream XML output to a file
|
81
|
+
filename = "sitemap_#{Extlib::Inflection::underscore(klass.to_s)}"
|
82
|
+
filename << "_#{sitemap_num}" if num_sitemaps > 1
|
83
|
+
|
84
|
+
gz = gz_writer("#{filename}.xml.gz")
|
85
|
+
|
86
|
+
xml = Builder::XmlMarkup.new(:target => gz)
|
87
|
+
xml.instruct!
|
88
|
+
xml.urlset(:xmlns => 'http://www.sitemaps.org/schemas/sitemap/0.9') do
|
89
|
+
for batch_num in batch_num_start..batch_num_end
|
90
|
+
offset = ((batch_num - 1) * @batch_size)
|
91
|
+
limit = (count - offset) < @batch_size ? (count - offset - 1) : @batch_size
|
92
|
+
find_options = num_batches > 1 ? {:limit => limit, :offset => offset} : {}
|
93
|
+
|
94
|
+
klass.send(find_method, find_options).each do |r|
|
95
|
+
last_mod_method = pick_method(
|
96
|
+
r,
|
97
|
+
[:updated_at, :updated_on, :updated, :created_at, :created_on, :created]
|
98
|
+
)
|
99
|
+
last_mod = last_mod_method.nil? ? Time.now : r.send(last_mod_method)
|
100
|
+
|
101
|
+
param_method = pick_method(r, [:to_param, :id])
|
102
|
+
raise ArgumentError, "#{klass} must provide a to_param instance method" if param_method.nil?
|
103
|
+
|
104
|
+
path = {:url => "#{source[:path]}/#{r.send(param_method)}", :last_mod => last_mod}
|
105
|
+
|
106
|
+
xml.url do
|
107
|
+
xml.loc("#{@base_url}/#{path[:url]}")
|
108
|
+
xml.lastmod(path[:last_mod].strftime('%Y-%m-%d')) unless path[:last_mod].nil?
|
109
|
+
xml.changefreq('weekly')
|
110
|
+
end
|
111
|
+
end
|
112
|
+
end
|
113
|
+
end
|
114
|
+
|
115
|
+
gz.close
|
116
|
+
end
|
117
|
+
|
118
|
+
end
|
119
|
+
|
120
|
+
generate_sitemap_index
|
121
|
+
ping_search_engines
|
122
|
+
end
|
123
|
+
|
124
|
+
private
|
125
|
+
def pick_method(klass, candidates)
|
126
|
+
method = nil
|
127
|
+
candidates.each do |candidate|
|
128
|
+
if klass.respond_to? candidate
|
129
|
+
method = candidate
|
130
|
+
break
|
131
|
+
end
|
132
|
+
end
|
133
|
+
method
|
134
|
+
end
|
135
|
+
|
136
|
+
def gz_writer(filename)
|
137
|
+
Zlib::GzipWriter.new(File.open("#{@file_path}/#{filename}", 'w+'))
|
138
|
+
end
|
139
|
+
|
140
|
+
def sitemap_index_filename
|
141
|
+
'sitemap_index.xml.gz'
|
142
|
+
end
|
143
|
+
|
144
|
+
# Create a sitemap index document
|
145
|
+
def generate_sitemap_index
|
146
|
+
xml = ''
|
147
|
+
builder = Builder::XmlMarkup.new(:target => xml)
|
148
|
+
builder.instruct!
|
149
|
+
builder.sitemapindex(:xmlns => 'http://www.sitemaps.org/schemas/sitemap/0.9') do
|
150
|
+
@sources.each do |source|
|
151
|
+
num_sitemaps = source[:num_sitemaps]
|
152
|
+
for i in 1..num_sitemaps
|
153
|
+
loc = "#{@base_url}/#{@web_path}/sitemap_#{Extlib::Inflection::underscore(source[:model].to_s)}"
|
154
|
+
loc << "_#{i}" if num_sitemaps > 1
|
155
|
+
loc << '.xml.gz'
|
156
|
+
|
157
|
+
builder.sitemap do
|
158
|
+
builder.loc(loc)
|
159
|
+
builder.lastmod(Time.now.strftime('%Y-%m-%d'))
|
160
|
+
end
|
161
|
+
end
|
162
|
+
end
|
163
|
+
end
|
164
|
+
|
165
|
+
gz = gz_writer(sitemap_index_filename)
|
166
|
+
gz.write(xml)
|
167
|
+
gz.close
|
168
|
+
end
|
169
|
+
|
170
|
+
def sitemap_uri
|
171
|
+
URI.escape("#{@base_url}/#{@web_path}/#{sitemap_index_filename}")
|
172
|
+
end
|
173
|
+
|
174
|
+
# Notify Google of the new sitemap index file
|
175
|
+
def ping_google
|
176
|
+
Net::HTTP.get('www.google.com', "/webmasters/tools/ping?sitemap=#{sitemap_uri}")
|
177
|
+
end
|
178
|
+
|
179
|
+
# Notify Yahoo! of the new sitemap index file
|
180
|
+
def ping_yahoo
|
181
|
+
Net::HTTP.get('search.yahooapis.com', "/SiteExplorerService/V1/updateNotification?appid=#{@yahoo_app_id}&url=#{sitemap_uri}")
|
182
|
+
end
|
183
|
+
|
184
|
+
# Notify MSN of the new sitemap index file
|
185
|
+
def ping_msn
|
186
|
+
Net::HTTP.get('webmaster.live.com', "/ping.aspx?siteMap=#{sitemap_uri}")
|
187
|
+
end
|
188
|
+
|
189
|
+
# Notify Ask of the new sitemap index file
|
190
|
+
def ping_ask
|
191
|
+
Net::HTTP.get('submissions.ask.com', "/ping?sitemap=#{sitemap_uri}")
|
192
|
+
end
|
193
|
+
|
194
|
+
def ping_search_engines
|
195
|
+
ping_google if @ping_google
|
196
|
+
ping_yahoo if @ping_yahoo && @yahoo_app_id
|
197
|
+
ping_msn if @ping_msn
|
198
|
+
ping_ask if @ping_ask
|
199
|
+
end
|
200
|
+
end
|
@@ -0,0 +1,177 @@
|
|
1
|
+
require File.dirname(__FILE__) + '/test_helper'
|
2
|
+
require 'nokogiri'
|
3
|
+
|
4
|
+
class BigSitemapTest < Test::Unit::TestCase
|
5
|
+
def setup
|
6
|
+
delete_tmp_files
|
7
|
+
end
|
8
|
+
|
9
|
+
def teardown
|
10
|
+
delete_tmp_files
|
11
|
+
end
|
12
|
+
|
13
|
+
should 'raise an error if the :base_url option is not specified' do
|
14
|
+
assert_nothing_raised { BigSitemap.new(:base_url => 'http://example.com', :document_root => tmp_dir) }
|
15
|
+
assert_raise(ArgumentError) { BigSitemap.new(:document_root => tmp_dir) }
|
16
|
+
end
|
17
|
+
|
18
|
+
should 'generate a sitemap index file' do
|
19
|
+
generate_sitemap_files
|
20
|
+
assert File.exists?(sitemaps_index_file)
|
21
|
+
end
|
22
|
+
|
23
|
+
should 'generate a single sitemap model file' do
|
24
|
+
create_sitemap
|
25
|
+
add_model
|
26
|
+
@sitemap.generate
|
27
|
+
assert File.exists?(single_sitemaps_model_file), "#{single_sitemaps_model_file} exists"
|
28
|
+
end
|
29
|
+
|
30
|
+
should 'generate exactly two sitemap model files' do
|
31
|
+
generate_exactly_two_model_sitemap_files
|
32
|
+
assert File.exists?(first_sitemaps_model_file), "#{first_sitemaps_model_file} exists"
|
33
|
+
assert File.exists?(second_sitemaps_model_file), "#{second_sitemaps_model_file} exists"
|
34
|
+
third_sitemaps_model_file = "#{sitemaps_dir}/sitemap_test_model_3.xml.gz"
|
35
|
+
assert !File.exists?(third_sitemaps_model_file), "#{third_sitemaps_model_file} does not exist"
|
36
|
+
end
|
37
|
+
|
38
|
+
context 'Sitemap index file' do
|
39
|
+
should 'contain one sitemapindex element' do
|
40
|
+
generate_sitemap_files
|
41
|
+
assert_equal 1, num_elements(sitemaps_index_file, 'sitemapindex')
|
42
|
+
end
|
43
|
+
|
44
|
+
should 'contain one sitemap element' do
|
45
|
+
generate_sitemap_files
|
46
|
+
assert_equal 1, num_elements(sitemaps_index_file, 'sitemap')
|
47
|
+
end
|
48
|
+
|
49
|
+
should 'contain one loc element' do
|
50
|
+
generate_sitemap_files
|
51
|
+
assert_equal 1, num_elements(sitemaps_index_file, 'loc')
|
52
|
+
end
|
53
|
+
|
54
|
+
should 'contain one lastmod element' do
|
55
|
+
generate_sitemap_files
|
56
|
+
assert_equal 1, num_elements(sitemaps_index_file, 'lastmod')
|
57
|
+
end
|
58
|
+
|
59
|
+
should 'contain two loc elements' do
|
60
|
+
generate_exactly_two_model_sitemap_files
|
61
|
+
assert_equal 2, num_elements(sitemaps_index_file, 'loc')
|
62
|
+
end
|
63
|
+
|
64
|
+
should 'contain two lastmod elements' do
|
65
|
+
generate_exactly_two_model_sitemap_files
|
66
|
+
assert_equal 2, num_elements(sitemaps_index_file, 'lastmod')
|
67
|
+
end
|
68
|
+
end
|
69
|
+
|
70
|
+
context 'Sitemap model file' do
|
71
|
+
should 'contain one urlset element' do
|
72
|
+
generate_sitemap_files
|
73
|
+
assert_equal 1, num_elements(single_sitemaps_model_file, 'urlset')
|
74
|
+
end
|
75
|
+
|
76
|
+
should 'contain several loc elements' do
|
77
|
+
generate_sitemap_files
|
78
|
+
assert_equal default_num_items, num_elements(single_sitemaps_model_file, 'loc')
|
79
|
+
end
|
80
|
+
|
81
|
+
should 'contain several lastmod elements' do
|
82
|
+
generate_sitemap_files
|
83
|
+
assert_equal default_num_items, num_elements(single_sitemaps_model_file, 'lastmod')
|
84
|
+
end
|
85
|
+
|
86
|
+
should 'contain several changefreq elements' do
|
87
|
+
generate_sitemap_files
|
88
|
+
assert_equal default_num_items, num_elements(single_sitemaps_model_file, 'changefreq')
|
89
|
+
end
|
90
|
+
|
91
|
+
should 'contain one loc element' do
|
92
|
+
generate_exactly_two_model_sitemap_files
|
93
|
+
assert_equal 1, num_elements(first_sitemaps_model_file, 'loc')
|
94
|
+
assert_equal 1, num_elements(second_sitemaps_model_file, 'loc')
|
95
|
+
end
|
96
|
+
|
97
|
+
should 'contain one lastmod element' do
|
98
|
+
generate_exactly_two_model_sitemap_files
|
99
|
+
assert_equal 1, num_elements(first_sitemaps_model_file, 'lastmod')
|
100
|
+
assert_equal 1, num_elements(second_sitemaps_model_file, 'lastmod')
|
101
|
+
end
|
102
|
+
|
103
|
+
should 'contain one changefreq element' do
|
104
|
+
generate_exactly_two_model_sitemap_files
|
105
|
+
assert_equal 1, num_elements(first_sitemaps_model_file, 'changefreq')
|
106
|
+
assert_equal 1, num_elements(second_sitemaps_model_file, 'changefreq')
|
107
|
+
end
|
108
|
+
end
|
109
|
+
|
110
|
+
private
|
111
|
+
def delete_tmp_files
|
112
|
+
FileUtils.rm_rf(sitemaps_dir)
|
113
|
+
end
|
114
|
+
|
115
|
+
def create_sitemap(options={})
|
116
|
+
@sitemap = BigSitemap.new({
|
117
|
+
:base_url => 'http://example.com',
|
118
|
+
:document_root => tmp_dir,
|
119
|
+
:update_google => false
|
120
|
+
}.update(options))
|
121
|
+
end
|
122
|
+
|
123
|
+
def generate_sitemap_files
|
124
|
+
create_sitemap
|
125
|
+
add_model
|
126
|
+
@sitemap.generate
|
127
|
+
end
|
128
|
+
|
129
|
+
def generate_exactly_two_model_sitemap_files
|
130
|
+
create_sitemap(:max_per_sitemap => 1, :batch_size => 1)
|
131
|
+
add_model(:num_items => 2)
|
132
|
+
@sitemap.generate
|
133
|
+
end
|
134
|
+
|
135
|
+
def add_model(options={})
|
136
|
+
num_items = options.delete(:num_items) || default_num_items
|
137
|
+
TestModel.stubs(:num_items).returns(num_items)
|
138
|
+
@sitemap.add({:model => TestModel, :path => 'test_controller'}.update(options))
|
139
|
+
end
|
140
|
+
|
141
|
+
def default_num_items
|
142
|
+
10
|
143
|
+
end
|
144
|
+
|
145
|
+
def sitemaps_index_file
|
146
|
+
"#{sitemaps_dir}/sitemap_index.xml.gz"
|
147
|
+
end
|
148
|
+
|
149
|
+
def single_sitemaps_model_file
|
150
|
+
"#{sitemaps_dir}/sitemap_test_model.xml.gz"
|
151
|
+
end
|
152
|
+
|
153
|
+
def first_sitemaps_model_file
|
154
|
+
"#{sitemaps_dir}/sitemap_test_model_1.xml.gz"
|
155
|
+
end
|
156
|
+
|
157
|
+
def second_sitemaps_model_file
|
158
|
+
"#{sitemaps_dir}/sitemap_test_model_2.xml.gz"
|
159
|
+
end
|
160
|
+
|
161
|
+
def sitemaps_dir
|
162
|
+
"#{tmp_dir}/sitemaps"
|
163
|
+
end
|
164
|
+
|
165
|
+
def tmp_dir
|
166
|
+
'/tmp'
|
167
|
+
end
|
168
|
+
|
169
|
+
def ns
|
170
|
+
{'s' => 'http://www.sitemaps.org/schemas/sitemap/0.9'}
|
171
|
+
end
|
172
|
+
|
173
|
+
def num_elements(filename, el)
|
174
|
+
data = Nokogiri::XML.parse(Zlib::GzipReader.open(filename).read)
|
175
|
+
data.search("//s:#{el}", ns).size
|
176
|
+
end
|
177
|
+
end
|
@@ -0,0 +1,18 @@
|
|
1
|
+
class TestModel
|
2
|
+
def to_param
|
3
|
+
object_id
|
4
|
+
end
|
5
|
+
|
6
|
+
class << self
|
7
|
+
def count_for_sitemap
|
8
|
+
self.find_for_sitemap.size
|
9
|
+
end
|
10
|
+
|
11
|
+
def find_for_sitemap(options={})
|
12
|
+
instances = []
|
13
|
+
num_times = options.delete(:limit) || self.num_items
|
14
|
+
num_times.times { instances.push(self.new) }
|
15
|
+
instances
|
16
|
+
end
|
17
|
+
end
|
18
|
+
end
|
data/test/test_helper.rb
ADDED
metadata
ADDED
@@ -0,0 +1,79 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: alexrabarts-big_sitemap
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.1.3
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- Alex Rabarts
|
8
|
+
autorequire:
|
9
|
+
bindir: bin
|
10
|
+
cert_chain: []
|
11
|
+
|
12
|
+
date: 2009-03-10 00:00:00 -07:00
|
13
|
+
default_executable:
|
14
|
+
dependencies:
|
15
|
+
- !ruby/object:Gem::Dependency
|
16
|
+
name: builder
|
17
|
+
type: :runtime
|
18
|
+
version_requirement:
|
19
|
+
version_requirements: !ruby/object:Gem::Requirement
|
20
|
+
requirements:
|
21
|
+
- - ">="
|
22
|
+
- !ruby/object:Gem::Version
|
23
|
+
version: 2.1.2
|
24
|
+
version:
|
25
|
+
- !ruby/object:Gem::Dependency
|
26
|
+
name: extlib
|
27
|
+
type: :runtime
|
28
|
+
version_requirement:
|
29
|
+
version_requirements: !ruby/object:Gem::Requirement
|
30
|
+
requirements:
|
31
|
+
- - ">="
|
32
|
+
- !ruby/object:Gem::Version
|
33
|
+
version: 0.9.9
|
34
|
+
version:
|
35
|
+
description: A Sitemap generator specifically designed for large sites (although it works equally well with small sites)
|
36
|
+
email: alexrabarts@gmail.com
|
37
|
+
executables: []
|
38
|
+
|
39
|
+
extensions: []
|
40
|
+
|
41
|
+
extra_rdoc_files: []
|
42
|
+
|
43
|
+
files:
|
44
|
+
- VERSION.yml
|
45
|
+
- README.markdown
|
46
|
+
- lib/big_sitemap.rb
|
47
|
+
- test/fixtures
|
48
|
+
- test/fixtures/test_model.rb
|
49
|
+
- test/big_sitemap_test.rb
|
50
|
+
- test/test_helper.rb
|
51
|
+
has_rdoc: true
|
52
|
+
homepage: http://github.com/alexrabarts/big_sitemap
|
53
|
+
post_install_message:
|
54
|
+
rdoc_options:
|
55
|
+
- --inline-source
|
56
|
+
- --charset=UTF-8
|
57
|
+
require_paths:
|
58
|
+
- lib
|
59
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
60
|
+
requirements:
|
61
|
+
- - ">="
|
62
|
+
- !ruby/object:Gem::Version
|
63
|
+
version: "0"
|
64
|
+
version:
|
65
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
66
|
+
requirements:
|
67
|
+
- - ">="
|
68
|
+
- !ruby/object:Gem::Version
|
69
|
+
version: "0"
|
70
|
+
version:
|
71
|
+
requirements: []
|
72
|
+
|
73
|
+
rubyforge_project:
|
74
|
+
rubygems_version: 1.2.0
|
75
|
+
signing_key:
|
76
|
+
specification_version: 2
|
77
|
+
summary: A Sitemap generator specifically designed for large sites (although it works equally well with small sites)
|
78
|
+
test_files: []
|
79
|
+
|