sitemap_generator 3.1.1 → 3.2

Sign up to get free protection for your applications and to get access to all the features.
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: ./
3
3
  specs:
4
- sitemap_generator (3.1.1)
4
+ sitemap_generator (3.2)
5
5
  builder
6
6
 
7
7
  GEM
data/README.md CHANGED
@@ -1,98 +1,104 @@
1
- SitemapGenerator
2
- ================
1
+ # SitemapGenerator
3
2
 
4
3
  SitemapGenerator is the easiest way to generate Sitemaps in Ruby. Rails integration provides access to the Rails route helpers within your sitemap config file and automatically makes the rake tasks available to you. Or if you prefer to use another framework, you can! You can use the rake tasks provided or run your sitemap configs as plain ruby scripts.
5
4
 
6
5
  Sitemaps adhere to the [Sitemap 0.9 protocol][sitemap_protocol] specification.
7
6
 
8
- Features
9
- -------
7
+ ## Features
10
8
 
11
- - Framework agnostic
12
- - Supports [News sitemaps][sitemap_news], [Video sitemaps][sitemap_video], [Image sitemaps][sitemap_images], and [Geo sitemaps][sitemap_geo]
13
- - Supports read-only filesystems like Heroku via uploading to a remote host like Amazon S3
14
- - Compatible with Rails 2 & 3
15
- - Adheres to the [Sitemap 0.9 protocol][sitemap_protocol]
16
- - Handles millions of links
17
- - Automatically compresses your sitemaps
18
- - Notifies search engines (Google, Bing, Ask, SitemapWriter) of new sitemaps
19
- - Ensures your old sitemaps stay in place if the new sitemap fails to generate
20
- - Gives you complete control over your sitemaps and their content
9
+ * Framework agnostic
10
+ * Supports [News sitemaps][sitemap_news], [Video sitemaps][sitemap_video], [Image sitemaps][sitemap_images], [Geo sitemaps][sitemap_geo] and [Mobile sitemaps][sitemap_mobile]
11
+ * Supports read-only filesystems like Heroku via uploading to a remote host like Amazon S3
12
+ * Compatible with Rails 2 & 3
13
+ * Adheres to the [Sitemap 0.9 protocol][sitemap_protocol]
14
+ * Handles millions of links
15
+ * Automatically compresses your sitemaps
16
+ * Notifies search engines (Google, Bing, SitemapWriter) of new sitemaps
17
+ * Ensures your old sitemaps stay in place if the new sitemap fails to generate
18
+ * Gives you complete control over your sitemaps and their content
21
19
 
22
- Show Me
23
- -------
20
+
21
+ ### Show Me
24
22
 
25
23
  Install:
26
24
 
27
- gem install sitemap_generator
25
+ ```
26
+ gem install sitemap_generator
27
+ ```
28
28
 
29
29
  Create `sitemap.rb`:
30
30
 
31
- require 'rubygems'
32
- require 'sitemap_generator'
31
+ ```ruby
32
+ require 'rubygems'
33
+ require 'sitemap_generator'
33
34
 
34
- SitemapGenerator::Sitemap.default_host = 'http://example.com'
35
- SitemapGenerator::Sitemap.create do
36
- add '/home', :changefreq => 'daily', :priority => 0.9
37
- add '/contact_us', :changefreq => 'weekly'
38
- end
39
- SitemapGenerator::Sitemap.ping_search_engines # called for you when you use the rake task
35
+ SitemapGenerator::Sitemap.default_host = 'http://example.com'
36
+ SitemapGenerator::Sitemap.create do
37
+ add '/home', :changefreq => 'daily', :priority => 0.9
38
+ add '/contact_us', :changefreq => 'weekly'
39
+ end
40
+ SitemapGenerator::Sitemap.ping_search_engines # called for you when you use the rake task
41
+ ```
40
42
 
41
43
  Run it:
42
44
 
43
- ruby sitemap.rb
45
+ ```
46
+ ruby sitemap.rb
47
+ ```
44
48
 
45
49
  Output:
46
50
 
47
- In /Users/karl/projects/sitemap_generator-test/public/
48
- + sitemap1.xml.gz 4 links / 357 Bytes
49
- + sitemap_index.xml.gz 1 sitemaps / 228 Bytes
50
- Sitemap stats: 4 links / 1 sitemaps / 0m00s
51
+ ```
52
+ In /Users/karl/projects/sitemap_generator-test/public/
53
+ + sitemap1.xml.gz 3 links / 357 Bytes
54
+ + sitemap_index.xml.gz 1 sitemaps / 228 Bytes
55
+ Sitemap stats: 3 links / 1 sitemaps / 0m00s
56
+
57
+ Successful ping of Google
58
+ Successful ping of Bing
59
+ Successful ping of Sitemap Writer
60
+ ```
51
61
 
52
- Successful ping of Google
53
- Successful ping of Ask
54
- Successful ping of Bing
55
- Successful ping of Sitemap Writer
56
62
 
57
- Contribute
58
- -------
63
+ ## Contribute
59
64
 
60
65
  Does your website use SitemapGenerator to generate Sitemaps? Where would you be without Sitemaps? Probably still knocking rocks together. Consider donating to the project to keep it up-to-date and open source.
61
66
 
62
67
  <a href='http://www.pledgie.com/campaigns/15267'><img alt='Click here to lend your support to: SitemapGenerator and make a donation at www.pledgie.com !' src='http://pledgie.com/campaigns/15267.png?skin_name=chrome' border='0' /></a>
63
68
 
64
- Changelog
65
- -------
66
69
 
67
- - v3.1.1: Bugfix: Groups inherit current adapter
68
- - v3.1.0: Add `add_to_index` method to add links to the sitemap index. Add `sitemap` method for accessing the LinkSet instance from within `create()`. Don't modify options hashes passed to methods. Fix and improve `yield_sitemap` option handling.
69
- - **v3.0.0: Framework agnostic**; fix alignment in output, show directory sitemaps are being generated into, only show sitemap compressed file size; toggle output using VERBOSE environment variable; remove tasks/ directory because it's deprecated in Rails 2; Simplify dependencies.
70
- - v2.2.1: Support adding new search engines to ping and modifying the default search engines.
70
+ ## Changelog
71
+
72
+ * v3.2: **Support mobile tags**, **SitemapGenerator::S3Adapter** a simple S3 adapter which uses Fog and doesn't require CarrierWave; Remove Ask from the sitemap ping because the service has been shutdown; [Turn off `include_index`][include_index_change] by default; Fix the news XML namespace; Only include autoplay attribute if present
73
+ * v3.1.1: Bugfix: Groups inherit current adapter
74
+ * v3.1.0: Add `add_to_index` method to add links to the sitemap index. Add `sitemap` method for accessing the LinkSet instance from within `create()`. Don't modify options hashes passed to methods. Fix and improve `yield_sitemap` option handling.
75
+ * **v3.0.0: Framework agnostic**; fix alignment in output, show directory sitemaps are being generated into, only show sitemap compressed file size; toggle output using VERBOSE environment variable; remove tasks/ directory because it's deprecated in Rails 2; Simplify dependencies.
76
+ * v2.2.1: Support adding new search engines to ping and modifying the default search engines.
71
77
  Allow the URL of the sitemap index to be passed as an argument to `ping_search_engines`. See **Pinging Search Engines**.
72
- - v2.1.8: Extend and improve Video Sitemap support. Include sitemap docs in the README, support all element attributes, properly format values.
73
- - v2.1.7: Improve format of float priorities; Remove Yahoo from ping - the Yahoo
78
+ * v2.1.8: Extend and improve Video Sitemap support. Include sitemap docs in the README, support all element attributes, properly format values.
79
+ * v2.1.7: Improve format of float priorities; Remove Yahoo from ping - the Yahoo
74
80
  service has been shut down.
75
- - v2.1.6: Fix the lastmod value on sitemap file links
76
- - v2.1.5: Fix verbose setting in the rake tasks; should default to true
77
- - v2.1.4: Allow special characters in URLs (don't use URI.join to construct URLs)
78
- - v2.1.3: Fix calling create with both `filename` and `sitemaps_namer` options
79
- - v2.1.2: Support multiple videos per url using the new `videos` option to `add()`.
80
- - v2.1.1: Support calling `create()` multiple times in a sitemap config. Support host names with path segments so you can use a `default_host` like `'http://mysite.com/subdirectory/'`. Turn off `include_index` when the `sitemaps_host` differs from `default_host`. Add docs about how to upload to remote hosts.
81
- - v2.1.0: [News sitemap][sitemap_news] support
82
- - v2.0.1.pre2: Fix uploading to the (bucket) root on a remote server
83
- - v2.0.1.pre1: Support read-only filesystems like Heroku by supporting uploading to remote host
84
- - v2.0.1: Minor improvements to verbose handling; prevent missing Timeout issue
85
- - **v2.0.0: Introducing a new simpler API, Sitemap Groups, Sitemap Namers and more!**
86
- - v1.5.0: New options `include_root`, `include_index`; Major testing & refactoring
87
- - v1.4.0: [Geo sitemap][geo_tags] support, multiple sitemap support via CONFIG_FILE rake option
88
- - v1.3.0: Support setting the sitemaps path
89
- - v1.2.0: Verified working with Rails 3 stable release
90
- - v1.1.0: [Video sitemap][sitemap_video] support
91
- - v0.2.6: [Image Sitemap][sitemap_images] support
92
- - v0.2.5: Rails 3 prerelease support (beta)
93
-
94
- Foreword
95
- -------
81
+ * v2.1.6: Fix the lastmod value on sitemap file links
82
+ * v2.1.5: Fix verbose setting in the rake tasks; should default to true
83
+ * v2.1.4: Allow special characters in URLs (don't use URI.join to construct URLs)
84
+ * v2.1.3: Fix calling create with both `filename` and `sitemaps_namer` options
85
+ * v2.1.2: Support multiple videos per url using the new `videos` option to `add()`.
86
+ * v2.1.1: Support calling `create()` multiple times in a sitemap config. Support host names with path segments so you can use a `default_host` like `'http://mysite.com/subdirectory/'`. Turn off `include_index` when the `sitemaps_host` differs from `default_host`. Add docs about how to upload to remote hosts.
87
+ * v2.1.0: [News sitemap][sitemap_news] support
88
+ * v2.0.1.pre2: Fix uploading to the (bucket) root on a remote server
89
+ * v2.0.1.pre1: Support read-only filesystems like Heroku by supporting uploading to remote host
90
+ * v2.0.1: Minor improvements to verbose handling; prevent missing Timeout issue
91
+ * **v2.0.0: Introducing a new simpler API, Sitemap Groups, Sitemap Namers and more!**
92
+ * v1.5.0: New options `include_root`, `include_index`; Major testing & refactoring
93
+ * v1.4.0: [Geo sitemap][geo_tags] support, multiple sitemap support via CONFIG_FILE rake option
94
+ * v1.3.0: Support setting the sitemaps path
95
+ * v1.2.0: Verified working with Rails 3 stable release
96
+ * v1.1.0: [Video sitemap][sitemap_video] support
97
+ * v0.2.6: [Image Sitemap][sitemap_images] support
98
+ * v0.2.5: Rails 3 prerelease support (beta)
99
+
100
+
101
+ ## Foreword
96
102
 
97
103
  Adam Salter first created SitemapGenerator while we were working together in Sydney, Australia. Unfortunately, he passed away in 2009. Since then I have taken over development of SitemapGenerator.
98
104
 
@@ -100,46 +106,53 @@ Those who knew him know what an amazing guy he was, and what an excellent Rails
100
106
 
101
107
  The canonical repository is now: [http://github.com/kjvarga/sitemap_generator][canonical_repo]
102
108
 
103
- Install
104
- =======
105
109
 
106
- Ruby
107
- -----
110
+ ## Install
111
+
112
+ ### Ruby
108
113
 
109
- gem install 'sitemap_generator'
114
+ ```
115
+ gem install 'sitemap_generator'
116
+ ```
110
117
 
111
118
  To use the rake tasks add the following to your `Rakefile`:
112
119
 
113
- require 'sitemap_generator/tasks'
120
+ ```ruby
121
+ require 'sitemap_generator/tasks'
122
+ ```
114
123
 
115
124
  The Rake tasks expect your sitemap to be at `config/sitemap.rb` but if you need to change that call like so: `rake sitemap:refresh CONFIG_FILE="path/to/sitemap.rb"`
116
125
 
117
- Rails
118
- -----
126
+ ### Rails
119
127
 
120
128
  Add the gem to your `Gemfile`:
121
129
 
122
- gem 'sitemap_generator'
130
+ ```ruby
131
+ gem 'sitemap_generator'
132
+ ```
123
133
 
124
134
  Alternatively, if you are not using a `Gemfile` add the gem to your `config/environment.rb` file config block:
125
135
 
126
- config.gem 'sitemap_generator'
136
+ ```ruby
137
+ config.gem 'sitemap_generator'
138
+ ```
139
+
127
140
 
128
141
  **Rails 1 or 2 only**, add the following code to your `Rakefile` to include the gem's Rake tasks in your project (Rails 3 does this for you automatically, so this step is not necessary):
129
142
 
130
- begin
131
- require 'sitemap_generator/tasks'
132
- rescue Exception => e
133
- puts "Warning, couldn't load gem tasks: #{e.message}! Skipping..."
134
- end
143
+ ```ruby
144
+ begin
145
+ require 'sitemap_generator/tasks'
146
+ rescue Exception => e
147
+ puts "Warning, couldn't load gem tasks: #{e.message}! Skipping..."
148
+ end
149
+ ```
135
150
 
136
- <i>If you would prefer to install as a plugin (deprecated) don't do any of the above. Simply run `script/plugin install git://github.com/kjvarga/sitemap_generator.git` from your application root directory.</i>
151
+ _If you would prefer to install as a plugin (deprecated) don't do any of the above. Simply run `script/plugin install git://github.com/kjvarga/sitemap_generator.git` from your application root directory._
137
152
 
138
- Getting Started
139
- ======
153
+ ## Getting Started
140
154
 
141
- Preventing Output
142
- -----
155
+ ### Preventing Output
143
156
 
144
157
  To disable all non-essential output set the environment variable `VERBOSE=false` when calling Rake or running your Ruby script.
145
158
 
@@ -147,65 +160,73 @@ Alternatively you can pass the `-s` option to Rake, for example `rake -s sitemap
147
160
 
148
161
  To disable output in-code use the following:
149
162
 
150
- SitemapGenerator.verbose = false
151
-
152
- Rake Tasks
153
- -----
154
-
155
- Run `rake sitemap:install` to create a `config/sitemap.rb` file which is your sitemap configuration and contains everything needed to build your sitemap. See **Sitemap Configuration** below for more information about how to define your sitemap.
163
+ ```ruby
164
+ SitemapGenerator.verbose = false
165
+ ```
156
166
 
157
- Run `rake sitemap:refresh` as needed to create or rebuild your sitemap files. Sitemaps are generated into the `public/` folder and by default are named `sitemap_index.xml.gz`, `sitemap1.xml.gz`, `sitemap2.xml.gz`, etc. As you can see they are automatically gzip compressed for you.
167
+ ### Rake Tasks
158
168
 
159
- `rake sitemap:refresh` will output information about each sitemap that is written including its location, how many links it contains and the size of the file.
169
+ * `rake sitemap:install` will create a `config/sitemap.rb` file which is your sitemap configuration and contains everything needed to build your sitemap. See **Sitemap Configuration** below for more information about how to define your sitemap.
170
+ * `rake sitemap:refresh` will create or rebuild your sitemap files as needed. Sitemaps are generated into the `public/` folder and by default are named `sitemap_index.xml.gz`, `sitemap1.xml.gz`, `sitemap2.xml.gz`, etc. As you can see they are automatically gzip compressed for you.
171
+ * `rake sitemap:refresh` will output information about each sitemap that is written including its location, how many links it contains and the size of the file.
160
172
 
161
173
 
162
- Pinging Search Engines
163
- -----
174
+ ### Pinging Search Engines
164
175
 
165
- Using `rake sitemap:refresh` will notify major search engines to let them know that a new sitemap is available (Google, Bing, Ask, SitemapWriter). To generate new sitemaps without notifying search engines (for example when running in a local environment) use `rake sitemap:refresh:no_ping`.
176
+ Using `rake sitemap:refresh` will notify major search engines to let them know that a new sitemap is available (Google, Bing, SitemapWriter). To generate new sitemaps without notifying search engines (for example when running in a local environment) use `rake sitemap:refresh:no_ping`.
166
177
 
167
178
  If you want to customize the hash of search engines you can access it at:
168
179
 
169
- SitemapGenerator::Sitemap.search_engines
180
+ ```ruby
181
+ SitemapGenerator::Sitemap.search_engines
182
+ ```
170
183
 
171
184
  Usually you would be adding a new search engine to ping. In this case you can modify the `search_engines` hash directly. This ensures that when `SitemapGenerator::Sitemap.ping_search_engines` is called your new search engine will be included.
172
185
 
173
186
  If you are calling `ping_search_engines` manually (for example if you have to wait some time or perform a custom action after your sitemaps have been regenerated) then you can pass you new search engine directly in the call as in the following example:
174
187
 
175
- SitemapGenerator::Sitemap.ping_search_engines(:newengine => 'http://newengine.com/ping?url=%s')
188
+ ```ruby
189
+ SitemapGenerator::Sitemap.ping_search_engines(:newengine => 'http://newengine.com/ping?url=%s')
190
+ ```
176
191
 
177
192
  The key gives the name of the search engine as a string or symbol and the value is the full URL to ping with a string interpolation that will be replaced by the CGI escaped sitemap index URL. If you have any literal percent characters in your URL you need to escape them with `%%`.
178
193
 
179
194
  If you are calling `SitemapGenerator::Sitemap.ping_search_engines` from outside of your sitemap config file then you will need to set `SitemapGenerator::Sitemap.default_host` and any other options that you set in your sitemap config which affect the location of the sitemap index file. For example:
180
195
 
181
- SitemapGenerator::Sitemap.default_host = 'http://example.com'
182
- SitemapGenerator::Sitemap.ping_search_engines
196
+ ```ruby
197
+ SitemapGenerator::Sitemap.default_host = 'http://example.com'
198
+ SitemapGenerator::Sitemap.ping_search_engines
199
+ ```
183
200
 
184
201
  Alternatively you can pass in the full URL to your sitemap index in which case we would have just the following:
185
202
 
186
- SitemapGenerator::Sitemap.ping_search_engines('http://example.com/sitemap_index.xml.gz')
203
+ ```ruby
204
+ SitemapGenerator::Sitemap.ping_search_engines('http://example.com/sitemap_index.xml.gz')
205
+ ```
187
206
 
188
- Crontab
189
- -----
207
+ ### Crontab
190
208
 
191
209
  To keep your sitemaps up-to-date, setup a cron job. Make sure to pass the `-s` option to silence rake. That way you will only get email if the sitemap build fails.
192
210
 
193
211
  If you're using Whenever, your schedule would look something like this:
194
212
 
195
- # config/schedule.rb
196
- every 1.day, :at => '5:00 am' do
197
- rake "-s sitemap:refresh"
198
- end
213
+ ```ruby
214
+ # config/schedule.rb
215
+ every 1.day, :at => '5:00 am' do
216
+ rake "-s sitemap:refresh"
217
+ end
218
+ ```
199
219
 
200
- Robots.txt
201
- ----------
220
+
221
+ ### Robots.txt
202
222
 
203
223
  You should add the URL of the sitemap index file to `public/robots.txt` to help search engines find your sitemaps. The URL should be the complete URL to the sitemap index. For example:
204
224
 
205
- Sitemap: http://www.example.com/sitemap_index.xml.gz
225
+ ```
226
+ Sitemap: http://www.example.com/sitemap_index.xml.gz
227
+ ```
206
228
 
207
- Deployments & Capistrano
208
- ----------
229
+ ## Deployments & Capistrano
209
230
 
210
231
  To ensure that your application's sitemaps are available after a deployment you can do one of the following:
211
232
 
@@ -213,29 +234,37 @@ To ensure that your application's sitemaps are available after a deployment you
213
234
 
214
235
  You can set your sitemaps path to your shared directory using the `sitemaps_path` option. For example if we have a directory `public/shared/` that is shared by all deployments we can have our sitemaps generated into that directory by setting:
215
236
 
216
- SitemapGenerator::Sitemap.sitemaps_path = 'shared/'
237
+ ```ruby
238
+ SitemapGenerator::Sitemap.sitemaps_path = 'shared/'
239
+ ```
217
240
 
218
241
  2. **Copy the sitemaps from the previous deploy over to the new deploy:**
219
242
 
220
243
  (You will need to customize the task if you are using custom sitemap filenames or locations.)
221
244
 
222
- after "deploy:update_code", "deploy:copy_old_sitemap"
223
- namespace :deploy do
224
- task :copy_old_sitemap do
225
- run "if [ -e #{previous_release}/public/sitemap_index.xml.gz ]; then cp #{previous_release}/public/sitemap* #{current_release}/public/; fi"
226
- end
227
- end
228
-
245
+ ```ruby
246
+ after "deploy:update_code", "deploy:copy_old_sitemap"
247
+ namespace :deploy do
248
+ task :copy_old_sitemap do
249
+ run "if [ -e #{previous_release}/public/sitemap_index.xml.gz ]; then cp #{previous_release}/public/sitemap* #{current_release}/public/; fi"
250
+ end
251
+ end
252
+ ```
229
253
 
230
254
  3. **Regenerate your sitemaps after each deployment:**
231
255
 
232
- after "deploy", "refresh_sitemaps"
233
- task :refresh_sitemaps do
234
- run "cd #{latest_release} && RAILS_ENV=#{rails_env} rake sitemap:refresh"
235
- end
256
+ ```ruby
257
+ after "deploy", "refresh_sitemaps"
258
+ task :refresh_sitemaps do
259
+ run "cd #{latest_release} && RAILS_ENV=#{rails_env} rake sitemap:refresh"
260
+ end
261
+ ```
262
+
263
+ ### Upload Sitemaps to a Remote Host
236
264
 
237
- Upload Sitemaps to a Remote Host
238
- ----------
265
+ > SitemapGenerator::S3Adapter is a simple S3 adapter which was added in v3.2 which
266
+ > uses Fog and doesn't require CarrierWave. You can find a bit more information
267
+ > about it [on the wiki page][remote_hosts].
239
268
 
240
269
  Sometimes it is desirable to host your sitemap files on a remote server and point robots
241
270
  and search engines to the remote files. For example if you are using a host like Heroku
@@ -258,25 +287,29 @@ Sitemap Generator uses CarrierWave to support uploading to Amazon S3 store, Rack
258
287
 
259
288
  For Example:
260
289
 
261
- SitemapGenerator::Sitemap.default_host = "http://www.example.com"
262
- SitemapGenerator::Sitemap.sitemaps_host = "http://s3.amazonaws.com/sitemap-generator/"
263
- SitemapGenerator::Sitemap.public_path = 'tmp/'
264
- SitemapGenerator::Sitemap.sitemaps_path = 'sitemaps/'
265
- SitemapGenerator::Sitemap.adapter = SitemapGenerator::WaveAdapter.new
290
+ ```ruby
291
+ SitemapGenerator::Sitemap.default_host = "http://www.example.com"
292
+ SitemapGenerator::Sitemap.sitemaps_host = "http://s3.amazonaws.com/sitemap-generator/"
293
+ SitemapGenerator::Sitemap.public_path = 'tmp/'
294
+ SitemapGenerator::Sitemap.sitemaps_path = 'sitemaps/'
295
+ SitemapGenerator::Sitemap.adapter = SitemapGenerator::WaveAdapter.new
296
+ ```
266
297
 
267
298
  3. Update your `robots.txt` file to point robots to the remote sitemap index file, e.g:
268
299
 
269
- Sitemap: http://s3.amazonaws.com/sitemap-generator/sitemaps/sitemap_index.xml.gz
300
+ ```
301
+ Sitemap: http://s3.amazonaws.com/sitemap-generator/sitemaps/sitemap_index.xml.gz
302
+ ```
270
303
 
271
304
  You generate your sitemaps as usual using `rake sitemap:refresh`.
272
305
 
273
306
  Note that SitemapGenerator will automatically turn off `include_index` in this case because
274
307
  the `sitemaps_host` does not match the `default_host`. The link to the sitemap index file
275
308
  that would otherwise be included would point to a different host than the rest of the links
276
- in the sitemap, something that the sitemap rules forbid.
309
+ in the sitemap, something that the sitemap rules forbid. (Since version 3.2 this is no
310
+ longer an issue because [`include_index` is off by default][include_index_change].)
277
311
 
278
- Generating Multiple Sitemaps
279
- ----------
312
+ ### Generating Multiple Sitemaps
280
313
 
281
314
  Each call to `create` creates a new sitemap index and associated sitemaps. You can call `create` as many times as you want within your sitemap configuration.
282
315
 
@@ -285,73 +318,85 @@ overwrite each other. You can use the `filename`, `sitemaps_namer` and `sitemap
285
318
 
286
319
  In the following example we generate three sitemaps each in its own subdirectory:
287
320
 
288
- %w(google bing apple).each do |subdomain|
289
- SitemapGenerator::Sitemap.default_host = "https://#{subdomain}.mysite.com"
290
- SitemapGenerator::Sitemap.sitemaps_path = "sitemaps/#{subdomain}"
291
- SitemapGenerator::Sitemap.create do
292
- add '/home'
293
- end
294
- end
321
+ ```ruby
322
+ %w(google bing apple).each do |subdomain|
323
+ SitemapGenerator::Sitemap.default_host = "https://#{subdomain}.mysite.com"
324
+ SitemapGenerator::Sitemap.sitemaps_path = "sitemaps/#{subdomain}"
325
+ SitemapGenerator::Sitemap.create do
326
+ add '/home'
327
+ end
328
+ end
329
+ ```
295
330
 
296
331
  Outputs:
297
332
 
298
- + sitemaps/google/sitemap1.xml.gz 2 links / 822 Bytes / 328 Bytes gzipped
299
- + sitemaps/google/sitemap_index.xml.gz 1 sitemaps / 389 Bytes / 217 Bytes gzipped
300
- Sitemap stats: 2 links / 1 sitemaps / 0m00s
301
- + sitemaps/bing/sitemap1.xml.gz 2 links / 820 Bytes / 330 Bytes gzipped
302
- + sitemaps/bing/sitemap_index.xml.gz 1 sitemaps / 388 Bytes / 217 Bytes gzipped
303
- Sitemap stats: 2 links / 1 sitemaps / 0m00s
304
- + sitemaps/apple/sitemap1.xml.gz 2 links / 820 Bytes / 330 Bytes gzipped
305
- + sitemaps/apple/sitemap_index.xml.gz 1 sitemaps / 388 Bytes / 214 Bytes gzipped
306
- Sitemap stats: 2 links / 1 sitemaps / 0m00s
333
+ ```
334
+ + sitemaps/google/sitemap1.xml.gz 2 links / 822 Bytes / 328 Bytes gzipped
335
+ + sitemaps/google/sitemap_index.xml.gz 1 sitemaps / 389 Bytes / 217 Bytes gzipped
336
+ Sitemap stats: 2 links / 1 sitemaps / 0m00s
337
+ + sitemaps/bing/sitemap1.xml.gz 2 links / 820 Bytes / 330 Bytes gzipped
338
+ + sitemaps/bing/sitemap_index.xml.gz 1 sitemaps / 388 Bytes / 217 Bytes gzipped
339
+ Sitemap stats: 2 links / 1 sitemaps / 0m00s
340
+ + sitemaps/apple/sitemap1.xml.gz 2 links / 820 Bytes / 330 Bytes gzipped
341
+ + sitemaps/apple/sitemap_index.xml.gz 1 sitemaps / 388 Bytes / 214 Bytes gzipped
342
+ Sitemap stats: 2 links / 1 sitemaps / 0m00s
343
+ ```
307
344
 
308
345
  If you don't want to have to generate all the sitemaps at once, or you want to refresh some more often than others, you can split them up into their own configuration files. Using the above example we would have:
309
346
 
310
- # config/google_sitemap.rb
311
- SitemapGenerator::Sitemap.default_host = "https://google.mysite.com"
312
- SitemapGenerator::Sitemap.sitemaps_path = "sitemaps/google"
313
- SitemapGenerator::Sitemap.create do
314
- add '/home'
315
- end
316
-
317
- # config/apple_sitemap.rb
318
- SitemapGenerator::Sitemap.default_host = "https://apple.mysite.com"
319
- SitemapGenerator::Sitemap.sitemaps_path = "sitemaps/apple"
320
- SitemapGenerator::Sitemap.create do
321
- add '/home'
322
- end
323
-
324
- # config/bing_sitemap.rb
325
- SitemapGenerator::Sitemap.default_host = "https://bing.mysite.com"
326
- SitemapGenerator::Sitemap.sitemaps_path = "sitemaps/bing"
327
- SitemapGenerator::Sitemap.create do
328
- add '/home'
329
- end
347
+ ```ruby
348
+ # config/google_sitemap.rb
349
+ SitemapGenerator::Sitemap.default_host = "https://google.mysite.com"
350
+ SitemapGenerator::Sitemap.sitemaps_path = "sitemaps/google"
351
+ SitemapGenerator::Sitemap.create do
352
+ add '/home'
353
+ end
354
+
355
+ # config/apple_sitemap.rb
356
+ SitemapGenerator::Sitemap.default_host = "https://apple.mysite.com"
357
+ SitemapGenerator::Sitemap.sitemaps_path = "sitemaps/apple"
358
+ SitemapGenerator::Sitemap.create do
359
+ add '/home'
360
+ end
361
+
362
+ # config/bing_sitemap.rb
363
+ SitemapGenerator::Sitemap.default_host = "https://bing.mysite.com"
364
+ SitemapGenerator::Sitemap.sitemaps_path = "sitemaps/bing"
365
+ SitemapGenerator::Sitemap.create do
366
+ add '/home'
367
+ end
368
+ ```
369
+
330
370
 
331
371
  To generate each one specify the configuration file to run by passing the `CONFIG_FILE` option to `rake sitemap:refresh`, e.g.:
332
372
 
333
- rake sitemap:refresh CONFIG_FILE="config/google_sitemap.rb"
334
- rake sitemap:refresh CONFIG_FILE="config/apple_sitemap.rb"
335
- rake sitemap:refresh CONFIG_FILE="config/bing_sitemap.rb"
373
+ ```
374
+ rake sitemap:refresh CONFIG_FILE="config/google_sitemap.rb"
375
+ rake sitemap:refresh CONFIG_FILE="config/apple_sitemap.rb"
376
+ rake sitemap:refresh CONFIG_FILE="config/bing_sitemap.rb"
377
+ ```
336
378
 
337
- Sitemap Configuration
338
- ======
379
+ ## Sitemap Configuration
339
380
 
340
381
  A sitemap configuration file contains all the information needed to generate your sitemaps. By default SitemapGenerator looks for a configuration file in `config/sitemap.rb` - relative to your application root or the current working directory. (Run `rake sitemap:install` to have this file generated for you if you have not done so already.)
341
382
 
342
383
  If you want to use a non-standard configuration file, or have multiple configuration files, you can specify which one to run by passing the `CONFIG_FILE` option like so:
343
384
 
344
- rake sitemap:refresh CONFIG_FILE="config/geo_sitemap.rb"
385
+ ```
386
+ rake sitemap:refresh CONFIG_FILE="config/geo_sitemap.rb"
387
+ ```
388
+
345
389
 
346
- A Simple Example
347
- -------
390
+ ### A Simple Example
348
391
 
349
392
  So what does a sitemap configuration look like? Let's take a look at a simple example:
350
393
 
351
- SitemapGenerator::Sitemap.default_host = "http://www.example.com"
352
- SitemapGenerator::Sitemap.create do
353
- add '/welcome'
354
- end
394
+ ```ruby
395
+ SitemapGenerator::Sitemap.default_host = "http://www.example.com"
396
+ SitemapGenerator::Sitemap.create do
397
+ add '/welcome'
398
+ end
399
+ ```
355
400
 
356
401
  A few things to note:
357
402
 
@@ -362,63 +407,64 @@ A few things to note:
362
407
 
363
408
  Now let's see what is output when we run this configuration with `rake sitemap:refresh:no_ping`:
364
409
 
365
- + sitemap1.xml.gz 3 links / 923 Bytes / 329 Bytes gzipped
366
- + sitemap_index.xml.gz 1 sitemaps / 364 Bytes / 199 Bytes gzipped
367
- Sitemap stats: 3 links / 1 sitemaps / 0m00s
410
+ ```
411
+ + sitemap1.xml.gz 2 links / 923 Bytes / 329 Bytes gzipped
412
+ + sitemap_index.xml.gz 1 sitemaps / 364 Bytes / 199 Bytes gzipped
413
+ Sitemap stats: 2 links / 1 sitemaps / 0m00s
414
+ ```
368
415
 
369
- Weird! The sitemap has three links, even though only added one! This is because SitemapGenerator adds the root URL `/` and the URL of the sitemap index file to your sitemap by default. (You can change the default behaviour by setting the `include_root` or `include_index` option.)
416
+ Weird! The sitemap has two links, even though only added one! This is because SitemapGenerator adds the root URL `/` by default. (Note that prior to version 3.2 the URL of the sitemap index file was also added to the sitemap by default but [this behaviour has been changed][include_index_change] because of Google complaining about nested indexing.) You can change the default behaviour by setting the `include_root` or `include_index` option.
370
417
 
371
418
  Now let's take a look at the files that were created. After uncompressing and XML-tidying the contents we have:
372
419
 
373
420
  * `public/sitemap_index.xml.gz`
374
421
 
375
- <?xml version="1.0" encoding="UTF-8"?>
376
- <sitemapindex xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/siteindex.xsd">
377
- <sitemap>
378
- <loc>http://www.example.com/sitemap1.xml.gz</loc>
379
- </sitemap>
380
- </sitemapindex>
422
+ ```xml
423
+ <?xml version="1.0" encoding="UTF-8"?>
424
+ <sitemapindex xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/siteindex.xsd">
425
+ <sitemap>
426
+ <loc>http://www.example.com/sitemap1.xml.gz</loc>
427
+ </sitemap>
428
+ </sitemapindex>
429
+ ```
381
430
 
382
431
  * `public/sitemap1.xml.gz`
383
432
 
384
- <?xml version="1.0" encoding="UTF-8"?>
385
- <urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:video="http://www.google.com/schemas/sitemap-video/1.1" xmlns:geo="http://www.google.com/geo/schemas/sitemap/1.0" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
386
- <url>
387
- <loc>http://www.example.com/</loc>
388
- <lastmod>2011-05-21T00:03:38+00:00</lastmod>
389
- <changefreq>always</changefreq>
390
- <priority>1.0</priority>
391
- </url>
392
- <url>
393
- <loc>http://www.example.com/sitemap_index.xml.gz</loc>
394
- <lastmod>2011-05-21T00:03:38+00:00</lastmod>
395
- <changefreq>always</changefreq>
396
- <priority>1.0</priority>
397
- </url>
398
- <url>
399
- <loc>http://www.example.com/welcome</loc>
400
- <lastmod>2011-05-21T00:03:38+00:00</lastmod>
401
- <changefreq>weekly</changefreq>
402
- <priority>0.5</priority>
403
- </url>
404
- </urlset>
405
-
406
- The sitemaps conform to the [Sitemap 0.9 protocol][sitemap_protocol]. Notice the values for `priority` and `changefreq` on the root and sitemap index links, the ones that were added for us? The values tell us that these links are the highest priority and should be checked regularly because they are constantly changing. You can specify your own values for these options in your call to `add`.
407
-
408
- Adding Links
409
- ----------
433
+ ```xml
434
+ <?xml version="1.0" encoding="UTF-8"?>
435
+ <urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:video="http://www.google.com/schemas/sitemap-video/1.1" xmlns:geo="http://www.google.com/geo/schemas/sitemap/1.0" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
436
+ <url>
437
+ <loc>http://www.example.com/</loc>
438
+ <lastmod>2011-05-21T00:03:38+00:00</lastmod>
439
+ <changefreq>always</changefreq>
440
+ <priority>1.0</priority>
441
+ </url>
442
+ <url>
443
+ <loc>http://www.example.com/welcome</loc>
444
+ <lastmod>2011-05-21T00:03:38+00:00</lastmod>
445
+ <changefreq>weekly</changefreq>
446
+ <priority>0.5</priority>
447
+ </url>
448
+ </urlset>
449
+ ```
450
+
451
+ The sitemaps conform to the [Sitemap 0.9 protocol][sitemap_protocol]. Notice the value for `priority` and `changefreq` on the root link, the one that was added for us? The values tell us that this link is the highest priority and should be checked regularly because it are constantly changing. You can specify your own values for these options in your call to `add`.
452
+
453
+ ### Adding Links
410
454
 
411
455
  You call `add` in the block passed to `create` to add a **path** to your sitemap. `add` takes a string path and optional hash of options, generates the URL and adds it to the sitemap. You only need to pass a **path** because the URL will be built for us using the `default_host` we specified. However, if we want to use a different host for a particular link, we can pass the `:host` option to `add`.
412
456
 
413
457
  Let's see another example:
414
458
 
415
- SitemapGenerator::Sitemap.default_host = "http://www.example.com"
416
- SitemapGenerator::Sitemap.create do
417
- add '/contact_us'
418
- Content.find_each do |content|
419
- add content_path(content), :lastmod => content.updated_at
420
- end
421
- end
459
+ ```ruby
460
+ SitemapGenerator::Sitemap.default_host = "http://www.example.com"
461
+ SitemapGenerator::Sitemap.create do
462
+ add '/contact_us'
463
+ Content.find_each do |content|
464
+ add content_path(content), :lastmod => content.updated_at
465
+ end
466
+ end
467
+ ```
422
468
 
423
469
  In this example first we add the `/contact_us` page to the sitemap and then we iterate through the Content model's records adding each one to the sitemap using the `content_path` helper method to generate the path for each record.
424
470
 
@@ -428,9 +474,11 @@ In the example about we pass a `lastmod` (last modified) option with the value o
428
474
 
429
475
  Looking at the output from running this sitemap, we see that we have a few more links than before:
430
476
 
431
- + sitemap1.xml.gz 12 links / 2.3 KB / 365 Bytes gzipped
432
- + sitemap_index.xml.gz 1 sitemaps / 364 Bytes / 199 Bytes gzipped
433
- Sitemap stats: 12 links / 1 sitemaps / 0m00s
477
+ ```
478
+ + sitemap1.xml.gz 12 links / 2.3 KB / 365 Bytes gzipped
479
+ + sitemap_index.xml.gz 1 sitemaps / 364 Bytes / 199 Bytes gzipped
480
+ Sitemap stats: 12 links / 1 sitemaps / 0m00s
481
+ ```
434
482
 
435
483
  From this example we can see that:
436
484
 
@@ -446,28 +494,36 @@ You can read more about `add` in the [XML Specification](http://sitemaps.org/pro
446
494
 
447
495
  Indicates how often the content of the page changes. One of `'always'`, `'hourly'`, `'daily'`, `'weekly'`, `'monthly'`, `'yearly'` or `'never'`. Example:
448
496
 
449
- add '/contact_us', :changefreq => 'monthly'
497
+ ```ruby
498
+ add '/contact_us', :changefreq => 'monthly'
499
+ ```
450
500
 
451
501
  * `lastmod` - Default: `Time.now` (Time).
452
502
 
453
503
  The date and time of last modification. Example:
454
504
 
455
- add content_path(content), :lastmod => content.updated_at
505
+ ```ruby
506
+ add content_path(content), :lastmod => content.updated_at
507
+ ```
456
508
 
457
509
  * `host` - Default: `default_host` (String).
458
510
 
459
511
  Host to use when building the URL. Example:
460
512
 
461
- add '/login', :host => 'https://securehost.com'
513
+ ```ruby
514
+ add '/login', :host => 'https://securehost.com'
515
+ ```
462
516
 
463
517
  * `priority` - Default: `0.5` (Float).
464
518
 
465
519
  The priority of the URL relative to other URLs on a scale from 0 to 1. Example:
466
520
 
467
- add '/about', :priority => 0.75
521
+ ```ruby
522
+ add '/about', :priority => 0.75
523
+ ```
524
+
468
525
 
469
- Adding Links to the Sitemap Index
470
- ----------
526
+ ### Adding Links to the Sitemap Index
471
527
 
472
528
  Sometimes you may need to manually add some links to the sitemap index file. For example if you are generating your sitemaps incrementally you may want to create a sitemap index which includes the files which have already been generated. To achieve this you can use the `add_to_index` method which works exactly the same as the `add` method described above.
473
529
 
@@ -479,50 +535,56 @@ It supports the same options as `add`, namely:
479
535
 
480
536
  The value for `host` defaults to whatever you have set as your `sitemaps_host`. Remember that the `sitemaps_host` is the host where your sitemaps reside. If your sitemaps are on the same host as your `default_host`, then the value for `default_host` is used. Example:
481
537
 
482
- add_to_index '/mysitemap1.xml.gz', :host => 'http://sitemaphostingserver.com'
538
+ ```ruby
539
+ add_to_index '/mysitemap1.xml.gz', :host => 'http://sitemaphostingserver.com'
540
+ ```
483
541
 
484
542
  * `priority`
485
543
 
486
544
  An example:
487
545
 
488
- SitemapGenerator::Sitemap.default_host = "http://www.example.com"
489
- SitemapGenerator::Sitemap.create do
490
- add_to_index '/mysitemap1.xml.gz'
491
- add_to_index '/mysitemap2.xml.gz'
492
- ...
493
- end
546
+ ```ruby
547
+ SitemapGenerator::Sitemap.default_host = "http://www.example.com"
548
+ SitemapGenerator::Sitemap.create do
549
+ add_to_index '/mysitemap1.xml.gz'
550
+ add_to_index '/mysitemap2.xml.gz'
551
+ # ...
552
+ end
553
+ ```
494
554
 
495
- Accessing the LinkSet instance
496
- ----------
555
+ ### Accessing the LinkSet instance
497
556
 
498
557
  Sometimes you need to mess with the internals to do custom stuff. If you need access to the LinkSet instance from within `create()` you can use the `sitemap` method to do so.
499
558
 
500
559
  In this example, say we have already pre-generated three sitemap files: `sitemap1.xml.gz`, `sitemap2.xml.gz`, `sitemap3.xml.gz`. Now we want to start the sitemap generation at `sitemap4.xml.gz` and create a bunch of new sitemaps. There are a few ways we can do this, but this is an easy way:
501
560
 
502
- SitemapGenerator::Sitemap.default_host = "http://www.example.com"
503
- SitemapGenerator::Sitemap.create do
504
- 3.times do |i|
505
- add_to_index sitemap.sitemaps_namer.to_s
506
- sitemap.sitemaps_namer.next
507
- end
508
- add '/home'
509
- add '/another'
510
- end
561
+ ```ruby
562
+ SitemapGenerator::Sitemap.default_host = "http://www.example.com"
563
+ SitemapGenerator::Sitemap.create do
564
+ 3.times do |i|
565
+ add_to_index sitemap.sitemaps_namer.to_s
566
+ sitemap.sitemaps_namer.next
567
+ end
568
+ add '/home'
569
+ add '/another'
570
+ end
571
+ ```
511
572
 
512
573
  The output looks something like this:
513
574
 
514
- In /Users/karl/projects/sitemap_generator-test/public/
515
- + sitemap4.xml.gz 4 links / 347 Bytes
516
- + sitemap_index.xml.gz 4 sitemaps / 242 Bytes
517
- Sitemap stats: 4 links / 4 sitemaps / 0m00s
575
+ ```
576
+ In /Users/karl/projects/sitemap_generator-test/public/
577
+ + sitemap4.xml.gz 4 links / 347 Bytes
578
+ + sitemap_index.xml.gz 4 sitemaps / 242 Bytes
579
+ Sitemap stats: 4 links / 4 sitemaps / 0m00s
580
+ ```
518
581
 
519
- Speeding Things Up
520
- ----------
582
+ ### Speeding Things Up
521
583
 
522
584
  For large ActiveRecord collections with thousands of records it is advisable to iterate through them in batches to avoid loading all records into memory at once. For this reason in the example above we use `Content.find_each` which is a batched iterator available since Rails 2.3.2, rather than `Content.all`.
523
585
 
524
- Customizing your Sitemaps
525
- =======
586
+
587
+ ## Customizing your Sitemaps
526
588
 
527
589
  SitemapGenerator supports a number of options which allow you to control every aspect of your sitemap generation. How they are named, where they are stored, the contents of the links and the location that the sitemaps will be hosted from can all be set.
528
590
 
@@ -530,34 +592,39 @@ The options can be set in the following ways.
530
592
 
531
593
  On `SitemapGenerator::Sitemap`:
532
594
 
533
- SitemapGenerator::Sitemap.default_host = 'http://example.com'
534
- SitemapGenerator::Sitemap.sitemaps_path = 'sitemaps/'
595
+ ```ruby
596
+ SitemapGenerator::Sitemap.default_host = 'http://example.com'
597
+ SitemapGenerator::Sitemap.sitemaps_path = 'sitemaps/'
598
+ ```
535
599
 
536
600
  These options will apply to all sitemaps. This is how you set most options.
537
601
 
538
602
  Passed as options in the call to `create`:
539
603
 
540
- SitemapGenerator::Sitemap.create(
541
- :default_host => 'http://example.com',
542
- :sitemaps_path => 'sitemaps/') do
543
- add '/home'
544
- end
604
+ ```ruby
605
+ SitemapGenerator::Sitemap.create(
606
+ :default_host => 'http://example.com',
607
+ :sitemaps_path => 'sitemaps/') do
608
+ add '/home'
609
+ end
610
+ ```
545
611
 
546
612
  This is useful if you are setting a lot of options.
547
613
 
548
614
  Finally, passed as options in a call to `group`:
549
615
 
550
- SitemapGenerator::Sitemap.create do
551
- group(:default_host => 'http://example.com',
552
- :sitemaps_path => 'sitemaps/') do
553
- add '/home'
554
- end
555
- end
616
+ ```ruby
617
+ SitemapGenerator::Sitemap.create do
618
+ group(:default_host => 'http://example.com',
619
+ :sitemaps_path => 'sitemaps/') do
620
+ add '/home'
621
+ end
622
+ end
623
+ ```
556
624
 
557
625
  The options passed to `group` only apply to the links and sitemaps generated in the group. Sitemap Groups are useful to group links into specific sitemaps, or to set options that you only want to apply to the links in that group.
558
626
 
559
- Sitemap Options
560
- -------
627
+ ### Sitemap Options
561
628
 
562
629
  The following options are supported:
563
630
 
@@ -565,7 +632,7 @@ The following options are supported:
565
632
 
566
633
  * `filename` - Symbol. The **base name for the files** that will be generated. The default value is `:sitemap`. This yields sitemaps with names like `sitemap1.xml.gz`, `sitemap2.xml.gz`, `sitemap3.xml.gz` etc, and a sitemap index named `sitemap_index.xml.gz`. If we now set the value to `:geo` the sitemaps would be named `geo1.xml.gz`, `geo2.xml.gz`, `geo3.xml.gz` etc, and the sitemap index would be named `geo_index.xml.gz`.
567
634
 
568
- * `include_index` - Boolean. Whether to **add a link to the sitemap index** to the current sitemap. This points search engines to your Sitemap Index to include it in the indexing of your site. Default is `true`. Turned off when `sitemaps_host` is set or within a `group()` block.
635
+ * `include_index` - Boolean. Whether to **add a link to the sitemap index** to the current sitemap. This points search engines to your Sitemap Index to include it in the indexing of your site. 2012-07: This is now turned off by default because Google may complain about there being 'Nested Sitemap indexes'. Default is `false`. Turned off when `sitemaps_host` is set or within a `group()` block.
569
636
 
570
637
  * `include_root` - Boolean. Whether to **add the root** url i.e. '/' to the current sitemap. Default is `true`. Turned off within a `group()` block.
571
638
 
@@ -588,8 +655,8 @@ different host than the rest of the links in the sitemap. Something that the si
588
655
  you can provide an instance of your own class to provide custom behavior. Your class must
589
656
  define a write method which takes a `SitemapGenerator::Location` and raw XML data.
590
657
 
591
- Sitemap Groups
592
- =======
658
+
659
+ ## Sitemap Groups
593
660
 
594
661
  Sitemap Groups is a powerful feature that is also very simple to use.
595
662
 
@@ -600,33 +667,36 @@ Sitemap Groups is a powerful feature that is also very simple to use.
600
667
  * Groups can handle any number of links.
601
668
  * Group sitemaps are finalized (written out) as they get full and at the end of each group.
602
669
 
603
- A Groups Example
604
- ----------------
670
+ ### A Groups Example
605
671
 
606
672
  When you create a new group you pass options which will apply only to that group. You pass a block to `group`. Inside your block you call `add` to add links to the group.
607
673
 
608
674
  Let's see an example that demonstrates a few interesting things about groups:
609
675
 
610
- SitemapGenerator::Sitemap.default_host = "http://www.example.com"
611
- SitemapGenerator::Sitemap.create do
612
- add '/rss'
676
+ ```ruby
677
+ SitemapGenerator::Sitemap.default_host = "http://www.example.com"
678
+ SitemapGenerator::Sitemap.create do
679
+ add '/rss'
613
680
 
614
- group(:sitemaps_path => 'en/', :filename => :english) do
615
- add '/home'
616
- end
681
+ group(:sitemaps_path => 'en/', :filename => :english) do
682
+ add '/home'
683
+ end
617
684
 
618
- group(:sitemaps_path => 'fr/', :filename => :french) do
619
- add '/maison'
620
- end
621
- end
685
+ group(:sitemaps_path => 'fr/', :filename => :french) do
686
+ add '/maison'
687
+ end
688
+ end
689
+ ```
622
690
 
623
691
  And the output from running the above:
624
692
 
625
- + en/english1.xml.gz 1 links / 612 Bytes / 296 Bytes gzipped
626
- + fr/french1.xml.gz 1 links / 614 Bytes / 298 Bytes gzipped
627
- + sitemap1.xml.gz 3 links / 919 Bytes / 328 Bytes gzipped
628
- + sitemap_index.xml.gz 3 sitemaps / 505 Bytes / 221 Bytes gzipped
629
- Sitemap stats: 5 links / 3 sitemaps / 0m00s
693
+ ```
694
+ + en/english1.xml.gz 1 links / 612 Bytes / 296 Bytes gzipped
695
+ + fr/french1.xml.gz 1 links / 614 Bytes / 298 Bytes gzipped
696
+ + sitemap1.xml.gz 3 links / 919 Bytes / 328 Bytes gzipped
697
+ + sitemap_index.xml.gz 3 sitemaps / 505 Bytes / 221 Bytes gzipped
698
+ Sitemap stats: 5 links / 3 sitemaps / 0m00s
699
+ ```
630
700
 
631
701
  So we have two sitemaps with one link each and one sitemap with three links. The sitemaps from the groups are easy to spot by their filenames. They are `english1.xml.gz` and `french1.xml.gz`. They contain only one link each because **`include_index` and `include_root` are set to `false` by default** in a group.
632
702
 
@@ -638,30 +708,31 @@ The options you use when creating your groups will determine which and how many
638
708
 
639
709
  If you have changed your sitemaps physical location in a group, then the default sitemap will not be used and it will be unaffected by the group. **Group sitemaps are finalized as they get full and at the end of each group.**
640
710
 
641
- Sitemap Extensions
642
- ===========
643
711
 
644
- News Sitemaps
645
- -----------
712
+ ## Sitemap Extensions
646
713
 
647
- A news item can be added to a sitemap URL by passing a `:news` hash to `add`. The hash must contain tags defined by the [News Sitemap][news_tags] specification.
648
-
649
- ### Example
714
+ ### News Sitemaps
650
715
 
651
- SitemapGenerator::Sitemap.create do
652
- add('/index.html', :news => {
653
- :publication_name => "Example",
654
- :publication_language => "en",
655
- :title => "My Article",
656
- :keywords => "my article, articles about myself",
657
- :stock_tickers => "SAO:PETR3",
658
- :publication_date => "2011-08-22",
659
- :access => "Subscription",
660
- :genres => "PressRelease"
661
- })
662
- end
716
+ A news item can be added to a sitemap URL by passing a `:news` hash to `add`. The hash must contain tags defined by the [News Sitemap][news_tags] specification.
663
717
 
664
- ### Supported options
718
+ #### Example
719
+
720
+ ```ruby
721
+ SitemapGenerator::Sitemap.create do
722
+ add('/index.html', :news => {
723
+ :publication_name => "Example",
724
+ :publication_language => "en",
725
+ :title => "My Article",
726
+ :keywords => "my article, articles about myself",
727
+ :stock_tickers => "SAO:PETR3",
728
+ :publication_date => "2011-08-22",
729
+ :access => "Subscription",
730
+ :genres => "PressRelease"
731
+ })
732
+ end
733
+ ```
734
+
735
+ #### Supported options
665
736
 
666
737
  * `publication_name`
667
738
  * `publication_language`
@@ -672,21 +743,22 @@ A news item can be added to a sitemap URL by passing a `:news` hash to `add`. T
672
743
  * `keywords`
673
744
  * `stock_tickers`
674
745
 
675
- * * *
676
- Image Sitemaps
677
- -----------
746
+
747
+ ### Image Sitemaps
678
748
 
679
749
  Images can be added to a sitemap URL by passing an `:images` array to `add`. Each item in the array must be a Hash containing tags defined by the [Image Sitemap][image_tags] specification.
680
750
 
681
- ### Example
751
+ #### Example
682
752
 
683
- SitemapGenerator::Sitemap.create do
684
- add('/index.html', :images => [{
685
- :loc => 'http://www.example.com/image.png',
686
- :title => 'Image' }])
687
- end
753
+ ```ruby
754
+ SitemapGenerator::Sitemap.create do
755
+ add('/index.html', :images => [{
756
+ :loc => 'http://www.example.com/image.png',
757
+ :title => 'Image' }])
758
+ end
759
+ ```
688
760
 
689
- ### Supported options
761
+ #### Supported options
690
762
 
691
763
  * `loc` Required, location of the image
692
764
  * `caption`
@@ -694,48 +766,50 @@ Images can be added to a sitemap URL by passing an `:images` array to `add`. Ea
694
766
  * `title`
695
767
  * `license`
696
768
 
697
- * * *
698
- Video Sitemaps
699
- -----------
769
+
770
+ ### Video Sitemaps
700
771
 
701
772
  A video can be added to a sitemap URL by passing a `:video` Hash to `add()`. The Hash can contain tags defined by the [Video Sitemap specification][video_tags].
702
773
 
703
774
  To add more than one video to a url, pass an array of video hashes using the `:videos` option.
704
775
 
705
- ### Example
776
+ #### Example
706
777
 
707
- add('/index.html', :video => {
708
- :thumbnail_loc => 'http://www.example.com/video1_thumbnail.png',
709
- :title => 'Title',
710
- :description => 'Description',
711
- :content_loc => 'http://www.example.com/cool_video.mpg',
712
- :tags => %w[one two three],
713
- :category => 'Category'
714
- })
778
+ ```ruby
779
+ add('/index.html', :video => {
780
+ :thumbnail_loc => 'http://www.example.com/video1_thumbnail.png',
781
+ :title => 'Title',
782
+ :description => 'Description',
783
+ :content_loc => 'http://www.example.com/cool_video.mpg',
784
+ :tags => %w[one two three],
785
+ :category => 'Category'
786
+ })
787
+ ```
715
788
 
716
- ### Supported options
789
+ #### Supported options
717
790
 
718
791
  * `:thumbnail_loc` - Required, string.
719
792
 
720
793
 
721
794
 
722
- Geo Sitemaps
723
- -----------
795
+ ### Geo Sitemaps
724
796
 
725
797
  Pages with geo data can be added by passing a `:geo` Hash to `add`. The Hash only supports one tag of `:format`. Google provides an [example of a geo sitemap link here][geo_tags]. Note that the sitemap does not actually contain your KML or GeoRSS. It merely links to a page that has this content.
726
798
 
727
- ### Example:
799
+ #### Example:
728
800
 
729
- SitemapGenerator::Sitemap.create do
730
- add('/stores/1234.xml', :geo => { :format => 'kml' })
731
- end
801
+ ```ruby
802
+ SitemapGenerator::Sitemap.create do
803
+ add('/stores/1234.xml', :geo => { :format => 'kml' })
804
+ end
805
+ ```
732
806
 
733
- ### Supported options
807
+ #### Supported options
734
808
 
735
809
  * `format` Required, either 'kml' or 'georss'
736
810
 
737
- Raison d'être
738
- =======
811
+
812
+ ## Raison d'être
739
813
 
740
814
  Most of the Sitemap plugins out there seem to try to recreate the Sitemap links by iterating the Rails routes. In some cases this is possible, but for a great deal of cases it isn't.
741
815
 
@@ -745,7 +819,9 @@ and
745
819
 
746
820
  b) How would you infer the correct series of links for the following route?
747
821
 
748
- map.zipcode 'location/:state/:city/:zipcode', :controller => 'zipcode', :action => 'index'
822
+ ```ruby
823
+ map.zipcode 'location/:state/:city/:zipcode', :controller => 'zipcode', :action => 'index'
824
+ ```
749
825
 
750
826
  Don't tell me it's trivial, because it isn't. It just looks trivial.
751
827
 
@@ -753,45 +829,46 @@ So my idea is to have another file similar to 'routes.rb' called 'sitemap.rb', w
753
829
 
754
830
  Here's my solution:
755
831
 
756
- Zipcode.find(:all, :include => :city).each do |z|
757
- add zipcode_path(:state => z.city.state, :city => z.city, :zipcode => z)
758
- end
832
+ ```ruby
833
+ Zipcode.find(:all, :include => :city).each do |z|
834
+ add zipcode_path(:state => z.city.state, :city => z.city, :zipcode => z)
835
+ end
836
+ ```
759
837
 
760
838
  Easy hey?
761
839
 
762
- Compatibility
763
- =======
840
+ ## Compatibility
764
841
 
765
842
  Tested and working on:
766
843
 
767
- - **Rails** 3.0.0, 3.0.7
768
- - **Rails** 1.x - 2.3.8
769
- - **Ruby** 1.8.6, 1.8.7, 1.8.7 Enterprise Edition, 1.9.1, 1.9.2
844
+ * **Rails** 3.0.0, 3.0.7
845
+ * **Rails** 1.x - 2.3.8
846
+ * **Ruby** 1.8.6, 1.8.7, 1.8.7 Enterprise Edition, 1.9.1, 1.9.2
847
+
848
+
849
+ ## Known Bugs
850
+
851
+ * There's no check on the size of a URL which [isn't supposed to exceed 2,048 bytes][sitemaps_xml].
852
+ * Currently only supports one Sitemap Index file, which can contain 50,000 Sitemap files which can each contain 50,000 urls, so it _only_ supports up to 2,500,000,000 (2.5 billion) urls.
770
853
 
771
- Known Bugs
772
- ========
773
854
 
774
- - There's no check on the size of a URL which [isn't supposed to exceed 2,048 bytes][sitemaps_xml].
775
- - Currently only supports one Sitemap Index file, which can contain 50,000 Sitemap files which can each contain 50,000 urls, so it _only_ supports up to 2,500,000,000 (2.5 billion) urls.
855
+ ## Wishlist & Coming Soon
776
856
 
777
- Wishlist & Coming Soon
778
- ========
857
+ * Rails framework agnosticism; support for other frameworks like Merb
779
858
 
780
- - Rails framework agnosticism; support for other frameworks like Merb
781
859
 
782
- Thanks (in no particular order)
783
- ========
860
+ ## Thanks (in no particular order)
784
861
 
785
- - [Rodrigo Flores](https://github.com/rodrigoflores) for News sitemaps
786
- - [Alex Soto](http://github.com/apsoto) for Video sitemaps
787
- - [Alexadre Bini](http://github.com/alexandrebini) for Image sitemaps
788
- - [Dan Pickett](http://github.com/dpickett)
789
- - [Rob Biedenharn](http://github.com/rab)
790
- - [Richie Vos](http://github.com/jerryvos)
791
- - [Adrian Mugnolo](http://github.com/xymbol)
792
- - [Jason Weathered](http://github.com/jasoncodes)
793
- - [Andy Stewart](http://github.com/airblade)
794
- - [Brian Armstrong](https://github.com/barmstrong) for Geo sitemaps
862
+ * [Rodrigo Flores](https://github.com/rodrigoflores) for News sitemaps
863
+ * [Alex Soto](http://github.com/apsoto) for Video sitemaps
864
+ * [Alexadre Bini](http://github.com/alexandrebini) for Image sitemaps
865
+ * [Dan Pickett](http://github.com/dpickett)
866
+ * [Rob Biedenharn](http://github.com/rab)
867
+ * [Richie Vos](http://github.com/jerryvos)
868
+ * [Adrian Mugnolo](http://github.com/xymbol)
869
+ * [Jason Weathered](http://github.com/jasoncodes)
870
+ * [Andy Stewart](http://github.com/airblade)
871
+ * [Brian Armstrong](https://github.com/barmstrong) for Geo sitemaps
795
872
 
796
873
  Copyright (c) 2009 Karl Varga released under the MIT license
797
874
 
@@ -804,9 +881,11 @@ Copyright (c) 2009 Karl Varga released under the MIT license
804
881
  [sitemap_video]:http://www.google.com/support/webmasters/bin/topic.py?topic=10079
805
882
  [sitemap_news]:http://www.google.com/support/webmasters/bin/topic.py?hl=en&topic=10078
806
883
  [sitemap_geo]:http://www.google.com/support/webmasters/bin/topic.py?hl=en&topic=14688
884
+ [sitemap_mobile]:http://support.google.com/webmasters/bin/answer.py?hl=en&answer=34648
807
885
  [sitemap_protocol]:http://sitemaps.org/protocol.php
808
886
  [video_tags]:http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=80472#4
809
887
  [image_tags]:http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=178636
810
888
  [geo_tags]:http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=94555
811
889
  [news_tags]:http://www.google.com/support/news_pub/bin/answer.py?answer=74288
812
890
  [remote_hosts]:https://github.com/kjvarga/sitemap_generator/wiki/Generate-Sitemaps-on-read-only-filesystems-like-Heroku
891
+ [include_index_change]:https://github.com/kjvarga/sitemap_generator/issues/70