sitemap_generator 2.1.8 → 2.2.1
Sign up to get free protection for your applications and to get access to all the features.
- data/Gemfile.lock +1 -1
- data/README.md +37 -26
- data/VERSION +1 -1
- data/lib/sitemap_generator/link_set.rb +73 -17
- metadata +4 -4
data/Gemfile.lock
CHANGED
data/README.md
CHANGED
@@ -26,6 +26,8 @@ Does your website use SitemapGenerator to generate Sitemaps? Where would you be
|
|
26
26
|
Changelog
|
27
27
|
-------
|
28
28
|
|
29
|
+
- v2.2.1: Support adding new search engines to ping and modifying the default search engines.
|
30
|
+
Allow the URL of the sitemap index to be passed as an argument to `ping_search_engines`. See **Pinging Search Engines**.
|
29
31
|
- v2.1.8: Extend and improve Video Sitemap support. Include sitemap docs in the README, support all element attributes, properly format values.
|
30
32
|
- v2.1.7: Improve format of float priorities; Remove Yahoo from ping - the Yahoo
|
31
33
|
service has been shut down.
|
@@ -57,41 +59,29 @@ Those who knew him know what an amazing guy he was, and what an excellent Rails
|
|
57
59
|
|
58
60
|
The canonical repository is now: [http://github.com/kjvarga/sitemap_generator][canonical_repo]
|
59
61
|
|
60
|
-
Install
|
62
|
+
Install
|
61
63
|
=======
|
62
64
|
|
63
|
-
Rails
|
64
|
-
|
65
|
+
Rails
|
66
|
+
-----
|
65
67
|
|
66
|
-
Add the gem to your `
|
68
|
+
Add the gem to your `Gemfile`:
|
67
69
|
|
68
70
|
gem 'sitemap_generator'
|
69
71
|
|
70
|
-
|
71
|
-
|
72
|
-
Rails 2 Gem
|
73
|
-
--------
|
74
|
-
|
75
|
-
1. Follow the Rails 3 install if you are using a `Gemfile`.
|
76
|
-
|
77
|
-
If you are not using a `Gemfile` add the gem to your `config/environment.rb` configuration block with:
|
78
|
-
|
79
|
-
config.gem 'sitemap_generator'
|
72
|
+
Alternatively, if you are not using a `Gemfile` add the gem to your `config/environment.rb` file config block:
|
80
73
|
|
81
|
-
|
74
|
+
config.gem 'sitemap_generator'
|
82
75
|
|
83
|
-
2
|
76
|
+
**Rails 1 or 2 only**, add the following code to your `Rakefile` to include the gem's Rake tasks in your project (Rails 3 does this for you automatically, so this step is not necessary):
|
84
77
|
|
85
|
-
|
86
|
-
|
87
|
-
|
88
|
-
|
89
|
-
|
90
|
-
|
91
|
-
Rails 2 Plugin
|
92
|
-
----------
|
78
|
+
begin
|
79
|
+
require 'sitemap_generator/tasks'
|
80
|
+
rescue Exception => e
|
81
|
+
puts "Warning, couldn't load gem tasks: #{e.message}! Skipping..."
|
82
|
+
end
|
93
83
|
|
94
|
-
|
84
|
+
<i>If you would prefer to install as a plugin (deprecated) don't do any of the above. Simply run `script/plugin install git://github.com/kjvarga/sitemap_generator.git` from your application root directory.</i>
|
95
85
|
|
96
86
|
Getting Started
|
97
87
|
======
|
@@ -107,11 +97,32 @@ Run `rake sitemap:refresh` as needed to create or rebuild your sitemap files. S
|
|
107
97
|
|
108
98
|
**To disable all non-essential output from `rake` run the tasks passing a `-s` option.** For example: `rake -s sitemap:refresh`.
|
109
99
|
|
110
|
-
Search
|
100
|
+
Pinging Search Engines
|
111
101
|
-----
|
112
102
|
|
113
103
|
Using `rake sitemap:refresh` will notify major search engines to let them know that a new sitemap is available (Google, Bing, Ask, SitemapWriter). To generate new sitemaps without notifying search engines (for example when running in a local environment) use `rake sitemap:refresh:no_ping`.
|
114
104
|
|
105
|
+
If you want to customize the hash of search engines you can access it at:
|
106
|
+
|
107
|
+
SitemapGenerator::Sitemap.search_engines
|
108
|
+
|
109
|
+
Usually you would be adding a new search engine to ping. In this case you can modify the `search_engines` hash directly. This ensures that when `SitemapGenerator::Sitemap.ping_search_engines` is called your new search engine will be included.
|
110
|
+
|
111
|
+
If you are calling `ping_search_engines` manually (for example if you have to wait some time or perform a custom action after your sitemaps have been regenerated) then you can pass you new search engine directly in the call as in the following example:
|
112
|
+
|
113
|
+
SitemapGenerator::Sitemap.ping_search_engines(:newengine => 'http://newengine.com/ping?url=%s')
|
114
|
+
|
115
|
+
The key gives the name of the search engine as a string or symbol and the value is the full URL to ping with a string interpolation that will be replaced by the CGI escaped sitemap index URL. If you have any literal percent characters in your URL you need to escape them with `%%`.
|
116
|
+
|
117
|
+
If you are calling `SitemapGenerator::Sitemap.ping_search_engines` from outside of your sitemap config file then you will need to set `SitemapGenerator::Sitemap.default_host` and any other options that you set in your sitemap config which affect the location of the sitemap index file. For example:
|
118
|
+
|
119
|
+
SitemapGenerator::Sitemap.default_host = 'http://example.com'
|
120
|
+
SitemapGenerator::Sitemap.ping_search_engines
|
121
|
+
|
122
|
+
Alternatively you can pass in the full URL to your sitemap index in which case we would have just the following:
|
123
|
+
|
124
|
+
SitemapGenerator::Sitemap.ping_search_engines('http://example.com/sitemap_index.xml.gz')
|
125
|
+
|
115
126
|
Crontab
|
116
127
|
-----
|
117
128
|
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
2.1
|
1
|
+
2.2.1
|
@@ -82,14 +82,17 @@ module SitemapGenerator
|
|
82
82
|
#
|
83
83
|
# * <tt>:sitemaps_namer</tt> - A +SitemapNamer+ instance for generating the sitemap names.
|
84
84
|
#
|
85
|
-
# * <tt
|
85
|
+
# * <tt>:include_index</tt> - Boolean. Whether to <b>add a link to the sitemap index<b>
|
86
86
|
# to the current sitemap. This points search engines to your Sitemap Index to
|
87
87
|
# include it in the indexing of your site. Default is `true`. Turned off when
|
88
88
|
# `sitemaps_host` is set or within a `group()` block.
|
89
89
|
#
|
90
|
-
# * <tt
|
90
|
+
# * <tt>:include_root</tt> - Boolean. Whether to **add the root** url i.e. '/' to the
|
91
91
|
# current sitemap. Default is `true`. Turned off within a `group()` block.
|
92
92
|
#
|
93
|
+
# * <tt>:search_engines</tt> - Hash. A hash of search engine names mapped to
|
94
|
+
# ping URLs. See ping_search_engines.
|
95
|
+
#
|
93
96
|
# * <tt>:verbose</tt> - If +true+, output a summary line for each sitemap and sitemap
|
94
97
|
# index that is created. Default is +false+.
|
95
98
|
def initialize(options={})
|
@@ -97,7 +100,13 @@ module SitemapGenerator
|
|
97
100
|
:include_root => true,
|
98
101
|
:include_index => true,
|
99
102
|
:filename => :sitemap,
|
100
|
-
:verbose => false
|
103
|
+
:verbose => false,
|
104
|
+
:search_engines => {
|
105
|
+
:google => "http://www.google.com/webmasters/sitemaps/ping?sitemap=%s",
|
106
|
+
:ask => "http://submissions.ask.com/ping?sitemap=%s",
|
107
|
+
:bing => "http://www.bing.com/webmaster/ping.aspx?siteMap=%s",
|
108
|
+
:sitemap_writer => "http://www.sitemapwriter.com/notify.php?crawler=all&url=%s"
|
109
|
+
}
|
101
110
|
})
|
102
111
|
options.each_pair { |k, v| instance_variable_set("@#{k}".to_sym, v) }
|
103
112
|
|
@@ -182,23 +191,51 @@ module SitemapGenerator
|
|
182
191
|
@group
|
183
192
|
end
|
184
193
|
|
185
|
-
# Ping search engines.
|
194
|
+
# Ping search engines to notify them of updated sitemaps.
|
195
|
+
#
|
196
|
+
# Search engines are already notified for you if you run `rake sitemap:refresh`.
|
197
|
+
# If you want to ping search engines separately to your sitemap generation, run
|
198
|
+
# `rake sitemap:refresh:no_ping` and then run a rake task or script
|
199
|
+
# which calls this method as in the example below.
|
200
|
+
#
|
201
|
+
# == Arguments
|
202
|
+
# * sitemap_index_url - The full URL to your sitemap index file.
|
203
|
+
# If not provided the location is based on the `host` you have
|
204
|
+
# set and any other options like your `sitemaps_path`. The URL
|
205
|
+
# will be CGI escaped for you when included as part of the
|
206
|
+
# search engine ping URL.
|
207
|
+
#
|
208
|
+
# == Options
|
209
|
+
# A hash of one or more search engines to ping in addition to the
|
210
|
+
# default search engines. The key is the name of the search engine
|
211
|
+
# as a string or symbol and the value is the full URL to ping with
|
212
|
+
# a string interpolation that will be replaced by the CGI escaped sitemap
|
213
|
+
# index URL. If you have any literal percent characters in your URL you
|
214
|
+
# need to escape them with `%%`. For example if your sitemap index URL
|
215
|
+
# is `http://example.com/sitemap_index.xml.gz` and your
|
216
|
+
# ping url is `http://example.com/100%%/ping?url=%s`
|
217
|
+
# then the final URL that is pinged will be `http://example.com/100%/ping?url=http%3A%2F%2Fexample.com%2Fsitemap_index.xml.gz`
|
218
|
+
#
|
219
|
+
# == Examples
|
220
|
+
#
|
221
|
+
# Both of these examples will ping the default search engines in addition to `http://superengine.com/ping?url=http%3A%2F%2Fexample.com%2Fsitemap_index.xml.gz`
|
222
|
+
#
|
223
|
+
# SitemapGenerator::Sitemap.host('http://example.com/')
|
224
|
+
# SitemapGenerator::Sitemap.ping_search_engines(:super_engine => 'http://superengine.com/ping?url=%s')
|
225
|
+
#
|
226
|
+
# Is equivalent to:
|
186
227
|
#
|
187
|
-
#
|
188
|
-
def ping_search_engines
|
228
|
+
# SitemapGenerator::Sitemap.ping_search_engines('http://example.com/sitemap_index.xml.gz', :super_engine => 'http://superengine.com/ping?url=%s')
|
229
|
+
def ping_search_engines(*args)
|
230
|
+
engines = args.last.is_a?(Hash) ? args.pop : {}
|
231
|
+
index_url = CGI.escape(args.shift || sitemap_index_url)
|
232
|
+
|
189
233
|
require 'open-uri'
|
190
234
|
require 'timeout'
|
191
235
|
|
192
|
-
sitemap_index_url = CGI.escape(sitemap_index.location.url)
|
193
|
-
search_engines = {
|
194
|
-
:google => "http://www.google.com/webmasters/sitemaps/ping?sitemap=#{sitemap_index_url}",
|
195
|
-
:ask => "http://submissions.ask.com/ping?sitemap=#{sitemap_index_url}",
|
196
|
-
:bing => "http://www.bing.com/webmaster/ping.aspx?siteMap=#{sitemap_index_url}",
|
197
|
-
:sitemap_writer => "http://www.sitemapwriter.com/notify.php?crawler=all&url=#{sitemap_index_url}"
|
198
|
-
}
|
199
|
-
|
200
236
|
puts "\n" if verbose
|
201
|
-
search_engines.each do |engine, link|
|
237
|
+
search_engines.merge(engines).each do |engine, link|
|
238
|
+
link = link % index_url
|
202
239
|
begin
|
203
240
|
Timeout::timeout(10) {
|
204
241
|
open(link)
|
@@ -221,16 +258,21 @@ module SitemapGenerator
|
|
221
258
|
@sitemaps_host || @default_host
|
222
259
|
end
|
223
260
|
|
224
|
-
# Lazy-initialize a sitemap instance
|
261
|
+
# Lazy-initialize a sitemap instance and return it.
|
225
262
|
def sitemap
|
226
263
|
@sitemap ||= SitemapGenerator::Builder::SitemapFile.new(sitemap_location)
|
227
264
|
end
|
228
265
|
|
229
|
-
# Lazy-initialize a sitemap index instance
|
266
|
+
# Lazy-initialize a sitemap index instance and return it.
|
230
267
|
def sitemap_index
|
231
268
|
@sitemap_index ||= SitemapGenerator::Builder::SitemapIndexFile.new(sitemap_index_location)
|
232
269
|
end
|
233
270
|
|
271
|
+
# Return the full url to the sitemap index file.
|
272
|
+
def sitemap_index_url
|
273
|
+
sitemap_index.location.url
|
274
|
+
end
|
275
|
+
|
234
276
|
def finalize!
|
235
277
|
finalize_sitemap!
|
236
278
|
finalize_sitemap_index!
|
@@ -414,6 +456,20 @@ module SitemapGenerator
|
|
414
456
|
self.sitemap_index_namer = SitemapGenerator::SitemapIndexNamer.new("#{@filename}_index")
|
415
457
|
end
|
416
458
|
|
459
|
+
# Set the search engines hash to a new hash of search engine names mapped to
|
460
|
+
# ping URLs (see ping_search_engines). If the value is nil it is converted
|
461
|
+
# to an empty hash.
|
462
|
+
# === Example
|
463
|
+
# <tt>search_engines = { :google => "http://www.google.com/webmasters/sitemaps/ping?sitemap=%s" }</tt>
|
464
|
+
def search_engines=(value)
|
465
|
+
@search_engines = value || {}
|
466
|
+
end
|
467
|
+
|
468
|
+
# Return the hash of search engines.
|
469
|
+
def search_engines
|
470
|
+
@search_engines || {}
|
471
|
+
end
|
472
|
+
|
417
473
|
# Set the namer to use when generating SitemapFiles (does not apply to the
|
418
474
|
# SitemapIndexFile)
|
419
475
|
def sitemaps_namer=(value)
|
metadata
CHANGED
@@ -1,13 +1,13 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: sitemap_generator
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
hash:
|
4
|
+
hash: 5
|
5
5
|
prerelease:
|
6
6
|
segments:
|
7
7
|
- 2
|
8
|
+
- 2
|
8
9
|
- 1
|
9
|
-
|
10
|
-
version: 2.1.8
|
10
|
+
version: 2.2.1
|
11
11
|
platform: ruby
|
12
12
|
authors:
|
13
13
|
- Karl Varga
|
@@ -16,7 +16,7 @@ autorequire:
|
|
16
16
|
bindir: bin
|
17
17
|
cert_chain: []
|
18
18
|
|
19
|
-
date:
|
19
|
+
date: 2012-01-07 00:00:00 -08:00
|
20
20
|
default_executable:
|
21
21
|
dependencies:
|
22
22
|
- !ruby/object:Gem::Dependency
|