sitemap_generator 2.1.8 → 2.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/Gemfile.lock +1 -1
- data/README.md +37 -26
- data/VERSION +1 -1
- data/lib/sitemap_generator/link_set.rb +73 -17
- metadata +4 -4
data/Gemfile.lock
CHANGED
data/README.md
CHANGED
@@ -26,6 +26,8 @@ Does your website use SitemapGenerator to generate Sitemaps? Where would you be
|
|
26
26
|
Changelog
|
27
27
|
-------
|
28
28
|
|
29
|
+
- v2.2.1: Support adding new search engines to ping and modifying the default search engines.
|
30
|
+
Allow the URL of the sitemap index to be passed as an argument to `ping_search_engines`. See **Pinging Search Engines**.
|
29
31
|
- v2.1.8: Extend and improve Video Sitemap support. Include sitemap docs in the README, support all element attributes, properly format values.
|
30
32
|
- v2.1.7: Improve format of float priorities; Remove Yahoo from ping - the Yahoo
|
31
33
|
service has been shut down.
|
@@ -57,41 +59,29 @@ Those who knew him know what an amazing guy he was, and what an excellent Rails
|
|
57
59
|
|
58
60
|
The canonical repository is now: [http://github.com/kjvarga/sitemap_generator][canonical_repo]
|
59
61
|
|
60
|
-
Install
|
62
|
+
Install
|
61
63
|
=======
|
62
64
|
|
63
|
-
Rails
|
64
|
-
|
65
|
+
Rails
|
66
|
+
-----
|
65
67
|
|
66
|
-
Add the gem to your `
|
68
|
+
Add the gem to your `Gemfile`:
|
67
69
|
|
68
70
|
gem 'sitemap_generator'
|
69
71
|
|
70
|
-
|
71
|
-
|
72
|
-
Rails 2 Gem
|
73
|
-
--------
|
74
|
-
|
75
|
-
1. Follow the Rails 3 install if you are using a `Gemfile`.
|
76
|
-
|
77
|
-
If you are not using a `Gemfile` add the gem to your `config/environment.rb` configuration block with:
|
78
|
-
|
79
|
-
config.gem 'sitemap_generator'
|
72
|
+
Alternatively, if you are not using a `Gemfile` add the gem to your `config/environment.rb` file config block:
|
80
73
|
|
81
|
-
|
74
|
+
config.gem 'sitemap_generator'
|
82
75
|
|
83
|
-
2
|
76
|
+
**Rails 1 or 2 only**, add the following code to your `Rakefile` to include the gem's Rake tasks in your project (Rails 3 does this for you automatically, so this step is not necessary):
|
84
77
|
|
85
|
-
|
86
|
-
|
87
|
-
|
88
|
-
|
89
|
-
|
90
|
-
|
91
|
-
Rails 2 Plugin
|
92
|
-
----------
|
78
|
+
begin
|
79
|
+
require 'sitemap_generator/tasks'
|
80
|
+
rescue Exception => e
|
81
|
+
puts "Warning, couldn't load gem tasks: #{e.message}! Skipping..."
|
82
|
+
end
|
93
83
|
|
94
|
-
|
84
|
+
<i>If you would prefer to install as a plugin (deprecated) don't do any of the above. Simply run `script/plugin install git://github.com/kjvarga/sitemap_generator.git` from your application root directory.</i>
|
95
85
|
|
96
86
|
Getting Started
|
97
87
|
======
|
@@ -107,11 +97,32 @@ Run `rake sitemap:refresh` as needed to create or rebuild your sitemap files. S
|
|
107
97
|
|
108
98
|
**To disable all non-essential output from `rake` run the tasks passing a `-s` option.** For example: `rake -s sitemap:refresh`.
|
109
99
|
|
110
|
-
Search
|
100
|
+
Pinging Search Engines
|
111
101
|
-----
|
112
102
|
|
113
103
|
Using `rake sitemap:refresh` will notify major search engines to let them know that a new sitemap is available (Google, Bing, Ask, SitemapWriter). To generate new sitemaps without notifying search engines (for example when running in a local environment) use `rake sitemap:refresh:no_ping`.
|
114
104
|
|
105
|
+
If you want to customize the hash of search engines you can access it at:
|
106
|
+
|
107
|
+
SitemapGenerator::Sitemap.search_engines
|
108
|
+
|
109
|
+
Usually you would be adding a new search engine to ping. In this case you can modify the `search_engines` hash directly. This ensures that when `SitemapGenerator::Sitemap.ping_search_engines` is called your new search engine will be included.
|
110
|
+
|
111
|
+
If you are calling `ping_search_engines` manually (for example if you have to wait some time or perform a custom action after your sitemaps have been regenerated) then you can pass you new search engine directly in the call as in the following example:
|
112
|
+
|
113
|
+
SitemapGenerator::Sitemap.ping_search_engines(:newengine => 'http://newengine.com/ping?url=%s')
|
114
|
+
|
115
|
+
The key gives the name of the search engine as a string or symbol and the value is the full URL to ping with a string interpolation that will be replaced by the CGI escaped sitemap index URL. If you have any literal percent characters in your URL you need to escape them with `%%`.
|
116
|
+
|
117
|
+
If you are calling `SitemapGenerator::Sitemap.ping_search_engines` from outside of your sitemap config file then you will need to set `SitemapGenerator::Sitemap.default_host` and any other options that you set in your sitemap config which affect the location of the sitemap index file. For example:
|
118
|
+
|
119
|
+
SitemapGenerator::Sitemap.default_host = 'http://example.com'
|
120
|
+
SitemapGenerator::Sitemap.ping_search_engines
|
121
|
+
|
122
|
+
Alternatively you can pass in the full URL to your sitemap index in which case we would have just the following:
|
123
|
+
|
124
|
+
SitemapGenerator::Sitemap.ping_search_engines('http://example.com/sitemap_index.xml.gz')
|
125
|
+
|
115
126
|
Crontab
|
116
127
|
-----
|
117
128
|
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
2.1
|
1
|
+
2.2.1
|
@@ -82,14 +82,17 @@ module SitemapGenerator
|
|
82
82
|
#
|
83
83
|
# * <tt>:sitemaps_namer</tt> - A +SitemapNamer+ instance for generating the sitemap names.
|
84
84
|
#
|
85
|
-
# * <tt
|
85
|
+
# * <tt>:include_index</tt> - Boolean. Whether to <b>add a link to the sitemap index<b>
|
86
86
|
# to the current sitemap. This points search engines to your Sitemap Index to
|
87
87
|
# include it in the indexing of your site. Default is `true`. Turned off when
|
88
88
|
# `sitemaps_host` is set or within a `group()` block.
|
89
89
|
#
|
90
|
-
# * <tt
|
90
|
+
# * <tt>:include_root</tt> - Boolean. Whether to **add the root** url i.e. '/' to the
|
91
91
|
# current sitemap. Default is `true`. Turned off within a `group()` block.
|
92
92
|
#
|
93
|
+
# * <tt>:search_engines</tt> - Hash. A hash of search engine names mapped to
|
94
|
+
# ping URLs. See ping_search_engines.
|
95
|
+
#
|
93
96
|
# * <tt>:verbose</tt> - If +true+, output a summary line for each sitemap and sitemap
|
94
97
|
# index that is created. Default is +false+.
|
95
98
|
def initialize(options={})
|
@@ -97,7 +100,13 @@ module SitemapGenerator
|
|
97
100
|
:include_root => true,
|
98
101
|
:include_index => true,
|
99
102
|
:filename => :sitemap,
|
100
|
-
:verbose => false
|
103
|
+
:verbose => false,
|
104
|
+
:search_engines => {
|
105
|
+
:google => "http://www.google.com/webmasters/sitemaps/ping?sitemap=%s",
|
106
|
+
:ask => "http://submissions.ask.com/ping?sitemap=%s",
|
107
|
+
:bing => "http://www.bing.com/webmaster/ping.aspx?siteMap=%s",
|
108
|
+
:sitemap_writer => "http://www.sitemapwriter.com/notify.php?crawler=all&url=%s"
|
109
|
+
}
|
101
110
|
})
|
102
111
|
options.each_pair { |k, v| instance_variable_set("@#{k}".to_sym, v) }
|
103
112
|
|
@@ -182,23 +191,51 @@ module SitemapGenerator
|
|
182
191
|
@group
|
183
192
|
end
|
184
193
|
|
185
|
-
# Ping search engines.
|
194
|
+
# Ping search engines to notify them of updated sitemaps.
|
195
|
+
#
|
196
|
+
# Search engines are already notified for you if you run `rake sitemap:refresh`.
|
197
|
+
# If you want to ping search engines separately to your sitemap generation, run
|
198
|
+
# `rake sitemap:refresh:no_ping` and then run a rake task or script
|
199
|
+
# which calls this method as in the example below.
|
200
|
+
#
|
201
|
+
# == Arguments
|
202
|
+
# * sitemap_index_url - The full URL to your sitemap index file.
|
203
|
+
# If not provided the location is based on the `host` you have
|
204
|
+
# set and any other options like your `sitemaps_path`. The URL
|
205
|
+
# will be CGI escaped for you when included as part of the
|
206
|
+
# search engine ping URL.
|
207
|
+
#
|
208
|
+
# == Options
|
209
|
+
# A hash of one or more search engines to ping in addition to the
|
210
|
+
# default search engines. The key is the name of the search engine
|
211
|
+
# as a string or symbol and the value is the full URL to ping with
|
212
|
+
# a string interpolation that will be replaced by the CGI escaped sitemap
|
213
|
+
# index URL. If you have any literal percent characters in your URL you
|
214
|
+
# need to escape them with `%%`. For example if your sitemap index URL
|
215
|
+
# is `http://example.com/sitemap_index.xml.gz` and your
|
216
|
+
# ping url is `http://example.com/100%%/ping?url=%s`
|
217
|
+
# then the final URL that is pinged will be `http://example.com/100%/ping?url=http%3A%2F%2Fexample.com%2Fsitemap_index.xml.gz`
|
218
|
+
#
|
219
|
+
# == Examples
|
220
|
+
#
|
221
|
+
# Both of these examples will ping the default search engines in addition to `http://superengine.com/ping?url=http%3A%2F%2Fexample.com%2Fsitemap_index.xml.gz`
|
222
|
+
#
|
223
|
+
# SitemapGenerator::Sitemap.host('http://example.com/')
|
224
|
+
# SitemapGenerator::Sitemap.ping_search_engines(:super_engine => 'http://superengine.com/ping?url=%s')
|
225
|
+
#
|
226
|
+
# Is equivalent to:
|
186
227
|
#
|
187
|
-
#
|
188
|
-
def ping_search_engines
|
228
|
+
# SitemapGenerator::Sitemap.ping_search_engines('http://example.com/sitemap_index.xml.gz', :super_engine => 'http://superengine.com/ping?url=%s')
|
229
|
+
def ping_search_engines(*args)
|
230
|
+
engines = args.last.is_a?(Hash) ? args.pop : {}
|
231
|
+
index_url = CGI.escape(args.shift || sitemap_index_url)
|
232
|
+
|
189
233
|
require 'open-uri'
|
190
234
|
require 'timeout'
|
191
235
|
|
192
|
-
sitemap_index_url = CGI.escape(sitemap_index.location.url)
|
193
|
-
search_engines = {
|
194
|
-
:google => "http://www.google.com/webmasters/sitemaps/ping?sitemap=#{sitemap_index_url}",
|
195
|
-
:ask => "http://submissions.ask.com/ping?sitemap=#{sitemap_index_url}",
|
196
|
-
:bing => "http://www.bing.com/webmaster/ping.aspx?siteMap=#{sitemap_index_url}",
|
197
|
-
:sitemap_writer => "http://www.sitemapwriter.com/notify.php?crawler=all&url=#{sitemap_index_url}"
|
198
|
-
}
|
199
|
-
|
200
236
|
puts "\n" if verbose
|
201
|
-
search_engines.each do |engine, link|
|
237
|
+
search_engines.merge(engines).each do |engine, link|
|
238
|
+
link = link % index_url
|
202
239
|
begin
|
203
240
|
Timeout::timeout(10) {
|
204
241
|
open(link)
|
@@ -221,16 +258,21 @@ module SitemapGenerator
|
|
221
258
|
@sitemaps_host || @default_host
|
222
259
|
end
|
223
260
|
|
224
|
-
# Lazy-initialize a sitemap instance
|
261
|
+
# Lazy-initialize a sitemap instance and return it.
|
225
262
|
def sitemap
|
226
263
|
@sitemap ||= SitemapGenerator::Builder::SitemapFile.new(sitemap_location)
|
227
264
|
end
|
228
265
|
|
229
|
-
# Lazy-initialize a sitemap index instance
|
266
|
+
# Lazy-initialize a sitemap index instance and return it.
|
230
267
|
def sitemap_index
|
231
268
|
@sitemap_index ||= SitemapGenerator::Builder::SitemapIndexFile.new(sitemap_index_location)
|
232
269
|
end
|
233
270
|
|
271
|
+
# Return the full url to the sitemap index file.
|
272
|
+
def sitemap_index_url
|
273
|
+
sitemap_index.location.url
|
274
|
+
end
|
275
|
+
|
234
276
|
def finalize!
|
235
277
|
finalize_sitemap!
|
236
278
|
finalize_sitemap_index!
|
@@ -414,6 +456,20 @@ module SitemapGenerator
|
|
414
456
|
self.sitemap_index_namer = SitemapGenerator::SitemapIndexNamer.new("#{@filename}_index")
|
415
457
|
end
|
416
458
|
|
459
|
+
# Set the search engines hash to a new hash of search engine names mapped to
|
460
|
+
# ping URLs (see ping_search_engines). If the value is nil it is converted
|
461
|
+
# to an empty hash.
|
462
|
+
# === Example
|
463
|
+
# <tt>search_engines = { :google => "http://www.google.com/webmasters/sitemaps/ping?sitemap=%s" }</tt>
|
464
|
+
def search_engines=(value)
|
465
|
+
@search_engines = value || {}
|
466
|
+
end
|
467
|
+
|
468
|
+
# Return the hash of search engines.
|
469
|
+
def search_engines
|
470
|
+
@search_engines || {}
|
471
|
+
end
|
472
|
+
|
417
473
|
# Set the namer to use when generating SitemapFiles (does not apply to the
|
418
474
|
# SitemapIndexFile)
|
419
475
|
def sitemaps_namer=(value)
|
metadata
CHANGED
@@ -1,13 +1,13 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: sitemap_generator
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
hash:
|
4
|
+
hash: 5
|
5
5
|
prerelease:
|
6
6
|
segments:
|
7
7
|
- 2
|
8
|
+
- 2
|
8
9
|
- 1
|
9
|
-
|
10
|
-
version: 2.1.8
|
10
|
+
version: 2.2.1
|
11
11
|
platform: ruby
|
12
12
|
authors:
|
13
13
|
- Karl Varga
|
@@ -16,7 +16,7 @@ autorequire:
|
|
16
16
|
bindir: bin
|
17
17
|
cert_chain: []
|
18
18
|
|
19
|
-
date:
|
19
|
+
date: 2012-01-07 00:00:00 -08:00
|
20
20
|
default_executable:
|
21
21
|
dependencies:
|
22
22
|
- !ruby/object:Gem::Dependency
|