sitemap_generator 5.3.1 → 6.1.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +5 -13
- data/CHANGES.md +31 -0
- data/MIT-LICENSE +1 -1
- data/README.md +204 -223
- data/VERSION +1 -1
- data/lib/capistrano/tasks/sitemap_generator.cap +3 -3
- data/lib/sitemap_generator.rb +12 -10
- data/lib/sitemap_generator/adapters/aws_sdk_adapter.rb +44 -29
- data/lib/sitemap_generator/adapters/file_adapter.rb +1 -2
- data/lib/sitemap_generator/adapters/fog_adapter.rb +9 -5
- data/lib/sitemap_generator/adapters/google_storage_adapter.rb +37 -0
- data/lib/sitemap_generator/adapters/s3_adapter.rb +14 -11
- data/lib/sitemap_generator/adapters/wave_adapter.rb +4 -4
- data/lib/sitemap_generator/application.rb +6 -9
- data/lib/sitemap_generator/builder/sitemap_index_file.rb +1 -0
- data/lib/sitemap_generator/core_ext/big_decimal.rb +17 -7
- data/lib/sitemap_generator/interpreter.rb +6 -4
- data/lib/sitemap_generator/link_set.rb +8 -4
- data/lib/sitemap_generator/templates.rb +2 -2
- metadata +70 -28
checksums.yaml
CHANGED
@@ -1,15 +1,7 @@
|
|
1
1
|
---
|
2
|
-
|
3
|
-
metadata.gz:
|
4
|
-
|
5
|
-
data.tar.gz: !binary |-
|
6
|
-
OTg0MzFiYjliNzgyYTU5MzU4NzMwMzkyODgzMWVjMWQ4YWJkYjhjYw==
|
2
|
+
SHA256:
|
3
|
+
metadata.gz: 4bad56ec68535f010d3e9b119f5da875f73f7686deb6f8893b96c945f389f80b
|
4
|
+
data.tar.gz: 951764f5c067836cb0a1d63c6ea4d0e431a3bbbd029572c331b7293c0bb5e5c5
|
7
5
|
SHA512:
|
8
|
-
metadata.gz:
|
9
|
-
|
10
|
-
ZjVlODkzNmZkMTZjOGZlNDk3NGY3ZGZiMjg0NzM2YjAwOWViYzQzNzRmNjQ3
|
11
|
-
NDRlOWYyZTM4NmEyY2I5N2UxODVjMDc0MDFjODJjMDYxODMyMjE=
|
12
|
-
data.tar.gz: !binary |-
|
13
|
-
NWMyMTZjOWJiOWFlM2UzM2Y0MjRmOWZjOTAxMmJiMjc4MGU5OTEyZDFjYjk3
|
14
|
-
MmY4YjMwYjczMTBkMDkwMGFiMjM2ODVmNDc2MTFiMzM0MDUwYTY4YWQ0ZjZi
|
15
|
-
Y2QzNzg4YWVkOTMwNzVhOTVjYjkxYmQ5Yjg3YTRlYjNiZDE0NWM=
|
6
|
+
metadata.gz: 372bc48c5160589d87ce184f7f749418e17d89812cdf978de33864636685ff4a11e18464b7f2556c095224526682c39b3526e5d1da91f1d718a2907ae31529a2
|
7
|
+
data.tar.gz: 708074cc61da80d13270bf822c3b60a480354cd6d102cb8f714e22a06cac67a6a2b4a6962d5d7f3420e9a73c08e57475932a9720b7bf72411dcaed57617c35a0
|
data/CHANGES.md
CHANGED
@@ -1,3 +1,34 @@
|
|
1
|
+
### 6.1.1
|
2
|
+
|
3
|
+
* Resolve deprecation warning on using Kernel#open in Ruby 2.7 (use URI.open instead) [#342](https://github.com/kjvarga/sitemap_generator/pull/342)
|
4
|
+
* Support S3 Endpoints for S3 Compliant Providers like DigitalOcean Spaces [#325](https://github.com/kjvarga/sitemap_generator/pull/325)
|
5
|
+
|
6
|
+
### 6.1.0
|
7
|
+
|
8
|
+
* Support uploading files to Google Cloud Storage [#326](https://github.com/kjvarga/sitemap_generator/pull/326) and [#340](https://github.com/kjvarga/sitemap_generator/pull/340)
|
9
|
+
|
10
|
+
### 6.0.2
|
11
|
+
|
12
|
+
* Resolve `BigDecimal.new is deprecated` warnings in Ruby 2.5 [#305](https://github.com/kjvarga/sitemap_generator/pull/305).
|
13
|
+
* Resolve `instance variable not initialized`, `File.exists? is deprecated` and `'*' interpreted as argument prefix` warnings [#304](https://github.com/kjvarga/sitemap_generator/pull/304).
|
14
|
+
|
15
|
+
### 6.0.1
|
16
|
+
|
17
|
+
* Use `yaml_tag` instead of `yaml_as`, which was deprecated in Ruby 2.4, and removed in 2.5 [#298](https://github.com/kjvarga/sitemap_generator/pull/298).
|
18
|
+
|
19
|
+
### 6.0.0
|
20
|
+
|
21
|
+
*Backwards incompatible changes*
|
22
|
+
|
23
|
+
* Adapters (AWS SDK, S3, Fog & Wave) no longer load their dependencies. It is up to the user
|
24
|
+
to `require` the appropriate libraries for the adapter to work.
|
25
|
+
* AwsSdkAdapter: Fixed [#279](https://github.com/kjvarga/sitemap_generator/issues/279) where sitemaps were incorrectly nested under a `sitemaps/` directory in S3
|
26
|
+
* Stop supporting Ruby < 2.0, test with Ruby 2.4.
|
27
|
+
|
28
|
+
*Other changes*
|
29
|
+
|
30
|
+
* If Rails is defined but the application is not loaded, don't include the URL helpers.
|
31
|
+
|
1
32
|
### 5.3.1
|
2
33
|
|
3
34
|
* Ensure files have 644 permissions when building to try to address issue [#264](https://github.com/kjvarga/sitemap_generator/issues/264)
|
data/MIT-LICENSE
CHANGED
data/README.md
CHANGED
@@ -9,7 +9,7 @@ Sitemaps adhere to the [Sitemap 0.9 protocol][sitemap_protocol] specification.
|
|
9
9
|
* Framework agnostic
|
10
10
|
* Supports [News sitemaps][sitemap_news], [Video sitemaps][sitemap_video], [Image sitemaps][sitemap_images], [Mobile sitemaps][sitemap_mobile], [PageMap sitemaps][sitemap_pagemap] and [Alternate Links][alternate_links]
|
11
11
|
* Supports read-only filesystems like Heroku via uploading to a remote host like Amazon S3
|
12
|
-
* Compatible with
|
12
|
+
* Compatible with all versions of Rails and Ruby
|
13
13
|
* Adheres to the [Sitemap 0.9 protocol][sitemap_protocol]
|
14
14
|
* Handles millions of links
|
15
15
|
* Customizable sitemap compression
|
@@ -20,7 +20,7 @@ Sitemaps adhere to the [Sitemap 0.9 protocol][sitemap_protocol] specification.
|
|
20
20
|
|
21
21
|
### Show Me
|
22
22
|
|
23
|
-
This is a simple standalone example. For Rails installation see the [Rails instructions](#rails) in the [Install](#
|
23
|
+
This is a simple standalone example. For Rails installation see the [Rails instructions](#rails) in the [Install](#installation) section.
|
24
24
|
|
25
25
|
Install:
|
26
26
|
|
@@ -59,69 +59,68 @@ Successful ping of Google
|
|
59
59
|
Successful ping of Bing
|
60
60
|
```
|
61
61
|
|
62
|
-
##
|
63
|
-
|
64
|
-
|
65
|
-
|
66
|
-
|
67
|
-
|
68
|
-
|
69
|
-
|
70
|
-
|
71
|
-
|
72
|
-
|
73
|
-
|
74
|
-
|
75
|
-
|
76
|
-
|
77
|
-
|
78
|
-
|
79
|
-
|
80
|
-
|
81
|
-
|
82
|
-
|
83
|
-
|
84
|
-
* [
|
85
|
-
|
86
|
-
|
87
|
-
|
88
|
-
|
89
|
-
|
90
|
-
|
91
|
-
|
92
|
-
|
93
|
-
|
94
|
-
|
95
|
-
|
96
|
-
|
97
|
-
|
98
|
-
|
99
|
-
|
100
|
-
|
101
|
-
|
102
|
-
|
103
|
-
|
104
|
-
|
105
|
-
|
106
|
-
|
107
|
-
|
108
|
-
|
109
|
-
|
110
|
-
|
111
|
-
|
112
|
-
|
113
|
-
|
114
|
-
|
115
|
-
|
116
|
-
|
117
|
-
|
118
|
-
|
119
|
-
|
120
|
-
|
121
|
-
|
122
|
-
|
123
|
-
|
124
|
-
* [Thanks (in no particular order)](#thanks-in-no-particular-order)
|
62
|
+
## Contents
|
63
|
+
|
64
|
+
* [Features](#features)
|
65
|
+
+ [Show Me](#show-me)
|
66
|
+
* [Contents](#contents)
|
67
|
+
* [Contribute](#contribute)
|
68
|
+
* [Foreword](#foreword)
|
69
|
+
* [Installation](#installation)
|
70
|
+
+ [Ruby](#ruby)
|
71
|
+
+ [Rails](#rails)
|
72
|
+
* [Getting Started](#getting-started)
|
73
|
+
+ [Preventing Output](#preventing-output)
|
74
|
+
+ [Rake Tasks](#rake-tasks)
|
75
|
+
+ [Pinging Search Engines](#pinging-search-engines)
|
76
|
+
+ [Crontab](#crontab)
|
77
|
+
+ [Robots.txt](#robotstxt)
|
78
|
+
+ [Ruby Modules](#ruby-modules)
|
79
|
+
+ [Deployments & Capistrano](#deployments--capistrano)
|
80
|
+
+ [Sitemaps with no Index File](#sitemaps-with-no-index-file)
|
81
|
+
+ [Upload Sitemaps to a Remote Host using Adapters](#upload-sitemaps-to-a-remote-host-using-adapters)
|
82
|
+
- [Supported Adapters](#supported-adapters)
|
83
|
+
* [`SitemapGenerator::FileAdapter`](#sitemapgeneratorfileadapter)
|
84
|
+
* [`SitemapGenerator::FogAdapter`](#sitemapgeneratorfogadapter)
|
85
|
+
* [`SitemapGenerator::S3Adapter`](#sitemapgenerators3adapter)
|
86
|
+
* [`SitemapGenerator::AwsSdkAdapter`](#sitemapgeneratorawssdkadapter)
|
87
|
+
* [`SitemapGenerator::WaveAdapter`](#sitemapgeneratorwaveadapter)
|
88
|
+
* [`SitemapGenerator::GoogleStorageAdapter`](#sitemapgeneratorgooglestorageadapter)
|
89
|
+
- [An Example of Using an Adapter](#an-example-of-using-an-adapter)
|
90
|
+
+ [Generating Multiple Sitemaps](#generating-multiple-sitemaps)
|
91
|
+
* [Sitemap Configuration](#sitemap-configuration)
|
92
|
+
+ [A Simple Example](#a-simple-example)
|
93
|
+
+ [Adding Links](#adding-links)
|
94
|
+
+ [Supported Options to `add`](#supported-options-to-add)
|
95
|
+
+ [Adding Links to the Sitemap Index](#adding-links-to-the-sitemap-index)
|
96
|
+
+ [Accessing the LinkSet instance](#accessing-the-linkset-instance)
|
97
|
+
+ [Speeding Things Up](#speeding-things-up)
|
98
|
+
* [Customizing your Sitemaps](#customizing-your-sitemaps)
|
99
|
+
+ [Sitemap Options](#sitemap-options)
|
100
|
+
* [Sitemap Groups](#sitemap-groups)
|
101
|
+
+ [A Groups Example](#a-groups-example)
|
102
|
+
+ [Using `group` without a block](#using-group-without-a-block)
|
103
|
+
* [Sitemap Extensions](#sitemap-extensions)
|
104
|
+
+ [News Sitemaps](#news-sitemaps)
|
105
|
+
- [Example](#example)
|
106
|
+
- [Supported options](#supported-options)
|
107
|
+
+ [Image Sitemaps](#image-sitemaps)
|
108
|
+
- [Example](#example-1)
|
109
|
+
- [Supported options](#supported-options-1)
|
110
|
+
+ [Video Sitemaps](#video-sitemaps)
|
111
|
+
- [Example](#example-2)
|
112
|
+
- [Supported options](#supported-options-2)
|
113
|
+
+ [PageMap Sitemaps](#pagemap-sitemaps)
|
114
|
+
- [Supported options](#supported-options-3)
|
115
|
+
- [Example:](#example)
|
116
|
+
+ [Alternate Links](#alternate-links)
|
117
|
+
- [Example](#example-3)
|
118
|
+
- [Supported options](#supported-options-4)
|
119
|
+
+ [Mobile Sitemaps](#-mobile-sitemaps)
|
120
|
+
- [Example](#example-4)
|
121
|
+
- [Supported options](#supported-options-5)
|
122
|
+
* [Compatibility](#compatibility)
|
123
|
+
* [Licence](#licence)
|
125
124
|
|
126
125
|
## Contribute
|
127
126
|
|
@@ -139,7 +138,7 @@ Those who knew him know what an amazing guy he was, and what an excellent Rails
|
|
139
138
|
The canonical repository is: [http://github.com/kjvarga/sitemap_generator][canonical_repo]
|
140
139
|
|
141
140
|
|
142
|
-
##
|
141
|
+
## Installation
|
143
142
|
|
144
143
|
### Ruby
|
145
144
|
|
@@ -157,7 +156,7 @@ The Rake tasks expect your sitemap to be at `config/sitemap.rb` but if you need
|
|
157
156
|
|
158
157
|
### Rails
|
159
158
|
|
160
|
-
SitemapGenerator works
|
159
|
+
SitemapGenerator works with all versions of Rails and has been tested in Rails 2, 3 and 4.
|
161
160
|
|
162
161
|
Add the gem to your `Gemfile`:
|
163
162
|
|
@@ -165,24 +164,13 @@ Add the gem to your `Gemfile`:
|
|
165
164
|
gem 'sitemap_generator'
|
166
165
|
```
|
167
166
|
|
168
|
-
Alternatively, if you are not using a `Gemfile` add the gem to your `config/
|
167
|
+
Alternatively, if you are not using a `Gemfile` add the gem to your `config/application.rb` file config block:
|
169
168
|
|
170
169
|
```ruby
|
171
170
|
config.gem 'sitemap_generator'
|
172
171
|
```
|
173
172
|
|
174
|
-
|
175
|
-
**Rails 1 or 2 only**, add the following code to your `Rakefile` to include the gem's Rake tasks in your project (Rails 3 does this for you automatically, so this step is not necessary):
|
176
|
-
|
177
|
-
```ruby
|
178
|
-
begin
|
179
|
-
require 'sitemap_generator/tasks'
|
180
|
-
rescue Exception => e
|
181
|
-
puts "Warning, couldn't load gem tasks: #{e.message}! Skipping..."
|
182
|
-
end
|
183
|
-
```
|
184
|
-
|
185
|
-
_If you would prefer to install as a plugin (deprecated) don't do any of the above. Simply run `script/plugin install git://github.com/kjvarga/sitemap_generator.git` from your application root directory._
|
173
|
+
Note: SitemapGenerator automatically loads its Rake tasks when used with Rails. You **do not need** to require the `sitemap_generator/tasks` file.
|
186
174
|
|
187
175
|
## Getting Started
|
188
176
|
|
@@ -198,14 +186,23 @@ SitemapGenerator.verbose = false
|
|
198
186
|
|
199
187
|
### Rake Tasks
|
200
188
|
|
201
|
-
* `rake sitemap:install` will create a `config/sitemap.rb` file which is your sitemap configuration
|
202
|
-
|
203
|
-
|
189
|
+
* `rake sitemap:install` will create a `config/sitemap.rb` file which is your sitemap configuration
|
190
|
+
and contains everything needed to build your sitemap. See
|
191
|
+
[**Sitemap Configuration**](#sitemap-configuration) below for more information about how to
|
192
|
+
define your sitemap.
|
193
|
+
|
194
|
+
* `rake sitemap:refresh` will create or rebuild your sitemap files as needed. Sitemaps are
|
195
|
+
generated into the `public/` folder and by default are named `sitemap.xml.gz`, `sitemap1.xml.gz`,
|
196
|
+
`sitemap2.xml.gz`, etc. As you can see, they are automatically GZip compressed for you. In this case,
|
197
|
+
`sitemap.xml.gz` is your sitemap "index" file.
|
204
198
|
|
199
|
+
`rake sitemap:refresh` will output information about each sitemap that is written including its
|
200
|
+
location, how many links it contains, and the size of the file.
|
205
201
|
|
206
202
|
### Pinging Search Engines
|
207
203
|
|
208
|
-
Using `rake sitemap:refresh` will notify
|
204
|
+
Using `rake sitemap:refresh` will notify Google and Bing to let them know that a new sitemap
|
205
|
+
is available. To generate new sitemaps without notifying search engines, use `rake sitemap:refresh:no_ping`.
|
209
206
|
|
210
207
|
If you want to customize the hash of search engines you can access it at:
|
211
208
|
|
@@ -213,24 +210,27 @@ If you want to customize the hash of search engines you can access it at:
|
|
213
210
|
SitemapGenerator::Sitemap.search_engines
|
214
211
|
```
|
215
212
|
|
216
|
-
Usually you would be adding a new search engine to ping. In this case you can modify
|
213
|
+
Usually you would be adding a new search engine to ping. In this case you can modify
|
214
|
+
the `search_engines` hash directly. This ensures that when
|
215
|
+
`SitemapGenerator::Sitemap.ping_search_engines` is called, your new search engine will be included.
|
217
216
|
|
218
|
-
If you are calling `ping_search_engines` manually
|
217
|
+
If you are calling `ping_search_engines` manually, then you can pass your new search engine
|
218
|
+
directly in the call, as in the following example:
|
219
219
|
|
220
220
|
```ruby
|
221
|
-
SitemapGenerator::Sitemap.ping_search_engines(:
|
221
|
+
SitemapGenerator::Sitemap.ping_search_engines(newengine: 'http://newengine.com/ping?url=%s')
|
222
222
|
```
|
223
223
|
|
224
|
-
The key gives the name of the search engine as a string or symbol and the value is the full URL to ping with a string interpolation that will be replaced by the CGI escaped sitemap index URL. If you have any literal percent characters in your URL you need to escape them with `%%`.
|
224
|
+
The key gives the name of the search engine, as a string or symbol, and the value is the full URL to ping, with a string interpolation that will be replaced by the CGI escaped sitemap index URL. If you have any literal percent characters in your URL you need to escape them with `%%`.
|
225
225
|
|
226
|
-
If you are calling `SitemapGenerator::Sitemap.ping_search_engines` from outside of your sitemap config file then you will need to set `SitemapGenerator::Sitemap.default_host` and any other options that you set in your sitemap config which affect the location of the sitemap index file. For example:
|
226
|
+
If you are calling `SitemapGenerator::Sitemap.ping_search_engines` from outside of your sitemap config file, then you will need to set `SitemapGenerator::Sitemap.default_host` and any other options that you set in your sitemap config which affect the location of the sitemap index file. For example:
|
227
227
|
|
228
228
|
```ruby
|
229
229
|
SitemapGenerator::Sitemap.default_host = 'http://example.com'
|
230
230
|
SitemapGenerator::Sitemap.ping_search_engines
|
231
231
|
```
|
232
232
|
|
233
|
-
Alternatively you can pass in the full URL to your sitemap index in which case we would have just the following:
|
233
|
+
Alternatively, you can pass in the full URL to your sitemap index, in which case we would have just the following:
|
234
234
|
|
235
235
|
```ruby
|
236
236
|
SitemapGenerator::Sitemap.ping_search_engines('http://example.com/sitemap.xml.gz')
|
@@ -249,7 +249,6 @@ every 1.day, :at => '5:00 am' do
|
|
249
249
|
end
|
250
250
|
```
|
251
251
|
|
252
|
-
|
253
252
|
### Robots.txt
|
254
253
|
|
255
254
|
You should add the URL of the sitemap index file to `public/robots.txt` to help search engines find your sitemaps. The URL should be the complete URL to the sitemap index. For example:
|
@@ -260,13 +259,15 @@ Sitemap: http://www.example.com/sitemap.xml.gz
|
|
260
259
|
|
261
260
|
### Ruby Modules
|
262
261
|
|
263
|
-
If you need to include a module (e.g. a rails helper) you
|
262
|
+
If you need to include a module (e.g. a rails helper), you must include it in the sitemap interpreter
|
263
|
+
class. The part of your sitemap configuration that defines your sitemaps is run within an instance
|
264
|
+
of the `SitemapGenerator::Interpreter`:
|
264
265
|
|
265
266
|
```ruby
|
266
267
|
SitemapGenerator::Interpreter.send :include, RoutingHelper
|
267
268
|
```
|
268
269
|
|
269
|
-
|
270
|
+
### Deployments & Capistrano
|
270
271
|
|
271
272
|
To include the capistrano tasks just add the following to your Capfile:
|
272
273
|
|
@@ -274,6 +275,12 @@ To include the capistrano tasks just add the following to your Capfile:
|
|
274
275
|
require 'capistrano/sitemap_generator'
|
275
276
|
```
|
276
277
|
|
278
|
+
Configurable options:
|
279
|
+
|
280
|
+
```ruby
|
281
|
+
set :sitemap_roles, :web # default
|
282
|
+
```
|
283
|
+
|
277
284
|
Available capistrano tasks:
|
278
285
|
|
279
286
|
```ruby
|
@@ -319,41 +326,111 @@ SitemapGenerator::Sitemap.create_index = :auto
|
|
319
326
|
|
320
327
|
_This section needs better documentation. Please consider contributing._
|
321
328
|
|
329
|
+
Sometimes it is desirable to host your sitemap files on a remote server, and point robots
|
330
|
+
and search engines to the remote files. For example, if you are using a host like Heroku,
|
331
|
+
which doesn't allow writing to the local filesystem. You still require *some* write access,
|
332
|
+
because the sitemap files need to be written out before uploading. So generally a host will
|
333
|
+
give you write access to a temporary directory. On Heroku this is `tmp/` within your application
|
334
|
+
directory.
|
335
|
+
|
322
336
|
#### Supported Adapters
|
323
|
-
* `SitemapGenerator::FileAdapter`
|
324
337
|
|
325
|
-
|
338
|
+
##### `SitemapGenerator::FileAdapter`
|
326
339
|
|
327
|
-
|
340
|
+
Standard adapter, writes out to a file.
|
328
341
|
|
329
|
-
|
342
|
+
##### `SitemapGenerator::FogAdapter`
|
330
343
|
|
331
|
-
|
344
|
+
Uses `Fog::Storage` to upload to any service supported by Fog.
|
332
345
|
|
333
|
-
|
346
|
+
You must `require 'fog'` in your sitemap config before using this adapter,
|
347
|
+
or `require` another library that defines `Fog::Storage`.
|
334
348
|
|
335
|
-
|
349
|
+
##### `SitemapGenerator::S3Adapter`
|
336
350
|
|
337
|
-
Uses `
|
351
|
+
Uses `Fog::Storage` to upload to Amazon S3 storage.
|
338
352
|
|
339
|
-
|
353
|
+
You must `require 'fog-aws'` in your sitemap config before using this adapter.
|
340
354
|
|
341
|
-
|
355
|
+
##### `SitemapGenerator::AwsSdkAdapter`
|
342
356
|
|
343
|
-
|
357
|
+
Uses `Aws::S3::Resource` to upload to Amazon S3 storage. Includes automatic detection of your AWS
|
358
|
+
credentials using `Aws::Credentials`.
|
344
359
|
|
345
|
-
|
346
|
-
|
347
|
-
|
348
|
-
|
349
|
-
|
350
|
-
|
360
|
+
You must `require 'aws-sdk-s3'` in your sitemap config before using this adapter,
|
361
|
+
or `require` another library that defines `Aws::S3::Resource` and `Aws::Credentials`.
|
362
|
+
|
363
|
+
An example of using this adapter in your sitemap configuration:
|
364
|
+
|
365
|
+
```ruby
|
366
|
+
SitemapGenerator::Sitemap.adapter = SitemapGenerator::AwsSdkAdapter.new('s3_bucket',
|
367
|
+
aws_access_key_id: 'AKIAI3SW5CRAZBL4WSTA',
|
368
|
+
aws_secret_access_key: 'asdfadsfdsafsadf',
|
369
|
+
aws_region: 'us-east-1'
|
370
|
+
)
|
371
|
+
```
|
372
|
+
|
373
|
+
##### `SitemapGenerator::AwsSdkAdapter (DigitalOcean Spaces)`
|
374
|
+
|
375
|
+
Uses `Aws::S3::Resource` to upload to Amazon S3 storage. Includes automatic detection of your AWS
|
376
|
+
credentials using `Aws::Credentials`.
|
377
|
+
|
378
|
+
You must `require 'aws-sdk-s3'` in your sitemap config before using this adapter,
|
379
|
+
or `require` another library that defines `Aws::S3::Resource` and `Aws::Credentials`.
|
351
380
|
|
352
|
-
|
381
|
+
An example of using this adapter in your sitemap configuration:
|
353
382
|
|
354
|
-
|
383
|
+
```ruby
|
384
|
+
SitemapGenerator::Sitemap.adapter = SitemapGenerator::AwsSdkAdapter.new('s3_bucket',
|
385
|
+
aws_access_key_id: 'AKIAI3SW5CRAZBL4WSTA',
|
386
|
+
aws_secret_access_key: 'asdfadsfdsafsadf',
|
387
|
+
aws_region: 'sfo2',
|
388
|
+
aws_endpoint: 'https://sfo2.digitaloceanspaces.com'
|
389
|
+
)
|
390
|
+
```
|
355
391
|
|
356
|
-
|
392
|
+
##### `SitemapGenerator::WaveAdapter`
|
393
|
+
|
394
|
+
Uses `CarrierWave::Uploader::Base` to upload to any service supported by CarrierWave, for example,
|
395
|
+
Amazon S3, Rackspace Cloud Files, and MongoDB's GridF.
|
396
|
+
|
397
|
+
You must `require 'carrierwave'` in your sitemap config before using this adapter,
|
398
|
+
or `require` another library that defines `CarrierWave::Uploader::Base`.
|
399
|
+
|
400
|
+
Some documentation exists [on the wiki page][remote_hosts].
|
401
|
+
|
402
|
+
##### `SitemapGenerator::GoogleStorageAdapter`
|
403
|
+
|
404
|
+
Uses [`Google::Cloud::Storage`][google_cloud_storage_gem] to upload to Google Cloud storage.
|
405
|
+
|
406
|
+
You must `require 'google/cloud/storage'` in your sitemap config before using this adapter.
|
407
|
+
|
408
|
+
An example of using this adapter in your sitemap configuration with options:
|
409
|
+
|
410
|
+
```ruby
|
411
|
+
SitemapGenerator::Sitemap.adapter = SitemapGenerator::GoogleStorageAdapter.new(
|
412
|
+
credentials: 'path/to/keyfile.json',
|
413
|
+
project_id: 'google_account_project_id',
|
414
|
+
bucket: 'name_of_bucket'
|
415
|
+
)
|
416
|
+
```
|
417
|
+
Also, inline with Google Authentication options, it can also pick credentials from environment variables. All [supported environment variables][google_cloud_storage_authentication] can be used, for example: `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_CREDENTIALS`. An example of using this adapter with the environment variables is:
|
418
|
+
|
419
|
+
```ruby
|
420
|
+
SitemapGenerator::Sitemap.adapter = SitemapGenerator::GoogleStorageAdapter.new(
|
421
|
+
bucket: 'name_of_bucket'
|
422
|
+
)
|
423
|
+
```
|
424
|
+
|
425
|
+
All options other than the `:bucket` option are passed to the `Google::Cloud::Storage.new` initializer giving you maximum configurability. See the [Google Cloud Storage initializer][google_cloud_storage_initializer] for supported options.
|
426
|
+
|
427
|
+
#### An Example of Using an Adapter
|
428
|
+
|
429
|
+
1. Please see [this wiki page][remote_hosts] for more information about setting up SitemapGenerator to upload to a
|
430
|
+
remote host.
|
431
|
+
|
432
|
+
2. This example uses the CarrierWave adapter. It shows some common settings that are used when the hostname hosting
|
433
|
+
the sitemaps differs from the hostname of the sitemap links.
|
357
434
|
|
358
435
|
```ruby
|
359
436
|
# Your website's host name
|
@@ -368,14 +445,14 @@ Sitemap Generator uses CarrierWave to support uploading to Amazon S3 store, Rack
|
|
368
445
|
# Set this to a directory/path if you don't want to upload to the root of your `sitemaps_host`
|
369
446
|
SitemapGenerator::Sitemap.sitemaps_path = 'sitemaps/'
|
370
447
|
|
371
|
-
#
|
448
|
+
# The adapter to perform the upload of sitemap files.
|
372
449
|
SitemapGenerator::Sitemap.adapter = SitemapGenerator::WaveAdapter.new
|
373
450
|
```
|
374
451
|
|
375
452
|
3. Update your `robots.txt` file to point robots to the remote sitemap index file, e.g:
|
376
453
|
|
377
454
|
```
|
378
|
-
Sitemap: http://s3.amazonaws.com/sitemap-generator/sitemaps/
|
455
|
+
Sitemap: http://s3.amazonaws.com/sitemap-generator/sitemaps/sitemap.xml.gz
|
379
456
|
```
|
380
457
|
|
381
458
|
You generate your sitemaps as usual using `rake sitemap:refresh`.
|
@@ -386,9 +463,10 @@ Sitemap Generator uses CarrierWave to support uploading to Amazon S3 store, Rack
|
|
386
463
|
in the sitemap, something that the sitemap rules forbid. (Since version 3.2 this is no
|
387
464
|
longer an issue because [`include_index` is off by default][include_index_change].)
|
388
465
|
|
389
|
-
4. Verify to
|
466
|
+
4. Verify to Google that you own the S3 url
|
390
467
|
|
391
|
-
In order for Google to use your sitemap, you need to prove you own the
|
468
|
+
In order for Google to use your sitemap, you need to prove you own the S3 bucket through [google webmaster tools](https://www.google.com/webmasters/tools/home?hl=en). In the example above, you would add the site `http://s3.amazonaws.com/sitemap-generator/sitemaps`. Once you have verified you own the directory, then add your
|
469
|
+
sitemap index to the list of sitemaps for the site.
|
392
470
|
|
393
471
|
### Generating Multiple Sitemaps
|
394
472
|
|
@@ -467,7 +545,6 @@ If you want to use a non-standard configuration file, or have multiple configura
|
|
467
545
|
rake sitemap:refresh CONFIG_FILE="config/geo_sitemap.rb"
|
468
546
|
```
|
469
547
|
|
470
|
-
|
471
548
|
### A Simple Example
|
472
549
|
|
473
550
|
So what does a sitemap configuration look like? Let's take a look at a simple example:
|
@@ -1058,117 +1135,18 @@ end
|
|
1058
1135
|
|
1059
1136
|
* `:mobile` - Presence of this option will turn on the mobile flag regardless of value.
|
1060
1137
|
|
1061
|
-
## Raison d'être
|
1062
|
-
|
1063
|
-
Most of the Sitemap plugins out there seem to try to recreate the Sitemap links by iterating the Rails routes. In some cases this is possible, but for a great deal of cases it isn't.
|
1064
|
-
|
1065
|
-
a) There are probably quite a few routes in your routes file that don't need inclusion in the Sitemap. (AJAX routes I'm looking at you.)
|
1066
|
-
|
1067
|
-
and
|
1068
|
-
|
1069
|
-
b) How would you infer the correct series of links for the following route?
|
1070
|
-
|
1071
|
-
```ruby
|
1072
|
-
map.zipcode 'location/:state/:city/:zipcode', :controller => 'zipcode', :action => 'index'
|
1073
|
-
```
|
1074
|
-
|
1075
|
-
Don't tell me it's trivial, because it isn't. It just looks trivial.
|
1076
|
-
|
1077
|
-
So my idea is to have another file similar to 'routes.rb' called 'sitemap.rb', where you can define what goes into the Sitemap.
|
1078
|
-
|
1079
|
-
Here's my solution:
|
1080
|
-
|
1081
|
-
```ruby
|
1082
|
-
Zipcode.find(:all, :include => :city).each do |z|
|
1083
|
-
add zipcode_path(:state => z.city.state, :city => z.city, :zipcode => z)
|
1084
|
-
end
|
1085
|
-
```
|
1086
|
-
|
1087
|
-
Easy hey?
|
1088
|
-
|
1089
1138
|
## Compatibility
|
1090
1139
|
|
1091
|
-
|
1092
|
-
|
1093
|
-
* **Rails** 3.0.0, 3.0.7, 4.2.3
|
1094
|
-
* **Rails** 1.x - 2.3.8
|
1095
|
-
* **Ruby** 1.8.6, 1.8.7, 1.8.7 Enterprise Edition, 1.9.1, 1.9.2, 2.1.3
|
1096
|
-
|
1097
|
-
|
1098
|
-
## Known Bugs
|
1099
|
-
|
1100
|
-
* There's no check on the size of a URL which [isn't supposed to exceed 2,048 bytes][sitemaps_xml].
|
1101
|
-
* Currently only supports one Sitemap Index file, which can contain 50,000 Sitemap files which can each contain 50,000 urls, so it _only_ supports up to 2,500,000,000 (2.5 billion) urls.
|
1102
|
-
|
1103
|
-
|
1104
|
-
## Deprecation Notices and Non-Backwards Compatible Changes
|
1105
|
-
|
1106
|
-
### Version 5.0.0
|
1107
|
-
|
1108
|
-
In version 5.0.0 I've removed a few deprecated methods that have been deprecated for a long time. The reason being that they would have made some new features more difficult and complex to implement. I never actually ouput deprecation notices from these methods, so I understand it you're a little annoyed that your config has suddenly broken. Apologies.
|
1109
|
-
|
1110
|
-
Here's a list of the methods that have been removed:
|
1111
|
-
* Removed options to `LinkSet::add()`: `:sitemaps_namer` and `:sitemap_index_namer` (use `:namer` option)
|
1112
|
-
* Removed `LinkSet::sitemaps_namer=`, `LinkSet::sitemaps_namer` (use `LinkSet::namer=` and `LinkSet::namer`)
|
1113
|
-
* Removed `LinkSet::sitemaps_index_namer=`, `LinkSet::sitemaps_index_namer` (use `LinkSet::namer=` and `LinkSet::namer`)
|
1114
|
-
* Removed the `SitemapGenerator::SitemapNamer` class (use `SitemapGenerator::SimpleNamer`)
|
1115
|
-
* Removed `LinkSet::add_links()` (use `LinkSet::create()`)
|
1116
|
-
|
1117
|
-
### Version 4.0.0
|
1118
|
-
|
1119
|
-
Version 4.0 introduces a new **non-backwards compatible** naming scheme. **If you are running version 3 or earlier and you upgrade to version 4, you need to make a couple small changes to ensure that search engines can still find your sitemaps!** Your sitemaps will still work fine, but the name of the index file has changed.
|
1120
|
-
|
1121
|
-
#### So what has changed?
|
1122
|
-
|
1123
|
-
* **The index is generated intelligently**. SitemapGenerator now detects whether you need an index or not, and only generates one if you need it or have requested it. So small sites (less than 50,000 links) won't have one, large sites will. You don't have to worry about anything. And with the `create_index` option, it's easier than ever to control index creation to suit your needs.
|
1124
|
-
|
1125
|
-
* **The default index file name has changed** from `sitemap_index.xml.gz` to just `sitemap.xml.gz`. So the `_index` part has been removed. This is a more standard naming scheme for the sitemaps. Any further sitemaps are named `sitemap1.xml.gz`, `sitemap2.xml.gz`, `sitemap3.xml.gz` etc, just as before.
|
1126
|
-
|
1127
|
-
* **Everyone now points search engines to the `sitemap.xml.gz` file**. It doesn't matter whether your site has 10 links or a million links, just point to `sitemap.xml.gz`. If your site needs an index, that is the index. If it doesn't, then that's your sitemap. Simple.
|
1128
|
-
|
1129
|
-
* **It's easier to write custom namers** because the index and the sitemaps share the same namer instance (which is now a `SitemapGenerator::SimpleNamer` instance).
|
1130
|
-
|
1131
|
-
* **Groups share the new naming convention**. So the files in your `geo` group will be named `geo.xml.gz`, `geo1.xml.gz`, `geo2.xml.gz` etc. Pre-version 4 these files would have been named `geo1.xml.gz`, `geo2.xml.gz`, `geo3.xml.gz` etc.
|
1132
|
-
|
1133
|
-
#### I don't want it! How can I keep everything as it was?
|
1134
|
-
|
1135
|
-
You don't care, you just want to get on with your day. To resort to pre-version 4 behaviour add the following to your sitemap config:
|
1136
|
-
|
1137
|
-
```ruby
|
1138
|
-
SitemapGenerator::Sitemap.create_index = true
|
1139
|
-
SitemapGenerator::Sitemap.namer = SitemapGenerator::SimpleNamer.new(:sitemap, :zero => '_index')
|
1140
|
-
```
|
1141
|
-
|
1142
|
-
This tells SitemapGenerator to always create an index file and to name it `sitemap_index.xml.gz`. If you are already using custom namers, you don't need to set `namer`; your old namers should still work as before. If you are using named groups, setting the sitemap namer in this way won't affect your groups, which will still be using the new naming scheme. If this is an issue for you, you may have to create namers for your groups.
|
1143
|
-
|
1144
|
-
#### I want it! What do I need to do?
|
1145
|
-
|
1146
|
-
1. Update your `robots.txt` file and make sure it points to `sitemap.xml.gz`.
|
1147
|
-
2. Generate your sitemaps to create the new `sitemap.xml.gz` file.
|
1148
|
-
3. Optionally remove the old `sitemap_index.xml.gz` file (or link it to the new file if you want to make sure that search engines can find it while you update them.)
|
1149
|
-
4. Go to your Google Webmaster tools and other places where you've pointed search engines to your sitemaps and point them to your new `sitemap.xml.gz` file.
|
1150
|
-
|
1151
|
-
That's it! Welcome to the future!
|
1152
|
-
|
1153
|
-
## Thanks (in no particular order)
|
1140
|
+
Compatible with all versions of Rails and Ruby.
|
1141
|
+
Ruby 1.9.3 support was dropped in Version 6.0.0 of this gem.
|
1154
1142
|
|
1155
|
-
|
1143
|
+
## Licence
|
1156
1144
|
|
1157
|
-
|
1145
|
+
Released under the MIT License. See the (MIT-LICENSE)[MIT-LICENSE] file.
|
1158
1146
|
|
1159
|
-
|
1160
|
-
* [Rodrigo Flores](https://github.com/rodrigoflores) for News sitemaps
|
1161
|
-
* [Alex Soto](http://github.com/apsoto) for Video sitemaps
|
1162
|
-
* [Alexadre Bini](http://github.com/alexandrebini) for Image sitemaps
|
1163
|
-
* [Dan Pickett](http://github.com/dpickett)
|
1164
|
-
* [Rob Biedenharn](http://github.com/rab)
|
1165
|
-
* [Richie Vos](http://github.com/jerryvos)
|
1166
|
-
* [Adrian Mugnolo](http://github.com/xymbol)
|
1167
|
-
* [Jason Weathered](http://github.com/jasoncodes)
|
1168
|
-
* [Andy Stewart](http://github.com/airblade)
|
1169
|
-
* [Brian Armstrong](https://github.com/barmstrong) for Geo sitemaps
|
1147
|
+
MIT. See the LICENSE.md file.
|
1170
1148
|
|
1171
|
-
Copyright (c)
|
1149
|
+
Copyright (c) Karl Varga released under the MIT license
|
1172
1150
|
|
1173
1151
|
[canonical_repo]:http://github.com/kjvarga/sitemap_generator
|
1174
1152
|
[enterprise_class]:https://twitter.com/dhh/status/1631034662 "I use enterprise in the same sense the Phusion guys do - i.e. Enterprise Ruby. Please don't look down on my use of the word 'enterprise' to represent being a cut above. It doesn't mean you ever have to work for a company the size of IBM. Or constantly fight inertia, writing crappy software, adhering to change management practices and spending hours in meetings... Not that there's anything wrong with that - Wait, what?"
|
@@ -1192,3 +1170,6 @@ Copyright (c) 2009 Karl Varga released under the MIT license
|
|
1192
1170
|
[iso_4217]:http://en.wikipedia.org/wiki/ISO_4217
|
1193
1171
|
[media]:https://developers.google.com/webmasters/smartphone-sites/details
|
1194
1172
|
[expires]:https://support.google.com/customsearch/answer/2631051?hl=en
|
1173
|
+
[google_cloud_storage_gem]:https://rubygems.org/gems/google-cloud-storage
|
1174
|
+
[google_cloud_storage_authentication]:https://googleapis.dev/ruby/google-cloud-storage/latest/file.AUTHENTICATION.html
|
1175
|
+
[google_cloud_storage_initializer]:https://github.com/googleapis/google-cloud-ruby/blob/master/google-cloud-storage/lib/google/cloud/storage.rb
|