sitemap_generator 4.3.1 → 5.0.0.beta
Sign up to get free protection for your applications and to get access to all the features.
- data/Gemfile.lock +1 -1
- data/README.md +37 -21
- data/VERSION +1 -1
- data/lib/sitemap_generator/adapters/file_adapter.rb +23 -1
- data/lib/sitemap_generator/builder/sitemap_file.rb +1 -10
- data/lib/sitemap_generator/builder/sitemap_index_file.rb +0 -8
- data/lib/sitemap_generator/link_set.rb +50 -66
- data/lib/sitemap_generator/sitemap_location.rb +62 -13
- data/lib/sitemap_generator/sitemap_namer.rb +10 -66
- data/spec/files/sitemap.groups.rb +3 -3
- data/spec/sitemap_generator/builder/sitemap_file_spec.rb +1 -1
- data/spec/sitemap_generator/builder/sitemap_url_spec.rb +2 -2
- data/spec/sitemap_generator/file_adaptor_spec.rb +20 -0
- data/spec/sitemap_generator/link_set_spec.rb +69 -59
- data/spec/sitemap_generator/sitemap_generator_spec.rb +93 -37
- data/spec/sitemap_generator/sitemap_location_spec.rb +59 -17
- data/spec/sitemap_generator/sitemap_namer_spec.rb +0 -70
- metadata +15 -13
- data/spec/files/sitemap.deprecated.rb +0 -15
data/Gemfile.lock
CHANGED
data/README.md
CHANGED
@@ -12,7 +12,7 @@ Sitemaps adhere to the [Sitemap 0.9 protocol][sitemap_protocol] specification.
|
|
12
12
|
* Compatible with Rails 2, 3 & 4 and tested with Ruby REE, 1.9.2 & 1.9.3
|
13
13
|
* Adheres to the [Sitemap 0.9 protocol][sitemap_protocol]
|
14
14
|
* Handles millions of links
|
15
|
-
*
|
15
|
+
* Customizable sitemap compression
|
16
16
|
* Notifies search engines (Google, Bing) of new sitemaps
|
17
17
|
* Ensures your old sitemaps stay in place if the new sitemap fails to generate
|
18
18
|
* Gives you complete control over your sitemap contents and naming scheme
|
@@ -66,11 +66,24 @@ Does your website use SitemapGenerator to generate Sitemaps? Where would you be
|
|
66
66
|
|
67
67
|
<a href='http://www.pledgie.com/campaigns/15267'><img alt='Click here to lend your support to: SitemapGenerator and make a donation at www.pledgie.com !' src='http://pledgie.com/campaigns/15267.png?skin_name=chrome' border='0' /></a>
|
68
68
|
|
69
|
-
##
|
69
|
+
## Deprecation Notices and Non-Backwards Compatible Changes
|
70
|
+
|
71
|
+
### Version 5.0.0
|
72
|
+
|
73
|
+
In version 5.0.0 I've removed a few deprecated methods that have been deprecated for a long time. The reason being that they would have made some new features more difficult and complex to implement. I never actually ouput deprecation notices from these methods, so I understand it you're a little annoyed that your config has suddenly broken. Apologies.
|
74
|
+
|
75
|
+
Here's a list of the methods that have been removed:
|
76
|
+
* Removed options to `LinkSet::add()`: `:sitemaps_namer` and `:sitemap_index_namer` (use `:namer` option)
|
77
|
+
* Removed `LinkSet::sitemaps_namer=`, `LinkSet::sitemaps_namer` (use `LinkSet::namer=` and `LinkSet::namer`)
|
78
|
+
* Removed `LinkSet::sitemaps_index_namer=`, `LinkSet::sitemaps_index_namer` (use `LinkSet::namer=` and `LinkSet::namer`)
|
79
|
+
* Removed the `SitemapGenerator::SitemapNamer` class (use `SitemapGenerator::SimpleNamer`)
|
80
|
+
* Removed `LinkSet::add_links()` (use `LinkSet::create()`)
|
81
|
+
|
82
|
+
### Version 4.0.0
|
70
83
|
|
71
84
|
Version 4.0 introduces a new **non-backwards compatible** naming scheme. **If you are running version 3 or earlier and you upgrade to version 4, you need to make a couple small changes to ensure that search engines can still find your sitemaps!** Your sitemaps will still work fine, but the name of the index file has changed.
|
72
85
|
|
73
|
-
|
86
|
+
#### So what has changed?
|
74
87
|
|
75
88
|
* **The index is generated intelligently**. SitemapGenerator now detects whether you need an index or not, and only generates one if you need it or have requested it. So small sites (less than 50,000 links) won't have one, large sites will. You don't have to worry about anything. And with the `create_index` option, it's easier than ever to control index creation to suit your needs.
|
76
89
|
|
@@ -82,7 +95,7 @@ Version 4.0 introduces a new **non-backwards compatible** naming scheme. **If y
|
|
82
95
|
|
83
96
|
* **Groups share the new naming convention**. So the files in your `geo` group will be named `geo.xml.gz`, `geo1.xml.gz`, `geo2.xml.gz` etc. Pre-version 4 these files would have been named `geo1.xml.gz`, `geo2.xml.gz`, `geo3.xml.gz` etc.
|
84
97
|
|
85
|
-
|
98
|
+
#### I don't want it! How can I keep everything as it was?
|
86
99
|
|
87
100
|
You don't care, you just want to get on with your day. To resort to pre-version 4 behaviour add the following to your sitemap config:
|
88
101
|
|
@@ -93,7 +106,7 @@ SitemapGenerator::Sitemap.namer = SitemapGenerator::SimpleNamer.new(:sitemap, :z
|
|
93
106
|
|
94
107
|
This tells SitemapGenerator to always create an index file and to name it `sitemap_index.xml.gz`. If you are already using custom namers, you don't need to set `namer`; your old namers should still work as before. If you are using named groups, setting the sitemap namer in this way won't affect your groups, which will still be using the new naming scheme. If this is an issue for you, you may have to create namers for your groups.
|
95
108
|
|
96
|
-
|
109
|
+
#### I want it! What do I need to do?
|
97
110
|
|
98
111
|
1. Update your `robots.txt` file and make sure it points to `sitemap.xml.gz`.
|
99
112
|
2. Generate your sitemaps to create the new `sitemap.xml.gz` file.
|
@@ -104,6 +117,7 @@ That's it! Welcome to the future!
|
|
104
117
|
|
105
118
|
## Changelog
|
106
119
|
|
120
|
+
* v5.0.0: Support new `:compress` option for customizing which files get compressed. Remove old deprecated methods (see deprecation notices above).
|
107
121
|
* v4.3.1: Support integer timestamps. Update README for new features added in last release.
|
108
122
|
* v4.3.0: Support `media` attibute on alternate links ([#125](https://github.com/kjvarga/sitemap_generator/issues/125)). Changed `SitemapGenerator::S3Adapter` to write files in a single operation, avoiding potential permissions errors when listing a directory prior to writing ([#130](https://github.com/kjvarga/sitemap_generator/issues/130)). Remove Sitemap Writer from ping task ([#129](https://github.com/kjvarga/sitemap_generator/issues/129)). Support `url:expires` element ([#126](https://github.com/kjvarga/sitemap_generator/issues/126)).
|
109
123
|
* v4.2.0: Update Google ping URL. Quote the ping URL in the output. Support Video `video:price` element ([#117](https://github.com/kjvarga/sitemap_generator/issues/117)). Support symbols as well as strings for most arguments to `add()` ([#113](https://github.com/kjvarga/sitemap_generator/issues/113)). Ensure that `public_path` and `sitemaps_path` end with a slash (`/`) ([#113](https://github.com/kjvarga/sitemap_generator/issues/118)).
|
@@ -739,36 +753,38 @@ The options passed to `group` only apply to the links and sitemaps generated in
|
|
739
753
|
|
740
754
|
### Sitemap Options
|
741
755
|
|
742
|
-
The following options are supported
|
756
|
+
The following options are supported.
|
743
757
|
|
744
|
-
*
|
758
|
+
* `:create_index` - Supported values: `true`, `false`, `:auto`. Default: `true`. Whether to create a sitemap index file. If `true` an index file is always created regardless of how many sitemap files are generated. If `false` an index file is never created. If `:auto` an index file is created only when you have more than one sitemap file (i.e. you have added more than 50,000 - `SitemapGenerator::MAX_SITEMAP_LINKS` - links).
|
745
759
|
|
746
|
-
*
|
760
|
+
* `:default_host` - String. Required. **Host including protocol** to use when building a link to add to your sitemap. For example `http://example.com`. Calling `add '/home'` would then generate the URL `http://example.com/home` and add that to the sitemap. You can pass a `:host` option in your call to `add` to override this value on a per-link basis. For example calling `add '/home', :host => 'https://example.com'` would generate the URL `https://example.com/home`, for that link only.
|
747
761
|
|
748
|
-
*
|
762
|
+
* `:filename` - Symbol. The **base name for the files** that will be generated. The default value is `:sitemap`. This yields files with names like `sitemap.xml.gz`, `sitemap1.xml.gz`, `sitemap2.xml.gz`, `sitemap3.xml.gz` etc. If we now set the value to `:geo` the files would be named `geo.xml.gz`, `geo1.xml.gz`, `geo2.xml.gz`, `geo3.xml.gz` etc.
|
749
763
|
|
750
|
-
*
|
764
|
+
* `:include_index` - Boolean. Whether to **add a link pointing to the sitemap index** to the current sitemap. This points search engines to your Sitemap Index to include it in the indexing of your site. 2012-07: This is now turned off by default because Google may complain about there being 'Nested Sitemap indexes'. Default is `false`. Turned off when `sitemaps_host` is set or within a `group()` block.
|
751
765
|
|
752
|
-
*
|
766
|
+
* `:include_root` - Boolean. Whether to **add the root** url i.e. '/' to the current sitemap. Default is `true`. Turned off within a `group()` block.
|
753
767
|
|
754
|
-
*
|
768
|
+
* `:public_path` - String. A **full or relative path** to the `public` directory or the directory you want to write sitemaps into. Defaults to `public/` under your application root or relative to the current working directory.
|
755
769
|
|
756
|
-
*
|
770
|
+
* `:sitemaps_host` - String. **Host including protocol** to use when generating a link to a sitemap file i.e. the hostname of the server where the sitemaps are hosted. The value will differ from the hostname in your sitemap links. For example: `'http://amazon.aws.com/'`. Note that `include_index` is
|
757
771
|
automatically turned off when the `sitemaps_host` does not match `default_host`.
|
758
772
|
Because the link to the sitemap index file that would otherwise be added would point to a different host than the rest of the links in the sitemap. Something that the sitemap rules forbid.
|
759
773
|
|
760
|
-
*
|
774
|
+
* `:namer` - A `SitemapGenerator::SimpleNamer` instance **for generating sitemap names**. You can read about Sitemap Namers by reading the API docs. Allows you to set the name, extension and number sequence for sitemap files, as well as modify the name of the first file in the sequence, which is often the index file. A simple example if we want to generate files like 'newname.xml.gz', 'newname1.xml.gz', etc is `SitemapGenerator::SimpleNamer.new(:newname)`.
|
775
|
+
|
776
|
+
* `:sitemaps_path` - String. A **relative path** giving a directory under your `public_path` at which to write sitemaps. The difference between the two options is that the `sitemaps_path` is used when generating a link to a sitemap file. For example, if we set `SitemapGenerator::Sitemap.sitemaps_path = 'en/'` and use the default `public_path` sitemaps will be written to `public/en/`. The URL to the sitemap index would then be `http://example.com/en/sitemap.xml.gz`.
|
761
777
|
|
762
|
-
* `
|
778
|
+
* `:verbose` - Boolean. Whether to **output a sitemap summary** describing the sitemap files and giving statistics about your sitemap. Default is `false`. When using the Rake tasks `verbose` will be `true` unless you pass the `-s` option.
|
763
779
|
|
764
|
-
* `
|
780
|
+
* `:adapter` - Instance. The default adapter is a `SitemapGenerator::FileAdapter` which simply writes files to the filesystem. You can use a `SitemapGenerator::WaveAdapter` for uploading sitemaps to remote servers - useful for read-only hosts such as Heroku. Or you can provide an instance of your own class to provide custom behavior. Your class must define a write method which takes a `SitemapGenerator::Location` and raw XML data.
|
765
781
|
|
766
|
-
* `
|
767
|
-
|
768
|
-
|
769
|
-
|
770
|
-
define a write method which takes a `SitemapGenerator::Location` and raw XML data.
|
782
|
+
* `:compress` - Specifies which files to compress with gzip. Default is `true`. Accepted values:
|
783
|
+
* `true` - Boolean; compress all files.
|
784
|
+
* `false` - Boolean; Do not compress any files.
|
785
|
+
* `:all_but_first` - Symbol; leave the first file uncompressed but compress all remaining files.
|
771
786
|
|
787
|
+
The compression setting applies to groups too. So `:all_but_first` will have the same effect (the first file in the group will not be compressed, the rest will). So if you require different behaviour for your groups, pass in a `:compress` option e.g. `group(:compress => false) { add('/link') }`
|
772
788
|
|
773
789
|
## Sitemap Groups
|
774
790
|
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
|
1
|
+
5.0.0.beta
|
@@ -1,5 +1,14 @@
|
|
1
1
|
module SitemapGenerator
|
2
|
+
# Class for writing out data to a file.
|
2
3
|
class FileAdapter
|
4
|
+
|
5
|
+
# Write data to a file.
|
6
|
+
# @param location - File object giving the full path and file name of the file.
|
7
|
+
# If the location specifies a directory(ies) which does not exist, the directory(ies)
|
8
|
+
# will be created for you. If the location path ends with `.gz` the data will be
|
9
|
+
# compressed prior to being written out. Otherwise the data will be written out
|
10
|
+
# unchanged.
|
11
|
+
# @param raw_data - data to write to the file.
|
3
12
|
def write(location, raw_data)
|
4
13
|
# Ensure that the directory exists
|
5
14
|
dir = location.directory
|
@@ -9,13 +18,26 @@ module SitemapGenerator
|
|
9
18
|
raise SitemapError.new("#{dir} should be a directory!")
|
10
19
|
end
|
11
20
|
|
12
|
-
|
21
|
+
stream = open(location.path, 'wb')
|
22
|
+
if location.path.to_s =~ /.gz$/
|
23
|
+
gzip(stream, raw_data)
|
24
|
+
else
|
25
|
+
plain(stream, raw_data)
|
26
|
+
end
|
13
27
|
end
|
14
28
|
|
29
|
+
# Write `data` to a stream, passing the data through a GzipWriter
|
30
|
+
# to compress it.
|
15
31
|
def gzip(stream, data)
|
16
32
|
gz = Zlib::GzipWriter.new(stream)
|
17
33
|
gz.write data
|
18
34
|
gz.close
|
19
35
|
end
|
36
|
+
|
37
|
+
# Write `data` to a stream as is.
|
38
|
+
def plain(stream, data)
|
39
|
+
stream.write data
|
40
|
+
stream.close
|
41
|
+
end
|
20
42
|
end
|
21
43
|
end
|
@@ -31,7 +31,7 @@ module SitemapGenerator
|
|
31
31
|
xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
|
32
32
|
http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"
|
33
33
|
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
|
34
|
-
xmlns:image="#{SitemapGenerator::SCHEMAS['image']}"
|
34
|
+
xmlns:image="#{SitemapGenerator::SCHEMAS['image']}"
|
35
35
|
xmlns:video="#{SitemapGenerator::SCHEMAS['video']}"
|
36
36
|
xmlns:geo="#{SitemapGenerator::SCHEMAS['geo']}"
|
37
37
|
xmlns:news="#{SitemapGenerator::SCHEMAS['news']}"
|
@@ -138,7 +138,6 @@ module SitemapGenerator
|
|
138
138
|
reserve_name
|
139
139
|
@location.write(@xml_wrapper_start + @xml_content + @xml_wrapper_end)
|
140
140
|
@xml_content = @xml_wrapper_start = @xml_wrapper_end = ''
|
141
|
-
puts summary if @location.verbose?
|
142
141
|
@written = true
|
143
142
|
end
|
144
143
|
|
@@ -166,14 +165,6 @@ module SitemapGenerator
|
|
166
165
|
self.class.new(location)
|
167
166
|
end
|
168
167
|
|
169
|
-
# Return a summary string
|
170
|
-
def summary(opts={})
|
171
|
-
uncompressed_size = number_to_human_size(@filesize)
|
172
|
-
compressed_size = number_to_human_size(@location.filesize)
|
173
|
-
path = ellipsis(@location.path_in_public, 47)
|
174
|
-
"+ #{'%-47s' % path} #{'%10s' % @link_count} links / #{'%10s' % compressed_size}"
|
175
|
-
end
|
176
|
-
|
177
168
|
protected
|
178
169
|
|
179
170
|
# Replace the last 3 characters of string with ... if the string is as big
|
@@ -93,14 +93,6 @@ module SitemapGenerator
|
|
93
93
|
@sitemaps_link_count
|
94
94
|
end
|
95
95
|
|
96
|
-
# Return a summary string
|
97
|
-
def summary(opts={})
|
98
|
-
uncompressed_size = number_to_human_size(@filesize)
|
99
|
-
compressed_size = number_to_human_size(@location.filesize)
|
100
|
-
path = ellipsis(@location.path_in_public, 44) # 47 - 3
|
101
|
-
"+ #{'%-44s' % path} #{'%10s' % @link_count} sitemaps / #{'%10s' % compressed_size}"
|
102
|
-
end
|
103
|
-
|
104
96
|
def stats_summary(opts={})
|
105
97
|
str = "Sitemap stats: #{number_with_delimiter(@sitemaps_link_count)} links / #{@link_count} sitemaps"
|
106
98
|
str += " / %dm%02ds" % opts[:time_taken].divmod(60) if opts[:time_taken]
|
@@ -4,8 +4,8 @@ require 'builder'
|
|
4
4
|
# which lists all the sitemap files written.
|
5
5
|
module SitemapGenerator
|
6
6
|
class LinkSet
|
7
|
-
@@requires_finalization_opts = [:filename, :sitemaps_path, :
|
8
|
-
@@new_location_opts = [:filename, :sitemaps_path, :
|
7
|
+
@@requires_finalization_opts = [:filename, :sitemaps_path, :sitemaps_host, :namer]
|
8
|
+
@@new_location_opts = [:filename, :sitemaps_path, :namer]
|
9
9
|
|
10
10
|
attr_reader :default_host, :sitemaps_path, :filename, :create_index
|
11
11
|
attr_accessor :verbose, :yahoo_app_id, :include_root, :include_index, :sitemaps_host, :adapter, :yield_sitemap
|
@@ -43,14 +43,6 @@ module SitemapGenerator
|
|
43
43
|
self
|
44
44
|
end
|
45
45
|
|
46
|
-
# Dreprecated. Use create.
|
47
|
-
def add_links(&block)
|
48
|
-
original_value = @yield_sitemap
|
49
|
-
@yield_sitemap = true
|
50
|
-
create(&block)
|
51
|
-
@yield_sitemap = original_value
|
52
|
-
end
|
53
|
-
|
54
46
|
# Constructor
|
55
47
|
#
|
56
48
|
# == Options:
|
@@ -110,10 +102,14 @@ module SitemapGenerator
|
|
110
102
|
# and index file names. See <tt>:filename</tt> if you don't need to do anything fancy, and can
|
111
103
|
# accept the default naming conventions.
|
112
104
|
#
|
113
|
-
#
|
105
|
+
# * <tt>:compress</tt> - Specifies which files to compress with gzip. Default is `true`. Accepted values:
|
106
|
+
# * `true` - Boolean; compress all files.
|
107
|
+
# * `false` - Boolean; write out only uncompressed files.
|
108
|
+
# * `:all_but_first` - Symbol; leave the first file uncompressed but compress any remaining files.
|
114
109
|
#
|
115
|
-
#
|
116
|
-
#
|
110
|
+
# The compression setting applies to groups too. So :all_but_first will have the same effect (the first
|
111
|
+
# file in the group will not be compressed, the rest will). So if you require different behaviour for your
|
112
|
+
# groups, pass in a `:compress` option e.g. <tt>group(:compress => false) { add('/link') }</tt>
|
117
113
|
#
|
118
114
|
# KJV: When adding a new option be sure to include it in `options_for_group()` if
|
119
115
|
# the option should be inherited by groups.
|
@@ -126,7 +122,8 @@ module SitemapGenerator
|
|
126
122
|
:google => "http://www.google.com/webmasters/tools/ping?sitemap=%s",
|
127
123
|
:bing => "http://www.bing.com/webmaster/ping.aspx?siteMap=%s"
|
128
124
|
},
|
129
|
-
:create_index => :auto
|
125
|
+
:create_index => :auto,
|
126
|
+
:compress => true
|
130
127
|
)
|
131
128
|
options.each_pair { |k, v| instance_variable_set("@#{k}".to_sym, v) }
|
132
129
|
|
@@ -378,7 +375,7 @@ module SitemapGenerator
|
|
378
375
|
# doesn't override the latter.
|
379
376
|
def set_options(opts={})
|
380
377
|
opts = opts.dup
|
381
|
-
%w(filename namer
|
378
|
+
%w(filename namer).each do |key|
|
382
379
|
if value = opts.delete(key.to_sym)
|
383
380
|
send("#{key}=", value)
|
384
381
|
end
|
@@ -412,9 +409,10 @@ module SitemapGenerator
|
|
412
409
|
:verbose,
|
413
410
|
:default_host,
|
414
411
|
:adapter,
|
415
|
-
:create_index
|
412
|
+
:create_index,
|
413
|
+
:compress
|
416
414
|
].inject({}) do |hash, key|
|
417
|
-
if value = instance_variable_get(:"@#{key}")
|
415
|
+
if !(value = instance_variable_get(:"@#{key}")).nil?
|
418
416
|
hash[key] = value
|
419
417
|
end
|
420
418
|
hash
|
@@ -475,7 +473,6 @@ module SitemapGenerator
|
|
475
473
|
@sitemap_index = nil if @sitemap_index && @sitemap_index.finalized? && !@protect_index
|
476
474
|
@sitemap = nil if @sitemap && @sitemap.finalized?
|
477
475
|
self.namer.reset
|
478
|
-
self.sitemaps_namer.reset if self.sitemaps_namer
|
479
476
|
@added_default_links = false
|
480
477
|
end
|
481
478
|
|
@@ -507,7 +504,7 @@ module SitemapGenerator
|
|
507
504
|
def public_path=(value)
|
508
505
|
@public_path = Pathname.new(SitemapGenerator::Utilities.append_slash(value))
|
509
506
|
if @public_path.relative?
|
510
|
-
@public_path = SitemapGenerator.app.root + @public_path
|
507
|
+
@public_path = SitemapGenerator.app.root + @public_path
|
511
508
|
end
|
512
509
|
update_location_info(:public_path, @public_path)
|
513
510
|
@public_path
|
@@ -567,11 +564,12 @@ module SitemapGenerator
|
|
567
564
|
def sitemap_location
|
568
565
|
SitemapGenerator::SitemapLocation.new(
|
569
566
|
:host => sitemaps_host,
|
570
|
-
:namer =>
|
567
|
+
:namer => namer,
|
571
568
|
:public_path => public_path,
|
572
569
|
:sitemaps_path => @sitemaps_path,
|
573
570
|
:adapter => @adapter,
|
574
|
-
:verbose => verbose
|
571
|
+
:verbose => verbose,
|
572
|
+
:compress => @compress
|
575
573
|
)
|
576
574
|
end
|
577
575
|
|
@@ -579,17 +577,24 @@ module SitemapGenerator
|
|
579
577
|
def sitemap_index_location
|
580
578
|
SitemapGenerator::SitemapLocation.new(
|
581
579
|
:host => sitemaps_host,
|
582
|
-
:namer =>
|
580
|
+
:namer => namer,
|
583
581
|
:public_path => public_path,
|
584
582
|
:sitemaps_path => @sitemaps_path,
|
585
583
|
:adapter => @adapter,
|
586
584
|
:verbose => verbose,
|
587
|
-
:create_index => @create_index
|
585
|
+
:create_index => @create_index,
|
586
|
+
:compress => @compress
|
588
587
|
)
|
589
588
|
end
|
590
589
|
|
591
590
|
# Set the value of +create_index+ on the SitemapIndexLocation object of the
|
592
591
|
# SitemapIndexFile.
|
592
|
+
#
|
593
|
+
# Whether to create a sitemap index file. Supported values: `true`, `false`, `:auto`.
|
594
|
+
# If `true` an index file is always created, regardless of how many links
|
595
|
+
# are in your sitemap. If `false` an index file is never created.
|
596
|
+
# If `:auto` an index file is created only if your sitemap has more than
|
597
|
+
# one sitemap file.
|
593
598
|
def create_index=(value, force=false)
|
594
599
|
@create_index = value
|
595
600
|
# Allow overriding the protected status of the index when we are creating a group.
|
@@ -613,6 +618,28 @@ module SitemapGenerator
|
|
613
618
|
@namer ||= @sitemap && @sitemap.location.namer || SitemapGenerator::SimpleNamer.new(@filename)
|
614
619
|
end
|
615
620
|
|
621
|
+
# Set the value of the compress setting.
|
622
|
+
#
|
623
|
+
# Values:
|
624
|
+
# * `true` - Boolean; compress all files
|
625
|
+
# * `false` - Boolean; write out only uncompressed files
|
626
|
+
# * `:all_but_first` - Symbol; leave the first file uncompressed but compress any remaining files.
|
627
|
+
#
|
628
|
+
# The compression setting applies to groups too. So :all_but_first will have the same effect (the first
|
629
|
+
# file in the group will not be compressed, the rest will). So if you require different behaviour for your
|
630
|
+
# groups, pass in a `:compress` option e.g. <tt>group(:compress => false) { add('/link') }</tt>
|
631
|
+
def compress=(value)
|
632
|
+
@compress = value
|
633
|
+
@sitemap_index.location[:compress] = @compress if @sitemap_index
|
634
|
+
@sitemap.location[:compress] = @compress if @sitemap
|
635
|
+
end
|
636
|
+
|
637
|
+
# Return the current compression setting. Its value determines which files will be gzip'ed.
|
638
|
+
# See the setter for documentation of its values.
|
639
|
+
def compress
|
640
|
+
@compress
|
641
|
+
end
|
642
|
+
|
616
643
|
protected
|
617
644
|
|
618
645
|
# Update the given attribute on the current sitemap index and sitemap file location objects.
|
@@ -624,48 +651,5 @@ module SitemapGenerator
|
|
624
651
|
end
|
625
652
|
end
|
626
653
|
include LocationHelpers
|
627
|
-
|
628
|
-
module Deprecated
|
629
|
-
# *Deprecated*
|
630
|
-
#
|
631
|
-
# Set the namer to use when generating SitemapFiles (does not apply to the
|
632
|
-
# SitemapIndexFile)
|
633
|
-
#
|
634
|
-
# As of version 4, use the <tt>namer<tt> option.
|
635
|
-
def sitemaps_namer=(value)
|
636
|
-
@sitemaps_namer = value
|
637
|
-
@sitemap.location[:namer] = value if @sitemap && !@sitemap.finalized?
|
638
|
-
end
|
639
|
-
|
640
|
-
# *Deprecated*
|
641
|
-
#
|
642
|
-
# Return the current sitemaps namer object. If it not set, looks for it on
|
643
|
-
# the current sitemap and if there is no sitemap, creates a new one using
|
644
|
-
# the current filename.
|
645
|
-
#
|
646
|
-
# As of version 4, use the <tt>namer<tt> option.
|
647
|
-
def sitemaps_namer
|
648
|
-
@sitemaps_namer ||= @sitemap && @sitemap.location.namer
|
649
|
-
end
|
650
|
-
|
651
|
-
# *Deprecated*
|
652
|
-
#
|
653
|
-
# Set the namer to use when generating the index file.
|
654
|
-
# The namer should be a <tt>SitemapGenerator::SitemapIndexNamer</tt> instance.
|
655
|
-
#
|
656
|
-
# As of version 4, use the <tt>namer<tt> option.
|
657
|
-
def sitemap_index_namer=(value)
|
658
|
-
@sitemap_index_namer = value
|
659
|
-
@sitemap_index.location[:namer] = value if @sitemap_index && !@sitemap_index.finalized? && !@protect_index
|
660
|
-
end
|
661
|
-
|
662
|
-
# *Deprecated*
|
663
|
-
#
|
664
|
-
# As of version 4, use the <tt>namer<tt> option.
|
665
|
-
def sitemap_index_namer
|
666
|
-
@sitemap_index_namer ||= @sitemap_index && @sitemap_index.location.namer
|
667
|
-
end
|
668
|
-
end
|
669
|
-
include Deprecated
|
670
654
|
end
|
671
655
|
end
|
@@ -1,5 +1,13 @@
|
|
1
|
+
require 'sitemap_generator/helpers/number_helper'
|
2
|
+
|
1
3
|
module SitemapGenerator
|
4
|
+
# A class for determining the exact location at which to write sitemap data.
|
5
|
+
# Handles reserving filenames from namers, constructing paths and sending
|
6
|
+
# data to the adapter to be written out.
|
2
7
|
class SitemapLocation < Hash
|
8
|
+
include SitemapGenerator::Helpers::NumberHelper
|
9
|
+
|
10
|
+
PATH_OUTPUT_WIDTH = 47 # Character width of the path in the summary lines
|
3
11
|
|
4
12
|
[:host, :adapter].each do |method|
|
5
13
|
define_method(method) do
|
@@ -18,25 +26,36 @@ module SitemapGenerator
|
|
18
26
|
# generates names like <tt>sitemap.xml.gz</tt>, <tt>sitemap1.xml.gz</tt>, <tt>sitemap2.xml.gz</tt> and so on.
|
19
27
|
#
|
20
28
|
# === Options
|
21
|
-
# * <tt
|
22
|
-
# * <tt
|
23
|
-
# * <tt
|
29
|
+
# * <tt>:adapter</tt> - SitemapGenerator::Adapter subclass
|
30
|
+
# * <tt>:filename</tt> - full name of the file e.g. <tt>'sitemap1.xml.gz'<tt>
|
31
|
+
# * <tt>:host</tt> - host name for URLs. The full URL to the file is then constructed from
|
24
32
|
# the <tt>host</tt>, <tt>sitemaps_path</tt> and <tt>filename</tt>
|
25
|
-
# * <tt
|
26
|
-
#
|
33
|
+
# * <tt>:namer</tt> - a SitemapGenerator::SimpleNamer instance for generating file names.
|
34
|
+
# Should be passed if no +filename+ is provided.
|
35
|
+
# * <tt>:public_path</tt> - path to the "public" directory, or the directory you want to
|
27
36
|
# write sitemaps in. Default is a directory <tt>public/</tt>
|
28
37
|
# in the current working directory, or relative to the Rails root
|
29
38
|
# directory if running under Rails.
|
30
|
-
# * <tt
|
39
|
+
# * <tt>:sitemaps_path</tt> - gives the path relative to the <tt>public_path</tt> in which to
|
31
40
|
# write sitemaps e.g. <tt>sitemaps/</tt>.
|
32
|
-
# * <tt
|
33
|
-
# * <tt
|
34
|
-
#
|
41
|
+
# * <tt>:verbose</tt> - whether to output summary into to STDOUT. Default +false+.
|
42
|
+
# * <tt>:create_index</tt> - whether to create a sitemap index. Default `:auto`. See <tt>LinkSet::create_index=</tt>
|
43
|
+
# for possible values. Only applies to the SitemapIndexLocation object.
|
44
|
+
# * <tt>compress</tt> - The LinkSet compress setting. Default: true. If `false` any `.gz` extension is
|
45
|
+
# stripped from the filename. If `:all_but_first`, only the `.gz` extension of the first
|
46
|
+
# filename is stripped off. If `true` the extensions are left unchanged.
|
35
47
|
def initialize(opts={})
|
36
|
-
SitemapGenerator::Utilities.assert_valid_keys(opts, [:adapter, :public_path, :sitemaps_path, :host, :filename, :namer, :verbose, :create_index])
|
48
|
+
SitemapGenerator::Utilities.assert_valid_keys(opts, [:adapter, :public_path, :sitemaps_path, :host, :filename, :namer, :verbose, :create_index, :compress])
|
37
49
|
opts[:adapter] ||= SitemapGenerator::FileAdapter.new
|
38
50
|
opts[:public_path] ||= SitemapGenerator.app.root + 'public/'
|
39
|
-
|
51
|
+
# This is a bit of a hack to make the SimpleNamer act like the old SitemapNamer.
|
52
|
+
# It doesn't really make sense to create a default namer like this because the
|
53
|
+
# namer instance should be shared by the location objects of the sitemaps and
|
54
|
+
# sitemap index files. However, this greatly eases testing, so I'm leaving it in
|
55
|
+
# for now.
|
56
|
+
if !opts[:filename] && !opts[:namer]
|
57
|
+
opts[:namer] = SitemapGenerator::SimpleNamer.new(:sitemap, :start => 2, :zero => 1)
|
58
|
+
end
|
40
59
|
opts[:verbose] = !!opts[:verbose]
|
41
60
|
self.merge!(opts)
|
42
61
|
end
|
@@ -78,6 +97,16 @@ module SitemapGenerator
|
|
78
97
|
raise SitemapGenerator::SitemapError, "No filename or namer set" unless self[:filename] || self[:namer]
|
79
98
|
unless self[:filename]
|
80
99
|
self.send(:[]=, :filename, self[:namer].to_s, :super => true)
|
100
|
+
|
101
|
+
# Post-process the filename for our compression settings.
|
102
|
+
# Strip the `.gz` from the extension if we aren't compressing this file.
|
103
|
+
# If you're setting the filename manually, :all_but_first won't work as
|
104
|
+
# expected. Ultimately I should force using a namer in all circumstances.
|
105
|
+
# Changing the filename here will affect how the FileAdapter writes out the file.
|
106
|
+
if self[:compress] == false ||
|
107
|
+
(self[:namer] && self[:namer].start? && self[:compress] == :all_but_first)
|
108
|
+
self[:filename].gsub!(/\.gz$/, '')
|
109
|
+
end
|
81
110
|
end
|
82
111
|
self[:filename]
|
83
112
|
end
|
@@ -119,23 +148,43 @@ module SitemapGenerator
|
|
119
148
|
super(key, value)
|
120
149
|
end
|
121
150
|
|
151
|
+
# Write `data` out to a file.
|
152
|
+
# Output a summary line if verbose is true.
|
122
153
|
def write(data)
|
123
154
|
adapter.write(self, data)
|
155
|
+
puts summary if verbose?
|
156
|
+
end
|
157
|
+
|
158
|
+
# Return a summary string
|
159
|
+
def summary
|
160
|
+
filesize = number_to_human_size(self.filesize)
|
161
|
+
path = ellipsis(self.path_in_public, self::PATH_OUTPUT_WIDTH)
|
162
|
+
"+ #{'%-'+self::PATH_OUTPUT_WIDTH+'s' % path} #{'%10s' % @link_count} links / #{'%10s' % filesize}"
|
124
163
|
end
|
125
164
|
end
|
126
165
|
|
127
166
|
class SitemapIndexLocation < SitemapLocation
|
128
167
|
def initialize(opts={})
|
129
168
|
if !opts[:filename] && !opts[:namer]
|
130
|
-
opts[:namer] = SitemapGenerator::
|
169
|
+
opts[:namer] = SitemapGenerator::SimpleNamer.new(:sitemap)
|
131
170
|
end
|
132
171
|
super(opts)
|
133
172
|
end
|
134
173
|
|
135
|
-
#
|
174
|
+
# Whether to create a sitemap index. Default `:auto`. See <tt>LinkSet::create_index=</tt>
|
175
|
+
# for possible values.
|
176
|
+
#
|
177
|
+
# A placeholder for an option which should really go into some
|
136
178
|
# kind of options class.
|
137
179
|
def create_index
|
138
180
|
self[:create_index]
|
139
181
|
end
|
182
|
+
|
183
|
+
# Return a summary string
|
184
|
+
def summary
|
185
|
+
filesize = number_to_human_size(self.filesize)
|
186
|
+
path = ellipsis(self.path_in_public, self::PATH_OUTPUT_WIDTH - 3)
|
187
|
+
"+ #{'%-'+self::PATH_OUTPUT_WIDTH+'s' % path} #{'%10s' % @link_count} sitemaps / #{'%10s' % filesize}"
|
188
|
+
end
|
140
189
|
end
|
141
190
|
end
|