dynamic_sitemaps 2.0.0.beta → 2.0.0.beta2

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 5df17dde12a20c3a3cca9273c46aa26834d93c47
4
- data.tar.gz: e026ef0473a20a0e0bafe8551d846983a23c2f49
3
+ metadata.gz: f029dbe8d23b37ba620af6793be4d12f10fd009c
4
+ data.tar.gz: 98818a397372a0e01313d75d9b1985d8865deca5
5
5
  SHA512:
6
- metadata.gz: 12da553942f24fdcbefe729b747e32389d2eeed1f14b9bb3363fbbf0ac58c0a502e9630d949bbd85e8db74bac2e837cd156ce9b2aa9cfb3df0135fe656d41be0
7
- data.tar.gz: b52815f7ae238b587867cf3a288f2ea44df35bfe98d38c2f4759fc4f61fb0305a85b6c2cd4c387d7d752ed4a16920b9c3d98e42b29b45c8eaac8a354f11a5af0
6
+ metadata.gz: 509e7d231f8465dfbf7a13917535c0acd36d32436c81db94988feae183abf5eb3c5c650fa3cef837909745ec3014e51d76a747df5152e83b30e1c71f66e0002c
7
+ data.tar.gz: cc1836f900c00716056143b9b93a115eec34f876f10ecf5167c9f5239639fdc1f7c697eb30cac940c54f515eb777849f47dfeb9e1d448383c18039789961b7d3
data/.gitignore CHANGED
@@ -1,5 +1,7 @@
1
1
  *.gem
2
2
  *.rbc
3
+ *.log
4
+ *.sqlite3
3
5
  .bundle
4
6
  .config
5
7
  .yardoc
@@ -14,5 +16,7 @@ rdoc
14
16
  spec/reports
15
17
  test/tmp
16
18
  test/version_tmp
19
+ test/sitemaps
20
+ test/dummy/public/sitemaps
17
21
  tmp
18
22
  .DS_Store
@@ -0,0 +1,8 @@
1
+ language: ruby
2
+ rvm:
3
+ - 2.0.0
4
+ - 1.9.3
5
+ before_script:
6
+ - "cd test/dummy; rake db:migrate; rake db:test:prepare; cd ../.."
7
+ notifications:
8
+ email: false
data/README.md CHANGED
@@ -1,106 +1,281 @@
1
+ [![Build Status](https://secure.travis-ci.org/lassebunk/dynamic_sitemaps.png)](http://travis-ci.org/lassebunk/dynamic_sitemaps)
2
+
1
3
  # DynamicSitemaps
2
4
 
3
- Dynamic Sitemaps is a plugin for Ruby on Rails that enables you to easily create flexible, dynamic sitemaps. It creates sitemaps in the [sitemaps.org](http://sitemaps.org) standard which is supported by several crawlers including Google, Bing, and Yahoo.
5
+ Dynamic Sitemaps is a plugin for [Ruby on Rails](http://rubyonrails.org) that enables you to easily create flexible, dynamic sitemaps. It creates sitemaps in the [sitemaps.org](http://sitemaps.org) standard which is supported by several crawlers including Google, Bing, and Yahoo.
4
6
 
5
7
  Dynamic Sitemaps is designed to be very (very) simple so there's a lot you cannot do, but possibly don't need (I didn't). If you need an advanced sitemap generator, please see Karl Varga's [SitemapGenerator](https://github.com/kjvarga/sitemap_generator).
6
8
 
7
- ## Planned for version 2.0 (this branch)
9
+ ## Version 2.0
10
+
11
+ Version 2.0 makes it possible to make very large sitemaps (up to 2.5 billion URLs) in a fast and memory efficient way; it is built for large amounts of data, i.e. millions of URLs without pushing your server to the limit, memory and CPU wise.
12
+
13
+ Version 2.0 is not compatible with version 1.0 (although the configuration DSL looks somewhat the same) as version 2.0 generates static sitemap XML files whereas 1.0 generated them dynamically on each request (slow for large sitemaps).
14
+
15
+ ## Requirements
16
+
17
+ DynamicSitemaps is tested in Rails 3.2.13 using Ruby 1.9.3 and 2.0.0, but should work in other versions of Rails 3 and above and Ruby 1.9 and above. Please create an [issue](https://github.com/lassebunk/dynamic_sitemaps/issues) if you encounter any problems.
18
+
19
+ ## Installation
20
+
21
+ Add this line to your application's Gemfile:
22
+
23
+ gem "dynamic_sitemaps", "2.0.0.beta2"
24
+
25
+ And then execute:
26
+
27
+ $ bundle
28
+
29
+ Or install it yourself as:
30
+
31
+ $ gem install dynamic_sitemaps
32
+
33
+ To generate a simple example config file in `config/sitemap.rb`:
34
+
35
+ $ rails generate dynamic_sitemaps:install
8
36
 
9
- Version 2.0 will make it possible to make very large sitemaps (up to 2.5 billion URLs) in a fast and memory efficient way; it will be built for large amounts of data, i.e. millions of URLs without pushing your server to the limit, memory and CPU wise.
37
+ If you want to use version 1.0 (v1.0.8) of DynamicSitemaps, please see [v1.0.8](https://github.com/lassebunk/dynamic_sitemaps/tree/da0f78ddb1e6a471d6d5715d492295da99f5e682) of the project. Please note that this version isn't good for large sitemaps as it generates them dynamically on each request.
10
38
 
11
- Version 2.0 will not be compatible with version 1.0 as version 2.0 will generate static sitemap XML files whereas 1.0 generates them dynamically on each request.
39
+ ## Basic usage
12
40
 
13
- Idea for the version 2.0 DSL, in ```config/sitemap.rb```:
41
+ The configuration file in `config/sitemap.rb` goes like this (also see the production example below for more advance usage like multiple sites / hosts, etc.):
14
42
 
15
43
  ```ruby
16
- host "www.mysite.com"
44
+ host "www.example.com"
17
45
 
46
+ # Basic sitemap – you can change the name :site as you wish
18
47
  sitemap :site do
19
48
  url root_url, last_mod: Time.now, change_freq: "daily", priority: 1.0
20
- url contact_url
21
- Page.all.each do |page|
22
- url page, last_mod: page.updated_at
23
- end
24
49
  end
25
50
 
26
- # All products and editions URLs
27
- sitemap :products, Product do |product|
28
- url product, last_mod: product.updated_at
29
- url product_editions_url(product)
30
- end
51
+ # Pings search engines after generation has finished
52
+ ping_with "http://#{host}/sitemap.xml"
53
+ ```
54
+
55
+ The host is needed to generate the URLs because the rake task doesn't know anything about the host being used.
56
+
57
+ Then, to generate the sitemap:
58
+
59
+ $ rake sitemap:generate
31
60
 
32
- # Autogenerate a tags sitemap with URLs for all tags containing products.
33
- # Automatically sets last_mod to tag.updated_at.
34
- sitemap Tag.where("products_count > 0")
61
+ This will, by default, generate a `sitemap.xml` file in `<project root>/public/sitemaps` that will look like this:
62
+
63
+ ```xml
64
+ <?xml version="1.0" encoding="UTF-8"?>
65
+ <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
66
+ <url>
67
+ <loc>http://www.example.com/</loc>
68
+ <lastmod>2013-07-08T17:02:45+02:00</lastmod>
69
+ <changefreq>daily</changefreq>
70
+ <priority>1.0</priority>
71
+ </url>
72
+ </urlset>
35
73
  ```
36
74
 
37
- You then run:
75
+ You then need to symlink from `public/sitemap.xml` (or whatever you choose) to `public/sitemaps/sitemap.xml`:
76
+
77
+ $ ln -s /path/to/project/public/sitemaps/sitemap.xml /path/to/project/public/sitemap.xml
78
+
79
+ See the below production example for inspiration on how to do this with [Capistrano](https://github.com/capistrano/capistrano), and other things like multiple sites / hosts, etc.
80
+
81
+ If a sitemap contains over 50,000 URLs, then by default, as specified by the [sitemaps.org](http://sitemaps.org) standard, DynamicSitemaps will split it into multiple sitemaps and generate an index file that will also be named `public/sitemaps/sitemap.xml` by default.
82
+ The sitemap files will then be named `site.xml`, `site2.xml`, `site3.xml`, and so on, and the index file will link to these files using the host set with `host`.
83
+
84
+ ## Automatic sitemaps for resourceful routes
85
+
86
+ DynamicSitemaps can automatically generate sitemaps for ActiveRecord models with the built-in Rails [resourceful routes](http://guides.rubyonrails.org/routing.html#resource-routing-the-rails-default) (the ones you create using `routes :model_name`).
87
+
88
+ Example:
89
+
90
+ ```ruby
91
+ host "www.example.com"
92
+
93
+ # Basic sitemap
94
+ sitemap :site do
95
+ url root_url, last_mod: Time.now, change_freq: "daily", priority: 1.0
96
+ end
38
97
 
39
- ```bash
40
- $ rake sitemap:generate
98
+ # Automatically link to all pages using the routes specified
99
+ # using "resources :pages" in config/routes.rb. This will also
100
+ # automatically set <lastmod> to the date and time in page.updated_at.
101
+ sitemap_for Page.scoped
102
+
103
+ # For products with special sitemap name and priority, and link to comments
104
+ sitemap_for Product.published, name: :published_products do |product|
105
+ url product, last_mod: product.updated_at, priority: (product.featured? ? 1.0 : 0.7)
106
+ url product_comments_url(product)
107
+ end
41
108
  ```
42
109
 
43
- This will generate the sitemaps, remove all files from ```public/sitemaps``` replacing them with:
110
+ This generates the sitemap files `site.xml`, `pages.xml`, and `published_products.xml` and links them together in the `sitemap.xml` index file, splitting them into multiple sitemap files if the number of URLs exceeds 50,000.
111
+
112
+ The argument passed to `sitemap_for` needs to respond to [`#find_each`](http://api.rubyonrails.org/classes/ActiveRecord/Batches.html#method-i-find_each), like an ActiveRecord [Relation](http://api.rubyonrails.org/classes/ActiveRecord/Relation.html).
113
+ This is to ensure that the records from the database are lazy loaded 1,000 at a time, so that it doesn't accidentally load millions of records in one call when the configuration file is read.
114
+ Therefore we use `Page.scoped` instead of the normal `Page.all`.
115
+
116
+ ## Custom configuration
117
+
118
+ You can configure different options of how DynamicSitemaps behaves, including the sitemap path and index file name.
119
+
120
+ In an initializer, e.g. `config/initializers/dynamic_sitemaps.rb`:
44
121
 
45
- ```bash
46
- - public/sitemaps/index.xml
47
- - public/sitemaps/site.xml
48
- - public/sitemaps/products.xml
49
- - public/sitemaps/products2.xml
50
- - public/sitemaps/products3.xml # etc.
51
- - public/sitemaps/tags.xml
52
- - public/sitemaps/tags2.xml # etc.
122
+ ```ruby
123
+ # These are the built-in defaults, so you don't need to specify them.
124
+ DynamicSitemaps.configure do |config|
125
+ config.path = Rails.root.join("public")
126
+ config.folder = "sitemaps" # This folder is emptied on each sitemap generation
127
+ config.index_file_name = "sitemap.xml"
128
+ config.always_generate_index = false # Makes sitemap.xml contain the sitemap
129
+ # (e.g. site.xml) when only one sitemap
130
+ # file has been generated
131
+ config.config_path = Rails.root.join("config", "sitemap.rb")
132
+ config.per_page = 50000
133
+ end
53
134
  ```
54
135
 
55
- And symlink ```/sitemap.xml``` to ```/sitemaps/sitemap.xml``` because the sitemaps.org spec only allows URLs in a sitemap to be *below* the ```sitemap.xml``` index file:
136
+ ## Pinging search engines
137
+
138
+ DynamicSitemaps can automatically ping Google and Bing (and other search engines you specify) with the sitemap when the generation finishes.
139
+
140
+ In `config/sitemap.rb`:
141
+
142
+ ```ruby
143
+ host "www.example.com"
56
144
 
57
- ```bash
58
- $ ln -s /var/www/mysite/public/sitemaps/index.xml /var/www/mysite/public/sitemap.xml
145
+ sitemap :site do
146
+ url root_url
147
+ end
148
+
149
+ ping_with "http://#{host}/sitemap.xml"
59
150
  ```
60
151
 
61
- If you use Capistrano and have a ```shared``` folder:
152
+ To customize it, in e.g. `config/initializers/dynamic_sitemaps.rb`:
62
153
 
63
- ```bash
64
- $ mkdir -p /var/www/mysite/shared/sitemaps
65
- $ rm -r /var/www/mysite/current/public/sitemaps # if uploaded accidentally
66
- $ ln -s /var/www/mysite/shared/sitemaps /var/www/mysite/current/public/sitemaps
154
+ ```ruby
155
+ DynamicSitemaps.configure do |config|
156
+ # Default is Google and Bing
157
+ config.search_engine_ping_urls << "http://customsearchengine.com/ping?url=%s"
158
+
159
+ # Default is pinging only in production
160
+ config.ping_environments << "staging"
161
+ end
67
162
  ```
68
163
 
69
- If for example you have multiple subdomains, then in ```config/sitemap.rb```:
164
+ ## In case of failure
165
+
166
+ DynamicSitemaps generates to a temporary directory (`<rails root>/tmp/dynamic_sitemaps`) first and, when finished, moves the files into the destination (by default `public/sitemaps`).
167
+ So in case you have generated a sitemap succesfully and the next sitemap generation fails, your sitemap files will remain untouched and available.
168
+
169
+ ## Production example with multiple domains, Capistrano, and Whenever
170
+
171
+ This is an example of a real production app that uses DynamicSitemaps with multiple sites and domains in one app, [Capistrano](https://github.com/capistrano/capistrano) for deployment, and [Whenever](https://github.com/javan/whenever) for crontab scheduling.
172
+
173
+ ### Sitemap setup
174
+
175
+ In `config/sitemap.rb`:
70
176
 
71
177
  ```ruby
72
178
  Site.all.each do |site|
73
- host "#{site.subdomain}.mysite.com"
74
- path Rails.root.join("public", "sitemaps", site.subdomain)
179
+ folder "sitemaps/#{site.key}"
180
+ host site.domain
75
181
 
76
- url root_url
77
- sitemap site.products
78
- # etc.
182
+ sitemap :site do
183
+ url root_url, priority: 1.0, change_freq: "daily"
184
+ url blog_posts_url
185
+ url tags_url
186
+ end
187
+
188
+ sitemap_for site.pages.where("slug != 'home'")
189
+ sitemap_for site.blog_posts.published
190
+ sitemap_for site.tags.scoped
191
+
192
+ sitemap_for site.products.where("type_id != ?", ProductType.find_by_key("unknown").id) do |product|
193
+ url product, last_mod: product.updated_at, priority: (product.featured? ? 1.0 : 0.7)
194
+ end
195
+
196
+ ping_with "http://#{host}/sitemap.xml"
79
197
  end
80
198
  ```
81
199
 
82
- ## Installation
200
+ ### Routing the default sitemap
83
201
 
84
- Add this line to your application's Gemfile:
202
+ #### Route for sitemap.xml and robots.txt
85
203
 
86
- gem "dynamic_sitemaps", git: "git://github.com/lassebunk/dynamic_sitemaps.git", branch: "2.0"
204
+ In `config/routes.rb`:
87
205
 
88
- And then execute:
206
+ ```ruby
207
+ get "sitemap.xml" => "home#sitemap", format: :xml, as: :sitemap
208
+ get "robots.txt" => "home#robots", format: :text, as: :robots
209
+ ```
89
210
 
90
- $ bundle
211
+ #### Controller
91
212
 
92
- Or install it yourself as:
213
+ In `app/controllers/home_controller.rb`:
93
214
 
94
- $ gem install dynamic_sitemaps
215
+ ```ruby
216
+ class HomeController < ApplicationController
217
+ # ...
218
+
219
+ def sitemap
220
+ path = Rails.root.join("public", "sitemaps", current_site.key, "sitemap.xml")
221
+ if File.exists?(path)
222
+ render xml: open(path).read
223
+ else
224
+ render text: "Sitemap not found.", status: :not_found
225
+ end
226
+ end
227
+
228
+ def robots
229
+ end
230
+ end
231
+ ```
95
232
 
96
- ## Usage
233
+ #### View for robots.txt
97
234
 
98
- TODO: Write usage instructions here
235
+ In `app/views/home/robots.text.erb`:
236
+
237
+ ```html
238
+ Sitemap: <%= sitemap_url %>
239
+ ```
240
+
241
+ ### Deployment with Capistrano
242
+
243
+ [Capistrano](https://github.com/capistrano/capistrano) deployment configuration in `config/deploy.rb`:
244
+
245
+ ```ruby
246
+ after "deploy:update_code", "sitemaps:create_symlink"
247
+
248
+ namespace :sitemaps do
249
+ task :create_symlink, roles: :app do
250
+ run "mkdir -p #{shared_path}/sitemaps"
251
+ run "rm -rf #{release_path}/public/sitemaps"
252
+ run "ln -s #{shared_path}/sitemaps #{release_path}/public/sitemaps"
253
+ end
254
+ end
255
+ ```
256
+
257
+ For automatic crontab scheduling with [Whenever](https://github.com/javan/whenever), in `config/schedule.rb`:
258
+
259
+ ```ruby
260
+ every 1.day, at: "6am" do
261
+ rake "sitemap:generate"
262
+ end
263
+ ```
264
+
265
+ This will automatically generate the sitemaps and ping Google and Bing every day at 6am using the sitemap URLs configured above.
266
+
267
+ ## Problems?
268
+
269
+ If you encounter any problems with DynamicSitemaps, please create an [issue](https://github.com/lassebunk/dynamic_sitemaps/issues).
270
+ If you want to fix the problem (please do :smile:), please see below.
99
271
 
100
272
  ## Contributing
101
273
 
102
- 1. Fork it
274
+ Help is always appreciated whether it be improvement of the code, testing, or adding new relevant features.
275
+ Please create an [issue](https://github.com/lassebunk/dynamic_sitemaps/issues) before implementing a new feature, so we can discuss it in advance. Thanks.
276
+
277
+ 1. Fork the repo
103
278
  2. Create your feature branch (`git checkout -b my-new-feature`)
104
- 3. Commit your changes (`git commit -am 'Add some feature'`)
279
+ 3. Commit your changes (`git commit -am 'Add feature'`)
105
280
  4. Push to the branch (`git push origin my-new-feature`)
106
281
  5. Create new Pull Request
data/TODO.md ADDED
@@ -0,0 +1,7 @@
1
+ # Todo
2
+
3
+ **No unfinished todos**.
4
+
5
+ Good job!
6
+
7
+ ![Octocat](http://octodex.github.com/images/poptocat.png)
@@ -19,4 +19,7 @@ Gem::Specification.new do |gem|
19
19
 
20
20
  gem.add_development_dependency "rails", "~> 3.2.13"
21
21
  gem.add_development_dependency "sqlite3"
22
+ gem.add_development_dependency "nokogiri", "~> 1.6.0"
23
+ gem.add_development_dependency "timecop", "~> 0.6.1"
24
+ gem.add_development_dependency "webmock", "~> 1.13.0"
22
25
  end
@@ -5,6 +5,7 @@ require "dynamic_sitemaps/sitemap_generator"
5
5
  require "dynamic_sitemaps/index_generator"
6
6
  require "dynamic_sitemaps/sitemap_result"
7
7
  require "dynamic_sitemaps/pinger"
8
+ require "dynamic_sitemaps/logger"
8
9
 
9
10
  module DynamicSitemaps
10
11
  DEFAULT_PER_PAGE = 50000
@@ -15,12 +16,15 @@ module DynamicSitemaps
15
16
  "http://www.google.com/webmasters/sitemaps/ping?sitemap=%s",
16
17
  "http://www.bing.com/webmaster/ping.aspx?siteMap=%s"
17
18
  ]
19
+ DEFAULT_PING_ENVIRONMENTS = ["production"]
18
20
 
19
21
  class << self
20
- attr_writer :path, :folder, :index_file_name, :always_generate_index, :config_path, :search_engine_ping_urls
22
+ attr_writer :index_file_name, :always_generate_index, :per_page, :search_engine_ping_urls, :ping_environments
21
23
 
22
- def generate_sitemap
23
- DynamicSitemaps::Generator.generate
24
+ # Generates the sitemap(s) and index based on the configuration file specified in DynamicSitemaps.config_path.
25
+ # If you supply a block, that block is evaluated instead of the configuration file.
26
+ def generate_sitemap(&block)
27
+ DynamicSitemaps::Generator.new.generate(&block)
24
28
  end
25
29
 
26
30
  # Configure DynamicSitemaps.
@@ -32,15 +36,14 @@ module DynamicSitemaps
32
36
  # config.index_file_name = "sitemap.xml"
33
37
  # config.always_generate_index = false
34
38
  # config.config_path = Rails.root.join("config", "sitemap.rb")
39
+ # config.per_page = 50_000
35
40
  # end
36
41
  #
37
42
  # To ping search engines after generating the sitemap:
38
43
  #
39
44
  # DynamicSitemaps.configure do |config|
40
45
  # config.search_engine_ping_urls << "http://customsearchengine.com/ping?url=%s" # Default is Google and Bing
41
- # config.sitemap_ping_urls = ["http://www.domain.com/sitemap.xml"]
42
- # # or dynamically:
43
- # config.sitemap_ping_urls = -> { Site.all.map { |site| "http://#{site.domain}/sitemap.xml" } }
46
+ # config.ping_environments << "staging" # Default is production
44
47
  # end
45
48
  def configure
46
49
  yield self
@@ -50,8 +53,18 @@ module DynamicSitemaps
50
53
  @folder ||= DEFAULT_FOLDER
51
54
  end
52
55
 
56
+ def folder=(new_folder)
57
+ raise ArgumentError, "DynamicSitemaps.folder can't be blank." if new_folder.blank?
58
+ @folder = new_folder
59
+ end
60
+
53
61
  def path
54
- @path ||= Rails.root.join("public")
62
+ @path ||= Rails.root.join("public").to_s
63
+ end
64
+
65
+ def path=(new_path)
66
+ raise ArgumentError, "DynamicSitemaps.path can't be blank." if new_path.blank?
67
+ @path = new_path.to_s
55
68
  end
56
69
 
57
70
  def index_file_name
@@ -64,26 +77,38 @@ module DynamicSitemaps
64
77
  end
65
78
 
66
79
  def config_path
67
- @config_path ||= Rails.root.join("config", "sitemap.rb")
80
+ @config_path ||= Rails.root.join("config", "sitemap.rb").to_s
81
+ end
82
+
83
+ def config_path=(new_path)
84
+ raise ArgumentError, "DynamicSitemaps.config_path can't be blank." if new_path.blank?
85
+ @config_path = new_path.to_s
86
+ end
87
+
88
+ def per_page
89
+ @per_page ||= DEFAULT_PER_PAGE
68
90
  end
69
91
 
70
92
  def search_engine_ping_urls
71
- @search_engine_ping_urls ||= SEARCH_ENGINE_PING_URLS
93
+ @search_engine_ping_urls ||= SEARCH_ENGINE_PING_URLS.dup
72
94
  end
73
95
 
74
- def sitemap_ping_urls
75
- case @sitemap_ping_urls
76
- when Array then @sitemap_ping_urls
77
- when Proc then @sitemap_ping_urls.call
78
- else []
79
- end
96
+ def ping_environments
97
+ @ping_environments ||= DEFAULT_PING_ENVIRONMENTS.dup
80
98
  end
81
99
 
100
+ # Removed in version 2.0.0.beta2
82
101
  def sitemap_ping_urls=(array_or_proc)
83
- unless array_or_proc.is_a?(Array) || array_or_proc.is_a?(Proc)
84
- raise "Unknown type #{array_or_proc.class.name} for sitemap_ping_urls."
85
- end
86
- @sitemap_ping_urls = array_or_proc
102
+ raise "sitemap_ping_urls has been removed. Please use `ping \"http://example.com/sitemap.xml\"` in config/sitemap.rb instead."
103
+ end
104
+
105
+ def temp_path
106
+ @temp_path ||= Rails.root.join("tmp", "dynamic_sitemaps").to_s
107
+ end
108
+
109
+ # Resets all instance variables. Used for testing.
110
+ def reset!
111
+ instance_variables.each { |var| remove_instance_variable var }
87
112
  end
88
113
  end
89
114
  end