adsense_crawler_for_private 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (45) hide show
  1. data/MIT-LICENSE +20 -0
  2. data/README.md +112 -0
  3. data/Rakefile +39 -0
  4. data/app/controllers/adsense_crawler_for_private/adsense_crawler_login_controller.rb +39 -0
  5. data/app/controllers/adsense_crawler_for_private/application_controller.rb +6 -0
  6. data/app/helpers/adsense_crawler_for_private/application_helper.rb +4 -0
  7. data/config/routes.rb +3 -0
  8. data/lib/adsense_crawler_for_private/engine.rb +6 -0
  9. data/lib/adsense_crawler_for_private/version.rb +3 -0
  10. data/lib/adsense_crawler_for_private.rb +84 -0
  11. data/lib/tasks/adsense_crawler_for_private_tasks.rake +4 -0
  12. data/test/adsense_crawler_for_private_test.rb +7 -0
  13. data/test/dummy/Rakefile +7 -0
  14. data/test/dummy/app/controllers/application_controller.rb +3 -0
  15. data/test/dummy/app/controllers/main_controller.rb +29 -0
  16. data/test/dummy/app/helpers/application_helper.rb +2 -0
  17. data/test/dummy/config/application.rb +50 -0
  18. data/test/dummy/config/boot.rb +10 -0
  19. data/test/dummy/config/database.yml +25 -0
  20. data/test/dummy/config/environment.rb +5 -0
  21. data/test/dummy/config/environments/development.rb +30 -0
  22. data/test/dummy/config/environments/production.rb +60 -0
  23. data/test/dummy/config/environments/test.rb +39 -0
  24. data/test/dummy/config/initializers/backtrace_silencers.rb +7 -0
  25. data/test/dummy/config/initializers/inflections.rb +10 -0
  26. data/test/dummy/config/initializers/mime_types.rb +5 -0
  27. data/test/dummy/config/initializers/secret_token.rb +7 -0
  28. data/test/dummy/config/initializers/session_store.rb +8 -0
  29. data/test/dummy/config/initializers/wrap_parameters.rb +14 -0
  30. data/test/dummy/config/locales/en.yml +5 -0
  31. data/test/dummy/config/routes.rb +6 -0
  32. data/test/dummy/config.ru +4 -0
  33. data/test/dummy/db/development.sqlite3 +0 -0
  34. data/test/dummy/db/test.sqlite3 +0 -0
  35. data/test/dummy/log/development.log +436 -0
  36. data/test/dummy/log/test.log +6831 -0
  37. data/test/dummy/public/404.html +26 -0
  38. data/test/dummy/public/422.html +26 -0
  39. data/test/dummy/public/500.html +26 -0
  40. data/test/dummy/public/favicon.ico +0 -0
  41. data/test/dummy/script/rails +6 -0
  42. data/test/integration/adsense_crawler_for_private/filtering_helper_test.rb +98 -0
  43. data/test/integration/adsense_crawler_for_private/login_test.rb +77 -0
  44. data/test/test_helper.rb +10 -0
  45. metadata +187 -0
data/MIT-LICENSE ADDED
@@ -0,0 +1,20 @@
1
+ Copyright 2011 YOURNAME
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,112 @@
1
+ # AdsenseCrawlerForPrivate
2
+
3
+ Easy way to enable adsense crawler to login and see private or custom pages. Basically one custom login filter.
4
+
5
+ [<img src="https://secure.travis-ci.org/holli/adsense_crawler_for_private.png" />](http://travis-ci.org/holli/adsense_crawler_for_private)
6
+
7
+ ## Usage
8
+
9
+
10
+ **Initialize:** In RAILS_ROOT/config/initializers/adsense_crawler_for_private.rb
11
+
12
+ ```
13
+ # Configure adsense_crawler_for_private
14
+
15
+ AdsenseCrawlerForPrivate.cookie_name = "adsense_crawler"
16
+ AdsenseCrawlerForPrivate.cookie_domain = :all
17
+ AdsenseCrawlerForPrivate.crawler_name = "test_name"
18
+ AdsenseCrawlerForPrivate.crawler_password = "test_password"
19
+
20
+ # If you are paranoid you can specify ip addresses that are ok to the crawlers to access
21
+ # AdsenseCrawlerForPrivate.ip_ranges = [IPAddr.new("127.0.0.1"), IPAddr.new("192.168.0.1/20")]
22
+
23
+ ```
24
+
25
+ **Routes:**
26
+
27
+ ```
28
+ # Give url "http://domain.dom/adsense_crawler_for_private/login" for crawlers to log in
29
+ # or directly "http://domain.dom/adsense_crawler_for_private/login?name=test_name&password=test_password"
30
+ mount AdsenseCrawlerForPrivate::Engine => "/adsense_crawler_for_private"
31
+ ```
32
+
33
+ **Rendering etc usage:** in controller define what to render for crawlers
34
+
35
+ ```
36
+
37
+ def SomeController << ApplicationController
38
+
39
+ before_filter :adsense_crawler_private_specific_page # normal authentication filters after this one
40
+
41
+ def adsense_crawler_private_specific_page
42
+ if AdsenseCrawlerForPrivate.login_check(cookies, request)
43
+ # here info how to render page for crawler
44
+ # e.g render 'crawler_ad_page'
45
+ # or creating a dummy login info
46
+ return false # so that normal authentication filters etc are not effective
47
+ end
48
+ end
49
+
50
+ # Here would be rest of the controller
51
+
52
+ end
53
+
54
+ ```
55
+
56
+ Or if you want only to check alongside other authentication you can call in your
57
+ own authentication filters AdsenseCrawlerForPrivate.login_check(request)-method
58
+ directly.
59
+
60
+
61
+ ### Testing your own stuff
62
+
63
+ You can test your filters by setting cookie in the same way as in AdsenseCrawlerLoginController#login .
64
+
65
+ If you have enabled ip_ranges option you have to make sure that the ip that tests use is enabled for crawlers.
66
+
67
+ ```
68
+ test "here would be a test for logged crawler in functional tests"
69
+ # Dummy login for crawler, These should be configured in initializers
70
+ crawler_name="adsense_crawler"; crawler_password = "adsense_pass";
71
+ @request.remote_addr = ip = "127.0.0.1"
72
+
73
+ # In some frameworks cookies.signed would be enough. Some will need you to sign the cookie by yourself.
74
+ # If you have a better way, let me know
75
+ # same as cookies.signed[AdsenseCrawlerForPrivate.cookie_name] = AdsenseCrawlerForPrivate.cookie_hash(crawler_name, ip)
76
+ @request.cookies[AdsenseCrawlerForPrivate.cookie_name] =
77
+ ActiveSupport::MessageVerifier.new(Dummy::Application.config.secret_token).generate(
78
+ AdsenseCrawlerForPrivate.cookie_str(crawler_name, crawler_password, 2.days.from_now, ip))
79
+
80
+
81
+ #Normal test in here
82
+ get :index
83
+ assert_response :success
84
+ assert response.body.include?("Hi normal crawler")
85
+ end
86
+
87
+ test "here would be a test for non-logged crawler"
88
+ get :index
89
+ assert_response :success
90
+ assert response.body.include?("Hi normal user")
91
+ end
92
+ ```
93
+
94
+ ### robots.txt
95
+
96
+ Remember to update robots.txt if you have previously forbidden adsense to crawl certain pages.
97
+
98
+ ## Requirements
99
+
100
+ Gem has been tested with ruby 1.8.7, 1.9.2 and Rails 3.1.
101
+
102
+ [<img src="https://secure.travis-ci.org/holli/adsense_crawler_for_private.png" />](http://travis-ci.org/holli/adsense_crawler_for_private)
103
+
104
+ http://travis-ci.org/#!/holli/adsense_crawler_for_private
105
+
106
+ ## Support
107
+
108
+ Submit suggestions or feature requests as a GitHub Issue or Pull Request. Remember to update tests. Tests are quite extensive.
109
+
110
+ ## License
111
+
112
+ Released under the MIT license (http://www.opensource.org/licenses/mit-license.php)
data/Rakefile ADDED
@@ -0,0 +1,39 @@
1
+ #!/usr/bin/env rake
2
+ begin
3
+ require 'bundler/setup'
4
+ rescue LoadError
5
+ puts 'You must `gem install bundler` and `bundle install` to run rake tasks'
6
+ end
7
+ begin
8
+ require 'rdoc/task'
9
+ rescue LoadError
10
+ require 'rdoc/rdoc'
11
+ require 'rake/rdoctask'
12
+ RDoc::Task = Rake::RDocTask
13
+ end
14
+
15
+ RDoc::Task.new(:rdoc) do |rdoc|
16
+ rdoc.rdoc_dir = 'rdoc'
17
+ rdoc.title = 'AdsenseCrawlerForPrivate'
18
+ rdoc.options << '--line-numbers'
19
+ rdoc.rdoc_files.include('README.rdoc')
20
+ rdoc.rdoc_files.include('lib/**/*.rb')
21
+ end
22
+
23
+ APP_RAKEFILE = File.expand_path("../test/dummy/Rakefile", __FILE__)
24
+ load 'rails/tasks/engine.rake'
25
+
26
+
27
+ Bundler::GemHelper.install_tasks
28
+
29
+ require 'rake/testtask'
30
+
31
+ Rake::TestTask.new(:test) do |t|
32
+ t.libs << 'lib'
33
+ t.libs << 'test'
34
+ t.pattern = 'test/**/*_test.rb'
35
+ t.verbose = false
36
+ end
37
+
38
+
39
+ task :default => :test
@@ -0,0 +1,39 @@
1
+ module AdsenseCrawlerForPrivate
2
+ class AdsenseCrawlerLoginController < ApplicationController
3
+
4
+ # Making sure that verify_authenticity_token is not on, adsense does not have it
5
+ skip_before_filter :verify_authenticity_token, :only => :login
6
+
7
+ def login
8
+ unless AdsenseCrawlerForPrivate.crawler_password.blank?
9
+
10
+ crawler_name = params[:name]
11
+ crawler_password = params[:password]
12
+
13
+ if (AdsenseCrawlerForPrivate.ip_check(request) and
14
+ crawler_name == AdsenseCrawlerForPrivate.crawler_name and
15
+ crawler_password == AdsenseCrawlerForPrivate.crawler_password)
16
+
17
+ cookies.signed[AdsenseCrawlerForPrivate.cookie_name] = AdsenseCrawlerForPrivate.cookie_hash(crawler_name, crawler_password, request.remote_addr)
18
+
19
+ AdsenseCrawlerForPrivate.logger.warn "login successfully. Crawler_name: #{crawler_name}"
20
+
21
+ render :text => 'crawler login ok', :status => 200
22
+ else
23
+ cookies.delete(AdsenseCrawlerForPrivate.cookie_name, :domain => AdsenseCrawlerForPrivate.cookie_domain)
24
+
25
+ AdsenseCrawlerForPrivate.logger.warn "login unsuccessful. Crawler_name: #{crawler_name}, crawler_password: #{crawler_password}, crawler_ip: #{request.remote_addr}"
26
+
27
+ render :text => 'crawler login unsuccessful', :status => 401 # 401 unauthorized
28
+ end
29
+
30
+ else
31
+ str = "AdsenseCrawlerForPrivate not configured, no password given."
32
+ AdsenseCrawlerForPrivate.logger.warn(str)
33
+ render :text => str, :status => 401
34
+ end
35
+
36
+ end
37
+
38
+ end
39
+ end
@@ -0,0 +1,6 @@
1
+ module AdsenseCrawlerForPrivate
2
+ class ApplicationController < ActionController::Base
3
+
4
+
5
+ end
6
+ end
@@ -0,0 +1,4 @@
1
+ module AdsenseCrawlerForPrivate
2
+ module ApplicationHelper
3
+ end
4
+ end
data/config/routes.rb ADDED
@@ -0,0 +1,3 @@
1
+ AdsenseCrawlerForPrivate::Engine.routes.draw do
2
+ get "login", :controller => "adsense_crawler_login"
3
+ end
@@ -0,0 +1,6 @@
1
+ module AdsenseCrawlerForPrivate
2
+ class Engine < Rails::Engine
3
+ isolate_namespace AdsenseCrawlerForPrivate
4
+
5
+ end
6
+ end
@@ -0,0 +1,3 @@
1
+ module AdsenseCrawlerForPrivate
2
+ VERSION = "1.0.0"
3
+ end
@@ -0,0 +1,84 @@
1
+ require 'ipaddr'
2
+ require "adsense_crawler_for_private/engine"
3
+
4
+ module AdsenseCrawlerForPrivate
5
+
6
+ mattr_accessor :cookie_name, :cookie_domain, :crawler_name, :crawler_password, :ip_ranges
7
+
8
+ protected :crawler_password
9
+
10
+ # Checks crawler cookie, returns true if logged in
11
+ def self.login_check(cookies, request)
12
+ cookie = cookies.signed[AdsenseCrawlerForPrivate.cookie_name]
13
+
14
+ login_ok = false
15
+
16
+ if !cookie.blank?
17
+ self.logger.info "login_check: cookie found #{cookie}"
18
+ begin
19
+ name, password, expiry_time, remote_addr = JSON.parse(cookie)
20
+ expiry_time = Time.parse(expiry_time)
21
+
22
+ if (name == AdsenseCrawlerForPrivate.crawler_name and
23
+ password == Digest::SHA1.hexdigest(AdsenseCrawlerForPrivate.crawler_password) and
24
+ expiry_time > Time.now and
25
+ request.remote_addr == remote_addr and
26
+ self.ip_check(request))
27
+ login_ok = true
28
+ self.logger.warn "login_check was ok for #{name}"
29
+ end
30
+ rescue JSON::ParserError => e
31
+ self.logger.warn "login_check problem parsing cookie json: #{e.inspect}"
32
+ ensure
33
+ unless login_ok
34
+ self.logger.warn "login_check wasn't ok, even though cookie was found."
35
+ cookies.delete(AdsenseCrawlerForPrivate.cookie_name)
36
+ end
37
+ end
38
+ end
39
+
40
+ return login_ok
41
+ end
42
+
43
+ def self.cookie_hash(crawler_name, crawler_password, request_or_ip)
44
+ {:value => AdsenseCrawlerForPrivate.cookie_str(crawler_name, crawler_password, 2.days.from_now, request_or_ip),
45
+ :expires => 2.days.from_now,
46
+ :domain => AdsenseCrawlerForPrivate.cookie_domain}
47
+ end
48
+
49
+ def self.cookie_str(crawler_name, crawler_password, expire_time, request_or_ip)
50
+ ip_str = request_or_ip.respond_to?(:remote_addr) ? request_or_ip.remote_addr : request_or_ip.to_s
51
+
52
+ return [crawler_name, Digest::SHA1.hexdigest(crawler_password),
53
+ expire_time.httpdate, ip_str].to_json
54
+ end
55
+
56
+ def self.ip_check(request)
57
+ unless AdsenseCrawlerForPrivate.ip_ranges.nil?
58
+ ip_check = ::IPAddr.new(request.remote_addr)
59
+ AdsenseCrawlerForPrivate.ip_ranges.each do |ip_accepted|
60
+ return true if ip_accepted.include?(ip_check)
61
+ end
62
+ return false
63
+ end
64
+ return true
65
+ end
66
+
67
+ def self.logger
68
+ Logger
69
+ end
70
+
71
+ class Logger
72
+ def self.info(str)
73
+ Rails.logger.info("AdsenseCrawlerForPrivate: #{str}")
74
+ end
75
+ def self.warn(str)
76
+ Rails.logger.warn("AdsenseCrawlerForPrivate: #{str}")
77
+ end
78
+ end
79
+ end
80
+
81
+ #class ApplicationController < ActionController::Base
82
+ # helper MyEngine::SharedEngineHelper
83
+ # end
84
+ #end
@@ -0,0 +1,4 @@
1
+ # desc "Explaining what the task does"
2
+ # task :adsense_crawler_for_private do
3
+ # # Task goes here
4
+ # end
@@ -0,0 +1,7 @@
1
+ require 'test_helper'
2
+
3
+ class AdsenseCrawlerForPrivateTest < ActiveSupport::TestCase
4
+ test "truth" do
5
+ assert_kind_of Module, AdsenseCrawlerForPrivate
6
+ end
7
+ end
@@ -0,0 +1,7 @@
1
+ #!/usr/bin/env rake
2
+ # Add your own tasks in files placed in lib/tasks ending in .rake,
3
+ # for example lib/tasks/capistrano.rake, and they will automatically be available to Rake.
4
+
5
+ require File.expand_path('../config/application', __FILE__)
6
+
7
+ Dummy::Application.load_tasks
@@ -0,0 +1,3 @@
1
+ class ApplicationController < ActionController::Base
2
+ protect_from_forgery
3
+ end
@@ -0,0 +1,29 @@
1
+ class MainController < ApplicationController
2
+
3
+ before_filter :adsense_crawler_private_specific_page, :only => :forbidden_render
4
+ before_filter :redirect_if_not_logged, :only => :forbidden_render
5
+
6
+
7
+ def normal_render
8
+ #debugger
9
+ render :text => 'this is rendered normally'
10
+ end
11
+
12
+ def forbidden_render
13
+ render :text => 'never used'
14
+ end
15
+
16
+ protected
17
+
18
+ def redirect_if_not_logged
19
+ redirect_to "http://www.you-were-not-logged-in.inv"
20
+ end
21
+
22
+ def adsense_crawler_private_specific_page
23
+ if AdsenseCrawlerForPrivate.login_check(cookies, request)
24
+ render :text => "private render for crawler"
25
+ return false
26
+ end
27
+ end
28
+
29
+ end
@@ -0,0 +1,2 @@
1
+ module ApplicationHelper
2
+ end
@@ -0,0 +1,50 @@
1
+ require File.expand_path('../boot', __FILE__)
2
+
3
+ # This is a custom way of disable activerecord database, used instead of default "require 'rails/all'"
4
+ # see http://stackoverflow.com/questions/2212709/remove-activerecord-in-rails-3-beta for more info
5
+ require "action_controller/railtie"
6
+ require "action_mailer/railtie"
7
+ require "active_resource/railtie"
8
+ require "sprockets/railtie"
9
+
10
+ Bundler.require
11
+ require "adsense_crawler_for_private"
12
+
13
+ module Dummy
14
+ class Application < Rails::Application
15
+ # Settings in config/environments/* take precedence over those specified here.
16
+ # Application configuration should go into files in config/initializers
17
+ # -- all .rb files in that directory are automatically loaded.
18
+
19
+ # Custom directories with classes and modules you want to be autoloadable.
20
+ # config.autoload_paths += %W(#{config.root}/extras)
21
+
22
+ # Only load the plugins named here, in the order given (default is alphabetical).
23
+ # :all can be used as a placeholder for all plugins not explicitly named.
24
+ # config.plugins = [ :exception_notification, :ssl_requirement, :all ]
25
+
26
+ # Activate observers that should always be running.
27
+ # config.active_record.observers = :cacher, :garbage_collector, :forum_observer
28
+
29
+ # Set Time.zone default to the specified zone and make Active Record auto-convert to this zone.
30
+ # Run "rake -D time" for a list of tasks for finding time zone names. Default is UTC.
31
+ # config.time_zone = 'Central Time (US & Canada)'
32
+
33
+ # The default locale is :en and all translations from config/locales/*.rb,yml are auto loaded.
34
+ # config.i18n.load_path += Dir[Rails.root.join('my', 'locales', '*.{rb,yml}').to_s]
35
+ # config.i18n.default_locale = :de
36
+
37
+ # Configure the default encoding used in templates for Ruby 1.9.
38
+ config.encoding = "utf-8"
39
+
40
+ # Configure sensitive parameters which will be filtered from the log file.
41
+ config.filter_parameters += [:password]
42
+
43
+ # Enable the asset pipeline
44
+ config.assets.enabled = true
45
+
46
+ # Version of your assets, change this if you want to expire all your assets
47
+ config.assets.version = '1.0'
48
+ end
49
+ end
50
+