irus_analytics 0.0.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,15 @@
1
+ ---
2
+ !binary "U0hBMQ==":
3
+ metadata.gz: !binary |-
4
+ M2VkNzliNDBkYWRlZmVmNmMzZDc3ZTUzZDc0YTk1ZWE2YmUyZTUyZQ==
5
+ data.tar.gz: !binary |-
6
+ OTgwYjQyOWVjN2Y5NWMzZjQ0MGViMjA2NGU0M2FlNTE4YWVkMTk0OQ==
7
+ SHA512:
8
+ metadata.gz: !binary |-
9
+ OTVkZTBjMWY3ZWUxMTFmMDhhMzk5MDQ3NjliODhkMTljZTRkNTFkMjg3OTZm
10
+ YWUzMTY3NDE0YWE2ZjVjMWEyNzVjMzkyMGNhYzQ2NzE1NmYyOTExYTM2Y2I0
11
+ ODUxMGQ2NGQ4OTdjODNhMTM3NDc4OWVjZGNkNWZhZjZmYmU4NGQ=
12
+ data.tar.gz: !binary |-
13
+ NDVjMGM0YTY5NGI4MmUxZjI0NzUxOWU0NDZkN2QwNzE4NDVlYjY0NjkwZjQ1
14
+ ZWQ2YmZhZmNlMjczM2I4ZTcyZjU1ZmZmYjIwNzYzMWQ3ZjQ1OTU5ZmNkMzA3
15
+ NjJmNzc1N2I0Mjc2NTEwOTdjYWM2YzEwYTU4ZGIwMzlhMzUyYmU=
data/.gitignore ADDED
@@ -0,0 +1,22 @@
1
+ *.gem
2
+ *.rbc
3
+ .bundle
4
+ .config
5
+ .yardoc
6
+ Gemfile.lock
7
+ InstalledFiles
8
+ _yardoc
9
+ coverage
10
+ doc/
11
+ lib/bundler/man
12
+ pkg
13
+ rdoc
14
+ spec/reports
15
+ test/tmp
16
+ test/version_tmp
17
+ tmp
18
+ *.bundle
19
+ *.so
20
+ *.o
21
+ *.a
22
+ mkmf.log
data/.ruby-gemset ADDED
@@ -0,0 +1 @@
1
+ irus_analytics
data/.ruby-version ADDED
@@ -0,0 +1 @@
1
+ ruby-1.9.3-p545
data/.travis.yml ADDED
@@ -0,0 +1,4 @@
1
+ language: ruby
2
+ rvm:
3
+ - 1.9.3
4
+ - 2.1.0
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in irus_analytics.gemspec
4
+ gemspec
data/LICENSE.txt ADDED
@@ -0,0 +1,15 @@
1
+ Copyright 2014 The University of Hull
2
+
3
+ Licensed under the Apache License, Version 2.0 (the "License");
4
+ you may not use this file except in compliance with the License.
5
+ You may obtain a copy of the License at
6
+
7
+ http://www.apache.org/licenses/LICENSE-2.0
8
+
9
+ Unless required by applicable law or agreed to in writing, software
10
+ distributed under the License is distributed on an "AS IS" BASIS,
11
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ See the License for the specific language governing permissions and
13
+ limitations under the License.
14
+
15
+ Additional copyright may be held by others, as reflected in the commit history.
data/README.md ADDED
@@ -0,0 +1,92 @@
1
+ # IrusAnalytics
2
+
3
+ IrusAnalytics is a gem that provides a simple way to send analytics to the IRUS-UK repository agggregation service.
4
+
5
+ More information about IRUS-UK can be found at [http://www.irus.mimas.ac.uk/](http://www.irus.mimas.ac.uk/). In summary the IRUS-UK service is designed to provide article-level usage statistics for Institutional Repositories. To sign up and use IRUS-UK, please see the above link.
6
+
7
+ This gem was developed for use with a Hydra repository [http://projecthydra.org/](http://projecthydra.org/), but it can be used with any other Rails based web application.
8
+
9
+ # Build Status
10
+ ![Build Status](https://api.travis-ci.org/uohull/irus_analytics.png?branch=master)
11
+
12
+ ## Installation
13
+
14
+ Add this line to your application's Gemfile:
15
+
16
+ gem 'irus_analytics'
17
+
18
+ And then execute:
19
+
20
+ $ bundle
21
+
22
+ Or install it yourself as:
23
+
24
+ $ gem install irus_analytics
25
+
26
+ ## Usage
27
+
28
+ Once you have the gem, run the following generator:
29
+
30
+ $ rails g irus_analytics
31
+
32
+ This will generate a configuration file that exists within config/initializers/irus_analytics.rb
33
+
34
+ **IrusAnalytics.configuration.source\_repository** is used to configure the name of the source respository url (i.e. what the url for your repository)
35
+ **IrusAnalytics.configuration.irus\_server\_address** is used to define the IRUS-UK server endpoint, this can be configured for the test version of the service.
36
+
37
+ The Irus analytics code is designed to be called after a download event has happened in your rails application. The following code needs adding to the Rails controller handles the content download.
38
+
39
+ A simple example...
40
+
41
+ class YourDownloadController < ApplicationController
42
+ # You need to include the IrusAnalytics behaviour module
43
+ include IrusAnalytics::Controller::AnalyticsBehaviour
44
+
45
+ after_filter :send_analytics, only: [:show]
46
+
47
+ def show
48
+ @id = params[:id]
49
+ # Your code
50
+ end
51
+
52
+ protected
53
+
54
+ # You need to define this method for IrusAnalytics to use as the identifier (typically a OAI valid identifier)
55
+ def item_identifier
56
+ @id
57
+ end
58
+ end
59
+
60
+ Therefore in summary...
61
+
62
+ include IrusAnalytics::Controller::AnalyticsBehaviour
63
+
64
+ after_filter :send_analytics, only: [:show]
65
+
66
+ def item_identifier
67
+ end
68
+
69
+ ... needs adding to the relevant controller.
70
+
71
+ To be compliant with the IRUS-UK client requirements/recommendations this Gem makes use of the Resque [https://github.com/resque/resque](https://github.com/resque/resque). Resque provides a simple way to create background jobs within your Ruby application, and is specifically used within this gem to push the analytics posts onto a queue. This means the download functionality within your application is unaffected by the send analytics call, and it provides a means of queuing analytics if the IRUS-UK server is down.
72
+
73
+ Note: Resque requires Redis to be installed
74
+
75
+ By installing this gem, your application should have access to the Resque rake tasks. These can be seen by running "rake -T", the list should include:-
76
+
77
+ rake resque:failures:sort
78
+ rake resque:work
79
+ rake resque:workers
80
+
81
+ To start the resque background job for this gem use
82
+
83
+ QUEUE=irus_analytics rake environment resque:work
84
+
85
+
86
+ ## Contributing
87
+
88
+ 1. Fork it ( https://github.com/uohull/irus_analytics/fork )
89
+ 2. Create your feature branch (`git checkout -b my-new-feature`)
90
+ 3. Commit your changes (`git commit -am 'Add some feature'`)
91
+ 4. Push to the branch (`git push origin my-new-feature`)
92
+ 5. Create a new Pull Request
data/Rakefile ADDED
@@ -0,0 +1,12 @@
1
+ require "bundler/gem_tasks"
2
+ require "rspec/core/rake_task"
3
+ require "resque/tasks"
4
+
5
+ # Default directory to look in is `/specs`
6
+ # Run with `rake spec`
7
+ RSpec::Core::RakeTask.new(:spec) do |task|
8
+ task.rspec_opts = ['--color']
9
+ end
10
+
11
+ task :default => :spec
12
+
@@ -0,0 +1,241 @@
1
+ [^a]fish
2
+ [+:,\.\;\/\\-]bot
3
+ ^$
4
+ ^IDA$
5
+ ^ruby$
6
+ ^voyager\/
7
+ acme\.spider
8
+ alexa
9
+ Alexandria(\s|\+)prototype(\s|\+)project
10
+ AllenTrack
11
+ almaden
12
+ appie
13
+ Arachmo
14
+ architext
15
+ archive\.org_bot
16
+ arks
17
+ asterias
18
+ atomz
19
+ autoemailspider
20
+ awbot
21
+ baiduspider
22
+ bbot
23
+ BDFetch
24
+ biadu
25
+ biglotron
26
+ bjaaland
27
+ blaiz\-bee
28
+ bloglines
29
+ blogpulse
30
+ boitho\.com\-dc
31
+ bookmark\-manager
32
+ bot
33
+ Brutus\/AET
34
+ bspider
35
+ bwh3_user_agent
36
+ celestial
37
+ cfnetwork|checkbot
38
+ checkprivacy
39
+ China\sLocal\sBrowse\s2\.6
40
+ cloakDetect
41
+ Code\sSample\sWeb\sClient
42
+ combine
43
+ commons\-httpclient
44
+ contentmatch
45
+ ContentSmartz
46
+ core
47
+ CoverScout
48
+ crawl
49
+ crawler
50
+ cursor
51
+ custo
52
+ DataCha0s\/2\.0
53
+ daumoa
54
+ Demo\sBot
55
+ docomo
56
+ Download\+Master
57
+ DSurf
58
+ dtSearchSpider
59
+ dumbot
60
+ easydl
61
+ EmailSiphon
62
+ EmailWolf
63
+ exabot
64
+ fast-webcrawler
65
+ favorg
66
+ FDM(\s|\+)1
67
+ feedburner
68
+ FeedFetcher
69
+ feedfetcher\-google
70
+ ferret
71
+ Fetch(\s|\+)API(\s|\+)Request
72
+ findlinks
73
+ Fulltext
74
+ Funnelback
75
+ gaisbot
76
+ GetRight
77
+ geturl
78
+ gigabot
79
+ girafabot
80
+ gnodspider
81
+ Goldfire(\s|\+)Server
82
+ google
83
+ grub
84
+ gulliver
85
+ harvest
86
+ heritrix
87
+ hl_ftien_spider
88
+ holmes
89
+ htdig
90
+ htmlparser
91
+ HttpComponents\/1.1
92
+ HTTPFetcher
93
+ httpget\?5\.2\.2
94
+ httpget\-5\.2\.2
95
+ httrack
96
+ ia_archiver
97
+ ichiro
98
+ iktomi
99
+ ilse
100
+ internetseer
101
+ intute
102
+ iSiloX
103
+ Jakarta\+Commons\-HttpClient
104
+ java
105
+ jeeves
106
+ jobo
107
+ kyluka
108
+ larbin
109
+ libcurl
110
+ libwww
111
+ libwww\-perl
112
+ lilina
113
+ linkbot
114
+ linkcheck
115
+ linkchecker
116
+ LinkLint-checkonly
117
+ linkscan
118
+ linkwalker
119
+ livejournal\.com
120
+ lmspider
121
+ LOCKSS
122
+ lwp
123
+ LWP\:\:Simple
124
+ lwp\-request
125
+ lwp\-tivial
126
+ lwp\-trivial
127
+ lwp-request
128
+ lycos[_+]
129
+ mail.ru
130
+ MarcEdit.5.2.Web.Client
131
+ mediapartners\-google
132
+ Mediapartners-Google
133
+ megite
134
+ Microsoft(\s|\+)URL(\s|\+)Control
135
+ milbot
136
+ mimas
137
+ mj12bot
138
+ mnogosearch
139
+ moget
140
+ mojeekbot
141
+ momspider
142
+ motor
143
+ msiecrawler
144
+ msnbot
145
+ MuscatFerre
146
+ myweb
147
+ NABOT
148
+ nagios
149
+ NaverBot
150
+ netcraft
151
+ netluchs
152
+ ng\/2\.
153
+ Ning
154
+ no_user_agent
155
+ nomad
156
+ nutch
157
+ ocelli
158
+ Offline(\s|\+)Navigator
159
+ onetszukaj
160
+ OurBrowser
161
+ parsijoo
162
+ pear.php.net
163
+ perman
164
+ PHP\/
165
+ pioneer
166
+ playmusic\.com
167
+ playstarmusic\.com
168
+ powermarks
169
+ psbot
170
+ PycURL
171
+ python
172
+ qihoobot
173
+ rambler
174
+ Readpaper
175
+ redalert|robozilla
176
+ RePEc.link.checker
177
+ robot
178
+ robots
179
+ RPT\-HTTPClient\/0.3-3E
180
+ rss
181
+ scan4mail
182
+ scientificcommons
183
+ scirus
184
+ scooter
185
+ seekbot
186
+ seznambot
187
+ shoutcast
188
+ slurp
189
+ sogou
190
+ speedy
191
+ spider
192
+ spiderman
193
+ spiderview
194
+ Strider
195
+ sunrise
196
+ superbot
197
+ surveybot
198
+ T\-H\-U\-N\-D\-E\-R\-S\-T\-O\-N\-E
199
+ tailrank
200
+ technoratibot
201
+ Teleport(\s|\+)Pro
202
+ Teoma
203
+ titan
204
+ turnitinbot
205
+ twiceler
206
+ ucsd
207
+ ultraseek
208
+ URL2File
209
+ urlaliasbuilder
210
+ urllib
211
+ validator
212
+ virus[_+]detector
213
+ voila
214
+ w3c\-checklink
215
+ Wanadoo
216
+ Web(\s|\+)Downloader
217
+ WebCloner
218
+ webcollage
219
+ WebCopier
220
+ Webinator
221
+ weblayers
222
+ Webmetrics
223
+ webmirror
224
+ webreaper
225
+ WebStripper
226
+ WebZIP
227
+ Wget
228
+ wordpress
229
+ worm
230
+ www.gnip.com
231
+ WWW\-Mechanize
232
+ xenu
233
+ Xenu(\s|\+)Link(\s|\+)Sleuth
234
+ y!j
235
+ yacy
236
+ yahoo
237
+ yandex
238
+ yodaobot
239
+ zealbot
240
+ zeus
241
+ zyborg
@@ -0,0 +1,27 @@
1
+ # coding: utf-8
2
+ lib = File.expand_path('../lib', __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+ require 'irus_analytics/version'
5
+
6
+ Gem::Specification.new do |spec|
7
+ spec.name = "irus_analytics"
8
+ spec.version = IrusAnalytics::VERSION
9
+ spec.authors = ["Simon Lamb"]
10
+ spec.email = ["s.lamb@hull.ac.uk"]
11
+ spec.summary = %q{IrusAnalytics is a gem that provides a simple way to send analytics to the IRUS-UK repository agggregation service. }
12
+ spec.description = %q{More information about IRUS-UK can be found at [http://www.irus.mimas.ac.uk/](http://www.irus.mimas.ac.uk/). In summary the IRUS-UK service is designed to provide article-level usage statistics for Institutional Repositories. This gem was developed for use with a Hydra repository [http://projecthydra.org/(http://projecthydra.org/)], but it can be used with any other Rails based web application. }
13
+ spec.homepage = "https://github.com/uohull/irus_analytics"
14
+ spec.license = "APACHE2"
15
+
16
+ spec.files = `git ls-files -z`.split("\x0")
17
+ spec.executables = spec.files.grep(%r{^bin/}) { |f| File.basename(f) }
18
+ spec.test_files = spec.files.grep(%r{^(test|spec|features)/})
19
+ spec.require_paths = ["lib"]
20
+
21
+ spec.add_dependency "openurl", "~> 0.5"
22
+ spec.add_dependency "resque", "~> 1.25"
23
+
24
+ spec.add_development_dependency "bundler", "~> 1.6"
25
+ spec.add_development_dependency "rake"
26
+ spec.add_development_dependency "rspec"
27
+ end
@@ -0,0 +1,31 @@
1
+ class IrusAnalyticsGenerator < Rails::Generators::Base
2
+
3
+ desc "Irus Analytics Generator"
4
+ def create_irus_analytics_initializer
5
+ initializer "irus_analytics.rb" do
6
+ <<EOF
7
+ # Configuration of IrusAnalytics
8
+ env = Rails.env.to_s
9
+
10
+ IrusAnalytics.configuration.source_repository = case env
11
+ when "development"
12
+ "your-repository.org"
13
+ when "test"
14
+ "your-repository.org"
15
+ else
16
+ "your-repository.org"
17
+ end
18
+
19
+ IrusAnalytics.configuration.irus_server_address = case env
20
+ when "development"
21
+ nil
22
+ when "test"
23
+ nil
24
+ else
25
+ "irus_server_address"
26
+ end
27
+ EOF
28
+
29
+ end
30
+ end
31
+ end
@@ -0,0 +1,89 @@
1
+ require 'logger'
2
+ require 'resque'
3
+
4
+ module IrusAnalytics
5
+ module Controller
6
+ module AnalyticsBehaviour
7
+ def send_analytics
8
+ logger = Logger.new(STDOUT) if logger.nil?
9
+ # Retrieve required params from the request
10
+ if request.nil?
11
+ logger.warn("IrusAnalytics::Controller::AnalyticsBehaviour.send_analytics exited: Request object is nil.")
12
+ else
13
+ # Should we filter this request...
14
+ unless filter_request?(request)
15
+ # Get Request data
16
+ client_ip = request.remote_ip if request.respond_to?(:remote_ip)
17
+ user_agent = request.user_agent if request.respond_to?(:user_agent)
18
+ file_url = request.url if request.respond_to?(:url)
19
+ referer = request.referer if request.respond_to?(:referer)
20
+
21
+ # Defined locally
22
+ datetime = datetime_stamp
23
+ source_repository = source_repository_name
24
+
25
+ # These following should defined in the controller class including this module
26
+ identifier = self.item_identifier if self.respond_to?(:item_identifier)
27
+
28
+ analytics_params = { date_stamp: datetime, client_ip_address: client_ip, user_agent: user_agent, item_oai_identifier: identifier, file_url: file_url,
29
+ http_referer: referer, source_repository: source_repository }
30
+
31
+ if irus_server_address.nil?
32
+ # In development and test Rails environment without irus_server_address we log in debug
33
+ if rails_environment == "development" || rails_environment == "test"
34
+ logger.debug("IrusAnalytics::ControllerBehaviour.send_irus_analytics params extracted #{analytics_params}")
35
+ else
36
+ logger.error("IrusAnalytics::Controller::AnalyticsBehaviour.send_analytics exited: Irus Server address is not set.")
37
+ end
38
+
39
+ else
40
+ begin
41
+ Resque.enqueue(IrusClient, irus_server_address, analytics_params)
42
+ rescue Exception => e
43
+ logger.error("IrusAnalytics::Controller::AnalyticsBehaviour.send_analytics error: Problem enquing the analytics with Resque. Error: #{e}")
44
+ end
45
+ end
46
+ end
47
+ end
48
+ end
49
+
50
+ private
51
+
52
+
53
+ # Returns UTC iso8601 version of Datetime
54
+ def datetime_stamp
55
+ Time.now.utc.iso8601
56
+ end
57
+
58
+ def source_repository_name
59
+ IrusAnalytics.configuration.source_repository
60
+ end
61
+
62
+ def irus_server_address
63
+ IrusAnalytics.configuration.irus_server_address
64
+ end
65
+
66
+ def filter_request?(request)
67
+ filter_request = false
68
+ # If we can't determine the request.user_agent we should filter it...
69
+ if request.respond_to?(:user_agent)
70
+ filter_request = !request.headers['HTTP_RANGE'].nil? || robot_user_agent?(request.user_agent)
71
+ else
72
+ filter_request = true
73
+ end
74
+ filter_request
75
+ end
76
+
77
+ def robot_user_agent?(user_agent)
78
+ IrusAnalytics::UserAgentFilter.instance.filter_user_agent?(user_agent)
79
+ end
80
+
81
+ def rails_environment
82
+ unless Rails.nil?
83
+ return Rails.env.to_s
84
+ end
85
+ end
86
+
87
+ end
88
+ end
89
+ end
@@ -0,0 +1,54 @@
1
+ require 'openurl'
2
+
3
+ module IrusAnalytics
4
+ class IrusAnalyticsService
5
+ attr_accessor :irus_server_address
6
+
7
+ def initialize(irus_server_address)
8
+ @irus_server_address = irus_server_address
9
+ @missing_params = []
10
+ end
11
+
12
+ def send_analytics(params = {})
13
+ if @irus_server_address.to_s.empty?
14
+ raise ArgumentError, "Cannot send analytics: Missing Irus server address"
15
+ end
16
+
17
+ default_params = {date_stamp: "", client_ip_address: "", user_agent: "", item_oai_identifier: "", file_url: "", http_referer: "", source_repository: ""}
18
+ params = default_params.merge(params)
19
+
20
+ if missing_mandatory_params?(params)
21
+ raise ArgumentError, "Missing the following required params: #{@missing_params}"
22
+ end
23
+
24
+ tracker_context_object_builder = IrusAnalytics::TrackerContextObjectBuilder.new
25
+
26
+ tracker_context_object_builder.set_event_datestamp(params[:date_stamp])
27
+ tracker_context_object_builder.set_client_ip_address(params[:client_ip_address])
28
+ tracker_context_object_builder.set_user_agent(params[:user_agent])
29
+ tracker_context_object_builder.set_oai_identifier(params[:item_oai_identifier])
30
+ tracker_context_object_builder.set_file_url(params[:file_url])
31
+ tracker_context_object_builder.set_http_referer(params[:http_referer])
32
+ tracker_context_object_builder.set_source_repository(params[:source_repository])
33
+
34
+ transport = openurl_link_resolver(tracker_context_object_builder.context_object)
35
+ transport.get
36
+
37
+ if transport.code != "200"
38
+ raise "Unexpected response from IRUS server"
39
+ end
40
+
41
+ end
42
+
43
+ # At present, all the params, are mandatory...
44
+ def missing_mandatory_params?(params)
45
+ params.each_pair { |key,value| @missing_fields << key if value.to_s.empty? }
46
+ return !@missing_params.empty?
47
+ end
48
+
49
+ def openurl_link_resolver(context_object)
50
+ OpenURL::Transport.new(@irus_server_address, context_object)
51
+ end
52
+
53
+ end
54
+ end
@@ -0,0 +1,22 @@
1
+ module IrusAnalytics
2
+ class IrusClient
3
+ @queue = :irus_analytics
4
+
5
+ def self.perform(irus_server_address, analytics_params)
6
+ service = IrusAnalytics::IrusAnalyticsService.new(irus_server_address)
7
+ service.send_analytics(symbolize_keys(analytics_params))
8
+ end
9
+
10
+ def self.symbolize_keys(hash)
11
+ new={}
12
+ hash.map do |key,value|
13
+ if value.is_a?(Hash)
14
+ value = symbolize_keys(value)
15
+ end
16
+ new[key.to_sym]=value
17
+ end
18
+ return new
19
+ end
20
+
21
+ end
22
+ end
@@ -0,0 +1,13 @@
1
+ require "rails"
2
+
3
+ # Used to push rake tasks up to app using gem
4
+ module IrusAnalytics
5
+ class Railtie < Rails::Railtie
6
+ railtie_name :irus_analytics
7
+
8
+ rake_tasks do
9
+ load "tasks/resque.rake"
10
+ end
11
+
12
+ end
13
+ end
@@ -0,0 +1,39 @@
1
+ require "openurl"
2
+
3
+ module IrusAnalytics
4
+ class TrackerContextObjectBuilder
5
+ attr_accessor :context_object
6
+ def initialize
7
+ @context_object = OpenURL::ContextObject.new
8
+ end
9
+
10
+ def set_event_datestamp(datetime)
11
+ @context_object.admin.merge!("url_tim"=>{"label"=>"Usage event datestamp", "value"=>datetime})
12
+ end
13
+
14
+ def set_client_ip_address(ip_address)
15
+ @context_object.admin.merge!("req_id"=>{"label"=>"Client IP address", "value"=>"urn:ip:#{ip_address}"})
16
+ end
17
+
18
+ def set_user_agent(user_agent)
19
+ @context_object.admin.merge!("req_dat"=>{"label"=>"UserAgent", "value"=>user_agent})
20
+ end
21
+
22
+ def set_oai_identifier(identifier)
23
+ @context_object.referent.set_metadata("artnum", identifier)
24
+ end
25
+
26
+ def set_file_url(url)
27
+ @context_object.admin.merge!("svc_dat"=>{"label"=>"FileURL", "value"=>url})
28
+ end
29
+
30
+ def set_http_referer(referer)
31
+ @context_object.admin.merge!("rfr_dat"=>{"label"=>"HTTP referer", "value"=>referer})
32
+ end
33
+
34
+ def set_source_repository(source_repository)
35
+ @context_object.admin.merge!("rfr_id"=>{"label"=>"Source repository", "value"=>source_repository})
36
+ end
37
+
38
+ end
39
+ end
@@ -0,0 +1,36 @@
1
+ require 'irus_analytics'
2
+ require 'singleton'
3
+
4
+ module IrusAnalytics
5
+ class UserAgentFilter
6
+ include Singleton
7
+
8
+ # Singleton module defines us a instance class method and makes this private...
9
+ def initialize
10
+ set_robot_agents
11
+ end
12
+
13
+ def filter_user_agent?(user_agent)
14
+ @robot_agents.each do |robot_regexp|
15
+ return true unless user_agent.match(robot_regexp).nil?
16
+ end
17
+ return false
18
+ end
19
+
20
+ def set_robot_agents
21
+ @robot_agents = get_robots_from_config
22
+ end
23
+
24
+ private
25
+
26
+ def get_robots_from_config
27
+ begin
28
+ agent_list = File.open("#{IrusAnalytics.config}/counter_robot_list.txt", "r") { |config| config.readlines.collect{|line| line.chomp }}
29
+ rescue Exception => ex
30
+ # Deal with configuration read error
31
+ agent_list = []
32
+ end
33
+ end
34
+
35
+ end
36
+ end
@@ -0,0 +1,3 @@
1
+ module IrusAnalytics
2
+ VERSION = "0.0.1"
3
+ end
@@ -0,0 +1,46 @@
1
+ require "irus_analytics/version"
2
+ require "irus_analytics/controller/analytics_behaviour"
3
+ require "irus_analytics/irus_analytics_service"
4
+ require "irus_analytics/tracker_context_object_builder"
5
+ require "irus_analytics/user_agent_filter"
6
+ require "irus_analytics/irus_client"
7
+ require "irus_analytics/rail_tie" if defined?(Rails)
8
+ require "resque/server"
9
+
10
+
11
+ module IrusAnalytics
12
+ class << self
13
+ attr_writer :configuration
14
+ end
15
+
16
+ def self.configuration
17
+ @configuration ||= Configuration.new
18
+ end
19
+
20
+ def self.reset
21
+ @configuration = Configuration.new
22
+ end
23
+
24
+ def self.configure
25
+ yeild(configuration)
26
+ end
27
+
28
+ def self.root
29
+ @root ||= File.expand_path(File.dirname(File.dirname(__FILE__)))
30
+ end
31
+
32
+ def self.config
33
+ File.join root, "config"
34
+ end
35
+
36
+ class Configuration
37
+ attr_accessor :source_repository, :irus_server_address
38
+
39
+ def initialize
40
+ @source_repository = "locahost:3000"
41
+ @irus_server_address = "localhost:3000/irus"
42
+ end
43
+ end
44
+
45
+ end
46
+
@@ -0,0 +1,3 @@
1
+ require 'resque/tasks'
2
+
3
+ task 'resque:setup' => :environment
@@ -0,0 +1,74 @@
1
+ require 'spec_helper'
2
+
3
+ class TestClass
4
+ include IrusAnalytics::Controller::AnalyticsBehaviour
5
+ attr_accessor :request, :item_identifier
6
+ end
7
+
8
+ describe IrusAnalytics::Controller::AnalyticsBehaviour do
9
+
10
+ describe ".send_analytics" do
11
+ before(:each) do
12
+ @test_class = TestClass.new
13
+ @test_class.request = double("request", :remote_ip => "127.0.0.1", :user_agent => "Test user agent", url: "http://localhost:3000/test", referer: "http://localhost:3000", headers: { "HTTP_RANGE" => nil })
14
+ @test_class.item_identifier = "test:123"
15
+ end
16
+
17
+ it "will call the send_irus_analytics method with the correct params..." do
18
+ # We set the datetime stamp to ensure sync
19
+ date_time = "2014-06-09T16:56:48Z"
20
+ allow(@test_class).to receive(:datetime_stamp) .and_return(date_time)
21
+ allow(@test_class).to receive(:source_repository_name) .and_return("hydra.hull.ac.uk")
22
+ allow(@test_class).to receive(:irus_server_address) .and_return("irus-server-address.org")
23
+ allow(@test_class).to receive(:rails_environment) .and_return("production")
24
+ params = { date_stamp: date_time, client_ip_address: "127.0.0.1", user_agent: "Test user agent",item_oai_identifier: "test:123",
25
+ file_url: "http://localhost:3000/test", http_referer: "http://localhost:3000", source_repository: "hydra.hull.ac.uk" }
26
+
27
+ allow(Resque).to receive(:enqueue) .and_return(nil)
28
+ # Should NOT filter this request
29
+ expect(@test_class).to receive(:filter_request?).and_return(false)
30
+ expect(Resque).to receive(:enqueue).with(IrusAnalytics::IrusClient, "irus-server-address.org", params )
31
+
32
+ @test_class.send_analytics
33
+ end
34
+
35
+ it "will not call the send_irus_analytics method when there is a filter user-agent.." do
36
+ # Add a well known robot...
37
+ @test_class.request = double("request", :remote_ip => "127.0.0.1", :user_agent => "Microsoft URL Control - 6.00.8862", url: "http://localhost:3000/test", referer: "http://localhost:3000", headers: { "HTTP_RANGE" => nil })
38
+ # We set the datetime stamp to ensure sync
39
+ date_time = "2014-06-09T16:56:48Z"
40
+ allow(@test_class).to receive(:datetime_stamp) .and_return(date_time)
41
+ allow(@test_class).to receive(:source_repository_name) .and_return("hydra.hull.ac.uk")
42
+ allow(@test_class).to receive(:irus_server_address) .and_return("irus-server-address.org")
43
+ allow(@test_class).to receive(:rails_environment) .and_return("production")
44
+ params = { date_stamp: date_time, client_ip_address: "127.0.0.1", user_agent: "Microsoft URL Control - 6.00.8862" ,item_oai_identifier: "test:123",
45
+ file_url: "http://localhost:3000/test", http_referer: "http://localhost:3000", source_repository: "hydra.hull.ac.uk" }
46
+
47
+ allow(Resque).to receive(:enqueue) .and_return(nil)
48
+ # Should filter this request
49
+ expect(Resque).to_not receive(:enqueue).with(IrusAnalytics::IrusClient, "irus-server-address.org", params )
50
+ @test_class.send_analytics
51
+ end
52
+
53
+
54
+ it "will not call the send_irus_analytics method when the request is expecting a chunk of data (HTTP_RANGE downloading data)." do
55
+ # Add a well known robot...
56
+ @test_class.request = double("request", :remote_ip => "127.0.0.1", :user_agent =>"Test user agent", url: "http://localhost:3000/test", referer: "http://localhost:3000", headers: { "HTTP_RANGE" => "bytes=0-65535"})
57
+ # We set the datetime stamp to ensure sync
58
+ date_time = "2014-06-09T16:56:48Z"
59
+ allow(@test_class).to receive(:datetime_stamp) .and_return(date_time)
60
+ allow(@test_class).to receive(:source_repository_name) .and_return("hydra.hull.ac.uk")
61
+ allow(@test_class).to receive(:irus_server_address) .and_return("irus-server-address.org")
62
+ allow(@test_class).to receive(:rails_environment) .and_return("production")
63
+ params = { date_stamp: date_time, client_ip_address: "127.0.0.1", user_agent: "Microsoft URL Control - 6.00.8862" ,item_oai_identifier: "test:123",
64
+ file_url: "http://localhost:3000/test", http_referer: "http://localhost:3000", source_repository: "hydra.hull.ac.uk" }
65
+
66
+ allow(Resque).to receive(:enqueue) .and_return(nil)
67
+ # Should filter this request
68
+ expect(Resque).to_not receive(:enqueue).with(IrusAnalytics::IrusClient, "irus-server-address.org", params )
69
+ @test_class.send_analytics
70
+ end
71
+
72
+
73
+ end
74
+ end
@@ -0,0 +1,31 @@
1
+ require 'spec_helper'
2
+
3
+ describe IrusAnalytics::IrusAnalyticsService do
4
+ let (:irus_analytics_service) { IrusAnalytics::IrusAnalyticsService.new("") }
5
+ let(:test_params) { { date_stamp: "2010-10-17T03:04:42Z", client_ip_address: "127.0.0.1", user_agent: "Mozilla/5.0 (iPad; U; CPU OS 3_2_1 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Mobile/7B405",
6
+ item_oai_identifier: "hull:123", file_url: "https://hydra.hull.ac.uk/assets/hull:123/content", http_referer: "https://www.google.co.uk/search?q=hydra+hull%3A123&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a&channel=sb&gfe_rd=cr",
7
+ source_repository: "hydra.hull.ac.uk" } }
8
+
9
+ describe ".send_analytics" do
10
+
11
+ before (:each) do
12
+ # Create a double for the transport object that will return 200 OK status
13
+ transport = double("transport", :get => "", :code => "200")
14
+ allow(irus_analytics_service).to receive(:openurl_link_resolver) .and_return(transport)
15
+ end
16
+
17
+ it "will throw an exception if the irus_server_address object variable is not set" do
18
+ expect { irus_analytics_service.send_analytics(test_params) }.to raise_error
19
+ end
20
+
21
+ it "enables the required parameters to be set within a hash" do
22
+ irus_analytics_service.irus_server_address = "irus_address"
23
+ irus_analytics_service.send_analytics(test_params)
24
+ end
25
+
26
+ it "will throw an exception if any of the mandatory IRUS data is missing" do
27
+ expect { irus_analytics_service.send_analytics({}) }.to raise_error
28
+ end
29
+
30
+ end
31
+ end
@@ -0,0 +1,22 @@
1
+ require "spec_helper"
2
+
3
+ describe IrusAnalytics::IrusClient do
4
+ let(:test_params) { { date_stamp: "2010-10-17T03:04:42Z", client_ip_address: "127.0.0.1", user_agent: "Mozilla/5.0 (iPad; U; CPU OS 3_2_1 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Mobile/7B405",
5
+ item_oai_identifier: "hull:123", file_url: "https://hydra.hull.ac.uk/assets/hull:123/content", http_referer: "https://www.google.co.uk/search?q=hydra+hull%3A123&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a&channel=sb&gfe_rd=cr",
6
+ source_repository: "hydra.hull.ac.uk" } }
7
+ describe ".perform" do
8
+ it "takes the irus_server_address and analytics_params and calls IrusAnalyticsService.send_irus_analytics method" do
9
+ # subject.class.perform("irus-server", test_params)
10
+ end
11
+ end
12
+
13
+ # Required due to Resque returning stringyfield hash keys
14
+ describe ".symbolize_keys" do
15
+ it "takes a hash that uses string keys, and returns the hash with symbol keys" do
16
+ test_hash = { "key_1" => "Value 1", "key_2" => "Value 2", "key_3" => "Value 3" }
17
+ new_hash = IrusAnalytics::IrusClient.symbolize_keys(test_hash)
18
+ expect(new_hash).to include(key_1: "Value 1", key_2: "Value 2", key_3: "Value 3")
19
+ end
20
+ end
21
+
22
+ end
@@ -0,0 +1,78 @@
1
+ require 'spec_helper'
2
+
3
+ describe IrusAnalytics::TrackerContextObjectBuilder do
4
+ describe ".initialize" do
5
+ it "will initialize an empty OpenURL::ContextObject instance" do
6
+ expect(IrusAnalytics::TrackerContextObjectBuilder.new.context_object).to be_an_instance_of OpenURL::ContextObject
7
+ end
8
+ end
9
+
10
+ context "set methods" do
11
+ let(:builder) { IrusAnalytics::TrackerContextObjectBuilder.new }
12
+
13
+ describe "OpenURL version" do
14
+ it "will be defaulted to required version for IRUS" do
15
+ expect(builder.context_object.kev).to include("url_ver=Z39.88-2004")
16
+ end
17
+ end
18
+
19
+ describe "set_event_datestamp" do
20
+ it "will set event datestamp as per IRUS specification" do
21
+ date_time = "2010-10-17T03:04:42Z"
22
+ builder.set_event_datestamp(date_time)
23
+ expect(builder.context_object.kev).to include("url_tim=2010-10-17T03%3A04%3A42Z") # Html_encoded version of the string
24
+ end
25
+ end
26
+
27
+ describe "set_client_ip_address" do
28
+ it "will set client ip address as per IRUS specification" do
29
+ ip_address = "127.0.0.1"
30
+ builder.set_client_ip_address(ip_address)
31
+ expect(builder.context_object.kev).to include("req_id=urn%3Aip%3A127.0.0.1")
32
+ end
33
+ end
34
+
35
+ describe "set_user_agent" do
36
+ it "will set the request UserAgent as per IRUS specification" do
37
+ agent = "Mozilla/5.0 (iPad; U; CPU OS 3_2_1 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Mobile/7B405"
38
+ builder.set_user_agent(agent)
39
+ expect(builder.context_object.kev).to include("req_dat=Mozilla%2F5.0+%28iPad%3B+U%3B+CPU+OS+3_2_1+like+Mac+OS+X%3B+en-us%29+AppleWebKit%2F531.21.10+%28KHTML%2C+like+Gecko%29+Mobile%2F7B405")
40
+ end
41
+ end
42
+
43
+ describe "set_oai_identifier" do
44
+ it "will set the item oai identifier as per IRUS specification" do
45
+ identifier = "oai:hull.ac.uk:hull:123"
46
+ builder.set_oai_identifier(identifier)
47
+ expect(builder.context_object.kev).to include("rft.artnum=oai%3Ahull.ac.uk%3Ahull%3A123")
48
+ end
49
+ end
50
+
51
+ describe "set_file_url" do
52
+ it "will set FileURL as per IRUS specification" do
53
+ url = "https://hydra.hull.ac.uk/assets/hull:123/content"
54
+ builder.set_file_url(url)
55
+ expect(builder.context_object.kev).to include("svc_dat=https%3A%2F%2Fhydra.hull.ac.uk%2Fassets%2Fhull%3A123%2Fcontent")
56
+ end
57
+ end
58
+
59
+ describe "set_http_referer" do
60
+ it "will set the HTTP referer as per IRUS specification" do
61
+ referer = "http://www.google.co.uk/url?sa=t&rct=j&q=http%20referer&source=web&cd=4&sqi=2&ved=0CEoQFjAD&url=http%3A%2F%2Fwww.whatismyreferer.com%2F&ei=zIBCU6fbEoOqhQf67YCwBg&usg=AFQjCNFt-KMqneTZfEb6OxjPZlD4ogiJcQ&sig2=wZJYkoWgNScNjgxRbRs29w&bvm=bv.64125504,d.ZWU"
62
+ builder.set_http_referer(referer)
63
+ expect(builder.context_object.kev).to include("rfr_dat=http%3A%2F%2Fwww.google.co.uk%2Furl%3Fsa%3Dt%26rct%3Dj%26q%3Dhttp%2520referer%26source%3Dweb%26cd%3D4%26sqi%3D2%26ved%3D0CEoQFjAD%26url%3Dhttp%253A%252F%252Fwww.whatismyreferer.com%252F%26ei%3DzIBCU6fbEoOqhQf67YCwBg%26usg%3DAFQjCNFt-KMqneTZfEb6OxjPZlD4ogiJcQ%26sig2%3DwZJYkoWgNScNjgxRbRs29w%26bvm%3Dbv.64125504%2Cd.ZWU")
64
+ end
65
+ end
66
+
67
+ describe "set_source_repository" do
68
+ it "will set the Source repository as per IRUS specification" do
69
+ src_repo = "hydra.hull.ac.uk"
70
+ builder.set_source_repository(src_repo)
71
+ expect(builder.context_object.kev).to include("rfr_id=hydra.hull.ac.uk")
72
+ end
73
+ end
74
+
75
+
76
+ end
77
+
78
+ end
@@ -0,0 +1,23 @@
1
+ require 'spec_helper'
2
+
3
+ describe IrusAnalytics::UserAgentFilter do
4
+
5
+ context "singleton" do
6
+ describe ".instance" do
7
+ it "should return the singleton instance of the RobotsFilter" do
8
+ expect(IrusAnalytics::UserAgentFilter.instance).to be_instance_of IrusAnalytics::UserAgentFilter
9
+ end
10
+ end
11
+
12
+ describe ".filter_user_agent" do
13
+ it "will return true when a user agent should be filtered" do
14
+ expect(IrusAnalytics::UserAgentFilter.instance.filter_user_agent?("appie")).to be true
15
+ end
16
+
17
+ it "will return false when a user agent is valid and should not be filtered" do
18
+ expect(IrusAnalytics::UserAgentFilter.instance.filter_user_agent?("Firefox 3.0")).to be false
19
+ end
20
+ end
21
+
22
+ end
23
+ end
@@ -0,0 +1,6 @@
1
+ require 'rspec'
2
+ require 'irus_analytics'
3
+
4
+ # RSpec.configure do |c|
5
+ # c.mock_with :rspec
6
+ # end
metadata ADDED
@@ -0,0 +1,151 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: irus_analytics
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.0.1
5
+ platform: ruby
6
+ authors:
7
+ - Simon Lamb
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2014-08-22 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: openurl
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - ~>
18
+ - !ruby/object:Gem::Version
19
+ version: '0.5'
20
+ type: :runtime
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - ~>
25
+ - !ruby/object:Gem::Version
26
+ version: '0.5'
27
+ - !ruby/object:Gem::Dependency
28
+ name: resque
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - ~>
32
+ - !ruby/object:Gem::Version
33
+ version: '1.25'
34
+ type: :runtime
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - ~>
39
+ - !ruby/object:Gem::Version
40
+ version: '1.25'
41
+ - !ruby/object:Gem::Dependency
42
+ name: bundler
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - ~>
46
+ - !ruby/object:Gem::Version
47
+ version: '1.6'
48
+ type: :development
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - ~>
53
+ - !ruby/object:Gem::Version
54
+ version: '1.6'
55
+ - !ruby/object:Gem::Dependency
56
+ name: rake
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - ! '>='
60
+ - !ruby/object:Gem::Version
61
+ version: '0'
62
+ type: :development
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - ! '>='
67
+ - !ruby/object:Gem::Version
68
+ version: '0'
69
+ - !ruby/object:Gem::Dependency
70
+ name: rspec
71
+ requirement: !ruby/object:Gem::Requirement
72
+ requirements:
73
+ - - ! '>='
74
+ - !ruby/object:Gem::Version
75
+ version: '0'
76
+ type: :development
77
+ prerelease: false
78
+ version_requirements: !ruby/object:Gem::Requirement
79
+ requirements:
80
+ - - ! '>='
81
+ - !ruby/object:Gem::Version
82
+ version: '0'
83
+ description: ! 'More information about IRUS-UK can be found at [http://www.irus.mimas.ac.uk/](http://www.irus.mimas.ac.uk/). In
84
+ summary the IRUS-UK service is designed to provide article-level usage statistics
85
+ for Institutional Repositories. This gem was developed for use with a Hydra repository
86
+ [http://projecthydra.org/(http://projecthydra.org/)], but it can be used with any
87
+ other Rails based web application. '
88
+ email:
89
+ - s.lamb@hull.ac.uk
90
+ executables: []
91
+ extensions: []
92
+ extra_rdoc_files: []
93
+ files:
94
+ - .gitignore
95
+ - .ruby-gemset
96
+ - .ruby-version
97
+ - .travis.yml
98
+ - Gemfile
99
+ - LICENSE.txt
100
+ - README.md
101
+ - Rakefile
102
+ - config/counter_robot_list.txt
103
+ - irus_analytics.gemspec
104
+ - lib/generators/irus_analytics_generator.rb
105
+ - lib/irus_analytics.rb
106
+ - lib/irus_analytics/controller/analytics_behaviour.rb
107
+ - lib/irus_analytics/irus_analytics_service.rb
108
+ - lib/irus_analytics/irus_client.rb
109
+ - lib/irus_analytics/rail_tie.rb
110
+ - lib/irus_analytics/tracker_context_object_builder.rb
111
+ - lib/irus_analytics/user_agent_filter.rb
112
+ - lib/irus_analytics/version.rb
113
+ - lib/tasks/resque.rake
114
+ - spec/lib/irus_analytics/controller/analytics_behaviour_spec.rb
115
+ - spec/lib/irus_analytics/irus_analytics_service_spec.rb
116
+ - spec/lib/irus_analytics/irus_client_spec.rb
117
+ - spec/lib/irus_analytics/tracker_context_object_builder_spec.rb
118
+ - spec/lib/irus_analytics/user_agent_filter_spec.rb
119
+ - spec/spec_helper.rb
120
+ homepage: https://github.com/uohull/irus_analytics
121
+ licenses:
122
+ - APACHE2
123
+ metadata: {}
124
+ post_install_message:
125
+ rdoc_options: []
126
+ require_paths:
127
+ - lib
128
+ required_ruby_version: !ruby/object:Gem::Requirement
129
+ requirements:
130
+ - - ! '>='
131
+ - !ruby/object:Gem::Version
132
+ version: '0'
133
+ required_rubygems_version: !ruby/object:Gem::Requirement
134
+ requirements:
135
+ - - ! '>='
136
+ - !ruby/object:Gem::Version
137
+ version: '0'
138
+ requirements: []
139
+ rubyforge_project:
140
+ rubygems_version: 2.2.2
141
+ signing_key:
142
+ specification_version: 4
143
+ summary: IrusAnalytics is a gem that provides a simple way to send analytics to the
144
+ IRUS-UK repository agggregation service.
145
+ test_files:
146
+ - spec/lib/irus_analytics/controller/analytics_behaviour_spec.rb
147
+ - spec/lib/irus_analytics/irus_analytics_service_spec.rb
148
+ - spec/lib/irus_analytics/irus_client_spec.rb
149
+ - spec/lib/irus_analytics/tracker_context_object_builder_spec.rb
150
+ - spec/lib/irus_analytics/user_agent_filter_spec.rb
151
+ - spec/spec_helper.rb