irus_analytics 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,15 @@
1
+ ---
2
+ !binary "U0hBMQ==":
3
+ metadata.gz: !binary |-
4
+ M2VkNzliNDBkYWRlZmVmNmMzZDc3ZTUzZDc0YTk1ZWE2YmUyZTUyZQ==
5
+ data.tar.gz: !binary |-
6
+ OTgwYjQyOWVjN2Y5NWMzZjQ0MGViMjA2NGU0M2FlNTE4YWVkMTk0OQ==
7
+ SHA512:
8
+ metadata.gz: !binary |-
9
+ OTVkZTBjMWY3ZWUxMTFmMDhhMzk5MDQ3NjliODhkMTljZTRkNTFkMjg3OTZm
10
+ YWUzMTY3NDE0YWE2ZjVjMWEyNzVjMzkyMGNhYzQ2NzE1NmYyOTExYTM2Y2I0
11
+ ODUxMGQ2NGQ4OTdjODNhMTM3NDc4OWVjZGNkNWZhZjZmYmU4NGQ=
12
+ data.tar.gz: !binary |-
13
+ NDVjMGM0YTY5NGI4MmUxZjI0NzUxOWU0NDZkN2QwNzE4NDVlYjY0NjkwZjQ1
14
+ ZWQ2YmZhZmNlMjczM2I4ZTcyZjU1ZmZmYjIwNzYzMWQ3ZjQ1OTU5ZmNkMzA3
15
+ NjJmNzc1N2I0Mjc2NTEwOTdjYWM2YzEwYTU4ZGIwMzlhMzUyYmU=
data/.gitignore ADDED
@@ -0,0 +1,22 @@
1
+ *.gem
2
+ *.rbc
3
+ .bundle
4
+ .config
5
+ .yardoc
6
+ Gemfile.lock
7
+ InstalledFiles
8
+ _yardoc
9
+ coverage
10
+ doc/
11
+ lib/bundler/man
12
+ pkg
13
+ rdoc
14
+ spec/reports
15
+ test/tmp
16
+ test/version_tmp
17
+ tmp
18
+ *.bundle
19
+ *.so
20
+ *.o
21
+ *.a
22
+ mkmf.log
data/.ruby-gemset ADDED
@@ -0,0 +1 @@
1
+ irus_analytics
data/.ruby-version ADDED
@@ -0,0 +1 @@
1
+ ruby-1.9.3-p545
data/.travis.yml ADDED
@@ -0,0 +1,4 @@
1
+ language: ruby
2
+ rvm:
3
+ - 1.9.3
4
+ - 2.1.0
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in irus_analytics.gemspec
4
+ gemspec
data/LICENSE.txt ADDED
@@ -0,0 +1,15 @@
1
+ Copyright 2014 The University of Hull
2
+
3
+ Licensed under the Apache License, Version 2.0 (the "License");
4
+ you may not use this file except in compliance with the License.
5
+ You may obtain a copy of the License at
6
+
7
+ http://www.apache.org/licenses/LICENSE-2.0
8
+
9
+ Unless required by applicable law or agreed to in writing, software
10
+ distributed under the License is distributed on an "AS IS" BASIS,
11
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ See the License for the specific language governing permissions and
13
+ limitations under the License.
14
+
15
+ Additional copyright may be held by others, as reflected in the commit history.
data/README.md ADDED
@@ -0,0 +1,92 @@
1
+ # IrusAnalytics
2
+
3
+ IrusAnalytics is a gem that provides a simple way to send analytics to the IRUS-UK repository agggregation service.
4
+
5
+ More information about IRUS-UK can be found at [http://www.irus.mimas.ac.uk/](http://www.irus.mimas.ac.uk/). In summary the IRUS-UK service is designed to provide article-level usage statistics for Institutional Repositories. To sign up and use IRUS-UK, please see the above link.
6
+
7
+ This gem was developed for use with a Hydra repository [http://projecthydra.org/](http://projecthydra.org/), but it can be used with any other Rails based web application.
8
+
9
+ # Build Status
10
+ ![Build Status](https://api.travis-ci.org/uohull/irus_analytics.png?branch=master)
11
+
12
+ ## Installation
13
+
14
+ Add this line to your application's Gemfile:
15
+
16
+ gem 'irus_analytics'
17
+
18
+ And then execute:
19
+
20
+ $ bundle
21
+
22
+ Or install it yourself as:
23
+
24
+ $ gem install irus_analytics
25
+
26
+ ## Usage
27
+
28
+ Once you have the gem, run the following generator:
29
+
30
+ $ rails g irus_analytics
31
+
32
+ This will generate a configuration file that exists within config/initializers/irus_analytics.rb
33
+
34
+ **IrusAnalytics.configuration.source\_repository** is used to configure the name of the source respository url (i.e. what the url for your repository)
35
+ **IrusAnalytics.configuration.irus\_server\_address** is used to define the IRUS-UK server endpoint, this can be configured for the test version of the service.
36
+
37
+ The Irus analytics code is designed to be called after a download event has happened in your rails application. The following code needs adding to the Rails controller handles the content download.
38
+
39
+ A simple example...
40
+
41
+ class YourDownloadController < ApplicationController
42
+ # You need to include the IrusAnalytics behaviour module
43
+ include IrusAnalytics::Controller::AnalyticsBehaviour
44
+
45
+ after_filter :send_analytics, only: [:show]
46
+
47
+ def show
48
+ @id = params[:id]
49
+ # Your code
50
+ end
51
+
52
+ protected
53
+
54
+ # You need to define this method for IrusAnalytics to use as the identifier (typically a OAI valid identifier)
55
+ def item_identifier
56
+ @id
57
+ end
58
+ end
59
+
60
+ Therefore in summary...
61
+
62
+ include IrusAnalytics::Controller::AnalyticsBehaviour
63
+
64
+ after_filter :send_analytics, only: [:show]
65
+
66
+ def item_identifier
67
+ end
68
+
69
+ ... needs adding to the relevant controller.
70
+
71
+ To be compliant with the IRUS-UK client requirements/recommendations this Gem makes use of the Resque [https://github.com/resque/resque](https://github.com/resque/resque). Resque provides a simple way to create background jobs within your Ruby application, and is specifically used within this gem to push the analytics posts onto a queue. This means the download functionality within your application is unaffected by the send analytics call, and it provides a means of queuing analytics if the IRUS-UK server is down.
72
+
73
+ Note: Resque requires Redis to be installed
74
+
75
+ By installing this gem, your application should have access to the Resque rake tasks. These can be seen by running "rake -T", the list should include:-
76
+
77
+ rake resque:failures:sort
78
+ rake resque:work
79
+ rake resque:workers
80
+
81
+ To start the resque background job for this gem use
82
+
83
+ QUEUE=irus_analytics rake environment resque:work
84
+
85
+
86
+ ## Contributing
87
+
88
+ 1. Fork it ( https://github.com/uohull/irus_analytics/fork )
89
+ 2. Create your feature branch (`git checkout -b my-new-feature`)
90
+ 3. Commit your changes (`git commit -am 'Add some feature'`)
91
+ 4. Push to the branch (`git push origin my-new-feature`)
92
+ 5. Create a new Pull Request
data/Rakefile ADDED
@@ -0,0 +1,12 @@
1
+ require "bundler/gem_tasks"
2
+ require "rspec/core/rake_task"
3
+ require "resque/tasks"
4
+
5
+ # Default directory to look in is `/specs`
6
+ # Run with `rake spec`
7
+ RSpec::Core::RakeTask.new(:spec) do |task|
8
+ task.rspec_opts = ['--color']
9
+ end
10
+
11
+ task :default => :spec
12
+
@@ -0,0 +1,241 @@
1
+ [^a]fish
2
+ [+:,\.\;\/\\-]bot
3
+ ^$
4
+ ^IDA$
5
+ ^ruby$
6
+ ^voyager\/
7
+ acme\.spider
8
+ alexa
9
+ Alexandria(\s|\+)prototype(\s|\+)project
10
+ AllenTrack
11
+ almaden
12
+ appie
13
+ Arachmo
14
+ architext
15
+ archive\.org_bot
16
+ arks
17
+ asterias
18
+ atomz
19
+ autoemailspider
20
+ awbot
21
+ baiduspider
22
+ bbot
23
+ BDFetch
24
+ biadu
25
+ biglotron
26
+ bjaaland
27
+ blaiz\-bee
28
+ bloglines
29
+ blogpulse
30
+ boitho\.com\-dc
31
+ bookmark\-manager
32
+ bot
33
+ Brutus\/AET
34
+ bspider
35
+ bwh3_user_agent
36
+ celestial
37
+ cfnetwork|checkbot
38
+ checkprivacy
39
+ China\sLocal\sBrowse\s2\.6
40
+ cloakDetect
41
+ Code\sSample\sWeb\sClient
42
+ combine
43
+ commons\-httpclient
44
+ contentmatch
45
+ ContentSmartz
46
+ core
47
+ CoverScout
48
+ crawl
49
+ crawler
50
+ cursor
51
+ custo
52
+ DataCha0s\/2\.0
53
+ daumoa
54
+ Demo\sBot
55
+ docomo
56
+ Download\+Master
57
+ DSurf
58
+ dtSearchSpider
59
+ dumbot
60
+ easydl
61
+ EmailSiphon
62
+ EmailWolf
63
+ exabot
64
+ fast-webcrawler
65
+ favorg
66
+ FDM(\s|\+)1
67
+ feedburner
68
+ FeedFetcher
69
+ feedfetcher\-google
70
+ ferret
71
+ Fetch(\s|\+)API(\s|\+)Request
72
+ findlinks
73
+ Fulltext
74
+ Funnelback
75
+ gaisbot
76
+ GetRight
77
+ geturl
78
+ gigabot
79
+ girafabot
80
+ gnodspider
81
+ Goldfire(\s|\+)Server
82
+ google
83
+ grub
84
+ gulliver
85
+ harvest
86
+ heritrix
87
+ hl_ftien_spider
88
+ holmes
89
+ htdig
90
+ htmlparser
91
+ HttpComponents\/1.1
92
+ HTTPFetcher
93
+ httpget\?5\.2\.2
94
+ httpget\-5\.2\.2
95
+ httrack
96
+ ia_archiver
97
+ ichiro
98
+ iktomi
99
+ ilse
100
+ internetseer
101
+ intute
102
+ iSiloX
103
+ Jakarta\+Commons\-HttpClient
104
+ java
105
+ jeeves
106
+ jobo
107
+ kyluka
108
+ larbin
109
+ libcurl
110
+ libwww
111
+ libwww\-perl
112
+ lilina
113
+ linkbot
114
+ linkcheck
115
+ linkchecker
116
+ LinkLint-checkonly
117
+ linkscan
118
+ linkwalker
119
+ livejournal\.com
120
+ lmspider
121
+ LOCKSS
122
+ lwp
123
+ LWP\:\:Simple
124
+ lwp\-request
125
+ lwp\-tivial
126
+ lwp\-trivial
127
+ lwp-request
128
+ lycos[_+]
129
+ mail.ru
130
+ MarcEdit.5.2.Web.Client
131
+ mediapartners\-google
132
+ Mediapartners-Google
133
+ megite
134
+ Microsoft(\s|\+)URL(\s|\+)Control
135
+ milbot
136
+ mimas
137
+ mj12bot
138
+ mnogosearch
139
+ moget
140
+ mojeekbot
141
+ momspider
142
+ motor
143
+ msiecrawler
144
+ msnbot
145
+ MuscatFerre
146
+ myweb
147
+ NABOT
148
+ nagios
149
+ NaverBot
150
+ netcraft
151
+ netluchs
152
+ ng\/2\.
153
+ Ning
154
+ no_user_agent
155
+ nomad
156
+ nutch
157
+ ocelli
158
+ Offline(\s|\+)Navigator
159
+ onetszukaj
160
+ OurBrowser
161
+ parsijoo
162
+ pear.php.net
163
+ perman
164
+ PHP\/
165
+ pioneer
166
+ playmusic\.com
167
+ playstarmusic\.com
168
+ powermarks
169
+ psbot
170
+ PycURL
171
+ python
172
+ qihoobot
173
+ rambler
174
+ Readpaper
175
+ redalert|robozilla
176
+ RePEc.link.checker
177
+ robot
178
+ robots
179
+ RPT\-HTTPClient\/0.3-3E
180
+ rss
181
+ scan4mail
182
+ scientificcommons
183
+ scirus
184
+ scooter
185
+ seekbot
186
+ seznambot
187
+ shoutcast
188
+ slurp
189
+ sogou
190
+ speedy
191
+ spider
192
+ spiderman
193
+ spiderview
194
+ Strider
195
+ sunrise
196
+ superbot
197
+ surveybot
198
+ T\-H\-U\-N\-D\-E\-R\-S\-T\-O\-N\-E
199
+ tailrank
200
+ technoratibot
201
+ Teleport(\s|\+)Pro
202
+ Teoma
203
+ titan
204
+ turnitinbot
205
+ twiceler
206
+ ucsd
207
+ ultraseek
208
+ URL2File
209
+ urlaliasbuilder
210
+ urllib
211
+ validator
212
+ virus[_+]detector
213
+ voila
214
+ w3c\-checklink
215
+ Wanadoo
216
+ Web(\s|\+)Downloader
217
+ WebCloner
218
+ webcollage
219
+ WebCopier
220
+ Webinator
221
+ weblayers
222
+ Webmetrics
223
+ webmirror
224
+ webreaper
225
+ WebStripper
226
+ WebZIP
227
+ Wget
228
+ wordpress
229
+ worm
230
+ www.gnip.com
231
+ WWW\-Mechanize
232
+ xenu
233
+ Xenu(\s|\+)Link(\s|\+)Sleuth
234
+ y!j
235
+ yacy
236
+ yahoo
237
+ yandex
238
+ yodaobot
239
+ zealbot
240
+ zeus
241
+ zyborg
@@ -0,0 +1,27 @@
1
+ # coding: utf-8
2
+ lib = File.expand_path('../lib', __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+ require 'irus_analytics/version'
5
+
6
+ Gem::Specification.new do |spec|
7
+ spec.name = "irus_analytics"
8
+ spec.version = IrusAnalytics::VERSION
9
+ spec.authors = ["Simon Lamb"]
10
+ spec.email = ["s.lamb@hull.ac.uk"]
11
+ spec.summary = %q{IrusAnalytics is a gem that provides a simple way to send analytics to the IRUS-UK repository agggregation service. }
12
+ spec.description = %q{More information about IRUS-UK can be found at [http://www.irus.mimas.ac.uk/](http://www.irus.mimas.ac.uk/). In summary the IRUS-UK service is designed to provide article-level usage statistics for Institutional Repositories. This gem was developed for use with a Hydra repository [http://projecthydra.org/(http://projecthydra.org/)], but it can be used with any other Rails based web application. }
13
+ spec.homepage = "https://github.com/uohull/irus_analytics"
14
+ spec.license = "APACHE2"
15
+
16
+ spec.files = `git ls-files -z`.split("\x0")
17
+ spec.executables = spec.files.grep(%r{^bin/}) { |f| File.basename(f) }
18
+ spec.test_files = spec.files.grep(%r{^(test|spec|features)/})
19
+ spec.require_paths = ["lib"]
20
+
21
+ spec.add_dependency "openurl", "~> 0.5"
22
+ spec.add_dependency "resque", "~> 1.25"
23
+
24
+ spec.add_development_dependency "bundler", "~> 1.6"
25
+ spec.add_development_dependency "rake"
26
+ spec.add_development_dependency "rspec"
27
+ end
@@ -0,0 +1,31 @@
1
+ class IrusAnalyticsGenerator < Rails::Generators::Base
2
+
3
+ desc "Irus Analytics Generator"
4
+ def create_irus_analytics_initializer
5
+ initializer "irus_analytics.rb" do
6
+ <<EOF
7
+ # Configuration of IrusAnalytics
8
+ env = Rails.env.to_s
9
+
10
+ IrusAnalytics.configuration.source_repository = case env
11
+ when "development"
12
+ "your-repository.org"
13
+ when "test"
14
+ "your-repository.org"
15
+ else
16
+ "your-repository.org"
17
+ end
18
+
19
+ IrusAnalytics.configuration.irus_server_address = case env
20
+ when "development"
21
+ nil
22
+ when "test"
23
+ nil
24
+ else
25
+ "irus_server_address"
26
+ end
27
+ EOF
28
+
29
+ end
30
+ end
31
+ end
@@ -0,0 +1,89 @@
1
+ require 'logger'
2
+ require 'resque'
3
+
4
+ module IrusAnalytics
5
+ module Controller
6
+ module AnalyticsBehaviour
7
+ def send_analytics
8
+ logger = Logger.new(STDOUT) if logger.nil?
9
+ # Retrieve required params from the request
10
+ if request.nil?
11
+ logger.warn("IrusAnalytics::Controller::AnalyticsBehaviour.send_analytics exited: Request object is nil.")
12
+ else
13
+ # Should we filter this request...
14
+ unless filter_request?(request)
15
+ # Get Request data
16
+ client_ip = request.remote_ip if request.respond_to?(:remote_ip)
17
+ user_agent = request.user_agent if request.respond_to?(:user_agent)
18
+ file_url = request.url if request.respond_to?(:url)
19
+ referer = request.referer if request.respond_to?(:referer)
20
+
21
+ # Defined locally
22
+ datetime = datetime_stamp
23
+ source_repository = source_repository_name
24
+
25
+ # These following should defined in the controller class including this module
26
+ identifier = self.item_identifier if self.respond_to?(:item_identifier)
27
+
28
+ analytics_params = { date_stamp: datetime, client_ip_address: client_ip, user_agent: user_agent, item_oai_identifier: identifier, file_url: file_url,
29
+ http_referer: referer, source_repository: source_repository }
30
+
31
+ if irus_server_address.nil?
32
+ # In development and test Rails environment without irus_server_address we log in debug
33
+ if rails_environment == "development" || rails_environment == "test"
34
+ logger.debug("IrusAnalytics::ControllerBehaviour.send_irus_analytics params extracted #{analytics_params}")
35
+ else
36
+ logger.error("IrusAnalytics::Controller::AnalyticsBehaviour.send_analytics exited: Irus Server address is not set.")
37
+ end
38
+
39
+ else
40
+ begin
41
+ Resque.enqueue(IrusClient, irus_server_address, analytics_params)
42
+ rescue Exception => e
43
+ logger.error("IrusAnalytics::Controller::AnalyticsBehaviour.send_analytics error: Problem enquing the analytics with Resque. Error: #{e}")
44
+ end
45
+ end
46
+ end
47
+ end
48
+ end
49
+
50
+ private
51
+
52
+
53
+ # Returns UTC iso8601 version of Datetime
54
+ def datetime_stamp
55
+ Time.now.utc.iso8601
56
+ end
57
+
58
+ def source_repository_name
59
+ IrusAnalytics.configuration.source_repository
60
+ end
61
+
62
+ def irus_server_address
63
+ IrusAnalytics.configuration.irus_server_address
64
+ end
65
+
66
+ def filter_request?(request)
67
+ filter_request = false
68
+ # If we can't determine the request.user_agent we should filter it...
69
+ if request.respond_to?(:user_agent)
70
+ filter_request = !request.headers['HTTP_RANGE'].nil? || robot_user_agent?(request.user_agent)
71
+ else
72
+ filter_request = true
73
+ end
74
+ filter_request
75
+ end
76
+
77
+ def robot_user_agent?(user_agent)
78
+ IrusAnalytics::UserAgentFilter.instance.filter_user_agent?(user_agent)
79
+ end
80
+
81
+ def rails_environment
82
+ unless Rails.nil?
83
+ return Rails.env.to_s
84
+ end
85
+ end
86
+
87
+ end
88
+ end
89
+ end
@@ -0,0 +1,54 @@
1
+ require 'openurl'
2
+
3
+ module IrusAnalytics
4
+ class IrusAnalyticsService
5
+ attr_accessor :irus_server_address
6
+
7
+ def initialize(irus_server_address)
8
+ @irus_server_address = irus_server_address
9
+ @missing_params = []
10
+ end
11
+
12
+ def send_analytics(params = {})
13
+ if @irus_server_address.to_s.empty?
14
+ raise ArgumentError, "Cannot send analytics: Missing Irus server address"
15
+ end
16
+
17
+ default_params = {date_stamp: "", client_ip_address: "", user_agent: "", item_oai_identifier: "", file_url: "", http_referer: "", source_repository: ""}
18
+ params = default_params.merge(params)
19
+
20
+ if missing_mandatory_params?(params)
21
+ raise ArgumentError, "Missing the following required params: #{@missing_params}"
22
+ end
23
+
24
+ tracker_context_object_builder = IrusAnalytics::TrackerContextObjectBuilder.new
25
+
26
+ tracker_context_object_builder.set_event_datestamp(params[:date_stamp])
27
+ tracker_context_object_builder.set_client_ip_address(params[:client_ip_address])
28
+ tracker_context_object_builder.set_user_agent(params[:user_agent])
29
+ tracker_context_object_builder.set_oai_identifier(params[:item_oai_identifier])
30
+ tracker_context_object_builder.set_file_url(params[:file_url])
31
+ tracker_context_object_builder.set_http_referer(params[:http_referer])
32
+ tracker_context_object_builder.set_source_repository(params[:source_repository])
33
+
34
+ transport = openurl_link_resolver(tracker_context_object_builder.context_object)
35
+ transport.get
36
+
37
+ if transport.code != "200"
38
+ raise "Unexpected response from IRUS server"
39
+ end
40
+
41
+ end
42
+
43
+ # At present, all the params, are mandatory...
44
+ def missing_mandatory_params?(params)
45
+ params.each_pair { |key,value| @missing_fields << key if value.to_s.empty? }
46
+ return !@missing_params.empty?
47
+ end
48
+
49
+ def openurl_link_resolver(context_object)
50
+ OpenURL::Transport.new(@irus_server_address, context_object)
51
+ end
52
+
53
+ end
54
+ end
@@ -0,0 +1,22 @@
1
+ module IrusAnalytics
2
+ class IrusClient
3
+ @queue = :irus_analytics
4
+
5
+ def self.perform(irus_server_address, analytics_params)
6
+ service = IrusAnalytics::IrusAnalyticsService.new(irus_server_address)
7
+ service.send_analytics(symbolize_keys(analytics_params))
8
+ end
9
+
10
+ def self.symbolize_keys(hash)
11
+ new={}
12
+ hash.map do |key,value|
13
+ if value.is_a?(Hash)
14
+ value = symbolize_keys(value)
15
+ end
16
+ new[key.to_sym]=value
17
+ end
18
+ return new
19
+ end
20
+
21
+ end
22
+ end
@@ -0,0 +1,13 @@
1
+ require "rails"
2
+
3
+ # Used to push rake tasks up to app using gem
4
+ module IrusAnalytics
5
+ class Railtie < Rails::Railtie
6
+ railtie_name :irus_analytics
7
+
8
+ rake_tasks do
9
+ load "tasks/resque.rake"
10
+ end
11
+
12
+ end
13
+ end
@@ -0,0 +1,39 @@
1
+ require "openurl"
2
+
3
+ module IrusAnalytics
4
+ class TrackerContextObjectBuilder
5
+ attr_accessor :context_object
6
+ def initialize
7
+ @context_object = OpenURL::ContextObject.new
8
+ end
9
+
10
+ def set_event_datestamp(datetime)
11
+ @context_object.admin.merge!("url_tim"=>{"label"=>"Usage event datestamp", "value"=>datetime})
12
+ end
13
+
14
+ def set_client_ip_address(ip_address)
15
+ @context_object.admin.merge!("req_id"=>{"label"=>"Client IP address", "value"=>"urn:ip:#{ip_address}"})
16
+ end
17
+
18
+ def set_user_agent(user_agent)
19
+ @context_object.admin.merge!("req_dat"=>{"label"=>"UserAgent", "value"=>user_agent})
20
+ end
21
+
22
+ def set_oai_identifier(identifier)
23
+ @context_object.referent.set_metadata("artnum", identifier)
24
+ end
25
+
26
+ def set_file_url(url)
27
+ @context_object.admin.merge!("svc_dat"=>{"label"=>"FileURL", "value"=>url})
28
+ end
29
+
30
+ def set_http_referer(referer)
31
+ @context_object.admin.merge!("rfr_dat"=>{"label"=>"HTTP referer", "value"=>referer})
32
+ end
33
+
34
+ def set_source_repository(source_repository)
35
+ @context_object.admin.merge!("rfr_id"=>{"label"=>"Source repository", "value"=>source_repository})
36
+ end
37
+
38
+ end
39
+ end
@@ -0,0 +1,36 @@
1
+ require 'irus_analytics'
2
+ require 'singleton'
3
+
4
+ module IrusAnalytics
5
+ class UserAgentFilter
6
+ include Singleton
7
+
8
+ # Singleton module defines us a instance class method and makes this private...
9
+ def initialize
10
+ set_robot_agents
11
+ end
12
+
13
+ def filter_user_agent?(user_agent)
14
+ @robot_agents.each do |robot_regexp|
15
+ return true unless user_agent.match(robot_regexp).nil?
16
+ end
17
+ return false
18
+ end
19
+
20
+ def set_robot_agents
21
+ @robot_agents = get_robots_from_config
22
+ end
23
+
24
+ private
25
+
26
+ def get_robots_from_config
27
+ begin
28
+ agent_list = File.open("#{IrusAnalytics.config}/counter_robot_list.txt", "r") { |config| config.readlines.collect{|line| line.chomp }}
29
+ rescue Exception => ex
30
+ # Deal with configuration read error
31
+ agent_list = []
32
+ end
33
+ end
34
+
35
+ end
36
+ end
@@ -0,0 +1,3 @@
1
+ module IrusAnalytics
2
+ VERSION = "0.0.1"
3
+ end
@@ -0,0 +1,46 @@
1
+ require "irus_analytics/version"
2
+ require "irus_analytics/controller/analytics_behaviour"
3
+ require "irus_analytics/irus_analytics_service"
4
+ require "irus_analytics/tracker_context_object_builder"
5
+ require "irus_analytics/user_agent_filter"
6
+ require "irus_analytics/irus_client"
7
+ require "irus_analytics/rail_tie" if defined?(Rails)
8
+ require "resque/server"
9
+
10
+
11
+ module IrusAnalytics
12
+ class << self
13
+ attr_writer :configuration
14
+ end
15
+
16
+ def self.configuration
17
+ @configuration ||= Configuration.new
18
+ end
19
+
20
+ def self.reset
21
+ @configuration = Configuration.new
22
+ end
23
+
24
+ def self.configure
25
+ yeild(configuration)
26
+ end
27
+
28
+ def self.root
29
+ @root ||= File.expand_path(File.dirname(File.dirname(__FILE__)))
30
+ end
31
+
32
+ def self.config
33
+ File.join root, "config"
34
+ end
35
+
36
+ class Configuration
37
+ attr_accessor :source_repository, :irus_server_address
38
+
39
+ def initialize
40
+ @source_repository = "locahost:3000"
41
+ @irus_server_address = "localhost:3000/irus"
42
+ end
43
+ end
44
+
45
+ end
46
+
@@ -0,0 +1,3 @@
1
+ require 'resque/tasks'
2
+
3
+ task 'resque:setup' => :environment
@@ -0,0 +1,74 @@
1
+ require 'spec_helper'
2
+
3
+ class TestClass
4
+ include IrusAnalytics::Controller::AnalyticsBehaviour
5
+ attr_accessor :request, :item_identifier
6
+ end
7
+
8
+ describe IrusAnalytics::Controller::AnalyticsBehaviour do
9
+
10
+ describe ".send_analytics" do
11
+ before(:each) do
12
+ @test_class = TestClass.new
13
+ @test_class.request = double("request", :remote_ip => "127.0.0.1", :user_agent => "Test user agent", url: "http://localhost:3000/test", referer: "http://localhost:3000", headers: { "HTTP_RANGE" => nil })
14
+ @test_class.item_identifier = "test:123"
15
+ end
16
+
17
+ it "will call the send_irus_analytics method with the correct params..." do
18
+ # We set the datetime stamp to ensure sync
19
+ date_time = "2014-06-09T16:56:48Z"
20
+ allow(@test_class).to receive(:datetime_stamp) .and_return(date_time)
21
+ allow(@test_class).to receive(:source_repository_name) .and_return("hydra.hull.ac.uk")
22
+ allow(@test_class).to receive(:irus_server_address) .and_return("irus-server-address.org")
23
+ allow(@test_class).to receive(:rails_environment) .and_return("production")
24
+ params = { date_stamp: date_time, client_ip_address: "127.0.0.1", user_agent: "Test user agent",item_oai_identifier: "test:123",
25
+ file_url: "http://localhost:3000/test", http_referer: "http://localhost:3000", source_repository: "hydra.hull.ac.uk" }
26
+
27
+ allow(Resque).to receive(:enqueue) .and_return(nil)
28
+ # Should NOT filter this request
29
+ expect(@test_class).to receive(:filter_request?).and_return(false)
30
+ expect(Resque).to receive(:enqueue).with(IrusAnalytics::IrusClient, "irus-server-address.org", params )
31
+
32
+ @test_class.send_analytics
33
+ end
34
+
35
+ it "will not call the send_irus_analytics method when there is a filter user-agent.." do
36
+ # Add a well known robot...
37
+ @test_class.request = double("request", :remote_ip => "127.0.0.1", :user_agent => "Microsoft URL Control - 6.00.8862", url: "http://localhost:3000/test", referer: "http://localhost:3000", headers: { "HTTP_RANGE" => nil })
38
+ # We set the datetime stamp to ensure sync
39
+ date_time = "2014-06-09T16:56:48Z"
40
+ allow(@test_class).to receive(:datetime_stamp) .and_return(date_time)
41
+ allow(@test_class).to receive(:source_repository_name) .and_return("hydra.hull.ac.uk")
42
+ allow(@test_class).to receive(:irus_server_address) .and_return("irus-server-address.org")
43
+ allow(@test_class).to receive(:rails_environment) .and_return("production")
44
+ params = { date_stamp: date_time, client_ip_address: "127.0.0.1", user_agent: "Microsoft URL Control - 6.00.8862" ,item_oai_identifier: "test:123",
45
+ file_url: "http://localhost:3000/test", http_referer: "http://localhost:3000", source_repository: "hydra.hull.ac.uk" }
46
+
47
+ allow(Resque).to receive(:enqueue) .and_return(nil)
48
+ # Should filter this request
49
+ expect(Resque).to_not receive(:enqueue).with(IrusAnalytics::IrusClient, "irus-server-address.org", params )
50
+ @test_class.send_analytics
51
+ end
52
+
53
+
54
+ it "will not call the send_irus_analytics method when the request is expecting a chunk of data (HTTP_RANGE downloading data)." do
55
+ # Add a well known robot...
56
+ @test_class.request = double("request", :remote_ip => "127.0.0.1", :user_agent =>"Test user agent", url: "http://localhost:3000/test", referer: "http://localhost:3000", headers: { "HTTP_RANGE" => "bytes=0-65535"})
57
+ # We set the datetime stamp to ensure sync
58
+ date_time = "2014-06-09T16:56:48Z"
59
+ allow(@test_class).to receive(:datetime_stamp) .and_return(date_time)
60
+ allow(@test_class).to receive(:source_repository_name) .and_return("hydra.hull.ac.uk")
61
+ allow(@test_class).to receive(:irus_server_address) .and_return("irus-server-address.org")
62
+ allow(@test_class).to receive(:rails_environment) .and_return("production")
63
+ params = { date_stamp: date_time, client_ip_address: "127.0.0.1", user_agent: "Microsoft URL Control - 6.00.8862" ,item_oai_identifier: "test:123",
64
+ file_url: "http://localhost:3000/test", http_referer: "http://localhost:3000", source_repository: "hydra.hull.ac.uk" }
65
+
66
+ allow(Resque).to receive(:enqueue) .and_return(nil)
67
+ # Should filter this request
68
+ expect(Resque).to_not receive(:enqueue).with(IrusAnalytics::IrusClient, "irus-server-address.org", params )
69
+ @test_class.send_analytics
70
+ end
71
+
72
+
73
+ end
74
+ end
@@ -0,0 +1,31 @@
1
+ require 'spec_helper'
2
+
3
+ describe IrusAnalytics::IrusAnalyticsService do
4
+ let (:irus_analytics_service) { IrusAnalytics::IrusAnalyticsService.new("") }
5
+ let(:test_params) { { date_stamp: "2010-10-17T03:04:42Z", client_ip_address: "127.0.0.1", user_agent: "Mozilla/5.0 (iPad; U; CPU OS 3_2_1 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Mobile/7B405",
6
+ item_oai_identifier: "hull:123", file_url: "https://hydra.hull.ac.uk/assets/hull:123/content", http_referer: "https://www.google.co.uk/search?q=hydra+hull%3A123&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a&channel=sb&gfe_rd=cr",
7
+ source_repository: "hydra.hull.ac.uk" } }
8
+
9
+ describe ".send_analytics" do
10
+
11
+ before (:each) do
12
+ # Create a double for the transport object that will return 200 OK status
13
+ transport = double("transport", :get => "", :code => "200")
14
+ allow(irus_analytics_service).to receive(:openurl_link_resolver) .and_return(transport)
15
+ end
16
+
17
+ it "will throw an exception if the irus_server_address object variable is not set" do
18
+ expect { irus_analytics_service.send_analytics(test_params) }.to raise_error
19
+ end
20
+
21
+ it "enables the required parameters to be set within a hash" do
22
+ irus_analytics_service.irus_server_address = "irus_address"
23
+ irus_analytics_service.send_analytics(test_params)
24
+ end
25
+
26
+ it "will throw an exception if any of the mandatory IRUS data is missing" do
27
+ expect { irus_analytics_service.send_analytics({}) }.to raise_error
28
+ end
29
+
30
+ end
31
+ end
@@ -0,0 +1,22 @@
1
+ require "spec_helper"
2
+
3
+ describe IrusAnalytics::IrusClient do
4
+ let(:test_params) { { date_stamp: "2010-10-17T03:04:42Z", client_ip_address: "127.0.0.1", user_agent: "Mozilla/5.0 (iPad; U; CPU OS 3_2_1 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Mobile/7B405",
5
+ item_oai_identifier: "hull:123", file_url: "https://hydra.hull.ac.uk/assets/hull:123/content", http_referer: "https://www.google.co.uk/search?q=hydra+hull%3A123&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a&channel=sb&gfe_rd=cr",
6
+ source_repository: "hydra.hull.ac.uk" } }
7
+ describe ".perform" do
8
+ it "takes the irus_server_address and analytics_params and calls IrusAnalyticsService.send_irus_analytics method" do
9
+ # subject.class.perform("irus-server", test_params)
10
+ end
11
+ end
12
+
13
+ # Required due to Resque returning stringyfield hash keys
14
+ describe ".symbolize_keys" do
15
+ it "takes a hash that uses string keys, and returns the hash with symbol keys" do
16
+ test_hash = { "key_1" => "Value 1", "key_2" => "Value 2", "key_3" => "Value 3" }
17
+ new_hash = IrusAnalytics::IrusClient.symbolize_keys(test_hash)
18
+ expect(new_hash).to include(key_1: "Value 1", key_2: "Value 2", key_3: "Value 3")
19
+ end
20
+ end
21
+
22
+ end
@@ -0,0 +1,78 @@
1
+ require 'spec_helper'
2
+
3
+ describe IrusAnalytics::TrackerContextObjectBuilder do
4
+ describe ".initialize" do
5
+ it "will initialize an empty OpenURL::ContextObject instance" do
6
+ expect(IrusAnalytics::TrackerContextObjectBuilder.new.context_object).to be_an_instance_of OpenURL::ContextObject
7
+ end
8
+ end
9
+
10
+ context "set methods" do
11
+ let(:builder) { IrusAnalytics::TrackerContextObjectBuilder.new }
12
+
13
+ describe "OpenURL version" do
14
+ it "will be defaulted to required version for IRUS" do
15
+ expect(builder.context_object.kev).to include("url_ver=Z39.88-2004")
16
+ end
17
+ end
18
+
19
+ describe "set_event_datestamp" do
20
+ it "will set event datestamp as per IRUS specification" do
21
+ date_time = "2010-10-17T03:04:42Z"
22
+ builder.set_event_datestamp(date_time)
23
+ expect(builder.context_object.kev).to include("url_tim=2010-10-17T03%3A04%3A42Z") # Html_encoded version of the string
24
+ end
25
+ end
26
+
27
+ describe "set_client_ip_address" do
28
+ it "will set client ip address as per IRUS specification" do
29
+ ip_address = "127.0.0.1"
30
+ builder.set_client_ip_address(ip_address)
31
+ expect(builder.context_object.kev).to include("req_id=urn%3Aip%3A127.0.0.1")
32
+ end
33
+ end
34
+
35
+ describe "set_user_agent" do
36
+ it "will set the request UserAgent as per IRUS specification" do
37
+ agent = "Mozilla/5.0 (iPad; U; CPU OS 3_2_1 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Mobile/7B405"
38
+ builder.set_user_agent(agent)
39
+ expect(builder.context_object.kev).to include("req_dat=Mozilla%2F5.0+%28iPad%3B+U%3B+CPU+OS+3_2_1+like+Mac+OS+X%3B+en-us%29+AppleWebKit%2F531.21.10+%28KHTML%2C+like+Gecko%29+Mobile%2F7B405")
40
+ end
41
+ end
42
+
43
+ describe "set_oai_identifier" do
44
+ it "will set the item oai identifier as per IRUS specification" do
45
+ identifier = "oai:hull.ac.uk:hull:123"
46
+ builder.set_oai_identifier(identifier)
47
+ expect(builder.context_object.kev).to include("rft.artnum=oai%3Ahull.ac.uk%3Ahull%3A123")
48
+ end
49
+ end
50
+
51
+ describe "set_file_url" do
52
+ it "will set FileURL as per IRUS specification" do
53
+ url = "https://hydra.hull.ac.uk/assets/hull:123/content"
54
+ builder.set_file_url(url)
55
+ expect(builder.context_object.kev).to include("svc_dat=https%3A%2F%2Fhydra.hull.ac.uk%2Fassets%2Fhull%3A123%2Fcontent")
56
+ end
57
+ end
58
+
59
+ describe "set_http_referer" do
60
+ it "will set the HTTP referer as per IRUS specification" do
61
+ referer = "http://www.google.co.uk/url?sa=t&rct=j&q=http%20referer&source=web&cd=4&sqi=2&ved=0CEoQFjAD&url=http%3A%2F%2Fwww.whatismyreferer.com%2F&ei=zIBCU6fbEoOqhQf67YCwBg&usg=AFQjCNFt-KMqneTZfEb6OxjPZlD4ogiJcQ&sig2=wZJYkoWgNScNjgxRbRs29w&bvm=bv.64125504,d.ZWU"
62
+ builder.set_http_referer(referer)
63
+ expect(builder.context_object.kev).to include("rfr_dat=http%3A%2F%2Fwww.google.co.uk%2Furl%3Fsa%3Dt%26rct%3Dj%26q%3Dhttp%2520referer%26source%3Dweb%26cd%3D4%26sqi%3D2%26ved%3D0CEoQFjAD%26url%3Dhttp%253A%252F%252Fwww.whatismyreferer.com%252F%26ei%3DzIBCU6fbEoOqhQf67YCwBg%26usg%3DAFQjCNFt-KMqneTZfEb6OxjPZlD4ogiJcQ%26sig2%3DwZJYkoWgNScNjgxRbRs29w%26bvm%3Dbv.64125504%2Cd.ZWU")
64
+ end
65
+ end
66
+
67
+ describe "set_source_repository" do
68
+ it "will set the Source repository as per IRUS specification" do
69
+ src_repo = "hydra.hull.ac.uk"
70
+ builder.set_source_repository(src_repo)
71
+ expect(builder.context_object.kev).to include("rfr_id=hydra.hull.ac.uk")
72
+ end
73
+ end
74
+
75
+
76
+ end
77
+
78
+ end
@@ -0,0 +1,23 @@
1
+ require 'spec_helper'
2
+
3
+ describe IrusAnalytics::UserAgentFilter do
4
+
5
+ context "singleton" do
6
+ describe ".instance" do
7
+ it "should return the singleton instance of the RobotsFilter" do
8
+ expect(IrusAnalytics::UserAgentFilter.instance).to be_instance_of IrusAnalytics::UserAgentFilter
9
+ end
10
+ end
11
+
12
+ describe ".filter_user_agent" do
13
+ it "will return true when a user agent should be filtered" do
14
+ expect(IrusAnalytics::UserAgentFilter.instance.filter_user_agent?("appie")).to be true
15
+ end
16
+
17
+ it "will return false when a user agent is valid and should not be filtered" do
18
+ expect(IrusAnalytics::UserAgentFilter.instance.filter_user_agent?("Firefox 3.0")).to be false
19
+ end
20
+ end
21
+
22
+ end
23
+ end
@@ -0,0 +1,6 @@
1
+ require 'rspec'
2
+ require 'irus_analytics'
3
+
4
+ # RSpec.configure do |c|
5
+ # c.mock_with :rspec
6
+ # end
metadata ADDED
@@ -0,0 +1,151 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: irus_analytics
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.0.1
5
+ platform: ruby
6
+ authors:
7
+ - Simon Lamb
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2014-08-22 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: openurl
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - ~>
18
+ - !ruby/object:Gem::Version
19
+ version: '0.5'
20
+ type: :runtime
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - ~>
25
+ - !ruby/object:Gem::Version
26
+ version: '0.5'
27
+ - !ruby/object:Gem::Dependency
28
+ name: resque
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - ~>
32
+ - !ruby/object:Gem::Version
33
+ version: '1.25'
34
+ type: :runtime
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - ~>
39
+ - !ruby/object:Gem::Version
40
+ version: '1.25'
41
+ - !ruby/object:Gem::Dependency
42
+ name: bundler
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - ~>
46
+ - !ruby/object:Gem::Version
47
+ version: '1.6'
48
+ type: :development
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - ~>
53
+ - !ruby/object:Gem::Version
54
+ version: '1.6'
55
+ - !ruby/object:Gem::Dependency
56
+ name: rake
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - ! '>='
60
+ - !ruby/object:Gem::Version
61
+ version: '0'
62
+ type: :development
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - ! '>='
67
+ - !ruby/object:Gem::Version
68
+ version: '0'
69
+ - !ruby/object:Gem::Dependency
70
+ name: rspec
71
+ requirement: !ruby/object:Gem::Requirement
72
+ requirements:
73
+ - - ! '>='
74
+ - !ruby/object:Gem::Version
75
+ version: '0'
76
+ type: :development
77
+ prerelease: false
78
+ version_requirements: !ruby/object:Gem::Requirement
79
+ requirements:
80
+ - - ! '>='
81
+ - !ruby/object:Gem::Version
82
+ version: '0'
83
+ description: ! 'More information about IRUS-UK can be found at [http://www.irus.mimas.ac.uk/](http://www.irus.mimas.ac.uk/). In
84
+ summary the IRUS-UK service is designed to provide article-level usage statistics
85
+ for Institutional Repositories. This gem was developed for use with a Hydra repository
86
+ [http://projecthydra.org/(http://projecthydra.org/)], but it can be used with any
87
+ other Rails based web application. '
88
+ email:
89
+ - s.lamb@hull.ac.uk
90
+ executables: []
91
+ extensions: []
92
+ extra_rdoc_files: []
93
+ files:
94
+ - .gitignore
95
+ - .ruby-gemset
96
+ - .ruby-version
97
+ - .travis.yml
98
+ - Gemfile
99
+ - LICENSE.txt
100
+ - README.md
101
+ - Rakefile
102
+ - config/counter_robot_list.txt
103
+ - irus_analytics.gemspec
104
+ - lib/generators/irus_analytics_generator.rb
105
+ - lib/irus_analytics.rb
106
+ - lib/irus_analytics/controller/analytics_behaviour.rb
107
+ - lib/irus_analytics/irus_analytics_service.rb
108
+ - lib/irus_analytics/irus_client.rb
109
+ - lib/irus_analytics/rail_tie.rb
110
+ - lib/irus_analytics/tracker_context_object_builder.rb
111
+ - lib/irus_analytics/user_agent_filter.rb
112
+ - lib/irus_analytics/version.rb
113
+ - lib/tasks/resque.rake
114
+ - spec/lib/irus_analytics/controller/analytics_behaviour_spec.rb
115
+ - spec/lib/irus_analytics/irus_analytics_service_spec.rb
116
+ - spec/lib/irus_analytics/irus_client_spec.rb
117
+ - spec/lib/irus_analytics/tracker_context_object_builder_spec.rb
118
+ - spec/lib/irus_analytics/user_agent_filter_spec.rb
119
+ - spec/spec_helper.rb
120
+ homepage: https://github.com/uohull/irus_analytics
121
+ licenses:
122
+ - APACHE2
123
+ metadata: {}
124
+ post_install_message:
125
+ rdoc_options: []
126
+ require_paths:
127
+ - lib
128
+ required_ruby_version: !ruby/object:Gem::Requirement
129
+ requirements:
130
+ - - ! '>='
131
+ - !ruby/object:Gem::Version
132
+ version: '0'
133
+ required_rubygems_version: !ruby/object:Gem::Requirement
134
+ requirements:
135
+ - - ! '>='
136
+ - !ruby/object:Gem::Version
137
+ version: '0'
138
+ requirements: []
139
+ rubyforge_project:
140
+ rubygems_version: 2.2.2
141
+ signing_key:
142
+ specification_version: 4
143
+ summary: IrusAnalytics is a gem that provides a simple way to send analytics to the
144
+ IRUS-UK repository agggregation service.
145
+ test_files:
146
+ - spec/lib/irus_analytics/controller/analytics_behaviour_spec.rb
147
+ - spec/lib/irus_analytics/irus_analytics_service_spec.rb
148
+ - spec/lib/irus_analytics/irus_client_spec.rb
149
+ - spec/lib/irus_analytics/tracker_context_object_builder_spec.rb
150
+ - spec/lib/irus_analytics/user_agent_filter_spec.rb
151
+ - spec/spec_helper.rb