oai_repository 0.1.2 → 0.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/README.rdoc CHANGED
@@ -3,3 +3,126 @@
3
3
  A Rails (3.1+) engine that allows you to expose your models through an OAI-PHM Data Provider interface.
4
4
 
5
5
  See http://www.oaforum.org/tutorial/ and http://www.openarchives.org/OAI/openarchivesprotocol.html#Repository
6
+
7
+ == Installation
8
+
9
+ If you are using Bundler with your Rails application, then simply add
10
+
11
+ gem "oai_repository"
12
+
13
+ and then run bundle install as usual.
14
+
15
+ Then run the generator
16
+
17
+ $ rails g oai_repository:install
18
+
19
+ == Configuration
20
+
21
+ The generator installs a configuration file at <tt>config/initializers/oai_repository.rb</tt>
22
+
23
+ The following settings should be edited appropriately:
24
+
25
+ config.repository_name = 'Test repository'
26
+ config.repository_url = 'http://localhost:3000/oai_repository'
27
+ config.record_prefix = 'http://localhost:3000/'
28
+ config.admin_email = 'root@localhost'
29
+ config.limit = 100
30
+ config.models = [ Person, Instrument ]
31
+
32
+ The values for <tt>config.models</tt> should be the class name of the ActiveRecord model class that is being identified with the given set. It doesn't actually _have_ to be an ActiveRecord model class, but it should act like one. You *must* supply at least one model.
33
+
34
+ The following settings are optional:
35
+
36
+ config.sets = []
37
+ config.additional_formats = []
38
+
39
+ The items of the sets list should be hash with value for spec, name, model, and optionally description. E.g.
40
+
41
+ config.sets = [
42
+ {
43
+ spec: 'class:party',
44
+ name: 'Parties',
45
+ model: Person
46
+ },
47
+ {
48
+ spec: 'class:service',
49
+ name: 'Services',
50
+ model: Instrument,
51
+ description: 'Things that are services'
52
+ }
53
+ ]
54
+
55
+ By default, an OAI repository must be able to emit its records in OAI_DC (Dublin Core) format. If you want to provide other output formats for your repository
56
+ (and those formats are subclasses of OAI::Provider::Metadata.Format) then
57
+ you can specify them here. E.g.
58
+
59
+ require 'rifcs_format'
60
+
61
+ config.additional_formats = [
62
+ OAI::Provider::Metadata::RIFCS
63
+ ]
64
+
65
+ == Instrumenting your Models
66
+
67
+ === OAI DC Format
68
+
69
+ As a *bare* *minimum*, your model classes must implement the following method (or readable attribute)
70
+
71
+ oai_dc_identifier
72
+
73
+ This must return a *unique* value for the *whole* *repository*.
74
+ The format of the unique identifier *must* correspond to that of
75
+ the URI (Uniform Resource Identifier) syntax. See http://www.openarchives.org/OAI/openarchivesprotocol.html#UniqueIdentifier for more details.
76
+
77
+ You may also supply oai_dc_<value> where <value> is any of
78
+
79
+ title
80
+ creator
81
+ subject
82
+ description
83
+ publisher
84
+ contributor
85
+ date
86
+ type
87
+ format
88
+ source
89
+ language
90
+ relation
91
+ coverage
92
+ rights
93
+
94
+ See http://www.openarchives.org/OAI/openarchivesprotocol.html#dublincore for a bit more information on the Dublin Core metadata format.
95
+
96
+ === OAI Sets
97
+
98
+ A set is an optional construct for grouping items for the purpose of selective harvesting.
99
+
100
+ You must fill the configuration item <tt>config.sets</tt> to list the sets your
101
+ repository uses. This list will be shown in the output of a <tt>ListSets</tt> request.
102
+
103
+ If you are grouping your records by set you have two implementation options in your model(s).
104
+
105
+ If all records from a model will belong to a given set, then simply
106
+
107
+ include OaiRepository::Set
108
+
109
+ in your model and all records will belong to the sets from your <tt>config.sets</tt> mapping.
110
+
111
+ If you want to be selective about set membership, implement a <tt>sets</tt> method in your model that responds with the set that a record belongs to. E.g.
112
+
113
+ def sets
114
+ oai_sets = [ OAI::Set.new({name: 'Tools', spec: 'tools'}) ]
115
+ if name.match('multimeter')
116
+ oai_sets << OAI::Set.new({name: 'Meters', spec: 'meters'})
117
+ end
118
+ oai_sets
119
+ end
120
+
121
+
122
+ == Mounting the Engine
123
+
124
+ In your <tt>config/routes.rb</tt> add
125
+
126
+ mount OaiRepository::Engine => "/oai_repository"
127
+
128
+ changing the path as desired.
@@ -4,9 +4,14 @@ class ARWrapperModel < OAI::Provider::Model
4
4
  def initialize(options={})
5
5
  @timestamp_field = options.delete(:timestamp_field) || 'updated_at'
6
6
  @limit = options.delete(:limit)
7
- @sets_map = options.delete(:sets)
7
+ @models = options.delete(:models)
8
+ @sets_map = options.delete(:sets) || []
8
9
  @oai_dc_mapping = {}
9
10
 
11
+ if @models.nil? or @models.empty?
12
+ raise "models configuration value is required and must be non-empty"
13
+ end
14
+
10
15
  unless options.empty?
11
16
  raise ArgumentError.new(
12
17
  "Unsupported options [#{options.keys.join(', ')}]"
@@ -38,11 +43,11 @@ class ARWrapperModel < OAI::Provider::Model
38
43
  end
39
44
 
40
45
  def earliest
41
- record_end(:asc)
46
+ first_record_date(:asc)
42
47
  end
43
48
 
44
49
  def latest
45
- record_end(:desc)
50
+ first_record_date(:desc)
46
51
  end
47
52
 
48
53
  # selector can be id or :all
@@ -53,14 +58,13 @@ class ARWrapperModel < OAI::Provider::Model
53
58
  token = ResumptionToken.parse(options[:resumption_token])
54
59
  end
55
60
 
56
- from = token ? token.from : options[:from]
57
- to = token ? token.until : options[:until]
61
+ from = token ? token.from.utc : options[:from]
62
+ to = token ? token.until.utc : options[:until]
58
63
  last = token ? token.last : 0
59
- prefix = token ? token.prefix : options[:metadata_prefix]
60
64
  set = token ? token.set : options[:set]
65
+ prefix = token ? token.prefix : options[:metadata_prefix]
61
66
 
62
- conditions = sql_conditions(:from => from, :until => to)
63
- record_rows = get_record_rows(get_models(set), conditions)
67
+ record_rows = get_record_rows(set, :from => from, :until => to)
64
68
 
65
69
  return get_specific_record(record_rows, selector) if selector != :all
66
70
 
@@ -75,7 +79,7 @@ class ARWrapperModel < OAI::Provider::Model
75
79
 
76
80
  def get_specific_record(records, id)
77
81
  # TODO: optimise somehow
78
- # This is terribly intensive :-(
82
+ # This scans all records :-(
79
83
  records.each do |record|
80
84
  obj = Object.const_get(record["type"]).find(record["id"])
81
85
  return obj if obj.oai_dc_identifier.eql?(id)
@@ -85,47 +89,51 @@ class ARWrapperModel < OAI::Provider::Model
85
89
 
86
90
  def get_paged_records(record_rows, options)
87
91
 
88
- if record_rows.size < options[:last]
92
+ last = options[:last]
93
+ if record_rows.size < last
89
94
  raise OAI::ResumptionTokenException.new
90
95
  end
91
96
 
92
- list = get_record_objects(record_rows[options[:last], @limit])
97
+ list = get_record_objects(record_rows[last, @limit])
93
98
 
94
- if list.size < @limit
99
+ last += @limit
100
+ if last >= record_rows.size
95
101
  list
96
102
  else
97
- options[:last] += @limit
103
+ options[:last] = last
98
104
  PartialResult.new(list, ResumptionToken.new(options))
99
105
  end
100
106
  end
101
107
 
102
- def sql_conditions(opts)
103
- from = Time.parse(opts[:from].to_s).localtime
104
- to = Time.parse(opts[:until].to_s).localtime.to_s
105
- return "#{timestamp_field} >= #{ActiveRecord::Base.sanitize(from)} AND #{timestamp_field} <= #{ActiveRecord::Base.sanitize(to)}"
106
- end
107
-
108
- def get_models(set)
109
- models =
110
- if set.nil?
111
- @sets_map.map{|s| s[:model]}
112
- else
113
- @sets_map.select{|s| s[:spec] == set}.map{|s| s[:model]}
108
+ def get_record_rows(set, options={})
109
+ union = []
110
+
111
+ from = options[:from]
112
+ # DateTime has microsecond precision, but we're parsing in dates with only
113
+ # second precision. In this case the microsecond value defaults to zero.
114
+ # Since some (most) of the records at the boundaries of a range will have
115
+ # non-zero microseconds (where the timestamp database field type has this
116
+ # level of resolution), the range needs to be adjusted to cover this.
117
+ #
118
+ # E.g. if the range is 2012-01-01 12:00:00 to 2012-01-01 12:30:00 inclusive
119
+ # a record with timestamp 2012-01-01 12:30:00.000001 is probably expected
120
+ # to fall in the range.
121
+ #
122
+ # To do this we extend the upper bound by one second and then make this
123
+ # upper bound exclusive.
124
+ to = options[:until] + 1.second
125
+
126
+ record_sql = @models.map do |m|
127
+ res = m.select("id, '#{m.name}' as type, #{timestamp_field}").where("#{timestamp_field} >= ? and #{timestamp_field} < ?", from.to_s(:db), to.to_s(:db))
128
+ if !(res.empty? or set.nil?)
129
+ res.select!{|record| record.sets.map(&:spec).include?(set.spec)}
114
130
  end
115
-
116
- if models.empty?
117
- raise OAI::NoMatchException.new
131
+ union += res unless res.empty?
118
132
  end
119
- models
120
- end
121
133
 
122
- def get_record_rows(models, conditions)
123
- record_sql = models.map do |m|
124
- "select id, '#{m.name}' as type, #{timestamp_field} from #{m.table_name} where #{conditions}"
125
- end.join(" UNION ")
134
+ raise OAI::NoMatchException.new if union.empty?
126
135
 
127
- sorted_list_sql = "select t.id as id, t.type as type, t.updated_at as updated_at from (#{record_sql}) t order by t.updated_at desc"
128
- records = ActiveRecord::Base.connection.execute(sorted_list_sql).to_a
136
+ union.sort! {|a,b| b.updated_at <=> a.updated_at}
129
137
  end
130
138
 
131
139
  def get_record_objects(records)
@@ -134,10 +142,10 @@ class ARWrapperModel < OAI::Provider::Model
134
142
  end
135
143
  end
136
144
 
137
- def record_end(order)
145
+ def first_record_date(order)
138
146
  record = nil
139
- @sets_map.each do |s|
140
- r = s[:model].first(:order => "#{@timestamp_field} #{order.to_s}")
147
+ @models.each do |model|
148
+ r = model.first(:order => "#{@timestamp_field} #{order.to_s}")
141
149
  next if r.nil?
142
150
 
143
151
  if record.nil?
@@ -148,7 +156,9 @@ class ARWrapperModel < OAI::Provider::Model
148
156
  record = r
149
157
  end
150
158
  end
159
+
151
160
  raise OAI::NoMatchException if record.nil?
161
+
152
162
  record.send(@timestamp_field)
153
163
  end
154
164
 
@@ -25,9 +25,15 @@ OaiRepository.setup do |config|
25
25
  # The number of records shown at a time (when doing a ListRecords)
26
26
  config.limit = 100
27
27
 
28
- # Map the name of the set to the ActiveRecord (or other) class name that
29
- # will provide (at a minimum) the required oai_dc attributes/methods.
30
- # E.g.
28
+ # The values for "models" should be the class name of the ActiveRecord model
29
+ # class that is being identified with the given set. It doesn't actually have
30
+ # to be a ActiveRecord model class, but it should act like one.
31
+ #
32
+ # You must supply at least one model.
33
+ config.models = [ SupplyMe ]
34
+
35
+ # List the sets (and the ActiveRecord model they belong to). E.g.
36
+ #
31
37
  # config.sets = [
32
38
  # {
33
39
  # spec: 'class:party',
@@ -37,14 +43,11 @@ OaiRepository.setup do |config|
37
43
  # {
38
44
  # spec: 'class:service',
39
45
  # name: 'Services',
40
- # description: 'Things that are services',
41
- # model: Instrument
46
+ # model: Instrument,
47
+ # description: 'Things that are services'
42
48
  # }
43
49
  # ]
44
50
  #
45
- # The "model" value should be the class name of the ActiveRecord model class
46
- # that is being identified with the given set. It doesn't actually *have*
47
- # to be a ActiveRecord model class, but it should act like one.
48
51
  config.sets = []
49
52
 
50
53
  # By default, an OAI repository must emit its records in OAI_DC (Dublin Core)
data/lib/oai_provider.rb CHANGED
@@ -20,6 +20,7 @@ module OAIProvider
20
20
  provider_class.record_prefix OaiRepository.record_prefix
21
21
  provider_class.admin_email OaiRepository.admin_email
22
22
  provider_class.source_model ARWrapperModel.new(
23
+ models: OaiRepository.models,
23
24
  sets: OaiRepository.sets,
24
25
  limit: OaiRepository.limit,
25
26
  timestamp_field: OaiRepository.timestamp_field
@@ -18,6 +18,9 @@ module OaiRepository
18
18
  mattr_accessor :admin_email
19
19
  @@admin_email = 'root@localhost'
20
20
 
21
+ mattr_accessor :models
22
+ @@models = {}
23
+
21
24
  mattr_accessor :sets
22
25
  @@sets = {}
23
26
 
@@ -30,4 +33,20 @@ module OaiRepository
30
33
 
31
34
  mattr_accessor :timestamp_field
32
35
  @@timestamp_field = 'updated_at'
36
+
37
+ module Set
38
+
39
+ def sets
40
+ OaiRepository.sets.select {|s| s[:model] == self.class}.map{|set_obj|
41
+ OAI::Set.new(
42
+ {
43
+ name: set_obj[:name],
44
+ spec: set_obj[:spec],
45
+ description: set_obj[:description]
46
+ }
47
+ )
48
+ }
49
+ end
50
+
51
+ end
33
52
  end
@@ -1,3 +1,3 @@
1
1
  module OaiRepository
2
- VERSION = "0.1.2"
2
+ VERSION = "0.9.0"
3
3
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: oai_repository
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.2
4
+ version: 0.9.0
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -10,11 +10,11 @@ authors:
10
10
  autorequire:
11
11
  bindir: bin
12
12
  cert_chain: []
13
- date: 2012-07-16 00:00:00.000000000 Z
13
+ date: 2012-07-24 00:00:00.000000000 Z
14
14
  dependencies:
15
15
  - !ruby/object:Gem::Dependency
16
16
  name: rails
17
- requirement: &74598920 !ruby/object:Gem::Requirement
17
+ requirement: &78869700 !ruby/object:Gem::Requirement
18
18
  none: false
19
19
  requirements:
20
20
  - - ~>
@@ -22,10 +22,10 @@ dependencies:
22
22
  version: '3.1'
23
23
  type: :runtime
24
24
  prerelease: false
25
- version_requirements: *74598920
25
+ version_requirements: *78869700
26
26
  - !ruby/object:Gem::Dependency
27
27
  name: oai
28
- requirement: &74598390 !ruby/object:Gem::Requirement
28
+ requirement: &78868860 !ruby/object:Gem::Requirement
29
29
  none: false
30
30
  requirements:
31
31
  - - ! '>='
@@ -33,10 +33,10 @@ dependencies:
33
33
  version: '0'
34
34
  type: :runtime
35
35
  prerelease: false
36
- version_requirements: *74598390
36
+ version_requirements: *78868860
37
37
  - !ruby/object:Gem::Dependency
38
38
  name: sqlite3
39
- requirement: &74597970 !ruby/object:Gem::Requirement
39
+ requirement: &78919450 !ruby/object:Gem::Requirement
40
40
  none: false
41
41
  requirements:
42
42
  - - ! '>='
@@ -44,10 +44,10 @@ dependencies:
44
44
  version: '0'
45
45
  type: :development
46
46
  prerelease: false
47
- version_requirements: *74597970
47
+ version_requirements: *78919450
48
48
  - !ruby/object:Gem::Dependency
49
49
  name: rspec-rails
50
- requirement: &74651040 !ruby/object:Gem::Requirement
50
+ requirement: &78918000 !ruby/object:Gem::Requirement
51
51
  none: false
52
52
  requirements:
53
53
  - - ! '>='
@@ -55,10 +55,10 @@ dependencies:
55
55
  version: '0'
56
56
  type: :development
57
57
  prerelease: false
58
- version_requirements: *74651040
58
+ version_requirements: *78918000
59
59
  - !ruby/object:Gem::Dependency
60
60
  name: capybara
61
- requirement: &74645980 !ruby/object:Gem::Requirement
61
+ requirement: &78916060 !ruby/object:Gem::Requirement
62
62
  none: false
63
63
  requirements:
64
64
  - - ! '>='
@@ -66,7 +66,7 @@ dependencies:
66
66
  version: '0'
67
67
  type: :development
68
68
  prerelease: false
69
- version_requirements: *74645980
69
+ version_requirements: *78916060
70
70
  description: An Engine for Rails (3.1+) that allows you to make your application an
71
71
  OAI-PMH Data Provider. See http://www.openarchives.org/pmh/ and http://www.openarchives.org/OAI/openarchivesprotocol.html#Repository
72
72
  email:
@@ -114,7 +114,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
114
114
  version: '0'
115
115
  segments:
116
116
  - 0
117
- hash: 387383629
117
+ hash: 493502575
118
118
  required_rubygems_version: !ruby/object:Gem::Requirement
119
119
  none: false
120
120
  requirements:
@@ -123,7 +123,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
123
123
  version: '0'
124
124
  segments:
125
125
  - 0
126
- hash: 387383629
126
+ hash: 493502575
127
127
  requirements: []
128
128
  rubyforge_project:
129
129
  rubygems_version: 1.8.15