oai_repository 0.1.2 → 0.9.0

Sign up to get free protection for your applications and to get access to all the features.
data/README.rdoc CHANGED
@@ -3,3 +3,126 @@
3
3
  A Rails (3.1+) engine that allows you to expose your models through an OAI-PHM Data Provider interface.
4
4
 
5
5
  See http://www.oaforum.org/tutorial/ and http://www.openarchives.org/OAI/openarchivesprotocol.html#Repository
6
+
7
+ == Installation
8
+
9
+ If you are using Bundler with your Rails application, then simply add
10
+
11
+ gem "oai_repository"
12
+
13
+ and then run bundle install as usual.
14
+
15
+ Then run the generator
16
+
17
+ $ rails g oai_repository:install
18
+
19
+ == Configuration
20
+
21
+ The generator installs a configuration file at <tt>config/initializers/oai_repository.rb</tt>
22
+
23
+ The following settings should be edited appropriately:
24
+
25
+ config.repository_name = 'Test repository'
26
+ config.repository_url = 'http://localhost:3000/oai_repository'
27
+ config.record_prefix = 'http://localhost:3000/'
28
+ config.admin_email = 'root@localhost'
29
+ config.limit = 100
30
+ config.models = [ Person, Instrument ]
31
+
32
+ The values for <tt>config.models</tt> should be the class name of the ActiveRecord model class that is being identified with the given set. It doesn't actually _have_ to be an ActiveRecord model class, but it should act like one. You *must* supply at least one model.
33
+
34
+ The following settings are optional:
35
+
36
+ config.sets = []
37
+ config.additional_formats = []
38
+
39
+ The items of the sets list should be hash with value for spec, name, model, and optionally description. E.g.
40
+
41
+ config.sets = [
42
+ {
43
+ spec: 'class:party',
44
+ name: 'Parties',
45
+ model: Person
46
+ },
47
+ {
48
+ spec: 'class:service',
49
+ name: 'Services',
50
+ model: Instrument,
51
+ description: 'Things that are services'
52
+ }
53
+ ]
54
+
55
+ By default, an OAI repository must be able to emit its records in OAI_DC (Dublin Core) format. If you want to provide other output formats for your repository
56
+ (and those formats are subclasses of OAI::Provider::Metadata.Format) then
57
+ you can specify them here. E.g.
58
+
59
+ require 'rifcs_format'
60
+
61
+ config.additional_formats = [
62
+ OAI::Provider::Metadata::RIFCS
63
+ ]
64
+
65
+ == Instrumenting your Models
66
+
67
+ === OAI DC Format
68
+
69
+ As a *bare* *minimum*, your model classes must implement the following method (or readable attribute)
70
+
71
+ oai_dc_identifier
72
+
73
+ This must return a *unique* value for the *whole* *repository*.
74
+ The format of the unique identifier *must* correspond to that of
75
+ the URI (Uniform Resource Identifier) syntax. See http://www.openarchives.org/OAI/openarchivesprotocol.html#UniqueIdentifier for more details.
76
+
77
+ You may also supply oai_dc_<value> where <value> is any of
78
+
79
+ title
80
+ creator
81
+ subject
82
+ description
83
+ publisher
84
+ contributor
85
+ date
86
+ type
87
+ format
88
+ source
89
+ language
90
+ relation
91
+ coverage
92
+ rights
93
+
94
+ See http://www.openarchives.org/OAI/openarchivesprotocol.html#dublincore for a bit more information on the Dublin Core metadata format.
95
+
96
+ === OAI Sets
97
+
98
+ A set is an optional construct for grouping items for the purpose of selective harvesting.
99
+
100
+ You must fill the configuration item <tt>config.sets</tt> to list the sets your
101
+ repository uses. This list will be shown in the output of a <tt>ListSets</tt> request.
102
+
103
+ If you are grouping your records by set you have two implementation options in your model(s).
104
+
105
+ If all records from a model will belong to a given set, then simply
106
+
107
+ include OaiRepository::Set
108
+
109
+ in your model and all records will belong to the sets from your <tt>config.sets</tt> mapping.
110
+
111
+ If you want to be selective about set membership, implement a <tt>sets</tt> method in your model that responds with the set that a record belongs to. E.g.
112
+
113
+ def sets
114
+ oai_sets = [ OAI::Set.new({name: 'Tools', spec: 'tools'}) ]
115
+ if name.match('multimeter')
116
+ oai_sets << OAI::Set.new({name: 'Meters', spec: 'meters'})
117
+ end
118
+ oai_sets
119
+ end
120
+
121
+
122
+ == Mounting the Engine
123
+
124
+ In your <tt>config/routes.rb</tt> add
125
+
126
+ mount OaiRepository::Engine => "/oai_repository"
127
+
128
+ changing the path as desired.
@@ -4,9 +4,14 @@ class ARWrapperModel < OAI::Provider::Model
4
4
  def initialize(options={})
5
5
  @timestamp_field = options.delete(:timestamp_field) || 'updated_at'
6
6
  @limit = options.delete(:limit)
7
- @sets_map = options.delete(:sets)
7
+ @models = options.delete(:models)
8
+ @sets_map = options.delete(:sets) || []
8
9
  @oai_dc_mapping = {}
9
10
 
11
+ if @models.nil? or @models.empty?
12
+ raise "models configuration value is required and must be non-empty"
13
+ end
14
+
10
15
  unless options.empty?
11
16
  raise ArgumentError.new(
12
17
  "Unsupported options [#{options.keys.join(', ')}]"
@@ -38,11 +43,11 @@ class ARWrapperModel < OAI::Provider::Model
38
43
  end
39
44
 
40
45
  def earliest
41
- record_end(:asc)
46
+ first_record_date(:asc)
42
47
  end
43
48
 
44
49
  def latest
45
- record_end(:desc)
50
+ first_record_date(:desc)
46
51
  end
47
52
 
48
53
  # selector can be id or :all
@@ -53,14 +58,13 @@ class ARWrapperModel < OAI::Provider::Model
53
58
  token = ResumptionToken.parse(options[:resumption_token])
54
59
  end
55
60
 
56
- from = token ? token.from : options[:from]
57
- to = token ? token.until : options[:until]
61
+ from = token ? token.from.utc : options[:from]
62
+ to = token ? token.until.utc : options[:until]
58
63
  last = token ? token.last : 0
59
- prefix = token ? token.prefix : options[:metadata_prefix]
60
64
  set = token ? token.set : options[:set]
65
+ prefix = token ? token.prefix : options[:metadata_prefix]
61
66
 
62
- conditions = sql_conditions(:from => from, :until => to)
63
- record_rows = get_record_rows(get_models(set), conditions)
67
+ record_rows = get_record_rows(set, :from => from, :until => to)
64
68
 
65
69
  return get_specific_record(record_rows, selector) if selector != :all
66
70
 
@@ -75,7 +79,7 @@ class ARWrapperModel < OAI::Provider::Model
75
79
 
76
80
  def get_specific_record(records, id)
77
81
  # TODO: optimise somehow
78
- # This is terribly intensive :-(
82
+ # This scans all records :-(
79
83
  records.each do |record|
80
84
  obj = Object.const_get(record["type"]).find(record["id"])
81
85
  return obj if obj.oai_dc_identifier.eql?(id)
@@ -85,47 +89,51 @@ class ARWrapperModel < OAI::Provider::Model
85
89
 
86
90
  def get_paged_records(record_rows, options)
87
91
 
88
- if record_rows.size < options[:last]
92
+ last = options[:last]
93
+ if record_rows.size < last
89
94
  raise OAI::ResumptionTokenException.new
90
95
  end
91
96
 
92
- list = get_record_objects(record_rows[options[:last], @limit])
97
+ list = get_record_objects(record_rows[last, @limit])
93
98
 
94
- if list.size < @limit
99
+ last += @limit
100
+ if last >= record_rows.size
95
101
  list
96
102
  else
97
- options[:last] += @limit
103
+ options[:last] = last
98
104
  PartialResult.new(list, ResumptionToken.new(options))
99
105
  end
100
106
  end
101
107
 
102
- def sql_conditions(opts)
103
- from = Time.parse(opts[:from].to_s).localtime
104
- to = Time.parse(opts[:until].to_s).localtime.to_s
105
- return "#{timestamp_field} >= #{ActiveRecord::Base.sanitize(from)} AND #{timestamp_field} <= #{ActiveRecord::Base.sanitize(to)}"
106
- end
107
-
108
- def get_models(set)
109
- models =
110
- if set.nil?
111
- @sets_map.map{|s| s[:model]}
112
- else
113
- @sets_map.select{|s| s[:spec] == set}.map{|s| s[:model]}
108
+ def get_record_rows(set, options={})
109
+ union = []
110
+
111
+ from = options[:from]
112
+ # DateTime has microsecond precision, but we're parsing in dates with only
113
+ # second precision. In this case the microsecond value defaults to zero.
114
+ # Since some (most) of the records at the boundaries of a range will have
115
+ # non-zero microseconds (where the timestamp database field type has this
116
+ # level of resolution), the range needs to be adjusted to cover this.
117
+ #
118
+ # E.g. if the range is 2012-01-01 12:00:00 to 2012-01-01 12:30:00 inclusive
119
+ # a record with timestamp 2012-01-01 12:30:00.000001 is probably expected
120
+ # to fall in the range.
121
+ #
122
+ # To do this we extend the upper bound by one second and then make this
123
+ # upper bound exclusive.
124
+ to = options[:until] + 1.second
125
+
126
+ record_sql = @models.map do |m|
127
+ res = m.select("id, '#{m.name}' as type, #{timestamp_field}").where("#{timestamp_field} >= ? and #{timestamp_field} < ?", from.to_s(:db), to.to_s(:db))
128
+ if !(res.empty? or set.nil?)
129
+ res.select!{|record| record.sets.map(&:spec).include?(set.spec)}
114
130
  end
115
-
116
- if models.empty?
117
- raise OAI::NoMatchException.new
131
+ union += res unless res.empty?
118
132
  end
119
- models
120
- end
121
133
 
122
- def get_record_rows(models, conditions)
123
- record_sql = models.map do |m|
124
- "select id, '#{m.name}' as type, #{timestamp_field} from #{m.table_name} where #{conditions}"
125
- end.join(" UNION ")
134
+ raise OAI::NoMatchException.new if union.empty?
126
135
 
127
- sorted_list_sql = "select t.id as id, t.type as type, t.updated_at as updated_at from (#{record_sql}) t order by t.updated_at desc"
128
- records = ActiveRecord::Base.connection.execute(sorted_list_sql).to_a
136
+ union.sort! {|a,b| b.updated_at <=> a.updated_at}
129
137
  end
130
138
 
131
139
  def get_record_objects(records)
@@ -134,10 +142,10 @@ class ARWrapperModel < OAI::Provider::Model
134
142
  end
135
143
  end
136
144
 
137
- def record_end(order)
145
+ def first_record_date(order)
138
146
  record = nil
139
- @sets_map.each do |s|
140
- r = s[:model].first(:order => "#{@timestamp_field} #{order.to_s}")
147
+ @models.each do |model|
148
+ r = model.first(:order => "#{@timestamp_field} #{order.to_s}")
141
149
  next if r.nil?
142
150
 
143
151
  if record.nil?
@@ -148,7 +156,9 @@ class ARWrapperModel < OAI::Provider::Model
148
156
  record = r
149
157
  end
150
158
  end
159
+
151
160
  raise OAI::NoMatchException if record.nil?
161
+
152
162
  record.send(@timestamp_field)
153
163
  end
154
164
 
@@ -25,9 +25,15 @@ OaiRepository.setup do |config|
25
25
  # The number of records shown at a time (when doing a ListRecords)
26
26
  config.limit = 100
27
27
 
28
- # Map the name of the set to the ActiveRecord (or other) class name that
29
- # will provide (at a minimum) the required oai_dc attributes/methods.
30
- # E.g.
28
+ # The values for "models" should be the class name of the ActiveRecord model
29
+ # class that is being identified with the given set. It doesn't actually have
30
+ # to be a ActiveRecord model class, but it should act like one.
31
+ #
32
+ # You must supply at least one model.
33
+ config.models = [ SupplyMe ]
34
+
35
+ # List the sets (and the ActiveRecord model they belong to). E.g.
36
+ #
31
37
  # config.sets = [
32
38
  # {
33
39
  # spec: 'class:party',
@@ -37,14 +43,11 @@ OaiRepository.setup do |config|
37
43
  # {
38
44
  # spec: 'class:service',
39
45
  # name: 'Services',
40
- # description: 'Things that are services',
41
- # model: Instrument
46
+ # model: Instrument,
47
+ # description: 'Things that are services'
42
48
  # }
43
49
  # ]
44
50
  #
45
- # The "model" value should be the class name of the ActiveRecord model class
46
- # that is being identified with the given set. It doesn't actually *have*
47
- # to be a ActiveRecord model class, but it should act like one.
48
51
  config.sets = []
49
52
 
50
53
  # By default, an OAI repository must emit its records in OAI_DC (Dublin Core)
data/lib/oai_provider.rb CHANGED
@@ -20,6 +20,7 @@ module OAIProvider
20
20
  provider_class.record_prefix OaiRepository.record_prefix
21
21
  provider_class.admin_email OaiRepository.admin_email
22
22
  provider_class.source_model ARWrapperModel.new(
23
+ models: OaiRepository.models,
23
24
  sets: OaiRepository.sets,
24
25
  limit: OaiRepository.limit,
25
26
  timestamp_field: OaiRepository.timestamp_field
@@ -18,6 +18,9 @@ module OaiRepository
18
18
  mattr_accessor :admin_email
19
19
  @@admin_email = 'root@localhost'
20
20
 
21
+ mattr_accessor :models
22
+ @@models = {}
23
+
21
24
  mattr_accessor :sets
22
25
  @@sets = {}
23
26
 
@@ -30,4 +33,20 @@ module OaiRepository
30
33
 
31
34
  mattr_accessor :timestamp_field
32
35
  @@timestamp_field = 'updated_at'
36
+
37
+ module Set
38
+
39
+ def sets
40
+ OaiRepository.sets.select {|s| s[:model] == self.class}.map{|set_obj|
41
+ OAI::Set.new(
42
+ {
43
+ name: set_obj[:name],
44
+ spec: set_obj[:spec],
45
+ description: set_obj[:description]
46
+ }
47
+ )
48
+ }
49
+ end
50
+
51
+ end
33
52
  end
@@ -1,3 +1,3 @@
1
1
  module OaiRepository
2
- VERSION = "0.1.2"
2
+ VERSION = "0.9.0"
3
3
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: oai_repository
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.2
4
+ version: 0.9.0
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -10,11 +10,11 @@ authors:
10
10
  autorequire:
11
11
  bindir: bin
12
12
  cert_chain: []
13
- date: 2012-07-16 00:00:00.000000000 Z
13
+ date: 2012-07-24 00:00:00.000000000 Z
14
14
  dependencies:
15
15
  - !ruby/object:Gem::Dependency
16
16
  name: rails
17
- requirement: &74598920 !ruby/object:Gem::Requirement
17
+ requirement: &78869700 !ruby/object:Gem::Requirement
18
18
  none: false
19
19
  requirements:
20
20
  - - ~>
@@ -22,10 +22,10 @@ dependencies:
22
22
  version: '3.1'
23
23
  type: :runtime
24
24
  prerelease: false
25
- version_requirements: *74598920
25
+ version_requirements: *78869700
26
26
  - !ruby/object:Gem::Dependency
27
27
  name: oai
28
- requirement: &74598390 !ruby/object:Gem::Requirement
28
+ requirement: &78868860 !ruby/object:Gem::Requirement
29
29
  none: false
30
30
  requirements:
31
31
  - - ! '>='
@@ -33,10 +33,10 @@ dependencies:
33
33
  version: '0'
34
34
  type: :runtime
35
35
  prerelease: false
36
- version_requirements: *74598390
36
+ version_requirements: *78868860
37
37
  - !ruby/object:Gem::Dependency
38
38
  name: sqlite3
39
- requirement: &74597970 !ruby/object:Gem::Requirement
39
+ requirement: &78919450 !ruby/object:Gem::Requirement
40
40
  none: false
41
41
  requirements:
42
42
  - - ! '>='
@@ -44,10 +44,10 @@ dependencies:
44
44
  version: '0'
45
45
  type: :development
46
46
  prerelease: false
47
- version_requirements: *74597970
47
+ version_requirements: *78919450
48
48
  - !ruby/object:Gem::Dependency
49
49
  name: rspec-rails
50
- requirement: &74651040 !ruby/object:Gem::Requirement
50
+ requirement: &78918000 !ruby/object:Gem::Requirement
51
51
  none: false
52
52
  requirements:
53
53
  - - ! '>='
@@ -55,10 +55,10 @@ dependencies:
55
55
  version: '0'
56
56
  type: :development
57
57
  prerelease: false
58
- version_requirements: *74651040
58
+ version_requirements: *78918000
59
59
  - !ruby/object:Gem::Dependency
60
60
  name: capybara
61
- requirement: &74645980 !ruby/object:Gem::Requirement
61
+ requirement: &78916060 !ruby/object:Gem::Requirement
62
62
  none: false
63
63
  requirements:
64
64
  - - ! '>='
@@ -66,7 +66,7 @@ dependencies:
66
66
  version: '0'
67
67
  type: :development
68
68
  prerelease: false
69
- version_requirements: *74645980
69
+ version_requirements: *78916060
70
70
  description: An Engine for Rails (3.1+) that allows you to make your application an
71
71
  OAI-PMH Data Provider. See http://www.openarchives.org/pmh/ and http://www.openarchives.org/OAI/openarchivesprotocol.html#Repository
72
72
  email:
@@ -114,7 +114,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
114
114
  version: '0'
115
115
  segments:
116
116
  - 0
117
- hash: 387383629
117
+ hash: 493502575
118
118
  required_rubygems_version: !ruby/object:Gem::Requirement
119
119
  none: false
120
120
  requirements:
@@ -123,7 +123,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
123
123
  version: '0'
124
124
  segments:
125
125
  - 0
126
- hash: 387383629
126
+ hash: 493502575
127
127
  requirements: []
128
128
  rubyforge_project:
129
129
  rubygems_version: 1.8.15