ro-crate 0.4.9 → 0.4.13

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (37) hide show
  1. checksums.yaml +4 -4
  2. data/Gemfile.lock +13 -13
  3. data/README.md +39 -0
  4. data/lib/ro_crate/model/crate.rb +48 -6
  5. data/lib/ro_crate/model/data_entity.rb +5 -4
  6. data/lib/ro_crate/model/directory.rb +2 -4
  7. data/lib/ro_crate/model/entity.rb +41 -4
  8. data/lib/ro_crate/model/entry.rb +2 -2
  9. data/lib/ro_crate/model/file.rb +2 -4
  10. data/lib/ro_crate/model/metadata.rb +9 -1
  11. data/lib/ro_crate/model/organization.rb +1 -1
  12. data/lib/ro_crate/model/preview.rb +3 -15
  13. data/lib/ro_crate/model/preview_generator.rb +40 -0
  14. data/lib/ro_crate/model/remote_entry.rb +1 -12
  15. data/lib/ro_crate/reader.rb +76 -19
  16. data/lib/ro_crate/writer.rb +4 -4
  17. data/lib/ro_crate.rb +2 -1
  18. data/ro_crate.gemspec +3 -3
  19. data/test/crate_test.rb +58 -3
  20. data/test/directory_test.rb +21 -21
  21. data/test/entity_test.rb +114 -0
  22. data/test/fixtures/biobb_hpc_workflows-condapack.zip +0 -0
  23. data/test/fixtures/conflicting_data_directory/info.txt +1 -0
  24. data/test/fixtures/conflicting_data_directory/nested.txt +1 -0
  25. data/test/fixtures/nested_directory.zip +0 -0
  26. data/test/fixtures/ro-crate-galaxy-sortchangecase/LICENSE +176 -0
  27. data/test/fixtures/ro-crate-galaxy-sortchangecase/README.md +6 -0
  28. data/test/fixtures/ro-crate-galaxy-sortchangecase/ro-crate-metadata.json +133 -0
  29. data/test/fixtures/ro-crate-galaxy-sortchangecase/sort-and-change-case.ga +118 -0
  30. data/test/fixtures/ro-crate-galaxy-sortchangecase/test/test1/input.bed +3 -0
  31. data/test/fixtures/ro-crate-galaxy-sortchangecase/test/test1/output_exp.bed +3 -0
  32. data/test/fixtures/ro-crate-galaxy-sortchangecase/test/test1/sort-and-change-case-test.yml +8 -0
  33. data/test/fixtures/sparse_directory_crate/ro-crate-preview.html +60 -59
  34. data/test/reader_test.rb +83 -40
  35. data/test/test_helper.rb +5 -1
  36. data/test/writer_test.rb +59 -2
  37. metadata +26 -8
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 88f4e08570b547ac985a759af19d619747dd78a9001cd7994f314a42c014e001
4
- data.tar.gz: e3711f54d12c1b4d6b1d68d16923dd051ae2b3a01a28a1d1e155a471bb84bfab
3
+ metadata.gz: 364757350da7d275b02def348bfd97782ad0ae31728c44f0691bc084e4d94ac9
4
+ data.tar.gz: 6b56ae58b682a4618dcb92ca78cbae720426316bc8ec50b9377fb9fda8ec9137
5
5
  SHA512:
6
- metadata.gz: 2a468ad85761945fc782b5ccb6943e4d1b53c7bb528dba59ed19f93ab2af08017f212d93bf47f9af983b6f00fd304dd1b2edfaef5e096f04ffdb181fd75352e5
7
- data.tar.gz: 89edb2f44a6842a7409c2e3107e76b3c401533738c32e6c18faecc6c9c29e6fd71a0d06fbe20f67fdec62564d00b3c168da36b29662ab8badfe652d923b7ab58
6
+ metadata.gz: a2b14f9c64528476a18f84d0292311484e975a3dc6085c00c94f5dac8d84fa144e91bb6138c876db4a4079aeec4222eb6a1f91458e2ef0260defa9d3a0a42d57
7
+ data.tar.gz: b56d257868b7cb94f341e844223d14988c280fff34bb54dd1a676d47681d96afc8193ebb60a22eecfef7e96dc0d2ce03ed1571190a1717fbe56d1b80ed579053
data/Gemfile.lock CHANGED
@@ -1,31 +1,31 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- ro-crate (0.4.9)
5
- addressable (~> 2.7.0)
4
+ ro-crate (0.4.13)
5
+ addressable (>= 2.7, < 2.9)
6
6
  rubyzip (~> 2.0.0)
7
7
 
8
8
  GEM
9
9
  remote: https://rubygems.org/
10
10
  specs:
11
- addressable (2.7.0)
11
+ addressable (2.8.0)
12
12
  public_suffix (>= 2.0.2, < 5.0)
13
13
  crack (0.4.3)
14
14
  safe_yaml (~> 1.0.0)
15
- docile (1.3.1)
15
+ docile (1.3.5)
16
16
  hashdiff (1.0.1)
17
- json (2.3.1)
18
- power_assert (0.4.1)
19
- public_suffix (4.0.3)
17
+ power_assert (1.1.3)
18
+ public_suffix (4.0.6)
20
19
  rake (13.0.0)
21
20
  rubyzip (2.0.0)
22
21
  safe_yaml (1.0.5)
23
- simplecov (0.16.1)
22
+ simplecov (0.21.2)
24
23
  docile (~> 1.1)
25
- json (>= 1.8, < 3)
26
- simplecov-html (~> 0.10.0)
27
- simplecov-html (0.10.2)
28
- test-unit (3.2.3)
24
+ simplecov-html (~> 0.11)
25
+ simplecov_json_formatter (~> 0.1)
26
+ simplecov-html (0.12.3)
27
+ simplecov_json_formatter (0.1.2)
28
+ test-unit (3.2.9)
29
29
  power_assert
30
30
  webmock (3.8.3)
31
31
  addressable (>= 2.3.6)
@@ -39,7 +39,7 @@ PLATFORMS
39
39
  DEPENDENCIES
40
40
  rake (~> 13.0.0)
41
41
  ro-crate!
42
- simplecov (~> 0.16.1)
42
+ simplecov (~> 0.21.2)
43
43
  test-unit (~> 3.2.3)
44
44
  webmock (~> 3.8.3)
45
45
  yard (~> 0.9.25)
data/README.md CHANGED
@@ -1,5 +1,7 @@
1
1
  # ro-crate-ruby
2
2
 
3
+ ![Tests](https://github.com/ResearchObject/ro-crate-ruby/actions/workflows/tests.yml/badge.svg)
4
+
3
5
  This is a WIP gem for creating, manipulating and reading RO-Crates (conforming to version 1.1 of the specification).
4
6
 
5
7
  * RO-Crate - https://researchobject.github.io/ro-crate/
@@ -17,6 +19,43 @@ and run `bundle install`.
17
19
 
18
20
  ## Usage
19
21
 
22
+ This gem consists a hierarchy of classes to model RO-Crate "entities": the crate itself, data entities
23
+ (files and directory) and contextual entities (with a limited set of specializations, such as `ROCrate::Person`).
24
+ They are all descendents of the `ROCrate::Entity` class, with the `ROCrate::Crate` class representing the crate itself.
25
+
26
+ The `ROCrate::Reader` class handles reading of RO-Crates into the above model, from a Zip file or directory.
27
+
28
+ The `ROCrate::Writer` class can write out an `ROCrate::Crate` instance into a Zip file or directory.
29
+
30
+ **Note:** for performance reasons, the gem is currently not linked-data aware and will allow you to set properties that
31
+ are not semantically valid.
32
+
33
+ ### Entities
34
+ Entities correspond to entries in the `@graph` of the RO-Crate's metadata JSON-LD file. Each entity class is
35
+ basically a wrapper around a set of JSON properties, with some convenience methods for getting/setting some
36
+ commonly used properties (`crate.name = "My first crate"`).
37
+
38
+ These convenience getter/setter methods will automatically handle turning objects into references and adding them to the
39
+ `@graph` if necessary.
40
+
41
+ ##### Getting/Setting Arbitrary Properties of Entities
42
+ As well as using the pre-defined getter/setter methods, you can get/set arbitrary properties like so.
43
+
44
+ To set the "creativeWorkStatus" property of the RO-Crate itself to a string literal:
45
+ ```ruby
46
+ crate['creativeWorkStatus'] = 'work-in-progress'
47
+ ```
48
+
49
+ If you want to reference other entities in the crate, you can get a JSON-LD reference from an entity object by using the `reference` method:
50
+ ```ruby
51
+ joe = crate.add_person('joe', { name: 'Joe Bloggs' }) # Add the entity to the @graph
52
+ crate['copyrightHolder'] = joe.reference # Reference the entity from the "copyrightHolder" property
53
+ ```
54
+ and to resolve those references back to the object, use the `dereference` method:
55
+ ```ruby
56
+ joe = crate['copyrightHolder'].dereference
57
+ ```
58
+
20
59
  ### Documentation
21
60
 
22
61
  [Click here for API documentation](https://www.researchobject.org/ro-crate-ruby/).
@@ -25,6 +25,15 @@ module ROCrate
25
25
  super(self, nil, id, properties)
26
26
  end
27
27
 
28
+ ##
29
+ # Lookup an Entity using the given ID (in this Entity's crate).
30
+ #
31
+ # @param id [String] The ID to query.
32
+ # @return [Entity, nil]
33
+ def dereference(id)
34
+ entities.detect { |e| e.canonical_id == crate.resolve_id(id) } if id
35
+ end
36
+
28
37
  ##
29
38
  # Create a new file and add it to the crate.
30
39
  #
@@ -168,6 +177,15 @@ module ROCrate
168
177
  @preview ||= ROCrate::Preview.new(self)
169
178
  end
170
179
 
180
+ ##
181
+ # Set the RO-Crate preview file
182
+ # @param preview [Preview] the preview to set.
183
+ #
184
+ # @return [Preview]
185
+ def preview=(preview)
186
+ @preview = claim(preview)
187
+ end
188
+
171
189
  ##
172
190
  # All the entities within the crate. Includes contextual entities, data entities, the crate itself and its metadata file.
173
191
  #
@@ -220,32 +238,56 @@ module ROCrate
220
238
  entity.class.new(crate, entity.id, entity.raw_properties)
221
239
  end
222
240
 
223
- alias_method :own_entries, :entries
241
+ alias_method :own_payload, :payload
224
242
  ##
225
- # # The RO-Crate's "payload" of the crate - a map of all the files/directories contained in the RO-Crate, where the
226
- # key is the destination path within the crate and the value is an Entry where the source data can be read.
243
+ # The file payload of the RO-Crate - a map of all the files/directories contained in the RO-Crate, where the
244
+ # key is the path relative to the crate's root, and the value is an Entry where the source data can be read.
227
245
  #
228
246
  # @return [Hash{String => Entry}>]
229
- def entries
247
+ def payload
230
248
  # Gather a map of entries, starting from the crate itself, then any directory data entities, then finally any
231
249
  # file data entities. This ensures in the case of a conflict, the more "specific" data entities take priority.
232
- entries = own_entries
250
+ entries = own_payload
233
251
  non_self_entities = default_entities.reject { |e| e == self }
234
252
  sorted_entities = (non_self_entities | data_entities).sort_by { |e| e.is_a?(ROCrate::Directory) ? 0 : 1 }
235
253
 
236
254
  sorted_entities.each do |entity|
237
- entity.entries.each do |path, entry|
255
+ entity.payload.each do |path, entry|
238
256
  entries[path] = entry
239
257
  end
240
258
  end
241
259
 
242
260
  entries
243
261
  end
262
+ alias_method :entries, :payload
244
263
 
245
264
  def get_binding
246
265
  binding
247
266
  end
248
267
 
268
+ ##
269
+ # Remove the entity from the RO-Crate.
270
+ #
271
+ # @param entity [Entity, String] The entity or ID of an entity to remove from the crate.
272
+ # @param remove_orphaned [Boolean] Should linked contextual entities also be removed from the crate they are left
273
+ # dangling (nothing else is linked to them)?
274
+ #
275
+ # @return [Entity, nil] The entity that was deleted, or nil if nothing was deleted.
276
+ def delete(entity, remove_orphaned: true)
277
+ entity = dereference(entity) if entity.is_a?(String)
278
+ return unless entity
279
+
280
+ deleted = data_entities.delete(entity) || contextual_entities.delete(entity)
281
+
282
+ if deleted && remove_orphaned
283
+ crate_entities = crate.linked_entities(deep: true)
284
+ to_remove = (entity.linked_entities(deep: true) - crate_entities)
285
+ to_remove.each(&:delete)
286
+ end
287
+
288
+ deleted
289
+ end
290
+
249
291
  private
250
292
 
251
293
  def full_entry_path(relative_path)
@@ -3,6 +3,8 @@ module ROCrate
3
3
  # A class to represent a "Data Entity" within an RO-Crate.
4
4
  # Data Entities are the actual physical files and directories within the Crate.
5
5
  class DataEntity < Entity
6
+ properties(%w[name contentSize dateModified encodingFormat identifier sameAs author])
7
+
6
8
  def self.format_local_id(id)
7
9
  super.chomp('/')
8
10
  end
@@ -13,8 +15,6 @@ module ROCrate
13
15
  # @return [Class]
14
16
  def self.specialize(props)
15
17
  type = props['@type']
16
- id = props['@id']
17
- abs = URI(id)&.absolute? rescue false
18
18
  type = [type] unless type.is_a?(Array)
19
19
  if type.include?('Dataset')
20
20
  ROCrate::Directory
@@ -24,12 +24,13 @@ module ROCrate
24
24
  end
25
25
 
26
26
  ##
27
- # A map of all the files/directories associated with this DataEntity.
27
+ # The payload of all the files/directories associated with this DataEntity, mapped by their relative file path.
28
28
  #
29
29
  # @return [Hash{String => Entry}>] The key is the location within the crate, and the value is an Entry.
30
- def entries
30
+ def payload
31
31
  {}
32
32
  end
33
+ alias_method :entries, :payload
33
34
 
34
35
  ##
35
36
  # A disk-safe filepath based on the ID of this DataEntity.
@@ -2,8 +2,6 @@ module ROCrate
2
2
  ##
3
3
  # A data entity that represents a directory of potentially many files and subdirectories (or none).
4
4
  class Directory < DataEntity
5
- properties(%w[name contentSize dateModified encodingFormat identifier sameAs])
6
-
7
5
  def self.format_local_id(id)
8
6
  super + '/'
9
7
  end
@@ -30,11 +28,11 @@ module ROCrate
30
28
  end
31
29
 
32
30
  ##
33
- # The "payload" of this directory - a map of all the files/directories, where the key is the destination path
31
+ # The payload of this directory - a map of all the files/directories, where the key is the destination path
34
32
  # within the crate and the value is an Entry where the source data can be read.
35
33
  #
36
34
  # @return [Hash{String => Entry}>]
37
- def entries
35
+ def payload
38
36
  entries = {}
39
37
  entries[filepath.chomp('/')] = @entry if @entry
40
38
 
@@ -129,16 +129,26 @@ module ROCrate
129
129
  # @param id [String] The ID to query.
130
130
  # @return [Entity, nil]
131
131
  def dereference(id)
132
- crate.entities.detect { |e| e.canonical_id == crate.resolve_id(id) } if id
132
+ crate.dereference(id)
133
133
  end
134
-
135
134
  alias_method :get, :dereference
136
135
 
136
+ ##
137
+ # Remove this entity from the RO-Crate.
138
+ #
139
+ # @param remove_orphaned [Boolean] Should linked contextual entities also be removed from the crate (if nothing else is linked to them)?
140
+ #
141
+ # @return [Entity, nil] This entity, or nil if nothing was deleted.
142
+ def delete(remove_orphaned: true)
143
+ crate.delete(self, remove_orphaned: remove_orphaned)
144
+ end
145
+
137
146
  def id
138
147
  @properties['@id']
139
148
  end
140
149
 
141
150
  def id=(id)
151
+ @canonical_id = nil
142
152
  @properties['@id'] = self.class.format_id(id)
143
153
  end
144
154
 
@@ -190,13 +200,13 @@ module ROCrate
190
200
  #
191
201
  # @return [Addressable::URI]
192
202
  def canonical_id
193
- crate.resolve_id(id)
203
+ @canonical_id ||= crate.resolve_id(id)
194
204
  end
195
205
 
196
206
  ##
197
207
  # Is this entity local to the crate or an external reference?
198
208
  #
199
- # @return [boolean]
209
+ # @return [Boolean]
200
210
  def external?
201
211
  crate.canonical_id.host != canonical_id.host
202
212
  end
@@ -226,6 +236,33 @@ module ROCrate
226
236
  @properties.has_type?(type)
227
237
  end
228
238
 
239
+ ##
240
+ # Gather a list of entities linked to this one through its properties.
241
+ # @param deep [Boolean] If false, only consider direct links, otherwise consider transitive links.
242
+ # @param linked [Hash{String => Entity}] Discovered entities, mapped by their ID, to avoid loops when recursing.
243
+ # @return [Array<Entity>]
244
+ def linked_entities(deep: false, linked: {})
245
+ properties.each do |key, value|
246
+ value = [value] if value.is_a?(JSONLDHash)
247
+
248
+ if value.is_a?(Array)
249
+ value.each do |v|
250
+ if v.is_a?(JSONLDHash) && !linked.key?(v['@id'])
251
+ entity = v.dereference
252
+ linked[entity.id] = entity if entity
253
+ if deep
254
+ entity.linked_entities(deep: true, linked: linked).each do |e|
255
+ linked[e.id] = e
256
+ end
257
+ end
258
+ end
259
+ end
260
+ end
261
+ end
262
+
263
+ linked.values.compact
264
+ end
265
+
229
266
  private
230
267
 
231
268
  def default_properties
@@ -14,10 +14,10 @@ module ROCrate
14
14
  end
15
15
 
16
16
  ##
17
- # Write the source to the destination via a buffer.
17
+ # Write the entry's source to the destination via a buffer.
18
18
  #
19
19
  # @param dest [#write] An IO-like destination to write to.
20
- def write(dest)
20
+ def write_to(dest)
21
21
  input = source
22
22
  input = input.open('rb') if input.is_a?(Pathname)
23
23
  while (buff = input.read(4096))
@@ -2,14 +2,12 @@ module ROCrate
2
2
  ##
3
3
  # A data entity that represents a single file.
4
4
  class File < DataEntity
5
- properties(%w[name contentSize dateModified encodingFormat identifier sameAs])
6
-
7
5
  ##
8
6
  # Create a new ROCrate::File. PLEASE NOTE, the new file will not be added to the crate. To do this, call
9
7
  # Crate#add_data_entity, or just use Crate#add_file.
10
8
  #
11
9
  # @param crate [Crate] The RO-Crate that owns this file.
12
- # @param source [String, Pathname, ::File, #read, URI, nil] The source on the disk (or on the internet if a URI) where this file will be read.
10
+ # @param source [String, Pathname, ::File, URI, nil, #read] The source on the disk (or on the internet if a URI) where this file will be read.
13
11
  # @param crate_path [String] The relative path within the RO-Crate where this file will be written.
14
12
  # @param properties [Hash{String => Object}] A hash of JSON-LD properties to associate with this file.
15
13
  def initialize(crate, source, crate_path = nil, properties = {})
@@ -58,7 +56,7 @@ module ROCrate
58
56
  # (for compatibility with Directory#entries)
59
57
  #
60
58
  # @return [Hash{String => Entry}>] The key is the location within the crate, and the value is an Entry.
61
- def entries
59
+ def payload
62
60
  remote? ? {} : { filepath => source }
63
61
  end
64
62
 
@@ -17,7 +17,15 @@ module ROCrate
17
17
  # @return [String] The rendered JSON-LD as a "prettified" string.
18
18
  def generate
19
19
  graph = crate.entities.map(&:properties).reject(&:empty?)
20
- JSON.pretty_generate('@context' => CONTEXT, '@graph' => graph)
20
+ JSON.pretty_generate('@context' => context, '@graph' => graph)
21
+ end
22
+
23
+ def context
24
+ @context || CONTEXT
25
+ end
26
+
27
+ def context= c
28
+ @context = c
21
29
  end
22
30
 
23
31
  private
@@ -2,7 +2,7 @@ module ROCrate
2
2
  ##
3
3
  # A contextual entity that represents an organization.
4
4
  class Organization < ContextualEntity
5
- properties(['name'])
5
+ properties(%w[name])
6
6
 
7
7
  private
8
8
 
@@ -12,26 +12,14 @@ module ROCrate
12
12
  # @return [String]
13
13
  attr_accessor :template
14
14
 
15
- def initialize(crate, properties = {})
15
+ def initialize(crate, source = nil, properties = {})
16
+ source ||= PreviewGenerator.new(self)
16
17
  @template = nil
17
- super(crate, nil, IDENTIFIER, properties)
18
- end
19
-
20
- ##
21
- # Generate the crate's `ro-crate-preview.html`.
22
- # @return [String] The rendered HTML as a string.
23
- def generate
24
- b = crate.get_binding
25
- renderer = ERB.new(template || ::File.read(DEFAULT_TEMPLATE))
26
- renderer.result(b)
18
+ super(crate, source, IDENTIFIER, properties)
27
19
  end
28
20
 
29
21
  private
30
22
 
31
- def source
32
- Entry.new(StringIO.new(generate))
33
- end
34
-
35
23
  def default_properties
36
24
  {
37
25
  '@id' => IDENTIFIER,
@@ -0,0 +1,40 @@
1
+ require 'erb'
2
+
3
+ module ROCrate
4
+ ##
5
+ # A class to handle generation of an RO-Crate's preview HTML in an IO-like way (to fit into an Entry).
6
+ class PreviewGenerator
7
+ ##
8
+ # @param preview [Preview] The RO-Crate preview object.
9
+ def initialize(preview)
10
+ @preview = preview
11
+ end
12
+
13
+ def read(*args)
14
+ io.read(*args)
15
+ end
16
+
17
+ ##
18
+ # Generate the crate's `ro-crate-preview.html`.
19
+ # @return [String] The rendered HTML as a string.
20
+ def generate
21
+ b = crate.get_binding
22
+ renderer = ERB.new(template)
23
+ renderer.result(b)
24
+ end
25
+
26
+ def template
27
+ @preview.template || ::File.read(Preview::DEFAULT_TEMPLATE)
28
+ end
29
+
30
+ def crate
31
+ @preview.crate
32
+ end
33
+
34
+ private
35
+
36
+ def io
37
+ @io ||= StringIO.new(generate)
38
+ end
39
+ end
40
+ end
@@ -2,7 +2,7 @@ module ROCrate
2
2
  ##
3
3
  # A class to represent a reference within an RO-Crate, to a remote file held on the internet somewhere.
4
4
  # It handles the actual reading/writing of bytes.
5
- class RemoteEntry
5
+ class RemoteEntry < Entry
6
6
  attr_reader :uri
7
7
 
8
8
  ##
@@ -13,17 +13,6 @@ module ROCrate
13
13
  @uri = uri
14
14
  end
15
15
 
16
- def write(dest)
17
- raise 'Cannot write to a remote entry!'
18
- end
19
-
20
- ##
21
- # Read from the source.
22
- #
23
- def read
24
- source.read
25
- end
26
-
27
16
  ##
28
17
  # @return [IO] An IO object for the remote resource.
29
18
  #
@@ -84,7 +84,10 @@ module ROCrate
84
84
  def self.read_zip(source, target_dir: Dir.mktmpdir)
85
85
  unzip_to(source, target_dir)
86
86
 
87
- read_directory(target_dir)
87
+ # Traverse the unzipped directory to try and find the crate's root
88
+ root_dir = detect_root_directory(target_dir)
89
+
90
+ read_directory(root_dir)
88
91
  end
89
92
 
90
93
  ##
@@ -100,8 +103,12 @@ module ROCrate
100
103
  entry == ROCrate::Metadata::IDENTIFIER_1_0 }
101
104
 
102
105
  if metadata_file
103
- entities = entities_from_metadata(::File.read(::File.join(source, metadata_file)))
104
- build_crate(entities, source)
106
+ metadata_json = ::File.read(::File.join(source, metadata_file))
107
+ metadata = JSON.parse(metadata_json)
108
+ entities = entities_from_metadata(metadata)
109
+ context = metadata['@context']
110
+
111
+ build_crate(entities, source, context: context)
105
112
  else
106
113
  raise 'No metadata found!'
107
114
  end
@@ -110,10 +117,9 @@ module ROCrate
110
117
  ##
111
118
  # Extracts all the entities from the @graph of the RO-Crate Metadata.
112
119
  #
113
- # @param metadata_json [String] A string containing the metadata JSON.
120
+ # @param metadata [Hash] A Hash containing the parsed metadata JSON.
114
121
  # @return [Hash{String => Hash}] A Hash of all the entities, mapped by their @id.
115
- def self.entities_from_metadata(metadata_json)
116
- metadata = JSON.parse(metadata_json)
122
+ def self.entities_from_metadata(metadata)
117
123
  graph = metadata['@graph']
118
124
 
119
125
  if graph
@@ -126,6 +132,7 @@ module ROCrate
126
132
  # Do some normalization...
127
133
  entities[ROCrate::Metadata::IDENTIFIER] = extract_metadata_entity(entities)
128
134
  raise "No metadata entity found in @graph!" unless entities[ROCrate::Metadata::IDENTIFIER]
135
+ entities[ROCrate::Preview::IDENTIFIER] = extract_preview_entity(entities)
129
136
  entities[ROCrate::Crate::IDENTIFIER] = extract_root_entity(entities)
130
137
  raise "No root entity (with @id: #{entities[ROCrate::Metadata::IDENTIFIER].dig('about', '@id')}) found in @graph!" unless entities[ROCrate::Crate::IDENTIFIER]
131
138
 
@@ -136,25 +143,50 @@ module ROCrate
136
143
  end
137
144
 
138
145
  ##
139
- # Create a crate from the given set of entities.
146
+ # Create and populate crate from the given set of entities.
140
147
  #
141
148
  # @param entity_hash [Hash{String => Hash}] A Hash containing all the entities in the @graph, mapped by their @id.
142
149
  # @param source [String, ::File, Pathname] The location of the RO-Crate being read.
150
+ # @param crate_class [Class] The class to use to instantiate the crate,
151
+ # useful if you have created a subclass of ROCrate::Crate that you want to use. (defaults to ROCrate::Crate).
152
+ # @param context [nil, String, Array, Hash] A custom JSON-LD @context (parsed), or nil to use default.
143
153
  # @return [Crate] The RO-Crate.
144
- def self.build_crate(entity_hash, source)
145
- ROCrate::Crate.new.tap do |crate|
154
+ def self.build_crate(entity_hash, source, crate_class: ROCrate::Crate, context:)
155
+ crate = initialize_crate(entity_hash, source, crate_class: crate_class, context: context)
156
+
157
+ extract_data_entities(crate, source, entity_hash).each do |entity|
158
+ crate.add_data_entity(entity)
159
+ end
160
+
161
+ # The remaining entities in the hash must be contextual.
162
+ extract_contextual_entities(crate, entity_hash).each do |entity|
163
+ crate.add_contextual_entity(entity)
164
+ end
165
+
166
+ crate
167
+ end
168
+
169
+ ##
170
+ # Initialize a crate from the given set of entities.
171
+ #
172
+ # @param entity_hash [Hash{String => Hash}] A Hash containing all the entities in the @graph, mapped by their @id.
173
+ # @param source [String, ::File, Pathname] The location of the RO-Crate being read.
174
+ # @param crate_class [Class] The class to use to instantiate the crate,
175
+ # useful if you have created a subclass of ROCrate::Crate that you want to use. (defaults to ROCrate::Crate).
176
+ # @param context [nil, String, Array, Hash] A custom JSON-LD @context (parsed), or nil to use default.
177
+ # @return [Crate] The RO-Crate.
178
+ def self.initialize_crate(entity_hash, source, crate_class: ROCrate::Crate, context:)
179
+ crate_class.new.tap do |crate|
146
180
  crate.properties = entity_hash.delete(ROCrate::Crate::IDENTIFIER)
147
181
  crate.metadata.properties = entity_hash.delete(ROCrate::Metadata::IDENTIFIER)
182
+ crate.metadata.context = context
148
183
  preview_properties = entity_hash.delete(ROCrate::Preview::IDENTIFIER)
149
- crate.preview.properties = preview_properties if preview_properties
150
- crate.add_all(source, false)
151
- extract_data_entities(crate, source, entity_hash).each do |entity|
152
- crate.add_data_entity(entity)
153
- end
154
- # The remaining entities in the hash must be contextual.
155
- extract_contextual_entities(crate, entity_hash).each do |entity|
156
- crate.add_contextual_entity(entity)
184
+ preview_path = ::File.join(source, ROCrate::Preview::IDENTIFIER)
185
+ preview_path = ::File.exists?(preview_path) ? Pathname.new(preview_path) : nil
186
+ if preview_properties || preview_path
187
+ crate.preview = ROCrate::Preview.new(crate, preview_path, preview_properties || {})
157
188
  end
189
+ crate.add_all(source, false)
158
190
  end
159
191
  end
160
192
 
@@ -226,8 +258,8 @@ module ROCrate
226
258
  ##
227
259
  # Extract the metadata entity from the entity hash, according to the rules defined here:
228
260
  # https://www.researchobject.org/ro-crate/1.1/root-data-entity.html#finding-the-root-data-entity
229
- # @return [Hash{String => Hash}] A Hash containing (hopefully) one value, the metadata entity's properties,
230
- # mapped by its @id.
261
+ # @return [nil, Hash{String => Hash}] A Hash containing (hopefully) one value, the metadata entity's properties
262
+ # mapped by its @id, or nil if nothing is found.
231
263
  def self.extract_metadata_entity(entities)
232
264
  key = entities.detect do |_, props|
233
265
  props.dig('conformsTo', '@id')&.start_with?(ROCrate::Metadata::RO_CRATE_BASE)
@@ -242,6 +274,13 @@ module ROCrate
242
274
  entities.delete(ROCrate::Metadata::IDENTIFIER_1_0))
243
275
  end
244
276
 
277
+ ##
278
+ # Extract the ro-crate-preview entity from the entity hash.
279
+ # @return [Hash{String => Hash}] A Hash containing the preview entity's properties mapped by its @id, or nil if nothing is found.
280
+ def self.extract_preview_entity(entities)
281
+ entities.delete("./#{ROCrate::Preview::IDENTIFIER}") || entities.delete(ROCrate::Preview::IDENTIFIER)
282
+ end
283
+
245
284
  ##
246
285
  # Extract the root entity from the entity hash, according to the rules defined here:
247
286
  # https://www.researchobject.org/ro-crate/1.1/root-data-entity.html#finding-the-root-data-entity
@@ -252,5 +291,23 @@ module ROCrate
252
291
  raise "Metadata entity does not reference any root entity" unless root_id
253
292
  entities.delete(root_id)
254
293
  end
294
+
295
+ ##
296
+ # Finds an RO-Crate's root directory (where `ro-crate-metdata.json` is located) within a given directory.
297
+ #
298
+ # @param source [String, ::File, Pathname] The location of the directory.
299
+ # @return [Pathname, nil] The path to the root, or nil if not found.
300
+ def self.detect_root_directory(source)
301
+ Pathname(source).find do |entry|
302
+ if entry.file?
303
+ name = entry.basename.to_s
304
+ if name == ROCrate::Metadata::IDENTIFIER || name == ROCrate::Metadata::IDENTIFIER_1_0
305
+ return entry.parent
306
+ end
307
+ end
308
+ end
309
+
310
+ nil
311
+ end
255
312
  end
256
313
  end