ro-crate 0.4.9 → 0.4.13

Sign up to get free protection for your applications and to get access to all the features.
Files changed (37) hide show
  1. checksums.yaml +4 -4
  2. data/Gemfile.lock +13 -13
  3. data/README.md +39 -0
  4. data/lib/ro_crate/model/crate.rb +48 -6
  5. data/lib/ro_crate/model/data_entity.rb +5 -4
  6. data/lib/ro_crate/model/directory.rb +2 -4
  7. data/lib/ro_crate/model/entity.rb +41 -4
  8. data/lib/ro_crate/model/entry.rb +2 -2
  9. data/lib/ro_crate/model/file.rb +2 -4
  10. data/lib/ro_crate/model/metadata.rb +9 -1
  11. data/lib/ro_crate/model/organization.rb +1 -1
  12. data/lib/ro_crate/model/preview.rb +3 -15
  13. data/lib/ro_crate/model/preview_generator.rb +40 -0
  14. data/lib/ro_crate/model/remote_entry.rb +1 -12
  15. data/lib/ro_crate/reader.rb +76 -19
  16. data/lib/ro_crate/writer.rb +4 -4
  17. data/lib/ro_crate.rb +2 -1
  18. data/ro_crate.gemspec +3 -3
  19. data/test/crate_test.rb +58 -3
  20. data/test/directory_test.rb +21 -21
  21. data/test/entity_test.rb +114 -0
  22. data/test/fixtures/biobb_hpc_workflows-condapack.zip +0 -0
  23. data/test/fixtures/conflicting_data_directory/info.txt +1 -0
  24. data/test/fixtures/conflicting_data_directory/nested.txt +1 -0
  25. data/test/fixtures/nested_directory.zip +0 -0
  26. data/test/fixtures/ro-crate-galaxy-sortchangecase/LICENSE +176 -0
  27. data/test/fixtures/ro-crate-galaxy-sortchangecase/README.md +6 -0
  28. data/test/fixtures/ro-crate-galaxy-sortchangecase/ro-crate-metadata.json +133 -0
  29. data/test/fixtures/ro-crate-galaxy-sortchangecase/sort-and-change-case.ga +118 -0
  30. data/test/fixtures/ro-crate-galaxy-sortchangecase/test/test1/input.bed +3 -0
  31. data/test/fixtures/ro-crate-galaxy-sortchangecase/test/test1/output_exp.bed +3 -0
  32. data/test/fixtures/ro-crate-galaxy-sortchangecase/test/test1/sort-and-change-case-test.yml +8 -0
  33. data/test/fixtures/sparse_directory_crate/ro-crate-preview.html +60 -59
  34. data/test/reader_test.rb +83 -40
  35. data/test/test_helper.rb +5 -1
  36. data/test/writer_test.rb +59 -2
  37. metadata +26 -8
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 88f4e08570b547ac985a759af19d619747dd78a9001cd7994f314a42c014e001
4
- data.tar.gz: e3711f54d12c1b4d6b1d68d16923dd051ae2b3a01a28a1d1e155a471bb84bfab
3
+ metadata.gz: 364757350da7d275b02def348bfd97782ad0ae31728c44f0691bc084e4d94ac9
4
+ data.tar.gz: 6b56ae58b682a4618dcb92ca78cbae720426316bc8ec50b9377fb9fda8ec9137
5
5
  SHA512:
6
- metadata.gz: 2a468ad85761945fc782b5ccb6943e4d1b53c7bb528dba59ed19f93ab2af08017f212d93bf47f9af983b6f00fd304dd1b2edfaef5e096f04ffdb181fd75352e5
7
- data.tar.gz: 89edb2f44a6842a7409c2e3107e76b3c401533738c32e6c18faecc6c9c29e6fd71a0d06fbe20f67fdec62564d00b3c168da36b29662ab8badfe652d923b7ab58
6
+ metadata.gz: a2b14f9c64528476a18f84d0292311484e975a3dc6085c00c94f5dac8d84fa144e91bb6138c876db4a4079aeec4222eb6a1f91458e2ef0260defa9d3a0a42d57
7
+ data.tar.gz: b56d257868b7cb94f341e844223d14988c280fff34bb54dd1a676d47681d96afc8193ebb60a22eecfef7e96dc0d2ce03ed1571190a1717fbe56d1b80ed579053
data/Gemfile.lock CHANGED
@@ -1,31 +1,31 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- ro-crate (0.4.9)
5
- addressable (~> 2.7.0)
4
+ ro-crate (0.4.13)
5
+ addressable (>= 2.7, < 2.9)
6
6
  rubyzip (~> 2.0.0)
7
7
 
8
8
  GEM
9
9
  remote: https://rubygems.org/
10
10
  specs:
11
- addressable (2.7.0)
11
+ addressable (2.8.0)
12
12
  public_suffix (>= 2.0.2, < 5.0)
13
13
  crack (0.4.3)
14
14
  safe_yaml (~> 1.0.0)
15
- docile (1.3.1)
15
+ docile (1.3.5)
16
16
  hashdiff (1.0.1)
17
- json (2.3.1)
18
- power_assert (0.4.1)
19
- public_suffix (4.0.3)
17
+ power_assert (1.1.3)
18
+ public_suffix (4.0.6)
20
19
  rake (13.0.0)
21
20
  rubyzip (2.0.0)
22
21
  safe_yaml (1.0.5)
23
- simplecov (0.16.1)
22
+ simplecov (0.21.2)
24
23
  docile (~> 1.1)
25
- json (>= 1.8, < 3)
26
- simplecov-html (~> 0.10.0)
27
- simplecov-html (0.10.2)
28
- test-unit (3.2.3)
24
+ simplecov-html (~> 0.11)
25
+ simplecov_json_formatter (~> 0.1)
26
+ simplecov-html (0.12.3)
27
+ simplecov_json_formatter (0.1.2)
28
+ test-unit (3.2.9)
29
29
  power_assert
30
30
  webmock (3.8.3)
31
31
  addressable (>= 2.3.6)
@@ -39,7 +39,7 @@ PLATFORMS
39
39
  DEPENDENCIES
40
40
  rake (~> 13.0.0)
41
41
  ro-crate!
42
- simplecov (~> 0.16.1)
42
+ simplecov (~> 0.21.2)
43
43
  test-unit (~> 3.2.3)
44
44
  webmock (~> 3.8.3)
45
45
  yard (~> 0.9.25)
data/README.md CHANGED
@@ -1,5 +1,7 @@
1
1
  # ro-crate-ruby
2
2
 
3
+ ![Tests](https://github.com/ResearchObject/ro-crate-ruby/actions/workflows/tests.yml/badge.svg)
4
+
3
5
  This is a WIP gem for creating, manipulating and reading RO-Crates (conforming to version 1.1 of the specification).
4
6
 
5
7
  * RO-Crate - https://researchobject.github.io/ro-crate/
@@ -17,6 +19,43 @@ and run `bundle install`.
17
19
 
18
20
  ## Usage
19
21
 
22
+ This gem consists a hierarchy of classes to model RO-Crate "entities": the crate itself, data entities
23
+ (files and directory) and contextual entities (with a limited set of specializations, such as `ROCrate::Person`).
24
+ They are all descendents of the `ROCrate::Entity` class, with the `ROCrate::Crate` class representing the crate itself.
25
+
26
+ The `ROCrate::Reader` class handles reading of RO-Crates into the above model, from a Zip file or directory.
27
+
28
+ The `ROCrate::Writer` class can write out an `ROCrate::Crate` instance into a Zip file or directory.
29
+
30
+ **Note:** for performance reasons, the gem is currently not linked-data aware and will allow you to set properties that
31
+ are not semantically valid.
32
+
33
+ ### Entities
34
+ Entities correspond to entries in the `@graph` of the RO-Crate's metadata JSON-LD file. Each entity class is
35
+ basically a wrapper around a set of JSON properties, with some convenience methods for getting/setting some
36
+ commonly used properties (`crate.name = "My first crate"`).
37
+
38
+ These convenience getter/setter methods will automatically handle turning objects into references and adding them to the
39
+ `@graph` if necessary.
40
+
41
+ ##### Getting/Setting Arbitrary Properties of Entities
42
+ As well as using the pre-defined getter/setter methods, you can get/set arbitrary properties like so.
43
+
44
+ To set the "creativeWorkStatus" property of the RO-Crate itself to a string literal:
45
+ ```ruby
46
+ crate['creativeWorkStatus'] = 'work-in-progress'
47
+ ```
48
+
49
+ If you want to reference other entities in the crate, you can get a JSON-LD reference from an entity object by using the `reference` method:
50
+ ```ruby
51
+ joe = crate.add_person('joe', { name: 'Joe Bloggs' }) # Add the entity to the @graph
52
+ crate['copyrightHolder'] = joe.reference # Reference the entity from the "copyrightHolder" property
53
+ ```
54
+ and to resolve those references back to the object, use the `dereference` method:
55
+ ```ruby
56
+ joe = crate['copyrightHolder'].dereference
57
+ ```
58
+
20
59
  ### Documentation
21
60
 
22
61
  [Click here for API documentation](https://www.researchobject.org/ro-crate-ruby/).
@@ -25,6 +25,15 @@ module ROCrate
25
25
  super(self, nil, id, properties)
26
26
  end
27
27
 
28
+ ##
29
+ # Lookup an Entity using the given ID (in this Entity's crate).
30
+ #
31
+ # @param id [String] The ID to query.
32
+ # @return [Entity, nil]
33
+ def dereference(id)
34
+ entities.detect { |e| e.canonical_id == crate.resolve_id(id) } if id
35
+ end
36
+
28
37
  ##
29
38
  # Create a new file and add it to the crate.
30
39
  #
@@ -168,6 +177,15 @@ module ROCrate
168
177
  @preview ||= ROCrate::Preview.new(self)
169
178
  end
170
179
 
180
+ ##
181
+ # Set the RO-Crate preview file
182
+ # @param preview [Preview] the preview to set.
183
+ #
184
+ # @return [Preview]
185
+ def preview=(preview)
186
+ @preview = claim(preview)
187
+ end
188
+
171
189
  ##
172
190
  # All the entities within the crate. Includes contextual entities, data entities, the crate itself and its metadata file.
173
191
  #
@@ -220,32 +238,56 @@ module ROCrate
220
238
  entity.class.new(crate, entity.id, entity.raw_properties)
221
239
  end
222
240
 
223
- alias_method :own_entries, :entries
241
+ alias_method :own_payload, :payload
224
242
  ##
225
- # # The RO-Crate's "payload" of the crate - a map of all the files/directories contained in the RO-Crate, where the
226
- # key is the destination path within the crate and the value is an Entry where the source data can be read.
243
+ # The file payload of the RO-Crate - a map of all the files/directories contained in the RO-Crate, where the
244
+ # key is the path relative to the crate's root, and the value is an Entry where the source data can be read.
227
245
  #
228
246
  # @return [Hash{String => Entry}>]
229
- def entries
247
+ def payload
230
248
  # Gather a map of entries, starting from the crate itself, then any directory data entities, then finally any
231
249
  # file data entities. This ensures in the case of a conflict, the more "specific" data entities take priority.
232
- entries = own_entries
250
+ entries = own_payload
233
251
  non_self_entities = default_entities.reject { |e| e == self }
234
252
  sorted_entities = (non_self_entities | data_entities).sort_by { |e| e.is_a?(ROCrate::Directory) ? 0 : 1 }
235
253
 
236
254
  sorted_entities.each do |entity|
237
- entity.entries.each do |path, entry|
255
+ entity.payload.each do |path, entry|
238
256
  entries[path] = entry
239
257
  end
240
258
  end
241
259
 
242
260
  entries
243
261
  end
262
+ alias_method :entries, :payload
244
263
 
245
264
  def get_binding
246
265
  binding
247
266
  end
248
267
 
268
+ ##
269
+ # Remove the entity from the RO-Crate.
270
+ #
271
+ # @param entity [Entity, String] The entity or ID of an entity to remove from the crate.
272
+ # @param remove_orphaned [Boolean] Should linked contextual entities also be removed from the crate they are left
273
+ # dangling (nothing else is linked to them)?
274
+ #
275
+ # @return [Entity, nil] The entity that was deleted, or nil if nothing was deleted.
276
+ def delete(entity, remove_orphaned: true)
277
+ entity = dereference(entity) if entity.is_a?(String)
278
+ return unless entity
279
+
280
+ deleted = data_entities.delete(entity) || contextual_entities.delete(entity)
281
+
282
+ if deleted && remove_orphaned
283
+ crate_entities = crate.linked_entities(deep: true)
284
+ to_remove = (entity.linked_entities(deep: true) - crate_entities)
285
+ to_remove.each(&:delete)
286
+ end
287
+
288
+ deleted
289
+ end
290
+
249
291
  private
250
292
 
251
293
  def full_entry_path(relative_path)
@@ -3,6 +3,8 @@ module ROCrate
3
3
  # A class to represent a "Data Entity" within an RO-Crate.
4
4
  # Data Entities are the actual physical files and directories within the Crate.
5
5
  class DataEntity < Entity
6
+ properties(%w[name contentSize dateModified encodingFormat identifier sameAs author])
7
+
6
8
  def self.format_local_id(id)
7
9
  super.chomp('/')
8
10
  end
@@ -13,8 +15,6 @@ module ROCrate
13
15
  # @return [Class]
14
16
  def self.specialize(props)
15
17
  type = props['@type']
16
- id = props['@id']
17
- abs = URI(id)&.absolute? rescue false
18
18
  type = [type] unless type.is_a?(Array)
19
19
  if type.include?('Dataset')
20
20
  ROCrate::Directory
@@ -24,12 +24,13 @@ module ROCrate
24
24
  end
25
25
 
26
26
  ##
27
- # A map of all the files/directories associated with this DataEntity.
27
+ # The payload of all the files/directories associated with this DataEntity, mapped by their relative file path.
28
28
  #
29
29
  # @return [Hash{String => Entry}>] The key is the location within the crate, and the value is an Entry.
30
- def entries
30
+ def payload
31
31
  {}
32
32
  end
33
+ alias_method :entries, :payload
33
34
 
34
35
  ##
35
36
  # A disk-safe filepath based on the ID of this DataEntity.
@@ -2,8 +2,6 @@ module ROCrate
2
2
  ##
3
3
  # A data entity that represents a directory of potentially many files and subdirectories (or none).
4
4
  class Directory < DataEntity
5
- properties(%w[name contentSize dateModified encodingFormat identifier sameAs])
6
-
7
5
  def self.format_local_id(id)
8
6
  super + '/'
9
7
  end
@@ -30,11 +28,11 @@ module ROCrate
30
28
  end
31
29
 
32
30
  ##
33
- # The "payload" of this directory - a map of all the files/directories, where the key is the destination path
31
+ # The payload of this directory - a map of all the files/directories, where the key is the destination path
34
32
  # within the crate and the value is an Entry where the source data can be read.
35
33
  #
36
34
  # @return [Hash{String => Entry}>]
37
- def entries
35
+ def payload
38
36
  entries = {}
39
37
  entries[filepath.chomp('/')] = @entry if @entry
40
38
 
@@ -129,16 +129,26 @@ module ROCrate
129
129
  # @param id [String] The ID to query.
130
130
  # @return [Entity, nil]
131
131
  def dereference(id)
132
- crate.entities.detect { |e| e.canonical_id == crate.resolve_id(id) } if id
132
+ crate.dereference(id)
133
133
  end
134
-
135
134
  alias_method :get, :dereference
136
135
 
136
+ ##
137
+ # Remove this entity from the RO-Crate.
138
+ #
139
+ # @param remove_orphaned [Boolean] Should linked contextual entities also be removed from the crate (if nothing else is linked to them)?
140
+ #
141
+ # @return [Entity, nil] This entity, or nil if nothing was deleted.
142
+ def delete(remove_orphaned: true)
143
+ crate.delete(self, remove_orphaned: remove_orphaned)
144
+ end
145
+
137
146
  def id
138
147
  @properties['@id']
139
148
  end
140
149
 
141
150
  def id=(id)
151
+ @canonical_id = nil
142
152
  @properties['@id'] = self.class.format_id(id)
143
153
  end
144
154
 
@@ -190,13 +200,13 @@ module ROCrate
190
200
  #
191
201
  # @return [Addressable::URI]
192
202
  def canonical_id
193
- crate.resolve_id(id)
203
+ @canonical_id ||= crate.resolve_id(id)
194
204
  end
195
205
 
196
206
  ##
197
207
  # Is this entity local to the crate or an external reference?
198
208
  #
199
- # @return [boolean]
209
+ # @return [Boolean]
200
210
  def external?
201
211
  crate.canonical_id.host != canonical_id.host
202
212
  end
@@ -226,6 +236,33 @@ module ROCrate
226
236
  @properties.has_type?(type)
227
237
  end
228
238
 
239
+ ##
240
+ # Gather a list of entities linked to this one through its properties.
241
+ # @param deep [Boolean] If false, only consider direct links, otherwise consider transitive links.
242
+ # @param linked [Hash{String => Entity}] Discovered entities, mapped by their ID, to avoid loops when recursing.
243
+ # @return [Array<Entity>]
244
+ def linked_entities(deep: false, linked: {})
245
+ properties.each do |key, value|
246
+ value = [value] if value.is_a?(JSONLDHash)
247
+
248
+ if value.is_a?(Array)
249
+ value.each do |v|
250
+ if v.is_a?(JSONLDHash) && !linked.key?(v['@id'])
251
+ entity = v.dereference
252
+ linked[entity.id] = entity if entity
253
+ if deep
254
+ entity.linked_entities(deep: true, linked: linked).each do |e|
255
+ linked[e.id] = e
256
+ end
257
+ end
258
+ end
259
+ end
260
+ end
261
+ end
262
+
263
+ linked.values.compact
264
+ end
265
+
229
266
  private
230
267
 
231
268
  def default_properties
@@ -14,10 +14,10 @@ module ROCrate
14
14
  end
15
15
 
16
16
  ##
17
- # Write the source to the destination via a buffer.
17
+ # Write the entry's source to the destination via a buffer.
18
18
  #
19
19
  # @param dest [#write] An IO-like destination to write to.
20
- def write(dest)
20
+ def write_to(dest)
21
21
  input = source
22
22
  input = input.open('rb') if input.is_a?(Pathname)
23
23
  while (buff = input.read(4096))
@@ -2,14 +2,12 @@ module ROCrate
2
2
  ##
3
3
  # A data entity that represents a single file.
4
4
  class File < DataEntity
5
- properties(%w[name contentSize dateModified encodingFormat identifier sameAs])
6
-
7
5
  ##
8
6
  # Create a new ROCrate::File. PLEASE NOTE, the new file will not be added to the crate. To do this, call
9
7
  # Crate#add_data_entity, or just use Crate#add_file.
10
8
  #
11
9
  # @param crate [Crate] The RO-Crate that owns this file.
12
- # @param source [String, Pathname, ::File, #read, URI, nil] The source on the disk (or on the internet if a URI) where this file will be read.
10
+ # @param source [String, Pathname, ::File, URI, nil, #read] The source on the disk (or on the internet if a URI) where this file will be read.
13
11
  # @param crate_path [String] The relative path within the RO-Crate where this file will be written.
14
12
  # @param properties [Hash{String => Object}] A hash of JSON-LD properties to associate with this file.
15
13
  def initialize(crate, source, crate_path = nil, properties = {})
@@ -58,7 +56,7 @@ module ROCrate
58
56
  # (for compatibility with Directory#entries)
59
57
  #
60
58
  # @return [Hash{String => Entry}>] The key is the location within the crate, and the value is an Entry.
61
- def entries
59
+ def payload
62
60
  remote? ? {} : { filepath => source }
63
61
  end
64
62
 
@@ -17,7 +17,15 @@ module ROCrate
17
17
  # @return [String] The rendered JSON-LD as a "prettified" string.
18
18
  def generate
19
19
  graph = crate.entities.map(&:properties).reject(&:empty?)
20
- JSON.pretty_generate('@context' => CONTEXT, '@graph' => graph)
20
+ JSON.pretty_generate('@context' => context, '@graph' => graph)
21
+ end
22
+
23
+ def context
24
+ @context || CONTEXT
25
+ end
26
+
27
+ def context= c
28
+ @context = c
21
29
  end
22
30
 
23
31
  private
@@ -2,7 +2,7 @@ module ROCrate
2
2
  ##
3
3
  # A contextual entity that represents an organization.
4
4
  class Organization < ContextualEntity
5
- properties(['name'])
5
+ properties(%w[name])
6
6
 
7
7
  private
8
8
 
@@ -12,26 +12,14 @@ module ROCrate
12
12
  # @return [String]
13
13
  attr_accessor :template
14
14
 
15
- def initialize(crate, properties = {})
15
+ def initialize(crate, source = nil, properties = {})
16
+ source ||= PreviewGenerator.new(self)
16
17
  @template = nil
17
- super(crate, nil, IDENTIFIER, properties)
18
- end
19
-
20
- ##
21
- # Generate the crate's `ro-crate-preview.html`.
22
- # @return [String] The rendered HTML as a string.
23
- def generate
24
- b = crate.get_binding
25
- renderer = ERB.new(template || ::File.read(DEFAULT_TEMPLATE))
26
- renderer.result(b)
18
+ super(crate, source, IDENTIFIER, properties)
27
19
  end
28
20
 
29
21
  private
30
22
 
31
- def source
32
- Entry.new(StringIO.new(generate))
33
- end
34
-
35
23
  def default_properties
36
24
  {
37
25
  '@id' => IDENTIFIER,
@@ -0,0 +1,40 @@
1
+ require 'erb'
2
+
3
+ module ROCrate
4
+ ##
5
+ # A class to handle generation of an RO-Crate's preview HTML in an IO-like way (to fit into an Entry).
6
+ class PreviewGenerator
7
+ ##
8
+ # @param preview [Preview] The RO-Crate preview object.
9
+ def initialize(preview)
10
+ @preview = preview
11
+ end
12
+
13
+ def read(*args)
14
+ io.read(*args)
15
+ end
16
+
17
+ ##
18
+ # Generate the crate's `ro-crate-preview.html`.
19
+ # @return [String] The rendered HTML as a string.
20
+ def generate
21
+ b = crate.get_binding
22
+ renderer = ERB.new(template)
23
+ renderer.result(b)
24
+ end
25
+
26
+ def template
27
+ @preview.template || ::File.read(Preview::DEFAULT_TEMPLATE)
28
+ end
29
+
30
+ def crate
31
+ @preview.crate
32
+ end
33
+
34
+ private
35
+
36
+ def io
37
+ @io ||= StringIO.new(generate)
38
+ end
39
+ end
40
+ end
@@ -2,7 +2,7 @@ module ROCrate
2
2
  ##
3
3
  # A class to represent a reference within an RO-Crate, to a remote file held on the internet somewhere.
4
4
  # It handles the actual reading/writing of bytes.
5
- class RemoteEntry
5
+ class RemoteEntry < Entry
6
6
  attr_reader :uri
7
7
 
8
8
  ##
@@ -13,17 +13,6 @@ module ROCrate
13
13
  @uri = uri
14
14
  end
15
15
 
16
- def write(dest)
17
- raise 'Cannot write to a remote entry!'
18
- end
19
-
20
- ##
21
- # Read from the source.
22
- #
23
- def read
24
- source.read
25
- end
26
-
27
16
  ##
28
17
  # @return [IO] An IO object for the remote resource.
29
18
  #
@@ -84,7 +84,10 @@ module ROCrate
84
84
  def self.read_zip(source, target_dir: Dir.mktmpdir)
85
85
  unzip_to(source, target_dir)
86
86
 
87
- read_directory(target_dir)
87
+ # Traverse the unzipped directory to try and find the crate's root
88
+ root_dir = detect_root_directory(target_dir)
89
+
90
+ read_directory(root_dir)
88
91
  end
89
92
 
90
93
  ##
@@ -100,8 +103,12 @@ module ROCrate
100
103
  entry == ROCrate::Metadata::IDENTIFIER_1_0 }
101
104
 
102
105
  if metadata_file
103
- entities = entities_from_metadata(::File.read(::File.join(source, metadata_file)))
104
- build_crate(entities, source)
106
+ metadata_json = ::File.read(::File.join(source, metadata_file))
107
+ metadata = JSON.parse(metadata_json)
108
+ entities = entities_from_metadata(metadata)
109
+ context = metadata['@context']
110
+
111
+ build_crate(entities, source, context: context)
105
112
  else
106
113
  raise 'No metadata found!'
107
114
  end
@@ -110,10 +117,9 @@ module ROCrate
110
117
  ##
111
118
  # Extracts all the entities from the @graph of the RO-Crate Metadata.
112
119
  #
113
- # @param metadata_json [String] A string containing the metadata JSON.
120
+ # @param metadata [Hash] A Hash containing the parsed metadata JSON.
114
121
  # @return [Hash{String => Hash}] A Hash of all the entities, mapped by their @id.
115
- def self.entities_from_metadata(metadata_json)
116
- metadata = JSON.parse(metadata_json)
122
+ def self.entities_from_metadata(metadata)
117
123
  graph = metadata['@graph']
118
124
 
119
125
  if graph
@@ -126,6 +132,7 @@ module ROCrate
126
132
  # Do some normalization...
127
133
  entities[ROCrate::Metadata::IDENTIFIER] = extract_metadata_entity(entities)
128
134
  raise "No metadata entity found in @graph!" unless entities[ROCrate::Metadata::IDENTIFIER]
135
+ entities[ROCrate::Preview::IDENTIFIER] = extract_preview_entity(entities)
129
136
  entities[ROCrate::Crate::IDENTIFIER] = extract_root_entity(entities)
130
137
  raise "No root entity (with @id: #{entities[ROCrate::Metadata::IDENTIFIER].dig('about', '@id')}) found in @graph!" unless entities[ROCrate::Crate::IDENTIFIER]
131
138
 
@@ -136,25 +143,50 @@ module ROCrate
136
143
  end
137
144
 
138
145
  ##
139
- # Create a crate from the given set of entities.
146
+ # Create and populate crate from the given set of entities.
140
147
  #
141
148
  # @param entity_hash [Hash{String => Hash}] A Hash containing all the entities in the @graph, mapped by their @id.
142
149
  # @param source [String, ::File, Pathname] The location of the RO-Crate being read.
150
+ # @param crate_class [Class] The class to use to instantiate the crate,
151
+ # useful if you have created a subclass of ROCrate::Crate that you want to use. (defaults to ROCrate::Crate).
152
+ # @param context [nil, String, Array, Hash] A custom JSON-LD @context (parsed), or nil to use default.
143
153
  # @return [Crate] The RO-Crate.
144
- def self.build_crate(entity_hash, source)
145
- ROCrate::Crate.new.tap do |crate|
154
+ def self.build_crate(entity_hash, source, crate_class: ROCrate::Crate, context:)
155
+ crate = initialize_crate(entity_hash, source, crate_class: crate_class, context: context)
156
+
157
+ extract_data_entities(crate, source, entity_hash).each do |entity|
158
+ crate.add_data_entity(entity)
159
+ end
160
+
161
+ # The remaining entities in the hash must be contextual.
162
+ extract_contextual_entities(crate, entity_hash).each do |entity|
163
+ crate.add_contextual_entity(entity)
164
+ end
165
+
166
+ crate
167
+ end
168
+
169
+ ##
170
+ # Initialize a crate from the given set of entities.
171
+ #
172
+ # @param entity_hash [Hash{String => Hash}] A Hash containing all the entities in the @graph, mapped by their @id.
173
+ # @param source [String, ::File, Pathname] The location of the RO-Crate being read.
174
+ # @param crate_class [Class] The class to use to instantiate the crate,
175
+ # useful if you have created a subclass of ROCrate::Crate that you want to use. (defaults to ROCrate::Crate).
176
+ # @param context [nil, String, Array, Hash] A custom JSON-LD @context (parsed), or nil to use default.
177
+ # @return [Crate] The RO-Crate.
178
+ def self.initialize_crate(entity_hash, source, crate_class: ROCrate::Crate, context:)
179
+ crate_class.new.tap do |crate|
146
180
  crate.properties = entity_hash.delete(ROCrate::Crate::IDENTIFIER)
147
181
  crate.metadata.properties = entity_hash.delete(ROCrate::Metadata::IDENTIFIER)
182
+ crate.metadata.context = context
148
183
  preview_properties = entity_hash.delete(ROCrate::Preview::IDENTIFIER)
149
- crate.preview.properties = preview_properties if preview_properties
150
- crate.add_all(source, false)
151
- extract_data_entities(crate, source, entity_hash).each do |entity|
152
- crate.add_data_entity(entity)
153
- end
154
- # The remaining entities in the hash must be contextual.
155
- extract_contextual_entities(crate, entity_hash).each do |entity|
156
- crate.add_contextual_entity(entity)
184
+ preview_path = ::File.join(source, ROCrate::Preview::IDENTIFIER)
185
+ preview_path = ::File.exists?(preview_path) ? Pathname.new(preview_path) : nil
186
+ if preview_properties || preview_path
187
+ crate.preview = ROCrate::Preview.new(crate, preview_path, preview_properties || {})
157
188
  end
189
+ crate.add_all(source, false)
158
190
  end
159
191
  end
160
192
 
@@ -226,8 +258,8 @@ module ROCrate
226
258
  ##
227
259
  # Extract the metadata entity from the entity hash, according to the rules defined here:
228
260
  # https://www.researchobject.org/ro-crate/1.1/root-data-entity.html#finding-the-root-data-entity
229
- # @return [Hash{String => Hash}] A Hash containing (hopefully) one value, the metadata entity's properties,
230
- # mapped by its @id.
261
+ # @return [nil, Hash{String => Hash}] A Hash containing (hopefully) one value, the metadata entity's properties
262
+ # mapped by its @id, or nil if nothing is found.
231
263
  def self.extract_metadata_entity(entities)
232
264
  key = entities.detect do |_, props|
233
265
  props.dig('conformsTo', '@id')&.start_with?(ROCrate::Metadata::RO_CRATE_BASE)
@@ -242,6 +274,13 @@ module ROCrate
242
274
  entities.delete(ROCrate::Metadata::IDENTIFIER_1_0))
243
275
  end
244
276
 
277
+ ##
278
+ # Extract the ro-crate-preview entity from the entity hash.
279
+ # @return [Hash{String => Hash}] A Hash containing the preview entity's properties mapped by its @id, or nil if nothing is found.
280
+ def self.extract_preview_entity(entities)
281
+ entities.delete("./#{ROCrate::Preview::IDENTIFIER}") || entities.delete(ROCrate::Preview::IDENTIFIER)
282
+ end
283
+
245
284
  ##
246
285
  # Extract the root entity from the entity hash, according to the rules defined here:
247
286
  # https://www.researchobject.org/ro-crate/1.1/root-data-entity.html#finding-the-root-data-entity
@@ -252,5 +291,23 @@ module ROCrate
252
291
  raise "Metadata entity does not reference any root entity" unless root_id
253
292
  entities.delete(root_id)
254
293
  end
294
+
295
+ ##
296
+ # Finds an RO-Crate's root directory (where `ro-crate-metdata.json` is located) within a given directory.
297
+ #
298
+ # @param source [String, ::File, Pathname] The location of the directory.
299
+ # @return [Pathname, nil] The path to the root, or nil if not found.
300
+ def self.detect_root_directory(source)
301
+ Pathname(source).find do |entry|
302
+ if entry.file?
303
+ name = entry.basename.to_s
304
+ if name == ROCrate::Metadata::IDENTIFIER || name == ROCrate::Metadata::IDENTIFIER_1_0
305
+ return entry.parent
306
+ end
307
+ end
308
+ end
309
+
310
+ nil
311
+ end
255
312
  end
256
313
  end