RubyGems - rof - Versions diffs - 0.0.1.pre → 1.0.4 - Mend

rof 0.0.1.pre → 1.0.4

Files changed (56) hide show

checksums.yaml +4 -4
data/.ruby-version +1 -1
data/.travis.yml +12 -2
data/Gemfile +1 -0
data/README.md +87 -0
data/bin/.ruby-version +1 -0
data/bin/csv_to_rof +26 -0
data/bin/fedora_to_rof +57 -0
data/bin/osf_to_rof +40 -0
data/bin/rof +78 -0
data/bulk-ingest.md +242 -0
data/labels.md +111 -0
data/lib/rof.rb +20 -1
data/lib/rof/access.rb +57 -0
data/lib/rof/cli.rb +122 -0
data/lib/rof/collection.rb +109 -0
data/lib/rof/compare_rof.rb +92 -0
data/lib/rof/filters/bendo.rb +33 -0
data/lib/rof/filters/date_stamp.rb +36 -0
data/lib/rof/filters/file_to_url.rb +27 -0
data/lib/rof/filters/label.rb +153 -0
data/lib/rof/filters/work.rb +111 -0
data/lib/rof/get_from_fedora.rb +196 -0
data/lib/rof/ingest.rb +204 -0
data/lib/rof/ingesters/rels_ext_ingester.rb +78 -0
data/lib/rof/ingesters/rights_metadata_ingester.rb +68 -0
data/lib/rof/osf_context.rb +19 -0
data/lib/rof/osf_to_rof.rb +122 -0
data/lib/rof/rdf_context.rb +36 -0
data/lib/rof/translate_csv.rb +112 -0
data/lib/rof/utility.rb +84 -0
data/lib/rof/version.rb +2 -2
data/rof.gemspec +17 -0
data/spec/fixtures/a.json +4 -0
data/spec/fixtures/label.json +20 -0
data/spec/fixtures/osf/b6psa.tar.gz +0 -0
data/spec/fixtures/rof/dev0012829m.rof +45 -0
data/spec/fixtures/vcr_tests/fedora_to_rof1.yml +5274 -0
data/spec/fixtures/vecnet-citation.json +73 -0
data/spec/lib/rof/access_spec.rb +36 -0
data/spec/lib/rof/cli_spec.rb +66 -0
data/spec/lib/rof/collection_spec.rb +90 -0
data/spec/lib/rof/compare_rof_spec.rb +263 -0
data/spec/lib/rof/filters/date_stamp_spec.rb +90 -0
data/spec/lib/rof/filters/file_to_url_spec.rb +70 -0
data/spec/lib/rof/filters/label_spec.rb +94 -0
data/spec/lib/rof/filters/work_spec.rb +87 -0
data/spec/lib/rof/ingest_spec.rb +117 -0
data/spec/lib/rof/ingesters/rels_ext_ingester_spec.rb +62 -0
data/spec/lib/rof/ingesters/rights_metadata_ingester_spec.rb +114 -0
data/spec/lib/rof/osf_to_rof_spec.rb +76 -0
data/spec/lib/rof/translate_csv_spec.rb +109 -0
data/spec/lib/rof/utility_spec.rb +64 -0
data/spec/lib/rof_spec.rb +14 -0
data/spec/spec_helper.rb +11 -11
metadata +283 -18

checksums.yaml CHANGED

@@ -1,7 +1,7 @@
 ---
 SHA1:
-  metadata.gz: 6145cfdc51338425ae4adccfd527266debd1c608
-  data.tar.gz: 04df2645bb27e387a870448c1e9e4adcea60cc63
+  metadata.gz: 33761b71b6e080540bbab28fa0516ae9f327b496
+  data.tar.gz: d8dc2c60c38574a5a872622f5a0e7bdfdf7ededd
 SHA512:
-  metadata.gz: 4ec14945f0f0a5ecd2e7026b680a7fb0cc991f6489bb9e681aefd55b28425f750c79ca0b2c602e6f67334ee840f97993b71665887ad3b1ac103487abdceaa4d2
-  data.tar.gz: 8206903744fd358fdd419c9313fcb45b00001e464ea364a36e48d2da6da571b82d9550907b9e5f5bfeab24f6290522e3ef001d9df7f660b5e47e13030ec37a3f
+  metadata.gz: baa4c3529a273e6de117e1070fb7fde0eb70a50c398238b95243eae326ea34a9ba4fd2200c59b11b332b93e061cddb1f2947d9d56de0c5d3794141f875665f51
+  data.tar.gz: 9e78181e99a5516a1389b8895a74c51b20a790f1e3fa7fa38563d1e5ac2cf9e98370457d8d2255be78cb792cbcc2db60e4f502a47cf70efee9bfbc8e8e6ad43b

data/.ruby-version CHANGED

	@@ -1 +1 @@
1	- 2.0.0
1	+ 2.2.2

data/.travis.yml CHANGED

@@ -1,12 +1,22 @@
 language: ruby
 rvm:
-  - "1.9.3"
   - "2.0.0"
+  - "2.1.1"
+  - "2.2.4"
+  - "2.3.0"
-script: "rspec"
+matrix:
+  allow_failures:
+    - rvm: "2.3.0"
+    - rvm: "2.2.4"
+script: "bundle exec rspec"
 notifications:
   irc: "irc.freenode.org#ndlib"
 before_install:
   - gem install bundler
+sudo: false
+cache: bundler

data/Gemfile CHANGED

@@ -4,6 +4,7 @@ source 'https://rubygems.org'
 gemspec
 group :test do
+  gem 'byebug'
 end
 group :doc do

data/README.md CHANGED

@@ -1,3 +1,90 @@
 # Raw Object Format
 [![Gem Version](https://badge.fury.io/rb/rof.png)](http://badge.fury.io/rb/rof)
+This is a pilot project to produce an intermediate data format that makes the
+bulk ingest of data into the Fedora Commons repository software simple. While the goal
+is to provide as simple of a format as possible, some affordances are made for
+defining standard datastreams used by Hydra project front-ends, such as the
+`rightsMetadata` datastream.
+See `spec/fixtures/vecnet-citation.json` as a sample two object model.
+An overview of the format is in [bulk-ingest.md](bulk-ingest.md).
+Sample command line usage:
+```
+$ bin/rof ingest --fedora 'http://localhost:8983/fedora' --user fedoraAdmin:fedoraAdmin spec/fixtures/vecnet-citation.json
+1. Ingesting vecnet:d217qs82g ...ok. 0.882s
+2. Ingesting vecnet:h415pf50x ...ok. 0.283s
+Total time 1.165s
+0 errors
+```
+ROF does more than just ingesting.
+Should an object already exist in Fedora, it will be updated to match what is provided in the source file.
+(However, this only applies to datastreams which are mentioned in the source file. Unmentioned datastreams
+are untouched).
+If the fedora path and user are omitted then rof lints the json file.
+```
+$ bin/rof ingest spec/fixtures/vecnet-citation.json
+1. Verifying vecnet:d217qs82g ...ok. 0.108s
+2. Verifying vecnet:h415pf50x ...ok. 0.002s
+Total time 0.111s
+0 errors
+```
+There is a filter which will assign objects identifiers. This requires an external [noids](https://github.com/ndlib/noids) service to provide the identifiers.
+See [labels.md](labels.md).
+```
+$ bin/rof filter label spec/fixtures/label.json --noids localhost:13001:test-pool --prefix temp
+[
+  {
+    "type": "fobject",
+    "pid": "temp:0k225999n60"
+  },
+  {
+    "type": "fobject",
+    "rels-ext": {
+      "partOf": [
+        "temp:0k225999n60"
+      ],
+      "refines": [
+        "temp:0r96736668t"
+      ]
+    },
+    "pid": "temp:0p096682x75"
+  },
+  {
+    "type": "fobject",
+    "pid": "temp:0r96736668t",
+    "rels-ext": {
+      "partOf": [
+        "temp:0r96736668t",
+        "temp:0k225999n60",
+        "another"
+      ]
+    }
+  }
+]
+```
+It is envisioned that there could be higher level objects, and that the ingesting into fedora done by this utility
+will be simply the final step of many.
+Other ideas for transformations:
+* A service to expand higher-level objects, say an `image-collection`, into a sequence of `fobjects`.
+* The ability to run file characterizations and create derivatives before ingest.
+# Other
+Since the files are JSON, any tool for working with JSON files will work with these.
+For example, the [jq](http://stedolan.github.io/jq/) tool makes it easy to extract all
+the `pid` field from every object in a file, and return it as a JSON array:
+```
+jq '[.[]|.pid]' < spec/fixtures/vecnet-citation.json
+```

data/bin/.ruby-version ADDED

	@@ -0,0 +1 @@
1	+ 2.2.2

data/bin/csv_to_rof ADDED

@@ -0,0 +1,26 @@
+#!/usr/bin/env ruby -Ilib
+require 'rof'
+require 'optparse'
+require 'json'
+opt = OptionParser.new do |opts|
+  opts.banner = %q{Usage: csv_to_rof
+  Reads a CSV file from stdin.
+  Writes a ROF file to stdout.
+  In case of an error, a message is written to stderr and the program
+  exits with a non-zero status.
+}
+end
+opt.parse!
+if ARGV.length != 0
+  abort opt.help
+end
+STDIN.set_encoding("UTF-8")
+csv_contents = STDIN.read
+rof = ROF::TranslateCSV.run(csv_contents)
+puts JSON.pretty_generate(rof)

data/bin/fedora_to_rof ADDED

@@ -0,0 +1,57 @@
+#!/usr/bin/env ruby -Ilib
+require 'rof'
+require 'optparse'
+# assign default parameter values
+fedora_info = {}
+config = {}
+file_path = STDOUT
+config['download'] = false
+config['inline'] = false
+config['download_path'] = '.'
+# parse the command line
+#
+opt = OptionParser.new do |opts|
+  opts.banner = %q{Usage: fedora_to_rof --fedora URL --user STRING --output DIR [--download | --inline]  PID [PID2 ...]
+Read the given PIDs from the given Fedora 3 instance, and then output them as
+ROF objects. By default output will be STDOUT, pass a directory in `--output`
+to save them as files. Datastreams smaller than 1024 bytes are added to the ROF
+file. Larger ones may either be included inline or saved as auxillary files.
+Use `--inline` to include them inline and use `--download` to save them as
+files. The files will have a name in the form `<pid>-<dsname>`.
+}
+  opts.on("", "--fedora URL", "Base Fedora URL (including port number)") do |url|
+    fedora_info[:url] = url
+  end
+  opts.on("", "--user STRING", "Username and password (colon separated) for fedora") do |u|
+    fedora_info[:user], fedora_info[:password] = u.split(':')
+  end
+  opts.on("", "--download DIRECTORY", "Save datastreams >1K in size to files (defaults to false)") do |directory|
+    config['download'] = true
+    config['download_path'] = directory
+  end
+  opts.on("", "--inline", "Include datastreams >1K in size in ROF output (defaults to false)") do
+    config['inline'] = true
+  end
+  opts.on("", "--outfile FILENAME", "File to save ROF to") do |output|
+    file_path = output
+  end
+end
+opt.parse!
+pids = ARGV
+fedora_info = nil if fedora_info.empty?
+# without a fedora and a pid, there is no reason to proceed
+if fedora_info == nil || pids.empty? then
+  STDERR.puts opt.help
+  exit 1
+end
+# perform conversion
+ROF::CLI.convert_to_rof(pids, fedora_info, file_path, config)

data/bin/osf_to_rof ADDED

@@ -0,0 +1,40 @@
+#!/usr/bin/env ruby -Ilib
+#Command Line Tool to convert and Open Science Farmework Archive Package to an ROF file
+require 'rof'
+require 'optparse'
+# assign default parameter values
+config = {}
+file_path = STDOUT
+config['project_file'] = './osf_projects'
+config['package_dir'] = './FROM_OSF'
+config['output_dir'] = '.'
+# parse the command line
+#
+opt = OptionParser.new do |opts|
+  opts.banner = %q{Usage: osf_to_rof --projectfile file --packagedir DIR  --outputdir DIR
+}
+  opts.on("", "--project_file project_file", "osf_projects file provided by requestor (required)") do |project_file|
+    config['project_file'] = project_file
+  end
+  opts.on("", "--package_dir package_dir", "directory OSF packages were  downloaded (defaults to ./FROM_OSF)") do |package_dir|
+    config['package_dir'] = package_dir
+  end
+  opts.on("", "--output_dir output_dir", "Directory to save ROF to (defaults to .)") do |output_dir|
+    config['output_dir'] = output_dir
+  end
+end
+opt.parse!
+# without a project file there is no reason to proceed
+if  !FileTest.exists?(config['project_file']) then
+  STDERR.puts opt.help
+  exit 1
+end
+# perform conversion
+ROF::CLI.osf_to_rof(config)

data/bin/rof ADDED

@@ -0,0 +1,78 @@
+#!/usr/bin/env ruby -Ilib
+require 'rof'
+require 'optparse'
+fedora_info = {}
+noids = {}
+prefix = nil
+bendo_info = nil
+search_path = ["."]
+opt = OptionParser.new do |opts|
+  opts.banner = %q{Usage: rof [options] <command> <input files>
+  command is one of:
+    compare
+    ingest
+    validate
+    filter <filter name>
+  Filtering sends transformed objects to stdout.
+  Possible filters are:
+    bendo, collections, datestamp, file-to-url, label, work}
+  opts.on("", "--fedora URL", "Base Fedora URL") do |url|
+    fedora_info[:url] = url
+  end
+  opts.on("", "--bendo URL", "Base Bendo URL") do |url|
+    bendo_info = url
+  end
+  opts.on("", "--user STRING", "Username and password (colon separated) for fedora") do |u|
+    fedora_info[:user], fedora_info[:password] = u.split(':')
+  end
+  opts.on("", "--noids STRING", "Noids server path and pool name (colon separated)") do |u|
+    noids[:noid_server], _, noids[:pool_name] = u.rpartition(':')
+  end
+  opts.on("", "--prefix STRING", "Prefix for label identifiers") do |s|
+    prefix = s
+  end
+  opts.on("", "--path PATH", "Colon seperated search path for files for ingest or validation. Defaults to the current directory") do |s|
+    search_path = s.split(":")
+  end
+end
+opt.parse!
+fedora_info = nil if fedora_info.empty?
+case ARGV[0]
+when "compare"
+  error_count = ROF::CLI.compare_files(ARGV[1], ARGV[2], STDOUT, fedora_info, bendo_info)
+  exit 1 if error_count > 0
+when "ingest", "validate"
+  error_count = ROF::CLI.ingest_file(ARGV[1], search_path, STDOUT, fedora_info, bendo_info)
+  exit 1 if error_count > 0
+when "filter"
+  filter = case ARGV[1]
+           when "bendo"
+             ROF::Filters::Bendo.new(bendo_info)
+           when "collections"
+             ROF::Filters::Collections.new
+           when "datestamp"
+             ROF::Filters::DateStamp.new
+           when "file-to-url"
+             ROF::Filters::FileToUrl.new
+           when "label"
+             ROF::Filters::Label.new(prefix, noids)
+           when "work"
+             ROF::Filters::Work.new
+           else
+             STDERR.puts "Unknown filter #{ARGV[1]}"
+             exit 3
+           end
+  ROF::CLI.filter_file(filter, ARGV[2], STDOUT)
+else
+  STDERR.puts "Unknown command #{ARGV[0]}"
+  STDERR.puts opt.help
+  exit 3
+end

data/bulk-ingest.md ADDED

@@ -0,0 +1,242 @@
+# Bulk Ingest
+Q. What does this hope to be?
+A. An intermediate representation we can use to process ingests to fedora. The
+idea is that we can accept any crazy format for bulk data and translate it into
+this format. Then another piece can load this format into Fedora. Perhaps it
+can also be used as an export format.
+Q. What is its name?
+A. I don't know. I was calling it ROF for Raw Object Format. I'm open to
+suggestions. Jeremy?
+Q. What is it?
+A. ROF hopes to support many levels of abstraction. To begin with, it only
+specifies a low level format based around the Fedora object model, whereby one
+lists all the data streams which constitute each object. It handles some mild
+translating for hydra rights metadata, but otherwise it is as dumb as a box of
+rocks. It is designed to be easy for machines to process (NOT humans) and is
+uses JSON format as its base. I see it supporting more abstract data elements,
+say "article + dataset" in time.
+Q. At what level does it hope to model objects? That is, why not just use FOXML?
+A. Ideally, this will model our content using our data models. Unfortunately,
+our data models are not well defined, and are still changing. Because of this,
+if we did model the content at the data model level we would need to change the
+loader whenever we add a new model. To begin with, ROF will describe what items
+should look like in Fedora, in terms of Fedora objects. FOXML is too detailed
+for our purpurses since it includes previous versions of content and an audit
+log. Additionally, Hydra only uses a subset of Fedora. ROF will focus on the
+parts which are important to us. And then we can extend it over time to handle
+more abstract objects.
+Q. So what are the problems with this format?
+A. Ideally the interchange format should match our semantic object model, not
+the way things are laid out in fedora. I see this being addressed in time.
+Also, this is another format for which we will need to develop tools to
+support.
+Q. Ok, enough already, what is the actual format?
+A. It uses JSON, so a valid ROF file will also be a valid JSON document. The
+reverse is not true, though. These are the restrictions.
+  1. A ROF file consists of a top-level JSON array. Each element in the array
+is a JSON object.
+  2. The only essential property of the object is "type", which indicates the
+type of the record. The other fields depend on the type field. Right now there
+is only the type "fobject". As the content models are developed, we can add
+more types to represent the models.
+The "fobject" type represents a basic fedora object. Each "fobject" record
+represents a single fedora object. It recognizes the following additional
+fields. A star is a wildcard and represents any sequence of characters.
+Field     |  Description
+----------|--------------
+pid       |  The pid to use for the object. If it includes a prefix, e.g. "vecnet:12bc34g", then that is the objects fedora id. It it doesn'thave a prefix, then the prefix "und:" is added.
+rights    |  The hydra rights of this object. Takes an object. See §Rights below.
+rels-ext  |  The rels-ext data stream of this object. Takes an object. See §Rels-ext below.
+metadata  |  Contents for the 'descMetadata' data stream.  Takes an object. It is given in JSON-LD, and translated to N3 format to be saved into fedora.
+*-file    |  Gives a filename to save as the contents of a data stream given. Takes a string. This overwrites the previous content. For example, the field 'hello-file' will save the file's contents into the data stream 'hello'.
+*-meta    |  Gives the fedora metadata for a given data stream. Takes an object. See §Meta below.
+*         |  Assigns the content directly to the named data stream. Takes a string.
+# Rights
+Rights are given as a object with the keys "discover", "discover-groups",
+"read", "read-groups", "edit", and "edit-groups". Each key takes an array of strings,
+which are taken to be a list of group or user names.
+In Hydra rightsMetadata, the groups "public" and "registered" have special meaning.
+The key "embargo-date", if provided, will be saved. It must be in the form _year-month-day_, e.g. "2015-01-24" for January 24, 2015.
+The semantics are, the item is considered viewable only by the editors until the embargo date, after which the rest of the rights take effect.
+ROF does *not* validate the date provided is in the correct format.
+Example: This object is viewable by anyone and editable only by the user `dbrower`.
+````json
+{
+ "read-groups" : ["public"],
+ "edit" : ["dbrower"]
+}
+````
+Example: This object is embargoed until Jan 24, 2015. Before that date it is viewable and editable only by user `dbrower`, afterward it is viewable by any logged in user and editable by user `dbrower`.
+````json
+{
+ "read-groups" : ["registered"],
+ "edit" : ["dbrower"],
+ "embargo-date" : "2015-01-24"
+}
+````
+# Rels-Ext
+Rels-ext are given as an object where each key is a relation, and takes either
+an array of strings. The strings are fedora object pids, which indicate which
+objects this one is connected to.
+Example:
+````json
+{
+ "isMemberOf" : ["xv57n93k"],
+ "relatedTo": ["user:12345"]
+}
+````
+More complicated relationships can be defined by using a JSON-LD `@context` element:
+````json
+{
+ "@context" : {
+    "hydra" : "http://project-hydra.org"
+ },
+ "hydra:hasEditor" : ["user:12345"]
+}
+````
+# Meta
+The Metadata field is not for an object's descriptive metadata, rather it is
+for the metadata associated to a specific fedora data stream. The data is given
+as pairs, the possible pairs are listed below, as well as the defaults. (TODO:
+this list is incomplete.)
+Field Name  Default         Description
+mime-type   "text/plain"    The mime-type of this data stream. The default is
+                            adjusted for the special data streams of
+                            "descMetadata", "rightsMetadata" and "RELS-EXT".
+label       ""              The label for this datastream
+versioned   true            Whether this data stream's content is versioned.
+storage     "M"             The Fedora storage class of this data stream.
+checksum    "SHA-1"         What checksum to use, or empty string to turn off.
+# Example ROF File
+This is not normative. There are probably errors. The JSON-LD sections are
+likely wrong.
+```json
+[{
+     "type" : "fobject",
+     "pid" : "vecnet:d217qs82g",
+     "af-model" : "Citation",
+     "rights" : {
+          "read-groups" : ["public"],
+          "edit" : ["vecnet_batchuser"]
+     },
+     "metadata" : {
+          "@context" : {
+               "dc" : "http://purl.org/dc/terms/",
+               "rdfs" : "http://www.w3.org/2000/01/rdf-schema#"
+          },
+          "dc:title" : "Molecular systematics and insecticide resistance in the major African malaria vector Anopheles funestus",
+          "dc:creator" : ["Coetzee, M.", "Koekemoer, L. L."],
+          "dc:identifier" : ["doi:10.1146/annurev-ento-120811-153628", "issn:1545-4487 (Electronic)", "issn:0066-4170 (Linking)", "23317045"],
+          "dc:description" : "Anopheles funestus is one of three major African vectors of malaria. Its distribution extends over much of the tropics and subtropics wherever suitable swampy breeding habitats are present. As with members of the Anopheles gambiae complex, An. funestus shows marked genetic heterogeneity across its range. Currently, two unnamed species are recognized in the group, with molecular and cytogenetic data indicating that more may be present. The control of malaria vectors in Africa has received increased attention in the past decade with the scaling up of insecticide-treated bed nets and indoor residual house spraying. Also in the past decade, the frequency of insecticide-resistant mosquitoes has increased exponentially. Whether this increase is in response to vector control initiatives or because of insecticide use in agriculture is debatable. In this article we examine the progress made on the systematics of the An. funestus group and review research on insecticide resistance and its mechanisms.",
+          "dc:language" : "eng",
+          "dc:type" : "Article",
+          "dc:source" : "Annual review of entomology",
+          "dc:references" : "Molecu2013",
+          "dc:bibliographicCitation" : "Annu Rev Entomol 58, 393-412. (2013)",
+          "rdf:seeAlso" : "http://www.ncbi.nlm.nih.gov/pubmed/23317045",
+          "dc:created" : "2013",
+          "dc:modified" : {
+              "@value":"2014-03-17Z",
+              "@type":"http://www.w3.org/2001/XMLSchema#date"
+          },
+          "rdf:domain" : "Citation"
+     },
+     "properties-meta" : {
+          "mime-type" : "text/xml"
+     },
+     "properties" : "<fields><depositor>vecnet_batchuser</depositor></fields>"
+},
+{
+     "type" : "fobject",
+     "pid" : "vecnet:h415pf50x",
+     "af-model" : "CitationFile",
+     "rights" : {
+          "read-groups" : ["registered"],
+          "edit" : ["vecnet_batchuser"]
+     },
+     "metadata" : {
+          "@context" : {
+               "dc" : "http://purl.org/dc/terms/",
+               "rdfs" : "http://www.w3.org/2000/01/rdf-schema#"
+          },
+          "dc:type" : "CitationFile",
+          "dc:dateSubmitted" : {
+               "@value" : "2014-03-17Z",
+               "@type"  : "http://www.w3.org/2001/XMLSchema#date"
+          },
+          "dc:modified" : {
+               "@value" : "2014-03-17Z",
+               "@type"  : "http://www.w3.org/2001/XMLSchema#date"
+          },
+          "dc:creator" : [ "Vecnet Batchuser", "Maureen Coetzee and Lizette L. Koekemoer" ],
+          "dc:title" : "Molecular Systematics and Insecticide Resistance in the Major African Malaria Vector Anopheles funestus"
+     },
+     "rels-ext" : {
+          "isPartOf" : ["vecnet:d217qs82g"]
+     },
+     "properties" : "<fields><depositor>vecnet_batchuser</depositor></fields>",
+     "properties-meta" : {
+          "mime-type": "text/xml",
+          "checksum" : ""
+     },
+     "content-meta" : {
+          "mime-type": "application/pdf",
+          "label" : "5772.pdf",
+          "checksum": ""
+     },
+     "content-file" : "/opt/citations/pdf/5772.pdf",
+     "full_text-file" : "/opt/citations/text/5772.txt",
+     "full_text-meta" : {
+          "label" : "File Datastream",
+          "checksum" : ""
+     },
+     "characterization-meta" : {
+          "mime-type" : "text/xml"
+     },
+     "thumbnail-meta" : {
+          "mime-type" : "image/png",
+          "label" : "File Datastream",
+          "checksum" : ""
+     },
+     "thumbnail-file" : "/opt/citations/thumb/5772.png"
+}]
+````
+# Extensions
+I would call a ROF file containing only "fobjects" and not using labels a
+level-0 ROF file. Higher level ROF files would introduce more complicated and
+abstract structures, which can ultimately be reduced to a level-0 file by
+processing. Then the level-0 file can be directly ingested into Fedora with no
+thought whatsoever.
+We can even export fedora objects as ROF files. These files would not retain
+any previous versions of data streams or the audit history, since ROF does not
+capture the entirety of FOXML.