om 3.1.0 → 3.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (51) hide show
  1. checksums.yaml +4 -4
  2. data/{COMMON_OM_PATTERNS.textile → COMMON_OM_PATTERNS.md} +136 -126
  3. data/CONTRIBUTING.md +2 -2
  4. data/GETTING_FANCY.md +153 -0
  5. data/GETTING_STARTED.md +329 -0
  6. data/Gemfile +1 -1
  7. data/History.md +164 -0
  8. data/LICENSE +15 -20
  9. data/QUERYING_DOCUMENTS.md +162 -0
  10. data/README.md +2 -2
  11. data/UPDATING_DOCUMENTS.md +6 -0
  12. data/gemfiles/gemfile.rails3 +1 -1
  13. data/gemfiles/gemfile.rails4 +1 -1
  14. data/lib/om/version.rb +1 -1
  15. data/lib/om/xml/dynamic_node.rb +42 -51
  16. data/lib/tasks/om.rake +1 -1
  17. data/om.gemspec +1 -2
  18. data/spec/integration/differentiated_elements_spec.rb +2 -2
  19. data/spec/integration/element_value_spec.rb +13 -13
  20. data/spec/integration/proxies_and_ref_spec.rb +10 -10
  21. data/spec/integration/querying_documents_spec.rb +20 -27
  22. data/spec/integration/rights_metadata_integration_example_spec.rb +4 -4
  23. data/spec/integration/selective_querying_spec.rb +1 -1
  24. data/spec/integration/serialization_spec.rb +15 -15
  25. data/spec/integration/set_reentrant_terminology_spec.rb +6 -6
  26. data/spec/integration/subclass_terminology_spec.rb +8 -8
  27. data/spec/integration/xpathy_stuff_spec.rb +10 -10
  28. data/spec/unit/container_spec.rb +27 -27
  29. data/spec/unit/document_spec.rb +24 -24
  30. data/spec/unit/dynamic_node_spec.rb +60 -49
  31. data/spec/unit/named_term_proxy_spec.rb +12 -7
  32. data/spec/unit/node_generator_spec.rb +4 -4
  33. data/spec/unit/nokogiri_sanity_spec.rb +17 -18
  34. data/spec/unit/om_spec.rb +2 -2
  35. data/spec/unit/template_registry_spec.rb +51 -51
  36. data/spec/unit/term_builder_spec.rb +45 -44
  37. data/spec/unit/term_spec.rb +55 -55
  38. data/spec/unit/term_value_operators_spec.rb +205 -205
  39. data/spec/unit/term_xpath_generator_spec.rb +33 -36
  40. data/spec/unit/terminology_builder_spec.rb +50 -47
  41. data/spec/unit/terminology_spec.rb +92 -92
  42. data/spec/unit/validation_spec.rb +12 -12
  43. data/spec/unit/xml_serialization_spec.rb +20 -20
  44. data/spec/unit/xml_spec.rb +3 -3
  45. data/spec/unit/xml_terminology_based_solrizer_spec.rb +18 -18
  46. metadata +11 -38
  47. data/GETTING_FANCY.textile +0 -145
  48. data/GETTING_STARTED.textile +0 -254
  49. data/History.textile +0 -186
  50. data/QUERYING_DOCUMENTS.textile +0 -139
  51. data/UPDATING_DOCUMENTS.textile +0 -3
@@ -1,254 +0,0 @@
1
- h2. OM (Opinionated Metadata) - Getting Started
2
-
3
- OM allows you to define a "terminology" to ease translation between XML and ruby objects - you can query the xml for Nodes _or_ node values without ever writing a line of XPath.
4
-
5
- OM "terms" are ruby symbols you define (in the terminology) that map specific XML content into ruby object attributes.
6
-
7
- The API documentation at "http://rdoc.info/github/projecthydra/om":http://rdoc.info/github/projecthydra/om provides additional, more targeted information. We will provide links to the API as appropriate.
8
-
9
- h4. What you will learn from this document
10
-
11
- # Install OM and run it in IRB
12
- # Build an OM Terminology
13
- # Use OM XML Document class
14
- # Create XML from the OM XML Document
15
- # Load existing XML into an OM XML Document
16
- # Query OM XML Document to get term values
17
- # Access the Terminology of an OM XML Document
18
- # Retrieve XPath from the terminology
19
-
20
-
21
- h2. Install OM
22
-
23
- To get started, you will create a new folder, set up a Gemfile to install OM, and then run bundler.
24
-
25
- <pre>
26
- mkdir omtest
27
- cd omtest
28
- </pre>
29
-
30
- Using whichever editor you prefer, create a file (in omtest directory) called Gemfile with the following contents:
31
-
32
- <pre>
33
- source 'http://rubygems.org'
34
- gem 'om'
35
- </pre>
36
-
37
- Now run bundler to install the gem: (you will need the bundler Gem)
38
-
39
- <pre>
40
- bundle install
41
- </pre>
42
-
43
- You should now be set to use irb to run the following example.
44
-
45
- h2. Build a simple OM terminology (in irb)
46
-
47
- To experiment with abbreviated terminology examples, irb is your friend. If you are working on a persistent terminology and have to experiment to make sure you declare your terminology correctly, we recommend writing test code (e.g. with rspec). You can see examples of this "here":https://github.com/projecthydra/hydra-tutorial-application/blob/master/spec/models/journal_article_mods_datastream_spec.rb
48
-
49
- <pre>
50
- irb
51
- require "rubygems"
52
- => true
53
- require "om"
54
- => true
55
- </pre>
56
-
57
- Create a simple (simplish?) Terminology Builder ("OM::XML::Terminology::Builder":OM/XML/Terminology/Builder.html") based on a couple of elements from the MODS schema.
58
-
59
- <pre>
60
- terminology_builder = OM::XML::Terminology::Builder.new do |t|
61
- t.root(:path=>"mods", :xmlns=>"http://www.loc.gov/mods/v3", :schema=>"http://www.loc.gov/standards/mods/v3/mods-3-2.xsd")
62
- # This is a mods:name. The underscore is purely to avoid namespace conflicts.
63
- t.name_ {
64
- t.namePart
65
- t.role(:ref=>[:role])
66
- t.family_name(:path=>"namePart", :attributes=>{:type=>"family"})
67
- t.given_name(:path=>"namePart", :attributes=>{:type=>"given"}, :label=>"first name")
68
- t.terms_of_address(:path=>"namePart", :attributes=>{:type=>"termsOfAddress"})
69
- }
70
-
71
- # Re-use the structure of a :name Term with a different @type attribute
72
- t.person(:ref=>:name, :attributes=>{:type=>"personal"})
73
- t.organization(:ref=>:name, :attributes=>{:type=>"corporate"})
74
-
75
- # This is a mods:role, which is used within mods:namePart elements
76
- t.role {
77
- t.text(:path=>"roleTerm",:attributes=>{:type=>"text"})
78
- t.code(:path=>"roleTerm",:attributes=>{:type=>"code"})
79
- }
80
- end
81
- </pre>
82
-
83
- Now tell the Terminology Builder to build your Terminology ("OM::XML::Terminology":OM/XML/Terminology.html"):
84
-
85
- <pre>terminology = terminology_builder.build</pre>
86
-
87
- h2. OM Documents
88
-
89
- Generally you will use an "OM::XML::Document":OM/XML/Document.html to work with your xml. Here's how to define a Document class that uses the same Terminology as above.
90
-
91
- In a separate window (so you can keep irb running), create the file my_mods_document.rb in the omtest directory, with this content:
92
-
93
- <pre>
94
- class MyModsDocument &lt; ActiveFedora::NokogiriDatastream
95
- include OM::XML::Document
96
-
97
- set_terminology do |t|
98
- t.root(:path=>"mods", :xmlns=>"http://www.loc.gov/mods/v3", :schema=>"http://www.loc.gov/standards/mods/v3/mods-3-2.xsd")
99
- # This is a mods:name. The underscore is purely to avoid namespace conflicts.
100
- t.name_ {
101
- t.namePart
102
- t.role(:ref=>[:role])
103
- t.family_name(:path=>"namePart", :attributes=>{:type=>"family"})
104
- t.given_name(:path=>"namePart", :attributes=>{:type=>"given"}, :label=>"first name")
105
- t.terms_of_address(:path=>"namePart", :attributes=>{:type=>"termsOfAddress"})
106
- }
107
- t.person(:ref=>:name, :attributes=>{:type=>"personal"})
108
- t.organization(:ref=>:name, :attributes=>{:type=>"corporate"})
109
-
110
- # This is a mods:role, which is used within mods:namePart elements
111
- t.role {
112
- t.text(:path=>"roleTerm",:attributes=>{:type=>"text"})
113
- t.code(:path=>"roleTerm",:attributes=>{:type=>"code"})
114
- }
115
- end
116
-
117
- # Generates an empty Mods Article (used when you call ModsArticle.new without passing in existing xml)
118
- # (overrides default behavior of creating a plain xml document)
119
- def self.xml_template
120
- # use Nokogiri to build the XML
121
- builder = Nokogiri::XML::Builder.new do |xml|
122
- xml.mods(:version=>"3.3", "xmlns:xlink"=>"http://www.w3.org/1999/xlink",
123
- "xmlns:xsi"=>"http://www.w3.org/2001/XMLSchema-instance",
124
- "xmlns"=>"http://www.loc.gov/mods/v3",
125
- "xsi:schemaLocation"=>"http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-3.xsd") {
126
- xml.titleInfo(:lang=>"") {
127
- xml.title
128
- }
129
- xml.name(:type=>"personal") {
130
- xml.namePart(:type=>"given")
131
- xml.namePart(:type=>"family")
132
- xml.affiliation
133
- xml.computing_id
134
- xml.description
135
- xml.role {
136
- xml.roleTerm("Author", :authority=>"marcrelator", :type=>"text")
137
- }
138
- }
139
- }
140
- end
141
- # return a Nokogiri::XML::Document, not an OM::XML::Document
142
- return builder.doc
143
- end
144
-
145
- end
146
- </pre>
147
-
148
- (Note that we are now also using the ActiveFedora gem.)
149
-
150
- "OM::XML::Document":OM/XML/Document.html provides the set_terminology method to handle the details of creating a TerminologyBuilder and building the terminology for you. This allows you to focus on defining the structures of the Terminology itself.
151
-
152
- h3. Creating XML Documents from Scratch using OM
153
-
154
- By default, new OM Document instances will create an empty xml document, but if you override self.xml_template to return a different object (e.g. "Nokogiri::XML::Document":http://nokogiri.rubyforge.org/nokogiri/Nokogiri/XML/Document.html), that will be created instead.
155
-
156
- In the example above, we have overridden xml_template to build an empty, relatively simple MODS document as a "Nokogiri::XML::Document":http://nokogiri.rubyforge.org/nokogiri/Nokogiri/XML/Document.html. We use "Nokogiri::XML::Builder":http://nokogiri.rubyforge.org/nokogiri/Nokogiri/XML/Builder.html and call its .doc method at the end of xml_template in order to return the "Nokogiri::XML::Document":http://nokogiri.rubyforge.org/nokogiri/Nokogiri/XML/Document.html object. Instead of using "Nokogiri::XML::Builder":http://nokogiri.rubyforge.org/nokogiri/Nokogiri/XML/Builder.html, you could put your template into an actual xml file and have xml_template use "Nokogiri::XML::Document.parse":http://nokogiri.rubyforge.org/nokogiri/Nokogiri/XML/Document.html#M000225 to load it. That's up to you. Create the documents however you want, just return a "Nokogiri::XML::Document":http://nokogiri.rubyforge.org/nokogiri/Nokogiri/XML/Document.html.
157
-
158
- to use "Nokogiri::XML::Builder":http://nokogiri.rubyforge.org/nokogiri/Nokogiri/XML/Builder.html
159
-
160
- <pre>
161
- require "my_mods_document"
162
- newdoc = MyModsDocument.new
163
- newdoc.to_xml
164
- => NoMethodError: undefined method `to_xml' for nil:NilClass
165
- </pre>
166
-
167
-
168
- h3. Loading an existing XML document with OM
169
-
170
- To load existing XML into your OM Document, use "#from_xml":OM/XML/Container/ClassMethods.html#from_xml-instance_method"
171
-
172
- For an example, download "hydrangea_article1.xml":https://github.com/mediashelf/om/blob/master/spec/fixtures/mods_articles/hydrangea_article1.xml into your working directory (omtest), then run this in irb:
173
-
174
- <pre>
175
- sample_xml = File.new("hydrangea_article1.xml")
176
- doc = MyModsDocument.from_xml(sample_xml)
177
- </pre>
178
-
179
- Take a look at the document object's xml that you've just populated. We will use this document for the next few examples.
180
-
181
- <pre>doc.to_xml</pre>
182
-
183
- h3. Querying OM Documents
184
-
185
- Using the Terminology associated with your Document, you can query the xml for nodes _or_ node values without ever writing a line of XPath.
186
-
187
- You can use OM::XML::Document.find_by_terms to retrieve xml _nodes_ from the datastream. It returns Nokogiri::XML::Node objects:
188
-
189
- <pre>
190
- doc.find_by_terms(:person)
191
- doc.find_by_terms(:person).length
192
- doc.find_by_terms(:person).each {|n| puts n.to_xml}
193
- </pre>
194
-
195
- You might prefer to use nodes as a way of getting multiple values pertaining to a node, rather than doing more expensive lookups for each desired value.
196
-
197
- If you want to get directly to the _values_ within those nodes, use OM::XML::Document.term_values:
198
-
199
- <pre>
200
- doc.term_values(:person, :given_name)
201
- doc.term_values(:person, :family_name)
202
- </pre>
203
-
204
- If the xpath points to XML nodes that contain other nodes, the response to term_values will contain Nokogiri::XML::Node objects instead of text values:
205
-
206
- <pre>
207
- doc.term_values(:name)
208
- </pre>
209
-
210
- For more examples of Querying OM Documents, see "Querying Documents":https://github.com/mediashelf/om/blob/master/QUERYING_DOCUMENTS.textile
211
-
212
- h3. Updating, Inserting & Deleting Elements (TermValueOperators)
213
-
214
- For more examples of Updating OM Documents, see "Updating Documents":https://github.com/mediashelf/om/blob/master/UPDATING_DOCUMENTS.textile
215
-
216
- h3. Validating Documents
217
-
218
- If you have an XML schema defined in your Terminology's root Term, you can validate any xml document by calling ".validate" on any instance of your Document classes.
219
-
220
- <pre>doc.validate</pre>
221
-
222
- __*Note:* this method requires an internet connection, as it will download the XML schema from the URL you have specified in the Terminology's root term.__
223
-
224
- h3. Directly accessing the "Nokogiri::XML::Document":http://nokogiri.rubyforge.org/nokogiri/Nokogiri/XML/Document.html and the "OM::XML::Terminology":https://github.com/mediashelf/om/blob/master/lib/om/xml/terminology.rb
225
-
226
- "OM::XML::Document":https://github.com/mediashelf/om/blob/master/lib/om/xml/document.rb is implemented as a container for a "Nokogiri::XML::Document":http://nokogiri.rubyforge.org/nokogiri/Nokogiri/XML/Document.html. It uses the associated OM Terminology to provide a bunch of convenience methods that wrap calls to Nokogiri. If you ever need to operate directly on the Nokogiri Document, simply call ng_xml and do what you need to do. OM will not get in your way.
227
-
228
- <pre>ng_document = doc.ng_xml</pre>
229
-
230
- If you need to look at the Terminology associated with your Document, call "#terminology":OM/XML/Document/ClassMethods.html#terminology-instance_method on the Document's _class_.
231
-
232
- <pre>
233
- MyModsDocument.terminology
234
- doc.class.terminology
235
- </pre>
236
-
237
- h2. Using a Terminology to generate XPath Queries based on Term Pointers
238
-
239
- Because the Terminology is essentially a mapping from XPath queries to ruby object attributes, in most cases you won't need to know the actual XPath queries. Nevertheless, when you <i>do</i> want to know the Xpath (e.g. for ensuring your terminology is correct) for a term, the Terminology can generate xpath queries based on the structures you've defined ("OM::XML::TermXPathGenerator":OM/XML/TermXpathGenerator.html).
240
-
241
- Here are the xpaths for :name and two variants of :name that were created using the :ref argument in the Terminology Builder:
242
-
243
- <pre>
244
- terminology.xpath_for(:name)
245
- => "//oxns:name"
246
- terminology.xpath_for(:person)
247
- => "//oxns:name[@type=\"personal\"]"
248
- terminology.xpath_for(:organization)
249
- => "//oxns:name[@type=\"corporate\"]"
250
- </pre>
251
-
252
- h2. Solrizing Documents
253
-
254
- The solrizer gem provides support for indexing XML documents into Solr based on OM Terminologies. That process is documented in the "solrizer documentation":http://rdoc.info/github/projecthydra/solrizer
@@ -1,186 +0,0 @@
1
- h3. 3.1.0 (17 Jul 2014)
2
- 2014-07-17: Bump solrizer version to ~> 3.3 [Justin Coyne]
3
-
4
- 2014-07-17: Use the system libxml2 on travis [Justin Coyne]
5
-
6
- 2014-07-17: Remove dependency on mediashelf-loggable [Justin Coyne]
7
-
8
- 2014-06-13: Setting values on a proxy term should build the parent terms if they
9
- don't exist [Justin Coyne]
10
-
11
- 2014-06-05: Handle invalid time for Rails 3 [Justin Coyne]
12
-
13
- 2014-06-02: Updating solrizer, correcting rspec deprecations [Adam Wead]
14
-
15
- h3. 3.0.1 (25 Jun 2013)
16
- Fix bug where values that were the same as the existing values were
17
- removed from the update list
18
-
19
- h3. 3.0.0 (20 Jun 2013)
20
- Return an array instead of a hash Term#update_values
21
- When passing an array to Term#update_values, it will overwrite all of
22
- the existing values of that term.
23
- OM::XML::Document#find_by_terms_and_value should match strings with
24
- text() = xpath query, rather than contains().
25
-
26
- h3. 2.2.1 (20 Jun 2013)
27
- Revert deprecation of passing Hash values
28
-
29
- h3. 2.2.0 (20 June 2013)
30
- Deprecate passing Hash values into DynamicNode#val= or
31
- Document#update_attributes. This behavior will be removed in 3.0.0
32
- Rails 4 support
33
- Pass nil in order to remove a node (instead of blank string)
34
-
35
- h3. 2.1.2 (3 May 2013)
36
- Fix missing comma after exception
37
-
38
- h3. 2.1.1 (2 May 2013)
39
- bump solrizer to 3.1.0
40
-
41
- h3. 2.1.0 (29 April 2013)
42
- support for element names with periods in them
43
- support for 'type: :time'
44
-
45
-
46
- h3. 2.0.0
47
- Support new solr schema
48
-
49
- h3. 1.8.0
50
- Removed unused mods_article_terminology.xml
51
- Replacing :data_type with :type; deprecating :data_type
52
- Making test related to HYDRA-647 pending
53
- Adding .type method for ruby 1.8.7 compatibility
54
- XML serialization should use the data_type node name and not type
55
- Update homepage in gemspec
56
- Remove .rvmrc
57
- Remove debugger from gemfile
58
-
59
- h3. 1.7.0
60
- Add casting to dates and integers when you specify the :type attribute on a terminology node
61
-
62
- h3. 1.6.1
63
- Integration spec to illustrate selective querying.
64
- Add #use_terminology and #extend_terminology methods to extend existing OM terminologies
65
-
66
- h3. 1.6.0
67
- Delegate all methods on the dynamic node to the found values
68
- Allow arrays to be set on dynamic nodes
69
-
70
- h3. 1.5.3
71
-
72
- HYDRA-657 OM Terms that share a name with methods on Nokogiri Builders have incorrect builder templates
73
- HYDRA-674 XML Builder templates incorrect for :none attributes
74
-
75
- h3. 1.5.2
76
-
77
- HYDRA-742 Can't modify frozen string (parameters in rails 3.2, when using ruby 1.9.3)
78
-
79
- h3. 1.5.1
80
-
81
- HYDRA-737 OM tests fail under ree 1.8.7-2011.12
82
- (Fix also applies to ruby 1.8.7-p357)
83
-
84
- h3. 1.5.0
85
-
86
- HYDRA-358 Added support for namespaceless terminologies
87
-
88
- h3. 1.4.4
89
-
90
- HYDRA-415 https://jira.duraspace.org/browse/HYDRA-415 Fixed insert of attribute nodes
91
- update to rspec2
92
- compatibility fixes for ruby 1.9
93
- RedCloth updated to 4.2.9
94
- Replace local 'delimited_list' logic with Array#join
95
-
96
- h3. 1.4.3
97
-
98
- HYDRA-681 https://jira.duraspace.org/browse/HYDRA-681 Om was calling .dirty when updating nodes, it should only do that when it's operating on a Nokogiri:Datastream
99
- HYDRA-682 https://jira.duraspace.org/browse/HYDRA-682 Om first level terms support update
100
-
101
- h3. 1.4.2
102
-
103
- "HYDRA-667":https://jira.duraspace.org/browse/HYDRA-667 Fixed bug where updating nodes wasn't marking the document as dirty
104
-
105
- h3. 1.4.0
106
-
107
- Added dynamic node access DSL. Added a warning when calling an index on a proxy term.
108
-
109
- h3. 1.3.0
110
-
111
- Document automatically includes Validation module, meaning that you can now call .validate on any document
112
-
113
- h3. 1.2.4
114
-
115
- TerminologyBuilder.root now passes on its options to the root term builder that it creates.
116
-
117
- h3. 1.2.3
118
-
119
- NamedTermProxies can now point to terms at the root of a Terminology
120
-
121
- h3. 1.2.0
122
-
123
- added OM::XML::TemplateRegistry for built-in templating and creation of new XML nodes
124
-
125
- h3. 1.1.1
126
-
127
- "HYDRA-395":https://jira.duraspace.org/browse/HYDRA-395: Fixed bug that prevented you from appending term values with apostrophes in them
128
-
129
- h2. 1.1.0
130
-
131
- HYDRA-371: Provide a way to specify a term that points to nodes where an attribute is not set
132
-
133
- Add support for this syntax in Terminologies, where an attribute value can be :none. When an attribute's value is set to :none, a not() predicate is used in the resulting xpath
134
-
135
- t.computing_id(:path=>"namePart", :attributes=>{:type=>:none})
136
-
137
- will result in an xpath that looks like:
138
-
139
- //namePart[not(@type)]
140
-
141
- namePart[not(@type)]
142
-
143
- h3. 1.0.1
144
-
145
- HYDRA-329: Allow for NamedTermProxies at root of Terminology
146
-
147
- h2. 1.0.0
148
-
149
- Stable release
150
-
151
- h3. 0.1.10
152
-
153
- Improving generation of constrained xpath queries
154
-
155
- h3. 0.1.9
156
-
157
- Improving support for deeply nested nodes (still needs work though)
158
-
159
- h3. 0.1.5
160
-
161
- * root_property now inserts an entry into the properties hash
162
- * added .generate method for building new instances of declared properties
163
- * refinements to accessor_xpath
164
-
165
- h3. 0.1.4
166
-
167
- * made attribute_xpath idempotent
168
-
169
- h3. 0.1.3
170
-
171
- * added accessor_generic_name and accessor_hierarchical_name methods
172
-
173
- h3. 0.1.2
174
-
175
- * changed syntax for looking up accessors with (optional) index values -- no using [{:person=>1}, :first_name] instead of [:person, 1, :first_name]
176
-
177
- h3. 0.1.1
178
-
179
- RENAMED to om (formerly opinionated-xml)
180
-
181
- * broke up functionality into Modules
182
- * added OM::XML::Accessor functionality
183
-
184
- h3. 0.1
185
-
186
- Note: OX v.1 Does not handle treating attribute values as the changing "value" of a node
@@ -1,139 +0,0 @@
1
- h2. Querying OM Documents
2
-
3
- This document will help you understand how to access the information associated with an "OM::XML::Document":OM/XML/Document.html object. We will explain some of the methods provided by the "OM::XML::Document":OM/XML/Document.html module and its related modules "OM::XML::TermXPathGenerator":OM/XML/TermXPathGenerator.html & "OM::XML::TermValueOperators":OM/XML/TermValueOperators.html
4
-
5
- _Note: In your code, don't worry about including OM::XML::TermXPathGenerator and OM::XML::TermValueOperators into your classes. OM::XML::Document handles that for you._
6
-
7
- h3. Load the Sample XML and Sample Terminology
8
-
9
- These examples use the Document class defined in "OM::Samples::ModsArticle":https://github.com/mediashelf/om/blob/master/lib/om/samples/mods_article.rb
10
-
11
- Download "hydrangea_article1.xml":https://github.com/mediashelf/om/blob/master/spec/fixtures/mods_articles/hydrangea_article1.xml (sample xml) into your working directory, then run this in irb:
12
-
13
- <pre>
14
- require "om/samples"
15
- sample_xml = File.new("hydrangea_article1.xml")
16
- doc = OM::Samples::ModsArticle.from_xml(sample_xml)
17
- </pre>
18
-
19
- h2. Querying the "OM::XML::Document":OM/XML/Document.html
20
-
21
- The "OM::XML::Terminology":OM/XML/Terminology.html" declared by "OM::Samples::ModsArticle":https://github.com/mediashelf/om/blob/master/lib/om/samples/mods_article.rb maps the defined Terminology structure to xpath queries. It will also run the queries for you in most cases.
22
-
23
- h4. xpath_for method of "OM::XML::Terminology":OM/XML/Terminology.html" retrieves xpath expressions for OM terms
24
-
25
- The xpath_for method retrieves the xpath used by the "OM::XML::Terminology":OM/XML/Terminology.html"
26
-
27
- Examples of xpaths for :name and two variants of :name that were created using the :ref argument in the Terminology builder:
28
-
29
- <pre>
30
- OM::Samples::ModsArticle.terminology.xpath_for(:name)
31
- => "//oxns:name"
32
- OM::Samples::ModsArticle.terminology.xpath_for(:person)
33
- => "//oxns:name[@type=\"personal\"]"
34
- OM::Samples::ModsArticle.terminology.xpath_for(:organization)
35
- => "//oxns:name[@type=\"corporate\"]"
36
- </pre>
37
-
38
- h4. Working with Terms
39
-
40
- To retrieve the values of xml nodes, use the term_values method:
41
-
42
- <pre>
43
- doc.term_values(:person, :first_name)
44
- doc.term_values(:person, :last_name)
45
- </pre>
46
-
47
- The term_values method is defined in the "OM::XML::TermValueOperators":OM/XML/TermValueOperators.html module, which is included in "OM::XML::Document":OM/XML/Document.html
48
-
49
- Not that if a term's xpath mapping points to XML nodes that contain other nodes, the response to term_values will be Nokogiri::XML::Node objects instead of text values:
50
-
51
- <pre>
52
- doc.term_values(:name)
53
- </pre>
54
-
55
- More examples of using term_values and find_by_terms (defined in "OM::XML::Document":OM/XML/Document.html):
56
-
57
- <pre>
58
- doc.find_by_terms(:organization).to_xml
59
- doc.term_values(:organization, :role)
60
- => ["\n Funder\n "]
61
- doc.term_values(:organization, :namePart)
62
- => ["NSF"]
63
- </pre>
64
-
65
- To retrieve the values of nested terms, create a sequence of terms, from outermost to innermost:
66
-
67
- <pre>
68
- OM::Samples::ModsArticle.terminology.xpath_for(:journal, :issue, :pages, :start)
69
- => "//oxns:relatedItem[@type=\"host\"]/oxns:part/oxns:extent[@unit=\"pages\"]/oxns:start"
70
- doc.term_values(:journal, :issue, :pages, :start)
71
- => ["195"]
72
- </pre>
73
-
74
- If you get one of the term names wrong in the sequence, OM will tell you which one is causing problems. See what happens when you put :page instead of :pages in your argument to term_values.
75
-
76
- <pre>
77
- doc.term_values(:journal, :issue, :page, :start)
78
- OM::XML::Terminology::BadPointerError: You attempted to retrieve a Term using this pointer: [:journal, :issue, :page] but no Term exists at that location. Everything is fine until ":page", which doesn't exist.
79
- </pre>
80
-
81
-
82
- h2. When XML Elements are Reused in a Document
83
-
84
- (Another way to put this: the xpath statement for a term can be ambiguous.)
85
-
86
- In our MODS document, we have two distinct uses of the title XML element:
87
- # the title of the published article
88
- # the title of the journal it was published in.
89
-
90
- How can we distinguish between these two uses?
91
-
92
- <pre>
93
- doc.term_values(:title_info, :main_title)
94
- => ["ARTICLE TITLE", "VARYING FORM OF TITLE", "TITLE OF HOST JOURNAL"]
95
- doc.term_values(:mods, :title_info, :main_title)
96
- => ["ARTICLE TITLE", "VARYING FORM OF TITLE"]
97
- OM::Samples::ModsArticle.terminology.xpath_for(:title_info, :main_title)
98
- => "//oxns:titleInfo/oxns:title"
99
- </pre>
100
-
101
- The solution: include the root node in your term pointer.
102
-
103
- <pre>
104
- OM::Samples::ModsArticle.terminology.xpath_for(:mods, :title_info, :main_title)
105
- => "//oxns:mods/oxns:titleInfo/oxns:title"
106
- doc.term_values(:mods, :title_info, :main_title)
107
- => ["ARTICLE TITLE", "VARYING FORM OF TITLE"]
108
- </pre>
109
-
110
- We can still access the Journal title by its own pointers:
111
-
112
- <pre>
113
- doc.term_values(:journal, :title_info, :main_title)
114
- => ["TITLE OF HOST JOURNAL"]
115
- </pre>
116
-
117
- h2. Making life easier with Proxy Terms
118
-
119
- If you use a nested term often, you may want to avoid typing the whole sequence of term names by defining a _proxy_ term.
120
-
121
- As you can see in "OM::Samples::ModsArticle":https://github.com/mediashelf/om/blob/master/lib/om/samples/mods_article.rb, we have defined a few proxy terms for convenience.
122
-
123
- <pre>
124
- t.publication_url(:proxy=>[:location,:url])
125
- t.peer_reviewed(:proxy=>[:journal,:origin_info,:issuance], :index_as=>[:facetable])
126
- t.title(:proxy=>[:mods,:title_info, :main_title])
127
- t.journal_title(:proxy=>[:journal, :title_info, :main_title])
128
- </pre>
129
-
130
- You can use proxy terms just like any other term when querying the document.
131
-
132
- <pre>
133
- OM::Samples::ModsArticle.terminology.xpath_for(:peer_reviewed)
134
- => "//oxns:relatedItem[@type=\"host\"]/oxns:originInfo/oxns:issuance"
135
- OM::Samples::ModsArticle.terminology.xpath_for(:title)
136
- => "//oxns:mods/oxns:titleInfo/oxns:title"
137
- OM::Samples::ModsArticle.terminology.xpath_for(:journal_title)
138
- => "//oxns:relatedItem[@type=\"host\"]/oxns:titleInfo/oxns:title"
139
- </pre>