xml-mapping 0.8.1 → 0.9.1

Sign up to get free protection for your applications and to get access to all the features.
Files changed (71) hide show
  1. data/ChangeLog +64 -3
  2. data/README +871 -173
  3. data/README_XPATH +40 -13
  4. data/Rakefile +37 -26
  5. data/TODO.txt +39 -8
  6. data/examples/README +5 -0
  7. data/examples/company_usage.intout +34 -22
  8. data/examples/documents_folders.rb +31 -0
  9. data/examples/documents_folders.xml +16 -0
  10. data/examples/documents_folders_usage.intin.rb +18 -0
  11. data/examples/documents_folders_usage.intout +46 -0
  12. data/examples/order_signature_enhanced_usage.intout +21 -11
  13. data/examples/order_usage.intin.rb +52 -5
  14. data/examples/order_usage.intout +154 -80
  15. data/examples/person.intin.rb +44 -0
  16. data/examples/person.intout +27 -0
  17. data/examples/person_mm.intin.rb +119 -0
  18. data/examples/person_mm.intout +114 -0
  19. data/examples/publication.intin.rb +44 -0
  20. data/examples/publication.intout +20 -0
  21. data/examples/reader.intin.rb +33 -0
  22. data/examples/reader.intout +19 -0
  23. data/examples/stringarray.rb +5 -0
  24. data/examples/stringarray.xml +10 -0
  25. data/examples/stringarray_usage.intin.rb +11 -0
  26. data/examples/stringarray_usage.intout +31 -0
  27. data/examples/time_augm.intout +19 -7
  28. data/examples/time_augm_loading.intin.rb +44 -0
  29. data/examples/time_augm_loading.intout +12 -0
  30. data/examples/time_node.intin.rb +79 -0
  31. data/examples/time_node.rb +3 -2
  32. data/examples/time_node_w_marshallers.intin.rb +48 -0
  33. data/examples/time_node_w_marshallers.intout +25 -0
  34. data/examples/time_node_w_marshallers.xml +9 -0
  35. data/examples/xpath_create_new.intout +132 -114
  36. data/examples/xpath_ensure_created.intout +86 -65
  37. data/examples/xpath_pathological.intout +16 -16
  38. data/examples/xpath_usage.intout +1 -1
  39. data/install.rb +1 -0
  40. data/lib/xml/mapping.rb +3 -1
  41. data/lib/xml/mapping/base.rb +442 -272
  42. data/lib/xml/mapping/core_classes_mapping.rb +32 -0
  43. data/lib/xml/mapping/standard_nodes.rb +176 -86
  44. data/lib/xml/mapping/version.rb +2 -2
  45. data/lib/xml/rexml_ext.rb +186 -0
  46. data/lib/xml/xxpath.rb +28 -265
  47. data/lib/xml/xxpath/steps.rb +345 -0
  48. data/lib/xml/xxpath_methods.rb +96 -0
  49. data/test/all_tests.rb +4 -1
  50. data/test/benchmark_fixtures.rb +14 -0
  51. data/test/{multiple_mappings.rb → bookmarks.rb} +0 -0
  52. data/test/company.rb +47 -0
  53. data/test/documents_folders.rb +11 -1
  54. data/test/examples_test.rb +29 -0
  55. data/test/fixtures/benchmark.xml +77 -0
  56. data/test/fixtures/company1.xml +9 -0
  57. data/test/fixtures/documents_folders.xml +0 -8
  58. data/test/fixtures/documents_folders2.xml +13 -19
  59. data/test/fixtures/triangle_m1.xml +17 -0
  60. data/test/fixtures/triangle_m2.xml +19 -0
  61. data/test/inheritance_test.rb +50 -0
  62. data/test/multiple_mappings_test.rb +155 -0
  63. data/test/rexml_xpath_benchmark.rb +29 -0
  64. data/test/triangle_mm.rb +57 -0
  65. data/test/xml_mapping_adv_test.rb +36 -1
  66. data/test/xml_mapping_test.rb +136 -7
  67. data/test/xpath_test.rb +154 -0
  68. data/test/xxpath_benchmark.rb +36 -0
  69. data/test/xxpath_benchmark.result1.txt +17 -0
  70. data/test/xxpath_methods_test.rb +61 -0
  71. metadata +139 -90
data/ChangeLog CHANGED
@@ -1,8 +1,40 @@
1
- 2005-12-07 Olaf Klischat
1
+ 2006/12/26 Olaf Klischat
2
2
 
3
- * release 0.8.1
3
+ * when creating a new instance of a mapping class from an XML
4
+ input, call new if possible rather than allocate (patch by Fred
5
+ Loney)
4
6
 
5
- 2005-12-07 Olaf Klischat
7
+ 2006/04/30 Olaf Klischat
8
+
9
+ * xml/xxpath: text() steps
10
+
11
+ 2006/03/31 Olaf Klischat
12
+
13
+ * SubObjectBaseNode: store marshaller/unmarshaller in
14
+ @marshaller/@unmarshaller (general policy for node
15
+ implementations is to set @options to
16
+ originally supplied option arguments and never change it; then
17
+ store "extracted" information in additional @attributes)
18
+
19
+ 2006/02/19 Olaf Klischat
20
+
21
+ * xml/xxpath: child::*[@attrname='attrvalue'] steps
22
+
23
+ 2006/02/19 Olaf Klischat
24
+
25
+ * xml/xxpath: .[@attrname='attrvalue'] steps
26
+
27
+ 2005/12/30 Olaf Klischat
28
+
29
+ * node initializers in node's initialize() method; initialize_impl
30
+ deprecated (but retained for backward compatibility)
31
+
32
+ 2005/12/28 Olaf Klischat
33
+
34
+ * :reader/:writer options to node factory functions (for partially
35
+ or completely overriding the node's functionality)
36
+
37
+ 2005/12/07 Olaf Klischat
6
38
 
7
39
  * ChangeLog file
8
40
 
@@ -10,6 +42,35 @@
10
42
 
11
43
  * bugfix: clone default values to avoid external modifications
12
44
 
45
+ 2005/11/27 Olaf Klischat
46
+
47
+ * xml/xxpath: name1|name2|... steps
48
+
49
+ 2005/11/19 Olaf Klischat
50
+
51
+ * support for String and numeric types in :class attributes
52
+
53
+ 2005/11/16 Olaf Klischat
54
+
55
+ * choice_node
56
+
57
+ 2005/11/05 Olaf Klischat
58
+
59
+ * xml/xxpath: descendants ("//") axis
60
+
61
+ 2005/10/11 Olaf Klischat
62
+
63
+ * support for "." paths/path elements (map sub-objects to XML data
64
+ from the parent object's XML element)
65
+
66
+ 2005/10/05 Olaf Klischat
67
+
68
+ * multiple distinct mappings per mapping class
69
+
70
+ 2005/09/30 Olaf Klischat
71
+
72
+ * @options moved from SingleAttributeNode to Node
73
+
13
74
  2005/07/07 Olaf Klischat
14
75
 
15
76
  * release 0.8
data/README CHANGED
@@ -1,9 +1,7 @@
1
- = XML-MAPPING: XML-to-object (and back) mapper for Ruby, including XPath interpreter
1
+ = XML-MAPPING: XML-to-object (and back) Mapper for Ruby, including XPath Interpreter
2
2
 
3
3
  Xml-mapping is an easy to use, extensible library that allows you to
4
- semi-automatically map Ruby objects to XML trees and vice versa. It is
5
- easy to use and has a modular design that allows for easy extension of
6
- its functionality.
4
+ semi-automatically map Ruby objects to XML trees and vice versa.
7
5
 
8
6
  == Download
9
7
 
@@ -11,16 +9,30 @@ For downloading the latest version, CVS repository access etc. go to:
11
9
 
12
10
  http://rubyforge.org/projects/xml-mapping/
13
11
 
14
- == Example
12
+ == Contents of this Document
13
+
14
+ - {Example}[aref:example]
15
+ - {Single-attribute Nodes}[aref:sanodes]
16
+ - {Default Values}[aref:defaultvalues]
17
+ - {Single-attribute Nodes with Sub-objects}[aref:subobjnodes]
18
+ - {Attribute Handling Details, Augmenting Existing Classes}[aref:attrdefns]
19
+ - {Other Nodes}[aref:onodes]
20
+ - {choice_node}[aref:choice_node]
21
+ - {Readers/Writers}[aref:readerswriters]
22
+ - {Multiple Mappings per Class}[aref:mappings]
23
+ - {Defining your own Node Types}[aref:definingnodes]
24
+ - {XPath Interpreter}[aref:xpath]
25
+
26
+ == {Example}[a:example]
15
27
 
16
28
  (example document stolen + extended from
17
29
  http://www.castor.org/xml-mapping.html)
18
30
 
19
- === Input document:
31
+ === Input Document:
20
32
 
21
33
  :include: order.xml
22
34
 
23
- === Mapping class declaration:
35
+ === Mapping Class Declaration:
24
36
 
25
37
  :include: order.rb
26
38
 
@@ -28,9 +40,6 @@ http://www.castor.org/xml-mapping.html)
28
40
 
29
41
  :include: order_usage.intout
30
42
 
31
-
32
- == Description
33
-
34
43
  As shown in the example, you have to include XML::Mapping into a class
35
44
  to turn it into a "mapping class". There are no other restrictions
36
45
  imposed on mapping classes; you can add attributes and methods to
@@ -38,14 +47,17 @@ them, include additional modules in them, derive them from other
38
47
  classes, derive other classes from them etc.pp.
39
48
 
40
49
  An instance of a mapping class can be created from/converted into an
41
- XML node by means of instance methods like XML::Mapping.load_from_xml,
42
- XML::Mapping#save_to_xml, XML::Mapping.load_from_file,
50
+ XML node with methods like XML::Mapping::ClassMethods.load_from_xml,
51
+ XML::Mapping#save_to_xml, XML::Mapping::ClassMethods.load_from_file,
43
52
  XML::Mapping#save_to_file. Special class methods like "text_node",
44
- "array_node" etc., called "node factory methods", may be called from
45
- the body of the class definition to define instance attributes that
46
- are automatically and bidirectionally mapped to subtrees of the XML
47
- element an instance of the class is mapped to. For example, in the
48
- definition
53
+ "array_node" etc., called *node* *factory* *methods*, may be called
54
+ from the body of the class definition to define instance attributes
55
+ that are automatically and bidirectionally mapped to subtrees of the
56
+ XML element an instance of the class is mapped to.
57
+
58
+ == {Single-attribute Nodes}[a:sanodes]
59
+
60
+ For example, in the definition
49
61
 
50
62
  class Address
51
63
  include XML::Mapping
@@ -59,45 +71,30 @@ definition
59
71
  the first call to #text_node creates an attribute named "city" which
60
72
  is mapped to the text of the XML child element defined by the XPath
61
73
  expression "City" (xml-mapping includes an XPath interpreter that can
62
- also be used seperately; see below). When you create an instance of
63
- +Address+ from an XML element (using Address.load_from_file(file_name)
64
- or Address.load_from_xml(rexml_element)), that instance's "city"
74
+ also be used seperately; see below[aref:xpath]). When you create an
75
+ instance of +Address+ from an XML element (using
76
+ Address.load_from_file(file_name) or
77
+ Address.load_from_xml(rexml_element)), that instance's "city"
65
78
  attribute will be set to the text of the XML element's "City" child
66
- element. When you convert an instance of Address into an XML element,
67
- a sub-element "City" is added and it text is set to the current value
68
- of the +city+ attribute. The other node types (numeric_node,
69
- array_node etc.) work analogously. The node types +object_node+,
70
- +array_node+, and +hash_node+ recursively map sub-trees to instances
71
- of mapping classes (as opposed to simple types like String
72
- etc.). For example, with the line
73
-
74
- array_node :signatures, "Signed-By", "Signature", :class=>Signature, :default_value=>[]
75
-
76
- , an attribute named "signatures" is added to the surrounding class
77
- (here: Order); the attribute will be an array whose elements
78
- correspond to the XML elements yielded by the XPath
79
- "Signed-By/Signature". Each element will be of class +Signature+ (each
80
- array element is created from the corresponding XML element by just
81
- calling <tt>Signature.load_from_xml(the_xml_element)</tt>). The reason
82
- why the path "Signed-By/Signature" is provieded in two arguments
83
- instead of just one combined one becomes apparent when marshalling the
84
- array (along with the surrounding object) back into a sequence of XML
85
- elements. When that happens, "Signed-By" names the common base element
86
- for all those elements, and "Signature" is the path that will be
87
- duplicated for each element. The input document in the example above
88
- shows how this ends up looking.
89
-
90
- Hash nodes work similarly, but they define hash-valued attributes
91
- instead of array-valued ones.
92
-
93
- Refer to the reference documentation for details about the node types
94
- that are included in the xml-mapping library.
95
-
96
-
97
- === Default values
98
-
99
- For each node you may define a _default value_ which will be set if
100
- there was no value defined for the attribute in the XML source.
79
+ element. When you convert an instance of +Address+ into an XML
80
+ element, a sub-element "City" is added and its text is set to the
81
+ current value of the +city+ attribute. The other node types
82
+ (numeric_node, array_node etc.) work analogously. Generally said, when
83
+ an instance of the above +Address+ class is created from or converted
84
+ to an XML tree, each of the four nodes in the class maps some parts of
85
+ that XML tree to a single, specific attribute of the +Adress+
86
+ instance. The name of that attribute is given in the first argument to
87
+ the node factory method. Such a node is called a "single-attribute
88
+ node". All node types that come with xml-mapping except one
89
+ (+choice_node+, which I'll talk about below) are single-attribute
90
+ nodes.
91
+
92
+
93
+ === {Default Values}[a:defaultvalues]
94
+
95
+ For each single-attribute node you may define a <i>default value</i>
96
+ which will be set if there was no value defined for the attribute in
97
+ the XML source.
101
98
 
102
99
  From the example:
103
100
 
@@ -118,7 +115,7 @@ The semantics of default values are as follows:
118
115
  (when defining your own initializer, you'll have to call the
119
116
  inherited _initialize_ method in order to get this behaviour)
120
117
 
121
- - when loading:
118
+ - when loading an instance from an XML document:
122
119
 
123
120
  - attributes without default values that are not represented in the
124
121
  XML raise an error
@@ -130,7 +127,7 @@ The semantics of default values are as follows:
130
127
  in the XML
131
128
 
132
129
 
133
- - when saving:
130
+ - when saving an instance to an XML document:
134
131
 
135
132
  - unset attributes without default values raise an error
136
133
 
@@ -150,10 +147,246 @@ This implies that:
150
147
 
151
148
 
152
149
 
153
- === Attribute handling details, augmenting existing classes
150
+ === {Single-attribute Nodes with Sub-objects}[a:subobjnodes]
151
+
152
+ Single-attribute nodes of type +array_node+, +hash_node+, and
153
+ +object_node+ recursively map one or more subtrees of their XML to
154
+ sub-objects (e.g. array elements or hash values) of their
155
+ attribute. For example, with the line
156
+
157
+ array_node :signatures, "Signed-By", "Signature", :class=>Signature, :default_value=>[]
154
158
 
155
- I'll shed some more light on how xml-mapping adds mapped attributes to
156
- Ruby classes. An attribute declaration like
159
+ , an attribute named "signatures" is added to the surrounding class
160
+ (here: +Order+); the attribute will be an array whose elements
161
+ correspond to the XML sub-trees yielded by the XPath expression
162
+ "Signed-By/Signature" (relative to the tree corresponding to the
163
+ +Order+ instance). Each element will be of class +Signature+
164
+ (internally, each element is created from its corresponding XML
165
+ subtree by just calling
166
+ <tt>Signature.load_from_xml(the_subtree)</tt>). The reason why the
167
+ path "Signed-By/Signature" is provided in two arguments instead of
168
+ just one combined one becomes apparent when marshalling the array
169
+ (along with the surrounding +Order+ object) back into a sequence of
170
+ XML elements. When that happens, "Signed-By" names the common base
171
+ element for all those elements, and "Signature" is the path that will
172
+ be duplicated for each element. For example, when the +signatures+
173
+ attribute contains an array with 3 +Signature+ instances (let's call
174
+ them <tt>sig1</tt>, <tt>sig2</tt>, and <tt>sig3</tt>) in it, it will
175
+ be marshalled to an XML tree that looks like this:
176
+
177
+ <Signed-By>
178
+ <Signature>
179
+ [marshalled object sig1]
180
+ </Signature>
181
+ <Signature>
182
+ [marshalled object sig2]
183
+ </Signature>
184
+ <Signature>
185
+ [marshalled object sig3]
186
+ </Signature>
187
+ </Signed-By>
188
+
189
+ Internally, each +Signature+ instance is stored into its
190
+ <tt><Signature></tt> sub-element by calling
191
+ <tt>the_signature_instance.fill_into_xml(the_sub_element)</tt>. The
192
+ input document in the example above shows how this ends up looking.
193
+
194
+ <tt>hash_node</tt>s work similarly, but they define hash-valued attributes
195
+ instead of array-valued ones.
196
+
197
+ <tt>object_node</tt>s are the simplest of the three types of
198
+ single-attribute nodes with sub-objects. They just map a single given
199
+ subtree directly to their attribute value. See the example for
200
+ examples :)
201
+
202
+ The mentioned methods +load_from_xml+ and +fill_into_xml+ are the only
203
+ methods classes must implement in order to be usable in the
204
+ <tt>:class=></tt> keyword arguments to node factory methods. Mapping
205
+ classes (i.e. classes that <tt>include XML::Mapping</tt>)
206
+ automatically inherit those functions and can thus be readily used in
207
+ <tt>:class=></tt> arguments, as shown for the +Signature+ class in the
208
+ +array_node+ call above. In addition to that, xml-mapping adds those
209
+ methods to some of Ruby's core classes, namely +String+ and +Numeric+
210
+ (and thus +Float+, +Integer+, and +BigInt+). So you can also use
211
+ strings or numbers as sub-objects of attributes of +array_node+,
212
+ +hash_node+, or +object_node+ nodes. For example, say you have an XML
213
+ document like this one:
214
+
215
+ :include: stringarray.xml
216
+
217
+ , and you want to map all the names to a string array attribute
218
+ +names+, you could do it like this:
219
+
220
+ :include: stringarray.rb
221
+
222
+ usage:
223
+
224
+ :include: stringarray_usage.intout
225
+
226
+ As a side node, this feature actually makes +text_node+ and
227
+ +numeric_node+ special cases of +object_node+. For example,
228
+ <tt>text_node :attr, "path"</tt> is the same as <tt>object_node :attr,
229
+ "path", :class=>String</tt>.
230
+
231
+
232
+ ==== Polymorphic Sub-objects, Marshallers/Unmarshallers
233
+
234
+ Besides the <tt>:class</tt> keyword argument, there are alternative
235
+ ways for a single-attribute node with sub-objects to specify the way
236
+ the sub-objects are created from/marshalled into their subtrees.
237
+
238
+ First, it's possible not to specify anything at all -- in that case,
239
+ the class of a sub-object will be automatically deduced from the root
240
+ element name of its subtree. This allows you to achieve a kind of
241
+ "polymorphic", late-bound way to decide about the sub-object's
242
+ class. The following example document contains a hierarchical,
243
+ recursive set of named "documents" and "folders", where folders hold a
244
+ set of entries, each of which may again be either a document or a
245
+ folder:
246
+
247
+ :include: documents_folders.xml
248
+
249
+ This can be mapped to Ruby like this:
250
+
251
+ :include: documents_folders.rb
252
+
253
+ Usage:
254
+
255
+ :include: documents_folders_usage.intout
256
+
257
+ As you see, the <tt>Folder#entries</tt> attribute is mapped via an
258
+ array_node that does not specify a <tt>:class</tt> or anything else to
259
+ govern the instantiation of the array's elements. This causes
260
+ xml-mapping to deduce the class of each array element from the root
261
+ element name of the corresponding XML tree. In this example, the root
262
+ element name is either "document" or "folder". The mapping between
263
+ root element names and class names is the one briefly described in
264
+ example[aref:example] at the beginning of this document -- the
265
+ unqualified class name is just converted to lower case and "dashed",
266
+ e.g. Foo::Bar::MyClass becomes "my-class"; and you may overwrite this
267
+ on a per-class basis by calling <tt>root_element_name
268
+ "the-new-name"</tt> in the class body. In our example, the root
269
+ element name "document" leads to an instantiation of class +Document+,
270
+ and the root element name "folder" leads to an instantiation of class
271
+ +Folder+.
272
+
273
+ Incidentally, the last example shows that you can readily derive
274
+ mapping classes from one another (as said before, you can also derive
275
+ mapping classes from other classes, include other modules into them
276
+ etc. at will). This works just like intuition thinks it should -- when
277
+ deriving one mapping class from another one, the list of nodes in
278
+ effect when loading/saving instances of the derived class will consist
279
+ of all nodes of that class and all superclasses, starting with the
280
+ topmost superclass that has nodes defined. There is one thing to take
281
+ care of though: When deriving mapping classes from one another, you
282
+ have to make sure to <tt>include XML::Mapping</tt> in each class. This
283
+ requirement exists purely due to ease-of-implementation
284
+ considerations; there are probably ways to do away with it, but the
285
+ inconvenience seemed not severe enough for me to bother (as
286
+ yet). Still, you might get "strange" errors if you forget to do it for
287
+ a class.
288
+
289
+ Besides the <tt>:class</tt> keyword argument and no argument, there is
290
+ a third way to specify the way the sub-objects are created
291
+ from/marshalled into their subtrees: <tt>:marshaller</tt> and/or
292
+ <tt>:unmarshaller</tt> keyword arguments. Here you pass procs in which
293
+ you just do all the work manually. So this is basically a "catch-all"
294
+ for cases where the other two alternatives are not appropriate for the
295
+ problem at hand. (*TODO*: Use other example?) Let's say we want to
296
+ extend the +Signature+ class from the initial example to include the
297
+ date on which the signature was created. We want the new XML
298
+ representation of such a signature to look like this:
299
+
300
+ :include: time_node_w_marshallers.xml
301
+
302
+ So, a new "signed-on" element was added that holds the day, month, and
303
+ year. In the +Signature+ instance in Ruby, we want the date to be
304
+ stored in an attribute named +signed_on+ of type +Time+ (that's Ruby's
305
+ built-in +Time+ class).
306
+
307
+ One could think of using +object_node+, but something like
308
+ <tt>object_node :signed_on, "signed-on", :class=>Time</tt> won't work
309
+ because +Time+ isn't a mapping class and doesn't define methods
310
+ +load_from_xml+ and +fill_into_xml+ (we could easily define those
311
+ though; we'll talk about that possibility here[aref:attrdefns] and
312
+ here[aref:definingnodes]). The fastest, most ad-hoc way to achieve
313
+ what we want are :marshaller and :unmarshaller keyword arguments, like
314
+ this:
315
+
316
+ :include: time_node_w_marshallers.intout
317
+
318
+ The <tt>:unmarshaller</tt> proc will be called whenever a +Signature+
319
+ instance is being read in from an XML source. The +xml+ argument
320
+ passed to the proc contains (as a REXML::Element instance) the XML
321
+ subtree corresponding to the node's attribute's sub-object currently
322
+ being read. In the case of our +object_node+, the sub-object is just
323
+ the node's attribute (+signed_on+) itself, and the subtree is the one
324
+ rooted at the <signed-on> element (if this were e.g. an +array_node+,
325
+ the <tt>:unmarshaller</tt> proc would be called once for each array
326
+ element, and +xml+ would hold the subtree corresponding to the
327
+ "current" array element). The proc is expected to extract the
328
+ sub-object's data from +xml+ and return the sub-object. So we have to
329
+ read the "year", "month", and "day" elements, construct a +Time+
330
+ instance from them and return that. One could just use the REXML API
331
+ to do that, but I've decided here to use the XPath interpreter that
332
+ comes with xml-mapping (xml/xxpath), and specifically the
333
+ 'xml/xxpath_methods' utility library that adds methods like +first+ to
334
+ REMXML::Element. We call +first+ on +xml+ three times, passing XPath
335
+ expressions to extract the "year"/"month"/"day" sub-elements,
336
+ construct the +Time+ instance from that and return it. The XPath
337
+ library is explained in more detail below[aref:xpath].
338
+
339
+ The <tt>:marshaller</tt> proc will be called whenever a +Signature+
340
+ instance is being written into an XML tree. +xml+ is again the XML
341
+ subtree rooted at the <signed-on> element (it will still be empty when
342
+ this proc is called), and +value+ is the current value of the
343
+ sub-object (again, since this is an +object_node+, +value+ is the
344
+ node's attribute, i.e. the +Time+ instance). We have to fill +xml+
345
+ with the data from +value+ here. So we add three elements "year",
346
+ "month" and "day" and set their texts to the corresponding values from
347
+ +value+. The commented-out code shows an alternative implementation of
348
+ the same thing using the XPath interpreter.
349
+
350
+ It should be mentioned again that :marshaller/:unmarshaller procs are
351
+ possible with all single-attribute nodes with sub-objects, i.e. with
352
+ +object_node+, +array_node+, and +hash_node+. So, if you wanted to map
353
+ a whole array of date values, you could use +array_node+ with the same
354
+ :marshaller/:unmarshaller procs as above, for example:
355
+
356
+ array_node :birthdays, "birthdays", "birthday",
357
+ :unmarshaller=> <as above>,
358
+ :marshaller=> <as above>
359
+
360
+ You can see that :marshaller/:unmarshaller procs give you more
361
+ flexibility, but they also impose more work because you essentially
362
+ have to do all the work of marshalling/unmarshalling the sub-objects
363
+ yourself. If you find yourself copying and pasting
364
+ marshaller/unmarshaller procs all over the place, you should instead
365
+ define your own node type or mix the marshalling/unmarshalling
366
+ capabilities into the +Time+ class itself. This is explained
367
+ here[aref:attrdefns] and here[aref:definingnodes], and you'll see that
368
+ it's not really much more work than writing :marshaller and
369
+ :unmarshaller procs (you essentially just move the code from those
370
+ procs into your own node type resp. into the +Time+ class), so you
371
+ should not hesitate to do this.
372
+
373
+ Another thing worth mentioning is that you don't have to specify
374
+ *both* a :marshaller and an :unmarshaller simultaneously. You can as
375
+ well give only one of them, and in addition to that pass a
376
+ <tt>:class</tt> argument or no argument. When you do that, the
377
+ specified marshaller (or unmarshaller) will be used when marshalling
378
+ (resp. unmarshalling) the sub-objects, and the other passed argument
379
+ (<tt>:class</tt> or none) will be employed when unmarshalling
380
+ (resp. marshalling) the sub-objects. So, in effect, you can deactivate
381
+ or "short-cut" some part of the marshalling/unmarshalling
382
+ functionality of a node type while retaining another part.
383
+
384
+
385
+
386
+ === {Attribute Handling Details, Augmenting Existing Classes}[a:attrdefns]
387
+
388
+ I'll shed some more light on how single-attribute nodes add mapped
389
+ attributes to Ruby classes. An attribute declaration like
157
390
 
158
391
  text_node :city, "City"
159
392
 
@@ -188,32 +421,255 @@ declarations that declare XML mappings for the day, month etc. fields:
188
421
  :include: time_augm.intout
189
422
 
190
423
  Here XML mappings are defined for the existing fields +year+, +month+
191
- etc. Xml-apping noticed that the getter methods for those attributes
424
+ etc. Xml-mapping noticed that the getter methods for those attributes
192
425
  existed, so it didn't overwrite them. When calling +save_to_xml+ on a
193
426
  +Time+ object, these methods are called and return the object's values
194
- for those fields, which then get written to the output XML. Of course
195
- you could also derive a new class from a pre-existing one and
196
- implement the XML::Mapping stuff there, or even derive several such
197
- classes in order to define more than one XML mapping for one existing
198
- class.
427
+ for those fields, which then get written to the output XML.
428
+
429
+ So you can convert +Time+ objects into XML trees. What about reading
430
+ them back in from XML? All XML reading operations go through
431
+ <tt><Class>.load_from_xml</tt>. The +load_from_xml+ class method
432
+ inherited from XML::Mapping (see
433
+ XML::Mapping::ClassMethods#load_from_xml) allocates a new instance of
434
+ the class (+Time+), then calls +fill_from_xml+
435
+ (i.e. XML::Mapping#fill_from_xml) on it. +fill_from_xml+ iterates over
436
+ all our nodes in the order of their definition. For each node, its
437
+ data (the <year>, or <month>, or <day> etc. element) is read from the
438
+ XML source and then written to the +Time+ instance via the respective
439
+ setter method (<tt>year=</tt>, <tt>month=</tt>, <tt>day=</tt>
440
+ etc.). These methods didn't exist in +Time+ before (+Time+ objects are
441
+ immutable), so xml-mapping defined its own, default setter methods
442
+ that just set <tt>@year</tt>, <tt>@month</tt> etc. This is of course
443
+ pretty useless because +Time+ objects don't hold their time in these
444
+ variables, so the setter methods don't really change the time of the
445
+ +Time+ object. So we have to redefine +load_from_xml+ for the +Time+
446
+ class:
447
+
448
+ :include: time_augm_loading.intout
449
+
450
+
451
+ == {Other Nodes}[a:onodes]
452
+
453
+ All nodes I've shown so far (node types text_node, numeric_node,
454
+ boolean_node, object_node, array_node, and hash_node) were
455
+ single-attribute nodes: The first parameter to the node factory method
456
+ of such a node is an attribute name, and the attribute of that name is
457
+ the only piece of the state of instances of the node's mapping class
458
+ that gets read/written by the node.
459
+
460
+ === {choice_node}[a:choice_node]
461
+
462
+ There is one node type distributed with xml-mapping that is not a
463
+ single-attribute node: +choice_node+. A +choice_node+ allows you to
464
+ specify a sequence of pairs, each consisting of an XPath expression
465
+ and another node (any node is supported here, including other
466
+ choice_nodes). When reading in an XML source, the choice_node will
467
+ delegate the work to the first node in the sequence whose
468
+ corresponding XPath expression was matched in the XML. When writing an
469
+ object back to XML, the choice_node will delegate the work to the
470
+ first node whose data was "present" in the object (for
471
+ single-attribute nodes, the data is considered "present" if the node's
472
+ attribute is non-nil; for choice_nodes, the data is considered
473
+ "present" if at least one of the node's sub-nodes is "present").
474
+
475
+ As a (somewhat contrived) example, here's a mapping for +Publication+
476
+ objects that have either a single author (contained in an "author" XML
477
+ attribute) or several "contributors" (contained in a sequence of
478
+ "contr" XML elements):
479
+
480
+ :include: publication.intout
481
+
482
+ The symbols :if, :then, and :elsif (but not :else -- see below) in the
483
+ +choice_node+'s node factory method call are ignored; they may be
484
+ sprinkled across the argument list at will (preferably the way shown
485
+ above of course) to increase readability.
486
+
487
+ The rest of the arguments specify the mentioned sequence of XPath
488
+ expressions and corresponding nodes.
489
+
490
+ When reading a +Publication+ object from XML, the XPath expressions
491
+ from the +choice_node+ (<tt>@author</tt> and +contr+) will be matched
492
+ in sequence against the source XML tree until a match is found or the
493
+ end of the argument list is reached. If the end is reached, an
494
+ exception is raised. Otherwise, for the first XPath expression that
495
+ matched, the corresponding node will be invoked (i.e. used to read
496
+ actual data from the XML source into the +Person+ object). If you
497
+ specify :else, :default, or :otherwise in place of an XPath
498
+ expression, this is treated as an XPath expression that always
499
+ matches. So you can use :else (or :default or :otherwise) for a
500
+ "fallback" node that will be used if none of the other XPath
501
+ expressions matched (an example for this follows).
502
+
503
+ When writing a +Publication+ object back to XML, the first node in the
504
+ sequence whose data is "present" in the source object will be invoked
505
+ to write data from the object into the target XML tree (and the
506
+ corresponding XPath expression will be created in the XML tree if it
507
+ doesn't exist already). If there is no such node in the sequence, an
508
+ exception is raised. As said above, for single-attribute nodes, the
509
+ node's data is considered "present" if the node's attribute is
510
+ non-nil. So, if you write a +Publication+ object to XML, and either
511
+ the +author+ or the +contributors+ attribute of the object is set, it
512
+ will be written; if both attributes are nil, an exception will be
513
+ raised.
514
+
515
+ A frequent use case for choice_nodes will probably be object
516
+ attributes that may be represented in multiple alternative ways in
517
+ XML. As an example, consider "Person" objects where the name of the
518
+ person should be stored alternatively in a sub-element named +name+,
519
+ or an attribute named +name+, or in the text of the +person+ element
520
+ itself. You can achieve this with +choice_node+ like this:
521
+
522
+ :include: person.intout
523
+
524
+ Here all sub-nodes of the choice_nodes are single-attribute nodes
525
+ (text_nodes) with the same attribute (+name+). As you see, when
526
+ writing persons to XML, the name is always stored in a <name>
527
+ sub-element. Of course, this is because that alternative appears first
528
+ in the choice_node.
529
+
530
+
531
+ === {Readers/Writers}[a:readerswriters]
532
+
533
+ Finally, _all_ nodes support keyword arguments :reader and :writer
534
+ which allow you to extend or completely override the reading and/or
535
+ writing functionality of the node with your own code. The :reader as
536
+ well as the :writer argument must be a proc that takes as its
537
+ arguments the Ruby object to be read/written (instance of the mapping
538
+ class the node belongs to) and the XML tree to be written to/read
539
+ from. An optional third argument may be specified -- it will receive a
540
+ proc that wraps the default reader/writer functionality of the
541
+ node.
542
+
543
+ The :reader proc is for reading (from the XML into the object), the
544
+ :writer proc is for writing (from the object into the XML).
545
+
546
+ Here's a (really contrived) example:
547
+
548
+ :include: reader.intout
549
+
550
+ So there's a "Foo" class with a text_node that would by default
551
+ (without the :reader and :writer proc) map the Ruby attribute "name"
552
+ to the XML attribute "name". The :reader proc is invoked when reading
553
+ from XML into a +Foo+ object. The +xml+ argument is the XML tree,
554
+ +obj+ is the object. +default_reader+ is the proc that wraps the
555
+ default reading functionality of the node. We invoke it at the
556
+ beginning. For this text_node, the default reading functionality is to
557
+ take the text of the "name" attribute of +xml+ and put it into the
558
+ +name+ attribute of +obj+. After that, we take the text of the "more"
559
+ attribute of +xml+ and append it to the +name+ attribute of +obj+. So
560
+ the XML tree <tt><foo name="Jim" more="XYZ"/></tt> is converted to a
561
+ +Foo+ object with +name+="JimXYZ".
562
+
563
+ In our :writer proc, we only take +obj+ (the +Foo+ object to be
564
+ written to XML) and +xml+ (the XML tree the stuff is to be written
565
+ to). Analogously to the :reader, we could take a proc that wraps the
566
+ default writing functionality of the node, but we don't do that
567
+ here--we completely override the writing functionality with our own
568
+ code, which just takes the +name+ attribute of the object and writes
569
+ "hi <the name> ho" to a +bar+ XML attribute in the XML tree (stupid
570
+ example, I know).
571
+
572
+ As a special convention, if you specify both a :reader and a :writer
573
+ for a node, and in both cases you do /not/ call the default behaviour,
574
+ then you should use the generic node type +node+, e.g.:
575
+
576
+ class SomeClass
577
+ include XML::Mapping
199
578
 
200
- It should be mentioned that in the +Time+ example above, the setter
201
- methods (<tt>year=</tt>, <tt>month=</tt> etc.) didn't exist in +Time+
202
- (+Time+ objects are immutable), so xml-mapping defined its own setter
203
- methods that just set <tt>@year</tt>, <tt>@month</tt> etc., which is
204
- pretty useless for this case. So you can't really read +Time+ values
205
- back from an XML representation in this example. For that to work,
206
- you'd need functioning <tt>blah=(x)</tt> methods for each +blah+
207
- attribute that you want to define an XML mapping for.
579
+ ....
208
580
 
581
+ node :reader=>proc{|obj,xml| ...},
582
+ :writer=>proc{|obj,xml| ...}
583
+ end
209
584
 
210
- === Defining your own node types
585
+ (since you're completely replacing both the reading and the writing
586
+ functionality, you're effectively replacing all the functionality of
587
+ the node, so it would be pointless and confusing to use one of the
588
+ more "specific" node types)
589
+
590
+ As you see, the purpose of readers and writers is to make it possible
591
+ to augment or override a node's functionality arbitrarily, so there
592
+ shouldn't be anything that's absolutely impossible to achieve with
593
+ xml-mapping. However, if you use readers and writers without invoking
594
+ the default behaviour, you really do everything manually, so you're
595
+ not doing any less work than you would do if you weren't using
596
+ xml-mapping at all. So you'll probably use readers and/or writers for
597
+ those bits of your mapping semantics that can't be achieved with
598
+ xml-mapping's predefined node types (an alternative approach might be
599
+ to override the +post_load+ and/or +post_save+ instance methods on the
600
+ mapping class -- see the reference documentation).
601
+
602
+ An advice similar to the one given above for marshallers/unmarshallers
603
+ applies here as well: If you find yourself writing lots of readers and
604
+ writers that only differ in some easily parameterizable aspects, you
605
+ should think about defining your own node types. We talk about that
606
+ below[aref:definingnodes], and it generally just means that you move
607
+ the (sensibly parameterized) code from your readers/writers to your
608
+ node types.
609
+
610
+
611
+ == {Multiple Mappings per Class}[a:mappings]
612
+
613
+ Sometimes you might want to represent the same Ruby object in multiple
614
+ alternative ways in XML. For example, the name of a "Person" object
615
+ could be represented either in a "name" element or a "name" attribute.
616
+
617
+ xml-mapping supports this by allowing you to define multiple disjoint
618
+ "mappings" for a mapping class. A mapping is by convention identified
619
+ with a symbol, e.g. <tt>:my_mapping</tt>, <tt>:other_mapping</tt>
620
+ etc., and each mapping comprises a root element name and a set of node
621
+ definitions. In the body of a mapping class definition, you switch to
622
+ another mapping with <tt>use_mapping :the_mapping</tt>. All following
623
+ node declarations will be added to that mapping *unless* you specify
624
+ the option :mapping=>:another_mapping for a node declaration (all node
625
+ types support that option). The default mapping (the mapping used if
626
+ there was no previous +use_mapping+ in the class body) is named
627
+ <tt>:_default</tt>.
628
+
629
+ All the worker methods like <tt>load_from_xml/file</tt>,
630
+ <tt>save_to_xml/file</tt>, <tt>load_object_from_xml/file</tt> support
631
+ a <tt>:mapping</tt> keyword argument to specify the mapping, which
632
+ again defaults to <tt>:_default</tt>.
633
+
634
+ In the following example, we define two mappings (the default one and
635
+ a mapping named <tt>:other</tt>) for +Person+ objects with a name, an
636
+ age and an address:
637
+
638
+ :include: examples/person_mm.intout
639
+
640
+ In this example, each of the two mappings contains nodes that map the
641
+ same set of Ruby attributes (name, age and address). This is probably
642
+ what you want most of the time (since you're normally defining
643
+ multiple XML mappings for the same Ruby data), but it's not a
644
+ necessity at all. When a mapping class is defined, xml-mapping will
645
+ add all Ruby attributes from all mappings to it.
646
+
647
+ You may have noticed that the <tt>object_node</tt>s in the +Person+
648
+ class apply the mapping they were themselves defined in to their
649
+ sub-ordinated class (+Address+). This is the case for all
650
+ {Single-attribute Nodes with Sub-objects}[aref:subobjnodes]
651
+ (+object_node+, +array_node+ and +hash_node+) unless you explicitly
652
+ specify a different mapping for the sub-object(s) using the option
653
+ :sub_mapping, e.g.
654
+
655
+ object_node :address, "address", :class=>Address, :sub_mapping=>:other
656
+
657
+
658
+
659
+ == {Defining your own Node Types}[a:definingnodes]
211
660
 
212
661
  It's easy to write additional node types and register them with the
213
- xml-mapping library. Let's say we want to extend the +Signature+ class
214
- from the example to include the time at which the signature was
215
- created. We want the new XML representation of such a signature to
216
- look like this:
662
+ xml-mapping library (the following node types come with xml-mapping:
663
+ +node+, +text_node+, +numeric_node+, +boolean_node+, +object_node+,
664
+ +array_node+, +hash_node+, +choice_node+).
665
+
666
+ I'll first show an example, then some more theoretical insight.
667
+
668
+ === Example
669
+
670
+ Let's say we want to extend the +Signature+ class from the example to
671
+ include the time at which the signature was created. We want the new
672
+ XML representation of such a signature to look like this:
217
673
 
218
674
  :include: order_signature_enhanced.xml
219
675
 
@@ -224,9 +680,9 @@ the mapping class declaration to look like this:
224
680
 
225
681
  (i.e. a new "time_node" declaration was added).
226
682
 
227
- We want this +signed_on+ call to define an attribute named +signed_on+
228
- which holds the date value from the XML in an instance of class
229
- +Time+.
683
+ We want this +time_node+ call to define an attribute named +signed_on+
684
+ which holds the date value from the XML document in an instance of
685
+ class +Time+.
230
686
 
231
687
  This node type can be defined with this piece of code:
232
688
 
@@ -237,126 +693,367 @@ library. The name of the node factory method ("time_node") is
237
693
  automatically derived from the class name of the node type
238
694
  ("TimeNode").
239
695
 
240
- There will be one instance of the node type per mapping class (not per
241
- mapping class instance). That instance will be created by the node
242
- factory method (+time_node+); there's no need to instantiate the node
243
- type directly. Whenever an instance of the mapping class needs to be
244
- marshalled/unmarshalled to/from XML, +set_attr_value+
245
- resp. +extract_attr_value+ will be called on the node type instance
246
- ("node" for short). The node factory method places the node into the
247
- mapping class; the @owner attribute of the node is set to reference
248
- the mapping class. The node factory method passes its arguments (in
249
- the example, that would be <tt>:signed_on, "signed-on",
250
- :default_value=>Time.now</tt>) to the node's initializer. TimeNode's
251
- parent class XML::Mapping::SingleAttributeNode already handles the
252
- <tt>:signed_on</tt> and <tt>:default_value=>Time.now</tt> arguments --
253
- <tt>:signed_on</tt> is stored into <tt>@attrname</tt>, and the default
254
- value declarations will be described in a moment. The remaining
255
- argument <tt>"signed-on"</tt> gets passed to our +initialize_impl+
256
- method as parameter _path_. We'll interpret it as an XPath expression
696
+ There will be one instance of the node type +TimeNode+ per +time_node+
697
+ declaration per mapping class (not per mapping class instance). That
698
+ instance (the "node" for short) will be created by the node factory
699
+ method (+time_node+); there's no need to instantiate the node type
700
+ directly. The +time_node+ method places the node into the mapping
701
+ class; the @owner attribute of the node is set to reference the
702
+ mapping class. The node factory method passes the mapping class the
703
+ node appears in (+Signature+), followed by its own arguments, to the
704
+ node's constructor. In the example, the +time_node+ method calls
705
+ <tt>TimeNode.new(Signature, :signed_on, "signed-on",
706
+ :default_value=>Time.now)</tt>). +new+ of course creates the node and
707
+ then delegates the arguments to our initializer +initialize+. We first
708
+ call the superclass's initializer, which strips off from the argument
709
+ list those arguments it handles itself, and returns the remaining
710
+ ones. In this case, the superclass XML::Mapping::SingleAttributeNode
711
+ handles the +Signature+, <tt>:signed_on</tt> and
712
+ <tt>:default_value=>Time.now</tt> arguments -- +Signature+ is stored
713
+ into <tt>@owner</tt>, <tt>:signed_on</tt> is stored into
714
+ <tt>@attrname</tt>, and <tt>{:default_value=>Time.now}</tt> is stored
715
+ into <tt>@options</tt>. The remaining argument list
716
+ <tt>["signed-on"]</tt> is returned; we capture the
717
+ <tt>"signed-on"</tt> string in _path_ (the rest of the argument list
718
+ (an empty array) we capture in _args_ for returning it at the end of
719
+ the initializer. This isn't strictly necessary, it's just a convention
720
+ that a node class initializer should always return those arguments it
721
+ didn't handle itself). We'll interpret _path_ as an XPath expression
257
722
  that locates the time value relative to the parent mapping object's
258
723
  XML tree (in this case, this would be the XML tree rooted at the
259
- +<Signature>+ element, i.e. the tree the +Signature+ instance was read
260
- from). We'll later have to read/store the year, month, and day values
261
- from <tt>path+"/year"</tt>, <tt>path+"/month"</tt>, and
724
+ <tt><Signature></tt> element, i.e. the tree the +Signature+ instance
725
+ was read from). We'll later have to read/store the year, month, and
726
+ day values from <tt>path+"/year"</tt>, <tt>path+"/month"</tt>, and
262
727
  <tt>path+"/day"</tt>, respectively, so we create (and precompile)
263
728
  three corresponding XPath expressions using XML::XXPath.new and store
264
729
  them into member variables of the node. XML::XXPath is an XPath
265
730
  implementation that is bundled with xml-mapping. It is very
266
731
  incomplete, but it supports writing (not just reading) of XML nodes,
267
732
  which is needed to support writing data back to XML. The XML::XXPath
268
- library is explained in more detail below.
733
+ library is explained in more detail below[aref:xpath].
269
734
 
270
735
  The +extract_attr_value+ method is called whenever an instance of the
271
- class the node belongs to (+Signature+ in the example) is being
272
- created from an XML tree. The parameter _xml_ is that tree (again,
273
- this is the tree rooted at the +<Signature>+ element in this
274
- example). The method implementation is expected to extract the
275
- attribute's value from _xml_ and return it, or raise
736
+ mapping class the node belongs to (+Signature+ in the example) is
737
+ being created from an XML tree. The parameter _xml_ is that tree
738
+ (again, this is the tree rooted at the <tt><Signature></tt> element in
739
+ this example). The method implementation is expected to extract the
740
+ single attribute's value from _xml_ and return it, or raise
276
741
  XML::Mapping::SingleAttributeNode::NoAttrValueSet if the attribute was
277
- "unset" in the XML (so the default value should be put in place if it
278
- was defined), or raise any other exception to signal an error and
279
- abort the whole process. In our implementation, we apply the xpath
280
- expressions created at initialization to _xml_
742
+ "unset" in the XML (this exception tells the framework that the
743
+ default value should be put in place if it was defined), or raise any
744
+ other exception to signal an error and abort the whole process. Our
745
+ superclass XML::Mapping::SingleAttributeNode will store the returned
746
+ single attribute's value into the <tt>signed_on</tt> attribute of the
747
+ +Signature+ instance being read in. In our implementation, we apply
748
+ the xpath expressions created during initialization to _xml_
281
749
  (e.g. <tt>@y_path.first(xml)</tt>). An expression
282
750
  _xpath_expr_.first(_xml_) returns (as a REXML element) the first
283
751
  sub-element of _xml_ that matches _xpath_expr_, or raises
284
752
  XML::XXPathError if there was no such element. We apply REXML's _text_
285
753
  method to the returned element to get out the element's text, convert
286
754
  it to integer, and supply it to the constructor of the +Time+ object
287
- to be returned. (as a side note, if an XPath expression matches XML
288
- attributes, XML::XXPath methods like _first_ will return "Attribute"
289
- nodes that behave similarly to REXML::Element nodes, including
290
- messages like _name_ and _text_ (XML::XXPath extends REXML to support
291
- this because REXML's Attribute class is too incompatible), so this
292
- would've worked also if our XPath expressions named XML attributes,
293
- not elements). The +default_when_xpath_err+ thing calls the supplied
294
- block and returns its value, but maps the exception XML::XXPathError to
295
- the mentioned XML::Mapping::SingleAttributeNode::NoAttrValueSet (any
296
- other exceptions fall through unchanged). As said above,
297
- XML::Mapping::NoAttrValueSet is then caught by our superclass
298
- (XML::Mapping::SingleAttributeNode), and the default value is set if
299
- it was provided. So you should just wrap +default_when_xpath_err+
300
- around any applications of XPath expressions whose non-presence in the
301
- XML you want to be considered a non-presence of the attribute you're
755
+ to be returned. As a side note, if an XPath expression matches XML
756
+ attributes, XML::XXPath methods like _first_ will return
757
+ XML::XXPath::Accessors::Attribute nodes that behave similarly to
758
+ REXML::Element nodes, including support for messages like _name_ and
759
+ _text_, so this would've worked also if our XPath expressions had
760
+ referred to XML attributes, not elements. The +default_when_xpath_err+
761
+ thing calls the supplied block and returns its value, but maps the
762
+ exception XML::XXPathError to the mentioned
763
+ XML::Mapping::SingleAttributeNode::NoAttrValueSet (any other
764
+ exceptions fall through unchanged). As said above,
765
+ XML::Mapping::SingleAttributeNode::NoAttrValueSet is caught by the
766
+ framework (more precisely, by our superclass
767
+ XML::Mapping::SingleAttributeNode), and the default value is set if it
768
+ was provided. So you should just wrap +default_when_xpath_err+ around
769
+ any applications of XPath expressions whose non-presence in the XML
770
+ you want to be considered a non-presence of the attribute you're
302
771
  trying to extract. (XML::XXPath is designed to know knothing about
303
772
  XML::Mapping, so it doesn't raise
304
773
  XML::Mapping::SingleAttributeNode::NoAttrValueSet directly)
305
774
 
306
775
  The +set_attr_value+ method is called whenever an instance of the
307
- class the node belongs to (+Signature+ in the example) is being stored
308
- into an XML tree. The _xml_ parameter is the XML tree (a REXML element
309
- node; here this is again the tree rooted at the +<Signature>+
310
- element); _value_ is the current value of the attribute. _xml_ will
311
- most probably be "half-populated" by the time this method is called --
312
- the framework calls the +set_attr_value+ methods of all nodes of a
313
- mapping class in the order of their definition, letting each node fill
314
- its "bit" into _xml_. The method implementation is expected to write
315
- _value_ into (the correct sub-elements of) _xml_, or raise an
316
- exception to signal an error and abort the whole process. No default
317
- value handling is done here; +set_attr_value+ won't be called at all
318
- if the attribute had been set to its default value. In our
319
- implementation we grab the year, month and day values from _value_
320
- (which must be a +Time+), and store it into the sub-elements of _xml_
321
- identified by XPath expressions <tt>@y_path</tt>, <tt>@m_path</tt> and
322
- <tt>@d_path</tt>, respectively. We do this by calling XML::XXPath#first
323
- with an additional parameter <tt>:ensure_created=>true</tt>. An
324
- expression _xpath_expr_.first(_xml_,:ensure_created=>true) works just
325
- like _xpath_expr_.first(_xml_) if _xpath_expr_ was already present in
326
- _xml_. If it was not, it is created (preferable at the end of _xml_'s
327
- list of sub-nodes), and returned. See below for a more detailed
328
- documentation of the XPath interpreter.
776
+ mapping class the node belongs to (+Signature+ in the example) is
777
+ being stored into an XML tree. The _xml_ parameter is the XML tree (a
778
+ REXML element node; here this is again the tree rooted at the
779
+ <tt><Signature></tt> element); _value_ is the current value of the
780
+ single attribute (in this example, the <tt>signed_on</tt> attribute of
781
+ the +Signature+ instance being stored). _xml_ will most probably be
782
+ "half-populated" by the time this method is called -- the framework
783
+ calls the +set_attr_value+ methods of all nodes of a mapping class in
784
+ the order of their definition, letting each node fill its "bit" into
785
+ _xml_. The method implementation is expected to write _value_ into
786
+ (the correct sub-elements of) _xml_, or raise an exception to signal
787
+ an error and abort the whole process. No default value handling is
788
+ done here; +set_attr_value+ won't be called at all if the attribute
789
+ had been set to its default value. In our implementation we grab the
790
+ year, month and day values from _value_ (which must be a +Time+), and
791
+ store it into the sub-elements of _xml_ identified by XPath
792
+ expressions <tt>@y_path</tt>, <tt>@m_path</tt> and <tt>@d_path</tt>,
793
+ respectively. We do this by calling XML::XXPath#first with an
794
+ additional parameter <tt>:ensure_created=>true</tt>. An expression
795
+ _xpath_expr_.first(_xml_,:ensure_created=>true) works just like
796
+ _xpath_expr_.first(_xml_) if _xpath_expr_ was already present in
797
+ _xml_. If it was not, it is created (preferably at the end of _xml_'s
798
+ list of sub-nodes), and returned. See below[aref:xpath] for a more
799
+ detailed documentation of the XPath interpreter.
329
800
 
330
801
  === Element order in created XML documents
331
802
 
332
- As just said, XML::XXPath, when used to create new XML nodes, generally
333
- appends those nodes to the end of the list of subnodes of the node the
334
- xpath expression was applied to. All xml-mapping nodes that come with
335
- xml-mapping use XML::XXPath when writing data to XML, and therefore
336
- also append their data to the XML data written by preceding nodes (the
337
- nodes are invoked in the order of their definition). This means that,
338
- generally, your output data will appear in the XML document in the
339
- same order in which the corresponding xml-mapping node definitions
340
- appeared in the mapping class (unless you used XPath expressions like
341
- foo[number] which explicitly dictate a fixed position in the sequence
342
- of XML nodes). For instance, in the example from the beginning of this
343
- document, if we put the <tt>:signatures</tt> node _before_ the
344
- <tt>:items</tt> node, the <tt><Signed-By></tt> element will appear
345
- _before_ the sequence of <tt><Item></tt> elements in the output XML.
803
+ As just said, XML::XXPath, when used to create new XML nodes,
804
+ generally appends those nodes to the end of the list of subnodes of
805
+ the node the xpath expression was applied to. All xml-mapping nodes
806
+ that come with xml-mapping use XML::XXPath when writing data to XML,
807
+ and therefore also append their data to the XML data written by
808
+ preceding nodes (the nodes are invoked in the order of their
809
+ definition). This means that, generally, your output data will appear
810
+ in the XML document in the same order in which the corresponding
811
+ xml-mapping node definitions appeared in the mapping class (unless you
812
+ used XPath expressions like foo[number] which explicitly dictate a
813
+ fixed position in the sequence of XML nodes). For instance, in the
814
+ +Order+ class from the example at the beginning of this document, if
815
+ we put the <tt>:signatures</tt> node _before_ the <tt>:items</tt>
816
+ node, the <tt><Signed-By></tt> element will appear _before_ the
817
+ sequence of <tt><Item></tt> elements in the output XML.
818
+
819
+
820
+ The following is a more systematic overview of the basic node
821
+ types. The description is self-contained, so some information from the
822
+ previous section will be repeated.
823
+
824
+ === Node Types Are Ruby Classes
825
+
826
+ A node type is implemented as a Ruby class derived from
827
+ XML::Mapping::Node or one of its subclasses.
828
+
829
+ The following node types (node classes) come with xml-mapping (they
830
+ all live in the XML::Mapping namespace, which I've left out here for
831
+ brevity):
832
+
833
+ Node
834
+ +-SingleAttributeNode
835
+ | +-SubObjectBaseNode
836
+ | | +-ObjectNode
837
+ | | +-ArrayNode
838
+ | | +-HashNode
839
+ | +-TextNode
840
+ | +-NumericNode
841
+ | +-BooleanNode
842
+ +-ChoiceNode
843
+
844
+ XML::Mapping::Node is the base class for all nodes,
845
+ XML::Mapping::SingleAttributeNode is the base class for
846
+ {single-attribute nodes}[aref:sanodes], and
847
+ XML::Mapping::SubObjectBaseNode is the base class for
848
+ {single-attribute nodes with
849
+ sub-objects}[aref:subobjnodes]. XML::Mapping::TextNode,
850
+ XML::Mapping::ArrayNode etc. are of course the +text_node+,
851
+ +array_node+ etc. we've talked about in this document. When you've
852
+ written a new node class, you register it with xml-mapping by calling
853
+ <tt>XML::Mapping.add_node_class MyNode</tt>. When you do that,
854
+ xml-mapping automatically defines the node factory method for your
855
+ class -- the method's name (e.g. +my_node+) is derived from the node's
856
+ class name (e.g. Foo::Bar::MyNode) by stripping all parent module
857
+ names, and then converting capital letters to lowercase and preceding
858
+ them with an underscore. In fact, this is just how all the predefined
859
+ node types are defined -- those node types are not "special"; they're
860
+ defined in the source file +xml/mapping/standard_nodes.rb+ and then
861
+ registered normally in +xml/mapping.rb+. The source code of the
862
+ built-in nodes is not very long or complicated; you may consider
863
+ reading it in addition to this text to gain a better understanding.
864
+
865
+
866
+ === How Node Types Work
867
+
868
+ The xml-mapping core "operates" node types as follows:
869
+
870
+
871
+ ==== Node Initialization
872
+
873
+ As said above, when a node class is registered with xml-mapping by
874
+ calling <tt>XML::Mapping.add_node_class TheNodeClass</tt>, xml-mapping
875
+ automatically generates the node factory method for that type. The
876
+ node factory method will effectively be defined as a class method of
877
+ the XML::Mapping module, which is why one can call it from the body of
878
+ a mapping class definition. The generated method will create a new
879
+ instance of the node class (a *node* for short) by calling _new_ on
880
+ the node class. The list of parameters to _new_ will consist of <i>the
881
+ mapping class, followed by all arguments that were passed to the node
882
+ factory method</i>. For example, when you have this node declaration:
883
+
884
+ class MyMappingClass
885
+ include XML::Mapping
886
+
887
+ my_node :foo, "bar", 42, :hi=>"ho"
888
+ end
889
+
890
+ , then the node factory method (+my_node+) calls
891
+ <tt>MyNode.new(MyMappingClass, :foo, "bar", 42, :hi=>"ho")</tt>.
346
892
 
893
+ _new_ of course creates the instance and calls _initialize_ on it. The
894
+ _initialize_ implementation will generally store the parameters into
895
+ some instance variables for later usage. As a convention, _initialize_
896
+ should always extract from the parameter list those parameters it
897
+ processes itself, process them, and return an array containing the
898
+ remaining (still unprocessed) parameters. Thus, an implementation of
899
+ _initialize_ follows this pattern:
900
+
901
+ def initialize(*args)
902
+ myparam1,myparam2,...,myparamx,*args = super(*args)
903
+
904
+ .... process the myparam1,myparam2,...,myparamx ....
905
+
906
+ # return still unprocessed args
907
+ args
908
+ end
909
+
910
+ (since the called superclass initializer is written the same way, the
911
+ parameter array returned by it will already be stripped of all
912
+ parameters that the superclass initializer (or any of its
913
+ superclasses's initializers) processed)
914
+
915
+ This technique is a simple way to "chain" the initializers of all
916
+ superclasses of a node class, starting with the topmost one (Node), so
917
+ that each initializer can easily find out and process the parameters
918
+ it is responsible for.
919
+
920
+ The base node class XML::Mapping::Node provides an _initialize_
921
+ implementation that, among other things (described below), adds _self_
922
+ (i.e. the created node) to the internal list of nodes held by the
923
+ mapping class, and sets the @owner attribute of _self_ to reference
924
+ the mapping class.
925
+
926
+ So, effectively there will be one instance of a node class (a node)
927
+ per node definition, and that instance lives in the mapping class the
928
+ node was defined in.
929
+
930
+
931
+ ==== Node Operation during Marshalling and Unmarshalling
932
+
933
+ When an instance of a mapping class is created or filled from an XML
934
+ tree, xml-mapping will call +xml_to_obj+ on all nodes defined in that
935
+ mapping class in the {mapping}[aref:mappings] the node is defined in,
936
+ in the order of their definition. Two parameters will be passed: the
937
+ mapping class instance being created/filled, and the XML tree the
938
+ instance is being created/filled from. The implementation of
939
+ +xml_to_obj+ is expected to read whatever pieces of data it is
940
+ responsible for from the XML tree and put it into the appropriate
941
+ variables/attributes etc. of the instance.
942
+
943
+ When an instance of a mapping class is stored or filled into an XML
944
+ tree, xml-mapping will call +obj_to_xml+ on all nodes defined in that
945
+ mapping class in the {mapping}[aref:mappings] the node is defined in,
946
+ in the order of their definition, again passing as parameters the
947
+ mapping class instance being stored, and the XML tree the instance is
948
+ being stored/filled into. The implementation of +obj_to_xml+ is
949
+ expected to read whatever pieces of data it is responsible for from
950
+ the instance and put it into the appropriate XML elements/XML attr
951
+ etc. of the XML tree.
952
+
953
+
954
+ === Basic Node Types Overview
955
+
956
+ The following is an overview of how initialization and
957
+ marshalling/unmarshalling is implemented in the node base classes
958
+ (Node, SingleAttributeNode, and SubObjectBaseNode).
959
+
960
+ TODO: summary table: member var name; introduced in class; meaning
961
+
962
+ ==== Node
963
+
964
+ In _initialize_, the mapping class and the option arguments are
965
+ stripped from the argument list. The mapping class is stored in
966
+ @owner, the option arguments are stored (as a hash) in @options (the
967
+ hash will be empty if no options were given). The
968
+ {mapping}[aref:mappings] the node is defined in is determined
969
+ (:mapping option, last <tt>use_mapping</tt> or <tt>:_default</tt>) and
970
+ stored in @mapping. The node then stores itself in the list of nodes
971
+ of the mapping class belonging to the mapping
972
+ (<tt>@owner.xml_mapping_nodes(:mapping=>@mapping)</tt>; see
973
+ XML::Mapping::ClassMethods#xml_mapping_nodes). This list is the list
974
+ of nodes later used when marshalling/unmarshalling an instance of the
975
+ mapping class with respect to a given mapping. This means that node
976
+ implementors will not normally "see" anything of the mapping (they
977
+ don't need to access the @mapping variable) because the
978
+ marshalling/unmarshalling methods
979
+ (<tt>obj_to_xml</tt>/<tt>xml_to_obj</tt>) simply won't be called if
980
+ the node's mapping is not the same as the mapping the
981
+ marshalling/unmarshalling is happening with.
982
+
983
+ Furthermore, if :reader and/or :writer options were given,
984
+ <tt>xml_to_obj</tt> resp. <tt>obj_to_xml</tt> are transparently
985
+ overwritten on the node to delegate to the supplied :reader/:writer
986
+ procs.
987
+
988
+ The marshalling/unmarshalling methods
989
+ (<tt>obj_to_xml</tt>/<tt>xml_to_obj</tt>) are not implemented in
990
+ +Node+ (they just raise an exception).
991
+
992
+
993
+ ==== SingleAttributeNode
994
+
995
+ In _initialize_, the attribute name is stripped from the argument list
996
+ and stored in @attrname, and an attribute of that name is added to the
997
+ mapping class the node belongs to.
998
+
999
+ During marshalling/unmarshalling of an object to/from XML,
1000
+ single-attribute nodes only read/write a single piece of the object's
1001
+ state: the single attribute (@attrname) the node handles. Because of
1002
+ this, the <tt>obj_to_xml</tt>/<tt>xml_to_obj</tt> implementations in
1003
+ SingleAttributeNode call two new methods introduced by
1004
+ SingleAttributeNode, which must be overwritten by subclasses:
1005
+
1006
+ extract_attr_value(xml)
1007
+
1008
+ set_attr_value(xml, value)
1009
+
1010
+ <tt>extract_attr_value(xml)</tt> is called by <tt>xml_to_obj</tt>
1011
+ during unmarshalling. _xml_ is the XML tree being read. The method
1012
+ must read the attribute's value from _xml_ and return
1013
+ it. <tt>xml_to_obj</tt> will set the attribute to that value.
1014
+
1015
+ <tt>set_attr_value(xml, value)</tt> is called by <tt>obj_to_xml</tt>
1016
+ during marshalling. _xml_ is the XML tree being written, _value_ is
1017
+ the current value of the attribute. The method must write _value_ into
1018
+ (the correct sub-elements/attributes) of _xml_.
1019
+
1020
+ SingleAttributeNode also handles the default value, if it was
1021
+ specified (via the :default_value option): When writing data to XML,
1022
+ <tt>set_attr_value(xml, value)</tt> won't be called if the attribute
1023
+ was set to the default value. When reading data from XML, the
1024
+ <tt>extract_attr_value(xml)</tt> implementation must raise a special
1025
+ exception, XML::Mapping::SingleAttributeNode::NoAttrValueSet, if it
1026
+ wants to indicate that the data was not present in the
1027
+ XML. SingleAttributeNode will catch this exception and put the default
1028
+ value, if it was defined, into the attribute.
1029
+
1030
+
1031
+ ==== SubObjectBaseNode
1032
+
1033
+ The initializer will set up additional member variables @sub_mapping,
1034
+ @marshaller, and @unmarshaller.
1035
+
1036
+ @sub_mapping contains the mapping to be used when reading/writing the
1037
+ sub-objects (either specified with :sub_mapping, or, by default, the
1038
+ mapping the node itself was defined in).
1039
+
1040
+ @marshaller and @unmarshaller contain procs that encapsulate
1041
+ writing/reading of sub-objects to/from XML, as specified by the user
1042
+ with :class/:marshaller/:unmarshaller etc. options (the meaning of
1043
+ those different options was described {above}[aref:subobjnodes]). The
1044
+ procs are there to be called from <tt>extract_attr_value</tt> or
1045
+ <tt>set_attr_value</tt> whenever the need arises.
347
1046
 
348
1047
 
349
- == XPath interpreter
1048
+ == {XPath Interpreter}[a:xpath]
350
1049
 
351
1050
  XML::XXPath is an XPath parser. It is used in xml-mapping node type
352
- definitions, but can just as well be utilized stand-alone (it does
353
- not depend on xml-mapping). XML::XXPath is very incomplete and probably
354
- will always be (it only supports path elements of types _elt_name_,
355
- @_attr_name_, _elt_name_[@_attr_name_=_attr_value_],
356
- _elt_name_[_index_], and *), but it should be reasonably efficient
357
- (XPath expressions are precompiled), and, most importantly, it
358
- supports write access. For example, if you create the path
359
- "/foo/bar[3]/baz[@key='hiho']" in the XML document
1051
+ definitions, but can just as well be utilized stand-alone (it does not
1052
+ depend on xml-mapping). XML::XXPath is very incomplete and probably
1053
+ will always be, but it should be reasonably efficient (XPath
1054
+ expressions are precompiled), and, most importantly, it supports write
1055
+ access, which is needed for writing objects to XML. For example, if
1056
+ you create the path "/foo/bar[3]/baz[@key='hiho']" in the XML document
360
1057
 
361
1058
  <foo>
362
1059
  <bar>
@@ -378,7 +1075,8 @@ supports write access. For example, if you create the path
378
1075
  </bar>
379
1076
  </foo>
380
1077
 
381
- XML::XXPath is explained in more detail in the reference documentation.
1078
+ XML::XXPath is explained in more detail in the reference documentation
1079
+ and the README_XPATH file.
382
1080
 
383
1081
 
384
1082
  == License