xml-mapping 0.8
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/LICENSE +56 -0
- data/README +386 -0
- data/README_XPATH +175 -0
- data/Rakefile +214 -0
- data/TODO.txt +32 -0
- data/doc/xpath_impl_notes.txt +119 -0
- data/examples/company.rb +34 -0
- data/examples/company.xml +26 -0
- data/examples/company_usage.intin.rb +19 -0
- data/examples/company_usage.intout +39 -0
- data/examples/order.rb +61 -0
- data/examples/order.xml +54 -0
- data/examples/order_signature_enhanced.rb +7 -0
- data/examples/order_signature_enhanced.xml +9 -0
- data/examples/order_signature_enhanced_usage.intin.rb +12 -0
- data/examples/order_signature_enhanced_usage.intout +16 -0
- data/examples/order_usage.intin.rb +73 -0
- data/examples/order_usage.intout +147 -0
- data/examples/time_augm.intin.rb +19 -0
- data/examples/time_augm.intout +23 -0
- data/examples/time_node.rb +27 -0
- data/examples/xpath_create_new.intin.rb +85 -0
- data/examples/xpath_create_new.intout +181 -0
- data/examples/xpath_docvsroot.intin.rb +30 -0
- data/examples/xpath_docvsroot.intout +34 -0
- data/examples/xpath_ensure_created.intin.rb +62 -0
- data/examples/xpath_ensure_created.intout +114 -0
- data/examples/xpath_pathological.intin.rb +42 -0
- data/examples/xpath_pathological.intout +56 -0
- data/examples/xpath_usage.intin.rb +51 -0
- data/examples/xpath_usage.intout +57 -0
- data/install.rb +40 -0
- data/lib/xml/mapping.rb +14 -0
- data/lib/xml/mapping/base.rb +563 -0
- data/lib/xml/mapping/standard_nodes.rb +343 -0
- data/lib/xml/mapping/version.rb +8 -0
- data/lib/xml/xxpath.rb +354 -0
- data/test/all_tests.rb +6 -0
- data/test/company.rb +54 -0
- data/test/documents_folders.rb +33 -0
- data/test/fixtures/bookmarks1.xml +24 -0
- data/test/fixtures/company1.xml +85 -0
- data/test/fixtures/documents_folders.xml +71 -0
- data/test/fixtures/documents_folders2.xml +30 -0
- data/test/multiple_mappings.rb +80 -0
- data/test/tests_init.rb +2 -0
- data/test/xml_mapping_adv_test.rb +84 -0
- data/test/xml_mapping_test.rb +182 -0
- data/test/xpath_test.rb +273 -0
- metadata +96 -0
data/LICENSE
ADDED
@@ -0,0 +1,56 @@
|
|
1
|
+
Xml-mapping is copyrighted free software by Olaf Klischat
|
2
|
+
<klischat@cs.tu-berlin.de>. You can redistribute it and/or modify it
|
3
|
+
under either the terms of the GPL, or the conditions below:
|
4
|
+
|
5
|
+
1. You may make and give away verbatim copies of the source form of the
|
6
|
+
software without restriction, provided that you duplicate all of the
|
7
|
+
original copyright notices and associated disclaimers.
|
8
|
+
|
9
|
+
2. You may modify your copy of the software in any way, provided that
|
10
|
+
you do at least ONE of the following:
|
11
|
+
|
12
|
+
a) place your modifications in the Public Domain or otherwise
|
13
|
+
make them Freely Available, such as by posting said
|
14
|
+
modifications to Usenet or an equivalent medium, or by allowing
|
15
|
+
the author to include your modifications in the software.
|
16
|
+
|
17
|
+
b) use the modified software only within your corporation or
|
18
|
+
organization.
|
19
|
+
|
20
|
+
c) give non-standard binaries non-standard names, with
|
21
|
+
instructions on where to get the original software distribution.
|
22
|
+
|
23
|
+
d) make other distribution arrangements with the author.
|
24
|
+
|
25
|
+
3. You may distribute the software in object code or binary form,
|
26
|
+
provided that you do at least ONE of the following:
|
27
|
+
|
28
|
+
a) distribute the binaries and library files of the software,
|
29
|
+
together with instructions (in the manual page or equivalent)
|
30
|
+
on where to get the original distribution.
|
31
|
+
|
32
|
+
b) accompany the distribution with the machine-readable source of
|
33
|
+
the software.
|
34
|
+
|
35
|
+
c) give non-standard binaries non-standard names, with
|
36
|
+
instructions on where to get the original software distribution.
|
37
|
+
|
38
|
+
d) make other distribution arrangements with the author.
|
39
|
+
|
40
|
+
4. You may modify and include the part of the software into any other
|
41
|
+
software (possibly commercial). But some files in the distribution
|
42
|
+
are not written by the author, so that they are not under these terms.
|
43
|
+
|
44
|
+
For the list of those files and their copying conditions, see the
|
45
|
+
file LEGAL.
|
46
|
+
|
47
|
+
5. The scripts and library files supplied as input to or produced as
|
48
|
+
output from the software do not automatically fall under the
|
49
|
+
copyright of the software, but belong to whomever generated them,
|
50
|
+
and may be sold commercially, and may be aggregated with this
|
51
|
+
software.
|
52
|
+
|
53
|
+
6. THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR
|
54
|
+
IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
|
55
|
+
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
56
|
+
PURPOSE.
|
data/README
ADDED
@@ -0,0 +1,386 @@
|
|
1
|
+
= XML-MAPPING: XML-to-object (and back) mapper for Ruby, including XPath interpreter
|
2
|
+
|
3
|
+
Xml-mapping is an easy to use, extensible library that allows you to
|
4
|
+
semi-automatically map Ruby objects to XML trees and vice versa. It is
|
5
|
+
easy to use and has a modular design that allows for easy extension of
|
6
|
+
its functionality.
|
7
|
+
|
8
|
+
== Download
|
9
|
+
|
10
|
+
For downloading the latest version, CVS repository access etc. go to:
|
11
|
+
|
12
|
+
http://rubyforge.org/projects/xml-mapping/
|
13
|
+
|
14
|
+
== Example
|
15
|
+
|
16
|
+
(example document stolen + extended from
|
17
|
+
http://www.castor.org/xml-mapping.html)
|
18
|
+
|
19
|
+
=== Input document:
|
20
|
+
|
21
|
+
:include: order.xml
|
22
|
+
|
23
|
+
=== Mapping class declaration:
|
24
|
+
|
25
|
+
:include: order.rb
|
26
|
+
|
27
|
+
=== Usage:
|
28
|
+
|
29
|
+
:include: order_usage.intout
|
30
|
+
|
31
|
+
|
32
|
+
== Description
|
33
|
+
|
34
|
+
As shown in the example, you have to include XML::Mapping into a class
|
35
|
+
to turn it into a "mapping class". There are no other restrictions
|
36
|
+
imposed on mapping classes; you can add attributes and methods to
|
37
|
+
them, include additional modules in them, derive them from other
|
38
|
+
classes, derive other classes from them etc.pp.
|
39
|
+
|
40
|
+
An instance of a mapping class can be created from/converted into an
|
41
|
+
XML node by means of instance methods like XML::Mapping.load_from_xml,
|
42
|
+
XML::Mapping#save_to_xml, XML::Mapping.load_from_file,
|
43
|
+
XML::Mapping#save_to_file. Special class methods like "text_node",
|
44
|
+
"array_node" etc., called "node factory methods", may be called from
|
45
|
+
the body of the class definition to define instance attributes that
|
46
|
+
are automatically and bidirectionally mapped to subtrees of the XML
|
47
|
+
element an instance of the class is mapped to. For example, in the
|
48
|
+
definition
|
49
|
+
|
50
|
+
class Address
|
51
|
+
include XML::Mapping
|
52
|
+
|
53
|
+
text_node :city, "City"
|
54
|
+
text_node :state, "State"
|
55
|
+
numeric_node :zip, "ZIP"
|
56
|
+
text_node :street, "Street"
|
57
|
+
end
|
58
|
+
|
59
|
+
the first call to #text_node creates an attribute named "city" which
|
60
|
+
is mapped to the text of the XML child element defined by the XPath
|
61
|
+
expression "City" (xml-mapping includes an XPath interpreter that can
|
62
|
+
also be used seperately; see below). When you create an instance of
|
63
|
+
+Address+ from an XML element (using Address.load_from_file(file_name)
|
64
|
+
or Address.load_from_xml(rexml_element)), that instance's "city"
|
65
|
+
attribute will be set to the text of the XML element's "City" child
|
66
|
+
element. When you convert an instance of Address into an XML element,
|
67
|
+
a sub-element "City" is added and it text is set to the current value
|
68
|
+
of the +city+ attribute. The other node types (numeric_node,
|
69
|
+
array_node etc.) work analogously. The node types +object_node+,
|
70
|
+
+array_node+, and +hash_node+ recursively map sub-trees to instances
|
71
|
+
of mapping classes (as opposed to simple types like String
|
72
|
+
etc.). For example, with the line
|
73
|
+
|
74
|
+
array_node :signatures, "Signed-By", "Signature", :class=>Signature, :default_value=>[]
|
75
|
+
|
76
|
+
, an attribute named "signatures" is added to the surrounding class
|
77
|
+
(here: Order); the attribute will be an array whose elements
|
78
|
+
correspond to the XML elements yielded by the XPath
|
79
|
+
"Signed-By/Signature". Each element will be of class +Signature+ (each
|
80
|
+
array element is created from the corresponding XML element by just
|
81
|
+
calling <tt>Signature.load_from_xml(the_xml_element)</tt>). The reason
|
82
|
+
why the path "Signed-By/Signature" is provieded in two arguments
|
83
|
+
instead of just one combined one becomes apparent when marshalling the
|
84
|
+
array (along with the surrounding object) back into a sequence of XML
|
85
|
+
elements. When that happens, "Signed-By" names the common base element
|
86
|
+
for all those elements, and "Signature" is the path that will be
|
87
|
+
duplicated for each element. The input document in the example above
|
88
|
+
shows how this ends up looking.
|
89
|
+
|
90
|
+
Hash nodes work similarly, but they define hash-valued attributes
|
91
|
+
instead of array-valued ones.
|
92
|
+
|
93
|
+
Refer to the reference documentation for details about the node types
|
94
|
+
that are included in the xml-mapping library.
|
95
|
+
|
96
|
+
|
97
|
+
=== Default values
|
98
|
+
|
99
|
+
For each node you may define a _default value_ which will be set if
|
100
|
+
there was no value defined for the attribute in the XML source.
|
101
|
+
|
102
|
+
From the example:
|
103
|
+
|
104
|
+
class Signature
|
105
|
+
include XML::Mapping
|
106
|
+
|
107
|
+
text_node :position, "Position", :default_value=>"Some Employee"
|
108
|
+
end
|
109
|
+
|
110
|
+
The semantics of default values are as follows:
|
111
|
+
|
112
|
+
- when creating a new instance from scratch:
|
113
|
+
|
114
|
+
- attributes with default values are set to their default values
|
115
|
+
|
116
|
+
- attributes without default values are left unset
|
117
|
+
|
118
|
+
(when defining your own initializer, you'll have to call the
|
119
|
+
inherited _initialize_ method in order to get this behaviour)
|
120
|
+
|
121
|
+
- when loading:
|
122
|
+
|
123
|
+
- attributes without default values that are not represented in the
|
124
|
+
XML raise an error
|
125
|
+
|
126
|
+
- attributes with default values that are not represented in the XML
|
127
|
+
are set to their default values
|
128
|
+
|
129
|
+
- all other attributes are set to their respective values as present
|
130
|
+
in the XML
|
131
|
+
|
132
|
+
|
133
|
+
- when saving:
|
134
|
+
|
135
|
+
- unset attributes without default values raise an error
|
136
|
+
|
137
|
+
- attributes with default values that are set to their default
|
138
|
+
values are not saved
|
139
|
+
|
140
|
+
- all other attributes are saved
|
141
|
+
|
142
|
+
|
143
|
+
This implies that:
|
144
|
+
|
145
|
+
- attributes that are set to their respective default values are not
|
146
|
+
represented in the XML
|
147
|
+
|
148
|
+
- attributes without default values must be set explicitly before
|
149
|
+
saving
|
150
|
+
|
151
|
+
|
152
|
+
|
153
|
+
=== Attribute handling details, augmenting existing classes
|
154
|
+
|
155
|
+
I'll shed some more light on how xml-mapping adds mapped attributes to
|
156
|
+
Ruby classes. An attribute declaration like
|
157
|
+
|
158
|
+
text_node :city, "City"
|
159
|
+
|
160
|
+
maps some portion of the XML tree (here: the "City" sub-element) to an
|
161
|
+
attribute (here: "city") of the class whose body the declaration
|
162
|
+
appears in. When writing (marshalling) instances of the surrounding
|
163
|
+
class into an XML document, xml-mapping will read the attribute value
|
164
|
+
from the instance using the function named +city+; when reading
|
165
|
+
(unmarshalling) an instance from an XML document, xml-mapping will use
|
166
|
+
the one-parameter function <tt>city=</tt> to set the attribute in the
|
167
|
+
instance to the value read from the XML document.
|
168
|
+
|
169
|
+
If these functions don't exist at the time the node declaration is
|
170
|
+
executed, xml-mapping adds default implementations that simply
|
171
|
+
read/write the attribute value to instance variables that have the
|
172
|
+
same name as the attribute. For example, the +city+ attribute
|
173
|
+
declaration in the +Address+ class in the example added functions
|
174
|
+
+city+ and <tt>city=</tt> that read/write from/to the instance
|
175
|
+
variable <tt>@city</tt>.
|
176
|
+
|
177
|
+
If, however, these functions already exist prior to defining the
|
178
|
+
attributes, xml-mapping will leave them untouched, so your precious
|
179
|
+
self-written accessor methods that do whatever complicated internal
|
180
|
+
processing of the data won't be overwritten.
|
181
|
+
|
182
|
+
This means that you can not only create new mapping classes from
|
183
|
+
scratch, you can also take existing classes that contain some
|
184
|
+
"business logic" and "augment" them with xml-mapping capabilities. As
|
185
|
+
a simple example, let's augment Ruby's "Time" class with node
|
186
|
+
declarations that declare XML mappings for the day, month etc. fields:
|
187
|
+
|
188
|
+
:include: time_augm.intout
|
189
|
+
|
190
|
+
Here XML mappings are defined for the existing fields +year+, +month+
|
191
|
+
etc. Xml-apping noticed that the getter methods for those attributes
|
192
|
+
existed, so it didn't overwrite them. When calling +save_to_xml+ on a
|
193
|
+
+Time+ object, these methods are called and return the object's values
|
194
|
+
for those fields, which then get written to the output XML. Of course
|
195
|
+
you could also derive a new class from a pre-existing one and
|
196
|
+
implement the XML::Mapping stuff there, or even derive several such
|
197
|
+
classes in order to define more than one XML mapping for one existing
|
198
|
+
class.
|
199
|
+
|
200
|
+
It should be mentioned that in the +Time+ example above, the setter
|
201
|
+
methods (<tt>year=</tt>, <tt>month=</tt> etc.) didn't exist in +Time+
|
202
|
+
(+Time+ objects are immutable), so xml-mapping defined its own setter
|
203
|
+
methods that just set <tt>@year</tt>, <tt>@month</tt> etc., which is
|
204
|
+
pretty useless for this case. So you can't really read +Time+ values
|
205
|
+
back from an XML representation in this example. For that to work,
|
206
|
+
you'd need functioning <tt>blah=(x)</tt> methods for each +blah+
|
207
|
+
attribute that you want to define an XML mapping for.
|
208
|
+
|
209
|
+
|
210
|
+
=== Defining your own node types
|
211
|
+
|
212
|
+
It's easy to write additional node types and register them with the
|
213
|
+
xml-mapping library. Let's say we want to extend the +Signature+ class
|
214
|
+
from the example to include the time at which the signature was
|
215
|
+
created. We want the new XML representation of such a signature to
|
216
|
+
look like this:
|
217
|
+
|
218
|
+
:include: order_signature_enhanced.xml
|
219
|
+
|
220
|
+
(we only save year, month and day to make this example shorter), and
|
221
|
+
the mapping class declaration to look like this:
|
222
|
+
|
223
|
+
:include: order_signature_enhanced.rb
|
224
|
+
|
225
|
+
(i.e. a new "time_node" declaration was added).
|
226
|
+
|
227
|
+
We want this +signed_on+ call to define an attribute named +signed_on+
|
228
|
+
which holds the date value from the XML in an instance of class
|
229
|
+
+Time+.
|
230
|
+
|
231
|
+
This node type can be defined with this piece of code:
|
232
|
+
|
233
|
+
:include: time_node.rb
|
234
|
+
|
235
|
+
The last line registers the new node type with the xml-mapping
|
236
|
+
library. The name of the node factory method ("time_node") is
|
237
|
+
automatically derived from the class name of the node type
|
238
|
+
("TimeNode").
|
239
|
+
|
240
|
+
There will be one instance of the node type per mapping class (not per
|
241
|
+
mapping class instance). That instance will be created by the node
|
242
|
+
factory method (+time_node+); there's no need to instantiate the node
|
243
|
+
type directly. Whenever an instance of the mapping class needs to be
|
244
|
+
marshalled/unmarshalled to/from XML, +set_attr_value+
|
245
|
+
resp. +extract_attr_value+ will be called on the node type instance
|
246
|
+
("node" for short). The node factory method places the node into the
|
247
|
+
mapping class; the @owner attribute of the node is set to reference
|
248
|
+
the mapping class. The node factory method passes its arguments (in
|
249
|
+
the example, that would be <tt>:signed_on, "signed-on",
|
250
|
+
:default_value=>Time.now</tt>) to the node's initializer. TimeNode's
|
251
|
+
parent class XML::Mapping::SingleAttributeNode already handles the
|
252
|
+
<tt>:signed_on</tt> and <tt>:default_value=>Time.now</tt> arguments --
|
253
|
+
<tt>:signed_on</tt> is stored into <tt>@attrname</tt>, and the default
|
254
|
+
value declarations will be described in a moment. The remaining
|
255
|
+
argument <tt>"signed-on"</tt> gets passed to our +initialize_impl+
|
256
|
+
method as parameter _path_. We'll interpret it as an XPath expression
|
257
|
+
that locates the time value relative to the parent mapping object's
|
258
|
+
XML tree (in this case, this would be the XML tree rooted at the
|
259
|
+
+<Signature>+ element, i.e. the tree the +Signature+ instance was read
|
260
|
+
from). We'll later have to read/store the year, month, and day values
|
261
|
+
from <tt>path+"/year"</tt>, <tt>path+"/month"</tt>, and
|
262
|
+
<tt>path+"/day"</tt>, respectively, so we create (and precompile)
|
263
|
+
three corresponding XPath expressions using XML::XXPath.new and store
|
264
|
+
them into member variables of the node. XML::XXPath is an XPath
|
265
|
+
implementation that is bundled with xml-mapping. It is very
|
266
|
+
incomplete, but it supports writing (not just reading) of XML nodes,
|
267
|
+
which is needed to support writing data back to XML. The XML::XXPath
|
268
|
+
library is explained in more detail below.
|
269
|
+
|
270
|
+
The +extract_attr_value+ method is called whenever an instance of the
|
271
|
+
class the node belongs to (+Signature+ in the example) is being
|
272
|
+
created from an XML tree. The parameter _xml_ is that tree (again,
|
273
|
+
this is the tree rooted at the +<Signature>+ element in this
|
274
|
+
example). The method implementation is expected to extract the
|
275
|
+
attribute's value from _xml_ and return it, or raise
|
276
|
+
XML::Mapping::SingleAttributeNode::NoAttrValueSet if the attribute was
|
277
|
+
"unset" in the XML (so the default value should be put in place if it
|
278
|
+
was defined), or raise any other exception to signal an error and
|
279
|
+
abort the whole process. In our implementation, we apply the xpath
|
280
|
+
expressions created at initialization to _xml_
|
281
|
+
(e.g. <tt>@y_path.first(xml)</tt>). An expression
|
282
|
+
_xpath_expr_.first(_xml_) returns (as a REXML element) the first
|
283
|
+
sub-element of _xml_ that matches _xpath_expr_, or raises
|
284
|
+
XML::XXPathError if there was no such element. We apply REXML's _text_
|
285
|
+
method to the returned element to get out the element's text, convert
|
286
|
+
it to integer, and supply it to the constructor of the +Time+ object
|
287
|
+
to be returned. (as a side note, if an XPath expression matches XML
|
288
|
+
attributes, XML::XXPath methods like _first_ will return "Attribute"
|
289
|
+
nodes that behave similarly to REXML::Element nodes, including
|
290
|
+
messages like _name_ and _text_ (XML::XXPath extends REXML to support
|
291
|
+
this because REXML's Attribute class is too incompatible), so this
|
292
|
+
would've worked also if our XPath expressions named XML attributes,
|
293
|
+
not elements). The +default_when_xpath_err+ thing calls the supplied
|
294
|
+
block and returns its value, but maps the exception XML::XXPathError to
|
295
|
+
the mentioned XML::Mapping::SingleAttributeNode::NoAttrValueSet (any
|
296
|
+
other exceptions fall through unchanged). As said above,
|
297
|
+
XML::Mapping::NoAttrValueSet is then caught by our superclass
|
298
|
+
(XML::Mapping::SingleAttributeNode), and the default value is set if
|
299
|
+
it was provided. So you should just wrap +default_when_xpath_err+
|
300
|
+
around any applications of XPath expressions whose non-presence in the
|
301
|
+
XML you want to be considered a non-presence of the attribute you're
|
302
|
+
trying to extract. (XML::XXPath is designed to know knothing about
|
303
|
+
XML::Mapping, so it doesn't raise
|
304
|
+
XML::Mapping::SingleAttributeNode::NoAttrValueSet directly)
|
305
|
+
|
306
|
+
The +set_attr_value+ method is called whenever an instance of the
|
307
|
+
class the node belongs to (+Signature+ in the example) is being stored
|
308
|
+
into an XML tree. The _xml_ parameter is the XML tree (a REXML element
|
309
|
+
node; here this is again the tree rooted at the +<Signature>+
|
310
|
+
element); _value_ is the current value of the attribute. _xml_ will
|
311
|
+
most probably be "half-populated" by the time this method is called --
|
312
|
+
the framework calls the +set_attr_value+ methods of all nodes of a
|
313
|
+
mapping class in the order of their definition, letting each node fill
|
314
|
+
its "bit" into _xml_. The method implementation is expected to write
|
315
|
+
_value_ into (the correct sub-elements of) _xml_, or raise an
|
316
|
+
exception to signal an error and abort the whole process. No default
|
317
|
+
value handling is done here; +set_attr_value+ won't be called at all
|
318
|
+
if the attribute had been set to its default value. In our
|
319
|
+
implementation we grab the year, month and day values from _value_
|
320
|
+
(which must be a +Time+), and store it into the sub-elements of _xml_
|
321
|
+
identified by XPath expressions <tt>@y_path</tt>, <tt>@m_path</tt> and
|
322
|
+
<tt>@d_path</tt>, respectively. We do this by calling XML::XXPath#first
|
323
|
+
with an additional parameter <tt>:ensure_created=>true</tt>. An
|
324
|
+
expression _xpath_expr_.first(_xml_,:ensure_created=>true) works just
|
325
|
+
like _xpath_expr_.first(_xml_) if _xpath_expr_ was already present in
|
326
|
+
_xml_. If it was not, it is created (preferable at the end of _xml_'s
|
327
|
+
list of sub-nodes), and returned. See below for a more detailed
|
328
|
+
documentation of the XPath interpreter.
|
329
|
+
|
330
|
+
=== Element order in created XML documents
|
331
|
+
|
332
|
+
As just said, XML::XXPath, when used to create new XML nodes, generally
|
333
|
+
appends those nodes to the end of the list of subnodes of the node the
|
334
|
+
xpath expression was applied to. All xml-mapping nodes that come with
|
335
|
+
xml-mapping use XML::XXPath when writing data to XML, and therefore
|
336
|
+
also append their data to the XML data written by preceding nodes (the
|
337
|
+
nodes are invoked in the order of their definition). This means that,
|
338
|
+
generally, your output data will appear in the XML document in the
|
339
|
+
same order in which the corresponding xml-mapping node definitions
|
340
|
+
appeared in the mapping class (unless you used XPath expressions like
|
341
|
+
foo[number] which explicitly dictate a fixed position in the sequence
|
342
|
+
of XML nodes). For instance, in the example from the beginning of this
|
343
|
+
document, if we put the <tt>:signatures</tt> node _before_ the
|
344
|
+
<tt>:items</tt> node, the <tt><Signed-By></tt> element will appear
|
345
|
+
_before_ the sequence of <tt><Item></tt> elements in the output XML.
|
346
|
+
|
347
|
+
|
348
|
+
|
349
|
+
== XPath interpreter
|
350
|
+
|
351
|
+
XML::XXPath is an XPath parser. It is used in xml-mapping node type
|
352
|
+
definitions, but can just as well be utilized stand-alone (it does
|
353
|
+
not depend on xml-mapping). XML::XXPath is very incomplete and probably
|
354
|
+
will always be (it only supports path elements of types _elt_name_,
|
355
|
+
@_attr_name_, _elt_name_[@_attr_name_=_attr_value_],
|
356
|
+
_elt_name_[_index_], and *), but it should be reasonably efficient
|
357
|
+
(XPath expressions are precompiled), and, most importantly, it
|
358
|
+
supports write access. For example, if you create the path
|
359
|
+
"/foo/bar[3]/baz[@key='hiho']" in the XML document
|
360
|
+
|
361
|
+
<foo>
|
362
|
+
<bar>
|
363
|
+
<baz key="ab">hello</baz>
|
364
|
+
<baz key="xy">goodbye</baz>
|
365
|
+
</bar>
|
366
|
+
</foo>
|
367
|
+
|
368
|
+
, you'll get:
|
369
|
+
|
370
|
+
<foo>
|
371
|
+
<bar>
|
372
|
+
<baz key='ab'>hello</baz>
|
373
|
+
<baz key='xy'>goodbye</baz>
|
374
|
+
</bar>
|
375
|
+
<bar/>
|
376
|
+
<bar>
|
377
|
+
<baz key='hiho'/>
|
378
|
+
</bar>
|
379
|
+
</foo>
|
380
|
+
|
381
|
+
XML::XXPath is explained in more detail in the reference documentation.
|
382
|
+
|
383
|
+
|
384
|
+
== License
|
385
|
+
|
386
|
+
Ruby's.
|
data/README_XPATH
ADDED
@@ -0,0 +1,175 @@
|
|
1
|
+
= XML-XXPATH
|
2
|
+
|
3
|
+
== Overview, Motivation
|
4
|
+
|
5
|
+
Xml-xxpath is an (incomplete) XPath interpreter that is at the moment
|
6
|
+
bundled with xml-mapping. It is built on top of REXML. xml-mapping
|
7
|
+
uses xml-xxpath extensively for implementing its node types -- see the
|
8
|
+
README file and the reference documentation (and the source code) for
|
9
|
+
details. xml-xxpath, however, does not depend on xml-mapping at all,
|
10
|
+
and is useful in its own right -- maybe I'll later distribute it as a
|
11
|
+
seperate library instead of bundling it. xml-xxpath's XPath support is
|
12
|
+
vastly incomplete (see below), but, in addition to the normal
|
13
|
+
reading/matching functionality found in other XPath implementations
|
14
|
+
(i.e. "find all elements in a given XML document matching a given
|
15
|
+
XPath expression"), xml-xxpath supports <i>write access</i>. For
|
16
|
+
example, when writing the XPath expression
|
17
|
+
"/foo/bar[3]/baz[@key='hiho']" to the XML document
|
18
|
+
|
19
|
+
<foo>
|
20
|
+
<bar>
|
21
|
+
<baz key='ab'>hello</baz>
|
22
|
+
<baz key='xy'>goodbye</baz>
|
23
|
+
</bar>
|
24
|
+
</foo>
|
25
|
+
|
26
|
+
, you'll get:
|
27
|
+
|
28
|
+
<foo>
|
29
|
+
<bar>
|
30
|
+
<baz key='ab'>hello</baz>
|
31
|
+
<baz key='xy'>goodbye</baz>
|
32
|
+
</bar>
|
33
|
+
<bar/>
|
34
|
+
<bar><baz key='hiho'/></bar>
|
35
|
+
</foo>
|
36
|
+
|
37
|
+
This feature is used by xml-mapping when writing (marshalling) Ruby
|
38
|
+
objects to XML, and is actually the reason why I couldn't just use any
|
39
|
+
of the existing XPath implementations, e.g. the one that comes with
|
40
|
+
REXML. Also, the whole xml-xxpath implementation is just 300 lines of
|
41
|
+
Ruby code, it is quite fast (paths are precompiled), and xml-xxpath
|
42
|
+
returns matched elements in the order they appeared in the source
|
43
|
+
document -- I've heard REXML::XXPath doesn't do that :)
|
44
|
+
|
45
|
+
Some basic knowledge of XPath is helpful for reading this document (I
|
46
|
+
don't know very much either).
|
47
|
+
|
48
|
+
At the moment, xml-xxpath understands XPath expressions of the form
|
49
|
+
[<tt>/</tt>]_pathelement_<tt>/</tt>_pathelement_<tt>/</tt>..., where
|
50
|
+
each _pathelement_ must be one of these:
|
51
|
+
|
52
|
+
- a simple element name _name_, e.g. +signature+
|
53
|
+
|
54
|
+
- an attribute name, @_attr_name_, e.g. <tt>@key</tt>
|
55
|
+
|
56
|
+
- a combination of an element name and an attribute name and
|
57
|
+
-value, in the form _elt_name_[@_attr_name_='_attr_value_']
|
58
|
+
|
59
|
+
- an element name and an index, _elt_name_[_index_]
|
60
|
+
|
61
|
+
- the "match-all" path element, <tt>*</tt>
|
62
|
+
|
63
|
+
|
64
|
+
== Usage
|
65
|
+
|
66
|
+
Xml-xxpath defines the class XML::XXPath. An instance of that class
|
67
|
+
wraps an XPath expression, the string representation of which must be
|
68
|
+
supplied when constructing the instance. You then call instance
|
69
|
+
methods like _first_, _all_ or <i>create_new</i> on the instance,
|
70
|
+
supplying the REXML Element the XPath expression should be applied to,
|
71
|
+
and get the results, or, in the case of write access, the element is
|
72
|
+
updated in-place.
|
73
|
+
|
74
|
+
|
75
|
+
=== Read Access
|
76
|
+
|
77
|
+
:include: xpath_usage.intout
|
78
|
+
|
79
|
+
The objects supplied to the <tt>all()</tt>, <tt>first()</tt>, and
|
80
|
+
<tt>each()</tt> calls must be REXML element nodes, i.e. they must
|
81
|
+
support messages like <tt>elements</tt>, <tt>attributes</tt> etc
|
82
|
+
(instances of REXML::Element and its subclasses do this). The calls
|
83
|
+
return the found elements as instances of REXML::Element or
|
84
|
+
XML::XXPath::Accessors::Attribute. The latter is a wrapper around
|
85
|
+
attribute nodes that is largely call-compatible to
|
86
|
+
REXML::Element. This is so you can write things like
|
87
|
+
<tt>path.each{|node|puts node.text}</tt> without having to
|
88
|
+
special-case anything even if the path matches attributes, not just
|
89
|
+
elements.
|
90
|
+
|
91
|
+
As you can see, you can re-use path objects, applying them to
|
92
|
+
different XML elements at will. You should do this because the XPath
|
93
|
+
pattern is stored inside the XPath object in a pre-compiled form,
|
94
|
+
which makes it more efficient.
|
95
|
+
|
96
|
+
The path elements of the XPath pattern are applied to the
|
97
|
+
<tt>.elements</tt> collection of the passed XML element and its
|
98
|
+
sub-elements, starting with the first one. This is shown by the
|
99
|
+
following code:
|
100
|
+
|
101
|
+
:include: xpath_docvsroot.intout
|
102
|
+
|
103
|
+
A REXML +Document+ object is a REXML +Element+ object whose +elements+
|
104
|
+
collection consists only of a single member -- the document's root
|
105
|
+
node. The first path element of the XPath -- "foo" in the example --
|
106
|
+
is matched against that. That is why the path "/bar" in the example
|
107
|
+
doesn't match anything when matched against the document +d+ itself.
|
108
|
+
|
109
|
+
An ordinary REXML +Element+ object that represents a node somewhere
|
110
|
+
inside an XML tree has an +elements+ collection that consists of all
|
111
|
+
the element's direct sub-elements. That is why XPath patterns matched
|
112
|
+
against the +firstelt+ element in the example *must not* start with
|
113
|
+
"/first" (unless there is a child node that is also named "first").
|
114
|
+
|
115
|
+
|
116
|
+
=== Write Access
|
117
|
+
|
118
|
+
You may pass a <tt>:ensure_created=>true</tt> option argument to
|
119
|
+
_path_.first(_elt_)/_path_.all(_elt_) calls to make sure that _path_
|
120
|
+
exists inside the passed XML element _elt_. If it existed before,
|
121
|
+
nothing changes, and the call behaves just as it would without the
|
122
|
+
option argument. If the path didn't exist before, the XML element is
|
123
|
+
modified such that
|
124
|
+
|
125
|
+
- the path exists afterwards
|
126
|
+
|
127
|
+
- all paths that existed before still exist afterwards
|
128
|
+
|
129
|
+
- the modification is as small as possible (i.e. as few elements as
|
130
|
+
possible are added, additional attributes are added to existing
|
131
|
+
elements if possible etc.)
|
132
|
+
|
133
|
+
The created resp. previously existing, matching elements are returned.
|
134
|
+
|
135
|
+
|
136
|
+
Examples:
|
137
|
+
|
138
|
+
:include: xpath_ensure_created.intout
|
139
|
+
|
140
|
+
|
141
|
+
Alternatively, you may pass a <tt>:create_new=>true</tt> option
|
142
|
+
argument or call <tt>create_new</tt> (_path_.create_new(_elt_) is
|
143
|
+
equivalent to _path_.first(_elt_,:create_new=>true)). In that case, a
|
144
|
+
new node in created in _elt_ for each path element of _path_ (or an
|
145
|
+
exception raised if that wasn't possible for any path element).
|
146
|
+
|
147
|
+
Examples:
|
148
|
+
|
149
|
+
:include: xpath_create_new.intout
|
150
|
+
|
151
|
+
|
152
|
+
=== Pathological Cases
|
153
|
+
|
154
|
+
What is created when the Path "*" is to be created inside an empty XML
|
155
|
+
element? The name of the element to be created isn't known, but still
|
156
|
+
some element must be created. The answer is that xml-xxpath creates a
|
157
|
+
special "unspecified" element whose name must be set by the caller
|
158
|
+
afterwards:
|
159
|
+
|
160
|
+
:include: xpath_pathological.intout
|
161
|
+
|
162
|
+
The "newelt" object in the last example is an ordinary
|
163
|
+
REXML::Element. xml-xxpath mixes the "unspecified" attribute into that
|
164
|
+
class, as well as into the XML::XXPath::Accessors::Attribute class
|
165
|
+
mentioned above.
|
166
|
+
|
167
|
+
|
168
|
+
== Implentation notes
|
169
|
+
|
170
|
+
<tt>doc/xpath_impl_notes.txt</tt> contains some documentation on the
|
171
|
+
implementation of xml-xxpath.
|
172
|
+
|
173
|
+
== License
|
174
|
+
|
175
|
+
Ruby's.
|