rexml 3.2.5 → 3.3.6
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/NEWS.md +406 -2
- data/README.md +10 -1
- data/doc/rexml/tasks/rdoc/element.rdoc +2 -2
- data/doc/rexml/tutorial.rdoc +1358 -0
- data/lib/rexml/attribute.rb +14 -9
- data/lib/rexml/document.rb +1 -1
- data/lib/rexml/element.rb +19 -34
- data/lib/rexml/entity.rb +5 -37
- data/lib/rexml/formatters/pretty.rb +3 -3
- data/lib/rexml/functions.rb +1 -2
- data/lib/rexml/namespace.rb +8 -4
- data/lib/rexml/node.rb +8 -4
- data/lib/rexml/parseexception.rb +1 -0
- data/lib/rexml/parsers/baseparser.rb +421 -263
- data/lib/rexml/parsers/pullparser.rb +4 -0
- data/lib/rexml/parsers/sax2parser.rb +6 -19
- data/lib/rexml/parsers/streamparser.rb +8 -10
- data/lib/rexml/parsers/treeparser.rb +9 -21
- data/lib/rexml/parsers/xpathparser.rb +136 -86
- data/lib/rexml/rexml.rb +3 -1
- data/lib/rexml/source.rb +128 -98
- data/lib/rexml/text.rb +40 -18
- data/lib/rexml/xpath_parser.rb +7 -3
- metadata +11 -39
@@ -0,0 +1,1358 @@
|
|
1
|
+
= \REXML Tutorial
|
2
|
+
|
3
|
+
== Why \REXML?
|
4
|
+
|
5
|
+
- Ruby's \REXML library is part of the Ruby distribution,
|
6
|
+
so using it requires no gem installations.
|
7
|
+
- \REXML is fully maintained.
|
8
|
+
- \REXML is mature, having been in use for long years.
|
9
|
+
|
10
|
+
== To Include, or Not to Include?
|
11
|
+
|
12
|
+
REXML is a module.
|
13
|
+
To use it, you must require it:
|
14
|
+
|
15
|
+
require 'rexml' # => true
|
16
|
+
|
17
|
+
If you do not also include it, you must fully qualify references to REXML:
|
18
|
+
|
19
|
+
REXML::Document # => REXML::Document
|
20
|
+
|
21
|
+
If you also include the module, you may optionally omit <tt>REXML::</tt>:
|
22
|
+
|
23
|
+
include REXML
|
24
|
+
Document # => REXML::Document
|
25
|
+
REXML::Document # => REXML::Document
|
26
|
+
|
27
|
+
== Preliminaries
|
28
|
+
|
29
|
+
All examples here assume that the following code has been executed:
|
30
|
+
|
31
|
+
require 'rexml'
|
32
|
+
include REXML
|
33
|
+
|
34
|
+
The source XML for many examples here is from file
|
35
|
+
{books.xml}[https://www.w3schools.com/xml/books.xml] at w3schools.com.
|
36
|
+
You may find it convenient to open that page in a new tab
|
37
|
+
(Ctrl-click in some browsers).
|
38
|
+
|
39
|
+
Note that your browser may display the XML with modified whitespace
|
40
|
+
and without the XML declaration, which in this case is:
|
41
|
+
|
42
|
+
<?xml version="1.0" encoding="UTF-8"?>
|
43
|
+
|
44
|
+
For convenience, we capture the XML into a string variable:
|
45
|
+
|
46
|
+
require 'open-uri'
|
47
|
+
source_string = URI.open('https://www.w3schools.com/xml/books.xml').read
|
48
|
+
|
49
|
+
And into a file:
|
50
|
+
|
51
|
+
File.write('source_file.xml', source_string)
|
52
|
+
|
53
|
+
Throughout these examples, variable +doc+ will hold only the document
|
54
|
+
derived from these sources:
|
55
|
+
|
56
|
+
doc = Document.new(source_string)
|
57
|
+
|
58
|
+
== Parsing \XML \Source
|
59
|
+
|
60
|
+
=== Parsing a Document
|
61
|
+
|
62
|
+
Use method REXML::Document::new to parse XML source.
|
63
|
+
|
64
|
+
The source may be a string:
|
65
|
+
|
66
|
+
doc = Document.new(source_string)
|
67
|
+
|
68
|
+
Or an \IO stream:
|
69
|
+
|
70
|
+
doc = File.open('source_file.xml', 'r') do |io|
|
71
|
+
Document.new(io)
|
72
|
+
end
|
73
|
+
|
74
|
+
Method <tt>URI.open</tt> returns a StringIO object,
|
75
|
+
so the source can be from a web page:
|
76
|
+
|
77
|
+
require 'open-uri'
|
78
|
+
io = URI.open("https://www.w3schools.com/xml/books.xml")
|
79
|
+
io.class # => StringIO
|
80
|
+
doc = Document.new(io)
|
81
|
+
|
82
|
+
For any of these sources, the returned object is an REXML::Document:
|
83
|
+
|
84
|
+
doc # => <UNDEFINED> ... </>
|
85
|
+
doc.class # => REXML::Document
|
86
|
+
|
87
|
+
Note: <tt>'UNDEFINED'</tt> is the "name" displayed for a document,
|
88
|
+
even though <tt>doc.name</tt> returns an empty string <tt>""</tt>.
|
89
|
+
|
90
|
+
A parsed document may produce \REXML objects of many classes,
|
91
|
+
but the two that are likely to be of greatest interest are
|
92
|
+
REXML::Document and REXML::Element.
|
93
|
+
These two classes are covered in great detail in this tutorial.
|
94
|
+
|
95
|
+
=== Context (Parsing Options)
|
96
|
+
|
97
|
+
The context for parsing a document is a hash that influences
|
98
|
+
the way the XML is read and stored.
|
99
|
+
|
100
|
+
The context entries are:
|
101
|
+
|
102
|
+
- +:respect_whitespace+: controls treatment of whitespace.
|
103
|
+
- +:compress_whitespace+: determines whether whitespace is compressed.
|
104
|
+
- +:ignore_whitespace_nodes+: determines whether whitespace-only nodes are to be ignored.
|
105
|
+
- +:raw+: controls treatment of special characters and entities.
|
106
|
+
|
107
|
+
See {Element Context}[../context_rdoc.html].
|
108
|
+
|
109
|
+
== Exploring the Document
|
110
|
+
|
111
|
+
An REXML::Document object represents an XML document.
|
112
|
+
|
113
|
+
The object inherits from its ancestor classes:
|
114
|
+
|
115
|
+
- REXML::Child (includes module REXML::Node)
|
116
|
+
- REXML::Parent (includes module {Enumerable}[rdoc-ref:Enumerable]).
|
117
|
+
- REXML::Element (includes module REXML::Namespace).
|
118
|
+
- REXML::Document
|
119
|
+
|
120
|
+
This section covers only those properties and methods that are unique to a document
|
121
|
+
(that is, not inherited or included).
|
122
|
+
|
123
|
+
=== Document Properties
|
124
|
+
|
125
|
+
A document has several properties (other than its children);
|
126
|
+
|
127
|
+
- Document type.
|
128
|
+
- Node type.
|
129
|
+
- Name.
|
130
|
+
- Document.
|
131
|
+
- XPath
|
132
|
+
|
133
|
+
[Document Type]
|
134
|
+
|
135
|
+
A document may have a document type:
|
136
|
+
|
137
|
+
my_xml = '<!DOCTYPE foo>'
|
138
|
+
my_doc = Document.new(my_xml)
|
139
|
+
doc_type = my_doc.doctype
|
140
|
+
doc_type.class # => REXML::DocType
|
141
|
+
doc_type.to_s # => "<!DOCTYPE foo>"
|
142
|
+
|
143
|
+
[Node Type]
|
144
|
+
|
145
|
+
A document also has a node type (always +:document+):
|
146
|
+
|
147
|
+
doc.node_type # => :document
|
148
|
+
|
149
|
+
[Name]
|
150
|
+
|
151
|
+
A document has a name (always an empty string):
|
152
|
+
|
153
|
+
doc.name # => ""
|
154
|
+
|
155
|
+
[Document]
|
156
|
+
|
157
|
+
\Method REXML::Document#document returns +self+:
|
158
|
+
|
159
|
+
doc.document == doc # => true
|
160
|
+
|
161
|
+
An object of a different class (\REXML::Element or \REXML::Child)
|
162
|
+
may have a document, which is the document to which the object belongs;
|
163
|
+
if so, that document will be an \REXML::Document object.
|
164
|
+
|
165
|
+
doc.root.document.class # => REXML::Document
|
166
|
+
|
167
|
+
[XPath]
|
168
|
+
|
169
|
+
\method REXML::Element#xpath returns the string xpath to the element,
|
170
|
+
relative to its most distant ancestor:
|
171
|
+
|
172
|
+
doc.root.class # => REXML::Element
|
173
|
+
doc.root.xpath # => "/bookstore"
|
174
|
+
doc.root.texts.first # => "\n\n"
|
175
|
+
doc.root.texts.first.xpath # => "/bookstore/text()"
|
176
|
+
|
177
|
+
If there is no ancestor, returns the expanded name of the element:
|
178
|
+
|
179
|
+
Element.new('foo').xpath # => "foo"
|
180
|
+
|
181
|
+
=== Document Children
|
182
|
+
|
183
|
+
A document may have children of these types:
|
184
|
+
|
185
|
+
- XML declaration.
|
186
|
+
- Root element.
|
187
|
+
- Text.
|
188
|
+
- Processing instructions.
|
189
|
+
- Comments.
|
190
|
+
- CDATA.
|
191
|
+
|
192
|
+
[XML Declaration]
|
193
|
+
|
194
|
+
A document may an XML declaration, which is stored as an REXML::XMLDecl object:
|
195
|
+
|
196
|
+
doc.xml_decl # => <?xml ... ?>
|
197
|
+
doc.xml_decl.class # => REXML::XMLDecl
|
198
|
+
|
199
|
+
Document.new('').xml_decl # => <?xml ... ?>
|
200
|
+
|
201
|
+
my_xml = '<?xml version="1.0" encoding="UTF-8" standalone="yes"?>"'
|
202
|
+
my_doc = Document.new(my_xml)
|
203
|
+
xml_decl = my_doc.xml_decl
|
204
|
+
xml_decl.to_s # => "<?xml version='1.0' encoding='UTF-8' standalone="yes"?>"
|
205
|
+
|
206
|
+
The version, encoding, and stand-alone values may be retrieved separately:
|
207
|
+
|
208
|
+
my_doc.version # => "1.0"
|
209
|
+
my_doc.encoding # => "UTF-8"
|
210
|
+
my_doc.stand_alone? # => "yes"
|
211
|
+
|
212
|
+
[Root Element]
|
213
|
+
|
214
|
+
A document may have a single element child, called the _root_ _element_,
|
215
|
+
which is stored as an REXML::Element object;
|
216
|
+
it may be retrieved with method +root+:
|
217
|
+
|
218
|
+
doc.root # => <bookstore> ... </>
|
219
|
+
doc.root.class # => REXML::Element
|
220
|
+
|
221
|
+
Document.new('').root # => nil
|
222
|
+
|
223
|
+
[Text]
|
224
|
+
|
225
|
+
A document may have text passages, each of which is stored
|
226
|
+
as an REXML::Text object:
|
227
|
+
|
228
|
+
doc.texts.each {|t| p [t.class, t] }
|
229
|
+
|
230
|
+
Output:
|
231
|
+
|
232
|
+
[REXML::Text, "\n"]
|
233
|
+
|
234
|
+
[Processing Instructions]
|
235
|
+
|
236
|
+
A document may have processing instructions, which are stored
|
237
|
+
as REXML::Instruction objects:
|
238
|
+
|
239
|
+
|
240
|
+
|
241
|
+
Output:
|
242
|
+
|
243
|
+
[REXML::Instruction, <?p-i my-application ...?>]
|
244
|
+
[REXML::Instruction, <?p-i my-application ...?>]
|
245
|
+
|
246
|
+
[Comments]
|
247
|
+
|
248
|
+
A document may have comments, which are stored
|
249
|
+
as REXML::Comment objects:
|
250
|
+
|
251
|
+
my_xml = <<-EOT
|
252
|
+
<!--foo-->
|
253
|
+
<!--bar-->
|
254
|
+
EOT
|
255
|
+
my_doc = Document.new(my_xml)
|
256
|
+
my_doc.comments.each {|c| p [c.class, c] }
|
257
|
+
|
258
|
+
Output:
|
259
|
+
|
260
|
+
[REXML::Comment, #<REXML::Comment: @parent=<UNDEFINED> ... </>, @string="foo">]
|
261
|
+
[REXML::Comment, #<REXML::Comment: @parent=<UNDEFINED> ... </>, @string="bar">]
|
262
|
+
|
263
|
+
[CDATA]
|
264
|
+
|
265
|
+
A document may have CDATA entries, which are stored
|
266
|
+
as REXML::CData objects:
|
267
|
+
|
268
|
+
my_xml = <<-EOT
|
269
|
+
<![CDATA[foo]]>
|
270
|
+
<![CDATA[bar]]>
|
271
|
+
EOT
|
272
|
+
my_doc = Document.new(my_xml)
|
273
|
+
my_doc.cdatas.each {|cd| p [cd.class, cd] }
|
274
|
+
|
275
|
+
Output:
|
276
|
+
|
277
|
+
[REXML::CData, "foo"]
|
278
|
+
[REXML::CData, "bar"]
|
279
|
+
|
280
|
+
The payload of a document is a tree of nodes, descending from the root element:
|
281
|
+
|
282
|
+
doc.root.children.each do |child|
|
283
|
+
p [child, child.class]
|
284
|
+
end
|
285
|
+
|
286
|
+
Output:
|
287
|
+
|
288
|
+
[REXML::Text, "\n\n"]
|
289
|
+
[REXML::Element, <book category='cooking'> ... </>]
|
290
|
+
[REXML::Text, "\n\n"]
|
291
|
+
[REXML::Element, <book category='children'> ... </>]
|
292
|
+
[REXML::Text, "\n\n"]
|
293
|
+
[REXML::Element, <book category='web'> ... </>]
|
294
|
+
[REXML::Text, "\n\n"]
|
295
|
+
[REXML::Element, <book category='web' cover='paperback'> ... </>]
|
296
|
+
[REXML::Text, "\n\n"]
|
297
|
+
|
298
|
+
== Exploring an Element
|
299
|
+
|
300
|
+
An REXML::Element object represents an XML element.
|
301
|
+
|
302
|
+
The object inherits from its ancestor classes:
|
303
|
+
|
304
|
+
- REXML::Child (includes module REXML::Node)
|
305
|
+
- REXML::Parent (includes module {Enumerable}[rdoc-ref:Enumerable]).
|
306
|
+
- REXML::Element (includes module REXML::Namespace).
|
307
|
+
|
308
|
+
This section covers methods:
|
309
|
+
|
310
|
+
- Defined in REXML::Element itself.
|
311
|
+
- Inherited from REXML::Parent and REXML::Child.
|
312
|
+
- Included from REXML::Node.
|
313
|
+
|
314
|
+
=== Inside the Element
|
315
|
+
|
316
|
+
[Brief String Representation]
|
317
|
+
|
318
|
+
Use method REXML::Element#inspect to retrieve a brief string representation.
|
319
|
+
|
320
|
+
doc.root.inspect # => "<bookstore> ... </>"
|
321
|
+
|
322
|
+
The ellipsis (<tt>...</tt>) indicates that the element has children.
|
323
|
+
When there are no children, the ellipsis is omitted:
|
324
|
+
|
325
|
+
Element.new('foo').inspect # => "<foo/>"
|
326
|
+
|
327
|
+
If the element has attributes, those are also included:
|
328
|
+
|
329
|
+
doc.root.elements.first.inspect # => "<book category='cooking'> ... </>"
|
330
|
+
|
331
|
+
[Extended String Representation]
|
332
|
+
|
333
|
+
Use inherited method REXML::Child.bytes to retrieve an extended
|
334
|
+
string representation.
|
335
|
+
|
336
|
+
doc.root.bytes # => "<bookstore>\n\n<book category='cooking'>\n <title lang='en'>Everyday Italian</title>\n <author>Giada De Laurentiis</author>\n <year>2005</year>\n <price>30.00</price>\n</book>\n\n<book category='children'>\n <title lang='en'>Harry Potter</title>\n <author>J K. Rowling</author>\n <year>2005</year>\n <price>29.99</price>\n</book>\n\n<book category='web'>\n <title lang='en'>XQuery Kick Start</title>\n <author>James McGovern</author>\n <author>Per Bothner</author>\n <author>Kurt Cagle</author>\n <author>James Linn</author>\n <author>Vaidyanathan Nagarajan</author>\n <year>2003</year>\n <price>49.99</price>\n</book>\n\n<book category='web' cover='paperback'>\n <title lang='en'>Learning XML</title>\n <author>Erik T. Ray</author>\n <year>2003</year>\n <price>39.95</price>\n</book>\n\n</bookstore>"
|
337
|
+
|
338
|
+
[Node Type]
|
339
|
+
|
340
|
+
Use method REXML::Element#node_type to retrieve the node type (always +:element+):
|
341
|
+
|
342
|
+
doc.root.node_type # => :element
|
343
|
+
|
344
|
+
[Raw Mode]
|
345
|
+
|
346
|
+
Use method REXML::Element#raw to retrieve whether (+true+ or +nil+)
|
347
|
+
raw mode is set.
|
348
|
+
|
349
|
+
doc.root.raw # => nil
|
350
|
+
|
351
|
+
[Context]
|
352
|
+
|
353
|
+
Use method REXML::Element#context to retrieve the context hash
|
354
|
+
(see {Element Context}[../context_rdoc.html]):
|
355
|
+
|
356
|
+
doc.root.context # => {}
|
357
|
+
|
358
|
+
=== Relationships
|
359
|
+
|
360
|
+
An element may have:
|
361
|
+
|
362
|
+
- Ancestors.
|
363
|
+
- Siblings.
|
364
|
+
- Children.
|
365
|
+
|
366
|
+
==== Ancestors
|
367
|
+
|
368
|
+
[Containing Document]
|
369
|
+
|
370
|
+
Use method REXML::Element#document to retrieve the containing document, if any:
|
371
|
+
|
372
|
+
ele = doc.root.elements.first # => <book category='cooking'> ... </>
|
373
|
+
ele.document # => <UNDEFINED> ... </>
|
374
|
+
ele = Element.new('foo') # => <foo/>
|
375
|
+
ele.document # => nil
|
376
|
+
|
377
|
+
[Root Element]
|
378
|
+
|
379
|
+
Use method REXML::Element#root to retrieve the root element:
|
380
|
+
|
381
|
+
ele = doc.root.elements.first # => <book category='cooking'> ... </>
|
382
|
+
ele.root # => <bookstore> ... </>
|
383
|
+
ele = Element.new('foo') # => <foo/>
|
384
|
+
ele.root # => <foo/>
|
385
|
+
|
386
|
+
[Root Node]
|
387
|
+
|
388
|
+
Use method REXML::Element#root_node to retrieve the most distant ancestor,
|
389
|
+
which is the containing document, if any, otherwise the root element:
|
390
|
+
|
391
|
+
ele = doc.root.elements.first # => <book category='cooking'> ... </>
|
392
|
+
ele.root_node # => <UNDEFINED> ... </>
|
393
|
+
ele = Element.new('foo') # => <foo/>
|
394
|
+
ele.root_node # => <foo/>
|
395
|
+
|
396
|
+
[Parent]
|
397
|
+
|
398
|
+
Use inherited method REXML::Child#parent to retrieve the parent
|
399
|
+
|
400
|
+
ele = doc.root # => <bookstore> ... </>
|
401
|
+
ele.parent # => <UNDEFINED> ... </>
|
402
|
+
ele = doc.root.elements.first # => <book category='cooking'> ... </>
|
403
|
+
ele.parent # => <bookstore> ... </>
|
404
|
+
|
405
|
+
Use included method REXML::Node#index_in_parent to retrieve the index
|
406
|
+
of the element among all of its parents children (not just the element children).
|
407
|
+
Note that while the index for <tt>doc.root.elements[n]</tt> is 1-based,
|
408
|
+
the returned index is 0-based.
|
409
|
+
|
410
|
+
doc.root.children # =>
|
411
|
+
# ["\n\n",
|
412
|
+
# <book category='cooking'> ... </>,
|
413
|
+
# "\n\n",
|
414
|
+
# <book category='children'> ... </>,
|
415
|
+
# "\n\n",
|
416
|
+
# <book category='web'> ... </>,
|
417
|
+
# "\n\n",
|
418
|
+
# <book category='web' cover='paperback'> ... </>,
|
419
|
+
# "\n\n"]
|
420
|
+
ele = doc.root.elements[1] # => <book category='cooking'> ... </>
|
421
|
+
ele.index_in_parent # => 2
|
422
|
+
ele = doc.root.elements[2] # => <book category='children'> ... </>
|
423
|
+
ele.index_in_parent# => 4
|
424
|
+
|
425
|
+
==== Siblings
|
426
|
+
|
427
|
+
[Next Element]
|
428
|
+
|
429
|
+
Use method REXML::Element#next_element to retrieve the first following
|
430
|
+
sibling that is itself an element (+nil+ if there is none):
|
431
|
+
|
432
|
+
ele = doc.root.elements[1]
|
433
|
+
while ele do
|
434
|
+
p [ele.class, ele]
|
435
|
+
ele = ele.next_element
|
436
|
+
end
|
437
|
+
p ele
|
438
|
+
|
439
|
+
Output:
|
440
|
+
|
441
|
+
[REXML::Element, <book category='cooking'> ... </>]
|
442
|
+
[REXML::Element, <book category='children'> ... </>]
|
443
|
+
[REXML::Element, <book category='web'> ... </>]
|
444
|
+
[REXML::Element, <book category='web' cover='paperback'> ... </>]
|
445
|
+
|
446
|
+
[Previous Element]
|
447
|
+
|
448
|
+
Use method REXML::Element#previous_element to retrieve the first preceding
|
449
|
+
sibling that is itself an element (+nil+ if there is none):
|
450
|
+
|
451
|
+
ele = doc.root.elements[4]
|
452
|
+
while ele do
|
453
|
+
p [ele.class, ele]
|
454
|
+
ele = ele.previous_element
|
455
|
+
end
|
456
|
+
p ele
|
457
|
+
|
458
|
+
Output:
|
459
|
+
|
460
|
+
[REXML::Element, <book category='web' cover='paperback'> ... </>]
|
461
|
+
[REXML::Element, <book category='web'> ... </>]
|
462
|
+
[REXML::Element, <book category='children'> ... </>]
|
463
|
+
[REXML::Element, <book category='cooking'> ... </>]
|
464
|
+
|
465
|
+
[Next Node]
|
466
|
+
|
467
|
+
Use included method REXML::Node.next_sibling_node
|
468
|
+
(or its alias <tt>next_sibling</tt>) to retrieve the first following node
|
469
|
+
regardless of its class:
|
470
|
+
|
471
|
+
node = doc.root.children[0]
|
472
|
+
while node do
|
473
|
+
p [node.class, node]
|
474
|
+
node = node.next_sibling
|
475
|
+
end
|
476
|
+
p node
|
477
|
+
|
478
|
+
Output:
|
479
|
+
|
480
|
+
[REXML::Text, "\n\n"]
|
481
|
+
[REXML::Element, <book category='cooking'> ... </>]
|
482
|
+
[REXML::Text, "\n\n"]
|
483
|
+
[REXML::Element, <book category='children'> ... </>]
|
484
|
+
[REXML::Text, "\n\n"]
|
485
|
+
[REXML::Element, <book category='web'> ... </>]
|
486
|
+
[REXML::Text, "\n\n"]
|
487
|
+
[REXML::Element, <book category='web' cover='paperback'> ... </>]
|
488
|
+
[REXML::Text, "\n\n"]
|
489
|
+
|
490
|
+
[Previous Node]
|
491
|
+
|
492
|
+
Use included method REXML::Node.previous_sibling_node
|
493
|
+
(or its alias <tt>previous_sibling</tt>) to retrieve the first preceding node
|
494
|
+
regardless of its class:
|
495
|
+
|
496
|
+
node = doc.root.children[-1]
|
497
|
+
while node do
|
498
|
+
p [node.class, node]
|
499
|
+
node = node.previous_sibling
|
500
|
+
end
|
501
|
+
p node
|
502
|
+
|
503
|
+
Output:
|
504
|
+
|
505
|
+
[REXML::Text, "\n\n"]
|
506
|
+
[REXML::Element, <book category='web' cover='paperback'> ... </>]
|
507
|
+
[REXML::Text, "\n\n"]
|
508
|
+
[REXML::Element, <book category='web'> ... </>]
|
509
|
+
[REXML::Text, "\n\n"]
|
510
|
+
[REXML::Element, <book category='children'> ... </>]
|
511
|
+
[REXML::Text, "\n\n"]
|
512
|
+
[REXML::Element, <book category='cooking'> ... </>]
|
513
|
+
[REXML::Text, "\n\n"]
|
514
|
+
|
515
|
+
==== Children
|
516
|
+
|
517
|
+
[Child Count]
|
518
|
+
|
519
|
+
Use inherited method REXML::Parent.size to retrieve the count
|
520
|
+
of nodes (of all types) in the element:
|
521
|
+
|
522
|
+
doc.root.size # => 9
|
523
|
+
|
524
|
+
[Child Nodes]
|
525
|
+
|
526
|
+
Use inherited method REXML::Parent.children to retrieve an array
|
527
|
+
of the child nodes (of all types):
|
528
|
+
|
529
|
+
doc.root.children # =>
|
530
|
+
# ["\n\n",
|
531
|
+
# <book category='cooking'> ... </>,
|
532
|
+
# "\n\n",
|
533
|
+
# <book category='children'> ... </>,
|
534
|
+
# "\n\n",
|
535
|
+
# <book category='web'> ... </>,
|
536
|
+
# "\n\n",
|
537
|
+
# <book category='web' cover='paperback'> ... </>,
|
538
|
+
# "\n\n"]
|
539
|
+
|
540
|
+
[Child at Index]
|
541
|
+
|
542
|
+
Use method REXML::Element#[] to retrieve the child at a given numerical index,
|
543
|
+
or +nil+ if there is no such child:
|
544
|
+
|
545
|
+
doc.root[0] # => "\n\n"
|
546
|
+
doc.root[1] # => <book category='cooking'> ... </>
|
547
|
+
doc.root[7] # => <book category='web' cover='paperback'> ... </>
|
548
|
+
doc.root[8] # => "\n\n"
|
549
|
+
|
550
|
+
doc.root[-1] # => "\n\n"
|
551
|
+
doc.root[-2] # => <book category='web' cover='paperback'> ... </>
|
552
|
+
|
553
|
+
doc.root[50] # => nil
|
554
|
+
|
555
|
+
[Index of Child]
|
556
|
+
|
557
|
+
Use method REXML::Parent#index to retrieve the zero-based child index
|
558
|
+
of the given object, or <tt>#size - 1</tt> if there is no such child:
|
559
|
+
|
560
|
+
ele = doc.root # => <bookstore> ... </>
|
561
|
+
ele.index(ele[0]) # => 0
|
562
|
+
ele.index(ele[1]) # => 1
|
563
|
+
ele.index(ele[7]) # => 7
|
564
|
+
ele.index(ele[8]) # => 8
|
565
|
+
|
566
|
+
ele.index(ele[-1]) # => 8
|
567
|
+
ele.index(ele[-2]) # => 7
|
568
|
+
|
569
|
+
ele.index(ele[50]) # => 8
|
570
|
+
|
571
|
+
[Element Children]
|
572
|
+
|
573
|
+
Use method REXML::Element#has_elements? to retrieve whether the element
|
574
|
+
has element children:
|
575
|
+
|
576
|
+
doc.root.has_elements? # => true
|
577
|
+
REXML::Element.new('foo').has_elements? # => false
|
578
|
+
|
579
|
+
Use method REXML::Element#elements to retrieve the REXML::Elements object
|
580
|
+
containing the element children:
|
581
|
+
|
582
|
+
eles = doc.root.elements
|
583
|
+
eles # => #<REXML::Elements:0x000001ee2848e960 @element=<bookstore> ... </>>
|
584
|
+
eles.size # => 4
|
585
|
+
eles.each {|e| p [e.class], e }
|
586
|
+
|
587
|
+
Output:
|
588
|
+
|
589
|
+
[<book category='cooking'> ... </>,
|
590
|
+
<book category='children'> ... </>,
|
591
|
+
<book category='web'> ... </>,
|
592
|
+
<book category='web' cover='paperback'> ... </>
|
593
|
+
]
|
594
|
+
|
595
|
+
Note that while in this example, all the element children of the root element are
|
596
|
+
elements of the same name, <tt>'book'</tt>, that is not true of all documents;
|
597
|
+
a root element (or any other element) may have any mixture of child elements.
|
598
|
+
|
599
|
+
[CDATA Children]
|
600
|
+
|
601
|
+
Use method REXML::Element#cdatas to retrieve a frozen array of CDATA children:
|
602
|
+
|
603
|
+
my_xml = <<-EOT
|
604
|
+
<root>
|
605
|
+
<![CDATA[foo]]>
|
606
|
+
<![CDATA[bar]]>
|
607
|
+
</root>
|
608
|
+
EOT
|
609
|
+
my_doc = REXML::Document.new(my_xml)
|
610
|
+
cdatas my_doc.root.cdatas
|
611
|
+
cdatas.frozen? # => true
|
612
|
+
cdatas.map {|cd| cd.class } # => [REXML::CData, REXML::CData]
|
613
|
+
|
614
|
+
[Comment Children]
|
615
|
+
|
616
|
+
Use method REXML::Element#comments to retrieve a frozen array of comment children:
|
617
|
+
|
618
|
+
my_xml = <<-EOT
|
619
|
+
<root>
|
620
|
+
<!--foo-->
|
621
|
+
<!--bar-->
|
622
|
+
</root>
|
623
|
+
EOT
|
624
|
+
my_doc = REXML::Document.new(my_xml)
|
625
|
+
comments = my_doc.root.comments
|
626
|
+
comments.frozen? # => true
|
627
|
+
comments.map {|c| c.class } # => [REXML::Comment, REXML::Comment]
|
628
|
+
comments.map {|c| c.to_s } # => ["foo", "bar"]
|
629
|
+
|
630
|
+
[Processing Instruction Children]
|
631
|
+
|
632
|
+
Use method REXML::Element#instructions to retrieve a frozen array
|
633
|
+
of processing instruction children:
|
634
|
+
|
635
|
+
my_xml = <<-EOT
|
636
|
+
<root>
|
637
|
+
<?target0 foo?>
|
638
|
+
<?target1 bar?>
|
639
|
+
</root>
|
640
|
+
EOT
|
641
|
+
my_doc = REXML::Document.new(my_xml)
|
642
|
+
instrs = my_doc.root.instructions
|
643
|
+
instrs.frozen? # => true
|
644
|
+
instrs.map {|i| i.class } # => [REXML::Instruction, REXML::Instruction]
|
645
|
+
instrs.map {|i| i.to_s } # => ["<?target0 foo?>", "<?target1 bar?>"]
|
646
|
+
|
647
|
+
[Text Children]
|
648
|
+
|
649
|
+
Use method REXML::Element#has_text? to retrieve whether the element
|
650
|
+
has text children:
|
651
|
+
|
652
|
+
doc.root.has_text? # => true
|
653
|
+
REXML::Element.new('foo').has_text? # => false
|
654
|
+
|
655
|
+
Use method REXML::Element#texts to retrieve a frozen array of text children:
|
656
|
+
|
657
|
+
my_xml = '<root><a/>text<b/>more<c/></root>'
|
658
|
+
my_doc = REXML::Document.new(my_xml)
|
659
|
+
texts = my_doc.root.texts
|
660
|
+
texts.frozen? # => true
|
661
|
+
texts.map {|t| t.class } # => [REXML::Text, REXML::Text]
|
662
|
+
texts.map {|t| t.to_s } # => ["text", "more"]
|
663
|
+
|
664
|
+
[Parenthood]
|
665
|
+
|
666
|
+
Use inherited method REXML::Parent.parent? to retrieve whether the element is a parent;
|
667
|
+
always returns +true+; only REXML::Child#parent returns +false+.
|
668
|
+
|
669
|
+
doc.root.parent? # => true
|
670
|
+
|
671
|
+
=== Element Attributes
|
672
|
+
|
673
|
+
Use method REXML::Element#has_attributes? to return whether the element
|
674
|
+
has attributes:
|
675
|
+
|
676
|
+
ele = doc.root # => <bookstore> ... </>
|
677
|
+
ele.has_attributes? # => false
|
678
|
+
ele = ele.elements.first # => <book category='cooking'> ... </>
|
679
|
+
ele.has_attributes? # => true
|
680
|
+
|
681
|
+
Use method REXML::Element#attributes to return the hash
|
682
|
+
containing the attributes for the element.
|
683
|
+
Each hash key is a string attribute name;
|
684
|
+
each hash value is an REXML::Attribute object.
|
685
|
+
|
686
|
+
ele = doc.root # => <bookstore> ... </>
|
687
|
+
attrs = ele.attributes # => {}
|
688
|
+
|
689
|
+
ele = ele.elements.first # => <book category='cooking'> ... </>
|
690
|
+
attrs = ele.attributes # => {"category"=>category='cooking'}
|
691
|
+
attrs.size # => 1
|
692
|
+
attr_name = attrs.keys.first # => "category"
|
693
|
+
attr_name.class # => String
|
694
|
+
attr_value = attrs.values.first # => category='cooking'
|
695
|
+
attr_value.class # => REXML::Attribute
|
696
|
+
|
697
|
+
Use method REXML::Element#[] to retrieve the string value for a given attribute,
|
698
|
+
which may be given as either a string or a symbol:
|
699
|
+
|
700
|
+
ele = doc.root.elements.first # => <book category='cooking'> ... </>
|
701
|
+
attr_value = ele['category'] # => "cooking"
|
702
|
+
attr_value.class # => String
|
703
|
+
ele['nosuch'] # => nil
|
704
|
+
|
705
|
+
Use method REXML::Element#attribute to retrieve the value of a named attribute:
|
706
|
+
|
707
|
+
my_xml = "<root xmlns:a='a' a:x='a:x' x='x'/>"
|
708
|
+
my_doc = REXML::Document.new(my_xml)
|
709
|
+
my_doc.root.attribute("x") # => x='x'
|
710
|
+
my_doc.root.attribute("x", "a") # => a:x='a:x'
|
711
|
+
|
712
|
+
== Whitespace
|
713
|
+
|
714
|
+
Use method REXML::Element#ignore_whitespace_nodes to determine whether
|
715
|
+
whitespace nodes were ignored when the XML was parsed;
|
716
|
+
returns +true+ if so, +nil+ otherwise.
|
717
|
+
|
718
|
+
Use method REXML::Element#whitespace to determine whether whitespace
|
719
|
+
is respected for the element; returns +true+ if so, +false+ otherwise.
|
720
|
+
|
721
|
+
== Namespaces
|
722
|
+
|
723
|
+
Use method REXML::Element#namespace to retrieve the string namespace URI
|
724
|
+
for the element, which may derive from one of its ancestors:
|
725
|
+
|
726
|
+
xml_string = <<-EOT
|
727
|
+
<root>
|
728
|
+
<a xmlns='1' xmlns:y='2'>
|
729
|
+
<b/>
|
730
|
+
<c xmlns:z='3'/>
|
731
|
+
</a>
|
732
|
+
</root>
|
733
|
+
EOT
|
734
|
+
d = Document.new(xml_string)
|
735
|
+
b = d.elements['//b']
|
736
|
+
b.namespace # => "1"
|
737
|
+
b.namespace('y') # => "2"
|
738
|
+
b.namespace('nosuch') # => nil
|
739
|
+
|
740
|
+
Use method REXML::Element#namespaces to retrieve a hash of all defined namespaces
|
741
|
+
in the element and its ancestors:
|
742
|
+
|
743
|
+
xml_string = <<-EOT
|
744
|
+
<root>
|
745
|
+
<a xmlns:x='1' xmlns:y='2'>
|
746
|
+
<b/>
|
747
|
+
<c xmlns:z='3'/>
|
748
|
+
</a>
|
749
|
+
</root>
|
750
|
+
EOT
|
751
|
+
d = Document.new(xml_string)
|
752
|
+
d.elements['//a'].namespaces # => {"x"=>"1", "y"=>"2"}
|
753
|
+
d.elements['//b'].namespaces # => {"x"=>"1", "y"=>"2"}
|
754
|
+
d.elements['//c'].namespaces # => {"x"=>"1", "y"=>"2", "z"=>"3"}
|
755
|
+
|
756
|
+
Use method REXML::Element#prefixes to retrieve an array of the string prefixes (names)
|
757
|
+
of all defined namespaces in the element and its ancestors:
|
758
|
+
|
759
|
+
xml_string = <<-EOT
|
760
|
+
<root>
|
761
|
+
<a xmlns:x='1' xmlns:y='2'>
|
762
|
+
<b/>
|
763
|
+
<c xmlns:z='3'/>
|
764
|
+
</a>
|
765
|
+
</root>
|
766
|
+
EOT
|
767
|
+
d = Document.new(xml_string, {compress_whitespace: :all})
|
768
|
+
d.elements['//a'].prefixes # => ["x", "y"]
|
769
|
+
d.elements['//b'].prefixes # => ["x", "y"]
|
770
|
+
d.elements['//c'].prefixes # => ["x", "y", "z"]
|
771
|
+
|
772
|
+
== Traversing
|
773
|
+
|
774
|
+
You can use certain methods to traverse children of the element.
|
775
|
+
Each child that meets given criteria is yielded to the given block.
|
776
|
+
|
777
|
+
[Traverse All Children]
|
778
|
+
|
779
|
+
Use inherited method REXML::Parent#each (or its alias #each_child) to traverse
|
780
|
+
all children of the element:
|
781
|
+
|
782
|
+
doc.root.each {|child| p [child.class, child] }
|
783
|
+
|
784
|
+
Output:
|
785
|
+
|
786
|
+
[REXML::Text, "\n\n"]
|
787
|
+
[REXML::Element, <book category='cooking'> ... </>]
|
788
|
+
[REXML::Text, "\n\n"]
|
789
|
+
[REXML::Element, <book category='children'> ... </>]
|
790
|
+
[REXML::Text, "\n\n"]
|
791
|
+
[REXML::Element, <book category='web'> ... </>]
|
792
|
+
[REXML::Text, "\n\n"]
|
793
|
+
[REXML::Element, <book category='web' cover='paperback'> ... </>]
|
794
|
+
[REXML::Text, "\n\n"]
|
795
|
+
|
796
|
+
[Traverse Element Children]
|
797
|
+
|
798
|
+
Use method REXML::Element#each_element to traverse only the element children
|
799
|
+
of the element:
|
800
|
+
|
801
|
+
doc.root.each_element {|e| p [e.class, e] }
|
802
|
+
|
803
|
+
Output:
|
804
|
+
|
805
|
+
[REXML::Element, <book category='cooking'> ... </>]
|
806
|
+
[REXML::Element, <book category='children'> ... </>]
|
807
|
+
[REXML::Element, <book category='web'> ... </>]
|
808
|
+
[REXML::Element, <book category='web' cover='paperback'> ... </>]
|
809
|
+
|
810
|
+
[Traverse Element Children with Attribute]
|
811
|
+
|
812
|
+
Use method REXML::Element#each_element_with_attribute with the single argument
|
813
|
+
+attr_name+ to traverse each element child that has the given attribute:
|
814
|
+
|
815
|
+
my_doc = Document.new '<a><b id="1"/><c id="2"/><d id="1"/><e/></a>'
|
816
|
+
my_doc.root.each_element_with_attribute('id') {|e| p [e.class, e] }
|
817
|
+
|
818
|
+
Output:
|
819
|
+
|
820
|
+
[REXML::Element, <b id='1'/>]
|
821
|
+
[REXML::Element, <c id='2'/>]
|
822
|
+
[REXML::Element, <d id='1'/>]
|
823
|
+
|
824
|
+
Use the same method with a second argument +value+ to traverse
|
825
|
+
each element child element that has the given attribute and value:
|
826
|
+
|
827
|
+
my_doc.root.each_element_with_attribute('id', '1') {|e| p [e.class, e] }
|
828
|
+
|
829
|
+
Output:
|
830
|
+
|
831
|
+
[REXML::Element, <b id='1'/>]
|
832
|
+
[REXML::Element, <d id='1'/>]
|
833
|
+
|
834
|
+
Use the same method with a third argument +max+ to traverse
|
835
|
+
no more than the given number of element children:
|
836
|
+
|
837
|
+
my_doc.root.each_element_with_attribute('id', '1', 1) {|e| p [e.class, e] }
|
838
|
+
|
839
|
+
Output:
|
840
|
+
|
841
|
+
[REXML::Element, <b id='1'/>]
|
842
|
+
|
843
|
+
Use the same method with a fourth argument +xpath+ to traverse
|
844
|
+
only those element children that match the given xpath:
|
845
|
+
|
846
|
+
my_doc.root.each_element_with_attribute('id', '1', 2, '//d') {|e| p [e.class, e] }
|
847
|
+
|
848
|
+
Output:
|
849
|
+
|
850
|
+
[REXML::Element, <d id='1'/>]
|
851
|
+
|
852
|
+
[Traverse Element Children with Text]
|
853
|
+
|
854
|
+
Use method REXML::Element#each_element_with_text with no arguments
|
855
|
+
to traverse those element children that have text:
|
856
|
+
|
857
|
+
my_doc = Document.new '<a><b>b</b><c>b</c><d>d</d><e/></a>'
|
858
|
+
my_doc.root.each_element_with_text {|e| p [e.class, e] }
|
859
|
+
|
860
|
+
Output:
|
861
|
+
|
862
|
+
[REXML::Element, <b> ... </>]
|
863
|
+
[REXML::Element, <c> ... </>]
|
864
|
+
[REXML::Element, <d> ... </>]
|
865
|
+
|
866
|
+
Use the same method with the single argument +text+ to traverse
|
867
|
+
those element children that have exactly that text:
|
868
|
+
|
869
|
+
my_doc.root.each_element_with_text('b') {|e| p [e.class, e] }
|
870
|
+
|
871
|
+
Output:
|
872
|
+
|
873
|
+
[REXML::Element, <b> ... </>]
|
874
|
+
[REXML::Element, <c> ... </>]
|
875
|
+
|
876
|
+
Use the same method with additional second argument +max+ to traverse
|
877
|
+
no more than the given number of element children:
|
878
|
+
|
879
|
+
my_doc.root.each_element_with_text('b', 1) {|e| p [e.class, e] }
|
880
|
+
|
881
|
+
Output:
|
882
|
+
|
883
|
+
[REXML::Element, <b> ... </>]
|
884
|
+
|
885
|
+
Use the same method with additional third argument +xpath+ to traverse
|
886
|
+
only those element children that also match the given xpath:
|
887
|
+
|
888
|
+
my_doc.root.each_element_with_text('b', 2, '//c') {|e| p [e.class, e] }
|
889
|
+
|
890
|
+
Output:
|
891
|
+
|
892
|
+
[REXML::Element, <c> ... </>]
|
893
|
+
|
894
|
+
[Traverse Element Children's Indexes]
|
895
|
+
|
896
|
+
Use inherited method REXML::Parent#each_index to traverse all children's indexes
|
897
|
+
(not just those of element children):
|
898
|
+
|
899
|
+
doc.root.each_index {|i| print i }
|
900
|
+
|
901
|
+
Output:
|
902
|
+
|
903
|
+
012345678
|
904
|
+
|
905
|
+
[Traverse Children Recursively]
|
906
|
+
|
907
|
+
Use included method REXML::Node#each_recursive to traverse all children recursively:
|
908
|
+
|
909
|
+
doc.root.each_recursive {|child| p [child.class, child] }
|
910
|
+
|
911
|
+
Output:
|
912
|
+
|
913
|
+
[REXML::Element, <book category='cooking'> ... </>]
|
914
|
+
[REXML::Element, <title lang='en'> ... </>]
|
915
|
+
[REXML::Element, <author> ... </>]
|
916
|
+
[REXML::Element, <year> ... </>]
|
917
|
+
[REXML::Element, <price> ... </>]
|
918
|
+
[REXML::Element, <book category='children'> ... </>]
|
919
|
+
[REXML::Element, <title lang='en'> ... </>]
|
920
|
+
[REXML::Element, <author> ... </>]
|
921
|
+
[REXML::Element, <year> ... </>]
|
922
|
+
[REXML::Element, <price> ... </>]
|
923
|
+
[REXML::Element, <book category='web'> ... </>]
|
924
|
+
[REXML::Element, <title lang='en'> ... </>]
|
925
|
+
[REXML::Element, <author> ... </>]
|
926
|
+
[REXML::Element, <author> ... </>]
|
927
|
+
[REXML::Element, <author> ... </>]
|
928
|
+
[REXML::Element, <author> ... </>]
|
929
|
+
[REXML::Element, <author> ... </>]
|
930
|
+
[REXML::Element, <year> ... </>]
|
931
|
+
[REXML::Element, <price> ... </>]
|
932
|
+
[REXML::Element, <book category='web' cover='paperback'> ... </>]
|
933
|
+
[REXML::Element, <title lang='en'> ... </>]
|
934
|
+
[REXML::Element, <author> ... </>]
|
935
|
+
[REXML::Element, <year> ... </>]
|
936
|
+
[REXML::Element, <price> ... </>]
|
937
|
+
|
938
|
+
== Searching
|
939
|
+
|
940
|
+
You can use certain methods to search among the descendants of an element.
|
941
|
+
|
942
|
+
Use method REXML::Element#get_elements to retrieve all element children of the element
|
943
|
+
that match the given +xpath+:
|
944
|
+
|
945
|
+
xml_string = <<-EOT
|
946
|
+
<root>
|
947
|
+
<a level='1'>
|
948
|
+
<a level='2'/>
|
949
|
+
</a>
|
950
|
+
</root>
|
951
|
+
EOT
|
952
|
+
d = Document.new(xml_string)
|
953
|
+
d.root.get_elements('//a') # => [<a level='1'> ... </>, <a level='2'/>]
|
954
|
+
|
955
|
+
Use method REXML::Element#get_text with no argument to retrieve the first text node
|
956
|
+
in the first child:
|
957
|
+
|
958
|
+
my_doc = Document.new "<p>some text <b>this is bold!</b> more text</p>"
|
959
|
+
text_node = my_doc.root.get_text
|
960
|
+
text_node.class # => REXML::Text
|
961
|
+
text_node.to_s # => "some text "
|
962
|
+
|
963
|
+
Use the same method with argument +xpath+ to retrieve the first text node
|
964
|
+
in the first child that matches the xpath:
|
965
|
+
|
966
|
+
my_doc.root.get_text(1) # => "this is bold!"
|
967
|
+
|
968
|
+
Use method REXML::Element#text with no argument to retrieve the text
|
969
|
+
from the first text node in the first child:
|
970
|
+
|
971
|
+
my_doc = Document.new "<p>some text <b>this is bold!</b> more text</p>"
|
972
|
+
text_node = my_doc.root.text
|
973
|
+
text_node.class # => String
|
974
|
+
text_node # => "some text "
|
975
|
+
|
976
|
+
Use the same method with argument +xpath+ to retrieve the text from the first text node
|
977
|
+
in the first child that matches the xpath:
|
978
|
+
|
979
|
+
my_doc.root.text(1) # => "this is bold!"
|
980
|
+
|
981
|
+
Use included method REXML::Node#find_first_recursive
|
982
|
+
to retrieve the first descendant element
|
983
|
+
for which the given block returns a truthy value, or +nil+ if none:
|
984
|
+
|
985
|
+
doc.root.find_first_recursive do |ele|
|
986
|
+
ele.name == 'price'
|
987
|
+
end # => <price> ... </>
|
988
|
+
doc.root.find_first_recursive do |ele|
|
989
|
+
ele.name == 'nosuch'
|
990
|
+
end # => nil
|
991
|
+
|
992
|
+
== Editing
|
993
|
+
|
994
|
+
=== Editing a Document
|
995
|
+
|
996
|
+
[Creating a Document]
|
997
|
+
|
998
|
+
Create a new document with method REXML::Document::new:
|
999
|
+
|
1000
|
+
doc = Document.new(source_string)
|
1001
|
+
empty_doc = REXML::Document.new
|
1002
|
+
|
1003
|
+
[Adding to the Document]
|
1004
|
+
|
1005
|
+
Add an XML declaration with method REXML::Document#add
|
1006
|
+
and an argument of type REXML::XMLDecl:
|
1007
|
+
|
1008
|
+
my_doc = Document.new
|
1009
|
+
my_doc.xml_decl.to_s # => ""
|
1010
|
+
my_doc.add(XMLDecl.new('2.0'))
|
1011
|
+
my_doc.xml_decl.to_s # => "<?xml version='2.0'?>"
|
1012
|
+
|
1013
|
+
Add a document type with method REXML::Document#add
|
1014
|
+
and an argument of type REXML::DocType:
|
1015
|
+
|
1016
|
+
my_doc = Document.new
|
1017
|
+
my_doc.doctype.to_s # => ""
|
1018
|
+
my_doc.add(DocType.new('foo'))
|
1019
|
+
my_doc.doctype.to_s # => "<!DOCTYPE foo>"
|
1020
|
+
|
1021
|
+
Add a node of any other REXML type with method REXML::Document#add and an argument
|
1022
|
+
that is not of type REXML::XMLDecl or REXML::DocType:
|
1023
|
+
|
1024
|
+
my_doc = Document.new
|
1025
|
+
my_doc.add(Element.new('foo'))
|
1026
|
+
my_doc.to_s # => "<foo/>"
|
1027
|
+
|
1028
|
+
Add an existing element as the root element with method REXML::Document#add_element:
|
1029
|
+
|
1030
|
+
ele = Element.new('foo')
|
1031
|
+
my_doc = Document.new
|
1032
|
+
my_doc.add_element(ele)
|
1033
|
+
my_doc.root # => <foo/>
|
1034
|
+
|
1035
|
+
Create and add an element as the root element with method REXML::Document#add_element:
|
1036
|
+
|
1037
|
+
my_doc = Document.new
|
1038
|
+
my_doc.add_element('foo')
|
1039
|
+
my_doc.root # => <foo/>
|
1040
|
+
|
1041
|
+
=== Editing an Element
|
1042
|
+
|
1043
|
+
==== Creating an Element
|
1044
|
+
|
1045
|
+
Create a new element with method REXML::Element::new:
|
1046
|
+
|
1047
|
+
ele = Element.new('foo') # => <foo/>
|
1048
|
+
|
1049
|
+
==== Setting Element Properties
|
1050
|
+
|
1051
|
+
Set the context for an element with method REXML::Element#context=
|
1052
|
+
(see {Element Context}[../context_rdoc.html]):
|
1053
|
+
|
1054
|
+
ele.context # => nil
|
1055
|
+
ele.context = {ignore_whitespace_nodes: :all}
|
1056
|
+
ele.context # => {:ignore_whitespace_nodes=>:all}
|
1057
|
+
|
1058
|
+
Set the parent for an element with inherited method REXML::Child#parent=
|
1059
|
+
|
1060
|
+
ele.parent # => nil
|
1061
|
+
ele.parent = Element.new('bar')
|
1062
|
+
ele.parent # => <bar/>
|
1063
|
+
|
1064
|
+
Set the text for an element with method REXML::Element#text=:
|
1065
|
+
|
1066
|
+
ele.text # => nil
|
1067
|
+
ele.text = 'bar'
|
1068
|
+
ele.text # => "bar"
|
1069
|
+
|
1070
|
+
==== Adding to an Element
|
1071
|
+
|
1072
|
+
Add a node as the last child with inherited method REXML::Parent#add (or its alias #push):
|
1073
|
+
|
1074
|
+
ele = Element.new('foo') # => <foo/>
|
1075
|
+
ele.push(Text.new('bar'))
|
1076
|
+
ele.push(Element.new('baz'))
|
1077
|
+
ele.children # => ["bar", <baz/>]
|
1078
|
+
|
1079
|
+
Add a node as the first child with inherited method REXML::Parent#unshift:
|
1080
|
+
|
1081
|
+
ele = Element.new('foo') # => <foo/>
|
1082
|
+
ele.unshift(Element.new('bar'))
|
1083
|
+
ele.unshift(Text.new('baz'))
|
1084
|
+
ele.children # => ["bar", <baz/>]
|
1085
|
+
|
1086
|
+
Add an element as the last child with method REXML::Element#add_element:
|
1087
|
+
|
1088
|
+
ele = Element.new('foo') # => <foo/>
|
1089
|
+
ele.add_element('bar')
|
1090
|
+
ele.add_element(Element.new('baz'))
|
1091
|
+
ele.children # => [<bar/>, <baz/>]
|
1092
|
+
|
1093
|
+
Add a text node as the last child with method REXML::Element#add_text:
|
1094
|
+
|
1095
|
+
ele = Element.new('foo') # => <foo/>
|
1096
|
+
ele.add_text('bar')
|
1097
|
+
ele.add_text(Text.new('baz'))
|
1098
|
+
ele.children # => ["bar", "baz"]
|
1099
|
+
|
1100
|
+
Insert a node before a given node with method REXML::Parent#insert_before:
|
1101
|
+
|
1102
|
+
ele = Element.new('foo') # => <foo/>
|
1103
|
+
ele.add_text('bar')
|
1104
|
+
ele.add_text(Text.new('baz'))
|
1105
|
+
ele.children # => ["bar", "baz"]
|
1106
|
+
target = ele[1] # => "baz"
|
1107
|
+
ele.insert_before(target, Text.new('bat'))
|
1108
|
+
ele.children # => ["bar", "bat", "baz"]
|
1109
|
+
|
1110
|
+
Insert a node after a given node with method REXML::Parent#insert_after:
|
1111
|
+
|
1112
|
+
ele = Element.new('foo') # => <foo/>
|
1113
|
+
ele.add_text('bar')
|
1114
|
+
ele.add_text(Text.new('baz'))
|
1115
|
+
ele.children # => ["bar", "baz"]
|
1116
|
+
target = ele[0] # => "bar"
|
1117
|
+
ele.insert_after(target, Text.new('bat'))
|
1118
|
+
ele.children # => ["bar", "bat", "baz"]
|
1119
|
+
|
1120
|
+
Add an attribute with method REXML::Element#add_attribute:
|
1121
|
+
|
1122
|
+
ele = Element.new('foo') # => <foo/>
|
1123
|
+
ele.add_attribute('bar', 'baz')
|
1124
|
+
ele.add_attribute(Attribute.new('bat', 'bam'))
|
1125
|
+
ele.attributes # => {"bar"=>bar='baz', "bat"=>bat='bam'}
|
1126
|
+
|
1127
|
+
Add multiple attributes with method REXML::Element#add_attributes:
|
1128
|
+
|
1129
|
+
ele = Element.new('foo') # => <foo/>
|
1130
|
+
ele.add_attributes({'bar' => 'baz', 'bat' => 'bam'})
|
1131
|
+
ele.add_attributes([['ban', 'bap'], ['bah', 'bad']])
|
1132
|
+
ele.attributes # => {"bar"=>bar='baz', "bat"=>bat='bam', "ban"=>ban='bap', "bah"=>bah='bad'}
|
1133
|
+
|
1134
|
+
Add a namespace with method REXML::Element#add_namespace:
|
1135
|
+
|
1136
|
+
ele = Element.new('foo') # => <foo/>
|
1137
|
+
ele.add_namespace('bar')
|
1138
|
+
ele.add_namespace('baz', 'bat')
|
1139
|
+
ele.namespaces # => {"xmlns"=>"bar", "baz"=>"bat"}
|
1140
|
+
|
1141
|
+
==== Deleting from an Element
|
1142
|
+
|
1143
|
+
Delete a specific child object with inherited method REXML::Parent#delete:
|
1144
|
+
|
1145
|
+
ele = Element.new('foo') # => <foo/>
|
1146
|
+
ele.add_element('bar')
|
1147
|
+
ele.add_text('baz')
|
1148
|
+
ele.children # => [<bar/>, "baz"]
|
1149
|
+
target = ele[1] # => "baz"
|
1150
|
+
ele.delete(target) # => "baz"
|
1151
|
+
ele.children # => [<bar/>]
|
1152
|
+
target = ele[0] # => <baz/>
|
1153
|
+
ele.delete(target) # => <baz/>
|
1154
|
+
ele.children # => []
|
1155
|
+
|
1156
|
+
Delete a child at a specific index with inherited method REXML::Parent#delete_at:
|
1157
|
+
|
1158
|
+
ele = Element.new('foo') # => <foo/>
|
1159
|
+
ele.add_element('bar')
|
1160
|
+
ele.add_text('baz')
|
1161
|
+
ele.children # => [<bar/>, "baz"]
|
1162
|
+
ele.delete_at(1)
|
1163
|
+
ele.children # => [<bar/>]
|
1164
|
+
ele.delete_at(0)
|
1165
|
+
ele.children # => []
|
1166
|
+
|
1167
|
+
Delete all children meeting a specified criterion with inherited method
|
1168
|
+
REXML::Parent#delete_if:
|
1169
|
+
|
1170
|
+
ele = Element.new('foo') # => <foo/>
|
1171
|
+
ele.add_element('bar')
|
1172
|
+
ele.add_text('baz')
|
1173
|
+
ele.add_element('bat')
|
1174
|
+
ele.add_text('bam')
|
1175
|
+
ele.children # => [<bar/>, "baz", <bat/>, "bam"]
|
1176
|
+
ele.delete_if {|child| child.instance_of?(Text) }
|
1177
|
+
ele.children # => [<bar/>, <bat/>]
|
1178
|
+
|
1179
|
+
Delete an element at a specific 1-based index with method REXML::Element#delete_element:
|
1180
|
+
|
1181
|
+
ele = Element.new('foo') # => <foo/>
|
1182
|
+
ele.add_element('bar')
|
1183
|
+
ele.add_text('baz')
|
1184
|
+
ele.add_element('bat')
|
1185
|
+
ele.add_text('bam')
|
1186
|
+
ele.children # => [<bar/>, "baz", <bat/>, "bam"]
|
1187
|
+
ele.delete_element(2) # => <bat/>
|
1188
|
+
ele.children # => [<bar/>, "baz", "bam"]
|
1189
|
+
ele.delete_element(1) # => <bar/>
|
1190
|
+
ele.children # => ["baz", "bam"]
|
1191
|
+
|
1192
|
+
Delete a specific element with the same method:
|
1193
|
+
|
1194
|
+
ele = Element.new('foo') # => <foo/>
|
1195
|
+
ele.add_element('bar')
|
1196
|
+
ele.add_text('baz')
|
1197
|
+
ele.add_element('bat')
|
1198
|
+
ele.add_text('bam')
|
1199
|
+
ele.children # => [<bar/>, "baz", <bat/>, "bam"]
|
1200
|
+
target = ele.elements[2] # => <bat/>
|
1201
|
+
ele.delete_element(target) # => <bat/>
|
1202
|
+
ele.children # => [<bar/>, "baz", "bam"]
|
1203
|
+
|
1204
|
+
Delete an element matching an xpath using the same method:
|
1205
|
+
|
1206
|
+
ele = Element.new('foo') # => <foo/>
|
1207
|
+
ele.add_element('bar')
|
1208
|
+
ele.add_text('baz')
|
1209
|
+
ele.add_element('bat')
|
1210
|
+
ele.add_text('bam')
|
1211
|
+
ele.children # => [<bar/>, "baz", <bat/>, "bam"]
|
1212
|
+
ele.delete_element('./bat') # => <bat/>
|
1213
|
+
ele.children # => [<bar/>, "baz", "bam"]
|
1214
|
+
ele.delete_element('./bar') # => <bar/>
|
1215
|
+
ele.children # => ["baz", "bam"]
|
1216
|
+
|
1217
|
+
Delete an attribute by name with method REXML::Element#delete_attribute:
|
1218
|
+
|
1219
|
+
ele = Element.new('foo') # => <foo/>
|
1220
|
+
ele.add_attributes({'bar' => 'baz', 'bam' => 'bat'})
|
1221
|
+
ele.attributes # => {"bar"=>bar='baz', "bam"=>bam='bat'}
|
1222
|
+
ele.delete_attribute('bam')
|
1223
|
+
ele.attributes # => {"bar"=>bar='baz'}
|
1224
|
+
|
1225
|
+
Delete a namespace with method REXML::Element#delete_namespace:
|
1226
|
+
|
1227
|
+
ele = Element.new('foo') # => <foo/>
|
1228
|
+
ele.add_namespace('bar')
|
1229
|
+
ele.add_namespace('baz', 'bat')
|
1230
|
+
ele.namespaces # => {"xmlns"=>"bar", "baz"=>"bat"}
|
1231
|
+
ele.delete_namespace('xmlns')
|
1232
|
+
ele.namespaces # => {} # => {"baz"=>"bat"}
|
1233
|
+
ele.delete_namespace('baz')
|
1234
|
+
ele.namespaces # => {} # => {}
|
1235
|
+
|
1236
|
+
Remove an element from its parent with inherited method REXML::Child#remove:
|
1237
|
+
|
1238
|
+
ele = Element.new('foo') # => <foo/>
|
1239
|
+
parent = Element.new('bar') # => <bar/>
|
1240
|
+
parent.add_element(ele) # => <foo/>
|
1241
|
+
parent.children.size # => 1
|
1242
|
+
ele.remove # => <foo/>
|
1243
|
+
parent.children.size # => 0
|
1244
|
+
|
1245
|
+
==== Replacing Nodes
|
1246
|
+
|
1247
|
+
Replace the node at a given 0-based index with inherited method REXML::Parent#[]=:
|
1248
|
+
|
1249
|
+
ele = Element.new('foo') # => <foo/>
|
1250
|
+
ele.add_element('bar')
|
1251
|
+
ele.add_text('baz')
|
1252
|
+
ele.add_element('bat')
|
1253
|
+
ele.add_text('bam')
|
1254
|
+
ele.children # => [<bar/>, "baz", <bat/>, "bam"]
|
1255
|
+
ele[2] = Text.new('bad') # => "bad"
|
1256
|
+
ele.children # => [<bar/>, "baz", "bad", "bam"]
|
1257
|
+
|
1258
|
+
Replace a given node with another node with inherited method REXML::Parent#replace_child:
|
1259
|
+
|
1260
|
+
ele = Element.new('foo') # => <foo/>
|
1261
|
+
ele.add_element('bar')
|
1262
|
+
ele.add_text('baz')
|
1263
|
+
ele.add_element('bat')
|
1264
|
+
ele.add_text('bam')
|
1265
|
+
ele.children # => [<bar/>, "baz", <bat/>, "bam"]
|
1266
|
+
target = ele[2] # => <bat/>
|
1267
|
+
ele.replace_child(target, Text.new('bah'))
|
1268
|
+
ele.children # => [<bar/>, "baz", "bah", "bam"]
|
1269
|
+
|
1270
|
+
Replace +self+ with a given node with inherited method REXML::Child#replace_with:
|
1271
|
+
|
1272
|
+
ele = Element.new('foo') # => <foo/>
|
1273
|
+
ele.add_element('bar')
|
1274
|
+
ele.add_text('baz')
|
1275
|
+
ele.add_element('bat')
|
1276
|
+
ele.add_text('bam')
|
1277
|
+
ele.children # => [<bar/>, "baz", <bat/>, "bam"]
|
1278
|
+
target = ele[2] # => <bat/>
|
1279
|
+
target.replace_with(Text.new('bah'))
|
1280
|
+
ele.children # => [<bar/>, "baz", "bah", "bam"]
|
1281
|
+
|
1282
|
+
=== Cloning
|
1283
|
+
|
1284
|
+
Create a shallow clone of an element with method REXML::Element#clone.
|
1285
|
+
The clone contains the name and attributes, but not the parent or children:
|
1286
|
+
|
1287
|
+
ele = Element.new('foo')
|
1288
|
+
ele.add_attributes({'bar' => 0, 'baz' => 1})
|
1289
|
+
ele.clone # => <foo bar='0' baz='1'/>
|
1290
|
+
|
1291
|
+
Create a shallow clone of a document with method REXML::Document#clone.
|
1292
|
+
The XML declaration is copied; the document type and root element are not cloned:
|
1293
|
+
|
1294
|
+
my_xml = '<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE foo><root/>'
|
1295
|
+
my_doc = Document.new(my_xml)
|
1296
|
+
clone_doc = my_doc.clone
|
1297
|
+
|
1298
|
+
my_doc.xml_decl # => <?xml ... ?>
|
1299
|
+
clone_doc.xml_decl # => <?xml ... ?>
|
1300
|
+
|
1301
|
+
my_doc.doctype.to_s # => "<?xml version='1.0' encoding='UTF-8'?>"
|
1302
|
+
clone_doc.doctype.to_s # => ""
|
1303
|
+
|
1304
|
+
my_doc.root # => <root/>
|
1305
|
+
clone_doc.root # => nil
|
1306
|
+
|
1307
|
+
Create a deep clone of an element with inherited method REXML::Parent#deep_clone.
|
1308
|
+
All nodes and attributes are copied:
|
1309
|
+
|
1310
|
+
doc.to_s.size # => 825
|
1311
|
+
clone = doc.deep_clone
|
1312
|
+
clone.to_s.size # => 825
|
1313
|
+
|
1314
|
+
== Writing the Document
|
1315
|
+
|
1316
|
+
Write a document to an \IO stream (defaults to <tt>$stdout</tt>)
|
1317
|
+
with method REXML::Document#write:
|
1318
|
+
|
1319
|
+
doc.write
|
1320
|
+
|
1321
|
+
Output:
|
1322
|
+
|
1323
|
+
<?xml version='1.0' encoding='UTF-8'?>
|
1324
|
+
<bookstore>
|
1325
|
+
|
1326
|
+
<book category='cooking'>
|
1327
|
+
<title lang='en'>Everyday Italian</title>
|
1328
|
+
<author>Giada De Laurentiis</author>
|
1329
|
+
<year>2005</year>
|
1330
|
+
<price>30.00</price>
|
1331
|
+
</book>
|
1332
|
+
|
1333
|
+
<book category='children'>
|
1334
|
+
<title lang='en'>Harry Potter</title>
|
1335
|
+
<author>J K. Rowling</author>
|
1336
|
+
<year>2005</year>
|
1337
|
+
<price>29.99</price>
|
1338
|
+
</book>
|
1339
|
+
|
1340
|
+
<book category='web'>
|
1341
|
+
<title lang='en'>XQuery Kick Start</title>
|
1342
|
+
<author>James McGovern</author>
|
1343
|
+
<author>Per Bothner</author>
|
1344
|
+
<author>Kurt Cagle</author>
|
1345
|
+
<author>James Linn</author>
|
1346
|
+
<author>Vaidyanathan Nagarajan</author>
|
1347
|
+
<year>2003</year>
|
1348
|
+
<price>49.99</price>
|
1349
|
+
</book>
|
1350
|
+
|
1351
|
+
<book category='web' cover='paperback'>
|
1352
|
+
<title lang='en'>Learning XML</title>
|
1353
|
+
<author>Erik T. Ray</author>
|
1354
|
+
<year>2003</year>
|
1355
|
+
<price>39.95</price>
|
1356
|
+
</book>
|
1357
|
+
|
1358
|
+
</bookstore>
|