rexml 3.2.5 → 3.3.6
Sign up to get free protection for your applications and to get access to all the features.
Potentially problematic release.
This version of rexml might be problematic. Click here for more details.
- checksums.yaml +4 -4
- data/NEWS.md +406 -2
- data/README.md +10 -1
- data/doc/rexml/tasks/rdoc/element.rdoc +2 -2
- data/doc/rexml/tutorial.rdoc +1358 -0
- data/lib/rexml/attribute.rb +14 -9
- data/lib/rexml/document.rb +1 -1
- data/lib/rexml/element.rb +19 -34
- data/lib/rexml/entity.rb +5 -37
- data/lib/rexml/formatters/pretty.rb +3 -3
- data/lib/rexml/functions.rb +1 -2
- data/lib/rexml/namespace.rb +8 -4
- data/lib/rexml/node.rb +8 -4
- data/lib/rexml/parseexception.rb +1 -0
- data/lib/rexml/parsers/baseparser.rb +421 -263
- data/lib/rexml/parsers/pullparser.rb +4 -0
- data/lib/rexml/parsers/sax2parser.rb +6 -19
- data/lib/rexml/parsers/streamparser.rb +8 -10
- data/lib/rexml/parsers/treeparser.rb +9 -21
- data/lib/rexml/parsers/xpathparser.rb +136 -86
- data/lib/rexml/rexml.rb +3 -1
- data/lib/rexml/source.rb +128 -98
- data/lib/rexml/text.rb +40 -18
- data/lib/rexml/xpath_parser.rb +7 -3
- metadata +11 -39
@@ -0,0 +1,1358 @@
|
|
1
|
+
= \REXML Tutorial
|
2
|
+
|
3
|
+
== Why \REXML?
|
4
|
+
|
5
|
+
- Ruby's \REXML library is part of the Ruby distribution,
|
6
|
+
so using it requires no gem installations.
|
7
|
+
- \REXML is fully maintained.
|
8
|
+
- \REXML is mature, having been in use for long years.
|
9
|
+
|
10
|
+
== To Include, or Not to Include?
|
11
|
+
|
12
|
+
REXML is a module.
|
13
|
+
To use it, you must require it:
|
14
|
+
|
15
|
+
require 'rexml' # => true
|
16
|
+
|
17
|
+
If you do not also include it, you must fully qualify references to REXML:
|
18
|
+
|
19
|
+
REXML::Document # => REXML::Document
|
20
|
+
|
21
|
+
If you also include the module, you may optionally omit <tt>REXML::</tt>:
|
22
|
+
|
23
|
+
include REXML
|
24
|
+
Document # => REXML::Document
|
25
|
+
REXML::Document # => REXML::Document
|
26
|
+
|
27
|
+
== Preliminaries
|
28
|
+
|
29
|
+
All examples here assume that the following code has been executed:
|
30
|
+
|
31
|
+
require 'rexml'
|
32
|
+
include REXML
|
33
|
+
|
34
|
+
The source XML for many examples here is from file
|
35
|
+
{books.xml}[https://www.w3schools.com/xml/books.xml] at w3schools.com.
|
36
|
+
You may find it convenient to open that page in a new tab
|
37
|
+
(Ctrl-click in some browsers).
|
38
|
+
|
39
|
+
Note that your browser may display the XML with modified whitespace
|
40
|
+
and without the XML declaration, which in this case is:
|
41
|
+
|
42
|
+
<?xml version="1.0" encoding="UTF-8"?>
|
43
|
+
|
44
|
+
For convenience, we capture the XML into a string variable:
|
45
|
+
|
46
|
+
require 'open-uri'
|
47
|
+
source_string = URI.open('https://www.w3schools.com/xml/books.xml').read
|
48
|
+
|
49
|
+
And into a file:
|
50
|
+
|
51
|
+
File.write('source_file.xml', source_string)
|
52
|
+
|
53
|
+
Throughout these examples, variable +doc+ will hold only the document
|
54
|
+
derived from these sources:
|
55
|
+
|
56
|
+
doc = Document.new(source_string)
|
57
|
+
|
58
|
+
== Parsing \XML \Source
|
59
|
+
|
60
|
+
=== Parsing a Document
|
61
|
+
|
62
|
+
Use method REXML::Document::new to parse XML source.
|
63
|
+
|
64
|
+
The source may be a string:
|
65
|
+
|
66
|
+
doc = Document.new(source_string)
|
67
|
+
|
68
|
+
Or an \IO stream:
|
69
|
+
|
70
|
+
doc = File.open('source_file.xml', 'r') do |io|
|
71
|
+
Document.new(io)
|
72
|
+
end
|
73
|
+
|
74
|
+
Method <tt>URI.open</tt> returns a StringIO object,
|
75
|
+
so the source can be from a web page:
|
76
|
+
|
77
|
+
require 'open-uri'
|
78
|
+
io = URI.open("https://www.w3schools.com/xml/books.xml")
|
79
|
+
io.class # => StringIO
|
80
|
+
doc = Document.new(io)
|
81
|
+
|
82
|
+
For any of these sources, the returned object is an REXML::Document:
|
83
|
+
|
84
|
+
doc # => <UNDEFINED> ... </>
|
85
|
+
doc.class # => REXML::Document
|
86
|
+
|
87
|
+
Note: <tt>'UNDEFINED'</tt> is the "name" displayed for a document,
|
88
|
+
even though <tt>doc.name</tt> returns an empty string <tt>""</tt>.
|
89
|
+
|
90
|
+
A parsed document may produce \REXML objects of many classes,
|
91
|
+
but the two that are likely to be of greatest interest are
|
92
|
+
REXML::Document and REXML::Element.
|
93
|
+
These two classes are covered in great detail in this tutorial.
|
94
|
+
|
95
|
+
=== Context (Parsing Options)
|
96
|
+
|
97
|
+
The context for parsing a document is a hash that influences
|
98
|
+
the way the XML is read and stored.
|
99
|
+
|
100
|
+
The context entries are:
|
101
|
+
|
102
|
+
- +:respect_whitespace+: controls treatment of whitespace.
|
103
|
+
- +:compress_whitespace+: determines whether whitespace is compressed.
|
104
|
+
- +:ignore_whitespace_nodes+: determines whether whitespace-only nodes are to be ignored.
|
105
|
+
- +:raw+: controls treatment of special characters and entities.
|
106
|
+
|
107
|
+
See {Element Context}[../context_rdoc.html].
|
108
|
+
|
109
|
+
== Exploring the Document
|
110
|
+
|
111
|
+
An REXML::Document object represents an XML document.
|
112
|
+
|
113
|
+
The object inherits from its ancestor classes:
|
114
|
+
|
115
|
+
- REXML::Child (includes module REXML::Node)
|
116
|
+
- REXML::Parent (includes module {Enumerable}[rdoc-ref:Enumerable]).
|
117
|
+
- REXML::Element (includes module REXML::Namespace).
|
118
|
+
- REXML::Document
|
119
|
+
|
120
|
+
This section covers only those properties and methods that are unique to a document
|
121
|
+
(that is, not inherited or included).
|
122
|
+
|
123
|
+
=== Document Properties
|
124
|
+
|
125
|
+
A document has several properties (other than its children);
|
126
|
+
|
127
|
+
- Document type.
|
128
|
+
- Node type.
|
129
|
+
- Name.
|
130
|
+
- Document.
|
131
|
+
- XPath
|
132
|
+
|
133
|
+
[Document Type]
|
134
|
+
|
135
|
+
A document may have a document type:
|
136
|
+
|
137
|
+
my_xml = '<!DOCTYPE foo>'
|
138
|
+
my_doc = Document.new(my_xml)
|
139
|
+
doc_type = my_doc.doctype
|
140
|
+
doc_type.class # => REXML::DocType
|
141
|
+
doc_type.to_s # => "<!DOCTYPE foo>"
|
142
|
+
|
143
|
+
[Node Type]
|
144
|
+
|
145
|
+
A document also has a node type (always +:document+):
|
146
|
+
|
147
|
+
doc.node_type # => :document
|
148
|
+
|
149
|
+
[Name]
|
150
|
+
|
151
|
+
A document has a name (always an empty string):
|
152
|
+
|
153
|
+
doc.name # => ""
|
154
|
+
|
155
|
+
[Document]
|
156
|
+
|
157
|
+
\Method REXML::Document#document returns +self+:
|
158
|
+
|
159
|
+
doc.document == doc # => true
|
160
|
+
|
161
|
+
An object of a different class (\REXML::Element or \REXML::Child)
|
162
|
+
may have a document, which is the document to which the object belongs;
|
163
|
+
if so, that document will be an \REXML::Document object.
|
164
|
+
|
165
|
+
doc.root.document.class # => REXML::Document
|
166
|
+
|
167
|
+
[XPath]
|
168
|
+
|
169
|
+
\method REXML::Element#xpath returns the string xpath to the element,
|
170
|
+
relative to its most distant ancestor:
|
171
|
+
|
172
|
+
doc.root.class # => REXML::Element
|
173
|
+
doc.root.xpath # => "/bookstore"
|
174
|
+
doc.root.texts.first # => "\n\n"
|
175
|
+
doc.root.texts.first.xpath # => "/bookstore/text()"
|
176
|
+
|
177
|
+
If there is no ancestor, returns the expanded name of the element:
|
178
|
+
|
179
|
+
Element.new('foo').xpath # => "foo"
|
180
|
+
|
181
|
+
=== Document Children
|
182
|
+
|
183
|
+
A document may have children of these types:
|
184
|
+
|
185
|
+
- XML declaration.
|
186
|
+
- Root element.
|
187
|
+
- Text.
|
188
|
+
- Processing instructions.
|
189
|
+
- Comments.
|
190
|
+
- CDATA.
|
191
|
+
|
192
|
+
[XML Declaration]
|
193
|
+
|
194
|
+
A document may an XML declaration, which is stored as an REXML::XMLDecl object:
|
195
|
+
|
196
|
+
doc.xml_decl # => <?xml ... ?>
|
197
|
+
doc.xml_decl.class # => REXML::XMLDecl
|
198
|
+
|
199
|
+
Document.new('').xml_decl # => <?xml ... ?>
|
200
|
+
|
201
|
+
my_xml = '<?xml version="1.0" encoding="UTF-8" standalone="yes"?>"'
|
202
|
+
my_doc = Document.new(my_xml)
|
203
|
+
xml_decl = my_doc.xml_decl
|
204
|
+
xml_decl.to_s # => "<?xml version='1.0' encoding='UTF-8' standalone="yes"?>"
|
205
|
+
|
206
|
+
The version, encoding, and stand-alone values may be retrieved separately:
|
207
|
+
|
208
|
+
my_doc.version # => "1.0"
|
209
|
+
my_doc.encoding # => "UTF-8"
|
210
|
+
my_doc.stand_alone? # => "yes"
|
211
|
+
|
212
|
+
[Root Element]
|
213
|
+
|
214
|
+
A document may have a single element child, called the _root_ _element_,
|
215
|
+
which is stored as an REXML::Element object;
|
216
|
+
it may be retrieved with method +root+:
|
217
|
+
|
218
|
+
doc.root # => <bookstore> ... </>
|
219
|
+
doc.root.class # => REXML::Element
|
220
|
+
|
221
|
+
Document.new('').root # => nil
|
222
|
+
|
223
|
+
[Text]
|
224
|
+
|
225
|
+
A document may have text passages, each of which is stored
|
226
|
+
as an REXML::Text object:
|
227
|
+
|
228
|
+
doc.texts.each {|t| p [t.class, t] }
|
229
|
+
|
230
|
+
Output:
|
231
|
+
|
232
|
+
[REXML::Text, "\n"]
|
233
|
+
|
234
|
+
[Processing Instructions]
|
235
|
+
|
236
|
+
A document may have processing instructions, which are stored
|
237
|
+
as REXML::Instruction objects:
|
238
|
+
|
239
|
+
|
240
|
+
|
241
|
+
Output:
|
242
|
+
|
243
|
+
[REXML::Instruction, <?p-i my-application ...?>]
|
244
|
+
[REXML::Instruction, <?p-i my-application ...?>]
|
245
|
+
|
246
|
+
[Comments]
|
247
|
+
|
248
|
+
A document may have comments, which are stored
|
249
|
+
as REXML::Comment objects:
|
250
|
+
|
251
|
+
my_xml = <<-EOT
|
252
|
+
<!--foo-->
|
253
|
+
<!--bar-->
|
254
|
+
EOT
|
255
|
+
my_doc = Document.new(my_xml)
|
256
|
+
my_doc.comments.each {|c| p [c.class, c] }
|
257
|
+
|
258
|
+
Output:
|
259
|
+
|
260
|
+
[REXML::Comment, #<REXML::Comment: @parent=<UNDEFINED> ... </>, @string="foo">]
|
261
|
+
[REXML::Comment, #<REXML::Comment: @parent=<UNDEFINED> ... </>, @string="bar">]
|
262
|
+
|
263
|
+
[CDATA]
|
264
|
+
|
265
|
+
A document may have CDATA entries, which are stored
|
266
|
+
as REXML::CData objects:
|
267
|
+
|
268
|
+
my_xml = <<-EOT
|
269
|
+
<![CDATA[foo]]>
|
270
|
+
<![CDATA[bar]]>
|
271
|
+
EOT
|
272
|
+
my_doc = Document.new(my_xml)
|
273
|
+
my_doc.cdatas.each {|cd| p [cd.class, cd] }
|
274
|
+
|
275
|
+
Output:
|
276
|
+
|
277
|
+
[REXML::CData, "foo"]
|
278
|
+
[REXML::CData, "bar"]
|
279
|
+
|
280
|
+
The payload of a document is a tree of nodes, descending from the root element:
|
281
|
+
|
282
|
+
doc.root.children.each do |child|
|
283
|
+
p [child, child.class]
|
284
|
+
end
|
285
|
+
|
286
|
+
Output:
|
287
|
+
|
288
|
+
[REXML::Text, "\n\n"]
|
289
|
+
[REXML::Element, <book category='cooking'> ... </>]
|
290
|
+
[REXML::Text, "\n\n"]
|
291
|
+
[REXML::Element, <book category='children'> ... </>]
|
292
|
+
[REXML::Text, "\n\n"]
|
293
|
+
[REXML::Element, <book category='web'> ... </>]
|
294
|
+
[REXML::Text, "\n\n"]
|
295
|
+
[REXML::Element, <book category='web' cover='paperback'> ... </>]
|
296
|
+
[REXML::Text, "\n\n"]
|
297
|
+
|
298
|
+
== Exploring an Element
|
299
|
+
|
300
|
+
An REXML::Element object represents an XML element.
|
301
|
+
|
302
|
+
The object inherits from its ancestor classes:
|
303
|
+
|
304
|
+
- REXML::Child (includes module REXML::Node)
|
305
|
+
- REXML::Parent (includes module {Enumerable}[rdoc-ref:Enumerable]).
|
306
|
+
- REXML::Element (includes module REXML::Namespace).
|
307
|
+
|
308
|
+
This section covers methods:
|
309
|
+
|
310
|
+
- Defined in REXML::Element itself.
|
311
|
+
- Inherited from REXML::Parent and REXML::Child.
|
312
|
+
- Included from REXML::Node.
|
313
|
+
|
314
|
+
=== Inside the Element
|
315
|
+
|
316
|
+
[Brief String Representation]
|
317
|
+
|
318
|
+
Use method REXML::Element#inspect to retrieve a brief string representation.
|
319
|
+
|
320
|
+
doc.root.inspect # => "<bookstore> ... </>"
|
321
|
+
|
322
|
+
The ellipsis (<tt>...</tt>) indicates that the element has children.
|
323
|
+
When there are no children, the ellipsis is omitted:
|
324
|
+
|
325
|
+
Element.new('foo').inspect # => "<foo/>"
|
326
|
+
|
327
|
+
If the element has attributes, those are also included:
|
328
|
+
|
329
|
+
doc.root.elements.first.inspect # => "<book category='cooking'> ... </>"
|
330
|
+
|
331
|
+
[Extended String Representation]
|
332
|
+
|
333
|
+
Use inherited method REXML::Child.bytes to retrieve an extended
|
334
|
+
string representation.
|
335
|
+
|
336
|
+
doc.root.bytes # => "<bookstore>\n\n<book category='cooking'>\n <title lang='en'>Everyday Italian</title>\n <author>Giada De Laurentiis</author>\n <year>2005</year>\n <price>30.00</price>\n</book>\n\n<book category='children'>\n <title lang='en'>Harry Potter</title>\n <author>J K. Rowling</author>\n <year>2005</year>\n <price>29.99</price>\n</book>\n\n<book category='web'>\n <title lang='en'>XQuery Kick Start</title>\n <author>James McGovern</author>\n <author>Per Bothner</author>\n <author>Kurt Cagle</author>\n <author>James Linn</author>\n <author>Vaidyanathan Nagarajan</author>\n <year>2003</year>\n <price>49.99</price>\n</book>\n\n<book category='web' cover='paperback'>\n <title lang='en'>Learning XML</title>\n <author>Erik T. Ray</author>\n <year>2003</year>\n <price>39.95</price>\n</book>\n\n</bookstore>"
|
337
|
+
|
338
|
+
[Node Type]
|
339
|
+
|
340
|
+
Use method REXML::Element#node_type to retrieve the node type (always +:element+):
|
341
|
+
|
342
|
+
doc.root.node_type # => :element
|
343
|
+
|
344
|
+
[Raw Mode]
|
345
|
+
|
346
|
+
Use method REXML::Element#raw to retrieve whether (+true+ or +nil+)
|
347
|
+
raw mode is set.
|
348
|
+
|
349
|
+
doc.root.raw # => nil
|
350
|
+
|
351
|
+
[Context]
|
352
|
+
|
353
|
+
Use method REXML::Element#context to retrieve the context hash
|
354
|
+
(see {Element Context}[../context_rdoc.html]):
|
355
|
+
|
356
|
+
doc.root.context # => {}
|
357
|
+
|
358
|
+
=== Relationships
|
359
|
+
|
360
|
+
An element may have:
|
361
|
+
|
362
|
+
- Ancestors.
|
363
|
+
- Siblings.
|
364
|
+
- Children.
|
365
|
+
|
366
|
+
==== Ancestors
|
367
|
+
|
368
|
+
[Containing Document]
|
369
|
+
|
370
|
+
Use method REXML::Element#document to retrieve the containing document, if any:
|
371
|
+
|
372
|
+
ele = doc.root.elements.first # => <book category='cooking'> ... </>
|
373
|
+
ele.document # => <UNDEFINED> ... </>
|
374
|
+
ele = Element.new('foo') # => <foo/>
|
375
|
+
ele.document # => nil
|
376
|
+
|
377
|
+
[Root Element]
|
378
|
+
|
379
|
+
Use method REXML::Element#root to retrieve the root element:
|
380
|
+
|
381
|
+
ele = doc.root.elements.first # => <book category='cooking'> ... </>
|
382
|
+
ele.root # => <bookstore> ... </>
|
383
|
+
ele = Element.new('foo') # => <foo/>
|
384
|
+
ele.root # => <foo/>
|
385
|
+
|
386
|
+
[Root Node]
|
387
|
+
|
388
|
+
Use method REXML::Element#root_node to retrieve the most distant ancestor,
|
389
|
+
which is the containing document, if any, otherwise the root element:
|
390
|
+
|
391
|
+
ele = doc.root.elements.first # => <book category='cooking'> ... </>
|
392
|
+
ele.root_node # => <UNDEFINED> ... </>
|
393
|
+
ele = Element.new('foo') # => <foo/>
|
394
|
+
ele.root_node # => <foo/>
|
395
|
+
|
396
|
+
[Parent]
|
397
|
+
|
398
|
+
Use inherited method REXML::Child#parent to retrieve the parent
|
399
|
+
|
400
|
+
ele = doc.root # => <bookstore> ... </>
|
401
|
+
ele.parent # => <UNDEFINED> ... </>
|
402
|
+
ele = doc.root.elements.first # => <book category='cooking'> ... </>
|
403
|
+
ele.parent # => <bookstore> ... </>
|
404
|
+
|
405
|
+
Use included method REXML::Node#index_in_parent to retrieve the index
|
406
|
+
of the element among all of its parents children (not just the element children).
|
407
|
+
Note that while the index for <tt>doc.root.elements[n]</tt> is 1-based,
|
408
|
+
the returned index is 0-based.
|
409
|
+
|
410
|
+
doc.root.children # =>
|
411
|
+
# ["\n\n",
|
412
|
+
# <book category='cooking'> ... </>,
|
413
|
+
# "\n\n",
|
414
|
+
# <book category='children'> ... </>,
|
415
|
+
# "\n\n",
|
416
|
+
# <book category='web'> ... </>,
|
417
|
+
# "\n\n",
|
418
|
+
# <book category='web' cover='paperback'> ... </>,
|
419
|
+
# "\n\n"]
|
420
|
+
ele = doc.root.elements[1] # => <book category='cooking'> ... </>
|
421
|
+
ele.index_in_parent # => 2
|
422
|
+
ele = doc.root.elements[2] # => <book category='children'> ... </>
|
423
|
+
ele.index_in_parent# => 4
|
424
|
+
|
425
|
+
==== Siblings
|
426
|
+
|
427
|
+
[Next Element]
|
428
|
+
|
429
|
+
Use method REXML::Element#next_element to retrieve the first following
|
430
|
+
sibling that is itself an element (+nil+ if there is none):
|
431
|
+
|
432
|
+
ele = doc.root.elements[1]
|
433
|
+
while ele do
|
434
|
+
p [ele.class, ele]
|
435
|
+
ele = ele.next_element
|
436
|
+
end
|
437
|
+
p ele
|
438
|
+
|
439
|
+
Output:
|
440
|
+
|
441
|
+
[REXML::Element, <book category='cooking'> ... </>]
|
442
|
+
[REXML::Element, <book category='children'> ... </>]
|
443
|
+
[REXML::Element, <book category='web'> ... </>]
|
444
|
+
[REXML::Element, <book category='web' cover='paperback'> ... </>]
|
445
|
+
|
446
|
+
[Previous Element]
|
447
|
+
|
448
|
+
Use method REXML::Element#previous_element to retrieve the first preceding
|
449
|
+
sibling that is itself an element (+nil+ if there is none):
|
450
|
+
|
451
|
+
ele = doc.root.elements[4]
|
452
|
+
while ele do
|
453
|
+
p [ele.class, ele]
|
454
|
+
ele = ele.previous_element
|
455
|
+
end
|
456
|
+
p ele
|
457
|
+
|
458
|
+
Output:
|
459
|
+
|
460
|
+
[REXML::Element, <book category='web' cover='paperback'> ... </>]
|
461
|
+
[REXML::Element, <book category='web'> ... </>]
|
462
|
+
[REXML::Element, <book category='children'> ... </>]
|
463
|
+
[REXML::Element, <book category='cooking'> ... </>]
|
464
|
+
|
465
|
+
[Next Node]
|
466
|
+
|
467
|
+
Use included method REXML::Node.next_sibling_node
|
468
|
+
(or its alias <tt>next_sibling</tt>) to retrieve the first following node
|
469
|
+
regardless of its class:
|
470
|
+
|
471
|
+
node = doc.root.children[0]
|
472
|
+
while node do
|
473
|
+
p [node.class, node]
|
474
|
+
node = node.next_sibling
|
475
|
+
end
|
476
|
+
p node
|
477
|
+
|
478
|
+
Output:
|
479
|
+
|
480
|
+
[REXML::Text, "\n\n"]
|
481
|
+
[REXML::Element, <book category='cooking'> ... </>]
|
482
|
+
[REXML::Text, "\n\n"]
|
483
|
+
[REXML::Element, <book category='children'> ... </>]
|
484
|
+
[REXML::Text, "\n\n"]
|
485
|
+
[REXML::Element, <book category='web'> ... </>]
|
486
|
+
[REXML::Text, "\n\n"]
|
487
|
+
[REXML::Element, <book category='web' cover='paperback'> ... </>]
|
488
|
+
[REXML::Text, "\n\n"]
|
489
|
+
|
490
|
+
[Previous Node]
|
491
|
+
|
492
|
+
Use included method REXML::Node.previous_sibling_node
|
493
|
+
(or its alias <tt>previous_sibling</tt>) to retrieve the first preceding node
|
494
|
+
regardless of its class:
|
495
|
+
|
496
|
+
node = doc.root.children[-1]
|
497
|
+
while node do
|
498
|
+
p [node.class, node]
|
499
|
+
node = node.previous_sibling
|
500
|
+
end
|
501
|
+
p node
|
502
|
+
|
503
|
+
Output:
|
504
|
+
|
505
|
+
[REXML::Text, "\n\n"]
|
506
|
+
[REXML::Element, <book category='web' cover='paperback'> ... </>]
|
507
|
+
[REXML::Text, "\n\n"]
|
508
|
+
[REXML::Element, <book category='web'> ... </>]
|
509
|
+
[REXML::Text, "\n\n"]
|
510
|
+
[REXML::Element, <book category='children'> ... </>]
|
511
|
+
[REXML::Text, "\n\n"]
|
512
|
+
[REXML::Element, <book category='cooking'> ... </>]
|
513
|
+
[REXML::Text, "\n\n"]
|
514
|
+
|
515
|
+
==== Children
|
516
|
+
|
517
|
+
[Child Count]
|
518
|
+
|
519
|
+
Use inherited method REXML::Parent.size to retrieve the count
|
520
|
+
of nodes (of all types) in the element:
|
521
|
+
|
522
|
+
doc.root.size # => 9
|
523
|
+
|
524
|
+
[Child Nodes]
|
525
|
+
|
526
|
+
Use inherited method REXML::Parent.children to retrieve an array
|
527
|
+
of the child nodes (of all types):
|
528
|
+
|
529
|
+
doc.root.children # =>
|
530
|
+
# ["\n\n",
|
531
|
+
# <book category='cooking'> ... </>,
|
532
|
+
# "\n\n",
|
533
|
+
# <book category='children'> ... </>,
|
534
|
+
# "\n\n",
|
535
|
+
# <book category='web'> ... </>,
|
536
|
+
# "\n\n",
|
537
|
+
# <book category='web' cover='paperback'> ... </>,
|
538
|
+
# "\n\n"]
|
539
|
+
|
540
|
+
[Child at Index]
|
541
|
+
|
542
|
+
Use method REXML::Element#[] to retrieve the child at a given numerical index,
|
543
|
+
or +nil+ if there is no such child:
|
544
|
+
|
545
|
+
doc.root[0] # => "\n\n"
|
546
|
+
doc.root[1] # => <book category='cooking'> ... </>
|
547
|
+
doc.root[7] # => <book category='web' cover='paperback'> ... </>
|
548
|
+
doc.root[8] # => "\n\n"
|
549
|
+
|
550
|
+
doc.root[-1] # => "\n\n"
|
551
|
+
doc.root[-2] # => <book category='web' cover='paperback'> ... </>
|
552
|
+
|
553
|
+
doc.root[50] # => nil
|
554
|
+
|
555
|
+
[Index of Child]
|
556
|
+
|
557
|
+
Use method REXML::Parent#index to retrieve the zero-based child index
|
558
|
+
of the given object, or <tt>#size - 1</tt> if there is no such child:
|
559
|
+
|
560
|
+
ele = doc.root # => <bookstore> ... </>
|
561
|
+
ele.index(ele[0]) # => 0
|
562
|
+
ele.index(ele[1]) # => 1
|
563
|
+
ele.index(ele[7]) # => 7
|
564
|
+
ele.index(ele[8]) # => 8
|
565
|
+
|
566
|
+
ele.index(ele[-1]) # => 8
|
567
|
+
ele.index(ele[-2]) # => 7
|
568
|
+
|
569
|
+
ele.index(ele[50]) # => 8
|
570
|
+
|
571
|
+
[Element Children]
|
572
|
+
|
573
|
+
Use method REXML::Element#has_elements? to retrieve whether the element
|
574
|
+
has element children:
|
575
|
+
|
576
|
+
doc.root.has_elements? # => true
|
577
|
+
REXML::Element.new('foo').has_elements? # => false
|
578
|
+
|
579
|
+
Use method REXML::Element#elements to retrieve the REXML::Elements object
|
580
|
+
containing the element children:
|
581
|
+
|
582
|
+
eles = doc.root.elements
|
583
|
+
eles # => #<REXML::Elements:0x000001ee2848e960 @element=<bookstore> ... </>>
|
584
|
+
eles.size # => 4
|
585
|
+
eles.each {|e| p [e.class], e }
|
586
|
+
|
587
|
+
Output:
|
588
|
+
|
589
|
+
[<book category='cooking'> ... </>,
|
590
|
+
<book category='children'> ... </>,
|
591
|
+
<book category='web'> ... </>,
|
592
|
+
<book category='web' cover='paperback'> ... </>
|
593
|
+
]
|
594
|
+
|
595
|
+
Note that while in this example, all the element children of the root element are
|
596
|
+
elements of the same name, <tt>'book'</tt>, that is not true of all documents;
|
597
|
+
a root element (or any other element) may have any mixture of child elements.
|
598
|
+
|
599
|
+
[CDATA Children]
|
600
|
+
|
601
|
+
Use method REXML::Element#cdatas to retrieve a frozen array of CDATA children:
|
602
|
+
|
603
|
+
my_xml = <<-EOT
|
604
|
+
<root>
|
605
|
+
<![CDATA[foo]]>
|
606
|
+
<![CDATA[bar]]>
|
607
|
+
</root>
|
608
|
+
EOT
|
609
|
+
my_doc = REXML::Document.new(my_xml)
|
610
|
+
cdatas my_doc.root.cdatas
|
611
|
+
cdatas.frozen? # => true
|
612
|
+
cdatas.map {|cd| cd.class } # => [REXML::CData, REXML::CData]
|
613
|
+
|
614
|
+
[Comment Children]
|
615
|
+
|
616
|
+
Use method REXML::Element#comments to retrieve a frozen array of comment children:
|
617
|
+
|
618
|
+
my_xml = <<-EOT
|
619
|
+
<root>
|
620
|
+
<!--foo-->
|
621
|
+
<!--bar-->
|
622
|
+
</root>
|
623
|
+
EOT
|
624
|
+
my_doc = REXML::Document.new(my_xml)
|
625
|
+
comments = my_doc.root.comments
|
626
|
+
comments.frozen? # => true
|
627
|
+
comments.map {|c| c.class } # => [REXML::Comment, REXML::Comment]
|
628
|
+
comments.map {|c| c.to_s } # => ["foo", "bar"]
|
629
|
+
|
630
|
+
[Processing Instruction Children]
|
631
|
+
|
632
|
+
Use method REXML::Element#instructions to retrieve a frozen array
|
633
|
+
of processing instruction children:
|
634
|
+
|
635
|
+
my_xml = <<-EOT
|
636
|
+
<root>
|
637
|
+
<?target0 foo?>
|
638
|
+
<?target1 bar?>
|
639
|
+
</root>
|
640
|
+
EOT
|
641
|
+
my_doc = REXML::Document.new(my_xml)
|
642
|
+
instrs = my_doc.root.instructions
|
643
|
+
instrs.frozen? # => true
|
644
|
+
instrs.map {|i| i.class } # => [REXML::Instruction, REXML::Instruction]
|
645
|
+
instrs.map {|i| i.to_s } # => ["<?target0 foo?>", "<?target1 bar?>"]
|
646
|
+
|
647
|
+
[Text Children]
|
648
|
+
|
649
|
+
Use method REXML::Element#has_text? to retrieve whether the element
|
650
|
+
has text children:
|
651
|
+
|
652
|
+
doc.root.has_text? # => true
|
653
|
+
REXML::Element.new('foo').has_text? # => false
|
654
|
+
|
655
|
+
Use method REXML::Element#texts to retrieve a frozen array of text children:
|
656
|
+
|
657
|
+
my_xml = '<root><a/>text<b/>more<c/></root>'
|
658
|
+
my_doc = REXML::Document.new(my_xml)
|
659
|
+
texts = my_doc.root.texts
|
660
|
+
texts.frozen? # => true
|
661
|
+
texts.map {|t| t.class } # => [REXML::Text, REXML::Text]
|
662
|
+
texts.map {|t| t.to_s } # => ["text", "more"]
|
663
|
+
|
664
|
+
[Parenthood]
|
665
|
+
|
666
|
+
Use inherited method REXML::Parent.parent? to retrieve whether the element is a parent;
|
667
|
+
always returns +true+; only REXML::Child#parent returns +false+.
|
668
|
+
|
669
|
+
doc.root.parent? # => true
|
670
|
+
|
671
|
+
=== Element Attributes
|
672
|
+
|
673
|
+
Use method REXML::Element#has_attributes? to return whether the element
|
674
|
+
has attributes:
|
675
|
+
|
676
|
+
ele = doc.root # => <bookstore> ... </>
|
677
|
+
ele.has_attributes? # => false
|
678
|
+
ele = ele.elements.first # => <book category='cooking'> ... </>
|
679
|
+
ele.has_attributes? # => true
|
680
|
+
|
681
|
+
Use method REXML::Element#attributes to return the hash
|
682
|
+
containing the attributes for the element.
|
683
|
+
Each hash key is a string attribute name;
|
684
|
+
each hash value is an REXML::Attribute object.
|
685
|
+
|
686
|
+
ele = doc.root # => <bookstore> ... </>
|
687
|
+
attrs = ele.attributes # => {}
|
688
|
+
|
689
|
+
ele = ele.elements.first # => <book category='cooking'> ... </>
|
690
|
+
attrs = ele.attributes # => {"category"=>category='cooking'}
|
691
|
+
attrs.size # => 1
|
692
|
+
attr_name = attrs.keys.first # => "category"
|
693
|
+
attr_name.class # => String
|
694
|
+
attr_value = attrs.values.first # => category='cooking'
|
695
|
+
attr_value.class # => REXML::Attribute
|
696
|
+
|
697
|
+
Use method REXML::Element#[] to retrieve the string value for a given attribute,
|
698
|
+
which may be given as either a string or a symbol:
|
699
|
+
|
700
|
+
ele = doc.root.elements.first # => <book category='cooking'> ... </>
|
701
|
+
attr_value = ele['category'] # => "cooking"
|
702
|
+
attr_value.class # => String
|
703
|
+
ele['nosuch'] # => nil
|
704
|
+
|
705
|
+
Use method REXML::Element#attribute to retrieve the value of a named attribute:
|
706
|
+
|
707
|
+
my_xml = "<root xmlns:a='a' a:x='a:x' x='x'/>"
|
708
|
+
my_doc = REXML::Document.new(my_xml)
|
709
|
+
my_doc.root.attribute("x") # => x='x'
|
710
|
+
my_doc.root.attribute("x", "a") # => a:x='a:x'
|
711
|
+
|
712
|
+
== Whitespace
|
713
|
+
|
714
|
+
Use method REXML::Element#ignore_whitespace_nodes to determine whether
|
715
|
+
whitespace nodes were ignored when the XML was parsed;
|
716
|
+
returns +true+ if so, +nil+ otherwise.
|
717
|
+
|
718
|
+
Use method REXML::Element#whitespace to determine whether whitespace
|
719
|
+
is respected for the element; returns +true+ if so, +false+ otherwise.
|
720
|
+
|
721
|
+
== Namespaces
|
722
|
+
|
723
|
+
Use method REXML::Element#namespace to retrieve the string namespace URI
|
724
|
+
for the element, which may derive from one of its ancestors:
|
725
|
+
|
726
|
+
xml_string = <<-EOT
|
727
|
+
<root>
|
728
|
+
<a xmlns='1' xmlns:y='2'>
|
729
|
+
<b/>
|
730
|
+
<c xmlns:z='3'/>
|
731
|
+
</a>
|
732
|
+
</root>
|
733
|
+
EOT
|
734
|
+
d = Document.new(xml_string)
|
735
|
+
b = d.elements['//b']
|
736
|
+
b.namespace # => "1"
|
737
|
+
b.namespace('y') # => "2"
|
738
|
+
b.namespace('nosuch') # => nil
|
739
|
+
|
740
|
+
Use method REXML::Element#namespaces to retrieve a hash of all defined namespaces
|
741
|
+
in the element and its ancestors:
|
742
|
+
|
743
|
+
xml_string = <<-EOT
|
744
|
+
<root>
|
745
|
+
<a xmlns:x='1' xmlns:y='2'>
|
746
|
+
<b/>
|
747
|
+
<c xmlns:z='3'/>
|
748
|
+
</a>
|
749
|
+
</root>
|
750
|
+
EOT
|
751
|
+
d = Document.new(xml_string)
|
752
|
+
d.elements['//a'].namespaces # => {"x"=>"1", "y"=>"2"}
|
753
|
+
d.elements['//b'].namespaces # => {"x"=>"1", "y"=>"2"}
|
754
|
+
d.elements['//c'].namespaces # => {"x"=>"1", "y"=>"2", "z"=>"3"}
|
755
|
+
|
756
|
+
Use method REXML::Element#prefixes to retrieve an array of the string prefixes (names)
|
757
|
+
of all defined namespaces in the element and its ancestors:
|
758
|
+
|
759
|
+
xml_string = <<-EOT
|
760
|
+
<root>
|
761
|
+
<a xmlns:x='1' xmlns:y='2'>
|
762
|
+
<b/>
|
763
|
+
<c xmlns:z='3'/>
|
764
|
+
</a>
|
765
|
+
</root>
|
766
|
+
EOT
|
767
|
+
d = Document.new(xml_string, {compress_whitespace: :all})
|
768
|
+
d.elements['//a'].prefixes # => ["x", "y"]
|
769
|
+
d.elements['//b'].prefixes # => ["x", "y"]
|
770
|
+
d.elements['//c'].prefixes # => ["x", "y", "z"]
|
771
|
+
|
772
|
+
== Traversing
|
773
|
+
|
774
|
+
You can use certain methods to traverse children of the element.
|
775
|
+
Each child that meets given criteria is yielded to the given block.
|
776
|
+
|
777
|
+
[Traverse All Children]
|
778
|
+
|
779
|
+
Use inherited method REXML::Parent#each (or its alias #each_child) to traverse
|
780
|
+
all children of the element:
|
781
|
+
|
782
|
+
doc.root.each {|child| p [child.class, child] }
|
783
|
+
|
784
|
+
Output:
|
785
|
+
|
786
|
+
[REXML::Text, "\n\n"]
|
787
|
+
[REXML::Element, <book category='cooking'> ... </>]
|
788
|
+
[REXML::Text, "\n\n"]
|
789
|
+
[REXML::Element, <book category='children'> ... </>]
|
790
|
+
[REXML::Text, "\n\n"]
|
791
|
+
[REXML::Element, <book category='web'> ... </>]
|
792
|
+
[REXML::Text, "\n\n"]
|
793
|
+
[REXML::Element, <book category='web' cover='paperback'> ... </>]
|
794
|
+
[REXML::Text, "\n\n"]
|
795
|
+
|
796
|
+
[Traverse Element Children]
|
797
|
+
|
798
|
+
Use method REXML::Element#each_element to traverse only the element children
|
799
|
+
of the element:
|
800
|
+
|
801
|
+
doc.root.each_element {|e| p [e.class, e] }
|
802
|
+
|
803
|
+
Output:
|
804
|
+
|
805
|
+
[REXML::Element, <book category='cooking'> ... </>]
|
806
|
+
[REXML::Element, <book category='children'> ... </>]
|
807
|
+
[REXML::Element, <book category='web'> ... </>]
|
808
|
+
[REXML::Element, <book category='web' cover='paperback'> ... </>]
|
809
|
+
|
810
|
+
[Traverse Element Children with Attribute]
|
811
|
+
|
812
|
+
Use method REXML::Element#each_element_with_attribute with the single argument
|
813
|
+
+attr_name+ to traverse each element child that has the given attribute:
|
814
|
+
|
815
|
+
my_doc = Document.new '<a><b id="1"/><c id="2"/><d id="1"/><e/></a>'
|
816
|
+
my_doc.root.each_element_with_attribute('id') {|e| p [e.class, e] }
|
817
|
+
|
818
|
+
Output:
|
819
|
+
|
820
|
+
[REXML::Element, <b id='1'/>]
|
821
|
+
[REXML::Element, <c id='2'/>]
|
822
|
+
[REXML::Element, <d id='1'/>]
|
823
|
+
|
824
|
+
Use the same method with a second argument +value+ to traverse
|
825
|
+
each element child element that has the given attribute and value:
|
826
|
+
|
827
|
+
my_doc.root.each_element_with_attribute('id', '1') {|e| p [e.class, e] }
|
828
|
+
|
829
|
+
Output:
|
830
|
+
|
831
|
+
[REXML::Element, <b id='1'/>]
|
832
|
+
[REXML::Element, <d id='1'/>]
|
833
|
+
|
834
|
+
Use the same method with a third argument +max+ to traverse
|
835
|
+
no more than the given number of element children:
|
836
|
+
|
837
|
+
my_doc.root.each_element_with_attribute('id', '1', 1) {|e| p [e.class, e] }
|
838
|
+
|
839
|
+
Output:
|
840
|
+
|
841
|
+
[REXML::Element, <b id='1'/>]
|
842
|
+
|
843
|
+
Use the same method with a fourth argument +xpath+ to traverse
|
844
|
+
only those element children that match the given xpath:
|
845
|
+
|
846
|
+
my_doc.root.each_element_with_attribute('id', '1', 2, '//d') {|e| p [e.class, e] }
|
847
|
+
|
848
|
+
Output:
|
849
|
+
|
850
|
+
[REXML::Element, <d id='1'/>]
|
851
|
+
|
852
|
+
[Traverse Element Children with Text]
|
853
|
+
|
854
|
+
Use method REXML::Element#each_element_with_text with no arguments
|
855
|
+
to traverse those element children that have text:
|
856
|
+
|
857
|
+
my_doc = Document.new '<a><b>b</b><c>b</c><d>d</d><e/></a>'
|
858
|
+
my_doc.root.each_element_with_text {|e| p [e.class, e] }
|
859
|
+
|
860
|
+
Output:
|
861
|
+
|
862
|
+
[REXML::Element, <b> ... </>]
|
863
|
+
[REXML::Element, <c> ... </>]
|
864
|
+
[REXML::Element, <d> ... </>]
|
865
|
+
|
866
|
+
Use the same method with the single argument +text+ to traverse
|
867
|
+
those element children that have exactly that text:
|
868
|
+
|
869
|
+
my_doc.root.each_element_with_text('b') {|e| p [e.class, e] }
|
870
|
+
|
871
|
+
Output:
|
872
|
+
|
873
|
+
[REXML::Element, <b> ... </>]
|
874
|
+
[REXML::Element, <c> ... </>]
|
875
|
+
|
876
|
+
Use the same method with additional second argument +max+ to traverse
|
877
|
+
no more than the given number of element children:
|
878
|
+
|
879
|
+
my_doc.root.each_element_with_text('b', 1) {|e| p [e.class, e] }
|
880
|
+
|
881
|
+
Output:
|
882
|
+
|
883
|
+
[REXML::Element, <b> ... </>]
|
884
|
+
|
885
|
+
Use the same method with additional third argument +xpath+ to traverse
|
886
|
+
only those element children that also match the given xpath:
|
887
|
+
|
888
|
+
my_doc.root.each_element_with_text('b', 2, '//c') {|e| p [e.class, e] }
|
889
|
+
|
890
|
+
Output:
|
891
|
+
|
892
|
+
[REXML::Element, <c> ... </>]
|
893
|
+
|
894
|
+
[Traverse Element Children's Indexes]
|
895
|
+
|
896
|
+
Use inherited method REXML::Parent#each_index to traverse all children's indexes
|
897
|
+
(not just those of element children):
|
898
|
+
|
899
|
+
doc.root.each_index {|i| print i }
|
900
|
+
|
901
|
+
Output:
|
902
|
+
|
903
|
+
012345678
|
904
|
+
|
905
|
+
[Traverse Children Recursively]
|
906
|
+
|
907
|
+
Use included method REXML::Node#each_recursive to traverse all children recursively:
|
908
|
+
|
909
|
+
doc.root.each_recursive {|child| p [child.class, child] }
|
910
|
+
|
911
|
+
Output:
|
912
|
+
|
913
|
+
[REXML::Element, <book category='cooking'> ... </>]
|
914
|
+
[REXML::Element, <title lang='en'> ... </>]
|
915
|
+
[REXML::Element, <author> ... </>]
|
916
|
+
[REXML::Element, <year> ... </>]
|
917
|
+
[REXML::Element, <price> ... </>]
|
918
|
+
[REXML::Element, <book category='children'> ... </>]
|
919
|
+
[REXML::Element, <title lang='en'> ... </>]
|
920
|
+
[REXML::Element, <author> ... </>]
|
921
|
+
[REXML::Element, <year> ... </>]
|
922
|
+
[REXML::Element, <price> ... </>]
|
923
|
+
[REXML::Element, <book category='web'> ... </>]
|
924
|
+
[REXML::Element, <title lang='en'> ... </>]
|
925
|
+
[REXML::Element, <author> ... </>]
|
926
|
+
[REXML::Element, <author> ... </>]
|
927
|
+
[REXML::Element, <author> ... </>]
|
928
|
+
[REXML::Element, <author> ... </>]
|
929
|
+
[REXML::Element, <author> ... </>]
|
930
|
+
[REXML::Element, <year> ... </>]
|
931
|
+
[REXML::Element, <price> ... </>]
|
932
|
+
[REXML::Element, <book category='web' cover='paperback'> ... </>]
|
933
|
+
[REXML::Element, <title lang='en'> ... </>]
|
934
|
+
[REXML::Element, <author> ... </>]
|
935
|
+
[REXML::Element, <year> ... </>]
|
936
|
+
[REXML::Element, <price> ... </>]
|
937
|
+
|
938
|
+
== Searching
|
939
|
+
|
940
|
+
You can use certain methods to search among the descendants of an element.
|
941
|
+
|
942
|
+
Use method REXML::Element#get_elements to retrieve all element children of the element
|
943
|
+
that match the given +xpath+:
|
944
|
+
|
945
|
+
xml_string = <<-EOT
|
946
|
+
<root>
|
947
|
+
<a level='1'>
|
948
|
+
<a level='2'/>
|
949
|
+
</a>
|
950
|
+
</root>
|
951
|
+
EOT
|
952
|
+
d = Document.new(xml_string)
|
953
|
+
d.root.get_elements('//a') # => [<a level='1'> ... </>, <a level='2'/>]
|
954
|
+
|
955
|
+
Use method REXML::Element#get_text with no argument to retrieve the first text node
|
956
|
+
in the first child:
|
957
|
+
|
958
|
+
my_doc = Document.new "<p>some text <b>this is bold!</b> more text</p>"
|
959
|
+
text_node = my_doc.root.get_text
|
960
|
+
text_node.class # => REXML::Text
|
961
|
+
text_node.to_s # => "some text "
|
962
|
+
|
963
|
+
Use the same method with argument +xpath+ to retrieve the first text node
|
964
|
+
in the first child that matches the xpath:
|
965
|
+
|
966
|
+
my_doc.root.get_text(1) # => "this is bold!"
|
967
|
+
|
968
|
+
Use method REXML::Element#text with no argument to retrieve the text
|
969
|
+
from the first text node in the first child:
|
970
|
+
|
971
|
+
my_doc = Document.new "<p>some text <b>this is bold!</b> more text</p>"
|
972
|
+
text_node = my_doc.root.text
|
973
|
+
text_node.class # => String
|
974
|
+
text_node # => "some text "
|
975
|
+
|
976
|
+
Use the same method with argument +xpath+ to retrieve the text from the first text node
|
977
|
+
in the first child that matches the xpath:
|
978
|
+
|
979
|
+
my_doc.root.text(1) # => "this is bold!"
|
980
|
+
|
981
|
+
Use included method REXML::Node#find_first_recursive
|
982
|
+
to retrieve the first descendant element
|
983
|
+
for which the given block returns a truthy value, or +nil+ if none:
|
984
|
+
|
985
|
+
doc.root.find_first_recursive do |ele|
|
986
|
+
ele.name == 'price'
|
987
|
+
end # => <price> ... </>
|
988
|
+
doc.root.find_first_recursive do |ele|
|
989
|
+
ele.name == 'nosuch'
|
990
|
+
end # => nil
|
991
|
+
|
992
|
+
== Editing
|
993
|
+
|
994
|
+
=== Editing a Document
|
995
|
+
|
996
|
+
[Creating a Document]
|
997
|
+
|
998
|
+
Create a new document with method REXML::Document::new:
|
999
|
+
|
1000
|
+
doc = Document.new(source_string)
|
1001
|
+
empty_doc = REXML::Document.new
|
1002
|
+
|
1003
|
+
[Adding to the Document]
|
1004
|
+
|
1005
|
+
Add an XML declaration with method REXML::Document#add
|
1006
|
+
and an argument of type REXML::XMLDecl:
|
1007
|
+
|
1008
|
+
my_doc = Document.new
|
1009
|
+
my_doc.xml_decl.to_s # => ""
|
1010
|
+
my_doc.add(XMLDecl.new('2.0'))
|
1011
|
+
my_doc.xml_decl.to_s # => "<?xml version='2.0'?>"
|
1012
|
+
|
1013
|
+
Add a document type with method REXML::Document#add
|
1014
|
+
and an argument of type REXML::DocType:
|
1015
|
+
|
1016
|
+
my_doc = Document.new
|
1017
|
+
my_doc.doctype.to_s # => ""
|
1018
|
+
my_doc.add(DocType.new('foo'))
|
1019
|
+
my_doc.doctype.to_s # => "<!DOCTYPE foo>"
|
1020
|
+
|
1021
|
+
Add a node of any other REXML type with method REXML::Document#add and an argument
|
1022
|
+
that is not of type REXML::XMLDecl or REXML::DocType:
|
1023
|
+
|
1024
|
+
my_doc = Document.new
|
1025
|
+
my_doc.add(Element.new('foo'))
|
1026
|
+
my_doc.to_s # => "<foo/>"
|
1027
|
+
|
1028
|
+
Add an existing element as the root element with method REXML::Document#add_element:
|
1029
|
+
|
1030
|
+
ele = Element.new('foo')
|
1031
|
+
my_doc = Document.new
|
1032
|
+
my_doc.add_element(ele)
|
1033
|
+
my_doc.root # => <foo/>
|
1034
|
+
|
1035
|
+
Create and add an element as the root element with method REXML::Document#add_element:
|
1036
|
+
|
1037
|
+
my_doc = Document.new
|
1038
|
+
my_doc.add_element('foo')
|
1039
|
+
my_doc.root # => <foo/>
|
1040
|
+
|
1041
|
+
=== Editing an Element
|
1042
|
+
|
1043
|
+
==== Creating an Element
|
1044
|
+
|
1045
|
+
Create a new element with method REXML::Element::new:
|
1046
|
+
|
1047
|
+
ele = Element.new('foo') # => <foo/>
|
1048
|
+
|
1049
|
+
==== Setting Element Properties
|
1050
|
+
|
1051
|
+
Set the context for an element with method REXML::Element#context=
|
1052
|
+
(see {Element Context}[../context_rdoc.html]):
|
1053
|
+
|
1054
|
+
ele.context # => nil
|
1055
|
+
ele.context = {ignore_whitespace_nodes: :all}
|
1056
|
+
ele.context # => {:ignore_whitespace_nodes=>:all}
|
1057
|
+
|
1058
|
+
Set the parent for an element with inherited method REXML::Child#parent=
|
1059
|
+
|
1060
|
+
ele.parent # => nil
|
1061
|
+
ele.parent = Element.new('bar')
|
1062
|
+
ele.parent # => <bar/>
|
1063
|
+
|
1064
|
+
Set the text for an element with method REXML::Element#text=:
|
1065
|
+
|
1066
|
+
ele.text # => nil
|
1067
|
+
ele.text = 'bar'
|
1068
|
+
ele.text # => "bar"
|
1069
|
+
|
1070
|
+
==== Adding to an Element
|
1071
|
+
|
1072
|
+
Add a node as the last child with inherited method REXML::Parent#add (or its alias #push):
|
1073
|
+
|
1074
|
+
ele = Element.new('foo') # => <foo/>
|
1075
|
+
ele.push(Text.new('bar'))
|
1076
|
+
ele.push(Element.new('baz'))
|
1077
|
+
ele.children # => ["bar", <baz/>]
|
1078
|
+
|
1079
|
+
Add a node as the first child with inherited method REXML::Parent#unshift:
|
1080
|
+
|
1081
|
+
ele = Element.new('foo') # => <foo/>
|
1082
|
+
ele.unshift(Element.new('bar'))
|
1083
|
+
ele.unshift(Text.new('baz'))
|
1084
|
+
ele.children # => ["bar", <baz/>]
|
1085
|
+
|
1086
|
+
Add an element as the last child with method REXML::Element#add_element:
|
1087
|
+
|
1088
|
+
ele = Element.new('foo') # => <foo/>
|
1089
|
+
ele.add_element('bar')
|
1090
|
+
ele.add_element(Element.new('baz'))
|
1091
|
+
ele.children # => [<bar/>, <baz/>]
|
1092
|
+
|
1093
|
+
Add a text node as the last child with method REXML::Element#add_text:
|
1094
|
+
|
1095
|
+
ele = Element.new('foo') # => <foo/>
|
1096
|
+
ele.add_text('bar')
|
1097
|
+
ele.add_text(Text.new('baz'))
|
1098
|
+
ele.children # => ["bar", "baz"]
|
1099
|
+
|
1100
|
+
Insert a node before a given node with method REXML::Parent#insert_before:
|
1101
|
+
|
1102
|
+
ele = Element.new('foo') # => <foo/>
|
1103
|
+
ele.add_text('bar')
|
1104
|
+
ele.add_text(Text.new('baz'))
|
1105
|
+
ele.children # => ["bar", "baz"]
|
1106
|
+
target = ele[1] # => "baz"
|
1107
|
+
ele.insert_before(target, Text.new('bat'))
|
1108
|
+
ele.children # => ["bar", "bat", "baz"]
|
1109
|
+
|
1110
|
+
Insert a node after a given node with method REXML::Parent#insert_after:
|
1111
|
+
|
1112
|
+
ele = Element.new('foo') # => <foo/>
|
1113
|
+
ele.add_text('bar')
|
1114
|
+
ele.add_text(Text.new('baz'))
|
1115
|
+
ele.children # => ["bar", "baz"]
|
1116
|
+
target = ele[0] # => "bar"
|
1117
|
+
ele.insert_after(target, Text.new('bat'))
|
1118
|
+
ele.children # => ["bar", "bat", "baz"]
|
1119
|
+
|
1120
|
+
Add an attribute with method REXML::Element#add_attribute:
|
1121
|
+
|
1122
|
+
ele = Element.new('foo') # => <foo/>
|
1123
|
+
ele.add_attribute('bar', 'baz')
|
1124
|
+
ele.add_attribute(Attribute.new('bat', 'bam'))
|
1125
|
+
ele.attributes # => {"bar"=>bar='baz', "bat"=>bat='bam'}
|
1126
|
+
|
1127
|
+
Add multiple attributes with method REXML::Element#add_attributes:
|
1128
|
+
|
1129
|
+
ele = Element.new('foo') # => <foo/>
|
1130
|
+
ele.add_attributes({'bar' => 'baz', 'bat' => 'bam'})
|
1131
|
+
ele.add_attributes([['ban', 'bap'], ['bah', 'bad']])
|
1132
|
+
ele.attributes # => {"bar"=>bar='baz', "bat"=>bat='bam', "ban"=>ban='bap', "bah"=>bah='bad'}
|
1133
|
+
|
1134
|
+
Add a namespace with method REXML::Element#add_namespace:
|
1135
|
+
|
1136
|
+
ele = Element.new('foo') # => <foo/>
|
1137
|
+
ele.add_namespace('bar')
|
1138
|
+
ele.add_namespace('baz', 'bat')
|
1139
|
+
ele.namespaces # => {"xmlns"=>"bar", "baz"=>"bat"}
|
1140
|
+
|
1141
|
+
==== Deleting from an Element
|
1142
|
+
|
1143
|
+
Delete a specific child object with inherited method REXML::Parent#delete:
|
1144
|
+
|
1145
|
+
ele = Element.new('foo') # => <foo/>
|
1146
|
+
ele.add_element('bar')
|
1147
|
+
ele.add_text('baz')
|
1148
|
+
ele.children # => [<bar/>, "baz"]
|
1149
|
+
target = ele[1] # => "baz"
|
1150
|
+
ele.delete(target) # => "baz"
|
1151
|
+
ele.children # => [<bar/>]
|
1152
|
+
target = ele[0] # => <baz/>
|
1153
|
+
ele.delete(target) # => <baz/>
|
1154
|
+
ele.children # => []
|
1155
|
+
|
1156
|
+
Delete a child at a specific index with inherited method REXML::Parent#delete_at:
|
1157
|
+
|
1158
|
+
ele = Element.new('foo') # => <foo/>
|
1159
|
+
ele.add_element('bar')
|
1160
|
+
ele.add_text('baz')
|
1161
|
+
ele.children # => [<bar/>, "baz"]
|
1162
|
+
ele.delete_at(1)
|
1163
|
+
ele.children # => [<bar/>]
|
1164
|
+
ele.delete_at(0)
|
1165
|
+
ele.children # => []
|
1166
|
+
|
1167
|
+
Delete all children meeting a specified criterion with inherited method
|
1168
|
+
REXML::Parent#delete_if:
|
1169
|
+
|
1170
|
+
ele = Element.new('foo') # => <foo/>
|
1171
|
+
ele.add_element('bar')
|
1172
|
+
ele.add_text('baz')
|
1173
|
+
ele.add_element('bat')
|
1174
|
+
ele.add_text('bam')
|
1175
|
+
ele.children # => [<bar/>, "baz", <bat/>, "bam"]
|
1176
|
+
ele.delete_if {|child| child.instance_of?(Text) }
|
1177
|
+
ele.children # => [<bar/>, <bat/>]
|
1178
|
+
|
1179
|
+
Delete an element at a specific 1-based index with method REXML::Element#delete_element:
|
1180
|
+
|
1181
|
+
ele = Element.new('foo') # => <foo/>
|
1182
|
+
ele.add_element('bar')
|
1183
|
+
ele.add_text('baz')
|
1184
|
+
ele.add_element('bat')
|
1185
|
+
ele.add_text('bam')
|
1186
|
+
ele.children # => [<bar/>, "baz", <bat/>, "bam"]
|
1187
|
+
ele.delete_element(2) # => <bat/>
|
1188
|
+
ele.children # => [<bar/>, "baz", "bam"]
|
1189
|
+
ele.delete_element(1) # => <bar/>
|
1190
|
+
ele.children # => ["baz", "bam"]
|
1191
|
+
|
1192
|
+
Delete a specific element with the same method:
|
1193
|
+
|
1194
|
+
ele = Element.new('foo') # => <foo/>
|
1195
|
+
ele.add_element('bar')
|
1196
|
+
ele.add_text('baz')
|
1197
|
+
ele.add_element('bat')
|
1198
|
+
ele.add_text('bam')
|
1199
|
+
ele.children # => [<bar/>, "baz", <bat/>, "bam"]
|
1200
|
+
target = ele.elements[2] # => <bat/>
|
1201
|
+
ele.delete_element(target) # => <bat/>
|
1202
|
+
ele.children # => [<bar/>, "baz", "bam"]
|
1203
|
+
|
1204
|
+
Delete an element matching an xpath using the same method:
|
1205
|
+
|
1206
|
+
ele = Element.new('foo') # => <foo/>
|
1207
|
+
ele.add_element('bar')
|
1208
|
+
ele.add_text('baz')
|
1209
|
+
ele.add_element('bat')
|
1210
|
+
ele.add_text('bam')
|
1211
|
+
ele.children # => [<bar/>, "baz", <bat/>, "bam"]
|
1212
|
+
ele.delete_element('./bat') # => <bat/>
|
1213
|
+
ele.children # => [<bar/>, "baz", "bam"]
|
1214
|
+
ele.delete_element('./bar') # => <bar/>
|
1215
|
+
ele.children # => ["baz", "bam"]
|
1216
|
+
|
1217
|
+
Delete an attribute by name with method REXML::Element#delete_attribute:
|
1218
|
+
|
1219
|
+
ele = Element.new('foo') # => <foo/>
|
1220
|
+
ele.add_attributes({'bar' => 'baz', 'bam' => 'bat'})
|
1221
|
+
ele.attributes # => {"bar"=>bar='baz', "bam"=>bam='bat'}
|
1222
|
+
ele.delete_attribute('bam')
|
1223
|
+
ele.attributes # => {"bar"=>bar='baz'}
|
1224
|
+
|
1225
|
+
Delete a namespace with method REXML::Element#delete_namespace:
|
1226
|
+
|
1227
|
+
ele = Element.new('foo') # => <foo/>
|
1228
|
+
ele.add_namespace('bar')
|
1229
|
+
ele.add_namespace('baz', 'bat')
|
1230
|
+
ele.namespaces # => {"xmlns"=>"bar", "baz"=>"bat"}
|
1231
|
+
ele.delete_namespace('xmlns')
|
1232
|
+
ele.namespaces # => {} # => {"baz"=>"bat"}
|
1233
|
+
ele.delete_namespace('baz')
|
1234
|
+
ele.namespaces # => {} # => {}
|
1235
|
+
|
1236
|
+
Remove an element from its parent with inherited method REXML::Child#remove:
|
1237
|
+
|
1238
|
+
ele = Element.new('foo') # => <foo/>
|
1239
|
+
parent = Element.new('bar') # => <bar/>
|
1240
|
+
parent.add_element(ele) # => <foo/>
|
1241
|
+
parent.children.size # => 1
|
1242
|
+
ele.remove # => <foo/>
|
1243
|
+
parent.children.size # => 0
|
1244
|
+
|
1245
|
+
==== Replacing Nodes
|
1246
|
+
|
1247
|
+
Replace the node at a given 0-based index with inherited method REXML::Parent#[]=:
|
1248
|
+
|
1249
|
+
ele = Element.new('foo') # => <foo/>
|
1250
|
+
ele.add_element('bar')
|
1251
|
+
ele.add_text('baz')
|
1252
|
+
ele.add_element('bat')
|
1253
|
+
ele.add_text('bam')
|
1254
|
+
ele.children # => [<bar/>, "baz", <bat/>, "bam"]
|
1255
|
+
ele[2] = Text.new('bad') # => "bad"
|
1256
|
+
ele.children # => [<bar/>, "baz", "bad", "bam"]
|
1257
|
+
|
1258
|
+
Replace a given node with another node with inherited method REXML::Parent#replace_child:
|
1259
|
+
|
1260
|
+
ele = Element.new('foo') # => <foo/>
|
1261
|
+
ele.add_element('bar')
|
1262
|
+
ele.add_text('baz')
|
1263
|
+
ele.add_element('bat')
|
1264
|
+
ele.add_text('bam')
|
1265
|
+
ele.children # => [<bar/>, "baz", <bat/>, "bam"]
|
1266
|
+
target = ele[2] # => <bat/>
|
1267
|
+
ele.replace_child(target, Text.new('bah'))
|
1268
|
+
ele.children # => [<bar/>, "baz", "bah", "bam"]
|
1269
|
+
|
1270
|
+
Replace +self+ with a given node with inherited method REXML::Child#replace_with:
|
1271
|
+
|
1272
|
+
ele = Element.new('foo') # => <foo/>
|
1273
|
+
ele.add_element('bar')
|
1274
|
+
ele.add_text('baz')
|
1275
|
+
ele.add_element('bat')
|
1276
|
+
ele.add_text('bam')
|
1277
|
+
ele.children # => [<bar/>, "baz", <bat/>, "bam"]
|
1278
|
+
target = ele[2] # => <bat/>
|
1279
|
+
target.replace_with(Text.new('bah'))
|
1280
|
+
ele.children # => [<bar/>, "baz", "bah", "bam"]
|
1281
|
+
|
1282
|
+
=== Cloning
|
1283
|
+
|
1284
|
+
Create a shallow clone of an element with method REXML::Element#clone.
|
1285
|
+
The clone contains the name and attributes, but not the parent or children:
|
1286
|
+
|
1287
|
+
ele = Element.new('foo')
|
1288
|
+
ele.add_attributes({'bar' => 0, 'baz' => 1})
|
1289
|
+
ele.clone # => <foo bar='0' baz='1'/>
|
1290
|
+
|
1291
|
+
Create a shallow clone of a document with method REXML::Document#clone.
|
1292
|
+
The XML declaration is copied; the document type and root element are not cloned:
|
1293
|
+
|
1294
|
+
my_xml = '<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE foo><root/>'
|
1295
|
+
my_doc = Document.new(my_xml)
|
1296
|
+
clone_doc = my_doc.clone
|
1297
|
+
|
1298
|
+
my_doc.xml_decl # => <?xml ... ?>
|
1299
|
+
clone_doc.xml_decl # => <?xml ... ?>
|
1300
|
+
|
1301
|
+
my_doc.doctype.to_s # => "<?xml version='1.0' encoding='UTF-8'?>"
|
1302
|
+
clone_doc.doctype.to_s # => ""
|
1303
|
+
|
1304
|
+
my_doc.root # => <root/>
|
1305
|
+
clone_doc.root # => nil
|
1306
|
+
|
1307
|
+
Create a deep clone of an element with inherited method REXML::Parent#deep_clone.
|
1308
|
+
All nodes and attributes are copied:
|
1309
|
+
|
1310
|
+
doc.to_s.size # => 825
|
1311
|
+
clone = doc.deep_clone
|
1312
|
+
clone.to_s.size # => 825
|
1313
|
+
|
1314
|
+
== Writing the Document
|
1315
|
+
|
1316
|
+
Write a document to an \IO stream (defaults to <tt>$stdout</tt>)
|
1317
|
+
with method REXML::Document#write:
|
1318
|
+
|
1319
|
+
doc.write
|
1320
|
+
|
1321
|
+
Output:
|
1322
|
+
|
1323
|
+
<?xml version='1.0' encoding='UTF-8'?>
|
1324
|
+
<bookstore>
|
1325
|
+
|
1326
|
+
<book category='cooking'>
|
1327
|
+
<title lang='en'>Everyday Italian</title>
|
1328
|
+
<author>Giada De Laurentiis</author>
|
1329
|
+
<year>2005</year>
|
1330
|
+
<price>30.00</price>
|
1331
|
+
</book>
|
1332
|
+
|
1333
|
+
<book category='children'>
|
1334
|
+
<title lang='en'>Harry Potter</title>
|
1335
|
+
<author>J K. Rowling</author>
|
1336
|
+
<year>2005</year>
|
1337
|
+
<price>29.99</price>
|
1338
|
+
</book>
|
1339
|
+
|
1340
|
+
<book category='web'>
|
1341
|
+
<title lang='en'>XQuery Kick Start</title>
|
1342
|
+
<author>James McGovern</author>
|
1343
|
+
<author>Per Bothner</author>
|
1344
|
+
<author>Kurt Cagle</author>
|
1345
|
+
<author>James Linn</author>
|
1346
|
+
<author>Vaidyanathan Nagarajan</author>
|
1347
|
+
<year>2003</year>
|
1348
|
+
<price>49.99</price>
|
1349
|
+
</book>
|
1350
|
+
|
1351
|
+
<book category='web' cover='paperback'>
|
1352
|
+
<title lang='en'>Learning XML</title>
|
1353
|
+
<author>Erik T. Ray</author>
|
1354
|
+
<year>2003</year>
|
1355
|
+
<price>39.95</price>
|
1356
|
+
</book>
|
1357
|
+
|
1358
|
+
</bookstore>
|