XMLROCS 0.0.1

Sign up to get free protection for your applications and to get access to all the features.
data/LICENSE ADDED
@@ -0,0 +1,28 @@
1
+
2
+ Copyright (c) 2008, Andreas Meingast, <ameingast@gmail.com>, http://yomi.at
3
+
4
+ All rights reserved.
5
+
6
+ Redistribution and use in source and binary forms, with or without
7
+ modification, are permitted provided that the following conditions are met:
8
+
9
+ * Redistributions of source code must retain the above copyright notice,
10
+ this list of conditions and the following disclaimer.
11
+ * Redistributions in binary form must reproduce the above copyright
12
+ notice, this list of conditions and the following disclaimer in the
13
+ documentation and/or other materials provided with the distribution.
14
+ * Neither the name of the person nor the names of its
15
+ contributors may be used to endorse or promote products derived
16
+ from this software without specific prior written permission.
17
+
18
+ THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
19
+ "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
20
+ LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
21
+ A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
22
+ CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
23
+ EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
24
+ PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
25
+ PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
26
+ LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
27
+ NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
28
+ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
data/README ADDED
@@ -0,0 +1,207 @@
1
+ = About
2
+
3
+ XMLROCS is short for XML Ruby ObjeCtS. It is a library that allows to
4
+ map XML data to Ruby objects.
5
+
6
+ XMLROCS is kind of a poor man's DOM. It provides basic access to attributes
7
+ and child elements so you can comfortably work with XML data.
8
+ Generally speaking, you can manipulate your XML data in true Ruby OO style.
9
+
10
+ == Creating an XMLROC
11
+
12
+ The following XML data will be used in the following examples:
13
+
14
+ xml = <<-EOS
15
+ <products name="Computers">
16
+ This is a mixed content for the following products:
17
+ <product id="2">Dell</product>
18
+ <product id="1">Acer</product>
19
+ <product id="3">Apple</product>
20
+ Another text.
21
+ <product id="4">HP</product>
22
+ </products>
23
+ EOS
24
+
25
+ o = XMLROCS::XMLNode.new :text => xml # => produces an XMLROC
26
+
27
+ If you already have an REXML::Document flying arround you can do the following:
28
+
29
+ rexml_document = REXML::Document.new(xml)
30
+ o = XMLROCS::XMLNode.new :root => rexml_document.root
31
+
32
+
33
+ == Accessing and Modifiying Attributes
34
+
35
+ All attributes are available via the []-operator. Attributenames are stored
36
+ as symbols, so the []-operator behaves pretty much like a Hash with symbol-keys.
37
+ Let's have a look at it:
38
+
39
+ o[:name] # => Computers
40
+ o[:i_m_not_there] # => nil
41
+
42
+ You can also modify attributes:
43
+
44
+ o[:name] = o[:name].reverse # => sretupmoC
45
+
46
+ which is pretty much equivalent to:
47
+
48
+ o[:name].reverse! # => sretupmoC
49
+
50
+ The other way to handle attributes is to access the @attributes accessor, which
51
+ acts as a hash with the following format:
52
+
53
+ { :atrribute_name => "attribute_value" }
54
+
55
+ Just like with the []-operator you can also modify the accessor-variable.
56
+
57
+
58
+ == Accessing and Modifiyng Children
59
+
60
+ Children can be accessed via instance methods or the @children accessor.
61
+ Just like the @attributes accessor, the @children accessor has the following
62
+ format:
63
+
64
+ { :child_name => [ XMLNode, ... ] }
65
+
66
+ :child_name is the name of the tag of the child, while XMLNode is an array
67
+ containing all children with the name.
68
+ Still, this solution is mainly used for internal representation and it is
69
+ pretty tedious to work with, because you _always_ have to handle the array,
70
+ even when you know that there is only one child (probably garuanteed by some
71
+ DTD or XSD).
72
+ Instance methods solve this problem by ALWAYS returning a direct XMLNode.
73
+ If there are n > 1 children with the same name, the instance method will
74
+ return the last (wrt to appearance in the XML data) element.
75
+ You can access all children with the same name by appending a '!'
76
+ to the instance_method just like this:
77
+
78
+ o.tag_name! # => [ XMLNode, ... ]
79
+
80
+ Some other examples:
81
+
82
+ # Determine the id of the Apple-product
83
+ o.product!.select { |x| x == "Apple" }.first[:id] # => "3"
84
+
85
+ # Determine the ids of all products whose name is not empty
86
+ o.product!.select { |x| not x.empty? } # => ["Dell", "Acer", "Apple", "HP"]
87
+
88
+ # Determine the name of the product with id == "3"
89
+ o.product!.select { |x| x[:id] == "3" }.first # => "Apple"
90
+
91
+ # produce a hashmap { id => productname }
92
+ o.product!.inject({}) { |h,x| h.merge({ x[:id] => x }) } # => {"1"=>"Acer", "2"=>"Dell", "3"=>"Apple", "4"=>"HP"}
93
+
94
+ # sort by product id
95
+ o.product!.sort { |a,b| a[:id] <=> b[:id] } # => ["Acer", "Dell", "Apple", "HP"]
96
+
97
+ # get the last product (wrt order of appearance in the xml text)
98
+ o.product # => HP
99
+
100
+ # apply some changes and write back the xml
101
+ o.product!.each { |x| x[:id] = "#{x[:id].to_i * 10}" }
102
+ o.to_xml # =>
103
+ '<products name="Computers">
104
+ <product id="20">Dell</product>
105
+ <product id="10">Acer</product>
106
+ <product id="30" >Apple</product>
107
+ <product id="40">HP</product>
108
+ </products>'
109
+
110
+ o.product!.each { |x| x.set_text(x.reverse) }
111
+ o.to_xml # =>
112
+ '<products name="Computers">
113
+ <product id="20">lleD</product>
114
+ <product id="10">recA</product>
115
+ <product id="30">elppA</product>
116
+ <product id="40">PH</product>
117
+ </products>'
118
+
119
+ == Appending and Removing Children and Attributes
120
+
121
+ To add attributes, just add a new entry to the attributes accessor:
122
+
123
+ o[:short_name] = "comp"
124
+ puts o.to_xml # =>
125
+ <products name="Computers" short_name="comp">
126
+ ...
127
+ </products>
128
+
129
+ To remove an attribute, you have to call the delete_attribute method
130
+
131
+ o.delete_attribute!(:short_name)
132
+ puts o.to_xml # =>
133
+ <products name="Computers">
134
+ ...
135
+ </products>
136
+
137
+
138
+ To add children, call the << operator with an XMLNode as argument.
139
+
140
+ # create the child node
141
+ child = XMLROCS::XMLNode.new(:text => '<product id="5">Fujitsu</product>')
142
+ # add it to the o-node
143
+ o << child
144
+ puts o.product!.select { |x| x[:id] == "5" }.first # => Fujitsu
145
+
146
+ To remove a child, call the >> operator with the childname as argument.
147
+ You can also provide a block to provide a more fine-grained filter.
148
+ The block will then be called with a child XMLNode as an argument.
149
+
150
+ # remove all children with id == "5"
151
+ o.>> { |child| child[:id] == "5" }
152
+
153
+ Whenever you provide a block, you have to bind the >> operator to
154
+ the object, because of the lower precedence of the infix call.
155
+
156
+ # remove all product children with id == "5"
157
+ o.>>(:product) { |child| child[:id] == "5" }
158
+
159
+ # remove all children called :product
160
+ o >> :product
161
+
162
+ == Walking over the XML-Tree Structure
163
+
164
+ All iterating is done in Preorder, but you can override the default behaviour
165
+ by setting the traversal accessor.
166
+
167
+ Currently the library supports the following traversals:
168
+
169
+ o.traversal = :preorder # default
170
+ o.traversal = :inorder
171
+ o.traversal = :postorder
172
+
173
+ You can inject the XML Tree with a left associative function, and map
174
+ over the tree by calling map or map! which will not produce an Array but
175
+ another XML Tree.
176
+
177
+ == Aggregating
178
+
179
+ This library does not provide direct support for XPath, but you can emulate
180
+ it's behaviour with iterators and closures.
181
+
182
+ Suppose you want to select all products whose id is smaller than 5. In
183
+ XPath you probably would come up with something like this:
184
+
185
+ //[@id < 5]
186
+
187
+ With XMLROCS there is no need for XPaths, because you can do the very same
188
+ thing easily in Ruby:
189
+
190
+ o.select { |node| node[:id] && node[:id].to_i < 5 }
191
+
192
+ Here's another one that builds an Array of all leafs:
193
+
194
+ o.select { |node| node.leaf? }
195
+
196
+ which is equivalent to:
197
+
198
+ o.select { |node| node.children.empty? }
199
+
200
+ Here's another one. Select all nodes that have no attributes:
201
+
202
+ o.select { |node| node.attributes.empty? }
203
+
204
+ It's actually quite handy, since your closure gets called on all children
205
+ and the object itself and then selects which elements you want to keep.
206
+ The big advandage here is, that you can actually do anything in the closure
207
+ that Ruby can do.
data/TODO ADDED
@@ -0,0 +1,15 @@
1
+ = TODO
2
+
3
+ == Typing
4
+
5
+ Introduce some kind of typing for text values, so you can say:
6
+
7
+ o = XMLROCS::XMLNode.new :text => some_text
8
+ o.type :child => Integer
9
+ o.type :child => { :attribute => Integer }
10
+ o.type :child => { :attribute => Array }
11
+
12
+
13
+ == Performance
14
+
15
+ The initialization of a whole XMLNode Tree is bad. Improve it.
data/lib/xmlrocs.rb ADDED
@@ -0,0 +1,327 @@
1
+ #
2
+ # Copyright (c) 2008, Andreas Meingast, <ameingast@gmail.com>, http://yomi.at
3
+ #
4
+ # All rights reserved.
5
+ #
6
+ # Redistribution and use in source and binary forms, with or without
7
+ # modification, are permitted provided that the following conditions are met:
8
+ #
9
+ # * Redistributions of source code must retain the above copyright notice,
10
+ # this list of conditions and the following disclaimer.
11
+ # * Redistributions in binary form must reproduce the above copyright
12
+ # notice, this list of conditions and the following disclaimer in the
13
+ # documentation and/or other materials provided with the distribution.
14
+ # * Neither the name of the person nor the names of its
15
+ # contributors may be used to endorse or promote products derived
16
+ # from this software without specific prior written permission.
17
+ #
18
+ # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
19
+ # "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
20
+ # LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
21
+ # A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
22
+ # CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
23
+ # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
24
+ # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
25
+ # PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
26
+ # LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
27
+ # NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
28
+ # SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
29
+ #
30
+
31
+ require 'rexml/document'
32
+
33
+ #
34
+ # For more information, have a look at the README or the XMLNode class
35
+ # documentation.
36
+ #
37
+ module XMLROCS
38
+
39
+ #
40
+ # Represents an XML Element. You can access and modify it's children
41
+ # and attributes.
42
+ #
43
+ # Each XMLNode has exactly one parent (that happens to be nil if the node
44
+ # is the root of the tree) and a (possibly empty) list of children.
45
+ #
46
+ # It also has a name (the name of the XML tag) that can be modified.
47
+ #
48
+ # You can compare two XMLNodes or an XMLNode and a String. When you provide
49
+ # a String, only the text value of the XMLNode is compared.
50
+ #
51
+ # You can enumerate the XMLNode Tree in the traversal-order defined
52
+ # in the @traversal accessor.
53
+ #
54
+ # For more information have a look at the README or instance method
55
+ # documentation.
56
+ #
57
+ class XMLNode < String
58
+ include Comparable
59
+ include Enumerable
60
+
61
+ #
62
+ # Hash containing the children of the current Node. The Hash has the
63
+ # following
64
+ # structure:
65
+ #
66
+ # { :childname => [ XMLNode, ... ] }
67
+ #
68
+ # You can either use the >> or << operators to modify children or use
69
+ # this accessor directly.
70
+ attr_reader :children
71
+
72
+ #
73
+ # Hash containing the attributes of the current Node:
74
+ #
75
+ # { :attribute_name => "Attribute" }
76
+ #
77
+ # Attributes can also be modified using this accessor.
78
+ attr_reader :attributes
79
+
80
+ #
81
+ # The parent of the current Node. The parent of the Root-Node is nil.
82
+ #
83
+ attr_reader :parent
84
+
85
+ #
86
+ # The name of the current Node.
87
+ #
88
+ # Example:
89
+ # x = XMLNode.new :text => '<a></a>'
90
+ # x.xmlname # => :a
91
+ #
92
+ attr_reader :xmlname
93
+
94
+ #
95
+ # Defines the order in which the tree is traversed.
96
+ #
97
+ # The following traversals are suppported:
98
+ # :preorder
99
+ # :postorder
100
+ # :inorder
101
+ #
102
+ attr_accessor :traversal
103
+
104
+ #
105
+ # Create a new XMLNode
106
+ # You have to either provide an REXML::Element as options[:root] or
107
+ # plaintext xml data as options[:text]. Otherwise an ArgumentError will
108
+ # be thrown.
109
+ #
110
+ # Supported options:
111
+ # :root # => REXML::Element that will be traversed.
112
+ # :text # => XML plaintext that will be parsed by REXML and then
113
+ # traversed.
114
+ # :nil # => Will create a nil node. Useful if you need a global
115
+ # parent.
116
+ # :traversal # => The traversal order. See traversal.
117
+ #
118
+ def initialize(options = {})
119
+ xmlroot = if options[:root]
120
+ options[:root]
121
+ elsif options[:text]
122
+ REXML::Document.new(options[:text]).root
123
+ elsif options[:nil]
124
+ nil
125
+ end
126
+ raise(ArgumentError, "Undefined") if !xmlroot and !options[:nil]
127
+ @children, @attributes = {}, {}
128
+ @parent ,@traversal = options[:parent], options[:traversal] || :preorder
129
+ @xmlname = xmlroot ? xmlroot.name.to_sym : :NIL
130
+ set_text(xmlroot ? xmlroot.get_text.to_s : '')
131
+ if xmlroot
132
+ xmlroot.attributes.each { |k,v| self[k.to_sym] = v }
133
+
134
+ # this recursions makes the whole library slow. basically we have
135
+ # all information in xmlroot, so no recursive calls would be necessary
136
+ xmlroot.select { |e| e.class == REXML::Element }.each do |e|
137
+ self << XMLNode.new({ :root => e, :parent => self })
138
+ end
139
+ end
140
+ end
141
+
142
+ #
143
+ # Access attributes by name. Attributenames are stored as symbols.
144
+ #
145
+ def [](attribute)
146
+ @attributes[attribute]
147
+ end
148
+
149
+ #
150
+ # Modify attributes. Behaves like a Hash. Keys are symbols by convention,
151
+ # values are XMLNode-objects.
152
+ #
153
+ def []=(attribute, value)
154
+ @attributes[attribute] = value
155
+ end
156
+
157
+ #
158
+ # Delete attributes. attribute has to be a symbol with the name of the
159
+ # attribute you want to delete.
160
+ #
161
+ def delete_attribute!(attribute)
162
+ @attributes.delete(attribute)
163
+ end
164
+
165
+ #
166
+ # Append a child. child has to be an XMLNode or a String that contains
167
+ # the XML data in plaintext.
168
+ #
169
+ def <<(child)
170
+ return self << XMLROCS::XMLNode.new(:text => child) if is_real_string(child)
171
+ (@children[child.xmlname] = (@children[child.xmlname] || []) << child).last
172
+ end
173
+
174
+ #
175
+ # Remove a child from the current XMLNode.
176
+ # Providing only a childname, it will delete all children with the given
177
+ # name.
178
+ # If you also provide a block, the block will be evaluated with each child
179
+ # as an argument and according to the return value of the call the child
180
+ # will be deleted or not (when the block returns true, the child will be
181
+ # deleted).
182
+ # If you provide a block you can optionally filter the children by
183
+ # providing a childname so only children with the given name will be
184
+ # evaluated.
185
+ #
186
+ def >>(childname = nil)
187
+ if block_given?
188
+ @children.select { |k,v| childname ? k == childname : true }.each do |k,v|
189
+ v.reject! { |child| yield(child) }
190
+ end
191
+ else
192
+ @children.delete(childname)
193
+ end
194
+ end
195
+
196
+ #
197
+ # Returns an array of all XMLNodes in the order that is specified in the
198
+ # traversal accessor.
199
+ #
200
+ def all(name = nil)
201
+ name ? select { |x| x.xmlname == name } : traverse
202
+ end
203
+
204
+ #
205
+ # Maps a function over the XMLNode-tree and returns a new XMLNode-tree
206
+ # with the mapped values.
207
+ #
208
+ def map(&block)
209
+ dup.map!(&block)
210
+ end
211
+
212
+ #
213
+ # Maps a function over the current XMLNode-tree.
214
+ #
215
+ def map!(&block)
216
+ traverse.map!(&block)
217
+ self
218
+ end
219
+
220
+ #
221
+ # Iterates over the XMLNode-tree in the order that is specified in the
222
+ # traversal accessor.
223
+ #
224
+ def each(&block)
225
+ traverse.each(&block)
226
+ end
227
+
228
+ #
229
+ # Deep-copies the Tree.
230
+ #
231
+ def dup
232
+ XMLROCS::XMLNode.new(:text => to_xml)
233
+ end
234
+
235
+ #
236
+ # If a tag is provided it checks if the child with the given name has
237
+ # siblings.
238
+ # Otherwise it does the same for the current node.
239
+ #
240
+ def single?(tag = nil)
241
+ tag ? @children[tag].length == 1 : @parent.children[@xmlname].length == 1
242
+ end
243
+
244
+ #
245
+ # If a tag is provided it checks if the child with the given name is a leaf.
246
+ # Otherwise it does the same for the current node.
247
+ #
248
+ def leaf?(tag = nil)
249
+ tag ? children[tag].all? { |x| x.leaf? } : children.empty?
250
+ end
251
+
252
+ #
253
+ # Generates an array of all leafs in the order specified in the traversal
254
+ # accessor.
255
+ #
256
+ def leafs
257
+ traverse.select { |x| x.leaf? }
258
+ end
259
+
260
+ #
261
+ # Sets the text of the current node to text.
262
+ #
263
+ def set_text(text)
264
+ # special match for whitespace-only
265
+ return gsub!(self.to_s, "") if text =~ /^\s+$/
266
+ gsub!(self.to_s, text)
267
+ end
268
+
269
+ #
270
+ # Deep-compares to XMLNodes.
271
+ #
272
+ def ==(other)
273
+ # pure string comparison
274
+ return super(other) if is_real_string(other)
275
+ return false unless other == self.to_s
276
+ [ [ self, other ], [ other, self ] ].each do |a,b|
277
+ a.children.each do |k,v|
278
+ return false if !b.children.has_key?(k) or v != b.children[k]
279
+ end
280
+ end
281
+ true
282
+ end
283
+
284
+ #
285
+ # Deep-transforms the current node into plaintext XML. If flat is true,
286
+ # all children will be omitted.
287
+ #
288
+ def to_xml(flat = false)
289
+ "<#{@xmlname} " + @attributes.map { |k,v| "#{k}=\"#{v}\" "}.join(" ") + ">" +
290
+ (flat ? "" : (leaf? ? self.to_s : @children.values.flatten.map { |e| e.to_xml }.join)) +
291
+ "</#{@xmlname}>"
292
+ end
293
+
294
+ def method_missing(method, *args)
295
+ if method.to_s[-1] == 33 and @children.has_key?(real_method = method.to_s.chomp("!").to_sym)
296
+ return @children[real_method]
297
+ end
298
+ return @children[method].last if @children.has_key?(method)
299
+ super(method, *args)
300
+ end
301
+
302
+ private
303
+
304
+ def preorder
305
+ children.values.flatten.inject([self]) { |cur,child| cur + child.send(:preorder) }
306
+ end
307
+
308
+ def inorder
309
+ preorder # FIXME
310
+ end
311
+
312
+ def postorder
313
+ preorder # FIXME
314
+ end
315
+
316
+ def traverse
317
+ self.send(@traversal)
318
+ end
319
+
320
+ #
321
+ # helper method
322
+ #
323
+ def is_real_string(what)
324
+ what.is_a?(String) and !what.is_a?(XMLNode)
325
+ end
326
+ end
327
+ end