XMLROCS 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/LICENSE ADDED
@@ -0,0 +1,28 @@
1
+
2
+ Copyright (c) 2008, Andreas Meingast, <ameingast@gmail.com>, http://yomi.at
3
+
4
+ All rights reserved.
5
+
6
+ Redistribution and use in source and binary forms, with or without
7
+ modification, are permitted provided that the following conditions are met:
8
+
9
+ * Redistributions of source code must retain the above copyright notice,
10
+ this list of conditions and the following disclaimer.
11
+ * Redistributions in binary form must reproduce the above copyright
12
+ notice, this list of conditions and the following disclaimer in the
13
+ documentation and/or other materials provided with the distribution.
14
+ * Neither the name of the person nor the names of its
15
+ contributors may be used to endorse or promote products derived
16
+ from this software without specific prior written permission.
17
+
18
+ THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
19
+ "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
20
+ LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
21
+ A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
22
+ CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
23
+ EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
24
+ PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
25
+ PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
26
+ LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
27
+ NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
28
+ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
data/README ADDED
@@ -0,0 +1,207 @@
1
+ = About
2
+
3
+ XMLROCS is short for XML Ruby ObjeCtS. It is a library that allows to
4
+ map XML data to Ruby objects.
5
+
6
+ XMLROCS is kind of a poor man's DOM. It provides basic access to attributes
7
+ and child elements so you can comfortably work with XML data.
8
+ Generally speaking, you can manipulate your XML data in true Ruby OO style.
9
+
10
+ == Creating an XMLROC
11
+
12
+ The following XML data will be used in the following examples:
13
+
14
+ xml = <<-EOS
15
+ <products name="Computers">
16
+ This is a mixed content for the following products:
17
+ <product id="2">Dell</product>
18
+ <product id="1">Acer</product>
19
+ <product id="3">Apple</product>
20
+ Another text.
21
+ <product id="4">HP</product>
22
+ </products>
23
+ EOS
24
+
25
+ o = XMLROCS::XMLNode.new :text => xml # => produces an XMLROC
26
+
27
+ If you already have an REXML::Document flying arround you can do the following:
28
+
29
+ rexml_document = REXML::Document.new(xml)
30
+ o = XMLROCS::XMLNode.new :root => rexml_document.root
31
+
32
+
33
+ == Accessing and Modifiying Attributes
34
+
35
+ All attributes are available via the []-operator. Attributenames are stored
36
+ as symbols, so the []-operator behaves pretty much like a Hash with symbol-keys.
37
+ Let's have a look at it:
38
+
39
+ o[:name] # => Computers
40
+ o[:i_m_not_there] # => nil
41
+
42
+ You can also modify attributes:
43
+
44
+ o[:name] = o[:name].reverse # => sretupmoC
45
+
46
+ which is pretty much equivalent to:
47
+
48
+ o[:name].reverse! # => sretupmoC
49
+
50
+ The other way to handle attributes is to access the @attributes accessor, which
51
+ acts as a hash with the following format:
52
+
53
+ { :atrribute_name => "attribute_value" }
54
+
55
+ Just like with the []-operator you can also modify the accessor-variable.
56
+
57
+
58
+ == Accessing and Modifiyng Children
59
+
60
+ Children can be accessed via instance methods or the @children accessor.
61
+ Just like the @attributes accessor, the @children accessor has the following
62
+ format:
63
+
64
+ { :child_name => [ XMLNode, ... ] }
65
+
66
+ :child_name is the name of the tag of the child, while XMLNode is an array
67
+ containing all children with the name.
68
+ Still, this solution is mainly used for internal representation and it is
69
+ pretty tedious to work with, because you _always_ have to handle the array,
70
+ even when you know that there is only one child (probably garuanteed by some
71
+ DTD or XSD).
72
+ Instance methods solve this problem by ALWAYS returning a direct XMLNode.
73
+ If there are n > 1 children with the same name, the instance method will
74
+ return the last (wrt to appearance in the XML data) element.
75
+ You can access all children with the same name by appending a '!'
76
+ to the instance_method just like this:
77
+
78
+ o.tag_name! # => [ XMLNode, ... ]
79
+
80
+ Some other examples:
81
+
82
+ # Determine the id of the Apple-product
83
+ o.product!.select { |x| x == "Apple" }.first[:id] # => "3"
84
+
85
+ # Determine the ids of all products whose name is not empty
86
+ o.product!.select { |x| not x.empty? } # => ["Dell", "Acer", "Apple", "HP"]
87
+
88
+ # Determine the name of the product with id == "3"
89
+ o.product!.select { |x| x[:id] == "3" }.first # => "Apple"
90
+
91
+ # produce a hashmap { id => productname }
92
+ o.product!.inject({}) { |h,x| h.merge({ x[:id] => x }) } # => {"1"=>"Acer", "2"=>"Dell", "3"=>"Apple", "4"=>"HP"}
93
+
94
+ # sort by product id
95
+ o.product!.sort { |a,b| a[:id] <=> b[:id] } # => ["Acer", "Dell", "Apple", "HP"]
96
+
97
+ # get the last product (wrt order of appearance in the xml text)
98
+ o.product # => HP
99
+
100
+ # apply some changes and write back the xml
101
+ o.product!.each { |x| x[:id] = "#{x[:id].to_i * 10}" }
102
+ o.to_xml # =>
103
+ '<products name="Computers">
104
+ <product id="20">Dell</product>
105
+ <product id="10">Acer</product>
106
+ <product id="30" >Apple</product>
107
+ <product id="40">HP</product>
108
+ </products>'
109
+
110
+ o.product!.each { |x| x.set_text(x.reverse) }
111
+ o.to_xml # =>
112
+ '<products name="Computers">
113
+ <product id="20">lleD</product>
114
+ <product id="10">recA</product>
115
+ <product id="30">elppA</product>
116
+ <product id="40">PH</product>
117
+ </products>'
118
+
119
+ == Appending and Removing Children and Attributes
120
+
121
+ To add attributes, just add a new entry to the attributes accessor:
122
+
123
+ o[:short_name] = "comp"
124
+ puts o.to_xml # =>
125
+ <products name="Computers" short_name="comp">
126
+ ...
127
+ </products>
128
+
129
+ To remove an attribute, you have to call the delete_attribute method
130
+
131
+ o.delete_attribute!(:short_name)
132
+ puts o.to_xml # =>
133
+ <products name="Computers">
134
+ ...
135
+ </products>
136
+
137
+
138
+ To add children, call the << operator with an XMLNode as argument.
139
+
140
+ # create the child node
141
+ child = XMLROCS::XMLNode.new(:text => '<product id="5">Fujitsu</product>')
142
+ # add it to the o-node
143
+ o << child
144
+ puts o.product!.select { |x| x[:id] == "5" }.first # => Fujitsu
145
+
146
+ To remove a child, call the >> operator with the childname as argument.
147
+ You can also provide a block to provide a more fine-grained filter.
148
+ The block will then be called with a child XMLNode as an argument.
149
+
150
+ # remove all children with id == "5"
151
+ o.>> { |child| child[:id] == "5" }
152
+
153
+ Whenever you provide a block, you have to bind the >> operator to
154
+ the object, because of the lower precedence of the infix call.
155
+
156
+ # remove all product children with id == "5"
157
+ o.>>(:product) { |child| child[:id] == "5" }
158
+
159
+ # remove all children called :product
160
+ o >> :product
161
+
162
+ == Walking over the XML-Tree Structure
163
+
164
+ All iterating is done in Preorder, but you can override the default behaviour
165
+ by setting the traversal accessor.
166
+
167
+ Currently the library supports the following traversals:
168
+
169
+ o.traversal = :preorder # default
170
+ o.traversal = :inorder
171
+ o.traversal = :postorder
172
+
173
+ You can inject the XML Tree with a left associative function, and map
174
+ over the tree by calling map or map! which will not produce an Array but
175
+ another XML Tree.
176
+
177
+ == Aggregating
178
+
179
+ This library does not provide direct support for XPath, but you can emulate
180
+ it's behaviour with iterators and closures.
181
+
182
+ Suppose you want to select all products whose id is smaller than 5. In
183
+ XPath you probably would come up with something like this:
184
+
185
+ //[@id < 5]
186
+
187
+ With XMLROCS there is no need for XPaths, because you can do the very same
188
+ thing easily in Ruby:
189
+
190
+ o.select { |node| node[:id] && node[:id].to_i < 5 }
191
+
192
+ Here's another one that builds an Array of all leafs:
193
+
194
+ o.select { |node| node.leaf? }
195
+
196
+ which is equivalent to:
197
+
198
+ o.select { |node| node.children.empty? }
199
+
200
+ Here's another one. Select all nodes that have no attributes:
201
+
202
+ o.select { |node| node.attributes.empty? }
203
+
204
+ It's actually quite handy, since your closure gets called on all children
205
+ and the object itself and then selects which elements you want to keep.
206
+ The big advandage here is, that you can actually do anything in the closure
207
+ that Ruby can do.
data/TODO ADDED
@@ -0,0 +1,15 @@
1
+ = TODO
2
+
3
+ == Typing
4
+
5
+ Introduce some kind of typing for text values, so you can say:
6
+
7
+ o = XMLROCS::XMLNode.new :text => some_text
8
+ o.type :child => Integer
9
+ o.type :child => { :attribute => Integer }
10
+ o.type :child => { :attribute => Array }
11
+
12
+
13
+ == Performance
14
+
15
+ The initialization of a whole XMLNode Tree is bad. Improve it.
data/lib/xmlrocs.rb ADDED
@@ -0,0 +1,327 @@
1
+ #
2
+ # Copyright (c) 2008, Andreas Meingast, <ameingast@gmail.com>, http://yomi.at
3
+ #
4
+ # All rights reserved.
5
+ #
6
+ # Redistribution and use in source and binary forms, with or without
7
+ # modification, are permitted provided that the following conditions are met:
8
+ #
9
+ # * Redistributions of source code must retain the above copyright notice,
10
+ # this list of conditions and the following disclaimer.
11
+ # * Redistributions in binary form must reproduce the above copyright
12
+ # notice, this list of conditions and the following disclaimer in the
13
+ # documentation and/or other materials provided with the distribution.
14
+ # * Neither the name of the person nor the names of its
15
+ # contributors may be used to endorse or promote products derived
16
+ # from this software without specific prior written permission.
17
+ #
18
+ # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
19
+ # "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
20
+ # LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
21
+ # A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
22
+ # CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
23
+ # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
24
+ # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
25
+ # PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
26
+ # LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
27
+ # NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
28
+ # SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
29
+ #
30
+
31
+ require 'rexml/document'
32
+
33
+ #
34
+ # For more information, have a look at the README or the XMLNode class
35
+ # documentation.
36
+ #
37
+ module XMLROCS
38
+
39
+ #
40
+ # Represents an XML Element. You can access and modify it's children
41
+ # and attributes.
42
+ #
43
+ # Each XMLNode has exactly one parent (that happens to be nil if the node
44
+ # is the root of the tree) and a (possibly empty) list of children.
45
+ #
46
+ # It also has a name (the name of the XML tag) that can be modified.
47
+ #
48
+ # You can compare two XMLNodes or an XMLNode and a String. When you provide
49
+ # a String, only the text value of the XMLNode is compared.
50
+ #
51
+ # You can enumerate the XMLNode Tree in the traversal-order defined
52
+ # in the @traversal accessor.
53
+ #
54
+ # For more information have a look at the README or instance method
55
+ # documentation.
56
+ #
57
+ class XMLNode < String
58
+ include Comparable
59
+ include Enumerable
60
+
61
+ #
62
+ # Hash containing the children of the current Node. The Hash has the
63
+ # following
64
+ # structure:
65
+ #
66
+ # { :childname => [ XMLNode, ... ] }
67
+ #
68
+ # You can either use the >> or << operators to modify children or use
69
+ # this accessor directly.
70
+ attr_reader :children
71
+
72
+ #
73
+ # Hash containing the attributes of the current Node:
74
+ #
75
+ # { :attribute_name => "Attribute" }
76
+ #
77
+ # Attributes can also be modified using this accessor.
78
+ attr_reader :attributes
79
+
80
+ #
81
+ # The parent of the current Node. The parent of the Root-Node is nil.
82
+ #
83
+ attr_reader :parent
84
+
85
+ #
86
+ # The name of the current Node.
87
+ #
88
+ # Example:
89
+ # x = XMLNode.new :text => '<a></a>'
90
+ # x.xmlname # => :a
91
+ #
92
+ attr_reader :xmlname
93
+
94
+ #
95
+ # Defines the order in which the tree is traversed.
96
+ #
97
+ # The following traversals are suppported:
98
+ # :preorder
99
+ # :postorder
100
+ # :inorder
101
+ #
102
+ attr_accessor :traversal
103
+
104
+ #
105
+ # Create a new XMLNode
106
+ # You have to either provide an REXML::Element as options[:root] or
107
+ # plaintext xml data as options[:text]. Otherwise an ArgumentError will
108
+ # be thrown.
109
+ #
110
+ # Supported options:
111
+ # :root # => REXML::Element that will be traversed.
112
+ # :text # => XML plaintext that will be parsed by REXML and then
113
+ # traversed.
114
+ # :nil # => Will create a nil node. Useful if you need a global
115
+ # parent.
116
+ # :traversal # => The traversal order. See traversal.
117
+ #
118
+ def initialize(options = {})
119
+ xmlroot = if options[:root]
120
+ options[:root]
121
+ elsif options[:text]
122
+ REXML::Document.new(options[:text]).root
123
+ elsif options[:nil]
124
+ nil
125
+ end
126
+ raise(ArgumentError, "Undefined") if !xmlroot and !options[:nil]
127
+ @children, @attributes = {}, {}
128
+ @parent ,@traversal = options[:parent], options[:traversal] || :preorder
129
+ @xmlname = xmlroot ? xmlroot.name.to_sym : :NIL
130
+ set_text(xmlroot ? xmlroot.get_text.to_s : '')
131
+ if xmlroot
132
+ xmlroot.attributes.each { |k,v| self[k.to_sym] = v }
133
+
134
+ # this recursions makes the whole library slow. basically we have
135
+ # all information in xmlroot, so no recursive calls would be necessary
136
+ xmlroot.select { |e| e.class == REXML::Element }.each do |e|
137
+ self << XMLNode.new({ :root => e, :parent => self })
138
+ end
139
+ end
140
+ end
141
+
142
+ #
143
+ # Access attributes by name. Attributenames are stored as symbols.
144
+ #
145
+ def [](attribute)
146
+ @attributes[attribute]
147
+ end
148
+
149
+ #
150
+ # Modify attributes. Behaves like a Hash. Keys are symbols by convention,
151
+ # values are XMLNode-objects.
152
+ #
153
+ def []=(attribute, value)
154
+ @attributes[attribute] = value
155
+ end
156
+
157
+ #
158
+ # Delete attributes. attribute has to be a symbol with the name of the
159
+ # attribute you want to delete.
160
+ #
161
+ def delete_attribute!(attribute)
162
+ @attributes.delete(attribute)
163
+ end
164
+
165
+ #
166
+ # Append a child. child has to be an XMLNode or a String that contains
167
+ # the XML data in plaintext.
168
+ #
169
+ def <<(child)
170
+ return self << XMLROCS::XMLNode.new(:text => child) if is_real_string(child)
171
+ (@children[child.xmlname] = (@children[child.xmlname] || []) << child).last
172
+ end
173
+
174
+ #
175
+ # Remove a child from the current XMLNode.
176
+ # Providing only a childname, it will delete all children with the given
177
+ # name.
178
+ # If you also provide a block, the block will be evaluated with each child
179
+ # as an argument and according to the return value of the call the child
180
+ # will be deleted or not (when the block returns true, the child will be
181
+ # deleted).
182
+ # If you provide a block you can optionally filter the children by
183
+ # providing a childname so only children with the given name will be
184
+ # evaluated.
185
+ #
186
+ def >>(childname = nil)
187
+ if block_given?
188
+ @children.select { |k,v| childname ? k == childname : true }.each do |k,v|
189
+ v.reject! { |child| yield(child) }
190
+ end
191
+ else
192
+ @children.delete(childname)
193
+ end
194
+ end
195
+
196
+ #
197
+ # Returns an array of all XMLNodes in the order that is specified in the
198
+ # traversal accessor.
199
+ #
200
+ def all(name = nil)
201
+ name ? select { |x| x.xmlname == name } : traverse
202
+ end
203
+
204
+ #
205
+ # Maps a function over the XMLNode-tree and returns a new XMLNode-tree
206
+ # with the mapped values.
207
+ #
208
+ def map(&block)
209
+ dup.map!(&block)
210
+ end
211
+
212
+ #
213
+ # Maps a function over the current XMLNode-tree.
214
+ #
215
+ def map!(&block)
216
+ traverse.map!(&block)
217
+ self
218
+ end
219
+
220
+ #
221
+ # Iterates over the XMLNode-tree in the order that is specified in the
222
+ # traversal accessor.
223
+ #
224
+ def each(&block)
225
+ traverse.each(&block)
226
+ end
227
+
228
+ #
229
+ # Deep-copies the Tree.
230
+ #
231
+ def dup
232
+ XMLROCS::XMLNode.new(:text => to_xml)
233
+ end
234
+
235
+ #
236
+ # If a tag is provided it checks if the child with the given name has
237
+ # siblings.
238
+ # Otherwise it does the same for the current node.
239
+ #
240
+ def single?(tag = nil)
241
+ tag ? @children[tag].length == 1 : @parent.children[@xmlname].length == 1
242
+ end
243
+
244
+ #
245
+ # If a tag is provided it checks if the child with the given name is a leaf.
246
+ # Otherwise it does the same for the current node.
247
+ #
248
+ def leaf?(tag = nil)
249
+ tag ? children[tag].all? { |x| x.leaf? } : children.empty?
250
+ end
251
+
252
+ #
253
+ # Generates an array of all leafs in the order specified in the traversal
254
+ # accessor.
255
+ #
256
+ def leafs
257
+ traverse.select { |x| x.leaf? }
258
+ end
259
+
260
+ #
261
+ # Sets the text of the current node to text.
262
+ #
263
+ def set_text(text)
264
+ # special match for whitespace-only
265
+ return gsub!(self.to_s, "") if text =~ /^\s+$/
266
+ gsub!(self.to_s, text)
267
+ end
268
+
269
+ #
270
+ # Deep-compares to XMLNodes.
271
+ #
272
+ def ==(other)
273
+ # pure string comparison
274
+ return super(other) if is_real_string(other)
275
+ return false unless other == self.to_s
276
+ [ [ self, other ], [ other, self ] ].each do |a,b|
277
+ a.children.each do |k,v|
278
+ return false if !b.children.has_key?(k) or v != b.children[k]
279
+ end
280
+ end
281
+ true
282
+ end
283
+
284
+ #
285
+ # Deep-transforms the current node into plaintext XML. If flat is true,
286
+ # all children will be omitted.
287
+ #
288
+ def to_xml(flat = false)
289
+ "<#{@xmlname} " + @attributes.map { |k,v| "#{k}=\"#{v}\" "}.join(" ") + ">" +
290
+ (flat ? "" : (leaf? ? self.to_s : @children.values.flatten.map { |e| e.to_xml }.join)) +
291
+ "</#{@xmlname}>"
292
+ end
293
+
294
+ def method_missing(method, *args)
295
+ if method.to_s[-1] == 33 and @children.has_key?(real_method = method.to_s.chomp("!").to_sym)
296
+ return @children[real_method]
297
+ end
298
+ return @children[method].last if @children.has_key?(method)
299
+ super(method, *args)
300
+ end
301
+
302
+ private
303
+
304
+ def preorder
305
+ children.values.flatten.inject([self]) { |cur,child| cur + child.send(:preorder) }
306
+ end
307
+
308
+ def inorder
309
+ preorder # FIXME
310
+ end
311
+
312
+ def postorder
313
+ preorder # FIXME
314
+ end
315
+
316
+ def traverse
317
+ self.send(@traversal)
318
+ end
319
+
320
+ #
321
+ # helper method
322
+ #
323
+ def is_real_string(what)
324
+ what.is_a?(String) and !what.is_a?(XMLNode)
325
+ end
326
+ end
327
+ end