uva-happymapper 0.4.1

Sign up to get free protection for your applications and to get access to all the features.
Files changed (40) hide show
  1. data/README.md +528 -0
  2. data/TODO +0 -0
  3. data/lib/happymapper.rb +617 -0
  4. data/lib/happymapper/attribute.rb +3 -0
  5. data/lib/happymapper/element.rb +3 -0
  6. data/lib/happymapper/item.rb +250 -0
  7. data/lib/happymapper/text_node.rb +3 -0
  8. data/spec/fixtures/address.xml +8 -0
  9. data/spec/fixtures/ambigous_items.xml +22 -0
  10. data/spec/fixtures/analytics.xml +61 -0
  11. data/spec/fixtures/analytics_profile.xml +127 -0
  12. data/spec/fixtures/atom.xml +19 -0
  13. data/spec/fixtures/commit.xml +52 -0
  14. data/spec/fixtures/current_weather.xml +89 -0
  15. data/spec/fixtures/dictionary.xml +20 -0
  16. data/spec/fixtures/family_tree.xml +21 -0
  17. data/spec/fixtures/inagy.xml +86 -0
  18. data/spec/fixtures/lastfm.xml +355 -0
  19. data/spec/fixtures/multiple_namespaces.xml +170 -0
  20. data/spec/fixtures/multiple_primitives.xml +5 -0
  21. data/spec/fixtures/pita.xml +133 -0
  22. data/spec/fixtures/posts.xml +23 -0
  23. data/spec/fixtures/product_default_namespace.xml +17 -0
  24. data/spec/fixtures/product_no_namespace.xml +10 -0
  25. data/spec/fixtures/product_single_namespace.xml +10 -0
  26. data/spec/fixtures/quarters.xml +19 -0
  27. data/spec/fixtures/radar.xml +21 -0
  28. data/spec/fixtures/statuses.xml +422 -0
  29. data/spec/fixtures/subclass_namespace.xml +50 -0
  30. data/spec/happymapper_attribute_spec.rb +21 -0
  31. data/spec/happymapper_element_spec.rb +21 -0
  32. data/spec/happymapper_item_spec.rb +115 -0
  33. data/spec/happymapper_spec.rb +968 -0
  34. data/spec/happymapper_text_node_spec.rb +21 -0
  35. data/spec/happymapper_to_xml_namespaces_spec.rb +196 -0
  36. data/spec/happymapper_to_xml_spec.rb +196 -0
  37. data/spec/ignay_spec.rb +95 -0
  38. data/spec/spec_helper.rb +7 -0
  39. data/spec/xpath_spec.rb +88 -0
  40. metadata +118 -0
@@ -0,0 +1,528 @@
1
+ UnhappyMapper
2
+ =============
3
+
4
+ UnHappymapper allows you to parse XML data and convert it quickly and easily into ruby data structures.
5
+
6
+ This project is a grandchild (a fork of a fork) of the great work done first by [jnunemaker](https://github.com/jnunemaker/happymapper) and then by [dam5s](http://github.com/dam5s/happymapper/). I found both of these projects when I started to work on a project that had a serious case of XML and required a number of bug fixes and also new features. Both of the previous maintainers are too busy or not interested in the new functionality so I have released a new gem.
7
+
8
+ ###Major Differences
9
+
10
+ * [dam5s](http://github.com/dam5s/happymapper/)'s fork added [Nokogiri](http://nokogiri.org/) support
11
+ * `#to_xml` support utilizing the same HappyMapper tags
12
+ * Fixes for [namespaces when using composition of classes](https://github.com/burtlo/happymapper/commit/fd1e898c70f7289d2d2618d629b56f2f6623785c)
13
+ * Fixes for instances of XML where a [namespace is defined but no elements with that namespace are found](https://github.com/burtlo/happymapper/commit/9614221a80ff3bda18ff859aa751dff29cf52fd3).
14
+
15
+
16
+ ## Installation
17
+
18
+ ### [Rubygems](https://rubygems.org/gems/unhappymapper)
19
+
20
+ $ gem install unhappymapper
21
+
22
+ ### [Source](https://github.com/burtlo/happymapper)
23
+
24
+ $ git clone https://github.com/burtlo/happymapper
25
+ $ cd happymapper
26
+ $ git checkout master
27
+ $ gem build unhappymapper.gemspec
28
+ $ gem install --local unhappymapper-X.X.X.gem
29
+
30
+ ### [Bundler](http://gembundler.com/)
31
+
32
+ Add the unhappymapper gem to your project's `Gemfile`.
33
+
34
+ gem 'unhappymapper'
35
+
36
+ Run the bundler command to install the gem:
37
+
38
+ $ bundle install
39
+
40
+ # Examples
41
+
42
+ Let's start with a simple example to get our feet wet. Here we have a simple example of XML that defines some address information:
43
+
44
+ <address>
45
+ <street>Milchstrasse</street>
46
+ <housenumber>23</housenumber>
47
+ <postcode>26131</postcode>
48
+ <city>Oldenburg</city>
49
+ <country code="de">Germany</country>
50
+ </address>
51
+
52
+ Happymapper will let you easily model this information as a class:
53
+
54
+ require 'happymapper'
55
+
56
+ class Address
57
+ include HappyMapper
58
+
59
+ tag 'address'
60
+ element :street, String, :tag => 'street'
61
+ element :postcode, String, :tag => 'postcode'
62
+ element :housenumber, Integer, :tag => 'housenumber'
63
+ element :city, String, :tag => 'city'
64
+ element :country, String, :tag => 'country'
65
+ end
66
+
67
+ To make a class HappyMapper compatible you simply `include HappyMapper` within the class definition. This takes care of all the work of defining all the speciality methods and magic you need to get running. As you can see we immediately start using these methods.
68
+
69
+ * `tag` matches the name of the XML tag name 'address'.
70
+
71
+ * `element` defines accessor methods for the specified symbol (e.g. `:street`,`:housenumber`) that will return the class type (e.g. `String`,`Integer`) of the XML tag specified (e.g. `:tag => 'street'`, `:tag => 'housenumber'`).
72
+
73
+ When you define an element with an accessor with the same name as the tag, this is the case for all the examples above, you can omit the `:tag`. These two element declaration are equivalent to each other:
74
+
75
+ element :street, String, :tag => 'street'
76
+ element :street, String
77
+
78
+ Including the additional tag element is not going to hurt anything and in some cases will make it absolutely clear how these elements map to the XML. However, once you know this rule, it is hard not to want to save yourself the keystrokes.
79
+
80
+ Instead of `element` you may also use `has_one`:
81
+
82
+ element :street, String, :tag => 'street'
83
+ element :street, String
84
+ has_one :street, String
85
+
86
+ These three statements are equivalent to each other.
87
+
88
+ ## Parsing
89
+
90
+ With the mapping of the address XML articulated in our Address class it is time to parse the data:
91
+
92
+ address = Address.parse(ADDRESS_XML_DATA, :single => true)
93
+ puts address.street
94
+
95
+ Assuming that the constant `ADDRESS_XML_DATA` contains a string representation of the address XML data this is fairly straight-forward save for the `parse` method.
96
+
97
+ The `parse` method, like `tag` and `element` are all added when you included HappyMapper in the class. Parse is a wonderful, magical place that converts all these declarations that you have made into the data structure you are about to know and love.
98
+
99
+ But what about the `:single => true`? Right, that is because by default when your object is all done parsing it will be an array. In this case an array with one element, but an array none the less. So the following are equivalent to each other:
100
+
101
+ address = Address.parse(ADDRESS_XML_DATA).first
102
+ address = Address.parse(ADDRESS_XML_DATA, :single => true)
103
+
104
+ The first one returns an array and we return the first instance, the second will do that work for us inside of parse.
105
+
106
+ ## Multiple Elements Mapping
107
+
108
+ What if our address XML was a little different, perhaps we allowed multiple streets:
109
+
110
+ <address>
111
+ <street>Milchstrasse</street>
112
+ <street>Another Street</street>
113
+ <housenumber>23</housenumber>
114
+ <postcode>26131</postcode>
115
+ <city>Oldenburg</city>
116
+ <country code="de">Germany</country>
117
+ </address>
118
+
119
+ Similar to `element` or `has_one`, the declaration for when you have multiple elements you simply use:
120
+
121
+ has_many :streets, String, :tag => 'street'
122
+
123
+ Your resulting `streets` method will now return an array.
124
+
125
+ address = Address.parse(ADDRESS_XML_DATA, :single => true)
126
+ puts address.streets.join('\n')
127
+
128
+ Imagine that you have to write `streets.join('\n')` for the rest of eternity throughout your code. It would be a nightmare and one that you could avoid by creating your own convenience method.
129
+
130
+ require 'happymapper'
131
+
132
+ class Address
133
+ include HappyMapper
134
+
135
+ tag 'address'
136
+
137
+ has_many :streets, String
138
+
139
+ def streets
140
+ @streets.join('\n')
141
+ end
142
+
143
+ element :postcode, String, :tag => 'postcode'
144
+ element :housenumber, String, :tag => 'housenumber'
145
+ element :city, String, :tag => 'city'
146
+ element :country, String, :tag => 'country'
147
+ end
148
+
149
+ Now when we call the method `streets` we get a single value, but we still have the instance variable `@streets` if we ever need to the values as an array.
150
+
151
+
152
+ ## Attribute Mapping
153
+
154
+ <address location='home'>
155
+ <street>Milchstrasse</street>
156
+ <street>Another Street</street>
157
+ <housenumber>23</housenumber>
158
+ <postcode>26131</postcode>
159
+ <city>Oldenburg</city>
160
+ <country code="de">Germany</country>
161
+ </address>
162
+
163
+ Attributes are absolutely the same as `element` or `has_many`
164
+
165
+ attribute :location, String, :tag => 'location
166
+
167
+ Again, you can omit the tag if the attribute accessor symbol matches the name of the attribute.
168
+
169
+
170
+ ### Attributes On Empty Child Elements
171
+
172
+ <feed xml:lang="en-US" xmlns="http://www.w3.org/2005/Atom">
173
+ <id>tag:all-the-episodes.heroku.com,2005:/tv_shows</id>
174
+ <link rel="alternate" type="text/html" href="http://all-the-episodes.heroku.com"/>
175
+ <link rel="self" type="application/atom+xml" href="http://all-the-episodes.heroku.com/tv_shows.atom"/>
176
+ <title>TV Shows</title>
177
+ <updated>2011-07-10T06:52:27Z</updated>
178
+ </feed>
179
+
180
+ In this case you would need to map an element to a new `Link` class just to access `<link>`s attributes, except that there is an alternate syntax. Instead of
181
+
182
+ class Feed
183
+ # ....
184
+ has_many :links, Link, :tag => 'link', :xpath => '.'
185
+ end
186
+
187
+ class Link
188
+ include HappyMapper
189
+
190
+ attribute :rel, String
191
+ attribute :type, String
192
+ attribute :href, String
193
+ end
194
+
195
+ You can drop the `Link` class and simply replace the `has_many` on `Feed` with
196
+
197
+ element :link, String, :single => false, :attributes => { :rel => String, :type => String, :href => String }
198
+
199
+ As there is no content, the type given for `:link` (`String` above) is irrelevant, but `nil` won't work and other types may try to perform typecasting and fail. You can omit the :single => false for elements that only occur once within their parent.
200
+
201
+ This syntax is most appropriate for elements that (a) have attributes but no content and (b) only occur at only one level of the heirarchy. If `<feed>` contained another element that also contained a `<link>` (as atom feeds generally do) it would be DRY-er to use the first syntax, i.e. with a separate `Link` class.
202
+
203
+
204
+ ## Class composition (and Text Node)
205
+
206
+ Our address has a country and that country element has a code. Up until this point we neglected it as we declared a `country` as being a `String`.
207
+
208
+ <address location='home'>
209
+ <street>Milchstrasse</street>
210
+ <street>Another Street</street>
211
+ <housenumber>23</housenumber>
212
+ <postcode>26131</postcode>
213
+ <city>Oldenburg</city>
214
+ <country code="de">Germany</country>
215
+ </address>
216
+
217
+ Well if we only going to parse country, on it's own, we would likely create a class mapping for it.
218
+
219
+ class Country
220
+ include HappyMapper
221
+
222
+ tag 'country'
223
+
224
+ attribute :code, String
225
+ text_node :name, String
226
+ end
227
+
228
+ We are utilizing an `attribute` declaration and a new declaration called `text_node`.
229
+
230
+ * `text_node` is used when you want the text contained within the element
231
+
232
+ Awesome, now if we were to redeclare our `Address` class we would use our new `Country` class.
233
+
234
+ class Address
235
+ include HappyMapper
236
+
237
+ tag 'address'
238
+
239
+ has_many :streets, String, :tag => 'street'
240
+
241
+ def streets
242
+ @streets.join('\n')
243
+ end
244
+
245
+ element :postcode, String, :tag => 'postcode'
246
+ element :housenumber, String, :tag => 'housenumber'
247
+ element :city, String, :tag => 'city'
248
+ element :country, Country, :tag => 'country'
249
+ end
250
+
251
+ Instead of `String`, `Boolean`, or `Integer` we say that it is a `Country` and HappyMapper takes care of the details of continuing the XML mapping through the country element.
252
+
253
+ address = Address.parse(ADDRESS_XML_DATA, :single => true)
254
+ puts address.country.code
255
+
256
+ A quick note, in the above example we used the constant `Country`. We could have used `'Country'`. The nice part of using the latter declaration, enclosed in quotes, is that you do not have to define your class before this class. So Country and Address can live in separate files and as long as both constants are available when it comes time to parse you are golden.
257
+
258
+ ## Custom XPATH
259
+
260
+ ### Has One, Has Many
261
+
262
+ Getting to elements deep down within your XML can be a little more work if you did not have xpath support. Consider the following example:
263
+
264
+ <media>
265
+ <gallery>
266
+ <title href="htttp://fishlovers.org/friends">Friends Who Like Fish</title>
267
+ <picture>
268
+ <name>Burtie Sanchez</name>
269
+ <img>burtie01.png</img>
270
+ </picture>
271
+ </gallery>
272
+ <picture>
273
+ <name>Unsorted Photo</name>
274
+ <img>bestfriends.png</img>
275
+ </picture>
276
+ </media>
277
+
278
+ You may want to map the sub-elements contained buried in the 'gallery' as top level items in the media. Traditionally you could use class composition to accomplish this task, however, using the xpath attribute you have the ability to shortcut some of that work.
279
+
280
+ class Media
281
+ include HappyMapper
282
+
283
+ has_one :title, String, :xpath => 'gallery/title'
284
+ has_one :link, String, :xpath => 'gallery/title/@href'
285
+ end
286
+
287
+
288
+ ## Subclasses
289
+
290
+ ### Inheritance (it doesn't work!)
291
+
292
+ While mapping XML to objects you may arrive at a point where you have two or more very similar structures.
293
+
294
+ class Article
295
+ include HappyMapper
296
+
297
+ has_one :title, String
298
+ has_one :author, String
299
+ has_one :published, Time
300
+
301
+ has_one :entry, String
302
+
303
+ end
304
+
305
+ class Gallery
306
+ include HappyMapper
307
+
308
+ has_one :title, String
309
+ has_one :author, String
310
+ has_one :published, Time
311
+
312
+ has_many :photos, String
313
+
314
+ end
315
+
316
+ In this example there are definitely two similarities between our two pieces of content. So much so that you might be included to create an inheritance structure to save yourself some keystrokes.
317
+
318
+ class Content
319
+ include HappyMapper
320
+
321
+ has_one :title, String
322
+ has_one :author, String
323
+ has_one :published, Time
324
+
325
+ end
326
+
327
+ class Article < Content
328
+ include HappyMapper
329
+
330
+ has_one :entry, String
331
+ end
332
+
333
+ class Gallery < Content
334
+ include HappyMapper
335
+
336
+ has_many :photos, String
337
+ end
338
+
339
+ However, *this does not work*. And the reason is because each one of these element declarations are method calls that are defining elements on the class itself. So it is not passed down through inheritance.
340
+
341
+ You can however, use some module mixin power to save you those keystrokes and impress your friends.
342
+
343
+
344
+ module Content
345
+ def self.included(content)
346
+ content.has_one :title, String
347
+ content.has_one :author, String
348
+ content.has_one :published, Time
349
+ end
350
+
351
+ def published_time
352
+ @published.strftime("%H:%M:%S")
353
+ end
354
+
355
+ end
356
+
357
+ class Article
358
+ include HappyMapper
359
+
360
+ include Content
361
+ has_one :entry, String
362
+ end
363
+
364
+ class Gallery
365
+ include HappyMapper
366
+
367
+ include Content
368
+ has_many :photos, String
369
+ end
370
+
371
+
372
+ Here, when we include `Content` in both of these classes the module method `#included` is called and our class is given as a parameter. So we take that opportunity to do some surgery and define our happymapper elements as well as any other methods that may rely on those instance variables that come along in the package.
373
+
374
+
375
+ ## Filtering with XPATH
376
+
377
+ I ran into a case where I wanted to capture all the pictures that were directly under media, but not the ones contained within a gallery.
378
+
379
+ <media>
380
+ <gallery>
381
+ <picture>
382
+ <name>Burtie Sanchez</name>
383
+ <img>burtie01.png</img>
384
+ </picture>
385
+ </gallery>
386
+ <picture>
387
+ <name>Unsorted Photo</name>
388
+ <img>bestfriends.png</img>
389
+ </picture>
390
+ </media>
391
+
392
+ The following `Media` class is where I started:
393
+
394
+ require 'happymapper'
395
+
396
+ class Media
397
+ include HappyMapper
398
+
399
+ has_many :galleries, Gallery, :tag => 'gallery'
400
+ has_many :pictures, Picture, :tag => 'picture'
401
+ end
402
+
403
+ However when I parsed the media xml the number of pictures returned to me was 2, not 1.
404
+
405
+ pictures = Media.parse(MEDIA_XML,:single => true).pictures
406
+ pictures.length.should == 1 # => Failed Expectation
407
+
408
+ I was mistaken and that is because, by default the mappings are assigned XPATH './/' which is requiring all the elements no matter where they can be found. To override this you can specify an XPATH value for your defined elements.
409
+
410
+ has_many :pictures, Picture, :tag => 'picture', :xpath => '/media'
411
+
412
+ `/media` states that we are only interested in pictures that can be found directly under the media element. So when we parse again we will have only our one element.
413
+
414
+
415
+ ## Namespaces
416
+
417
+ Obviously your XML and these trivial examples are easy to map and parse because they lack the treacherous namespaces that befall most XML files.
418
+
419
+ Perhaps our `address` XML is really swarming with namespaces:
420
+
421
+ <prefix:address location='home' xmlns:prefix="http://www.unicornland.com/prefix">
422
+ <prefix:street>Milchstrasse</prefix:street>
423
+ <prefix:street>Another Street</prefix:street>
424
+ <prefix:housenumber>23</prefix:housenumber>
425
+ <prefix:postcode>26131</prefix:postcode>
426
+ <prefix:city>Oldenburg</prefix:city>
427
+ <prefix:country code="de">Germany</prefix:country>
428
+ </prefix:address>
429
+
430
+ Here again is our address example with a made up namespace called `prefix` that comes direct to use from unicornland, a very magical place indeed. Well we are going to have to do some work on our class definition and that simply adding this one liner to the `Address` class:
431
+
432
+ class Address
433
+ include HappyMapper
434
+
435
+ tag 'address'
436
+ namespace 'prefix'
437
+ # ... rest of the code ...
438
+ end
439
+
440
+ Of course, if that is too easy for you, you can append a `:namespace => 'prefix` to every one of the elements that you defined.
441
+
442
+ has_many :street, String, :tag => 'street', :namespace => 'prefix'
443
+ element :postcode, String, :tag => 'postcode', :namespace => 'prefix'
444
+ element :housenumber, String, :tag => 'housenumber', :namespace => 'prefix'
445
+ element :city, String, :tag => 'city', :namespace => 'prefix'
446
+ element :country, Country, :tag => 'country', :namespace => 'prefix'
447
+
448
+ I definitely recommend the former, as it saves you a whole hell of lot of typing. However, there are times when appending a namespace to an element declaration is important and that is when it has a different namespace then `namespsace 'prefix'`.
449
+
450
+ Imagine that our `country` actually belonged to a completely different namespace.
451
+
452
+ <prefix:address location='home' xmlns:prefix="http://www.unicornland.com/prefix"
453
+ xmlns:prefix="http://www.trollcountry.com/different">
454
+ <prefix:street>Milchstrasse</prefix:street>
455
+ <prefix:street>Another Street</prefix:street>
456
+ <prefix:housenumber>23</prefix:housenumber>
457
+ <prefix:postcode>26131</prefix:postcode>
458
+ <prefix:city>Oldenburg</prefix:city>
459
+ <different:country code="de">Germany</different:country>
460
+ </prefix:address>
461
+
462
+ Well we would need to specify that namespace:
463
+
464
+ element :country, Country, :tag => 'country', :namespace => 'different'
465
+
466
+ With that we should be able to parse as we once did.
467
+
468
+ ## Large Datasets (in_groups_of)
469
+
470
+ When dealing with large sets of XML that simply cannot or should not be placed into memory the objects can be handled in groups through the `:in_groups_of` parameter.
471
+
472
+ Address.parse(LARGE_ADDRESS_XML_DATA,:in_groups_of => 5) do |group|
473
+ puts address.streets
474
+ end
475
+
476
+ This trivial block will parse the large set of XML data and in groups of 5 addresses at a time display the streets.
477
+
478
+ ## Saving to XML
479
+
480
+ Saving a class to XML is as easy as calling `#to_xml`. The end result will be the current state of your object represented as xml. Let's cover some details that are sometimes necessary and features present to make your life easier.
481
+
482
+
483
+ ### :on_save
484
+
485
+ When you are saving data to xml it is often important to change or manipulate data to a particular format. For example, a time object:
486
+
487
+ has_one :published_time, Time, :on_save => lambda {|time| time.strftime("%H:%M:%S") if time }
488
+
489
+ Here we add the options `:on_save` and specify a lambda which will be executed on the method call to `:published_time`.
490
+
491
+ ### :state_when_nil
492
+
493
+ When an element contains a nil value, or perhaps the result of the :on_save lambda correctly results in a nil value you will be happy that the element will not appear in the resulting XML. However, there are time when you will want to see that element and that's when `:state_when_nil` is there for you.
494
+
495
+ has_one :favorite_color, String, :state_when_nil => true
496
+
497
+ The resulting XML will include the 'favorite_color' element even if the favorite color has not been specified.
498
+
499
+ ### :read_only
500
+
501
+ When an element, attribute, or text node is a value that you have no interest in
502
+ saving to XML, you can ensure that takes place by stating that it is `read only`.
503
+
504
+ has_one :modified, Boolean, :read_only => true
505
+ attribute :temporary, Boolean, :read_only => true
506
+
507
+ This is useful if perhaps the incoming XML is different than the out-going XML.
508
+
509
+ ### namespaces
510
+
511
+ While parsing the XML only required you to simply specify the prefix of the namespace you wanted to parse, when you persist to xml you will need to define your namespaces so that they are correctly captured.
512
+
513
+ class Address
514
+ include HappyMapper
515
+
516
+ register_namespace 'prefix', 'http://www.unicornland.com/prefix'
517
+ register_namespace 'different', 'http://www.trollcountry.com/different'
518
+
519
+ tag 'address'
520
+ namespace 'prefix'
521
+
522
+ has_many :street, String
523
+ element :postcode, String
524
+ element :housenumber, String
525
+ element :city, String
526
+ element :country, Country, :tag => 'country', :namespace => 'different'
527
+
528
+ end