feedzirra 0.4.0 → 0.5.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (43) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +59 -18
  3. data/README.md +153 -128
  4. data/benchmarks/README.md +90 -0
  5. data/benchmarks/basic.rb +31 -0
  6. data/benchmarks/feed_list.txt +10 -0
  7. data/benchmarks/feed_xml/apple.xml +149 -0
  8. data/benchmarks/feed_xml/cnn.xml +278 -0
  9. data/benchmarks/feed_xml/daring_fireball.xml +1697 -0
  10. data/benchmarks/feed_xml/engadget.xml +604 -0
  11. data/benchmarks/feed_xml/feedzirra_commits.xml +370 -0
  12. data/benchmarks/feed_xml/gizmodo.xml +2 -0
  13. data/benchmarks/feed_xml/loop.xml +441 -0
  14. data/benchmarks/feed_xml/rails.xml +1938 -0
  15. data/benchmarks/feed_xml/white_house.xml +951 -0
  16. data/benchmarks/feed_xml/xkcd.xml +2 -0
  17. data/benchmarks/fetching_systems.rb +23 -0
  18. data/benchmarks/other_libraries.rb +73 -0
  19. data/feedzirra.gemspec +5 -6
  20. data/lib/feedzirra.rb +0 -1
  21. data/lib/feedzirra/feed_utilities.rb +1 -1
  22. data/lib/feedzirra/parser/atom.rb +1 -1
  23. data/lib/feedzirra/parser/itunes_rss.rb +1 -1
  24. data/lib/feedzirra/parser/itunes_rss_item.rb +2 -0
  25. data/lib/feedzirra/version.rb +1 -1
  26. data/spec/feedzirra/feed_spec.rb +13 -14
  27. data/spec/feedzirra/feed_utilities_spec.rb +44 -0
  28. data/spec/feedzirra/parser/atom_spec.rb +4 -0
  29. data/spec/feedzirra/parser/itunes_rss_item_spec.rb +8 -0
  30. data/spec/feedzirra/parser/itunes_rss_spec.rb +4 -0
  31. data/spec/sample_feeds/AtomFeedWithSpacesAroundEquals.xml +60 -0
  32. data/spec/sample_feeds/ITunesWithSpacesInAttributes.xml +62 -0
  33. data/spec/sample_feeds/itunes.xml +2 -0
  34. data/spec/spec_helper.rb +8 -0
  35. metadata +24 -34
  36. data/spec/benchmarks/feed_benchmarks.rb +0 -98
  37. data/spec/benchmarks/feedzirra_benchmarks.rb +0 -40
  38. data/spec/benchmarks/fetching_benchmarks.rb +0 -28
  39. data/spec/benchmarks/parsing_benchmark.rb +0 -30
  40. data/spec/benchmarks/updating_benchmarks.rb +0 -33
  41. data/spec/sample_feeds/run_against_sample.rb +0 -20
  42. data/spec/sample_feeds/top5kfeeds.dat +0 -2170
  43. data/spec/sample_feeds/trouble_feeds.txt +0 -16
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 95a08649fedde92fbdc49bf3df509bb0c2e95344
4
- data.tar.gz: bed6f361c5d49a5ab122ecceabd8730b08111fe8
3
+ metadata.gz: 048f791232329d98fa0df072ce3e5c7847c41749
4
+ data.tar.gz: 72e0a96c185791334ecdbfd7afab28221d49613d
5
5
  SHA512:
6
- metadata.gz: a84e2ffebe7036193224bc88b12a577c978f7fe04490f8547fdd32659ce7685a008f44e39d40eeb5dc4615a1fd77f59ab92d2f98e68fe3efc26e475c0ef6ea13
7
- data.tar.gz: 385e01cbe1825b3a9421a06c15d75095be655602d201a9b9d5787b3a042e5c6dcac33d243786d7ab090181e0fd06a2b81f26167cbfaaf9d8ff9d3b85331856e0
6
+ metadata.gz: 7c641a152886c14656b4e6152d2c8c295a60fe492d45960a3177cfdd71241a7a6af7771a100d632c65d649743f2fbc909dd34e6f8f9c68b5eecfe3b528dddce4
7
+ data.tar.gz: 846fe77d347dcfc4f2ae943f76472e6fab1f3116a1872149e2c78f0ac37082d156f066759d9f68b222950f380ef8867577629b9ae43f0c2e72fa6881bf332fb7
@@ -1,26 +1,53 @@
1
1
  # Feedzirra Changelog
2
2
 
3
+ ## 0.5.0
4
+
5
+ * General
6
+ * Lots of README cleanup
7
+ * Remove pending specs
8
+ * Rewrite benchmarks and move them out of the spec folder
9
+ * Upgrade to latest Rspec
10
+
11
+ * Enhancements
12
+ * Allow spaces in rss tag when checking parse-ability [[#127][]]
13
+ * Compare `entry_id` and `url` for finding new entries [[#195][]]
14
+ * Add closed captioned and order tags for iTunesRSSItem [[#160][]]
15
+
16
+ [#127]: https://github.com/pauldix/feedzirra/pull/127
17
+ [#160]: https://github.com/pauldix/feedzirra/pull/160
18
+ [#195]: https://github.com/pauldix/feedzirra/pull/195
19
+
3
20
  ## 0.4.0
4
21
 
5
22
  * Enhancements
6
- * Raise when parser invokes its failure callback [[#159](https://github.com/pauldix/feedzirra/issues/159)]
7
- * Add PubSubHubbub hub urls as feed element [[#138](https://github.com/pauldix/feedzirra/pull/138)]
8
- * Add support for iTunes image in iTunes RSS item [[#164](https://github.com/pauldix/feedzirra/pull/164)]
23
+ * Raise when parser invokes its failure callback [[#159][]]
24
+ * Add PubSubHubbub hub urls as feed element [[#138][]]
25
+ * Add support for iTunes image in iTunes RSS item [[#164][]]
9
26
 
10
27
  * Bug fixes
11
- * Use curb callbacks rather than response codes [[#161](https://github.com/pauldix/feedzirra/pull/161)]
28
+ * Use curb callbacks rather than response codes [[#161][]]
29
+
30
+ [#138]: https://github.com/pauldix/feedzirra/pull/138
31
+ [#159]: https://github.com/pauldix/feedzirra/issues/159
32
+ [#161]: https://github.com/pauldix/feedzirra/pull/161
33
+ [#164]: https://github.com/pauldix/feedzirra/pull/164
12
34
 
13
35
  ## 0.3.0
14
36
 
15
37
  * General
16
- * Add CodeClimate badge [[#192](https://github.com/pauldix/feedzirra/pull/192)]
38
+ * Add CodeClimate badge [[#192][]]
17
39
 
18
40
  * Enhancements
19
- * CURL SSL Version option [[#156](https://github.com/pauldix/feedzirra/pull/156)]
20
- * Cookie support for Curb [[#98](https://github.com/pauldix/feedzirra/pull/98)]
41
+ * CURL SSL Version option [[#156][]]
42
+ * Cookie support for Curb [[#98][]]
21
43
 
22
44
  * Deprecations
23
- * For `ITunesRSSItem`, use `id` instead of `guid` [[#169](https://github.com/pauldix/feedzirra/pull/169)]
45
+ * For `ITunesRSSItem`, use `id` instead of `guid` [[#169][]]
46
+
47
+ [#98]: https://github.com/pauldix/feedzirra/pull/98
48
+ [#156]: https://github.com/pauldix/feedzirra/pull/156
49
+ [#169]: https://github.com/pauldix/feedzirra/pull/169
50
+ [#192]: https://github.com/pauldix/feedzirra/pull/192
24
51
 
25
52
  ## 0.2.2
26
53
 
@@ -31,33 +58,47 @@
31
58
  * README updates
32
59
 
33
60
  * Enhancements
34
- * Also use dc:identifier for `entry_id` [[#182](https://github.com/pauldix/feedzirra/pull/182)]
61
+ * Also use dc:identifier for `entry_id` [[#182][]]
35
62
 
36
63
  * Bug fixes
37
- * Don't try to sanitize non-existent elements [[#174](https://github.com/pauldix/feedzirra/pull/174)]
38
- * Fix Rspec deprecations [[#188](https://github.com/pauldix/feedzirra/pull/188)]
39
- * Fix Travis [[#191](https://github.com/pauldix/feedzirra/pull/191)]
64
+ * Don't try to sanitize non-existent elements [[#174][]]
65
+ * Fix Rspec deprecations [[#188][]]
66
+ * Fix Travis [[#191][]]
67
+
68
+ [#174]: https://github.com/pauldix/feedzirra/pull/174
69
+ [#182]: https://github.com/pauldix/feedzirra/pull/182
70
+ [#188]: https://github.com/pauldix/feedzirra/pull/188
71
+ [#191]: https://github.com/pauldix/feedzirra/pull/191
40
72
 
41
73
  ## 0.2.1
42
74
 
43
- * Use `Time.parse_safely` in `Feed.last_modified_from_header` [[#129](https://github.com/pauldix/feedzirra/pull/129)].
44
- * Added image to the RSS Entry Parser [[#103](https://github.com/pauldix/feedzirra/pull/103)].
45
- * Compatibility fixes for Ruby 2.0 [[#136](https://github.com/pauldix/feedzirra/pull/136)].
46
- * Remove gorillib dependency [[#113](https://github.com/pauldix/feedzirra/pull/113)].
75
+ * Use `Time.parse_safely` in `Feed.last_modified_from_header` [[#129][]].
76
+ * Added image to the RSS Entry Parser [[#103][]].
77
+ * Compatibility fixes for Ruby 2.0 [[#136][]].
78
+ * Remove gorillib dependency [[#113][]].
79
+
80
+ [#103]: https://github.com/pauldix/feedzirra/pull/103
81
+ [#113]: https://github.com/pauldix/feedzirra/pull/113
82
+ [#129]: https://github.com/pauldix/feedzirra/pull/129
83
+ [#136]: https://github.com/pauldix/feedzirra/pull/136
47
84
 
48
85
  ## 0.2.0.rc2
49
86
 
50
- * Bump sax-machine to `v0.2.0.rc1`, fixes encoding issues [[#76](https://github.com/pauldix/feedzirra/issues/76)].
87
+ * Bump sax-machine to `v0.2.0.rc1`, fixes encoding issues [[#76][]].
88
+
89
+ [#76]: https://github.com/pauldix/feedzirra/issues/76
51
90
 
52
91
  ## 0.2.0.rc1
53
92
 
54
93
  * Remove ActiveSupport dependency
55
94
  * No longer tethered to any version of Rails!
56
95
  * Update curb (v0.8.0) and rspec (v2.10.0)
57
- * Revert [3008ceb](https://github.com/pauldix/feedzirra/commit/3008ceb338df1f4c37a211d0aab8a6ad4f584dbc)
96
+ * Revert [3008ceb][]
58
97
  * Add Travis-CI integration
59
98
  * General repository and gem maintenance
60
99
 
100
+ [3008ceb]: https://github.com/pauldix/feedzirra/commit/3008ceb338df1f4c37a211d0aab8a6ad4f584dbc
101
+
61
102
  ## 0.1.3
62
103
 
63
104
  * ?
data/README.md CHANGED
@@ -1,161 +1,187 @@
1
- # Feedzirra [![Build Status](https://secure.travis-ci.org/pauldix/feedzirra.png)](http://travis-ci.org/pauldix/feedzirra) [![Code Climate](https://codeclimate.com/github/pauldix/feedzirra.png)](https://codeclimate.com/github/pauldix/feedzirra)
1
+ # Feedzirra [![Build Status][travis-badge]][travis] [![Code Climate][code-climate-badge]][code-climate]
2
2
 
3
- I'd like feedback on the api and any bugs encountered on feeds in the wild. I've set up a [google group here](http://groups.google.com/group/feedzirra).
3
+ [travis-badge]: https://secure.travis-ci.org/pauldix/feedzirra.png
4
+ [travis]: http://travis-ci.org/pauldix/feedzirra
5
+ [code-climate-badge]: https://codeclimate.com/github/pauldix/feedzirra.png
6
+ [code-climate]: https://codeclimate.com/github/pauldix/feedzirra
7
+
8
+ I'd like feedback on the api and any bugs encountered on feeds in the wild. I've
9
+ set up a [google group][].
10
+
11
+ [google group]: http://groups.google.com/group/feedzirra
4
12
 
5
13
  ## Description
6
14
 
7
- Feedzirra is a feed library that is designed to get and update many feeds as quickly as possible. This includes using libcurl-multi through the [curb](https://github.com/taf2/curb) gem for faster http gets, and libxml through [nokogiri](https://github.com/tenderlove/nokogiri) and [sax-machine](https://github.com/pauldix/sax-machine) for faster parsing. Feedzirra requires at least Ruby 1.9.2.
15
+ Feedzirra is a feed library that is designed to get and update many feeds as
16
+ quickly as possible. This includes using libcurl-multi through the [curb][] gem
17
+ for faster http gets, and libxml through [nokogiri][] and [sax-machine][] for
18
+ faster parsing. Feedzirra requires at least Ruby 1.9.2.
8
19
 
9
- Once you have fetched feeds using Feedzirra, they can be updated using the feed objects. Feedzirra automatically inserts etag and last-modified information from the http response headers to lower bandwidth usage, eliminate unnecessary parsing, and make things speedier in general.
20
+ [curb]: https://github.com/taf2/curb
21
+ [nokogiri]: https://github.com/sparklemotion/nokogiri
22
+ [sax-machine]: https://github.com/pauldix/sax-machine
10
23
 
11
- Another feature present in Feedzirra is the ability to create callback functions that get called "on success" and "on failure" when getting a feed. This makes it easy to do things like log errors or update data stores.
24
+ Once you have fetched feeds using Feedzirra, they can be updated using the feed
25
+ objects. Feedzirra automatically inserts etag and last-modified information from
26
+ the http response headers to lower bandwidth usage, eliminate unnecessary
27
+ parsing, and make things speedier in general.
12
28
 
13
- The fetching and parsing logic have been decoupled so that either of them can be used in isolation if you'd prefer not to use everything that Feedzirra offers. However, the code examples below use helper methods in the Feed class that put everything together to make things as simple as possible.
29
+ Another feature present in Feedzirra is the ability to create callback functions
30
+ that get called "on success" and "on failure" when getting a feed. This makes it
31
+ easy to do things like log errors or update data stores.
14
32
 
15
- The final feature of Feedzirra is the ability to define custom parsing classes. In truth, Feedzirra could be used to parse much more than feeds. Microformats, page scraping, and almost anything else are fair game.
33
+ The fetching and parsing logic have been decoupled so that either of them can be
34
+ used in isolation if you'd prefer not to use everything that Feedzirra offers.
35
+ However, the code examples below use helper methods in the Feed class that put
36
+ everything together to make things as simple as possible.
37
+
38
+ The final feature of Feedzirra is the ability to define custom parsing classes.
39
+ In truth, Feedzirra could be used to parse much more than feeds. Microformats,
40
+ page scraping, and almost anything else are fair game.
16
41
 
17
42
  ## Speedup date parsing
18
43
 
19
- In MRI before 1.9.3 the date parsing code was written in Ruby and was optimized for readability over speed, to speed up this part you can install the [home_run](https://github.com/jeremyevans/home_run) gem to replace it with an optimized C version. In most cases, if you are using Ruby 1.9.3+, you will not need to use [home_run](https://github.com/jeremyevans/home_run).
44
+ In MRI before 1.9.3 the date parsing code was written in Ruby and was optimized
45
+ for readability over speed, to speed up this part you can install the
46
+ [home_run][] gem to replace it with an optimized C version. In most cases, if
47
+ you are using Ruby 1.9.3+, you will not need to use home\_run.
48
+
49
+ [home_run]: https://github.com/jeremyevans/home_run
20
50
 
21
51
  ## Usage
22
52
 
23
53
  [A gist of the following code](http://gist.github.com/57285)
24
54
 
25
55
  ```ruby
26
- require 'feedzirra'
27
-
28
- # fetching a single feed
29
- feed = Feedzirra::Feed.fetch_and_parse("http://feeds.feedburner.com/PaulDixExplainsNothing")
30
-
31
- # feed and entries accessors
32
- feed.title # => "Paul Dix Explains Nothing"
33
- feed.url # => "http://www.pauldix.net"
34
- feed.feed_url # => "http://feeds.feedburner.com/PaulDixExplainsNothing"
35
- feed.etag # => "GunxqnEP4NeYhrqq9TyVKTuDnh0"
36
- feed.last_modified # => Sat Jan 31 17:58:16 -0500 2009 # it's a Time object
37
-
38
- entry = feed.entries.first
39
- entry.title # => "Ruby Http Client Library Performance"
40
- entry.url # => "http://www.pauldix.net/2009/01/ruby-http-client-library-performance.html"
41
- entry.author # => "Paul Dix"
42
- entry.summary # => "..."
43
- entry.content # => "..."
44
- entry.published # => Thu Jan 29 17:00:19 UTC 2009 # it's a Time object
45
- entry.categories # => ["...", "..."]
46
-
47
- # sanitizing an entry's content
48
- entry.title.sanitize # => returns the title with harmful stuff escaped
49
- entry.author.sanitize # => returns the author with harmful stuff escaped
50
- entry.content.sanitize # => returns the content with harmful stuff escaped
51
- entry.content.sanitize! # => returns content with harmful stuff escaped and replaces original (also exists for author and title)
52
- entry.sanitize! # => sanitizes the entry's title, author, and content in place (as in, it changes the value to clean versions)
53
- feed.sanitize_entries! # => sanitizes all entries in place
54
-
55
- # updating a single feed
56
- updated_feed = Feedzirra::Feed.update(feed)
57
-
58
- # an updated feed has the following extra accessors
59
- updated_feed.updated? # returns true if any of the feed attributes have been modified. will return false if no new entries
60
- updated_feed.new_entries # a collection of the entry objects that are newer than the latest in the feed before update
61
-
62
- # fetching multiple feeds
63
- feed_urls = ["http://feeds.feedburner.com/PaulDixExplainsNothing", "http://feeds.feedburner.com/trottercashion"]
64
- feeds = Feedzirra::Feed.fetch_and_parse(feed_urls)
65
-
66
- # feeds is now a hash with the feed_urls as keys and the parsed feed objects as values. If an error was thrown
67
- # there will be a Fixnum of the http response code instead of a feed object
68
-
69
- # updating multiple feeds. it expects a collection of feed objects
70
- updated_feeds = Feedzirra::Feed.update(feeds.values)
71
-
72
- # defining custom behavior on failure or success. note that a return status of 304 (not updated) will call the on_success handler
73
- feed = Feedzirra::Feed.fetch_and_parse("http://feeds.feedburner.com/PaulDixExplainsNothing",
74
- :on_success => lambda [|url, feed| puts feed.title ],
75
- :on_failure => lambda [|url, response_code, response_header, response_body| puts response_body ])
76
- # if a collection was passed into fetch_and_parse, the handlers will be called for each one
77
-
78
- # the behavior for the handlers when using Feedzirra::Feed.update is slightly different. The feed passed into on_success will be
79
- # the updated feed with the standard updated accessors. on failure it will be the original feed object passed into update
80
-
81
- # fetching a feed via a proxy (optional)
82
- feed = Feedzirra::Feed.fetch_and_parse("http://feeds.feedburner.com/PaulDixExplainsNothing", {:proxy_url => '10.0.0.1', :proxy_port => 3084})
83
-
56
+ require 'feedzirra'
57
+
58
+ # fetching a single feed
59
+ feed = Feedzirra::Feed.fetch_and_parse("http://feeds.feedburner.com/PaulDixExplainsNothing")
60
+
61
+ # feed and entries accessors
62
+ feed.title # => "Paul Dix Explains Nothing"
63
+ feed.url # => "http://www.pauldix.net"
64
+ feed.feed_url # => "http://feeds.feedburner.com/PaulDixExplainsNothing"
65
+ feed.etag # => "GunxqnEP4NeYhrqq9TyVKTuDnh0"
66
+ feed.last_modified # => Sat Jan 31 17:58:16 -0500 2009 # it's a Time object
67
+
68
+ entry = feed.entries.first
69
+ entry.title # => "Ruby Http Client Library Performance"
70
+ entry.url # => "http://www.pauldix.net/2009/01/ruby-http-client-library-performance.html"
71
+ entry.author # => "Paul Dix"
72
+ entry.summary # => "..."
73
+ entry.content # => "..."
74
+ entry.published # => Thu Jan 29 17:00:19 UTC 2009 # it's a Time object
75
+ entry.categories # => ["...", "..."]
76
+
77
+ # sanitizing an entry's content
78
+ entry.title.sanitize # => returns the title with harmful stuff escaped
79
+ entry.author.sanitize # => returns the author with harmful stuff escaped
80
+ entry.content.sanitize # => returns the content with harmful stuff escaped
81
+ entry.content.sanitize! # => returns content with harmful stuff escaped and replaces original (also exists for author and title)
82
+ entry.sanitize! # => sanitizes the entry's title, author, and content in place (as in, it changes the value to clean versions)
83
+ feed.sanitize_entries! # => sanitizes all entries in place
84
+
85
+ # updating a single feed
86
+ updated_feed = Feedzirra::Feed.update(feed)
87
+
88
+ # an updated feed has the following extra accessors
89
+ updated_feed.updated? # returns true if any of the feed attributes have been modified. will return false if no new entries
90
+ updated_feed.new_entries # a collection of the entry objects that are newer than the latest in the feed before update
91
+
92
+ # fetching multiple feeds
93
+ feed_urls = ["http://feeds.feedburner.com/PaulDixExplainsNothing", "http://feeds.feedburner.com/trottercashion"]
94
+ feeds = Feedzirra::Feed.fetch_and_parse(feed_urls)
95
+
96
+ # feeds is now a hash with the feed_urls as keys and the parsed feed objects as values. If an error was thrown
97
+ # there will be a Fixnum of the http response code instead of a feed object
98
+
99
+ # updating multiple feeds. it expects a collection of feed objects
100
+ updated_feeds = Feedzirra::Feed.update(feeds.values)
101
+
102
+ # defining custom behavior on failure or success. note that a return status of 304 (not updated) will call the on_success handler
103
+ feed = Feedzirra::Feed.fetch_and_parse("http://feeds.feedburner.com/PaulDixExplainsNothing",
104
+ :on_success => lambda [|url, feed| puts feed.title ],
105
+ :on_failure => lambda [|url, response_code, response_header, response_body| puts response_body ])
106
+ # if a collection was passed into fetch_and_parse, the handlers will be called for each one
107
+
108
+ # the behavior for the handlers when using Feedzirra::Feed.update is slightly different. The feed passed into on_success will be
109
+ # the updated feed with the standard updated accessors. on failure it will be the original feed object passed into update
110
+
111
+ # fetching a feed via a proxy (optional)
112
+ feed = Feedzirra::Feed.fetch_and_parse("http://feeds.feedburner.com/PaulDixExplainsNothing", {:proxy_url => '10.0.0.1', :proxy_port => 3084})
84
113
  ```
114
+
85
115
  ## Extending
86
116
 
87
117
  ### Adding a feed parsing class
88
118
 
89
119
  ```ruby
90
- # Adds a new feed parsing class, this class will be used first
91
- Feedzirra::Feed.add_feed_class MyFeedClass
120
+ # Adds a new feed parsing class, this class will be used first
121
+ Feedzirra::Feed.add_feed_class MyFeedClass
92
122
  ```
93
123
 
94
124
  ### Adding attributes to all feeds types / all entries types
95
125
 
96
126
  ```ruby
97
- # Add the generator attribute to all feed types
98
- Feedzirra::Feed.add_common_feed_element('generator')
99
- Feedzirra::Feed.fetch_and_parse("href="http://www.pauldix.net/atom.xml").generator # => 'TypePad'
100
-
101
- # Add some GeoRss information
102
- Feedzirra::Feed.add_common_feed_entry_element('geo:lat', :as => :lat)
103
- Feedzirra::Feed.fetch_and_parse("http://www.earthpublisher.com/georss.php").entries.each do |e|
104
- p "lat: #[e.lat}, long: #{e.long]"
105
- end
127
+ # Add the generator attribute to all feed types
128
+ Feedzirra::Feed.add_common_feed_element('generator')
129
+ Feedzirra::Feed.fetch_and_parse("href="http://www.pauldix.net/atom.xml").generator # => 'TypePad'
130
+
131
+ # Add some GeoRss information
132
+ Feedzirra::Feed.add_common_feed_entry_element('geo:lat', :as => :lat)
133
+ Feedzirra::Feed.fetch_and_parse("http://www.earthpublisher.com/georss.php").entries.each do |e|
134
+ p "lat: #[e.lat}, long: #{e.long]"
135
+ end
106
136
  ```
107
137
 
108
138
  ### Adding attributes to only one class
109
139
 
110
- If you want to add attributes for only one class you simply have to declare them in the class
140
+ If you want to add attributes for only one class you simply have to declare them
141
+ in the class
111
142
 
112
143
  ```ruby
113
- # Add some GeoRss information
114
- require 'lib/feedzirra/parser/rss_entry'
115
-
116
- class Feedzirra::Parser::RSSEntry
117
- element 'geo:lat', :as => :lat
118
- element 'geo:long', :as => :long
119
- end
120
-
121
- # Fetch a feed containing GeoRss info and print them
122
- Feedzirra::Feed.fetch_and_parse("http://www.earthpublisher.com/georss.php").entries.each do |e|
123
- p "lat: #[e.lat}, long: #{e.long]"
124
- end
144
+ # Add some GeoRss information
145
+ require 'lib/feedzirra/parser/rss_entry'
146
+
147
+ class Feedzirra::Parser::RSSEntry
148
+ element 'geo:lat', :as => :lat
149
+ element 'geo:long', :as => :long
150
+ end
151
+
152
+ # Fetch a feed containing GeoRss info and print them
153
+ Feedzirra::Feed.fetch_and_parse("http://www.earthpublisher.com/georss.php").entries.each do |e|
154
+ p "lat: #[e.lat}, long: #{e.long]"
155
+ end
125
156
  ```
126
157
 
127
158
  ## Testing
128
159
 
129
- Feedzirra uses [curb](https://github.com/taf2/curb) to perform requests. `curb` provides bindings for [libcurl](http://curl.haxx.se/libcurl/) and supports numerous protocols, including FILE. To test Feedzirra with local file use `file://` protocol:
160
+ Feedzirra uses [curb][] to perform requests. `curb` provides bindings for
161
+ [libcurl][] and supports numerous protocols, including FILE. To test Feedzirra
162
+ with local file use `file://` protocol:
163
+
164
+ [libcurl]: http://curl.haxx.se/libcurl/
165
+
130
166
  ```ruby
131
167
  feed = Feedzirra::Feed.fetch_and_parse('file:///home/feedzirra/examples/feed.rss')
132
168
  ```
133
169
 
134
-
135
170
  ## Benchmarks
136
171
 
137
- One of the goals of Feedzirra is speed. This includes not only parsing, but fetching multiple feeds as quickly as possible. I ran a benchmark getting 20 feeds 10 times using Feedzirra, rFeedParser, and FeedNormalizer. For more details the [benchmark code can be found in the project in spec/benchmarks/feedzirra_benchmarks.rb](https://github.com/pauldix/feedzirra/blob/7fb5634c5c16e9c6ec971767b462c6518cd55f5d/spec/benchmarks/feedzirra_benchmarks.rb)
138
-
139
- feedzirra 5.170000 1.290000 6.460000 ( 18.917796)
140
- rfeedparser 104.260000 12.220000 116.480000 (244.799063)
141
- feed-normalizer 66.250000 4.010000 70.260000 (191.589862)
142
-
143
- The result of that benchmark is a bit sketchy because of the network variability. Running 10 times against the same 20 feeds was meant to smooth some of that out. However, there is also a [benchmark comparing parsing speed in spec/benchmarks/parsing_benchmark.rb](https://github.com/pauldix/feedzirra/blob/7fb5634c5c16e9c6ec971767b462c6518cd55f5d/spec/benchmarks/parsing_benchmark.rb) on an atom feed.
144
-
145
- feedzirra 0.500000 0.030000 0.530000 ( 0.658744)
146
- rfeedparser 8.400000 1.110000 9.510000 ( 11.839827)
147
- feed-normalizer 5.980000 0.160000 6.140000 ( 7.576140)
172
+ Since a major goal of Feedzirra is speed, benchmarks are provided--see the
173
+ [Benchmark README][benchmark_readme] for more details.
148
174
 
149
- There's also a [benchmark that shows the results of using Feedzirra to perform updates on feeds](https://github.com/pauldix/feedzirra/blob/45d64319544c61a4c9eb9f7f825c73b9f9030cb3/spec/benchmarks/updating_benchmarks.rb) you've already pulled in. I tested against 179 feeds. The first is the initial pull and the second is an update 65 seconds later. I'm not sure how many of them support etag and last-modified, so performance may be better or worse depending on what feeds you're requesting.
150
-
151
- feedzirra fetch and parse 4.010000 0.710000 4.720000 ( 15.110101)
152
- feedzirra update 0.660000 0.280000 0.940000 ( 5.152709)
175
+ [benchmark_readme]: https://github.com/pauldix/feedzirra/blob/master/benchmarks/README.md
153
176
 
154
177
  ## TODO
155
178
 
156
- This thing needs to hammer on many different feeds in the wild. I'm sure there will be bugs. I want to find them and crush them. I didn't bother using the test suite for feedparser. i wanted to start fresh.
179
+ This thing needs to hammer on many different feeds in the wild. I'm sure there
180
+ will be bugs. I want to find them and crush them. I didn't bother using the test
181
+ suite for feedparser. i wanted to start fresh.
157
182
 
158
183
  Here are some more specific TODOs.
184
+
159
185
  * Make a feedzirra-rails gem to integrate feedzirra seamlessly with Rails and ActiveRecord.
160
186
  * Add support for authenticated feeds.
161
187
  * Create a super sweet DSL for defining new parsers.
@@ -168,27 +194,26 @@ Here are some more specific TODOs.
168
194
 
169
195
  (The MIT License)
170
196
 
171
- Copyright (c) 2009-2012:
197
+ Copyright (c) 2009-2013:
172
198
 
173
199
  - [Paul Dix](http://pauldix.net)
174
200
  - [Julien Kirch](http://archiloque.net/)
175
201
  - [Ezekiel Templin](http://zeke.templ.in/)
176
-
177
- Permission is hereby granted, free of charge, to any person obtaining
178
- a copy of this software and associated documentation files (the
179
- 'Software'), to deal in the Software without restriction, including
180
- without limitation the rights to use, copy, modify, merge, publish,
181
- distribute, sublicense, and/or sell copies of the Software, and to
182
- permit persons to whom the Software is furnished to do so, subject to
183
- the following conditions:
184
-
185
- The above copyright notice and this permission notice shall be
186
- included in all copies or substantial portions of the Software.
187
-
188
- THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
189
- EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
190
- MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
191
- IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
192
- CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
193
- TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
194
- SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
202
+ - [Jon Allured](http://jonallured.com/)
203
+
204
+ Permission is hereby granted, free of charge, to any person obtaining a copy of
205
+ this software and associated documentation files (the 'Software'), to deal in
206
+ the Software without restriction, including without limitation the rights to
207
+ use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
208
+ the Software, and to permit persons to whom the Software is furnished to do so,
209
+ subject to the following conditions:
210
+
211
+ The above copyright notice and this permission notice shall be included in all
212
+ copies or substantial portions of the Software.
213
+
214
+ THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
215
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
216
+ FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
217
+ COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
218
+ IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
219
+ CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.