rubylibre-feedzirra 0.0.14 → 0.0.23

Sign up to get free protection for your applications and to get access to all the features.
Files changed (43) hide show
  1. data/README.rdoc +169 -0
  2. data/README.textile +9 -0
  3. data/lib/feedzirra/feed.rb +32 -37
  4. data/lib/feedzirra/parser/atom.rb +9 -0
  5. data/lib/feedzirra/parser/atom_entry.rb +6 -0
  6. data/lib/feedzirra/parser/atom_feed_burner_entry.rb +1 -1
  7. data/lib/feedzirra/parser/itunes_category.rb +12 -0
  8. data/lib/feedzirra/parser/mrss_category.rb +11 -0
  9. data/lib/feedzirra/parser/mrss_content.rb +48 -0
  10. data/lib/feedzirra/parser/mrss_copyright.rb +10 -0
  11. data/lib/feedzirra/parser/mrss_credit.rb +11 -0
  12. data/lib/feedzirra/parser/mrss_group.rb +37 -0
  13. data/lib/feedzirra/parser/mrss_hash.rb +10 -0
  14. data/lib/feedzirra/parser/mrss_player.rb +11 -0
  15. data/lib/feedzirra/parser/mrss_rating.rb +10 -0
  16. data/lib/feedzirra/parser/mrss_restriction.rb +11 -0
  17. data/lib/feedzirra/parser/mrss_text.rb +13 -0
  18. data/lib/feedzirra/parser/mrss_thumbnail.rb +11 -0
  19. data/lib/feedzirra/parser/rss.rb +64 -9
  20. data/lib/feedzirra/parser/rss_entry.rb +54 -14
  21. data/lib/feedzirra/parser/rss_image.rb +15 -0
  22. data/lib/feedzirra.rb +17 -5
  23. data/spec/benchmarks/feed_benchmarks.rb +98 -0
  24. data/spec/benchmarks/feedzirra_benchmarks.rb +40 -0
  25. data/spec/benchmarks/fetching_benchmarks.rb +28 -0
  26. data/spec/benchmarks/parsing_benchmark.rb +30 -0
  27. data/spec/benchmarks/updating_benchmarks.rb +33 -0
  28. data/spec/feedzirra/feed_spec.rb +35 -53
  29. data/spec/feedzirra/parser/atom_entry_spec.rb +4 -0
  30. data/spec/feedzirra/parser/atom_spec.rb +8 -0
  31. data/spec/feedzirra/parser/mrss_content_spec.rb +32 -0
  32. data/spec/feedzirra/parser/rss_entry_spec.rb +121 -8
  33. data/spec/feedzirra/parser/rss_spec.rb +66 -14
  34. data/spec/sample_feeds/run_against_sample.rb +20 -0
  35. data/spec/spec_helper.rb +3 -3
  36. metadata +37 -22
  37. data/lib/feedzirra/parser/itunes_rss.rb +0 -50
  38. data/lib/feedzirra/parser/itunes_rss_item.rb +0 -31
  39. data/lib/feedzirra/parser/itunes_rss_owner.rb +0 -12
  40. data/spec/feedzirra/parser/itunes_rss_item_spec.rb +0 -48
  41. data/spec/feedzirra/parser/itunes_rss_owner_spec.rb +0 -18
  42. data/spec/feedzirra/parser/itunes_rss_spec.rb +0 -50
  43. data/spec/spec.opts +0 -2
data/README.rdoc ADDED
@@ -0,0 +1,169 @@
1
+ == Feedzirra
2
+
3
+ I'd like feedback on the api and any bugs encountered on feeds in the wild. I've set up a {google group here}[http://groups.google.com/group/feedzirra].
4
+
5
+ === Description
6
+
7
+ Feedzirra is a feed library that is designed to get and update many feeds as quickly as possible. This includes using libcurl-multi through the taf2-curb[link:http://github.com/taf2/curb/tree/master] gem for faster http gets, and libxml through nokogiri[link:http://github.com/tenderlove/nokogiri/tree/master] and sax-machine[link:http://github.com/pauldix/sax-machine/tree/master] for faster parsing.
8
+
9
+ Once you have fetched feeds using Feedzirra, they can be updated using the feed objects. Feedzirra automatically inserts etag and last-modified information from the http response headers to lower bandwidth usage, eliminate unnecessary parsing, and make things speedier in general.
10
+
11
+ Another feature present in Feedzirra is the ability to create callback functions that get called "on success" and "on failure" when getting a feed. This makes it easy to do things like log errors or update data stores.
12
+
13
+ The fetching and parsing logic have been decoupled so that either of them can be used in isolation if you'd prefer not to use everything that Feedzirra offers. However, the code examples below use helper methods in the Feed class that put everything together to make things as simple as possible.
14
+
15
+ The final feature of Feedzirra is the ability to define custom parsing classes. In truth, Feedzirra could be used to parse much more than feeds. Microformats, page scraping, and almost anything else are fair game.
16
+
17
+ === Installation
18
+
19
+ For now Feedzirra exists only on github. It also has a few gem requirements that are only on github. Before you start you need to have libcurl[link:http://curl.haxx.se/] and libxml[link:http://xmlsoft.org/] installed. If you're on Leopard you have both. Otherwise, you'll need to grab them. Once you've got those libraries, these are the gems that get used: nokogiri, pauldix-sax-machine, taf2-curb (note that this is a fork that lives on github and not the Ruby Forge version of curb), and pauldix-feedzirra. The feedzirra gemspec has all the dependencies so you should be able to get up and running with the standard github gem install routine:
20
+
21
+ gem sources -a http://gems.github.com # if you haven't already
22
+ gem install pauldix-feedzirra
23
+
24
+ *NOTE:*Some people have been reporting a few issues related to installation. First, the Ruby Forge version of curb is not what you want. It will not work. Nor will the curl-multi gem that lives on Ruby Forge. You have to get the taf2-curb[link:http://github.com/taf2/curb/tree/master] fork installed.
25
+
26
+ If you see this error when doing a require:
27
+
28
+ /Library/Ruby/Site/1.8/rubygems/custom_require.rb:31:in `gem_original_require': no such file to load -- curb_core (LoadError)
29
+
30
+ It means that the taf2-curb gem didn't build correctly. To resolve this you can do a git clone git://github.com/taf2/curb.git then run rake gem in the curb directory, then sudo gem install pkg/curb-0.2.4.0.gem. After that you should be good.
31
+
32
+ If you see something like this when trying to run it:
33
+
34
+ NoMethodError: undefined method `on_success' for #<Curl::Easy:0x1182724>
35
+ from ./lib/feedzirra/feed.rb:88:in `add_url_to_multi'
36
+
37
+ This means that you are requiring curl-multi or the Ruby Forge version of Curb somewhere. You can't use those and need to get the taf2 version up and running.
38
+
39
+ If you're on Debian or Ubuntu and getting errors while trying to install the taf2-curb gem, it could be because you don't have the latest version of libcurl installed. Do this to fix:
40
+
41
+ sudo apt-get install libcurl4-gnutls-dev
42
+
43
+ Another problem could be if you are running Mac Ports and you have libcurl installed through there. You need to uninstall it for curb to work! The version in Mac Ports is old and doesn't play nice with curb. If you're running Leopard, you can just uninstall and you should be golden. If you're on an older version of OS X, you'll then need to {download curl}[http://curl.haxx.se/download.html] and build from source. Then you'll have to install the taf2-curb gem again. You might have to perform the step above.
44
+
45
+ If you're still having issues, please let me know on the mailing list. Also, {Todd Fisher (taf2)}[link:http://github.com/taf2] is working on fixing the gem install. Please send him a full error report.
46
+
47
+ === Usage
48
+
49
+ {A gist of the following code}[link:http://gist.github.com/57285]
50
+
51
+ require 'feedzirra'
52
+
53
+ # fetching a single feed
54
+ feed = Feedzirra::Feed.fetch_and_parse("http://feeds.feedburner.com/PaulDixExplainsNothing")
55
+
56
+ # feed and entries accessors
57
+ feed.title # => "Paul Dix Explains Nothing"
58
+ feed.url # => "http://www.pauldix.net"
59
+ feed.feed_url # => "http://feeds.feedburner.com/PaulDixExplainsNothing"
60
+ feed.etag # => "GunxqnEP4NeYhrqq9TyVKTuDnh0"
61
+ feed.last_modified # => Sat Jan 31 17:58:16 -0500 2009 # it's a Time object
62
+
63
+ entry = feed.entries.first
64
+ entry.title # => "Ruby Http Client Library Performance"
65
+ entry.url # => "http://www.pauldix.net/2009/01/ruby-http-client-library-performance.html"
66
+ entry.author # => "Paul Dix"
67
+ entry.summary # => "..."
68
+ entry.content # => "..."
69
+ entry.published # => Thu Jan 29 17:00:19 UTC 2009 # it's a Time object
70
+ entry.categories # => ["...", "..."]
71
+
72
+ # sanitizing an entry's content
73
+ entry.title.sanitize # => returns the title with harmful stuff escaped
74
+ entry.author.sanitize # => returns the author with harmful stuff escaped
75
+ entry.content.sanitize # => returns the content with harmful stuff escaped
76
+ entry.content.sanitize! # => returns content with harmful stuff escaped and replaces original (also exists for author and title)
77
+ entry.sanitize! # => sanitizes the entry's title, author, and content in place (as in, it changes the value to clean versions)
78
+ feed.sanitize_entries! # => sanitizes all entries in place
79
+
80
+ # updating a single feed
81
+ updated_feed = Feedzirra::Feed.update(feed)
82
+
83
+ # an updated feed has the following extra accessors
84
+ updated_feed.updated? # returns true if any of the feed attributes have been modified. will return false if only new entries
85
+ updated_feed.new_entries # a collection of the entry objects that are newer than the latest in the feed before update
86
+
87
+ # fetching multiple feeds
88
+ feed_urls = ["http://feeds.feedburner.com/PaulDixExplainsNothing", "http://feeds.feedburner.com/trottercashion"]
89
+ feeds = Feedzirra::Feed.fetch_and_parse(feed_urls)
90
+
91
+ # feeds is now a hash with the feed_urls as keys and the parsed feed objects as values. If an error was thrown
92
+ # there will be a Fixnum of the http response code instead of a feed object
93
+
94
+ # updating multiple feeds. it expects a collection of feed objects
95
+ updated_feeds = Feedzirra::Feed.update(feeds.values)
96
+
97
+ # defining custom behavior on failure or success. note that a return status of 304 (not updated) will call the on_success handler
98
+ feed = Feedzirra::Feed.fetch_and_parse("http://feeds.feedburner.com/PaulDixExplainsNothing",
99
+ :on_success => lambda {|feed| puts feed.title },
100
+ :on_failure => lambda {|url, response_code, response_header, response_body| puts response_body })
101
+ # if a collection was passed into fetch_and_parse, the handlers will be called for each one
102
+
103
+ # the behavior for the handlers when using Feedzirra::Feed.update is slightly different. The feed passed into on_success will be
104
+ # the updated feed with the standard updated accessors. on failure it will be the original feed object passed into update
105
+
106
+ # Defining custom parsers
107
+ # TODO: the functionality is here, just write some good examples that show how to do this
108
+
109
+ === Extending
110
+
111
+ === Benchmarks
112
+
113
+ One of the goals of Feedzirra is speed. This includes not only parsing, but fetching multiple feeds as quickly as possible. I ran a benchmark getting 20 feeds 10 times using Feedzirra, rFeedParser, and FeedNormalizer. For more details the {benchmark code can be found in the project in spec/benchmarks/feedzirra_benchmarks.rb}[http://github.com/pauldix/feedzirra/blob/7fb5634c5c16e9c6ec971767b462c6518cd55f5d/spec/benchmarks/feedzirra_benchmarks.rb]
114
+
115
+ feedzirra 5.170000 1.290000 6.460000 ( 18.917796)
116
+ rfeedparser 104.260000 12.220000 116.480000 (244.799063)
117
+ feed-normalizer 66.250000 4.010000 70.260000 (191.589862)
118
+
119
+ The result of that benchmark is a bit sketchy because of the network variability. Running 10 times against the same 20 feeds was meant to smooth some of that out. However, there is also a {benchmark comparing parsing speed in spec/benchmarks/parsing_benchmark.rb}[http://github.com/pauldix/feedzirra/blob/7fb5634c5c16e9c6ec971767b462c6518cd55f5d/spec/benchmarks/parsing_benchmark.rb] on an atom feed.
120
+
121
+ feedzirra 0.500000 0.030000 0.530000 ( 0.658744)
122
+ rfeedparser 8.400000 1.110000 9.510000 ( 11.839827)
123
+ feed-normalizer 5.980000 0.160000 6.140000 ( 7.576140)
124
+
125
+ There's also a {benchmark that shows the results of using Feedzirra to perform updates on feeds}[http://github.com/pauldix/feedzirra/blob/45d64319544c61a4c9eb9f7f825c73b9f9030cb3/spec/benchmarks/updating_benchmarks.rb] you've already pulled in. I tested against 179 feeds. The first is the initial pull and the second is an update 65 seconds later. I'm not sure how many of them support etag and last-modified, so performance may be better or worse depending on what feeds you're requesting.
126
+
127
+ feedzirra fetch and parse 4.010000 0.710000 4.720000 ( 15.110101)
128
+ feedzirra update 0.660000 0.280000 0.940000 ( 5.152709)
129
+
130
+ === TODO
131
+
132
+ This thing needs to hammer on many different feeds in the wild. I'm sure there will be bugs. I want to find them and crush them. I didn't bother using the test suite for feedparser. i wanted to start fresh.
133
+
134
+ Here are some more specific TODOs.
135
+ * Make a feedzirra-rails gem to integrate feedzirra seamlessly with Rails and ActiveRecord.
136
+ * Add support for authenticated feeds.
137
+ * Create a super sweet DSL for defining new parsers.
138
+ * Test against Ruby 1.9.1 and fix any bugs.
139
+ * I'm not keeping track of modified on entries. Should I add this?
140
+ * Clean up the fetching code inside feed.rb so it doesn't suck so hard.
141
+ * Make the feed_spec actually mock stuff out so it doesn't hit the net.
142
+ * Readdress how feeds determine if they can parse a document. Maybe I should use namespaces instead?
143
+
144
+ === LICENSE
145
+
146
+ (The MIT License)
147
+
148
+ Copyright (c) 2009:
149
+
150
+ {Paul Dix}[http://pauldix.net]
151
+
152
+ Permission is hereby granted, free of charge, to any person obtaining
153
+ a copy of this software and associated documentation files (the
154
+ 'Software'), to deal in the Software without restriction, including
155
+ without limitation the rights to use, copy, modify, merge, publish,
156
+ distribute, sublicense, and/or sell copies of the Software, and to
157
+ permit persons to whom the Software is furnished to do so, subject to
158
+ the following conditions:
159
+
160
+ The above copyright notice and this permission notice shall be
161
+ included in all copies or substantial portions of the Software.
162
+
163
+ THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
164
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
165
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
166
+ IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
167
+ CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
168
+ TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
169
+ SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.textile CHANGED
@@ -1,3 +1,5 @@
1
+ "Changes made from Paul's tree":#changes
2
+
1
3
  h1. Feedzirra
2
4
 
3
5
  "http://github.com/pauldix/feedzirra/tree/master":http://github.com/pauldix/feedzirra/tree/master
@@ -168,6 +170,13 @@ Here are some more specific TODOs.
168
170
  * Clean up the fetching code inside feed.rb so it doesn't suck so hard.
169
171
  * Readdress how feeds determine if they can parse a document. Maybe I should use namespaces instead?
170
172
 
173
+ h2(#changes). Updates from Paul's tree
174
+
175
+ * Add in Media RSS support that should handle the spec with multiple content entries.
176
+ * Add in the missing support for nested iTunes Categories
177
+ * Combined iTunes into RSS, removed extra files
178
+ * Made element names more consistent across RSS and iTunes
179
+
171
180
  h2. LICENSE
172
181
 
173
182
  (The MIT License)
@@ -1,9 +1,9 @@
1
1
  module Feedzirra
2
2
  class NoParserAvailable < StandardError; end
3
-
3
+
4
4
  class Feed
5
5
  USER_AGENT = "feedzirra http://github.com/pauldix/feedzirra/tree/master"
6
-
6
+
7
7
  # Takes a raw XML feed and attempts to parse it. If no parser is available a Feedzirra::NoParserAvailable exception is raised.
8
8
  #
9
9
  # === Parameters
@@ -21,13 +21,13 @@ module Feedzirra
21
21
  end
22
22
 
23
23
  # Determines the correct parser class to use for parsing the feed.
24
- #
24
+ #
25
25
  # === Parameters
26
26
  # [xml<String>] The XML that you would like determine the parser for.
27
27
  # === Returns
28
28
  # The class name of the parser that can handle the XML.
29
29
  def self.determine_feed_parser_for_xml(xml)
30
- start_of_doc = xml.slice(0, 1000)
30
+ start_of_doc = xml.slice(0, 2000)
31
31
  feed_classes.detect {|klass| klass.able_to_parse?(start_of_doc)}
32
32
  end
33
33
 
@@ -37,7 +37,7 @@ module Feedzirra
37
37
  # [klass<Constant>] The class/constant that you want to register.
38
38
  # === Returns
39
39
  # A updated array of feed parser class names.
40
- def self.add_feed_class(klass)
40
+ def self.add_feed_class(klass)
41
41
  feed_classes.unshift klass
42
42
  end
43
43
 
@@ -46,24 +46,14 @@ module Feedzirra
46
46
  # === Returns
47
47
  # A array of class names.
48
48
  def self.feed_classes
49
- @feed_classes ||= [Feedzirra::Parser::RSS, Feedzirra::Parser::AtomFeedBurner, Feedzirra::Parser::Atom]
50
- end
51
-
52
- # Makes all feed types look for the passed in element to parse. This is actually just a call to
53
- # element (a SAXMachine call) in the class
54
- #
55
- # === Parameters
56
- # [element_tag<String>]
57
- # [options<Hash>] Valid keys are same as with SAXMachine
58
- def self.add_common_feed_element(element_tag, options = {})
59
- # need to think of a better way to do this. will break for people who want this behavior
60
- # across their added classes
61
- feed_classes.map{|k| eval("#{k}") }.each do |klass|
62
- klass.send(:element, element_tag, options)
63
- end
49
+ @feed_classes ||= [
50
+ Feedzirra::Parser::RSS,
51
+ Feedzirra::Parser::AtomFeedBurner,
52
+ Feedzirra::Parser::Atom
53
+ ]
64
54
  end
65
-
66
- # Makes all entry types look for the passed in element to parse. This is actually just a call to
55
+
56
+ # Makes all entry types look for the passed in element to parse. This is actually just a call to
67
57
  # element (a SAXMachine call) in the class
68
58
  #
69
59
  # === Parameters
@@ -76,7 +66,7 @@ module Feedzirra
76
66
  klass.send(:element, element_tag, options)
77
67
  end
78
68
  end
79
-
69
+
80
70
  # Fetches and returns the raw XML for each URL provided.
81
71
  #
82
72
  # === Parameters
@@ -89,7 +79,7 @@ module Feedzirra
89
79
  # :on_failure - Block that gets executed after a failed request.
90
80
  # === Returns
91
81
  # A String of XML if a single URL is passed.
92
- #
82
+ #
93
83
  # A Hash if multiple URL's are passed. The key will be the URL, and the value the XML.
94
84
  def self.fetch_raw(urls, options = {})
95
85
  url_queue = [*urls]
@@ -108,9 +98,11 @@ module Feedzirra
108
98
  curl.timeout = options[:timeout] if options[:timeout]
109
99
 
110
100
  curl.on_success do |c|
101
+ c = c.select { |e| e.kind_of? Curl::Easy }.first if(c.kind_of? Array)
111
102
  responses[url] = decode_content(c)
112
103
  end
113
104
  curl.on_failure do |c|
105
+ c = c.select { |e| e.kind_of? Curl::Easy }.first if(c.kind_of? Array)
114
106
  responses[url] = c.response_code
115
107
  end
116
108
  end
@@ -139,13 +131,13 @@ module Feedzirra
139
131
  url_queue = [*urls]
140
132
  multi = Curl::Multi.new
141
133
  responses = {}
142
-
134
+
143
135
  # I broke these down so I would only try to do 30 simultaneously because
144
136
  # I was getting weird errors when doing a lot. As one finishes it pops another off the queue.
145
137
  url_queue.slice!(0, 30).each do |url|
146
138
  add_url_to_multi(multi, url, url_queue, responses, options)
147
139
  end
148
-
140
+
149
141
  multi.perform
150
142
  return urls.is_a?(String) ? responses.values.first : responses
151
143
  end
@@ -162,7 +154,7 @@ module Feedzirra
162
154
  gz = Zlib::GzipReader.new(StringIO.new(c.body_str))
163
155
  xml = gz.read
164
156
  gz.close
165
- rescue Zlib::GzipFile::Error
157
+ rescue Zlib::GzipFile::Error
166
158
  # Maybe this is not gzipped?
167
159
  xml = c.body_str
168
160
  end
@@ -191,15 +183,15 @@ module Feedzirra
191
183
  feed_queue = [*feeds]
192
184
  multi = Curl::Multi.new
193
185
  responses = {}
194
-
186
+
195
187
  feed_queue.slice!(0, 30).each do |feed|
196
188
  add_feed_to_multi(multi, feed, feed_queue, responses, options)
197
189
  end
198
-
190
+
199
191
  multi.perform
200
192
  responses.size == 1 ? responses.values.first : responses.values
201
193
  end
202
-
194
+
203
195
  # An abstraction for adding a feed by URL to the passed Curb::multi stack.
204
196
  #
205
197
  # === Parameters
@@ -227,11 +219,11 @@ module Feedzirra
227
219
  curl.timeout = options[:timeout] if options[:timeout]
228
220
 
229
221
  curl.on_success do |c|
230
- c = c.select{ |n| n.kind_of?(Curl::Easy) }.first if c.kind_of?(Array)
222
+ c = c.select { |e| e.kind_of? Curl::Easy }.first if(c.kind_of? Array)
231
223
  add_url_to_multi(multi, url_queue.shift, url_queue, responses, options) unless url_queue.empty?
232
224
  xml = decode_content(c)
233
225
  klass = determine_feed_parser_for_xml(xml)
234
-
226
+
235
227
  if klass
236
228
  begin
237
229
  feed = klass.parse(xml)
@@ -249,8 +241,9 @@ module Feedzirra
249
241
  options[:on_failure].call(url, c.response_code, c.header_str, c.body_str) if options.has_key?(:on_failure)
250
242
  end
251
243
  end
252
-
244
+
253
245
  curl.on_failure do |c|
246
+ c = c.select { |e| e.kind_of? Curl::Easy }.first if(c.kind_of? Array)
254
247
  add_url_to_multi(multi, url_queue.shift, url_queue, responses, options) unless url_queue.empty?
255
248
  responses[url] = c.response_code
256
249
  options[:on_failure].call(url, c.response_code, c.header_str, c.body_str) if options.has_key?(:on_failure)
@@ -258,7 +251,7 @@ module Feedzirra
258
251
  end
259
252
  multi.add(easy)
260
253
  end
261
-
254
+
262
255
  # An abstraction for adding a feed by a Feed object to the passed Curb::multi stack.
263
256
  #
264
257
  # === Parameters
@@ -273,7 +266,7 @@ module Feedzirra
273
266
  # * :on_failure - Block that gets executed after a failed request.
274
267
  # === Returns
275
268
  # The updated Curl::Multi object with the request details added to it's stack.
276
- def self.add_feed_to_multi(multi, feed, feed_queue, responses, options)
269
+ def self.add_feed_to_multi(multi, feed, feed_queue, responses, options)
277
270
  easy = Curl::Easy.new(feed.feed_url) do |curl|
278
271
  curl.headers["User-Agent"] = (options[:user_agent] || USER_AGENT)
279
272
  curl.headers["If-Modified-Since"] = feed.last_modified.httpdate if feed.last_modified
@@ -285,6 +278,7 @@ module Feedzirra
285
278
  curl.timeout = options[:timeout] if options[:timeout]
286
279
 
287
280
  curl.on_success do |c|
281
+ c = c.select { |e| e.kind_of? Curl::Easy }.first if(c.kind_of? Array)
288
282
  begin
289
283
  add_feed_to_multi(multi, feed_queue.shift, feed_queue, responses, options) unless feed_queue.empty?
290
284
  updated_feed = Feed.parse(c.body_str)
@@ -300,6 +294,7 @@ module Feedzirra
300
294
  end
301
295
 
302
296
  curl.on_failure do |c|
297
+ c = c.select { |e| e.kind_of? Curl::Easy }.first if(c.kind_of? Array)
303
298
  add_feed_to_multi(multi, feed_queue.shift, feed_queue, responses, options) unless feed_queue.empty?
304
299
  response_code = c.response_code
305
300
  if response_code == 304 # it's not modified. this isn't an error condition
@@ -315,7 +310,7 @@ module Feedzirra
315
310
  end
316
311
 
317
312
  # Determines the etag from the request headers.
318
- #
313
+ #
319
314
  # === Parameters
320
315
  # [header<String>] Raw request header returned from the request
321
316
  # === Returns
@@ -336,4 +331,4 @@ module Feedzirra
336
331
  Time.parse($1) if $1
337
332
  end
338
333
  end
339
- end
334
+ end
@@ -15,11 +15,20 @@ module Feedzirra
15
15
  element :title
16
16
  element :link, :as => :url, :value => :href, :with => {:type => "text/html"}
17
17
  element :link, :as => :feed_url, :value => :href, :with => {:type => "application/atom+xml"}
18
+ elements :link, :as => :links, :value => :href
18
19
  elements :entry, :as => :entries, :class => AtomEntry
19
20
 
20
21
  def self.able_to_parse?(xml) #:nodoc:
21
22
  xml =~ /(Atom)|(#{Regexp.escape("http://purl.org/atom")})/
22
23
  end
24
+
25
+ def url
26
+ @url || links.last
27
+ end
28
+
29
+ def feed_url
30
+ @feed_url || links.first
31
+ end
23
32
  end
24
33
  end
25
34
 
@@ -27,6 +27,12 @@ module Feedzirra
27
27
  element :updated
28
28
  element :modified, :as => :updated
29
29
  elements :category, :as => :categories, :value => :term
30
+ elements :link, :as => :links, :value => :href
31
+ elements :link, :as => :enclosure_links, :value => :href, :with => {:rel => "enclosure"}
32
+
33
+ def url
34
+ @url || links.first
35
+ end
30
36
  end
31
37
 
32
38
  end
@@ -18,7 +18,7 @@ module Feedzirra
18
18
  element :title
19
19
  element :name, :as => :author
20
20
  element :link, :as => :url, :value => :href, :with => {:type => "text/html", :rel => "alternate"}
21
- element :"feedburner:origLink", :as => :url
21
+ #element :"feedburner:origLink", :as => :url
22
22
  element :summary
23
23
  element :content
24
24
  element :published
@@ -0,0 +1,12 @@
1
+ module Feedzirra
2
+ module Parser
3
+ class RSS
4
+ class ITunesCategory
5
+ include SAXMachine
6
+
7
+ element :'itunes:category', :as => :name, :value => :text
8
+ elements :'itunes:category', :as => :sub_categories, :value => :text
9
+ end
10
+ end
11
+ end
12
+ end
@@ -0,0 +1,11 @@
1
+ module Feedzirra
2
+ module Parser
3
+ class MRSSCategory
4
+ include SAXMachine
5
+
6
+ element :'media:category', :as => :category
7
+ element :'media:category', :value => :scheme, :as => :scheme
8
+ element :'media:category', :value => :label, :as => :label
9
+ end
10
+ end
11
+ end
@@ -0,0 +1,48 @@
1
+ require File.dirname(__FILE__) + '/mrss_credit'
2
+ require File.dirname(__FILE__) + '/mrss_restriction'
3
+ require File.dirname(__FILE__) + '/mrss_category'
4
+ require File.dirname(__FILE__) + '/mrss_copyright'
5
+ require File.dirname(__FILE__) + '/mrss_hash'
6
+ require File.dirname(__FILE__) + '/mrss_player'
7
+ require File.dirname(__FILE__) + '/mrss_rating'
8
+ require File.dirname(__FILE__) + '/mrss_restriction'
9
+ require File.dirname(__FILE__) + '/mrss_text'
10
+ require File.dirname(__FILE__) + '/mrss_thumbnail'
11
+
12
+ module Feedzirra
13
+ module Parser
14
+ class MRSSContent
15
+ include SAXMachine
16
+
17
+ element :'media:content', :as => :url, :value => :url
18
+ element :'media:content', :as => :content_type, :value => :type
19
+ element :'media:content', :as => :medium, :value => :medium
20
+ element :'media:content', :as => :duration, :value => :duration
21
+ element :'media:content', :as => :isDefault, :value => :isDefault
22
+ element :'media:content', :as => :expression, :value => :expression
23
+ element :'media:content', :as => :bitrate, :value => :bitrate
24
+ element :'media:content', :as => :framerate, :value => :framerate
25
+ element :'media:content', :as => :samplingrate, :value => :sampling
26
+ element :'media:content', :as => :channels, :value => :duration
27
+ element :'media:content', :as => :height, :value => :height
28
+ element :'media:content', :as => :width, :value => :width
29
+ element :'media:content', :as => :lang, :value => :lang
30
+ element :'media:content', :as => :fileSize, :value => :fileSize
31
+
32
+ # optional elements
33
+ element :'media:title', :as => :media_title
34
+ element :'media:keywords', :as => :media_keywords
35
+ element :'media:description', :as => :media_description
36
+
37
+ element :'media:thumbnail', :as => :media_thumbnail, :class => MRSSThumbnail
38
+ element :'media:rating', :as => :rating, :class => MRSSRating
39
+ element :'media:category', :as => :media_category, :class => MRSSCategory
40
+ element :'media:hash', :as => :media_hash, :class => MRSSHash
41
+ element :'media:player', :as => :media_player, :class => MRSSPlayer
42
+ elements :'media:credit', :as => :credits, :class => MRSSCredit
43
+ element :'media:copyright', :as => :copyright, :class => MRSSCopyright
44
+ element :'media:restriction', :as => :media_restriction, :class => MRSSRestriction
45
+ element :'media:text', :as => :text, :class => MRSSText
46
+ end
47
+ end
48
+ end
@@ -0,0 +1,10 @@
1
+ module Feedzirra
2
+ module Parser
3
+ class MRSSCopyright
4
+ include SAXMachine
5
+
6
+ element :'media:copyright', :as => :copyright
7
+ element :'media:copyright', :as => :url, :value => :url
8
+ end
9
+ end
10
+ end
@@ -0,0 +1,11 @@
1
+ module Feedzirra
2
+ module Parser
3
+ class MRSSCredit
4
+ include SAXMachine
5
+
6
+ element :'media:credit', :as => :role, :value => :role
7
+ element :'media:credit', :as => :scheme, :value => :scheme
8
+ element :'media:credit', :as => :name
9
+ end
10
+ end
11
+ end
@@ -0,0 +1,37 @@
1
+ require File.dirname(__FILE__) + '/mrss_content'
2
+ require File.dirname(__FILE__) + '/mrss_credit'
3
+ require File.dirname(__FILE__) + '/mrss_restriction'
4
+ require File.dirname(__FILE__) + '/mrss_group'
5
+ require File.dirname(__FILE__) + '/mrss_category'
6
+ require File.dirname(__FILE__) + '/mrss_copyright'
7
+ require File.dirname(__FILE__) + '/mrss_hash'
8
+ require File.dirname(__FILE__) + '/mrss_player'
9
+ require File.dirname(__FILE__) + '/mrss_rating'
10
+ require File.dirname(__FILE__) + '/mrss_restriction'
11
+ require File.dirname(__FILE__) + '/mrss_text'
12
+ require File.dirname(__FILE__) + '/mrss_thumbnail'
13
+
14
+ module Feedzirra
15
+ module Parser
16
+ class MRSSGroup
17
+ include SAXMachine
18
+
19
+ elements :'media:content', :as => :media_content, :class => MRSSContent
20
+
21
+ # optional elements
22
+ element :'media:title', :as => :media_title
23
+ element :'media:keywords', :as => :media_keywords
24
+ element :'media:description', :as => :media_description
25
+
26
+ element :'media:thumbnail', :as => :media_thumbnail, :class => MRSSThumbnail
27
+ element :'media:rating', :as => :rating, :class => MRSSRating
28
+ element :'media:category', :as => :media_category, :class => MRSSCategory
29
+ element :'media:hash', :as => :media_hash, :class => MRSSHash
30
+ element :'media:player', :as => :media_player, :class => MRSSPlayer
31
+ elements :'media:credit', :as => :credits, :class => MRSSCredit
32
+ element :'media:copyright', :as => :copyright, :class => MRSSCopyright
33
+ element :'media:restriction', :as => :media_restriction, :class => MRSSRestriction
34
+ element :'media:text', :as => :text, :class => MRSSText
35
+ end
36
+ end
37
+ end
@@ -0,0 +1,10 @@
1
+ module Feedzirra
2
+ module Parser
3
+ class MRSSHash
4
+ include SAXMachine
5
+
6
+ element :'media:hash', :as => :hash
7
+ element :'media:hash', :value => :algo, :as => :algo
8
+ end
9
+ end
10
+ end
@@ -0,0 +1,11 @@
1
+ module Feedzirra
2
+ module Parser
3
+ class MRSSPlayer
4
+ include SAXMachine
5
+
6
+ element :'media:player', :value => :url, :as => :url
7
+ element :'media:player', :value => :width, :as => :width
8
+ element :'media:player', :value => :height, :as => :height
9
+ end
10
+ end
11
+ end
@@ -0,0 +1,10 @@
1
+ module Feedzirra
2
+ module Parser
3
+ class MRSSRating
4
+ include SAXMachine
5
+
6
+ element :'media:rating', :as => :rating
7
+ element :'media:rating', :value => :scheme, :as => :scheme
8
+ end
9
+ end
10
+ end
@@ -0,0 +1,11 @@
1
+ module Feedzirra
2
+ module Parser
3
+ class MRSSRestriction
4
+ include SAXMachine
5
+
6
+ element :'media:restriction', :as => :value
7
+ element :'media:restriction', :as => :scope, :value => :type
8
+ element :'media:restriction', :as => :relationship, :value => :relationship
9
+ end
10
+ end
11
+ end
@@ -0,0 +1,13 @@
1
+ module Feedzirra
2
+ module Parser
3
+ class MRSSText
4
+ include SAXMachine
5
+
6
+ element :'media:text', :as => :type, :value => :type
7
+ element :'media:text', :as => :lang, :value => :lang
8
+ element :'media:text', :as => :start, :value => :start
9
+ element :'media:text', :as => :end, :value => :end
10
+ element :'media:text', :as => :text
11
+ end
12
+ end
13
+ end
@@ -0,0 +1,11 @@
1
+ module Feedzirra
2
+ module Parser
3
+ class MRSSThumbnail
4
+ include SAXMachine
5
+
6
+ element :'media:thumbnail', :as => :url, :value => :url
7
+ element :'media:thumbnail', :as => :with, :value => :width
8
+ element :'media:thumbnail', :as => :height, :value => :height
9
+ end
10
+ end
11
+ end