kete-feedzirra 0.0.8.1 → 0.0.16.1
Sign up to get free protection for your applications and to get access to all the features.
- data/README.textile +11 -2
- data/Rakefile +3 -0
- data/lib/feedzirra.rb +10 -10
- data/lib/feedzirra/feed.rb +45 -42
- data/lib/feedzirra/parser/atom.rb +47 -0
- data/lib/feedzirra/parser/atom_entry.rb +51 -0
- data/lib/feedzirra/parser/atom_feed_burner.rb +27 -0
- data/lib/feedzirra/parser/atom_feed_burner_entry.rb +35 -0
- data/lib/feedzirra/parser/itunes_rss.rb +50 -0
- data/lib/feedzirra/parser/itunes_rss_item.rb +31 -0
- data/lib/feedzirra/parser/itunes_rss_owner.rb +12 -0
- data/lib/feedzirra/parser/rss.rb +40 -0
- data/lib/feedzirra/parser/rss_entry.rb +55 -0
- data/spec/feedzirra/feed_spec.rb +30 -27
- data/spec/feedzirra/feed_utilities_spec.rb +9 -9
- data/spec/feedzirra/{atom_entry_spec.rb → parser/atom_entry_spec.rb} +7 -3
- data/spec/feedzirra/{atom_feed_burner_entry_spec.rb → parser/atom_feed_burner_entry_spec.rb} +4 -4
- data/spec/feedzirra/{atom_feed_burner_spec.rb → parser/atom_feed_burner_spec.rb} +6 -6
- data/spec/feedzirra/{atom_spec.rb → parser/atom_spec.rb} +16 -8
- data/spec/feedzirra/{itunes_rss_item_spec.rb → parser/itunes_rss_item_spec.rb} +3 -3
- data/spec/feedzirra/{itunes_rss_owner_spec.rb → parser/itunes_rss_owner_spec.rb} +3 -3
- data/spec/feedzirra/{itunes_rss_spec.rb → parser/itunes_rss_spec.rb} +5 -5
- data/spec/feedzirra/{rss_entry_spec.rb → parser/rss_entry_spec.rb} +14 -14
- data/spec/feedzirra/{rss_spec.rb → parser/rss_spec.rb} +17 -18
- metadata +22 -21
- data/lib/feedzirra/atom.rb +0 -35
- data/lib/feedzirra/atom_entry.rb +0 -41
- data/lib/feedzirra/atom_feed_burner.rb +0 -22
- data/lib/feedzirra/atom_feed_burner_entry.rb +0 -30
- data/lib/feedzirra/itunes_rss.rb +0 -46
- data/lib/feedzirra/itunes_rss_item.rb +0 -28
- data/lib/feedzirra/itunes_rss_owner.rb +0 -8
- data/lib/feedzirra/rss.rb +0 -36
- data/lib/feedzirra/rss_entry.rb +0 -48
data/README.textile
CHANGED
@@ -23,10 +23,12 @@ The final feature of Feedzirra is the ability to define custom parsing classes.
|
|
23
23
|
h2. Installation
|
24
24
|
|
25
25
|
For now Feedzirra exists only on github. It also has a few gem requirements that are only on github. Before you start you need to have "libcurl":http://curl.haxx.se/ and "libxml":http://xmlsoft.org/ installed. If you're on Leopard you have both. Otherwise, you'll need to grab them. Once you've got those libraries, these are the gems that get used: nokogiri, pauldix-sax-machine, taf2-curb (note that this is a fork that lives on github and not the Ruby Forge version of curb), and pauldix-feedzirra. The feedzirra gemspec has all the dependencies so you should be able to get up and running with the standard github gem install routine:
|
26
|
+
|
26
27
|
<pre>
|
27
28
|
gem sources -a http://gems.github.com # if you haven't already
|
28
29
|
gem install pauldix-feedzirra
|
29
30
|
</pre>
|
31
|
+
|
30
32
|
<b>NOTE:</b>Some people have been reporting a few issues related to installation. First, the Ruby Forge version of curb is not what you want. It will not work. Nor will the curl-multi gem that lives on Ruby Forge. You have to get the "taf2-curb":http://github.com/taf2/curb/tree/master fork installed.
|
31
33
|
|
32
34
|
If you see this error when doing a require:
|
@@ -93,7 +95,7 @@ updated_feed.new_entries # a collection of the entry objects that are newer tha
|
|
93
95
|
|
94
96
|
# fetching multiple feeds
|
95
97
|
feed_urls = ["http://feeds.feedburner.com/PaulDixExplainsNothing", "http://feeds.feedburner.com/trottercashion"]
|
96
|
-
feeds = Feedzirra::Feed.fetch_and_parse(
|
98
|
+
feeds = Feedzirra::Feed.fetch_and_parse(feed_urls)
|
97
99
|
|
98
100
|
# feeds is now a hash with the feed_urls as keys and the parsed feed objects as values. If an error was thrown
|
99
101
|
# there will be a Fixnum of the http response code instead of a feed object
|
@@ -116,6 +118,12 @@ Feedzirra::Feed.add_common_feed_entry_element("wfw:commentRss", :as => :comment_
|
|
116
118
|
# AtomEntry classes. Now you can access those in an atom feed:
|
117
119
|
Feedzirra::Feed.parse(some_atom_xml).entries.first.comment_rss_ # => wfw:commentRss is now parsed!
|
118
120
|
|
121
|
+
|
122
|
+
# You can also define your own parsers and add them to the ones Feedzirra knows about. Here's an example that adds
|
123
|
+
# ITunesRSS parsing. It's included in the library, but not part of Feedzirra by default because some of the field names
|
124
|
+
# differ from other classes, thus breaking normalization.
|
125
|
+
Feedzirra::Feed.add_feed_class(ITunesRSS) # now all feeds will be checked to see if they match ITunesRSS before others
|
126
|
+
|
119
127
|
# You can also access http basic auth feeds. Unfortunately, you can't get to these inside of a bulk get of a bunch of feeds.
|
120
128
|
# You'll have to do it on its own like so:
|
121
129
|
Feedzirra::Feed.fetch_and_parse(some_url, :http_authentication => ["myusername", "mypassword"])
|
@@ -151,7 +159,8 @@ This thing needs to hammer on many different feeds in the wild. I'm sure there w
|
|
151
159
|
Here are some more specific TODOs.
|
152
160
|
* Fix the iTunes parser so things are normalized again
|
153
161
|
* Fix the Zlib deflate error
|
154
|
-
*
|
162
|
+
* Fix this error: http://github.com/inbox/70508
|
163
|
+
* Convert to use Typhoeus instead of taf2-curb
|
155
164
|
* Make the entries parse all link fields
|
156
165
|
* Make a feedzirra-rails gem to integrate feedzirra seamlessly with Rails and ActiveRecord.
|
157
166
|
* Create a super sweet DSL for defining new parsers.
|
data/Rakefile
CHANGED
data/lib/feedzirra.rb
CHANGED
@@ -18,17 +18,17 @@ require 'feedzirra/feed_utilities'
|
|
18
18
|
require 'feedzirra/feed_entry_utilities'
|
19
19
|
require 'feedzirra/feed'
|
20
20
|
|
21
|
-
require 'feedzirra/rss_entry'
|
22
|
-
require 'feedzirra/itunes_rss_owner'
|
23
|
-
require 'feedzirra/itunes_rss_item'
|
24
|
-
require 'feedzirra/atom_entry'
|
25
|
-
require 'feedzirra/atom_feed_burner_entry'
|
21
|
+
require 'feedzirra/parser/rss_entry'
|
22
|
+
require 'feedzirra/parser/itunes_rss_owner'
|
23
|
+
require 'feedzirra/parser/itunes_rss_item'
|
24
|
+
require 'feedzirra/parser/atom_entry'
|
25
|
+
require 'feedzirra/parser/atom_feed_burner_entry'
|
26
26
|
|
27
|
-
require 'feedzirra/rss'
|
28
|
-
require 'feedzirra/itunes_rss'
|
29
|
-
require 'feedzirra/atom'
|
30
|
-
require 'feedzirra/atom_feed_burner'
|
27
|
+
require 'feedzirra/parser/rss'
|
28
|
+
require 'feedzirra/parser/itunes_rss'
|
29
|
+
require 'feedzirra/parser/atom'
|
30
|
+
require 'feedzirra/parser/atom_feed_burner'
|
31
31
|
|
32
32
|
module Feedzirra
|
33
|
-
VERSION = "0.0.
|
33
|
+
VERSION = "0.0.16"
|
34
34
|
end
|
data/lib/feedzirra/feed.rb
CHANGED
@@ -27,7 +27,7 @@ module Feedzirra
|
|
27
27
|
# === Returns
|
28
28
|
# The class name of the parser that can handle the XML.
|
29
29
|
def self.determine_feed_parser_for_xml(xml)
|
30
|
-
start_of_doc = xml.slice(0,
|
30
|
+
start_of_doc = xml.slice(0, 2000)
|
31
31
|
feed_classes.detect {|klass| klass.able_to_parse?(start_of_doc)}
|
32
32
|
end
|
33
33
|
|
@@ -46,7 +46,7 @@ module Feedzirra
|
|
46
46
|
# === Returns
|
47
47
|
# A array of class names.
|
48
48
|
def self.feed_classes
|
49
|
-
@feed_classes ||= [
|
49
|
+
@feed_classes ||= [Feedzirra::Parser::RSS, Feedzirra::Parser::AtomFeedBurner, Feedzirra::Parser::Atom]
|
50
50
|
end
|
51
51
|
|
52
52
|
# Makes all entry types look for the passed in element to parse. This is actually just a call to
|
@@ -58,25 +58,11 @@ module Feedzirra
|
|
58
58
|
def self.add_common_feed_entry_element(element_tag, options = {})
|
59
59
|
# need to think of a better way to do this. will break for people who want this behavior
|
60
60
|
# across their added classes
|
61
|
-
|
61
|
+
feed_classes.map{|k| eval("#{k}Entry") }.each do |klass|
|
62
62
|
klass.send(:element, element_tag, options)
|
63
63
|
end
|
64
64
|
end
|
65
65
|
|
66
|
-
# Makes all entry types look for the passed in elements to parse. This is actually just a call to
|
67
|
-
# elements (a SAXMachine call) in the class
|
68
|
-
#
|
69
|
-
# === Parameters
|
70
|
-
# [element_tag<String>]
|
71
|
-
# [options<Hash>] Valid keys are same as with SAXMachine
|
72
|
-
def self.add_common_feed_entry_elements(element_tag, options = {})
|
73
|
-
# need to think of a better way to do this. will break for people who want this behavior
|
74
|
-
# across their added classes
|
75
|
-
[RSSEntry, AtomFeedBurnerEntry, AtomEntry].each do |klass|
|
76
|
-
klass.send(:elements, element_tag, options)
|
77
|
-
end
|
78
|
-
end
|
79
|
-
|
80
66
|
# Fetches and returns the raw XML for each URL provided.
|
81
67
|
#
|
82
68
|
# === Parameters
|
@@ -100,9 +86,12 @@ module Feedzirra
|
|
100
86
|
curl.headers["User-Agent"] = (options[:user_agent] || USER_AGENT)
|
101
87
|
curl.headers["If-Modified-Since"] = options[:if_modified_since].httpdate if options.has_key?(:if_modified_since)
|
102
88
|
curl.headers["If-None-Match"] = options[:if_none_match] if options.has_key?(:if_none_match)
|
103
|
-
curl.headers["Accept-encoding"] = 'gzip, deflate'
|
89
|
+
curl.headers["Accept-encoding"] = 'gzip, deflate' if options.has_key?(:compress)
|
104
90
|
curl.follow_location = true
|
105
91
|
curl.userpwd = options[:http_authentication].join(':') if options.has_key?(:http_authentication)
|
92
|
+
|
93
|
+
curl.max_redirects = options[:max_redirects] if options[:max_redirects]
|
94
|
+
curl.timeout = options[:timeout] if options[:timeout]
|
106
95
|
|
107
96
|
curl.on_success do |c|
|
108
97
|
responses[url] = decode_content(c)
|
@@ -115,7 +104,7 @@ module Feedzirra
|
|
115
104
|
end
|
116
105
|
|
117
106
|
multi.perform
|
118
|
-
|
107
|
+
urls.is_a?(String) ? responses.values.first : responses
|
119
108
|
end
|
120
109
|
|
121
110
|
# Fetches and returns the parsed XML for each URL provided.
|
@@ -123,11 +112,11 @@ module Feedzirra
|
|
123
112
|
# === Parameters
|
124
113
|
# [urls<String> or <Array>] A single feed URL, or an array of feed URLs.
|
125
114
|
# [options<Hash>] Valid keys for this argument as as followed:
|
126
|
-
#
|
127
|
-
#
|
128
|
-
#
|
129
|
-
#
|
130
|
-
#
|
115
|
+
# * :user_agent - String that overrides the default user agent.
|
116
|
+
# * :if_modified_since - Time object representing when the feed was last updated.
|
117
|
+
# * :if_none_match - String, an etag for the request that was stored previously.
|
118
|
+
# * :on_success - Block that gets executed after a successful request.
|
119
|
+
# * :on_failure - Block that gets executed after a failed request.
|
131
120
|
# === Returns
|
132
121
|
# A Feed object if a single URL is passed.
|
133
122
|
#
|
@@ -137,12 +126,12 @@ module Feedzirra
|
|
137
126
|
multi = Curl::Multi.new
|
138
127
|
responses = {}
|
139
128
|
|
140
|
-
# I broke these down so I would only try to do 30 simultaneously because
|
129
|
+
# I broke these down so I would only try to do 30 simultaneously because
|
141
130
|
# I was getting weird errors when doing a lot. As one finishes it pops another off the queue.
|
142
131
|
url_queue.slice!(0, 30).each do |url|
|
143
132
|
add_url_to_multi(multi, url, url_queue, responses, options)
|
144
133
|
end
|
145
|
-
|
134
|
+
|
146
135
|
multi.perform
|
147
136
|
return urls.is_a?(String) ? responses.values.first : responses
|
148
137
|
end
|
@@ -194,7 +183,7 @@ module Feedzirra
|
|
194
183
|
end
|
195
184
|
|
196
185
|
multi.perform
|
197
|
-
|
186
|
+
responses.size == 1 ? responses.values.first : responses.values
|
198
187
|
end
|
199
188
|
|
200
189
|
# An abstraction for adding a feed by URL to the passed Curb::multi stack.
|
@@ -216,9 +205,12 @@ module Feedzirra
|
|
216
205
|
curl.headers["User-Agent"] = (options[:user_agent] || USER_AGENT)
|
217
206
|
curl.headers["If-Modified-Since"] = options[:if_modified_since].httpdate if options.has_key?(:if_modified_since)
|
218
207
|
curl.headers["If-None-Match"] = options[:if_none_match] if options.has_key?(:if_none_match)
|
219
|
-
curl.headers["Accept-encoding"] = 'gzip, deflate'
|
208
|
+
curl.headers["Accept-encoding"] = 'gzip, deflate' if options.has_key?(:compress)
|
220
209
|
curl.follow_location = true
|
221
210
|
curl.userpwd = options[:http_authentication].join(':') if options.has_key?(:http_authentication)
|
211
|
+
|
212
|
+
curl.max_redirects = options[:max_redirects] if options[:max_redirects]
|
213
|
+
curl.timeout = options[:timeout] if options[:timeout]
|
222
214
|
|
223
215
|
curl.on_success do |c|
|
224
216
|
add_url_to_multi(multi, url_queue.shift, url_queue, responses, options) unless url_queue.empty?
|
@@ -226,12 +218,16 @@ module Feedzirra
|
|
226
218
|
klass = determine_feed_parser_for_xml(xml)
|
227
219
|
|
228
220
|
if klass
|
229
|
-
|
230
|
-
|
231
|
-
|
232
|
-
|
233
|
-
|
234
|
-
|
221
|
+
begin
|
222
|
+
feed = klass.parse(xml)
|
223
|
+
feed.feed_url = c.last_effective_url
|
224
|
+
feed.etag = etag_from_header(c.header_str)
|
225
|
+
feed.last_modified = last_modified_from_header(c.header_str)
|
226
|
+
responses[url] = feed
|
227
|
+
options[:on_success].call(url, feed) if options.has_key?(:on_success)
|
228
|
+
rescue Exception => e
|
229
|
+
options[:on_failure].call(url, c.response_code, c.header_str, c.body_str) if options.has_key?(:on_failure)
|
230
|
+
end
|
235
231
|
else
|
236
232
|
# puts "Error determining parser for #{url} - #{c.last_effective_url}"
|
237
233
|
# raise NoParserAvailable.new("no valid parser for content.") (this would unfirtunately fail the whole 'multi', so it's not really useable)
|
@@ -270,15 +266,22 @@ module Feedzirra
|
|
270
266
|
curl.userpwd = options[:http_authentication].join(':') if options.has_key?(:http_authentication)
|
271
267
|
curl.follow_location = true
|
272
268
|
|
269
|
+
curl.max_redirects = options[:max_redirects] if options[:max_redirects]
|
270
|
+
curl.timeout = options[:timeout] if options[:timeout]
|
271
|
+
|
273
272
|
curl.on_success do |c|
|
274
|
-
|
275
|
-
|
276
|
-
|
277
|
-
|
278
|
-
|
279
|
-
|
280
|
-
|
281
|
-
|
273
|
+
begin
|
274
|
+
add_feed_to_multi(multi, feed_queue.shift, feed_queue, responses, options) unless feed_queue.empty?
|
275
|
+
updated_feed = Feed.parse(c.body_str)
|
276
|
+
updated_feed.feed_url = c.last_effective_url
|
277
|
+
updated_feed.etag = etag_from_header(c.header_str)
|
278
|
+
updated_feed.last_modified = last_modified_from_header(c.header_str)
|
279
|
+
feed.update_from_feed(updated_feed)
|
280
|
+
responses[feed.feed_url] = feed
|
281
|
+
options[:on_success].call(feed) if options.has_key?(:on_success)
|
282
|
+
rescue Exception => e
|
283
|
+
options[:on_failure].call(feed, c.response_code, c.header_str, c.body_str) if options.has_key?(:on_failure)
|
284
|
+
end
|
282
285
|
end
|
283
286
|
|
284
287
|
curl.on_failure do |c|
|
@@ -0,0 +1,47 @@
|
|
1
|
+
module Feedzirra
|
2
|
+
|
3
|
+
module Parser
|
4
|
+
# == Summary
|
5
|
+
# Parser for dealing with Atom feeds.
|
6
|
+
#
|
7
|
+
# == Attributes
|
8
|
+
# * prev_page
|
9
|
+
# * next_page
|
10
|
+
# * lat_page
|
11
|
+
# * title
|
12
|
+
# * subtitle
|
13
|
+
# * updated
|
14
|
+
# * feed_url
|
15
|
+
# * url
|
16
|
+
# * related
|
17
|
+
# * entries
|
18
|
+
class Atom
|
19
|
+
include SAXMachine
|
20
|
+
include FeedUtilities
|
21
|
+
element :"atom:link", :as => :prev_page, :value => :href, :with => {:rel => 'prev'}
|
22
|
+
element :"atom:link", :as => :next_page, :value => :href, :with => {:rel => 'next'}
|
23
|
+
element :"atom:link", :as => :last_page, :value => :href, :with => {:rel => 'last'}
|
24
|
+
element :title
|
25
|
+
element :subtitle
|
26
|
+
element :updated
|
27
|
+
element :link, :as => :url, :value => :href, :with => {:type => "text/html"}
|
28
|
+
element :link, :as => :feed_url, :value => :href, :with => {:type => "application/atom+xml"}
|
29
|
+
elements :link, :as => :related, :value => :href, :with => {:rel => "related"}
|
30
|
+
elements :link, :as => :links, :value => :href
|
31
|
+
elements :entry, :as => :entries, :class => AtomEntry
|
32
|
+
|
33
|
+
def self.able_to_parse?(xml) #:nodoc:
|
34
|
+
xml =~ /(Atom)|(#{Regexp.escape("http://purl.org/atom")})/
|
35
|
+
end
|
36
|
+
|
37
|
+
def url
|
38
|
+
@url || links.last
|
39
|
+
end
|
40
|
+
|
41
|
+
def feed_url
|
42
|
+
@feed_url || links.first
|
43
|
+
end
|
44
|
+
end
|
45
|
+
end
|
46
|
+
|
47
|
+
end
|
@@ -0,0 +1,51 @@
|
|
1
|
+
module Feedzirra
|
2
|
+
|
3
|
+
module Parser
|
4
|
+
# == Summary
|
5
|
+
# Parser for dealing with Atom feed entries.
|
6
|
+
#
|
7
|
+
# == Attributes
|
8
|
+
# * title
|
9
|
+
# * url
|
10
|
+
# * related
|
11
|
+
# * author
|
12
|
+
# * content
|
13
|
+
# * summary
|
14
|
+
# * published
|
15
|
+
# * categories
|
16
|
+
# * media_content
|
17
|
+
# * media_description
|
18
|
+
# * media_thumbnail
|
19
|
+
# * enclosure
|
20
|
+
class AtomEntry
|
21
|
+
include SAXMachine
|
22
|
+
include FeedEntryUtilities
|
23
|
+
element :title
|
24
|
+
element :link, :as => :url, :value => :href, :with => {:rel => "alternate"}
|
25
|
+
elements :link, :as => :related, :value => :href, :with => {:rel => "related"}
|
26
|
+
element :name, :as => :author
|
27
|
+
element :content
|
28
|
+
element :summary
|
29
|
+
element :published
|
30
|
+
element :id
|
31
|
+
element :created, :as => :published
|
32
|
+
element :issued, :as => :published
|
33
|
+
element :updated
|
34
|
+
element :modified, :as => :updated
|
35
|
+
elements :category, :as => :categories, :value => :term
|
36
|
+
|
37
|
+
element :"media:content", :as => :media_content, :value => :url
|
38
|
+
element :"media:description", :as => :media_description
|
39
|
+
element :"media:thumbnail", :as => :media_thumbnail, :value => :url
|
40
|
+
element :enclosure, :value => :url
|
41
|
+
|
42
|
+
elements :link, :as => :links, :value => :href
|
43
|
+
|
44
|
+
def url
|
45
|
+
@url || links.first
|
46
|
+
end
|
47
|
+
end
|
48
|
+
|
49
|
+
end
|
50
|
+
|
51
|
+
end
|
@@ -0,0 +1,27 @@
|
|
1
|
+
module Feedzirra
|
2
|
+
|
3
|
+
module Parser
|
4
|
+
# == Summary
|
5
|
+
# Parser for dealing with Feedburner Atom feeds.
|
6
|
+
#
|
7
|
+
# == Attributes
|
8
|
+
# * title
|
9
|
+
# * feed_url
|
10
|
+
# * url
|
11
|
+
# * entries
|
12
|
+
class AtomFeedBurner
|
13
|
+
include SAXMachine
|
14
|
+
include FeedUtilities
|
15
|
+
element :title
|
16
|
+
element :link, :as => :url, :value => :href, :with => {:type => "text/html"}
|
17
|
+
element :link, :as => :feed_url, :value => :href, :with => {:type => "application/atom+xml"}
|
18
|
+
elements :entry, :as => :entries, :class => AtomFeedBurnerEntry
|
19
|
+
|
20
|
+
def self.able_to_parse?(xml) #:nodoc:
|
21
|
+
(xml =~ /Atom/ && xml =~ /feedburner/) || false
|
22
|
+
end
|
23
|
+
end
|
24
|
+
|
25
|
+
end
|
26
|
+
|
27
|
+
end
|
@@ -0,0 +1,35 @@
|
|
1
|
+
module Feedzirra
|
2
|
+
|
3
|
+
module Parser
|
4
|
+
# == Summary
|
5
|
+
# Parser for dealing with Feedburner Atom feed entries.
|
6
|
+
#
|
7
|
+
# == Attributes
|
8
|
+
# * title
|
9
|
+
# * url
|
10
|
+
# * author
|
11
|
+
# * content
|
12
|
+
# * summary
|
13
|
+
# * published
|
14
|
+
# * categories
|
15
|
+
class AtomFeedBurnerEntry
|
16
|
+
include SAXMachine
|
17
|
+
include FeedEntryUtilities
|
18
|
+
element :title
|
19
|
+
element :name, :as => :author
|
20
|
+
element :link, :as => :url, :value => :href, :with => {:rel => "alternate"}
|
21
|
+
element :"feedburner:origLink", :as => :url
|
22
|
+
element :summary
|
23
|
+
element :content
|
24
|
+
element :published
|
25
|
+
element :id
|
26
|
+
element :issued, :as => :published
|
27
|
+
element :created, :as => :published
|
28
|
+
element :updated
|
29
|
+
element :modified, :as => :updated
|
30
|
+
elements :category, :as => :categories, :value => :term
|
31
|
+
end
|
32
|
+
|
33
|
+
end
|
34
|
+
|
35
|
+
end
|
@@ -0,0 +1,50 @@
|
|
1
|
+
module Feedzirra
|
2
|
+
|
3
|
+
module Parser
|
4
|
+
# iTunes is RSS 2.0 + some apple extensions
|
5
|
+
# Source: http://www.apple.com/itunes/whatson/podcasts/specs.html
|
6
|
+
class ITunesRSS
|
7
|
+
include SAXMachine
|
8
|
+
include FeedUtilities
|
9
|
+
|
10
|
+
attr_accessor :feed_url
|
11
|
+
|
12
|
+
# RSS 2.0 elements that need including
|
13
|
+
element :copyright
|
14
|
+
element :description
|
15
|
+
element :language
|
16
|
+
element :managingEditor
|
17
|
+
element :title
|
18
|
+
element :link, :as => :url
|
19
|
+
|
20
|
+
# If author is not present use managingEditor on the channel
|
21
|
+
element :"itunes:author", :as => :itunes_author
|
22
|
+
element :"itunes:block", :as => :itunes_block
|
23
|
+
element :"itunes:image", :value => :href, :as => :itunes_image
|
24
|
+
element :"itunes:explicit", :as => :itunes_explicit
|
25
|
+
element :"itunes:keywords", :as => :itunes_keywords
|
26
|
+
# New URL for the podcast feed
|
27
|
+
element :"itunes:new-feed-url", :as => :itunes_new_feed_url
|
28
|
+
element :"itunes:subtitle", :as => :itunes_subtitle
|
29
|
+
# If summary is not present, use the description tag
|
30
|
+
element :"itunes:summary", :as => :itunes_summary
|
31
|
+
|
32
|
+
# iTunes RSS feeds can have multiple main categories...
|
33
|
+
# ...and multiple sub-categories per category
|
34
|
+
# TODO subcategories not supported correctly - they are at the same level
|
35
|
+
# as the main categories
|
36
|
+
elements :"itunes:category", :as => :itunes_categories, :value => :text
|
37
|
+
|
38
|
+
elements :"itunes:owner", :as => :itunes_owners, :class => ITunesRSSOwner
|
39
|
+
|
40
|
+
elements :item, :as => :entries, :class => ITunesRSSItem
|
41
|
+
|
42
|
+
def self.able_to_parse?(xml)
|
43
|
+
xml =~ /xmlns:itunes=\"http:\/\/www.itunes.com\/dtds\/podcast-1.0.dtd\"/
|
44
|
+
end
|
45
|
+
|
46
|
+
end
|
47
|
+
|
48
|
+
end
|
49
|
+
|
50
|
+
end
|