feedme 0.1

Sign up to get free protection for your applications and to get access to all the features.
data/History.txt ADDED
@@ -0,0 +1,3 @@
1
+ === 0.1 / 2009-09-03
2
+
3
+ * Everything is new. First release.
data/Manifest.txt ADDED
@@ -0,0 +1,7 @@
1
+ History.txt
2
+ Manifest.txt
3
+ README.txt
4
+ Rakefile
5
+ lib/feedme.rb
6
+ examples/rocketboom.rb
7
+ examples/rocketboom.rss
data/README.txt ADDED
@@ -0,0 +1,115 @@
1
+ = feedme
2
+
3
+ * http://feedme.rubyforge.org
4
+
5
+ == DESCRIPTION:
6
+
7
+ A simple, flexible, and extensible RSS and Atom parser for Ruby. Based on the popular SimpleRSS library, but with many nice extra features.
8
+
9
+ == FEATURES/PROBLEMS:
10
+
11
+ * Parse RSS 0.91, 0.92, 1.0, and 2.0
12
+ * Parse Atom
13
+ * Parse all tags by default, or choose the tags you want to parse
14
+ * Access all attributes and content as if they were methods
15
+ * Access all values of tags that can appear multiple times
16
+ * Delicious syntactic sugar that makes it simple to get the data you want
17
+
18
+ === SYNOPSIS:
19
+
20
+ The API is similar to SimpleRSS:
21
+
22
+ require 'rubygems'
23
+ require 'feedme'
24
+ require 'open-uri'
25
+
26
+ rss = FeedMe.parse open('http://slashdot.org/index.rdf')
27
+
28
+ rss.version # => 1.0
29
+ rss.channel.title # => "Slashdot"
30
+ rss.channel.link # => "http://slashdot.org/"
31
+ rss.items.first.link # => "http://books.slashdot.org/article.pl?sid=05/08/29/1319236&from=rss"
32
+
33
+ But since the parser can read Atom feeds as easily as RSS feeds, there are optional aliases that allow more atom like reading:
34
+
35
+ rss.feed.title # => "Slashdot"
36
+ rss.feed.link # => "http://slashdot.org/"
37
+ rss.entries.first.link # => "http://books.slashdot.org/article.pl?sid=05/08/29/1319236&from=rss"
38
+
39
+ Under the covers, all content is stored in arrays. This means that you can access all content for a tag that appears multiple times (i.e. category):
40
+
41
+ rss.items.first.category_array # => ["News for Nerds", "Technology"]
42
+ rss.items.first.category # => "News for Nerds"
43
+
44
+ You also have access to all the attributes as well as tag values:
45
+
46
+ rss.items.first.guid.isPermaLink # => "true"
47
+ rss.items.first.guid.content # => http://books.slashdot.org/article.pl?sid=05/08/29/1319236
48
+
49
+ FeedMe also adds some syntactic sugar that makes it easy to get the information you want:
50
+
51
+ rss.items.first.category? # => true
52
+ rss.items.first.category_count # => 2
53
+ rss.items.first.guid_content # => http://books.slashdot.org/article.pl?sid=05/08/29/1319236
54
+
55
+ There are two different parsers that you can use, depending on your needs. The default parser is "promiscuous," meaning that it parses all tags. There is also a strict parser that only parses tags specified in a list. Here is how you create the different types of parsers:
56
+
57
+ FeedMe.parse(source) # parse using the default (promiscuous) parser
58
+ FeedMe::ParserBuilder.new.parse(source) # equivalent to the previous line
59
+ FeedMe::StrictParserBuilder.new.parse(source) # only parse certain tags
60
+
61
+ The strict parser can be extended by adding new tags to parse:
62
+
63
+ builder = FeedMe::StrictParserBuilder.new
64
+ builder.rss_tags << :some_new_tag
65
+ builder.rss_item_tags << :'item+myrel' # parse an item that has a custom rel type
66
+ builder.item_ext_tags << :'feedburner:origLink' # parse an extension tag - one that has a specific namespace
67
+
68
+ Either parser can be extended by adding aliases to existing tags:
69
+
70
+ builder.aliases[:updated] => :pubDate # now you can always access the updated date using :updated, regardless of whether it's an RSS or Atom feed
71
+
72
+ Another bit of syntactic sugar is the "bang mod." These are modifications that can be applied to feed content by adding '!' to the tag name. The default bang mod is to strip HTML tags from the content.
73
+
74
+ rss.entry.content # => <div>Some great stuff</div>
75
+ rss.entry.content! # => Some great stuff
76
+
77
+ You can create your own bang mods. The following is an example of a bang mod that takes an argument. The first line is how bang mods are added, and the third line tells the builder to actually apply this bang mod when the '!' suffix is used. Note that bang mod names may only contain alphanumeric characters. Argument values are specified at the end separated by underscores.
78
+
79
+ # wrap content at a specified number of columns
80
+ builder.bang_mod_fns[:wrap] => proc {|str, col| str.gsub(/(.{1,#{col}})( +|$\n?)|(.{1,#{col}})/, "\\1\\3\n").strip }
81
+ builder.bang_mods << :wrap_80
82
+
83
+ In order to prevent clashes between tag/attribute names and the parser class' instance variables, all instance variables are prefixed with 'fm_'. They are:
84
+
85
+ fm_source # the original, unparsed source
86
+ fm_options # the options passed to the parser constructor
87
+ fm_type # the feed type
88
+ fm_tags # the tags the parser looks for in the source
89
+ fm_parsed # the list of tags the parser actually found
90
+ fm_unparsed # the list of tags that appeared in the feed but were not parsed (useful for debugging)
91
+
92
+ Additionally, there are several variables that are available at every level of the parse tree:
93
+
94
+ fm_builder # the ParserBuilder that created the parser
95
+ fm_parent # the container of the current level of the parse tree
96
+ fm_tag_name # the name of the rss/atom tag whose content is contained in this level of the tree
97
+
98
+ === A word on RSS/Atom Versions
99
+
100
+ RSS has undergone much revision since Netscape 0.90 was released in 1999. The current accepted specification is maintained by the RSS Advisory board (http://www.rssboard.org/rss-specification). The current version (as of this writing) is 2.0.11, although the actual schema has not changed since 2.0.1.
101
+
102
+ Atom is an IETF standard (http://www.w3.org/2005/Atom) and so far there is a single version of the specification (commonly referred to as 1.0).
103
+
104
+ FeedMe does its best to support RSS and Atom versions currently in use. It specifically does *NOT* support any Netscape version of RSS.
105
+
106
+ Due to various incompatibilities between different RSS versions, it is strongly recommended that you examine the version attribute of the feed (as shown in the Usage section). Mark Pilgrim has an excellent article on RSS version incompatibility: http://diveintomark.org/archives/2004/02/04/incompatible-rss.
107
+
108
+ == INSTALL:
109
+
110
+ * gem install feedme
111
+ * http://rubyforge.org/projects/feedme
112
+
113
+ == LICENSE:
114
+
115
+ This work is licensed under the Creative Commons Attribution 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/us/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.
data/Rakefile ADDED
@@ -0,0 +1,7 @@
1
+ require 'rubygems'
2
+ require 'hoe'
3
+
4
+ Hoe.spec 'feedme' do |hoe|
5
+ hoe.developer('John Didion', 'jdidion@rubyforge.org')
6
+ hoe.rubyforge_name = 'feedme'
7
+ end
@@ -0,0 +1,50 @@
1
+ #require 'feedme'
2
+ require '../lib/feedme'
3
+ require 'net/http'
4
+
5
+ def fetch(url)
6
+ response = Net::HTTP.get_response(URI.parse(url))
7
+ case response
8
+ when Net::HTTPSuccess
9
+ response.body
10
+ else
11
+ response.error!
12
+ end
13
+ end
14
+
15
+ # read from a file
16
+ content = ""
17
+ File.open('rocketboom.rss', "r") do |file|
18
+ content = file.read
19
+ end
20
+
21
+ # read from a url
22
+ #content = fetch('http://www.rocketboom.com/rss/hd.xml')
23
+
24
+ # create a new ParserBuilder
25
+ builder = FeedMe::ParserBuilder.new
26
+ # add a bang mod to wrap content to 50 columns
27
+ builder.bang_mods << :wrap_80
28
+
29
+ # parse the rss feed
30
+ rss = builder.parse(content)
31
+
32
+ # equivalent to rss.channel.title
33
+ puts "#{rss.type} Feed: #{rss.title}"
34
+
35
+ # use a virtual method...this one a shortcut to rss.items.size
36
+ puts "#{rss.item_count} items"
37
+ rss.items.each do |item|
38
+ puts
39
+ # we can easily access the content of a mixed element
40
+ puts "ID: #{item.guid_value} (#{item.guid.isPermaLink})"
41
+ puts "Date: #{item.pubDate}"
42
+ puts "Title: #{item.title}"
43
+ # we can access all categories
44
+ puts "Categories: #{item.category_array.join(', ')}" if item.category_array?
45
+ # ! causes value to be modified according to prior specifications
46
+ # ? checks for the presense of a tag/attribute
47
+ puts "Description:\n#{item.description!}" if item.description?
48
+ # we can access attribute values just as easily as tag content
49
+ puts "Enclosure: #{item.enclosure.url}" if item.enclosure?
50
+ end
@@ -0,0 +1,348 @@
1
+ <?xml version="1.0" encoding="UTF-8"?>
2
+ <rss version="2.0"
3
+ xmlns:content="http://purl.org/rss/1.0/modules/content/"
4
+ xmlns:wfw="http://wellformedweb.org/CommentAPI/"
5
+ xmlns:dc="http://purl.org/dc/elements/1.1/"
6
+ xmlns:atom="http://www.w3.org/2005/Atom"
7
+ xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd">
8
+
9
+ <channel>
10
+ <title>Rocketboom (HD)</title>
11
+ <atom:link href="http://www.rocketboom.com/index.php?feed=podcast&#038;format=hd" rel="self" type="application/rss+xml" />
12
+ <link>http://www.rocketboom.com</link>
13
+ <description>Daily Internet culture with Joanne Colan</description>
14
+ <pubDate>Wed, 26 Aug 2009 20:19:56 +0000</pubDate>
15
+ <generator>http://wordpress.org/?v=2.6.1-alpha</generator>
16
+ <language>en</language>
17
+ <itunes:summary>Rocketboom is a three minute daily videoblog hosted by Joanne Colan and based in New York City. We cover and create a wide range of information and commentary from top news stories to quirky internet culture.</itunes:summary>
18
+ <itunes:subtitle>Daily Internet culture with Joanne Colan</itunes:subtitle>
19
+ <itunes:author>Rocketboom</itunes:author>
20
+ <itunes:image href="http://www.rocketboom.com/images/rocketboom_itunes_20070801.jpg" />
21
+ <itunes:category text="News &amp; Politics" />
22
+ <itunes:category text="Technology" />
23
+ <itunes:category text="Comedy" />
24
+ <itunes:keywords>rocketboom, daily, news, internet, culture, art, science, technology, tech, space, comedy, video, podcast</itunes:keywords>
25
+ <itunes:explicit>no</itunes:explicit>
26
+ <itunes:owner>
27
+ <itunes:name>Rocketboom</itunes:name>
28
+ <itunes:email>hi@rocketboom.com</itunes:email>
29
+ </itunes:owner>
30
+ <item>
31
+ <title>Molly Status</title>
32
+ <link>http://www.rocketboom.com/molly-status/</link>
33
+ <comments>http://www.rocketboom.com/molly-status/#comments</comments>
34
+ <pubDate>Tue, 25 Aug 2009 02:42:04 +0000</pubDate>
35
+ <dc:creator>leah</dc:creator>
36
+
37
+ <category><![CDATA[Daily News]]></category>
38
+
39
+ <category><![CDATA[daily news]]></category>
40
+
41
+ <category><![CDATA[mememolly]]></category>
42
+
43
+ <category><![CDATA[molly]]></category>
44
+
45
+ <category><![CDATA[new york city]]></category>
46
+
47
+ <guid isPermaLink="false">http://www.rocketboom.com/?p=3635</guid>
48
+ <description><![CDATA[<object width="600" height="363"><param name="movie" value="http://www.youtube.com/v/V9JZ56JiCRk&hl=en&fs=1"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/V9JZ56JiCRk&hl=en&fs=1" type="application/x-shockwave-flash" wmode="transparent" width="600" height="363" flashvars="hl=en&fs=1"></embed></object>Molly has just been officially approved to work in the U.S. and will be back in New York City soon to begin anchoring Rocketboom daily! Welcome aboard Molly!
49
+ ]]></description>
50
+ <content:encoded><![CDATA[<object width="600" height="363"><param name="movie" value="http://www.youtube.com/v/V9JZ56JiCRk&hl=en&fs=1"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/V9JZ56JiCRk&hl=en&fs=1" type="application/x-shockwave-flash" wmode="transparent" width="600" height="363" flashvars="hl=en&fs=1"></embed></object><p><span>Molly has just been officially approved to work in the U.S. and will be back in New York City soon to begin anchoring Rocketboom daily! Welcome aboard Molly! </span></p>
51
+ ]]></content:encoded>
52
+ <wfw:commentRss>http://www.rocketboom.com/molly-status/feed/</wfw:commentRss>
53
+ <enclosure url="http://www.rocketboom.com/video/molly-status_hd.m4v" length="8274420" type="video/mp4" />
54
+ <itunes:summary>Molly has just been officially approved to work in the U.S. and will be back in New York City soon to begin anchoring Rocketboom daily! Welcome aboard Molly!
55
+
56
+ http://www.rocketboom.com/molly-status/</itunes:summary>
57
+ <itunes:subtitle>Molly has just been officially approved to work in the U.S. and will be back in New York City soon to begin anchoring Rocketboom daily! Welcome aboard Molly!
58
+ </itunes:subtitle>
59
+ </item>
60
+ <item>
61
+ <title>I Can Has Cheezburger the MusicLOL</title>
62
+ <link>http://www.rocketboom.com/i-can-has-cheezburger-the-musiclol/</link>
63
+ <comments>http://www.rocketboom.com/i-can-has-cheezburger-the-musiclol/#comments</comments>
64
+ <pubDate>Sat, 22 Aug 2009 06:20:13 +0000</pubDate>
65
+ <dc:creator>leah</dc:creator>
66
+
67
+ <category><![CDATA[Humanwire]]></category>
68
+
69
+ <category><![CDATA[ella morton]]></category>
70
+
71
+ <category><![CDATA[i can has cheezburger]]></category>
72
+
73
+ <category><![CDATA[lolcats]]></category>
74
+
75
+ <category><![CDATA[Rocketboom NYC Correspondent]]></category>
76
+
77
+ <guid isPermaLink="false">http://www.rocketboom.com/?p=3625</guid>
78
+ <description><![CDATA[<object width="600" height="363"><param name="movie" value="http://www.youtube.com/v/oQAnPFwOF2g&hl=en&fs=1"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/oQAnPFwOF2g&hl=en&fs=1" type="application/x-shockwave-flash" wmode="transparent" width="600" height="363" flashvars="hl=en&fs=1"></embed></object>Rocketboom NYC Correspondent Ella Morton interviews the creators of I Can Has Cheezburger The MusicLOL.
79
+ ]]></description>
80
+ <content:encoded><![CDATA[<object width="600" height="363"><param name="movie" value="http://www.youtube.com/v/oQAnPFwOF2g&hl=en&fs=1"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/oQAnPFwOF2g&hl=en&fs=1" type="application/x-shockwave-flash" wmode="transparent" width="600" height="363" flashvars="hl=en&fs=1"></embed></object><p><strong>Rocketboom NYC</strong> Correspondent <a href="http://sprinkleofginger.com/">Ella Morton</a> interviews the creators of <a href="http://icanhascheezburgerthemusiclol.wordpress.com/">I Can Has Cheezburger The MusicLOL</a>.</p>
81
+ ]]></content:encoded>
82
+ <wfw:commentRss>http://www.rocketboom.com/i-can-has-cheezburger-the-musiclol/feed/</wfw:commentRss>
83
+ <enclosure url="http://www.rocketboom.com/video/i-can-has-cheezburger-the-musiclol_hd.m4v" length="101725820" type="video/mp4" />
84
+ <itunes:summary>Rocketboom NYC Correspondent Ella Morton interviews the creators of I Can Has Cheezburger The MusicLOL.
85
+
86
+ http://www.rocketboom.com/i-can-has-cheezburger-the-musiclol/</itunes:summary>
87
+ <itunes:subtitle>Rocketboom NYC Correspondent Ella Morton interviews the creators of I Can Has Cheezburger The MusicLOL.
88
+ </itunes:subtitle>
89
+ </item>
90
+ <item>
91
+ <title>Know Your Meme: Bubb Rubb</title>
92
+ <link>http://www.rocketboom.com/know-your-meme-bubb-rubb/</link>
93
+ <comments>http://www.rocketboom.com/know-your-meme-bubb-rubb/#comments</comments>
94
+ <pubDate>Tue, 18 Aug 2009 21:07:07 +0000</pubDate>
95
+ <dc:creator>leah</dc:creator>
96
+
97
+ <category><![CDATA[Know Your Meme]]></category>
98
+
99
+ <category><![CDATA[Bubb Rubb]]></category>
100
+
101
+ <category><![CDATA[Rocketboom Institute for Internet Studies]]></category>
102
+
103
+ <category><![CDATA[whistle tips]]></category>
104
+
105
+ <category><![CDATA[woo woo]]></category>
106
+
107
+ <category><![CDATA[yatta]]></category>
108
+
109
+ <guid isPermaLink="false">http://www.rocketboom.com/?p=3593</guid>
110
+ <description><![CDATA[<object width="600" height="363"><param name="movie" value="http://www.youtube.com/v/FO0QVnUPEew&hl=en&fs=1"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/FO0QVnUPEew&hl=en&fs=1" type="application/x-shockwave-flash" wmode="transparent" width="600" height="363" flashvars="hl=en&fs=1"></embed></object>The Rocketboom Institute for Internet Studies investigates why the whistles go whoo whoo with Bubb Rubb. Links to all your Bubb Rubb resources matt-d, Ghost Ride It, Gas Break Dip, Kron.com, archive.org, socalevo.net, whistle tips, Bubb Rubb everywhere, Bubb Rubb sound board, whistle tips banned, Blur feat. Bubb Rubb, woo WOO, the whistles go whoo, [...]]]></description>
111
+ <content:encoded><![CDATA[<object width="600" height="363"><param name="movie" value="http://www.youtube.com/v/FO0QVnUPEew&hl=en&fs=1"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/FO0QVnUPEew&hl=en&fs=1" type="application/x-shockwave-flash" wmode="transparent" width="600" height="363" flashvars="hl=en&fs=1"></embed></object><p>The Rocketboom Institute for Internet Studies investigates why the whistles go whoo whoo with Bubb Rubb. Links to all your Bubb Rubb resources <a href="http://www.matt-d.com/ghetto/carwhistle.wmv">matt-d</a>, <a href="http://www.youtube.com/watch?v=YPNJjL9iznY">Ghost Ride It</a>, <a href="http://www.youtube.com/watch?v=z-N4De4C8Kg">Gas Break Dip</a>, <a href="http://web.archive.org/web/20050310043139/http://www.kron.com/Global/story.asp?S=1142838">Kron.com</a>, <a href="http://web.archive.org/web/20030210184423/http://homepage.mac.com/howheels/rubb.html">archive.org</a>, <a href="http://www.socalevo.net/forum/index.php?PHPSESSID=&lt;br &gt;&lt;/a&gt; 600ea755919d1c199ab08ee2f2639ae6&amp;topic=4406.0,">socalevo.net</a>, <a href="http://www.dtmpower.net/forum/archived-threads/83279-whistle-tips-lol.html http://acurazine.com/forums/showthread.php?t=92609">whistle tips</a>, <a href="http://allfordmustangs.com/forums/mustang-lounge/10438-bubb-rubb-everywhere-warning-lots-pics.html">Bubb Rubb everywhere</a>, <a href="http://www.civicforums.com/forums/20-off-topic/119272-official-bubb-rubb-sound-board-woo-woooo-lol-4.html">Bubb Rubb sound board</a>, <a href="http://www.yotatech.com/f5/news-flash-whistle-tips-banned-6800/">whistle tips banned</a>, <a href="http://www.youtube.com/watch?v=ld1iSkbOCx4">Blur feat. Bubb Rubb</a>, <a href="http://www.metafilter.com/23229/woo-WOO">woo WOO</a>, <a href="http://uncledirtae.com/blog/archives/2003/02/02/the_whistles_go_whoo.php">the whistles go whoo</a>, <a href="http://www.millerarts.com/interactive/BubbRubb/">Bubb Rubb art</a>, <a href="http://gothamist.com/2003/10/02/gimme_a_woowooo.php">gimme a woowoo</a>, <a href="http://www.buzolich.com/indecorum/2003/bubb-rubb/">Bubb Rubbon buzolich</a>, <a href="http://groups.yahoo.com/group/BubbRubb/">Bubb Rubb group</a>, <a href="http://knowyourmeme.com/memes/bubb-rubb">Bubb Rubb on Know Your Meme</a>, <a href="http://www.millerarts.com/interactive/BubbRubb/BubbRubbSoundBoard.html">Bubb Rubb Sound Board</a>, <a href="http://wiki.ytmnd.com/Bubb_Rubb">Bubb Rubb YTMND</a>, <a href="http://web.archive.org/web/20030601183448/bubbrubb.com">BubbRubb.com</a>, <a href="http://web.archive.org/web/20030601183448/http://bubbrubb.com/trailer1.wmv">Bubb Rubb trailer</a>, <a href="http://avanttrash.com/images/purple-drank.jpg">Purple Drank</a>, <a href="http://www.jamglue.com/tracks/2332723-Three-Six-Mafia-feat-UGK-Sippin-On-Some-Syrup-Acapella-Accapella">Sippin on some syrup</a>, <a href="http://www.collegehumor.com/video:1820574">Shatner vs. Bubb Rubb</a>, <a href="http://www.youtube.com/watch?v=utHHHu1L_Mg">Bubb Rubb vs. the leprechaun</a>, <a href="http://www.youtube.com/watch?v=J6lo9Km5HaU">Saturday Remix</a>, <a href="http://www.youtube.com/watch?v=uRAkPPzikO0">300 Bubb Rubb Style</a>, <a href="http://www.youtube.com/watch?v=Nnzw_i4YmKk">the whistles go woo woo</a></p>
112
+ ]]></content:encoded>
113
+ <wfw:commentRss>http://www.rocketboom.com/know-your-meme-bubb-rubb/feed/</wfw:commentRss>
114
+ <enclosure url="http://www.rocketboom.com/video/know-your-meme-bubb-rubb_hd.m4v" length="125899499" type="video/mp4" />
115
+ <itunes:summary>The Rocketboom Institute for Internet Studies investigates why the whistles go whoo whoo with Bubb Rubb. Links to all your Bubb Rubb resources matt-d, Ghost Ride It, Gas Break Dip, Kron.com, archive.org, socalevo.net, whistle tips, Bubb Rubb everywhere, Bubb Rubb sound board, whistle tips banned, Blur feat. Bubb Rubb, woo WOO, the whistles go whoo, Bubb Rubb art, gimme a woowoo, Bubb Rubbon buzolich, Bubb Rubb group, Bubb Rubb on Know Your Meme, Bubb Rubb Sound Board, Bubb Rubb YTMND, BubbRubb.com, Bubb Rubb trailer, Purple Drank, Sippin on some syrup, Shatner vs. Bubb Rubb, Bubb Rubb vs. the leprechaun, Saturday Remix, 300 Bubb Rubb Style, the whistles go woo woo
116
+
117
+ http://www.rocketboom.com/know-your-meme-bubb-rubb/</itunes:summary>
118
+ <itunes:subtitle>The Rocketboom Institute for Internet Studies investigates why the whistles go whoo whoo with Bubb Rubb. Links to all your Bubb Rubb resources matt-d, Ghost Ride It, Gas Break Dip, Kron.com, archive.org, socalevo.net, whistle tips, Bubb Rubb [...]</itunes:subtitle>
119
+ </item>
120
+ <item>
121
+ <title>Car Free Summer Streets</title>
122
+ <link>http://www.rocketboom.com/summer-streets/</link>
123
+ <comments>http://www.rocketboom.com/summer-streets/#comments</comments>
124
+ <pubDate>Mon, 17 Aug 2009 06:24:16 +0000</pubDate>
125
+ <dc:creator>leah</dc:creator>
126
+
127
+ <category><![CDATA[Humanwire]]></category>
128
+
129
+ <category><![CDATA[department of transportation]]></category>
130
+
131
+ <category><![CDATA[ella morton]]></category>
132
+
133
+ <category><![CDATA[new york city]]></category>
134
+
135
+ <category><![CDATA[summer streets]]></category>
136
+
137
+ <guid isPermaLink="false">http://www.rocketboom.com/?p=3584</guid>
138
+ <description><![CDATA[<object width="600" height="363"><param name="movie" value="http://www.youtube.com/v/YLF429LRLFM&hl=en&fs=1"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/YLF429LRLFM&hl=en&fs=1" type="application/x-shockwave-flash" wmode="transparent" width="600" height="363" flashvars="hl=en&fs=1"></embed></object>Rocketboom NYC correspondent Ella Morton explores New York City&#8217;s Summer Streets program.
139
+ ]]></description>
140
+ <content:encoded><![CDATA[<object width="600" height="363"><param name="movie" value="http://www.youtube.com/v/YLF429LRLFM&hl=en&fs=1"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/YLF429LRLFM&hl=en&fs=1" type="application/x-shockwave-flash" wmode="transparent" width="600" height="363" flashvars="hl=en&fs=1"></embed></object><p><span class="description"><strong>Rocketboom NYC</strong> correspondent <a href="http://sprinkleofginger.com/" target="_blank">Ella Morton</a> explores New York City&#8217;s <a href="http://www.nyc.gov/html/dot/summerstreets/html/home/home.shtml" target="_blank">Summer Streets</a> program.</span></p>
141
+ ]]></content:encoded>
142
+ <wfw:commentRss>http://www.rocketboom.com/summer-streets/feed/</wfw:commentRss>
143
+ <enclosure url="http://www.rocketboom.com/video/summer-streets_hd.m4v" length="71443397" type="video/mp4" />
144
+ <itunes:summary>Rocketboom NYC correspondent Ella Morton explores New York City&#8217;s Summer Streets program.
145
+
146
+ http://www.rocketboom.com/summer-streets/</itunes:summary>
147
+ <itunes:subtitle>Rocketboom NYC correspondent Ella Morton explores New York City&#8217;s Summer Streets program.
148
+ </itunes:subtitle>
149
+ </item>
150
+ <item>
151
+ <title>Augmented Reality Subway App</title>
152
+ <link>http://www.rocketboom.com/augmented-reality-subway-app/</link>
153
+ <comments>http://www.rocketboom.com/augmented-reality-subway-app/#comments</comments>
154
+ <pubDate>Fri, 14 Aug 2009 01:33:17 +0000</pubDate>
155
+ <dc:creator>leah</dc:creator>
156
+
157
+ <category><![CDATA[Technology]]></category>
158
+
159
+ <category><![CDATA[Across Air]]></category>
160
+
161
+ <category><![CDATA[application]]></category>
162
+
163
+ <category><![CDATA[Augmented Reality]]></category>
164
+
165
+ <category><![CDATA[ellie rountree]]></category>
166
+
167
+ <category><![CDATA[Nearest Subway]]></category>
168
+
169
+ <guid isPermaLink="false">http://www.rocketboom.com/?p=3579</guid>
170
+ <description><![CDATA[<object width="600" height="363"><param name="movie" value="http://www.youtube.com/v/j67xUCULGdE&hl=en&fs=1"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/j67xUCULGdE&hl=en&fs=1" type="application/x-shockwave-flash" wmode="transparent" width="600" height="363" flashvars="hl=en&fs=1"></embed></object>Rocketboom Tech correspondent Ellie Rountree tests out a new application from the UK-based company, Across Air, called Nearest Subway and interviews Managing Director Chetan Damani.
171
+
172
+ This episode was created in collaboration with Intel!
173
+ ]]></description>
174
+ <content:encoded><![CDATA[<object width="600" height="363"><param name="movie" value="http://www.youtube.com/v/j67xUCULGdE&hl=en&fs=1"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/j67xUCULGdE&hl=en&fs=1" type="application/x-shockwave-flash" wmode="transparent" width="600" height="363" flashvars="hl=en&fs=1"></embed></object><p><strong>Rocketboom Tech</strong> correspondent <a href="http://twitter.com/elspethjane" target="_blank">Ellie Rountree</a> tests out a new application from the UK-based company, <a href="http://www.acrossair.com/" target="_blank">Across Air</a>, called Nearest Subway and interviews Managing Director Chetan Damani.<br />
175
+ <a href="http://www.intel.com"><img src="http://www.rocketboom.net/images/logo_intel.gif" border="0" alt="" width="45" /></a><br />
176
+ This episode was created in collaboration with <a href="http://www.intel.com">Intel</a>!</p>
177
+ ]]></content:encoded>
178
+ <wfw:commentRss>http://www.rocketboom.com/augmented-reality-subway-app/feed/</wfw:commentRss>
179
+ <enclosure url="http://www.rocketboom.com/video/augmented-reality-subway-app_hd.m4v" length="81701769" type="video/mp4" />
180
+ <itunes:summary>Rocketboom Tech correspondent Ellie Rountree tests out a new application from the UK-based company, Across Air, called Nearest Subway and interviews Managing Director Chetan Damani.
181
+
182
+ This episode was created in collaboration with Intel!
183
+
184
+ http://www.rocketboom.com/augmented-reality-subway-app/</itunes:summary>
185
+ <itunes:subtitle>Rocketboom Tech correspondent Ellie Rountree tests out a new application from the UK-based company, Across Air, called Nearest Subway and interviews Managing Director Chetan Damani.
186
+
187
+ This episode was created in collaboration with Intel!
188
+ </itunes:subtitle>
189
+ </item>
190
+ <item>
191
+ <title>Senegal: Recycled Art</title>
192
+ <link>http://www.rocketboom.com/senegal-recycled-art/</link>
193
+ <comments>http://www.rocketboom.com/senegal-recycled-art/#comments</comments>
194
+ <pubDate>Wed, 12 Aug 2009 20:55:24 +0000</pubDate>
195
+ <dc:creator>leah</dc:creator>
196
+
197
+ <category><![CDATA[Humanwire]]></category>
198
+
199
+ <category><![CDATA[artist]]></category>
200
+
201
+ <category><![CDATA[Cheikh Seck]]></category>
202
+
203
+ <category><![CDATA[Mamadou Tall Diedhiou]]></category>
204
+
205
+ <category><![CDATA[Senegal]]></category>
206
+
207
+ <guid isPermaLink="false">http://www.rocketboom.com/?p=3527</guid>
208
+ <description><![CDATA[<object width="600" height="363"><param name="movie" value="http://www.youtube.com/v/737EfIydJ5Y&hl=en&fs=1"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/737EfIydJ5Y&hl=en&fs=1" type="application/x-shockwave-flash" wmode="transparent" width="600" height="363" flashvars="hl=en&fs=1"></embed></object>Humanwire field correspondent Cheikh Seck in Senegal speaks with world renowned artist, Mamadou Tall Diedhiou about his recycled bird statues.
209
+ ]]></description>
210
+ <content:encoded><![CDATA[<object width="600" height="363"><param name="movie" value="http://www.youtube.com/v/737EfIydJ5Y&hl=en&fs=1"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/737EfIydJ5Y&hl=en&fs=1" type="application/x-shockwave-flash" wmode="transparent" width="600" height="363" flashvars="hl=en&fs=1"></embed></object><p><strong>Humanwire</strong> field correspondent <a href="http://www.myhero.com/myhero/hero.asp?hero=Cheikh_Seck_mlkdakar_SN_07" target="_blank">Cheikh Seck</a> in <a href="http://en.wikipedia.org/wiki/Senegal" target="_blank">Senegal</a> speaks with world renowned artist, Mamadou Tall Diedhiou about his recycled bird statues.</p>
211
+ ]]></content:encoded>
212
+ <wfw:commentRss>http://www.rocketboom.com/senegal-recycled-art/feed/</wfw:commentRss>
213
+ <enclosure url="http://www.rocketboom.com/video/senegal-recycled-art_hd.m4v" length="104079693" type="video/mp4" />
214
+ <itunes:summary>Humanwire field correspondent Cheikh Seck in Senegal speaks with world renowned artist, Mamadou Tall Diedhiou about his recycled bird statues.
215
+
216
+ http://www.rocketboom.com/senegal-recycled-art/</itunes:summary>
217
+ <itunes:subtitle>Humanwire field correspondent Cheikh Seck in Senegal speaks with world renowned artist, Mamadou Tall Diedhiou about his recycled bird statues.
218
+ </itunes:subtitle>
219
+ </item>
220
+ <item>
221
+ <title>Old New York, New New York</title>
222
+ <link>http://www.rocketboom.com/old-new-york-new-new-york/</link>
223
+ <comments>http://www.rocketboom.com/old-new-york-new-new-york/#comments</comments>
224
+ <pubDate>Wed, 12 Aug 2009 03:00:30 +0000</pubDate>
225
+ <dc:creator>leah</dc:creator>
226
+
227
+ <category><![CDATA[Humanwire]]></category>
228
+
229
+ <category><![CDATA[battery park city]]></category>
230
+
231
+ <category><![CDATA[collect pond]]></category>
232
+
233
+ <category><![CDATA[ella morton]]></category>
234
+
235
+ <category><![CDATA[foley square]]></category>
236
+
237
+ <category><![CDATA[geography]]></category>
238
+
239
+ <category><![CDATA[history]]></category>
240
+
241
+ <category><![CDATA[Michael Miscione]]></category>
242
+
243
+ <category><![CDATA[new york]]></category>
244
+
245
+ <category><![CDATA[new york city]]></category>
246
+
247
+ <category><![CDATA[no. 7 line]]></category>
248
+
249
+ <category><![CDATA[queens]]></category>
250
+
251
+ <category><![CDATA[UN]]></category>
252
+
253
+ <category><![CDATA[united nations]]></category>
254
+
255
+ <category><![CDATA[united states]]></category>
256
+
257
+ <guid isPermaLink="false">http://www.rocketboom.com/?p=3523</guid>
258
+ <description><![CDATA[<object width="600" height="363"><param name="movie" value="http://www.youtube.com/v/6OVgXNm0Y2s&hl=en&fs=1"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/6OVgXNm0Y2s&hl=en&fs=1" type="application/x-shockwave-flash" wmode="transparent" width="600" height="363" flashvars="hl=en&fs=1"></embed></object>Rocketboom NYC Correspondent Ella Morton meets up with Manhattan Borough Historian, Michael Miscione, to talk about the changing landscapes of New York City.
259
+ ]]></description>
260
+ <content:encoded><![CDATA[<object width="600" height="363"><param name="movie" value="http://www.youtube.com/v/6OVgXNm0Y2s&hl=en&fs=1"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/6OVgXNm0Y2s&hl=en&fs=1" type="application/x-shockwave-flash" wmode="transparent" width="600" height="363" flashvars="hl=en&fs=1"></embed></object><p><strong>Rocketboom NYC</strong> Correspondent <a href="http://www.sprinkleofginger.com/">Ella Morton</a><span> meets up with Manhattan Borough Historian, Michael Miscione, to talk about the <a href="http://www.nypl.org/research/chss/lhg/nyc2.cfm" target="_blank">changing landscapes of New York City</a>.</span></p>
261
+ ]]></content:encoded>
262
+ <wfw:commentRss>http://www.rocketboom.com/old-new-york-new-new-york/feed/</wfw:commentRss>
263
+ <enclosure url="http://www.rocketboom.com/video/old-new-york-new-new-york_hd.m4v" length="119051114" type="video/mp4" />
264
+ <itunes:summary>Rocketboom NYC Correspondent Ella Morton meets up with Manhattan Borough Historian, Michael Miscione, to talk about the changing landscapes of New York City.
265
+
266
+ http://www.rocketboom.com/old-new-york-new-new-york/</itunes:summary>
267
+ <itunes:subtitle>Rocketboom NYC Correspondent Ella Morton meets up with Manhattan Borough Historian, Michael Miscione, to talk about the changing landscapes of New York City.
268
+ </itunes:subtitle>
269
+ </item>
270
+ <item>
271
+ <title>A Renegade Cabaret on the Fire Escape</title>
272
+ <link>http://www.rocketboom.com/the-renegade-cabaret/</link>
273
+ <comments>http://www.rocketboom.com/the-renegade-cabaret/#comments</comments>
274
+ <pubDate>Mon, 10 Aug 2009 20:45:06 +0000</pubDate>
275
+ <dc:creator>yatta</dc:creator>
276
+
277
+ <category><![CDATA[Humanwire]]></category>
278
+
279
+ <category><![CDATA[Elizabeth Soychak]]></category>
280
+
281
+ <category><![CDATA[High Line Park]]></category>
282
+
283
+ <category><![CDATA[new york city]]></category>
284
+
285
+ <category><![CDATA[Patty Heffley]]></category>
286
+
287
+ <category><![CDATA[Renegade Cabaret]]></category>
288
+
289
+ <guid isPermaLink="false">http://www.rocketboom.com/?p=3517</guid>
290
+ <description><![CDATA[<object width="600" height="363"><param name="movie" value="http://www.youtube.com/v/iLo3P_zMVXs&hl=en&fs=1"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/iLo3P_zMVXs&hl=en&fs=1" type="application/x-shockwave-flash" wmode="transparent" width="600" height="363" flashvars="hl=en&fs=1"></embed></object>Rocketboom NYC Correspondent Ella Morton speaks with founder Patty Heffley and singer Elizabeth Soychak of the Renegade Cabaret on the High Line, a non-sanctioned musical performance that takes place nightly next to the High Line Park in the Chelsea neighborhood of New York City.
291
+ ]]></description>
292
+ <content:encoded><![CDATA[<object width="600" height="363"><param name="movie" value="http://www.youtube.com/v/iLo3P_zMVXs&hl=en&fs=1"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/iLo3P_zMVXs&hl=en&fs=1" type="application/x-shockwave-flash" wmode="transparent" width="600" height="363" flashvars="hl=en&fs=1"></embed></object><p><strong>Rocketboom NYC</strong> Correspondent <a href="http://www.sprinkleofginger.com/">Ella Morton</a> speaks with founder <a href="http://www.pattyheffley.com/">Patty Heffley</a> and singer <a href="http://www.elizabethsoychak.com/">Elizabeth Soychak</a> of the <a href="http://www.renegadecabaret.com">Renegade Cabaret</a> on the <a href="http://www.thehighline.org/">High Line</a>, a <a href="http://www.nytimes.com/2009/06/25/garden/25seen.html?_r=1">non-sanctioned musical performance</a> that takes place nightly next to the High Line Park in the <a href="http://en.wikipedia.org/wiki/Chelsea,_Manhattan">Chelsea</a> neighborhood of New York City.</p>
293
+ ]]></content:encoded>
294
+ <wfw:commentRss>http://www.rocketboom.com/the-renegade-cabaret/feed/</wfw:commentRss>
295
+ <enclosure url="http://www.rocketboom.com/video/the-renegade-cabaret_hd.m4v" length="105490865" type="video/mp4" />
296
+ <itunes:summary>Rocketboom NYC Correspondent Ella Morton speaks with founder Patty Heffley and singer Elizabeth Soychak of the Renegade Cabaret on the High Line, a non-sanctioned musical performance that takes place nightly next to the High Line Park in the Chelsea neighborhood of New York City.
297
+
298
+ http://www.rocketboom.com/the-renegade-cabaret/</itunes:summary>
299
+ <itunes:subtitle>Rocketboom NYC Correspondent Ella Morton speaks with founder Patty Heffley and singer Elizabeth Soychak of the Renegade Cabaret on the High Line, a non-sanctioned musical performance that takes place nightly next to the High Line Park in the [...]</itunes:subtitle>
300
+ </item>
301
+ <item>
302
+ <title>Building Custom Bicycles In New York City</title>
303
+ <link>http://www.rocketboom.com/custom-bicycles-in-new-york-city/</link>
304
+ <comments>http://www.rocketboom.com/custom-bicycles-in-new-york-city/#comments</comments>
305
+ <pubDate>Thu, 06 Aug 2009 18:33:52 +0000</pubDate>
306
+ <dc:creator>yatta</dc:creator>
307
+
308
+ <category><![CDATA[Humanwire]]></category>
309
+
310
+ <category><![CDATA[Josh Hadar]]></category>
311
+
312
+ <guid isPermaLink="false">http://www.rocketboom.com/?p=3501</guid>
313
+ <description><![CDATA[<object width="600" height="363"><param name="movie" value="http://www.youtube.com/v/lbxMezAi9hU&hl=en&fs=1"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/lbxMezAi9hU&hl=en&fs=1" type="application/x-shockwave-flash" wmode="transparent" width="600" height="363" flashvars="hl=en&fs=1"></embed></object>Rocketboom NYC Correspondent Ella Morton interviews artist Josh Hadar about his hand sculpted custom bicycles.
314
+ ]]></description>
315
+ <content:encoded><![CDATA[<object width="600" height="363"><param name="movie" value="http://www.youtube.com/v/lbxMezAi9hU&hl=en&fs=1"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/lbxMezAi9hU&hl=en&fs=1" type="application/x-shockwave-flash" wmode="transparent" width="600" height="363" flashvars="hl=en&fs=1"></embed></object><p><strong>Rocketboom NYC</strong> Correspondent <a href="http://www.sprinkleofginger.com/">Ella Morton</a> interviews artist <span class="description"><a href="http://hadarmetaldesign.com/">Josh Hadar</a> about his hand sculpted custom bicycles. </span></p>
316
+ ]]></content:encoded>
317
+ <wfw:commentRss>http://www.rocketboom.com/custom-bicycles-in-new-york-city/feed/</wfw:commentRss>
318
+ <enclosure url="http://www.rocketboom.com/video/custom-bicycles-in-new-york-city_hd.m4v" length="96669982" type="video/mp4" />
319
+ <itunes:summary>Rocketboom NYC Correspondent Ella Morton interviews artist Josh Hadar about his hand sculpted custom bicycles.
320
+
321
+ http://www.rocketboom.com/custom-bicycles-in-new-york-city/</itunes:summary>
322
+ <itunes:subtitle>Rocketboom NYC Correspondent Ella Morton interviews artist Josh Hadar about his hand sculpted custom bicycles.
323
+ </itunes:subtitle>
324
+ </item>
325
+ <item>
326
+ <title>Kenyan Mine Closes After Credit Crunch</title>
327
+ <link>http://www.rocketboom.com/kenyan-mine-closes-after-credit-crunch/</link>
328
+ <comments>http://www.rocketboom.com/kenyan-mine-closes-after-credit-crunch/#comments</comments>
329
+ <pubDate>Wed, 05 Aug 2009 16:00:06 +0000</pubDate>
330
+ <dc:creator>yatta</dc:creator>
331
+
332
+ <category><![CDATA[Humanwire]]></category>
333
+
334
+ <guid isPermaLink="false">http://www.rocketboom.com/?p=3493</guid>
335
+ <description><![CDATA[<object width="600" height="363"><param name="movie" value="http://www.youtube.com/v/3QTmG5CfhCU&hl=en&fs=1"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/3QTmG5CfhCU&hl=en&fs=1" type="application/x-shockwave-flash" wmode="transparent" width="600" height="363" flashvars="hl=en&fs=1"></embed></object>Rocketboom Field Correspondent Ruud Elmendorp reports on the closing of the Fluorspar Mine in Kenya as a result of the global credit crisis.
336
+ ]]></description>
337
+ <content:encoded><![CDATA[<object width="600" height="363"><param name="movie" value="http://www.youtube.com/v/3QTmG5CfhCU&hl=en&fs=1"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/3QTmG5CfhCU&hl=en&fs=1" type="application/x-shockwave-flash" wmode="transparent" width="600" height="363" flashvars="hl=en&fs=1"></embed></object><p>Rocketboom <a href="http://www.rocketboom.com/category/field-reports/">Field</a> Correspondent <a href="http://videoreporter.nl/">Ruud Elmendorp</a> reports on the closing of the <a href="http://www.kenyafluorspar.com/">Fluorspar Mine</a> in Kenya as a result of the global credit crisis.</p>
338
+ ]]></content:encoded>
339
+ <wfw:commentRss>http://www.rocketboom.com/kenyan-mine-closes-after-credit-crunch/feed/</wfw:commentRss>
340
+ <enclosure url="http://www.rocketboom.com/video/kenyan-mine-closes-after-credit-crunch_hd.m4v" length="54337266" type="video/mp4" />
341
+ <itunes:summary>Rocketboom Field Correspondent Ruud Elmendorp reports on the closing of the Fluorspar Mine in Kenya as a result of the global credit crisis.
342
+
343
+ http://www.rocketboom.com/kenyan-mine-closes-after-credit-crunch/</itunes:summary>
344
+ <itunes:subtitle>Rocketboom Field Correspondent Ruud Elmendorp reports on the closing of the Fluorspar Mine in Kenya as a result of the global credit crisis.
345
+ </itunes:subtitle>
346
+ </item>
347
+ </channel>
348
+ </rss>
data/lib/feedme.rb ADDED
@@ -0,0 +1,527 @@
1
+ ####################################################################################
2
+ # FeedMe v0.1
3
+ #
4
+ # FeedMe is an easy to use parser for RSS and Atom files. It is based on SimpleRSS,
5
+ # but has some improvements that make it worth considering:
6
+ # 1. Support for attributes
7
+ # 2. Support for nested elements
8
+ # 3. Support for elements that appear multiple times
9
+ # 4. Syntactic sugar that makes it easier to get at the information you want
10
+ #
11
+ # One word of caution: FeedMe will be maintained only so long as SimpleRSS does not
12
+ # provide the above features. I will try to keep FeedMe's API compatible with
13
+ # SimpleRSS so that it will be easy for users to switch if/when necessary.
14
+ ####################################################################################
15
+
16
+ require 'cgi'
17
+ require 'time'
18
+
19
+ module FeedMe
20
+ VERSION = "0.1"
21
+
22
+ # constants for the feed type
23
+ RSS = :RSS
24
+ ATOM = :ATOM
25
+
26
+ # the key used to access the content element of a mixed tag
27
+ CONTENT_KEY = :content
28
+
29
+ def FeedMe.parse(source, options={})
30
+ ParserBuilder.new.parse(source, options)
31
+ end
32
+
33
+ def FeedMe.parse_strict(source, options={})
34
+ StrictParserBuilder.new.parse(source, options)
35
+ end
36
+
37
+ class ParserBuilder
38
+ attr_accessor :rss_tags, :rss_item_tags, :atom_tags, :atom_entry_tags,
39
+ :date_tags, :value_tags, :ghost_tags, :aliases,
40
+ :bang_mods, :bang_mod_fns
41
+
42
+ # the promiscuous parser only has to know about tags that have nested subtags
43
+ def initialize
44
+ # rss tags
45
+ @rss_tags = [
46
+ {
47
+ :image => nil,
48
+ :textInput => nil,
49
+ :skipHours => nil,
50
+ :skipDays => nil,
51
+ :items => [{ :'rdf:Seq' => nil }],
52
+ #:item => @rss_item_tags
53
+ }
54
+ ]
55
+ @rss_item_tags = [ {} ]
56
+
57
+ #atom tags
58
+ @atom_tags = [
59
+ {
60
+ :author => nil,
61
+ :contributor => nil,
62
+ #:entry => @atom_entry_tags
63
+ }
64
+ ]
65
+ @atom_entry_tags = [
66
+ {
67
+ :author => nil,
68
+ :contributor => nil
69
+ }
70
+ ]
71
+
72
+ # tags whose value is a date
73
+ @date_tags = [ :pubDate, :lastBuildDate, :published, :updated, :'dc:date', :expirationDate ]
74
+
75
+ # tags that can be used as the default value for a tag with attributes
76
+ @value_tags = [ CONTENT_KEY, :href ]
77
+
78
+ # tags that don't become part of the parsed object tree
79
+ @ghost_tags = [ :'rdf:Seq' ]
80
+
81
+ # tag/attribute aliases
82
+ @aliases = {
83
+ :items => :item_array,
84
+ :item_array => :entry_array,
85
+ :entries => :entry_array,
86
+ :entry_array => :item_array,
87
+ :link => :'link+self'
88
+ }
89
+
90
+ # bang mods
91
+ @bang_mods = [ :stripHtml ]
92
+ @bang_mod_fns = {
93
+ :stripHtml => proc {|str| str.gsub(/<\/?[^>]*>/, "").strip },
94
+ :wrap => proc {|str, col| str.gsub(/(.{1,#{col}})( +|$\n?)|(.{1,#{col}})/, "\\1\\3\n").strip }
95
+ }
96
+ end
97
+
98
+ def all_rss_tags
99
+ all_tags = rss_tags.dup
100
+ all_tags[0][:item] = rss_item_tags.dup
101
+ return all_tags
102
+ end
103
+
104
+ def all_atom_tags
105
+ all_tags = atom_tags.dup
106
+ all_tags[0][:entry] = atom_entry_tags.dup
107
+ return all_tags
108
+ end
109
+
110
+ def parse(source, options={})
111
+ Parser.new(self, source, options)
112
+ end
113
+ end
114
+
115
+ class StrictParserBuilder < ParserBuilder
116
+ attr_accessor :feed_ext_tags, :item_ext_tags
117
+
118
+ def initialize
119
+ super()
120
+
121
+ # rss tags
122
+ @rss_tags = [
123
+ {
124
+ :image => [ :url, :title, :link, :width, :height, :description ],
125
+ :textInput => [ :title, :description, :name, :link ],
126
+ :skipHours => [ :hour ],
127
+ :skipDays => [ :day ],
128
+ :items => [
129
+ {
130
+ :'rdf:Seq' => [ :'rdf:li' ]
131
+ },
132
+ :'rdf:Seq'
133
+ ],
134
+ #:item => @item_tags
135
+ },
136
+ :title, :link, :description, # required
137
+ :language, :copyright, :managingEditor, :webMaster, # optional
138
+ :pubDate, :lastBuildDate, :category, :generator,
139
+ :docs, :cloud, :ttl, :rating,
140
+ :image, :textInput, :skipHours, :skipDays, :item, # have subtags
141
+ :items
142
+ ]
143
+ @rss_item_tags = [
144
+ {},
145
+ :title, :description, # required
146
+ :link, :author, :category, :comments, :enclosure, # optional
147
+ :guid, :pubDate, :source, :expirationDate
148
+ ]
149
+
150
+ #atom tags
151
+ person_tags = [ :name, :uri, :email ]
152
+ @atom_tags = [
153
+ {
154
+ :author => person_tags,
155
+ :contributor => person_tags,
156
+ #:entry => @entry_tags
157
+ },
158
+ :id, :author, :title, :updated, # required
159
+ :category, :contributor, :generator, :icon, :logo, # optional
160
+ :'link+self', :'link+alternate', :'link+edit',
161
+ :'link+replies', :'link+related', :'link+enclosure',
162
+ :'link+via', :rights, :subtitle
163
+ ]
164
+ @atom_entry_tags = [
165
+ {
166
+ :author => person_tags,
167
+ :contributor => person_tags
168
+ },
169
+ :id, :author, :title, :updated, :summary, # required
170
+ :category, :content, :contributor, :'link+self',
171
+ :'link+alternate', :'link+edit', :'link+replies',
172
+ :'link+related', :'link+enclosure', :published,
173
+ :rights, :source
174
+ ]
175
+
176
+ # extensions
177
+ @feed_ext_tags = [
178
+ :'dc:date', :'feedburner:browserFriendly',
179
+ :'itunes:author', :'itunes:category'
180
+ ]
181
+ @item_ext_tags = [
182
+ :'dc:date', :'dc:subject', :'dc:creator',
183
+ :'dc:title', :'dc:rights', :'dc:publisher',
184
+ :'trackback:ping', :'trackback:about',
185
+ :'feedburner:origLink'
186
+ ]
187
+ end
188
+
189
+ def all_rss_tags
190
+ all_tags = rss_tags + (feed_ext_tags or [])
191
+ all_tags[0][:item] = rss_item_tags + (item_ext_tags or [])
192
+ return all_tags
193
+ end
194
+
195
+ def all_atom_tags
196
+ all_tags = atom_tags + (feed_ext_tags or [])
197
+ all_tags[0][:entry] = atom_entry_tags + (item_ext_tags or [])
198
+ return all_tags
199
+ end
200
+ end
201
+
202
+ class FeedData
203
+ attr_reader :fm_tag_name, :fm_parent, :fm_builder
204
+
205
+ def initialize(tag_name, parent, builder, attrs = {})
206
+ @fm_tag_name = tag_name
207
+ @fm_parent = parent
208
+ @fm_builder = builder
209
+ @data = attrs.dup
210
+ end
211
+
212
+ def key?(key)
213
+ @data.key?(key)
214
+ end
215
+
216
+ def keys
217
+ @data.keys
218
+ end
219
+
220
+ def [](key)
221
+ @data[key]
222
+ end
223
+
224
+ def []=(key, value)
225
+ @data[key] = value
226
+ end
227
+
228
+ def to_s
229
+ @data.to_s
230
+ end
231
+
232
+ def method_missing(name, *args)
233
+ call_virtual_method(name, args)
234
+ end
235
+
236
+ protected
237
+
238
+ def clean_tag(tag)
239
+ tag.to_s.gsub(':','_').intern
240
+ end
241
+
242
+ # generate a name for the array variable corresponding to a single-value variable
243
+ def arrayize(key)
244
+ return key + '_array'
245
+ end
246
+
247
+ # There are several virtual methods for each attribute/tag.
248
+ # 1. Tag/attribute name: since tags/attributes are stored as arrays,
249
+ # the instance variable name is the tag/attribute name followed by
250
+ # '_array'. The tag/attribute name is actually a virtual method that
251
+ # returns the first element in the array.
252
+ # 2. Aliases: for tags/attributes with aliases, the alias is a virtual
253
+ # method that simply forwards to the aliased method.
254
+ # 3. Any name that ends with a '?' returns true if the name without
255
+ # the '?' is a valid method and has a non-nil value.
256
+ # 4. Any name that ends with a '!' returns the value of the name
257
+ # without the '!', modified by the currently active set of bang mods
258
+ # 5. Tag/attribute name + '_value': returns the content portion of
259
+ # an element if it has both attributes and content, , or to return the
260
+ # default attribute (defined by the value_tags property). Otherwise
261
+ # equivalent to just the tag/attribute name.
262
+ # 6. Tag/attribute name + '_count': shortcut for tag/attribute
263
+ # array.size.
264
+ # 7. If the tag name is of the form "tag+rel", the tag having the
265
+ # specified rel value is returned
266
+ def call_virtual_method(name, args, history=[])
267
+ # make sure we don't get stuck in an infinite loop
268
+ history.each do |call|
269
+ if call[0] == fm_tag_name and call[1] == name
270
+ puts name
271
+ puts self.inspect
272
+ raise FeedMe::InfiniteCallLoopError.new(name, history)
273
+ end
274
+ end
275
+ history << [ fm_tag_name, name ]
276
+
277
+ raw_name = name
278
+ name = clean_tag(name)
279
+ name_str = name.to_s
280
+ array_key = clean_tag(arrayize(name.to_s))
281
+
282
+ if name_str[-1,1] == '?'
283
+ !call_virtual_method(name_str[0..-2], args, history).nil? rescue false
284
+ elsif name_str[-1,1] == '!'
285
+ value = call_virtual_method(name_str[0..-2], args, history)
286
+ fm_builder.bang_mods.each do |bm|
287
+ parts = bm.to_s.split('_')
288
+ bm_key = parts[0].to_sym
289
+ next unless fm_builder.bang_mod_fns.key?(bm_key)
290
+ value = fm_builder.bang_mod_fns[bm_key].call(value, *parts[1..-1])
291
+ end
292
+ return value
293
+ elsif key? name
294
+ self[name]
295
+ elsif key? array_key
296
+ self[array_key].first
297
+ elsif name_str =~ /(.+)_value/
298
+ value = call_virtual_method($1, args, history)
299
+ if value.is_a?(FeedData)
300
+ fm_builder.value_tags.each do |tag|
301
+ return value.call_virtual_method(tag, args, history) rescue nil
302
+ end
303
+ else
304
+ value
305
+ end
306
+ elsif name_str =~ /(.+)_count/
307
+ call_virtual_method(clean_tag(arrayize($1)), args, history).size
308
+ elsif name_str.include?("+")
309
+ tag_data = tag.to_s.split("+")
310
+ rel = tag_data[1]
311
+ call_virtual_method(clean_tag(arrayize(tag_data[0])), args, history).each do |elt|
312
+ next unless elt.is_a?(FeedData) and elt.rel?
313
+ return elt if elt.rel.casecmp(rel) == 0
314
+ end
315
+ elsif fm_builder.aliases.key? name
316
+ name = fm_builder.aliases[name]
317
+ method(name).call(*args) rescue call_virtual_method(name, args, history)
318
+ elsif fm_tag_name == :items # special handling for RDF items tag
319
+ self[:'rdf:li_array'].method(raw_name).call(*args)
320
+ elsif fm_tag_name == :'rdf:li' # special handling for RDF li tag
321
+ uri = self[:'rdf:resource']
322
+ fm_parent.fm_parent.item_array.each do |item|
323
+ if item[:'rdf:about'] == uri
324
+ return item.call_virtual_method(name, args, history)
325
+ end
326
+ end
327
+ else
328
+ raise NameError.new("No such method #{name}", name)
329
+ end
330
+ end
331
+ end
332
+
333
+ class Parser < FeedData
334
+ attr_reader :fm_source, :fm_options, :fm_type, :fm_tags, :fm_unparsed
335
+
336
+ def initialize(builder, source, options={})
337
+ super(nil, nil, builder)
338
+ @fm_source = source.respond_to?(:read) ? source.read : source.to_s
339
+ @fm_options = Hash.new.update(options)
340
+ @fm_parsed = []
341
+ @fm_unparsed = []
342
+ parse
343
+ end
344
+
345
+ def channel() self end
346
+ alias :feed :channel
347
+
348
+ def fm_tag_name
349
+ @fm_type == FeedMe::RSS ? 'channel' : 'feed'
350
+ end
351
+
352
+ private
353
+
354
+ def parse
355
+ # RSS = everything between channel tags + everthing between </channel> and </rdf> if this is an RDF document
356
+ if @fm_source =~ %r{<(?:.*?:)?(?:rss|rdf)(.*?)>.*?<(?:.*?:)?channel(.*?)>(.+)</(?:.*?:)?channel>(.*)</(?:.*?:)?(?:rss|rdf)>}mi
357
+ @fm_type = FeedMe::RSS
358
+ @fm_tags = fm_builder.all_rss_tags
359
+ attrs = parse_attributes($1, $2)
360
+ attrs[:version] ||= '1.0';
361
+ parse_content(self, attrs, $3 + nil_safe_to_s($4), @fm_tags)
362
+ # Atom = everthing between feed tags
363
+ elsif @fm_source =~ %r{<(?:.*?:)?feed(.*?)>(.+)</(?:.*?:)?feed>}mi
364
+ @fm_type = FeedMe::ATOM
365
+ @fm_tags = fm_builder.all_atom_tags
366
+ parse_content(self, parse_attributes($1), $2, @fm_tags)
367
+ else
368
+ raise FeedMeError, "Poorly formatted feed"
369
+ end
370
+ end
371
+
372
+ def parse_content(parent, attrs, content, tags)
373
+ # add attributes to parent
374
+ attrs.each_pair {|key, value| add_tag(parent, key, unescape(value)) }
375
+
376
+ # the first item in a tag array may be a hash that defines tags that have subtags
377
+ first_tag = 0
378
+ if !tags.nil? && tags[0].is_a?(Hash)
379
+ sub_tags = tags[0]
380
+ first_tag = 1
381
+ end
382
+
383
+ # split the content into elements
384
+ elements = {}
385
+ # TODO: this will break if a namespace is used that is not rss: or atom:
386
+ content.scan( %r{(<(?:rss:|atom:)?([^ >]+)([^>]*)(?:/>|>(.*?)</(?:rss:|atom:)?\2>))}mi ) do |match|
387
+ # \1 = full content (from start to end tag), \2 = tag name
388
+ # \3 = attributes, and \4 = content between tags
389
+ key = clean_tag(match[1])
390
+ value = [parse_attributes(match[2]), match[3]]
391
+ if elements.key? key
392
+ elements[key] << value
393
+ else
394
+ elements[key] = [value]
395
+ end
396
+ end
397
+
398
+ # check if this is a promiscuous parser
399
+ if tags.nil? || tags.empty? || (tags.size == 1 && first_tag == 1)
400
+ tags = elements.keys
401
+ first_tag = 0
402
+ end
403
+
404
+ # iterate over all tags (some or all of which may not be present)
405
+ tags[first_tag..-1].each do |tag|
406
+ key = clean_tag(tag)
407
+ element_array = elements.delete(tag) or next
408
+ @fm_parsed << key
409
+
410
+ element_array.each do |elt|
411
+ if !sub_tags.nil? && sub_tags.key?(key)
412
+ if fm_builder.ghost_tags.include? key
413
+ new_parent = parent
414
+ else
415
+ new_parent = FeedData.new(key, parent, fm_builder)
416
+ add_tag(parent, key, new_parent)
417
+ end
418
+ parse_content(new_parent, elt[0], elt[1], sub_tags[key])
419
+ else
420
+ add_tag(parent, key, clean_content(key, elt[0], elt[1], parent))
421
+ end
422
+ end
423
+ end
424
+
425
+ @fm_unparsed += elements.keys
426
+
427
+ @fm_parsed.uniq!
428
+ @fm_unparsed.uniq!
429
+ end
430
+
431
+ def add_tag(hash, key, value)
432
+ array_var = clean_tag(arrayize(key.to_s))
433
+ if hash.key? array_var
434
+ hash[array_var] << value
435
+ else
436
+ hash[array_var] = [value]
437
+ end
438
+ end
439
+
440
+ # used to normalize attribute names
441
+ def format_tag(tag)
442
+ camelize(underscore(tag).downcase, false)
443
+ end
444
+
445
+ def clean_content(tag, attrs, content, parent)
446
+ content = content.to_s
447
+ if fm_builder.date_tags.include? tag
448
+ content = Time.parse(content) rescue unescape(content)
449
+ else
450
+ content = unescape(content)
451
+ end
452
+
453
+ unless attrs.empty?
454
+ hash = FeedData.new(tag, parent, fm_builder, attrs)
455
+ if !content.empty?
456
+ hash[FeedMe::CONTENT_KEY] = content
457
+ end
458
+ return hash
459
+ end
460
+
461
+ return content
462
+ end
463
+
464
+ def parse_attributes(*attrs)
465
+ hash = {}
466
+ attrs.each do |a|
467
+ next if a.nil?
468
+ # pull key/value pairs out of attr string
469
+ array = a.scan(/(\w+)=['"]?([^'"]+)/)
470
+ # unescape values
471
+ array = array.collect {|key, value| [clean_tag(format_tag(key)), unescape(value)]}
472
+ hash.merge! Hash[*array.flatten]
473
+ end
474
+ return hash
475
+ end
476
+
477
+ def unescape(content)
478
+ content = CGI.unescapeHTML(content)
479
+
480
+ query = content.match(/^(http:.*\?)(.*)$/)
481
+ content = query[1] + CGI.unescape(query[2]) if query
482
+
483
+ cdata = content.match(%r{<!\[CDATA\[(.*)\]\]>}mi)
484
+ content = cdata[1] if cdata
485
+
486
+ return content
487
+
488
+ #if content =~ /([^-_.!~*'()a-zA-Z\d;\/?:@&=+$,\[\]]%)/n then
489
+ # CGI.unescapeHTML(content).gsub(/(<!\[CDATA\[|\]\]>)/,'').strip
490
+ #else
491
+ # content.gsub(/(<!\[CDATA\[|\]\]>)/,'').strip
492
+ #end
493
+ end
494
+
495
+ def underscore(camel_cased_word)
496
+ camel_cased_word.to_s.gsub(/::/, '/').
497
+ gsub(/([A-Z]+)([A-Z][a-z])/,'\1_\2').
498
+ gsub(/([a-z\d])([A-Z])/,'\1_\2').
499
+ tr("-", "_").
500
+ downcase
501
+ end
502
+
503
+ def camelize(lower_case_and_underscored_word, first_letter_in_uppercase = true)
504
+ if first_letter_in_uppercase
505
+ lower_case_and_underscored_word.to_s.gsub(/\/(.?)/) { "::#{$1.upcase}" }.gsub(/(?:^|_)(.)/) { $1.upcase }
506
+ else
507
+ lower_case_and_underscored_word[0,1].downcase + camelize(lower_case_and_underscored_word)[1..-1]
508
+ end
509
+ end
510
+
511
+ def nil_safe_to_s(obj)
512
+ obj.nil? ? '' : obj.to_s
513
+ end
514
+ end
515
+
516
+ class FeedMeError < StandardError
517
+ end
518
+
519
+ class InfiniteCallLoopError < StandardError
520
+ attr_reader :name, :history
521
+
522
+ def initialize(name, history)
523
+ @name = name
524
+ @history = history
525
+ end
526
+ end
527
+ end
@@ -0,0 +1,4 @@
1
+ $:.unshift(File.dirname(__FILE__) + '/../lib')
2
+
3
+ require 'test/unit'
4
+ require 'feedme'
metadata ADDED
@@ -0,0 +1,74 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: feedme
3
+ version: !ruby/object:Gem::Version
4
+ version: "0.1"
5
+ platform: ruby
6
+ authors:
7
+ - John Didion
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+
12
+ date: 2009-09-03 00:00:00 -04:00
13
+ default_executable:
14
+ dependencies:
15
+ - !ruby/object:Gem::Dependency
16
+ name: hoe
17
+ type: :development
18
+ version_requirement:
19
+ version_requirements: !ruby/object:Gem::Requirement
20
+ requirements:
21
+ - - ">="
22
+ - !ruby/object:Gem::Version
23
+ version: 2.3.3
24
+ version:
25
+ description: A simple, flexible, and extensible RSS and Atom parser for Ruby. Based on the popular SimpleRSS library, but with many nice extra features.
26
+ email:
27
+ - jdidion@rubyforge.org
28
+ executables: []
29
+
30
+ extensions: []
31
+
32
+ extra_rdoc_files:
33
+ - History.txt
34
+ - Manifest.txt
35
+ - README.txt
36
+ files:
37
+ - History.txt
38
+ - Manifest.txt
39
+ - README.txt
40
+ - Rakefile
41
+ - lib/feedme.rb
42
+ - examples/rocketboom.rb
43
+ - examples/rocketboom.rss
44
+ has_rdoc: true
45
+ homepage: http://feedme.rubyforge.org
46
+ licenses: []
47
+
48
+ post_install_message:
49
+ rdoc_options:
50
+ - --main
51
+ - README.txt
52
+ require_paths:
53
+ - lib
54
+ required_ruby_version: !ruby/object:Gem::Requirement
55
+ requirements:
56
+ - - ">="
57
+ - !ruby/object:Gem::Version
58
+ version: "0"
59
+ version:
60
+ required_rubygems_version: !ruby/object:Gem::Requirement
61
+ requirements:
62
+ - - ">="
63
+ - !ruby/object:Gem::Version
64
+ version: "0"
65
+ version:
66
+ requirements: []
67
+
68
+ rubyforge_project: feedme
69
+ rubygems_version: 1.3.5
70
+ signing_key:
71
+ specification_version: 3
72
+ summary: A simple, flexible, and extensible RSS and Atom parser for Ruby
73
+ test_files:
74
+ - test/test_helper.rb