pauldix-feedzirra 0.0.1 → 0.0.2

Sign up to get free protection for your applications and to get access to all the features.
data/README.textile CHANGED
@@ -1,7 +1,8 @@
1
1
  h1. Feedzirra
2
2
 
3
3
  "http://github.com/pauldix/feedzirra/tree/master":http://github.com/pauldix/feedzirra/tree/master
4
- "group discussion":http://groups.google.com/group/feedzirra
4
+
5
+ I'd like feedback on the api and any bugs encountered on feeds in the wild. I've set up a "google group here":http://groups.google.com/group/feedzirra.
5
6
 
6
7
  h2. Summary
7
8
 
@@ -9,24 +10,46 @@ A feed fetching and parsing library that treats the internet like Godzilla treat
9
10
 
10
11
  h2. Description
11
12
 
12
- Feedzirra is a feed library that is designed to get and update many feeds as quickly as possible. This includes using libcurl-multi through the "taf2-curb"http://github.com/taf2/curb/tree/master gem for faster http gets, and libxml through "nokogiri":http://github.com/tenderlove/nokogiri/tree/master and "sax-machine":http://github.com/pauldix/sax-machine/tree/master for faster parsing.
13
+ Feedzirra is a feed library that is designed to get and update many feeds as quickly as possible. This includes using libcurl-multi through the "taf2-curb":http://github.com/taf2/curb/tree/master gem for faster http gets, and libxml through "nokogiri":http://github.com/tenderlove/nokogiri/tree/master and "sax-machine":http://github.com/pauldix/sax-machine/tree/master for faster parsing.
13
14
 
14
15
  Once you have fetched feeds using Feedzirra, they can be updated using the feed objects. Feedzirra automatically inserts etag and last-modified information from the http response headers to lower bandwidth usage, eliminate unnecessary parsing, and make things speedier in general.
15
16
 
17
+ Another feature present in Feedzirra is the ability to create callback functions that get called "on success" and "on failure" when getting a feed. This makes it easy to do things like log errors or update data stores.
18
+
16
19
  The fetching and parsing logic have been decoupled so that either of them can be used in isolation if you'd prefer not to use everything that Feedzirra offers. However, the code examples below use helper methods in the Feed class that put everything together to make things as simple as possible.
17
20
 
18
21
  The final feature of Feedzirra is the ability to define custom parsing classes. In truth, Feedzirra could be used to parse much more than feeds. Microformats, page scraping, and almost anything else are fair game.
19
22
 
20
23
  h2. Installation
21
24
 
22
- For now Feedzirra exists only on github. It also has a few gem requirements that are only on github. Before you start you need to have "libcurl":http://curl.haxx.se/ and "libxml":http://xmlsoft.org/ installed. If you're on Leopard you have both. Otherwise, you'll need to grab them. Once you've got those libraries, these are the gems you'll need.
25
+ For now Feedzirra exists only on github. It also has a few gem requirements that are only on github. Before you start you need to have "libcurl":http://curl.haxx.se/ and "libxml":http://xmlsoft.org/ installed. If you're on Leopard you have both. Otherwise, you'll need to grab them. Once you've got those libraries, these are the gems that get used: nokogiri, pauldix-sax-machine, taf2-curb (note that this is a fork that lives on github and not the Ruby Forge version of curb), and pauldix-feedzirra. The feedzirra gemspec has all the dependencies so you should be able to get up and running with the standard github gem install routine:
23
26
  <pre>
24
- gem install nokogiri
25
27
  gem sources -a http://gems.github.com # if you haven't already
26
- gem install pauldix-sax-machine
27
- gem install taf2-curb
28
28
  gem install pauldix-feedzirra
29
29
  </pre>
30
+ <b>NOTE:</b>Some people have been reporting a few issues related to installation. First, the Ruby Forge version of curb is not what you want. It will not work. Nor will the curl-multi gem that lives on Ruby Forge. You have to get the "taf2-curb":http://github.com/taf2/curb/tree/master fork installed.
31
+
32
+ If you see this error when doing a require:
33
+ <pre>
34
+ /Library/Ruby/Site/1.8/rubygems/custom_require.rb:31:in `gem_original_require': no such file to load -- curb_core (LoadError)
35
+ </pre>
36
+ It means that the taf2-curb gem didn't build correctly. To resolve this you can do a git clone git://github.com/taf2/curb.git then run rake gem in the curb directory, then sudo gem install pkg/curb-0.2.4.0.gem. After that you should be good.
37
+
38
+ If you see something like this when trying to run it:
39
+ <pre>
40
+ NoMethodError: undefined method `on_success' for #<Curl::Easy:0x1182724>
41
+ from ./lib/feedzirra/feed.rb:88:in `add_url_to_multi'
42
+ </pre>
43
+ This means that you are requiring curl-multi or the Ruby Forge version of Curb somewhere. You can't use those and need to get the taf2 version up and running.
44
+
45
+ If you're on Debian or Ubuntu and getting errors while trying to install the taf2-curb gem, it could be because you don't have the latest version of libcurl installed. Do this to fix:
46
+ <pre>
47
+ sudo apt-get install libcurl4-gnutls-dev
48
+ </pre>
49
+
50
+ Another problem could be if you are running Mac Ports and you have libcurl installed through there. You need to uninstall it for curb to work! The version in Mac Ports is old and doesn't play nice with curb. If you're running Leopard, you can just uninstall and you should be golden. If you're on an older version of OS X, you'll then need to "download curl":http://curl.haxx.se/download.html and build from source. Then you'll have to install the taf2-curb gem again. You might have to perform the step above.
51
+
52
+ If you're still having issues, please let me know on the mailing list. Also, "Todd Fisher (taf2)":http://github.com/taf2 is working on fixing the gem install. Please send him a full error report.
30
53
 
31
54
  h2. Usage
32
55
 
@@ -51,6 +74,14 @@ entry.author # => "Paul Dix"
51
74
  entry.summary # => "..."
52
75
  entry.content # => "..."
53
76
  entry.published # => Thu Jan 29 17:00:19 UTC 2009 # it's a Time object
77
+ entry.categories # => ["...", "..."]
78
+
79
+ # sanitizing an entry's content
80
+ entry.sanitized.title # => returns the title with harmful stuff escaped
81
+ entry.sanitized.author # => returns the author with harmful stuff escaped
82
+ entry.sanitized.content # => returns the content with harmful stuff escaped
83
+ entry.sanitize! # => sanitizes the entry's title, author, and content in place (as in, it changes the value to clean versions)
84
+ feed.sanitize_entries! # => sanitizes all entries in place
54
85
 
55
86
  # updating a single feed
56
87
  updated_feed = Feedzirra::Feed.update(feed)
@@ -61,7 +92,7 @@ updated_feed.new_entries # a collection of the entry objects that are newer tha
61
92
 
62
93
  # fetching multiple feeds
63
94
  feed_urls = ["http://feeds.feedburner.com/PaulDixExplainsNothing", "http://feeds.feedburner.com/trottercashion"]
64
- feeds = Feedzirra::Feed.fetch_and_parse(feeds)
95
+ feeds = Feedzirra::Feed.fetch_and_parse(feeds_urls)
65
96
 
66
97
  # feeds is now a hash with the feed_urls as keys and the parsed feed objects as values. If an error was thrown
67
98
  # there will be a Fixnum of the http response code instead of a feed object
@@ -106,11 +137,17 @@ h2. Next Steps
106
137
 
107
138
  This thing needs to hammer on many different feeds in the wild. I'm sure there will be bugs. I want to find them and crush them. I didn't bother using the test suite for feedparser. i wanted to start fresh.
108
139
 
109
- Here are some more specific todos.
110
- * Clean up the fetching code inside feed.rb so it doesn't suck so hard.
111
- * Make the feed_spec actually mock stuff out so it doesn't hit the net.
140
+ Here are some more specific TODOs.
141
+ * Make a feedzirra-rails gem to integrate feedzirra seamlessly with Rails and ActiveRecord.
142
+ * Add function to sanitize content.
143
+ * Add support to automatically handle gzip and deflate encododing.
144
+ * Add support for authenticated feeds.
112
145
  * Create a super sweet DSL for defining new parsers.
146
+ * Test against Ruby 1.9.1 and fix any bugs.
113
147
  * I'm not keeping track of modified on entries. Should I add this?
148
+ * Clean up the fetching code inside feed.rb so it doesn't suck so hard.
149
+ * Make the feed_spec actually mock stuff out so it doesn't hit the net.
150
+ * Readdress how feeds determine if they can parse a document. Maybe I should use namespaces instead?
114
151
 
115
152
  h2. LICENSE
116
153
 
@@ -3,11 +3,12 @@ module Feedzirra
3
3
  include SAXMachine
4
4
  include FeedEntryUtilities
5
5
  element :title
6
- element :link, :as => :url, :value => :href, :with => {:type => "text/html"}
6
+ element :link, :as => :url, :value => :href, :with => {:type => "text/html", :rel => "alternate"}
7
7
  element :name, :as => :author
8
8
  element :content
9
9
  element :summary
10
10
  element :published
11
11
  element :created, :as => :published
12
+ elements :category, :as => :categories, :value => :term
12
13
  end
13
14
  end
@@ -4,9 +4,11 @@ module Feedzirra
4
4
  include FeedEntryUtilities
5
5
  element :title
6
6
  element :name, :as => :author
7
+ element :link, :as => :url, :value => :href, :with => {:type => "text/html", :rel => "alternate"}
7
8
  element :"feedburner:origLink", :as => :url
8
9
  element :summary
9
10
  element :content
10
11
  element :published
12
+ elements :category, :as => :categories, :value => :term
11
13
  end
12
14
  end
@@ -13,7 +13,7 @@ module Feedzirra
13
13
  end
14
14
 
15
15
  def self.determine_feed_parser_for_xml(xml)
16
- start_of_doc = xml.slice(0, 500)
16
+ start_of_doc = xml.slice(0, 1000)
17
17
  feed_classes.detect {|klass| klass.able_to_parse?(start_of_doc)}
18
18
  end
19
19
 
@@ -22,24 +22,25 @@ module Feedzirra
22
22
  end
23
23
 
24
24
  def self.feed_classes
25
- @feed_classes ||= [RSS, RDF, AtomFeedBurner, Atom]
25
+ @feed_classes ||= [RSS, AtomFeedBurner, Atom]
26
26
  end
27
27
 
28
28
  # can take a single url or an array of urls
29
29
  # when passed a single url it returns the body of the response
30
30
  # when passed an array of urls it returns a hash with the urls as keys and body of responses as values
31
31
  def self.fetch_raw(urls, options = {})
32
- urls = [*urls]
32
+ url_queue = [*urls]
33
33
  multi = Curl::Multi.new
34
34
  responses = {}
35
- urls.each do |url|
35
+ url_queue.each do |url|
36
36
  easy = Curl::Easy.new(url) do |curl|
37
37
  curl.headers["User-Agent"] = (options[:user_agent] || USER_AGENT)
38
38
  curl.headers["If-Modified-Since"] = options[:if_modified_since].httpdate if options.has_key?(:if_modified_since)
39
39
  curl.headers["If-None-Match"] = options[:if_none_match] if options.has_key?(:if_none_match)
40
+ curl.headers["Accept-encoding"] = 'gzip, deflate'
40
41
  curl.follow_location = true
41
42
  curl.on_success do |c|
42
- responses[url] = c.body_str
43
+ responses[url] = decode_content(c)
43
44
  end
44
45
  curl.on_failure do |c|
45
46
  responses[url] = c.response_code
@@ -49,7 +50,7 @@ module Feedzirra
49
50
  end
50
51
 
51
52
  multi.perform
52
- return responses.size == 1 ? responses.values.first : responses
53
+ return urls.is_a?(String) ? responses.values.first : responses
53
54
  end
54
55
 
55
56
  def self.fetch_and_parse(urls, options = {})
@@ -64,7 +65,21 @@ module Feedzirra
64
65
  end
65
66
 
66
67
  multi.perform
67
- return responses.size == 1 ? responses.values.first : responses
68
+ return urls.is_a?(String) ? responses.values.first : responses
69
+ end
70
+
71
+ def self.decode_content(c)
72
+ if c.header_str.match(/Content-Encoding: gzip/)
73
+ gz = Zlib::GzipReader.new(StringIO.new(c.body_str))
74
+ xml = gz.read
75
+ gz.close
76
+ elsif c.header_str.match(/Content-Encoding: deflate/)
77
+ xml = Zlib::Deflate.inflate(c.body_str)
78
+ else
79
+ xml = c.body_str
80
+ end
81
+
82
+ xml
68
83
  end
69
84
 
70
85
  def self.update(feeds, options = {})
@@ -84,10 +99,11 @@ module Feedzirra
84
99
  curl.headers["User-Agent"] = (options[:user_agent] || USER_AGENT)
85
100
  curl.headers["If-Modified-Since"] = options[:if_modified_since].httpdate if options.has_key?(:if_modified_since)
86
101
  curl.headers["If-None-Match"] = options[:if_none_match] if options.has_key?(:if_none_match)
102
+ curl.headers["Accept-encoding"] = 'gzip, deflate'
87
103
  curl.follow_location = true
88
104
  curl.on_success do |c|
89
105
  add_url_to_multi(multi, url_queue.shift, url_queue, responses, options) unless url_queue.empty?
90
- xml = c.body_str
106
+ xml = decode_content(c)
91
107
  klass = determine_feed_parser_for_xml(xml)
92
108
  if klass
93
109
  feed = klass.parse(xml)
@@ -10,6 +10,25 @@ module Feedzirra
10
10
  @published = parse_datetime(val)
11
11
  end
12
12
 
13
+ def sanitized
14
+ dispatcher = Class.new do
15
+ def initialize(entry)
16
+ @entry = entry
17
+ end
18
+
19
+ def method_missing(method, *args)
20
+ Dryopteris.sanitize(@entry.send(method))
21
+ end
22
+ end
23
+ dispatcher.new(self)
24
+ end
25
+
26
+ def sanitize!
27
+ self.title = sanitized.title
28
+ self.author = sanitized.author
29
+ self.content = sanitized.content
30
+ end
31
+
13
32
  alias_method :last_modified, :published
14
33
  end
15
34
  end
@@ -26,7 +26,7 @@ module Feedzirra
26
26
 
27
27
  def update_from_feed(feed)
28
28
  self.new_entries += find_new_entries_for(feed)
29
- self.entries += self.new_entries
29
+ self.entries.unshift(*self.new_entries)
30
30
 
31
31
  updated! if UPDATABLE_ATTRIBUTES.any? { |name| update_attribute(feed, name) }
32
32
  end
@@ -39,6 +39,10 @@ module Feedzirra
39
39
  end
40
40
  end
41
41
 
42
+ def sanitize_entries!
43
+ entries.each {|entry| entry.sanitize!}
44
+ end
45
+
42
46
  private
43
47
 
44
48
  def updated!
@@ -46,7 +50,18 @@ module Feedzirra
46
50
  end
47
51
 
48
52
  def find_new_entries_for(feed)
49
- feed.entries.inject([]) { |result, entry| result << entry unless existing_entry?(entry); result }
53
+ # this implementation is a hack, which is why it's so ugly.
54
+ # it's to get around the fact that not all feeds have a published date.
55
+ # however, they're always ordered with the newest one first.
56
+ # So we go through the entries just parsed and insert each one as a new entry
57
+ # until we get to one that has the same url as the the newest for the feed
58
+ latest_entry = self.entries.first
59
+ found_new_entries = []
60
+ feed.entries.each do |entry|
61
+ break if entry.url == latest_entry.url
62
+ found_new_entries << entry
63
+ end
64
+ found_new_entries
50
65
  end
51
66
 
52
67
  def existing_entry?(test_entry)
data/lib/feedzirra/rss.rb CHANGED
@@ -9,7 +9,7 @@ module Feedzirra
9
9
  attr_accessor :feed_url
10
10
 
11
11
  def self.able_to_parse?(xml)
12
- xml =~ /rss.*version\=\"2\.0\"/
12
+ xml =~ /\<rss|rdf/
13
13
  end
14
14
  end
15
15
  end
@@ -4,9 +4,13 @@ module Feedzirra
4
4
  include FeedEntryUtilities
5
5
  element :title
6
6
  element :link, :as => :url
7
+
7
8
  element :"dc:creator", :as => :author
8
9
  element :"content:encoded", :as => :content
9
10
  element :description, :as => :summary
11
+
10
12
  element :pubDate, :as => :published
13
+ element :"dc:date", :as => :published
14
+ elements :category, :as => :categories
11
15
  end
12
16
  end
data/lib/feedzirra.rb CHANGED
@@ -2,8 +2,10 @@ $LOAD_PATH.unshift(File.dirname(__FILE__)) unless $LOAD_PATH.include?(File.dirna
2
2
 
3
3
  gem 'activesupport'
4
4
 
5
+ require 'zlib'
5
6
  require 'curb'
6
7
  require 'sax-machine'
8
+ require 'dryopteris'
7
9
  require 'active_support/basic_object'
8
10
  require 'active_support/core_ext/object'
9
11
  require 'active_support/core_ext/time'
@@ -25,5 +27,5 @@ require 'feedzirra/atom'
25
27
  require 'feedzirra/atom_feed_burner'
26
28
 
27
29
  module Feedzirra
28
- VERSION = "0.0.1"
30
+ VERSION = "0.0.2"
29
31
  end
@@ -30,4 +30,8 @@ describe Feedzirra::AtomEntry do
30
30
  it "should parse the published date" do
31
31
  @entry.published.to_s.should == "Fri Jan 16 18:21:00 UTC 2009"
32
32
  end
33
+
34
+ it "should parse the categories" do
35
+ @entry.categories.should == ['Turkey', 'Seattle']
36
+ end
33
37
  end
@@ -11,6 +11,11 @@ describe Feedzirra::AtomFeedBurnerEntry do
11
11
  @entry.title.should == "Making a Ruby C library even faster"
12
12
  end
13
13
 
14
+ it "should be able to fetch a url via the 'alternate' rel if no origLink exists" do
15
+ entry = Feedzirra::AtomFeedBurner.parse(File.read("#{File.dirname(__FILE__)}/../sample_feeds/PaulDixExplainsNothingAlternate.xml")).entries.first
16
+ entry.url.should == 'http://feeds.feedburner.com/~r/PaulDixExplainsNothing/~3/519925023/making-a-ruby-c-library-even-faster.html'
17
+ end
18
+
14
19
  it "should parse the url" do
15
20
  @entry.url.should == "http://www.pauldix.net/2009/01/making-a-ruby-c-library-even-faster.html"
16
21
  end
@@ -30,4 +35,8 @@ describe Feedzirra::AtomFeedBurnerEntry do
30
35
  it "should parse the published date" do
31
36
  @entry.published.to_s.should == "Thu Jan 22 15:50:22 UTC 2009"
32
37
  end
38
+
39
+ it "should parse the categories" do
40
+ @entry.categories.should == ['Ruby', 'Another Category']
41
+ end
33
42
  end
@@ -14,4 +14,32 @@ describe Feedzirra::FeedUtilities do
14
14
  time.to_s.should == "Wed Feb 20 18:05:00 UTC 2008"
15
15
  end
16
16
  end
17
+
18
+ describe "sanitizing" do
19
+ before(:each) do
20
+ @feed = Feedzirra::Feed.parse(sample_atom_feed)
21
+ @entry = @feed.entries.first
22
+ end
23
+
24
+ it "should provide a sanitized title" do
25
+ new_title = "<script>" + @entry.title
26
+ @entry.title = new_title
27
+ @entry.sanitized.title.should == Dryopteris.sanitize(new_title)
28
+ end
29
+
30
+ it "should sanitize things in place" do
31
+ @entry.title += "<script>"
32
+ @entry.author += "<script>"
33
+ @entry.content += "<script>"
34
+
35
+ cleaned_title = Dryopteris.sanitize(@entry.title)
36
+ cleaned_author = Dryopteris.sanitize(@entry.author)
37
+ cleaned_content = Dryopteris.sanitize(@entry.content)
38
+
39
+ @entry.sanitize!
40
+ @entry.title.should == cleaned_title
41
+ @entry.author.should == cleaned_author
42
+ @entry.content.should == cleaned_content
43
+ end
44
+ end
17
45
  end
@@ -5,29 +5,29 @@ describe Feedzirra::Feed do
5
5
  context "when there's an available parser" do
6
6
  it "should parse an rdf feed" do
7
7
  feed = Feedzirra::Feed.parse(sample_rdf_feed)
8
- feed.class.should == Feedzirra::RDF
9
8
  feed.title.should == "HREF Considered Harmful"
9
+ feed.entries.first.published.to_s.should == "Tue Sep 02 19:50:07 UTC 2008"
10
10
  feed.entries.size.should == 10
11
11
  end
12
12
 
13
13
  it "should parse an rss feed" do
14
14
  feed = Feedzirra::Feed.parse(sample_rss_feed)
15
- feed.class.should == Feedzirra::RSS
16
15
  feed.title.should == "Tender Lovemaking"
16
+ feed.entries.first.published.to_s.should == "Thu Dec 04 17:17:49 UTC 2008"
17
17
  feed.entries.size.should == 10
18
18
  end
19
19
 
20
20
  it "should parse an atom feed" do
21
21
  feed = Feedzirra::Feed.parse(sample_atom_feed)
22
- feed.class.should == Feedzirra::Atom
23
22
  feed.title.should == "Amazon Web Services Blog"
23
+ feed.entries.first.published.to_s.should == "Fri Jan 16 18:21:00 UTC 2009"
24
24
  feed.entries.size.should == 10
25
25
  end
26
26
 
27
27
  it "should parse an feedburner atom feed" do
28
28
  feed = Feedzirra::Feed.parse(sample_feedburner_atom_feed)
29
- feed.class.should == Feedzirra::AtomFeedBurner
30
29
  feed.title.should == "Paul Dix Explains Nothing"
30
+ feed.entries.first.published.to_s.should == "Thu Jan 22 15:50:22 UTC 2009"
31
31
  feed.entries.size.should == 5
32
32
  end
33
33
  end
@@ -42,8 +42,8 @@ describe Feedzirra::Feed do
42
42
 
43
43
  it "should parse an feedburner rss feed" do
44
44
  feed = Feedzirra::Feed.parse(sample_rss_feed_burner_feed)
45
- feed.class.should == Feedzirra::RDF
46
45
  feed.title.should == "Sam Harris: Author, Philosopher, Essayist, Atheist"
46
+ feed.entries.first.published.to_s.should == "Tue Jan 13 17:20:28 UTC 2009"
47
47
  feed.entries.size.should == 10
48
48
  end
49
49
  end
@@ -57,12 +57,12 @@ describe Feedzirra::Feed do
57
57
  Feedzirra::Feed.determine_feed_parser_for_xml(sample_feedburner_atom_feed).should == Feedzirra::AtomFeedBurner
58
58
  end
59
59
 
60
- it "should return the Feedzirra::RDF class for an rdf/rss 1.0 feed" do
61
- Feedzirra::Feed.determine_feed_parser_for_xml(sample_rdf_feed).should == Feedzirra::RDF
60
+ it "should return the Feedzirra::RSS class for an rdf/rss 1.0 feed" do
61
+ Feedzirra::Feed.determine_feed_parser_for_xml(sample_rdf_feed).should == Feedzirra::RSS
62
62
  end
63
63
 
64
- it "should return the Feedzirra::RDF class for an rss feedburner feed" do
65
- Feedzirra::Feed.determine_feed_parser_for_xml(sample_rss_feed_burner_feed).should == Feedzirra::RDF
64
+ it "should return the Feedzirra::RSS class for an rss feedburner feed" do
65
+ Feedzirra::Feed.determine_feed_parser_for_xml(sample_rss_feed_burner_feed).should == Feedzirra::RSS
66
66
  end
67
67
 
68
68
  it "should return the Feedzirra::RSS object for an rss 2.0 feed" do
@@ -113,7 +113,7 @@ describe Feedzirra::Feed do
113
113
  describe "fetching feeds" do
114
114
  before(:each) do
115
115
  @paul_feed_url = "http://feeds.feedburner.com/PaulDixExplainsNothing"
116
- @trotter_feed_url = "http://feeds.feedburner.com/trottercashion"
116
+ @trotter_feed_url = "http://feeds2.feedburner.com/trottercashion"
117
117
  end
118
118
 
119
119
  describe "handling many feeds" do
@@ -139,6 +139,11 @@ describe Feedzirra::Feed do
139
139
  results[@paul_feed_url].should =~ /Paul Dix/
140
140
  results[@trotter_feed_url].should =~ /Trotter Cashion/
141
141
  end
142
+
143
+ it "should always return a hash when passed an array" do
144
+ results = Feedzirra::Feed.fetch_raw([@paul_feed_url])
145
+ results.class.should == Hash
146
+ end
142
147
  end
143
148
 
144
149
  describe "#fetch_and_parse" do
@@ -169,6 +174,11 @@ describe Feedzirra::Feed do
169
174
  feeds[@trotter_feed_url].feed_url.should == @trotter_feed_url
170
175
  end
171
176
 
177
+ it "should always return a hash when passed an array" do
178
+ feeds = Feedzirra::Feed.fetch_and_parse([@paul_feed_url])
179
+ feeds.class.should == Hash
180
+ end
181
+
172
182
  it "should yeild the url and feed object to a :on_success lambda" do
173
183
  successful_call_mock = mock("successful_call_mock")
174
184
  successful_call_mock.should_receive(:call)
@@ -125,8 +125,8 @@ describe Feedzirra::FeedUtilities do
125
125
  @new_entry.url = "http://pauldix.net/new.html"
126
126
  @new_entry.published = (Time.now + 10).to_s
127
127
  @feed.entries << @old_entry
128
- @updated_feed.entries << @old_entry
129
128
  @updated_feed.entries << @new_entry
129
+ @updated_feed.entries << @old_entry
130
130
  end
131
131
 
132
132
  it "should update last-modified from the latest entry date" do
@@ -30,4 +30,8 @@ describe Feedzirra::RSSEntry do
30
30
  it "should parse the published date" do
31
31
  @entry.published.to_s.should == "Thu Dec 04 17:17:49 UTC 2008"
32
32
  end
33
+
34
+ it "should parse the categories" do
35
+ @entry.categories.should == ['computadora', 'nokogiri', 'rails']
36
+ end
33
37
  end
@@ -5,10 +5,11 @@ describe Feedzirra::RSS do
5
5
  it "should return true for an RSS feed" do
6
6
  Feedzirra::RSS.should be_able_to_parse(sample_rss_feed)
7
7
  end
8
-
9
- it "should return false for an rdf feed" do
10
- Feedzirra::RSS.should_not be_able_to_parse(sample_rdf_feed)
11
- end
8
+
9
+ # this is no longer true. combined rdf and rss into one
10
+ # it "should return false for an rdf feed" do
11
+ # Feedzirra::RSS.should_not be_able_to_parse(sample_rdf_feed)
12
+ # end
12
13
 
13
14
  it "should return fase for an atom feed" do
14
15
  Feedzirra::RSS.should_not be_able_to_parse(sample_atom_feed)
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: pauldix-feedzirra
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.1
4
+ version: 0.0.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - Paul Dix
@@ -9,11 +9,12 @@ autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
11
 
12
- date: 2009-01-22 00:00:00 -08:00
12
+ date: 2009-02-19 00:00:00 -08:00
13
13
  default_executable:
14
14
  dependencies:
15
15
  - !ruby/object:Gem::Dependency
16
16
  name: nokogiri
17
+ type: :runtime
17
18
  version_requirement:
18
19
  version_requirements: !ruby/object:Gem::Requirement
19
20
  requirements:
@@ -23,6 +24,7 @@ dependencies:
23
24
  version:
24
25
  - !ruby/object:Gem::Dependency
25
26
  name: pauldix-sax-machine
27
+ type: :runtime
26
28
  version_requirement:
27
29
  version_requirements: !ruby/object:Gem::Requirement
28
30
  requirements:
@@ -32,6 +34,7 @@ dependencies:
32
34
  version:
33
35
  - !ruby/object:Gem::Dependency
34
36
  name: taf2-curb
37
+ type: :runtime
35
38
  version_requirement:
36
39
  version_requirements: !ruby/object:Gem::Requirement
37
40
  requirements:
@@ -39,8 +42,19 @@ dependencies:
39
42
  - !ruby/object:Gem::Version
40
43
  version: 0.2.3
41
44
  version:
45
+ - !ruby/object:Gem::Dependency
46
+ name: builder
47
+ type: :runtime
48
+ version_requirement:
49
+ version_requirements: !ruby/object:Gem::Requirement
50
+ requirements:
51
+ - - ">="
52
+ - !ruby/object:Gem::Version
53
+ version: 2.1.2
54
+ version:
42
55
  - !ruby/object:Gem::Dependency
43
56
  name: activesupport
57
+ type: :runtime
44
58
  version_requirement:
45
59
  version_requirements: !ruby/object:Gem::Requirement
46
60
  requirements:
@@ -48,6 +62,16 @@ dependencies:
48
62
  - !ruby/object:Gem::Version
49
63
  version: 2.0.0
50
64
  version:
65
+ - !ruby/object:Gem::Dependency
66
+ name: mdalessio-dryopteris
67
+ type: :runtime
68
+ version_requirement:
69
+ version_requirements: !ruby/object:Gem::Requirement
70
+ requirements:
71
+ - - ">="
72
+ - !ruby/object:Gem::Version
73
+ version: 0.0.0
74
+ version:
51
75
  description:
52
76
  email: paul@pauldix.net
53
77
  executables: []