jsl-feedzirra 0.0.12.7 → 0.0.12.8

Sign up to get free protection for your applications and to get access to all the features.
data/README.rdoc CHANGED
@@ -3,50 +3,25 @@
3
3
  === Description
4
4
 
5
5
  Feedzirra is a feed library that is designed to get and update many feeds as quickly as possible. This includes using libcurl-multi through the
6
- taf2-curb[link:http://github.com/taf2/curb/tree/master] gem for faster http gets, and libxml through nokogiri[link:http://github.com/tenderlove/nokogiri/tree/master]
7
- and sax-machine[link:http://github.com/pauldix/sax-machine/tree/master] for faster parsing.
6
+ taf2-curb[link:http://github.com/taf2/curb/tree/master] gem for faster http gets, and libxml through
7
+ nokogiri[link:http://github.com/tenderlove/nokogiri/tree/master] and sax-machine[link:http://github.com/pauldix/sax-machine/tree/master] for
8
+ faster parsing.
8
9
 
9
- It allows for easy customization of feed parsing options through the definition of custom parsing classes, and allows you to take as little or as much control as you want in updating feeds. Feedzirra
10
- makes it easy to figure out which content in feeds is new by providing simple 'backends' so that Feedzirra can track the last contents fetched from a particular feed. Out of the box, Feedzirra can
11
- store this information in the filesystem, Memcached or Tokyo Cabinet. If you want to keep track of new or updated feeds on your own, just use the default backend which will will let you set options
12
- for conditional fetching of feeds without the help of Feedzirra.
10
+ It allows for easy customization of feed parsing options through the definition of custom parsing classes, and allows you to take as little or as
11
+ much control as you want in updating feeds. Feedzirra makes it easy to figure out which content in feeds is new by storing the previous retrieval
12
+ of a feed in a key-value store. Feedzirra uses the the "moneta" gem, which is a unified interface to key-value storage systems, in order to provide
13
+ access to many different types of stores depending on your requirements.
13
14
 
14
15
  === Installation
15
16
 
16
- For now Feedzirra exists only on github. It also has a few gem requirements that are only on github. Before you start you need to have libcurl[link:http://curl.haxx.se/] and
17
- libxml[link:http://xmlsoft.org/] installed. If you're on Leopard you have both. Otherwise, you'll need to grab them. Once you've got those libraries, these are the gems that get
18
- used: nokogiri, pauldix-sax-machine, taf2-curb (note that this is a fork that lives on github and not the Ruby Forge version of curb), and pauldix-feedzirra. The feedzirra gemspec has all
19
- the dependencies so you should be able to get up and running with the standard github gem install routine:
17
+ For now Feedzirra exists only on github. It also has a few gem requirements that are only on github. Before you start you need to have
18
+ libcurl[link:http://curl.haxx.se/] and libxml[link:http://xmlsoft.org/] installed. If you're on Leopard you have both. Otherwise, you'll need to
19
+ grab them. Once you've got those libraries, these are the gems that get used: nokogiri, pauldix-sax-machine, taf2-curb (note that this is a fork
20
+ that lives on github and not the Ruby Forge version of curb), and pauldix-feedzirra. The feedzirra gemspec has all the dependencies so you should
21
+ be able to get up and running with the standard github gem install routine:
20
22
 
21
- gem sources -a http://gems.github.com # if you haven't already
22
- gem install pauldix-feedzirra
23
-
24
- ==== Troubleshooting Installation
25
-
26
- *NOTE:*Some people have been reporting a few issues related to installation. First, the Ruby Forge version of curb is not what you want. It will not work. Nor will the curl-multi gem that lives on
27
- Ruby Forge. You have to get the taf2-curb[link:http://github.com/taf2/curb/tree/master] fork installed.
28
-
29
- If you see this error when doing a require:
30
-
31
- /Library/Ruby/Site/1.8/rubygems/custom_require.rb:31:in `gem_original_require': no such file to load -- curb_core (LoadError)
32
-
33
- It means that the taf2-curb gem didn't build correctly. To resolve this you can do a git clone git://github.com/taf2/curb.git then run rake gem in the curb directory, then sudo gem
34
- install pkg/curb-0.2.4.0.gem. After that you should be good.
35
-
36
- If you see something like this when trying to run it:
37
-
38
- NoMethodError: undefined method `on_success' for #<Curl::Easy:0x1182724>
39
- from ./lib/feedzirra/feed.rb:88:in `add_url_to_multi'
40
-
41
- This means that you are requiring curl-multi or the Ruby Forge version of Curb somewhere. You can't use those and need to get the taf2 version up and running.
42
-
43
- If you're on Debian or Ubuntu and getting errors while trying to install the taf2-curb gem, it could be because you don't have the latest version of libcurl installed. Do this to fix:
44
-
45
- sudo apt-get install libcurl4-gnutls-dev
46
-
47
- Another problem could be if you are running Mac Ports and you have libcurl installed through there. You need to uninstall it for curb to work! The version in Mac Ports is old and doesn't play nice with curb. If you're running Leopard, you can just uninstall and you should be golden. If you're on an older version of OS X, you'll then need to {download curl}[http://curl.haxx.se/download.html] and build from source. Then you'll have to install the taf2-curb gem again. You might have to perform the step above.
48
-
49
- If you're still having issues, please let me know on the mailing list. Also, {Todd Fisher (taf2)}[link:http://github.com/taf2] is working on fixing the gem install. Please send him a full error report.
23
+ gem sources -a http://gems.github.com # if you haven't already
24
+ gem install pauldix-feedzirra
50
25
 
51
26
  === Usage
52
27
 
@@ -64,7 +39,7 @@ allows configuration of the backend store, as well as fetching options for the l
64
39
  an example of configuration with the Memcache store connected to Tokyo Tyrant (the front-end for Tokyo Cabinet):
65
40
 
66
41
  reader = Feedzirra::Reader.new('http://www.pauldix.net/atom.xml', :backend =>
67
- { :class => Feedzirra::Backend::Memcache, :port => 1978, :server => 'localhost' })
42
+ { :moneta_klass => Moneta::Memcache, :port => 1978, :server => 'localhost' })
68
43
 
69
44
  Other options that may be put in the options hash follow the original API described below.
70
45
 
@@ -150,13 +125,43 @@ There's also a {benchmark that shows the results of using Feedzirra to perform u
150
125
 
151
126
  === Discussion
152
127
 
153
- I'd like feedback on the api and any bugs encountered on feeds in the wild. I've set up a {google group here}[http://groups.google.com/group/feedzirra].
128
+ I'd like feedback on the api and any bugs encountered on feeds in the wild. I've set up a
129
+ {google group here}[http://groups.google.com/group/feedzirra].
130
+
131
+ ==== Troubleshooting Installation
132
+
133
+ *NOTE:*Some people have been reporting a few issues related to installation. First, the Ruby Forge version of curb is not what you want. It will not work. Nor will the curl-multi gem that lives on
134
+ Ruby Forge. You have to get the taf2-curb[link:http://github.com/taf2/curb/tree/master] fork installed.
135
+
136
+ If you see this error when doing a require:
137
+
138
+ /Library/Ruby/Site/1.8/rubygems/custom_require.rb:31:in `gem_original_require': no such file to load -- curb_core (LoadError)
139
+
140
+ It means that the taf2-curb gem didn't build correctly. To resolve this you can do a git clone git://github.com/taf2/curb.git then run rake gem in the curb directory, then sudo gem
141
+ install pkg/curb-0.2.4.0.gem. After that you should be good.
142
+
143
+ If you see something like this when trying to run it:
144
+
145
+ NoMethodError: undefined method `on_success' for #<Curl::Easy:0x1182724>
146
+ from ./lib/feedzirra/feed.rb:88:in `add_url_to_multi'
147
+
148
+ This means that you are requiring curl-multi or the Ruby Forge version of Curb somewhere. You can't use those and need to get the taf2 version up and running.
149
+
150
+ If you're on Debian or Ubuntu and getting errors while trying to install the taf2-curb gem, it could be because you don't have the latest version of libcurl installed. Do this to fix:
151
+
152
+ sudo apt-get install libcurl4-gnutls-dev
153
+
154
+ Another problem could be if you are running Mac Ports and you have libcurl installed through there. You need to uninstall it for curb to work! The version in Mac Ports is old and doesn't play nice with curb. If you're running Leopard, you can just uninstall and you should be golden. If you're on an older version of OS X, you'll then need to {download curl}[http://curl.haxx.se/download.html] and build from source. Then you'll have to install the taf2-curb gem again. You might have to perform the step above.
155
+
156
+ If you're still having issues, please let me know on the mailing list. Also, {Todd Fisher (taf2)}[link:http://github.com/taf2] is working on fixing the gem install. Please send him a full error report.
154
157
 
155
158
  === TODO
156
159
 
157
- This thing needs to hammer on many different feeds in the wild. I'm sure there will be bugs. I want to find them and crush them. I didn't bother using the test suite for feedparser. i wanted to start fresh.
160
+ This thing needs to hammer on many different feeds in the wild. I'm sure there will be bugs. I want to find them and crush them. I didn't bother
161
+ using the test suite for feedparser. i wanted to start fresh.
158
162
 
159
163
  Here are some more specific TODOs.
164
+
160
165
  * Make a feedzirra-rails gem to integrate feedzirra seamlessly with Rails and ActiveRecord.
161
166
  * Add support for authenticated feeds.
162
167
  * Create a super sweet DSL for defining new parsers.
data/lib/feedzirra.rb CHANGED
@@ -7,20 +7,18 @@ require 'curb'
7
7
  require 'sax-machine'
8
8
  require 'dryopteris'
9
9
  require 'uri'
10
+
10
11
  require 'active_support/basic_object'
11
12
  require 'active_support/core_ext/object'
12
13
  require 'active_support/core_ext/time'
13
14
 
15
+ require 'hashback'
16
+
14
17
  require 'core_ext/date'
15
18
  require 'core_ext/string'
16
19
  require 'core_ext/array'
17
20
  require 'core_ext/hash'
18
21
 
19
- require 'feedzirra/backend/dev_null'
20
- require 'feedzirra/backend/filesystem'
21
- require 'feedzirra/backend/memcache'
22
- require 'feedzirra/backend/memory'
23
-
24
22
  require 'feedzirra/http_multi'
25
23
 
26
24
  require 'feedzirra/parser/feed_utilities'
@@ -8,16 +8,19 @@ module Feedzirra
8
8
 
9
9
  DEFAULTS = {
10
10
  :backend => {
11
- :class => Feedzirra::Backend::Memory
11
+ :moneta_klass => 'Moneta::Memory'
12
12
  }
13
13
  }
14
14
 
15
15
  def initialize(*args)
16
- @options = DEFAULTS.merge(args.extract_options!)
17
- @retrievables = args.flatten
18
- @multi = Curl::Multi.new
19
- @responses = { }
20
- @backend = @options[:backend][:class].new(@options[:backend].except(:class))
16
+ @options = DEFAULTS.merge(args.extract_options!)
17
+ @retrievables = args.flatten
18
+ @multi = Curl::Multi.new
19
+ @responses = { }
20
+
21
+ @backend = HashBack::Backend.new( 'Feedzirra',
22
+ @options[:backend][:moneta_klass],
23
+ @options[:backend].except(:moneta_klass) )
21
24
  end
22
25
 
23
26
  # Prepares the curl object and calls #perform
@@ -43,7 +46,7 @@ module Feedzirra
43
46
  url = retrievable.feed_url
44
47
  else
45
48
  url = retrievable
46
- retrievable = @backend.get(url) # Try to fetch the last retrieval from backend
49
+ retrievable = @backend[url] # Try to fetch the last retrieval from backend
47
50
  end
48
51
 
49
52
  easy = build_curl_easy(url, retrievable, retrievable_queue)
@@ -97,7 +100,7 @@ module Feedzirra
97
100
  set_updated_feed_entries!(retrievable, updated_feed)
98
101
  end
99
102
 
100
- @backend.set(url, updated_feed)
103
+ @backend[url] = updated_feed
101
104
 
102
105
  responses[url] = updated_feed
103
106
  @options[:on_success].call(retrievable) if @options.has_key?(:on_success)
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: jsl-feedzirra
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.12.7
4
+ version: 0.0.12.8
5
5
  platform: ruby
6
6
  authors:
7
7
  - Paul Dix
@@ -12,6 +12,16 @@ cert_chain: []
12
12
  date: 2009-04-29 00:00:00 -07:00
13
13
  default_executable:
14
14
  dependencies:
15
+ - !ruby/object:Gem::Dependency
16
+ name: hashback
17
+ type: :runtime
18
+ version_requirement:
19
+ version_requirements: !ruby/object:Gem::Requirement
20
+ requirements:
21
+ - - ">"
22
+ - !ruby/object:Gem::Version
23
+ version: 0.0.0
24
+ version:
15
25
  - !ruby/object:Gem::Dependency
16
26
  name: nokogiri
17
27
  type: :runtime
@@ -111,10 +121,6 @@ files:
111
121
  - lib/feedzirra/parser/rss_entry.rb
112
122
  - lib/feedzirra/parser/feed_utilities.rb
113
123
  - lib/feedzirra/parser/feed_entry_utilities.rb
114
- - lib/feedzirra/backend/filesystem.rb
115
- - lib/feedzirra/backend/dev_null.rb
116
- - lib/feedzirra/backend/memcache.rb
117
- - lib/feedzirra/backend/memory.rb
118
124
  - README.rdoc
119
125
  - Rakefile
120
126
  - spec/spec.opts
@@ -131,10 +137,6 @@ files:
131
137
  - spec/feedzirra/parser/rss_entry_spec.rb
132
138
  - spec/feedzirra/feed_utilities_spec.rb
133
139
  - spec/feedzirra/feed_entry_utilities_spec.rb
134
- - spec/feedzirra/backend/dev_null_spec.rb
135
- - spec/feedzirra/backend/filesystem_spec.rb
136
- - spec/feedzirra/backend/memcache_spec.rb
137
- - spec/feedzirra/backend/memory_spec.rb
138
140
  has_rdoc: true
139
141
  homepage: http://github.com/pauldix/feedzirra
140
142
  post_install_message:
@@ -1,21 +0,0 @@
1
- module Feedzirra
2
- module Backend
3
-
4
- # Class exists for cases where you don't want Feedzirra to remember what
5
- # it has fetched. If this backend is selected, user needs to pass arguments
6
- # in the form of a Feed object to the Reader class to help Feedzirra know when
7
- # a page hasn't changed, and which feed entries have been updated.
8
- class DevNull
9
-
10
- def initialize(options = { })
11
- end
12
-
13
- def get(url)
14
- end
15
-
16
- def set(url, result)
17
- end
18
-
19
- end
20
- end
21
- end
@@ -1,37 +0,0 @@
1
- require 'md5'
2
- require 'uri'
3
-
4
- module Feedzirra
5
- module Backend
6
- class Filesystem
7
-
8
- DEFAULTS = {
9
- :path => File.expand_path(File.join(%w[ ~ / .feedzirra ]))
10
- }
11
-
12
- def initialize(options = { })
13
- @options = DEFAULTS.merge(options)
14
- initialize_store
15
- end
16
-
17
- def get(url)
18
- f = filename_for(url)
19
- Marshal.load(File.read(f)) if File.exist?(f)
20
- end
21
-
22
- def set(url, result)
23
- File.open(filename_for(url), 'w') {|f| f.write(Marshal.dump(result)) }
24
- end
25
-
26
- private
27
-
28
- def initialize_store
29
- FileUtils.mkdir(@options[:path]) unless File.exist?(@options[:path])
30
- end
31
-
32
- def filename_for(url)
33
- File.join(@options[:path], MD5.hexdigest(URI.parse(url).normalize.to_s))
34
- end
35
- end
36
- end
37
- end
@@ -1,37 +0,0 @@
1
- require 'memcache'
2
-
3
- module Feedzirra
4
- module Backend
5
-
6
- # Can be used to set up Memcache, or clients able to speak the Memcache protocol such as
7
- # Tokyo Tyrant, as a Feedzirra::Backend.
8
- class Memcache
9
- DEFAULTS = {
10
- :server => 'localhost',
11
- :port => '11211'
12
- }
13
-
14
- def initialize(options = { })
15
- @options = DEFAULTS.merge(options)
16
- @cache = MemCache.new([ @options[:server], @options[:port] ].join(':'), :namespace => 'Feedzirra')
17
- end
18
-
19
- def get(url)
20
- res = @cache.get(key_for(url))
21
- Marshal.load(res) unless res.nil?
22
- end
23
-
24
- def set(url, result)
25
- @cache.set(key_for(url), Marshal.dump(result))
26
- end
27
-
28
- private
29
-
30
- def key_for(url)
31
- MD5.hexdigest(URI.parse(url).normalize.to_s)
32
- end
33
-
34
- end
35
-
36
- end
37
- end
@@ -1,23 +0,0 @@
1
- module Feedzirra
2
- module Backend
3
-
4
- # Memory store uses a ruby Hash to store the results of feed fetches.
5
- # It won't persist after the application exits.
6
- class Memory
7
-
8
- def initialize(options = { })
9
- @store = { }
10
- @options = options
11
- end
12
-
13
- def get(url)
14
- @store[url]
15
- end
16
-
17
- def set(url, result)
18
- @store[url] = result
19
- end
20
-
21
- end
22
- end
23
- end
@@ -1,35 +0,0 @@
1
- require File.join(File.dirname(__FILE__), %w[.. .. spec_helper])
2
-
3
- describe Feedzirra::Backend::DevNull do
4
- before do
5
- @backend = Feedzirra::Backend::DevNull.new
6
- end
7
-
8
- it_should_behave_like "all backends"
9
-
10
- it "should initialize properly" do
11
- Feedzirra::Backend::DevNull.new
12
- end
13
-
14
- describe "#set" do
15
- it "should accept two arguments" do
16
- lambda {
17
- @backend.set('foo', 'nothing')
18
- }.should_not raise_error
19
- end
20
- end
21
-
22
- describe "#get" do
23
- it "should accept one argument" do
24
- lambda {
25
- @backend.get('foo')
26
- }.should_not raise_error
27
- end
28
-
29
- it "should return nil even after something is set" do
30
- @backend.set('junk', 'something')
31
- @backend.get('junk').should be_nil
32
- end
33
- end
34
-
35
- end
@@ -1,31 +0,0 @@
1
- require File.join(File.dirname(__FILE__), %w[.. .. spec_helper])
2
-
3
- describe Feedzirra::Backend::Filesystem do
4
- before do
5
- @backend = Feedzirra::Backend::Filesystem.new({:path => '/tmp'})
6
- end
7
-
8
- it_should_behave_like "all backends"
9
-
10
- it "should accept a Hash of options for inialization" do
11
- Feedzirra::Backend::Filesystem.new({:path => '/tmp'})
12
- end
13
-
14
- describe "#filename_for" do
15
- it "should default to the users' home path plus the MD5 of the normalized URL" do
16
- MD5.expects(:hexdigest).returns('bah')
17
- uri = mock('uri')
18
- uri.expects(:normalize).returns('foo')
19
- URI.expects(:parse).returns(uri)
20
- Feedzirra::Backend::Filesystem.new({:path => '/tmp'}).__send__(:filename_for, 'hey').should == '/tmp/bah'
21
- end
22
- end
23
-
24
- describe "#initialize_store" do
25
- it "should call File.mkdir if the directory does not exist" do
26
- File.expects(:exist?).with('/tmp').returns(false)
27
- FileUtils.expects(:mkdir).with('/tmp')
28
- @backend.__send__(:initialize_store)
29
- end
30
- end
31
- end
@@ -1,25 +0,0 @@
1
- require File.join(File.dirname(__FILE__), %w[.. .. spec_helper])
2
-
3
- describe Feedzirra::Backend::Memcache do
4
- before do
5
- @backend = Feedzirra::Backend::Memcache.new({:server => 'localhost', :port => 11211})
6
- @cache = mock('cache')
7
- @backend.instance_variable_set(:@cache, @cache)
8
- end
9
-
10
- it_should_behave_like "all backends"
11
-
12
- describe "#get" do
13
- it "should call #get on the cache" do
14
- @cache.expects(:get)
15
- @backend.get('foo')
16
- end
17
- end
18
-
19
- describe "#set" do
20
- it "should call #set on the cache" do
21
- @cache.expects(:set)
22
- @backend.set('foo', 'hey')
23
- end
24
- end
25
- end
@@ -1,14 +0,0 @@
1
- require File.join(File.dirname(__FILE__), %w[.. .. spec_helper])
2
-
3
- describe Feedzirra::Backend::Memory do
4
- before do
5
- @backend = Feedzirra::Backend::Memory.new
6
- end
7
-
8
- it_should_behave_like "all backends"
9
-
10
- it "should initialize properly" do
11
- Feedzirra::Backend::Memory.new
12
- end
13
-
14
- end