feedbagtoo 0.7.2

Sign up to get free protection for your applications and to get access to all the features.
data/COPYING ADDED
@@ -0,0 +1,20 @@
1
+ Copyright (C) 2012 David Moreno <david@axiombox.com>
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/ChangeLog ADDED
@@ -0,0 +1,29 @@
1
+ * 0.9 - Fri Mar 16 10:59:00 EDT 2012
2
+ - Changed license to MIT.
3
+
4
+ * 0.6 - Fri Mar 5 20:10:33 EST 2010
5
+ - Added bin/feedbag.
6
+ - Removed the args[:narrow] option, not really needed.
7
+ - Handle case where feed URLs contain GET parameters; add tests
8
+ by Patrick Reagan <patrick.reagan@viget.com>.
9
+
10
+ * 0.5.99 - Tue May 12 12:52:22 EDT 2009
11
+ - Added rails/init.rb to load easily on a Rails app.
12
+
13
+ * 0.5.13.1 - Wed Apr 22 11:16:19 EDT 2009
14
+ - Changed args on find() from nil to {}
15
+
16
+ * 0.5.13 - Wed Apr 22 11:12:40 EDT 2009
17
+ - Added :narrow option so find() skips feed_validate and A links.
18
+
19
+ * 0.5.12 - Fri Mar 20 12:34:48 EDT 2009
20
+ - Added support for "feed://" URLs
21
+
22
+ * 0.5.11 - Sat Mar 7 17:22:30 EST 2009
23
+ - Benchmark against Rfeedfinder added.
24
+
25
+ * 0.5.10 - Wed Mar 4 13:32:33 EST 2009
26
+ - Feeds whose URLs contained query string arguments were not being
27
+ auto-discovered -- fixed
28
+
29
+ ** For previous changes, see the git log
data/Gemfile ADDED
@@ -0,0 +1,22 @@
1
+ # Clean up if needed
2
+ # rm -rf ~/.bundle/ ~/.gem/; rm -rf $GEM_HOME/bundler/ $GEM_HOME/cache/bundler/; rm -rf .bundle/; rm -rf vendor/cache/; rm -rf Gemfile.lock
3
+
4
+ source "http://rubygems.org"
5
+
6
+
7
+ # Add dependencies to develop your gem here.
8
+ # Include everything needed to run rake, tests, features, etc.
9
+ group :development, :test do
10
+ gem 'growl'
11
+ gem "rdoc", "~> 3.12"
12
+ gem "bundler", ">=1.0.0"
13
+ gem "jeweler", "~> 1.8.3"
14
+ gem "active_support"
15
+ gem "mocha"
16
+ gem "hpricot"
17
+ if RUBY_VERSION < '1.9'
18
+ gem "ruby-debug"
19
+ else
20
+ gem 'debugger', '~> 1.1.4'
21
+ end
22
+ end
data/Gemfile.lock ADDED
@@ -0,0 +1,43 @@
1
+ GEM
2
+ remote: http://rubygems.org/
3
+ specs:
4
+ active_support (3.0.0)
5
+ activesupport (= 3.0.0)
6
+ activesupport (3.0.0)
7
+ columnize (0.3.6)
8
+ git (1.2.5)
9
+ growl (1.0.3)
10
+ hpricot (0.8.6)
11
+ jeweler (1.8.4)
12
+ bundler (~> 1.0)
13
+ git (>= 1.2.5)
14
+ rake
15
+ rdoc
16
+ json (1.7.3)
17
+ linecache (0.46)
18
+ rbx-require-relative (> 0.0.4)
19
+ metaclass (0.0.1)
20
+ mocha (0.12.0)
21
+ metaclass (~> 0.0.1)
22
+ rake (0.9.2.2)
23
+ rbx-require-relative (0.0.9)
24
+ rdoc (3.12)
25
+ json (~> 1.4)
26
+ ruby-debug (0.10.4)
27
+ columnize (>= 0.1)
28
+ ruby-debug-base (~> 0.10.4.0)
29
+ ruby-debug-base (0.10.4)
30
+ linecache (>= 0.3)
31
+
32
+ PLATFORMS
33
+ ruby
34
+
35
+ DEPENDENCIES
36
+ active_support
37
+ bundler (>= 1.0.0)
38
+ growl
39
+ hpricot
40
+ jeweler (~> 1.8.3)
41
+ mocha
42
+ rdoc (~> 3.12)
43
+ ruby-debug
data/README.markdown ADDED
@@ -0,0 +1,101 @@
1
+ Feedbag
2
+ =======
3
+ Forked version of feedbag that returns title, description and url.
4
+
5
+ Feedbag is a feed auto-discovery Ruby library. You don't need to know more about it. It is said to be:
6
+
7
+ > Ruby's favorite auto-discovery tool/library!
8
+
9
+ ### Quick synopsis
10
+
11
+ >> require "rubygems"
12
+ => true
13
+ >> require "feedbag"
14
+ => true
15
+ >> Feedbag.find "log.damog.net"
16
+ => ["http://feeds.feedburner.com/TeoremaDelCerdoInfinito", "http://log.damog.net/comments/feed/"]
17
+ >> Feedbag.feed?("google.com")
18
+ => false
19
+ >> Feedbag.feed?("http://planet.debian.org/rss20.xml")
20
+ => true
21
+
22
+ ### Installation
23
+
24
+ $ sudo gem install damog-feedbag -s http://gems.github.com/
25
+
26
+ Or just grab feedbag.rb and use it on your own project:
27
+
28
+ $ wget http://github.com/damog/feedbag/raw/master/lib/feedbag.rb
29
+
30
+ ## Tutorial
31
+
32
+ So you want to know more about it.
33
+
34
+ OK, if the URL passed to the find method is a feed itself, that only feed URL will be returned.
35
+
36
+ >> Feedbag.find "github.com/damog.atom"
37
+ => ["http://github.com/damog.atom"]
38
+ >>
39
+
40
+ Otherwise, it will always return LINK feeds first, A (anchor tags) feeds later. Between A feeds, the ones hosted on the same URL's host, will have larger priority:
41
+
42
+ >> Feedbag.find "http://ve.planetalinux.org"
43
+ => ["http://feedproxy.google.com/PlanetaLinuxVenezuela", "http://rendergraf.wordpress.com/feed/", "http://rootweiller.wordpress.com/feed/", "http://skatox.com/blog/feed/", "http://kodegeek.com/atom.xml", "http://blog.0x29.com.ve/?feed=rss2&cat=8"]
44
+ >>
45
+
46
+ On your application you should only take the very first element of the array, most of the times:
47
+
48
+ >> Feedbag.find("planet.debian.org").first(3)
49
+ => ["http://planet.debian.org/rss10.xml", "http://planet.debian.org/rss20.xml", "http://planet.debian.org/atom.xml"]
50
+ >>
51
+
52
+ (Try running that same example without the "first" method. That example's host is a blog aggregator, so it has hundreds of feed URLs:)
53
+
54
+ >> Feedbag.find("planet.debian.org").size
55
+ => 104
56
+ >>
57
+
58
+ Feedbag will find them all, but it will return the most important ones on the first elements on the array returned.
59
+
60
+ >> Feedbag.find("cnn.com")
61
+ => ["http://rss.cnn.com/rss/cnn_topstories.rss", "http://rss.cnn.com/rss/cnn_latest.rss", "http://rss.cnn.com/services/podcasting/robinmeade/rss.xml"]
62
+ >>
63
+
64
+ ### Why should you use it?
65
+
66
+ - Because it's cool.
67
+ - Because it only uses [Hpricot](https://code.whytheluckystiff.net/hpricot/) as dependency.
68
+ - Because it follows modern feed filename conventions (like those ones used by WordPress blogs, or Blogger, etc).
69
+ - Because it's a single file you can embed easily in your application.
70
+ - Because it passes most of the Mark Pilgrim's [Atom auto-discovery test suite](http://diveintomark.org/tests/client/autodiscovery/). It doesn't pass them all because some of those tests are broken (citation needed).
71
+
72
+ ### Why did I build it?
73
+
74
+ - Because I liked Benjamin Trott's [Feed::Find](http://search.cpan.org/~btrott/Feed-Find-0.06/lib/Feed/Find.pm).
75
+ - Because I thought it would be good to have Feed::Find's functionality in Ruby.
76
+ - Because I thought it was going to be easy to maintain.
77
+ - Because I was going to use it on [rFeed](http://github.com/damog/rfeed).
78
+ - And finally, because I didn't know [rfeedfinder](http://rfeedfinder.rubyforge.org/) existed :-)
79
+
80
+ ### Bugs
81
+
82
+ Please, report bugs to [rt@support.axiombox.com](rt@support.axiombox.com) or directly to the author.
83
+
84
+ ### Contribute
85
+
86
+ > git clone git://github.com/damog/feedbag.git
87
+
88
+ ...patch, build, hack and make pull requests. I'll be glad.
89
+
90
+ ### Author
91
+
92
+ [David Moreno](http://damog.net/) <[david@axiombox.com](mailto:david@axiombox.com)>.
93
+
94
+ ### Copyright
95
+
96
+ This is free software. See [COPYING](http://github.com/damog/feedbag/master/COPYING) for more information.
97
+
98
+ ### Thanks
99
+
100
+ [Raquel](http://maggit.net), for making [Axiombox](http://axiombox.com) and most of my dreams possible. Also, [GitHub](http://github.com) for making a nice code sharing service that doesn't suck.
101
+
data/Rakefile ADDED
@@ -0,0 +1,33 @@
1
+ # encoding: utf-8
2
+
3
+ require 'rubygems'
4
+ require 'bundler'
5
+ require 'rake/testtask'
6
+ begin
7
+ Bundler.setup(:default, :development)
8
+ rescue Bundler::BundlerError => e
9
+ $stderr.puts e.message
10
+ $stderr.puts "Run `bundle install` to install missing gems"
11
+ exit e.status_code
12
+ end
13
+ require 'rake'
14
+
15
+ require 'jeweler'
16
+ Jeweler::Tasks.new do |gem|
17
+ gem.name = "feedbagtoo"
18
+ gem.summary = "Fork of the feedbag gem that returns title along with url."
19
+ gem.description = "This gem will return title and url for each feed discovered at a given url"
20
+ gem.email = "justin@tatemae.com"
21
+ gem.homepage = "http://github.com/tatemae/feedbagtoo"
22
+ gem.authors = ["Axiombox", "David Moreno", "Joel Duffin", "Justin Ball", "Fabien Penso"]
23
+ end
24
+ Jeweler::RubygemsDotOrgTasks.new
25
+
26
+
27
+ task :default => :test
28
+
29
+ Rake::TestTask.new do |t|
30
+ t.libs << 'test'
31
+ t.test_files = FileList["test/feedbag_test.rb"]
32
+ t.verbose = true
33
+ end
data/TODO ADDED
@@ -0,0 +1 @@
1
+ - Document Feedbag.feed?
data/VERSION ADDED
@@ -0,0 +1 @@
1
+ 0.7.2
@@ -0,0 +1,30 @@
1
+ require "benchmark"
2
+ require "rubygems"
3
+
4
+ sites = [
5
+ "log.damog.net",
6
+ "http://cnn.com",
7
+ "scripting.com",
8
+ "mx.planetalinux.org",
9
+ "http://feedproxy.google.com/UniversoPlanetaLinux",
10
+ ]
11
+
12
+ Benchmark.bm do |x|
13
+ sites.each do |site|
14
+ puts "#{site}:"
15
+
16
+ puts " feedbag"
17
+ x.report {
18
+ require 'feedbag'
19
+ Feedbag.find(site)
20
+ }
21
+
22
+ puts " rfeedfinder"
23
+ x.report {
24
+ require 'rfeedfinder'
25
+ Rfeedfinder.feed(site)
26
+ }
27
+
28
+ end
29
+ end
30
+
data/bin/feedbag ADDED
@@ -0,0 +1,28 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "rubygems"
4
+ require "feedbag"
5
+
6
+ def usage
7
+ %Q{
8
+ #{$0} <url 1> [<url 2> <url 3> ... <url n>]
9
+ }
10
+ end
11
+
12
+ if ARGV.empty?
13
+ puts usage
14
+ exit 1
15
+ end
16
+
17
+ ARGV.each do |url|
18
+ puts "== #{url}:"
19
+ feeds = Feedbag.find url
20
+ if feeds.empty?
21
+ puts " no feeds found!"
22
+ else
23
+ feeds.each do |f|
24
+ puts " - #{f}"
25
+ end
26
+ end
27
+ end
28
+
@@ -0,0 +1,80 @@
1
+ # Generated by jeweler
2
+ # DO NOT EDIT THIS FILE DIRECTLY
3
+ # Instead, edit Jeweler::Tasks in Rakefile, and run 'rake gemspec'
4
+ # -*- encoding: utf-8 -*-
5
+
6
+ Gem::Specification.new do |s|
7
+ s.name = %q{feedbagtoo}
8
+ s.version = "0.7.2"
9
+
10
+ s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
11
+ s.authors = ["Axiombox", "David Moreno", "Joel Duffin", "Justin Ball", "Fabien Penso"]
12
+ s.date = %q{2012-07-18}
13
+ s.default_executable = %q{feedbag}
14
+ s.description = %q{This gem will return title and url for each feed discovered at a given url}
15
+ s.email = %q{justin@tatemae.com}
16
+ s.executables = ["feedbag"]
17
+ s.extra_rdoc_files = [
18
+ "ChangeLog",
19
+ "README.markdown",
20
+ "TODO"
21
+ ]
22
+ s.files = [
23
+ "COPYING",
24
+ "ChangeLog",
25
+ "Gemfile",
26
+ "Gemfile.lock",
27
+ "README.markdown",
28
+ "Rakefile",
29
+ "TODO",
30
+ "VERSION",
31
+ "benchmark/rfeedfinder_benchmark.rb",
32
+ "bin/feedbag",
33
+ "feedbagtoo.gemspec",
34
+ "index.html",
35
+ "lib/feedbag.rb",
36
+ "lib/feedbagtoo.rb",
37
+ "rails/init.rb",
38
+ "test/atom_autodiscovery_test.rb",
39
+ "test/feedbag_test.rb",
40
+ "test/test_helper.rb"
41
+ ]
42
+ s.homepage = %q{http://github.com/tatemae/feedbagtoo}
43
+ s.require_paths = ["lib"]
44
+ s.rubygems_version = %q{1.6.2}
45
+ s.summary = %q{Fork of the feedbag gem that returns title along with url.}
46
+
47
+ if s.respond_to? :specification_version then
48
+ s.specification_version = 3
49
+
50
+ if Gem::Version.new(Gem::VERSION) >= Gem::Version.new('1.2.0') then
51
+ s.add_development_dependency(%q<growl>, [">= 0"])
52
+ s.add_development_dependency(%q<rdoc>, ["~> 3.12"])
53
+ s.add_development_dependency(%q<bundler>, [">= 1.0.0"])
54
+ s.add_development_dependency(%q<jeweler>, ["~> 1.8.3"])
55
+ s.add_development_dependency(%q<active_support>, [">= 0"])
56
+ s.add_development_dependency(%q<mocha>, [">= 0"])
57
+ s.add_development_dependency(%q<hpricot>, [">= 0"])
58
+ s.add_development_dependency(%q<ruby-debug>, [">= 0"])
59
+ else
60
+ s.add_dependency(%q<growl>, [">= 0"])
61
+ s.add_dependency(%q<rdoc>, ["~> 3.12"])
62
+ s.add_dependency(%q<bundler>, [">= 1.0.0"])
63
+ s.add_dependency(%q<jeweler>, ["~> 1.8.3"])
64
+ s.add_dependency(%q<active_support>, [">= 0"])
65
+ s.add_dependency(%q<mocha>, [">= 0"])
66
+ s.add_dependency(%q<hpricot>, [">= 0"])
67
+ s.add_dependency(%q<ruby-debug>, [">= 0"])
68
+ end
69
+ else
70
+ s.add_dependency(%q<growl>, [">= 0"])
71
+ s.add_dependency(%q<rdoc>, ["~> 3.12"])
72
+ s.add_dependency(%q<bundler>, [">= 1.0.0"])
73
+ s.add_dependency(%q<jeweler>, ["~> 1.8.3"])
74
+ s.add_dependency(%q<active_support>, [">= 0"])
75
+ s.add_dependency(%q<mocha>, [">= 0"])
76
+ s.add_dependency(%q<hpricot>, [">= 0"])
77
+ s.add_dependency(%q<ruby-debug>, [">= 0"])
78
+ end
79
+ end
80
+
data/index.html ADDED
@@ -0,0 +1,115 @@
1
+ <h1>Feedbag</h1>
2
+
3
+ <blockquote>
4
+ <p>Do you want me to drag my sack across your face?
5
+ - Glenn Quagmire</p>
6
+ </blockquote>
7
+
8
+ <p>Feedbag is a feed auto-discovery Ruby library. You don't need to know more about it. It is said to be:</p>
9
+
10
+ <blockquote>
11
+ <p>Ruby's favorite auto-discovery tool/library!</p>
12
+ </blockquote>
13
+
14
+ <h3>Quick synopsis</h3>
15
+
16
+ <pre><code>&gt;&gt; require "rubygems"
17
+ =&gt; true
18
+ &gt;&gt; require "feedbag"
19
+ =&gt; true
20
+ &gt;&gt; Feedbag.find "log.damog.net"
21
+ =&gt; ["http://feeds.feedburner.com/TeoremaDelCerdoInfinito", "http://log.damog.net/comments/feed/"]
22
+ </code></pre>
23
+
24
+ <h3>Installation</h3>
25
+
26
+ <pre><code>$ sudo gem install damog-feedbag -s http://gems.github.com/
27
+ </code></pre>
28
+
29
+ <p>Or just grab feedbag.rb and use it on your own project:</p>
30
+
31
+ <pre><code>$ wget http://github.com/damog/feedbag/raw/master/lib/feedbag.rb
32
+ </code></pre>
33
+
34
+ <h2>Tutorial</h2>
35
+
36
+ <p>So you want to know more about it.</p>
37
+
38
+ <p>OK, if the URL passed to the find method is a feed itself, that only feed URL will be returned.</p>
39
+
40
+ <pre><code>&gt;&gt; Feedbag.find "github.com/damog.atom"
41
+ =&gt; ["http://github.com/damog.atom"]
42
+ &gt;&gt;
43
+ </code></pre>
44
+
45
+ <p>Otherwise, it will always return LINK feeds first, A (anchor tags) feeds later. Between A feeds, the ones hosted on the same URL's host, will have larger priority:</p>
46
+
47
+ <pre><code>&gt;&gt; Feedbag.find "http://ve.planetalinux.org"
48
+ =&gt; ["http://feedproxy.google.com/PlanetaLinuxVenezuela", "http://rendergraf.wordpress.com/feed/", "http://rootweiller.wordpress.com/feed/", "http://skatox.com/blog/feed/", "http://kodegeek.com/atom.xml", "http://blog.0x29.com.ve/?feed=rss2&amp;cat=8"]
49
+ &gt;&gt;
50
+ </code></pre>
51
+
52
+ <p>On your application you should only take the very first element of the array, most of the times:</p>
53
+
54
+ <pre><code>&gt;&gt; Feedbag.find("planet.debian.org").first(3)
55
+ =&gt; ["http://planet.debian.org/rss10.xml", "http://planet.debian.org/rss20.xml", "http://planet.debian.org/atom.xml"]
56
+ &gt;&gt;
57
+ </code></pre>
58
+
59
+ <p>(Try running that same example without the "first" method. That example's host is a blog aggregator, so it has hundreds of feed URLs:)</p>
60
+
61
+ <pre><code>&gt;&gt; Feedbag.find("planet.debian.org").size
62
+ =&gt; 104
63
+ &gt;&gt;
64
+ </code></pre>
65
+
66
+ <p>Feedbag will find them all, but it will return the most important ones on the first elements on the array returned.</p>
67
+
68
+ <pre><code>&gt;&gt; Feedbag.find("cnn.com")
69
+ =&gt; ["http://rss.cnn.com/rss/cnn_topstories.rss", "http://rss.cnn.com/rss/cnn_latest.rss", "http://rss.cnn.com/services/podcasting/robinmeade/rss.xml"]
70
+ &gt;&gt;
71
+ </code></pre>
72
+
73
+ <h3>Why should you use it?</h3>
74
+
75
+ <ul>
76
+ <li>Because it's cool.</li>
77
+ <li>Because it only uses <a href="https://code.whytheluckystiff.net/hpricot/">Hpricot</a> as dependency.</li>
78
+ <li>Because it follows modern feed filename conventions (like those ones used by WordPress blogs, or Blogger, etc).</li>
79
+ <li>Because it's a single file you can embed easily in your application.</li>
80
+ <li>Because it passes most of the Mark Pilgrim's <a href="http://diveintomark.org/tests/client/autodiscovery/">Atom auto-discovery test suite</a>. It doesn't pass them all because some of those tests are broken (citation needed).</li>
81
+ </ul>
82
+
83
+ <h3>Why did I build it?</h3>
84
+
85
+ <ul>
86
+ <li>Because I liked Benjamin Trott's <a href="http://search.cpan.org/~btrott/Feed-Find-0.06/lib/Feed/Find.pm">Feed::Find</a>.</li>
87
+ <li>Because I thought it would be good to have Feed::Find's functionality in Ruby.</li>
88
+ <li>Because I thought it was going to be easy to maintain.</li>
89
+ <li>Because I was going to use it on <a href="http://github.com/damog/rfeed">rFeed</a>.</li>
90
+ <li>And finally, because I didn't know <a href="http://rfeedfinder.rubyforge.org/">rfeedfinder</a> existed :-)</li>
91
+ </ul>
92
+
93
+ <h3>Bugs</h3>
94
+
95
+ <p>Please, report bugs to <a href="rt@support.axiombox.com">rt@support.axiombox.com</a> or directly to the author.</p>
96
+
97
+ <h3>Contribute</h3>
98
+
99
+ <blockquote>
100
+ <p>git clone git://github.com/damog/feedbag.git</p>
101
+ </blockquote>
102
+
103
+ <p>...patch, build, hack and make pull requests. I'll be glad.</p>
104
+
105
+ <h3>Author</h3>
106
+
107
+ <p><a href="http://damog.net/">David Moreno</a> &lt;<a href="mailto:david@axiombox.com">david@axiombox.com</a>>.</p>
108
+
109
+ <h3>Copyright</h3>
110
+
111
+ <p>This is free software. See <a href="http://github.com/damog/feedbag/master/COPYING">COPYING</a> for more information.</p>
112
+
113
+ <h3>Thanks</h3>
114
+
115
+ <p><a href="http://maggit.net">Raquel</a>, for making <a href="http://axiombox.com">Axiombox</a> and most of my dreams possible. Also, <a href="http://github.com">GitHub</a> for making a nice code sharing service that doesn't suck.</p>
data/lib/feedbag.rb ADDED
@@ -0,0 +1,226 @@
1
+ #!/usr/bin/ruby
2
+
3
+ # Copyright (c) 2012 David Moreno <david@axiombox.com>
4
+ #
5
+ # Permission is hereby granted, free of charge, to any person obtaining
6
+ # a copy of this software and associated documentation files (the
7
+ # "Software"), to deal in the Software without restriction, including
8
+ # without limitation the rights to use, copy, modify, merge, publish,
9
+ # distribute, sublicense, and/or sell copies of the Software, and to
10
+ # permit persons to whom the Software is furnished to do so, subject to
11
+ # the following conditions:
12
+ #
13
+ # The above copyright notice and this permission notice shall be
14
+ # included in all copies or substantial portions of the Software.
15
+ #
16
+ # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
17
+ # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18
+ # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
19
+ # NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
20
+ # LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
21
+ # OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
22
+ # WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
23
+
24
+ require "rubygems"
25
+ require "hpricot"
26
+ require "open-uri"
27
+ require "net/http"
28
+ require 'timeout'
29
+ require 'iconv' if RUBY_VERSION < '1.9'
30
+
31
+ module Feedbag
32
+ Feed = Struct.new(:url, :title, :human_url, :description)
33
+
34
+ @content_types = [
35
+ 'application/x.atom+xml',
36
+ 'application/atom+xml',
37
+ 'application/xml',
38
+ 'text/xml',
39
+ 'application/rss+xml',
40
+ 'application/rdf+xml',
41
+ ]
42
+
43
+ $feeds = []
44
+ $base_uri = nil
45
+
46
+ def self.feed?(url)
47
+ # use LWR::Simple.normalize some time
48
+ url_uri = URI.parse(url)
49
+ url = "#{url_uri.scheme or 'http'}://#{url_uri.host}#{url_uri.path}"
50
+ url << "?#{url_uri.query}" if url_uri.query
51
+
52
+ # hack:
53
+ url.sub!(/^feed:\/\//, 'http://')
54
+
55
+ res = self.find(url)
56
+ if res.size == 1 and res.first == url
57
+ return true
58
+ else
59
+ return false
60
+ end
61
+ end
62
+
63
+ def self.find(url, args = {})
64
+ $feeds = []
65
+
66
+ url_uri = URI.parse(url)
67
+ url = nil
68
+ if url_uri.scheme.nil?
69
+ url = "http://#{url_uri.to_s}"
70
+ elsif url_uri.scheme == "feed"
71
+ return self.add_feed(url_uri.to_s.sub(/^feed:\/\//, 'http://'), nil)
72
+ else
73
+ url = url_uri.to_s
74
+ end
75
+ #url = "#{url_uri.scheme or 'http'}://#{url_uri.host}#{url_uri.path}"
76
+
77
+ #return self.add_feed(url, nil) if looks_like_feed? url
78
+
79
+ # check if feed_valid is avail
80
+ begin
81
+ require "feed_validator"
82
+ v = W3C::FeedValidator.new
83
+ v.validate_url(url)
84
+ return self.add_feed(url, nil) if v.valid?
85
+ rescue LoadError
86
+ # scoo
87
+ rescue REXML::ParseException
88
+ # usually indicates timeout
89
+ # TODO: actually find out timeout. use Terminator?
90
+ # $stderr.puts "Feed looked like feed but might not have passed validation or timed out"
91
+ rescue => ex
92
+ $stderr.puts "#{ex.class} error ocurred with: `#{url}': #{ex.message}"
93
+ end
94
+
95
+ begin
96
+ Timeout::timeout(15) do
97
+ html = open(url) do |f|
98
+ content_type = f.content_type.downcase
99
+ if content_type == "application/octet-stream" # open failed
100
+ content_type = f.meta["content-type"].gsub(/;.*$/, '')
101
+ end
102
+ if @content_types.include?(content_type)
103
+ return self.add_feed(url, nil)
104
+ end
105
+
106
+ if RUBY_VERSION < '1.9'
107
+ ic = Iconv.new('UTF-8//IGNORE', f.charset)
108
+ doc = Hpricot(ic.iconv(f.read))
109
+ else
110
+ doc = Hpricot(f.read)
111
+ end
112
+
113
+ if doc.at("base") and doc.at("base")["href"]
114
+ $base_uri = doc.at("base")["href"]
115
+ else
116
+ $base_uri = nil
117
+ end
118
+
119
+ title = (doc/:title).first
120
+ title = title.innerHTML if title
121
+
122
+ description = (doc/:description).first
123
+ description = description.innerHTML if description
124
+
125
+ # first with links
126
+ (doc/"atom:link").each do |l|
127
+ next unless l["rel"]
128
+ if l["type"] and @content_types.include?(l["type"].downcase.strip) and l["rel"].downcase == "self"
129
+ self.add_feed(l["href"], url, $base_uri, title, description || title)
130
+ end
131
+ end
132
+
133
+ (doc/"link").each do |l|
134
+ next unless l["rel"]
135
+ if l["type"] and @content_types.include?(l["type"].downcase.strip) and (l["rel"].downcase =~ /alternate/i or l["rel"] == "service.feed")
136
+ self.add_feed(l["href"], url, $base_uri, title, description || title)
137
+ end
138
+ end
139
+
140
+ (doc/"a").each do |a|
141
+ next unless a["href"]
142
+ if self.looks_like_feed?(a["href"]) and (a["href"] =~ /\// or a["href"] =~ /#{url_uri.host}/)
143
+ title = a["title"] || a.inner_html || a['alt'] || title
144
+ self.add_feed(a["href"], url, $base_uri, title, description || title)
145
+ end
146
+ end
147
+
148
+ (doc/"a").each do |a|
149
+ next unless a["href"]
150
+ if self.looks_like_feed?(a["href"])
151
+ title = a["title"] || a.inner_html || a['alt'] || title
152
+ self.add_feed(a["href"], url, $base_uri, title, description || title)
153
+ end
154
+ end
155
+
156
+ # Added support for feeds like http://tabtimes.com/tbfeed/mashable/full.xml
157
+ if url.match(/.xml$/) and doc.root and doc.root["xml:base"] and doc.root["xml:base"].strip == url.strip
158
+ self.add_feed(url, url, $base_uri, title, description)
159
+ end
160
+ end
161
+ end
162
+ rescue Timeout::Error => err
163
+ $stderr.puts "Timeout error ocurred with `#{url}: #{err}'"
164
+ rescue OpenURI::HTTPError => the_error
165
+ $stderr.puts "Error ocurred with `#{url}': #{the_error}"
166
+ rescue SocketError => err
167
+ $stderr.puts "Socket error ocurred with: `#{url}': #{err}"
168
+ rescue => ex
169
+ $stderr.puts "#{ex.class} error ocurred with: `#{url}': #{ex.message}"
170
+ ensure
171
+ return $feeds
172
+ end
173
+ end
174
+
175
+ def self.looks_like_feed?(url)
176
+ if url =~ /((\.|\/)(rdf|xml|rdf|rss)$|feed=(rss|atom)|(atom|feed)\/?$)/i
177
+ true
178
+ else
179
+ false
180
+ end
181
+ end
182
+
183
+ def self.add_feed(feed_url, orig_url, base_uri = nil, title = "", description = "")
184
+ # puts "#{feed_url} - #{orig_url}"
185
+ url = feed_url.sub(/^feed:/, '').strip
186
+
187
+ if base_uri
188
+ # url = base_uri + feed_url
189
+ url = URI.parse(base_uri).merge(feed_url).to_s
190
+ end
191
+
192
+ begin
193
+ uri = URI.parse(url)
194
+ rescue
195
+ puts "Error with `#{url}'"
196
+ exit 1
197
+ end
198
+ unless uri.absolute?
199
+ orig = URI.parse(orig_url)
200
+ url = orig.merge(url).to_s
201
+ end
202
+
203
+ # verify url is really valid
204
+ $feeds.push(Feed.new(url, title, orig_url, description)) unless $feeds.any? { |f| f.url == url }# if self._is_http_valid(URI.parse(url), orig_url)
205
+ end
206
+
207
+ # not used. yet.
208
+ def self._is_http_valid(uri, orig_url)
209
+ req = Net::HTTP.get_response(uri)
210
+ orig_uri = URI.parse(orig_url)
211
+ case req
212
+ when Net::HTTPSuccess then
213
+ return true
214
+ else
215
+ return false
216
+ end
217
+ end
218
+ end
219
+
220
+ if __FILE__ == $0
221
+ if ARGV.size == 0
222
+ puts 'usage: feedbag url'
223
+ else
224
+ puts Feedbag.find ARGV.first
225
+ end
226
+ end
data/lib/feedbagtoo.rb ADDED
@@ -0,0 +1 @@
1
+ require 'feedbag'
data/rails/init.rb ADDED
@@ -0,0 +1 @@
1
+ require File.join File.dirname(__FILE__), "..", "lib", "feedbag"
@@ -0,0 +1,40 @@
1
+ require File.dirname(__FILE__) + '/test_helper'
2
+
3
+ class AtomAutoDiscoveryTest < Test::Unit::TestCase
4
+ def test_autodisc
5
+ base_url = "http://diveintomark.org/tests/client/autodiscovery/"
6
+ url = base_url + "html4-001.html"
7
+
8
+ i = 1
9
+ puts "trying now with #{url}"
10
+ while(i)
11
+ puts
12
+ i = 0 # unless otherwise found
13
+
14
+ f = Feedbag.find url
15
+
16
+ assert_instance_of Array, f
17
+ assert f.size == 1, "Feedbag didn't find a feed on #{url} or found more than one"
18
+
19
+ puts " found #{f[0]}"
20
+ feed = Hpricot(open(f[0]))
21
+
22
+ (feed/"link").each do |l|
23
+ next unless l["rel"] == "alternate"
24
+ assert_equal l["href"], url
25
+ end
26
+
27
+ # ahora me voy al siguiente
28
+ html = Hpricot(open(url))
29
+ (html/"link").each do |l|
30
+ next unless l["rel"] == "next"
31
+ url = URI.parse(base_url).merge(l["href"]).to_s
32
+ puts "trying now with #{url}"
33
+ i = 1
34
+ end
35
+
36
+ end
37
+ end
38
+
39
+
40
+ end
@@ -0,0 +1,47 @@
1
+ require File.dirname(__FILE__) + '/test_helper'
2
+ class FeedbagTest < ActiveSupport::TestCase
3
+
4
+ test "Feedbag.feed? should know that an RSS url is a feed" do
5
+ rss_url = 'http://example.com/rss/'
6
+ Feedbag.stubs(:find).with(rss_url).returns([rss_url])
7
+
8
+ assert Feedbag.feed?(rss_url)
9
+ end
10
+
11
+ test "Feedbag.feed? should know that an RSS url with parameters is a feed" do
12
+ rss_url = "http://example.com/data?format=rss"
13
+ Feedbag.stubs(:find).with(rss_url).returns([rss_url])
14
+
15
+ assert Feedbag.feed?(rss_url)
16
+ end
17
+
18
+ test "Feedbag find should discover feeds containing atom:link" do
19
+ feeds = []
20
+ feeds << 'http://www.psfk.com/feeds/mashable'
21
+ feeds << 'http://jenniferlynch.wordpress.com/feed'
22
+ feeds << 'http://lurenbijdeburen.wordpress.com/feed'
23
+
24
+ feeds.each do |url|
25
+ assert_equal url, Feedbag.find(url).first.url
26
+ end
27
+ end
28
+
29
+ test "Feedbag find should discover feeds from site" do
30
+ feeds = []
31
+ feeds << 'http://www.justinball.com/'
32
+
33
+ feeds.each do |url|
34
+ assert_equal 'http://www.justinball.com/feed/', Feedbag.find(url).first.url
35
+ end
36
+ end
37
+
38
+ test "Feedbag find should discover feeds from xml" do
39
+ feeds = []
40
+ feeds << 'http://tabtimes.com/tbfeed/mashable/full.xml'
41
+
42
+ feeds.each do |url|
43
+ assert_equal url, Feedbag.find(url).first.url
44
+ end
45
+ end
46
+
47
+ end
@@ -0,0 +1,17 @@
1
+ require 'rubygems'
2
+
3
+ require 'test/unit'
4
+
5
+
6
+ require 'active_support'
7
+ require 'active_support/test_case'
8
+
9
+ require 'mocha'
10
+
11
+ require File.dirname(__FILE__) + '/../lib/feedbag'
12
+
13
+ if RUBY_VERSION < '1.9'
14
+ require 'ruby-debug'
15
+ else
16
+ require 'debugger'
17
+ end
metadata ADDED
@@ -0,0 +1,206 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: feedbagtoo
3
+ version: !ruby/object:Gem::Version
4
+ hash: 7
5
+ prerelease:
6
+ segments:
7
+ - 0
8
+ - 7
9
+ - 2
10
+ version: 0.7.2
11
+ platform: ruby
12
+ authors:
13
+ - Axiombox
14
+ - David Moreno
15
+ - Joel Duffin
16
+ - Justin Ball
17
+ - Fabien Penso
18
+ autorequire:
19
+ bindir: bin
20
+ cert_chain: []
21
+
22
+ date: 2012-07-18 00:00:00 -06:00
23
+ default_executable: feedbag
24
+ dependencies:
25
+ - !ruby/object:Gem::Dependency
26
+ prerelease: false
27
+ type: :development
28
+ requirement: &id001 !ruby/object:Gem::Requirement
29
+ none: false
30
+ requirements:
31
+ - - ">="
32
+ - !ruby/object:Gem::Version
33
+ hash: 3
34
+ segments:
35
+ - 0
36
+ version: "0"
37
+ name: growl
38
+ version_requirements: *id001
39
+ - !ruby/object:Gem::Dependency
40
+ prerelease: false
41
+ type: :development
42
+ requirement: &id002 !ruby/object:Gem::Requirement
43
+ none: false
44
+ requirements:
45
+ - - ~>
46
+ - !ruby/object:Gem::Version
47
+ hash: 31
48
+ segments:
49
+ - 3
50
+ - 12
51
+ version: "3.12"
52
+ name: rdoc
53
+ version_requirements: *id002
54
+ - !ruby/object:Gem::Dependency
55
+ prerelease: false
56
+ type: :development
57
+ requirement: &id003 !ruby/object:Gem::Requirement
58
+ none: false
59
+ requirements:
60
+ - - ">="
61
+ - !ruby/object:Gem::Version
62
+ hash: 23
63
+ segments:
64
+ - 1
65
+ - 0
66
+ - 0
67
+ version: 1.0.0
68
+ name: bundler
69
+ version_requirements: *id003
70
+ - !ruby/object:Gem::Dependency
71
+ prerelease: false
72
+ type: :development
73
+ requirement: &id004 !ruby/object:Gem::Requirement
74
+ none: false
75
+ requirements:
76
+ - - ~>
77
+ - !ruby/object:Gem::Version
78
+ hash: 49
79
+ segments:
80
+ - 1
81
+ - 8
82
+ - 3
83
+ version: 1.8.3
84
+ name: jeweler
85
+ version_requirements: *id004
86
+ - !ruby/object:Gem::Dependency
87
+ prerelease: false
88
+ type: :development
89
+ requirement: &id005 !ruby/object:Gem::Requirement
90
+ none: false
91
+ requirements:
92
+ - - ">="
93
+ - !ruby/object:Gem::Version
94
+ hash: 3
95
+ segments:
96
+ - 0
97
+ version: "0"
98
+ name: active_support
99
+ version_requirements: *id005
100
+ - !ruby/object:Gem::Dependency
101
+ prerelease: false
102
+ type: :development
103
+ requirement: &id006 !ruby/object:Gem::Requirement
104
+ none: false
105
+ requirements:
106
+ - - ">="
107
+ - !ruby/object:Gem::Version
108
+ hash: 3
109
+ segments:
110
+ - 0
111
+ version: "0"
112
+ name: mocha
113
+ version_requirements: *id006
114
+ - !ruby/object:Gem::Dependency
115
+ prerelease: false
116
+ type: :development
117
+ requirement: &id007 !ruby/object:Gem::Requirement
118
+ none: false
119
+ requirements:
120
+ - - ">="
121
+ - !ruby/object:Gem::Version
122
+ hash: 3
123
+ segments:
124
+ - 0
125
+ version: "0"
126
+ name: hpricot
127
+ version_requirements: *id007
128
+ - !ruby/object:Gem::Dependency
129
+ prerelease: false
130
+ type: :development
131
+ requirement: &id008 !ruby/object:Gem::Requirement
132
+ none: false
133
+ requirements:
134
+ - - ">="
135
+ - !ruby/object:Gem::Version
136
+ hash: 3
137
+ segments:
138
+ - 0
139
+ version: "0"
140
+ name: ruby-debug
141
+ version_requirements: *id008
142
+ description: This gem will return title and url for each feed discovered at a given url
143
+ email: justin@tatemae.com
144
+ executables:
145
+ - feedbag
146
+ extensions: []
147
+
148
+ extra_rdoc_files:
149
+ - ChangeLog
150
+ - README.markdown
151
+ - TODO
152
+ files:
153
+ - COPYING
154
+ - ChangeLog
155
+ - Gemfile
156
+ - Gemfile.lock
157
+ - README.markdown
158
+ - Rakefile
159
+ - TODO
160
+ - VERSION
161
+ - benchmark/rfeedfinder_benchmark.rb
162
+ - bin/feedbag
163
+ - feedbagtoo.gemspec
164
+ - index.html
165
+ - lib/feedbag.rb
166
+ - lib/feedbagtoo.rb
167
+ - rails/init.rb
168
+ - test/atom_autodiscovery_test.rb
169
+ - test/feedbag_test.rb
170
+ - test/test_helper.rb
171
+ has_rdoc: true
172
+ homepage: http://github.com/tatemae/feedbagtoo
173
+ licenses: []
174
+
175
+ post_install_message:
176
+ rdoc_options: []
177
+
178
+ require_paths:
179
+ - lib
180
+ required_ruby_version: !ruby/object:Gem::Requirement
181
+ none: false
182
+ requirements:
183
+ - - ">="
184
+ - !ruby/object:Gem::Version
185
+ hash: 3
186
+ segments:
187
+ - 0
188
+ version: "0"
189
+ required_rubygems_version: !ruby/object:Gem::Requirement
190
+ none: false
191
+ requirements:
192
+ - - ">="
193
+ - !ruby/object:Gem::Version
194
+ hash: 3
195
+ segments:
196
+ - 0
197
+ version: "0"
198
+ requirements: []
199
+
200
+ rubyforge_project:
201
+ rubygems_version: 1.6.2
202
+ signing_key:
203
+ specification_version: 3
204
+ summary: Fork of the feedbag gem that returns title along with url.
205
+ test_files: []
206
+