feedbagtoo 0.7.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/COPYING ADDED
@@ -0,0 +1,20 @@
1
+ Copyright (C) 2012 David Moreno <david@axiombox.com>
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/ChangeLog ADDED
@@ -0,0 +1,29 @@
1
+ * 0.9 - Fri Mar 16 10:59:00 EDT 2012
2
+ - Changed license to MIT.
3
+
4
+ * 0.6 - Fri Mar 5 20:10:33 EST 2010
5
+ - Added bin/feedbag.
6
+ - Removed the args[:narrow] option, not really needed.
7
+ - Handle case where feed URLs contain GET parameters; add tests
8
+ by Patrick Reagan <patrick.reagan@viget.com>.
9
+
10
+ * 0.5.99 - Tue May 12 12:52:22 EDT 2009
11
+ - Added rails/init.rb to load easily on a Rails app.
12
+
13
+ * 0.5.13.1 - Wed Apr 22 11:16:19 EDT 2009
14
+ - Changed args on find() from nil to {}
15
+
16
+ * 0.5.13 - Wed Apr 22 11:12:40 EDT 2009
17
+ - Added :narrow option so find() skips feed_validate and A links.
18
+
19
+ * 0.5.12 - Fri Mar 20 12:34:48 EDT 2009
20
+ - Added support for "feed://" URLs
21
+
22
+ * 0.5.11 - Sat Mar 7 17:22:30 EST 2009
23
+ - Benchmark against Rfeedfinder added.
24
+
25
+ * 0.5.10 - Wed Mar 4 13:32:33 EST 2009
26
+ - Feeds whose URLs contained query string arguments were not being
27
+ auto-discovered -- fixed
28
+
29
+ ** For previous changes, see the git log
data/Gemfile ADDED
@@ -0,0 +1,22 @@
1
+ # Clean up if needed
2
+ # rm -rf ~/.bundle/ ~/.gem/; rm -rf $GEM_HOME/bundler/ $GEM_HOME/cache/bundler/; rm -rf .bundle/; rm -rf vendor/cache/; rm -rf Gemfile.lock
3
+
4
+ source "http://rubygems.org"
5
+
6
+
7
+ # Add dependencies to develop your gem here.
8
+ # Include everything needed to run rake, tests, features, etc.
9
+ group :development, :test do
10
+ gem 'growl'
11
+ gem "rdoc", "~> 3.12"
12
+ gem "bundler", ">=1.0.0"
13
+ gem "jeweler", "~> 1.8.3"
14
+ gem "active_support"
15
+ gem "mocha"
16
+ gem "hpricot"
17
+ if RUBY_VERSION < '1.9'
18
+ gem "ruby-debug"
19
+ else
20
+ gem 'debugger', '~> 1.1.4'
21
+ end
22
+ end
data/Gemfile.lock ADDED
@@ -0,0 +1,43 @@
1
+ GEM
2
+ remote: http://rubygems.org/
3
+ specs:
4
+ active_support (3.0.0)
5
+ activesupport (= 3.0.0)
6
+ activesupport (3.0.0)
7
+ columnize (0.3.6)
8
+ git (1.2.5)
9
+ growl (1.0.3)
10
+ hpricot (0.8.6)
11
+ jeweler (1.8.4)
12
+ bundler (~> 1.0)
13
+ git (>= 1.2.5)
14
+ rake
15
+ rdoc
16
+ json (1.7.3)
17
+ linecache (0.46)
18
+ rbx-require-relative (> 0.0.4)
19
+ metaclass (0.0.1)
20
+ mocha (0.12.0)
21
+ metaclass (~> 0.0.1)
22
+ rake (0.9.2.2)
23
+ rbx-require-relative (0.0.9)
24
+ rdoc (3.12)
25
+ json (~> 1.4)
26
+ ruby-debug (0.10.4)
27
+ columnize (>= 0.1)
28
+ ruby-debug-base (~> 0.10.4.0)
29
+ ruby-debug-base (0.10.4)
30
+ linecache (>= 0.3)
31
+
32
+ PLATFORMS
33
+ ruby
34
+
35
+ DEPENDENCIES
36
+ active_support
37
+ bundler (>= 1.0.0)
38
+ growl
39
+ hpricot
40
+ jeweler (~> 1.8.3)
41
+ mocha
42
+ rdoc (~> 3.12)
43
+ ruby-debug
data/README.markdown ADDED
@@ -0,0 +1,101 @@
1
+ Feedbag
2
+ =======
3
+ Forked version of feedbag that returns title, description and url.
4
+
5
+ Feedbag is a feed auto-discovery Ruby library. You don't need to know more about it. It is said to be:
6
+
7
+ > Ruby's favorite auto-discovery tool/library!
8
+
9
+ ### Quick synopsis
10
+
11
+ >> require "rubygems"
12
+ => true
13
+ >> require "feedbag"
14
+ => true
15
+ >> Feedbag.find "log.damog.net"
16
+ => ["http://feeds.feedburner.com/TeoremaDelCerdoInfinito", "http://log.damog.net/comments/feed/"]
17
+ >> Feedbag.feed?("google.com")
18
+ => false
19
+ >> Feedbag.feed?("http://planet.debian.org/rss20.xml")
20
+ => true
21
+
22
+ ### Installation
23
+
24
+ $ sudo gem install damog-feedbag -s http://gems.github.com/
25
+
26
+ Or just grab feedbag.rb and use it on your own project:
27
+
28
+ $ wget http://github.com/damog/feedbag/raw/master/lib/feedbag.rb
29
+
30
+ ## Tutorial
31
+
32
+ So you want to know more about it.
33
+
34
+ OK, if the URL passed to the find method is a feed itself, that only feed URL will be returned.
35
+
36
+ >> Feedbag.find "github.com/damog.atom"
37
+ => ["http://github.com/damog.atom"]
38
+ >>
39
+
40
+ Otherwise, it will always return LINK feeds first, A (anchor tags) feeds later. Between A feeds, the ones hosted on the same URL's host, will have larger priority:
41
+
42
+ >> Feedbag.find "http://ve.planetalinux.org"
43
+ => ["http://feedproxy.google.com/PlanetaLinuxVenezuela", "http://rendergraf.wordpress.com/feed/", "http://rootweiller.wordpress.com/feed/", "http://skatox.com/blog/feed/", "http://kodegeek.com/atom.xml", "http://blog.0x29.com.ve/?feed=rss2&cat=8"]
44
+ >>
45
+
46
+ On your application you should only take the very first element of the array, most of the times:
47
+
48
+ >> Feedbag.find("planet.debian.org").first(3)
49
+ => ["http://planet.debian.org/rss10.xml", "http://planet.debian.org/rss20.xml", "http://planet.debian.org/atom.xml"]
50
+ >>
51
+
52
+ (Try running that same example without the "first" method. That example's host is a blog aggregator, so it has hundreds of feed URLs:)
53
+
54
+ >> Feedbag.find("planet.debian.org").size
55
+ => 104
56
+ >>
57
+
58
+ Feedbag will find them all, but it will return the most important ones on the first elements on the array returned.
59
+
60
+ >> Feedbag.find("cnn.com")
61
+ => ["http://rss.cnn.com/rss/cnn_topstories.rss", "http://rss.cnn.com/rss/cnn_latest.rss", "http://rss.cnn.com/services/podcasting/robinmeade/rss.xml"]
62
+ >>
63
+
64
+ ### Why should you use it?
65
+
66
+ - Because it's cool.
67
+ - Because it only uses [Hpricot](https://code.whytheluckystiff.net/hpricot/) as dependency.
68
+ - Because it follows modern feed filename conventions (like those ones used by WordPress blogs, or Blogger, etc).
69
+ - Because it's a single file you can embed easily in your application.
70
+ - Because it passes most of the Mark Pilgrim's [Atom auto-discovery test suite](http://diveintomark.org/tests/client/autodiscovery/). It doesn't pass them all because some of those tests are broken (citation needed).
71
+
72
+ ### Why did I build it?
73
+
74
+ - Because I liked Benjamin Trott's [Feed::Find](http://search.cpan.org/~btrott/Feed-Find-0.06/lib/Feed/Find.pm).
75
+ - Because I thought it would be good to have Feed::Find's functionality in Ruby.
76
+ - Because I thought it was going to be easy to maintain.
77
+ - Because I was going to use it on [rFeed](http://github.com/damog/rfeed).
78
+ - And finally, because I didn't know [rfeedfinder](http://rfeedfinder.rubyforge.org/) existed :-)
79
+
80
+ ### Bugs
81
+
82
+ Please, report bugs to [rt@support.axiombox.com](rt@support.axiombox.com) or directly to the author.
83
+
84
+ ### Contribute
85
+
86
+ > git clone git://github.com/damog/feedbag.git
87
+
88
+ ...patch, build, hack and make pull requests. I'll be glad.
89
+
90
+ ### Author
91
+
92
+ [David Moreno](http://damog.net/) <[david@axiombox.com](mailto:david@axiombox.com)>.
93
+
94
+ ### Copyright
95
+
96
+ This is free software. See [COPYING](http://github.com/damog/feedbag/master/COPYING) for more information.
97
+
98
+ ### Thanks
99
+
100
+ [Raquel](http://maggit.net), for making [Axiombox](http://axiombox.com) and most of my dreams possible. Also, [GitHub](http://github.com) for making a nice code sharing service that doesn't suck.
101
+
data/Rakefile ADDED
@@ -0,0 +1,33 @@
1
+ # encoding: utf-8
2
+
3
+ require 'rubygems'
4
+ require 'bundler'
5
+ require 'rake/testtask'
6
+ begin
7
+ Bundler.setup(:default, :development)
8
+ rescue Bundler::BundlerError => e
9
+ $stderr.puts e.message
10
+ $stderr.puts "Run `bundle install` to install missing gems"
11
+ exit e.status_code
12
+ end
13
+ require 'rake'
14
+
15
+ require 'jeweler'
16
+ Jeweler::Tasks.new do |gem|
17
+ gem.name = "feedbagtoo"
18
+ gem.summary = "Fork of the feedbag gem that returns title along with url."
19
+ gem.description = "This gem will return title and url for each feed discovered at a given url"
20
+ gem.email = "justin@tatemae.com"
21
+ gem.homepage = "http://github.com/tatemae/feedbagtoo"
22
+ gem.authors = ["Axiombox", "David Moreno", "Joel Duffin", "Justin Ball", "Fabien Penso"]
23
+ end
24
+ Jeweler::RubygemsDotOrgTasks.new
25
+
26
+
27
+ task :default => :test
28
+
29
+ Rake::TestTask.new do |t|
30
+ t.libs << 'test'
31
+ t.test_files = FileList["test/feedbag_test.rb"]
32
+ t.verbose = true
33
+ end
data/TODO ADDED
@@ -0,0 +1 @@
1
+ - Document Feedbag.feed?
data/VERSION ADDED
@@ -0,0 +1 @@
1
+ 0.7.2
@@ -0,0 +1,30 @@
1
+ require "benchmark"
2
+ require "rubygems"
3
+
4
+ sites = [
5
+ "log.damog.net",
6
+ "http://cnn.com",
7
+ "scripting.com",
8
+ "mx.planetalinux.org",
9
+ "http://feedproxy.google.com/UniversoPlanetaLinux",
10
+ ]
11
+
12
+ Benchmark.bm do |x|
13
+ sites.each do |site|
14
+ puts "#{site}:"
15
+
16
+ puts " feedbag"
17
+ x.report {
18
+ require 'feedbag'
19
+ Feedbag.find(site)
20
+ }
21
+
22
+ puts " rfeedfinder"
23
+ x.report {
24
+ require 'rfeedfinder'
25
+ Rfeedfinder.feed(site)
26
+ }
27
+
28
+ end
29
+ end
30
+
data/bin/feedbag ADDED
@@ -0,0 +1,28 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "rubygems"
4
+ require "feedbag"
5
+
6
+ def usage
7
+ %Q{
8
+ #{$0} <url 1> [<url 2> <url 3> ... <url n>]
9
+ }
10
+ end
11
+
12
+ if ARGV.empty?
13
+ puts usage
14
+ exit 1
15
+ end
16
+
17
+ ARGV.each do |url|
18
+ puts "== #{url}:"
19
+ feeds = Feedbag.find url
20
+ if feeds.empty?
21
+ puts " no feeds found!"
22
+ else
23
+ feeds.each do |f|
24
+ puts " - #{f}"
25
+ end
26
+ end
27
+ end
28
+
@@ -0,0 +1,80 @@
1
+ # Generated by jeweler
2
+ # DO NOT EDIT THIS FILE DIRECTLY
3
+ # Instead, edit Jeweler::Tasks in Rakefile, and run 'rake gemspec'
4
+ # -*- encoding: utf-8 -*-
5
+
6
+ Gem::Specification.new do |s|
7
+ s.name = %q{feedbagtoo}
8
+ s.version = "0.7.2"
9
+
10
+ s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
11
+ s.authors = ["Axiombox", "David Moreno", "Joel Duffin", "Justin Ball", "Fabien Penso"]
12
+ s.date = %q{2012-07-18}
13
+ s.default_executable = %q{feedbag}
14
+ s.description = %q{This gem will return title and url for each feed discovered at a given url}
15
+ s.email = %q{justin@tatemae.com}
16
+ s.executables = ["feedbag"]
17
+ s.extra_rdoc_files = [
18
+ "ChangeLog",
19
+ "README.markdown",
20
+ "TODO"
21
+ ]
22
+ s.files = [
23
+ "COPYING",
24
+ "ChangeLog",
25
+ "Gemfile",
26
+ "Gemfile.lock",
27
+ "README.markdown",
28
+ "Rakefile",
29
+ "TODO",
30
+ "VERSION",
31
+ "benchmark/rfeedfinder_benchmark.rb",
32
+ "bin/feedbag",
33
+ "feedbagtoo.gemspec",
34
+ "index.html",
35
+ "lib/feedbag.rb",
36
+ "lib/feedbagtoo.rb",
37
+ "rails/init.rb",
38
+ "test/atom_autodiscovery_test.rb",
39
+ "test/feedbag_test.rb",
40
+ "test/test_helper.rb"
41
+ ]
42
+ s.homepage = %q{http://github.com/tatemae/feedbagtoo}
43
+ s.require_paths = ["lib"]
44
+ s.rubygems_version = %q{1.6.2}
45
+ s.summary = %q{Fork of the feedbag gem that returns title along with url.}
46
+
47
+ if s.respond_to? :specification_version then
48
+ s.specification_version = 3
49
+
50
+ if Gem::Version.new(Gem::VERSION) >= Gem::Version.new('1.2.0') then
51
+ s.add_development_dependency(%q<growl>, [">= 0"])
52
+ s.add_development_dependency(%q<rdoc>, ["~> 3.12"])
53
+ s.add_development_dependency(%q<bundler>, [">= 1.0.0"])
54
+ s.add_development_dependency(%q<jeweler>, ["~> 1.8.3"])
55
+ s.add_development_dependency(%q<active_support>, [">= 0"])
56
+ s.add_development_dependency(%q<mocha>, [">= 0"])
57
+ s.add_development_dependency(%q<hpricot>, [">= 0"])
58
+ s.add_development_dependency(%q<ruby-debug>, [">= 0"])
59
+ else
60
+ s.add_dependency(%q<growl>, [">= 0"])
61
+ s.add_dependency(%q<rdoc>, ["~> 3.12"])
62
+ s.add_dependency(%q<bundler>, [">= 1.0.0"])
63
+ s.add_dependency(%q<jeweler>, ["~> 1.8.3"])
64
+ s.add_dependency(%q<active_support>, [">= 0"])
65
+ s.add_dependency(%q<mocha>, [">= 0"])
66
+ s.add_dependency(%q<hpricot>, [">= 0"])
67
+ s.add_dependency(%q<ruby-debug>, [">= 0"])
68
+ end
69
+ else
70
+ s.add_dependency(%q<growl>, [">= 0"])
71
+ s.add_dependency(%q<rdoc>, ["~> 3.12"])
72
+ s.add_dependency(%q<bundler>, [">= 1.0.0"])
73
+ s.add_dependency(%q<jeweler>, ["~> 1.8.3"])
74
+ s.add_dependency(%q<active_support>, [">= 0"])
75
+ s.add_dependency(%q<mocha>, [">= 0"])
76
+ s.add_dependency(%q<hpricot>, [">= 0"])
77
+ s.add_dependency(%q<ruby-debug>, [">= 0"])
78
+ end
79
+ end
80
+
data/index.html ADDED
@@ -0,0 +1,115 @@
1
+ <h1>Feedbag</h1>
2
+
3
+ <blockquote>
4
+ <p>Do you want me to drag my sack across your face?
5
+ - Glenn Quagmire</p>
6
+ </blockquote>
7
+
8
+ <p>Feedbag is a feed auto-discovery Ruby library. You don't need to know more about it. It is said to be:</p>
9
+
10
+ <blockquote>
11
+ <p>Ruby's favorite auto-discovery tool/library!</p>
12
+ </blockquote>
13
+
14
+ <h3>Quick synopsis</h3>
15
+
16
+ <pre><code>&gt;&gt; require "rubygems"
17
+ =&gt; true
18
+ &gt;&gt; require "feedbag"
19
+ =&gt; true
20
+ &gt;&gt; Feedbag.find "log.damog.net"
21
+ =&gt; ["http://feeds.feedburner.com/TeoremaDelCerdoInfinito", "http://log.damog.net/comments/feed/"]
22
+ </code></pre>
23
+
24
+ <h3>Installation</h3>
25
+
26
+ <pre><code>$ sudo gem install damog-feedbag -s http://gems.github.com/
27
+ </code></pre>
28
+
29
+ <p>Or just grab feedbag.rb and use it on your own project:</p>
30
+
31
+ <pre><code>$ wget http://github.com/damog/feedbag/raw/master/lib/feedbag.rb
32
+ </code></pre>
33
+
34
+ <h2>Tutorial</h2>
35
+
36
+ <p>So you want to know more about it.</p>
37
+
38
+ <p>OK, if the URL passed to the find method is a feed itself, that only feed URL will be returned.</p>
39
+
40
+ <pre><code>&gt;&gt; Feedbag.find "github.com/damog.atom"
41
+ =&gt; ["http://github.com/damog.atom"]
42
+ &gt;&gt;
43
+ </code></pre>
44
+
45
+ <p>Otherwise, it will always return LINK feeds first, A (anchor tags) feeds later. Between A feeds, the ones hosted on the same URL's host, will have larger priority:</p>
46
+
47
+ <pre><code>&gt;&gt; Feedbag.find "http://ve.planetalinux.org"
48
+ =&gt; ["http://feedproxy.google.com/PlanetaLinuxVenezuela", "http://rendergraf.wordpress.com/feed/", "http://rootweiller.wordpress.com/feed/", "http://skatox.com/blog/feed/", "http://kodegeek.com/atom.xml", "http://blog.0x29.com.ve/?feed=rss2&amp;cat=8"]
49
+ &gt;&gt;
50
+ </code></pre>
51
+
52
+ <p>On your application you should only take the very first element of the array, most of the times:</p>
53
+
54
+ <pre><code>&gt;&gt; Feedbag.find("planet.debian.org").first(3)
55
+ =&gt; ["http://planet.debian.org/rss10.xml", "http://planet.debian.org/rss20.xml", "http://planet.debian.org/atom.xml"]
56
+ &gt;&gt;
57
+ </code></pre>
58
+
59
+ <p>(Try running that same example without the "first" method. That example's host is a blog aggregator, so it has hundreds of feed URLs:)</p>
60
+
61
+ <pre><code>&gt;&gt; Feedbag.find("planet.debian.org").size
62
+ =&gt; 104
63
+ &gt;&gt;
64
+ </code></pre>
65
+
66
+ <p>Feedbag will find them all, but it will return the most important ones on the first elements on the array returned.</p>
67
+
68
+ <pre><code>&gt;&gt; Feedbag.find("cnn.com")
69
+ =&gt; ["http://rss.cnn.com/rss/cnn_topstories.rss", "http://rss.cnn.com/rss/cnn_latest.rss", "http://rss.cnn.com/services/podcasting/robinmeade/rss.xml"]
70
+ &gt;&gt;
71
+ </code></pre>
72
+
73
+ <h3>Why should you use it?</h3>
74
+
75
+ <ul>
76
+ <li>Because it's cool.</li>
77
+ <li>Because it only uses <a href="https://code.whytheluckystiff.net/hpricot/">Hpricot</a> as dependency.</li>
78
+ <li>Because it follows modern feed filename conventions (like those ones used by WordPress blogs, or Blogger, etc).</li>
79
+ <li>Because it's a single file you can embed easily in your application.</li>
80
+ <li>Because it passes most of the Mark Pilgrim's <a href="http://diveintomark.org/tests/client/autodiscovery/">Atom auto-discovery test suite</a>. It doesn't pass them all because some of those tests are broken (citation needed).</li>
81
+ </ul>
82
+
83
+ <h3>Why did I build it?</h3>
84
+
85
+ <ul>
86
+ <li>Because I liked Benjamin Trott's <a href="http://search.cpan.org/~btrott/Feed-Find-0.06/lib/Feed/Find.pm">Feed::Find</a>.</li>
87
+ <li>Because I thought it would be good to have Feed::Find's functionality in Ruby.</li>
88
+ <li>Because I thought it was going to be easy to maintain.</li>
89
+ <li>Because I was going to use it on <a href="http://github.com/damog/rfeed">rFeed</a>.</li>
90
+ <li>And finally, because I didn't know <a href="http://rfeedfinder.rubyforge.org/">rfeedfinder</a> existed :-)</li>
91
+ </ul>
92
+
93
+ <h3>Bugs</h3>
94
+
95
+ <p>Please, report bugs to <a href="rt@support.axiombox.com">rt@support.axiombox.com</a> or directly to the author.</p>
96
+
97
+ <h3>Contribute</h3>
98
+
99
+ <blockquote>
100
+ <p>git clone git://github.com/damog/feedbag.git</p>
101
+ </blockquote>
102
+
103
+ <p>...patch, build, hack and make pull requests. I'll be glad.</p>
104
+
105
+ <h3>Author</h3>
106
+
107
+ <p><a href="http://damog.net/">David Moreno</a> &lt;<a href="mailto:david@axiombox.com">david@axiombox.com</a>>.</p>
108
+
109
+ <h3>Copyright</h3>
110
+
111
+ <p>This is free software. See <a href="http://github.com/damog/feedbag/master/COPYING">COPYING</a> for more information.</p>
112
+
113
+ <h3>Thanks</h3>
114
+
115
+ <p><a href="http://maggit.net">Raquel</a>, for making <a href="http://axiombox.com">Axiombox</a> and most of my dreams possible. Also, <a href="http://github.com">GitHub</a> for making a nice code sharing service that doesn't suck.</p>
data/lib/feedbag.rb ADDED
@@ -0,0 +1,226 @@
1
+ #!/usr/bin/ruby
2
+
3
+ # Copyright (c) 2012 David Moreno <david@axiombox.com>
4
+ #
5
+ # Permission is hereby granted, free of charge, to any person obtaining
6
+ # a copy of this software and associated documentation files (the
7
+ # "Software"), to deal in the Software without restriction, including
8
+ # without limitation the rights to use, copy, modify, merge, publish,
9
+ # distribute, sublicense, and/or sell copies of the Software, and to
10
+ # permit persons to whom the Software is furnished to do so, subject to
11
+ # the following conditions:
12
+ #
13
+ # The above copyright notice and this permission notice shall be
14
+ # included in all copies or substantial portions of the Software.
15
+ #
16
+ # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
17
+ # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18
+ # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
19
+ # NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
20
+ # LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
21
+ # OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
22
+ # WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
23
+
24
+ require "rubygems"
25
+ require "hpricot"
26
+ require "open-uri"
27
+ require "net/http"
28
+ require 'timeout'
29
+ require 'iconv' if RUBY_VERSION < '1.9'
30
+
31
+ module Feedbag
32
+ Feed = Struct.new(:url, :title, :human_url, :description)
33
+
34
+ @content_types = [
35
+ 'application/x.atom+xml',
36
+ 'application/atom+xml',
37
+ 'application/xml',
38
+ 'text/xml',
39
+ 'application/rss+xml',
40
+ 'application/rdf+xml',
41
+ ]
42
+
43
+ $feeds = []
44
+ $base_uri = nil
45
+
46
+ def self.feed?(url)
47
+ # use LWR::Simple.normalize some time
48
+ url_uri = URI.parse(url)
49
+ url = "#{url_uri.scheme or 'http'}://#{url_uri.host}#{url_uri.path}"
50
+ url << "?#{url_uri.query}" if url_uri.query
51
+
52
+ # hack:
53
+ url.sub!(/^feed:\/\//, 'http://')
54
+
55
+ res = self.find(url)
56
+ if res.size == 1 and res.first == url
57
+ return true
58
+ else
59
+ return false
60
+ end
61
+ end
62
+
63
+ def self.find(url, args = {})
64
+ $feeds = []
65
+
66
+ url_uri = URI.parse(url)
67
+ url = nil
68
+ if url_uri.scheme.nil?
69
+ url = "http://#{url_uri.to_s}"
70
+ elsif url_uri.scheme == "feed"
71
+ return self.add_feed(url_uri.to_s.sub(/^feed:\/\//, 'http://'), nil)
72
+ else
73
+ url = url_uri.to_s
74
+ end
75
+ #url = "#{url_uri.scheme or 'http'}://#{url_uri.host}#{url_uri.path}"
76
+
77
+ #return self.add_feed(url, nil) if looks_like_feed? url
78
+
79
+ # check if feed_valid is avail
80
+ begin
81
+ require "feed_validator"
82
+ v = W3C::FeedValidator.new
83
+ v.validate_url(url)
84
+ return self.add_feed(url, nil) if v.valid?
85
+ rescue LoadError
86
+ # scoo
87
+ rescue REXML::ParseException
88
+ # usually indicates timeout
89
+ # TODO: actually find out timeout. use Terminator?
90
+ # $stderr.puts "Feed looked like feed but might not have passed validation or timed out"
91
+ rescue => ex
92
+ $stderr.puts "#{ex.class} error ocurred with: `#{url}': #{ex.message}"
93
+ end
94
+
95
+ begin
96
+ Timeout::timeout(15) do
97
+ html = open(url) do |f|
98
+ content_type = f.content_type.downcase
99
+ if content_type == "application/octet-stream" # open failed
100
+ content_type = f.meta["content-type"].gsub(/;.*$/, '')
101
+ end
102
+ if @content_types.include?(content_type)
103
+ return self.add_feed(url, nil)
104
+ end
105
+
106
+ if RUBY_VERSION < '1.9'
107
+ ic = Iconv.new('UTF-8//IGNORE', f.charset)
108
+ doc = Hpricot(ic.iconv(f.read))
109
+ else
110
+ doc = Hpricot(f.read)
111
+ end
112
+
113
+ if doc.at("base") and doc.at("base")["href"]
114
+ $base_uri = doc.at("base")["href"]
115
+ else
116
+ $base_uri = nil
117
+ end
118
+
119
+ title = (doc/:title).first
120
+ title = title.innerHTML if title
121
+
122
+ description = (doc/:description).first
123
+ description = description.innerHTML if description
124
+
125
+ # first with links
126
+ (doc/"atom:link").each do |l|
127
+ next unless l["rel"]
128
+ if l["type"] and @content_types.include?(l["type"].downcase.strip) and l["rel"].downcase == "self"
129
+ self.add_feed(l["href"], url, $base_uri, title, description || title)
130
+ end
131
+ end
132
+
133
+ (doc/"link").each do |l|
134
+ next unless l["rel"]
135
+ if l["type"] and @content_types.include?(l["type"].downcase.strip) and (l["rel"].downcase =~ /alternate/i or l["rel"] == "service.feed")
136
+ self.add_feed(l["href"], url, $base_uri, title, description || title)
137
+ end
138
+ end
139
+
140
+ (doc/"a").each do |a|
141
+ next unless a["href"]
142
+ if self.looks_like_feed?(a["href"]) and (a["href"] =~ /\// or a["href"] =~ /#{url_uri.host}/)
143
+ title = a["title"] || a.inner_html || a['alt'] || title
144
+ self.add_feed(a["href"], url, $base_uri, title, description || title)
145
+ end
146
+ end
147
+
148
+ (doc/"a").each do |a|
149
+ next unless a["href"]
150
+ if self.looks_like_feed?(a["href"])
151
+ title = a["title"] || a.inner_html || a['alt'] || title
152
+ self.add_feed(a["href"], url, $base_uri, title, description || title)
153
+ end
154
+ end
155
+
156
+ # Added support for feeds like http://tabtimes.com/tbfeed/mashable/full.xml
157
+ if url.match(/.xml$/) and doc.root and doc.root["xml:base"] and doc.root["xml:base"].strip == url.strip
158
+ self.add_feed(url, url, $base_uri, title, description)
159
+ end
160
+ end
161
+ end
162
+ rescue Timeout::Error => err
163
+ $stderr.puts "Timeout error ocurred with `#{url}: #{err}'"
164
+ rescue OpenURI::HTTPError => the_error
165
+ $stderr.puts "Error ocurred with `#{url}': #{the_error}"
166
+ rescue SocketError => err
167
+ $stderr.puts "Socket error ocurred with: `#{url}': #{err}"
168
+ rescue => ex
169
+ $stderr.puts "#{ex.class} error ocurred with: `#{url}': #{ex.message}"
170
+ ensure
171
+ return $feeds
172
+ end
173
+ end
174
+
175
+ def self.looks_like_feed?(url)
176
+ if url =~ /((\.|\/)(rdf|xml|rdf|rss)$|feed=(rss|atom)|(atom|feed)\/?$)/i
177
+ true
178
+ else
179
+ false
180
+ end
181
+ end
182
+
183
+ def self.add_feed(feed_url, orig_url, base_uri = nil, title = "", description = "")
184
+ # puts "#{feed_url} - #{orig_url}"
185
+ url = feed_url.sub(/^feed:/, '').strip
186
+
187
+ if base_uri
188
+ # url = base_uri + feed_url
189
+ url = URI.parse(base_uri).merge(feed_url).to_s
190
+ end
191
+
192
+ begin
193
+ uri = URI.parse(url)
194
+ rescue
195
+ puts "Error with `#{url}'"
196
+ exit 1
197
+ end
198
+ unless uri.absolute?
199
+ orig = URI.parse(orig_url)
200
+ url = orig.merge(url).to_s
201
+ end
202
+
203
+ # verify url is really valid
204
+ $feeds.push(Feed.new(url, title, orig_url, description)) unless $feeds.any? { |f| f.url == url }# if self._is_http_valid(URI.parse(url), orig_url)
205
+ end
206
+
207
+ # not used. yet.
208
+ def self._is_http_valid(uri, orig_url)
209
+ req = Net::HTTP.get_response(uri)
210
+ orig_uri = URI.parse(orig_url)
211
+ case req
212
+ when Net::HTTPSuccess then
213
+ return true
214
+ else
215
+ return false
216
+ end
217
+ end
218
+ end
219
+
220
+ if __FILE__ == $0
221
+ if ARGV.size == 0
222
+ puts 'usage: feedbag url'
223
+ else
224
+ puts Feedbag.find ARGV.first
225
+ end
226
+ end
data/lib/feedbagtoo.rb ADDED
@@ -0,0 +1 @@
1
+ require 'feedbag'
data/rails/init.rb ADDED
@@ -0,0 +1 @@
1
+ require File.join File.dirname(__FILE__), "..", "lib", "feedbag"
@@ -0,0 +1,40 @@
1
+ require File.dirname(__FILE__) + '/test_helper'
2
+
3
+ class AtomAutoDiscoveryTest < Test::Unit::TestCase
4
+ def test_autodisc
5
+ base_url = "http://diveintomark.org/tests/client/autodiscovery/"
6
+ url = base_url + "html4-001.html"
7
+
8
+ i = 1
9
+ puts "trying now with #{url}"
10
+ while(i)
11
+ puts
12
+ i = 0 # unless otherwise found
13
+
14
+ f = Feedbag.find url
15
+
16
+ assert_instance_of Array, f
17
+ assert f.size == 1, "Feedbag didn't find a feed on #{url} or found more than one"
18
+
19
+ puts " found #{f[0]}"
20
+ feed = Hpricot(open(f[0]))
21
+
22
+ (feed/"link").each do |l|
23
+ next unless l["rel"] == "alternate"
24
+ assert_equal l["href"], url
25
+ end
26
+
27
+ # ahora me voy al siguiente
28
+ html = Hpricot(open(url))
29
+ (html/"link").each do |l|
30
+ next unless l["rel"] == "next"
31
+ url = URI.parse(base_url).merge(l["href"]).to_s
32
+ puts "trying now with #{url}"
33
+ i = 1
34
+ end
35
+
36
+ end
37
+ end
38
+
39
+
40
+ end
@@ -0,0 +1,47 @@
1
+ require File.dirname(__FILE__) + '/test_helper'
2
+ class FeedbagTest < ActiveSupport::TestCase
3
+
4
+ test "Feedbag.feed? should know that an RSS url is a feed" do
5
+ rss_url = 'http://example.com/rss/'
6
+ Feedbag.stubs(:find).with(rss_url).returns([rss_url])
7
+
8
+ assert Feedbag.feed?(rss_url)
9
+ end
10
+
11
+ test "Feedbag.feed? should know that an RSS url with parameters is a feed" do
12
+ rss_url = "http://example.com/data?format=rss"
13
+ Feedbag.stubs(:find).with(rss_url).returns([rss_url])
14
+
15
+ assert Feedbag.feed?(rss_url)
16
+ end
17
+
18
+ test "Feedbag find should discover feeds containing atom:link" do
19
+ feeds = []
20
+ feeds << 'http://www.psfk.com/feeds/mashable'
21
+ feeds << 'http://jenniferlynch.wordpress.com/feed'
22
+ feeds << 'http://lurenbijdeburen.wordpress.com/feed'
23
+
24
+ feeds.each do |url|
25
+ assert_equal url, Feedbag.find(url).first.url
26
+ end
27
+ end
28
+
29
+ test "Feedbag find should discover feeds from site" do
30
+ feeds = []
31
+ feeds << 'http://www.justinball.com/'
32
+
33
+ feeds.each do |url|
34
+ assert_equal 'http://www.justinball.com/feed/', Feedbag.find(url).first.url
35
+ end
36
+ end
37
+
38
+ test "Feedbag find should discover feeds from xml" do
39
+ feeds = []
40
+ feeds << 'http://tabtimes.com/tbfeed/mashable/full.xml'
41
+
42
+ feeds.each do |url|
43
+ assert_equal url, Feedbag.find(url).first.url
44
+ end
45
+ end
46
+
47
+ end
@@ -0,0 +1,17 @@
1
+ require 'rubygems'
2
+
3
+ require 'test/unit'
4
+
5
+
6
+ require 'active_support'
7
+ require 'active_support/test_case'
8
+
9
+ require 'mocha'
10
+
11
+ require File.dirname(__FILE__) + '/../lib/feedbag'
12
+
13
+ if RUBY_VERSION < '1.9'
14
+ require 'ruby-debug'
15
+ else
16
+ require 'debugger'
17
+ end
metadata ADDED
@@ -0,0 +1,206 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: feedbagtoo
3
+ version: !ruby/object:Gem::Version
4
+ hash: 7
5
+ prerelease:
6
+ segments:
7
+ - 0
8
+ - 7
9
+ - 2
10
+ version: 0.7.2
11
+ platform: ruby
12
+ authors:
13
+ - Axiombox
14
+ - David Moreno
15
+ - Joel Duffin
16
+ - Justin Ball
17
+ - Fabien Penso
18
+ autorequire:
19
+ bindir: bin
20
+ cert_chain: []
21
+
22
+ date: 2012-07-18 00:00:00 -06:00
23
+ default_executable: feedbag
24
+ dependencies:
25
+ - !ruby/object:Gem::Dependency
26
+ prerelease: false
27
+ type: :development
28
+ requirement: &id001 !ruby/object:Gem::Requirement
29
+ none: false
30
+ requirements:
31
+ - - ">="
32
+ - !ruby/object:Gem::Version
33
+ hash: 3
34
+ segments:
35
+ - 0
36
+ version: "0"
37
+ name: growl
38
+ version_requirements: *id001
39
+ - !ruby/object:Gem::Dependency
40
+ prerelease: false
41
+ type: :development
42
+ requirement: &id002 !ruby/object:Gem::Requirement
43
+ none: false
44
+ requirements:
45
+ - - ~>
46
+ - !ruby/object:Gem::Version
47
+ hash: 31
48
+ segments:
49
+ - 3
50
+ - 12
51
+ version: "3.12"
52
+ name: rdoc
53
+ version_requirements: *id002
54
+ - !ruby/object:Gem::Dependency
55
+ prerelease: false
56
+ type: :development
57
+ requirement: &id003 !ruby/object:Gem::Requirement
58
+ none: false
59
+ requirements:
60
+ - - ">="
61
+ - !ruby/object:Gem::Version
62
+ hash: 23
63
+ segments:
64
+ - 1
65
+ - 0
66
+ - 0
67
+ version: 1.0.0
68
+ name: bundler
69
+ version_requirements: *id003
70
+ - !ruby/object:Gem::Dependency
71
+ prerelease: false
72
+ type: :development
73
+ requirement: &id004 !ruby/object:Gem::Requirement
74
+ none: false
75
+ requirements:
76
+ - - ~>
77
+ - !ruby/object:Gem::Version
78
+ hash: 49
79
+ segments:
80
+ - 1
81
+ - 8
82
+ - 3
83
+ version: 1.8.3
84
+ name: jeweler
85
+ version_requirements: *id004
86
+ - !ruby/object:Gem::Dependency
87
+ prerelease: false
88
+ type: :development
89
+ requirement: &id005 !ruby/object:Gem::Requirement
90
+ none: false
91
+ requirements:
92
+ - - ">="
93
+ - !ruby/object:Gem::Version
94
+ hash: 3
95
+ segments:
96
+ - 0
97
+ version: "0"
98
+ name: active_support
99
+ version_requirements: *id005
100
+ - !ruby/object:Gem::Dependency
101
+ prerelease: false
102
+ type: :development
103
+ requirement: &id006 !ruby/object:Gem::Requirement
104
+ none: false
105
+ requirements:
106
+ - - ">="
107
+ - !ruby/object:Gem::Version
108
+ hash: 3
109
+ segments:
110
+ - 0
111
+ version: "0"
112
+ name: mocha
113
+ version_requirements: *id006
114
+ - !ruby/object:Gem::Dependency
115
+ prerelease: false
116
+ type: :development
117
+ requirement: &id007 !ruby/object:Gem::Requirement
118
+ none: false
119
+ requirements:
120
+ - - ">="
121
+ - !ruby/object:Gem::Version
122
+ hash: 3
123
+ segments:
124
+ - 0
125
+ version: "0"
126
+ name: hpricot
127
+ version_requirements: *id007
128
+ - !ruby/object:Gem::Dependency
129
+ prerelease: false
130
+ type: :development
131
+ requirement: &id008 !ruby/object:Gem::Requirement
132
+ none: false
133
+ requirements:
134
+ - - ">="
135
+ - !ruby/object:Gem::Version
136
+ hash: 3
137
+ segments:
138
+ - 0
139
+ version: "0"
140
+ name: ruby-debug
141
+ version_requirements: *id008
142
+ description: This gem will return title and url for each feed discovered at a given url
143
+ email: justin@tatemae.com
144
+ executables:
145
+ - feedbag
146
+ extensions: []
147
+
148
+ extra_rdoc_files:
149
+ - ChangeLog
150
+ - README.markdown
151
+ - TODO
152
+ files:
153
+ - COPYING
154
+ - ChangeLog
155
+ - Gemfile
156
+ - Gemfile.lock
157
+ - README.markdown
158
+ - Rakefile
159
+ - TODO
160
+ - VERSION
161
+ - benchmark/rfeedfinder_benchmark.rb
162
+ - bin/feedbag
163
+ - feedbagtoo.gemspec
164
+ - index.html
165
+ - lib/feedbag.rb
166
+ - lib/feedbagtoo.rb
167
+ - rails/init.rb
168
+ - test/atom_autodiscovery_test.rb
169
+ - test/feedbag_test.rb
170
+ - test/test_helper.rb
171
+ has_rdoc: true
172
+ homepage: http://github.com/tatemae/feedbagtoo
173
+ licenses: []
174
+
175
+ post_install_message:
176
+ rdoc_options: []
177
+
178
+ require_paths:
179
+ - lib
180
+ required_ruby_version: !ruby/object:Gem::Requirement
181
+ none: false
182
+ requirements:
183
+ - - ">="
184
+ - !ruby/object:Gem::Version
185
+ hash: 3
186
+ segments:
187
+ - 0
188
+ version: "0"
189
+ required_rubygems_version: !ruby/object:Gem::Requirement
190
+ none: false
191
+ requirements:
192
+ - - ">="
193
+ - !ruby/object:Gem::Version
194
+ hash: 3
195
+ segments:
196
+ - 0
197
+ version: "0"
198
+ requirements: []
199
+
200
+ rubyforge_project:
201
+ rubygems_version: 1.6.2
202
+ signing_key:
203
+ specification_version: 3
204
+ summary: Fork of the feedbag gem that returns title along with url.
205
+ test_files: []
206
+