feedtosis 0.0.3.6
Sign up to get free protection for your applications and to get access to all the features.
- data/LICENSE +20 -0
- data/README.rdoc +135 -0
- data/Rakefile +28 -0
- data/feedtosis.gemspec +48 -0
- data/lib/extensions/core/hash.rb +7 -0
- data/lib/extensions/feed_normalizer/feed_instance_methods.rb +17 -0
- data/lib/feedtosis.rb +17 -0
- data/lib/feedtosis/client.rb +173 -0
- data/lib/feedtosis/result.rb +34 -0
- data/spec/extensions/feed_normalizer/feed_instance_methods_spec.rb +12 -0
- data/spec/feedtosis/client_spec.rb +162 -0
- data/spec/feedtosis/result_spec.rb +34 -0
- data/spec/fixtures/http_headers/wooster.txt +19 -0
- data/spec/fixtures/xml/older_wooster.xml +203 -0
- data/spec/fixtures/xml/wooster.xml +215 -0
- data/spec/spec_helper.rb +31 -0
- metadata +117 -0
data/LICENSE
ADDED
@@ -0,0 +1,20 @@
|
|
1
|
+
Copyright (c) 2009 Justin S. Leitgeb
|
2
|
+
|
3
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
4
|
+
a copy of this software and associated documentation files (the
|
5
|
+
"Software"), to deal in the Software without restriction, including
|
6
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
7
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
8
|
+
permit persons to whom the Software is furnished to do so, subject to
|
9
|
+
the following conditions:
|
10
|
+
|
11
|
+
The above copyright notice and this permission notice shall be
|
12
|
+
included in all copies or substantial portions of the Software.
|
13
|
+
|
14
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
15
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
16
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
17
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
18
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
19
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
20
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.rdoc
ADDED
@@ -0,0 +1,135 @@
|
|
1
|
+
= Description
|
2
|
+
|
3
|
+
Feedtosis fetches RSS and Atom feeds with an easy-to-use interface. It uses
|
4
|
+
FeedNormalizer for parsing, and Curb for fetching. It helps by automatically
|
5
|
+
using conditional HTTP GET requests as well as by reliably pointing out which
|
6
|
+
entries are new in any given feed.
|
7
|
+
|
8
|
+
Feedtosis is designed to help you with book-keeping about feed fetching
|
9
|
+
details so that things like using HTTP conditional GET are trivial. It has a simple
|
10
|
+
interface, and remains a lightweight component that delegates to FeedNormalizer
|
11
|
+
for parsing feeds and the fantastic taf2-curb library for fetching feeds.
|
12
|
+
|
13
|
+
== Installation
|
14
|
+
|
15
|
+
Assuming that you've followed the directions on gems.github.com to allow your
|
16
|
+
computer to install gems from GitHub, the following command will install the
|
17
|
+
Feedtosis library:
|
18
|
+
|
19
|
+
sudo gem install jsl-feedtosis
|
20
|
+
|
21
|
+
== Usage
|
22
|
+
|
23
|
+
Feedtosis is easy to use. Just create a client object, and invoke the
|
24
|
+
"fetch" method:
|
25
|
+
|
26
|
+
require 'feedtosis'
|
27
|
+
client = Feedtosis::Client.new('http://feeds.feedburner.com/wooster')
|
28
|
+
result = client.fetch
|
29
|
+
|
30
|
+
+result+ will be a Feedtosis::Result object which delegates methods to
|
31
|
+
the FeedNormalizer::Feed object as well as the Curl::Easy object used to fetch
|
32
|
+
the feed. Useful methods on this object include +entries+, +new_entries+ and
|
33
|
+
+response_code+ among many others (basically all of the methods that
|
34
|
+
FeedNormalizer::Feed and Curl::Easy objects respond to are implemented and can
|
35
|
+
be called directly, minus the setter methods for these objects).
|
36
|
+
|
37
|
+
Note that since Feedtosis uses HTTP conditional GET, it may not actually
|
38
|
+
have received a full XML response from the server suitable for being parsed
|
39
|
+
into entries. In this case, methods such as +entries+ on the Feedtosis::Result
|
40
|
+
will return +nil+. Depending on your application logic, you may want to inspect
|
41
|
+
the methods that are delegated to the Curl::Easy object, such as +response_code+,
|
42
|
+
for more information on what happened in these cases.
|
43
|
+
|
44
|
+
Remember that a response code of 304 means "Not Modified". In this case, you should
|
45
|
+
expect "entries" and "new_entries" to be nil, since the resource wasn't downloaded
|
46
|
+
according to the logic of HTTP conditional GET.
|
47
|
+
|
48
|
+
On subsequent requests of a particular resource, Feedtosis will update
|
49
|
+
+new_entries+ to contain the feed entries that we haven't seen yet. In most
|
50
|
+
applications, your program will probably call the same batch of URLS multiple
|
51
|
+
times, and process the elements in +new_entries+.
|
52
|
+
|
53
|
+
You will most likely want to allow Feedtosis to remember details about the
|
54
|
+
last retrieval of a feed after the client is removed from memory. Feedtosis
|
55
|
+
uses Moneta, a unified interface to key-value storage systems to remember
|
56
|
+
"summaries" of feeds that it has seen in the past. See the document section on
|
57
|
+
Customization for more details on how to configure this system.
|
58
|
+
|
59
|
+
== Customization
|
60
|
+
|
61
|
+
Feedtosis stores summaries of feeds in a key-value storage system. If no
|
62
|
+
options are included when creating a new Feedtosis::Client object, the
|
63
|
+
default is to use a "memory" storage system. The memory system is just a basic
|
64
|
+
ruby Hash, so it won't keep track of feeds after a particular Client is removed
|
65
|
+
from memory. To configure a different backend, pass an options hash to the
|
66
|
+
Feedtosis client initialization:
|
67
|
+
|
68
|
+
url = "http://newsrss.bbc.co.uk/rss/newsonline_world_edition/south_asia/rss.xml"
|
69
|
+
f = Feedtosis::Client.new(url, :backend => Moneta::Memcache.new(:server => 'localhost:1978'))
|
70
|
+
res = f.fetch
|
71
|
+
|
72
|
+
This example sets up a Memcache backend, which in this case points to Tokyo
|
73
|
+
Tyrant on port 1978.
|
74
|
+
|
75
|
+
Generally, Feedtosis supports all systems supported by Moneta, and any one
|
76
|
+
of the supported systems can be given to the +moneta_klass+ parameter. Other
|
77
|
+
options following +backend+ are passed directly to Moneta for configuration.
|
78
|
+
|
79
|
+
|
80
|
+
== Implementation
|
81
|
+
|
82
|
+
Feedtosis helps to identify new feed entries and to figure out when
|
83
|
+
conditional GET can be used in retrieving resources. In order to accomplish this
|
84
|
+
without having to require that the user store information such as etags and
|
85
|
+
dates of the last retrieved entry, Feedtosis stores a summary structure in
|
86
|
+
the configured key-value store (backed by Moneta). In order to do conditional
|
87
|
+
GET requests, Feedtosis stores the Last-Modified date, as well as the ETag
|
88
|
+
of the last request in the summary structure, which is put in a namespaced
|
89
|
+
element consisting of the term 'Feedtosis' (bet you won't have to worry
|
90
|
+
about name collisions on that one!) and the MD5 of the URL retrieved.
|
91
|
+
|
92
|
+
It can also be a bit tricky to decipher which feed entries are new since many
|
93
|
+
feed sources don't include unique ids with their feeds. Feedtosis reliably
|
94
|
+
keeps track of which entries in a feed are new by storing (in the summary hash
|
95
|
+
mentioned above) an MD5 signature of each entry in a feed. It takes elements
|
96
|
+
such as the published-at date, title and content and generates the MD5 of these
|
97
|
+
elements. This allows Feedtosis to cheaply compute (both in terms of
|
98
|
+
computation and storage) which feed entries should be presented to the user as
|
99
|
+
"new". Below is an example of a summary structure:
|
100
|
+
|
101
|
+
{
|
102
|
+
:etag => "4c8f-46ac09fbbe940",
|
103
|
+
:last_modified => "Mon, 25 May 2009 18:17:33 GMT",
|
104
|
+
:digests => [["f2993783ded928637ce5f2dc2d837f10", "da64efa6dd9ce34e5699b9efe73a37a7"]]
|
105
|
+
}
|
106
|
+
|
107
|
+
The data stored by Feedtosis in the summary structure allows it to be
|
108
|
+
helpful to the user without storing lots of data that are unnecessary for
|
109
|
+
efficient functioning.
|
110
|
+
|
111
|
+
The summary structure keeps an Array of Arrays containing digests of feeds. The reason
|
112
|
+
for this is that some feeds, such as the Google blog search feeds, contain slightly different
|
113
|
+
but often-recurring results in the result set. Feedtosis keeps complete sets of entry digests
|
114
|
+
for previous feed retrievals. The number of digest sets that will be kept is configurable by
|
115
|
+
setting the option :retained_digest_size on Feedtosis client initialization.
|
116
|
+
|
117
|
+
== HTML cleaning/sanitizing
|
118
|
+
|
119
|
+
Feedtosis doesn't do anything about feed sanitizing, as other libraries have
|
120
|
+
been built for this purpose. FeedNormalizer has methods for escaping entries,
|
121
|
+
but to strip HTML I suggest that you look at the Ruby gem "sanitize".
|
122
|
+
|
123
|
+
== Credits
|
124
|
+
|
125
|
+
Thanks to Sander Hartlage (GitHub: Sander6) for useful feedback early in the
|
126
|
+
development of Feedtosis.
|
127
|
+
|
128
|
+
== Feedback
|
129
|
+
|
130
|
+
Please let me know if you have any problems with or questions about
|
131
|
+
Feedtosis.
|
132
|
+
|
133
|
+
= Author
|
134
|
+
|
135
|
+
Justin S. Leitgeb, mailto:justin@phq.org
|
data/Rakefile
ADDED
@@ -0,0 +1,28 @@
|
|
1
|
+
require 'rubygems'
|
2
|
+
require 'spec'
|
3
|
+
|
4
|
+
require 'rake'
|
5
|
+
require 'spec/rake/spectask'
|
6
|
+
require 'rake/rdoctask'
|
7
|
+
|
8
|
+
require 'lib/feedtosis'
|
9
|
+
|
10
|
+
desc 'Test the plugin.'
|
11
|
+
Spec::Rake::SpecTask.new(:spec) do |t|
|
12
|
+
t.spec_opts = ["--format", "progress", "--colour"]
|
13
|
+
t.libs << 'lib'
|
14
|
+
t.verbose = true
|
15
|
+
end
|
16
|
+
|
17
|
+
desc "Run all the tests"
|
18
|
+
task :default => :spec
|
19
|
+
|
20
|
+
desc 'Generate documentation'
|
21
|
+
Rake::RDocTask.new(:rdoc) do |rdoc|
|
22
|
+
rdoc.rdoc_dir = 'rdoc'
|
23
|
+
rdoc.title = 'Feedtosis'
|
24
|
+
rdoc.options << '--line-numbers' << '--inline-source'
|
25
|
+
rdoc.rdoc_files.include('README.rdoc')
|
26
|
+
rdoc.rdoc_files.include('lib/feedtosis/**/*.rb')
|
27
|
+
end
|
28
|
+
|
data/feedtosis.gemspec
ADDED
@@ -0,0 +1,48 @@
|
|
1
|
+
Gem::Specification.new do |s|
|
2
|
+
s.name = %q{feedtosis}
|
3
|
+
s.version = "0.0.3.6"
|
4
|
+
|
5
|
+
s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
|
6
|
+
s.authors = ["Justin Leitgeb"]
|
7
|
+
s.date = %q{2009-07-15}
|
8
|
+
s.description = %q{Feedtosis finds new information in feeds quickly using smart fetching and matching of previously read entries}
|
9
|
+
s.email = %q{justin@phq.org}
|
10
|
+
|
11
|
+
s.files = ["lib/extensions/core/hash.rb",
|
12
|
+
"lib/extensions/feed_normalizer/feed_instance_methods.rb",
|
13
|
+
"lib/feedtosis/result.rb",
|
14
|
+
"lib/feedtosis/client.rb", "lib/feedtosis.rb", "LICENSE",
|
15
|
+
"feedtosis.gemspec", "Rakefile", "README.rdoc",
|
16
|
+
"spec/extensions/feed_normalizer/feed_instance_methods_spec.rb",
|
17
|
+
"spec/fixtures/http_headers/wooster.txt",
|
18
|
+
"spec/fixtures/xml/older_wooster.xml", "spec/fixtures/xml/wooster.xml",
|
19
|
+
"spec/feedtosis/client_spec.rb",
|
20
|
+
"spec/feedtosis/result_spec.rb",
|
21
|
+
"spec/spec_helper.rb"]
|
22
|
+
|
23
|
+
s.has_rdoc = true
|
24
|
+
s.homepage = %q{http://github.com/jsl/feedtosis}
|
25
|
+
s.rdoc_options = ["--charset=UTF-8"]
|
26
|
+
s.require_paths = ["lib"]
|
27
|
+
s.rubygems_version = %q{1.3.1}
|
28
|
+
s.summary = %q{Retrieves feeds using conditional GET and marks entries that you haven't seen before}
|
29
|
+
s.test_files = ["spec/spec_helper.rb", "spec/feedtosis/client_spec.rb", "spec/feedtosis/result_spec.rb" ]
|
30
|
+
|
31
|
+
s.extra_rdoc_files = [ "README.rdoc" ]
|
32
|
+
|
33
|
+
s.rdoc_options += [
|
34
|
+
'--title', 'Feedtosis',
|
35
|
+
'--main', 'README.rdoc',
|
36
|
+
'--line-numbers',
|
37
|
+
'--inline-source'
|
38
|
+
]
|
39
|
+
|
40
|
+
%w[ taf2-curb jsl-moneta jsl-http_headers feed-normalizer ].each do |dep|
|
41
|
+
s.add_dependency(dep)
|
42
|
+
end
|
43
|
+
|
44
|
+
if s.respond_to? :specification_version then
|
45
|
+
current_version = Gem::Specification::CURRENT_SPECIFICATION_VERSION
|
46
|
+
s.specification_version = 2
|
47
|
+
end
|
48
|
+
end
|
@@ -0,0 +1,17 @@
|
|
1
|
+
# Extends FeedNormalizer::Feed with method for detecting new_items (aliased as new_entries for
|
2
|
+
# convenience).
|
3
|
+
module Feedtosis
|
4
|
+
module FeedInstanceMethods
|
5
|
+
|
6
|
+
# Returns only the feeds that are new.
|
7
|
+
def new_items
|
8
|
+
self.entries.select do |e|
|
9
|
+
e.instance_variable_get(:@_seen) == false
|
10
|
+
end
|
11
|
+
end
|
12
|
+
|
13
|
+
alias :new_entries :new_items
|
14
|
+
end
|
15
|
+
end
|
16
|
+
|
17
|
+
FeedNormalizer::Feed.__send__(:include, Feedtosis::FeedInstanceMethods)
|
data/lib/feedtosis.rb
ADDED
@@ -0,0 +1,17 @@
|
|
1
|
+
require 'rubygems'
|
2
|
+
|
3
|
+
require 'curb'
|
4
|
+
require 'http_headers'
|
5
|
+
require 'feed-normalizer'
|
6
|
+
require 'md5'
|
7
|
+
require 'uri'
|
8
|
+
|
9
|
+
lib_dirs = [ 'extensions', 'feedtosis' ].map do |d|
|
10
|
+
File.join(File.dirname(__FILE__), d)
|
11
|
+
end
|
12
|
+
|
13
|
+
lib_dirs.each do |d|
|
14
|
+
Dir[File.join(d, "**", "*.rb")].each do |file|
|
15
|
+
require file
|
16
|
+
end
|
17
|
+
end
|
@@ -0,0 +1,173 @@
|
|
1
|
+
module Feedtosis
|
2
|
+
|
3
|
+
# Feedtosis::Client is the primary interface to the feed reader. Call it
|
4
|
+
# with a url that was previously fetched while connected to the configured
|
5
|
+
# backend, and it will 1) only do a retrieval if deemed necessary based on the
|
6
|
+
# etag and modified-at of the last etag and 2) mark all entries retrieved as
|
7
|
+
# either new or not new. Entries retrieved are normalized using the
|
8
|
+
# feed-normalizer gem.
|
9
|
+
class Client
|
10
|
+
attr_reader :url, :options, :backend
|
11
|
+
|
12
|
+
DEFAULTS = {
|
13
|
+
:backend => Hash.new,
|
14
|
+
|
15
|
+
# The namespace will be prefixed to the key used for storage of the summary value. Based on your
|
16
|
+
# application needs, it may be useful to provide a custom prefix with initialization options.
|
17
|
+
:namespace => 'feedtosis',
|
18
|
+
|
19
|
+
# Some feed aggregators that we may be pulling from have entries that are present in one fetch and
|
20
|
+
# then disappear (Google blog search does this). For these cases, we can't rely on only the digests of
|
21
|
+
# the last fetch to guarantee "newness" of a feed that we may have previously consumed. We keep a
|
22
|
+
# number of previous sets of digests in order to make sure that we mark correct feeds as "new".
|
23
|
+
:retained_digest_size => 10
|
24
|
+
} unless defined?(DEFAULTS)
|
25
|
+
|
26
|
+
# Initializes a new feedtosis library. It must be initialized with a valid URL as the first argument.
|
27
|
+
# A following optional +options+ Hash may take the arguments:
|
28
|
+
# * backend: a key-value store to be used for summary structures of feeds fetched. Moneta backends work well, but any object acting like a Hash is valid.
|
29
|
+
# * retained_digest_size: an Integer specifying the number of previous MD5 sets of entries to keep, used for new feed detection
|
30
|
+
def initialize(url, options = { })
|
31
|
+
@url = url
|
32
|
+
|
33
|
+
raise ArgumentError, "Feedtosis::Client options must be in Hash form if provided" unless options.is_a?(Hash)
|
34
|
+
@options = options.reverse_merge(DEFAULTS)
|
35
|
+
|
36
|
+
@backend = @options[:backend]
|
37
|
+
|
38
|
+
unless @url.match(URI.regexp('http'))
|
39
|
+
raise ArgumentError, "Url #{@url} is not valid!"
|
40
|
+
end
|
41
|
+
|
42
|
+
unless @backend.respond_to?(:[]) && @backend.respond_to?(:[]=)
|
43
|
+
raise ArgumentError, "Backend needs to be a key-value store"
|
44
|
+
end
|
45
|
+
end
|
46
|
+
|
47
|
+
# Retrieves the latest entries from this feed. Returns a Feedtosis::Result
|
48
|
+
# object which delegates methods to the Curl::Easy object making the request
|
49
|
+
# and the FeedNormalizer::Feed object that may have been created from the
|
50
|
+
# HTTP response body.
|
51
|
+
def fetch
|
52
|
+
curl = build_curl_easy
|
53
|
+
curl.perform
|
54
|
+
feed = process_curl_response(curl)
|
55
|
+
Feedtosis::Result.new(curl, feed)
|
56
|
+
end
|
57
|
+
|
58
|
+
private
|
59
|
+
|
60
|
+
# Marks entries as either seen or not seen based on the unique signature of
|
61
|
+
# the entry, which is calculated by taking the MD5 of common attributes.
|
62
|
+
def mark_new_entries(response)
|
63
|
+
digests = summary_digests
|
64
|
+
|
65
|
+
# For each entry in the responses object, mark @_seen as false if the
|
66
|
+
# digest of this entry doesn't exist in the cached object.
|
67
|
+
response.entries.each do |e|
|
68
|
+
seen = digests.include?(digest_for(e))
|
69
|
+
e.instance_variable_set(:@_seen, seen)
|
70
|
+
end
|
71
|
+
|
72
|
+
response
|
73
|
+
end
|
74
|
+
|
75
|
+
# Returns an Array of summary digests for this feed. Since we keep a number of sets
|
76
|
+
# of digests, inject across these sets to accumulate unique identifiers.
|
77
|
+
def summary_digests
|
78
|
+
summary_for_feed[:digests].inject([]) do |r, e|
|
79
|
+
r |= e
|
80
|
+
end.uniq
|
81
|
+
end
|
82
|
+
|
83
|
+
# Processes the results by identifying which entries are new if the response
|
84
|
+
# is a 200. Otherwise, returns the Curl::Easy object for the user to inspect.
|
85
|
+
def process_curl_response(curl)
|
86
|
+
if curl.response_code == 200
|
87
|
+
response = parser_for_xml(curl.body_str)
|
88
|
+
response = mark_new_entries(response)
|
89
|
+
store_summary_to_backend(response, curl)
|
90
|
+
response
|
91
|
+
end
|
92
|
+
end
|
93
|
+
|
94
|
+
# Sets options for the Curl::Easy object, including parameters for HTTP
|
95
|
+
# conditional GET.
|
96
|
+
def build_curl_easy
|
97
|
+
curl = new_curl_easy(@url)
|
98
|
+
|
99
|
+
# Many feeds have a 302 redirect to another URL. For more recent versions
|
100
|
+
# of Curl, we need to specify this.
|
101
|
+
curl.follow_location = true
|
102
|
+
|
103
|
+
set_header_options(curl)
|
104
|
+
end
|
105
|
+
|
106
|
+
def new_curl_easy(url)
|
107
|
+
Curl::Easy.new(url)
|
108
|
+
end
|
109
|
+
|
110
|
+
# Returns the summary hash for this feed from the backend store.
|
111
|
+
def summary_for_feed
|
112
|
+
@backend[key_for_cached] || { :digests => [ ] }
|
113
|
+
end
|
114
|
+
|
115
|
+
# Sets the headers from the backend, if available
|
116
|
+
def set_header_options(curl)
|
117
|
+
summary = summary_for_feed
|
118
|
+
|
119
|
+
unless summary.nil?
|
120
|
+
curl.headers['If-None-Match'] = summary[:etag] unless summary[:etag].nil?
|
121
|
+
curl.headers['If-Modified-Since'] = summary[:last_modified] unless summary[:last_modified].nil?
|
122
|
+
end
|
123
|
+
|
124
|
+
curl
|
125
|
+
end
|
126
|
+
|
127
|
+
# Returns the key for the storage of the summary structure in the key-value system.
|
128
|
+
def key_for_cached
|
129
|
+
[ @options[:namespace], MD5.hexdigest(@url) ].join('_')
|
130
|
+
end
|
131
|
+
|
132
|
+
# Stores information about the retrieval, including ETag, Last-Modified,
|
133
|
+
# and MD5 digests of all entries to the backend store. This enables
|
134
|
+
# conditional GET usage on subsequent requests and marking of entries as
|
135
|
+
# either new or seen.
|
136
|
+
def store_summary_to_backend(feed, curl)
|
137
|
+
headers = HttpHeaders.new(curl.header_str)
|
138
|
+
|
139
|
+
# Store info about HTTP retrieval
|
140
|
+
summary = { }
|
141
|
+
|
142
|
+
summary.merge!(:etag => headers.etag) unless headers.etag.nil?
|
143
|
+
summary.merge!(:last_modified => headers.last_modified) unless headers.last_modified.nil?
|
144
|
+
|
145
|
+
# Store digest for each feed entry so we can detect new feeds on the next
|
146
|
+
# retrieval
|
147
|
+
new_digest_set = feed.entries.map do |e|
|
148
|
+
digest_for(e)
|
149
|
+
end
|
150
|
+
|
151
|
+
new_digest_set = summary_for_feed[:digests].unshift(new_digest_set)
|
152
|
+
new_digest_set = new_digest_set[0..@options[:retained_digest_size]]
|
153
|
+
|
154
|
+
summary.merge!( :digests => new_digest_set )
|
155
|
+
set_summary(summary)
|
156
|
+
end
|
157
|
+
|
158
|
+
def set_summary(summary)
|
159
|
+
@backend[key_for_cached] = summary
|
160
|
+
end
|
161
|
+
|
162
|
+
# Computes a unique signature for the FeedNormalizer::Entry object given.
|
163
|
+
# This signature will be the MD5 of enough fields to have a reasonable
|
164
|
+
# probability of determining if the entry is unique or not.
|
165
|
+
def digest_for(entry)
|
166
|
+
MD5.hexdigest( [ entry.title, entry.content, entry.date_published ].join )
|
167
|
+
end
|
168
|
+
|
169
|
+
def parser_for_xml(xml)
|
170
|
+
FeedNormalizer::FeedNormalizer.parse(xml)
|
171
|
+
end
|
172
|
+
end
|
173
|
+
end
|