feedtosis 0.0.3.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/LICENSE +20 -0
- data/README.rdoc +135 -0
- data/Rakefile +28 -0
- data/feedtosis.gemspec +48 -0
- data/lib/extensions/core/hash.rb +7 -0
- data/lib/extensions/feed_normalizer/feed_instance_methods.rb +17 -0
- data/lib/feedtosis.rb +17 -0
- data/lib/feedtosis/client.rb +173 -0
- data/lib/feedtosis/result.rb +34 -0
- data/spec/extensions/feed_normalizer/feed_instance_methods_spec.rb +12 -0
- data/spec/feedtosis/client_spec.rb +162 -0
- data/spec/feedtosis/result_spec.rb +34 -0
- data/spec/fixtures/http_headers/wooster.txt +19 -0
- data/spec/fixtures/xml/older_wooster.xml +203 -0
- data/spec/fixtures/xml/wooster.xml +215 -0
- data/spec/spec_helper.rb +31 -0
- metadata +117 -0
data/LICENSE
ADDED
@@ -0,0 +1,20 @@
|
|
1
|
+
Copyright (c) 2009 Justin S. Leitgeb
|
2
|
+
|
3
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
4
|
+
a copy of this software and associated documentation files (the
|
5
|
+
"Software"), to deal in the Software without restriction, including
|
6
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
7
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
8
|
+
permit persons to whom the Software is furnished to do so, subject to
|
9
|
+
the following conditions:
|
10
|
+
|
11
|
+
The above copyright notice and this permission notice shall be
|
12
|
+
included in all copies or substantial portions of the Software.
|
13
|
+
|
14
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
15
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
16
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
17
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
18
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
19
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
20
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.rdoc
ADDED
@@ -0,0 +1,135 @@
|
|
1
|
+
= Description
|
2
|
+
|
3
|
+
Feedtosis fetches RSS and Atom feeds with an easy-to-use interface. It uses
|
4
|
+
FeedNormalizer for parsing, and Curb for fetching. It helps by automatically
|
5
|
+
using conditional HTTP GET requests as well as by reliably pointing out which
|
6
|
+
entries are new in any given feed.
|
7
|
+
|
8
|
+
Feedtosis is designed to help you with book-keeping about feed fetching
|
9
|
+
details so that things like using HTTP conditional GET are trivial. It has a simple
|
10
|
+
interface, and remains a lightweight component that delegates to FeedNormalizer
|
11
|
+
for parsing feeds and the fantastic taf2-curb library for fetching feeds.
|
12
|
+
|
13
|
+
== Installation
|
14
|
+
|
15
|
+
Assuming that you've followed the directions on gems.github.com to allow your
|
16
|
+
computer to install gems from GitHub, the following command will install the
|
17
|
+
Feedtosis library:
|
18
|
+
|
19
|
+
sudo gem install jsl-feedtosis
|
20
|
+
|
21
|
+
== Usage
|
22
|
+
|
23
|
+
Feedtosis is easy to use. Just create a client object, and invoke the
|
24
|
+
"fetch" method:
|
25
|
+
|
26
|
+
require 'feedtosis'
|
27
|
+
client = Feedtosis::Client.new('http://feeds.feedburner.com/wooster')
|
28
|
+
result = client.fetch
|
29
|
+
|
30
|
+
+result+ will be a Feedtosis::Result object which delegates methods to
|
31
|
+
the FeedNormalizer::Feed object as well as the Curl::Easy object used to fetch
|
32
|
+
the feed. Useful methods on this object include +entries+, +new_entries+ and
|
33
|
+
+response_code+ among many others (basically all of the methods that
|
34
|
+
FeedNormalizer::Feed and Curl::Easy objects respond to are implemented and can
|
35
|
+
be called directly, minus the setter methods for these objects).
|
36
|
+
|
37
|
+
Note that since Feedtosis uses HTTP conditional GET, it may not actually
|
38
|
+
have received a full XML response from the server suitable for being parsed
|
39
|
+
into entries. In this case, methods such as +entries+ on the Feedtosis::Result
|
40
|
+
will return +nil+. Depending on your application logic, you may want to inspect
|
41
|
+
the methods that are delegated to the Curl::Easy object, such as +response_code+,
|
42
|
+
for more information on what happened in these cases.
|
43
|
+
|
44
|
+
Remember that a response code of 304 means "Not Modified". In this case, you should
|
45
|
+
expect "entries" and "new_entries" to be nil, since the resource wasn't downloaded
|
46
|
+
according to the logic of HTTP conditional GET.
|
47
|
+
|
48
|
+
On subsequent requests of a particular resource, Feedtosis will update
|
49
|
+
+new_entries+ to contain the feed entries that we haven't seen yet. In most
|
50
|
+
applications, your program will probably call the same batch of URLS multiple
|
51
|
+
times, and process the elements in +new_entries+.
|
52
|
+
|
53
|
+
You will most likely want to allow Feedtosis to remember details about the
|
54
|
+
last retrieval of a feed after the client is removed from memory. Feedtosis
|
55
|
+
uses Moneta, a unified interface to key-value storage systems to remember
|
56
|
+
"summaries" of feeds that it has seen in the past. See the document section on
|
57
|
+
Customization for more details on how to configure this system.
|
58
|
+
|
59
|
+
== Customization
|
60
|
+
|
61
|
+
Feedtosis stores summaries of feeds in a key-value storage system. If no
|
62
|
+
options are included when creating a new Feedtosis::Client object, the
|
63
|
+
default is to use a "memory" storage system. The memory system is just a basic
|
64
|
+
ruby Hash, so it won't keep track of feeds after a particular Client is removed
|
65
|
+
from memory. To configure a different backend, pass an options hash to the
|
66
|
+
Feedtosis client initialization:
|
67
|
+
|
68
|
+
url = "http://newsrss.bbc.co.uk/rss/newsonline_world_edition/south_asia/rss.xml"
|
69
|
+
f = Feedtosis::Client.new(url, :backend => Moneta::Memcache.new(:server => 'localhost:1978'))
|
70
|
+
res = f.fetch
|
71
|
+
|
72
|
+
This example sets up a Memcache backend, which in this case points to Tokyo
|
73
|
+
Tyrant on port 1978.
|
74
|
+
|
75
|
+
Generally, Feedtosis supports all systems supported by Moneta, and any one
|
76
|
+
of the supported systems can be given to the +moneta_klass+ parameter. Other
|
77
|
+
options following +backend+ are passed directly to Moneta for configuration.
|
78
|
+
|
79
|
+
|
80
|
+
== Implementation
|
81
|
+
|
82
|
+
Feedtosis helps to identify new feed entries and to figure out when
|
83
|
+
conditional GET can be used in retrieving resources. In order to accomplish this
|
84
|
+
without having to require that the user store information such as etags and
|
85
|
+
dates of the last retrieved entry, Feedtosis stores a summary structure in
|
86
|
+
the configured key-value store (backed by Moneta). In order to do conditional
|
87
|
+
GET requests, Feedtosis stores the Last-Modified date, as well as the ETag
|
88
|
+
of the last request in the summary structure, which is put in a namespaced
|
89
|
+
element consisting of the term 'Feedtosis' (bet you won't have to worry
|
90
|
+
about name collisions on that one!) and the MD5 of the URL retrieved.
|
91
|
+
|
92
|
+
It can also be a bit tricky to decipher which feed entries are new since many
|
93
|
+
feed sources don't include unique ids with their feeds. Feedtosis reliably
|
94
|
+
keeps track of which entries in a feed are new by storing (in the summary hash
|
95
|
+
mentioned above) an MD5 signature of each entry in a feed. It takes elements
|
96
|
+
such as the published-at date, title and content and generates the MD5 of these
|
97
|
+
elements. This allows Feedtosis to cheaply compute (both in terms of
|
98
|
+
computation and storage) which feed entries should be presented to the user as
|
99
|
+
"new". Below is an example of a summary structure:
|
100
|
+
|
101
|
+
{
|
102
|
+
:etag => "4c8f-46ac09fbbe940",
|
103
|
+
:last_modified => "Mon, 25 May 2009 18:17:33 GMT",
|
104
|
+
:digests => [["f2993783ded928637ce5f2dc2d837f10", "da64efa6dd9ce34e5699b9efe73a37a7"]]
|
105
|
+
}
|
106
|
+
|
107
|
+
The data stored by Feedtosis in the summary structure allows it to be
|
108
|
+
helpful to the user without storing lots of data that are unnecessary for
|
109
|
+
efficient functioning.
|
110
|
+
|
111
|
+
The summary structure keeps an Array of Arrays containing digests of feeds. The reason
|
112
|
+
for this is that some feeds, such as the Google blog search feeds, contain slightly different
|
113
|
+
but often-recurring results in the result set. Feedtosis keeps complete sets of entry digests
|
114
|
+
for previous feed retrievals. The number of digest sets that will be kept is configurable by
|
115
|
+
setting the option :retained_digest_size on Feedtosis client initialization.
|
116
|
+
|
117
|
+
== HTML cleaning/sanitizing
|
118
|
+
|
119
|
+
Feedtosis doesn't do anything about feed sanitizing, as other libraries have
|
120
|
+
been built for this purpose. FeedNormalizer has methods for escaping entries,
|
121
|
+
but to strip HTML I suggest that you look at the Ruby gem "sanitize".
|
122
|
+
|
123
|
+
== Credits
|
124
|
+
|
125
|
+
Thanks to Sander Hartlage (GitHub: Sander6) for useful feedback early in the
|
126
|
+
development of Feedtosis.
|
127
|
+
|
128
|
+
== Feedback
|
129
|
+
|
130
|
+
Please let me know if you have any problems with or questions about
|
131
|
+
Feedtosis.
|
132
|
+
|
133
|
+
= Author
|
134
|
+
|
135
|
+
Justin S. Leitgeb, mailto:justin@phq.org
|
data/Rakefile
ADDED
@@ -0,0 +1,28 @@
|
|
1
|
+
require 'rubygems'
|
2
|
+
require 'spec'
|
3
|
+
|
4
|
+
require 'rake'
|
5
|
+
require 'spec/rake/spectask'
|
6
|
+
require 'rake/rdoctask'
|
7
|
+
|
8
|
+
require 'lib/feedtosis'
|
9
|
+
|
10
|
+
desc 'Test the plugin.'
|
11
|
+
Spec::Rake::SpecTask.new(:spec) do |t|
|
12
|
+
t.spec_opts = ["--format", "progress", "--colour"]
|
13
|
+
t.libs << 'lib'
|
14
|
+
t.verbose = true
|
15
|
+
end
|
16
|
+
|
17
|
+
desc "Run all the tests"
|
18
|
+
task :default => :spec
|
19
|
+
|
20
|
+
desc 'Generate documentation'
|
21
|
+
Rake::RDocTask.new(:rdoc) do |rdoc|
|
22
|
+
rdoc.rdoc_dir = 'rdoc'
|
23
|
+
rdoc.title = 'Feedtosis'
|
24
|
+
rdoc.options << '--line-numbers' << '--inline-source'
|
25
|
+
rdoc.rdoc_files.include('README.rdoc')
|
26
|
+
rdoc.rdoc_files.include('lib/feedtosis/**/*.rb')
|
27
|
+
end
|
28
|
+
|
data/feedtosis.gemspec
ADDED
@@ -0,0 +1,48 @@
|
|
1
|
+
Gem::Specification.new do |s|
|
2
|
+
s.name = %q{feedtosis}
|
3
|
+
s.version = "0.0.3.6"
|
4
|
+
|
5
|
+
s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
|
6
|
+
s.authors = ["Justin Leitgeb"]
|
7
|
+
s.date = %q{2009-07-15}
|
8
|
+
s.description = %q{Feedtosis finds new information in feeds quickly using smart fetching and matching of previously read entries}
|
9
|
+
s.email = %q{justin@phq.org}
|
10
|
+
|
11
|
+
s.files = ["lib/extensions/core/hash.rb",
|
12
|
+
"lib/extensions/feed_normalizer/feed_instance_methods.rb",
|
13
|
+
"lib/feedtosis/result.rb",
|
14
|
+
"lib/feedtosis/client.rb", "lib/feedtosis.rb", "LICENSE",
|
15
|
+
"feedtosis.gemspec", "Rakefile", "README.rdoc",
|
16
|
+
"spec/extensions/feed_normalizer/feed_instance_methods_spec.rb",
|
17
|
+
"spec/fixtures/http_headers/wooster.txt",
|
18
|
+
"spec/fixtures/xml/older_wooster.xml", "spec/fixtures/xml/wooster.xml",
|
19
|
+
"spec/feedtosis/client_spec.rb",
|
20
|
+
"spec/feedtosis/result_spec.rb",
|
21
|
+
"spec/spec_helper.rb"]
|
22
|
+
|
23
|
+
s.has_rdoc = true
|
24
|
+
s.homepage = %q{http://github.com/jsl/feedtosis}
|
25
|
+
s.rdoc_options = ["--charset=UTF-8"]
|
26
|
+
s.require_paths = ["lib"]
|
27
|
+
s.rubygems_version = %q{1.3.1}
|
28
|
+
s.summary = %q{Retrieves feeds using conditional GET and marks entries that you haven't seen before}
|
29
|
+
s.test_files = ["spec/spec_helper.rb", "spec/feedtosis/client_spec.rb", "spec/feedtosis/result_spec.rb" ]
|
30
|
+
|
31
|
+
s.extra_rdoc_files = [ "README.rdoc" ]
|
32
|
+
|
33
|
+
s.rdoc_options += [
|
34
|
+
'--title', 'Feedtosis',
|
35
|
+
'--main', 'README.rdoc',
|
36
|
+
'--line-numbers',
|
37
|
+
'--inline-source'
|
38
|
+
]
|
39
|
+
|
40
|
+
%w[ taf2-curb jsl-moneta jsl-http_headers feed-normalizer ].each do |dep|
|
41
|
+
s.add_dependency(dep)
|
42
|
+
end
|
43
|
+
|
44
|
+
if s.respond_to? :specification_version then
|
45
|
+
current_version = Gem::Specification::CURRENT_SPECIFICATION_VERSION
|
46
|
+
s.specification_version = 2
|
47
|
+
end
|
48
|
+
end
|
@@ -0,0 +1,17 @@
|
|
1
|
+
# Extends FeedNormalizer::Feed with method for detecting new_items (aliased as new_entries for
|
2
|
+
# convenience).
|
3
|
+
module Feedtosis
|
4
|
+
module FeedInstanceMethods
|
5
|
+
|
6
|
+
# Returns only the feeds that are new.
|
7
|
+
def new_items
|
8
|
+
self.entries.select do |e|
|
9
|
+
e.instance_variable_get(:@_seen) == false
|
10
|
+
end
|
11
|
+
end
|
12
|
+
|
13
|
+
alias :new_entries :new_items
|
14
|
+
end
|
15
|
+
end
|
16
|
+
|
17
|
+
FeedNormalizer::Feed.__send__(:include, Feedtosis::FeedInstanceMethods)
|
data/lib/feedtosis.rb
ADDED
@@ -0,0 +1,17 @@
|
|
1
|
+
require 'rubygems'
|
2
|
+
|
3
|
+
require 'curb'
|
4
|
+
require 'http_headers'
|
5
|
+
require 'feed-normalizer'
|
6
|
+
require 'md5'
|
7
|
+
require 'uri'
|
8
|
+
|
9
|
+
lib_dirs = [ 'extensions', 'feedtosis' ].map do |d|
|
10
|
+
File.join(File.dirname(__FILE__), d)
|
11
|
+
end
|
12
|
+
|
13
|
+
lib_dirs.each do |d|
|
14
|
+
Dir[File.join(d, "**", "*.rb")].each do |file|
|
15
|
+
require file
|
16
|
+
end
|
17
|
+
end
|
@@ -0,0 +1,173 @@
|
|
1
|
+
module Feedtosis
|
2
|
+
|
3
|
+
# Feedtosis::Client is the primary interface to the feed reader. Call it
|
4
|
+
# with a url that was previously fetched while connected to the configured
|
5
|
+
# backend, and it will 1) only do a retrieval if deemed necessary based on the
|
6
|
+
# etag and modified-at of the last etag and 2) mark all entries retrieved as
|
7
|
+
# either new or not new. Entries retrieved are normalized using the
|
8
|
+
# feed-normalizer gem.
|
9
|
+
class Client
|
10
|
+
attr_reader :url, :options, :backend
|
11
|
+
|
12
|
+
DEFAULTS = {
|
13
|
+
:backend => Hash.new,
|
14
|
+
|
15
|
+
# The namespace will be prefixed to the key used for storage of the summary value. Based on your
|
16
|
+
# application needs, it may be useful to provide a custom prefix with initialization options.
|
17
|
+
:namespace => 'feedtosis',
|
18
|
+
|
19
|
+
# Some feed aggregators that we may be pulling from have entries that are present in one fetch and
|
20
|
+
# then disappear (Google blog search does this). For these cases, we can't rely on only the digests of
|
21
|
+
# the last fetch to guarantee "newness" of a feed that we may have previously consumed. We keep a
|
22
|
+
# number of previous sets of digests in order to make sure that we mark correct feeds as "new".
|
23
|
+
:retained_digest_size => 10
|
24
|
+
} unless defined?(DEFAULTS)
|
25
|
+
|
26
|
+
# Initializes a new feedtosis library. It must be initialized with a valid URL as the first argument.
|
27
|
+
# A following optional +options+ Hash may take the arguments:
|
28
|
+
# * backend: a key-value store to be used for summary structures of feeds fetched. Moneta backends work well, but any object acting like a Hash is valid.
|
29
|
+
# * retained_digest_size: an Integer specifying the number of previous MD5 sets of entries to keep, used for new feed detection
|
30
|
+
def initialize(url, options = { })
|
31
|
+
@url = url
|
32
|
+
|
33
|
+
raise ArgumentError, "Feedtosis::Client options must be in Hash form if provided" unless options.is_a?(Hash)
|
34
|
+
@options = options.reverse_merge(DEFAULTS)
|
35
|
+
|
36
|
+
@backend = @options[:backend]
|
37
|
+
|
38
|
+
unless @url.match(URI.regexp('http'))
|
39
|
+
raise ArgumentError, "Url #{@url} is not valid!"
|
40
|
+
end
|
41
|
+
|
42
|
+
unless @backend.respond_to?(:[]) && @backend.respond_to?(:[]=)
|
43
|
+
raise ArgumentError, "Backend needs to be a key-value store"
|
44
|
+
end
|
45
|
+
end
|
46
|
+
|
47
|
+
# Retrieves the latest entries from this feed. Returns a Feedtosis::Result
|
48
|
+
# object which delegates methods to the Curl::Easy object making the request
|
49
|
+
# and the FeedNormalizer::Feed object that may have been created from the
|
50
|
+
# HTTP response body.
|
51
|
+
def fetch
|
52
|
+
curl = build_curl_easy
|
53
|
+
curl.perform
|
54
|
+
feed = process_curl_response(curl)
|
55
|
+
Feedtosis::Result.new(curl, feed)
|
56
|
+
end
|
57
|
+
|
58
|
+
private
|
59
|
+
|
60
|
+
# Marks entries as either seen or not seen based on the unique signature of
|
61
|
+
# the entry, which is calculated by taking the MD5 of common attributes.
|
62
|
+
def mark_new_entries(response)
|
63
|
+
digests = summary_digests
|
64
|
+
|
65
|
+
# For each entry in the responses object, mark @_seen as false if the
|
66
|
+
# digest of this entry doesn't exist in the cached object.
|
67
|
+
response.entries.each do |e|
|
68
|
+
seen = digests.include?(digest_for(e))
|
69
|
+
e.instance_variable_set(:@_seen, seen)
|
70
|
+
end
|
71
|
+
|
72
|
+
response
|
73
|
+
end
|
74
|
+
|
75
|
+
# Returns an Array of summary digests for this feed. Since we keep a number of sets
|
76
|
+
# of digests, inject across these sets to accumulate unique identifiers.
|
77
|
+
def summary_digests
|
78
|
+
summary_for_feed[:digests].inject([]) do |r, e|
|
79
|
+
r |= e
|
80
|
+
end.uniq
|
81
|
+
end
|
82
|
+
|
83
|
+
# Processes the results by identifying which entries are new if the response
|
84
|
+
# is a 200. Otherwise, returns the Curl::Easy object for the user to inspect.
|
85
|
+
def process_curl_response(curl)
|
86
|
+
if curl.response_code == 200
|
87
|
+
response = parser_for_xml(curl.body_str)
|
88
|
+
response = mark_new_entries(response)
|
89
|
+
store_summary_to_backend(response, curl)
|
90
|
+
response
|
91
|
+
end
|
92
|
+
end
|
93
|
+
|
94
|
+
# Sets options for the Curl::Easy object, including parameters for HTTP
|
95
|
+
# conditional GET.
|
96
|
+
def build_curl_easy
|
97
|
+
curl = new_curl_easy(@url)
|
98
|
+
|
99
|
+
# Many feeds have a 302 redirect to another URL. For more recent versions
|
100
|
+
# of Curl, we need to specify this.
|
101
|
+
curl.follow_location = true
|
102
|
+
|
103
|
+
set_header_options(curl)
|
104
|
+
end
|
105
|
+
|
106
|
+
def new_curl_easy(url)
|
107
|
+
Curl::Easy.new(url)
|
108
|
+
end
|
109
|
+
|
110
|
+
# Returns the summary hash for this feed from the backend store.
|
111
|
+
def summary_for_feed
|
112
|
+
@backend[key_for_cached] || { :digests => [ ] }
|
113
|
+
end
|
114
|
+
|
115
|
+
# Sets the headers from the backend, if available
|
116
|
+
def set_header_options(curl)
|
117
|
+
summary = summary_for_feed
|
118
|
+
|
119
|
+
unless summary.nil?
|
120
|
+
curl.headers['If-None-Match'] = summary[:etag] unless summary[:etag].nil?
|
121
|
+
curl.headers['If-Modified-Since'] = summary[:last_modified] unless summary[:last_modified].nil?
|
122
|
+
end
|
123
|
+
|
124
|
+
curl
|
125
|
+
end
|
126
|
+
|
127
|
+
# Returns the key for the storage of the summary structure in the key-value system.
|
128
|
+
def key_for_cached
|
129
|
+
[ @options[:namespace], MD5.hexdigest(@url) ].join('_')
|
130
|
+
end
|
131
|
+
|
132
|
+
# Stores information about the retrieval, including ETag, Last-Modified,
|
133
|
+
# and MD5 digests of all entries to the backend store. This enables
|
134
|
+
# conditional GET usage on subsequent requests and marking of entries as
|
135
|
+
# either new or seen.
|
136
|
+
def store_summary_to_backend(feed, curl)
|
137
|
+
headers = HttpHeaders.new(curl.header_str)
|
138
|
+
|
139
|
+
# Store info about HTTP retrieval
|
140
|
+
summary = { }
|
141
|
+
|
142
|
+
summary.merge!(:etag => headers.etag) unless headers.etag.nil?
|
143
|
+
summary.merge!(:last_modified => headers.last_modified) unless headers.last_modified.nil?
|
144
|
+
|
145
|
+
# Store digest for each feed entry so we can detect new feeds on the next
|
146
|
+
# retrieval
|
147
|
+
new_digest_set = feed.entries.map do |e|
|
148
|
+
digest_for(e)
|
149
|
+
end
|
150
|
+
|
151
|
+
new_digest_set = summary_for_feed[:digests].unshift(new_digest_set)
|
152
|
+
new_digest_set = new_digest_set[0..@options[:retained_digest_size]]
|
153
|
+
|
154
|
+
summary.merge!( :digests => new_digest_set )
|
155
|
+
set_summary(summary)
|
156
|
+
end
|
157
|
+
|
158
|
+
def set_summary(summary)
|
159
|
+
@backend[key_for_cached] = summary
|
160
|
+
end
|
161
|
+
|
162
|
+
# Computes a unique signature for the FeedNormalizer::Entry object given.
|
163
|
+
# This signature will be the MD5 of enough fields to have a reasonable
|
164
|
+
# probability of determining if the entry is unique or not.
|
165
|
+
def digest_for(entry)
|
166
|
+
MD5.hexdigest( [ entry.title, entry.content, entry.date_published ].join )
|
167
|
+
end
|
168
|
+
|
169
|
+
def parser_for_xml(xml)
|
170
|
+
FeedNormalizer::FeedNormalizer.parse(xml)
|
171
|
+
end
|
172
|
+
end
|
173
|
+
end
|