consummo 0.4.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/.travis.yml ADDED
@@ -0,0 +1,7 @@
1
+ language: ruby
2
+ rvm:
3
+ - 2.3.0
4
+ before_install: gem install bundler -v 1.11.2
5
+ addons:
6
+ code_climate:
7
+ repo_token: b88e21fff712be545a536cef38d27e541c56f41542da3524bd3e11d19d156d84
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in consummo.gemspec
4
+ gemspec
data/LICENSE.txt ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2016 Clayton
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,99 @@
1
+ # Consummo
2
+
3
+ Consummo is an engine for consuming, enriching and producing pieces of content from RSS feeds.
4
+
5
+ ## Quick Start
6
+
7
+ ```ruby
8
+
9
+ # Create a Feed
10
+ feed = Feed.new(uri: "http://feedjira.com/blog/feed.xml")
11
+
12
+ # Produce Items from the feed
13
+ items = ItemProducer.new(feeds: [feed]).produce
14
+ # items => [SimpleItem, SimpleItem, SimpleItem]
15
+
16
+ # define our content enrichers
17
+ enrichers = [FacebookLikeEnricher.new]
18
+
19
+ # Consume items
20
+ enriched_items = ItemConsumer.new(items: items, enrichers: enrichers).consume
21
+ # enriched_items => [SimpleItem, SimpleItem, SimpleItem]
22
+
23
+ ```
24
+
25
+ ## Using with Rails
26
+
27
+ If you're using consummo with rails, you'll probably want to create `ActiveRecord` backed objects for:
28
+
29
+ - `FeedItem`
30
+ - `Feed`
31
+
32
+ When producing and consuming `FeedItems` you'll probably want to persist them to a datastore when producing and/or consuming.
33
+
34
+ ## The Consummo Domain
35
+
36
+ ### Feeds
37
+ A `Feed` is a simple data structure that represents a URI feed. It has a `uri` attribute.
38
+
39
+ ### Feed Items
40
+ A `Feed Item` represents a singular piece of content produced from a `Feed`. It has attributes like `title` and `url`.
41
+
42
+ ### Producers
43
+ An `Item Producer` takes a list of `Feeds` along with a `Fetcher` and fetches items from the feed.
44
+
45
+ ### Consumers
46
+ An `Item Consumer` takes unenriched `Feed Items` and passes them through `Item Enrichers`.
47
+
48
+ ### Item Enricher
49
+ An `Item Enricher` enriches the details and data from a `Feed Item`. For example, the `FacebookLikeEnricher` will determine the number of Facebook Likes for a particular `Feed Item` (using the item's url) and add that attribute to the `Feed Item`.
50
+
51
+ The intent of `Item Enrichers` is that they are extensible and easy to implement such that multiple custom enrichments are possible.
52
+
53
+ ## Custom Enrichment
54
+
55
+ Enrichers follow a very simple interface:
56
+
57
+ ```ruby
58
+ class SimpleEnricher
59
+ def enrich(item)
60
+ { "simple" => "enrichment" }
61
+ end
62
+ end
63
+ ```
64
+
65
+ An Enricher should be able to `enrich` something that looks like an `item` (`SimpleItem`) and return a hash of key/value pairs.
66
+
67
+ ## Installation
68
+ Add this line to your application's Gemfile:
69
+
70
+ ```ruby
71
+ gem 'consummo'
72
+ ```
73
+
74
+ And then execute:
75
+
76
+ $ bundle
77
+
78
+ Or install it yourself as:
79
+
80
+ $ gem install consummo
81
+
82
+ ## Usage
83
+
84
+ TODO: Write usage instructions here
85
+
86
+ ## Development
87
+
88
+ After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
89
+
90
+ To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
91
+
92
+ ## Contributing
93
+
94
+ Bug reports and pull requests are welcome on GitHub at https://github.com/[USERNAME]/consummo.
95
+
96
+
97
+ ## License
98
+
99
+ The gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).
data/Rakefile ADDED
@@ -0,0 +1,6 @@
1
+ require "bundler/gem_tasks"
2
+ require "rspec/core/rake_task"
3
+
4
+ RSpec::Core::RakeTask.new(:spec)
5
+
6
+ task :default => :spec
data/bin/console ADDED
@@ -0,0 +1,14 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "bundler/setup"
4
+ require "consummo"
5
+
6
+ # You can add fixtures and/or initialization code here to make experimenting
7
+ # with your gem easier. You can also use a different console, if you like.
8
+
9
+ # (If you use this, don't forget to add pry to your Gemfile!)
10
+ # require "pry"
11
+ # Pry.start
12
+
13
+ require "irb"
14
+ IRB.start
data/bin/setup ADDED
@@ -0,0 +1,8 @@
1
+ #!/usr/bin/env bash
2
+ set -euo pipefail
3
+ IFS=$'\n\t'
4
+ set -vx
5
+
6
+ bundle install
7
+
8
+ # Do any other automated setup that you need to do here
data/consummo.gemspec ADDED
@@ -0,0 +1,33 @@
1
+ # coding: utf-8
2
+ lib = File.expand_path('../lib', __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+ require 'consummo/version'
5
+
6
+ Gem::Specification.new do |spec|
7
+ spec.name = "consummo"
8
+ spec.version = Consummo::VERSION
9
+ spec.authors = ["Clayton Lengel-Zigich"]
10
+ spec.email = ["clayton@claytonlz.com"]
11
+
12
+ spec.summary = %q{An engine for producing, consuming and enriching items from RSS Feeds.}
13
+ spec.description = %q{Consummo is a small ruby library that provides item producers, consumers and enrichers to take items from RSS feeds and turn them into persistable feed items. Consummo was extracted from https://consummo.io where it was used to find and process items in a new curation workflow.}
14
+ spec.homepage = "https://github.com/clayton/consummo"
15
+ spec.license = "MIT"
16
+
17
+ spec.files = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
18
+ spec.bindir = "exe"
19
+ spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
20
+ spec.require_paths = ["lib"]
21
+
22
+ spec.add_runtime_dependency "feedjira", "~> 2.0"
23
+ spec.add_runtime_dependency "activesupport", "~> 4.2"
24
+ spec.add_runtime_dependency "odyssey", "~> 0.2"
25
+ spec.add_runtime_dependency "httparty", "~> 0.13"
26
+
27
+ spec.add_development_dependency "bundler", "~> 1.11"
28
+ spec.add_development_dependency "rake", "~> 10.0"
29
+ spec.add_development_dependency "rspec", "~> 3.0"
30
+ spec.add_development_dependency "vcr", "~> 3.0"
31
+ spec.add_development_dependency "simplecov", "~> 0.9"
32
+ spec.add_development_dependency "codeclimate-test-reporter", "~> 0.5"
33
+ end
File without changes
@@ -0,0 +1,6 @@
1
+ class ContentScorer
2
+ def score(text)
3
+ return 0 if text.blank?
4
+ Odyssey.flesch_kincaid_re(text, false)
5
+ end
6
+ end
@@ -0,0 +1,13 @@
1
+ class FacebookLikeEnricher < SocialMediaMetricEnricher
2
+ def metric_endpoint
3
+ "http://graph.facebook.com/?id="
4
+ end
5
+
6
+ def metric_name
7
+ "facebook_likes"
8
+ end
9
+
10
+ def metric_key
11
+ "shares"
12
+ end
13
+ end
@@ -0,0 +1,15 @@
1
+ class Feed
2
+ attr_accessor :uri
3
+
4
+ def self.create(args={})
5
+ Feed.new(args)
6
+ end
7
+
8
+ def initialize(attrs={})
9
+ self.uri = attrs[:uri]
10
+ end
11
+
12
+ def id
13
+ return self.to_s
14
+ end
15
+ end
@@ -0,0 +1,9 @@
1
+ require 'feedjira'
2
+
3
+ class FeedClient
4
+ def fetch_and_parse(uri)
5
+ feed = Feedjira::Feed.fetch_and_parse(uri)
6
+ return [] if feed.is_a?(Fixnum)
7
+ feed.entries
8
+ end
9
+ end
@@ -0,0 +1,22 @@
1
+ class FeedFetcher
2
+ def initialize(client: FeedClient.new, factory: FeedItemFactory.new)
3
+ @client = client
4
+ @factory = factory
5
+ end
6
+
7
+ def fetch(feed, last_modified = nil)
8
+ @client.fetch_and_parse(feed.uri).map do |entry|
9
+ next if old_entry?(entry, last_modified)
10
+ item = @factory.build(entry)
11
+ item.attributes = {:feed_id => feed.id}
12
+
13
+ item
14
+ end.compact
15
+ end
16
+
17
+ private
18
+ def old_entry?(entry, last_modified)
19
+ return false if last_modified.nil?
20
+ last_modified > entry.last_modified
21
+ end
22
+ end
@@ -0,0 +1,11 @@
1
+ class FeedItem
2
+ attr_accessor :title
3
+
4
+ def initialize(attrs={})
5
+ self.title = attrs[:title]
6
+ end
7
+
8
+ def self.create(attributes={})
9
+ #
10
+ end
11
+ end
@@ -0,0 +1,24 @@
1
+ class FeedItemFactory
2
+ def build(entry)
3
+ item = SimpleItem.new({
4
+ :title => entry.title,
5
+ :hinted_title => entry.title,
6
+ :url => entry.url,
7
+ :author => entry.author,
8
+ :summary => entry.summary,
9
+ :published_at => sanitize_published_at(entry.published),
10
+ :guid => entry.id
11
+ })
12
+
13
+ item.attributes = {:categories => entry.categories.join(";")} unless entry.is_a?(Feedjira::Parser::ITunesRSSItem)
14
+
15
+ item
16
+ end
17
+
18
+ private
19
+ def sanitize_published_at(published)
20
+ return Time.now if published.blank?
21
+ return Time.now if Time.parse(published.to_s) < Time.parse('1979-01-01-01 00:00:00')
22
+ published
23
+ end
24
+ end
@@ -0,0 +1,19 @@
1
+ class FeedProcessor
2
+ def initialize(feeds = [], enrichers: [])
3
+ @feeds = feeds
4
+ @enrichers = enrichers
5
+ end
6
+
7
+ def process
8
+ producer = ItemProducer.new(feeds: @feeds)
9
+ consumer = ItemConsumer.new(items: producer.produce, enrichers: @enrichers)
10
+ persist(consumer.consume)
11
+ end
12
+
13
+ private
14
+ def persist(items)
15
+ items.each do |item|
16
+ FeedItem.create(item.attributes)
17
+ end
18
+ end
19
+ end
@@ -0,0 +1,6 @@
1
+ class HttpClient
2
+ def get(uri)
3
+ response = HTTParty.get(uri)
4
+ response.to_h
5
+ end
6
+ end
@@ -0,0 +1,15 @@
1
+ class ItemConsumer
2
+ def initialize(items: [], enrichers: [])
3
+ @items = items
4
+ @enrichers = enrichers
5
+ end
6
+
7
+ def consume
8
+ @items.flat_map do |item|
9
+ @enrichers.each do |enricher|
10
+ item.attributes = enricher.enrich(item)
11
+ end
12
+ item
13
+ end
14
+ end
15
+ end
@@ -0,0 +1,12 @@
1
+ class ItemProducer
2
+ def initialize(feeds: [], fetcher: FeedFetcher.new)
3
+ @feeds = feeds
4
+ @fetcher = fetcher
5
+ end
6
+
7
+ def produce
8
+ @feeds.flat_map do |feed|
9
+ @fetcher.fetch(feed)
10
+ end.shuffle
11
+ end
12
+ end
@@ -0,0 +1,22 @@
1
+ class KeywordHintEnricher
2
+ def initialize(keywords:[])
3
+ @keywords = keywords
4
+ end
5
+
6
+ def enrich(item, keywords=[])
7
+ modified = item.title
8
+ list = keywords.empty? ? @keywords : keywords
9
+
10
+ for keyword in list do
11
+ modified = make_replacements(modified, keyword)
12
+ end
13
+ {hinted_title: modified}
14
+ end
15
+
16
+ private
17
+ def make_replacements(text, keyword)
18
+ return nil if text.blank?
19
+ return text unless text.match(/#{keyword}/i)
20
+ text.gsub(/(#{keyword})/i, '<strong>\1</strong>')
21
+ end
22
+ end
@@ -0,0 +1,12 @@
1
+ require "odyssey"
2
+
3
+ class ReadabilityEnricher
4
+ def initialize(scorer: ContentScorer.new)
5
+ @scorer = scorer
6
+ end
7
+
8
+ def enrich(item)
9
+ score = @scorer.score(item.content)
10
+ {:readability_score => score}
11
+ end
12
+ end
@@ -0,0 +1,5 @@
1
+ class SimpleEnricher
2
+ def enrich(item)
3
+ { "simple" => "enrichment" }
4
+ end
5
+ end
@@ -0,0 +1,29 @@
1
+ class SimpleItem
2
+ def initialize(attributes={})
3
+ @attributes = attributes
4
+ end
5
+
6
+ def attributes
7
+ @attributes
8
+ end
9
+
10
+ def attributes=(attrs)
11
+ @attributes.merge!(attrs)
12
+ end
13
+
14
+ def title
15
+ @attributes[:title]
16
+ end
17
+
18
+ def url
19
+ @attributes[:url]
20
+ end
21
+
22
+ def guid
23
+ @attributes[:guid] || @attributes[:url] || nil
24
+ end
25
+
26
+ def content
27
+ @attributes[:content] || ""
28
+ end
29
+ end
@@ -0,0 +1,26 @@
1
+ class SocialMediaMetricEnricher
2
+ def initialize(client: HttpClient.new)
3
+ @client = client
4
+ end
5
+
6
+ def enrich(item)
7
+ begin
8
+ result = @client.get("#{metric_endpoint}#{URI.encode(item.url)}")
9
+ {metric_name.to_sym => result[metric_key]}
10
+ rescue Exception => e
11
+ {}
12
+ end
13
+ end
14
+
15
+ def metric_endpoint
16
+ "http://example.com?id="
17
+ end
18
+
19
+ def metric_name
20
+ "shares"
21
+ end
22
+
23
+ def metric_key
24
+ metric_name
25
+ end
26
+ end
@@ -0,0 +1,13 @@
1
+ class TwitterShareEnricher < SocialMediaMetricEnricher
2
+ def metric_endpoint
3
+ "http://urls.api.twitter.com/1/urls/count.json?url="
4
+ end
5
+
6
+ def metric_name
7
+ "twitter_shares"
8
+ end
9
+
10
+ def metric_key
11
+ "count"
12
+ end
13
+ end
@@ -0,0 +1,10 @@
1
+ class UrlEnricher
2
+ def initialize(resolver: UrlResolver.new)
3
+ @resolver = resolver
4
+ end
5
+
6
+ def enrich(item)
7
+ resolved = @resolver.resolve(item.url)
8
+ {url: resolved.to_s}
9
+ end
10
+ end
@@ -0,0 +1,13 @@
1
+ require 'httparty'
2
+
3
+ class UrlResolver
4
+ def resolve(url)
5
+ return url if url.blank?
6
+ begin
7
+ response = HTTParty.head(url, follow_redirects: true)
8
+ response.request.last_uri
9
+ rescue Errno::ECONNREFUSED, URI::InvalidURIError, HTTParty::RedirectionTooDeep
10
+ url
11
+ end
12
+ end
13
+ end
@@ -0,0 +1,3 @@
1
+ module Consummo
2
+ VERSION = "0.4.4"
3
+ end
@@ -0,0 +1,6 @@
1
+ class WordCountEnricher
2
+ def enrich(item)
3
+ wc = item.content.scan(/[[:alpha:]]+/).count
4
+ {:word_count => wc}
5
+ end
6
+ end
data/lib/consummo.rb ADDED
@@ -0,0 +1,31 @@
1
+ require 'active_support'
2
+ require 'active_support/core_ext/object/blank'
3
+
4
+ require "consummo/version"
5
+
6
+ require "consummo/social_media_metric_enricher"
7
+ require "consummo/content_scorer"
8
+ require "consummo/facebook_like_enricher"
9
+ require "consummo/feed"
10
+ require "consummo/feed_client"
11
+ require "consummo/feed_fetcher"
12
+ require "consummo/feed_item"
13
+ require "consummo/feed_item_factory"
14
+ require "consummo/feed_processor"
15
+ require "consummo/http_client"
16
+ require "consummo/item_consumer"
17
+ require "consummo/item_producer"
18
+ require "consummo/keyword_hint_enricher"
19
+ require "consummo/readability_enricher"
20
+ require "consummo/simple_item"
21
+ require "consummo/social_media_metric_enricher"
22
+ require "consummo/twitter_share_enricher"
23
+ require "consummo/url_enricher"
24
+ require "consummo/url_resolver"
25
+ require "consummo/version"
26
+ require "consummo/word_count_enricher"
27
+
28
+
29
+ module Consummo
30
+ # Your code goes here...
31
+ end