rcarvalho-link_thumbnailer 1.0.9.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (43) hide show
  1. data/.gitignore +19 -0
  2. data/.rspec +2 -0
  3. data/.travis.yml +6 -0
  4. data/CHANGELOG.md +91 -0
  5. data/Gemfile +12 -0
  6. data/LICENSE +22 -0
  7. data/README.md +184 -0
  8. data/Rakefile +7 -0
  9. data/app/controllers/link_thumbnailer/application_controller.rb +4 -0
  10. data/app/controllers/link_thumbnailer/previews_controller.rb +11 -0
  11. data/lib/generators/link_thumbnailer/install_generator.rb +19 -0
  12. data/lib/generators/templates/initializer.rb +41 -0
  13. data/lib/link_thumbnailer.rb +96 -0
  14. data/lib/link_thumbnailer/configuration.rb +6 -0
  15. data/lib/link_thumbnailer/doc.rb +65 -0
  16. data/lib/link_thumbnailer/doc_parser.rb +15 -0
  17. data/lib/link_thumbnailer/engine.rb +9 -0
  18. data/lib/link_thumbnailer/fetcher.rb +34 -0
  19. data/lib/link_thumbnailer/img_comparator.rb +18 -0
  20. data/lib/link_thumbnailer/img_parser.rb +46 -0
  21. data/lib/link_thumbnailer/img_url_filter.rb +13 -0
  22. data/lib/link_thumbnailer/object.rb +41 -0
  23. data/lib/link_thumbnailer/opengraph.rb +20 -0
  24. data/lib/link_thumbnailer/rails/routes.rb +47 -0
  25. data/lib/link_thumbnailer/rails/routes/mapper.rb +30 -0
  26. data/lib/link_thumbnailer/rails/routes/mapping.rb +33 -0
  27. data/lib/link_thumbnailer/version.rb +3 -0
  28. data/lib/link_thumbnailer/web_image.rb +18 -0
  29. data/link_thumbnailer.gemspec +28 -0
  30. data/spec/doc_parser_spec.rb +25 -0
  31. data/spec/doc_spec.rb +23 -0
  32. data/spec/examples/empty_example.html +11 -0
  33. data/spec/examples/example.html +363 -0
  34. data/spec/examples/og_example.html +12 -0
  35. data/spec/fetcher_spec.rb +97 -0
  36. data/spec/img_comparator_spec.rb +16 -0
  37. data/spec/img_url_filter_spec.rb +31 -0
  38. data/spec/link_thumbnailer_spec.rb +205 -0
  39. data/spec/object_spec.rb +130 -0
  40. data/spec/opengraph_spec.rb +7 -0
  41. data/spec/spec_helper.rb +13 -0
  42. data/spec/web_image_spec.rb +57 -0
  43. metadata +245 -0
@@ -0,0 +1,19 @@
1
+ *.gem
2
+ *.rbc
3
+ .bundle
4
+ .config
5
+ .yardoc
6
+ Gemfile.lock
7
+ InstalledFiles
8
+ _yardoc
9
+ coverage
10
+ doc/
11
+ lib/bundler/man
12
+ pkg
13
+ rdoc
14
+ spec/reports
15
+ test/tmp
16
+ test/version_tmp
17
+ tmp
18
+ .project
19
+ .coveralls.yml
data/.rspec ADDED
@@ -0,0 +1,2 @@
1
+ --colour
2
+ --format=nested
@@ -0,0 +1,6 @@
1
+ bundler_args: --without development
2
+ language: ruby
3
+ rvm:
4
+ - 1.9.2
5
+ - 1.9.3
6
+ - 2.0.0
@@ -0,0 +1,91 @@
1
+ # 1.0.9
2
+
3
+ - Fix issue when Location header used a relative path instead of an absolute path
4
+ - Update gemfile to be more flexible when using Hashie gem
5
+
6
+ # 1.0.8
7
+
8
+ - Thanks to [juriglx](https://github.com/juriglx), support for canonical urls
9
+ - Bug fixes
10
+
11
+ # 1.0.7
12
+
13
+ - Fix: Issue with preview controller
14
+
15
+ # 1.0.6
16
+
17
+ - Fix: Issue when setting `strict` option. Always returning OG representation.
18
+
19
+ # 1.0.5
20
+
21
+ - Thanks to [phlegx](https://github.com/phlegx), support for timeout http connection through configurations.
22
+
23
+ # 1.0.4
24
+
25
+ - Fix issue #7: nil img was returned when exception is raised. Now skiping nil images in results.
26
+ - Thanks to [phlegx](https://github.com/phlegx), support for SSL and User Agent customization through configurations.
27
+
28
+ # 1.0.3
29
+
30
+ - Fix issue #5: Url was incorect in case of HTTP Redirections.
31
+
32
+ # 1.0.2
33
+
34
+ - Feature: User can now set options at runtime by passing valid options to ```generate``` method
35
+ - Bug fix when doing ```rails g link_thumbnailer:install``` by explicitly specifying the scope of Rails
36
+
37
+ # 1.0.1
38
+
39
+ - Refactor LinkThumbnailer#generate method to have a cleaner code
40
+
41
+ # 1.0.0
42
+
43
+ - Update readme
44
+ - Add PreviewController for easy integration with user's app
45
+ - Add link_thumbnailer routes for easy integration with user's app
46
+ - Refactor some code
47
+ - Change 'to_a' method to 'to_hash' in object model
48
+
49
+ # 0.0.6
50
+
51
+ - Update readme
52
+ - Add `to_a` to WebImage class
53
+ - Refactor `to_json` for WebImage class
54
+ - Add specs corresponding
55
+
56
+ # 0.0.5
57
+
58
+ - Bug fix
59
+ - Remove `require 'rails'` from spec_helper.rb
60
+ - Remove rails dependences (blank? method) in code
61
+ - Spec fix
62
+
63
+ # 0.0.4
64
+
65
+ - Add specs for almost all classes
66
+ - Add a method `to_json` for WebImage class to be able to get a usable array of images' attributes
67
+
68
+ # 0.0.3
69
+
70
+ - Add specs for LinkThumbnailer class
71
+ - Refactor config system, now using dedicated configuration class
72
+
73
+ # 0.0.2
74
+
75
+ - Added Rspec
76
+ - Bug fixes:
77
+ - Now checking if attribute is blank for LinkThumbnailer::Object.valid? method
78
+
79
+ # 0.0.1
80
+
81
+ - LinkThumbnailer::Object
82
+ - LinkThumbnailer::Doc
83
+ - LinkThumbnailer::DocParser
84
+ - LinkThumbnailer::Fetcher
85
+ - LinkThumbnailer::ImgComparator
86
+ - LinkThumbnailer::ImgParser
87
+ - LinkThumbnailer::ImgUrlFilter
88
+ - LinkThumbnailer::Opengraph
89
+ - LinkThumbnailer::WebImage
90
+ - LinkThumbnailer.configure
91
+ - LinkThumbnailer.generate
data/Gemfile ADDED
@@ -0,0 +1,12 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in link_thumbnailer.gemspec
4
+ gemspec
5
+
6
+ group :test do
7
+ gem 'coveralls', require: false
8
+ gem 'simplecov', require: false
9
+ gem 'rspec', '~> 2.14'
10
+ gem 'webmock', '~> 1.14'
11
+ gem 'pry', '~> 0.9'
12
+ end
data/LICENSE ADDED
@@ -0,0 +1,22 @@
1
+ Copyright (c) 2012 Pierre-Louis Gottfrois
2
+
3
+ MIT License
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining
6
+ a copy of this software and associated documentation files (the
7
+ "Software"), to deal in the Software without restriction, including
8
+ without limitation the rights to use, copy, modify, merge, publish,
9
+ distribute, sublicense, and/or sell copies of the Software, and to
10
+ permit persons to whom the Software is furnished to do so, subject to
11
+ the following conditions:
12
+
13
+ The above copyright notice and this permission notice shall be
14
+ included in all copies or substantial portions of the Software.
15
+
16
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
17
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
19
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
20
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
21
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
22
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,184 @@
1
+ # LinkThumbnailer
2
+
3
+ [![Code Climate](https://codeclimate.com/github/gottfrois/link_thumbnailer.png)](https://codeclimate.com/github/gottfrois/link_thumbnailer)
4
+ [![Coverage Status](https://coveralls.io/repos/gottfrois/link_thumbnailer/badge.png?branch=master)](https://coveralls.io/r/gottfrois/link_thumbnailer?branch=master)
5
+ [![Build Status](https://travis-ci.org/gottfrois/link_thumbnailer.png?branch=master)](https://travis-ci.org/gottfrois/link_thumbnailer)
6
+
7
+ Ruby gem generating image thumbnails from a given URL. Rank them and give you back an object containing images and website informations. Works like Facebook link previewer.
8
+
9
+ Demo Application is [here](http://link-thumbnailer-demo.herokuapp.com/) !
10
+
11
+ ## Installation
12
+
13
+ Add this line to your application's Gemfile:
14
+
15
+ gem 'link_thumbnailer'
16
+
17
+ And then execute:
18
+
19
+ $ bundle
20
+
21
+ Or install it yourself as:
22
+
23
+ $ gem install link_thumbnailer
24
+
25
+ Run:
26
+
27
+ $ rails g link_thumbnailer:install
28
+
29
+ This will add `link_thumbnailer.rb` to `config/initializers/`. See [#Configuration](https://github.com/gottfrois/link_thumbnailer#configuration) for more details.
30
+
31
+ ## Usage
32
+
33
+ Run `irb` and require the gem:
34
+
35
+ require 'rails'
36
+ => true
37
+
38
+ require 'link_thumbnailer'
39
+ => true
40
+
41
+ This gem can handle [Opengraph](http://ogp.me/) protocol. Here is an example with such a website:
42
+
43
+ object = LinkThumbnailer.generate('http://zerply.com')
44
+ => #<LinkThumbnailer::Object description="Go beyond the résumé - showcase your work and your talent" image="http://zerply.com/img/front/facebook_icon_green.png" images=["http://zerply.com/img/front/facebook_icon_green.png"] site_name="zerply.com" title="Join Me on Zerply" url="http://zerply.com">
45
+
46
+ object.title?
47
+ => true
48
+ object.title
49
+ => "Join Me on Zerply"
50
+
51
+ object.url?
52
+ => true
53
+ object.url
54
+ => "http://zerply.com"
55
+
56
+ object.foo?
57
+ => false
58
+ object.foo
59
+ => nil
60
+
61
+ Now with a regular website with no particular protocol:
62
+
63
+ object = LinkThumbnailer.generate('http://foo.com')
64
+ => #<LinkThumbnailer::Object description=nil images=[[ JPEG 750x200 750x200+0+0 DirectClass 8-bit 45kb] scene=0] title="Foo.com" url="http://foo.com">
65
+
66
+ object.title
67
+ => "Foo.com"
68
+
69
+ object.images
70
+ => [[ JPEG 750x200 750x200+0+0 DirectClass 8-bit 45kb]
71
+ scene=0]
72
+
73
+ object.images.first.source_url
74
+ => #<URI::HTTP:0x007ff7a923ef58 URL:http://foo.com/media/BAhbB1sHOgZmSSItMjAxMi8wNC8yNi8yMC8xMS80OS80MjYvY29yZG92YWJlYWNoLmpwZwY6BkVUWwg6BnA6CnRodW1iSSINNzUweDIwMCMGOwZU/cordovabeach.jpg>
75
+
76
+ object.to_hash
77
+ => {"url"=>"http://foo.com", "images"=>[{:source_url=>"http://foo.com/media/BAhbB1sHOgZmSSItMjAxMi8wNC8yNi8yMC8xMS80OS80MjYvY29yZG92YWJlYWNoLmpwZwY6BkVUWwg6BnA6CnRodW1iSSINNzUweDIwMCMGOwZU/cordovabeach.jpg", :mime_type=>"image/jpeg", :rows=>200, :filesize=>46501, :number_colors=>9490}], "title"=>"Foo.com", "description"=>nil}
78
+
79
+ object.to_json
80
+ => "{\"url\":\"http://foo.com\",\"images\":[{\"source_url\":\"http://foo.com/media/BAhbB1sHOgZmSSItMjAxMi8wNC8yNi8yMC8xMS80OS80MjYvY29yZG92YWJlYWNoLmpwZwY6BkVUWwg6BnA6CnRodW1iSSINNzUweDIwMCMGOwZU/cordovabeach.jpg\",\"mime_type\":\"image/jpeg\",\"rows\":200,\"filesize\":46501,\"number_colors\":9490}],\"title\":\"Foo.com\",\"description\":null}"
81
+
82
+ You can check whether this object is valid or not (set mandatory attributes in the initializer, defaults are `[url, title, images]`)
83
+
84
+ object.valid?
85
+ => true
86
+
87
+ You also can set options at runtime:
88
+
89
+ object = LinkThumbnailer.generate('http://foo.com', top: 10, limit: 20, redirect_limit: 5)
90
+
91
+ ## Preview Controller
92
+
93
+ For an easy integration into your application, use the builtin `PreviewController`.
94
+
95
+ Take a look at the demo application [here](https://github.com/gottfrois/link_thumbnailer_demo).
96
+
97
+ Basically, all you have to do in your view is something like this:
98
+
99
+ <%= form_tag '/link/preview', method: :post, remote: true do %>
100
+ <%= text_field_tag :url %>
101
+ <%= submit_tag 'Preview' %>
102
+ <% end %>
103
+
104
+ Don't forget to add this anywhere in your `routes.rb` file:
105
+
106
+ use_link_thumbnailer
107
+
108
+ Note: You won't have to bother with this if you did run the installer using:
109
+
110
+ $ rails g link_thumbnailer:install
111
+
112
+ The `PreviewController` will automatically respond to json calls, returning json version of the preview object. Just like in the IRB console above.
113
+
114
+ ## Configuration
115
+
116
+ In `config/initializers/link_thumbnailer.rb`
117
+
118
+ LinkThumbnailer.configure do |config|
119
+ # Set mandatory attributes require for the website to be valid.
120
+ # You can set `strict` to false if you want to skip this validation.
121
+ # config.mandatory_attributes = %w(url title image)
122
+
123
+ # Whether you want to validate given website against mandatory attributes or not.
124
+ # config.strict = true
125
+
126
+ # Numbers of redirects before raising an exception when trying to parse given url.
127
+ # config.redirect_limit = 3
128
+
129
+ # List of blacklisted urls you want to skip when searching for images.
130
+ # config.blacklist_urls = [
131
+ # %r{^http://ad\.doubleclick\.net/},
132
+ # %r{^http://b\.scorecardresearch\.com/},
133
+ # %r{^http://pixel\.quantserve\.com/},
134
+ # %r{^http://s7\.addthis\.com/}
135
+ # ]
136
+
137
+ # Fetch 10 images maximum.
138
+ # config.limit = 10
139
+
140
+ # Return top 5 images only.
141
+ # config.top = 5
142
+
143
+ # Set user agent
144
+ # config.user_agent = 'linkthumbnailer'
145
+
146
+ # Enable or disable SSL verification
147
+ # config.verify_ssl = true
148
+
149
+ # HTTP open_timeout: The amount of time in seconds to wait for a connection to be opened.
150
+ # config.http_timeout = 5
151
+ end
152
+
153
+ ## Features
154
+
155
+ Implemented:
156
+
157
+ - Implements [OpenGraph](http://ogp.me/) protocol.
158
+ - Find images and sort them according to how well they represent what the page is about (includes absolute images).
159
+ - Sort images based on their size and color.
160
+ - Blacklist some well known advertisings image urls.
161
+ - Routes and Controllers to handle preview generation
162
+
163
+ Coming soon:
164
+
165
+ - Use the gem [ruby-readability](https://github.com/iterationlabs/ruby-readability) to parse images and website information
166
+ - Cache results on filesystem
167
+
168
+ ## Contributing
169
+
170
+ 1. Fork it
171
+ 2. Create your feature branch (`git checkout -b my-new-feature`)
172
+ 3. Run the specs (`bundle exec rspec spec`)
173
+ 4. Commit your changes (`git commit -am 'Added some feature'`)
174
+ 5. Push to the branch (`git push origin my-new-feature`)
175
+ 6. Create new Pull Request
176
+
177
+ ## Contributors
178
+
179
+ - [phlegx](https://github.com/phlegx)
180
+ - [juriglx](https://github.com/juriglx)
181
+
182
+
183
+ [![Bitdeli Badge](https://d2weczhvl823v0.cloudfront.net/gottfrois/link_thumbnailer/trend.png)](https://bitdeli.com/free "Bitdeli Badge")
184
+
@@ -0,0 +1,7 @@
1
+ require 'bundler/gem_tasks'
2
+ require 'rspec/core/rake_task'
3
+
4
+ RSpec::Core::RakeTask.new('spec')
5
+
6
+ task default: :spec
7
+ task test: :spec
@@ -0,0 +1,4 @@
1
+ module LinkThumbnailer
2
+ class ApplicationController < ActionController::Base
3
+ end
4
+ end
@@ -0,0 +1,11 @@
1
+ module LinkThumbnailer
2
+ class PreviewsController < ApplicationController
3
+
4
+ respond_to :json
5
+
6
+ def create
7
+ @preview = LinkThumbnailer.generate(params[:url])
8
+ render json: @preview.to_json
9
+ end
10
+ end
11
+ end
@@ -0,0 +1,19 @@
1
+ module LinkThumbnailer
2
+ module Generators
3
+ class InstallGenerator < ::Rails::Generators::Base
4
+
5
+ source_root File.expand_path('../../templates', __FILE__)
6
+
7
+ desc 'Creates a LinkThumbnailer initializer for your application.'
8
+
9
+ def install
10
+ route "use_link_thumbnailer"
11
+ end
12
+
13
+ def copy_initializer
14
+ template 'initializer.rb', 'config/initializers/link_thumbnailer.rb'
15
+ end
16
+
17
+ end
18
+ end
19
+ end
@@ -0,0 +1,41 @@
1
+ # Use this hook to configure LinkThumbnailer bahaviors.
2
+ LinkThumbnailer.configure do |config|
3
+ # Set mandatory attributes require for the website to be valid.
4
+ # You can set `strict` to false if you want to skip this validation.
5
+ # config.mandatory_attributes = %w(url title image)
6
+
7
+ # Whether you want to validate given website against mandatory attributes or not.
8
+ # config.strict = true
9
+
10
+ # Numbers of redirects before raising an exception when trying to parse given url.
11
+ # config.redirect_limit = 3
12
+
13
+ # List of blacklisted urls you want to skip when searching for images.
14
+ # config.blacklist_urls = [
15
+ # %r{^http://ad\.doubleclick\.net/},
16
+ # %r{^http://b\.scorecardresearch\.com/},
17
+ # %r{^http://pixel\.quantserve\.com/},
18
+ # %r{^http://s7\.addthis\.com/}
19
+ # ]
20
+
21
+ # Included Rmagick attributes for images. See http://www.imagemagick.org/RMagick/doc/
22
+ # for more details.
23
+ # 'source_url' is a custom attribute and should always be included since this
24
+ # is where you'll find the image url.
25
+ # config.rmagick_attributes = %w(source_url mime_type colums rows filesize number_colors)
26
+
27
+ # Fetch 10 images maximum.
28
+ # config.limit = 10
29
+
30
+ # Return top 5 images only.
31
+ # config.top = 5
32
+
33
+ # Set user agent
34
+ # config.user_agent = 'linkthumbnailer'
35
+
36
+ # Enable or disable SSL verification
37
+ # config.verify_ssl = true
38
+
39
+ # HTTP open_timeout: The amount of time in seconds to wait for a connection to be opened.
40
+ # config.http_timeout = 5
41
+ end
@@ -0,0 +1,96 @@
1
+ require 'link_thumbnailer/engine' if defined? Rails
2
+ require 'link_thumbnailer/configuration'
3
+ require 'link_thumbnailer/object'
4
+ require 'link_thumbnailer/fetcher'
5
+ require 'link_thumbnailer/doc_parser'
6
+ require 'link_thumbnailer/doc'
7
+ require 'link_thumbnailer/img_url_filter'
8
+ require 'link_thumbnailer/img_parser'
9
+ require 'link_thumbnailer/img_comparator'
10
+ require 'link_thumbnailer/web_image'
11
+ require 'link_thumbnailer/opengraph'
12
+ require 'link_thumbnailer/version'
13
+
14
+ module LinkThumbnailer
15
+
16
+ module Rails
17
+ autoload :Routes, 'link_thumbnailer/rails/routes'
18
+ end
19
+
20
+ class << self
21
+
22
+ attr_accessor :configuration, :object, :fetcher, :doc_parser,
23
+ :img_url_filters, :img_parser
24
+
25
+ def config
26
+ self.configuration ||= Configuration.new(
27
+ mandatory_attributes: %w(url title images),
28
+ strict: true,
29
+ redirect_limit: 3,
30
+ blacklist_urls: [
31
+ %r{^http://ad\.doubleclick\.net/},
32
+ %r{^http://b\.scorecardresearch\.com/},
33
+ %r{^http://pixel\.quantserve\.com/},
34
+ %r{^http://s7\.addthis\.com/}
35
+ ],
36
+ rmagick_attributes: %w(source_url mime_type colums rows filesize number_colors),
37
+ limit: 10,
38
+ top: 5,
39
+ user_agent: 'linkthumbnailer',
40
+ verify_ssl: true,
41
+ http_timeout: 5
42
+ )
43
+ end
44
+
45
+ def configure
46
+ yield config
47
+ end
48
+
49
+ def generate(url, options = {})
50
+ set_options(options)
51
+ instantiate_classes
52
+
53
+ doc = self.doc_parser.parse(self.fetcher.fetch(url), url)
54
+
55
+ self.object[:url] = self.fetcher.url.to_s
56
+ opengraph(doc) || custom(doc)
57
+ end
58
+
59
+ private
60
+
61
+ def set_options(options)
62
+ config
63
+ options.each {|k, v| config[k] = v }
64
+ end
65
+
66
+ def instantiate_classes
67
+ self.object = LinkThumbnailer::Object.new
68
+ self.fetcher = LinkThumbnailer::Fetcher.new
69
+ self.doc_parser = LinkThumbnailer::DocParser.new
70
+ self.img_url_filters = [LinkThumbnailer::ImgUrlFilter.new]
71
+ self.img_parser = LinkThumbnailer::ImgParser.new(self.fetcher, self.img_url_filters)
72
+ end
73
+
74
+ def opengraph(doc)
75
+ return nil unless opengraph?(doc)
76
+ self.object = LinkThumbnailer::Opengraph.parse(self.object, doc)
77
+ return self.object if self.object.valid?
78
+ nil
79
+ end
80
+
81
+ def custom(doc)
82
+ self.object[:title] = doc.title
83
+ self.object[:description] = doc.description
84
+ self.object[:images] = self.img_parser.parse(doc.img_abs_urls.dup)
85
+ self.object[:url] = doc.canonical_url || self.object[:url]
86
+ return self.object if self.object.valid?
87
+ nil
88
+ end
89
+
90
+ def opengraph?(doc)
91
+ !doc.xpath('//meta[starts-with(@property, "og:") and @content]').empty?
92
+ end
93
+
94
+ end
95
+
96
+ end