link_thumbnailer 3.3.2 → 3.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +5 -5
- data/.ruby-version +1 -1
- data/.travis.yml +2 -3
- data/CHANGELOG.md +231 -79
- data/lib/generators/templates/initializer.rb +9 -0
- data/lib/link_thumbnailer/configuration.rb +3 -1
- data/lib/link_thumbnailer/exceptions.rb +1 -0
- data/lib/link_thumbnailer/model.rb +3 -3
- data/lib/link_thumbnailer/page.rb +1 -1
- data/lib/link_thumbnailer/processor.rb +33 -2
- data/lib/link_thumbnailer/response.rb +2 -2
- data/lib/link_thumbnailer/scrapers/default/favicon.rb +14 -2
- data/lib/link_thumbnailer/version.rb +1 -1
- data/link_thumbnailer.gemspec +1 -1
- data/spec/fixture_spec.rb +19 -0
- data/spec/fixtures/default_with_few_favicons.html +15 -0
- data/spec/fixtures/google_utf8_no_meta_charset.html +6 -0
- data/spec/fixtures/with_related_path_in_href.html +13 -0
- data/spec/fixtures/with_root_path_in_href.html +13 -0
- data/spec/processor_spec.rb +40 -20
- data/spec/response_spec.rb +24 -2
- metadata +22 -9
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
|
-
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
2
|
+
SHA256:
|
3
|
+
metadata.gz: 4a61e97176d3dfeee5edd1b272ad4c9bb9e4d967655edd529d128f5ec0bf2a9f
|
4
|
+
data.tar.gz: 48430fe14a47d7a76b26df13421c01474b21c4b76df823723f6e13ba70d2a46f
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 206962b937ceb0a38910c71bce981379972c0291e7a71c21b1b2d6ccbd22f82e991be36ec4c2b084a681b6c8e2c7f7713ad420d26a8e617fc5d4733bb9785f7c
|
7
|
+
data.tar.gz: 38ee02ac70e404432ce97d276f09e4f2dff02c2b323753098fe7607cb93cbcf549e85b8ec9c0ed11cead8f8c81ba0282bb1c0c22a173208773b10a080e34b9c6
|
data/.ruby-version
CHANGED
@@ -1 +1 @@
|
|
1
|
-
2.
|
1
|
+
2.6.6
|
data/.travis.yml
CHANGED
data/CHANGELOG.md
CHANGED
@@ -1,12 +1,43 @@
|
|
1
|
-
#
|
1
|
+
# Changelog
|
2
|
+
All notable changes to this project will be documented in this file.
|
3
|
+
|
4
|
+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
5
|
+
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
6
|
+
|
7
|
+
## [Unreleased]
|
8
|
+
|
9
|
+
## [3.4.0]
|
10
|
+
### Adds
|
11
|
+
|
12
|
+
- Adds `download_size_limit` configuration to raise `LinkThumbnailer::DownloadSizeLimit` when the body of the request is too big. Defaults to `10 * 1024 * 1024` bytes.
|
13
|
+
- Adds `favicon_size` configuration to allow to choose which favison
|
14
|
+
size the gem should prefer. Defaults to the first favicon found otherwise.
|
15
|
+
|
16
|
+
### Fixes
|
17
|
+
|
18
|
+
- Fixes string encoding in previous versions of Ruby
|
19
|
+
- Fixes favicon by providing the full path.
|
20
|
+
- When HTML charset cannot be found in the HTML header, we now try
|
21
|
+
to find it in the body.
|
22
|
+
- Closes the HTTP connection upon completion
|
23
|
+
|
24
|
+
### Changes
|
25
|
+
|
26
|
+
- 401 HTTP errors now raise `LinkThumbnailer::HTTPError`
|
27
|
+
- Upgrades [ImageInfo](https://github.com/gottfrois/image_info/blob/master/CHANGELOG.md) gem
|
28
|
+
|
29
|
+
## [3.3.2]
|
30
|
+
### Fixes
|
2
31
|
|
3
32
|
- Frozen strings https://github.com/gottfrois/link_thumbnailer/pull/125
|
4
33
|
|
5
|
-
|
34
|
+
## [3.3.1]
|
35
|
+
### Changes
|
6
36
|
|
7
37
|
- Gem upgrade (json)
|
8
38
|
|
9
|
-
|
39
|
+
## [3.3.0]
|
40
|
+
### Adds
|
10
41
|
|
11
42
|
- Allows to configure overrided http headers
|
12
43
|
|
@@ -16,14 +47,16 @@ LinkThumbnailer.configure do |config|
|
|
16
47
|
end
|
17
48
|
```
|
18
49
|
|
19
|
-
|
50
|
+
## [3.2.1]
|
51
|
+
### Fixes
|
20
52
|
|
21
53
|
- Fixes #88
|
22
54
|
- Override User-Agent header properly
|
23
55
|
- Match xpath nodes if attribute content is present
|
24
56
|
- Avoid nil urls in image parser
|
25
57
|
|
26
|
-
|
58
|
+
## [3.2.0]
|
59
|
+
### Adds
|
27
60
|
|
28
61
|
Makes scrapers configurable by allowing to set the scraping strategy:
|
29
62
|
|
@@ -36,7 +69,8 @@ end
|
|
36
69
|
`opengraph` use the [Open Graph Protocol](http://ogp.me/).
|
37
70
|
`default` use a homemade algorithm
|
38
71
|
|
39
|
-
|
72
|
+
## [3.1.2]
|
73
|
+
### Adds
|
40
74
|
|
41
75
|
Allows to customize ideal description length
|
42
76
|
|
@@ -58,14 +92,22 @@ end
|
|
58
92
|
Will default to `120` characters. More information about how the gem manage to find the best description can be found at
|
59
93
|
http://www.codeids.com/2015/06/27/how-to-find-best-description-of-a-website-using-linkthumbnailer/
|
60
94
|
|
61
|
-
|
95
|
+
## [3.1.1]
|
96
|
+
### Fixes
|
62
97
|
|
63
|
-
- Upgrade `video_info` gem
|
64
98
|
- Fixes https://github.com/gottfrois/link_thumbnailer/issues/69
|
65
99
|
|
66
|
-
|
100
|
+
### Changes
|
101
|
+
|
102
|
+
- Upgrade `video_info` gem
|
103
|
+
|
104
|
+
## [3.1.0]
|
105
|
+
### Fixes
|
67
106
|
|
68
107
|
- Fix an issue when image sizes could not be retrieved.
|
108
|
+
|
109
|
+
### Adds
|
110
|
+
|
69
111
|
- Grapers now accepts an optional parameter to customize the weigth of the grader in the probablity computation.
|
70
112
|
|
71
113
|
```ruby
|
@@ -76,64 +118,83 @@ Will give a 3 times more weigth to the `Position` grader compare to other grader
|
|
76
118
|
By default all graders have a weigth of `1` except the above position grader since position should play a bigger role in
|
77
119
|
order to find good description candidates.
|
78
120
|
|
79
|
-
|
121
|
+
## [3.0.3]
|
122
|
+
### Fixes
|
80
123
|
|
81
124
|
- Fix an issue when dealing with absolute urls. https://github.com/gottfrois/link_thumbnailer/issues/68
|
82
125
|
- Fix an issue with http redirection and location header not beeing present. https://github.com/gottfrois/link_thumbnailer/issues/70
|
83
126
|
- Rescue and raise custom LinkThumbnailer exceptions. https://github.com/gottfrois/link_thumbnailer/issues/71
|
84
127
|
|
85
|
-
|
128
|
+
## [3.0.2]
|
129
|
+
### Fixes
|
86
130
|
|
87
131
|
- Replace FastImage gem dependency by [ImageInfo](https://github.com/gottfrois/image_info) to improve performances when
|
88
132
|
fetching multiple images size information. Benchmark shows an order of magnitude improvement response time.
|
89
133
|
- Fixes [#57](https://github.com/gottfrois/link_thumbnailer/issues/57)
|
90
134
|
|
91
|
-
|
135
|
+
## [3.0.1]
|
136
|
+
### Fixes
|
92
137
|
|
93
138
|
- Remove useless dependencies
|
94
139
|
|
95
|
-
|
140
|
+
## [3.0.0]
|
141
|
+
### Changes
|
96
142
|
|
97
143
|
- Improved description sorting.
|
98
144
|
- Refactored how graders work. More information [here](https://github.com/gottfrois/link_thumbnailer/wiki/How-to-build-your-own-Grader%3F)
|
99
145
|
|
100
|
-
|
146
|
+
## [2.6.1]
|
147
|
+
### Fixes
|
101
148
|
|
102
149
|
- Fix remove useless dependency
|
103
150
|
|
104
|
-
|
151
|
+
## [2.6.0]
|
152
|
+
### Adds
|
105
153
|
|
106
154
|
- Introduce new `raise_on_invalid_format` option (false by default) to raise `LinkThumbnailer::FormatNotSupported` if http `Content-Type` is invalid. Fixes #61 and #64.
|
107
155
|
|
108
|
-
|
156
|
+
## [2.5.2]
|
157
|
+
### Fixes
|
109
158
|
|
110
159
|
- Fix OpenURI::HTTPError exception raised when video_info gem is not able to parse video metadata. Fixes #60.
|
111
160
|
|
112
|
-
|
161
|
+
## [2.5.1]
|
162
|
+
### Adds
|
113
163
|
|
114
164
|
- Implement `Set-Cookie` header between http redirections to set cookies when site requires it. Fixes #55.
|
115
165
|
|
116
|
-
|
166
|
+
## [2.5.0]
|
167
|
+
### Adds
|
117
168
|
|
118
169
|
- Handles seamlessly `og:image` and `og:image:url`
|
119
170
|
- Handles seamlessly `og:video` and `og:video:url`
|
120
171
|
- Handles `og:video:width` and `og:video:height` for one video only (please create a ticket if you want support for multiple videos/images width & height)
|
172
|
+
|
173
|
+
### Fixes
|
174
|
+
|
121
175
|
- Fix calling `as_json` on `website` to return `as_json` representation of videos and images, not just their urls
|
122
|
-
- Gem updates and fix rspec deprecation warnings
|
123
176
|
|
124
|
-
|
177
|
+
### Changes
|
178
|
+
|
179
|
+
- Gem updates
|
180
|
+
|
181
|
+
## [2.4.0]
|
182
|
+
### Adds
|
125
183
|
|
126
184
|
- Handle connection through proxy automatically using the `ENV['HTTP_PROXY']` variable thanks to [taganaka](https://github.com/taganaka).
|
127
185
|
|
128
|
-
|
186
|
+
## [2.3.2]
|
187
|
+
### Fixes
|
129
188
|
|
130
189
|
- Fix an issue with vimeo opengraph urls. Fixes [#46](https://github.com/gottfrois/link_thumbnailer/pull/46)
|
131
190
|
|
132
|
-
|
191
|
+
## [2.3.1]
|
192
|
+
### Fixes
|
133
193
|
|
134
194
|
- Fix an issue with the link density grader caused by links with image instead of text. Fixes [#45](https://github.com/gottfrois/link_thumbnailer/issues/45)
|
135
195
|
|
136
|
-
|
196
|
+
## [2.3.0]
|
197
|
+
### Adds
|
137
198
|
|
138
199
|
- Add requested favicon scraper [#40](https://github.com/gottfrois/link_thumbnailer/issues/40)
|
139
200
|
|
@@ -151,19 +212,23 @@ o.favicon
|
|
151
212
|
=> "https://github.com/fluidicon.png"
|
152
213
|
```
|
153
214
|
|
154
|
-
|
215
|
+
## [2.2.3]
|
216
|
+
### Fixes
|
155
217
|
|
156
218
|
- Fixes [#41](https://github.com/gottfrois/link_thumbnailer/issues/41)
|
157
219
|
|
158
|
-
|
220
|
+
## [2.2.2]
|
221
|
+
### Fixes
|
159
222
|
|
160
223
|
- Fixes [#41](https://github.com/gottfrois/link_thumbnailer/issues/41)
|
161
224
|
|
162
|
-
|
225
|
+
## [2.2.1]
|
226
|
+
### Fixes
|
163
227
|
|
164
|
-
-
|
228
|
+
- Fixes issue when computing link density ratio
|
165
229
|
|
166
|
-
|
230
|
+
## [2.2.0]
|
231
|
+
### Adds
|
167
232
|
|
168
233
|
- Add support for `og:video`
|
169
234
|
- Add support for multiple `og:video` as well
|
@@ -189,30 +254,35 @@ Ex:
|
|
189
254
|
config.attributes = [:title, :images, :description, :videos]
|
190
255
|
```
|
191
256
|
|
192
|
-
|
257
|
+
## [2.1.0]
|
258
|
+
### Adds
|
193
259
|
|
194
260
|
- Increased `og:image` scraping performance by parsing `og:image:width` and `og:image:height` attribute if specified
|
195
261
|
- Introduced `image_stats` option to allow disabling image size and type parsing causing performance issues.
|
196
262
|
|
197
263
|
When disabled, size will be `[0, 0]` and type will be `nil`
|
198
264
|
|
199
|
-
|
265
|
+
## [2.0.4]
|
266
|
+
### Fixes
|
200
267
|
|
201
268
|
- Fixes [#39](https://github.com/gottfrois/link_thumbnailer/issues/39)
|
202
269
|
|
203
|
-
|
270
|
+
## [2.0.3]
|
271
|
+
### Fixes
|
204
272
|
|
205
273
|
- Fixes [#37](https://github.com/gottfrois/link_thumbnailer/issues/37)
|
206
274
|
|
207
|
-
|
275
|
+
## [2.0.2]
|
276
|
+
### Fixes
|
208
277
|
|
209
|
-
-
|
278
|
+
- Fixes couple of issues with `URI` class namespace
|
210
279
|
|
211
|
-
|
280
|
+
## [2.0.1]
|
212
281
|
|
213
|
-
-
|
282
|
+
- Fixes issue with image parser (fastimage) when given an URI instance instead of a string
|
214
283
|
|
215
|
-
|
284
|
+
## [2.0.0]
|
285
|
+
### Changes
|
216
286
|
|
217
287
|
- Fully refactored LinkThumbnailer
|
218
288
|
- Introduced [Graders](https://github.com/gottfrois/link_thumbnailer/wiki/How-to-build-your-own-Grader%3F)
|
@@ -228,60 +298,85 @@ When disabled, size will be `[0, 0]` and type will be `nil`
|
|
228
298
|
To update from `1.x.x` to `2.x.x` you need to run `rails g link_thumbnailer:install` to get the new configuration file.
|
229
299
|
If you used the `PreviewsController` feature, you need to build it yourself since it is not supported anymore.
|
230
300
|
|
231
|
-
|
301
|
+
## [1.1.2]
|
302
|
+
### Fixes
|
232
303
|
|
233
|
-
-
|
304
|
+
- Fixes issue with FastImage URLs [https://github.com/gottfrois/link_thumbnailer/pull/31](https://github.com/gottfrois/link_thumbnailer/pull/31)
|
234
305
|
|
235
|
-
|
306
|
+
## [1.1.1]
|
307
|
+
### Fixes
|
236
308
|
|
237
|
-
-
|
309
|
+
- Fixes route helper not working under rails 4.
|
238
310
|
|
239
|
-
|
311
|
+
## [1.1.0]
|
312
|
+
### Changes
|
240
313
|
|
241
314
|
- Replace RMagick by [FastImage](https://github.com/sdsykes/fastimage)
|
242
315
|
- Rename `rmagick_attributes` config into `image_attributes`
|
243
316
|
|
244
|
-
|
317
|
+
## [1.0.9]
|
318
|
+
### Fixes
|
319
|
+
|
320
|
+
- Fixes issue when Location header used a relative path instead of an absolute path
|
321
|
+
|
322
|
+
### Changes
|
245
323
|
|
246
|
-
- Fix issue when Location header used a relative path instead of an absolute path
|
247
324
|
- Update gemfile to be more flexible when using Hashie gem
|
248
325
|
|
249
|
-
|
326
|
+
## [1.0.8]
|
327
|
+
# Adds
|
250
328
|
|
251
329
|
- Thanks to [juriglx](https://github.com/juriglx), support for canonical urls
|
330
|
+
|
331
|
+
### Fixes
|
332
|
+
|
252
333
|
- Bug fixes
|
253
334
|
|
254
|
-
|
335
|
+
## [1.0.7]
|
336
|
+
### Fixes
|
255
337
|
|
256
|
-
-
|
338
|
+
- Fixes an issue with the preview controller
|
257
339
|
|
258
|
-
|
340
|
+
## [1.0.6]
|
341
|
+
### Fixes
|
259
342
|
|
260
|
-
-
|
343
|
+
- Fixes an issue when setting `strict` option. Always returning OG representation.
|
261
344
|
|
262
|
-
|
345
|
+
## [1.0.5]
|
346
|
+
### Adds
|
263
347
|
|
264
348
|
- Thanks to [phlegx](https://github.com/phlegx), support for timeout http connection through configurations.
|
265
349
|
|
266
|
-
|
350
|
+
## [1.0.4]
|
351
|
+
### Fixes
|
352
|
+
|
353
|
+
- Fixes issue #7: nil img was returned when exception is raised. Now skiping nil images in results.
|
354
|
+
|
355
|
+
### Adds
|
267
356
|
|
268
|
-
- Fix issue #7: nil img was returned when exception is raised. Now skiping nil images in results.
|
269
357
|
- Thanks to [phlegx](https://github.com/phlegx), support for SSL and User Agent customization through configurations.
|
270
358
|
|
271
|
-
|
359
|
+
## [1.0.3]
|
360
|
+
### Fixes
|
272
361
|
|
273
|
-
-
|
362
|
+
- Fixes issue #5: Url was incorect in case of HTTP Redirections.
|
274
363
|
|
275
|
-
|
364
|
+
## [1.0.2]
|
365
|
+
### Fixes
|
276
366
|
|
277
|
-
- Feature: User can now set options at runtime by passing valid options to ```generate``` method
|
278
367
|
- Bug fix when doing ```rails g link_thumbnailer:install``` by explicitly specifying the scope of Rails
|
279
368
|
|
280
|
-
|
369
|
+
### Adds
|
370
|
+
|
371
|
+
- User can now set options at runtime by passing valid options to ```generate``` method
|
372
|
+
|
373
|
+
## [1.0.1]
|
374
|
+
### Fixes
|
281
375
|
|
282
376
|
- Refactor LinkThumbnailer#generate method to have a cleaner code
|
283
377
|
|
284
|
-
|
378
|
+
## [1.0.0]
|
379
|
+
### Changes
|
285
380
|
|
286
381
|
- Update readme
|
287
382
|
- Add PreviewController for easy integration with user's app
|
@@ -289,46 +384,103 @@ If you used the `PreviewsController` feature, you need to build it yourself sinc
|
|
289
384
|
- Refactor some code
|
290
385
|
- Change 'to_a' method to 'to_hash' in object model
|
291
386
|
|
292
|
-
|
387
|
+
## [0.0.6]
|
388
|
+
### Adds
|
293
389
|
|
294
390
|
- Update readme
|
295
391
|
- Add `to_a` to WebImage class
|
296
|
-
- Refactor `to_json` for WebImage class
|
297
392
|
- Add specs corresponding
|
298
393
|
|
299
|
-
|
394
|
+
### Fixes
|
395
|
+
|
396
|
+
- Refactor `to_json` for WebImage class
|
397
|
+
|
398
|
+
## [0.0.5]
|
399
|
+
### Fixes
|
300
400
|
|
301
401
|
- Bug fix
|
302
402
|
- Remove `require 'rails'` from spec_helper.rb
|
303
403
|
- Remove rails dependences (blank? method) in code
|
304
404
|
- Spec fix
|
305
405
|
|
306
|
-
|
406
|
+
## [0.0.4]
|
407
|
+
### Adds
|
307
408
|
|
308
409
|
- Add specs for almost all classes
|
309
410
|
- Add a method `to_json` for WebImage class to be able to get a usable array of images' attributes
|
310
411
|
|
311
|
-
|
412
|
+
## [0.0.3]
|
413
|
+
### Adds
|
312
414
|
|
313
415
|
- Add specs for LinkThumbnailer class
|
416
|
+
|
417
|
+
### Fixes
|
418
|
+
|
314
419
|
- Refactor config system, now using dedicated configuration class
|
315
420
|
|
316
|
-
|
421
|
+
## [0.0.2]
|
422
|
+
### Adds
|
317
423
|
|
318
424
|
- Added Rspec
|
319
|
-
|
320
|
-
|
321
|
-
|
322
|
-
|
323
|
-
|
324
|
-
|
325
|
-
|
326
|
-
|
327
|
-
-
|
328
|
-
|
329
|
-
|
330
|
-
|
331
|
-
|
332
|
-
|
333
|
-
|
334
|
-
|
425
|
+
|
426
|
+
### Fixes
|
427
|
+
|
428
|
+
- Now checking if attribute is blank for LinkThumbnailer::Object.valid? method
|
429
|
+
|
430
|
+
## [0.0.1]
|
431
|
+
### Adds
|
432
|
+
|
433
|
+
- First release 🎆
|
434
|
+
|
435
|
+
[Unreleased]: https://github.com/gottfrois/link_thumbnailer/compare/v3.4.0...HEAD
|
436
|
+
[3.4.0]: https://github.com/gottfrois/link_thumbnailer/compare/v3.3.2...v3.4.0
|
437
|
+
[3.3.2]: https://github.com/gottfrois/link_thumbnailer/compare/v3.3.1...v3.3.2
|
438
|
+
[3.3.1]: https://github.com/gottfrois/link_thumbnailer/compare/v3.3.0...v3.3.1
|
439
|
+
[3.3.0]: https://github.com/gottfrois/link_thumbnailer/compare/v3.2.1...v3.3.0
|
440
|
+
[3.2.1]: https://github.com/gottfrois/link_thumbnailer/compare/v3.2.0...v3.2.1
|
441
|
+
[3.2.0]: https://github.com/gottfrois/link_thumbnailer/compare/v3.1.2...v3.2.0
|
442
|
+
[3.1.2]: https://github.com/gottfrois/link_thumbnailer/compare/v3.1.1...v3.1.2
|
443
|
+
[3.1.1]: https://github.com/gottfrois/link_thumbnailer/compare/v3.1.0...v3.1.1
|
444
|
+
[3.1.0]: https://github.com/gottfrois/link_thumbnailer/compare/v3.0.3...v3.1.0
|
445
|
+
[3.0.3]: https://github.com/gottfrois/link_thumbnailer/compare/v3.0.2...v3.0.3
|
446
|
+
[3.0.2]: https://github.com/gottfrois/link_thumbnailer/compare/v3.0.1...v3.0.2
|
447
|
+
[3.0.1]: https://github.com/gottfrois/link_thumbnailer/compare/v3.0.0...v3.0.1
|
448
|
+
[3.0.0]: https://github.com/gottfrois/link_thumbnailer/compare/v2.6.1...v3.0.0
|
449
|
+
[2.6.1]: https://github.com/gottfrois/link_thumbnailer/compare/v2.6.0...v2.6.1
|
450
|
+
[2.6.0]: https://github.com/gottfrois/link_thumbnailer/compare/v2.5.2...v2.6.0
|
451
|
+
[2.5.2]: https://github.com/gottfrois/link_thumbnailer/compare/v2.5.1...v2.5.2
|
452
|
+
[2.5.1]: https://github.com/gottfrois/link_thumbnailer/compare/v2.5.0...v2.5.1
|
453
|
+
[2.5.0]: https://github.com/gottfrois/link_thumbnailer/compare/v2.4.0...v2.5.0
|
454
|
+
[2.4.0]: https://github.com/gottfrois/link_thumbnailer/compare/v2.3.2...v2.4.0
|
455
|
+
[2.3.2]: https://github.com/gottfrois/link_thumbnailer/compare/v2.3.1...v2.3.2
|
456
|
+
[2.3.1]: https://github.com/gottfrois/link_thumbnailer/compare/v2.3.0...v2.3.1
|
457
|
+
[2.3.0]: https://github.com/gottfrois/link_thumbnailer/compare/v2.2.3...v2.3.0
|
458
|
+
[2.2.3]: https://github.com/gottfrois/link_thumbnailer/compare/v2.2.2...v2.2.3
|
459
|
+
[2.2.2]: https://github.com/gottfrois/link_thumbnailer/compare/v2.2.1...v2.2.2
|
460
|
+
[2.2.1]: https://github.com/gottfrois/link_thumbnailer/compare/v2.2.0...v2.2.1
|
461
|
+
[2.2.0]: https://github.com/gottfrois/link_thumbnailer/compare/v2.1.0...v2.2.0
|
462
|
+
[2.1.0]: https://github.com/gottfrois/link_thumbnailer/compare/v2.0.4...v2.1.0
|
463
|
+
[2.0.4]: https://github.com/gottfrois/link_thumbnailer/compare/v2.0.3...v2.0.4
|
464
|
+
[2.0.3]: https://github.com/gottfrois/link_thumbnailer/compare/v2.0.2...v2.0.3
|
465
|
+
[2.0.2]: https://github.com/gottfrois/link_thumbnailer/compare/v2.0.1...v2.0.2
|
466
|
+
[2.0.1]: https://github.com/gottfrois/link_thumbnailer/compare/v2.0.0...v2.0.1
|
467
|
+
[2.0.0]: https://github.com/gottfrois/link_thumbnailer/compare/v1.1.2...v2.0.0
|
468
|
+
[1.1.2]: https://github.com/gottfrois/link_thumbnailer/compare/v1.1.1...v1.1.2
|
469
|
+
[1.1.1]: https://github.com/gottfrois/link_thumbnailer/compare/v1.1.0...v1.1.1
|
470
|
+
[1.1.0]: https://github.com/gottfrois/link_thumbnailer/compare/v1.0.9...v1.1.0
|
471
|
+
[1.0.9]: https://github.com/gottfrois/link_thumbnailer/compare/v1.0.8...v1.0.9
|
472
|
+
[1.0.8]: https://github.com/gottfrois/link_thumbnailer/compare/v1.0.7...v1.0.8
|
473
|
+
[1.0.7]: https://github.com/gottfrois/link_thumbnailer/compare/v1.0.6...v1.0.7
|
474
|
+
[1.0.6]: https://github.com/gottfrois/link_thumbnailer/compare/v1.0.5...v1.0.6
|
475
|
+
[1.0.5]: https://github.com/gottfrois/link_thumbnailer/compare/v1.0.4...v1.0.5
|
476
|
+
[1.0.4]: https://github.com/gottfrois/link_thumbnailer/compare/v1.0.3...v1.0.4
|
477
|
+
[1.0.3]: https://github.com/gottfrois/link_thumbnailer/compare/v1.0.2...v1.0.3
|
478
|
+
[1.0.2]: https://github.com/gottfrois/link_thumbnailer/compare/v1.0.1...v1.0.2
|
479
|
+
[1.0.1]: https://github.com/gottfrois/link_thumbnailer/compare/v1.0.0...v1.0.1
|
480
|
+
[1.0.0]: https://github.com/gottfrois/link_thumbnailer/compare/v0.0.6...v1.0.0
|
481
|
+
[0.0.6]: https://github.com/gottfrois/link_thumbnailer/compare/v0.0.5...v0.0.6
|
482
|
+
[0.0.5]: https://github.com/gottfrois/link_thumbnailer/compare/v0.0.4...v0.0.5
|
483
|
+
[0.0.4]: https://github.com/gottfrois/link_thumbnailer/compare/v0.0.3...v0.0.4
|
484
|
+
[0.0.3]: https://github.com/gottfrois/link_thumbnailer/compare/v0.0.2...v0.0.3
|
485
|
+
[0.0.2]: https://github.com/gottfrois/link_thumbnailer/compare/v0.0.1...v0.0.2
|
486
|
+
[0.0.1]: https://github.com/gottfrois/link_thumbnailer/releases/tag/v0.0.1
|
@@ -35,6 +35,11 @@ LinkThumbnailer.configure do |config|
|
|
35
35
|
#
|
36
36
|
# config.attributes = [:title, :images, :description, :videos, :favicon]
|
37
37
|
|
38
|
+
# Prior favicon size. If the website doesn't have such size - returns the first favicon.
|
39
|
+
# Value should be like '32x32' or '16x16'. Default value is nil.
|
40
|
+
#
|
41
|
+
# config.favicon_size = nil
|
42
|
+
|
38
43
|
# List of procedures used to rate the website description. Add you custom class
|
39
44
|
# here. See wiki for more details on how to build your own graders.
|
40
45
|
#
|
@@ -83,6 +88,10 @@ LinkThumbnailer.configure do |config|
|
|
83
88
|
#
|
84
89
|
# config.scrapers = [:opengraph, :default]
|
85
90
|
|
91
|
+
# Limit for download size in bytes. When using ActiveSupport, you can also use values like 10.megabytes
|
92
|
+
#
|
93
|
+
# config.download_size_limit = 10 * 1024 * 1024
|
94
|
+
|
86
95
|
# Sets the default encoding.
|
87
96
|
#
|
88
97
|
# config.encoding = 'utf-8'
|
@@ -28,7 +28,8 @@ module LinkThumbnailer
|
|
28
28
|
:verify_ssl, :http_open_timeout, :http_read_timeout, :attributes,
|
29
29
|
:graders, :description_min_length, :positive_regex, :negative_regex,
|
30
30
|
:image_limit, :image_stats, :raise_on_invalid_format, :max_concurrency,
|
31
|
-
:scrapers, :http_override_headers, :encoding
|
31
|
+
:scrapers, :http_override_headers, :download_size_limit, :encoding,
|
32
|
+
:favicon_size
|
32
33
|
|
33
34
|
alias_method :http_timeout, :http_open_timeout
|
34
35
|
alias_method :http_timeout=, :http_open_timeout=
|
@@ -65,6 +66,7 @@ module LinkThumbnailer
|
|
65
66
|
@max_concurrency = 20
|
66
67
|
@scrapers = [:opengraph, :default]
|
67
68
|
@http_override_headers = { 'Accept-Encoding' => 'none' }
|
69
|
+
@download_size_limit = 10 * 1024 * 1024
|
68
70
|
@encoding = 'utf-8'
|
69
71
|
end
|
70
72
|
|
@@ -12,9 +12,9 @@ module LinkThumbnailer
|
|
12
12
|
def sanitize(str)
|
13
13
|
return unless str
|
14
14
|
|
15
|
-
str = str.
|
16
|
-
str.encode
|
17
|
-
str
|
15
|
+
str = str.encode("UTF-16", "UTF-8", invalid: :replace, undef: :replace, replace: "")
|
16
|
+
str = str.encode("UTF-8", "UTF-16").strip.gsub(/[\r\n\f]+/, "\n")
|
17
|
+
str
|
18
18
|
end
|
19
19
|
end
|
20
20
|
end
|
@@ -17,6 +17,12 @@ module LinkThumbnailer
|
|
17
17
|
super(config)
|
18
18
|
end
|
19
19
|
|
20
|
+
def start(url)
|
21
|
+
result = call(url)
|
22
|
+
shutdown
|
23
|
+
result
|
24
|
+
end
|
25
|
+
|
20
26
|
def call(url = '', redirect_count = 0, headers = {})
|
21
27
|
self.url = url
|
22
28
|
@redirect_count = redirect_count
|
@@ -28,12 +34,16 @@ module LinkThumbnailer
|
|
28
34
|
set_http_options
|
29
35
|
perform_request
|
30
36
|
end
|
31
|
-
rescue ::Net::HTTPExceptions, ::SocketError, ::Timeout::Error => e
|
37
|
+
rescue ::Net::HTTPExceptions, ::SocketError, ::Timeout::Error, ::Net::HTTP::Persistent::Error => e
|
32
38
|
raise ::LinkThumbnailer::HTTPError.new(e.message)
|
33
39
|
end
|
34
40
|
|
35
41
|
private
|
36
42
|
|
43
|
+
def shutdown
|
44
|
+
http.shutdown
|
45
|
+
end
|
46
|
+
|
37
47
|
def with_valid_url
|
38
48
|
raise ::LinkThumbnailer::BadUriFormat unless valid_url_format?
|
39
49
|
yield if block_given?
|
@@ -53,7 +63,7 @@ module LinkThumbnailer
|
|
53
63
|
end
|
54
64
|
|
55
65
|
def perform_request
|
56
|
-
response =
|
66
|
+
response = request_in_chunks
|
57
67
|
headers = {}
|
58
68
|
headers['Cookie'] = response['Set-Cookie'] if response['Set-Cookie'].present?
|
59
69
|
|
@@ -73,6 +83,19 @@ module LinkThumbnailer
|
|
73
83
|
end
|
74
84
|
end
|
75
85
|
|
86
|
+
def request_in_chunks
|
87
|
+
body = String.new
|
88
|
+
response = http.request(url) do |resp|
|
89
|
+
raise ::LinkThumbnailer::DownloadSizeLimit if too_big_download_size?(resp.content_length)
|
90
|
+
resp.read_body do |chunk|
|
91
|
+
body.concat(chunk)
|
92
|
+
raise ::LinkThumbnailer::DownloadSizeLimit if too_big_download_size?(body.length)
|
93
|
+
end
|
94
|
+
end
|
95
|
+
response.body = body
|
96
|
+
response
|
97
|
+
end
|
98
|
+
|
76
99
|
def resolve_relative_url(location)
|
77
100
|
location.start_with?('http') ? location : build_absolute_url_for(location)
|
78
101
|
end
|
@@ -101,6 +124,10 @@ module LinkThumbnailer
|
|
101
124
|
config.verify_ssl
|
102
125
|
end
|
103
126
|
|
127
|
+
def download_size_limit
|
128
|
+
config.download_size_limit
|
129
|
+
end
|
130
|
+
|
104
131
|
def too_many_redirections?
|
105
132
|
redirect_count > redirect_limit
|
106
133
|
end
|
@@ -120,6 +147,10 @@ module LinkThumbnailer
|
|
120
147
|
false
|
121
148
|
end
|
122
149
|
|
150
|
+
def too_big_download_size?(size)
|
151
|
+
size.to_i > download_size_limit.to_i
|
152
|
+
end
|
153
|
+
|
123
154
|
def url=(url)
|
124
155
|
@url = ::URI.parse(url.to_s)
|
125
156
|
end
|
@@ -18,8 +18,8 @@ module LinkThumbnailer
|
|
18
18
|
|
19
19
|
def extract_charset
|
20
20
|
content_type = @response['Content-Type'] || ''
|
21
|
-
m = content_type.match(/charset=(\w+)/)
|
22
|
-
(m && m[1]) || ''
|
21
|
+
m = content_type.match(/charset=([\w-]+)/)
|
22
|
+
(m && m[1]) || @response.body.scrub =~ /<meta[^>]*charset\s*=\s*["']?(.+?)["' >]/i && $1 || ''
|
23
23
|
end
|
24
24
|
|
25
25
|
def extract_body
|
@@ -15,7 +15,11 @@ module LinkThumbnailer
|
|
15
15
|
private
|
16
16
|
|
17
17
|
def to_uri(href)
|
18
|
-
::URI.parse(href)
|
18
|
+
uri = ::URI.parse(href)
|
19
|
+
uri.scheme ||= website.url.scheme
|
20
|
+
uri.host ||= website.url.host
|
21
|
+
uri.path = uri.path&.sub(%r{^(?=[^\/])}, '/')
|
22
|
+
uri
|
19
23
|
rescue ::URI::InvalidURIError
|
20
24
|
nil
|
21
25
|
end
|
@@ -25,13 +29,21 @@ module LinkThumbnailer
|
|
25
29
|
end
|
26
30
|
|
27
31
|
def node
|
28
|
-
document.xpath("//link[contains(@rel, 'icon')]")
|
32
|
+
icons = document.xpath("//link[contains(@rel, 'icon')]")
|
33
|
+
retrieve_by_size(icons) || icons.first
|
29
34
|
end
|
30
35
|
|
31
36
|
def modelize(uri)
|
32
37
|
model_class.new(uri)
|
33
38
|
end
|
34
39
|
|
40
|
+
def retrieve_by_size(icons)
|
41
|
+
return if config.favicon_size.nil?
|
42
|
+
|
43
|
+
icons.find do |icon|
|
44
|
+
icon.attributes['sizes']&.value == config.favicon_size
|
45
|
+
end
|
46
|
+
end
|
35
47
|
end
|
36
48
|
end
|
37
49
|
end
|
data/link_thumbnailer.gemspec
CHANGED
@@ -25,5 +25,5 @@ Gem::Specification.new do |spec|
|
|
25
25
|
spec.add_dependency 'nokogiri', '>= 1.6'
|
26
26
|
spec.add_dependency 'net-http-persistent', '>= 2.9'
|
27
27
|
spec.add_dependency 'video_info', '>= 2.6'
|
28
|
-
spec.add_dependency 'image_info', '>= 1.0'
|
28
|
+
spec.add_dependency 'image_info', ['~> 1.0', '>= 1.2.0']
|
29
29
|
end
|
data/spec/fixture_spec.rb
CHANGED
@@ -109,6 +109,25 @@ describe 'Fixture' do
|
|
109
109
|
|
110
110
|
end
|
111
111
|
|
112
|
+
context 'with 32 favicon size' do
|
113
|
+
let(:action) { LinkThumbnailer.generate(url, favicon_size: '32x32') }
|
114
|
+
let(:favicon) { 'http://foo.com/foo32x32.ico' }
|
115
|
+
let(:html) { File.open(File.dirname(__FILE__) + '/fixtures/default_with_few_favicons.html').read }
|
116
|
+
|
117
|
+
it { expect(action.favicon).to eq(favicon) }
|
118
|
+
end
|
119
|
+
|
120
|
+
context 'when favicon with root path in the href' do
|
121
|
+
let(:html) { File.open(File.dirname(__FILE__) + '/fixtures/with_root_path_in_href.html').read }
|
122
|
+
|
123
|
+
it { expect(action.favicon).to eq(favicon) }
|
124
|
+
end
|
125
|
+
|
126
|
+
context 'when favicon with related path in the href' do
|
127
|
+
let(:html) { File.open(File.dirname(__FILE__) + '/fixtures/with_related_path_in_href.html').read }
|
128
|
+
|
129
|
+
it { expect(action.favicon).to eq(favicon) }
|
130
|
+
end
|
112
131
|
end
|
113
132
|
|
114
133
|
end
|
@@ -0,0 +1,15 @@
|
|
1
|
+
<html>
|
2
|
+
<head>
|
3
|
+
<title>Title from meta</title>
|
4
|
+
<link rel="shortcut icon" href="http://foo.com/foo.ico">
|
5
|
+
<link rel="shortcut icon" sizes='32x32' href="http://foo.com/foo32x32.ico">
|
6
|
+
<link rel="shortcut icon" sizes='16x16' href="http://foo.com/foo16x16.ico">
|
7
|
+
</head>
|
8
|
+
<body>
|
9
|
+
|
10
|
+
<p>Description from body</p>
|
11
|
+
|
12
|
+
<img src="http://foo.com/foo.png">
|
13
|
+
|
14
|
+
</body>
|
15
|
+
</html>
|
@@ -0,0 +1,6 @@
|
|
1
|
+
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="ja"><head><meta content="世界中のあらゆる情報を検索するためのツールを提供しています。さまざまな検索機能を活用して、お探しの情報を見つけてください。" name="description"><meta content="noodp" name="robots"><meta content="text/html" http-equiv="Content-Type"><meta content="/images/branding/googleg/1x/googleg_standard_color_128dp.png" itemprop="image"><title>Google</title><script>(function(){window.google={kEI:'716VWJL3JYq18QWo_qzoAg',kEXPI:'750722,1351903,1352241,1352381,3700253,3700347,4028875,4029815,4032677,4038012,4043492,4045839,4048347,4062666,4065787,4068816,4069773,4069838,4069840,4070138,4072773,4073405,4073726,4073959,4076096,4076931,4076999,4077777,4078438,4079081,4079105,4079894,4081038,4081485,4082194,4082201,4082298,4082619,4083044,4083476,4084343,4084673,4085336,4085412,4085683,4086011,4089003,4089144,4089183,4089427,4089538,4089913,4090414,4090547,4090549,4090598,4090657,4090806,4090893,4091966,4092028,4092182,4092218,4092474,4092478,4092598,4092864,4092867,4092875,4092897,4092934,4093073,4093948,4094169,4094250,4094769,4094987,4094997,4095554,4095771,4095907,4095998,8300096,8300272,8507380,8507419,8507861,8507940,8508624,8510023,10200083,13500022,13500024',authuser:0,kscs:'c9c918f0_24'};google.kHL='ja';})();(function(){google.lc=[];google.li=0;google.getEI=function(a){for(var b;a&&(!a.getAttribute||!(b=a.getAttribute("eid")));)a=a.parentNode;return b||google.kEI};google.getLEI=function(a){for(var b=null;a&&(!a.getAttribute||!(b=a.getAttribute("leid")));)a=a.parentNode;return b};google.https=function(){return"https:"==window.location.protocol};google.ml=function(){return null};google.wl=function(a,b){try{google.ml(Error(a),!1,b)}catch(c){}};google.time=function(){return(new Date).getTime()};google.log=function(a,b,c,d,g){a=google.logUrl(a,b,c,d,g);if(""!=a){b=new Image;var e=google.lc,f=google.li;e[f]=b;b.onerror=b.onload=b.onabort=function(){delete e[f]};window.google&&window.google.vel&&window.google.vel.lu&&window.google.vel.lu(a);b.src=a;google.li=f+1}};google.logUrl=function(a,b,c,d,g){var e="",f=google.ls||"";c||-1!=b.search("&ei=")||(e="&ei="+google.getEI(d),-1==b.search("&lei=")&&(d=google.getLEI(d))&&(e+="&lei="+d));a=c||"/"+(g||"gen_204")+"?atyp=i&ct="+a+"&cad="+b+e+f+"&zx="+google.time();/^http:/i.test(a)&&google.https()&&(google.ml(Error("a"),!1,{src:a,glmm:1}),a="");return a};google.y={};google.x=function(a,b){google.y[a.id]=[a,b];return!1};google.lq=[];google.load=function(a,b,c){google.lq.push([[a],b,c])};google.loadAll=function(a,b){google.lq.push([a,b])};}).call(this);var a=window.location,b=a.href.indexOf("#");if(0<=b){var c=a.href.substring(b+1);/(^|&)q=/.test(c)&&-1==c.indexOf("#")&&a.replace("/search?"+c.replace(/(^|&)fp=[^&]*/g,"")+"&cad=h")};</script><style>#gbar,#guser{font-size:13px;padding-top:1px !important;}#gbar{height:22px}#guser{padding-bottom:7px !important;text-align:right}.gbh,.gbd{border-top:1px solid #c9d7f1;font-size:1px}.gbh{height:0;position:absolute;top:24px;width:100%}@media all{.gb1{height:22px;margin-right:.5em;vertical-align:top}#gbar{float:left}}a.gb1,a.gb4{text-decoration:underline !important}a.gb1,a.gb4{color:#00c !important}.gbi .gb4{color:#dd8e27 !important}.gbf .gb4{color:#900 !important}
|
2
|
+
</style><style>body,td,a,p,.h{font-family:arial,sans-serif}body{margin:0;overflow-y:scroll}#gog{padding:3px 8px 0}td{line-height:.8em}.gac_m td{line-height:17px}form{margin-bottom:20px}.h{color:#36c}.q{color:#00c}.ts td{padding:0}.ts{border-collapse:collapse}em{font-weight:bold;font-style:normal}.lst{height:25px;width:496px}.gsfi,.lst{font:18px arial,sans-serif}.gsfs{font:17px arial,sans-serif}.ds{display:inline-box;display:inline-block;margin:3px 0 4px;margin-left:4px}input{font-family:inherit}a.gb1,a.gb2,a.gb3,a.gb4{color:#11c !important}body{background:#fff;color:black}a{color:#11c;text-decoration:none}a:hover,a:active{text-decoration:underline}.fl a{color:#36c}a:visited{color:#551a8b}a.gb1,a.gb4{text-decoration:underline}a.gb3:hover{text-decoration:none}#ghead a.gb2:hover{color:#fff !important}.sblc{padding-top:5px}.sblc a{display:block;margin:2px 0;margin-left:13px;font-size:11px}.lsbb{background:#eee;border:solid 1px;border-color:#ccc #999 #999 #ccc;height:30px}.lsbb{display:block}.ftl,#fll a{display:inline-block;margin:0 12px}.lsb{background:url(/images/nav_logo229.png) 0 -261px repeat-x;border:none;color:#000;cursor:pointer;height:30px;margin:0;outline:0;font:15px arial,sans-serif;vertical-align:top}.lsb:active{background:#ccc}.lst:focus{outline:none}</style><script></script><link href="/images/branding/product/ico/googleg_lodp.ico" rel="shortcut icon"></head><body bgcolor="#fff"><script>(function(){var src='/images/nav_logo229.png';var iesg=false;document.body.onload = function(){window.n && window.n();if (document.images){new Image().src=src;}
|
3
|
+
if (!iesg){document.f&&document.f.q.focus();document.gbqf&&document.gbqf.q.focus();}
|
4
|
+
}
|
5
|
+
})();</script><div id="mngb"> <div id=gbar><nobr><b class=gb1>検索</b> <a class=gb1 href="https://www.google.co.jp/imghp?hl=ja&tab=wi">画像</a> <a class=gb1 href="https://maps.google.co.jp/maps?hl=ja&tab=wl">マップ</a> <a class=gb1 href="https://play.google.com/?hl=ja&tab=w8">Play</a> <a class=gb1 href="https://www.youtube.com/?gl=JP&tab=w1">YouTube</a> <a class=gb1 href="https://news.google.co.jp/nwshp?hl=ja&tab=wn">ニュース</a> <a class=gb1 href="https://mail.google.com/mail/?tab=wm">Gmail</a> <a class=gb1 href="https://drive.google.com/?tab=wo">ドライブ</a> <a class=gb1 style="text-decoration:none" href="https://www.google.co.jp/intl/ja/options/"><u>もっと見る</u> »</a></nobr></div><div id=guser width=100%><nobr><span id=gbn class=gbi></span><span id=gbf class=gbf></span><span id=gbe></span><a href="http://www.google.co.jp/history/optout?hl=ja" class=gb4>ウェブ履歴</a> | <a href="/preferences?hl=ja" class=gb4>設定</a> | <a target=_top id=gb_70 href="https://accounts.google.com/ServiceLogin?hl=ja&passive=true&continue=https://www.google.co.jp/" class=gb4>ログイン</a></nobr></div><div class=gbh style=left:0></div><div class=gbh style=right:0></div> </div><center><br clear="all" id="lgpd"><div id="lga"><div style="padding:28px 0 3px"><div style="height:110px;width:276px;background:url(/images/branding/googlelogo/1x/googlelogo_white_background_color_272x92dp.png) no-repeat" title="Google" align="left" id="hplogo" onload="window.lol&&lol()"><div style="color:#777;font-size:16px;font-weight:bold;position:relative;top:70px;left:218px" nowrap="">日本</div></div></div><br></div><form action="/search" name="f"><table cellpadding="0" cellspacing="0"><tr valign="top"><td width="25%"> </td><td align="center" nowrap=""><input name="ie" value="Shift_JIS" type="hidden"><input value="ja" name="hl" type="hidden"><input name="source" type="hidden" value="hp"><input name="biw" type="hidden"><input name="bih" type="hidden"><div class="ds" style="height:32px;margin:4px 0"><input style="color:#000;margin:0;padding:5px 8px 0 6px;vertical-align:top" autocomplete="off" class="lst" value="" title="Google 検索" maxlength="2048" name="q" size="57"></div><br style="line-height:0"><span class="ds"><span class="lsbb"><input class="lsb" value="Google 検索" name="btnG" type="submit"></span></span><span class="ds"><span class="lsbb"><input class="lsb" value="I'm Feeling Lucky" name="btnI" onclick="if(this.form.q.value)this.checked=1; else top.location='/doodles/'" type="submit"></span></span></td><td class="fl sblc" align="left" nowrap="" width="25%"><a href="/advanced_search?hl=ja&authuser=0">検索オプション</a><a href="/language_tools?hl=ja&authuser=0">言語ツール</a></td></tr></table><input id="gbv" name="gbv" type="hidden" value="1"></form><div id="gac_scont"></div><div style="font-size:83%;min-height:3.5em"><br></div><span id="footer"><div style="font-size:10pt"><div style="margin:19px auto;text-align:center" id="fll"><a href="/intl/ja/ads/">広告掲載</a><a href="http://www.google.co.jp/intl/ja/services/">ビジネス ソリューション</a><a href="https://plus.google.com/115899767381375908215" rel="publisher">+Google</a><a href="/intl/ja/about.html">Google について</a><a href="https://www.google.co.jp/setprefdomain?prefdom=US&sig=__NSwOcr0nRFmauaxJWrHJmztdkTc%3D" id="fehl">Google.com</a></div></div><p style="color:#767676;font-size:8pt">© 2017 - <a href="/intl/ja/policies/privacy/">プライバシー</a> - <a href="/intl/ja/policies/terms/">規約</a></p></span></center><script>(function(){window.google.cdo={height:0,width:0};(function(){var a=window.innerWidth,b=window.innerHeight;if(!a||!b)var c=window.document,d="CSS1Compat"==c.compatMode?c.documentElement:c.body,a=d.clientWidth,b=d.clientHeight;a&&b&&(a!=google.cdo.width||b!=google.cdo.height)&&google.log("","","/client_204?&atyp=i&biw="+a+"&bih="+b+"&ei="+google.kEI);}).call(this);})();</script><div id="xjsd"></div><div id="xjsi"><script>(function(){function c(b){window.setTimeout(function(){var a=document.createElement("script");a.src=b;document.getElementById("xjsd").appendChild(a)},0)}google.dljp=function(b,a){google.xjsu=b;c(a)};google.dlj=c;}).call(this);(function(){window.google.xjsrm=[];})();if(google.y)google.y.first=[];if(!google.xjs){window._=window._||{};window._._DumpException=function(e){throw e};if(google.timers&&google.timers.load.t){google.timers.load.t.xjsls=new Date().getTime();}google.dljp('/xjs/_/js/k\x3dxjs.hp.en_US.usfcb5N0rIw.O/m\x3dsb_he,d/rt\x3dj/d\x3d1/t\x3dzcms/rs\x3dACT90oH_6o1ZdJ1OdBerX9LppN-xEd11Eg','/xjs/_/js/k\x3dxjs.hp.en_US.usfcb5N0rIw.O/m\x3dsb_he,d/rt\x3dj/d\x3d1/t\x3dzcms/rs\x3dACT90oH_6o1ZdJ1OdBerX9LppN-xEd11Eg');google.xjs=1;}google.pmc={"sb_he":{"agen":true,"cgen":true,"client":"heirloom-hp","dh":true,"dhqt":true,"ds":"","fl":true,"host":"google.co.jp","isbh":28,"jam":0,"jsonp":true,"msgs":{"cibl":"検索をクリア","dym":"もしかして:","lcky":"I\u0026#39;m Feeling Lucky","lml":"詳細","oskt":"入力ツール","psrc":"この検索キーワードは\u003Ca href=\"/history\"\u003Eウェブ履歴\u003C/a\u003Eから削除されました","psrl":"削除","sbit":"画像で検索","srch":"Google 検索"},"nds":true,"ovr":{},"pq":"","refpd":true,"refspre":true,"rfs":[],"scd":10,"sce":5,"stok":"WXDdTX_0YGr0J9JRNcHRvk23BlI"},"d":{}};google.y.first.push(function(){if(google.med){google.med('init');google.initHistory();google.med('history');}});if(google.j&&google.j.en&&google.j.xi){window.setTimeout(google.j.xi,0);}
|
6
|
+
</script></div></body></html>
|
data/spec/processor_spec.rb
CHANGED
@@ -7,7 +7,6 @@ describe LinkThumbnailer::Processor do
|
|
7
7
|
let(:page) { ::LinkThumbnailer::Page.new(url, {}) }
|
8
8
|
let(:instance) { described_class.new }
|
9
9
|
let(:url) { 'http://foo.com' }
|
10
|
-
|
11
10
|
before do
|
12
11
|
allow(LinkThumbnailer).to receive(:page).and_return(page)
|
13
12
|
end
|
@@ -15,6 +14,7 @@ describe LinkThumbnailer::Processor do
|
|
15
14
|
describe '#call' do
|
16
15
|
|
17
16
|
let(:action) { instance.call(url) }
|
17
|
+
let(:https_action) { instance.call(https_url) }
|
18
18
|
|
19
19
|
context 'when redirect_count is greater than config' do
|
20
20
|
|
@@ -44,6 +44,16 @@ describe LinkThumbnailer::Processor do
|
|
44
44
|
|
45
45
|
end
|
46
46
|
|
47
|
+
context 'on http unauthorized error' do
|
48
|
+
|
49
|
+
before do
|
50
|
+
stub_request(:get, url).to_return(status: 401, body: '', headers: {})
|
51
|
+
end
|
52
|
+
|
53
|
+
it { expect { action }.to raise_error(LinkThumbnailer::HTTPError) }
|
54
|
+
|
55
|
+
end
|
56
|
+
|
47
57
|
context 'on http redirection' do
|
48
58
|
|
49
59
|
let(:body) { 'foo' }
|
@@ -202,31 +212,41 @@ describe LinkThumbnailer::Processor do
|
|
202
212
|
|
203
213
|
end
|
204
214
|
|
205
|
-
context 'when access non-utf8
|
215
|
+
context 'when access non-utf8 encoded website' do
|
216
|
+
|
206
217
|
let(:code) { 200 }
|
207
|
-
let(:
|
208
|
-
|
209
|
-
|
210
|
-
|
211
|
-
|
212
|
-
|
213
|
-
let(:response) do
|
214
|
-
r = ::Net::HTTPSuccess.new('', code, body_shift_jis)
|
215
|
-
r['Content-Type'] = 'text/html'
|
216
|
-
r.body = body_shift_jis
|
217
|
-
r.instance_variable_set(:@read, true)
|
218
|
-
r
|
218
|
+
let(:utf8_encoded_body) { File.read(File.expand_path('fixtures/google_utf8.html', File.dirname(__FILE__))) }
|
219
|
+
let(:shift_jis_encoded_body) { File.read(File.expand_path('fixtures/google_shift_jis.html', File.dirname(__FILE__))) }
|
220
|
+
let(:response) { ::Net::HTTPSuccess.new('', code, body) }
|
221
|
+
|
222
|
+
before do
|
223
|
+
allow(response).to receive(:body).and_return(body)
|
219
224
|
end
|
220
225
|
|
221
|
-
context 'when http success with valid charset provided in Content-Type' do
|
222
|
-
|
223
|
-
|
226
|
+
context 'when http success with valid charset provided in Content-Type header' do
|
227
|
+
|
228
|
+
let(:body) { shift_jis_encoded_body }
|
229
|
+
|
230
|
+
before do
|
231
|
+
response['Content-Type'] = 'text/html; charset=Shift-JIS'
|
232
|
+
end
|
233
|
+
|
234
|
+
it { expect(action).to eq(shift_jis_encoded_body) }
|
235
|
+
|
224
236
|
end
|
225
237
|
|
226
|
-
context 'when http success with
|
227
|
-
|
228
|
-
|
238
|
+
context 'when http success with valid charset provided in Content-Type header' do
|
239
|
+
|
240
|
+
let(:body) { shift_jis_encoded_body }
|
241
|
+
|
242
|
+
before do
|
243
|
+
response['Content-Type'] = 'text/html; charset=Shift_JIS'
|
244
|
+
end
|
245
|
+
|
246
|
+
it { expect(action).to eq(utf8_encoded_body) }
|
247
|
+
|
229
248
|
end
|
249
|
+
|
230
250
|
end
|
231
251
|
|
232
252
|
context 'when http redirection' do
|
data/spec/response_spec.rb
CHANGED
@@ -24,12 +24,16 @@ describe LinkThumbnailer::Response do
|
|
24
24
|
File.read(File.expand_path('fixtures/google_utf8.html', File.dirname(__FILE__)))
|
25
25
|
end
|
26
26
|
|
27
|
+
let(:body_utf8_no_meta_charset) do
|
28
|
+
File.read(File.expand_path('fixtures/google_utf8_no_meta_charset.html', File.dirname(__FILE__)))
|
29
|
+
end
|
30
|
+
|
27
31
|
before do
|
28
32
|
allow(LinkThumbnailer).to receive(:page).and_return(page)
|
29
33
|
end
|
30
34
|
|
31
35
|
describe '#charset' do
|
32
|
-
context 'when charset provided in content-type' do
|
36
|
+
context 'when charset provided in content-type "Shift_JIS"' do
|
33
37
|
before do
|
34
38
|
response['Content-Type'] = 'text/html; charset=Shift_JIS'
|
35
39
|
end
|
@@ -37,7 +41,25 @@ describe LinkThumbnailer::Response do
|
|
37
41
|
it { expect(instance.charset).to eq 'Shift_JIS' }
|
38
42
|
end
|
39
43
|
|
40
|
-
context 'when
|
44
|
+
context 'when charset provided in content-type "utf-8"' do
|
45
|
+
before do
|
46
|
+
response['Content-Type'] = 'text/html; charset=utf-8'
|
47
|
+
end
|
48
|
+
|
49
|
+
it { expect(instance.charset).to eq 'utf-8' }
|
50
|
+
end
|
51
|
+
|
52
|
+
context 'when no charset available in content-type and charset provided in meta tag "UTF-8"' do
|
53
|
+
before do
|
54
|
+
response.body = body_utf8
|
55
|
+
end
|
56
|
+
it { expect(instance.charset).to eq 'UTF-8' }
|
57
|
+
end
|
58
|
+
|
59
|
+
context 'when no charset available in content-type and body' do
|
60
|
+
before do
|
61
|
+
response.body = body_utf8_no_meta_charset
|
62
|
+
end
|
41
63
|
it { expect(instance.charset).to eq '' }
|
42
64
|
end
|
43
65
|
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: link_thumbnailer
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 3.
|
4
|
+
version: 3.4.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Pierre-Louis Gottfrois
|
8
|
-
autorequire:
|
8
|
+
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2020-07-24 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: activesupport
|
@@ -98,16 +98,22 @@ dependencies:
|
|
98
98
|
name: image_info
|
99
99
|
requirement: !ruby/object:Gem::Requirement
|
100
100
|
requirements:
|
101
|
-
- - "
|
101
|
+
- - "~>"
|
102
102
|
- !ruby/object:Gem::Version
|
103
103
|
version: '1.0'
|
104
|
+
- - ">="
|
105
|
+
- !ruby/object:Gem::Version
|
106
|
+
version: 1.2.0
|
104
107
|
type: :runtime
|
105
108
|
prerelease: false
|
106
109
|
version_requirements: !ruby/object:Gem::Requirement
|
107
110
|
requirements:
|
108
|
-
- - "
|
111
|
+
- - "~>"
|
109
112
|
- !ruby/object:Gem::Version
|
110
113
|
version: '1.0'
|
114
|
+
- - ">="
|
115
|
+
- !ruby/object:Gem::Version
|
116
|
+
version: 1.2.0
|
111
117
|
description: Ruby gem generating thumbnail images from a given URL.
|
112
118
|
email:
|
113
119
|
- pierrelouis.gottfrois@gmail.com
|
@@ -177,13 +183,17 @@ files:
|
|
177
183
|
- spec/fixtures/bar.png
|
178
184
|
- spec/fixtures/default_from_body.html
|
179
185
|
- spec/fixtures/default_from_meta.html
|
186
|
+
- spec/fixtures/default_with_few_favicons.html
|
180
187
|
- spec/fixtures/foo.png
|
181
188
|
- spec/fixtures/google_shift_jis.html
|
182
189
|
- spec/fixtures/google_utf8.html
|
190
|
+
- spec/fixtures/google_utf8_no_meta_charset.html
|
183
191
|
- spec/fixtures/og_not_valid_example.html
|
184
192
|
- spec/fixtures/og_valid_example.html
|
185
193
|
- spec/fixtures/og_valid_multi_image_example.html
|
186
194
|
- spec/fixtures/og_valid_multi_video_example.html
|
195
|
+
- spec/fixtures/with_related_path_in_href.html
|
196
|
+
- spec/fixtures/with_root_path_in_href.html
|
187
197
|
- spec/grader_spec.rb
|
188
198
|
- spec/graders/base_spec.rb
|
189
199
|
- spec/graders/html_attribute_spec.rb
|
@@ -211,7 +221,7 @@ files:
|
|
211
221
|
homepage: https://github.com/gottfrois/link_thumbnailer
|
212
222
|
licenses: []
|
213
223
|
metadata: {}
|
214
|
-
post_install_message:
|
224
|
+
post_install_message:
|
215
225
|
rdoc_options: []
|
216
226
|
require_paths:
|
217
227
|
- lib
|
@@ -226,9 +236,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
226
236
|
- !ruby/object:Gem::Version
|
227
237
|
version: '0'
|
228
238
|
requirements: []
|
229
|
-
|
230
|
-
|
231
|
-
signing_key:
|
239
|
+
rubygems_version: 3.0.8
|
240
|
+
signing_key:
|
232
241
|
specification_version: 4
|
233
242
|
summary: Ruby gem ranking images from a given URL returning an object containing images
|
234
243
|
and website informations.
|
@@ -238,13 +247,17 @@ test_files:
|
|
238
247
|
- spec/fixtures/bar.png
|
239
248
|
- spec/fixtures/default_from_body.html
|
240
249
|
- spec/fixtures/default_from_meta.html
|
250
|
+
- spec/fixtures/default_with_few_favicons.html
|
241
251
|
- spec/fixtures/foo.png
|
242
252
|
- spec/fixtures/google_shift_jis.html
|
243
253
|
- spec/fixtures/google_utf8.html
|
254
|
+
- spec/fixtures/google_utf8_no_meta_charset.html
|
244
255
|
- spec/fixtures/og_not_valid_example.html
|
245
256
|
- spec/fixtures/og_valid_example.html
|
246
257
|
- spec/fixtures/og_valid_multi_image_example.html
|
247
258
|
- spec/fixtures/og_valid_multi_video_example.html
|
259
|
+
- spec/fixtures/with_related_path_in_href.html
|
260
|
+
- spec/fixtures/with_root_path_in_href.html
|
248
261
|
- spec/grader_spec.rb
|
249
262
|
- spec/graders/base_spec.rb
|
250
263
|
- spec/graders/html_attribute_spec.rb
|