directlink 0.0.4.3 → 0.0.4.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (7) hide show
  1. checksums.yaml +4 -4
  2. data/.bashrc +2 -1
  3. data/README.md +17 -2
  4. data/directlink.gemspec +2 -2
  5. data/lib/directlink.rb +61 -48
  6. data/test.rb +118 -16
  7. metadata +4 -4
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 57cb3f9cecd52d2a4ac93edb8dd372398b0a54fa
4
- data.tar.gz: 4dc3b43d8f661432207c6481eac1d3246377a003
3
+ metadata.gz: 7e252357f3a65696511a206fa8fdf8bebd65cc97
4
+ data.tar.gz: f4fddedf24809bed96c56a3e063b92d8feeb2cb9
5
5
  SHA512:
6
- metadata.gz: f1199f9b98636f09e58676764e1c943ef792fe24271f1d127bbbb8d7b8355544924d13cf19be91ff06ed468e885dbd98def0c5189ce351d28daacda35a9c12e1
7
- data.tar.gz: 0c5e5720f11a60aa8548f8598b36d2395e1fd08a2bea654c2fe288916e85399d3897e8843aa2a90fb91e02e813b38859754362cb97373307e09b6bb92d139086
6
+ metadata.gz: c955776c59833e37194229370d7d26b17d1e6d48674b1cde952cc262eb9705e8681f8eb68d700b7de41abe31bb976a1fb297a18712a157dc0aec364b9d668377
7
+ data.tar.gz: cf7513d360fdfb6bad2108906398526ed18ee588dbd0a99e1961f5c8d593bd6061049376fad8936e0c835df0f4f6c1decc92952aa4040c4851a78dbb6bdc3d20
data/.bashrc CHANGED
@@ -1,2 +1,3 @@
1
1
  source api_tokens_for_travis.sh
2
- echo ruby test.rb
2
+ echo 'to test: ruby test.rb'
3
+ echo 'to test: bundle exec ./bin/directlink --debug ...'
data/README.md CHANGED
@@ -23,6 +23,13 @@ $ directlink //4.bp.blogspot.com/-5kP8ndL0kuM/Wpt82UCqvmI/AAAAAAAAEjI/ZbbZWs0-kg
23
23
  => https://4.bp.blogspot.com/-5kP8ndL0kuM/Wpt82UCqvmI/AAAAAAAAEjI/ZbbZWs0-kgwRXEJ9JEGioR0bm6U8MOkvQCKgBGAs/s0/IMG_20171223_093922.jpg
24
24
  jpeg 4160x3120
25
25
  ```
26
+ Given the link to a page it tries to find the main image on it.
27
+ ```
28
+ $ directlink https://plus.google.com/107956229381790410785/posts/Gu9apRHri41
29
+ <= https://plus.google.com/107956229381790410785/posts/Gu9apRHri41
30
+ => https://lh3.googleusercontent.com/-mRDjiHoDA30/W0mndQaRXeI/AAAAAAAAfyA/NhZGMAoQsbAb8cUFDzNWh-NXQ9O-YQhuQCJoC/s0/001
31
+ jpeg 2000x1328
32
+ ```
26
33
  Retrieves all images from Imgur album or gallery, orders them by resolution from high to low:
27
34
  ```
28
35
  $ directlink https://imgur.com/a/oacI3gl
@@ -91,7 +98,7 @@ $ export REDDIT_SECRETS=secrets.yaml
91
98
 
92
99
  #### the "don't give up mode"
93
100
 
94
- If the passed link is not the image link or a photo page of a known image hosting, the tool is still able to find the main images that the linked webpage contains (here it found three images in the markdown file):
101
+ If the passed link is not the image link or a photo page of a known image hosting, the tool is still able to find the main images that the linked webpage contains. Like in the second example of this README or here -- it found three images in the markdown file:
95
102
  ```
96
103
  $ directlink https://github.com/Nakilon/dhash-vips
97
104
  <= https://github.com/Nakilon/dhash-vips
@@ -155,10 +162,17 @@ DirectLink "http://minus.com/", 0
155
162
  SocketError: Failed to open TCP connection to minus.com:80 (getaddrinfo: nodename nor servname provided, or not known) to http://minus.com/
156
163
  ```
157
164
 
165
+ #### Ruby 2.0
166
+
167
+ The `addressable` dependency (for proper URI parsing) has a dependency that by default wants Ruby 2.1 or higher. You may fix it safely by adding this line to your `Gemfile`:
168
+ ```
169
+ gem "jwt", "<2"
170
+ ```
171
+
158
172
  ## Notes:
159
173
 
160
174
  * `module DirectLink` public methods return different sets of properties -- `DirectLink()` unites them
161
- * the `ErrorAssert` and `ErrorMissingEnvVar` should never be raised and you might report it if it does
175
+ * the `ErrorAssert`, `ErrorMissingEnvVar` and `URI::InvalidURIError` should never be raised and you might report it
162
176
  * style: `@@` and lambdas are used to keep things private
163
177
  * this gem is a historically 2 or 3 libraries merged -- this is why tests may look awkward
164
178
  * 500px.com has discontinued API in June 2018 -- the tool now uses undocumented methods
@@ -166,3 +180,4 @@ SocketError: Failed to open TCP connection to minus.com:80 (getaddrinfo: nodenam
166
180
 
167
181
  TODO: maybe make all these web service specific methods private and discourage to use them since they all return very different things and sometimes don't raise exceptions while the `DirectLink()` does
168
182
  TODO: what should `--json` print if exception was thrown?
183
+ TODO: looped prompt mode
data/directlink.gemspec CHANGED
@@ -1,6 +1,6 @@
1
1
  Gem::Specification.new do |spec|
2
2
  spec.name = "directlink"
3
- spec.version = "0.0.4.3"
3
+ spec.version = "0.0.4.4"
4
4
  spec.summary = "converts any kind of image hyperlink to direct link, type of image and its resolution"
5
5
 
6
6
  spec.author = "Victor Maslov aka Nakilon"
@@ -11,7 +11,7 @@ Gem::Specification.new do |spec|
11
11
 
12
12
  spec.add_dependency "fastimage", "~>2.1.3"
13
13
  spec.add_dependency "nokogiri"
14
- spec.add_dependency "reddit_bot", "~>1.6.7"
14
+ spec.add_dependency "reddit_bot", "~>1.6.8"
15
15
  spec.add_dependency "kramdown"
16
16
  spec.add_dependency "addressable"
17
17
  spec.add_development_dependency "minitest"
data/lib/directlink.rb CHANGED
@@ -46,25 +46,29 @@ module DirectLink
46
46
  /\A(https:\/\/lh3\.googleusercontent\.com\/-[a-zA-Z0-9_-]{11}\/W[a-zA-Z0-9_-]{9}I\/AAAAAAAA[a-zA-Z0-9_]{3}\/[a-zA-Z0-9_-]{32}[gwAQ]CJoC\/)w4\d\d-h318-n(\/[^\/]+)\z/,
47
47
  /\A(https:\/\/lh3\.googleusercontent\.com\/-[a-zA-Z0-9_]{11}\/W[a-zA-Z0-9]{9}I\/AAAAAAAA[a-zA-Z0-9]{3}\/[a-zA-Z0-9_-]{32}[gw]CJoC\/)w48\d-h8\d\d-n(\/[^\/]+)\z/,
48
48
  /\A(https:\/\/lh3\.googleusercontent\.com\/-[a-zA-Z0-9_-]{11}\/W[a-zA-Z0-9_-]{9}I\/AAAAAAA[a-zA-Z0-9_-]{4}\/[a-zA-Z0-9_-]{33}(?:CJoC|CL0B(?:GAs)?)\/)w530(?:-d)?-h[1-9]\d\d-n(\/[^\/]+)\z/,
49
- /\A(https:\/\/lh3\.googleusercontent\.com\/-[a-zA-Z0-9]{11}\/W[a-zA-Z0-9]{9}I\/AAAAAAAA[a-zA-Z]{3}\/[a-zA-Z0-9-]{32}QCJoC\/)w530-h175-n(\/[^\/]+)\z/,
50
- /\A(https:\/\/lh3\.googleusercontent\.com\/-[a-zA-Z-]{11}\/W[a-zA-Z-]{9}I\/AAAAAAAA[a-zA-Z]{3}\/[a-zA-Z0-9_-]{32}ACJoC\/)w179-h318-n(\/[^\/]+)\z/,
51
- /\A(https:\/\/lh3\.googleusercontent\.com\/-[a-zA-Z0-9-]{11}\/W[a-zA-Z0-9]{9}I\/AAAAAAAAA[A-Z]{2}\/[a-zA-Z0-9]{32}QCJoC\/)w208-h318-n(\/[^\/]+)\z/
49
+ /\A(https:\/\/lh3\.googleusercontent\.com\/-[a-zA-Z-]{11}\/W[a-zA-Z-]{9}I\/AAAAAAAA[a-zA-Z]{3}\/[a-zA-Z0-9_-]{32}ACJoC\/)w179-h318-n(\/[^\/]+)\z/
52
50
  "#{$1}s#{width}#{$2}"
53
- when /\A(\/\/lh3\.googleusercontent\.com\/proxy\/[a-zA-Z0-9_-]{66,523}=)(?:w(?:464|504|530)-h\d\d\d-[np]|s530-p|s110-p-k)\z/
51
+ when /\A(\/\/lh3\.googleusercontent\.com\/proxy\/[a-zA-Z0-9_-]{66,523}=)(?:w(?:[45]\d\d)-h\d\d\d-[np]|s530-p|s110-p-k)\z/
54
52
  "https:#{$1}s#{width}"
55
53
  when /\A(\/\/lh3\.googleusercontent\.com\/cOh2Nsv7EGo0QbuoKxoKZVZO_NcBzufuvPtzirMJfPmAzCzMtnEncfA7zGIDTJfkc1YZFX2MhgKnjA=)w530-h398-p\z/
56
54
  "https:#{$1}s#{width}"
57
- when /\A(\/\/lh3\.googleusercontent\.com\/-[a-zA-Z0-9-]{11}\/W[a-zA-Z0-9_-]{9}I\/AAAAAAA[AC][a-zA-Z0-9]{3}\/[a-zA-Z0-9_-]{32}[gwAQ]CJoC\/)w530-h3\d\d-p(\/[^\/]+)\z/,
55
+ when /\A(\/\/lh3\.googleusercontent\.com\/-[a-zA-Z0-9-]{11}\/[VW][a-zA-Z0-9_-]{9}I\/AAAAAAA[AC][a-zA-Z0-9]{3}\/[a-zA-Z0-9_-]{32}[gwAQ]CJoC\/)w530-h3\d\d-p(\/[^\/]+)\z/,
56
+ /\A(\/\/lh3\.googleusercontent\.com\/-[a-zA-Z0-9]{11}\/W[a-zA-Z0-9]{9}I\/AAAAAAAA[a-zA-Z0-9]{3}\/[a-zA-Z0-9_]{32}ACJoC\/)w530-h298-p(\/[^\/]+)\z/,
58
57
  /\A(\/\/[124]\.bp\.blogspot\.com\/-[a-zA-Z0-9_-]{11}\/W[npw][a-zA-Z0-9_-]{8}I\/AAAAAAAA[KDE][a-zA-Z0-9_-]{2}\/[a-zA-Z0-9_-]{33}C(?:Lc|Kg)BGAs\/)w530-h[23]\d\d-p(\/[^\/]+)\z/,
59
- /\A(\/\/[2]\.bp\.blogspot\.com\/-[a-zA-Z-]{11}\/W[a-zA-Z0-9]{8}_I\/AAAAAAAAHDs\/[a-zA-Z0-9-]{33}CEwYBhgL\/)w530-h353-p(\/[^\/]+)\z/
58
+ /\A(\/\/[2]\.bp\.blogspot\.com\/-[a-zA-Z-]{11}\/W[a-zA-Z0-9]{8}_I\/AAAAAAAAHDs\/[a-zA-Z0-9-]{33}CEwYBhgL\/)w530-h353-p(\/[^\/]+)\z/,
59
+ /\A(\/\/4\.bp\.blogspot\.com\/-[a-zA-Z0-9-]{11}\/W[a-zA-Z0-9]{9}I\/AAAAAAAAHHg\/[a-zA-Z0-9-]{33}CLcBGAs\/)w530-h353-p(\/[^\/]+)\z/
60
60
  "https:#{$1}s#{width}#{$2}"
61
61
  when /\A(https:\/\/lh3\.googleusercontent\.com\/-dUQsDY2vWuE\/AAAAAAAAAAI\/AAAAAAAAAAQ\/wVFZagieszU\/)w530-h176-n(\/photo\.jpg)\z/,
62
- /\A(https:\/\/lh3\.googleusercontent\.com\/-t_ab__91ChA\/VeLaObkUlgI\/AAAAAAAAL4s\/VjO6KK_lkRw\/)w530-h351-n(\/[^\/]+)\z/
62
+ /\A(https:\/\/lh3\.googleusercontent\.com\/-t_ab__91ChA\/VeLaObkUlgI\/AAAAAAAAL4s\/VjO6KK_lkRw\/)w530-h351-n(\/[^\/]+)\z/,
63
+ /\A(https:\/\/lh3\.googleusercontent\.com\/-s655sojwyvw\/VcNB4YMCz-I\/AAAAAAAALqo\/kW98MOcJJ0g\/)w530-h398-n\/06\.08\.15%2B-%2B1\z/,
64
+ /\A(https:\/\/lh3\.googleusercontent\.com\/-u3FhiUTmLCY\/Vk7dMQnxR2I\/AAAAAAAAMc0\/I76_52swA4s\/)w530-h322-n\/Harekosh_A%252520Concert_YkRqQg\.jpg\z/,
65
+ /\A(https:\/\/lh3\.googleusercontent\.com\/-t_ab__91ChA\/VeLaObkUlgI\/AAAAAAAAL4s\/VjO6KK_lkRw\/)w530-d-h351-n\/30\.08\.15%2B-%2B1\z/
63
66
  "#{$1}s#{width}#{$2}"
64
67
  # high res (s0) Google Plus post image
65
68
  when /\Ahttps:\/\/lh3\.googleusercontent\.com\/-[a-zA-Z0-9_-]{11}\/W[a-zA-Z0-9_-]{9}I\/AAAAAAA[ABC][a-zA-Z0-9]{3}\/[a-zA-Z0-9_-]{33}CJoC\/s0\/[^\/]+\z/,
66
69
  /\Ahttps:\/\/lh3\.googleusercontent\.com\/-[a-zA-Z0-9]{11}\/W[a-zA-Z0-9]{9}I\/AAAAAAAA[a-zA-Z_]{3}\/[a-zA-Z0-9]{32}gCJoC\/s0\/[^\/]+\z/,
67
- /\Ahttps:\/\/lh3\.googleusercontent\.com\/-[a-zA-Z0-9]{11}\/[a-zA-Z0-9]{10}I\/AAAAAAA[a-zA-Z]{4}\/[a-zA-Z0-9]{32}wCJoC\/s0\/[^\/]+\z/
70
+ /\Ahttps:\/\/lh3\.googleusercontent\.com\/-[a-zA-Z0-9]{11}\/[a-zA-Z0-9]{10}I\/AAAAAAA[a-zA-Z0-9]{4}\/[a-zA-Z0-9_-]{32}wCJoC\/s0\/[^\/]+\z/,
71
+ /\Ahttps:\/\/lh3\.googleusercontent\.com\/-[a-zA-Z0-9]{11}\/[a-zA-Z0-9]{10}I\/AAAAAAA[A-Z]{4}\/[a-zA-Z0-9-]{32}gCJoC\/s0\/[^\/]+\z/
68
72
  src
69
73
  # Google Plus userpic
70
74
  when /\A(https:\/\/lh3\.googleusercontent\.com\/-[a-zA-Z0-9-]{11}\/AAAAAAAAAAI\/AAAAAAAA[a-zA-Z0-9]{3}\/[a-zA-Z0-9_-]{11}\/)s\d\d-p(?:-k)?-rw-no(\/photo\.jpg)\z/
@@ -90,24 +94,25 @@ module DirectLink
90
94
  def self.imgur link, timeout = 1000
91
95
  raise ErrorMissingEnvVar.new "define IMGUR_CLIENT_ID env var" unless ENV["IMGUR_CLIENT_ID"]
92
96
 
93
- case link
94
- when /\Ahttps?:\/\/(?:(?:i|m|www)\.)?imgur\.com\/(a|gallery)\/([a-zA-Z0-9]{5}(?:[a-zA-Z0-9]{2})?)\z/,
95
- /\Ahttps?:\/\/imgur\.com\/(gallery)\/([a-zA-Z0-9]{5}(?:[a-zA-Z0-9]{2})?)\/new\z/
97
+ request_data = lambda do |url|
96
98
  t = 1
97
- json = begin
98
- NetHTTPUtils.request_data "https://api.imgur.com/3/#{
99
- $1 == "gallery" ? "gallery" : "album"
100
- }/#{$2}/0.json", header: { Authorization: "Client-ID #{ENV["IMGUR_CLIENT_ID"]}" }
99
+ begin
100
+ NetHTTPUtils.request_data url, header: { Authorization: "Client-ID #{ENV["IMGUR_CLIENT_ID"]}" }
101
101
  rescue NetHTTPUtils::Error => e
102
- if 500 == e.code && t < timeout
103
- logger.error "retrying in #{t} seconds because of Imgur HTTP ERROR 500"
102
+ raise ErrorNotFound.new url.inspect if 404 == e.code
103
+ if t < timeout && [400, 500, 503].include?(e.code)
104
+ logger.error "retrying in #{t} seconds because of Imgur HTTP ERROR #{e.code}"
104
105
  sleep t
105
106
  t *= 2
106
107
  retry
107
108
  end
108
- raise ErrorNotFound.new link.inspect if 404 == e.code
109
- raise ErrorAssert.new "unexpected http error for #{link}"
109
+ raise ErrorAssert.new "unexpected http error for #{url}"
110
110
  end
111
+ end
112
+ case link
113
+ when /\Ahttps?:\/\/(?:(?:i|m|www)\.)?imgur\.com\/(a|gallery)\/([a-zA-Z0-9]{5}(?:[a-zA-Z0-9]{2})?)\z/,
114
+ /\Ahttps?:\/\/imgur\.com\/(gallery)\/([a-zA-Z0-9]{5}(?:[a-zA-Z0-9]{2})?)\/new\z/
115
+ json = request_data["https://api.imgur.com/3/#{$1 == "gallery" ? "gallery" : "album"}/#{$2}/0.json"]
111
116
  data = JSON.load(json)["data"]
112
117
  if data["error"]
113
118
  raise ErrorAssert.new "unexpected error #{data.inspect} for #{link}"
@@ -128,19 +133,7 @@ module DirectLink
128
133
  /\Ahttps?:\/\/imgur\.com\/([a-zA-Z0-9]{5}(?:[a-zA-Z0-9]{2})?)\z/,
129
134
  /\Ahttps?:\/\/imgur\.com\/([a-zA-Z0-9]{7})(?:\?\S+)?\z/,
130
135
  /\Ahttps?:\/\/imgur\.com\/r\/[0-9_a-z]+\/([a-zA-Z0-9]{7})\z/
131
- t = 1
132
- json = begin
133
- NetHTTPUtils.request_data "https://api.imgur.com/3/image/#{$1}/0.json", header: { Authorization: "Client-ID #{ENV["IMGUR_CLIENT_ID"]}" }
134
- rescue NetHTTPUtils::Error => e
135
- raise ErrorNotFound.new link.inspect if e.code == 404
136
- if t < timeout && [400, 500].include?(e.code)
137
- logger.error "retrying in #{t} seconds because of Imgur HTTP ERROR #{e.code}"
138
- sleep t
139
- t *= 2
140
- retry
141
- end
142
- raise ErrorAssert.new "unexpected http error for #{link}"
143
- end
136
+ json = request_data["https://api.imgur.com/3/image/#{$1}/0.json"]
144
137
  [ JSON.load(json)["data"] ]
145
138
  else
146
139
  raise ErrorBadLink.new link
@@ -202,29 +195,48 @@ module DirectLink
202
195
  class << self
203
196
  attr_accessor :reddit_bot
204
197
  end
205
- def self.reddit link
198
+ def self.reddit link, timeout = 1000
206
199
  unless id = URI(link).path[/\A(?:\/r\/[0-9a-zA-Z_]+)?(?:\/comments)?\/([0-9a-z]{5,6})(?:\/|\z)/, 1]
207
200
  raise DirectLink::ErrorBadLink.new link unless URI(link).host &&
208
201
  URI(link).host.split(?.) == %w{ i redd it } &&
209
202
  URI(link).path[/\A\/[a-z0-9]{12,13}\.(gif|jpg)\z/]
210
203
  return [true, link]
211
204
  end
212
- if ENV["REDDIT_SECRETS"]
205
+ retry_on_json_parseerror = lambda do |&b|
206
+ t = 1
207
+ begin
208
+ b.call
209
+ rescue JSON::ParserError => e
210
+ raise ErrorBadLink.new link if t > timeout
211
+ logger.error "#{e.message[0, 500].gsub(/\s+/, " ")}, retrying in #{t} seconds"
212
+ sleep t
213
+ t *= 2
214
+ retry
215
+ end
216
+ end
217
+ json = if ENV["REDDIT_SECRETS"]
213
218
  require "reddit_bot"
214
219
  RedditBot.logger.level = Logger::FATAL
215
220
  require "yaml"
216
- reddit_bot ||= RedditBot::Bot.new YAML.load_file ENV["REDDIT_SECRETS"]
217
- json = reddit_bot.json(:get, "/by_id/t3_#{id}")
221
+ self.reddit_bot ||= RedditBot::Bot.new YAML.load_file ENV["REDDIT_SECRETS"]
222
+ retry_on_json_parseerror.call{ self.reddit_bot.json :get, "/by_id/t3_#{id}" }
218
223
  else
219
224
  raise ErrorMissingEnvVar.new "defining REDDIT_SECRETS env var is highly recommended" rescue nil
220
- json = JSON.load NetHTTPUtils.request_data "#{link}.json", header: {"User-Agent" => "Mozilla"}
225
+ json = retry_on_json_parseerror.call{ JSON.load NetHTTPUtils.request_data "https://www.reddit.com/#{id}.json", header: {"User-Agent" => "Mozilla"} }
221
226
  raise ErrorAssert.new "our knowledge about Reddit API seems to be outdated" unless json.size == 2
222
- json = json.find{ |_| _["data"]["children"].first["kind"] == "t3" }
227
+ json.find{ |_| _["data"]["children"].first["kind"] == "t3" }
223
228
  end
224
229
  data = json["data"]["children"].first["data"]
225
- url = data["url"]
226
- return [true, url] unless data["is_self"]
227
- raise ErrorAssert.new "our knowledge about Reddit API seems to be outdated" if url != "https://www.reddit.com" + data["permalink"]
230
+ if data["media"]["reddit_video"]
231
+ t = data["preview"]["images"]
232
+ raise ErrorAssert.new "our knowledge about Reddit API seems to be outdated" unless t.size == 1
233
+ return [true, t.first["source"]["url"]]
234
+ else
235
+ raise ErrorAssert.new "our knowledge about Reddit API seems to be outdated" unless data["media"].keys.sort == %w{ oembed type } && data["media"]["type"] == "youtube.com"
236
+ return [true, data["media"]["oembed"]["thumbnail_url"]]
237
+ end if data["media"]
238
+ return [true, data["url"]] unless data["is_self"]
239
+ raise ErrorAssert.new "our knowledge about Reddit API seems to be outdated" if data["url"] != "https://www.reddit.com" + data["permalink"]
228
240
  return [false, data["selftext"]]
229
241
  end
230
242
 
@@ -312,16 +324,17 @@ def DirectLink link, max_redirect_resolving_retry_delay = nil, giveup = false
312
324
 
313
325
  begin
314
326
  s, u = DirectLink.reddit(link)
315
- if s
316
- return DirectLink u, max_redirect_resolving_retry_delay, giveup
327
+ unless s
328
+ raise DirectLink::ErrorBadLink.new link if giveup # TODO: print original url in such cases if there was a recursion
329
+ f = ->_{ _.type == :a ? _.attr["href"] : _.children.flat_map(&f) }
330
+ require "kramdown"
331
+ return f[Kramdown::Document.new(u).root].map{ |_| DirectLink _, max_redirect_resolving_retry_delay, giveup }
317
332
  end
318
- raise DirectLink::ErrorBadLink.new link if giveup # TODO: print original url in such cases if there was a recursion
319
- f = ->_{ _.type == :a ? _.attr["href"] : _.children.flat_map(&f) }
320
- require "kramdown"
321
- return f[Kramdown::Document.new(u).root].map{ |_| DirectLink _, max_redirect_resolving_retry_delay, giveup }
333
+ return struct.new *u.values_at(*%w{ fallback_url width height }), "video" if u.is_a? Hash
334
+ link = u
322
335
  rescue DirectLink::ErrorMissingEnvVar
323
336
  end if %w{ reddit com } == URI(link).host.split(?.).last(2) ||
324
- %w{ redd it } == URI(link).host.split(?.).last(2)
337
+ %w{ redd it } == URI(link).host.split(?.)
325
338
 
326
339
 
327
340
  begin
data/test.rb CHANGED
@@ -113,6 +113,18 @@ describe DirectLink do
113
113
  assert DirectLink.google link
114
114
  end
115
115
  end
116
+ %w{
117
+ https://lh3.googleusercontent.com/-s655sojwyvw/VcNB4YMCz-I/AAAAAAAALqo/kW98MOcJJ0g/w530-h398-n/06.08.15%2B-%2B1
118
+ //4.bp.blogspot.com/-TuMlpg-Q1YY/W3PXkW1lkaI/AAAAAAAAHHg/Bh9IsuLV01kbctIu6lcRJHKkY-ej8oD5gCLcBGAs/w530-h353-p/_MG_2688-Edit.jpg
119
+ //lh3.googleusercontent.com/proxy/SEfB6tFuim6X0HdZfEBSxrXtumUdf4Q4y05rUW4wc_clWWVrowuWAGZghx71xwPUmf_8si2VQwnRivsM7PfD2gp3kA=w480-h360-n
120
+ https://lh3.googleusercontent.com/-u3FhiUTmLCY/Vk7dMQnxR2I/AAAAAAAAMc0/I76_52swA4s/w530-h322-n/Harekosh_A%252520Concert_YkRqQg.jpg
121
+ https://lh3.googleusercontent.com/-t_ab__91ChA/VeLaObkUlgI/AAAAAAAAL4s/VjO6KK_lkRw/w530-d-h351-n/30.08.15%2B-%2B1
122
+ //lh3.googleusercontent.com/-u2NzdIQfVyQ/Wy83AzoFT8I/AAAAAAAAh6M/fdpxOUkj5mUIfpvYol_R5dyupnF2nDIEACJoC/w530-h298-p/_DSC9134.jpg
123
+ }.each_with_index do |link, i| # July contenstants
124
+ it "another (July) Google Plus community post image ##{i + 1}" do
125
+ assert DirectLink.google link
126
+ end
127
+ end
116
128
  %w{
117
129
  https://lh3.googleusercontent.com/-f37xWyiyP8U/WvmxOxCd-0I/AAAAAAAACpw/3A2tRj02oY40MzJqZBJyWGImoSer0lwMgCJoC/s0/140809%2B029.jpg
118
130
  https://lh3.googleusercontent.com/-1s_eiQB4x2k/WvXQEx59z2I/AAAAAAAAcI0/DvKYzWw3g6UNelqAQdOwrdtYdSEqKgkxwCJoC/s0/001
@@ -131,6 +143,8 @@ describe DirectLink do
131
143
  https://lh3.googleusercontent.com/-aUVoiLNsmAg/WzcsUU2xfNI/AAAAAAAAODw/DOBual6E1rkVLHh3SKZSzbpNQzdEoZPOQCJoC/w530-h883-n-k-no/gplus-1797734754.mp4
132
144
  //lh3.googleusercontent.com/proxy/hOIoIpMEmoVDSP40VRzM92Zw2AeLvEEhxfyKHCOxiNVPyiGvZik5rMvl3jYISLgDJla6mhZuk8pFEYJhX5BU2wy_dw=w530-h822-p
133
145
  https://lh3.googleusercontent.com/-GP3BA3zGR5A/W0IwuVXlfmI/AAAAAAADROs/SH8rRlBDYTsHZiHpM45S3zpEipu5hJ2PwCJoC/s0/%25D1%2582%25D0%25B0%25D0%25B4%25D0%25B6%25D0%25B8%25D0%25BA%25D1%2581%25D0%25BA%25D0%25BE%25D0%25B5%2B%25D1%2580%25D0%25B0%25D0%25B7%25D0%25BD%25D0%25BE%25D1%2582%25D1%2580%25D0%25B0%25D0%25B2%25D1%258C%25D0%25B5.png
146
+ https://lh3.googleusercontent.com/-DLODAbD9W7E/W27ob5XGCOI/AAAAAAADV8g/J_6RYR6UkKsc2RJOWRx6Q-NBVx5RbMoxwCJoC/s0/1236080.jpg
147
+ https://lh3.googleusercontent.com/-cJajRreI87w/W4gW5uF4Q7I/AAAAAAADZKI/mw1YayYE-MY2-1OCCmjvgM3kbCK0lmIggCJoC/s0/2504855.jpg
134
148
  }.each_with_index do |link, i|
135
149
  it "gpluscomm_105636351696833883213_86400 ##{i + 1}" do
136
150
  assert DirectLink.google link
@@ -187,21 +201,27 @@ describe DirectLink do
187
201
  assert_nil e.cause if Exception.instance_methods.include? :cause # Ruby 2.1
188
202
  end
189
203
 
190
- valid_imgur_image_url = "https://i.imgur.com/BLCesav.jpg"
204
+ valid_imgur_image_url_direct = "https://i.imgur.com/BLCesav.jpg"
191
205
  it 200 do
192
206
  assert_equal [["https://i.imgur.com/BLCesav.jpg", 1000, 1500, "image/jpeg"]],
193
- DirectLink.imgur(valid_imgur_image_url)
194
- end
195
- [400, 500].each do |error_code|
196
- it "retries two times on error #{error_code}" do
197
- tries = 0
198
- e = assert_raises DirectLink::ErrorAssert do
199
- NetHTTPUtils.stub :request_data, ->*{ tries += 1; raise NetHTTPUtils::Error.new "", error_code } do
200
- DirectLink.imgur valid_imgur_image_url, 4 # do not remove `4` or test may hang
207
+ DirectLink.imgur(valid_imgur_image_url_direct)
208
+ end
209
+ valid_imgur_image_url_album = "https://imgur.com/a/wPi63mj"
210
+ [400, 500, 503].each do |error_code|
211
+ [
212
+ [valid_imgur_image_url_direct, :direct],
213
+ [valid_imgur_image_url_album, :album],
214
+ ].each do |url, kind|
215
+ it "retries limited amout of times on error #{error_code} (#{kind})" do
216
+ tries = 0
217
+ e = assert_raises DirectLink::ErrorAssert do
218
+ NetHTTPUtils.stub :request_data, ->*{ tries += 1; raise NetHTTPUtils::Error.new "", error_code } do
219
+ DirectLink.imgur url, 4 # do not remove `4` or test will hang
220
+ end
201
221
  end
222
+ assert_equal error_code, e.cause.code if Exception.instance_methods.include? :cause # Ruby 2.1
223
+ assert_equal 3, tries
202
224
  end
203
- assert_equal error_code, e.cause.code if Exception.instance_methods.include? :cause # Ruby 2.1
204
- assert_equal 3, tries
205
225
  end
206
226
  end
207
227
  it "does not throw 400 after a successfull retry" do
@@ -213,13 +233,13 @@ describe DirectLink do
213
233
  m.call *args
214
234
  } do
215
235
  assert_equal [["https://i.imgur.com/BLCesav.jpg", 1000, 1500, "image/jpeg"]],
216
- DirectLink.imgur(valid_imgur_image_url, 4) # do not remove `4` or test may hang
236
+ DirectLink.imgur(valid_imgur_image_url_direct, 4) # do not remove `4` or test may hang
217
237
  end
218
238
  end
219
239
  it 404 do
220
240
  e = assert_raises DirectLink::ErrorNotFound do
221
241
  NetHTTPUtils.stub :request_data, ->*{ raise NetHTTPUtils::Error.new "", 404 } do
222
- DirectLink.imgur valid_imgur_image_url
242
+ DirectLink.imgur valid_imgur_image_url_direct
223
243
  end
224
244
  end
225
245
  assert_equal 404, e.cause.code if Exception.instance_methods.include? :cause # Ruby 2.1
@@ -314,7 +334,7 @@ describe DirectLink do
314
334
  ["http://redd.it/32tq0i", [true, "http://i.imgur.com/vy6Ms4Z.jpg"]], # TODO maybe check that it calls #imgur recursively
315
335
  ["https://i.redd.it/c8rk0kjywhy01.jpg", [true, "https://i.redd.it/c8rk0kjywhy01.jpg"]],
316
336
  ["https://i.redd.it/si758zk7r5xz.jpg", [true, "https://i.redd.it/si758zk7r5xz.jpg"]], # it is 404 but `.reddit` does not care -- it just returns the url
317
- ["https://reddit.com/123456", [true, "http://www.youtube.com/watch?v=b9upM4RbIeU&amp;feature=g-vrec"]],
337
+ ["https://reddit.com/123456", [true, "https://i.ytimg.com/vi/b9upM4RbIeU/hqdefault.jpg"]],
318
338
  ["https://www.reddit.com/r/travel/988889", [true, "https://i.redd.it/3h5xls6ehrg11.jpg"]],
319
339
  ["http://redd.it/988889", [true, "https://i.redd.it/3h5xls6ehrg11.jpg"]],
320
340
  ] ],
@@ -381,6 +401,46 @@ describe DirectLink do
381
401
  end
382
402
  end
383
403
 
404
+ # TODO: make a Reddit describe
405
+ it "retries limited amout of times on error JSON::ParserError" do
406
+ link = "https://www.reddit.com/r/gifs/comments/9ftc8f/low_pass_wake_vortices/?st=JM2JIKII&amp;sh=c00fea4f"
407
+ tries = 0
408
+ m = NetHTTPUtils.method :request_data
409
+ e = assert_raises DirectLink::ErrorBadLink do
410
+ NetHTTPUtils.stub :request_data, lambda{ |*args|
411
+ if args.first == "https://www.reddit.com/9ftc8f.json"
412
+ tries += 1
413
+ raise JSON::ParserError
414
+ end
415
+ m.call *args
416
+ } do
417
+ t = ENV.delete "REDDIT_SECRETS"
418
+ begin
419
+ DirectLink.reddit link, 3 # do not remove `4` or test will hang
420
+ ensure
421
+ ENV["REDDIT_SECRETS"] = t
422
+ end
423
+ end
424
+ end
425
+ assert_instance_of JSON::ParserError, e.cause if Exception.instance_methods.include? :cause # Ruby 2.1
426
+ assert_equal 3, tries
427
+ end
428
+ it "Reddit correctly parses out id when no token provided" do
429
+ t = ENV.delete "REDDIT_SECRETS"
430
+ begin
431
+ assert_equal "https://i.redditmedia.com/-WnE-3o4RhKx6ImGD69vJYAo7UjMn5b4ClHHISJ0_Kk.png?s=fc3e2f2f9973c45daa759a45a75557bf",
432
+ DirectLink("https://www.reddit.com/r/gifs/comments/9ftc8f/low_pass_wake_vortices/?st=JM2JIKII&amp;sh=c00fea4f").url
433
+ ensure
434
+ ENV["REDDIT_SECRETS"] = t
435
+ end
436
+ end
437
+ it "it is really impossible to get dimensions from the shitty Reddit media hosting" do
438
+ # TODO: why does it call the same Net::HTTP::Get twice?
439
+ assert_raises FastImage::UnknownImageType do
440
+ DirectLink "https://v.redd.it/2tyovczka8m11/DASH_4_8_M"
441
+ end
442
+ end
443
+
384
444
  describe "throws ErrorBadLink if method does not match the link" do
385
445
  %i{ google imgur flickr _500px wiki reddit }.each do |method|
386
446
  ["", "test", "http://example.com/"].each_with_index do |url, i|
@@ -393,10 +453,54 @@ describe DirectLink do
393
453
  end
394
454
  end
395
455
 
456
+ # TODO: test each webservice-specific method
457
+ # TODO: check not in OpenSSL but higher -- in Net::HTTP
458
+ it "does not cause the SystemStackError" do
459
+ OpenSSL::Buffering.module_eval do
460
+ old = instance_method :write
461
+ depths = []
462
+ define_method :write do |arg|
463
+ depths.push caller.size
464
+ raise "probable infinite recursion" if [1]*10 == depths.each_cons(2).map{ |i,j| j-i } if depths.size > 10
465
+ old.bind(self).(arg)
466
+ end
467
+ end
468
+ DirectLink "https://i.redd.it/gdo0cnmeagx01.jpg"
469
+ end
470
+
396
471
  end
397
472
 
398
473
  describe "DirectLink()" do
399
474
 
475
+ it "does not raise JSON::ParserError -- Reddit sucks and may respond with wrong content type" do
476
+ DirectLink.reddit "https://www.reddit.com/123456" # just to initialize DirectLink.reddit_bot
477
+ # I don't remember for what though
478
+ # oh, maybe to avoid rasing in initializer
479
+ # but why not?
480
+ limit = 0
481
+ loop do
482
+ limit += 1
483
+ tries = 0
484
+ m = JSON.method :load
485
+ JSON.stub :load, ->*args{
486
+ if limit == tries += 1
487
+ raise JSON::ParserError
488
+ else
489
+ m.call *args
490
+ end
491
+ } do
492
+ t = ENV.delete "REDDIT_SECRETS"
493
+ begin
494
+ p DirectLink.reddit "https://www.reddit.com/123456"
495
+ ensure
496
+ ENV["REDDIT_SECRETS"] = t
497
+ end
498
+ end
499
+ break if 1 == tries
500
+ end
501
+ assert_equal 2, limit, "`JSON.load` was called only once?!"
502
+ end
503
+
400
504
  # thanks to gem addressable
401
505
  it "does not throw URI::InvalidURIError if there are brackets" do
402
506
  assert_equal 404, (
@@ -480,8 +584,6 @@ describe DirectLink do
480
584
  ["https://github.com/Nakilon/dhash-vips", 3],
481
585
  ["http://imgur.com/HQHBBBD", FastImage::UnknownImageType, true],
482
586
  ["http://imgur.com/HQHBBBD", "https://i.imgur.com/HQHBBBD.jpg?fb"], # .at_css("meta[@property='og:image']")
483
- ["http://redd.it/123456", FastImage::UnknownImageType, true],
484
- ["http://redd.it/123456", 1],
485
587
  ["http://redd.it/997he7", DirectLink::ErrorBadLink, true],
486
588
  ["http://redd.it/997he7", 1], # currently only links are parsed
487
589
  ].each_with_index do |(input, expectation, giveup), i|
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: directlink
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.4.3
4
+ version: 0.0.4.4
5
5
  platform: ruby
6
6
  authors:
7
7
  - Victor Maslov aka Nakilon
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2018-08-23 00:00:00.000000000 Z
11
+ date: 2018-09-15 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: fastimage
@@ -44,14 +44,14 @@ dependencies:
44
44
  requirements:
45
45
  - - "~>"
46
46
  - !ruby/object:Gem::Version
47
- version: 1.6.7
47
+ version: 1.6.8
48
48
  type: :runtime
49
49
  prerelease: false
50
50
  version_requirements: !ruby/object:Gem::Requirement
51
51
  requirements:
52
52
  - - "~>"
53
53
  - !ruby/object:Gem::Version
54
- version: 1.6.7
54
+ version: 1.6.8
55
55
  - !ruby/object:Gem::Dependency
56
56
  name: kramdown
57
57
  requirement: !ruby/object:Gem::Requirement