crawler-movie-core 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 63d00b95a3570dd862d808a816614f4cb671c662158e6009c28bfad6b5cea324
4
+ data.tar.gz: f38ca8bc83f1350d9396a947e2ee775a407d8ca1ea7662dbfe4d880ea126d51c
5
+ SHA512:
6
+ metadata.gz: 071fa837d98aa1d529fd1e803b27c544fbdc5ed6159ca60b426ca9bf77396177f32f2fe0fd4a6b9fd209ae3f7bc31c11bd38853668ab15b665553d645fb76ec9
7
+ data.tar.gz: 490467a51ce41e0b284fecb7fe067b41e569c00c75d09c998fa653edc2129945987d7b86fd0634af97df9fe9f021c3d001a845d73b46e8c2a0639a2902835b37
@@ -0,0 +1,8 @@
1
+ /.bundle/
2
+ /.yardoc
3
+ /_yardoc/
4
+ /coverage/
5
+ /doc/
6
+ /pkg/
7
+ /spec/reports/
8
+ /tmp/
File without changes
data/Gemfile ADDED
@@ -0,0 +1,5 @@
1
+ source 'https://rubygems.org'
2
+
3
+ git_source(:github) { |repo_name| "https://github.com/#{repo_name}" }
4
+
5
+ gemspec
@@ -0,0 +1,427 @@
1
+ Attribution-ShareAlike 4.0 International
2
+
3
+ =======================================================================
4
+
5
+ Creative Commons Corporation ("Creative Commons") is not a law firm and
6
+ does not provide legal services or legal advice. Distribution of
7
+ Creative Commons public licenses does not create a lawyer-client or
8
+ other relationship. Creative Commons makes its licenses and related
9
+ information available on an "as-is" basis. Creative Commons gives no
10
+ warranties regarding its licenses, any material licensed under their
11
+ terms and conditions, or any related information. Creative Commons
12
+ disclaims all liability for damages resulting from their use to the
13
+ fullest extent possible.
14
+
15
+ Using Creative Commons Public Licenses
16
+
17
+ Creative Commons public licenses provide a standard set of terms and
18
+ conditions that creators and other rights holders may use to share
19
+ original works of authorship and other material subject to copyright
20
+ and certain other rights specified in the public license below. The
21
+ following considerations are for informational purposes only, are not
22
+ exhaustive, and do not form part of our licenses.
23
+
24
+ Considerations for licensors: Our public licenses are
25
+ intended for use by those authorized to give the public
26
+ permission to use material in ways otherwise restricted by
27
+ copyright and certain other rights. Our licenses are
28
+ irrevocable. Licensors should read and understand the terms
29
+ and conditions of the license they choose before applying it.
30
+ Licensors should also secure all rights necessary before
31
+ applying our licenses so that the public can reuse the
32
+ material as expected. Licensors should clearly mark any
33
+ material not subject to the license. This includes other CC-
34
+ licensed material, or material used under an exception or
35
+ limitation to copyright. More considerations for licensors:
36
+ wiki.creativecommons.org/Considerations_for_licensors
37
+
38
+ Considerations for the public: By using one of our public
39
+ licenses, a licensor grants the public permission to use the
40
+ licensed material under specified terms and conditions. If
41
+ the licensor's permission is not necessary for any reason--for
42
+ example, because of any applicable exception or limitation to
43
+ copyright--then that use is not regulated by the license. Our
44
+ licenses grant only permissions under copyright and certain
45
+ other rights that a licensor has authority to grant. Use of
46
+ the licensed material may still be restricted for other
47
+ reasons, including because others have copyright or other
48
+ rights in the material. A licensor may make special requests,
49
+ such as asking that all changes be marked or described.
50
+ Although not required by our licenses, you are encouraged to
51
+ respect those requests where reasonable. More_considerations
52
+ for the public:
53
+ wiki.creativecommons.org/Considerations_for_licensees
54
+
55
+ =======================================================================
56
+
57
+ Creative Commons Attribution-ShareAlike 4.0 International Public
58
+ License
59
+
60
+ By exercising the Licensed Rights (defined below), You accept and agree
61
+ to be bound by the terms and conditions of this Creative Commons
62
+ Attribution-ShareAlike 4.0 International Public License ("Public
63
+ License"). To the extent this Public License may be interpreted as a
64
+ contract, You are granted the Licensed Rights in consideration of Your
65
+ acceptance of these terms and conditions, and the Licensor grants You
66
+ such rights in consideration of benefits the Licensor receives from
67
+ making the Licensed Material available under these terms and
68
+ conditions.
69
+
70
+
71
+ Section 1 -- Definitions.
72
+
73
+ a. Adapted Material means material subject to Copyright and Similar
74
+ Rights that is derived from or based upon the Licensed Material
75
+ and in which the Licensed Material is translated, altered,
76
+ arranged, transformed, or otherwise modified in a manner requiring
77
+ permission under the Copyright and Similar Rights held by the
78
+ Licensor. For purposes of this Public License, where the Licensed
79
+ Material is a musical work, performance, or sound recording,
80
+ Adapted Material is always produced where the Licensed Material is
81
+ synched in timed relation with a moving image.
82
+
83
+ b. Adapter's License means the license You apply to Your Copyright
84
+ and Similar Rights in Your contributions to Adapted Material in
85
+ accordance with the terms and conditions of this Public License.
86
+
87
+ c. BY-SA Compatible License means a license listed at
88
+ creativecommons.org/compatiblelicenses, approved by Creative
89
+ Commons as essentially the equivalent of this Public License.
90
+
91
+ d. Copyright and Similar Rights means copyright and/or similar rights
92
+ closely related to copyright including, without limitation,
93
+ performance, broadcast, sound recording, and Sui Generis Database
94
+ Rights, without regard to how the rights are labeled or
95
+ categorized. For purposes of this Public License, the rights
96
+ specified in Section 2(b)(1)-(2) are not Copyright and Similar
97
+ Rights.
98
+
99
+ e. Effective Technological Measures means those measures that, in the
100
+ absence of proper authority, may not be circumvented under laws
101
+ fulfilling obligations under Article 11 of the WIPO Copyright
102
+ Treaty adopted on December 20, 1996, and/or similar international
103
+ agreements.
104
+
105
+ f. Exceptions and Limitations means fair use, fair dealing, and/or
106
+ any other exception or limitation to Copyright and Similar Rights
107
+ that applies to Your use of the Licensed Material.
108
+
109
+ g. License Elements means the license attributes listed in the name
110
+ of a Creative Commons Public License. The License Elements of this
111
+ Public License are Attribution and ShareAlike.
112
+
113
+ h. Licensed Material means the artistic or literary work, database,
114
+ or other material to which the Licensor applied this Public
115
+ License.
116
+
117
+ i. Licensed Rights means the rights granted to You subject to the
118
+ terms and conditions of this Public License, which are limited to
119
+ all Copyright and Similar Rights that apply to Your use of the
120
+ Licensed Material and that the Licensor has authority to license.
121
+
122
+ j. Licensor means the individual(s) or entity(ies) granting rights
123
+ under this Public License.
124
+
125
+ k. Share means to provide material to the public by any means or
126
+ process that requires permission under the Licensed Rights, such
127
+ as reproduction, public display, public performance, distribution,
128
+ dissemination, communication, or importation, and to make material
129
+ available to the public including in ways that members of the
130
+ public may access the material from a place and at a time
131
+ individually chosen by them.
132
+
133
+ l. Sui Generis Database Rights means rights other than copyright
134
+ resulting from Directive 96/9/EC of the European Parliament and of
135
+ the Council of 11 March 1996 on the legal protection of databases,
136
+ as amended and/or succeeded, as well as other essentially
137
+ equivalent rights anywhere in the world.
138
+
139
+ m. You means the individual or entity exercising the Licensed Rights
140
+ under this Public License. Your has a corresponding meaning.
141
+
142
+
143
+ Section 2 -- Scope.
144
+
145
+ a. License grant.
146
+
147
+ 1. Subject to the terms and conditions of this Public License,
148
+ the Licensor hereby grants You a worldwide, royalty-free,
149
+ non-sublicensable, non-exclusive, irrevocable license to
150
+ exercise the Licensed Rights in the Licensed Material to:
151
+
152
+ a. reproduce and Share the Licensed Material, in whole or
153
+ in part; and
154
+
155
+ b. produce, reproduce, and Share Adapted Material.
156
+
157
+ 2. Exceptions and Limitations. For the avoidance of doubt, where
158
+ Exceptions and Limitations apply to Your use, this Public
159
+ License does not apply, and You do not need to comply with
160
+ its terms and conditions.
161
+
162
+ 3. Term. The term of this Public License is specified in Section
163
+ 6(a).
164
+
165
+ 4. Media and formats; technical modifications allowed. The
166
+ Licensor authorizes You to exercise the Licensed Rights in
167
+ all media and formats whether now known or hereafter created,
168
+ and to make technical modifications necessary to do so. The
169
+ Licensor waives and/or agrees not to assert any right or
170
+ authority to forbid You from making technical modifications
171
+ necessary to exercise the Licensed Rights, including
172
+ technical modifications necessary to circumvent Effective
173
+ Technological Measures. For purposes of this Public License,
174
+ simply making modifications authorized by this Section 2(a)
175
+ (4) never produces Adapted Material.
176
+
177
+ 5. Downstream recipients.
178
+
179
+ a. Offer from the Licensor -- Licensed Material. Every
180
+ recipient of the Licensed Material automatically
181
+ receives an offer from the Licensor to exercise the
182
+ Licensed Rights under the terms and conditions of this
183
+ Public License.
184
+
185
+ b. Additional offer from the Licensor -- Adapted Material.
186
+ Every recipient of Adapted Material from You
187
+ automatically receives an offer from the Licensor to
188
+ exercise the Licensed Rights in the Adapted Material
189
+ under the conditions of the Adapter's License You apply.
190
+
191
+ c. No downstream restrictions. You may not offer or impose
192
+ any additional or different terms or conditions on, or
193
+ apply any Effective Technological Measures to, the
194
+ Licensed Material if doing so restricts exercise of the
195
+ Licensed Rights by any recipient of the Licensed
196
+ Material.
197
+
198
+ 6. No endorsement. Nothing in this Public License constitutes or
199
+ may be construed as permission to assert or imply that You
200
+ are, or that Your use of the Licensed Material is, connected
201
+ with, or sponsored, endorsed, or granted official status by,
202
+ the Licensor or others designated to receive attribution as
203
+ provided in Section 3(a)(1)(A)(i).
204
+
205
+ b. Other rights.
206
+
207
+ 1. Moral rights, such as the right of integrity, are not
208
+ licensed under this Public License, nor are publicity,
209
+ privacy, and/or other similar personality rights; however, to
210
+ the extent possible, the Licensor waives and/or agrees not to
211
+ assert any such rights held by the Licensor to the limited
212
+ extent necessary to allow You to exercise the Licensed
213
+ Rights, but not otherwise.
214
+
215
+ 2. Patent and trademark rights are not licensed under this
216
+ Public License.
217
+
218
+ 3. To the extent possible, the Licensor waives any right to
219
+ collect royalties from You for the exercise of the Licensed
220
+ Rights, whether directly or through a collecting society
221
+ under any voluntary or waivable statutory or compulsory
222
+ licensing scheme. In all other cases the Licensor expressly
223
+ reserves any right to collect such royalties.
224
+
225
+
226
+ Section 3 -- License Conditions.
227
+
228
+ Your exercise of the Licensed Rights is expressly made subject to the
229
+ following conditions.
230
+
231
+ a. Attribution.
232
+
233
+ 1. If You Share the Licensed Material (including in modified
234
+ form), You must:
235
+
236
+ a. retain the following if it is supplied by the Licensor
237
+ with the Licensed Material:
238
+
239
+ i. identification of the creator(s) of the Licensed
240
+ Material and any others designated to receive
241
+ attribution, in any reasonable manner requested by
242
+ the Licensor (including by pseudonym if
243
+ designated);
244
+
245
+ ii. a copyright notice;
246
+
247
+ iii. a notice that refers to this Public License;
248
+
249
+ iv. a notice that refers to the disclaimer of
250
+ warranties;
251
+
252
+ v. a URI or hyperlink to the Licensed Material to the
253
+ extent reasonably practicable;
254
+
255
+ b. indicate if You modified the Licensed Material and
256
+ retain an indication of any previous modifications; and
257
+
258
+ c. indicate the Licensed Material is licensed under this
259
+ Public License, and include the text of, or the URI or
260
+ hyperlink to, this Public License.
261
+
262
+ 2. You may satisfy the conditions in Section 3(a)(1) in any
263
+ reasonable manner based on the medium, means, and context in
264
+ which You Share the Licensed Material. For example, it may be
265
+ reasonable to satisfy the conditions by providing a URI or
266
+ hyperlink to a resource that includes the required
267
+ information.
268
+
269
+ 3. If requested by the Licensor, You must remove any of the
270
+ information required by Section 3(a)(1)(A) to the extent
271
+ reasonably practicable.
272
+
273
+ b. ShareAlike.
274
+
275
+ In addition to the conditions in Section 3(a), if You Share
276
+ Adapted Material You produce, the following conditions also apply.
277
+
278
+ 1. The Adapter's License You apply must be a Creative Commons
279
+ license with the same License Elements, this version or
280
+ later, or a BY-SA Compatible License.
281
+
282
+ 2. You must include the text of, or the URI or hyperlink to, the
283
+ Adapter's License You apply. You may satisfy this condition
284
+ in any reasonable manner based on the medium, means, and
285
+ context in which You Share Adapted Material.
286
+
287
+ 3. You may not offer or impose any additional or different terms
288
+ or conditions on, or apply any Effective Technological
289
+ Measures to, Adapted Material that restrict exercise of the
290
+ rights granted under the Adapter's License You apply.
291
+
292
+
293
+ Section 4 -- Sui Generis Database Rights.
294
+
295
+ Where the Licensed Rights include Sui Generis Database Rights that
296
+ apply to Your use of the Licensed Material:
297
+
298
+ a. for the avoidance of doubt, Section 2(a)(1) grants You the right
299
+ to extract, reuse, reproduce, and Share all or a substantial
300
+ portion of the contents of the database;
301
+
302
+ b. if You include all or a substantial portion of the database
303
+ contents in a database in which You have Sui Generis Database
304
+ Rights, then the database in which You have Sui Generis Database
305
+ Rights (but not its individual contents) is Adapted Material,
306
+
307
+ including for purposes of Section 3(b); and
308
+ c. You must comply with the conditions in Section 3(a) if You Share
309
+ all or a substantial portion of the contents of the database.
310
+
311
+ For the avoidance of doubt, this Section 4 supplements and does not
312
+ replace Your obligations under this Public License where the Licensed
313
+ Rights include other Copyright and Similar Rights.
314
+
315
+
316
+ Section 5 -- Disclaimer of Warranties and Limitation of Liability.
317
+
318
+ a. UNLESS OTHERWISE SEPARATELY UNDERTAKEN BY THE LICENSOR, TO THE
319
+ EXTENT POSSIBLE, THE LICENSOR OFFERS THE LICENSED MATERIAL AS-IS
320
+ AND AS-AVAILABLE, AND MAKES NO REPRESENTATIONS OR WARRANTIES OF
321
+ ANY KIND CONCERNING THE LICENSED MATERIAL, WHETHER EXPRESS,
322
+ IMPLIED, STATUTORY, OR OTHER. THIS INCLUDES, WITHOUT LIMITATION,
323
+ WARRANTIES OF TITLE, MERCHANTABILITY, FITNESS FOR A PARTICULAR
324
+ PURPOSE, NON-INFRINGEMENT, ABSENCE OF LATENT OR OTHER DEFECTS,
325
+ ACCURACY, OR THE PRESENCE OR ABSENCE OF ERRORS, WHETHER OR NOT
326
+ KNOWN OR DISCOVERABLE. WHERE DISCLAIMERS OF WARRANTIES ARE NOT
327
+ ALLOWED IN FULL OR IN PART, THIS DISCLAIMER MAY NOT APPLY TO YOU.
328
+
329
+ b. TO THE EXTENT POSSIBLE, IN NO EVENT WILL THE LICENSOR BE LIABLE
330
+ TO YOU ON ANY LEGAL THEORY (INCLUDING, WITHOUT LIMITATION,
331
+ NEGLIGENCE) OR OTHERWISE FOR ANY DIRECT, SPECIAL, INDIRECT,
332
+ INCIDENTAL, CONSEQUENTIAL, PUNITIVE, EXEMPLARY, OR OTHER LOSSES,
333
+ COSTS, EXPENSES, OR DAMAGES ARISING OUT OF THIS PUBLIC LICENSE OR
334
+ USE OF THE LICENSED MATERIAL, EVEN IF THE LICENSOR HAS BEEN
335
+ ADVISED OF THE POSSIBILITY OF SUCH LOSSES, COSTS, EXPENSES, OR
336
+ DAMAGES. WHERE A LIMITATION OF LIABILITY IS NOT ALLOWED IN FULL OR
337
+ IN PART, THIS LIMITATION MAY NOT APPLY TO YOU.
338
+
339
+ c. The disclaimer of warranties and limitation of liability provided
340
+ above shall be interpreted in a manner that, to the extent
341
+ possible, most closely approximates an absolute disclaimer and
342
+ waiver of all liability.
343
+
344
+
345
+ Section 6 -- Term and Termination.
346
+
347
+ a. This Public License applies for the term of the Copyright and
348
+ Similar Rights licensed here. However, if You fail to comply with
349
+ this Public License, then Your rights under this Public License
350
+ terminate automatically.
351
+
352
+ b. Where Your right to use the Licensed Material has terminated under
353
+ Section 6(a), it reinstates:
354
+
355
+ 1. automatically as of the date the violation is cured, provided
356
+ it is cured within 30 days of Your discovery of the
357
+ violation; or
358
+
359
+ 2. upon express reinstatement by the Licensor.
360
+
361
+ For the avoidance of doubt, this Section 6(b) does not affect any
362
+ right the Licensor may have to seek remedies for Your violations
363
+ of this Public License.
364
+
365
+ c. For the avoidance of doubt, the Licensor may also offer the
366
+ Licensed Material under separate terms or conditions or stop
367
+ distributing the Licensed Material at any time; however, doing so
368
+ will not terminate this Public License.
369
+
370
+ d. Sections 1, 5, 6, 7, and 8 survive termination of this Public
371
+ License.
372
+
373
+
374
+ Section 7 -- Other Terms and Conditions.
375
+
376
+ a. The Licensor shall not be bound by any additional or different
377
+ terms or conditions communicated by You unless expressly agreed.
378
+
379
+ b. Any arrangements, understandings, or agreements regarding the
380
+ Licensed Material not stated herein are separate from and
381
+ independent of the terms and conditions of this Public License.
382
+
383
+
384
+ Section 8 -- Interpretation.
385
+
386
+ a. For the avoidance of doubt, this Public License does not, and
387
+ shall not be interpreted to, reduce, limit, restrict, or impose
388
+ conditions on any use of the Licensed Material that could lawfully
389
+ be made without permission under this Public License.
390
+
391
+ b. To the extent possible, if any provision of this Public License is
392
+ deemed unenforceable, it shall be automatically reformed to the
393
+ minimum extent necessary to make it enforceable. If the provision
394
+ cannot be reformed, it shall be severed from this Public License
395
+ without affecting the enforceability of the remaining terms and
396
+ conditions.
397
+
398
+ c. No term or condition of this Public License will be waived and no
399
+ failure to comply consented to unless expressly agreed to by the
400
+ Licensor.
401
+
402
+ d. Nothing in this Public License constitutes or may be interpreted
403
+ as a limitation upon, or waiver of, any privileges and immunities
404
+ that apply to the Licensor or You, including from the legal
405
+ processes of any jurisdiction or authority.
406
+
407
+
408
+ =======================================================================
409
+
410
+ Creative Commons is not a party to its public
411
+ licenses. Notwithstanding, Creative Commons may elect to apply one of
412
+ its public licenses to material it publishes and in those instances
413
+ will be considered the “Licensor.” The text of the Creative Commons
414
+ public licenses is dedicated to the public domain under the CC0 Public
415
+ Domain Dedication. Except for the limited purpose of indicating that
416
+ material is shared under a Creative Commons public license or as
417
+ otherwise permitted by the Creative Commons policies published at
418
+ creativecommons.org/policies, Creative Commons does not authorize the
419
+ use of the trademark "Creative Commons" or any other trademark or logo
420
+ of Creative Commons without its prior written consent including,
421
+ without limitation, in connection with any unauthorized modifications
422
+ to any of its public licenses or any other arrangements,
423
+ understandings, or agreements concerning use of licensed material. For
424
+ the avoidance of doubt, this paragraph does not form part of the
425
+ public licenses.
426
+
427
+ Creative Commons may be contacted at creativecommons.org.
File without changes
@@ -0,0 +1,2 @@
1
+ require 'bundler/gem_tasks'
2
+ task default: :spec
@@ -0,0 +1,7 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require 'bundler/setup'
4
+ require 'crawler/movie'
5
+
6
+ require 'irb'
7
+ IRB.start(__FILE__)
@@ -0,0 +1,6 @@
1
+ #!/usr/bin/env bash
2
+ set -euo pipefail
3
+ IFS=$'\n\t'
4
+ set -vx
5
+
6
+ bundle install
@@ -0,0 +1,33 @@
1
+ lib = File.expand_path('../lib', __FILE__)
2
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
3
+ require 'crawler/movie/core/version'
4
+
5
+ Gem::Specification.new do |spec|
6
+ spec.name = 'crawler-movie-core'
7
+ spec.version = Crawler::Movie::Core::VERSION
8
+ spec.authors = ['Jonathan PHILIPPE']
9
+ spec.email = ['jonathan@cinema.paris']
10
+
11
+ spec.summary = %q{}
12
+ spec.description = %q{}
13
+ spec.homepage = 'https://crawler.cinema.paris'
14
+ spec.license = 'CC-BY-SA-4.0'
15
+
16
+ if spec.respond_to?(:metadata)
17
+ spec.metadata['homepage_uri'] = spec.homepage
18
+ spec.metadata['source_code_uri'] = 'https://github.com/cinema-paris/crawler-movie-core'
19
+ spec.metadata['changelog_uri'] = 'https://github.com/cinema-paris/crawler-movie-core/CHANGELOG.md'
20
+ end
21
+
22
+ spec.files = Dir.chdir(File.expand_path('..', __FILE__)) do
23
+ `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
24
+ end
25
+ spec.bindir = 'exe'
26
+ spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
27
+ spec.require_paths = ['lib']
28
+
29
+ spec.add_development_dependency 'bundler', '~> 1.17'
30
+ spec.add_development_dependency 'rake', '~> 10.0'
31
+ spec.add_runtime_dependency 'activesupport', '>= 3.0'
32
+ spec.add_runtime_dependency 'levenshtein-ffi', '>= 1.0'
33
+ end
@@ -0,0 +1,54 @@
1
+ require 'active_support/core_ext/hash/keys'
2
+ require 'active_support/inflector'
3
+ require 'levenshtein-ffi'
4
+
5
+ module Crawler
6
+ module Movie
7
+ PROVIDERS = []
8
+ SCORES = {}
9
+
10
+ def self.add_provider(provider_name, options = {})
11
+ options.assert_valid_keys :score, :insert_at
12
+
13
+ PROVIDERS.insert(options[:insert_at] || -1, provider_name)
14
+
15
+ if (score = options[:score])
16
+ SCORES[provider_name] = score
17
+ end
18
+ end
19
+
20
+ def self.configure
21
+ yield self
22
+ end
23
+
24
+ def self.transliterate(string)
25
+ ActiveSupport::Inflector.transliterate(string.gsub(/[:\-.,!?]/, ' ').strip.gsub(/\s+/, ' ')).downcase
26
+ end
27
+
28
+ def self.search(query, year: nil)
29
+ movies = PROVIDERS.flat_map do |provider_name|
30
+ camelized = ActiveSupport::Inflector.camelize("crawler/movie/providers/#{provider_name.to_s}")
31
+ klass = ActiveSupport::Inflector.constantize(camelized)
32
+ movies = klass.search(transliterate(query))
33
+
34
+ movies.map do |movie|
35
+ provider_score = SCORES[provider_name] || 0.5
36
+ query_transliterated = transliterate(query)
37
+ title_transliterated = transliterate(movie[:title])
38
+ levenshtein_distance = Levenshtein.distance(query_transliterated, title_transliterated)
39
+ max_size = [query_transliterated.size, title_transliterated.size].max.to_f
40
+ title_score = (max_size - levenshtein_distance) / max_size
41
+ year_score = 1.0 unless year
42
+ year_score ||= movie[:release_date] && year.to_s == movie[:release_date].year.to_s ? 1.0 : 0.9
43
+
44
+ {
45
+ data: movie,
46
+ score: provider_score * title_score * year_score
47
+ }
48
+ end
49
+ end
50
+
51
+ movies.max_by { |movie| movie[:score] }
52
+ end
53
+ end
54
+ end
@@ -0,0 +1,7 @@
1
+ module Crawler
2
+ module Movie
3
+ module Core
4
+ VERSION = '0.1.0'
5
+ end
6
+ end
7
+ end
metadata ADDED
@@ -0,0 +1,113 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: crawler-movie-core
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - Jonathan PHILIPPE
8
+ autorequire:
9
+ bindir: exe
10
+ cert_chain: []
11
+ date: 2019-10-27 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: bundler
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - "~>"
18
+ - !ruby/object:Gem::Version
19
+ version: '1.17'
20
+ type: :development
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - "~>"
25
+ - !ruby/object:Gem::Version
26
+ version: '1.17'
27
+ - !ruby/object:Gem::Dependency
28
+ name: rake
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - "~>"
32
+ - !ruby/object:Gem::Version
33
+ version: '10.0'
34
+ type: :development
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - "~>"
39
+ - !ruby/object:Gem::Version
40
+ version: '10.0'
41
+ - !ruby/object:Gem::Dependency
42
+ name: activesupport
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - ">="
46
+ - !ruby/object:Gem::Version
47
+ version: '3.0'
48
+ type: :runtime
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - ">="
53
+ - !ruby/object:Gem::Version
54
+ version: '3.0'
55
+ - !ruby/object:Gem::Dependency
56
+ name: levenshtein-ffi
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - ">="
60
+ - !ruby/object:Gem::Version
61
+ version: '1.0'
62
+ type: :runtime
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - ">="
67
+ - !ruby/object:Gem::Version
68
+ version: '1.0'
69
+ description: ''
70
+ email:
71
+ - jonathan@cinema.paris
72
+ executables: []
73
+ extensions: []
74
+ extra_rdoc_files: []
75
+ files:
76
+ - ".gitignore"
77
+ - CHANGELOG.md
78
+ - Gemfile
79
+ - LICENSE.txt
80
+ - README.md
81
+ - Rakefile
82
+ - bin/console
83
+ - bin/setup
84
+ - crawler-movie-core.gemspec
85
+ - lib/crawler/movie.rb
86
+ - lib/crawler/movie/core/version.rb
87
+ homepage: https://crawler.cinema.paris
88
+ licenses:
89
+ - CC-BY-SA-4.0
90
+ metadata:
91
+ homepage_uri: https://crawler.cinema.paris
92
+ source_code_uri: https://github.com/cinema-paris/crawler-movie-core
93
+ changelog_uri: https://github.com/cinema-paris/crawler-movie-core/CHANGELOG.md
94
+ post_install_message:
95
+ rdoc_options: []
96
+ require_paths:
97
+ - lib
98
+ required_ruby_version: !ruby/object:Gem::Requirement
99
+ requirements:
100
+ - - ">="
101
+ - !ruby/object:Gem::Version
102
+ version: '0'
103
+ required_rubygems_version: !ruby/object:Gem::Requirement
104
+ requirements:
105
+ - - ">="
106
+ - !ruby/object:Gem::Version
107
+ version: '0'
108
+ requirements: []
109
+ rubygems_version: 3.0.1
110
+ signing_key:
111
+ specification_version: 4
112
+ summary: ''
113
+ test_files: []