movieDB 1.0.0 → 1.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 18eb3efc3ad17c7cb5b6134a951402964af0b5da
4
- data.tar.gz: 7eeb628cab2588293f92aa458eeabb903cbfb6ec
3
+ metadata.gz: 15142fe8cad1d00faba9deef3d9b9968a0abca47
4
+ data.tar.gz: 5f0faa23b644c40f564eb658902ba22d1b4cffdc
5
5
  SHA512:
6
- metadata.gz: 09c390931150038ca9b18d52ffdd084a805b344c4272876c9ff6880ee2c4ae3da5623be445cc9b3ad6e7f192e60594fb62a682ebad4168f68d08f8e07b3629b5
7
- data.tar.gz: 4b559dcc3e841b450187b87f1060039033109bdce2cd4791ae5c9628bea929db875acff15748e807f9e6444b2eeddcd03ff37b0b2b62d4b324bffdf30468d068
6
+ metadata.gz: b754c6573573f84c74dbe11ff5190f5c6cf218f432370e61c2698d2d32ebd0155cd0f6a58893c241fb033793ac61822ec0fd732c8dfb223daf1e58b1118f3104
7
+ data.tar.gz: 5cc88caeaef20a8a0357a817319354df47ab802f3a805e3479608c9e74e341e5a8e35bc43956bf97aee2ceda7eb0ec03ba98907452562800b970e9690eb3fa78
data/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # MovieDB
2
2
 
3
- MovieDB is a multi-threaded ruby wrapper for performing advance statistical computation and high-level data analysis on Movie or TV Data from IMDb.
3
+ MovieDB is a multi-threaded ruby wrapper for performing advance statistical computation and high-level data analysis on Movie Data from IMDb.
4
4
  The objective and usage of this tool is to allow producers, directors, writers to make logical business decisions that will generate profitable ROI.
5
5
 
6
6
  ## Badges
@@ -15,8 +15,8 @@
15
15
  ## Technology
16
16
  * SciRuby is used for all statistical and scientific computations.
17
17
  * Redis is used to store all data.
18
- * IMDb and TMDb is the source for all film / TV data.
19
- * BoxOfficeMojo is where we will be scraping future film / TV data.
18
+ * IMDb and TMDb is the source for all film.
19
+ * BoxOfficeMojo is where we will be scraping future film.
20
20
  * Celluloid is used to build the fault-tolerant concurrent programs. Note, if you are using MRI or YARV,
21
21
  multi-threading won't work since these types of interpreters have Global Interpreter Lock (GIL).
22
22
  Fortunately, you can use JRuby or Rubinius, since they don’t have a GIL and support real parallel threading.
@@ -26,20 +26,6 @@ ruby-2.2.2 or higher
26
26
 
27
27
  jruby-9.0.0.0
28
28
 
29
- ## Category
30
- movieDB is broken down into 3 components namely:
31
-
32
- * Statistics
33
- * Visualizations (Work in progress)
34
- * DataMining (Work in progress)
35
-
36
- # Statistics
37
-
38
- Simple statistical analysis on numeric data.
39
- The corresponding computation is performed on
40
- both numeric and string vectors within the
41
- collected data.
42
-
43
29
  ## Installation
44
30
 
45
31
  Redis Installation
@@ -84,7 +70,7 @@ m = MovieDB::Movie.pool(size: 2)
84
70
  ```
85
71
  ## Step Process
86
72
 
87
- Fetching and analysing movie / TV data using movieDB is a simple 2 step process.
73
+ Fetching and analysing movie data using movieDB is a simple 2 step process.
88
74
 
89
75
  First, fetch the data from IMDb.
90
76
 
@@ -94,13 +80,33 @@ That's it! It is that simple.
94
80
 
95
81
  ## Part 1 - Fetch Data from IMDb
96
82
 
97
- There are 2 ways to find IMDb ids.
83
+ There are 3 ways to find IMDb ids.
84
+
85
+ * Search IMDb id via API
86
+
87
+ * Search IMDb id via Website
98
88
 
99
- * Finding specific IMDb ids
89
+ * Generate random IMDb ids.
100
90
 
101
- * Finding random IMDb ids.
91
+ ### Search IMDb id via API
102
92
 
103
- ### Fetching specific IMDb ids
93
+ You can read the [documentation](http://rubydoc.info/github/ariejan/imdb/master/frames) for IMDb API to see all that you can do with this gem.
94
+
95
+ ``` ruby
96
+ i = Imdb::Search.new("Star Trek")
97
+
98
+ i.movies.size #=> 97
99
+ ```
100
+ This will return 97 objects related to 'Star Trek'
101
+
102
+ To collect all the IMDb ids
103
+
104
+ ``` ruby
105
+ ids = i.movies.collect(&:id).uniq
106
+
107
+ #=> ["0796366", "0060028", "0079945" ...]
108
+ ```
109
+ ### Search IMDb id via Website
104
110
 
105
111
  To find IMDb id for specific movies, you must go to:
106
112
 
@@ -116,8 +122,9 @@ http://www.imdb.com/title/tt0369610/
116
122
  ```
117
123
  0369610 is the IMDb id.
118
124
 
119
- ### Fetching random IMDb ids (multi-thread setup)
120
- You can fetch IMDb ids random.
125
+ ### Generate random IMDb ids (multi-thread setup)
126
+
127
+ You can fetch IMDb ids random. This approach will probably run you into some problems, see Disclaimer.
121
128
 
122
129
  ``` ruby
123
130
  r = Random.new
@@ -133,7 +140,7 @@ Note: IMDB has a rate limit of 40 requests every 10 seconds and are limited by I
133
140
  If you exceed the limit, you will receive a 429 HTTP status with a 'Retry-After' header.
134
141
  As soon your cool down period expires, you are free to continue making requests.
135
142
 
136
- Also, movieDB will throw a NameError if the randomly generated IMDb id in invalid.
143
+ Also, movieDB will throw a NameError if the randomly generated IMDb id is invalid.
137
144
 
138
145
  ### Get Movie Data
139
146
 
@@ -227,7 +234,7 @@ plot_summa 373 298 311
227
234
 
228
235
  When performing statistics on an object, movieDB by default processes all fields.
229
236
 
230
- Contrary, to this default approach, you now have the option of filtering what fields you want processed with the following 2 filters.
237
+ However, you now have the option of filtering what fields you want processed using the following filters:
231
238
 
232
239
  * only
233
240
  * except
@@ -287,10 +294,18 @@ m.all_ids
287
294
  Gets the remaining time to live of a movie.
288
295
 
289
296
  ``` ruby
290
- m.ttl("0369610)
297
+ m.ttl("0369610")
291
298
  # => 120
292
299
  ```
293
300
 
301
+ * DELETE key
302
+ deletes a single movie object stored in redis.
303
+
304
+ ``` ruby
305
+ m.del("0369610")
306
+ # => # => ["3079380"...]
307
+ ```
308
+
294
309
  * DELETE_ALL key
295
310
  deletes all movie objects stored in redis.
296
311
 
@@ -298,11 +313,6 @@ deletes all movie objects stored in redis.
298
313
  m.delete_all
299
314
  # => []
300
315
  ```
301
- # Visualizations
302
- (Work in progress)
303
-
304
- # Data mining
305
- (Work in progress)
306
316
 
307
317
  ## Contact me
308
318
 
@@ -312,5 +322,10 @@ You can also contact me at albertmck@gmail.com
312
322
 
313
323
  ## Disclaimer
314
324
  This software is provided “as is” and without any express or implied warranties, including, without limitation, the implied warranties of merchantibility and fitness for a particular purpose.
325
+ Neither I, nor any developer who contributed to this project, accept any kind of liability for your use of this library.
326
+
327
+ IMDB does not permit use of its data by third parties without their consent.
328
+
329
+ Using this library for anything other than limited personal use may result in an IP ban to the IMDB website.
315
330
 
316
331
  ###### Copyright (c) 2013 - 2015 Albert McKeever, released under MIT license
Binary file
@@ -42,5 +42,4 @@ module MovieDB
42
42
  return arr.flatten
43
43
  end
44
44
  end
45
- end
46
-
45
+ end
@@ -1,4 +1,5 @@
1
1
  require 'daru'
2
+ require 'json'
2
3
 
3
4
  module MovieDB
4
5
  module DataAnalysis
@@ -11,9 +12,12 @@ module MovieDB
11
12
 
12
13
  stats = [:mean, :std, :sum, :count, :max, :min, :min, :product, :standardize, :describe, :covariance, :correlation, :worksheet]
13
14
 
15
+ $collect_vals = {}
16
+
14
17
  stats.each do |method_name|
15
18
  define_method method_name do |**args|
16
- dataframes_stats(method_name, args)
19
+ $collect_vals[:method] = method_name.to_s
20
+ $collect_vals[:vals] = dataframes_stats(method_name, args)
17
21
  end
18
22
  end
19
23
 
@@ -22,16 +26,16 @@ module MovieDB
22
26
  def dataframes_stats(method, filters = {})
23
27
  raise ArgumentError, 'Please provide 2 or more IMDd ids.' if $movie_data.length <= 1
24
28
 
25
- @data_key = {}
26
- @index = []
29
+ $data_key = {}
30
+ $index = []
27
31
 
28
32
  if filters.empty?
29
33
  $movie_data.each_with_index do |movie, _|
30
34
  value_count = []
31
35
 
32
36
  movie.each_pair do |k, v|
33
- @data_key[(movie['title'].sub(" ", "_").downcase)] = value_count << (MovieDB::DataAnalysis::Statistics.numeric_vals.any? { |word| word == k } ? v.to_i : v.split(' ').count)
34
- @index << k.to_sym
37
+ $data_key[(movie['title'].sub(" ", "_").downcase)] = value_count << (MovieDB::DataAnalysis::Statistics.numeric_vals.any? { |word| word == k } ? v.to_i : v.split(' ').count)
38
+ $index << k.to_sym
35
39
  end
36
40
  end
37
41
  else
@@ -45,8 +49,8 @@ module MovieDB
45
49
  mr = movie.reject { |k, _| k != filter.to_s }
46
50
 
47
51
  mr.each_pair do |k, v|
48
- @data_key[(movie['title'].sub(" ", "_").downcase)] = value_count << (MovieDB::DataAnalysis::Statistics.numeric_vals.any? { |word| word == k } ? v.to_i : v.join(' ').split(' ').count)
49
- @index << k.to_sym
52
+ $data_key[(movie['title'].sub(" ", "_").downcase)] = value_count << (MovieDB::DataAnalysis::Statistics.numeric_vals.any? { |word| word == k } ? v.to_i : v.join(' ').split(' ').count)
53
+ $index << k.to_sym
50
54
  end
51
55
  end
52
56
  end
@@ -58,8 +62,8 @@ module MovieDB
58
62
  value_count = []
59
63
 
60
64
  mr.each_pair do |k, v|
61
- @data_key[(movie['title'].sub(" ", "_").downcase)] = value_count << (MovieDB::DataAnalysis::Statistics.numeric_vals.any? { |word| word == k } ? v.to_i : v.join(' ').split(' ').count)
62
- @index << k.to_sym
65
+ $data_key[(movie['title'].sub(" ", "_").downcase)] = value_count << (MovieDB::DataAnalysis::Statistics.numeric_vals.any? { |word| word == k } ? v.to_i : v.join(' ').split(' ').count)
66
+ $index << k.to_sym
63
67
  end
64
68
  end
65
69
  end
@@ -68,9 +72,9 @@ module MovieDB
68
72
  end
69
73
  end
70
74
 
71
- index = @index.uniq
75
+ index = $index.uniq
72
76
 
73
- movie_numeric_vector = Hash[@data_key.map { |k, v| [k.to_s.gsub('-', '_').to_sym, v] }]
77
+ movie_numeric_vector = Hash[$data_key.map { |k, v| [k.to_s.gsub('-', '_').to_sym, v] }]
74
78
  compute_stats(method, movie_numeric_vector, index )
75
79
  end
76
80
 
@@ -74,6 +74,8 @@ module MovieDB
74
74
  return @redis_db.hgetall("#{id}")
75
75
  when :ttl
76
76
  return @redis_db.ttl("#{id}")
77
+ when :del
78
+ return @redis_db.del("#{id}")
77
79
  else
78
80
  raise ArgumentError, "The method #{method} is invalid."
79
81
  end
@@ -48,7 +48,7 @@ module MovieDB
48
48
  # m.fetch("0369610", "3079380", "0478970")
49
49
  #
50
50
  # m.hgetall("0369610")
51
- [:all, :hkeys, :hvals, :flushall, :ttl].each do |method_name|
51
+ [:all, :hkeys, :hvals, :flushall, :ttl, :del].each do |method_name|
52
52
  define_method method_name do |arg|
53
53
  MovieDB::DataStore.get_data(method_name, arg)
54
54
  end
@@ -56,6 +56,7 @@ module MovieDB
56
56
 
57
57
  alias hgetall all
58
58
 
59
+ # No argument is required.
59
60
  [:scan, :flushall].each do |method_name|
60
61
  define_method method_name do
61
62
  mn = MovieDB::DataStore.get_data(method_name)
@@ -1,3 +1,3 @@
1
1
  module MovieDB
2
- VERSION = "1.0.0"
2
+ VERSION = "1.0.1"
3
3
  end
@@ -26,4 +26,5 @@ Gem::Specification.new do |spec|
26
26
  spec.add_dependency 'imdb'
27
27
  spec.add_dependency 'json'
28
28
  spec.add_dependency 'celluloid'
29
+ spec.add_dependency 'nyaplot'
29
30
  end
@@ -9,8 +9,8 @@ describe MovieDB do
9
9
 
10
10
  context '#mean' do
11
11
  it 'should calculate mean of all values' do
12
- # expect(m.mean.round(2)).to eq(Daru::Vector.new([ 2891127.4, 36972648.98, 1445963.96],
13
- # index: ['ant-man', :jurassic_world, :spy]))
12
+ expect(m.mean.round(2)).to eq(Daru::Vector.new([ 2891127.4, 36972648.98, 1445963.96],
13
+ index: ['ant-man', :jurassic_world, :spy]))
14
14
  end
15
15
  end
16
16
 
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: movieDB
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.0
4
+ version: 1.0.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Albert McKeever
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2015-08-18 00:00:00.000000000 Z
11
+ date: 2015-08-20 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -122,6 +122,20 @@ dependencies:
122
122
  - - ">="
123
123
  - !ruby/object:Gem::Version
124
124
  version: '0'
125
+ - !ruby/object:Gem::Dependency
126
+ name: nyaplot
127
+ requirement: !ruby/object:Gem::Requirement
128
+ requirements:
129
+ - - ">="
130
+ - !ruby/object:Gem::Version
131
+ version: '0'
132
+ type: :runtime
133
+ prerelease: false
134
+ version_requirements: !ruby/object:Gem::Requirement
135
+ requirements:
136
+ - - ">="
137
+ - !ruby/object:Gem::Version
138
+ version: '0'
125
139
  description: Perform Data Analysis on IMDB Movies
126
140
  email:
127
141
  - kotn_ep1@hotmail.com
@@ -139,6 +153,7 @@ files:
139
153
  - LICENSE.txt
140
154
  - README.md
141
155
  - Rakefile
156
+ - images/sampbar.png
142
157
  - lib/movieDB.rb
143
158
  - lib/movieDB/.DS_Store
144
159
  - lib/movieDB/base.rb