suggestor 0.0.3 → 0.0.6

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -10,16 +10,15 @@ tastes) and alike.
10
10
 
11
11
  The gem needs an structure of date like this:
12
12
 
13
- data = {"1": {"10": 10, "12": 1}, "2": {"11":5, "12": 4}}
13
+ data = '{"Alvaro Pereyra Rabanal": {"Primer": 10, "Memento": 9}, "Gustavo Leon": {"The Matrix":8, "Harry Potter": 8}}'
14
14
 
15
- Each element will ("1" or "2") correspond to, following the example, to user ids. They will gave access to related items (movies).
15
+ Each element will correspond to, following the example, users. They will gave access to related items (reviews for movies).
16
16
 
17
- In the example, the user "1" has seen movies identified with ids "10" and "12", given them a rating of 10 and 1, respectively. Similar with user with id "2".
17
+ In the example, the user "Alvaro Pereyra Rabanal" has seen movies "Primer" and "Memento", given them a rating of 10 and 9, respectively. Similar with user with "Gustavo Leon".
18
18
 
19
19
  After loading the gem with the data:
20
20
 
21
- engine = Suggestor::Engine.new
22
- engine.load_data(data)
21
+ engine = Suggestor::Engine.new(data)
23
22
 
24
23
  We can start to get some results.
25
24
 
@@ -28,18 +27,29 @@ We can start to get some results.
28
27
 
29
28
  For example, we can get similar users:
30
29
 
31
- engine.similar_items_to("1")
30
+ engine.similar_to("Alvaro Pereyra Rabanal")
32
31
 
33
32
  Which will return an structure like
34
33
 
35
- {id: similarity_score, id2: similarity_score }
34
+ [["label", similarity_score], ["label": similarity_score]]
35
+
36
+ Like:
37
+
38
+ [["Eogen Clase", 0.0001649620587264929], ["Daniel Subauste", 0.00011641443538998836], ["4D2Studio Diseno y Animacion", 8.548469823901521e-05], ["Rafael Lanfranco", 6.177033788374823e-05], ["Veronica Zapata Gotelli", 6.074965068950854e-05]]
36
39
 
37
40
  Thus, you can load the data and save their similarity scores for later use.
38
41
 
42
+ You can limit the data passing a "size" argument:
43
+
44
+ engine.similar_to("Alvaro Pereyra Rabanal", :size => 5)
45
+
39
46
  Now, that fine and all, but what about Mr. Bob who always is ranking everything
40
47
  higher. ID4 maybe is not that good after all. If that happens, Suggestor allows you to change the algorithm used:
41
48
 
42
- engine.similar_items_to("1", :algorithm => :pearson_correlation)
49
+ algorithm = Suggestor::Algorithms::PearsonCorrelation
50
+ engine = Suggestor::Engine.new(data, algorithm)
51
+
52
+ engine.recommended_to("Alvaro Pereyra Rabanal")
43
53
 
44
54
  There are two implemented methods, Euclidean Distance and Pearson Correlation.
45
55
 
@@ -54,11 +64,21 @@ take in mind if some user grades higher or lower and return more exact suggestio
54
64
  Most interestingly, the gem allows you to get suggestions base on the data.
55
65
  For example, which movies shoud user "2" watch based on his reviews, and similar other users tastes?
56
66
 
57
- engine.recommented_related_items_for("2",:pearson_correlation)
67
+ engine.recommended_to("Alvaro Pereyra Rabanal")
58
68
 
59
69
  As before, the structure returned will be
60
70
 
61
- {id: similarity_score, id2: similarity_score }
71
+ [["label", similarity_score], ["label": similarity_score]]
72
+
73
+ But in this case, it will represent movie labels, and how similar they are. You
74
+ can easily use this data to save it to a BD, since Movie ratings tend to estabilize on time and won't change that often.
75
+
76
+ ### Similar related items
77
+
78
+ We can also invert the data that the user has added, enableing us to get
79
+ similar related items. For example, let's say I'm on a Movie profile and
80
+ want to check which other movies are similar to it:
81
+
82
+ engine.similar_related_to("Batman Begins ", :size => 5)
62
83
 
63
- But in this case, it will represent movie id's, and how similar are. You
64
- can easily use this data to save it to a BD, since Movie ratings tend to estabilize on time and won't change that often.
84
+ Now you can go and build your awesome recommendations web site :)
@@ -0,0 +1,42 @@
1
+ require_relative '../lib/suggestor'
2
+
3
+ # I'm using test data of Users and their movie recommendations
4
+ # Each user have a hash of their reviews with the movie and
5
+ # what they've rate them with
6
+ json = File.read("test/movies.json")
7
+ engine = Suggestor::Engine.new(json, Suggestor::Algorithms::EuclideanDistance)
8
+
9
+ # Let's get some similar users
10
+ name = "Alvaro Pereyra Rabanal"
11
+ puts "Who is similar to #{name}"
12
+ puts engine.similar_to(name, size: 5).inspect
13
+
14
+ puts
15
+ puts
16
+
17
+ # So, after knowing them, why not having some recommendations?
18
+ puts "Interesting! But I want to see some stuff at the movies, what to watch?"
19
+ opts = {size: 5}
20
+ results = engine.recommended_to("Alvaro Pereyra Rabanal", opts)
21
+
22
+ puts results.inspect
23
+
24
+ puts
25
+ puts
26
+
27
+ # That's good, but let's take in mind bias while using Pearson Correlation:
28
+ puts "Adjust this results please"
29
+ engine = Suggestor::Engine.new(json,Suggestor::Algorithms::PearsonCorrelation)
30
+
31
+ ops = {size: 5}
32
+ results = engine.recommended_to("Alvaro Pereyra Rabanal", opts)
33
+ puts results.inspect
34
+
35
+ puts
36
+ puts
37
+
38
+ name = "Batman Begins "
39
+ puts "Now that was nice. But which others are similar to '#{name}'"
40
+ ops = {size: 10}
41
+ results = engine.similar_related_to(name, opts)
42
+ puts results.inspect
data/lib/suggestor.rb CHANGED
@@ -1,3 +1,5 @@
1
1
  require_relative 'suggestor/engine'
2
-
2
+ require_relative 'suggestor/algorithms/recommendation_algorithm'
3
+ require_relative 'suggestor/algorithms/euclidean_distance'
4
+ require_relative 'suggestor/algorithms/pearson_correlation'
3
5
 
@@ -1,5 +1,3 @@
1
- require_relative 'recommendation_algorithm'
2
-
3
1
  module Suggestor
4
2
  module Algorithms
5
3
 
@@ -24,21 +22,22 @@ module Suggestor
24
22
 
25
23
  include RecommendationAlgorithm
26
24
 
27
- def similarity_score_between(first, second)
28
- return 0.0 if no_shared_items_between?(first, second)
29
- inverse_of_sum_of_squares_between(first, second)
25
+ def similarity_score(first, second)
26
+ return 0.0 if nothing_shared?(first, second)
27
+ inverse_of_squares(first, second)
30
28
  end
31
29
 
32
- def inverse_of_sum_of_squares_between(first, second)
33
- 1/(1+sum_squares_of_shared_items_between(first, second))
30
+ def inverse_of_squares(first, second)
31
+ 1/(1+Math.sqrt(sum_squares(first, second)))
34
32
  end
35
33
 
36
- def sum_squares_of_shared_items_between(first, second)
37
- shared_items_between(first, second).inject(0.0) do |sum, item|
38
- sum + (values_for(first)[item] - values_for(second)[item])**2
34
+ def sum_squares(first, second)
35
+ shared_items(first, second).inject(0.0) do |sum, item|
36
+ sum + ( values_for(first)[item] - values_for(second)[item] ) ** 2
39
37
  end
40
38
  end
41
39
 
42
40
  end
41
+
43
42
  end
44
43
  end
@@ -1,5 +1,3 @@
1
- require_relative 'recommendation_algorithm'
2
-
3
1
  module Suggestor
4
2
  module Algorithms
5
3
 
@@ -18,86 +16,94 @@ module Suggestor
18
16
  # the closest distance to all of them. If the two users have the same
19
17
  # ratings, it would show as a perfect diagonal (score of 1)
20
18
 
21
- # The closest the movies to the line are, the more similar their tastes are.
19
+ # The closest the movies to the line are, the more similar their tastes
20
+ # are.
22
21
 
23
22
  # The great thing about using Pearson Correlation is that it works with
24
23
  # bias to valuating the results. Thus, a user that always rates movies
25
24
  # with great scores won't impact and mess up the results.
26
25
 
27
- # It's probably a best fit for subjetive reviews (movies reviews, profile points, etc).
26
+ # It's probably a best fit for subjetive reviews (movies reviews, profile
27
+ # points, etc).
28
28
 
29
- # More info at: http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient
29
+ # More info at:
30
+ # http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient
30
31
 
31
32
  class PearsonCorrelation
32
33
 
33
34
  include RecommendationAlgorithm
34
35
 
35
- def similarity_score_between(first, second)
36
- return 0.0 if no_shared_items_between?(first, second)
36
+ def similarity_score(first, second)
37
+ return -1.0 if nothing_shared?(first, second)
37
38
 
38
- calculate_all_sums_for(first, second)
39
- numerator = difference_from_total_and_normalize_values
40
- # 10.5 / 0.0 /
41
- denominator = square_root_from_differences_of_sums
39
+ process_values(first, second)
42
40
 
43
- return 0.0 if denominator == 0
41
+ numerator = difference_from_values
42
+ denominator = square_root_from_differences
44
43
 
44
+ return 0.0 if denominator == 0
45
45
  numerator / denominator
46
-
47
46
  end
48
47
 
49
48
  private
50
49
 
51
- def calculate_all_sums_for(first,second)
52
-
53
- shared_items = shared_items_between(first, second)
54
- @total_related_items = shared_items.size
50
+ def process_values(first, second)
51
+ items = shared_items(first, second)
52
+ @total_related_items = items.size.to_f
55
53
 
56
- #simplify access
57
- first_values = values_for(first)
54
+ first_values = values_for(first)
58
55
  second_values = values_for(second)
59
56
 
60
- @first_values_sum = @second_values_sum = @first_square_values_sum = \
61
- @second_square_values_sum = @products_sum = 0.0
57
+ create_helper_variables
62
58
 
63
- shared_items.each do |item|
59
+ items.each do |item|
64
60
 
65
- # Gets the corresponding value for each item on both elements
66
- # For ex., the rating of the same movie by different users
67
- first_value = first_values[item]
61
+ first_value = first_values[item]
68
62
  second_value = second_values[item]
69
63
 
70
- # Will add all the related items values for the first
71
- # and second item
72
- # For ex., all movie recommendations ratings
73
- @first_values_sum += first_value
74
- @second_values_sum += second_value
75
-
76
- # Adds the squares of both elements
77
- @first_square_values_sum += first_value ** 2
78
- @second_square_values_sum += second_value ** 2
64
+ append_values(first_value, second_value)
65
+ append_squares(first_value, second_value)
66
+ append_product(first_value, second_value)
79
67
 
80
- # Adds the product of both values
81
- @products_sum += first_value*second_value
82
68
  end
69
+ end
83
70
 
71
+ def append_values(first_value, second_value)
72
+ @first_values_sum += first_value
73
+ @second_values_sum += second_value
84
74
  end
85
75
 
86
- def difference_from_total_and_normalize_values
76
+ def append_squares(first_value, second_value)
77
+ @first_square_values_sum += ( first_value ** 2 )
78
+ @second_square_values_sum += ( second_value ** 2 )
79
+ end
80
+
81
+ def append_product(first_value, second_value)
82
+ @products_sum += first_value * second_value
83
+ end
84
+
85
+ def difference_from_values
87
86
  product = @first_values_sum * @second_values_sum
88
87
  normalized = product / @total_related_items
89
88
  @products_sum - normalized
90
89
  end
91
90
 
92
- def square_root_from_differences_of_sums
93
-
94
- power_left_result = @first_values_sum **2 /@total_related_items
95
- equation_left = @first_square_values_sum - power_left_result
91
+ def square_root_from_differences
92
+ power_left_result = ( @first_values_sum ** 2 ) / @total_related_items
93
+ equation_left = @first_square_values_sum - power_left_result
94
+
95
+ power_right_result = ( @second_values_sum ** 2 )/ @total_related_items
96
+ equation_right = @second_square_values_sum - power_right_result
96
97
 
97
- power_right_result = ( @second_values_sum **2 )/@total_related_items
98
- equation_right = @second_square_values_sum - power_right_result
99
- Math.sqrt(equation_left * equation_right)
98
+ Math.sqrt( equation_left * equation_right )
99
+ end
100
100
 
101
+ def create_helper_variables
102
+ @first_values_sum = 0.0
103
+ @second_values_sum = 0.0
104
+ @first_square_values_sum = 0.0
105
+ @second_square_values_sum = 0.0
106
+ @products_sum = 0.0
101
107
  end
102
108
 
103
109
  end
@@ -8,50 +8,86 @@ module Suggestor
8
8
  @collection = collection
9
9
  end
10
10
 
11
- # returns similar items based on their similary score
12
- # for example, similar users based on their movies reviews
13
- def similar_items_to(main)
14
-
15
- #just compare those whore aren't the main item
16
- compare_to = collection.dup
17
- compare_to.delete(main)
18
-
19
- # return results based on their score
20
- compare_to.keys.inject({}) do |result, other|
21
- result.merge!({other => similarity_score_between(main,other)})
22
- end
11
+ # Ex. Similar users based on their movies reviews
12
+ def similar_to(main, opts={})
13
+ opts.merge!(default_options)
14
+
15
+ collection = remove_self(main)
16
+ results = order_by_similarity_score(main,collection)
23
17
 
18
+ sort_results(results,opts[:size])
24
19
  end
25
20
 
26
- # returns recommended related items for the main user
27
- # The most important feature. For example, a user will get
28
- # movie recommendations based on his past movie reviews
29
- # and how it compares with others
30
- def recommented_related_items_for(main)
21
+ # Ex. a user will get movie recommendations
22
+ def recommended_to(main, opts={})
23
+ opts.merge!(default_options)
31
24
 
32
25
  @similarities = @totals = Hash.new(0)
33
- @main = main
34
26
 
35
- create_similarities_totals
36
- generate_rankings
27
+ create_similarities_totals(main)
28
+ results = generate_rankings
37
29
 
30
+ sort_results(results,opts[:size])
38
31
  end
39
32
 
40
- def no_shared_items_between?(first,second)
41
- shared_items_between(first,second).empty?
33
+ # Ex. what other movies are related to a given one
34
+ def similar_related_to(main, opts={})
35
+ opts.merge!(default_options)
36
+
37
+ collection = invert_collection
38
+ engine = self.class.new(collection)
39
+
40
+ engine.similar_to(main,opts)
42
41
  end
43
42
 
44
- def shared_items_between(first,second)
45
- return [] unless values_for(first) && values_for(second)
43
+ def shared_items(first, second)
44
+ return [] unless values_for(first) && values_for(second)
45
+
46
46
  related_keys_for(first).select do |item|
47
47
  related_keys_for(second).include? item
48
48
  end
49
- end
49
+ end
50
50
 
51
51
  private
52
52
 
53
- def main_already_has?(related)
54
- collection[@main].has_key?(related)
53
+ def default_options
54
+ {size: 5}
55
+ end
56
+
57
+ def nothing_shared?(first, second)
58
+ shared_items(first, second).empty?
59
+ end
60
+
61
+ def remove_self(main)
62
+ cleaned = collection.dup
63
+ cleaned.delete(main)
64
+ cleaned
65
+ end
66
+
67
+
68
+ # changes { "Cat": {"1": 10, "2":20}, "Dog": {"1":5, "2": 15} }
69
+ # to {"1": {"Cat": 10, "Dog": 5}, "2": {"Cat": 20, "Dog": 15}
70
+ def invert_collection
71
+ results = {}
72
+
73
+ collection.keys.each do |main|
74
+ collection[main].keys.each do |item|
75
+ results[item] ||= {}
76
+ results[item][main] = collection[main][item]
77
+ end
78
+ end
79
+
80
+ results
81
+ end
82
+
83
+ def order_by_similarity_score(main,collection)
84
+ result = collection.keys.inject({}) do |res, other|
85
+ res.merge!({other => similarity_score(main, other)})
86
+ end
87
+ end
88
+
89
+ def already_has?(main, related)
90
+ collection[main].has_key?(related)
55
91
  end
56
92
 
57
93
  def values_for(id)
@@ -62,47 +98,58 @@ module Suggestor
62
98
  values_for(id).keys
63
99
  end
64
100
 
65
- def add_to_totals(other,item,score)
66
- @totals[item] += collection[other][item]*score
101
+ def add_to_totals(other, item, score)
102
+ @totals[item] += collection[other][item]*score
67
103
  @similarities[item] += score
68
104
  end
69
105
 
70
- def generate_rankings
71
- @rankings = {}
106
+ def sort_results(results,size=-1)
107
+ sorted = results.sort{|a,b| a[1] <=> b[1]}.reverse
108
+ sorted[0, size]
109
+ end
72
110
 
111
+ def generate_rankings
112
+ rankings = {}
113
+
73
114
  @totals.each_pair do |item, total|
74
- normalized_value = (total / @similarities[item])
75
- @rankings.merge!( { item => normalized_value} )
115
+ normalized_value = (total / Math.sqrt(@similarities[item]))
116
+ rankings.merge!( { item => normalized_value} )
76
117
  end
77
118
 
78
- @rankings
119
+ rankings
120
+ end
121
+
122
+ def something_in_common?(score)
123
+ score > 0
124
+ end
79
125
 
126
+ def same_item?(main, other)
127
+ other == main
80
128
  end
81
129
 
82
- def create_similarities_totals
130
+ def create_similarities_totals(main)
83
131
 
84
132
  collection.keys.each do |other|
85
133
 
86
- # won't bother comparing it if the compared item is the same
87
- # as the main, or if they scores are below 0 (nothing in common)
88
- next if other == @main
89
- score = similarity_score_between(@main,other)
90
- next if score <= 0
91
-
92
- # will compare each the results but only for related items
93
- # that the main item doesn't already have
94
- # For ex., if they have already saw a movie they won't
95
- # get it suggested
134
+ next if same_item?(main,other)
135
+
136
+ score = similarity_score(main, other)
137
+
138
+ next unless something_in_common?(score)
139
+
96
140
  collection[other].keys.each do |item|
97
141
 
98
- unless main_already_has?(item)
99
- add_to_totals(other,item,score)
142
+ unless already_has?(main, item)
143
+ add_to_totals(other, item, score)
100
144
  end
101
145
 
102
146
  end
147
+
103
148
  end
149
+
104
150
  end
105
151
 
152
+
106
153
  end
107
154
  end
108
155
  end
@@ -1,6 +1,4 @@
1
1
  require 'json'
2
- require_relative 'algorithms/euclidean_distance'
3
- require_relative 'algorithms/pearson_correlation'
4
2
 
5
3
  module Suggestor
6
4
 
@@ -8,50 +6,25 @@ module Suggestor
8
6
 
9
7
  class Engine
10
8
 
11
- attr_accessor :collection
12
-
13
- def initialize
14
- @collection = {}
15
- end
16
-
17
- def load_data(input)
18
- add_to_collection(input)
9
+ def initialize(input, algorithm = Algorithms::EuclideanDistance)
10
+ @collection = parse_from_json(input)
11
+ @algorithm = algorithm.new(@collection)
19
12
  end
20
-
21
- def similarity_score_for(first, second, opts={})
22
- opts[:algorithm] ||= :euclidean_distance
23
- strategy_for(opts[:algorithm]).similarity_score_between(first, second)
13
+
14
+ def similar_to(item, opts={})
15
+ @algorithm.similar_to(item, opts)
24
16
  end
25
17
 
26
- def similar_items_to(item, opts={})
27
- opts[:algorithm] ||= :euclidean_distance
28
- strategy_for(opts[:algorithm]).similar_items_to(item)
18
+ def recommended_to(item, opts={})
19
+ @algorithm.recommended_to(item, opts)
29
20
  end
30
21
 
31
- def recommented_related_items_for(item, opts={})
32
- opts[:algorithm] ||= :euclidean_distance
33
- strategy_for(opts[:algorithm]).recommented_related_items_for(item)
22
+ def similar_related_to(item, opts={})
23
+ @algorithm.similar_related_to(item, opts)
34
24
  end
35
25
 
36
26
  private
37
-
38
- def strategy_for(algorithm)
39
- constantize(classify(algorithm)).new(collection)
40
- end
41
-
42
- # based on Rail's code
43
- def classify(name)
44
- name.to_s.gsub(/\/(.?)/) { "::#{$1.upcase}" }.gsub(/(?:^|_)(.)/) { $1.upcase }
45
- end
46
-
47
- def constantize(name)
48
- Suggestor::Algorithms.const_get(name)
49
- end
50
27
 
51
- def add_to_collection(input)
52
- @collection.merge! parse_from_json(input)
53
- end
54
-
55
28
  def parse_from_json(json)
56
29
  JSON.parse(json)
57
30
  rescue Exception => ex
@@ -1,3 +1,3 @@
1
1
  module Suggestor
2
- VERSION = "0.0.3"
2
+ VERSION = "0.0.6"
3
3
  end
data/suggestor.gemspec CHANGED
@@ -17,5 +17,6 @@ Gem::Specification.new do |s|
17
17
  s.files = `git ls-files`.split("\n")
18
18
  s.test_files = `git ls-files -- {test,spec,features}/*`.split("\n")
19
19
  s.executables = []
20
+ s.add_dependency("rake")
20
21
  s.require_paths = ["lib"]
21
22
  end
@@ -1,29 +1,27 @@
1
1
  require 'minitest/autorun'
2
- require_relative '../lib/suggestor/algorithms/euclidean_distance'
3
- require_relative '../lib/suggestor/engine'
2
+ require 'json'
3
+ require_relative '../lib/suggestor'
4
4
 
5
5
  describe Suggestor::Algorithms::EuclideanDistance do
6
6
 
7
7
  before do
8
- @data_string = File.read("test/test.json")
9
- @suggestor = Suggestor::Engine.new
10
- @suggestor.load_data(@data_string)
11
- @algorithm = Suggestor::Algorithms::EuclideanDistance.new(@suggestor.collection)
8
+ data_string = File.read("test/numbers.json")
9
+ data = JSON.parse(data_string)
10
+ @algorithm = Suggestor::Algorithms::EuclideanDistance.new(data)
12
11
  end
13
12
 
14
13
  describe "when building up recommendations" do
15
14
 
16
15
  it "must return a list of shared items between two people" do
17
- @algorithm.shared_items_between(1,2).must_be :==, ["1","2"]
16
+ @algorithm.shared_items(1,2).must_be :==, ["1","2"]
18
17
  end
19
18
 
20
19
  it "must return 0 as similarity record if two elements hace no shared items" do
21
- @algorithm.similarity_score_between(1,99).must_be :==, 0
20
+ @algorithm.similarity_score(1,99).must_be :==, 0
22
21
  end
23
22
 
24
23
  it "must return 1 as similarity record if two elements have equal related values" do
25
- puts @algorithm.shared_items_between(1,1).inspect
26
- @algorithm.similarity_score_between(1,1).must_be :==, 1
24
+ @algorithm.similarity_score(1,1).must_be :==, 1
27
25
  end
28
26
 
29
27
  end
data/test/movies.json ADDED
@@ -0,0 +1 @@
1
+ {"Alvaro Pereyra Rabanal":{"Enterrado":90,"La reunion del diablo":20,"Scott Pilgrim vs The World":80,"El avispon verde":20,"Se dice de mi":89,"Un tonto en el amor":75,"El secreto de sus ojos":99,"Wall Street: El dinero nunca duerme":90,"Super 8 ":90,"Kung fu panda 2":92,"La revelacion":70,"Rio":90,"El cisne negro":70,"Tron: El Legado":90,"Invasion del Mundo: Batalla Los Angeles":50,"Peluda venganza":20,"Megamente":90,"Dias de ira":76,"El especialista":20,"u00bfQue paso ayer? 2":20,"El escritor oculto":80,"Red ":33,"Amor a distancia":90,"Resident Evil 4: La Resurreccion":50,"Octubre":55,"Sin limite":65,"Love Actually ":90,"El gran concierto":90,"Perdidos en Tokio":90,"Pulp Fiction ":87,"Cazador de demonios":20,"Loco y estu00fapido amor":65,"Amor por contrato":66,"Contracorriente":90,"Actividad paranormal 2":20,"La Vigilia":15,"El rey leon":90,"Transformers: El lado oscuro de la luna":20,"Cuando Harry conocio a Sally":95,"Machete":90,"Avatar":80,"Soy el nu00famero cuatro":90,"Toy Story 3 ":99,"El discurso del rey ":90,"Noches de encanto":90,"Enredados ":55,"Red Social":80,"La otra familia ":70,"Tesis ":90,"Harry Potter and the Deathly Hallows":76,"El planeta de los simios: Revolucion":80,"Biutiful ":20,"Harry Potter y las Reliquias de la muerte: Parte II":60,"Las Cronicas de Narnia: La travesia del viajero del alba":63,"El regreso de la nana magica":20},"Angel Velasquez":{"Los indestructibles":95},"Rafael Lanfranco":{"El rey leon":20,"Enter the Dragon ":78,"Un hombre solitario":20,"Desconocido ":20,"La historia sin fin":90,"Pi: Fe en el caos":66,"Harry Potter and the Deathly Hallows":86,"Dark City ":89,"Juego de Traiciones":78,"X-Men: Primera generacion":89,"La Aldea":20,"Agora ":90,"Comer, rezar, amar":20,"La duda":89,"Invasion del Mundo: Batalla Los Angeles":86,"El nuevo entrenador":90,"Marea Roja":86,"Sniper":77,"Conoceras al hombre de tus suenos":90,"El peleador":54,"Enterrado":50,"Mary and Max ":90,"Gran Torino ":86,"Senales":20,"Real Steel ":10,"Breakin' ":94,"Tesis ":66,"Red ":20,"El secreto de sus ojos":75,"El Senor De Los Anillos: Las Dos Torres":50,"Cowboys y Aliens":82,"Mision Imposible":91,"Match Point":64,"Loco por ella":77,"La revelacion":90,"El origen":86,"El Pianista":90,"Lazos de sangre":94,"El Informante":100,"Mas alla de la vida":90,"Carancho ":20,"Harry Potter y las Reliquias de la muerte: Parte II":30,"Jumper ":20,"Apocalypse Now ":92,"Source Code ":90,"Rango ":20,"Hot Fuzz ":43,"La fuente de la vida":91,"The Doubt ":89,"Medianoche en Paris":89,"Sin limite":77,"Rapidos y Furiosos 5":90,"El discurso del rey ":81,"127 horas":85,"Following":90,"Serenity ":60,"Una propuesta atrevida":20,"Temple de Acero":81,"Piratas del Caribe: Navegando aguas misteriosas":84,"Ex, todos tenemos uno":90,"Los ilusionautas":20,"Agente Salt":90,"Star Wars: Episodio IV - Una nueva esperanza":89,"El planeta de los simios: Revolucion":89,"Kick - Ass":90,"Cyrus":90,"El Exterminador 2: El Dia del Juicio Final":87,"Invasion del mundo":82,"Mundo Surreal":92,"Kung Fu Hustle":90,"El mensajero ":90,"Red Social":94,"I saw the devil":73,"El Club de la Pelea":90,"El Senor de los Anillos: El retorno del Rey":66,"Scott Pilgrim vs The World":90,"Terminator II ":87,"Preciosa":90,"Kung fu panda 2":92,"En un rincon del corazon":20,"12 hombres molestos":55,"Thirteen Days ":11,"E.T : El Extraterrestre":90,"Harry Potter y el Prisionero de Azkaban":11,"Way of the Dragon ":90,"Los agentes del destino":75,"True Romance ":75,"Pulp Fiction ":100,"Thor":60,"8 Mile":70,"Super 8 ":90,"The Wolfman ":20,"Los imperdonables":92,"El Protegido":90,"Mi nombre es John Lennon":90,"Hable con ella ":90,"Bastardos sin gloria":80,"The Big Lebowski ":90,"social network":90,"Camino al Oscar":99,"Winnie Pooh ":50,"Beginners":90,"El Fin de Los Tiempos":20,"Celda 211 ":90,"Siempre a tu lado":70,"Tron: El Legado":20,"Que pena tu vida ":20,"Capitan America: El primer vengador":78,"El especialista":20,"Fuego Contra Fuego":78},"4D2Studio Diseno y Animacion":{"Piratas del Caribe 3: En El Fin del Mundo":90,"The Adventures of Tintin: The Secret of the Unicorn ":99,"Mundo Surreal":60,"Temple de Acero":99,"Laeon :El Profesional":90,"El secreto de sus ojos":92,"Fallen Art":96,"Kung fu panda 2":95,"Capitan America: El primer vengador":99,"Tron: El Legado":100,"Los indestructibles":9,"Rango ":20,"Megamente":20,"Thor":100,"Dias de ira":90,"Dorothy of Oz ":100,"Pandorum":65,"Los ilusionautas":20,"Siempre a tu lado":90,"El amante":90,"Cazador de demonios":75,"El u00faltimo maestro del aire":9,"Transformers: El lado oscuro de la luna":77,"X-Men: Primera generacion":90,"Los cazafantasmas":89,"Harry Potter and the Deathly Hallows":89,"Comer, rezar, amar":20,"Agora ":90,"Piratas del Caribe: Navegando aguas misteriosas":85},"Daniel Subauste":{"El rey leon":90,"La Masacre de Texas: El Origen":20,"Una loca pelicula de vampiros":20,"Calabozos y Dragones":205,"Piratas del Caribe 3: En El Fin del Mundo":20,"El Vengador":40,"Linterna Verde":60,"Luna Nueva":20,"Juan de los Muertos":100,"Harry Potter and the Deathly Hallows":70,"Avatar":60,"Dragones, destino de fuego":2,"Space Cowboys ":20,"Agora ":95,"Comer, rezar, amar":20,"X-Men: Primera generacion":90,"Piratas en el Callao":20,"Invasion del Mundo: Batalla Los Angeles":38,"El u00faltimo exorcismo":20,"Tesis ":90,"Daejame entrar":90,"Senales":20,"Red ":90,"Cowboys y Aliens":45,"Mision Imposible":55,"Megamind ":100,"Harry Potter y el Caliz de Fuego":75,"Los indestructibles":30,"La invasion":40,"La Sonrisa de Mona Lisa":90,"Pandorum":68,"Zodiaco":91,"Calabozos y Dragones 2 El Poder Mayor":80,"Transformers: El lado oscuro de la luna":60,"Corazon Valiente":90,"El cisne negro":90,"Mas alla de la vida":20,"Me enamorae en Nueva York":20,"Harry Potter y las Reliquias de la muerte: Parte II":65,"Millennium I: Los hombres que no amaban a las mujeres":60,"Un tonto en el amor":90,"Como Agua para Chocolate ":20,"Laeon :El Profesional":90,"La Naranja Mecanica":90,"Horton Hears a Who! ":90,"Sin limite":85,"Enredados ":90,"El u00faltimo maestro del aire":20,"La chica de mis suenos":90,"La Pasion de Cristo":76,"Temple de Acero":60,"Crepu00fasculo":20,"Perdidos en Tokio":99,"Piratas del Caribe: Navegando aguas misteriosas":80,"Ga'Hoole :La Leyenda De Los Guardianes":90,"Los ilusionautas":20,"Wall Street: El dinero nunca duerme":90,"El planeta de los simios: Revolucion":85,"Kick - Ass":90,"Hannibal Rising ":20,"Planet Terror ":90,"El Codigo Da Vinci":20,"Mundo Surreal":90,"Red Social":60,"Machete":99,"Kung Fu Hustle":90,"Dragones: destino de fuego ":2,"Sanctum ":80,"Scott Pilgrim vs The World":75,"Seven":90,"Triste San Valentin":20,"Megamente":100,"u00bfComo saber si es amor?":20,"Kung fu panda 2":90,"El u00faltimo guerrero Chanka":100,"Thor":60,"Apocalypto ":20,"The Kids Are All Right ":90,"Super 8 ":78,"El Protegido":20,"TRON ":60,"El avispon verde":20,"Los pitufos":75,"El juego del miedo VII 3D":20,"Mongol, el emperador":90,"Tron: El Legado":85,"El Fin de Los Tiempos":20,"The Runaways ":40,"Capitan America: El primer vengador":90,"Millennium I - Los hombres que no amaban a las mujeres":60},"Laura Vanessa M":{"El secreto de sus ojos":99,"Wall Street: El dinero nunca duerme":70,"Atraccion peligrosa":44,"El escritor oculto":86,"La Vigilia":90,"Noches de encanto":90},"Veronica Zapata Gotelli":{"Sin lugar Para los Daebiles":90,"Mundo Surreal":20,"Temple de Acero":70,"Cartas a Julieta":55,"Carancho ":90,"Mary and Max ":90,"The Kids Are All Right ":70,"Wall Street: El dinero nunca duerme":66,"La Naranja Mecanica":90,"Rio":90,"Perros de Reserva":90,"Sin City ":90,"El Truco Final":90,"Lazos de sangre":60,"El cisne negro":90,"La cinta blanca":40,"Los indestructibles":75,"Fargo ":50,"Traffic ":90,"LadyKillers":60,"El juego ":90,"Seven":90,"Psicosis":90,"Crueldad Intolerable":79,"300 ":92,"El escritor oculto":90,"El peleador":80,"Octubre":90,"Al otro lado del corazon":92,"El Resplandor":99,"Source Code ":70,"Love and Other Impossible Pursuits ":90,"Gran Torino ":96,"The King's Speech":90,"La vida de los peces ":77,"Incendies ":91,"X-Men: Primera generacion":90,"El Hombre que Nunca Estuvo Alli":90,"La chica de la capa roja":20,"Batman Begins ":90,"Terciopelo Azul ":55,"Cuando Harry conocio a Sally":90,"Triste San Valentin":90,"Buenas Noches y Buena Suerte":82,"Un Hombre Serio":90,"Soy el nu00famero cuatro":20,"The Big Lebowski ":78,"Noches de encanto":20,"Red Social":90,"El discurso del rey ":93,"Ciudadano Kane ":90,"Quaemese despuaes de Leer":90,"Pase libre":90,"Agua para elefantes":20,"Rain Man ":90,"Conoceras al hombre de tus suenos":90,"En un rincon del corazon":20,"Dinner for Schmucks ":20,"Vaertigo":90,"Un cuento chino ":20,"Batman :El Caballero Oscuro":90,"Bastardos sin gloria":90,"Belleza Americana":90,"Una esposa de mentira":20,"Biutiful ":85},"Guillermo Pereyra":{"Paris en la mira":70,"Una loca pelicula de vampiros":81}}
File without changes
@@ -0,0 +1,27 @@
1
+ require 'minitest/autorun'
2
+ require_relative '../lib/suggestor'
3
+
4
+ describe Suggestor::Algorithms::PearsonCorrelation do
5
+
6
+ before do
7
+ data_string = File.read("test/numbers.json")
8
+ data = JSON.parse(data_string)
9
+ @algorithm = Suggestor::Algorithms::PearsonCorrelation.new(data)
10
+ end
11
+
12
+ describe "when building up recommendations" do
13
+
14
+ it "must return a list of shared items between two people" do
15
+ @algorithm.shared_items(1,2).must_be :==, ["1","2"]
16
+ end
17
+
18
+ it "must return 1 as similarity record if two elements have equal related values" do
19
+ @algorithm.similarity_score(1,1).must_be :==, 1
20
+ end
21
+
22
+ it "must return -1 as similarity record if two elements are totally distant" do
23
+ @algorithm.similarity_score(1,99).must_be :==, -1
24
+ end
25
+
26
+ end
27
+ end
@@ -3,46 +3,41 @@ require_relative '../lib/suggestor'
3
3
 
4
4
  describe Suggestor::Engine do
5
5
  before do
6
- @suggestor = Suggestor::Engine.new
7
- @data_string = File.read("test/test.json")
6
+ @data_string = File.read("test/numbers.json")
8
7
  end
9
8
 
10
9
  describe "when loading up the data structure" do
11
10
  it "must raise an exception with invalid data" do
12
- lambda{ @suggestor.load_data("GIBBERISH}") }.must_raise Suggestor::WrongInputFormat
11
+ lambda{ Suggestor::Engine.new("GIBBERISH") }.must_raise Suggestor::WrongInputFormat
13
12
  end
14
-
15
- it "must return an array structure if data is ok" do
16
- @suggestor.load_data(@data_string).must_be_instance_of Hash
17
- end
18
-
19
13
  end
20
14
 
21
15
  describe "when accesing the data after load_dataing it" do
22
16
 
23
17
  before do
24
- @suggestor.load_data(@data_string)
25
- end
26
-
27
- it "must return a similarty score between to elements" do
28
- @suggestor.similarity_score_for("1","1").must_be :==, 1
18
+ @suggestor = Suggestor::Engine.new(@data_string)
29
19
  end
30
20
 
31
21
  it "must return similar items from the base one with euclidean distance" do
32
- expected = {"2"=>0.02702702702702703, "3"=>0.02702702702702703}
33
- @suggestor.similar_items_to("1").must_be :==, expected
22
+ expected = [["3", 0.14285714285714285], ["2", 0.14285714285714285]]
23
+ @suggestor.similar_to("1").must_be :==, expected
34
24
  end
35
25
 
36
26
  it "must return similar items from the base one with pearson correlation" do
37
- expected = {"1"=>1.0, "3"=>0.0}
38
- @suggestor.similar_items_to("2",:algorithm => :pearson_correlation).must_be :==, expected
27
+ @suggestor = Suggestor::Engine.new(@data_string,Suggestor::Algorithms::PearsonCorrelation)
28
+ expected = [["2", 0.0], ["1", 0.0]]
29
+ @suggestor.similar_to("3").must_be :==, expected
39
30
  end
40
31
 
41
32
  it "must return similar items from the base one with euclidean distance" do
42
- expected = {"4"=>1.0}
43
- @suggestor.recommented_related_items_for("2").must_be :==, expected
33
+ expected = [["4", 2.6457513110645903]]
34
+ @suggestor.recommended_to("2").must_be :==, expected
44
35
  end
45
36
 
46
- end
37
+ it "must return similar related items from one of them" do
38
+ expected = [["5", 0.3333333333333333], ["3", 0.25], ["1", 0.12389934309929541], ["4", 0.0]]
39
+ @suggestor.similar_related_to("2").must_be :==, expected
40
+ end
47
41
 
42
+ end
48
43
  end
metadata CHANGED
@@ -2,7 +2,7 @@
2
2
  name: suggestor
3
3
  version: !ruby/object:Gem::Version
4
4
  prerelease:
5
- version: 0.0.3
5
+ version: 0.0.6
6
6
  platform: ruby
7
7
  authors:
8
8
  - Alvaro Pereyra
@@ -10,10 +10,19 @@ autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
12
 
13
- date: 2011-09-19 00:00:00 -05:00
14
- default_executable:
15
- dependencies: []
16
-
13
+ date: 2011-09-24 00:00:00 Z
14
+ dependencies:
15
+ - !ruby/object:Gem::Dependency
16
+ name: rake
17
+ prerelease: false
18
+ requirement: &id001 !ruby/object:Gem::Requirement
19
+ none: false
20
+ requirements:
21
+ - - ">="
22
+ - !ruby/object:Gem::Version
23
+ version: "0"
24
+ type: :runtime
25
+ version_requirements: *id001
17
26
  description: Suggestor allows you to get suggestions of related items in your data
18
27
  email:
19
28
  - alvaro@xendacentral.com
@@ -28,20 +37,19 @@ files:
28
37
  - Gemfile
29
38
  - README.md
30
39
  - Rakefile
31
- - demos/playing_around.rb
40
+ - examples/playing_around.rb
32
41
  - lib/suggestor.rb
33
42
  - lib/suggestor/algorithms/euclidean_distance.rb
34
43
  - lib/suggestor/algorithms/pearson_correlation.rb
35
44
  - lib/suggestor/algorithms/recommendation_algorithm.rb
36
- - lib/suggestor/datum.rb
37
45
  - lib/suggestor/engine.rb
38
46
  - lib/suggestor/version.rb
39
47
  - suggestor.gemspec
40
48
  - test/euclidean_test.rb
41
- - test/pearon_correlation.rb
49
+ - test/movies.json
50
+ - test/numbers.json
51
+ - test/pearson_correlation.rb
42
52
  - test/suggestor_test.rb
43
- - test/test.json
44
- has_rdoc: true
45
53
  homepage: ""
46
54
  licenses: []
47
55
 
@@ -65,12 +73,13 @@ required_rubygems_version: !ruby/object:Gem::Requirement
65
73
  requirements: []
66
74
 
67
75
  rubyforge_project: suggestor
68
- rubygems_version: 1.5.0
76
+ rubygems_version: 1.8.10
69
77
  signing_key:
70
78
  specification_version: 3
71
79
  summary: Suggestor allows you to get suggestions of related items in your data
72
80
  test_files:
73
81
  - test/euclidean_test.rb
74
- - test/pearon_correlation.rb
82
+ - test/movies.json
83
+ - test/numbers.json
84
+ - test/pearson_correlation.rb
75
85
  - test/suggestor_test.rb
76
- - test/test.json
@@ -1,16 +0,0 @@
1
- require_relative '../lib/suggestor'
2
-
3
- engine = Suggestor::Engine.new
4
-
5
- # I'm using test data of Users and their movie recommendations
6
- # Each user (identified by their ids) have a hash of their movies ids and
7
- # what they've rate them with
8
- json = File.read("test/test.json")
9
-
10
- engine.load_data(json)
11
-
12
- # Let's get some similar users
13
- puts engine.similar_items_to("2").inspect
14
-
15
- # So, after knowing them, why not having some recommendations?
16
- puts engine.recommented_related_items_for("2", algorithm: :euclidean_distance)
@@ -1,13 +0,0 @@
1
- require 'delegate'
2
-
3
- module Suggestor
4
-
5
- class Datum < DelegateClass(Hash)
6
-
7
- def initialize(hash)
8
- super(hash)
9
- end
10
-
11
- end
12
-
13
- end
@@ -1,34 +0,0 @@
1
- require 'minitest/autorun'
2
- require_relative '../lib/suggestor/algorithms/pearson_correlation'
3
- require_relative '../lib/suggestor/engine'
4
-
5
- describe Suggestor::Algorithms::PearsonCorrelation do
6
-
7
- before do
8
- @data_string = File.read("test/test.json")
9
- @suggestor = Suggestor::Engine.new
10
- @suggestor.load_data(@data_string)
11
- @algorithm = Suggestor::Algorithms::PearsonCorrelation.new(@suggestor.collection)
12
- end
13
-
14
- describe "when building up recommendations" do
15
-
16
- it "must return a list of shared items between two people" do
17
- @algorithm.shared_items_between(1,2).must_be :==, ["1","2"]
18
- end
19
-
20
- it "must return 0 as similarity record if two elements hace no shared items" do
21
- @algorithm.similarity_score_between(1,4).must_be :==, 0
22
- end
23
-
24
- it "must return 1 as similarity record if two elements have equal related values" do
25
- @algorithm.similarity_score_between(1,1).must_be :==, 1
26
- end
27
-
28
- it "must return -1 as similarity record if two elements are totally distant" do
29
- @algorithm.similarity_score_between(1,99).must_be :==, 0
30
- end
31
-
32
-
33
- end
34
- end