RubyGems - suggestor - Versions diffs - 0.0.3 → 0.0.6 - Mend

suggestor 0.0.3 → 0.0.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (18) hide show

data/README.md +32 -12
data/examples/playing_around.rb +42 -0
data/lib/suggestor.rb +3 -1
data/lib/suggestor/algorithms/euclidean_distance.rb +9 -10
data/lib/suggestor/algorithms/pearson_correlation.rb +50 -44
data/lib/suggestor/algorithms/recommendation_algorithm.rb +94 -47
data/lib/suggestor/engine.rb +10 -37
data/lib/suggestor/version.rb +1 -1
data/suggestor.gemspec +1 -0
data/test/euclidean_test.rb +8 -10
data/test/movies.json +1 -0
data/test/{test.json → numbers.json} +0 -0
data/test/pearson_correlation.rb +27 -0
data/test/suggestor_test.rb +15 -20
metadata +22 -13
data/demos/playing_around.rb +0 -16
data/lib/suggestor/datum.rb +0 -13
data/test/pearon_correlation.rb +0 -34

data/README.md CHANGED Viewed

@@ -10,16 +10,15 @@ tastes) and alike.
 The gem needs an structure of date like this:
-    data = {"1": {"10": 10, "12": 1}, "2": {"11":5, "12": 4}}
+    data = '{"Alvaro Pereyra Rabanal": {"Primer": 10, "Memento": 9}, "Gustavo Leon": {"The Matrix":8, "Harry Potter": 8}}'
-Each element will ("1" or "2") correspond to, following the example, to user ids. They will gave access to related items (movies).
+Each element will correspond to, following the example, users. They will gave access to related items (reviews for movies).
-In the example, the user "1" has seen movies identified with ids "10" and "12", given them a rating of 10 and 1, respectively. Similar with user with id "2".
+In the example, the user "Alvaro Pereyra Rabanal" has seen movies "Primer" and "Memento", given them a rating of 10 and 9, respectively. Similar with user with "Gustavo Leon".
 After loading the gem with the data:
-    engine = Suggestor::Engine.new
-    engine.load_data(data)
+    engine = Suggestor::Engine.new(data)
 We can start to get some results.
@@ -28,18 +27,29 @@ We can start to get some results.
 For example, we can get similar users:
-    engine.similar_items_to("1")
+    engine.similar_to("Alvaro Pereyra Rabanal")
 Which will return an structure like
-    {id: similarity_score, id2: similarity_score }
+    [["label", similarity_score], ["label": similarity_score]]
+Like:
+    [["Eogen Clase", 0.0001649620587264929], ["Daniel Subauste", 0.00011641443538998836], ["4D2Studio Diseno y Animacion", 8.548469823901521e-05], ["Rafael  Lanfranco", 6.177033788374823e-05], ["Veronica Zapata Gotelli", 6.074965068950854e-05]]
 Thus, you can load the data and save their similarity scores for later use.
+You can limit the data passing a "size" argument:
+  engine.similar_to("Alvaro Pereyra Rabanal", :size => 5)
 Now, that fine and all, but what about Mr. Bob who always is ranking everything
 higher. ID4 maybe is not that good after all. If that happens, Suggestor allows you to change the algorithm used:
-    engine.similar_items_to("1", :algorithm => :pearson_correlation)
+    algorithm = Suggestor::Algorithms::PearsonCorrelation
+    engine = Suggestor::Engine.new(data, algorithm)
+    engine.recommended_to("Alvaro Pereyra Rabanal")
 There are two implemented methods, Euclidean Distance and Pearson Correlation.
@@ -54,11 +64,21 @@ take in mind if some user grades higher or lower and return more exact suggestio
 Most interestingly, the gem allows you to get suggestions base on the data.
 For example, which movies shoud user "2" watch based on his reviews, and similar other users tastes?
-    engine.recommented_related_items_for("2",:pearson_correlation)
+    engine.recommended_to("Alvaro Pereyra Rabanal")
 As before, the structure returned will be
-    {id: similarity_score, id2: similarity_score }
+    [["label", similarity_score], ["label": similarity_score]]
+But in this case, it will represent movie labels, and how similar they are. You
+can easily use this data to save it to a BD, since Movie ratings tend to estabilize on time and won't change that often.
+### Similar related items
+We can also invert the data that the user has added, enableing us to get
+similar related items. For example, let's say I'm on a Movie profile and
+want to check which other movies are similar to it:
+    engine.similar_related_to("Batman Begins ", :size => 5)
-But in this case, it will represent movie id's, and how similar are. You
-can easily use this data to save it to a BD, since Movie ratings tend to estabilize on time and won't change that often.
+Now you can go and build your awesome recommendations web site :)

data/examples/playing_around.rb ADDED Viewed

@@ -0,0 +1,42 @@
+require_relative '../lib/suggestor'
+# I'm using test data of Users and their movie recommendations
+# Each user have a hash of their reviews with the movie and
+# what they've rate them with
+json = File.read("test/movies.json")
+engine = Suggestor::Engine.new(json, Suggestor::Algorithms::EuclideanDistance)
+# Let's get some similar users
+name = "Alvaro Pereyra Rabanal"
+puts "Who is similar to #{name}"
+puts engine.similar_to(name, size: 5).inspect
+puts
+puts
+# So, after knowing them, why not having some recommendations?
+puts "Interesting! But I want to see some stuff at the movies, what to watch?"
+opts = {size: 5}
+results = engine.recommended_to("Alvaro Pereyra Rabanal", opts)
+puts results.inspect
+puts
+puts
+# That's good, but let's take in mind bias while using Pearson Correlation:
+puts "Adjust this results please"
+engine = Suggestor::Engine.new(json,Suggestor::Algorithms::PearsonCorrelation)
+ops = {size: 5}
+results = engine.recommended_to("Alvaro Pereyra Rabanal", opts)
+puts results.inspect
+puts
+puts
+name = "Batman Begins "
+puts "Now that was nice. But which others are similar to '#{name}'"
+ops = {size: 10}
+results = engine.similar_related_to(name, opts)
+puts results.inspect

data/lib/suggestor.rb CHANGED Viewed

@@ -1,3 +1,5 @@
 require_relative 'suggestor/engine'
+require_relative 'suggestor/algorithms/recommendation_algorithm'
+require_relative 'suggestor/algorithms/euclidean_distance'
+require_relative 'suggestor/algorithms/pearson_correlation'

data/lib/suggestor/algorithms/euclidean_distance.rb CHANGED Viewed

@@ -1,5 +1,3 @@
-require_relative 'recommendation_algorithm'
 module Suggestor
   module Algorithms
@@ -24,21 +22,22 @@ module Suggestor
       include RecommendationAlgorithm
-      def similarity_score_between(first, second)
-        return 0.0 if no_shared_items_between?(first, second)
-        inverse_of_sum_of_squares_between(first, second)
+      def similarity_score(first, second)
+        return 0.0 if nothing_shared?(first, second)
+        inverse_of_squares(first, second)
       end
-      def inverse_of_sum_of_squares_between(first, second)
-        1/(1+sum_squares_of_shared_items_between(first, second))
+      def inverse_of_squares(first, second)
+        1/(1+Math.sqrt(sum_squares(first, second)))
       end
-      def sum_squares_of_shared_items_between(first, second)
-        shared_items_between(first, second).inject(0.0) do |sum, item|
-          sum + (values_for(first)[item] - values_for(second)[item])**2
+      def sum_squares(first, second)
+        shared_items(first, second).inject(0.0) do |sum, item|
+          sum + ( values_for(first)[item] - values_for(second)[item] ) ** 2
         end
       end
     end
   end
 end

data/lib/suggestor/algorithms/pearson_correlation.rb CHANGED Viewed

@@ -1,5 +1,3 @@
-require_relative 'recommendation_algorithm'
 module Suggestor
   module Algorithms
@@ -18,86 +16,94 @@ module Suggestor
     # the closest distance to all of them. If the two users have the same
     # ratings, it would show as a perfect diagonal (score of 1)
-    # The closest the movies to the line are, the more similar their tastes are.
+    # The closest the movies to the line are, the more similar their tastes
+    # are.
     # The great thing about using Pearson Correlation is that it works with
     # bias to valuating the results. Thus, a user that always rates movies
     # with great scores won't impact and mess up the results.
-    # It's probably a best fit for subjetive reviews (movies reviews, profile points, etc).
+    # It's probably a best fit for subjetive reviews (movies reviews, profile
+    # points, etc).
-    # More info at: http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient
+    # More info at:
+    # http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient
     class PearsonCorrelation
       include RecommendationAlgorithm
-      def similarity_score_between(first, second)
-        return 0.0 if no_shared_items_between?(first, second)
+      def similarity_score(first, second)
+        return -1.0 if nothing_shared?(first, second)
-        calculate_all_sums_for(first, second)
-        numerator = difference_from_total_and_normalize_values
-        # 10.5 /    0.0 /
-        denominator = square_root_from_differences_of_sums
+        process_values(first, second)
-        return 0.0 if denominator == 0
+        numerator   = difference_from_values
+        denominator = square_root_from_differences
+        return 0.0 if denominator == 0
         numerator / denominator
       end
       private
-      def calculate_all_sums_for(first,second)
-        shared_items = shared_items_between(first, second)
-        @total_related_items = shared_items.size
+      def process_values(first, second)
+        items                = shared_items(first, second)
+        @total_related_items = items.size.to_f
-        #simplify access
-        first_values = values_for(first)
+        first_values  = values_for(first)
         second_values = values_for(second)
-        @first_values_sum = @second_values_sum = @first_square_values_sum = \
-        @second_square_values_sum = @products_sum = 0.0
+        create_helper_variables
-        shared_items.each do |item|
+        items.each do |item|
-          # Gets the corresponding value for each item on both elements
-          # For ex., the rating of the same movie by different users
-          first_value = first_values[item]
+          first_value  = first_values[item]
           second_value = second_values[item]
-          # Will add all the related items values  for the first
-          # and second item
-          # For ex., all movie recommendations ratings
-          @first_values_sum += first_value
-          @second_values_sum += second_value
-          # Adds the squares of both elements
-          @first_square_values_sum += first_value ** 2
-          @second_square_values_sum += second_value ** 2
+          append_values(first_value, second_value)
+          append_squares(first_value, second_value)
+          append_product(first_value, second_value)
-          # Adds the product of both values
-          @products_sum += first_value*second_value
         end
+      end
+      def append_values(first_value, second_value)
+        @first_values_sum  += first_value
+        @second_values_sum += second_value
       end
-      def difference_from_total_and_normalize_values
+      def append_squares(first_value, second_value)
+        @first_square_values_sum  += ( first_value ** 2 )
+        @second_square_values_sum += ( second_value ** 2 )
+      end
+      def append_product(first_value, second_value)
+        @products_sum += first_value * second_value
+      end
+      def difference_from_values
         product = @first_values_sum * @second_values_sum
         normalized = product / @total_related_items
         @products_sum - normalized
       end
-      def square_root_from_differences_of_sums
-        power_left_result = @first_values_sum **2 /@total_related_items
-        equation_left = @first_square_values_sum - power_left_result
+      def square_root_from_differences
+        power_left_result = ( @first_values_sum ** 2 ) / @total_related_items
+        equation_left     = @first_square_values_sum - power_left_result
+        power_right_result = ( @second_values_sum ** 2 )/ @total_related_items
+        equation_right     = @second_square_values_sum - power_right_result
-        power_right_result = ( @second_values_sum **2 )/@total_related_items
-        equation_right = @second_square_values_sum - power_right_result
-        Math.sqrt(equation_left * equation_right)
+        Math.sqrt( equation_left * equation_right )
+      end
+      def create_helper_variables
+        @first_values_sum         = 0.0
+        @second_values_sum        = 0.0
+        @first_square_values_sum  = 0.0
+        @second_square_values_sum = 0.0
+        @products_sum             = 0.0
       end
     end

data/lib/suggestor/algorithms/recommendation_algorithm.rb CHANGED Viewed

@@ -8,50 +8,86 @@ module Suggestor
         @collection = collection
       end
-      # returns similar items based on their similary score
-      # for example, similar users based on their movies reviews
-      def similar_items_to(main)
-        #just compare those whore aren't the main item
-        compare_to = collection.dup
-        compare_to.delete(main)
-        # return results based on their score
-        compare_to.keys.inject({}) do |result, other|
-          result.merge!({other => similarity_score_between(main,other)})
-        end
+      # Ex. Similar users based on their movies reviews
+      def similar_to(main, opts={})
+        opts.merge!(default_options)
+        collection = remove_self(main)
+        results    = order_by_similarity_score(main,collection)
+        sort_results(results,opts[:size])
       end
-      # returns recommended related items for the main user
-      # The most important feature. For example, a user will get
-      # movie recommendations based on his past movie reviews
-      # and how it compares with others
-      def recommented_related_items_for(main)
+      # Ex. a user will get movie recommendations
+      def recommended_to(main, opts={})
+        opts.merge!(default_options)
         @similarities = @totals = Hash.new(0)
-        @main = main
-        create_similarities_totals
-        generate_rankings
+        create_similarities_totals(main)
+        results = generate_rankings
+        sort_results(results,opts[:size])
       end
-      def no_shared_items_between?(first,second)
-        shared_items_between(first,second).empty?
+      # Ex. what other movies are related to a given one
+      def similar_related_to(main, opts={})
+        opts.merge!(default_options)
+        collection = invert_collection
+        engine     = self.class.new(collection)
+        engine.similar_to(main,opts)
       end
-      def shared_items_between(first,second)
-        return [] unless values_for(first) && values_for(second)
+      def shared_items(first, second)
+        return [] unless values_for(first) && values_for(second)
         related_keys_for(first).select do |item|
           related_keys_for(second).include? item
         end
-      end
+      end
      private
-      def main_already_has?(related)
-        collection[@main].has_key?(related)
+      def default_options
+        {size: 5}
+      end
+      def nothing_shared?(first, second)
+        shared_items(first, second).empty?
+      end
+      def remove_self(main)
+        cleaned = collection.dup
+        cleaned.delete(main)
+        cleaned
+      end
+      # changes { "Cat": {"1": 10, "2":20}, "Dog": {"1":5, "2": 15} }
+      # to {"1": {"Cat": 10, "Dog": 5}, "2": {"Cat": 20, "Dog": 15}
+      def invert_collection
+        results = {}
+        collection.keys.each do |main|
+          collection[main].keys.each do |item|
+            results[item] ||= {}
+            results[item][main] = collection[main][item]
+          end
+        end
+        results
+      end
+      def order_by_similarity_score(main,collection)
+        result = collection.keys.inject({}) do |res, other|
+          res.merge!({other => similarity_score(main, other)})
+        end
+      end
+      def already_has?(main, related)
+        collection[main].has_key?(related)
       end
       def values_for(id)
@@ -62,47 +98,58 @@ module Suggestor
         values_for(id).keys
       end
-      def add_to_totals(other,item,score)
-        @totals[item] += collection[other][item]*score
+      def add_to_totals(other, item, score)
+        @totals[item]       += collection[other][item]*score
         @similarities[item] += score
       end
-      def generate_rankings
-        @rankings = {}
+      def sort_results(results,size=-1)
+        sorted = results.sort{|a,b| a[1] <=> b[1]}.reverse
+        sorted[0, size]
+      end
+      def generate_rankings
+        rankings = {}
         @totals.each_pair do |item, total|
-          normalized_value = (total / @similarities[item])
-          @rankings.merge!( { item => normalized_value} )
+          normalized_value = (total / Math.sqrt(@similarities[item]))
+          rankings.merge!( { item => normalized_value} )
         end
-        @rankings
+        rankings
+      end
+      def something_in_common?(score)
+        score > 0
+      end
+      def same_item?(main, other)
+        other == main
       end
-      def create_similarities_totals
+      def create_similarities_totals(main)
         collection.keys.each do |other|
-          # won't bother comparing it if the compared item is the same
-          # as the main, or if they scores are below 0 (nothing in common)
-          next if other == @main
-          score = similarity_score_between(@main,other)
-          next if score <= 0
-          # will compare each the results but only for related items
-          # that the main item doesn't already have
-          # For ex., if they have already saw a movie they won't
-          # get it suggested
+          next if same_item?(main,other)
+          score = similarity_score(main, other)
+          next unless something_in_common?(score)
           collection[other].keys.each do |item|
-            unless main_already_has?(item)
-              add_to_totals(other,item,score)
+            unless already_has?(main, item)
+              add_to_totals(other, item, score)
             end
           end
         end
       end
     end
   end
 end

data/lib/suggestor/engine.rb CHANGED Viewed

@@ -1,6 +1,4 @@
 require 'json'
-require_relative 'algorithms/euclidean_distance'
-require_relative 'algorithms/pearson_correlation'
 module Suggestor
@@ -8,50 +6,25 @@ module Suggestor
   class Engine
-    attr_accessor :collection
-    def initialize
-      @collection = {}
-    end
-    def load_data(input)
-      add_to_collection(input)
+    def initialize(input, algorithm = Algorithms::EuclideanDistance)
+      @collection = parse_from_json(input)
+      @algorithm  = algorithm.new(@collection)
     end
-    def similarity_score_for(first, second, opts={})
-      opts[:algorithm] ||= :euclidean_distance
-      strategy_for(opts[:algorithm]).similarity_score_between(first, second)
+    def similar_to(item, opts={})
+      @algorithm.similar_to(item, opts)
     end
-    def similar_items_to(item, opts={})
-      opts[:algorithm] ||= :euclidean_distance
-      strategy_for(opts[:algorithm]).similar_items_to(item)
+    def recommended_to(item, opts={})
+      @algorithm.recommended_to(item, opts)
     end
-    def recommented_related_items_for(item, opts={})
-      opts[:algorithm] ||= :euclidean_distance
-      strategy_for(opts[:algorithm]).recommented_related_items_for(item)
+    def similar_related_to(item, opts={})
+      @algorithm.similar_related_to(item, opts)
     end
     private
-    def strategy_for(algorithm)
-      constantize(classify(algorithm)).new(collection)
-    end
-    # based on Rail's code
-    def classify(name)
-      name.to_s.gsub(/\/(.?)/) { "::#{$1.upcase}" }.gsub(/(?:^|_)(.)/) { $1.upcase }
-    end
-    def constantize(name)
-      Suggestor::Algorithms.const_get(name)
-    end
-    def add_to_collection(input)
-      @collection.merge! parse_from_json(input)
-    end
     def parse_from_json(json)
       JSON.parse(json)
     rescue Exception => ex

data/lib/suggestor/version.rb CHANGED Viewed

@@ -1,3 +1,3 @@
 module Suggestor
-  VERSION = "0.0.3"
+  VERSION = "0.0.6"
 end

data/suggestor.gemspec CHANGED Viewed

@@ -17,5 +17,6 @@ Gem::Specification.new do |s|
   s.files         = `git ls-files`.split("\n")
   s.test_files    = `git ls-files -- {test,spec,features}/*`.split("\n")
   s.executables   = []
+  s.add_dependency("rake")
   s.require_paths = ["lib"]
 end

data/test/euclidean_test.rb CHANGED Viewed

@@ -1,29 +1,27 @@
 require 'minitest/autorun'
-require_relative '../lib/suggestor/algorithms/euclidean_distance'
-require_relative '../lib/suggestor/engine'
+require 'json'
+require_relative '../lib/suggestor'
   describe Suggestor::Algorithms::EuclideanDistance do
     before do
-      @data_string = File.read("test/test.json")
-      @suggestor = Suggestor::Engine.new
-      @suggestor.load_data(@data_string)
-      @algorithm = Suggestor::Algorithms::EuclideanDistance.new(@suggestor.collection)
+      data_string = File.read("test/numbers.json")
+      data        = JSON.parse(data_string)
+      @algorithm  = Suggestor::Algorithms::EuclideanDistance.new(data)
     end
     describe "when building up recommendations" do
       it "must return a list of shared items between two people" do
-        @algorithm.shared_items_between(1,2).must_be :==, ["1","2"]
+        @algorithm.shared_items(1,2).must_be :==, ["1","2"]
       end
       it "must return 0 as similarity record if two elements hace no shared items" do
-        @algorithm.similarity_score_between(1,99).must_be :==, 0
+        @algorithm.similarity_score(1,99).must_be :==, 0
       end
       it "must return 1 as similarity record if two elements have equal related values" do
-        puts @algorithm.shared_items_between(1,1).inspect
-        @algorithm.similarity_score_between(1,1).must_be :==, 1
+        @algorithm.similarity_score(1,1).must_be :==, 1
       end
     end

data/test/movies.json ADDED Viewed

@@ -0,0 +1 @@

+ {"Alvaro Pereyra Rabanal":{"Enterrado":90,"La reunion del diablo":20,"Scott Pilgrim vs The World":80,"El avispon verde":20,"Se dice de mi":89,"Un tonto en el amor":75,"El secreto de sus ojos":99,"Wall Street: El dinero nunca duerme":90,"Super 8 ":90,"Kung fu panda 2":92,"La revelacion":70,"Rio":90,"El cisne negro":70,"Tron: El Legado":90,"Invasion del Mundo: Batalla Los Angeles":50,"Peluda venganza":20,"Megamente":90,"Dias de ira":76,"El especialista":20,"u00bfQue paso ayer? 2":20,"El escritor oculto":80,"Red ":33,"Amor a distancia":90,"Resident Evil 4: La Resurreccion":50,"Octubre":55,"Sin limite":65,"Love Actually ":90,"El gran concierto":90,"Perdidos en Tokio":90,"Pulp Fiction ":87,"Cazador de demonios":20,"Loco y estu00fapido amor":65,"Amor por contrato":66,"Contracorriente":90,"Actividad paranormal 2":20,"La Vigilia":15,"El rey leon":90,"Transformers: El lado oscuro de la luna":20,"Cuando Harry conocio a Sally":95,"Machete":90,"Avatar":80,"Soy el nu00famero cuatro":90,"Toy Story 3 ":99,"El discurso del rey ":90,"Noches de encanto":90,"Enredados ":55,"Red Social":80,"La otra familia ":70,"Tesis ":90,"Harry Potter and the Deathly Hallows":76,"El planeta de los simios: Revolucion":80,"Biutiful ":20,"Harry Potter y las Reliquias de la muerte: Parte II":60,"Las Cronicas de Narnia: La travesia del viajero del alba":63,"El regreso de la nana magica":20},"Angel Velasquez":{"Los indestructibles":95},"Rafael Lanfranco":{"El rey leon":20,"Enter the Dragon ":78,"Un hombre solitario":20,"Desconocido ":20,"La historia sin fin":90,"Pi: Fe en el caos":66,"Harry Potter and the Deathly Hallows":86,"Dark City ":89,"Juego de Traiciones":78,"X-Men: Primera generacion":89,"La Aldea":20,"Agora ":90,"Comer, rezar, amar":20,"La duda":89,"Invasion del Mundo: Batalla Los Angeles":86,"El nuevo entrenador":90,"Marea Roja":86,"Sniper":77,"Conoceras al hombre de tus suenos":90,"El peleador":54,"Enterrado":50,"Mary and Max ":90,"Gran Torino ":86,"Senales":20,"Real Steel ":10,"Breakin' ":94,"Tesis ":66,"Red ":20,"El secreto de sus ojos":75,"El Senor De Los Anillos: Las Dos Torres":50,"Cowboys y Aliens":82,"Mision Imposible":91,"Match Point":64,"Loco por ella":77,"La revelacion":90,"El origen":86,"El Pianista":90,"Lazos de sangre":94,"El Informante":100,"Mas alla de la vida":90,"Carancho ":20,"Harry Potter y las Reliquias de la muerte: Parte II":30,"Jumper ":20,"Apocalypse Now ":92,"Source Code ":90,"Rango ":20,"Hot Fuzz ":43,"La fuente de la vida":91,"The Doubt ":89,"Medianoche en Paris":89,"Sin limite":77,"Rapidos y Furiosos 5":90,"El discurso del rey ":81,"127 horas":85,"Following":90,"Serenity ":60,"Una propuesta atrevida":20,"Temple de Acero":81,"Piratas del Caribe: Navegando aguas misteriosas":84,"Ex, todos tenemos uno":90,"Los ilusionautas":20,"Agente Salt":90,"Star Wars: Episodio IV - Una nueva esperanza":89,"El planeta de los simios: Revolucion":89,"Kick - Ass":90,"Cyrus":90,"El Exterminador 2: El Dia del Juicio Final":87,"Invasion del mundo":82,"Mundo Surreal":92,"Kung Fu Hustle":90,"El mensajero ":90,"Red Social":94,"I saw the devil":73,"El Club de la Pelea":90,"El Senor de los Anillos: El retorno del Rey":66,"Scott Pilgrim vs The World":90,"Terminator II ":87,"Preciosa":90,"Kung fu panda 2":92,"En un rincon del corazon":20,"12 hombres molestos":55,"Thirteen Days ":11,"E.T : El Extraterrestre":90,"Harry Potter y el Prisionero de Azkaban":11,"Way of the Dragon ":90,"Los agentes del destino":75,"True Romance ":75,"Pulp Fiction ":100,"Thor":60,"8 Mile":70,"Super 8 ":90,"The Wolfman ":20,"Los imperdonables":92,"El Protegido":90,"Mi nombre es John Lennon":90,"Hable con ella ":90,"Bastardos sin gloria":80,"The Big Lebowski ":90,"social network":90,"Camino al Oscar":99,"Winnie Pooh ":50,"Beginners":90,"El Fin de Los Tiempos":20,"Celda 211 ":90,"Siempre a tu lado":70,"Tron: El Legado":20,"Que pena tu vida ":20,"Capitan America: El primer vengador":78,"El especialista":20,"Fuego Contra Fuego":78},"4D2Studio Diseno y Animacion":{"Piratas del Caribe 3: En El Fin del Mundo":90,"The Adventures of Tintin: The Secret of the Unicorn ":99,"Mundo Surreal":60,"Temple de Acero":99,"Laeon :El Profesional":90,"El secreto de sus ojos":92,"Fallen Art":96,"Kung fu panda 2":95,"Capitan America: El primer vengador":99,"Tron: El Legado":100,"Los indestructibles":9,"Rango ":20,"Megamente":20,"Thor":100,"Dias de ira":90,"Dorothy of Oz ":100,"Pandorum":65,"Los ilusionautas":20,"Siempre a tu lado":90,"El amante":90,"Cazador de demonios":75,"El u00faltimo maestro del aire":9,"Transformers: El lado oscuro de la luna":77,"X-Men: Primera generacion":90,"Los cazafantasmas":89,"Harry Potter and the Deathly Hallows":89,"Comer, rezar, amar":20,"Agora ":90,"Piratas del Caribe: Navegando aguas misteriosas":85},"Daniel Subauste":{"El rey leon":90,"La Masacre de Texas: El Origen":20,"Una loca pelicula de vampiros":20,"Calabozos y Dragones":205,"Piratas del Caribe 3: En El Fin del Mundo":20,"El Vengador":40,"Linterna Verde":60,"Luna Nueva":20,"Juan de los Muertos":100,"Harry Potter and the Deathly Hallows":70,"Avatar":60,"Dragones, destino de fuego":2,"Space Cowboys ":20,"Agora ":95,"Comer, rezar, amar":20,"X-Men: Primera generacion":90,"Piratas en el Callao":20,"Invasion del Mundo: Batalla Los Angeles":38,"El u00faltimo exorcismo":20,"Tesis ":90,"Daejame entrar":90,"Senales":20,"Red ":90,"Cowboys y Aliens":45,"Mision Imposible":55,"Megamind ":100,"Harry Potter y el Caliz de Fuego":75,"Los indestructibles":30,"La invasion":40,"La Sonrisa de Mona Lisa":90,"Pandorum":68,"Zodiaco":91,"Calabozos y Dragones 2 El Poder Mayor":80,"Transformers: El lado oscuro de la luna":60,"Corazon Valiente":90,"El cisne negro":90,"Mas alla de la vida":20,"Me enamorae en Nueva York":20,"Harry Potter y las Reliquias de la muerte: Parte II":65,"Millennium I: Los hombres que no amaban a las mujeres":60,"Un tonto en el amor":90,"Como Agua para Chocolate ":20,"Laeon :El Profesional":90,"La Naranja Mecanica":90,"Horton Hears a Who! ":90,"Sin limite":85,"Enredados ":90,"El u00faltimo maestro del aire":20,"La chica de mis suenos":90,"La Pasion de Cristo":76,"Temple de Acero":60,"Crepu00fasculo":20,"Perdidos en Tokio":99,"Piratas del Caribe: Navegando aguas misteriosas":80,"Ga'Hoole :La Leyenda De Los Guardianes":90,"Los ilusionautas":20,"Wall Street: El dinero nunca duerme":90,"El planeta de los simios: Revolucion":85,"Kick - Ass":90,"Hannibal Rising ":20,"Planet Terror ":90,"El Codigo Da Vinci":20,"Mundo Surreal":90,"Red Social":60,"Machete":99,"Kung Fu Hustle":90,"Dragones: destino de fuego ":2,"Sanctum ":80,"Scott Pilgrim vs The World":75,"Seven":90,"Triste San Valentin":20,"Megamente":100,"u00bfComo saber si es amor?":20,"Kung fu panda 2":90,"El u00faltimo guerrero Chanka":100,"Thor":60,"Apocalypto ":20,"The Kids Are All Right ":90,"Super 8 ":78,"El Protegido":20,"TRON ":60,"El avispon verde":20,"Los pitufos":75,"El juego del miedo VII 3D":20,"Mongol, el emperador":90,"Tron: El Legado":85,"El Fin de Los Tiempos":20,"The Runaways ":40,"Capitan America: El primer vengador":90,"Millennium I - Los hombres que no amaban a las mujeres":60},"Laura Vanessa M":{"El secreto de sus ojos":99,"Wall Street: El dinero nunca duerme":70,"Atraccion peligrosa":44,"El escritor oculto":86,"La Vigilia":90,"Noches de encanto":90},"Veronica Zapata Gotelli":{"Sin lugar Para los Daebiles":90,"Mundo Surreal":20,"Temple de Acero":70,"Cartas a Julieta":55,"Carancho ":90,"Mary and Max ":90,"The Kids Are All Right ":70,"Wall Street: El dinero nunca duerme":66,"La Naranja Mecanica":90,"Rio":90,"Perros de Reserva":90,"Sin City ":90,"El Truco Final":90,"Lazos de sangre":60,"El cisne negro":90,"La cinta blanca":40,"Los indestructibles":75,"Fargo ":50,"Traffic ":90,"LadyKillers":60,"El juego ":90,"Seven":90,"Psicosis":90,"Crueldad Intolerable":79,"300 ":92,"El escritor oculto":90,"El peleador":80,"Octubre":90,"Al otro lado del corazon":92,"El Resplandor":99,"Source Code ":70,"Love and Other Impossible Pursuits ":90,"Gran Torino ":96,"The King's Speech":90,"La vida de los peces ":77,"Incendies ":91,"X-Men: Primera generacion":90,"El Hombre que Nunca Estuvo Alli":90,"La chica de la capa roja":20,"Batman Begins ":90,"Terciopelo Azul ":55,"Cuando Harry conocio a Sally":90,"Triste San Valentin":90,"Buenas Noches y Buena Suerte":82,"Un Hombre Serio":90,"Soy el nu00famero cuatro":20,"The Big Lebowski ":78,"Noches de encanto":20,"Red Social":90,"El discurso del rey ":93,"Ciudadano Kane ":90,"Quaemese despuaes de Leer":90,"Pase libre":90,"Agua para elefantes":20,"Rain Man ":90,"Conoceras al hombre de tus suenos":90,"En un rincon del corazon":20,"Dinner for Schmucks ":20,"Vaertigo":90,"Un cuento chino ":20,"Batman :El Caballero Oscuro":90,"Bastardos sin gloria":90,"Belleza Americana":90,"Una esposa de mentira":20,"Biutiful ":85},"Guillermo Pereyra":{"Paris en la mira":70,"Una loca pelicula de vampiros":81}}

data/test/{test.json → numbers.json} RENAMED Viewed

File without changes

data/test/pearson_correlation.rb ADDED Viewed

@@ -0,0 +1,27 @@
+require 'minitest/autorun'
+require_relative '../lib/suggestor'
+  describe Suggestor::Algorithms::PearsonCorrelation do
+    before do
+      data_string = File.read("test/numbers.json")
+      data        = JSON.parse(data_string)
+      @algorithm  = Suggestor::Algorithms::PearsonCorrelation.new(data)
+    end
+    describe "when building up recommendations" do
+      it "must return a list of shared items between two people" do
+        @algorithm.shared_items(1,2).must_be :==, ["1","2"]
+      end
+      it "must return 1 as similarity record if two elements have equal related values" do
+        @algorithm.similarity_score(1,1).must_be :==, 1
+      end
+      it "must return -1 as similarity record if two elements are totally distant" do
+        @algorithm.similarity_score(1,99).must_be :==, -1
+      end
+    end
+  end

data/test/suggestor_test.rb CHANGED Viewed

@@ -3,46 +3,41 @@ require_relative '../lib/suggestor'
   describe Suggestor::Engine do
     before do
-      @suggestor = Suggestor::Engine.new
-      @data_string = File.read("test/test.json")
+      @data_string = File.read("test/numbers.json")
     end
     describe "when loading up the data structure" do
       it "must raise an exception with invalid data" do
-        lambda{ @suggestor.load_data("GIBBERISH}") }.must_raise Suggestor::WrongInputFormat
+        lambda{ Suggestor::Engine.new("GIBBERISH") }.must_raise Suggestor::WrongInputFormat
       end
-      it "must return an array structure if data is ok" do
-        @suggestor.load_data(@data_string).must_be_instance_of Hash
-      end
     end
     describe "when accesing the data after load_dataing it" do
       before do
-        @suggestor.load_data(@data_string)
-      end
-      it "must return a similarty score between to elements" do
-        @suggestor.similarity_score_for("1","1").must_be :==, 1
+        @suggestor = Suggestor::Engine.new(@data_string)
       end
       it "must return similar items from the base one with euclidean distance" do
-        expected = {"2"=>0.02702702702702703, "3"=>0.02702702702702703}
-        @suggestor.similar_items_to("1").must_be :==, expected
+        expected = [["3", 0.14285714285714285], ["2", 0.14285714285714285]]
+        @suggestor.similar_to("1").must_be :==, expected
       end
       it "must return similar items from the base one with pearson correlation" do
-        expected = {"1"=>1.0, "3"=>0.0}
-        @suggestor.similar_items_to("2",:algorithm => :pearson_correlation).must_be :==, expected
+        @suggestor = Suggestor::Engine.new(@data_string,Suggestor::Algorithms::PearsonCorrelation)
+        expected = [["2", 0.0], ["1", 0.0]]
+        @suggestor.similar_to("3").must_be :==, expected
       end
       it "must return similar items from the base one with euclidean distance" do
-        expected = {"4"=>1.0}
-        @suggestor.recommented_related_items_for("2").must_be :==, expected
+        expected = [["4", 2.6457513110645903]]
+        @suggestor.recommended_to("2").must_be :==, expected
       end
-    end
+      it "must return similar related items from one of them" do
+        expected = [["5", 0.3333333333333333], ["3", 0.25], ["1", 0.12389934309929541], ["4", 0.0]]
+        @suggestor.similar_related_to("2").must_be :==, expected
+      end
+    end
   end

metadata CHANGED Viewed

@@ -2,7 +2,7 @@
 name: suggestor
 version: !ruby/object:Gem::Version
   prerelease:
-  version: 0.0.3
+  version: 0.0.6
 platform: ruby
 authors:
 - Alvaro Pereyra
@@ -10,10 +10,19 @@ autorequire:
 bindir: bin
 cert_chain: []
-date: 2011-09-19 00:00:00 -05:00
-default_executable:
-dependencies: []
+date: 2011-09-24 00:00:00 Z
+dependencies:
+- !ruby/object:Gem::Dependency
+  name: rake
+  prerelease: false
+  requirement: &id001 !ruby/object:Gem::Requirement
+    none: false
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: "0"
+  type: :runtime
+  version_requirements: *id001
 description: Suggestor allows you to get suggestions of related items in your data
 email:
 - alvaro@xendacentral.com
@@ -28,20 +37,19 @@ files:
 - Gemfile
 - README.md
 - Rakefile
-- demos/playing_around.rb
+- examples/playing_around.rb
 - lib/suggestor.rb
 - lib/suggestor/algorithms/euclidean_distance.rb
 - lib/suggestor/algorithms/pearson_correlation.rb
 - lib/suggestor/algorithms/recommendation_algorithm.rb
-- lib/suggestor/datum.rb
 - lib/suggestor/engine.rb
 - lib/suggestor/version.rb
 - suggestor.gemspec
 - test/euclidean_test.rb
-- test/pearon_correlation.rb
+- test/movies.json
+- test/numbers.json
+- test/pearson_correlation.rb
 - test/suggestor_test.rb
-- test/test.json
-has_rdoc: true
 homepage: ""
 licenses: []
@@ -65,12 +73,13 @@ required_rubygems_version: !ruby/object:Gem::Requirement
 requirements: []
 rubyforge_project: suggestor
-rubygems_version: 1.5.0
+rubygems_version: 1.8.10
 signing_key:
 specification_version: 3
 summary: Suggestor allows you to get suggestions of related items in your data
 test_files:
 - test/euclidean_test.rb
-- test/pearon_correlation.rb
+- test/movies.json
+- test/numbers.json
+- test/pearson_correlation.rb
 - test/suggestor_test.rb
-- test/test.json

data/demos/playing_around.rb DELETED Viewed

@@ -1,16 +0,0 @@
-require_relative '../lib/suggestor'
-engine = Suggestor::Engine.new
-# I'm using test data of Users and their movie recommendations
-# Each user (identified by their ids) have a hash of their movies ids and
-# what they've rate them with
-json = File.read("test/test.json")
-engine.load_data(json)
-# Let's get some similar users
-puts engine.similar_items_to("2").inspect
-# So, after knowing them, why not having some recommendations?
-puts engine.recommented_related_items_for("2", algorithm: :euclidean_distance)

data/lib/suggestor/datum.rb DELETED Viewed

@@ -1,13 +0,0 @@
-require 'delegate'
-module Suggestor
-  class Datum < DelegateClass(Hash)
-    def initialize(hash)
-      super(hash)
-    end
-  end
-end

data/test/pearon_correlation.rb DELETED Viewed

@@ -1,34 +0,0 @@
-require 'minitest/autorun'
-require_relative '../lib/suggestor/algorithms/pearson_correlation'
-require_relative '../lib/suggestor/engine'
-  describe Suggestor::Algorithms::PearsonCorrelation do
-    before do
-      @data_string = File.read("test/test.json")
-      @suggestor = Suggestor::Engine.new
-      @suggestor.load_data(@data_string)
-      @algorithm = Suggestor::Algorithms::PearsonCorrelation.new(@suggestor.collection)
-    end
-    describe "when building up recommendations" do
-      it "must return a list of shared items between two people" do
-        @algorithm.shared_items_between(1,2).must_be :==, ["1","2"]
-      end
-      it "must return 0 as similarity record if two elements hace no shared items" do
-        @algorithm.similarity_score_between(1,4).must_be :==, 0
-      end
-      it "must return 1 as similarity record if two elements have equal related values" do
-        @algorithm.similarity_score_between(1,1).must_be :==, 1
-      end
-      it "must return -1 as similarity record if two elements are totally distant" do
-        @algorithm.similarity_score_between(1,99).must_be :==, 0
-      end
-    end
-  end