recommengine 0.1.3 → 0.1.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +28 -17
- data/lib/recommengine.rb +11 -8
- data/lib/recommengine/euclidean_calculator.rb +14 -8
- data/lib/recommengine/flipper.rb +6 -4
- data/lib/recommengine/matcher.rb +9 -6
- data/lib/recommengine/pearson_calculator.rb +19 -19
- data/lib/recommengine/recommender.rb +49 -33
- data/lib/recommengine/test_data.rb +23 -0
- data/recommengine.gemspec +1 -1
- data/spec/lib/recommender_spec.rb +1 -1
- metadata +3 -2
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA1:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 00c3b60d76e9a77e8ffe57a61f4a6d378638ac06
|
|
4
|
+
data.tar.gz: 0a8ca1c880611d2a5a3646b98c6e9e8d5bec3579
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 97cd7325701aa1aa34687797e518798131bd4de7773aed4f64abd59bc91ffe7b097c3862567f6a4eb7ecc620174ba70e7513bbb26fc84820a9f5b95ab617077a
|
|
7
|
+
data.tar.gz: f749cbd245f20a45acf828a8aa15c0ca413d119109cf3ca3807650f6bf366eef25f4c062d2a6b9af5e70adcc490cdaff06973b5f34dfa1af93cdbcd681d12b42
|
data/README.md
CHANGED
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
# RecommEngine
|
|
2
|
-
A plug-and-play recommendation engine gem supporting multiple similarity algorithms. Use to recommend
|
|
2
|
+
A plug-and-play recommendation engine gem supporting multiple similarity algorithms. Use to recommend items for people, or people for items.
|
|
3
3
|
|
|
4
4
|
## Installation
|
|
5
5
|
Simply `gem install recommengine` or if using bundler, add recommengine to your Gemfile thusly
|
|
@@ -8,29 +8,30 @@ require 'recommengine'
|
|
|
8
8
|
```
|
|
9
9
|
|
|
10
10
|
## Methodology
|
|
11
|
-
RecommEngine uses a weighted scoring system in conjunction with a similarity algorithm (of either the Pearson or Euclidean variety) to suggest
|
|
12
|
-
Products need not be physical items, as this gem has applications outside the ecommerce realm. Products can be things like movies or web links as well. Similarly, scores aren't limited to only to ratings -- they only need to be a numerical representation that describes a user's behavior. For example:
|
|
11
|
+
RecommEngine uses a weighted scoring system in conjunction with a similarity algorithm (of either the Pearson or Euclidean variety) to suggest items to users based on their prior behavior in accordance with the principle of collaborative filtering. In order to utilize this gem, you must have users, items, and some kind of numerical scoring system that describes a user's interaction with a given item.
|
|
13
12
|
|
|
13
|
+
Items need not be physical products, as recommendation engines have applications outside the ecommerce/marketplace realm. Items can be things like movies or web links as well. Similarly, scores aren't limited only to ratings of physical items -- they only need to be a numerical representation that describes a user's behavior. For example:
|
|
14
14
|
|
|
15
|
-
|
|
15
|
+
|
|
16
|
+
| Score | Physical Goods | Movies | Web Habits |
|
|
16
17
|
--------|:--------------:|:------:|-------------:|
|
|
17
|
-
| 0 | No Interaction |
|
|
18
|
-
| 1 | Browsed |
|
|
19
|
-
| 2 | Searched For |
|
|
20
|
-
| 3 | Bought |
|
|
21
|
-
| 4 | Bought > 1x |
|
|
18
|
+
| 0 | No Interaction | * | Didn't Click |
|
|
19
|
+
| 1 | Browsed | ** | Clicked |
|
|
20
|
+
| 2 | Searched For | *** | Searched for |
|
|
21
|
+
| 3 | Bought | **** | N/A |
|
|
22
|
+
| 4 | Bought > 1x | *****| N/A |
|
|
22
23
|
|
|
23
24
|
### Euclidean Algorithm
|
|
24
25
|
|
|
25
|
-
The Euclidean distance algorithm is a simplistic similarity measure that determines how close two points are when plotted in [Cartesian coordinates](https://en.wikipedia.org/wiki/Cartesian_coordinate_system). Read more about Euclidean distance on [wikipedia](https://en.wikipedia.org/wiki/Euclidean_distance).
|
|
26
|
+
The Euclidean distance algorithm is a simplistic similarity measure that determines how close two points are when plotted in [Cartesian coordinates](https://en.wikipedia.org/wiki/Cartesian_coordinate_system). This algorithm essentially calculates average distance by means of the [Pythagorean Theorem](https://en.wikipedia.org/wiki/Pythagorean_theorem) and inverts the result so that a score of '1.0' represents a perfect correlation (both users have identical preferences) and a score of '0.0' representents no correlation. Read more about Euclidean distance on [wikipedia](https://en.wikipedia.org/wiki/Euclidean_distance).
|
|
26
27
|
|
|
27
28
|
### Pearson Algorithm
|
|
28
29
|
|
|
29
|
-
The Pearson similarity score adds a bit more complexity. Feel free to read up on it on [wikipedia](https://en.wikipedia.org/wiki/
|
|
30
|
+
The Pearson similarity score adds a bit more complexity. While not as intuitive as calculating Euclidean distance, the Pearson correlation determines the ratio how much all scored items between two users vary altogether to the product of how much they vary individually, such that a score of '1.0' represents a perfect correlation, a score of '0.0' represents no correlation, and a score of '-1.0' represents an inverse (negative) correlation. The general idea from a Cartesian standpoint (assuming we have a `subject` user on the X axis, a `comparate` user on the Y axis, scores correlating to axial position, and a scatter plot of items) is to determine the 'best-fit' line that comes as close as possible to touching all items (a score of 1.0 could be represented by a perfectly diagonal line that touches all items). Feel free to read up on it on [wikipedia](https://en.wikipedia.org/wiki/Pearson_item-moment_correlation_coefficient). The Pearson method is the default similarity algorithm in RecommEngine, primarily because it compensates for grade inflation and can determine negative correlations between users' preferences.
|
|
30
31
|
|
|
31
32
|
### How Recommendations Work
|
|
32
33
|
|
|
33
|
-
When passing
|
|
34
|
+
When passing `data:` and `subject:` arguments to the recommendations method, RecommEngine will compare the similarity between all members of the `data` hash and the `subject` using the specified `similarity` algorithm, and determine a weighted predicted score for each item based upon each user's similarity to the subject and the scores that each user gave each item. Only items that the subject has not yet scored will be returned (no need to recommend an item to a user that he/she has already interacted with). The items are sorted by score in descending order.
|
|
34
35
|
|
|
35
36
|
## Usage
|
|
36
37
|
|
|
@@ -62,12 +63,12 @@ RecommEngine.recs(data: books, subject: :alice)
|
|
|
62
63
|
["War of the Worlds", 3.5]]
|
|
63
64
|
```
|
|
64
65
|
|
|
65
|
-
Only
|
|
66
|
+
Only items the subject has not rated will be returned (we won't recommend items with which the user is already familiar). The second element in each array is the predicted score the subject will give to the item, based upon the weighted average that similar users to alice have rated the item. Since no similarity algorithm was specified, the Pearson score is used by default.
|
|
66
67
|
|
|
67
68
|
Alternatively, we could specify the use of the Euclidean algorithm as follows.
|
|
68
69
|
|
|
69
70
|
```ruby
|
|
70
|
-
RecommEngine(data: books, subject: :alice, similarity: 'Euclidean')
|
|
71
|
+
RecommEngine.recs(data: books, subject: :alice, similarity: 'Euclidean')
|
|
71
72
|
```
|
|
72
73
|
|
|
73
74
|
Which returns a similar, though subtly different set of results:
|
|
@@ -77,12 +78,18 @@ Which returns a similar, though subtly different set of results:
|
|
|
77
78
|
["War of the Worlds", 3.8959601003790714],
|
|
78
79
|
["The Great Gatsby", 3.7736808311188366]]
|
|
79
80
|
```
|
|
80
|
-
|
|
81
|
+
|
|
82
|
+
If you only want the top recommendation, simply:
|
|
83
|
+
|
|
84
|
+
```Ruby
|
|
85
|
+
RecommEngine.top_rec(data: books, subject: :alice, similarity: 'Euclidean')
|
|
86
|
+
```
|
|
87
|
+
### Similar Users
|
|
81
88
|
|
|
82
89
|
RecommEngine includes a utility to find similar users to a subject. This can be done by calling:
|
|
83
90
|
|
|
84
91
|
```ruby
|
|
85
|
-
RecommEngine.
|
|
92
|
+
RecommEngine.similar_users(data: books, subject: :bob)
|
|
86
93
|
```
|
|
87
94
|
|
|
88
95
|
which returns:
|
|
@@ -95,9 +102,13 @@ You'll notice a reasonably strong positive correlation with Don (meaning Don and
|
|
|
95
102
|
|
|
96
103
|
By default, the Pearson algorithm is used, and only 3 results are returned. These parameters can be defined explicitly when calling the method, by passing for example `num: 5` and/or `similarity: 'Euclidean'`.
|
|
97
104
|
|
|
105
|
+
Practically speaking, this can be used as a means to recommend users to one another in a social network.
|
|
106
|
+
|
|
107
|
+
To get users with the lowest similarity score: `RecommEngine.dissimilar_users(data: ...`
|
|
108
|
+
|
|
98
109
|
### Flipper
|
|
99
110
|
|
|
100
|
-
RecommEngine also provides a handy means to 'flip' your data -- transposing
|
|
111
|
+
RecommEngine also provides a handy means to 'flip' your data -- transposing items and users. This can be used for making generic item recommendations (for instance when a user is not logged in, or you do not have very much data for the logged-in user). This can also be handy when pushing direct marketing campaigns for specific items. Items are compared by similarity, and users are recommended for each items. It can lead to some interesting results. For example
|
|
101
112
|
|
|
102
113
|
```ruby
|
|
103
114
|
RecommEngine.recs(data: RecommEngine.flip(books), subject: 'Crime and Punishment')
|
data/lib/recommengine.rb
CHANGED
|
@@ -1,27 +1,30 @@
|
|
|
1
|
-
files = %w[calculator euclidean_calculator flipper matcher pearson_calculator recommender]
|
|
2
|
-
files.each { |f| require "recommengine/#{f}" }
|
|
1
|
+
files = %w[calculator euclidean_calculator flipper matcher pearson_calculator recommender test_data]
|
|
2
|
+
files.each { |f| require "./lib/recommengine/#{f}" }
|
|
3
3
|
|
|
4
4
|
module RecommEngine
|
|
5
|
+
|
|
5
6
|
DEFAULT_ALGORITHM = 'Pearson'
|
|
6
7
|
DEFAULT_MATCHES_NUMBER = 3
|
|
7
8
|
|
|
8
|
-
|
|
9
|
+
module_function
|
|
10
|
+
|
|
11
|
+
def recs(data:, subject:, similarity: RecommEngine::DEFAULT_ALGORITHM)
|
|
9
12
|
RecommEngine::Recommender.new(data: data, subject: subject, similarity: similarity).recs
|
|
10
13
|
end
|
|
11
14
|
|
|
12
|
-
|
|
15
|
+
def top_rec(data:, subject:, similarity: RecommEngine::DEFAULT_ALGORITHM)
|
|
16
|
+
RecommEngine::Recommender.new(data: data, subject: subject, similarity: similarity).top_rec
|
|
17
|
+
end
|
|
13
18
|
|
|
14
|
-
def
|
|
19
|
+
def similar_users(data:, subject:, similarity: RecommEngine::DEFAULT_ALGORITHM, num: RecommEngine::DEFAULT_MATCHES_NUMBER)
|
|
15
20
|
RecommEngine::Matcher.new(data: data, subject: subject, similarity: similarity, num: num).top_matches
|
|
16
21
|
end
|
|
17
22
|
|
|
18
|
-
def
|
|
23
|
+
def dissimilar_users(data:, subject:, similarity: RecommEngine::DEFAULT_ALGORITHM, num: RecommEngine::DEFAULT_MATCHES_NUMBER)
|
|
19
24
|
RecommEngine::Matcher.new(data: data, subject: subject, similarity: similarity, num: num).bottom_matches
|
|
20
25
|
end
|
|
21
26
|
|
|
22
27
|
def flip(data)
|
|
23
28
|
RecommEngine::Flipper.new(data).flip
|
|
24
29
|
end
|
|
25
|
-
|
|
26
|
-
module_function :recommendations, :recs, :top_matches, :flip
|
|
27
30
|
end
|
|
@@ -1,19 +1,25 @@
|
|
|
1
1
|
module RecommEngine
|
|
2
2
|
class EuclideanCalculator < Calculator
|
|
3
|
-
|
|
3
|
+
def calc
|
|
4
|
+
1.0 / (Math.sqrt(1.0 + sum_of_squared_distances))
|
|
5
|
+
end
|
|
6
|
+
|
|
7
|
+
private
|
|
4
8
|
|
|
5
|
-
def
|
|
6
|
-
|
|
9
|
+
def squared_distances
|
|
10
|
+
data[subject].map{ |item, subject_score| square_of_distance(item, subject_score) if comparate_score(item) }.compact
|
|
7
11
|
end
|
|
8
12
|
|
|
9
|
-
def
|
|
10
|
-
|
|
13
|
+
def sum_of_squared_distances
|
|
14
|
+
squared_distances.inject(:+)
|
|
11
15
|
end
|
|
12
16
|
|
|
13
|
-
|
|
17
|
+
def comparate_score(item)
|
|
18
|
+
data[comparate][item]
|
|
19
|
+
end
|
|
14
20
|
|
|
15
|
-
def
|
|
16
|
-
|
|
21
|
+
def square_of_distance(item, subject_score)
|
|
22
|
+
(subject_score - comparate_score(item))**2
|
|
17
23
|
end
|
|
18
24
|
end
|
|
19
25
|
end
|
data/lib/recommengine/flipper.rb
CHANGED
|
@@ -1,5 +1,7 @@
|
|
|
1
1
|
module RecommEngine
|
|
2
2
|
class Flipper
|
|
3
|
+
attr_reader :data
|
|
4
|
+
|
|
3
5
|
def initialize(data)
|
|
4
6
|
@data = data
|
|
5
7
|
end
|
|
@@ -7,10 +9,10 @@ module RecommEngine
|
|
|
7
9
|
def flip
|
|
8
10
|
result = {}
|
|
9
11
|
result.default = {}
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
result[
|
|
13
|
-
result[
|
|
12
|
+
data.each_key do |user|
|
|
13
|
+
data[user].each_key do |item|
|
|
14
|
+
result[item] = {} if result[item].empty?
|
|
15
|
+
result[item][user] = data[user][item]
|
|
14
16
|
end
|
|
15
17
|
end
|
|
16
18
|
result
|
data/lib/recommengine/matcher.rb
CHANGED
|
@@ -20,24 +20,27 @@ module RecommEngine
|
|
|
20
20
|
private
|
|
21
21
|
|
|
22
22
|
def all_matches
|
|
23
|
-
|
|
23
|
+
similarity_scores.sort_by{|k, v| v}
|
|
24
24
|
end
|
|
25
25
|
|
|
26
|
-
def
|
|
26
|
+
def similarity_scores
|
|
27
27
|
scores = {}
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
scores[k] = similarity_calculator.new(data: @data, subject: @subject, comparate: k).calc
|
|
28
|
+
comparates.each do |comparate|
|
|
29
|
+
scores[comparate] = similarity_calculator.new(data: data, subject: subject, comparate: comparate).calc
|
|
31
30
|
end
|
|
32
31
|
scores
|
|
33
32
|
end
|
|
34
33
|
|
|
35
34
|
def similarity_calculator
|
|
36
|
-
Module.const_get("RecommEngine::#{
|
|
35
|
+
Module.const_get("RecommEngine::#{similarity}Calculator")
|
|
37
36
|
end
|
|
38
37
|
|
|
39
38
|
def upper_limit
|
|
40
39
|
num - 1
|
|
41
40
|
end
|
|
41
|
+
|
|
42
|
+
def comparates
|
|
43
|
+
data.dup.delete_if{ |k,v| k == subject }.keys
|
|
44
|
+
end
|
|
42
45
|
end
|
|
43
46
|
end
|
|
@@ -1,40 +1,40 @@
|
|
|
1
1
|
module RecommEngine
|
|
2
2
|
class PearsonCalculator < Calculator
|
|
3
|
-
|
|
3
|
+
attr_reader :sum_of_subject_scores, :sum_of_comparate_scores, :sum_of_sq_subject_scores, :sum_of_sq_comparate_scores, :sum_of_scores_product, :_number_of_hits, :_similar_items
|
|
4
4
|
|
|
5
|
-
def initialize(
|
|
5
|
+
def initialize(*args)
|
|
6
6
|
super
|
|
7
|
-
@
|
|
7
|
+
@sum_of_subject_scores = @sum_of_comparate_scores = @sum_of_sq_subject_scores = @sum_of_sq_comparate_scores = @sum_of_scores_product = 0
|
|
8
8
|
end
|
|
9
9
|
|
|
10
10
|
def calc
|
|
11
|
-
return 0 if
|
|
12
|
-
|
|
11
|
+
return 0 if number_of_hits < 2
|
|
12
|
+
sum_all_scores
|
|
13
13
|
perform_equation
|
|
14
14
|
end
|
|
15
15
|
|
|
16
16
|
private
|
|
17
17
|
|
|
18
18
|
def similar_items
|
|
19
|
-
@
|
|
19
|
+
@_similar_items ||= data[subject].map{ |k, v| k if data[comparate].keys.include?(k) }.compact
|
|
20
20
|
end
|
|
21
21
|
|
|
22
22
|
def number_of_hits
|
|
23
|
-
@
|
|
23
|
+
@_number_of_hits ||= similar_items.length
|
|
24
24
|
end
|
|
25
25
|
|
|
26
|
-
def
|
|
27
|
-
similar_items.each{ |item|
|
|
26
|
+
def sum_all_scores
|
|
27
|
+
similar_items.each{ |item| sum_scores_for(item) }
|
|
28
28
|
end
|
|
29
29
|
|
|
30
|
-
def
|
|
31
|
-
subject_val =
|
|
32
|
-
comparate_val =
|
|
33
|
-
@
|
|
34
|
-
@
|
|
35
|
-
@
|
|
36
|
-
@
|
|
37
|
-
@
|
|
30
|
+
def sum_scores_for(item)
|
|
31
|
+
subject_val = data[subject][item]
|
|
32
|
+
comparate_val = data[comparate][item]
|
|
33
|
+
@sum_of_subject_scores += subject_val
|
|
34
|
+
@sum_of_comparate_scores += comparate_val
|
|
35
|
+
@sum_of_sq_subject_scores += subject_val**2.0
|
|
36
|
+
@sum_of_sq_comparate_scores += comparate_val**2.0
|
|
37
|
+
@sum_of_scores_product += subject_val * comparate_val
|
|
38
38
|
end
|
|
39
39
|
|
|
40
40
|
def perform_equation
|
|
@@ -42,11 +42,11 @@ module RecommEngine
|
|
|
42
42
|
end
|
|
43
43
|
|
|
44
44
|
def numerator
|
|
45
|
-
|
|
45
|
+
sum_of_scores_product - (sum_of_subject_scores * sum_of_comparate_scores/number_of_hits)
|
|
46
46
|
end
|
|
47
47
|
|
|
48
48
|
def denominator
|
|
49
|
-
@
|
|
49
|
+
@_denominator ||= Math.sqrt((sum_of_sq_subject_scores - (sum_of_subject_scores**2.0)/number_of_hits)*(sum_of_sq_comparate_scores - (sum_of_comparate_scores**2.0)/number_of_hits))
|
|
50
50
|
end
|
|
51
51
|
end
|
|
52
52
|
end
|
|
@@ -1,62 +1,78 @@
|
|
|
1
1
|
module RecommEngine
|
|
2
2
|
class Recommender
|
|
3
|
-
attr_reader :data, :subject, :
|
|
3
|
+
attr_reader :data, :subject, :similarity_algorithm, :user_similarity_scores, :sum_of_weighted_scores_by_item, :sum_of_user_similarity_scores_by_item, :predicted_scores
|
|
4
4
|
|
|
5
5
|
def initialize(data:, subject:, similarity: RecommEngine::DEFAULT_ALGORITHM)
|
|
6
6
|
@data = data
|
|
7
7
|
@subject = subject
|
|
8
|
-
@
|
|
9
|
-
@
|
|
10
|
-
@
|
|
11
|
-
@
|
|
12
|
-
@
|
|
13
|
-
@
|
|
8
|
+
@similarity_algorithm = similarity
|
|
9
|
+
@predicted_scores = {}
|
|
10
|
+
@user_similarity_scores = {}
|
|
11
|
+
@sum_of_weighted_scores_by_item = {}
|
|
12
|
+
@sum_of_user_similarity_scores_by_item = {}
|
|
13
|
+
@sum_of_weighted_scores_by_item.default = 0
|
|
14
|
+
@sum_of_user_similarity_scores_by_item.default = 0
|
|
14
15
|
end
|
|
15
16
|
|
|
16
17
|
def recs
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
18
|
+
calculate_weighted_totals
|
|
19
|
+
calculate_predicted_scores
|
|
20
|
+
predicted_scores.sort_by{ |item, score| score }.reverse
|
|
21
|
+
end
|
|
22
|
+
|
|
23
|
+
def top_rec
|
|
24
|
+
calculate_weighted_totals
|
|
25
|
+
calculate_predicted_scores
|
|
26
|
+
predicted_scores.max_by{ |item, score| score }
|
|
24
27
|
end
|
|
25
28
|
|
|
26
29
|
private
|
|
27
30
|
|
|
28
|
-
def
|
|
29
|
-
|
|
31
|
+
def calculate_weighted_totals
|
|
32
|
+
comparates.each do |comparate|
|
|
33
|
+
data[comparate].each_key{ |item| update_cumulative_totals(comparate, item) unless scored_by_subject?(item) }
|
|
34
|
+
end
|
|
30
35
|
end
|
|
31
36
|
|
|
32
|
-
def
|
|
33
|
-
|
|
34
|
-
|
|
37
|
+
def comparates
|
|
38
|
+
data.dup.delete_if{ |user, item| user == subject || non_positive_similarity?(user) }.keys
|
|
39
|
+
end
|
|
40
|
+
|
|
41
|
+
def non_positive_similarity?(comparate)
|
|
42
|
+
similarity_score(comparate) <= 0
|
|
43
|
+
end
|
|
44
|
+
|
|
45
|
+
def similarity_score(comparate)
|
|
46
|
+
user_similarity_scores[comparate] ||= similarity_calculator.new(data: calculator_data(comparate), subject: subject, comparate: comparate).calc
|
|
35
47
|
end
|
|
36
48
|
|
|
37
49
|
def similarity_calculator
|
|
38
|
-
Module.const_get("RecommEngine::#{
|
|
50
|
+
Module.const_get("RecommEngine::#{similarity_algorithm}Calculator")
|
|
39
51
|
end
|
|
40
52
|
|
|
41
|
-
def
|
|
42
|
-
|
|
53
|
+
def calculator_data(comparate)
|
|
54
|
+
data.select{ |user, item| user == subject || user == comparate }
|
|
43
55
|
end
|
|
44
56
|
|
|
45
|
-
def
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
|
|
57
|
+
def calculate_predicted_scores
|
|
58
|
+
sum_of_weighted_scores_by_item.each { |item, sum_of_scores| predicted_scores[item] = average_weighted_similarity_score(item, sum_of_scores) }
|
|
59
|
+
end
|
|
60
|
+
|
|
61
|
+
def scored_by_subject?(item)
|
|
62
|
+
data[subject][item] && !data[subject][item].zero?
|
|
63
|
+
end
|
|
64
|
+
|
|
65
|
+
def update_cumulative_totals(comparate, item)
|
|
66
|
+
sum_of_weighted_scores_by_item[item] += weighted_similarity_score(comparate, item)
|
|
67
|
+
sum_of_user_similarity_scores_by_item[item] += similarity_score(comparate)
|
|
49
68
|
end
|
|
50
69
|
|
|
51
|
-
def
|
|
52
|
-
|
|
53
|
-
@similarity_sums[product] += score(comperate)
|
|
70
|
+
def weighted_similarity_score(comparate, item)
|
|
71
|
+
data[comparate][item] * similarity_score(comparate)
|
|
54
72
|
end
|
|
55
73
|
|
|
56
|
-
def
|
|
57
|
-
|
|
58
|
-
@totals.each { |subject, total| rankings[subject] = total / @similarity_sums[subject] }
|
|
59
|
-
rankings.sort_by{|k, v| v}.reverse
|
|
74
|
+
def average_weighted_similarity_score(item, sum_of_scores)
|
|
75
|
+
sum_of_scores / sum_of_user_similarity_scores_by_item[item]
|
|
60
76
|
end
|
|
61
77
|
end
|
|
62
78
|
end
|
|
@@ -0,0 +1,23 @@
|
|
|
1
|
+
module RecommEngine
|
|
2
|
+
WU_TANG = {'Ghostface Killah'=> {'The Sixth Sense'=> 2.5, 'Snakes on a Plane'=> 3.5, '27 Dresses'=> 3.0, 'The Avengers'=> 3.5, 'Pootie Tang'=> 2.5, 'Titanic'=> 3.0},
|
|
3
|
+
'RZA'=> {'The Sixth Sense'=> 3.0, 'Snakes on a Plane'=> 3.5, '27 Dresses'=> 1.5, 'The Avengers'=> 5.0, 'Titanic'=> 3.0, 'Pootie Tang'=> 3.5},
|
|
4
|
+
'Raekwon'=> {'The Sixth Sense'=> 2.5, 'Snakes on a Plane'=> 3.0, 'The Avengers'=> 3.5, 'Titanic'=> 4.0},
|
|
5
|
+
'GZA'=> {'Snakes on a Plane'=> 3.5, '27 Dresses'=> 3.0, 'Titanic'=> 4.5, 'The Avengers'=> 4.0, 'Pootie Tang'=> 2.5},
|
|
6
|
+
'Method Man'=> {'The Sixth Sense'=> 3.0, 'Snakes on a Plane'=> 4.0, '27 Dresses'=> 2.0, 'The Avengers'=> 3.0, 'Titanic'=> 3.0,'Pootie Tang'=> 2.0},
|
|
7
|
+
'ODB' => {'The Sixth Sense'=> 3.0, 'Snakes on a Plane'=> 4.0, 'Titanic'=> 3.0, 'The Avengers'=> 5.0, 'Pootie Tang'=> 3.5},
|
|
8
|
+
'Inspectah Deck'=> {'Snakes on a Plane'=>4.5, 'Pootie Tang'=>1.0, 'The Avengers'=>4.0}}
|
|
9
|
+
|
|
10
|
+
CRITICS = {"Lisa Rose"=>{"Lady in the Water"=>2.5, "Snakes on a Plane"=>3.5, "Just My Luck"=>3.0, "Superman Returns"=>3.5, "You, Me and Dupree"=>2.5, "The Night Listener"=>3.0},
|
|
11
|
+
"Gene Seymour"=>{"Lady in the Water"=>3.0, "Snakes on a Plane"=>3.5, "Just My Luck"=>1.5, "Superman Returns"=>5.0, "The Night Listener"=>3.0, "You, Me and Dupree"=>3.5},
|
|
12
|
+
"Michael Phillips"=>{"Lady in the Water"=>2.5, "Snakes on a Plane"=>3.0, "Superman Returns"=>3.5, "The Night Listener"=>4.0},
|
|
13
|
+
"Claudia Puig"=>{"Snakes on a Plane"=>3.5, "Just My Luck"=>3.0, "The Night Listener"=>4.5, "Superman Returns"=>4.0, "You, Me and Dupree"=>2.5},
|
|
14
|
+
"Mick LaSalle"=>{"Lady in the Water"=>3.0, "Snakes on a Plane"=>4.0, "Just My Luck"=>2.0, "Superman Returns"=>3.0, "The Night Listener"=>3.0, "You, Me and Dupree"=>2.0},
|
|
15
|
+
"Jack Matthews"=>{"Lady in the Water"=>3.0, "Snakes on a Plane"=>4.0, "The Night Listener"=>3.0, "Superman Returns"=>5.0, "You, Me and Dupree"=>3.5},
|
|
16
|
+
"Toby"=>{"Snakes on a Plane"=>4.5, "You, Me and Dupree"=>1.0,"Superman Returns"=>4.0}}
|
|
17
|
+
|
|
18
|
+
BOOKS = {alice: {"War and Peace" => 2.5, "Crime and Punishment" => 3.5},
|
|
19
|
+
bob: {"War of the Worlds" => 5.0, "War and Peace" => 1.5, "The Great Gatsby" => 4.0},
|
|
20
|
+
cindy: {"War and Peace" => 5.0, "The Great Gatsby" => 4.5, "War of the Worlds" => 3.0, "Twenty Thousand Leagues Under the Sea" => 3.0},
|
|
21
|
+
don: {"War of the Worlds" => 4.0, "The Great Gatsby" => 2.5, "Twenty Thousand Leagues Under the Sea" => 5.0, "Crime and Punishment" => 4.5, "War and Peace" => 3.0},
|
|
22
|
+
erica: {"War of the Worlds" => 3.0, "The Great Gatsby" => 4.5, "Twenty Thousand Leagues Under the Sea" => 4.0, "Crime and Punishment" => 4.5, "War and Peace" => 3.5}}
|
|
23
|
+
end
|
data/recommengine.gemspec
CHANGED
|
@@ -2,7 +2,7 @@ $:.push File.expand_path("../lib", __FILE__)
|
|
|
2
2
|
|
|
3
3
|
Gem::Specification.new do |s|
|
|
4
4
|
s.name = 'recommengine'
|
|
5
|
-
s.version = '0.1.
|
|
5
|
+
s.version = '0.1.4'
|
|
6
6
|
s.date = '2015-09-12'
|
|
7
7
|
s.summary = "A flexible recommendation engine."
|
|
8
8
|
s.description = "A flexible recommendation engine supporting multiple similarity algorithms for use in ecommerce sites, marketplaces, social sharing apps, and more."
|
|
@@ -5,7 +5,7 @@ describe '.recommender' do
|
|
|
5
5
|
|
|
6
6
|
describe '#initialize' do
|
|
7
7
|
it 'defaults to Pearson similarity algorithm if none is specified' do
|
|
8
|
-
expect(default_recommender.
|
|
8
|
+
expect(default_recommender.similarity_algorithm).to eq('Pearson')
|
|
9
9
|
end
|
|
10
10
|
end
|
|
11
11
|
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: recommengine
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.1.
|
|
4
|
+
version: 0.1.4
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Cody Knauer
|
|
@@ -42,6 +42,7 @@ files:
|
|
|
42
42
|
- lib/recommengine/matcher.rb
|
|
43
43
|
- lib/recommengine/pearson_calculator.rb
|
|
44
44
|
- lib/recommengine/recommender.rb
|
|
45
|
+
- lib/recommengine/test_data.rb
|
|
45
46
|
- recommengine.gemspec
|
|
46
47
|
- spec/lib/euclidean_calculator_spec.rb
|
|
47
48
|
- spec/lib/flipper_spec.rb
|
|
@@ -70,7 +71,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
|
70
71
|
version: '0'
|
|
71
72
|
requirements: []
|
|
72
73
|
rubyforge_project:
|
|
73
|
-
rubygems_version: 2.4.5
|
|
74
|
+
rubygems_version: 2.4.5.1
|
|
74
75
|
signing_key:
|
|
75
76
|
specification_version: 4
|
|
76
77
|
summary: A flexible recommendation engine.
|