co2_filter 0.0.1 → 0.1.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.rdoc +25 -2
- data/lib/co2_filter/collaborative.rb +23 -4
- data/lib/co2_filter/content_based.rb +12 -3
- data/lib/co2_filter/version.rb +1 -1
- data/lib/co2_filter.rb +9 -2
- metadata +17 -17
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 200b98128f375b95bc95eeef7eee80e24f970f17
|
4
|
+
data.tar.gz: e1c8b8c12ef51beb83281a4af88045df6b27a7c2
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 79e65ec47635ae4177f0f30aa420573d15c601fb5b4d521d3235046dd947f9c508d38fd1ef7121721e3071b2fc4a8c5eae3c36920870871368bfe21eb46d5260
|
7
|
+
data.tar.gz: c598b3125d2c286c60a8c47f63a64b70228ac81620b2b0f39d3ea51ef61873dd3e6baa9842e2bf5285f28aee0a76cd6bf02614e83c95f14fc4e0e9ff81bb0caf
|
data/README.rdoc
CHANGED
@@ -15,7 +15,7 @@ Rating ranges are irrelevant to the algorithm, as rating averages are the key po
|
|
15
15
|
|
16
16
|
Add this line to your application's Gemfile:
|
17
17
|
|
18
|
-
gem 'co2_filter'
|
18
|
+
gem 'co2_filter'
|
19
19
|
|
20
20
|
And then execute:
|
21
21
|
|
@@ -70,9 +70,28 @@ Attribute strength refers to a situation where attributes are applied in varying
|
|
70
70
|
|
71
71
|
=== Using Individual Filters
|
72
72
|
|
73
|
+
==== Collaborative Filter
|
74
|
+
|
73
75
|
To implement only the collaborative filter, just use:
|
74
76
|
Co2Filter::Collaborative.filter(current_user: current_user, other_users: other_users)
|
75
77
|
|
78
|
+
The collaborative filter accepts an argument that determines by what process users are determined to be similar, called +measure+:
|
79
|
+
Co2Filter::Collaborative.filter(current_user: current_user, other_users: other_users, measure: :euclidean)
|
80
|
+
If is assigned to +:euclidean+, users will be considered similar based on the Euclidean distance of their rating sets. This is a straight-forward comparison that makes sense to most people, but may overlook some subtleties.
|
81
|
+
If assigned to +:cosine+, users' similarity is determined by a mean-based cosine coefficient. This essentially means that the curve formed by users' rating sets is compared by shape to others'. This technique is more reliable in some cases, but suffers considerably more in sparse data sets.
|
82
|
+
+:hybrid+ is the default measure, which represents an average of the above two measures. Inherently, this makes it slower, but averaging may prove more reliable overall. It unfortunately may also smooth over the uniquely accurate aspects of each technique, lowering opportunities for a surprisingly good recommendation (or surprisingly bad).
|
83
|
+
|
84
|
+
You may also feed precalculated similarity coefficients into the filter using the +similarity_coefficients+ argument:
|
85
|
+
similarity_coefficients = {
|
86
|
+
'user1' => 0.56,
|
87
|
+
'user2' => 0.8,
|
88
|
+
'user3' => -0.4
|
89
|
+
# ...
|
90
|
+
}
|
91
|
+
Co2Filter::Collaborative.filter(current_user: current_user, other_users: other_users, similarity_coefficients: similarity_coefficients)
|
92
|
+
|
93
|
+
==== Content-Based Filter
|
94
|
+
|
76
95
|
To implement only the content-based filter, use:
|
77
96
|
Co2Filter::ContentBased.filter(user: current_user, items: items)
|
78
97
|
|
@@ -83,6 +102,10 @@ The content-based filtering process consists of two steps:
|
|
83
102
|
If you are interested in doing this process piecemeal (for instance, to save the user profile to the database for later use), you can do so:
|
84
103
|
user_profile = Co2Filter::ContentBased.ratings_to_profile(user_ratings: current_user, items: items)
|
85
104
|
Co2Filter::ContentBased.filter(user: user_profile, items: items)
|
105
|
+
Note that a <tt>Co2Filter::ContentBased::UserProfile</tt> object like the one returned must be submitted as the user to trigger this shortcut.
|
106
|
+
|
107
|
+
Separately run content-based filtering can be combined into the base (hybrid) filter by submitting the results (as a <tt>Co2Filter::Results</tt> object) to the filter as follows:
|
108
|
+
recommended = Co2Filter.filter(current_user: current_user, other_users: other_users, content_based_results: content_based_results)
|
86
109
|
|
87
110
|
=== Content-Boosted Collaborative Filtering
|
88
111
|
|
@@ -92,7 +115,7 @@ Content-boosted collaborative filtering can be used as follows:
|
|
92
115
|
This is the most processor-intensive algorithm, but it too can be split up into multiple pieces if you wish:
|
93
116
|
boosted_users = Co2Filter::ContentBased.boost_ratings(users: other_users, items: items)
|
94
117
|
Co2Filter::Collaborative.filter(current_user: current_user, other_users: boosted_users)
|
95
|
-
Note that the second step is simply the basic collaborative filter. If you wish to break up the +boost_ratings+ method even further, then you are actually talking about using the
|
118
|
+
Note that the second step is simply the basic collaborative filter. If you wish to break up the +boost_ratings+ method even further, then you are actually talking about using the <tt>Co2Filter::ContentBased.filter</tt> on each of the users. (See the definition for +boost_ratings+.)
|
96
119
|
|
97
120
|
== Contributing
|
98
121
|
|
@@ -1,9 +1,22 @@
|
|
1
1
|
module Co2Filter::Collaborative
|
2
2
|
autoload :Results, 'co2_filter/collaborative/results'
|
3
3
|
|
4
|
-
def self.filter(current_user
|
4
|
+
def self.filter(current_user: nil, other_users: nil, measure: :hybrid, similarity_coefficients: nil)
|
5
|
+
raise ArgumentError.new("A 'current_user' argument must be provided.") unless current_user
|
6
|
+
raise ArgumentError.new("An 'other_users' argument must be provided.") unless other_users
|
7
|
+
|
5
8
|
current_user = Co2Filter::RatingSet.new(current_user) unless current_user.is_a? Co2Filter::RatingSet
|
6
|
-
if
|
9
|
+
if similarity_coefficients.is_a? Hash
|
10
|
+
processed_users = other_users.inject({}) do |processed, (user_id, ratings)|
|
11
|
+
ratings = Co2Filter::RatingSet.new(ratings) unless ratings.is_a? Co2Filter::RatingSet
|
12
|
+
processed[user_id] = {
|
13
|
+
ratings: ratings,
|
14
|
+
mean: ratings.mean,
|
15
|
+
coefficient: similarity_coefficients[user_id]
|
16
|
+
} if similarity_coefficients[user_id]
|
17
|
+
processed
|
18
|
+
end
|
19
|
+
elsif measure == :euclidean
|
7
20
|
processed_users = euclidean(current_user: current_user, other_users: other_users, num_nearest: 30)
|
8
21
|
elsif measure == :cosine
|
9
22
|
processed_users = mean_centered_cosine(current_user: current_user, other_users: other_users, num_nearest: 30)
|
@@ -39,7 +52,10 @@ module Co2Filter::Collaborative
|
|
39
52
|
Results.new(item_ratings)
|
40
53
|
end
|
41
54
|
|
42
|
-
def self.mean_centered_cosine(current_user
|
55
|
+
def self.mean_centered_cosine(current_user: nil, other_users: nil, num_nearest: 30)
|
56
|
+
raise ArgumentError.new("A 'current_user' argument must be provided.") unless current_user
|
57
|
+
raise ArgumentError.new("An 'other_users' argument must be provided.") unless other_users
|
58
|
+
|
43
59
|
processed = other_users.map do |key, user2|
|
44
60
|
user2 = Co2Filter::RatingSet.new(user2) unless user2.is_a? Co2Filter::RatingSet
|
45
61
|
[key, single_cosine(current_user, user2)]
|
@@ -85,7 +101,10 @@ module Co2Filter::Collaborative
|
|
85
101
|
}
|
86
102
|
end
|
87
103
|
|
88
|
-
def self.euclidean(current_user
|
104
|
+
def self.euclidean(current_user: nil, other_users: nil, num_nearest: 30, range:0)
|
105
|
+
raise ArgumentError.new("A 'current_user' argument must be provided.") unless current_user
|
106
|
+
raise ArgumentError.new("An 'other_users' argument must be provided.") unless other_users
|
107
|
+
|
89
108
|
if range == 0
|
90
109
|
lowest = nil
|
91
110
|
highest = nil
|
@@ -2,7 +2,10 @@ module Co2Filter::ContentBased
|
|
2
2
|
autoload :Results, 'co2_filter/content_based/results'
|
3
3
|
autoload :UserProfile, 'co2_filter/content_based/user_profile'
|
4
4
|
|
5
|
-
def self.filter(user
|
5
|
+
def self.filter(user: nil, items: nil)
|
6
|
+
raise ArgumentError.new("A 'user' argument must be provided.") unless user
|
7
|
+
raise ArgumentError.new("An 'items' argument must be provided.") unless items
|
8
|
+
|
6
9
|
if(user.is_a?(UserProfile))
|
7
10
|
user_profile = user
|
8
11
|
new_items = items
|
@@ -25,7 +28,10 @@ module Co2Filter::ContentBased
|
|
25
28
|
Results.new(results)
|
26
29
|
end
|
27
30
|
|
28
|
-
def self.ratings_to_profile(user_ratings
|
31
|
+
def self.ratings_to_profile(user_ratings: nil, items: nil)
|
32
|
+
raise ArgumentError.new("A 'user_ratings' argument must be provided.") unless user_ratings
|
33
|
+
raise ArgumentError.new("An 'items' argument must be provided.") unless items
|
34
|
+
|
29
35
|
user_ratings = Co2Filter::RatingSet.new(user_ratings) unless user_ratings.is_a? Co2Filter::RatingSet
|
30
36
|
user_prefs = {}
|
31
37
|
strength_normalizers = {}
|
@@ -47,7 +53,10 @@ module Co2Filter::ContentBased
|
|
47
53
|
UserProfile.new(user_prefs, user_ratings.mean)
|
48
54
|
end
|
49
55
|
|
50
|
-
def self.boost_ratings(users
|
56
|
+
def self.boost_ratings(users: nil, items: nil)
|
57
|
+
raise ArgumentError.new("A 'users' argument must be provided.") unless users
|
58
|
+
raise ArgumentError.new("An 'items' argument must be provided.") unless items
|
59
|
+
|
51
60
|
users.inject({}) do |content_boosted_users, (user_id, ratings)|
|
52
61
|
content_boosted_users[user_id] = ratings.merge(filter(user: ratings, items: items))
|
53
62
|
content_boosted_users
|
data/lib/co2_filter/version.rb
CHANGED
data/lib/co2_filter.rb
CHANGED
@@ -1,6 +1,9 @@
|
|
1
1
|
module Co2Filter
|
2
|
-
def self.filter(current_user: , other_users: , items: nil, user_profile: nil, content_based_results: nil)
|
2
|
+
def self.filter(current_user: nil, other_users: nil, items: nil, user_profile: nil, content_based_results: nil)
|
3
|
+
raise ArgumentError.new("A 'current_user' argument must be provided.") unless current_user
|
4
|
+
raise ArgumentError.new("An 'other_users' argument must be provided.") unless other_users
|
3
5
|
raise ArgumentError.new("An 'items' or 'content_based_results' argument must be provided.") unless items || content_based_results
|
6
|
+
|
4
7
|
collab = Collaborative.filter(current_user: current_user, other_users: other_users)
|
5
8
|
|
6
9
|
if content_based_results && content_based_results.is_a?(Results)
|
@@ -17,7 +20,11 @@ module Co2Filter
|
|
17
20
|
Results.new(hybrid)
|
18
21
|
end
|
19
22
|
|
20
|
-
def self.content_boosted_collaborative_filter(current_user
|
23
|
+
def self.content_boosted_collaborative_filter(current_user: nil, other_users: nil, items: nil)
|
24
|
+
raise ArgumentError.new("A 'current_user' argument must be provided.") unless current_user
|
25
|
+
raise ArgumentError.new("An 'other_users' argument must be provided.") unless other_users
|
26
|
+
raise ArgumentError.new("An 'items' argument must be provided.") unless items
|
27
|
+
|
21
28
|
content_boosted_users = ContentBased.boost_ratings(users: other_users, items: items)
|
22
29
|
results = Collaborative.filter(current_user: current_user, other_users: content_boosted_users)
|
23
30
|
Results.new(results)
|
metadata
CHANGED
@@ -1,55 +1,55 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: co2_filter
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.0
|
4
|
+
version: 0.1.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Tommy Orr
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2016-
|
11
|
+
date: 2016-03-09 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: bundler
|
15
15
|
requirement: !ruby/object:Gem::Requirement
|
16
16
|
requirements:
|
17
|
-
- -
|
17
|
+
- - ~>
|
18
18
|
- !ruby/object:Gem::Version
|
19
19
|
version: '1.11'
|
20
20
|
type: :development
|
21
21
|
prerelease: false
|
22
22
|
version_requirements: !ruby/object:Gem::Requirement
|
23
23
|
requirements:
|
24
|
-
- -
|
24
|
+
- - ~>
|
25
25
|
- !ruby/object:Gem::Version
|
26
26
|
version: '1.11'
|
27
27
|
- !ruby/object:Gem::Dependency
|
28
28
|
name: rake
|
29
29
|
requirement: !ruby/object:Gem::Requirement
|
30
30
|
requirements:
|
31
|
-
- -
|
31
|
+
- - ~>
|
32
32
|
- !ruby/object:Gem::Version
|
33
33
|
version: '10.0'
|
34
34
|
type: :development
|
35
35
|
prerelease: false
|
36
36
|
version_requirements: !ruby/object:Gem::Requirement
|
37
37
|
requirements:
|
38
|
-
- -
|
38
|
+
- - ~>
|
39
39
|
- !ruby/object:Gem::Version
|
40
40
|
version: '10.0'
|
41
41
|
- !ruby/object:Gem::Dependency
|
42
42
|
name: rspec
|
43
43
|
requirement: !ruby/object:Gem::Requirement
|
44
44
|
requirements:
|
45
|
-
- -
|
45
|
+
- - ~>
|
46
46
|
- !ruby/object:Gem::Version
|
47
47
|
version: '3.0'
|
48
48
|
type: :development
|
49
49
|
prerelease: false
|
50
50
|
version_requirements: !ruby/object:Gem::Requirement
|
51
51
|
requirements:
|
52
|
-
- -
|
52
|
+
- - ~>
|
53
53
|
- !ruby/object:Gem::Version
|
54
54
|
version: '3.0'
|
55
55
|
description:
|
@@ -59,20 +59,20 @@ executables: []
|
|
59
59
|
extensions: []
|
60
60
|
extra_rdoc_files: []
|
61
61
|
files:
|
62
|
-
- MIT-LICENSE
|
63
|
-
- README.rdoc
|
64
|
-
- Rakefile
|
65
|
-
- lib/co2_filter.rb
|
66
|
-
- lib/co2_filter/collaborative.rb
|
67
62
|
- lib/co2_filter/collaborative/results.rb
|
68
|
-
- lib/co2_filter/
|
63
|
+
- lib/co2_filter/collaborative.rb
|
69
64
|
- lib/co2_filter/content_based/results.rb
|
70
65
|
- lib/co2_filter/content_based/user_profile.rb
|
66
|
+
- lib/co2_filter/content_based.rb
|
71
67
|
- lib/co2_filter/hash_wrapper.rb
|
72
68
|
- lib/co2_filter/rating_set.rb
|
73
69
|
- lib/co2_filter/results.rb
|
74
70
|
- lib/co2_filter/version.rb
|
71
|
+
- lib/co2_filter.rb
|
75
72
|
- lib/tasks/co2_filter_tasks.rake
|
73
|
+
- MIT-LICENSE
|
74
|
+
- Rakefile
|
75
|
+
- README.rdoc
|
76
76
|
homepage: https://github.com/comatose-turtle/co2_filter
|
77
77
|
licenses:
|
78
78
|
- MIT
|
@@ -83,17 +83,17 @@ require_paths:
|
|
83
83
|
- lib
|
84
84
|
required_ruby_version: !ruby/object:Gem::Requirement
|
85
85
|
requirements:
|
86
|
-
- -
|
86
|
+
- - '>='
|
87
87
|
- !ruby/object:Gem::Version
|
88
88
|
version: '0'
|
89
89
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
90
90
|
requirements:
|
91
|
-
- -
|
91
|
+
- - '>='
|
92
92
|
- !ruby/object:Gem::Version
|
93
93
|
version: '0'
|
94
94
|
requirements: []
|
95
95
|
rubyforge_project:
|
96
|
-
rubygems_version: 2.
|
96
|
+
rubygems_version: 2.0.14
|
97
97
|
signing_key:
|
98
98
|
specification_version: 4
|
99
99
|
summary: Uses both collaborative and content-based filtering methods to enable a complex,
|