co2_filter 0.0.1 → 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: aa78d78af6f6ed5e9dc74d4104f85759dc5299d6
4
- data.tar.gz: d2b184ffa6deac97b8d0afc8e9020b92fc2263cd
3
+ metadata.gz: 200b98128f375b95bc95eeef7eee80e24f970f17
4
+ data.tar.gz: e1c8b8c12ef51beb83281a4af88045df6b27a7c2
5
5
  SHA512:
6
- metadata.gz: 231065d35af899464b9aad9c690e29591eba7ae846f58b3886b3b5a0ce963fe69b370d2a6df52e37745b8ad6debadb09d10f8162a481a182e138afb596fc7b99
7
- data.tar.gz: 46ffdb54af8bec7e72366bed2772d214344c65ee8c1a7d7cee23c33953f6aaab0075c86250a28add3c4901bd809ca519fba47d359a0a62ab7d06e51c099fe902
6
+ metadata.gz: 79e65ec47635ae4177f0f30aa420573d15c601fb5b4d521d3235046dd947f9c508d38fd1ef7121721e3071b2fc4a8c5eae3c36920870871368bfe21eb46d5260
7
+ data.tar.gz: c598b3125d2c286c60a8c47f63a64b70228ac81620b2b0f39d3ea51ef61873dd3e6baa9842e2bf5285f28aee0a76cd6bf02614e83c95f14fc4e0e9ff81bb0caf
data/README.rdoc CHANGED
@@ -15,7 +15,7 @@ Rating ranges are irrelevant to the algorithm, as rating averages are the key po
15
15
 
16
16
  Add this line to your application's Gemfile:
17
17
 
18
- gem 'co2_filter', git: 'https://github.com/comatose-turtle/co2_filter.git'
18
+ gem 'co2_filter'
19
19
 
20
20
  And then execute:
21
21
 
@@ -70,9 +70,28 @@ Attribute strength refers to a situation where attributes are applied in varying
70
70
 
71
71
  === Using Individual Filters
72
72
 
73
+ ==== Collaborative Filter
74
+
73
75
  To implement only the collaborative filter, just use:
74
76
  Co2Filter::Collaborative.filter(current_user: current_user, other_users: other_users)
75
77
 
78
+ The collaborative filter accepts an argument that determines by what process users are determined to be similar, called +measure+:
79
+ Co2Filter::Collaborative.filter(current_user: current_user, other_users: other_users, measure: :euclidean)
80
+ If is assigned to +:euclidean+, users will be considered similar based on the Euclidean distance of their rating sets. This is a straight-forward comparison that makes sense to most people, but may overlook some subtleties.
81
+ If assigned to +:cosine+, users' similarity is determined by a mean-based cosine coefficient. This essentially means that the curve formed by users' rating sets is compared by shape to others'. This technique is more reliable in some cases, but suffers considerably more in sparse data sets.
82
+ +:hybrid+ is the default measure, which represents an average of the above two measures. Inherently, this makes it slower, but averaging may prove more reliable overall. It unfortunately may also smooth over the uniquely accurate aspects of each technique, lowering opportunities for a surprisingly good recommendation (or surprisingly bad).
83
+
84
+ You may also feed precalculated similarity coefficients into the filter using the +similarity_coefficients+ argument:
85
+ similarity_coefficients = {
86
+ 'user1' => 0.56,
87
+ 'user2' => 0.8,
88
+ 'user3' => -0.4
89
+ # ...
90
+ }
91
+ Co2Filter::Collaborative.filter(current_user: current_user, other_users: other_users, similarity_coefficients: similarity_coefficients)
92
+
93
+ ==== Content-Based Filter
94
+
76
95
  To implement only the content-based filter, use:
77
96
  Co2Filter::ContentBased.filter(user: current_user, items: items)
78
97
 
@@ -83,6 +102,10 @@ The content-based filtering process consists of two steps:
83
102
  If you are interested in doing this process piecemeal (for instance, to save the user profile to the database for later use), you can do so:
84
103
  user_profile = Co2Filter::ContentBased.ratings_to_profile(user_ratings: current_user, items: items)
85
104
  Co2Filter::ContentBased.filter(user: user_profile, items: items)
105
+ Note that a <tt>Co2Filter::ContentBased::UserProfile</tt> object like the one returned must be submitted as the user to trigger this shortcut.
106
+
107
+ Separately run content-based filtering can be combined into the base (hybrid) filter by submitting the results (as a <tt>Co2Filter::Results</tt> object) to the filter as follows:
108
+ recommended = Co2Filter.filter(current_user: current_user, other_users: other_users, content_based_results: content_based_results)
86
109
 
87
110
  === Content-Boosted Collaborative Filtering
88
111
 
@@ -92,7 +115,7 @@ Content-boosted collaborative filtering can be used as follows:
92
115
  This is the most processor-intensive algorithm, but it too can be split up into multiple pieces if you wish:
93
116
  boosted_users = Co2Filter::ContentBased.boost_ratings(users: other_users, items: items)
94
117
  Co2Filter::Collaborative.filter(current_user: current_user, other_users: boosted_users)
95
- Note that the second step is simply the basic collaborative filter. If you wish to break up the +boost_ratings+ method even further, then you are actually talking about using the +Co2Filter::ContentBased.filter+ on each of the users. (See the definition for +boost_ratings+.)
118
+ Note that the second step is simply the basic collaborative filter. If you wish to break up the +boost_ratings+ method even further, then you are actually talking about using the <tt>Co2Filter::ContentBased.filter</tt> on each of the users. (See the definition for +boost_ratings+.)
96
119
 
97
120
  == Contributing
98
121
 
@@ -1,9 +1,22 @@
1
1
  module Co2Filter::Collaborative
2
2
  autoload :Results, 'co2_filter/collaborative/results'
3
3
 
4
- def self.filter(current_user:, other_users:, measure: :hybrid)
4
+ def self.filter(current_user: nil, other_users: nil, measure: :hybrid, similarity_coefficients: nil)
5
+ raise ArgumentError.new("A 'current_user' argument must be provided.") unless current_user
6
+ raise ArgumentError.new("An 'other_users' argument must be provided.") unless other_users
7
+
5
8
  current_user = Co2Filter::RatingSet.new(current_user) unless current_user.is_a? Co2Filter::RatingSet
6
- if measure == :euclidean
9
+ if similarity_coefficients.is_a? Hash
10
+ processed_users = other_users.inject({}) do |processed, (user_id, ratings)|
11
+ ratings = Co2Filter::RatingSet.new(ratings) unless ratings.is_a? Co2Filter::RatingSet
12
+ processed[user_id] = {
13
+ ratings: ratings,
14
+ mean: ratings.mean,
15
+ coefficient: similarity_coefficients[user_id]
16
+ } if similarity_coefficients[user_id]
17
+ processed
18
+ end
19
+ elsif measure == :euclidean
7
20
  processed_users = euclidean(current_user: current_user, other_users: other_users, num_nearest: 30)
8
21
  elsif measure == :cosine
9
22
  processed_users = mean_centered_cosine(current_user: current_user, other_users: other_users, num_nearest: 30)
@@ -39,7 +52,10 @@ module Co2Filter::Collaborative
39
52
  Results.new(item_ratings)
40
53
  end
41
54
 
42
- def self.mean_centered_cosine(current_user:, other_users:, num_nearest:)
55
+ def self.mean_centered_cosine(current_user: nil, other_users: nil, num_nearest: 30)
56
+ raise ArgumentError.new("A 'current_user' argument must be provided.") unless current_user
57
+ raise ArgumentError.new("An 'other_users' argument must be provided.") unless other_users
58
+
43
59
  processed = other_users.map do |key, user2|
44
60
  user2 = Co2Filter::RatingSet.new(user2) unless user2.is_a? Co2Filter::RatingSet
45
61
  [key, single_cosine(current_user, user2)]
@@ -85,7 +101,10 @@ module Co2Filter::Collaborative
85
101
  }
86
102
  end
87
103
 
88
- def self.euclidean(current_user:, other_users:, num_nearest:, range:0)
104
+ def self.euclidean(current_user: nil, other_users: nil, num_nearest: 30, range:0)
105
+ raise ArgumentError.new("A 'current_user' argument must be provided.") unless current_user
106
+ raise ArgumentError.new("An 'other_users' argument must be provided.") unless other_users
107
+
89
108
  if range == 0
90
109
  lowest = nil
91
110
  highest = nil
@@ -2,7 +2,10 @@ module Co2Filter::ContentBased
2
2
  autoload :Results, 'co2_filter/content_based/results'
3
3
  autoload :UserProfile, 'co2_filter/content_based/user_profile'
4
4
 
5
- def self.filter(user:, items:)
5
+ def self.filter(user: nil, items: nil)
6
+ raise ArgumentError.new("A 'user' argument must be provided.") unless user
7
+ raise ArgumentError.new("An 'items' argument must be provided.") unless items
8
+
6
9
  if(user.is_a?(UserProfile))
7
10
  user_profile = user
8
11
  new_items = items
@@ -25,7 +28,10 @@ module Co2Filter::ContentBased
25
28
  Results.new(results)
26
29
  end
27
30
 
28
- def self.ratings_to_profile(user_ratings:, items:)
31
+ def self.ratings_to_profile(user_ratings: nil, items: nil)
32
+ raise ArgumentError.new("A 'user_ratings' argument must be provided.") unless user_ratings
33
+ raise ArgumentError.new("An 'items' argument must be provided.") unless items
34
+
29
35
  user_ratings = Co2Filter::RatingSet.new(user_ratings) unless user_ratings.is_a? Co2Filter::RatingSet
30
36
  user_prefs = {}
31
37
  strength_normalizers = {}
@@ -47,7 +53,10 @@ module Co2Filter::ContentBased
47
53
  UserProfile.new(user_prefs, user_ratings.mean)
48
54
  end
49
55
 
50
- def self.boost_ratings(users:, items:)
56
+ def self.boost_ratings(users: nil, items: nil)
57
+ raise ArgumentError.new("A 'users' argument must be provided.") unless users
58
+ raise ArgumentError.new("An 'items' argument must be provided.") unless items
59
+
51
60
  users.inject({}) do |content_boosted_users, (user_id, ratings)|
52
61
  content_boosted_users[user_id] = ratings.merge(filter(user: ratings, items: items))
53
62
  content_boosted_users
@@ -1,3 +1,3 @@
1
1
  module Co2Filter
2
- VERSION = "0.0.1"
2
+ VERSION = "0.1.0"
3
3
  end
data/lib/co2_filter.rb CHANGED
@@ -1,6 +1,9 @@
1
1
  module Co2Filter
2
- def self.filter(current_user: , other_users: , items: nil, user_profile: nil, content_based_results: nil)
2
+ def self.filter(current_user: nil, other_users: nil, items: nil, user_profile: nil, content_based_results: nil)
3
+ raise ArgumentError.new("A 'current_user' argument must be provided.") unless current_user
4
+ raise ArgumentError.new("An 'other_users' argument must be provided.") unless other_users
3
5
  raise ArgumentError.new("An 'items' or 'content_based_results' argument must be provided.") unless items || content_based_results
6
+
4
7
  collab = Collaborative.filter(current_user: current_user, other_users: other_users)
5
8
 
6
9
  if content_based_results && content_based_results.is_a?(Results)
@@ -17,7 +20,11 @@ module Co2Filter
17
20
  Results.new(hybrid)
18
21
  end
19
22
 
20
- def self.content_boosted_collaborative_filter(current_user:, other_users:, items:)
23
+ def self.content_boosted_collaborative_filter(current_user: nil, other_users: nil, items: nil)
24
+ raise ArgumentError.new("A 'current_user' argument must be provided.") unless current_user
25
+ raise ArgumentError.new("An 'other_users' argument must be provided.") unless other_users
26
+ raise ArgumentError.new("An 'items' argument must be provided.") unless items
27
+
21
28
  content_boosted_users = ContentBased.boost_ratings(users: other_users, items: items)
22
29
  results = Collaborative.filter(current_user: current_user, other_users: content_boosted_users)
23
30
  Results.new(results)
metadata CHANGED
@@ -1,55 +1,55 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: co2_filter
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.1
4
+ version: 0.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Tommy Orr
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2016-02-19 00:00:00.000000000 Z
11
+ date: 2016-03-09 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
15
15
  requirement: !ruby/object:Gem::Requirement
16
16
  requirements:
17
- - - "~>"
17
+ - - ~>
18
18
  - !ruby/object:Gem::Version
19
19
  version: '1.11'
20
20
  type: :development
21
21
  prerelease: false
22
22
  version_requirements: !ruby/object:Gem::Requirement
23
23
  requirements:
24
- - - "~>"
24
+ - - ~>
25
25
  - !ruby/object:Gem::Version
26
26
  version: '1.11'
27
27
  - !ruby/object:Gem::Dependency
28
28
  name: rake
29
29
  requirement: !ruby/object:Gem::Requirement
30
30
  requirements:
31
- - - "~>"
31
+ - - ~>
32
32
  - !ruby/object:Gem::Version
33
33
  version: '10.0'
34
34
  type: :development
35
35
  prerelease: false
36
36
  version_requirements: !ruby/object:Gem::Requirement
37
37
  requirements:
38
- - - "~>"
38
+ - - ~>
39
39
  - !ruby/object:Gem::Version
40
40
  version: '10.0'
41
41
  - !ruby/object:Gem::Dependency
42
42
  name: rspec
43
43
  requirement: !ruby/object:Gem::Requirement
44
44
  requirements:
45
- - - "~>"
45
+ - - ~>
46
46
  - !ruby/object:Gem::Version
47
47
  version: '3.0'
48
48
  type: :development
49
49
  prerelease: false
50
50
  version_requirements: !ruby/object:Gem::Requirement
51
51
  requirements:
52
- - - "~>"
52
+ - - ~>
53
53
  - !ruby/object:Gem::Version
54
54
  version: '3.0'
55
55
  description:
@@ -59,20 +59,20 @@ executables: []
59
59
  extensions: []
60
60
  extra_rdoc_files: []
61
61
  files:
62
- - MIT-LICENSE
63
- - README.rdoc
64
- - Rakefile
65
- - lib/co2_filter.rb
66
- - lib/co2_filter/collaborative.rb
67
62
  - lib/co2_filter/collaborative/results.rb
68
- - lib/co2_filter/content_based.rb
63
+ - lib/co2_filter/collaborative.rb
69
64
  - lib/co2_filter/content_based/results.rb
70
65
  - lib/co2_filter/content_based/user_profile.rb
66
+ - lib/co2_filter/content_based.rb
71
67
  - lib/co2_filter/hash_wrapper.rb
72
68
  - lib/co2_filter/rating_set.rb
73
69
  - lib/co2_filter/results.rb
74
70
  - lib/co2_filter/version.rb
71
+ - lib/co2_filter.rb
75
72
  - lib/tasks/co2_filter_tasks.rake
73
+ - MIT-LICENSE
74
+ - Rakefile
75
+ - README.rdoc
76
76
  homepage: https://github.com/comatose-turtle/co2_filter
77
77
  licenses:
78
78
  - MIT
@@ -83,17 +83,17 @@ require_paths:
83
83
  - lib
84
84
  required_ruby_version: !ruby/object:Gem::Requirement
85
85
  requirements:
86
- - - ">="
86
+ - - '>='
87
87
  - !ruby/object:Gem::Version
88
88
  version: '0'
89
89
  required_rubygems_version: !ruby/object:Gem::Requirement
90
90
  requirements:
91
- - - ">="
91
+ - - '>='
92
92
  - !ruby/object:Gem::Version
93
93
  version: '0'
94
94
  requirements: []
95
95
  rubyforge_project:
96
- rubygems_version: 2.4.5.1
96
+ rubygems_version: 2.0.14
97
97
  signing_key:
98
98
  specification_version: 4
99
99
  summary: Uses both collaborative and content-based filtering methods to enable a complex,