predictor 2.0.0.rc1 → 2.0.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 29f61606a156b3a6132dc9212f3d027492285d2e
4
- data.tar.gz: acf97ff88f34ca518536e18b7753fe27e69956ec
3
+ metadata.gz: 22ffddcaceb5b1a189aa7f0e237fad7469cb537e
4
+ data.tar.gz: 21c1b3aa6902f605c01e1de0cd91fa37c4e046e1
5
5
  SHA512:
6
- metadata.gz: cd25db9f133ee47f44703e8e6cab7d0daed59b9a035abfac75d733d6891d0c312981f76875b0278a9b66285c3f374650b01fe4466f45b51c0d42685dafae4783
7
- data.tar.gz: 4b01b5f70cdb3d4de6267e72502b50e8c8c7210f3d13855477b9d1dd9a4556b57787bca2827ddc7ac3060c915f032a2b32a50b5feb13709b9ecd5a86167c092c
6
+ metadata.gz: 1e6231526449192f10c34b73bee3500e52b13c7f4e9a3f6699b40c13e361dd7c098c28e199dd2eba2c8490532bf5c07f79af39027b865e31f7100fd5219ba855
7
+ data.tar.gz: d07e1994e3221003cc17fda719f2964ab35478a0f396735018fab0a12e278831a710eca7512ea538328a6473c0296a3a79295a48322f5b215a8b535be7e962b4
@@ -1,7 +1,7 @@
1
1
  =======
2
2
  Predictor Changelog
3
3
  =========
4
- 2.0.0 (2014-03-07)
4
+ 2.0.0 (2014-04-17)
5
5
  ---------------------
6
6
  **Rewrite of 1.0.0 and contains several breaking changes!**
7
7
 
data/Gemfile CHANGED
@@ -1,10 +1,3 @@
1
1
  source 'https://rubygems.org'
2
2
 
3
- gem "redis"
4
-
5
- group :development do
6
- gem "rake"
7
- gem "rspec"
8
- gem "yard"
9
- gem "pry"
10
- end
3
+ gemspec
data/README.md CHANGED
@@ -11,7 +11,7 @@ Originally forked and based on [Recommendify](https://github.com/paulasmuth/reco
11
11
  * Provide item similarities such as "Users that read this book also read ..."
12
12
  * Provide personalized predictions based on a user's past history, such as "You read these 10 books, so you might also like to read ..."
13
13
 
14
- At the moment, Predictor uses the [Jaccard index](http://en.wikipedia.org/wiki/Jaccard_index) to determine similarities between items. There are other ways to do this, which we intend to implement eventually, but if you want to beat us to the punch, pull requests are quite welcome :)
14
+ At the moment, Predictor uses the [Jaccard index](http://en.wikipedia.org/wiki/Jaccard_index) or the [Sorenson-Dice coefficient](http://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient) (default is Jaccard) to determine similarities between items. There are other ways to do this, which we intend to implement eventually, but if you want to beat us to the punch, pull requests are quite welcome :)
15
15
 
16
16
  Notice
17
17
  ---------------------
@@ -19,11 +19,8 @@ This is the readme for Predictor 2.0, which contains a few breaking changes from
19
19
 
20
20
  Installation
21
21
  ---------------------
22
+ In your Gemfile:
22
23
  ```ruby
23
- gem install predictor
24
- ````
25
- or in your Gemfile:
26
- ````
27
24
  gem 'predictor'
28
25
  ```
29
26
  Getting Started
@@ -59,7 +56,7 @@ class CourseRecommender
59
56
  limit_similarities_to 500 # Optional, but if specified, Predictor only caches the top x similarities for an item at any given time. Can greatly help with efficient use of Redis memory
60
57
  input_matrix :users, weight: 3.0
61
58
  input_matrix :tags, weight: 2.0
62
- input_matrix :topics, weight: 1.0
59
+ input_matrix :topics, weight: 1.0, measure: :sorensen_coefficient # Use Sorenson over Jaccard
63
60
  end
64
61
  ```
65
62
 
@@ -191,6 +188,7 @@ predictor.topics.delete_item!("course-1")
191
188
  # to delete_from_matrix! if you want to update similarities to account for the deleted item (in v1, this was a bug and didn't occur)
192
189
  predictor.delete_from_matrix!(:topics, "course-1")
193
190
  ```
191
+ * Regenerate your recommendations, as redis keys have changed for Predictor 2. You can use the recommender.clean! to clear out old similarities, then run your rake task (or whatever you've setup) to create new similarities.
194
192
 
195
193
  Problems? Issues? Want to help out?
196
194
  ---------------------
@@ -215,5 +213,4 @@ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
215
213
  FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
216
214
  COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
217
215
  IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
218
- CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
219
-
216
+ CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/Rakefile CHANGED
@@ -1,12 +1 @@
1
- require "rubygems"
2
- require "rspec"
3
- require 'rspec/core/rake_task'
4
- require "yard"
5
-
6
- desc "Run all examples"
7
- task RSpec::Core::RakeTask.new('spec')
8
-
9
- task :default => "spec"
10
-
11
- desc "Generate documentation"
12
- task YARD::Rake::YardocTask.new
1
+ require "bundler/gem_tasks"
@@ -1,4 +1,5 @@
1
- require "predictor/version"
1
+ require "redis"
2
2
  require "predictor/predictor"
3
+ require "predictor/distance"
3
4
  require "predictor/input_matrix"
4
5
  require "predictor/base"
@@ -79,7 +79,7 @@ module Predictor::Base
79
79
  keys.concat(sets.map { |set| matrix.redis_key(:items, set) })
80
80
  end
81
81
 
82
- keys.empty? ? [] : (Predictor.redis.sunion(keys) - [item])
82
+ keys.empty? ? [] : (Predictor.redis.sunion(keys) - [item.to_s])
83
83
  end
84
84
 
85
85
  def predictions_for(set=nil, item_set: nil, matrix_label: nil, with_scores: false, offset: 0, limit: -1, exclusion_set: [])
@@ -185,7 +185,7 @@ module Predictor::Base
185
185
  def cache_similarity(item1, item2)
186
186
  score = 0
187
187
  input_matrices.each do |key, matrix|
188
- score += (matrix.calculate_jaccard(item1, item2) * matrix.weight)
188
+ score += (matrix.score(item1, item2) * matrix.weight)
189
189
  end
190
190
  if score > 0
191
191
  add_similarity_if_necessary(item1, item2, score)
@@ -213,4 +213,4 @@ module Predictor::Base
213
213
  end
214
214
  Predictor.redis.zadd(key, score, similarity) if store
215
215
  end
216
- end
216
+ end
@@ -0,0 +1,31 @@
1
+ module Predictor
2
+ module Distance
3
+ extend self
4
+
5
+ def jaccard_index(key_1, key_2, redis = Predictor.redis)
6
+ x, y = nil
7
+
8
+ redis.multi do |multi|
9
+ x = multi.sinterstore 'temp', [key_1, key_2]
10
+ y = multi.sunionstore 'temp', [key_1, key_2]
11
+ multi.del 'temp'
12
+ end
13
+
14
+ y.value > 0 ? (x.value.to_f/y.value.to_f) : 0.0
15
+ end
16
+
17
+ def sorensen_coefficient(key_1, key_2, redis = Predictor.redis)
18
+ x, y, z = nil
19
+
20
+ redis.multi do |multi|
21
+ x = multi.sinterstore 'temp', [key_1, key_2]
22
+ y = multi.scard key_1
23
+ z = multi.scard key_2
24
+ multi.del 'temp'
25
+ end
26
+
27
+ denom = (y.value + z.value)
28
+ denom > 0 ? (2 * (x.value) / denom.to_f) : 0.0
29
+ end
30
+ end
31
+ end
@@ -1,85 +1,82 @@
1
- class Predictor::InputMatrix
2
- def initialize(opts)
3
- @opts = opts
4
- end
5
-
6
- def parent_redis_key(*append)
7
- ([@opts.fetch(:redis_prefix)] + append).flatten.compact.join(":")
8
- end
1
+ module Predictor
2
+ class InputMatrix
3
+ def initialize(opts)
4
+ @opts = opts
5
+ end
9
6
 
10
- def redis_key(*append)
11
- ([@opts.fetch(:redis_prefix), @opts.fetch(:key)] + append).flatten.compact.join(":")
12
- end
7
+ def parent_redis_key(*append)
8
+ ([@opts.fetch(:redis_prefix)] + append).flatten.compact.join(":")
9
+ end
13
10
 
14
- def weight
15
- (@opts[:weight] || 1).to_f
16
- end
11
+ def redis_key(*append)
12
+ ([@opts.fetch(:redis_prefix), @opts.fetch(:key)] + append).flatten.compact.join(":")
13
+ end
17
14
 
18
- def add_to_set(set, *items)
19
- items = items.flatten if items.count == 1 && items[0].is_a?(Array)
20
- Predictor.redis.multi do
21
- items.each { |item| add_single_nomulti(set, item) }
15
+ def weight
16
+ (@opts[:weight] || 1).to_f
22
17
  end
23
- end
24
18
 
25
- def add_set(set, items)
26
- add_to_set(set, *items)
27
- end
19
+ def add_to_set(set, *items)
20
+ items = items.flatten if items.count == 1 && items[0].is_a?(Array)
21
+ Predictor.redis.multi do
22
+ items.each { |item| add_single_nomulti(set, item) }
23
+ end
24
+ end
28
25
 
29
- def add_single(set, item)
30
- add_to_set(set, item)
31
- end
26
+ def add_set(set, items)
27
+ add_to_set(set, *items)
28
+ end
32
29
 
33
- def items_for(set)
34
- Predictor.redis.smembers redis_key(:items, set)
35
- end
30
+ def add_single(set, item)
31
+ add_to_set(set, item)
32
+ end
36
33
 
37
- def sets_for(item)
38
- Predictor.redis.sunion redis_key(:sets, item)
39
- end
34
+ def items_for(set)
35
+ Predictor.redis.smembers redis_key(:items, set)
36
+ end
40
37
 
41
- def related_items(item)
42
- sets = Predictor.redis.smembers(redis_key(:sets, item))
43
- keys = sets.map { |set| redis_key(:items, set) }
44
- keys.length > 0 ? Predictor.redis.sunion(keys) - [item] : []
45
- end
38
+ def sets_for(item)
39
+ Predictor.redis.sunion redis_key(:sets, item)
40
+ end
46
41
 
47
- # delete item from the matrix
48
- def delete_item(item)
49
- Predictor.redis.watch(redis_key(:sets, item)) do
42
+ def related_items(item)
50
43
  sets = Predictor.redis.smembers(redis_key(:sets, item))
51
- Predictor.redis.multi do |multi|
52
- sets.each do |set|
53
- multi.srem(redis_key(:items, set), item)
54
- end
44
+ keys = sets.map { |set| redis_key(:items, set) }
45
+ keys.length > 0 ? Predictor.redis.sunion(keys) - [item.to_s] : []
46
+ end
47
+
48
+ # delete item from the matrix
49
+ def delete_item(item)
50
+ Predictor.redis.watch(redis_key(:sets, item)) do
51
+ sets = Predictor.redis.smembers(redis_key(:sets, item))
52
+ Predictor.redis.multi do |multi|
53
+ sets.each do |set|
54
+ multi.srem(redis_key(:items, set), item)
55
+ end
55
56
 
56
- multi.del redis_key(:sets, item)
57
+ multi.del redis_key(:sets, item)
58
+ end
57
59
  end
58
60
  end
59
- end
60
61
 
61
- def calculate_jaccard(item1, item2)
62
- x = nil
63
- y = nil
64
- Predictor.redis.multi do |multi|
65
- x = multi.sinterstore 'temp', [redis_key(:sets, item1), redis_key(:sets, item2)]
66
- y = multi.sunionstore 'temp', [redis_key(:sets, item1), redis_key(:sets, item2)]
67
- multi.del 'temp'
62
+ def score(item1, item2)
63
+ measure_name = @opts.fetch(:measure, :jaccard_index)
64
+ Distance.send(measure_name, redis_key(:sets, item1), redis_key(:sets, item2), Predictor.redis)
68
65
  end
69
66
 
70
- if y.value > 0
71
- return (x.value.to_f/y.value.to_f)
72
- else
73
- return 0.0
67
+ def calculate_jaccard(item1, item2)
68
+ warn 'InputMatrix#calculate_jaccard is now deprecated. Use InputMatrix#score instead'
69
+ Distance.jaccard_index(redis_key(:sets, item1), redis_key(:sets, item2), Predictor.redis)
74
70
  end
75
- end
76
71
 
77
- private
72
+ private
73
+
74
+ def add_single_nomulti(set, item)
75
+ Predictor.redis.sadd(parent_redis_key(:all_items), item)
76
+ Predictor.redis.sadd(redis_key(:items, set), item)
77
+ # add the set to the item's set--inverting the sets
78
+ Predictor.redis.sadd(redis_key(:sets, item), set)
79
+ end
78
80
 
79
- def add_single_nomulti(set, item)
80
- Predictor.redis.sadd(parent_redis_key(:all_items), item)
81
- Predictor.redis.sadd(redis_key(:items, set), item)
82
- # add the set to the item's set--inverting the sets
83
- Predictor.redis.sadd(redis_key(:sets, item), set)
84
81
  end
85
- end
82
+ end
@@ -1,3 +1,3 @@
1
1
  module Predictor
2
- VERSION = "2.0.0.rc1"
2
+ VERSION = "2.0.0"
3
3
  end
@@ -13,7 +13,10 @@ Gem::Specification.new do |s|
13
13
 
14
14
  s.add_dependency "redis", ">= 3.0.0"
15
15
 
16
- s.add_development_dependency "rspec", "~> 2.8.0"
16
+ s.add_development_dependency "rspec", "~> 2.14.0"
17
+ s.add_development_dependency "rake"
18
+ s.add_development_dependency "pry"
19
+ s.add_development_dependency "yard"
17
20
 
18
21
  s.files = `git ls-files`.split("\n") - [".gitignore", ".rspec", ".travis.yml"]
19
22
  s.test_files = `git ls-files -- spec/*`.split("\n")
@@ -1,4 +1,4 @@
1
- require ::File.expand_path('../spec_helper', __FILE__)
1
+ require 'spec_helper'
2
2
 
3
3
  describe Predictor::Base do
4
4
  class BaseRecommender
@@ -1,9 +1,13 @@
1
- require ::File.expand_path('../spec_helper', __FILE__)
1
+ require 'spec_helper'
2
2
 
3
3
  describe Predictor::InputMatrix do
4
+ let(:options) { @default_options.merge(@options) }
5
+
6
+ before(:each) { @options = {} }
4
7
 
5
8
  before(:all) do
6
- @matrix = Predictor::InputMatrix.new(:redis_prefix => "predictor-test", :key => "mymatrix")
9
+ @default_options = { redis_prefix: "predictor-test", key: "mymatrix" }
10
+ @matrix = Predictor::InputMatrix.new(@default_options)
7
11
  end
8
12
 
9
13
  before(:each) do
@@ -92,15 +96,54 @@ describe Predictor::InputMatrix do
92
96
  end
93
97
  end
94
98
 
95
- it "should calculate the correct jaccard index" do
96
- @matrix.add_to_set "item1", "foo", "bar", "fnord", "blubb"
97
- @matrix.add_to_set "item2", "bar", "fnord", "shmoo", "snafu"
98
- @matrix.add_to_set "item3", "bar", "nada", "snafu"
99
+ describe "#score" do
100
+ let(:matrix) { Predictor::InputMatrix.new(options) }
101
+
102
+ context "default" do
103
+ it "scores as jaccard index by default" do
104
+ matrix.add_to_set "item1", "foo", "bar", "fnord", "blubb"
105
+ matrix.add_to_set "item2", "bar", "fnord", "shmoo", "snafu"
106
+ matrix.add_to_set "item3", "bar", "nada", "snafu"
107
+
108
+ matrix.score("bar", "snafu").should == 2.0/3.0
109
+ end
110
+
111
+ it "scores as jaccard index when given option" do
112
+ matrix = Predictor::InputMatrix.new(options.merge(measure: :jaccard_index))
113
+ matrix.add_to_set "item1", "foo", "bar", "fnord", "blubb"
114
+ matrix.add_to_set "item2", "bar", "fnord", "shmoo", "snafu"
115
+ matrix.add_to_set "item3", "bar", "nada", "snafu"
99
116
 
100
- @matrix.calculate_jaccard("bar", "snafu").should == 2.0/3.0
117
+ matrix.score("bar", "snafu").should == 2.0/3.0
118
+ end
119
+
120
+ it "should handle missing sets" do
121
+ matrix.add_to_set "item1", "foo", "bar", "fnord", "blubb"
122
+
123
+ matrix.score("is", "missing").should == 0.0
124
+ end
125
+ end
126
+
127
+ context "sorensen_coefficient" do
128
+ before { @options[:measure] = :sorensen_coefficient }
129
+
130
+ it "should calculate the correct sorensen index" do
131
+ matrix.add_to_set "item1", "foo", "bar", "fnord", "blubb"
132
+ matrix.add_to_set "item2", "fnord", "shmoo", "snafu"
133
+ matrix.add_to_set "item3", "bar", "nada", "snafu"
134
+
135
+ matrix.score("bar", "snafu").should == 2.0/4.0
136
+ end
137
+
138
+ it "should handle missing sets" do
139
+ matrix.add_to_set "item1", "foo", "bar", "fnord", "blubb"
140
+
141
+ matrix.score("is", "missing").should == 0.0
142
+ end
143
+ end
101
144
  end
102
145
 
103
- private
146
+ private
104
147
 
105
148
  def add_two_item_test_data!(matrix)
106
149
  matrix.add_to_set("user42", "fnord", "blubb")
@@ -118,4 +161,4 @@ private
118
161
  matrix.add_to_set("user50", "fnord", "shmoo")
119
162
  end
120
163
 
121
- end
164
+ end
@@ -1,4 +1,4 @@
1
- require ::File.expand_path('../spec_helper', __FILE__)
1
+ require 'spec_helper'
2
2
 
3
3
  describe Predictor do
4
4
 
@@ -1,9 +1,6 @@
1
- require "rspec"
2
- require "redis"
1
+ require "predictor"
3
2
  require "pry"
4
3
 
5
- require ::File.expand_path('../../lib/predictor', __FILE__)
6
-
7
4
  def flush_redis!
8
5
  Predictor.redis = Redis.new
9
6
  Predictor.redis.keys("predictor-test*").each do |k|
@@ -19,22 +16,20 @@ module Predictor::Base
19
16
 
20
17
  end
21
18
 
22
-
23
19
  class TestRecommender
24
20
  include Predictor::Base
25
21
 
26
22
  input_matrix :jaccard_one
27
-
28
23
  end
29
24
 
30
25
  class Predictor::TestInputMatrix
31
26
 
32
27
  def initialize(opts)
33
- @opts = opts
28
+ @opts = opts
34
29
  end
35
30
 
36
31
  def method_missing(method, *args)
37
- @opts[method]
32
+ @opts[method]
38
33
  end
39
34
 
40
- end
35
+ end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: predictor
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.0.0.rc1
4
+ version: 2.0.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Pathgather
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2014-03-08 00:00:00.000000000 Z
11
+ date: 2014-04-17 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: redis
@@ -30,14 +30,56 @@ dependencies:
30
30
  requirements:
31
31
  - - ~>
32
32
  - !ruby/object:Gem::Version
33
- version: 2.8.0
33
+ version: 2.14.0
34
34
  type: :development
35
35
  prerelease: false
36
36
  version_requirements: !ruby/object:Gem::Requirement
37
37
  requirements:
38
38
  - - ~>
39
39
  - !ruby/object:Gem::Version
40
- version: 2.8.0
40
+ version: 2.14.0
41
+ - !ruby/object:Gem::Dependency
42
+ name: rake
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - '>='
46
+ - !ruby/object:Gem::Version
47
+ version: '0'
48
+ type: :development
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - '>='
53
+ - !ruby/object:Gem::Version
54
+ version: '0'
55
+ - !ruby/object:Gem::Dependency
56
+ name: pry
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - '>='
60
+ - !ruby/object:Gem::Version
61
+ version: '0'
62
+ type: :development
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - '>='
67
+ - !ruby/object:Gem::Version
68
+ version: '0'
69
+ - !ruby/object:Gem::Dependency
70
+ name: yard
71
+ requirement: !ruby/object:Gem::Requirement
72
+ requirements:
73
+ - - '>='
74
+ - !ruby/object:Gem::Version
75
+ version: '0'
76
+ type: :development
77
+ prerelease: false
78
+ version_requirements: !ruby/object:Gem::Requirement
79
+ requirements:
80
+ - - '>='
81
+ - !ruby/object:Gem::Version
82
+ version: '0'
41
83
  description: Fast and efficient recommendations and predictions using Redis
42
84
  email:
43
85
  - tech@pathgather.com
@@ -53,6 +95,7 @@ files:
53
95
  - docs/READMEv1.md
54
96
  - lib/predictor.rb
55
97
  - lib/predictor/base.rb
98
+ - lib/predictor/distance.rb
56
99
  - lib/predictor/input_matrix.rb
57
100
  - lib/predictor/predictor.rb
58
101
  - lib/predictor/version.rb
@@ -76,9 +119,9 @@ required_ruby_version: !ruby/object:Gem::Requirement
76
119
  version: '0'
77
120
  required_rubygems_version: !ruby/object:Gem::Requirement
78
121
  requirements:
79
- - - '>'
122
+ - - '>='
80
123
  - !ruby/object:Gem::Version
81
- version: 1.3.1
124
+ version: '0'
82
125
  requirements: []
83
126
  rubyforge_project:
84
127
  rubygems_version: 2.1.11