jruby_mahout 0.2.1 → 0.2.2

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -1,30 +1,76 @@
1
- # Jruby Mahout
2
- Jruby Mahout is a gem that unleashes the power of Apache Mahout in the world of Jruby. Mahout is a superior machine learning library written in Java. It deals with recommendations, clustering and classification machine learning problems at scale. Until now it was difficult to use it in Ruby projects. You'd have to implement Java interfaces in Jruby yourself, which is not quick especially if you just started exploring the world of machine learning.
1
+ # JRuby Mahout
2
+ Jruby Mahout is a gem that unleashes the power of Apache Mahout in the world of JRuby. Mahout is a superior machine learning library written in Java. It deals with recommendations, clustering and classification machine learning problems at scale. Until now it was difficult to use it in Ruby projects. You'd have to implement Java interfaces in JRuby yourself, which is not quick especially if you just started exploring the world of machine learning.
3
3
 
4
- The goal of this library is to make machine learning at scale in Jruby projects simple.
4
+ The goal of this library is to make machine learning at scale in JRuby projects simple.
5
5
 
6
6
  ## Quick Overview
7
- This is an early version of a Jruby gem that only supports Mahout recommendations. It also includes a simple Postgres manager that can be used to manage appropriate recommendations tables. Unfortunately it's impossible to use ActiveRecord (AR) with Mahout, because AR at a mach higher level and creates a lot of overhead that is critical when dealing with millions of records in real time.
7
+ This is an early version of a JRuby gem that only supports Mahout recommendations. It also includes a simple Postgres manager that can be used to manage appropriate recommendations tables. Unfortunately it's impossible to use ActiveRecord (AR) with Mahout, because AR operates at a much higher level and creates a lot of overhead that is critical when dealing with millions of records in real time.
8
8
 
9
9
  ## Get Mahout
10
- First of all you need to download Mahout library from one of the [mirrors](http://www.apache.org/dyn/closer.cgi/mahout/). Jruby Mahout only supports Mahout 0.7 at this point.
10
+ First of all you need to download the Mahout library from one of the [mirrors](http://www.apache.org/dyn/closer.cgi/mahout/). Jruby Mahout only supports Mahout 0.7 at this point.
11
11
 
12
12
  ## Get Postgres JDBC Adapter
13
- If you wish to work with a database for recommendations, you'll have to install [JDBC driver for Postgres](http://jdbc.postgresql.org/download.html). Another option is to use file-based recommendation.
13
+ If you wish to work with a database for recommendations, you'll have to install the [JDBC driver for Postgres](http://jdbc.postgresql.org/download.html). Another option is to use file-based recommendations.
14
14
 
15
15
  ## Installation
16
- ### 1. Set environment variable MAHOUT_DIR to point at your Mahout installation.
16
+ ### 1. Set the environment variable MAHOUT_DIR to point at your Mahout installation.
17
17
  ### 2. Add the gem to your `Gemfile`
18
18
  ```ruby
19
19
  platform :jruby do
20
20
  gem "jruby_mahout"
21
21
  end
22
22
  ```
23
- And run `bundle install`.
23
+ ### 3. Run `bundle install`.
24
+
25
+ ## How to Use?
26
+ I am planning to add more examples covering Jruby Mahout use cases to [this repo](https://github.com/vasinov/jruby_mahout-examples) soon.
27
+
28
+ First, define the `MAHOUT_DIR` environmental variable for your Mahout installation. For example:
29
+
30
+ ```
31
+ export MAHOUT_DIR=/bin/mahout
32
+ ```
33
+
34
+ The easiest way to start working with Jruby Mahout recommendations is to initialize a recommender:
35
+ ```ruby
36
+ require 'jruby_mahout'
37
+ recommender = JrubyMahout::Recommender.new("PearsonCorrelationSimilarity", 5, "GenericUserBasedRecommender", false)
38
+ ```
39
+
40
+ Set up a data model:
41
+ ```ruby
42
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "recommender_data.csv" }).data_model
43
+ ```
44
+
45
+ and get recommendations:
46
+ ```ruby
47
+ puts recommender.recommend(2, 10, nil) # 10 recommendations for user with id = 2
48
+ ```
49
+
50
+ You can evaluate your recommender to see how efficient it is:
51
+ ```ruby
52
+ puts recommender.evaluate(0.7, 0.3)
53
+ ```
54
+
55
+ The closer the score is to zero—the better.
56
+
57
+ I realize that it's a very sparse introduction to Jruby Mahout. I am working on the tutorial and better documentation that should cover this gem more in depth. Stay tuned at [my blog](http://www.vasinov.com/blog).
58
+
59
+ ## Development Plans
60
+ There are several things that should be supported by this gem, before it can be used in production. Some of them are:
61
+ - Hadoop integration
62
+ - Clustering support
63
+ - Classification support
64
+ - Better docs
65
+
66
+ If you feel like you can help—please do.
67
+
68
+ ## Testing
69
+ Jruby Mahout is thoroughly tested with Rspec.
24
70
 
25
71
  ## Contribute
26
72
  - Fork the project.
27
73
  - Write code for a feature or bug fix.
28
74
  - Add Rspec tests for it.
29
75
  - Commit, do not make changes to rakefile or version.
30
- - Submit a pull request.
76
+ - Submit a pull request.
@@ -44,19 +44,27 @@ module JrubyMahout
44
44
  end
45
45
  end
46
46
 
47
- def upsert_record(record, name)
47
+ def upsert_record(table_name, record)
48
48
  begin
49
- @statement.execute("UPDATE #{name} SET user_id=#{record[:user_id]}, item_id=#{record[:item_id]}, rating=#{record[:rating]} WHERE user_id=#{record[:user_id]} AND item_id=#{record[:item_id]};")
50
- @statement.execute("INSERT INTO #{name} (user_id, item_id, rating) SELECT #{record[:user_id]}, #{record[:item_id]}, #{record[:rating]} WHERE NOT EXISTS (SELECT 1 FROM #{name} WHERE user_id=#{record[:user_id]} AND item_id=#{record[:item_id]});")
49
+ @statement.execute("UPDATE #{table_name} SET user_id=#{record[:user_id]}, item_id=#{record[:item_id]}, rating=#{record[:rating]} WHERE user_id=#{record[:user_id]} AND item_id=#{record[:item_id]};")
50
+ @statement.execute("INSERT INTO #{table_name} (user_id, item_id, rating) SELECT #{record[:user_id]}, #{record[:item_id]}, #{record[:rating]} WHERE NOT EXISTS (SELECT 1 FROM #{table_name} WHERE user_id=#{record[:user_id]} AND item_id=#{record[:item_id]});")
51
51
  rescue java.sql.SQLException => e
52
52
  puts e
53
53
  end
54
54
  end
55
55
 
56
- def create_table(name)
56
+ def delete_record(table_name, record)
57
+ begin
58
+ @statement.execute("DELETE FROM #{table_name} WHERE user_id=#{record[:user_id]} AND item_id=#{record[:item_id]};")
59
+ rescue java.sql.SQLException => e
60
+ puts e
61
+ end
62
+ end
63
+
64
+ def create_table(table_name)
57
65
  begin
58
66
  @statement.executeUpdate("
59
- CREATE TABLE #{name} (
67
+ CREATE TABLE #{table_name} (
60
68
  user_id BIGINT NOT NULL,
61
69
  item_id BIGINT NOT NULL,
62
70
  rating int NOT NULL,
@@ -64,18 +72,18 @@ module JrubyMahout
64
72
  PRIMARY KEY (user_id, item_id)
65
73
  );
66
74
  ")
67
- @statement.executeUpdate("CREATE INDEX #{name}_user_id_index ON #{name} (user_id);")
68
- @statement.executeUpdate("CREATE INDEX #{name}_item_id_index ON #{name} (item_id);")
75
+ @statement.executeUpdate("CREATE INDEX #{table_name}_user_id_index ON #{table_name} (user_id);")
76
+ @statement.executeUpdate("CREATE INDEX #{table_name}_item_id_index ON #{table_name} (item_id);")
69
77
  rescue java.sql.SQLException => e
70
78
  puts e
71
79
  end
72
80
  end
73
81
 
74
- def delete_table(name)
82
+ def delete_table(table_name)
75
83
  begin
76
- @statement.executeUpdate("DROP INDEX IF EXISTS #{name}_user_id_index;")
77
- @statement.executeUpdate("DROP INDEX IF EXISTS #{name}_item_id_index;")
78
- @statement.executeUpdate("DROP TABLE IF EXISTS #{name};")
84
+ @statement.executeUpdate("DROP INDEX IF EXISTS #{table_name}_user_id_index;")
85
+ @statement.executeUpdate("DROP INDEX IF EXISTS #{table_name}_item_id_index;")
86
+ @statement.executeUpdate("DROP TABLE IF EXISTS #{table_name};")
79
87
  rescue java.sql.SQLException => e
80
88
  puts e
81
89
  end
@@ -34,18 +34,18 @@ module JrubyMahout
34
34
  end
35
35
 
36
36
  def similar_items(item_id, number_of_items, rescorer)
37
- if @recommender.nil? or @recommender_name == "GenericItemBasedRecommender"
37
+ if @recommender.nil? or @recommender_name == "GenericUserBasedRecommender"
38
38
  nil
39
39
  else
40
- @recommender.mostSimilarItems(item_id, number_of_items, rescorer)
40
+ to_array(@recommender.mostSimilarItems(item_id, number_of_items, rescorer))
41
41
  end
42
42
  end
43
43
 
44
- def similar_users(user_id, number_of_items, rescorer)
45
- if @recommender.nil? or @recommender_name == "GenericUserBasedRecommender"
44
+ def similar_users(user_id, number_of_users, rescorer)
45
+ if @recommender.nil? or @recommender_name == "GenericItemBasedRecommender"
46
46
  nil
47
47
  else
48
- @recommender.mostSimilarUserIDs(user_id, amount, rescorer)
48
+ to_array(@recommender.mostSimilarUserIDs(user_id, number_of_users, rescorer))
49
49
  end
50
50
  end
51
51
 
@@ -58,10 +58,10 @@ module JrubyMahout
58
58
  end
59
59
 
60
60
  def recommended_because(user_id, item_id, number_of_items)
61
- if @recommender.nil? or @recommender_name == "GenericItemBasedRecommender"
61
+ if @recommender.nil? or @recommender_name == "GenericUserBasedRecommender"
62
62
  nil
63
63
  else
64
- @recommender.recommendedBecause(user_id, item_id, number_of_items)
64
+ to_array(@recommender.recommendedBecause(user_id, item_id, number_of_items))
65
65
  end
66
66
  end
67
67
 
@@ -74,5 +74,15 @@ module JrubyMahout
74
74
 
75
75
  recommendations_array
76
76
  end
77
+
78
+ private
79
+ def to_array(things)
80
+ things_array = []
81
+ things.each do |thing_id|
82
+ things_array << thing_id
83
+ end
84
+
85
+ things_array
86
+ end
77
87
  end
78
88
  end
@@ -9,6 +9,7 @@ module JrubyMahout
9
9
  java_import org.apache.mahout.cf.taste.impl.similarity.PearsonCorrelationSimilarity
10
10
 
11
11
  java_import org.apache.mahout.cf.taste.impl.neighborhood.NearestNUserNeighborhood
12
+ java_import org.apache.mahout.cf.taste.impl.neighborhood.ThresholdUserNeighborhood
12
13
 
13
14
  java_import org.apache.mahout.cf.taste.impl.recommender.GenericUserBasedRecommender
14
15
  java_import org.apache.mahout.cf.taste.impl.recommender.GenericItemBasedRecommender
@@ -49,7 +50,11 @@ module JrubyMahout
49
50
  end
50
51
 
51
52
  unless @neighborhood_size.nil?
52
- neighborhood = NearestNUserNeighborhood.new(Integer(@neighborhood_size), similarity, data_model)
53
+ if @neighborhood_size > 1
54
+ neighborhood = NearestNUserNeighborhood.new(Integer(@neighborhood_size), similarity, data_model)
55
+ elsif @neighborhood_size >= -1 and @neighborhood_size <= 1
56
+ neighborhood = ThresholdUserNeighborhood.new(Float(@neighborhood_size), similarity, data_model)
57
+ end
53
58
  end
54
59
 
55
60
  case @recommender_name
@@ -1,3 +1,3 @@
1
1
  module JrubyMahout
2
- VERSION = '0.2.1'
2
+ VERSION = '0.2.2'
3
3
  end
@@ -95,88 +95,176 @@ describe JrubyMahout::Recommender do
95
95
 
96
96
  describe ".recommend" do
97
97
  context "with valid arguments" do
98
- it "should return an array for PearsonCorrelationSimilarity and GenericUserBasedRecommender" do
99
- recommender = JrubyMahout::Recommender.new("PearsonCorrelationSimilarity", 5, "GenericUserBasedRecommender", false)
100
- recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
101
-
102
- recommender.recommend(1, 10, nil).should be_an_instance_of Array
103
- end
104
-
105
- it "should return an array for EuclideanDistanceSimilarity and GenericUserBasedRecommender" do
106
- recommender = JrubyMahout::Recommender.new("EuclideanDistanceSimilarity", 5, "GenericUserBasedRecommender", false)
107
- recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
108
-
109
- recommender.recommend(1, 10, nil).should be_an_instance_of Array
110
- end
111
-
112
- it "should return an array for SpearmanCorrelationSimilarity and GenericUserBasedRecommender" do
113
- recommender = JrubyMahout::Recommender.new("SpearmanCorrelationSimilarity", 5, "GenericUserBasedRecommender", false)
114
- recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
115
-
116
- recommender.recommend(1, 10, nil).should be_an_instance_of Array
117
- end
118
-
119
- it "should return an array for LogLikelihoodSimilarity and GenericUserBasedRecommender" do
120
- recommender = JrubyMahout::Recommender.new("LogLikelihoodSimilarity", 5, "GenericUserBasedRecommender", false)
121
- recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
122
-
123
- recommender.recommend(1, 10, nil).should be_an_instance_of Array
124
- end
125
-
126
- it "should return an array for TanimotoCoefficientSimilarity and GenericUserBasedRecommender" do
127
- recommender = JrubyMahout::Recommender.new("TanimotoCoefficientSimilarity", 5, "GenericUserBasedRecommender", false)
128
- recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
129
-
130
- recommender.recommend(1, 10, nil).should be_an_instance_of Array
131
- end
132
-
133
- it "should return an array for GenericItemSimilarity and GenericUserBasedRecommender" do
134
- recommender = JrubyMahout::Recommender.new("GenericItemSimilarity", 5, "GenericUserBasedRecommender", false)
135
- recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
136
-
137
- recommender.recommend(1, 10, nil).should be_an_instance_of Array
138
- end
139
-
140
- it "should return an array for PearsonCorrelationSimilarity and GenericItemBasedRecommender" do
141
- recommender = JrubyMahout::Recommender.new("PearsonCorrelationSimilarity", nil, "GenericItemBasedRecommender", false)
142
- recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
143
-
144
- recommender.recommend(1, 10, nil).should be_an_instance_of Array
145
- end
146
-
147
- it "should return an array for EuclideanDistanceSimilarity and GenericItemBasedRecommender" do
148
- recommender = JrubyMahout::Recommender.new("EuclideanDistanceSimilarity", nil, "GenericItemBasedRecommender", false)
149
- recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
150
-
151
- recommender.recommend(1, 10, nil).should be_an_instance_of Array
152
- end
153
-
154
- it "should return an array for LogLikelihoodSimilarity and GenericItemBasedRecommender" do
155
- recommender = JrubyMahout::Recommender.new("LogLikelihoodSimilarity", nil, "GenericItemBasedRecommender", false)
156
- recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
157
-
158
- recommender.recommend(1, 10, nil).should be_an_instance_of Array
159
- end
98
+ context "with NearestNUserNeighborhood" do
99
+ it "should return an array for PearsonCorrelationSimilarity and GenericUserBasedRecommender" do
100
+ recommender = JrubyMahout::Recommender.new("PearsonCorrelationSimilarity", 5, "GenericUserBasedRecommender", false)
101
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
102
+
103
+ recommender.recommend(1, 10, nil).should be_an_instance_of Array
104
+ end
105
+
106
+ it "should return an array for EuclideanDistanceSimilarity and GenericUserBasedRecommender" do
107
+ recommender = JrubyMahout::Recommender.new("EuclideanDistanceSimilarity", 5, "GenericUserBasedRecommender", false)
108
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
109
+
110
+ recommender.recommend(1, 10, nil).should be_an_instance_of Array
111
+ end
112
+
113
+ it "should return an array for SpearmanCorrelationSimilarity and GenericUserBasedRecommender" do
114
+ recommender = JrubyMahout::Recommender.new("SpearmanCorrelationSimilarity", 5, "GenericUserBasedRecommender", false)
115
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
116
+
117
+ recommender.recommend(1, 10, nil).should be_an_instance_of Array
118
+ end
119
+
120
+ it "should return an array for LogLikelihoodSimilarity and GenericUserBasedRecommender" do
121
+ recommender = JrubyMahout::Recommender.new("LogLikelihoodSimilarity", 5, "GenericUserBasedRecommender", false)
122
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
123
+
124
+ recommender.recommend(1, 10, nil).should be_an_instance_of Array
125
+ end
126
+
127
+ it "should return an array for TanimotoCoefficientSimilarity and GenericUserBasedRecommender" do
128
+ recommender = JrubyMahout::Recommender.new("TanimotoCoefficientSimilarity", 5, "GenericUserBasedRecommender", false)
129
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
130
+
131
+ recommender.recommend(1, 10, nil).should be_an_instance_of Array
132
+ end
133
+
134
+ it "should return an array for GenericItemSimilarity and GenericUserBasedRecommender" do
135
+ recommender = JrubyMahout::Recommender.new("GenericItemSimilarity", 5, "GenericUserBasedRecommender", false)
136
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
137
+
138
+ recommender.recommend(1, 10, nil).should be_an_instance_of Array
139
+ end
140
+
141
+ it "should return an array for PearsonCorrelationSimilarity and GenericItemBasedRecommender" do
142
+ recommender = JrubyMahout::Recommender.new("PearsonCorrelationSimilarity", nil, "GenericItemBasedRecommender", false)
143
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
144
+
145
+ recommender.recommend(1, 10, nil).should be_an_instance_of Array
146
+ end
147
+
148
+ it "should return an array for EuclideanDistanceSimilarity and GenericItemBasedRecommender" do
149
+ recommender = JrubyMahout::Recommender.new("EuclideanDistanceSimilarity", nil, "GenericItemBasedRecommender", false)
150
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
151
+
152
+ recommender.recommend(1, 10, nil).should be_an_instance_of Array
153
+ end
154
+
155
+ it "should return an array for LogLikelihoodSimilarity and GenericItemBasedRecommender" do
156
+ recommender = JrubyMahout::Recommender.new("LogLikelihoodSimilarity", nil, "GenericItemBasedRecommender", false)
157
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
158
+
159
+ recommender.recommend(1, 10, nil).should be_an_instance_of Array
160
+ end
161
+
162
+ it "should return an array for TanimotoCoefficientSimilarity and GenericItemBasedRecommender" do
163
+ recommender = JrubyMahout::Recommender.new("TanimotoCoefficientSimilarity", nil, "GenericItemBasedRecommender", false)
164
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
165
+
166
+ recommender.recommend(1, 10, nil).should be_an_instance_of Array
167
+ end
168
+
169
+ it "should return an array for GenericItemSimilarity and GenericItemBasedRecommender" do
170
+ recommender = JrubyMahout::Recommender.new("GenericItemSimilarity", nil, "GenericItemBasedRecommender", false)
171
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
172
+
173
+ recommender.recommend(1, 10, nil).should be_an_instance_of Array
174
+ end
175
+
176
+ it "should return an array for SlopeOneRecommender" do
177
+ recommender = JrubyMahout::Recommender.new("nil", nil, "SlopeOneRecommender", false)
178
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
179
+
180
+ recommender.recommend(1, 10, nil).should be_an_instance_of Array
181
+ end
182
+ end
183
+
184
+ context "with ThresholdUserNeighborhood" do
185
+ it "should return an array for PearsonCorrelationSimilarity and GenericUserBasedRecommender" do
186
+ recommender = JrubyMahout::Recommender.new("PearsonCorrelationSimilarity", 0.7, "GenericUserBasedRecommender", false)
187
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
188
+
189
+ recommender.recommend(1, 10, nil).should be_an_instance_of Array
190
+ end
191
+
192
+ it "should return an array for EuclideanDistanceSimilarity and GenericUserBasedRecommender" do
193
+ recommender = JrubyMahout::Recommender.new("EuclideanDistanceSimilarity", 0.7, "GenericUserBasedRecommender", false)
194
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
195
+
196
+ recommender.recommend(1, 10, nil).should be_an_instance_of Array
197
+ end
198
+
199
+ it "should return an array for SpearmanCorrelationSimilarity and GenericUserBasedRecommender" do
200
+ recommender = JrubyMahout::Recommender.new("SpearmanCorrelationSimilarity", 0.7, "GenericUserBasedRecommender", false)
201
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
202
+
203
+ recommender.recommend(1, 10, nil).should be_an_instance_of Array
204
+ end
205
+
206
+ it "should return an array for LogLikelihoodSimilarity and GenericUserBasedRecommender" do
207
+ recommender = JrubyMahout::Recommender.new("LogLikelihoodSimilarity", 0.7, "GenericUserBasedRecommender", false)
208
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
209
+
210
+ recommender.recommend(1, 10, nil).should be_an_instance_of Array
211
+ end
212
+
213
+ it "should return an array for TanimotoCoefficientSimilarity and GenericUserBasedRecommender" do
214
+ recommender = JrubyMahout::Recommender.new("TanimotoCoefficientSimilarity", 0.7, "GenericUserBasedRecommender", false)
215
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
216
+
217
+ recommender.recommend(1, 10, nil).should be_an_instance_of Array
218
+ end
219
+
220
+ it "should return an array for GenericItemSimilarity and GenericUserBasedRecommender" do
221
+ recommender = JrubyMahout::Recommender.new("GenericItemSimilarity", 0.7, "GenericUserBasedRecommender", false)
222
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
223
+
224
+ recommender.recommend(1, 10, nil).should be_an_instance_of Array
225
+ end
226
+
227
+ it "should return an array for PearsonCorrelationSimilarity and GenericItemBasedRecommender" do
228
+ recommender = JrubyMahout::Recommender.new("PearsonCorrelationSimilarity", nil, "GenericItemBasedRecommender", false)
229
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
230
+
231
+ recommender.recommend(1, 10, nil).should be_an_instance_of Array
232
+ end
233
+
234
+ it "should return an array for EuclideanDistanceSimilarity and GenericItemBasedRecommender" do
235
+ recommender = JrubyMahout::Recommender.new("EuclideanDistanceSimilarity", nil, "GenericItemBasedRecommender", false)
236
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
237
+
238
+ recommender.recommend(1, 10, nil).should be_an_instance_of Array
239
+ end
240
+
241
+ it "should return an array for LogLikelihoodSimilarity and GenericItemBasedRecommender" do
242
+ recommender = JrubyMahout::Recommender.new("LogLikelihoodSimilarity", nil, "GenericItemBasedRecommender", false)
243
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
244
+
245
+ recommender.recommend(1, 10, nil).should be_an_instance_of Array
246
+ end
247
+
248
+ it "should return an array for TanimotoCoefficientSimilarity and GenericItemBasedRecommender" do
249
+ recommender = JrubyMahout::Recommender.new("TanimotoCoefficientSimilarity", nil, "GenericItemBasedRecommender", false)
250
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
251
+
252
+ recommender.recommend(1, 10, nil).should be_an_instance_of Array
253
+ end
254
+
255
+ it "should return an array for GenericItemSimilarity and GenericItemBasedRecommender" do
256
+ recommender = JrubyMahout::Recommender.new("GenericItemSimilarity", nil, "GenericItemBasedRecommender", false)
257
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
258
+
259
+ recommender.recommend(1, 10, nil).should be_an_instance_of Array
260
+ end
261
+
262
+ it "should return an array for SlopeOneRecommender" do
263
+ recommender = JrubyMahout::Recommender.new("nil", nil, "SlopeOneRecommender", false)
264
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
160
265
 
161
- it "should return an array for TanimotoCoefficientSimilarity and GenericItemBasedRecommender" do
162
- recommender = JrubyMahout::Recommender.new("TanimotoCoefficientSimilarity", nil, "GenericItemBasedRecommender", false)
163
- recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
164
-
165
- recommender.recommend(1, 10, nil).should be_an_instance_of Array
166
- end
167
-
168
- it "should return an array for GenericItemSimilarity and GenericItemBasedRecommender" do
169
- recommender = JrubyMahout::Recommender.new("GenericItemSimilarity", nil, "GenericItemBasedRecommender", false)
170
- recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
171
-
172
- recommender.recommend(1, 10, nil).should be_an_instance_of Array
173
- end
174
-
175
- it "should return an array for SlopeOneRecommender" do
176
- recommender = JrubyMahout::Recommender.new("nil", nil, "SlopeOneRecommender", false)
177
- recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
178
-
179
- recommender.recommend(1, 10, nil).should be_an_instance_of Array
266
+ recommender.recommend(1, 10, nil).should be_an_instance_of Array
267
+ end
180
268
  end
181
269
  end
182
270
 
@@ -293,4 +381,52 @@ describe JrubyMahout::Recommender do
293
381
  end
294
382
  end
295
383
  end
384
+
385
+ # TODO: cover all cases
386
+ describe ".similar_users" do
387
+ context "with valid arguments" do
388
+ it "should return an array of users" do
389
+ recommender = JrubyMahout::Recommender.new("SpearmanCorrelationSimilarity", 5, "GenericUserBasedRecommender", false)
390
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
391
+
392
+ recommender.similar_users(1, 10, nil).should be_an_instance_of Array
393
+ end
394
+ end
395
+ end
396
+
397
+ # TODO: cover all cases
398
+ describe ".similar_items" do
399
+ context "with valid arguments" do
400
+ it "should return an array of items" do
401
+ recommender = JrubyMahout::Recommender.new("GenericItemSimilarity", nil, "GenericItemBasedRecommender", false)
402
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
403
+
404
+ recommender.similar_items(4, 10, nil).should be_an_instance_of Array
405
+ end
406
+ end
407
+ end
408
+
409
+ # TODO: cover all cases
410
+ describe ".recommended_because" do
411
+ context "with valid arguments" do
412
+ it "should return an array of items" do
413
+ recommender = JrubyMahout::Recommender.new("PearsonCorrelationSimilarity", nil, "GenericItemBasedRecommender", false)
414
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
415
+
416
+ recommender.recommended_because(1, 138, 5).should be_an_instance_of Array
417
+ end
418
+ end
419
+ end
420
+
421
+ # TODO: cover all cases
422
+ describe ".estimate_preference" do
423
+ context "with valid arguments" do
424
+ it "should return afloat with an estimate" do
425
+ recommender = JrubyMahout::Recommender.new("PearsonCorrelationSimilarity", nil, "GenericItemBasedRecommender", false)
426
+ recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
427
+
428
+ recommender.estimate_preference(1, 138).should be_an_instance_of Float
429
+ end
430
+ end
431
+ end
296
432
  end
metadata CHANGED
@@ -2,14 +2,14 @@
2
2
  name: jruby_mahout
3
3
  version: !ruby/object:Gem::Version
4
4
  prerelease:
5
- version: 0.2.1
5
+ version: 0.2.2
6
6
  platform: ruby
7
7
  authors:
8
8
  - Vasily Vasinov
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2012-12-10 00:00:00.000000000 Z
12
+ date: 2012-12-20 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: rake
@@ -47,7 +47,7 @@ dependencies:
47
47
  none: false
48
48
  prerelease: false
49
49
  type: :development
50
- description: Jruby Mahout is a gem that unleashes the power of Apache Mahout in the world of Jruby. Mahout is a superior machine learning library written in Java. It deals with recommendations, clustering and classification machine learning problems at scale. Until now it was difficult to use it in Ruby projects. You'd have to implement Java interfaces in Jruby yourself, which is not quick especially if you just started exploring the world of machine learning.
50
+ description: JRuby Mahout is a gem that unleashes the power of Apache Mahout in the world of JRuby. Mahout is a superior machine learning library written in Java. It deals with recommendations, clustering and classification machine learning problems at scale. Until now it was difficult to use it in Ruby projects. You'd have to implement Java interfaces in Jruby yourself, which is not quick especially if you just started exploring the world of machine learning.
51
51
  email:
52
52
  - vasinov@me.com
53
53
  executables: []
@@ -60,8 +60,6 @@ files:
60
60
  bGliL2pydWJ5X21haG91dC9kYXRhX21vZGVsLnJi
61
61
  - !binary |-
62
62
  bGliL2pydWJ5X21haG91dC9ldmFsdWF0b3IucmI=
63
- - !binary |-
64
- bGliL2pydWJ5X21haG91dC9tYWhvdXRfaW1wb3J0cy5yYg==
65
63
  - !binary |-
66
64
  bGliL2pydWJ5X21haG91dC9teXNxbF9tYW5hZ2VyLnJi
67
65
  - !binary |-
@@ -107,7 +105,7 @@ rubyforge_project:
107
105
  rubygems_version: 1.8.24
108
106
  signing_key:
109
107
  specification_version: 3
110
- summary: Jruby Mahout is a gem that unleashes the power of Apache Mahout in the world of Jruby.
108
+ summary: JRuby Mahout is a gem that unleashes the power of Apache Mahout in the world of JRuby.
111
109
  test_files:
112
110
  - !binary |-
113
111
  c3BlYy9yZWNvbW1lbmRlcl9kYXRhLmNzdg==
@@ -1,21 +0,0 @@
1
- # Recommenders
2
-
3
-
4
- # Neighborhoods
5
-
6
-
7
- # Recommenders
8
-
9
-
10
- # Weighting
11
-
12
-
13
- # Evaluators
14
-
15
-
16
- # Data Models
17
-
18
-
19
-
20
- # Postgres
21
-