jruby_mahout 0.2.1 → 0.2.2
Sign up to get free protection for your applications and to get access to all the features.
- data/README.md +55 -9
- data/lib/jruby_mahout/postgres_manager.rb +19 -11
- data/lib/jruby_mahout/recommender.rb +17 -7
- data/lib/jruby_mahout/recommender_builder.rb +6 -1
- data/lib/jruby_mahout/version.rb +1 -1
- data/spec/recommender_spec.rb +217 -81
- metadata +4 -6
- data/lib/jruby_mahout/mahout_imports.rb +0 -21
data/README.md
CHANGED
@@ -1,30 +1,76 @@
|
|
1
|
-
#
|
2
|
-
Jruby Mahout is a gem that unleashes the power of Apache Mahout in the world of
|
1
|
+
# JRuby Mahout
|
2
|
+
Jruby Mahout is a gem that unleashes the power of Apache Mahout in the world of JRuby. Mahout is a superior machine learning library written in Java. It deals with recommendations, clustering and classification machine learning problems at scale. Until now it was difficult to use it in Ruby projects. You'd have to implement Java interfaces in JRuby yourself, which is not quick especially if you just started exploring the world of machine learning.
|
3
3
|
|
4
|
-
The goal of this library is to make machine learning at scale in
|
4
|
+
The goal of this library is to make machine learning at scale in JRuby projects simple.
|
5
5
|
|
6
6
|
## Quick Overview
|
7
|
-
This is an early version of a
|
7
|
+
This is an early version of a JRuby gem that only supports Mahout recommendations. It also includes a simple Postgres manager that can be used to manage appropriate recommendations tables. Unfortunately it's impossible to use ActiveRecord (AR) with Mahout, because AR operates at a much higher level and creates a lot of overhead that is critical when dealing with millions of records in real time.
|
8
8
|
|
9
9
|
## Get Mahout
|
10
|
-
First of all you need to download Mahout library from one of the [mirrors](http://www.apache.org/dyn/closer.cgi/mahout/). Jruby Mahout only supports Mahout 0.7 at this point.
|
10
|
+
First of all you need to download the Mahout library from one of the [mirrors](http://www.apache.org/dyn/closer.cgi/mahout/). Jruby Mahout only supports Mahout 0.7 at this point.
|
11
11
|
|
12
12
|
## Get Postgres JDBC Adapter
|
13
|
-
If you wish to work with a database for recommendations, you'll have to install [JDBC driver for Postgres](http://jdbc.postgresql.org/download.html). Another option is to use file-based
|
13
|
+
If you wish to work with a database for recommendations, you'll have to install the [JDBC driver for Postgres](http://jdbc.postgresql.org/download.html). Another option is to use file-based recommendations.
|
14
14
|
|
15
15
|
## Installation
|
16
|
-
### 1. Set environment variable MAHOUT_DIR to point at your Mahout installation.
|
16
|
+
### 1. Set the environment variable MAHOUT_DIR to point at your Mahout installation.
|
17
17
|
### 2. Add the gem to your `Gemfile`
|
18
18
|
```ruby
|
19
19
|
platform :jruby do
|
20
20
|
gem "jruby_mahout"
|
21
21
|
end
|
22
22
|
```
|
23
|
-
|
23
|
+
### 3. Run `bundle install`.
|
24
|
+
|
25
|
+
## How to Use?
|
26
|
+
I am planning to add more examples covering Jruby Mahout use cases to [this repo](https://github.com/vasinov/jruby_mahout-examples) soon.
|
27
|
+
|
28
|
+
First, define the `MAHOUT_DIR` environmental variable for your Mahout installation. For example:
|
29
|
+
|
30
|
+
```
|
31
|
+
export MAHOUT_DIR=/bin/mahout
|
32
|
+
```
|
33
|
+
|
34
|
+
The easiest way to start working with Jruby Mahout recommendations is to initialize a recommender:
|
35
|
+
```ruby
|
36
|
+
require 'jruby_mahout'
|
37
|
+
recommender = JrubyMahout::Recommender.new("PearsonCorrelationSimilarity", 5, "GenericUserBasedRecommender", false)
|
38
|
+
```
|
39
|
+
|
40
|
+
Set up a data model:
|
41
|
+
```ruby
|
42
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "recommender_data.csv" }).data_model
|
43
|
+
```
|
44
|
+
|
45
|
+
and get recommendations:
|
46
|
+
```ruby
|
47
|
+
puts recommender.recommend(2, 10, nil) # 10 recommendations for user with id = 2
|
48
|
+
```
|
49
|
+
|
50
|
+
You can evaluate your recommender to see how efficient it is:
|
51
|
+
```ruby
|
52
|
+
puts recommender.evaluate(0.7, 0.3)
|
53
|
+
```
|
54
|
+
|
55
|
+
The closer the score is to zero—the better.
|
56
|
+
|
57
|
+
I realize that it's a very sparse introduction to Jruby Mahout. I am working on the tutorial and better documentation that should cover this gem more in depth. Stay tuned at [my blog](http://www.vasinov.com/blog).
|
58
|
+
|
59
|
+
## Development Plans
|
60
|
+
There are several things that should be supported by this gem, before it can be used in production. Some of them are:
|
61
|
+
- Hadoop integration
|
62
|
+
- Clustering support
|
63
|
+
- Classification support
|
64
|
+
- Better docs
|
65
|
+
|
66
|
+
If you feel like you can help—please do.
|
67
|
+
|
68
|
+
## Testing
|
69
|
+
Jruby Mahout is thoroughly tested with Rspec.
|
24
70
|
|
25
71
|
## Contribute
|
26
72
|
- Fork the project.
|
27
73
|
- Write code for a feature or bug fix.
|
28
74
|
- Add Rspec tests for it.
|
29
75
|
- Commit, do not make changes to rakefile or version.
|
30
|
-
- Submit a pull request.
|
76
|
+
- Submit a pull request.
|
@@ -44,19 +44,27 @@ module JrubyMahout
|
|
44
44
|
end
|
45
45
|
end
|
46
46
|
|
47
|
-
def upsert_record(
|
47
|
+
def upsert_record(table_name, record)
|
48
48
|
begin
|
49
|
-
@statement.execute("UPDATE #{
|
50
|
-
@statement.execute("INSERT INTO #{
|
49
|
+
@statement.execute("UPDATE #{table_name} SET user_id=#{record[:user_id]}, item_id=#{record[:item_id]}, rating=#{record[:rating]} WHERE user_id=#{record[:user_id]} AND item_id=#{record[:item_id]};")
|
50
|
+
@statement.execute("INSERT INTO #{table_name} (user_id, item_id, rating) SELECT #{record[:user_id]}, #{record[:item_id]}, #{record[:rating]} WHERE NOT EXISTS (SELECT 1 FROM #{table_name} WHERE user_id=#{record[:user_id]} AND item_id=#{record[:item_id]});")
|
51
51
|
rescue java.sql.SQLException => e
|
52
52
|
puts e
|
53
53
|
end
|
54
54
|
end
|
55
55
|
|
56
|
-
def
|
56
|
+
def delete_record(table_name, record)
|
57
|
+
begin
|
58
|
+
@statement.execute("DELETE FROM #{table_name} WHERE user_id=#{record[:user_id]} AND item_id=#{record[:item_id]};")
|
59
|
+
rescue java.sql.SQLException => e
|
60
|
+
puts e
|
61
|
+
end
|
62
|
+
end
|
63
|
+
|
64
|
+
def create_table(table_name)
|
57
65
|
begin
|
58
66
|
@statement.executeUpdate("
|
59
|
-
CREATE TABLE #{
|
67
|
+
CREATE TABLE #{table_name} (
|
60
68
|
user_id BIGINT NOT NULL,
|
61
69
|
item_id BIGINT NOT NULL,
|
62
70
|
rating int NOT NULL,
|
@@ -64,18 +72,18 @@ module JrubyMahout
|
|
64
72
|
PRIMARY KEY (user_id, item_id)
|
65
73
|
);
|
66
74
|
")
|
67
|
-
@statement.executeUpdate("CREATE INDEX #{
|
68
|
-
@statement.executeUpdate("CREATE INDEX #{
|
75
|
+
@statement.executeUpdate("CREATE INDEX #{table_name}_user_id_index ON #{table_name} (user_id);")
|
76
|
+
@statement.executeUpdate("CREATE INDEX #{table_name}_item_id_index ON #{table_name} (item_id);")
|
69
77
|
rescue java.sql.SQLException => e
|
70
78
|
puts e
|
71
79
|
end
|
72
80
|
end
|
73
81
|
|
74
|
-
def delete_table(
|
82
|
+
def delete_table(table_name)
|
75
83
|
begin
|
76
|
-
@statement.executeUpdate("DROP INDEX IF EXISTS #{
|
77
|
-
@statement.executeUpdate("DROP INDEX IF EXISTS #{
|
78
|
-
@statement.executeUpdate("DROP TABLE IF EXISTS #{
|
84
|
+
@statement.executeUpdate("DROP INDEX IF EXISTS #{table_name}_user_id_index;")
|
85
|
+
@statement.executeUpdate("DROP INDEX IF EXISTS #{table_name}_item_id_index;")
|
86
|
+
@statement.executeUpdate("DROP TABLE IF EXISTS #{table_name};")
|
79
87
|
rescue java.sql.SQLException => e
|
80
88
|
puts e
|
81
89
|
end
|
@@ -34,18 +34,18 @@ module JrubyMahout
|
|
34
34
|
end
|
35
35
|
|
36
36
|
def similar_items(item_id, number_of_items, rescorer)
|
37
|
-
if @recommender.nil? or @recommender_name == "
|
37
|
+
if @recommender.nil? or @recommender_name == "GenericUserBasedRecommender"
|
38
38
|
nil
|
39
39
|
else
|
40
|
-
@recommender.mostSimilarItems(item_id, number_of_items, rescorer)
|
40
|
+
to_array(@recommender.mostSimilarItems(item_id, number_of_items, rescorer))
|
41
41
|
end
|
42
42
|
end
|
43
43
|
|
44
|
-
def similar_users(user_id,
|
45
|
-
if @recommender.nil? or @recommender_name == "
|
44
|
+
def similar_users(user_id, number_of_users, rescorer)
|
45
|
+
if @recommender.nil? or @recommender_name == "GenericItemBasedRecommender"
|
46
46
|
nil
|
47
47
|
else
|
48
|
-
@recommender.mostSimilarUserIDs(user_id,
|
48
|
+
to_array(@recommender.mostSimilarUserIDs(user_id, number_of_users, rescorer))
|
49
49
|
end
|
50
50
|
end
|
51
51
|
|
@@ -58,10 +58,10 @@ module JrubyMahout
|
|
58
58
|
end
|
59
59
|
|
60
60
|
def recommended_because(user_id, item_id, number_of_items)
|
61
|
-
if @recommender.nil? or @recommender_name == "
|
61
|
+
if @recommender.nil? or @recommender_name == "GenericUserBasedRecommender"
|
62
62
|
nil
|
63
63
|
else
|
64
|
-
@recommender.recommendedBecause(user_id, item_id, number_of_items)
|
64
|
+
to_array(@recommender.recommendedBecause(user_id, item_id, number_of_items))
|
65
65
|
end
|
66
66
|
end
|
67
67
|
|
@@ -74,5 +74,15 @@ module JrubyMahout
|
|
74
74
|
|
75
75
|
recommendations_array
|
76
76
|
end
|
77
|
+
|
78
|
+
private
|
79
|
+
def to_array(things)
|
80
|
+
things_array = []
|
81
|
+
things.each do |thing_id|
|
82
|
+
things_array << thing_id
|
83
|
+
end
|
84
|
+
|
85
|
+
things_array
|
86
|
+
end
|
77
87
|
end
|
78
88
|
end
|
@@ -9,6 +9,7 @@ module JrubyMahout
|
|
9
9
|
java_import org.apache.mahout.cf.taste.impl.similarity.PearsonCorrelationSimilarity
|
10
10
|
|
11
11
|
java_import org.apache.mahout.cf.taste.impl.neighborhood.NearestNUserNeighborhood
|
12
|
+
java_import org.apache.mahout.cf.taste.impl.neighborhood.ThresholdUserNeighborhood
|
12
13
|
|
13
14
|
java_import org.apache.mahout.cf.taste.impl.recommender.GenericUserBasedRecommender
|
14
15
|
java_import org.apache.mahout.cf.taste.impl.recommender.GenericItemBasedRecommender
|
@@ -49,7 +50,11 @@ module JrubyMahout
|
|
49
50
|
end
|
50
51
|
|
51
52
|
unless @neighborhood_size.nil?
|
52
|
-
|
53
|
+
if @neighborhood_size > 1
|
54
|
+
neighborhood = NearestNUserNeighborhood.new(Integer(@neighborhood_size), similarity, data_model)
|
55
|
+
elsif @neighborhood_size >= -1 and @neighborhood_size <= 1
|
56
|
+
neighborhood = ThresholdUserNeighborhood.new(Float(@neighborhood_size), similarity, data_model)
|
57
|
+
end
|
53
58
|
end
|
54
59
|
|
55
60
|
case @recommender_name
|
data/lib/jruby_mahout/version.rb
CHANGED
data/spec/recommender_spec.rb
CHANGED
@@ -95,88 +95,176 @@ describe JrubyMahout::Recommender do
|
|
95
95
|
|
96
96
|
describe ".recommend" do
|
97
97
|
context "with valid arguments" do
|
98
|
-
|
99
|
-
|
100
|
-
|
101
|
-
|
102
|
-
|
103
|
-
|
104
|
-
|
105
|
-
|
106
|
-
|
107
|
-
|
108
|
-
|
109
|
-
|
110
|
-
|
111
|
-
|
112
|
-
|
113
|
-
|
114
|
-
|
115
|
-
|
116
|
-
|
117
|
-
|
118
|
-
|
119
|
-
|
120
|
-
|
121
|
-
|
122
|
-
|
123
|
-
|
124
|
-
|
125
|
-
|
126
|
-
|
127
|
-
|
128
|
-
|
129
|
-
|
130
|
-
|
131
|
-
|
132
|
-
|
133
|
-
|
134
|
-
|
135
|
-
|
136
|
-
|
137
|
-
|
138
|
-
|
139
|
-
|
140
|
-
|
141
|
-
|
142
|
-
|
143
|
-
|
144
|
-
|
145
|
-
|
146
|
-
|
147
|
-
|
148
|
-
|
149
|
-
|
150
|
-
|
151
|
-
|
152
|
-
|
153
|
-
|
154
|
-
|
155
|
-
|
156
|
-
|
157
|
-
|
158
|
-
|
159
|
-
|
98
|
+
context "with NearestNUserNeighborhood" do
|
99
|
+
it "should return an array for PearsonCorrelationSimilarity and GenericUserBasedRecommender" do
|
100
|
+
recommender = JrubyMahout::Recommender.new("PearsonCorrelationSimilarity", 5, "GenericUserBasedRecommender", false)
|
101
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
102
|
+
|
103
|
+
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
104
|
+
end
|
105
|
+
|
106
|
+
it "should return an array for EuclideanDistanceSimilarity and GenericUserBasedRecommender" do
|
107
|
+
recommender = JrubyMahout::Recommender.new("EuclideanDistanceSimilarity", 5, "GenericUserBasedRecommender", false)
|
108
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
109
|
+
|
110
|
+
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
111
|
+
end
|
112
|
+
|
113
|
+
it "should return an array for SpearmanCorrelationSimilarity and GenericUserBasedRecommender" do
|
114
|
+
recommender = JrubyMahout::Recommender.new("SpearmanCorrelationSimilarity", 5, "GenericUserBasedRecommender", false)
|
115
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
116
|
+
|
117
|
+
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
118
|
+
end
|
119
|
+
|
120
|
+
it "should return an array for LogLikelihoodSimilarity and GenericUserBasedRecommender" do
|
121
|
+
recommender = JrubyMahout::Recommender.new("LogLikelihoodSimilarity", 5, "GenericUserBasedRecommender", false)
|
122
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
123
|
+
|
124
|
+
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
125
|
+
end
|
126
|
+
|
127
|
+
it "should return an array for TanimotoCoefficientSimilarity and GenericUserBasedRecommender" do
|
128
|
+
recommender = JrubyMahout::Recommender.new("TanimotoCoefficientSimilarity", 5, "GenericUserBasedRecommender", false)
|
129
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
130
|
+
|
131
|
+
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
132
|
+
end
|
133
|
+
|
134
|
+
it "should return an array for GenericItemSimilarity and GenericUserBasedRecommender" do
|
135
|
+
recommender = JrubyMahout::Recommender.new("GenericItemSimilarity", 5, "GenericUserBasedRecommender", false)
|
136
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
137
|
+
|
138
|
+
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
139
|
+
end
|
140
|
+
|
141
|
+
it "should return an array for PearsonCorrelationSimilarity and GenericItemBasedRecommender" do
|
142
|
+
recommender = JrubyMahout::Recommender.new("PearsonCorrelationSimilarity", nil, "GenericItemBasedRecommender", false)
|
143
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
144
|
+
|
145
|
+
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
146
|
+
end
|
147
|
+
|
148
|
+
it "should return an array for EuclideanDistanceSimilarity and GenericItemBasedRecommender" do
|
149
|
+
recommender = JrubyMahout::Recommender.new("EuclideanDistanceSimilarity", nil, "GenericItemBasedRecommender", false)
|
150
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
151
|
+
|
152
|
+
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
153
|
+
end
|
154
|
+
|
155
|
+
it "should return an array for LogLikelihoodSimilarity and GenericItemBasedRecommender" do
|
156
|
+
recommender = JrubyMahout::Recommender.new("LogLikelihoodSimilarity", nil, "GenericItemBasedRecommender", false)
|
157
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
158
|
+
|
159
|
+
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
160
|
+
end
|
161
|
+
|
162
|
+
it "should return an array for TanimotoCoefficientSimilarity and GenericItemBasedRecommender" do
|
163
|
+
recommender = JrubyMahout::Recommender.new("TanimotoCoefficientSimilarity", nil, "GenericItemBasedRecommender", false)
|
164
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
165
|
+
|
166
|
+
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
167
|
+
end
|
168
|
+
|
169
|
+
it "should return an array for GenericItemSimilarity and GenericItemBasedRecommender" do
|
170
|
+
recommender = JrubyMahout::Recommender.new("GenericItemSimilarity", nil, "GenericItemBasedRecommender", false)
|
171
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
172
|
+
|
173
|
+
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
174
|
+
end
|
175
|
+
|
176
|
+
it "should return an array for SlopeOneRecommender" do
|
177
|
+
recommender = JrubyMahout::Recommender.new("nil", nil, "SlopeOneRecommender", false)
|
178
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
179
|
+
|
180
|
+
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
181
|
+
end
|
182
|
+
end
|
183
|
+
|
184
|
+
context "with ThresholdUserNeighborhood" do
|
185
|
+
it "should return an array for PearsonCorrelationSimilarity and GenericUserBasedRecommender" do
|
186
|
+
recommender = JrubyMahout::Recommender.new("PearsonCorrelationSimilarity", 0.7, "GenericUserBasedRecommender", false)
|
187
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
188
|
+
|
189
|
+
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
190
|
+
end
|
191
|
+
|
192
|
+
it "should return an array for EuclideanDistanceSimilarity and GenericUserBasedRecommender" do
|
193
|
+
recommender = JrubyMahout::Recommender.new("EuclideanDistanceSimilarity", 0.7, "GenericUserBasedRecommender", false)
|
194
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
195
|
+
|
196
|
+
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
197
|
+
end
|
198
|
+
|
199
|
+
it "should return an array for SpearmanCorrelationSimilarity and GenericUserBasedRecommender" do
|
200
|
+
recommender = JrubyMahout::Recommender.new("SpearmanCorrelationSimilarity", 0.7, "GenericUserBasedRecommender", false)
|
201
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
202
|
+
|
203
|
+
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
204
|
+
end
|
205
|
+
|
206
|
+
it "should return an array for LogLikelihoodSimilarity and GenericUserBasedRecommender" do
|
207
|
+
recommender = JrubyMahout::Recommender.new("LogLikelihoodSimilarity", 0.7, "GenericUserBasedRecommender", false)
|
208
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
209
|
+
|
210
|
+
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
211
|
+
end
|
212
|
+
|
213
|
+
it "should return an array for TanimotoCoefficientSimilarity and GenericUserBasedRecommender" do
|
214
|
+
recommender = JrubyMahout::Recommender.new("TanimotoCoefficientSimilarity", 0.7, "GenericUserBasedRecommender", false)
|
215
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
216
|
+
|
217
|
+
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
218
|
+
end
|
219
|
+
|
220
|
+
it "should return an array for GenericItemSimilarity and GenericUserBasedRecommender" do
|
221
|
+
recommender = JrubyMahout::Recommender.new("GenericItemSimilarity", 0.7, "GenericUserBasedRecommender", false)
|
222
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
223
|
+
|
224
|
+
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
225
|
+
end
|
226
|
+
|
227
|
+
it "should return an array for PearsonCorrelationSimilarity and GenericItemBasedRecommender" do
|
228
|
+
recommender = JrubyMahout::Recommender.new("PearsonCorrelationSimilarity", nil, "GenericItemBasedRecommender", false)
|
229
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
230
|
+
|
231
|
+
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
232
|
+
end
|
233
|
+
|
234
|
+
it "should return an array for EuclideanDistanceSimilarity and GenericItemBasedRecommender" do
|
235
|
+
recommender = JrubyMahout::Recommender.new("EuclideanDistanceSimilarity", nil, "GenericItemBasedRecommender", false)
|
236
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
237
|
+
|
238
|
+
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
239
|
+
end
|
240
|
+
|
241
|
+
it "should return an array for LogLikelihoodSimilarity and GenericItemBasedRecommender" do
|
242
|
+
recommender = JrubyMahout::Recommender.new("LogLikelihoodSimilarity", nil, "GenericItemBasedRecommender", false)
|
243
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
244
|
+
|
245
|
+
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
246
|
+
end
|
247
|
+
|
248
|
+
it "should return an array for TanimotoCoefficientSimilarity and GenericItemBasedRecommender" do
|
249
|
+
recommender = JrubyMahout::Recommender.new("TanimotoCoefficientSimilarity", nil, "GenericItemBasedRecommender", false)
|
250
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
251
|
+
|
252
|
+
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
253
|
+
end
|
254
|
+
|
255
|
+
it "should return an array for GenericItemSimilarity and GenericItemBasedRecommender" do
|
256
|
+
recommender = JrubyMahout::Recommender.new("GenericItemSimilarity", nil, "GenericItemBasedRecommender", false)
|
257
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
258
|
+
|
259
|
+
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
260
|
+
end
|
261
|
+
|
262
|
+
it "should return an array for SlopeOneRecommender" do
|
263
|
+
recommender = JrubyMahout::Recommender.new("nil", nil, "SlopeOneRecommender", false)
|
264
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
160
265
|
|
161
|
-
|
162
|
-
|
163
|
-
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
164
|
-
|
165
|
-
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
166
|
-
end
|
167
|
-
|
168
|
-
it "should return an array for GenericItemSimilarity and GenericItemBasedRecommender" do
|
169
|
-
recommender = JrubyMahout::Recommender.new("GenericItemSimilarity", nil, "GenericItemBasedRecommender", false)
|
170
|
-
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
171
|
-
|
172
|
-
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
173
|
-
end
|
174
|
-
|
175
|
-
it "should return an array for SlopeOneRecommender" do
|
176
|
-
recommender = JrubyMahout::Recommender.new("nil", nil, "SlopeOneRecommender", false)
|
177
|
-
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
178
|
-
|
179
|
-
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
266
|
+
recommender.recommend(1, 10, nil).should be_an_instance_of Array
|
267
|
+
end
|
180
268
|
end
|
181
269
|
end
|
182
270
|
|
@@ -293,4 +381,52 @@ describe JrubyMahout::Recommender do
|
|
293
381
|
end
|
294
382
|
end
|
295
383
|
end
|
384
|
+
|
385
|
+
# TODO: cover all cases
|
386
|
+
describe ".similar_users" do
|
387
|
+
context "with valid arguments" do
|
388
|
+
it "should return an array of users" do
|
389
|
+
recommender = JrubyMahout::Recommender.new("SpearmanCorrelationSimilarity", 5, "GenericUserBasedRecommender", false)
|
390
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
391
|
+
|
392
|
+
recommender.similar_users(1, 10, nil).should be_an_instance_of Array
|
393
|
+
end
|
394
|
+
end
|
395
|
+
end
|
396
|
+
|
397
|
+
# TODO: cover all cases
|
398
|
+
describe ".similar_items" do
|
399
|
+
context "with valid arguments" do
|
400
|
+
it "should return an array of items" do
|
401
|
+
recommender = JrubyMahout::Recommender.new("GenericItemSimilarity", nil, "GenericItemBasedRecommender", false)
|
402
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
403
|
+
|
404
|
+
recommender.similar_items(4, 10, nil).should be_an_instance_of Array
|
405
|
+
end
|
406
|
+
end
|
407
|
+
end
|
408
|
+
|
409
|
+
# TODO: cover all cases
|
410
|
+
describe ".recommended_because" do
|
411
|
+
context "with valid arguments" do
|
412
|
+
it "should return an array of items" do
|
413
|
+
recommender = JrubyMahout::Recommender.new("PearsonCorrelationSimilarity", nil, "GenericItemBasedRecommender", false)
|
414
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
415
|
+
|
416
|
+
recommender.recommended_because(1, 138, 5).should be_an_instance_of Array
|
417
|
+
end
|
418
|
+
end
|
419
|
+
end
|
420
|
+
|
421
|
+
# TODO: cover all cases
|
422
|
+
describe ".estimate_preference" do
|
423
|
+
context "with valid arguments" do
|
424
|
+
it "should return afloat with an estimate" do
|
425
|
+
recommender = JrubyMahout::Recommender.new("PearsonCorrelationSimilarity", nil, "GenericItemBasedRecommender", false)
|
426
|
+
recommender.data_model = JrubyMahout::DataModel.new("file", { :file_path => "spec/recommender_data.csv" }).data_model
|
427
|
+
|
428
|
+
recommender.estimate_preference(1, 138).should be_an_instance_of Float
|
429
|
+
end
|
430
|
+
end
|
431
|
+
end
|
296
432
|
end
|
metadata
CHANGED
@@ -2,14 +2,14 @@
|
|
2
2
|
name: jruby_mahout
|
3
3
|
version: !ruby/object:Gem::Version
|
4
4
|
prerelease:
|
5
|
-
version: 0.2.
|
5
|
+
version: 0.2.2
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
8
8
|
- Vasily Vasinov
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date: 2012-12-
|
12
|
+
date: 2012-12-20 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: rake
|
@@ -47,7 +47,7 @@ dependencies:
|
|
47
47
|
none: false
|
48
48
|
prerelease: false
|
49
49
|
type: :development
|
50
|
-
description:
|
50
|
+
description: JRuby Mahout is a gem that unleashes the power of Apache Mahout in the world of JRuby. Mahout is a superior machine learning library written in Java. It deals with recommendations, clustering and classification machine learning problems at scale. Until now it was difficult to use it in Ruby projects. You'd have to implement Java interfaces in Jruby yourself, which is not quick especially if you just started exploring the world of machine learning.
|
51
51
|
email:
|
52
52
|
- vasinov@me.com
|
53
53
|
executables: []
|
@@ -60,8 +60,6 @@ files:
|
|
60
60
|
bGliL2pydWJ5X21haG91dC9kYXRhX21vZGVsLnJi
|
61
61
|
- !binary |-
|
62
62
|
bGliL2pydWJ5X21haG91dC9ldmFsdWF0b3IucmI=
|
63
|
-
- !binary |-
|
64
|
-
bGliL2pydWJ5X21haG91dC9tYWhvdXRfaW1wb3J0cy5yYg==
|
65
63
|
- !binary |-
|
66
64
|
bGliL2pydWJ5X21haG91dC9teXNxbF9tYW5hZ2VyLnJi
|
67
65
|
- !binary |-
|
@@ -107,7 +105,7 @@ rubyforge_project:
|
|
107
105
|
rubygems_version: 1.8.24
|
108
106
|
signing_key:
|
109
107
|
specification_version: 3
|
110
|
-
summary:
|
108
|
+
summary: JRuby Mahout is a gem that unleashes the power of Apache Mahout in the world of JRuby.
|
111
109
|
test_files:
|
112
110
|
- !binary |-
|
113
111
|
c3BlYy9yZWNvbW1lbmRlcl9kYXRhLmNzdg==
|