svmkit 0.4.0 → 0.4.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/HISTORY.md +7 -0
- data/README.md +61 -25
- data/lib/svmkit.rb +4 -0
- data/lib/svmkit/linear_model/lasso.rb +5 -8
- data/lib/svmkit/linear_model/linear_regression.rb +159 -0
- data/lib/svmkit/linear_model/logistic_regression.rb +3 -2
- data/lib/svmkit/linear_model/ridge.rb +5 -6
- data/lib/svmkit/linear_model/svc.rb +3 -2
- data/lib/svmkit/linear_model/svr.rb +4 -7
- data/lib/svmkit/optimizer/nadam.rb +28 -2
- data/lib/svmkit/optimizer/rmsprop.rb +69 -0
- data/lib/svmkit/optimizer/sgd.rb +65 -0
- data/lib/svmkit/optimizer/yellow_fin.rb +144 -0
- data/lib/svmkit/polynomial_model/factorization_machine_classifier.rb +7 -9
- data/lib/svmkit/polynomial_model/factorization_machine_regressor.rb +7 -11
- data/lib/svmkit/version.rb +1 -1
- data/svmkit.gemspec +2 -2
- metadata +8 -4
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: af30c20b06fec51d531364ad9ca1414ce2fe36cdbe61fd8a1a7128c793d67304
|
4
|
+
data.tar.gz: ba87c535aa723ec17334fd6819577dcb51d2d11ccef6adb967f73de1702522f5
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: b32efe1dcd924c3e31ad0dc26dfbdcc86b0154b8b8591e58db5364103526b7dc828c46462b5f2dfe81c7c8ee23836ae8d4b81061cdf1ceb4f023c48cc78dd110
|
7
|
+
data.tar.gz: 6f38f301d23b3abc1037e1b0fe620e687da1fe44216a49707b2192d30fd8f2a7cb7690d6365580dda470e6852200db20b540c35947e3b1c54d8f8b5b599b2dc0
|
data/HISTORY.md
CHANGED
@@ -1,3 +1,10 @@
|
|
1
|
+
# 0.4.1
|
2
|
+
- Add class for linear regressor.
|
3
|
+
- Add class for SGD optimizer.
|
4
|
+
- Add class for RMSProp optimizer.
|
5
|
+
- Add class for YellowFin optimizer.
|
6
|
+
- Fix to be able to select optimizer on estimators of LineaModel and PolynomialModel.
|
7
|
+
|
1
8
|
# 0.4.0
|
2
9
|
## Breaking changes
|
3
10
|
|
data/README.md
CHANGED
@@ -8,8 +8,8 @@
|
|
8
8
|
SVMKit is a machine learninig library in Ruby.
|
9
9
|
SVMKit provides machine learning algorithms with interfaces similar to Scikit-Learn in Python.
|
10
10
|
SVMKit currently supports Linear / Kernel Support Vector Machine,
|
11
|
-
Logistic Regression, Ridge, Lasso, Factorization Machine,
|
12
|
-
K-nearest neighbor classifier, and cross-validation.
|
11
|
+
Logistic Regression, Linear Regression, Ridge, Lasso, Factorization Machine,
|
12
|
+
Naive Bayes, Decision Tree, Random Forest, K-nearest neighbor classifier, and cross-validation.
|
13
13
|
|
14
14
|
## Installation
|
15
15
|
|
@@ -29,61 +29,97 @@ Or install it yourself as:
|
|
29
29
|
|
30
30
|
## Usage
|
31
31
|
|
32
|
-
|
32
|
+
### Example 1. Pendigits dataset classification
|
33
|
+
|
34
|
+
SVMKit provides function loading libsvm format dataset file.
|
35
|
+
We start by downloading the pendigits dataset from LIBSVM Data web site.
|
36
|
+
|
37
|
+
```bash
|
38
|
+
$ wget https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass/pendigits
|
39
|
+
$ wget https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass/pendigits.t
|
40
|
+
```
|
41
|
+
|
42
|
+
Training of the classifier with Linear SVM and RBF kernel feature map is the following code.
|
33
43
|
|
34
44
|
```ruby
|
35
45
|
require 'svmkit'
|
36
46
|
|
47
|
+
# Load the training dataset.
|
37
48
|
samples, labels = SVMKit::Dataset.load_libsvm_file('pendigits')
|
38
49
|
|
39
|
-
|
40
|
-
|
50
|
+
# If the features consists only of integers, load_libsvm_file method reads in Numo::Int32 format.
|
51
|
+
# As necessary, you should convert sample array to Numo::DFloat format.
|
52
|
+
samples = Numo::DFloat.cast(samples)
|
41
53
|
|
42
|
-
|
43
|
-
|
54
|
+
# Map training data to RBF kernel feature space.
|
55
|
+
transformer = SVMKit::KernelApproximation::RBF.new(gamma: 0.0001, n_components: 1024, random_seed: 1)
|
56
|
+
transformed = transformer.fit_transform(samples)
|
44
57
|
|
45
|
-
|
58
|
+
# Train linear SVM classifier.
|
59
|
+
classifier = SVMKit::LinearModel::SVC.new(reg_param: 0.0001, max_iter: 1000, batch_size: 50, random_seed: 1)
|
46
60
|
classifier.fit(transformed, labels)
|
47
61
|
|
48
|
-
|
49
|
-
File.open('
|
50
|
-
File.open('
|
62
|
+
# Save the model.
|
63
|
+
File.open('transformer.dat', 'wb') { |f| f.write(Marshal.dump(transformer)) }
|
64
|
+
File.open('classifier.dat', 'wb') { |f| f.write(Marshal.dump(classifier)) }
|
51
65
|
```
|
52
66
|
|
53
|
-
|
67
|
+
Classifying testing data with the trained classifier is the following code.
|
54
68
|
|
55
69
|
```ruby
|
56
70
|
require 'svmkit'
|
57
71
|
|
72
|
+
# Load the testing dataset.
|
58
73
|
samples, labels = SVMKit::Dataset.load_libsvm_file('pendigits.t')
|
74
|
+
samples = Numo::DFloat.cast(samples)
|
59
75
|
|
60
|
-
|
61
|
-
transformer = Marshal.load(File.binread('
|
62
|
-
classifier = Marshal.load(File.binread('
|
76
|
+
# Load the model.
|
77
|
+
transformer = Marshal.load(File.binread('transformer.dat'))
|
78
|
+
classifier = Marshal.load(File.binread('classifier.dat'))
|
79
|
+
|
80
|
+
# Map testing data to RBF kernel feature space.
|
81
|
+
transformed = transformer.transform(samples)
|
82
|
+
|
83
|
+
# Classify the testing data and evaluate prediction results.
|
84
|
+
puts("Accuracy: %.1f%%" % (100.0 * classifier.score(transformed, labels)))
|
85
|
+
|
86
|
+
# Other evaluating approach
|
87
|
+
# results = classifier.predict(transformed)
|
88
|
+
# evaluator = SVMKit::EvaluationMeasure::Accuracy.new
|
89
|
+
# puts("Accuracy: %.1f%%" % (100.0 * evaluator.score(results, labels)))
|
90
|
+
```
|
63
91
|
|
64
|
-
|
65
|
-
transformed = transformer.transform(normalized)
|
92
|
+
Execution of the above scripts result in the following.
|
66
93
|
|
67
|
-
|
94
|
+
```bash
|
95
|
+
$ ruby train.rb
|
96
|
+
$ ruby test.rb
|
97
|
+
Accuracy: 98.4%
|
68
98
|
```
|
69
99
|
|
70
|
-
|
100
|
+
### Example 2. Cross-validation
|
71
101
|
|
72
102
|
```ruby
|
73
103
|
require 'svmkit'
|
74
104
|
|
105
|
+
# Load dataset.
|
75
106
|
samples, labels = SVMKit::Dataset.load_libsvm_file('pendigits')
|
107
|
+
samples = Numo::DFloat.cast(samples)
|
76
108
|
|
77
|
-
|
109
|
+
# Define the estimator to be evaluated.
|
110
|
+
lr = SVMKit::LinearModel::LogisticRegression.new(reg_param: 0.0001, random_seed: 1)
|
78
111
|
|
112
|
+
# Define the evaluation measure, splitting strategy, and cross validation.
|
113
|
+
ev = SVMKit::EvaluationMeasure::LogLoss.new
|
79
114
|
kf = SVMKit::ModelSelection::StratifiedKFold.new(n_splits: 5, shuffle: true, random_seed: 1)
|
80
|
-
cv = SVMKit::ModelSelection::CrossValidation.new(estimator:
|
115
|
+
cv = SVMKit::ModelSelection::CrossValidation.new(estimator: lr, splitter: kf, evaluator: ev)
|
81
116
|
|
82
|
-
|
83
|
-
report = cv.perform(
|
117
|
+
# Perform 5-cross validation.
|
118
|
+
report = cv.perform(samples, labels)
|
84
119
|
|
85
|
-
|
86
|
-
|
120
|
+
# Output result.
|
121
|
+
mean_logloss = report[:test_score].inject(:+) / kf.n_splits
|
122
|
+
puts("5-CV mean log-loss: %.3f" % mean_logloss)
|
87
123
|
```
|
88
124
|
|
89
125
|
## Development
|
data/lib/svmkit.rb
CHANGED
@@ -13,11 +13,15 @@ require 'svmkit/base/regressor'
|
|
13
13
|
require 'svmkit/base/transformer'
|
14
14
|
require 'svmkit/base/splitter'
|
15
15
|
require 'svmkit/base/evaluator'
|
16
|
+
require 'svmkit/optimizer/sgd'
|
17
|
+
require 'svmkit/optimizer/rmsprop'
|
16
18
|
require 'svmkit/optimizer/nadam'
|
19
|
+
require 'svmkit/optimizer/yellow_fin'
|
17
20
|
require 'svmkit/kernel_approximation/rbf'
|
18
21
|
require 'svmkit/linear_model/svc'
|
19
22
|
require 'svmkit/linear_model/svr'
|
20
23
|
require 'svmkit/linear_model/logistic_regression'
|
24
|
+
require 'svmkit/linear_model/linear_regression'
|
21
25
|
require 'svmkit/linear_model/ridge'
|
22
26
|
require 'svmkit/linear_model/lasso'
|
23
27
|
require 'svmkit/kernel_machine/kernel_svc'
|
@@ -43,7 +43,7 @@ module SVMKit
|
|
43
43
|
# @param max_iter [Integer] The maximum number of iterations.
|
44
44
|
# @param batch_size [Integer] The size of the mini batches.
|
45
45
|
# @param optimizer [Optimizer] The optimizer to calculate adaptive learning rate.
|
46
|
-
#
|
46
|
+
# If nil is given, Nadam is used.
|
47
47
|
# @param random_seed [Integer] The seed value using to initialize the random generator.
|
48
48
|
def initialize(reg_param: 1.0, fit_bias: false, max_iter: 1000, batch_size: 10, optimizer: nil, random_seed: nil)
|
49
49
|
check_params_float(reg_param: reg_param)
|
@@ -57,6 +57,7 @@ module SVMKit
|
|
57
57
|
@params[:max_iter] = max_iter
|
58
58
|
@params[:batch_size] = batch_size
|
59
59
|
@params[:optimizer] = optimizer
|
60
|
+
@params[:optimizer] ||= Optimizer::Nadam.new
|
60
61
|
@params[:random_seed] = random_seed
|
61
62
|
@params[:random_seed] ||= srand
|
62
63
|
@weight_vec = nil
|
@@ -80,11 +81,7 @@ module SVMKit
|
|
80
81
|
if n_outputs > 1
|
81
82
|
@weight_vec = Numo::DFloat.zeros(n_outputs, n_features)
|
82
83
|
@bias_term = Numo::DFloat.zeros(n_outputs)
|
83
|
-
n_outputs.times
|
84
|
-
weight, bias = single_fit(x, y[true, n])
|
85
|
-
@weight_vec[n, true] = weight
|
86
|
-
@bias_term[n] = bias
|
87
|
-
end
|
84
|
+
n_outputs.times { |n| @weight_vec[n, true], @bias_term[n] = single_fit(x, y[true, n]) }
|
88
85
|
else
|
89
86
|
@weight_vec, @bias_term = single_fit(x, y)
|
90
87
|
end
|
@@ -131,8 +128,8 @@ module SVMKit
|
|
131
128
|
weight_vec = Numo::DFloat.zeros(n_features)
|
132
129
|
left_weight_vec = Numo::DFloat.zeros(n_features)
|
133
130
|
right_weight_vec = Numo::DFloat.zeros(n_features)
|
134
|
-
left_optimizer =
|
135
|
-
right_optimizer =
|
131
|
+
left_optimizer = @params[:optimizer].dup
|
132
|
+
right_optimizer = @params[:optimizer].dup
|
136
133
|
# Start optimization.
|
137
134
|
@params[:max_iter].times do |_t|
|
138
135
|
# Random sampling.
|
@@ -0,0 +1,159 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require 'svmkit/validation'
|
4
|
+
require 'svmkit/base/base_estimator'
|
5
|
+
require 'svmkit/base/regressor'
|
6
|
+
require 'svmkit/optimizer/nadam'
|
7
|
+
|
8
|
+
module SVMKit
|
9
|
+
module LinearModel
|
10
|
+
# LinearRegression is a class that implements ordinary least square linear regression
|
11
|
+
# with mini-batch stochastic gradient descent optimization.
|
12
|
+
#
|
13
|
+
# @example
|
14
|
+
# estimator =
|
15
|
+
# SVMKit::LinearModel::LinearRegression.new(max_iter: 1000, batch_size: 20, random_seed: 1)
|
16
|
+
# estimator.fit(training_samples, traininig_values)
|
17
|
+
# results = estimator.predict(testing_samples)
|
18
|
+
#
|
19
|
+
class LinearRegression
|
20
|
+
include Base::BaseEstimator
|
21
|
+
include Base::Regressor
|
22
|
+
include Validation
|
23
|
+
|
24
|
+
# Return the weight vector.
|
25
|
+
# @return [Numo::DFloat] (shape: [n_outputs, n_features])
|
26
|
+
attr_reader :weight_vec
|
27
|
+
|
28
|
+
# Return the bias term (a.k.a. intercept).
|
29
|
+
# @return [Numo::DFloat] (shape: [n_outputs])
|
30
|
+
attr_reader :bias_term
|
31
|
+
|
32
|
+
# Return the random generator for random sampling.
|
33
|
+
# @return [Random]
|
34
|
+
attr_reader :rng
|
35
|
+
|
36
|
+
# Create a new ordinary least square linear regressor.
|
37
|
+
#
|
38
|
+
# @param fit_bias [Boolean] The flag indicating whether to fit the bias term.
|
39
|
+
# @param max_iter [Integer] The maximum number of iterations.
|
40
|
+
# @param batch_size [Integer] The size of the mini batches.
|
41
|
+
# @param optimizer [Optimizer] The optimizer to calculate adaptive learning rate.
|
42
|
+
# If nil is given, Nadam is used.
|
43
|
+
# @param random_seed [Integer] The seed value using to initialize the random generator.
|
44
|
+
def initialize(fit_bias: false, max_iter: 1000, batch_size: 10, optimizer: nil, random_seed: nil)
|
45
|
+
check_params_integer(max_iter: max_iter, batch_size: batch_size)
|
46
|
+
check_params_boolean(fit_bias: fit_bias)
|
47
|
+
check_params_type_or_nil(Integer, random_seed: random_seed)
|
48
|
+
check_params_positive(max_iter: max_iter, batch_size: batch_size)
|
49
|
+
@params = {}
|
50
|
+
@params[:fit_bias] = fit_bias
|
51
|
+
@params[:max_iter] = max_iter
|
52
|
+
@params[:batch_size] = batch_size
|
53
|
+
@params[:optimizer] = optimizer
|
54
|
+
@params[:optimizer] ||= Optimizer::Nadam.new
|
55
|
+
@params[:random_seed] = random_seed
|
56
|
+
@params[:random_seed] ||= srand
|
57
|
+
@weight_vec = nil
|
58
|
+
@bias_term = nil
|
59
|
+
@rng = Random.new(@params[:random_seed])
|
60
|
+
end
|
61
|
+
|
62
|
+
# Fit the model with given training data.
|
63
|
+
#
|
64
|
+
# @param x [Numo::DFloat] (shape: [n_samples, n_features]) The training data to be used for fitting the model.
|
65
|
+
# @param y [Numo::Int32] (shape: [n_samples, n_outputs]) The target values to be used for fitting the model.
|
66
|
+
# @return [LinearRegression] The learned regressor itself.
|
67
|
+
def fit(x, y)
|
68
|
+
check_sample_array(x)
|
69
|
+
check_tvalue_array(y)
|
70
|
+
check_sample_tvalue_size(x, y)
|
71
|
+
|
72
|
+
n_outputs = y.shape[1].nil? ? 1 : y.shape[1]
|
73
|
+
n_features = x.shape[1]
|
74
|
+
|
75
|
+
if n_outputs > 1
|
76
|
+
@weight_vec = Numo::DFloat.zeros(n_outputs, n_features)
|
77
|
+
@bias_term = Numo::DFloat.zeros(n_outputs)
|
78
|
+
n_outputs.times { |n| @weight_vec[n, true], @bias_term[n] = single_fit(x, y[true, n]) }
|
79
|
+
else
|
80
|
+
@weight_vec, @bias_term = single_fit(x, y)
|
81
|
+
end
|
82
|
+
|
83
|
+
self
|
84
|
+
end
|
85
|
+
|
86
|
+
# Predict values for samples.
|
87
|
+
#
|
88
|
+
# @param x [Numo::DFloat] (shape: [n_samples, n_features]) The samples to predict the values.
|
89
|
+
# @return [Numo::DFloat] (shape: [n_samples, n_outputs]) Predicted values per sample.
|
90
|
+
def predict(x)
|
91
|
+
check_sample_array(x)
|
92
|
+
x.dot(@weight_vec.transpose) + @bias_term
|
93
|
+
end
|
94
|
+
|
95
|
+
# Dump marshal data.
|
96
|
+
# @return [Hash] The marshal data about LinearRegression.
|
97
|
+
def marshal_dump
|
98
|
+
{ params: @params,
|
99
|
+
weight_vec: @weight_vec,
|
100
|
+
bias_term: @bias_term,
|
101
|
+
rng: @rng }
|
102
|
+
end
|
103
|
+
|
104
|
+
# Load marshal data.
|
105
|
+
# @return [nil]
|
106
|
+
def marshal_load(obj)
|
107
|
+
@params = obj[:params]
|
108
|
+
@weight_vec = obj[:weight_vec]
|
109
|
+
@bias_term = obj[:bias_term]
|
110
|
+
@rng = obj[:rng]
|
111
|
+
nil
|
112
|
+
end
|
113
|
+
|
114
|
+
private
|
115
|
+
|
116
|
+
def single_fit(x, y)
|
117
|
+
# Expand feature vectors for bias term.
|
118
|
+
samples = @params[:fit_bias] ? expand_feature(x) : x
|
119
|
+
# Initialize some variables.
|
120
|
+
n_samples, n_features = samples.shape
|
121
|
+
rand_ids = [*0...n_samples].shuffle(random: @rng)
|
122
|
+
weight_vec = Numo::DFloat.zeros(n_features)
|
123
|
+
optimizer = @params[:optimizer].dup
|
124
|
+
# Start optimization.
|
125
|
+
@params[:max_iter].times do |_t|
|
126
|
+
# Random sampling.
|
127
|
+
subset_ids = rand_ids.shift(@params[:batch_size])
|
128
|
+
rand_ids.concat(subset_ids)
|
129
|
+
data = samples[subset_ids, true]
|
130
|
+
values = y[subset_ids]
|
131
|
+
# Calculate gradients for loss function.
|
132
|
+
loss_grad = loss_gradient(data, values, weight_vec)
|
133
|
+
next if loss_grad.ne(0.0).count.zero?
|
134
|
+
# Update weight.
|
135
|
+
weight_vec = optimizer.call(weight_vec, weight_gradient(loss_grad, data, weight_vec))
|
136
|
+
end
|
137
|
+
split_weight_vec_bias(weight_vec)
|
138
|
+
end
|
139
|
+
|
140
|
+
def loss_gradient(x, y, weight)
|
141
|
+
2.0 * (x.dot(weight) - y)
|
142
|
+
end
|
143
|
+
|
144
|
+
def weight_gradient(loss_grad, data, _weight)
|
145
|
+
(loss_grad.expand_dims(1) * data).mean(0)
|
146
|
+
end
|
147
|
+
|
148
|
+
def expand_feature(x)
|
149
|
+
Numo::NArray.hstack([x, Numo::DFloat.ones([x.shape[0], 1])])
|
150
|
+
end
|
151
|
+
|
152
|
+
def split_weight_vec_bias(weight_vec)
|
153
|
+
weights = @params[:fit_bias] ? weight_vec[0...-1] : weight_vec
|
154
|
+
bias = @params[:fit_bias] ? weight_vec[-1] : 0.0
|
155
|
+
[weights, bias]
|
156
|
+
end
|
157
|
+
end
|
158
|
+
end
|
159
|
+
end
|
@@ -49,7 +49,7 @@ module SVMKit
|
|
49
49
|
# @param max_iter [Integer] The maximum number of iterations.
|
50
50
|
# @param batch_size [Integer] The size of the mini batches.
|
51
51
|
# @param optimizer [Optimizer] The optimizer to calculate adaptive learning rate.
|
52
|
-
#
|
52
|
+
# If nil is given, Nadam is used.
|
53
53
|
# @param random_seed [Integer] The seed value using to initialize the random generator.
|
54
54
|
def initialize(reg_param: 1.0, fit_bias: false, bias_scale: 1.0,
|
55
55
|
max_iter: 1000, batch_size: 20, optimizer: nil, random_seed: nil)
|
@@ -65,6 +65,7 @@ module SVMKit
|
|
65
65
|
@params[:max_iter] = max_iter
|
66
66
|
@params[:batch_size] = batch_size
|
67
67
|
@params[:optimizer] = optimizer
|
68
|
+
@params[:optimizer] ||= Optimizer::Nadam.new
|
68
69
|
@params[:random_seed] = random_seed
|
69
70
|
@params[:random_seed] ||= srand
|
70
71
|
@weight_vec = nil
|
@@ -175,7 +176,7 @@ module SVMKit
|
|
175
176
|
n_samples, n_features = samples.shape
|
176
177
|
rand_ids = [*0...n_samples].shuffle(random: @rng)
|
177
178
|
weight_vec = Numo::DFloat.zeros(n_features)
|
178
|
-
optimizer =
|
179
|
+
optimizer = @params[:optimizer].dup
|
179
180
|
# Start optimization.
|
180
181
|
@params[:max_iter].times do |_t|
|
181
182
|
# random sampling
|
@@ -39,6 +39,8 @@ module SVMKit
|
|
39
39
|
# @param fit_bias [Boolean] The flag indicating whether to fit the bias term.
|
40
40
|
# @param max_iter [Integer] The maximum number of iterations.
|
41
41
|
# @param batch_size [Integer] The size of the mini batches.
|
42
|
+
# @param optimizer [Optimizer] The optimizer to calculate adaptive learning rate.
|
43
|
+
# If nil is given, Nadam is used.
|
42
44
|
# @param random_seed [Integer] The seed value using to initialize the random generator.
|
43
45
|
def initialize(reg_param: 1.0, fit_bias: false, max_iter: 1000, batch_size: 10, optimizer: nil, random_seed: nil)
|
44
46
|
check_params_float(reg_param: reg_param)
|
@@ -52,6 +54,7 @@ module SVMKit
|
|
52
54
|
@params[:max_iter] = max_iter
|
53
55
|
@params[:batch_size] = batch_size
|
54
56
|
@params[:optimizer] = optimizer
|
57
|
+
@params[:optimizer] ||= Optimizer::Nadam.new
|
55
58
|
@params[:random_seed] = random_seed
|
56
59
|
@params[:random_seed] ||= srand
|
57
60
|
@weight_vec = nil
|
@@ -75,11 +78,7 @@ module SVMKit
|
|
75
78
|
if n_outputs > 1
|
76
79
|
@weight_vec = Numo::DFloat.zeros(n_outputs, n_features)
|
77
80
|
@bias_term = Numo::DFloat.zeros(n_outputs)
|
78
|
-
n_outputs.times
|
79
|
-
weight, bias = single_fit(x, y[true, n])
|
80
|
-
@weight_vec[n, true] = weight
|
81
|
-
@bias_term[n] = bias
|
82
|
-
end
|
81
|
+
n_outputs.times { |n| @weight_vec[n, true], @bias_term[n] = single_fit(x, y[true, n]) }
|
83
82
|
else
|
84
83
|
@weight_vec, @bias_term = single_fit(x, y)
|
85
84
|
end
|
@@ -124,7 +123,7 @@ module SVMKit
|
|
124
123
|
n_samples, n_features = samples.shape
|
125
124
|
rand_ids = [*0...n_samples].shuffle(random: @rng)
|
126
125
|
weight_vec = Numo::DFloat.zeros(n_features)
|
127
|
-
optimizer =
|
126
|
+
optimizer = @params[:optimizer].dup
|
128
127
|
# Start optimization.
|
129
128
|
@params[:max_iter].times do |_t|
|
130
129
|
# Random sampling.
|
@@ -51,7 +51,7 @@ module SVMKit
|
|
51
51
|
# @param batch_size [Integer] The size of the mini batches.
|
52
52
|
# @param probability [Boolean] The flag indicating whether to perform probability estimation.
|
53
53
|
# @param optimizer [Optimizer] The optimizer to calculate adaptive learning rate.
|
54
|
-
#
|
54
|
+
# If nil is given, Nadam is used.
|
55
55
|
# @param random_seed [Integer] The seed value using to initialize the random generator.
|
56
56
|
def initialize(reg_param: 1.0, fit_bias: false, bias_scale: 1.0,
|
57
57
|
max_iter: 1000, batch_size: 20, probability: false, optimizer: nil, random_seed: nil)
|
@@ -68,6 +68,7 @@ module SVMKit
|
|
68
68
|
@params[:batch_size] = batch_size
|
69
69
|
@params[:probability] = probability
|
70
70
|
@params[:optimizer] = optimizer
|
71
|
+
@params[:optimizer] ||= Optimizer::Nadam.new
|
71
72
|
@params[:random_seed] = random_seed
|
72
73
|
@params[:random_seed] ||= srand
|
73
74
|
@weight_vec = nil
|
@@ -194,7 +195,7 @@ module SVMKit
|
|
194
195
|
n_samples, n_features = samples.shape
|
195
196
|
rand_ids = [*0...n_samples].shuffle(random: @rng)
|
196
197
|
weight_vec = Numo::DFloat.zeros(n_features)
|
197
|
-
optimizer =
|
198
|
+
optimizer = @params[:optimizer].dup
|
198
199
|
# Start optimization.
|
199
200
|
@params[:max_iter].times do |_t|
|
200
201
|
# random sampling.
|
@@ -44,7 +44,7 @@ module SVMKit
|
|
44
44
|
# @param max_iter [Integer] The maximum number of iterations.
|
45
45
|
# @param batch_size [Integer] The size of the mini batches.
|
46
46
|
# @param optimizer [Optimizer] The optimizer to calculate adaptive learning rate.
|
47
|
-
#
|
47
|
+
# If nil is given, Nadam is used.
|
48
48
|
# @param random_seed [Integer] The seed value using to initialize the random generator.
|
49
49
|
def initialize(reg_param: 1.0, fit_bias: false, bias_scale: 1.0, epsilon: 0.1,
|
50
50
|
max_iter: 1000, batch_size: 20, optimizer: nil, random_seed: nil)
|
@@ -62,6 +62,7 @@ module SVMKit
|
|
62
62
|
@params[:max_iter] = max_iter
|
63
63
|
@params[:batch_size] = batch_size
|
64
64
|
@params[:optimizer] = optimizer
|
65
|
+
@params[:optimizer] ||= Optimizer::Nadam.new
|
65
66
|
@params[:random_seed] = random_seed
|
66
67
|
@params[:random_seed] ||= srand
|
67
68
|
@weight_vec = nil
|
@@ -85,11 +86,7 @@ module SVMKit
|
|
85
86
|
if n_outputs > 1
|
86
87
|
@weight_vec = Numo::DFloat.zeros(n_outputs, n_features)
|
87
88
|
@bias_term = Numo::DFloat.zeros(n_outputs)
|
88
|
-
n_outputs.times
|
89
|
-
weight, bias = single_fit(x, y[true, n])
|
90
|
-
@weight_vec[n, true] = weight
|
91
|
-
@bias_term[n] = bias
|
92
|
-
end
|
89
|
+
n_outputs.times { |n| @weight_vec[n, true], @bias_term[n] = single_fit(x, y[true, n]) }
|
93
90
|
else
|
94
91
|
@weight_vec, @bias_term = single_fit(x, y)
|
95
92
|
end
|
@@ -134,7 +131,7 @@ module SVMKit
|
|
134
131
|
n_samples, n_features = samples.shape
|
135
132
|
rand_ids = [*0...n_samples].shuffle(random: @rng)
|
136
133
|
weight_vec = Numo::DFloat.zeros(n_features)
|
137
|
-
optimizer =
|
134
|
+
optimizer = @params[:optimizer].dup
|
138
135
|
# Start optimization.
|
139
136
|
@params[:max_iter].times do |_t|
|
140
137
|
# random sampling
|
@@ -1,16 +1,22 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
2
|
|
3
3
|
require 'svmkit/validation'
|
4
|
+
require 'svmkit/base/base_estimator'
|
4
5
|
|
5
6
|
module SVMKit
|
6
7
|
# This module consists of the classes that implement optimizers adaptively tuning hyperparameters.
|
7
8
|
module Optimizer
|
8
9
|
# Nadam is a class that implements Nadam optimizer.
|
9
|
-
#
|
10
|
+
#
|
11
|
+
# @example
|
12
|
+
# optimizer = SVMKit::Optimizer::Nadam.new(learning_rate: 0.01, momentum: 0.9, decay1: 0.9, decay2: 0.999)
|
13
|
+
# estimator = SVMKit::LinearModel::LinearRegression.new(optimizer: optimizer, random_seed: 1)
|
14
|
+
# estimator.fit(samples, values)
|
10
15
|
#
|
11
16
|
# *Reference*
|
12
17
|
# - T. Dozat, "Incorporating Nesterov Momentum into Adam," Tech. Repo. Stanford University, 2015.
|
13
18
|
class Nadam
|
19
|
+
include Base::BaseEstimator
|
14
20
|
include Validation
|
15
21
|
|
16
22
|
# Create a new optimizer with Nadam
|
@@ -19,7 +25,6 @@ module SVMKit
|
|
19
25
|
# @param momentum [Float] The initial value of momentum.
|
20
26
|
# @param decay1 [Float] The smoothing parameter for the first moment.
|
21
27
|
# @param decay2 [Float] The smoothing parameter for the second moment.
|
22
|
-
# @param schedule_decay [Float] The smooting parameter.
|
23
28
|
def initialize(learning_rate: 0.01, momentum: 0.9, decay1: 0.9, decay2: 0.999)
|
24
29
|
check_params_float(learning_rate: learning_rate, momentum: momentum, decay1: decay1, decay2: decay2)
|
25
30
|
check_params_positive(learning_rate: learning_rate, momentum: momentum, decay1: decay1, decay2: decay2)
|
@@ -59,6 +64,27 @@ module SVMKit
|
|
59
64
|
|
60
65
|
weight - (@params[:learning_rate] / (nm_sec_moment**0.5 + 1e-8)) * ((1 - decay1_curr) * nm_gradient + decay1_next * nm_fst_moment)
|
61
66
|
end
|
67
|
+
|
68
|
+
# Dump marshal data.
|
69
|
+
# @return [Hash] The marshal data.
|
70
|
+
def marshal_dump
|
71
|
+
{ params: @params,
|
72
|
+
fst_moment: @fst_moment,
|
73
|
+
sec_moment: @sec_moment,
|
74
|
+
decay1_prod: @decay1_prod,
|
75
|
+
iter: @iter }
|
76
|
+
end
|
77
|
+
|
78
|
+
# Load marshal data.
|
79
|
+
# @return [nil]
|
80
|
+
def marshal_load(obj)
|
81
|
+
@params = obj[:params]
|
82
|
+
@fst_moment = obj[:fst_moment]
|
83
|
+
@sec_moment = obj[:sec_moment]
|
84
|
+
@decay1_prod = obj[:decay1_prod]
|
85
|
+
@iter = obj[:iter]
|
86
|
+
nil
|
87
|
+
end
|
62
88
|
end
|
63
89
|
end
|
64
90
|
end
|
@@ -0,0 +1,69 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require 'svmkit/validation'
|
4
|
+
require 'svmkit/base/base_estimator'
|
5
|
+
|
6
|
+
module SVMKit
|
7
|
+
module Optimizer
|
8
|
+
# RMSProp is a class that implements RMSProp optimizer.
|
9
|
+
#
|
10
|
+
# @example
|
11
|
+
# optimizer = SVMKit::Optimizer::RMSProp.new(learning_rate: 0.01, momentum: 0.9, decay: 0.9)
|
12
|
+
# estimator = SVMKit::LinearModel::LinearRegression.new(optimizer: optimizer, random_seed: 1)
|
13
|
+
# estimator.fit(samples, values)
|
14
|
+
#
|
15
|
+
# *Reference*
|
16
|
+
# - I. Sutskever, J. Martens, G. Dahl, and G. Hinton, "On the importance of initialization and momentum in deep learning," Proc. ICML' 13, pp. 1139--1147, 2013.
|
17
|
+
# - G. Hinton, N. Srivastava, and K. Swersky, "Lecture 6e rmsprop," Neural Networks for Machine Learning, 2012.
|
18
|
+
class RMSProp
|
19
|
+
include Base::BaseEstimator
|
20
|
+
include Validation
|
21
|
+
|
22
|
+
# Create a new optimizer with RMSProp.
|
23
|
+
#
|
24
|
+
# @param learning_rate [Float] The initial value of learning rate.
|
25
|
+
# @param momentum [Float] The initial value of momentum.
|
26
|
+
# @param decay [Float] The smooting parameter.
|
27
|
+
def initialize(learning_rate: 0.01, momentum: 0.9, decay: 0.9)
|
28
|
+
check_params_float(learning_rate: learning_rate, momentum: momentum, decay: decay)
|
29
|
+
check_params_positive(learning_rate: learning_rate, momentum: momentum, decay: decay)
|
30
|
+
@params = {}
|
31
|
+
@params[:learning_rate] = learning_rate
|
32
|
+
@params[:momentum] = momentum
|
33
|
+
@params[:decay] = decay
|
34
|
+
@moment = nil
|
35
|
+
@update = nil
|
36
|
+
end
|
37
|
+
|
38
|
+
# Calculate the updated weight with RMSProp adaptive learning rate.
|
39
|
+
#
|
40
|
+
# @param weight [Numo::DFloat] (shape: [n_features]) The weight to be updated.
|
41
|
+
# @param gradient [Numo::DFloat] (shape: [n_features]) The gradient for updating the weight.
|
42
|
+
# @return [Numo::DFloat] (shape: [n_feautres]) The updated weight.
|
43
|
+
def call(weight, gradient)
|
44
|
+
@moment ||= Numo::DFloat.zeros(weight.shape[0])
|
45
|
+
@update ||= Numo::DFloat.zeros(weight.shape[0])
|
46
|
+
@moment = @params[:decay] * @moment + (1.0 - @params[:decay]) * gradient**2
|
47
|
+
@update = @params[:momentum] * @update - (@params[:learning_rate] / (@moment**0.5 + 1.0e-8)) * gradient
|
48
|
+
weight + @update
|
49
|
+
end
|
50
|
+
|
51
|
+
# Dump marshal data.
|
52
|
+
# @return [Hash] The marshal data.
|
53
|
+
def marshal_dump
|
54
|
+
{ params: @params,
|
55
|
+
moment: @moment,
|
56
|
+
update: @update }
|
57
|
+
end
|
58
|
+
|
59
|
+
# Load marshal data.
|
60
|
+
# @return [nil]
|
61
|
+
def marshal_load(obj)
|
62
|
+
@params = obj[:params]
|
63
|
+
@moment = obj[:moment]
|
64
|
+
@update = obj[:update]
|
65
|
+
nil
|
66
|
+
end
|
67
|
+
end
|
68
|
+
end
|
69
|
+
end
|
@@ -0,0 +1,65 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require 'svmkit/validation'
|
4
|
+
require 'svmkit/base/base_estimator'
|
5
|
+
|
6
|
+
module SVMKit
|
7
|
+
module Optimizer
|
8
|
+
# SGD is a class that implements SGD optimizer.
|
9
|
+
#
|
10
|
+
# @example
|
11
|
+
# optimizer = SVMKit::Optimizer::SGD.new(learning_rate: 0.01, momentum: 0.9, decay: 0.9)
|
12
|
+
# estimator = SVMKit::LinearModel::LinearRegression.new(optimizer: optimizer, random_seed: 1)
|
13
|
+
# estimator.fit(samples, values)
|
14
|
+
class SGD
|
15
|
+
include Base::BaseEstimator
|
16
|
+
include Validation
|
17
|
+
|
18
|
+
# Create a new optimizer with SGD.
|
19
|
+
#
|
20
|
+
# @param learning_rate [Float] The initial value of learning rate.
|
21
|
+
# @param momentum [Float] The initial value of momentum.
|
22
|
+
# @param decay [Float] The smooting parameter.
|
23
|
+
def initialize(learning_rate: 0.01, momentum: 0.0, decay: 0.0)
|
24
|
+
check_params_float(learning_rate: learning_rate, momentum: momentum, decay: decay)
|
25
|
+
check_params_positive(learning_rate: learning_rate, momentum: momentum, decay: decay)
|
26
|
+
@params = {}
|
27
|
+
@params[:learning_rate] = learning_rate
|
28
|
+
@params[:momentum] = momentum
|
29
|
+
@params[:decay] = decay
|
30
|
+
@iter = 0
|
31
|
+
@update = nil
|
32
|
+
end
|
33
|
+
|
34
|
+
# Calculate the updated weight with SGD.
|
35
|
+
#
|
36
|
+
# @param weight [Numo::DFloat] (shape: [n_features]) The weight to be updated.
|
37
|
+
# @param gradient [Numo::DFloat] (shape: [n_features]) The gradient for updating the weight.
|
38
|
+
# @return [Numo::DFloat] (shape: [n_feautres]) The updated weight.
|
39
|
+
def call(weight, gradient)
|
40
|
+
@update ||= Numo::DFloat.zeros(weight.shape[0])
|
41
|
+
current_learning_rate = @params[:learning_rate] / (1.0 + @params[:decay] * @iter)
|
42
|
+
@iter += 1
|
43
|
+
@update = @params[:momentum] * @update - current_learning_rate * gradient
|
44
|
+
weight + @update
|
45
|
+
end
|
46
|
+
|
47
|
+
# Dump marshal data.
|
48
|
+
# @return [Hash] The marshal data.
|
49
|
+
def marshal_dump
|
50
|
+
{ params: @params,
|
51
|
+
iter: @iter,
|
52
|
+
update: @update }
|
53
|
+
end
|
54
|
+
|
55
|
+
# Load marshal data.
|
56
|
+
# @return [nil]
|
57
|
+
def marshal_load(obj)
|
58
|
+
@params = obj[:params]
|
59
|
+
@iter = obj[:iter]
|
60
|
+
@update = obj[:update]
|
61
|
+
nil
|
62
|
+
end
|
63
|
+
end
|
64
|
+
end
|
65
|
+
end
|
@@ -0,0 +1,144 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require 'svmkit/validation'
|
4
|
+
require 'svmkit/base/base_estimator'
|
5
|
+
|
6
|
+
module SVMKit
|
7
|
+
module Optimizer
|
8
|
+
# YellowFin is a class that implements YellowFin optimizer.
|
9
|
+
#
|
10
|
+
# @example
|
11
|
+
# optimizer = SVMKit::Optimizer::YellowFin.new(learning_rate: 0.01, momentum: 0.9, decay: 0.999, window_width: 20)
|
12
|
+
# estimator = SVMKit::LinearModel::LinearRegression.new(optimizer: optimizer, random_seed: 1)
|
13
|
+
# estimator.fit(samples, values)
|
14
|
+
#
|
15
|
+
# *Reference*
|
16
|
+
# - J. Zhang and I. Mitliagkas, "YellowFin and the Art of Momentum Tuning," CoRR abs/1706.03471, 2017.
|
17
|
+
class YellowFin
|
18
|
+
include Base::BaseEstimator
|
19
|
+
include Validation
|
20
|
+
|
21
|
+
# Create a new optimizer with YellowFin.
|
22
|
+
#
|
23
|
+
# @param learning_rate [Float] The initial value of learning rate.
|
24
|
+
# @param momentum [Float] The initial value of momentum.
|
25
|
+
# @param decay [Float] The smooting parameter.
|
26
|
+
# @param window_width [Integer] The sliding window width for searching curvature range.
|
27
|
+
def initialize(learning_rate: 0.01, momentum: 0.9, decay: 0.999, window_width: 20)
|
28
|
+
check_params_float(learning_rate: learning_rate, momentum: momentum, decay: decay)
|
29
|
+
check_params_integer(window_width: window_width)
|
30
|
+
check_params_positive(learning_rate: learning_rate, momentum: momentum, decay: decay, window_width: window_width)
|
31
|
+
@params = {}
|
32
|
+
@params[:learning_rate] = learning_rate
|
33
|
+
@params[:momentum] = momentum
|
34
|
+
@params[:decay] = decay
|
35
|
+
@params[:window_width] = window_width
|
36
|
+
@smth_learning_rate = learning_rate
|
37
|
+
@smth_momentum = momentum
|
38
|
+
@grad_norms = nil
|
39
|
+
@grad_norm_min = 0.0
|
40
|
+
@grad_norm_max = 0.0
|
41
|
+
@grad_mean_sqr = 0.0
|
42
|
+
@grad_mean = 0.0
|
43
|
+
@grad_var = 0.0
|
44
|
+
@grad_norm_mean = 0.0
|
45
|
+
@curve_mean = 0.0
|
46
|
+
@distance_mean = 0.0
|
47
|
+
@update = nil
|
48
|
+
end
|
49
|
+
|
50
|
+
# Calculate the updated weight with adaptive momentum coefficient and learning rate.
|
51
|
+
#
|
52
|
+
# @param weight [Numo::DFloat] (shape: [n_features]) The weight to be updated.
|
53
|
+
# @param gradient [Numo::DFloat] (shape: [n_features]) The gradient for updating the weight.
|
54
|
+
# @return [Numo::DFloat] (shape: [n_feautres]) The updated weight.
|
55
|
+
def call(weight, gradient)
|
56
|
+
@update ||= Numo::DFloat.zeros(weight.shape[0])
|
57
|
+
curvature_range(gradient)
|
58
|
+
gradient_variance(gradient)
|
59
|
+
distance_to_optimum(gradient)
|
60
|
+
@smth_momentum = @params[:decay] * @smth_momentum + (1 - @params[:decay]) * current_momentum
|
61
|
+
@smth_learning_rate = @params[:decay] * @smth_learning_rate + (1 - @params[:decay]) * current_learning_rate
|
62
|
+
@update = @smth_momentum * @update - @smth_learning_rate * gradient
|
63
|
+
weight + @update
|
64
|
+
end
|
65
|
+
|
66
|
+
private
|
67
|
+
|
68
|
+
def current_momentum
|
69
|
+
dr = Math.sqrt(@grad_norm_max / @grad_norm_min + 1.0e-8)
|
70
|
+
[cubic_root**2, ((dr - 1) / (dr + 1))**2].max
|
71
|
+
end
|
72
|
+
|
73
|
+
def current_learning_rate
|
74
|
+
(1.0 - Math.sqrt(@params[:momentum]))**2 / (@grad_norm_min + 1.0e-8)
|
75
|
+
end
|
76
|
+
|
77
|
+
def cubic_root
|
78
|
+
p = (@distance_mean**2 * @grad_norm_min**2) / (2 * @grad_var + 1.0e-8)
|
79
|
+
w3 = (-Math.sqrt(p**2 + 4.fdiv(27) * p**3) - p).fdiv(2)
|
80
|
+
w = (w3 >= 0.0 ? 1 : -1) * w3.abs**1.fdiv(3)
|
81
|
+
y = w - p / (3 * w + 1.0e-8)
|
82
|
+
y + 1
|
83
|
+
end
|
84
|
+
|
85
|
+
def curvature_range(gradient)
|
86
|
+
@grad_norms ||= []
|
87
|
+
@grad_norms.push((gradient**2).sum)
|
88
|
+
@grad_norms.shift(@grad_norms.size - @params[:window_width]) if @grad_norms.size > @params[:window_width]
|
89
|
+
@grad_norm_min = @params[:decay] * @grad_norm_min + (1 - @params[:decay]) * @grad_norms.min
|
90
|
+
@grad_norm_max = @params[:decay] * @grad_norm_max + (1 - @params[:decay]) * @grad_norms.max
|
91
|
+
end
|
92
|
+
|
93
|
+
def gradient_variance(gradient)
|
94
|
+
@grad_mean_sqr = @params[:decay] * @grad_mean_sqr + (1 - @params[:decay]) * gradient**2
|
95
|
+
@grad_mean = @params[:decay] * @grad_mean + (1 - @params[:decay]) * gradient
|
96
|
+
@grad_var = (@grad_mean_sqr - @grad_mean**2).sum
|
97
|
+
end
|
98
|
+
|
99
|
+
def distance_to_optimum(gradient)
|
100
|
+
grad_sqr = (gradient**2).sum
|
101
|
+
@grad_norm_mean = @params[:decay] * @grad_norm_mean + (1 - @params[:decay]) * Math.sqrt(grad_sqr + 1.0e-8)
|
102
|
+
@curve_mean = @params[:decay] * @curve_mean + (1 - @params[:decay]) * grad_sqr
|
103
|
+
@distance_mean = @params[:decay] * @distance_mean + (1 - @params[:decay]) * (@grad_norm_mean / @curve_mean)
|
104
|
+
end
|
105
|
+
|
106
|
+
# Dump marshal data.
|
107
|
+
# @return [Hash] The marshal data.
|
108
|
+
def marshal_dump
|
109
|
+
{ params: @params,
|
110
|
+
smth_learning_rate: @smth_learning_rate,
|
111
|
+
smth_momentum: @smth_momentum,
|
112
|
+
grad_norms: @grad_norms,
|
113
|
+
grad_norm_min: @grad_norm_min,
|
114
|
+
grad_norm_max: @grad_norm_max,
|
115
|
+
grad_mean_sqr: @grad_mean_sqr,
|
116
|
+
grad_mean: @grad_mean,
|
117
|
+
grad_var: @grad_var,
|
118
|
+
grad_norm_mean: @grad_norm_mean,
|
119
|
+
curve_mean: @curve_mean,
|
120
|
+
distance_mean: @distance_mean,
|
121
|
+
update: @update }
|
122
|
+
end
|
123
|
+
|
124
|
+
# Load marshal data.
|
125
|
+
# @return [nis]
|
126
|
+
def marshal_load(obj)
|
127
|
+
@params = obj[:params]
|
128
|
+
@smth_learning_rate = obj[:smth_learning_rate]
|
129
|
+
@smth_momentum = obj[:smth_momentum]
|
130
|
+
@grad_norms = obj[:grad_norms]
|
131
|
+
@grad_norm_min = obj[:grad_norm_min]
|
132
|
+
@grad_norm_max = obj[:grad_norm_max]
|
133
|
+
@grad_mean_sqr = obj[:grad_mean_sqr]
|
134
|
+
@grad_mean = obj[:grad_mean]
|
135
|
+
@grad_var = obj[:grad_var]
|
136
|
+
@grad_norm_mean = obj[:grad_norm_mean]
|
137
|
+
@curve_mean = obj[:curve_mean]
|
138
|
+
@distance_mean = obj[:distance_mean]
|
139
|
+
@update = obj[:update]
|
140
|
+
nil
|
141
|
+
end
|
142
|
+
end
|
143
|
+
end
|
144
|
+
end
|
@@ -21,8 +21,8 @@ module SVMKit
|
|
21
21
|
# results = estimator.predict(testing_samples)
|
22
22
|
#
|
23
23
|
# *Reference*
|
24
|
-
# - S. Rendle, "Factorization Machines with libFM," ACM
|
25
|
-
# - S. Rendle, "Factorization Machines,"
|
24
|
+
# - S. Rendle, "Factorization Machines with libFM," ACM TIST, vol. 3 (3), pp. 57:1--57:22, 2012.
|
25
|
+
# - S. Rendle, "Factorization Machines," Proc. ICDM'10, pp. 995--1000, 2010.
|
26
26
|
class FactorizationMachineClassifier
|
27
27
|
include Base::BaseEstimator
|
28
28
|
include Base::Classifier
|
@@ -57,7 +57,7 @@ module SVMKit
|
|
57
57
|
# @param max_iter [Integer] The maximum number of iterations.
|
58
58
|
# @param batch_size [Integer] The size of the mini batches.
|
59
59
|
# @param optimizer [Optimizer] The optimizer to calculate adaptive learning rate.
|
60
|
-
#
|
60
|
+
# If nil is given, Nadam is used.
|
61
61
|
# @param random_seed [Integer] The seed value using to initialize the random generator.
|
62
62
|
def initialize(n_factors: 2, loss: 'hinge', reg_param_linear: 1.0, reg_param_factor: 1.0,
|
63
63
|
max_iter: 1000, batch_size: 10, optimizer: nil, random_seed: nil)
|
@@ -76,6 +76,7 @@ module SVMKit
|
|
76
76
|
@params[:max_iter] = max_iter
|
77
77
|
@params[:batch_size] = batch_size
|
78
78
|
@params[:optimizer] = optimizer
|
79
|
+
@params[:optimizer] ||= Optimizer::Nadam.new
|
79
80
|
@params[:random_seed] = random_seed
|
80
81
|
@params[:random_seed] ||= srand
|
81
82
|
@factor_mat = nil
|
@@ -105,10 +106,7 @@ module SVMKit
|
|
105
106
|
@bias_term = Numo::DFloat.zeros(n_classes)
|
106
107
|
n_classes.times do |n|
|
107
108
|
bin_y = Numo::Int32.cast(y.eq(@classes[n])) * 2 - 1
|
108
|
-
|
109
|
-
@factor_mat[n, true, true] = factor
|
110
|
-
@weight_vec[n, true] = weight
|
111
|
-
@bias_term[n] = bias
|
109
|
+
@factor_mat[n, true, true], @weight_vec[n, true], @bias_term[n] = binary_fit(x, bin_y)
|
112
110
|
end
|
113
111
|
else
|
114
112
|
negative_label = y.to_a.uniq.min
|
@@ -194,8 +192,8 @@ module SVMKit
|
|
194
192
|
rand_ids = [*0...n_samples].shuffle(random: @rng)
|
195
193
|
weight_vec = Numo::DFloat.zeros(n_features + 1)
|
196
194
|
factor_mat = Numo::DFloat.zeros(@params[:n_factors], n_features)
|
197
|
-
weight_optimizer =
|
198
|
-
factor_optimizers = Array.new(@params[:n_factors]) {
|
195
|
+
weight_optimizer = @params[:optimizer].dup
|
196
|
+
factor_optimizers = Array.new(@params[:n_factors]) { @params[:optimizer].dup }
|
199
197
|
# Start optimization.
|
200
198
|
@params[:max_iter].times do |_t|
|
201
199
|
# Random sampling.
|
@@ -19,8 +19,8 @@ module SVMKit
|
|
19
19
|
# results = estimator.predict(testing_samples)
|
20
20
|
#
|
21
21
|
# *Reference*
|
22
|
-
# - S. Rendle, "Factorization Machines with libFM," ACM
|
23
|
-
# - S. Rendle, "Factorization Machines," Proc.
|
22
|
+
# - S. Rendle, "Factorization Machines with libFM," ACM TIST, vol. 3 (3), pp. 57:1--57:22, 2012.
|
23
|
+
# - S. Rendle, "Factorization Machines," Proc. ICDM'10, pp. 995--1000, 2010.
|
24
24
|
class FactorizationMachineRegressor
|
25
25
|
include Base::BaseEstimator
|
26
26
|
include Base::Regressor
|
@@ -50,7 +50,7 @@ module SVMKit
|
|
50
50
|
# @param max_iter [Integer] The maximum number of iterations.
|
51
51
|
# @param batch_size [Integer] The size of the mini batches.
|
52
52
|
# @param optimizer [Optimizer] The optimizer to calculate adaptive learning rate.
|
53
|
-
#
|
53
|
+
# If nil is given, Nadam is used.
|
54
54
|
# @param random_seed [Integer] The seed value using to initialize the random generator.
|
55
55
|
def initialize(n_factors: 2, reg_param_linear: 1.0, reg_param_factor: 1.0,
|
56
56
|
max_iter: 1000, batch_size: 10, optimizer: nil, random_seed: nil)
|
@@ -66,6 +66,7 @@ module SVMKit
|
|
66
66
|
@params[:max_iter] = max_iter
|
67
67
|
@params[:batch_size] = batch_size
|
68
68
|
@params[:optimizer] = optimizer
|
69
|
+
@params[:optimizer] ||= Optimizer::Nadam.new
|
69
70
|
@params[:random_seed] = random_seed
|
70
71
|
@params[:random_seed] ||= srand
|
71
72
|
@factor_mat = nil
|
@@ -91,12 +92,7 @@ module SVMKit
|
|
91
92
|
@factor_mat = Numo::DFloat.zeros(n_outputs, @params[:n_factors], n_features)
|
92
93
|
@weight_vec = Numo::DFloat.zeros(n_outputs, n_features)
|
93
94
|
@bias_term = Numo::DFloat.zeros(n_outputs)
|
94
|
-
n_outputs.times
|
95
|
-
factor, weight, bias = single_fit(x, y[true, n])
|
96
|
-
@factor_mat[n, true, true] = factor
|
97
|
-
@weight_vec[n, true] = weight
|
98
|
-
@bias_term[n] = bias
|
99
|
-
end
|
95
|
+
n_outputs.times { |n| @factor_mat[n, true, true], @weight_vec[n, true], @bias_term[n] = single_fit(x, y[true, n]) }
|
100
96
|
else
|
101
97
|
@factor_mat, @weight_vec, @bias_term = single_fit(x, y)
|
102
98
|
end
|
@@ -148,8 +144,8 @@ module SVMKit
|
|
148
144
|
rand_ids = [*0...n_samples].shuffle(random: @rng)
|
149
145
|
weight_vec = Numo::DFloat.zeros(n_features + 1)
|
150
146
|
factor_mat = Numo::DFloat.zeros(@params[:n_factors], n_features)
|
151
|
-
weight_optimizer =
|
152
|
-
factor_optimizers = Array.new(@params[:n_factors]) {
|
147
|
+
weight_optimizer = @params[:optimizer].dup
|
148
|
+
factor_optimizers = Array.new(@params[:n_factors]) { @params[:optimizer].dup }
|
153
149
|
# Start optimization.
|
154
150
|
@params[:max_iter].times do |_t|
|
155
151
|
# Random sampling.
|
data/lib/svmkit/version.rb
CHANGED
data/svmkit.gemspec
CHANGED
@@ -17,8 +17,8 @@ MSG
|
|
17
17
|
SVMKit is a machine learninig library in Ruby.
|
18
18
|
SVMKit provides machine learning algorithms with interfaces similar to Scikit-Learn in Python.
|
19
19
|
SVMKit currently supports Linear / Kernel Support Vector Machine,
|
20
|
-
Logistic Regression, Ridge, Lasso, Factorization Machine,
|
21
|
-
K-nearest neighbor algorithm, and cross-validation.
|
20
|
+
Logistic Regression, Linear Regression, Ridge, Lasso, Factorization Machine,
|
21
|
+
Naive Bayes, Decision Tree, Random Forest, K-nearest neighbor algorithm, and cross-validation.
|
22
22
|
MSG
|
23
23
|
spec.homepage = 'https://github.com/yoshoku/svmkit'
|
24
24
|
spec.license = 'BSD-2-Clause'
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: svmkit
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.4.
|
4
|
+
version: 0.4.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- yoshoku
|
8
8
|
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date: 2018-06-
|
11
|
+
date: 2018-06-08 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: numo-narray
|
@@ -84,8 +84,8 @@ description: |
|
|
84
84
|
SVMKit is a machine learninig library in Ruby.
|
85
85
|
SVMKit provides machine learning algorithms with interfaces similar to Scikit-Learn in Python.
|
86
86
|
SVMKit currently supports Linear / Kernel Support Vector Machine,
|
87
|
-
Logistic Regression, Ridge, Lasso, Factorization Machine,
|
88
|
-
K-nearest neighbor algorithm, and cross-validation.
|
87
|
+
Logistic Regression, Linear Regression, Ridge, Lasso, Factorization Machine,
|
88
|
+
Naive Bayes, Decision Tree, Random Forest, K-nearest neighbor algorithm, and cross-validation.
|
89
89
|
email:
|
90
90
|
- yoshoku@outlook.com
|
91
91
|
executables: []
|
@@ -128,6 +128,7 @@ files:
|
|
128
128
|
- lib/svmkit/kernel_approximation/rbf.rb
|
129
129
|
- lib/svmkit/kernel_machine/kernel_svc.rb
|
130
130
|
- lib/svmkit/linear_model/lasso.rb
|
131
|
+
- lib/svmkit/linear_model/linear_regression.rb
|
131
132
|
- lib/svmkit/linear_model/logistic_regression.rb
|
132
133
|
- lib/svmkit/linear_model/ridge.rb
|
133
134
|
- lib/svmkit/linear_model/svc.rb
|
@@ -140,6 +141,9 @@ files:
|
|
140
141
|
- lib/svmkit/nearest_neighbors/k_neighbors_classifier.rb
|
141
142
|
- lib/svmkit/nearest_neighbors/k_neighbors_regressor.rb
|
142
143
|
- lib/svmkit/optimizer/nadam.rb
|
144
|
+
- lib/svmkit/optimizer/rmsprop.rb
|
145
|
+
- lib/svmkit/optimizer/sgd.rb
|
146
|
+
- lib/svmkit/optimizer/yellow_fin.rb
|
143
147
|
- lib/svmkit/pairwise_metric.rb
|
144
148
|
- lib/svmkit/polynomial_model/factorization_machine_classifier.rb
|
145
149
|
- lib/svmkit/polynomial_model/factorization_machine_regressor.rb
|