svmkit 0.4.0 → 0.4.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/HISTORY.md +7 -0
- data/README.md +61 -25
- data/lib/svmkit.rb +4 -0
- data/lib/svmkit/linear_model/lasso.rb +5 -8
- data/lib/svmkit/linear_model/linear_regression.rb +159 -0
- data/lib/svmkit/linear_model/logistic_regression.rb +3 -2
- data/lib/svmkit/linear_model/ridge.rb +5 -6
- data/lib/svmkit/linear_model/svc.rb +3 -2
- data/lib/svmkit/linear_model/svr.rb +4 -7
- data/lib/svmkit/optimizer/nadam.rb +28 -2
- data/lib/svmkit/optimizer/rmsprop.rb +69 -0
- data/lib/svmkit/optimizer/sgd.rb +65 -0
- data/lib/svmkit/optimizer/yellow_fin.rb +144 -0
- data/lib/svmkit/polynomial_model/factorization_machine_classifier.rb +7 -9
- data/lib/svmkit/polynomial_model/factorization_machine_regressor.rb +7 -11
- data/lib/svmkit/version.rb +1 -1
- data/svmkit.gemspec +2 -2
- metadata +8 -4
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: af30c20b06fec51d531364ad9ca1414ce2fe36cdbe61fd8a1a7128c793d67304
|
4
|
+
data.tar.gz: ba87c535aa723ec17334fd6819577dcb51d2d11ccef6adb967f73de1702522f5
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: b32efe1dcd924c3e31ad0dc26dfbdcc86b0154b8b8591e58db5364103526b7dc828c46462b5f2dfe81c7c8ee23836ae8d4b81061cdf1ceb4f023c48cc78dd110
|
7
|
+
data.tar.gz: 6f38f301d23b3abc1037e1b0fe620e687da1fe44216a49707b2192d30fd8f2a7cb7690d6365580dda470e6852200db20b540c35947e3b1c54d8f8b5b599b2dc0
|
data/HISTORY.md
CHANGED
@@ -1,3 +1,10 @@
|
|
1
|
+
# 0.4.1
|
2
|
+
- Add class for linear regressor.
|
3
|
+
- Add class for SGD optimizer.
|
4
|
+
- Add class for RMSProp optimizer.
|
5
|
+
- Add class for YellowFin optimizer.
|
6
|
+
- Fix to be able to select optimizer on estimators of LineaModel and PolynomialModel.
|
7
|
+
|
1
8
|
# 0.4.0
|
2
9
|
## Breaking changes
|
3
10
|
|
data/README.md
CHANGED
@@ -8,8 +8,8 @@
|
|
8
8
|
SVMKit is a machine learninig library in Ruby.
|
9
9
|
SVMKit provides machine learning algorithms with interfaces similar to Scikit-Learn in Python.
|
10
10
|
SVMKit currently supports Linear / Kernel Support Vector Machine,
|
11
|
-
Logistic Regression, Ridge, Lasso, Factorization Machine,
|
12
|
-
K-nearest neighbor classifier, and cross-validation.
|
11
|
+
Logistic Regression, Linear Regression, Ridge, Lasso, Factorization Machine,
|
12
|
+
Naive Bayes, Decision Tree, Random Forest, K-nearest neighbor classifier, and cross-validation.
|
13
13
|
|
14
14
|
## Installation
|
15
15
|
|
@@ -29,61 +29,97 @@ Or install it yourself as:
|
|
29
29
|
|
30
30
|
## Usage
|
31
31
|
|
32
|
-
|
32
|
+
### Example 1. Pendigits dataset classification
|
33
|
+
|
34
|
+
SVMKit provides function loading libsvm format dataset file.
|
35
|
+
We start by downloading the pendigits dataset from LIBSVM Data web site.
|
36
|
+
|
37
|
+
```bash
|
38
|
+
$ wget https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass/pendigits
|
39
|
+
$ wget https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass/pendigits.t
|
40
|
+
```
|
41
|
+
|
42
|
+
Training of the classifier with Linear SVM and RBF kernel feature map is the following code.
|
33
43
|
|
34
44
|
```ruby
|
35
45
|
require 'svmkit'
|
36
46
|
|
47
|
+
# Load the training dataset.
|
37
48
|
samples, labels = SVMKit::Dataset.load_libsvm_file('pendigits')
|
38
49
|
|
39
|
-
|
40
|
-
|
50
|
+
# If the features consists only of integers, load_libsvm_file method reads in Numo::Int32 format.
|
51
|
+
# As necessary, you should convert sample array to Numo::DFloat format.
|
52
|
+
samples = Numo::DFloat.cast(samples)
|
41
53
|
|
42
|
-
|
43
|
-
|
54
|
+
# Map training data to RBF kernel feature space.
|
55
|
+
transformer = SVMKit::KernelApproximation::RBF.new(gamma: 0.0001, n_components: 1024, random_seed: 1)
|
56
|
+
transformed = transformer.fit_transform(samples)
|
44
57
|
|
45
|
-
|
58
|
+
# Train linear SVM classifier.
|
59
|
+
classifier = SVMKit::LinearModel::SVC.new(reg_param: 0.0001, max_iter: 1000, batch_size: 50, random_seed: 1)
|
46
60
|
classifier.fit(transformed, labels)
|
47
61
|
|
48
|
-
|
49
|
-
File.open('
|
50
|
-
File.open('
|
62
|
+
# Save the model.
|
63
|
+
File.open('transformer.dat', 'wb') { |f| f.write(Marshal.dump(transformer)) }
|
64
|
+
File.open('classifier.dat', 'wb') { |f| f.write(Marshal.dump(classifier)) }
|
51
65
|
```
|
52
66
|
|
53
|
-
|
67
|
+
Classifying testing data with the trained classifier is the following code.
|
54
68
|
|
55
69
|
```ruby
|
56
70
|
require 'svmkit'
|
57
71
|
|
72
|
+
# Load the testing dataset.
|
58
73
|
samples, labels = SVMKit::Dataset.load_libsvm_file('pendigits.t')
|
74
|
+
samples = Numo::DFloat.cast(samples)
|
59
75
|
|
60
|
-
|
61
|
-
transformer = Marshal.load(File.binread('
|
62
|
-
classifier = Marshal.load(File.binread('
|
76
|
+
# Load the model.
|
77
|
+
transformer = Marshal.load(File.binread('transformer.dat'))
|
78
|
+
classifier = Marshal.load(File.binread('classifier.dat'))
|
79
|
+
|
80
|
+
# Map testing data to RBF kernel feature space.
|
81
|
+
transformed = transformer.transform(samples)
|
82
|
+
|
83
|
+
# Classify the testing data and evaluate prediction results.
|
84
|
+
puts("Accuracy: %.1f%%" % (100.0 * classifier.score(transformed, labels)))
|
85
|
+
|
86
|
+
# Other evaluating approach
|
87
|
+
# results = classifier.predict(transformed)
|
88
|
+
# evaluator = SVMKit::EvaluationMeasure::Accuracy.new
|
89
|
+
# puts("Accuracy: %.1f%%" % (100.0 * evaluator.score(results, labels)))
|
90
|
+
```
|
63
91
|
|
64
|
-
|
65
|
-
transformed = transformer.transform(normalized)
|
92
|
+
Execution of the above scripts result in the following.
|
66
93
|
|
67
|
-
|
94
|
+
```bash
|
95
|
+
$ ruby train.rb
|
96
|
+
$ ruby test.rb
|
97
|
+
Accuracy: 98.4%
|
68
98
|
```
|
69
99
|
|
70
|
-
|
100
|
+
### Example 2. Cross-validation
|
71
101
|
|
72
102
|
```ruby
|
73
103
|
require 'svmkit'
|
74
104
|
|
105
|
+
# Load dataset.
|
75
106
|
samples, labels = SVMKit::Dataset.load_libsvm_file('pendigits')
|
107
|
+
samples = Numo::DFloat.cast(samples)
|
76
108
|
|
77
|
-
|
109
|
+
# Define the estimator to be evaluated.
|
110
|
+
lr = SVMKit::LinearModel::LogisticRegression.new(reg_param: 0.0001, random_seed: 1)
|
78
111
|
|
112
|
+
# Define the evaluation measure, splitting strategy, and cross validation.
|
113
|
+
ev = SVMKit::EvaluationMeasure::LogLoss.new
|
79
114
|
kf = SVMKit::ModelSelection::StratifiedKFold.new(n_splits: 5, shuffle: true, random_seed: 1)
|
80
|
-
cv = SVMKit::ModelSelection::CrossValidation.new(estimator:
|
115
|
+
cv = SVMKit::ModelSelection::CrossValidation.new(estimator: lr, splitter: kf, evaluator: ev)
|
81
116
|
|
82
|
-
|
83
|
-
report = cv.perform(
|
117
|
+
# Perform 5-cross validation.
|
118
|
+
report = cv.perform(samples, labels)
|
84
119
|
|
85
|
-
|
86
|
-
|
120
|
+
# Output result.
|
121
|
+
mean_logloss = report[:test_score].inject(:+) / kf.n_splits
|
122
|
+
puts("5-CV mean log-loss: %.3f" % mean_logloss)
|
87
123
|
```
|
88
124
|
|
89
125
|
## Development
|
data/lib/svmkit.rb
CHANGED
@@ -13,11 +13,15 @@ require 'svmkit/base/regressor'
|
|
13
13
|
require 'svmkit/base/transformer'
|
14
14
|
require 'svmkit/base/splitter'
|
15
15
|
require 'svmkit/base/evaluator'
|
16
|
+
require 'svmkit/optimizer/sgd'
|
17
|
+
require 'svmkit/optimizer/rmsprop'
|
16
18
|
require 'svmkit/optimizer/nadam'
|
19
|
+
require 'svmkit/optimizer/yellow_fin'
|
17
20
|
require 'svmkit/kernel_approximation/rbf'
|
18
21
|
require 'svmkit/linear_model/svc'
|
19
22
|
require 'svmkit/linear_model/svr'
|
20
23
|
require 'svmkit/linear_model/logistic_regression'
|
24
|
+
require 'svmkit/linear_model/linear_regression'
|
21
25
|
require 'svmkit/linear_model/ridge'
|
22
26
|
require 'svmkit/linear_model/lasso'
|
23
27
|
require 'svmkit/kernel_machine/kernel_svc'
|
@@ -43,7 +43,7 @@ module SVMKit
|
|
43
43
|
# @param max_iter [Integer] The maximum number of iterations.
|
44
44
|
# @param batch_size [Integer] The size of the mini batches.
|
45
45
|
# @param optimizer [Optimizer] The optimizer to calculate adaptive learning rate.
|
46
|
-
#
|
46
|
+
# If nil is given, Nadam is used.
|
47
47
|
# @param random_seed [Integer] The seed value using to initialize the random generator.
|
48
48
|
def initialize(reg_param: 1.0, fit_bias: false, max_iter: 1000, batch_size: 10, optimizer: nil, random_seed: nil)
|
49
49
|
check_params_float(reg_param: reg_param)
|
@@ -57,6 +57,7 @@ module SVMKit
|
|
57
57
|
@params[:max_iter] = max_iter
|
58
58
|
@params[:batch_size] = batch_size
|
59
59
|
@params[:optimizer] = optimizer
|
60
|
+
@params[:optimizer] ||= Optimizer::Nadam.new
|
60
61
|
@params[:random_seed] = random_seed
|
61
62
|
@params[:random_seed] ||= srand
|
62
63
|
@weight_vec = nil
|
@@ -80,11 +81,7 @@ module SVMKit
|
|
80
81
|
if n_outputs > 1
|
81
82
|
@weight_vec = Numo::DFloat.zeros(n_outputs, n_features)
|
82
83
|
@bias_term = Numo::DFloat.zeros(n_outputs)
|
83
|
-
n_outputs.times
|
84
|
-
weight, bias = single_fit(x, y[true, n])
|
85
|
-
@weight_vec[n, true] = weight
|
86
|
-
@bias_term[n] = bias
|
87
|
-
end
|
84
|
+
n_outputs.times { |n| @weight_vec[n, true], @bias_term[n] = single_fit(x, y[true, n]) }
|
88
85
|
else
|
89
86
|
@weight_vec, @bias_term = single_fit(x, y)
|
90
87
|
end
|
@@ -131,8 +128,8 @@ module SVMKit
|
|
131
128
|
weight_vec = Numo::DFloat.zeros(n_features)
|
132
129
|
left_weight_vec = Numo::DFloat.zeros(n_features)
|
133
130
|
right_weight_vec = Numo::DFloat.zeros(n_features)
|
134
|
-
left_optimizer =
|
135
|
-
right_optimizer =
|
131
|
+
left_optimizer = @params[:optimizer].dup
|
132
|
+
right_optimizer = @params[:optimizer].dup
|
136
133
|
# Start optimization.
|
137
134
|
@params[:max_iter].times do |_t|
|
138
135
|
# Random sampling.
|
@@ -0,0 +1,159 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require 'svmkit/validation'
|
4
|
+
require 'svmkit/base/base_estimator'
|
5
|
+
require 'svmkit/base/regressor'
|
6
|
+
require 'svmkit/optimizer/nadam'
|
7
|
+
|
8
|
+
module SVMKit
|
9
|
+
module LinearModel
|
10
|
+
# LinearRegression is a class that implements ordinary least square linear regression
|
11
|
+
# with mini-batch stochastic gradient descent optimization.
|
12
|
+
#
|
13
|
+
# @example
|
14
|
+
# estimator =
|
15
|
+
# SVMKit::LinearModel::LinearRegression.new(max_iter: 1000, batch_size: 20, random_seed: 1)
|
16
|
+
# estimator.fit(training_samples, traininig_values)
|
17
|
+
# results = estimator.predict(testing_samples)
|
18
|
+
#
|
19
|
+
class LinearRegression
|
20
|
+
include Base::BaseEstimator
|
21
|
+
include Base::Regressor
|
22
|
+
include Validation
|
23
|
+
|
24
|
+
# Return the weight vector.
|
25
|
+
# @return [Numo::DFloat] (shape: [n_outputs, n_features])
|
26
|
+
attr_reader :weight_vec
|
27
|
+
|
28
|
+
# Return the bias term (a.k.a. intercept).
|
29
|
+
# @return [Numo::DFloat] (shape: [n_outputs])
|
30
|
+
attr_reader :bias_term
|
31
|
+
|
32
|
+
# Return the random generator for random sampling.
|
33
|
+
# @return [Random]
|
34
|
+
attr_reader :rng
|
35
|
+
|
36
|
+
# Create a new ordinary least square linear regressor.
|
37
|
+
#
|
38
|
+
# @param fit_bias [Boolean] The flag indicating whether to fit the bias term.
|
39
|
+
# @param max_iter [Integer] The maximum number of iterations.
|
40
|
+
# @param batch_size [Integer] The size of the mini batches.
|
41
|
+
# @param optimizer [Optimizer] The optimizer to calculate adaptive learning rate.
|
42
|
+
# If nil is given, Nadam is used.
|
43
|
+
# @param random_seed [Integer] The seed value using to initialize the random generator.
|
44
|
+
def initialize(fit_bias: false, max_iter: 1000, batch_size: 10, optimizer: nil, random_seed: nil)
|
45
|
+
check_params_integer(max_iter: max_iter, batch_size: batch_size)
|
46
|
+
check_params_boolean(fit_bias: fit_bias)
|
47
|
+
check_params_type_or_nil(Integer, random_seed: random_seed)
|
48
|
+
check_params_positive(max_iter: max_iter, batch_size: batch_size)
|
49
|
+
@params = {}
|
50
|
+
@params[:fit_bias] = fit_bias
|
51
|
+
@params[:max_iter] = max_iter
|
52
|
+
@params[:batch_size] = batch_size
|
53
|
+
@params[:optimizer] = optimizer
|
54
|
+
@params[:optimizer] ||= Optimizer::Nadam.new
|
55
|
+
@params[:random_seed] = random_seed
|
56
|
+
@params[:random_seed] ||= srand
|
57
|
+
@weight_vec = nil
|
58
|
+
@bias_term = nil
|
59
|
+
@rng = Random.new(@params[:random_seed])
|
60
|
+
end
|
61
|
+
|
62
|
+
# Fit the model with given training data.
|
63
|
+
#
|
64
|
+
# @param x [Numo::DFloat] (shape: [n_samples, n_features]) The training data to be used for fitting the model.
|
65
|
+
# @param y [Numo::Int32] (shape: [n_samples, n_outputs]) The target values to be used for fitting the model.
|
66
|
+
# @return [LinearRegression] The learned regressor itself.
|
67
|
+
def fit(x, y)
|
68
|
+
check_sample_array(x)
|
69
|
+
check_tvalue_array(y)
|
70
|
+
check_sample_tvalue_size(x, y)
|
71
|
+
|
72
|
+
n_outputs = y.shape[1].nil? ? 1 : y.shape[1]
|
73
|
+
n_features = x.shape[1]
|
74
|
+
|
75
|
+
if n_outputs > 1
|
76
|
+
@weight_vec = Numo::DFloat.zeros(n_outputs, n_features)
|
77
|
+
@bias_term = Numo::DFloat.zeros(n_outputs)
|
78
|
+
n_outputs.times { |n| @weight_vec[n, true], @bias_term[n] = single_fit(x, y[true, n]) }
|
79
|
+
else
|
80
|
+
@weight_vec, @bias_term = single_fit(x, y)
|
81
|
+
end
|
82
|
+
|
83
|
+
self
|
84
|
+
end
|
85
|
+
|
86
|
+
# Predict values for samples.
|
87
|
+
#
|
88
|
+
# @param x [Numo::DFloat] (shape: [n_samples, n_features]) The samples to predict the values.
|
89
|
+
# @return [Numo::DFloat] (shape: [n_samples, n_outputs]) Predicted values per sample.
|
90
|
+
def predict(x)
|
91
|
+
check_sample_array(x)
|
92
|
+
x.dot(@weight_vec.transpose) + @bias_term
|
93
|
+
end
|
94
|
+
|
95
|
+
# Dump marshal data.
|
96
|
+
# @return [Hash] The marshal data about LinearRegression.
|
97
|
+
def marshal_dump
|
98
|
+
{ params: @params,
|
99
|
+
weight_vec: @weight_vec,
|
100
|
+
bias_term: @bias_term,
|
101
|
+
rng: @rng }
|
102
|
+
end
|
103
|
+
|
104
|
+
# Load marshal data.
|
105
|
+
# @return [nil]
|
106
|
+
def marshal_load(obj)
|
107
|
+
@params = obj[:params]
|
108
|
+
@weight_vec = obj[:weight_vec]
|
109
|
+
@bias_term = obj[:bias_term]
|
110
|
+
@rng = obj[:rng]
|
111
|
+
nil
|
112
|
+
end
|
113
|
+
|
114
|
+
private
|
115
|
+
|
116
|
+
def single_fit(x, y)
|
117
|
+
# Expand feature vectors for bias term.
|
118
|
+
samples = @params[:fit_bias] ? expand_feature(x) : x
|
119
|
+
# Initialize some variables.
|
120
|
+
n_samples, n_features = samples.shape
|
121
|
+
rand_ids = [*0...n_samples].shuffle(random: @rng)
|
122
|
+
weight_vec = Numo::DFloat.zeros(n_features)
|
123
|
+
optimizer = @params[:optimizer].dup
|
124
|
+
# Start optimization.
|
125
|
+
@params[:max_iter].times do |_t|
|
126
|
+
# Random sampling.
|
127
|
+
subset_ids = rand_ids.shift(@params[:batch_size])
|
128
|
+
rand_ids.concat(subset_ids)
|
129
|
+
data = samples[subset_ids, true]
|
130
|
+
values = y[subset_ids]
|
131
|
+
# Calculate gradients for loss function.
|
132
|
+
loss_grad = loss_gradient(data, values, weight_vec)
|
133
|
+
next if loss_grad.ne(0.0).count.zero?
|
134
|
+
# Update weight.
|
135
|
+
weight_vec = optimizer.call(weight_vec, weight_gradient(loss_grad, data, weight_vec))
|
136
|
+
end
|
137
|
+
split_weight_vec_bias(weight_vec)
|
138
|
+
end
|
139
|
+
|
140
|
+
def loss_gradient(x, y, weight)
|
141
|
+
2.0 * (x.dot(weight) - y)
|
142
|
+
end
|
143
|
+
|
144
|
+
def weight_gradient(loss_grad, data, _weight)
|
145
|
+
(loss_grad.expand_dims(1) * data).mean(0)
|
146
|
+
end
|
147
|
+
|
148
|
+
def expand_feature(x)
|
149
|
+
Numo::NArray.hstack([x, Numo::DFloat.ones([x.shape[0], 1])])
|
150
|
+
end
|
151
|
+
|
152
|
+
def split_weight_vec_bias(weight_vec)
|
153
|
+
weights = @params[:fit_bias] ? weight_vec[0...-1] : weight_vec
|
154
|
+
bias = @params[:fit_bias] ? weight_vec[-1] : 0.0
|
155
|
+
[weights, bias]
|
156
|
+
end
|
157
|
+
end
|
158
|
+
end
|
159
|
+
end
|
@@ -49,7 +49,7 @@ module SVMKit
|
|
49
49
|
# @param max_iter [Integer] The maximum number of iterations.
|
50
50
|
# @param batch_size [Integer] The size of the mini batches.
|
51
51
|
# @param optimizer [Optimizer] The optimizer to calculate adaptive learning rate.
|
52
|
-
#
|
52
|
+
# If nil is given, Nadam is used.
|
53
53
|
# @param random_seed [Integer] The seed value using to initialize the random generator.
|
54
54
|
def initialize(reg_param: 1.0, fit_bias: false, bias_scale: 1.0,
|
55
55
|
max_iter: 1000, batch_size: 20, optimizer: nil, random_seed: nil)
|
@@ -65,6 +65,7 @@ module SVMKit
|
|
65
65
|
@params[:max_iter] = max_iter
|
66
66
|
@params[:batch_size] = batch_size
|
67
67
|
@params[:optimizer] = optimizer
|
68
|
+
@params[:optimizer] ||= Optimizer::Nadam.new
|
68
69
|
@params[:random_seed] = random_seed
|
69
70
|
@params[:random_seed] ||= srand
|
70
71
|
@weight_vec = nil
|
@@ -175,7 +176,7 @@ module SVMKit
|
|
175
176
|
n_samples, n_features = samples.shape
|
176
177
|
rand_ids = [*0...n_samples].shuffle(random: @rng)
|
177
178
|
weight_vec = Numo::DFloat.zeros(n_features)
|
178
|
-
optimizer =
|
179
|
+
optimizer = @params[:optimizer].dup
|
179
180
|
# Start optimization.
|
180
181
|
@params[:max_iter].times do |_t|
|
181
182
|
# random sampling
|
@@ -39,6 +39,8 @@ module SVMKit
|
|
39
39
|
# @param fit_bias [Boolean] The flag indicating whether to fit the bias term.
|
40
40
|
# @param max_iter [Integer] The maximum number of iterations.
|
41
41
|
# @param batch_size [Integer] The size of the mini batches.
|
42
|
+
# @param optimizer [Optimizer] The optimizer to calculate adaptive learning rate.
|
43
|
+
# If nil is given, Nadam is used.
|
42
44
|
# @param random_seed [Integer] The seed value using to initialize the random generator.
|
43
45
|
def initialize(reg_param: 1.0, fit_bias: false, max_iter: 1000, batch_size: 10, optimizer: nil, random_seed: nil)
|
44
46
|
check_params_float(reg_param: reg_param)
|
@@ -52,6 +54,7 @@ module SVMKit
|
|
52
54
|
@params[:max_iter] = max_iter
|
53
55
|
@params[:batch_size] = batch_size
|
54
56
|
@params[:optimizer] = optimizer
|
57
|
+
@params[:optimizer] ||= Optimizer::Nadam.new
|
55
58
|
@params[:random_seed] = random_seed
|
56
59
|
@params[:random_seed] ||= srand
|
57
60
|
@weight_vec = nil
|
@@ -75,11 +78,7 @@ module SVMKit
|
|
75
78
|
if n_outputs > 1
|
76
79
|
@weight_vec = Numo::DFloat.zeros(n_outputs, n_features)
|
77
80
|
@bias_term = Numo::DFloat.zeros(n_outputs)
|
78
|
-
n_outputs.times
|
79
|
-
weight, bias = single_fit(x, y[true, n])
|
80
|
-
@weight_vec[n, true] = weight
|
81
|
-
@bias_term[n] = bias
|
82
|
-
end
|
81
|
+
n_outputs.times { |n| @weight_vec[n, true], @bias_term[n] = single_fit(x, y[true, n]) }
|
83
82
|
else
|
84
83
|
@weight_vec, @bias_term = single_fit(x, y)
|
85
84
|
end
|
@@ -124,7 +123,7 @@ module SVMKit
|
|
124
123
|
n_samples, n_features = samples.shape
|
125
124
|
rand_ids = [*0...n_samples].shuffle(random: @rng)
|
126
125
|
weight_vec = Numo::DFloat.zeros(n_features)
|
127
|
-
optimizer =
|
126
|
+
optimizer = @params[:optimizer].dup
|
128
127
|
# Start optimization.
|
129
128
|
@params[:max_iter].times do |_t|
|
130
129
|
# Random sampling.
|
@@ -51,7 +51,7 @@ module SVMKit
|
|
51
51
|
# @param batch_size [Integer] The size of the mini batches.
|
52
52
|
# @param probability [Boolean] The flag indicating whether to perform probability estimation.
|
53
53
|
# @param optimizer [Optimizer] The optimizer to calculate adaptive learning rate.
|
54
|
-
#
|
54
|
+
# If nil is given, Nadam is used.
|
55
55
|
# @param random_seed [Integer] The seed value using to initialize the random generator.
|
56
56
|
def initialize(reg_param: 1.0, fit_bias: false, bias_scale: 1.0,
|
57
57
|
max_iter: 1000, batch_size: 20, probability: false, optimizer: nil, random_seed: nil)
|
@@ -68,6 +68,7 @@ module SVMKit
|
|
68
68
|
@params[:batch_size] = batch_size
|
69
69
|
@params[:probability] = probability
|
70
70
|
@params[:optimizer] = optimizer
|
71
|
+
@params[:optimizer] ||= Optimizer::Nadam.new
|
71
72
|
@params[:random_seed] = random_seed
|
72
73
|
@params[:random_seed] ||= srand
|
73
74
|
@weight_vec = nil
|
@@ -194,7 +195,7 @@ module SVMKit
|
|
194
195
|
n_samples, n_features = samples.shape
|
195
196
|
rand_ids = [*0...n_samples].shuffle(random: @rng)
|
196
197
|
weight_vec = Numo::DFloat.zeros(n_features)
|
197
|
-
optimizer =
|
198
|
+
optimizer = @params[:optimizer].dup
|
198
199
|
# Start optimization.
|
199
200
|
@params[:max_iter].times do |_t|
|
200
201
|
# random sampling.
|
@@ -44,7 +44,7 @@ module SVMKit
|
|
44
44
|
# @param max_iter [Integer] The maximum number of iterations.
|
45
45
|
# @param batch_size [Integer] The size of the mini batches.
|
46
46
|
# @param optimizer [Optimizer] The optimizer to calculate adaptive learning rate.
|
47
|
-
#
|
47
|
+
# If nil is given, Nadam is used.
|
48
48
|
# @param random_seed [Integer] The seed value using to initialize the random generator.
|
49
49
|
def initialize(reg_param: 1.0, fit_bias: false, bias_scale: 1.0, epsilon: 0.1,
|
50
50
|
max_iter: 1000, batch_size: 20, optimizer: nil, random_seed: nil)
|
@@ -62,6 +62,7 @@ module SVMKit
|
|
62
62
|
@params[:max_iter] = max_iter
|
63
63
|
@params[:batch_size] = batch_size
|
64
64
|
@params[:optimizer] = optimizer
|
65
|
+
@params[:optimizer] ||= Optimizer::Nadam.new
|
65
66
|
@params[:random_seed] = random_seed
|
66
67
|
@params[:random_seed] ||= srand
|
67
68
|
@weight_vec = nil
|
@@ -85,11 +86,7 @@ module SVMKit
|
|
85
86
|
if n_outputs > 1
|
86
87
|
@weight_vec = Numo::DFloat.zeros(n_outputs, n_features)
|
87
88
|
@bias_term = Numo::DFloat.zeros(n_outputs)
|
88
|
-
n_outputs.times
|
89
|
-
weight, bias = single_fit(x, y[true, n])
|
90
|
-
@weight_vec[n, true] = weight
|
91
|
-
@bias_term[n] = bias
|
92
|
-
end
|
89
|
+
n_outputs.times { |n| @weight_vec[n, true], @bias_term[n] = single_fit(x, y[true, n]) }
|
93
90
|
else
|
94
91
|
@weight_vec, @bias_term = single_fit(x, y)
|
95
92
|
end
|
@@ -134,7 +131,7 @@ module SVMKit
|
|
134
131
|
n_samples, n_features = samples.shape
|
135
132
|
rand_ids = [*0...n_samples].shuffle(random: @rng)
|
136
133
|
weight_vec = Numo::DFloat.zeros(n_features)
|
137
|
-
optimizer =
|
134
|
+
optimizer = @params[:optimizer].dup
|
138
135
|
# Start optimization.
|
139
136
|
@params[:max_iter].times do |_t|
|
140
137
|
# random sampling
|
@@ -1,16 +1,22 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
2
|
|
3
3
|
require 'svmkit/validation'
|
4
|
+
require 'svmkit/base/base_estimator'
|
4
5
|
|
5
6
|
module SVMKit
|
6
7
|
# This module consists of the classes that implement optimizers adaptively tuning hyperparameters.
|
7
8
|
module Optimizer
|
8
9
|
# Nadam is a class that implements Nadam optimizer.
|
9
|
-
#
|
10
|
+
#
|
11
|
+
# @example
|
12
|
+
# optimizer = SVMKit::Optimizer::Nadam.new(learning_rate: 0.01, momentum: 0.9, decay1: 0.9, decay2: 0.999)
|
13
|
+
# estimator = SVMKit::LinearModel::LinearRegression.new(optimizer: optimizer, random_seed: 1)
|
14
|
+
# estimator.fit(samples, values)
|
10
15
|
#
|
11
16
|
# *Reference*
|
12
17
|
# - T. Dozat, "Incorporating Nesterov Momentum into Adam," Tech. Repo. Stanford University, 2015.
|
13
18
|
class Nadam
|
19
|
+
include Base::BaseEstimator
|
14
20
|
include Validation
|
15
21
|
|
16
22
|
# Create a new optimizer with Nadam
|
@@ -19,7 +25,6 @@ module SVMKit
|
|
19
25
|
# @param momentum [Float] The initial value of momentum.
|
20
26
|
# @param decay1 [Float] The smoothing parameter for the first moment.
|
21
27
|
# @param decay2 [Float] The smoothing parameter for the second moment.
|
22
|
-
# @param schedule_decay [Float] The smooting parameter.
|
23
28
|
def initialize(learning_rate: 0.01, momentum: 0.9, decay1: 0.9, decay2: 0.999)
|
24
29
|
check_params_float(learning_rate: learning_rate, momentum: momentum, decay1: decay1, decay2: decay2)
|
25
30
|
check_params_positive(learning_rate: learning_rate, momentum: momentum, decay1: decay1, decay2: decay2)
|
@@ -59,6 +64,27 @@ module SVMKit
|
|
59
64
|
|
60
65
|
weight - (@params[:learning_rate] / (nm_sec_moment**0.5 + 1e-8)) * ((1 - decay1_curr) * nm_gradient + decay1_next * nm_fst_moment)
|
61
66
|
end
|
67
|
+
|
68
|
+
# Dump marshal data.
|
69
|
+
# @return [Hash] The marshal data.
|
70
|
+
def marshal_dump
|
71
|
+
{ params: @params,
|
72
|
+
fst_moment: @fst_moment,
|
73
|
+
sec_moment: @sec_moment,
|
74
|
+
decay1_prod: @decay1_prod,
|
75
|
+
iter: @iter }
|
76
|
+
end
|
77
|
+
|
78
|
+
# Load marshal data.
|
79
|
+
# @return [nil]
|
80
|
+
def marshal_load(obj)
|
81
|
+
@params = obj[:params]
|
82
|
+
@fst_moment = obj[:fst_moment]
|
83
|
+
@sec_moment = obj[:sec_moment]
|
84
|
+
@decay1_prod = obj[:decay1_prod]
|
85
|
+
@iter = obj[:iter]
|
86
|
+
nil
|
87
|
+
end
|
62
88
|
end
|
63
89
|
end
|
64
90
|
end
|
@@ -0,0 +1,69 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require 'svmkit/validation'
|
4
|
+
require 'svmkit/base/base_estimator'
|
5
|
+
|
6
|
+
module SVMKit
|
7
|
+
module Optimizer
|
8
|
+
# RMSProp is a class that implements RMSProp optimizer.
|
9
|
+
#
|
10
|
+
# @example
|
11
|
+
# optimizer = SVMKit::Optimizer::RMSProp.new(learning_rate: 0.01, momentum: 0.9, decay: 0.9)
|
12
|
+
# estimator = SVMKit::LinearModel::LinearRegression.new(optimizer: optimizer, random_seed: 1)
|
13
|
+
# estimator.fit(samples, values)
|
14
|
+
#
|
15
|
+
# *Reference*
|
16
|
+
# - I. Sutskever, J. Martens, G. Dahl, and G. Hinton, "On the importance of initialization and momentum in deep learning," Proc. ICML' 13, pp. 1139--1147, 2013.
|
17
|
+
# - G. Hinton, N. Srivastava, and K. Swersky, "Lecture 6e rmsprop," Neural Networks for Machine Learning, 2012.
|
18
|
+
class RMSProp
|
19
|
+
include Base::BaseEstimator
|
20
|
+
include Validation
|
21
|
+
|
22
|
+
# Create a new optimizer with RMSProp.
|
23
|
+
#
|
24
|
+
# @param learning_rate [Float] The initial value of learning rate.
|
25
|
+
# @param momentum [Float] The initial value of momentum.
|
26
|
+
# @param decay [Float] The smooting parameter.
|
27
|
+
def initialize(learning_rate: 0.01, momentum: 0.9, decay: 0.9)
|
28
|
+
check_params_float(learning_rate: learning_rate, momentum: momentum, decay: decay)
|
29
|
+
check_params_positive(learning_rate: learning_rate, momentum: momentum, decay: decay)
|
30
|
+
@params = {}
|
31
|
+
@params[:learning_rate] = learning_rate
|
32
|
+
@params[:momentum] = momentum
|
33
|
+
@params[:decay] = decay
|
34
|
+
@moment = nil
|
35
|
+
@update = nil
|
36
|
+
end
|
37
|
+
|
38
|
+
# Calculate the updated weight with RMSProp adaptive learning rate.
|
39
|
+
#
|
40
|
+
# @param weight [Numo::DFloat] (shape: [n_features]) The weight to be updated.
|
41
|
+
# @param gradient [Numo::DFloat] (shape: [n_features]) The gradient for updating the weight.
|
42
|
+
# @return [Numo::DFloat] (shape: [n_feautres]) The updated weight.
|
43
|
+
def call(weight, gradient)
|
44
|
+
@moment ||= Numo::DFloat.zeros(weight.shape[0])
|
45
|
+
@update ||= Numo::DFloat.zeros(weight.shape[0])
|
46
|
+
@moment = @params[:decay] * @moment + (1.0 - @params[:decay]) * gradient**2
|
47
|
+
@update = @params[:momentum] * @update - (@params[:learning_rate] / (@moment**0.5 + 1.0e-8)) * gradient
|
48
|
+
weight + @update
|
49
|
+
end
|
50
|
+
|
51
|
+
# Dump marshal data.
|
52
|
+
# @return [Hash] The marshal data.
|
53
|
+
def marshal_dump
|
54
|
+
{ params: @params,
|
55
|
+
moment: @moment,
|
56
|
+
update: @update }
|
57
|
+
end
|
58
|
+
|
59
|
+
# Load marshal data.
|
60
|
+
# @return [nil]
|
61
|
+
def marshal_load(obj)
|
62
|
+
@params = obj[:params]
|
63
|
+
@moment = obj[:moment]
|
64
|
+
@update = obj[:update]
|
65
|
+
nil
|
66
|
+
end
|
67
|
+
end
|
68
|
+
end
|
69
|
+
end
|
@@ -0,0 +1,65 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require 'svmkit/validation'
|
4
|
+
require 'svmkit/base/base_estimator'
|
5
|
+
|
6
|
+
module SVMKit
|
7
|
+
module Optimizer
|
8
|
+
# SGD is a class that implements SGD optimizer.
|
9
|
+
#
|
10
|
+
# @example
|
11
|
+
# optimizer = SVMKit::Optimizer::SGD.new(learning_rate: 0.01, momentum: 0.9, decay: 0.9)
|
12
|
+
# estimator = SVMKit::LinearModel::LinearRegression.new(optimizer: optimizer, random_seed: 1)
|
13
|
+
# estimator.fit(samples, values)
|
14
|
+
class SGD
|
15
|
+
include Base::BaseEstimator
|
16
|
+
include Validation
|
17
|
+
|
18
|
+
# Create a new optimizer with SGD.
|
19
|
+
#
|
20
|
+
# @param learning_rate [Float] The initial value of learning rate.
|
21
|
+
# @param momentum [Float] The initial value of momentum.
|
22
|
+
# @param decay [Float] The smooting parameter.
|
23
|
+
def initialize(learning_rate: 0.01, momentum: 0.0, decay: 0.0)
|
24
|
+
check_params_float(learning_rate: learning_rate, momentum: momentum, decay: decay)
|
25
|
+
check_params_positive(learning_rate: learning_rate, momentum: momentum, decay: decay)
|
26
|
+
@params = {}
|
27
|
+
@params[:learning_rate] = learning_rate
|
28
|
+
@params[:momentum] = momentum
|
29
|
+
@params[:decay] = decay
|
30
|
+
@iter = 0
|
31
|
+
@update = nil
|
32
|
+
end
|
33
|
+
|
34
|
+
# Calculate the updated weight with SGD.
|
35
|
+
#
|
36
|
+
# @param weight [Numo::DFloat] (shape: [n_features]) The weight to be updated.
|
37
|
+
# @param gradient [Numo::DFloat] (shape: [n_features]) The gradient for updating the weight.
|
38
|
+
# @return [Numo::DFloat] (shape: [n_feautres]) The updated weight.
|
39
|
+
def call(weight, gradient)
|
40
|
+
@update ||= Numo::DFloat.zeros(weight.shape[0])
|
41
|
+
current_learning_rate = @params[:learning_rate] / (1.0 + @params[:decay] * @iter)
|
42
|
+
@iter += 1
|
43
|
+
@update = @params[:momentum] * @update - current_learning_rate * gradient
|
44
|
+
weight + @update
|
45
|
+
end
|
46
|
+
|
47
|
+
# Dump marshal data.
|
48
|
+
# @return [Hash] The marshal data.
|
49
|
+
def marshal_dump
|
50
|
+
{ params: @params,
|
51
|
+
iter: @iter,
|
52
|
+
update: @update }
|
53
|
+
end
|
54
|
+
|
55
|
+
# Load marshal data.
|
56
|
+
# @return [nil]
|
57
|
+
def marshal_load(obj)
|
58
|
+
@params = obj[:params]
|
59
|
+
@iter = obj[:iter]
|
60
|
+
@update = obj[:update]
|
61
|
+
nil
|
62
|
+
end
|
63
|
+
end
|
64
|
+
end
|
65
|
+
end
|
@@ -0,0 +1,144 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require 'svmkit/validation'
|
4
|
+
require 'svmkit/base/base_estimator'
|
5
|
+
|
6
|
+
module SVMKit
|
7
|
+
module Optimizer
|
8
|
+
# YellowFin is a class that implements YellowFin optimizer.
|
9
|
+
#
|
10
|
+
# @example
|
11
|
+
# optimizer = SVMKit::Optimizer::YellowFin.new(learning_rate: 0.01, momentum: 0.9, decay: 0.999, window_width: 20)
|
12
|
+
# estimator = SVMKit::LinearModel::LinearRegression.new(optimizer: optimizer, random_seed: 1)
|
13
|
+
# estimator.fit(samples, values)
|
14
|
+
#
|
15
|
+
# *Reference*
|
16
|
+
# - J. Zhang and I. Mitliagkas, "YellowFin and the Art of Momentum Tuning," CoRR abs/1706.03471, 2017.
|
17
|
+
class YellowFin
|
18
|
+
include Base::BaseEstimator
|
19
|
+
include Validation
|
20
|
+
|
21
|
+
# Create a new optimizer with YellowFin.
|
22
|
+
#
|
23
|
+
# @param learning_rate [Float] The initial value of learning rate.
|
24
|
+
# @param momentum [Float] The initial value of momentum.
|
25
|
+
# @param decay [Float] The smooting parameter.
|
26
|
+
# @param window_width [Integer] The sliding window width for searching curvature range.
|
27
|
+
def initialize(learning_rate: 0.01, momentum: 0.9, decay: 0.999, window_width: 20)
|
28
|
+
check_params_float(learning_rate: learning_rate, momentum: momentum, decay: decay)
|
29
|
+
check_params_integer(window_width: window_width)
|
30
|
+
check_params_positive(learning_rate: learning_rate, momentum: momentum, decay: decay, window_width: window_width)
|
31
|
+
@params = {}
|
32
|
+
@params[:learning_rate] = learning_rate
|
33
|
+
@params[:momentum] = momentum
|
34
|
+
@params[:decay] = decay
|
35
|
+
@params[:window_width] = window_width
|
36
|
+
@smth_learning_rate = learning_rate
|
37
|
+
@smth_momentum = momentum
|
38
|
+
@grad_norms = nil
|
39
|
+
@grad_norm_min = 0.0
|
40
|
+
@grad_norm_max = 0.0
|
41
|
+
@grad_mean_sqr = 0.0
|
42
|
+
@grad_mean = 0.0
|
43
|
+
@grad_var = 0.0
|
44
|
+
@grad_norm_mean = 0.0
|
45
|
+
@curve_mean = 0.0
|
46
|
+
@distance_mean = 0.0
|
47
|
+
@update = nil
|
48
|
+
end
|
49
|
+
|
50
|
+
# Calculate the updated weight with adaptive momentum coefficient and learning rate.
|
51
|
+
#
|
52
|
+
# @param weight [Numo::DFloat] (shape: [n_features]) The weight to be updated.
|
53
|
+
# @param gradient [Numo::DFloat] (shape: [n_features]) The gradient for updating the weight.
|
54
|
+
# @return [Numo::DFloat] (shape: [n_feautres]) The updated weight.
|
55
|
+
def call(weight, gradient)
|
56
|
+
@update ||= Numo::DFloat.zeros(weight.shape[0])
|
57
|
+
curvature_range(gradient)
|
58
|
+
gradient_variance(gradient)
|
59
|
+
distance_to_optimum(gradient)
|
60
|
+
@smth_momentum = @params[:decay] * @smth_momentum + (1 - @params[:decay]) * current_momentum
|
61
|
+
@smth_learning_rate = @params[:decay] * @smth_learning_rate + (1 - @params[:decay]) * current_learning_rate
|
62
|
+
@update = @smth_momentum * @update - @smth_learning_rate * gradient
|
63
|
+
weight + @update
|
64
|
+
end
|
65
|
+
|
66
|
+
private
|
67
|
+
|
68
|
+
def current_momentum
|
69
|
+
dr = Math.sqrt(@grad_norm_max / @grad_norm_min + 1.0e-8)
|
70
|
+
[cubic_root**2, ((dr - 1) / (dr + 1))**2].max
|
71
|
+
end
|
72
|
+
|
73
|
+
def current_learning_rate
|
74
|
+
(1.0 - Math.sqrt(@params[:momentum]))**2 / (@grad_norm_min + 1.0e-8)
|
75
|
+
end
|
76
|
+
|
77
|
+
def cubic_root
|
78
|
+
p = (@distance_mean**2 * @grad_norm_min**2) / (2 * @grad_var + 1.0e-8)
|
79
|
+
w3 = (-Math.sqrt(p**2 + 4.fdiv(27) * p**3) - p).fdiv(2)
|
80
|
+
w = (w3 >= 0.0 ? 1 : -1) * w3.abs**1.fdiv(3)
|
81
|
+
y = w - p / (3 * w + 1.0e-8)
|
82
|
+
y + 1
|
83
|
+
end
|
84
|
+
|
85
|
+
def curvature_range(gradient)
|
86
|
+
@grad_norms ||= []
|
87
|
+
@grad_norms.push((gradient**2).sum)
|
88
|
+
@grad_norms.shift(@grad_norms.size - @params[:window_width]) if @grad_norms.size > @params[:window_width]
|
89
|
+
@grad_norm_min = @params[:decay] * @grad_norm_min + (1 - @params[:decay]) * @grad_norms.min
|
90
|
+
@grad_norm_max = @params[:decay] * @grad_norm_max + (1 - @params[:decay]) * @grad_norms.max
|
91
|
+
end
|
92
|
+
|
93
|
+
def gradient_variance(gradient)
|
94
|
+
@grad_mean_sqr = @params[:decay] * @grad_mean_sqr + (1 - @params[:decay]) * gradient**2
|
95
|
+
@grad_mean = @params[:decay] * @grad_mean + (1 - @params[:decay]) * gradient
|
96
|
+
@grad_var = (@grad_mean_sqr - @grad_mean**2).sum
|
97
|
+
end
|
98
|
+
|
99
|
+
def distance_to_optimum(gradient)
|
100
|
+
grad_sqr = (gradient**2).sum
|
101
|
+
@grad_norm_mean = @params[:decay] * @grad_norm_mean + (1 - @params[:decay]) * Math.sqrt(grad_sqr + 1.0e-8)
|
102
|
+
@curve_mean = @params[:decay] * @curve_mean + (1 - @params[:decay]) * grad_sqr
|
103
|
+
@distance_mean = @params[:decay] * @distance_mean + (1 - @params[:decay]) * (@grad_norm_mean / @curve_mean)
|
104
|
+
end
|
105
|
+
|
106
|
+
# Dump marshal data.
|
107
|
+
# @return [Hash] The marshal data.
|
108
|
+
def marshal_dump
|
109
|
+
{ params: @params,
|
110
|
+
smth_learning_rate: @smth_learning_rate,
|
111
|
+
smth_momentum: @smth_momentum,
|
112
|
+
grad_norms: @grad_norms,
|
113
|
+
grad_norm_min: @grad_norm_min,
|
114
|
+
grad_norm_max: @grad_norm_max,
|
115
|
+
grad_mean_sqr: @grad_mean_sqr,
|
116
|
+
grad_mean: @grad_mean,
|
117
|
+
grad_var: @grad_var,
|
118
|
+
grad_norm_mean: @grad_norm_mean,
|
119
|
+
curve_mean: @curve_mean,
|
120
|
+
distance_mean: @distance_mean,
|
121
|
+
update: @update }
|
122
|
+
end
|
123
|
+
|
124
|
+
# Load marshal data.
|
125
|
+
# @return [nis]
|
126
|
+
def marshal_load(obj)
|
127
|
+
@params = obj[:params]
|
128
|
+
@smth_learning_rate = obj[:smth_learning_rate]
|
129
|
+
@smth_momentum = obj[:smth_momentum]
|
130
|
+
@grad_norms = obj[:grad_norms]
|
131
|
+
@grad_norm_min = obj[:grad_norm_min]
|
132
|
+
@grad_norm_max = obj[:grad_norm_max]
|
133
|
+
@grad_mean_sqr = obj[:grad_mean_sqr]
|
134
|
+
@grad_mean = obj[:grad_mean]
|
135
|
+
@grad_var = obj[:grad_var]
|
136
|
+
@grad_norm_mean = obj[:grad_norm_mean]
|
137
|
+
@curve_mean = obj[:curve_mean]
|
138
|
+
@distance_mean = obj[:distance_mean]
|
139
|
+
@update = obj[:update]
|
140
|
+
nil
|
141
|
+
end
|
142
|
+
end
|
143
|
+
end
|
144
|
+
end
|
@@ -21,8 +21,8 @@ module SVMKit
|
|
21
21
|
# results = estimator.predict(testing_samples)
|
22
22
|
#
|
23
23
|
# *Reference*
|
24
|
-
# - S. Rendle, "Factorization Machines with libFM," ACM
|
25
|
-
# - S. Rendle, "Factorization Machines,"
|
24
|
+
# - S. Rendle, "Factorization Machines with libFM," ACM TIST, vol. 3 (3), pp. 57:1--57:22, 2012.
|
25
|
+
# - S. Rendle, "Factorization Machines," Proc. ICDM'10, pp. 995--1000, 2010.
|
26
26
|
class FactorizationMachineClassifier
|
27
27
|
include Base::BaseEstimator
|
28
28
|
include Base::Classifier
|
@@ -57,7 +57,7 @@ module SVMKit
|
|
57
57
|
# @param max_iter [Integer] The maximum number of iterations.
|
58
58
|
# @param batch_size [Integer] The size of the mini batches.
|
59
59
|
# @param optimizer [Optimizer] The optimizer to calculate adaptive learning rate.
|
60
|
-
#
|
60
|
+
# If nil is given, Nadam is used.
|
61
61
|
# @param random_seed [Integer] The seed value using to initialize the random generator.
|
62
62
|
def initialize(n_factors: 2, loss: 'hinge', reg_param_linear: 1.0, reg_param_factor: 1.0,
|
63
63
|
max_iter: 1000, batch_size: 10, optimizer: nil, random_seed: nil)
|
@@ -76,6 +76,7 @@ module SVMKit
|
|
76
76
|
@params[:max_iter] = max_iter
|
77
77
|
@params[:batch_size] = batch_size
|
78
78
|
@params[:optimizer] = optimizer
|
79
|
+
@params[:optimizer] ||= Optimizer::Nadam.new
|
79
80
|
@params[:random_seed] = random_seed
|
80
81
|
@params[:random_seed] ||= srand
|
81
82
|
@factor_mat = nil
|
@@ -105,10 +106,7 @@ module SVMKit
|
|
105
106
|
@bias_term = Numo::DFloat.zeros(n_classes)
|
106
107
|
n_classes.times do |n|
|
107
108
|
bin_y = Numo::Int32.cast(y.eq(@classes[n])) * 2 - 1
|
108
|
-
|
109
|
-
@factor_mat[n, true, true] = factor
|
110
|
-
@weight_vec[n, true] = weight
|
111
|
-
@bias_term[n] = bias
|
109
|
+
@factor_mat[n, true, true], @weight_vec[n, true], @bias_term[n] = binary_fit(x, bin_y)
|
112
110
|
end
|
113
111
|
else
|
114
112
|
negative_label = y.to_a.uniq.min
|
@@ -194,8 +192,8 @@ module SVMKit
|
|
194
192
|
rand_ids = [*0...n_samples].shuffle(random: @rng)
|
195
193
|
weight_vec = Numo::DFloat.zeros(n_features + 1)
|
196
194
|
factor_mat = Numo::DFloat.zeros(@params[:n_factors], n_features)
|
197
|
-
weight_optimizer =
|
198
|
-
factor_optimizers = Array.new(@params[:n_factors]) {
|
195
|
+
weight_optimizer = @params[:optimizer].dup
|
196
|
+
factor_optimizers = Array.new(@params[:n_factors]) { @params[:optimizer].dup }
|
199
197
|
# Start optimization.
|
200
198
|
@params[:max_iter].times do |_t|
|
201
199
|
# Random sampling.
|
@@ -19,8 +19,8 @@ module SVMKit
|
|
19
19
|
# results = estimator.predict(testing_samples)
|
20
20
|
#
|
21
21
|
# *Reference*
|
22
|
-
# - S. Rendle, "Factorization Machines with libFM," ACM
|
23
|
-
# - S. Rendle, "Factorization Machines," Proc.
|
22
|
+
# - S. Rendle, "Factorization Machines with libFM," ACM TIST, vol. 3 (3), pp. 57:1--57:22, 2012.
|
23
|
+
# - S. Rendle, "Factorization Machines," Proc. ICDM'10, pp. 995--1000, 2010.
|
24
24
|
class FactorizationMachineRegressor
|
25
25
|
include Base::BaseEstimator
|
26
26
|
include Base::Regressor
|
@@ -50,7 +50,7 @@ module SVMKit
|
|
50
50
|
# @param max_iter [Integer] The maximum number of iterations.
|
51
51
|
# @param batch_size [Integer] The size of the mini batches.
|
52
52
|
# @param optimizer [Optimizer] The optimizer to calculate adaptive learning rate.
|
53
|
-
#
|
53
|
+
# If nil is given, Nadam is used.
|
54
54
|
# @param random_seed [Integer] The seed value using to initialize the random generator.
|
55
55
|
def initialize(n_factors: 2, reg_param_linear: 1.0, reg_param_factor: 1.0,
|
56
56
|
max_iter: 1000, batch_size: 10, optimizer: nil, random_seed: nil)
|
@@ -66,6 +66,7 @@ module SVMKit
|
|
66
66
|
@params[:max_iter] = max_iter
|
67
67
|
@params[:batch_size] = batch_size
|
68
68
|
@params[:optimizer] = optimizer
|
69
|
+
@params[:optimizer] ||= Optimizer::Nadam.new
|
69
70
|
@params[:random_seed] = random_seed
|
70
71
|
@params[:random_seed] ||= srand
|
71
72
|
@factor_mat = nil
|
@@ -91,12 +92,7 @@ module SVMKit
|
|
91
92
|
@factor_mat = Numo::DFloat.zeros(n_outputs, @params[:n_factors], n_features)
|
92
93
|
@weight_vec = Numo::DFloat.zeros(n_outputs, n_features)
|
93
94
|
@bias_term = Numo::DFloat.zeros(n_outputs)
|
94
|
-
n_outputs.times
|
95
|
-
factor, weight, bias = single_fit(x, y[true, n])
|
96
|
-
@factor_mat[n, true, true] = factor
|
97
|
-
@weight_vec[n, true] = weight
|
98
|
-
@bias_term[n] = bias
|
99
|
-
end
|
95
|
+
n_outputs.times { |n| @factor_mat[n, true, true], @weight_vec[n, true], @bias_term[n] = single_fit(x, y[true, n]) }
|
100
96
|
else
|
101
97
|
@factor_mat, @weight_vec, @bias_term = single_fit(x, y)
|
102
98
|
end
|
@@ -148,8 +144,8 @@ module SVMKit
|
|
148
144
|
rand_ids = [*0...n_samples].shuffle(random: @rng)
|
149
145
|
weight_vec = Numo::DFloat.zeros(n_features + 1)
|
150
146
|
factor_mat = Numo::DFloat.zeros(@params[:n_factors], n_features)
|
151
|
-
weight_optimizer =
|
152
|
-
factor_optimizers = Array.new(@params[:n_factors]) {
|
147
|
+
weight_optimizer = @params[:optimizer].dup
|
148
|
+
factor_optimizers = Array.new(@params[:n_factors]) { @params[:optimizer].dup }
|
153
149
|
# Start optimization.
|
154
150
|
@params[:max_iter].times do |_t|
|
155
151
|
# Random sampling.
|
data/lib/svmkit/version.rb
CHANGED
data/svmkit.gemspec
CHANGED
@@ -17,8 +17,8 @@ MSG
|
|
17
17
|
SVMKit is a machine learninig library in Ruby.
|
18
18
|
SVMKit provides machine learning algorithms with interfaces similar to Scikit-Learn in Python.
|
19
19
|
SVMKit currently supports Linear / Kernel Support Vector Machine,
|
20
|
-
Logistic Regression, Ridge, Lasso, Factorization Machine,
|
21
|
-
K-nearest neighbor algorithm, and cross-validation.
|
20
|
+
Logistic Regression, Linear Regression, Ridge, Lasso, Factorization Machine,
|
21
|
+
Naive Bayes, Decision Tree, Random Forest, K-nearest neighbor algorithm, and cross-validation.
|
22
22
|
MSG
|
23
23
|
spec.homepage = 'https://github.com/yoshoku/svmkit'
|
24
24
|
spec.license = 'BSD-2-Clause'
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: svmkit
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.4.
|
4
|
+
version: 0.4.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- yoshoku
|
8
8
|
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date: 2018-06-
|
11
|
+
date: 2018-06-08 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: numo-narray
|
@@ -84,8 +84,8 @@ description: |
|
|
84
84
|
SVMKit is a machine learninig library in Ruby.
|
85
85
|
SVMKit provides machine learning algorithms with interfaces similar to Scikit-Learn in Python.
|
86
86
|
SVMKit currently supports Linear / Kernel Support Vector Machine,
|
87
|
-
Logistic Regression, Ridge, Lasso, Factorization Machine,
|
88
|
-
K-nearest neighbor algorithm, and cross-validation.
|
87
|
+
Logistic Regression, Linear Regression, Ridge, Lasso, Factorization Machine,
|
88
|
+
Naive Bayes, Decision Tree, Random Forest, K-nearest neighbor algorithm, and cross-validation.
|
89
89
|
email:
|
90
90
|
- yoshoku@outlook.com
|
91
91
|
executables: []
|
@@ -128,6 +128,7 @@ files:
|
|
128
128
|
- lib/svmkit/kernel_approximation/rbf.rb
|
129
129
|
- lib/svmkit/kernel_machine/kernel_svc.rb
|
130
130
|
- lib/svmkit/linear_model/lasso.rb
|
131
|
+
- lib/svmkit/linear_model/linear_regression.rb
|
131
132
|
- lib/svmkit/linear_model/logistic_regression.rb
|
132
133
|
- lib/svmkit/linear_model/ridge.rb
|
133
134
|
- lib/svmkit/linear_model/svc.rb
|
@@ -140,6 +141,9 @@ files:
|
|
140
141
|
- lib/svmkit/nearest_neighbors/k_neighbors_classifier.rb
|
141
142
|
- lib/svmkit/nearest_neighbors/k_neighbors_regressor.rb
|
142
143
|
- lib/svmkit/optimizer/nadam.rb
|
144
|
+
- lib/svmkit/optimizer/rmsprop.rb
|
145
|
+
- lib/svmkit/optimizer/sgd.rb
|
146
|
+
- lib/svmkit/optimizer/yellow_fin.rb
|
143
147
|
- lib/svmkit/pairwise_metric.rb
|
144
148
|
- lib/svmkit/polynomial_model/factorization_machine_classifier.rb
|
145
149
|
- lib/svmkit/polynomial_model/factorization_machine_regressor.rb
|