liblinear-ruby 0.0.7 → 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: a7d1e2c1eeff706b5cd0494d4fc720dec937f710
4
- data.tar.gz: 3d436efa057c9fa1a68e0e4e884322b4c07dfb10
3
+ metadata.gz: fe141358d47228659a23f83ea68a49dc09fc1401
4
+ data.tar.gz: 2165c62af5e42c40836eb17131f668e44a0c3311
5
5
  SHA512:
6
- metadata.gz: 59d78c950c0d15db0213b6925a56a7aa56b567126fa184ceea744e13c418bedd4cd00638182ac0635f40075041cd96cde6bb1ba1c788440f0fbe70a9e46b128f
7
- data.tar.gz: fbeaad15badd01d2fea3ebf2ab2a4cf594e7f4b90975e5c88c73f13433f9c3c3282c5d7939dd347355543e2a94216e5e9274d3089a823650cb616507c089bd8b
6
+ metadata.gz: 723448dcbff38bee0b668ee62d53c07ebbcd0d86feff6022480ab4a5904b39f8ef79bc5440baef3c38f55e120d58ce22327dd69d070cc36454d6dca6afe9c1d6
7
+ data.tar.gz: cd6f6aaaf33b0cf5d5400328b59341f80561eab64aece3e24c7dde2466800b5b4176e8dd81fcf26fba66eb8d3d444c607288dc2be842e122b1af47f3eb7db8ec
data/README.md CHANGED
@@ -1,8 +1,8 @@
1
1
  # Liblinear-Ruby
2
2
  [![Gem Version](https://badge.fury.io/rb/liblinear-ruby.png)](http://badge.fury.io/rb/liblinear-ruby)
3
3
 
4
- Liblinear-Ruby is Ruby interface to LIBLINEAR using SWIG.
5
- Now, this interface is supporting LIBLINEAR 1.95.
4
+ Liblinear-Ruby is Ruby interface of LIBLINEAR using SWIG.
5
+ Now, this interface is supporting LIBLINEAR 2.1.
6
6
 
7
7
  ## Installation
8
8
 
@@ -23,63 +23,29 @@ This sample code execute classification with L2-regularized logistic regression.
23
23
  ```ruby
24
24
  require 'liblinear'
25
25
 
26
- # Setting parameters
27
- param = Liblinear::Parameter.new
28
- param.solver_type = Liblinear::L2R_LR
29
-
30
- # Training phase
31
- labels = [1, -1]
32
- examples = [
33
- {1=>0, 2=>0, 3=>0, 4=>0, 5=>0},
34
- {1=>1, 2=>1, 3=>1, 4=>1, 5=>1}
35
- ]
36
- bias = 0.5
37
- prob = Liblinear::Problem.new(labels, examples, bias)
38
- model = Liblinear::Model.new(prob, param)
39
-
40
- # Predicting phase
41
- puts model.predict({1=>1, 2=>1, 3=>1, 4=>1, 5=>1}) # => -1.0
42
-
43
- # Analyzing phase
44
- puts model.coefficient
45
- puts model.bias
46
-
47
- # Cross Validation
48
- fold = 2
49
- cv = Liblinear::CrossValidator.new(prob, param, fold)
50
- cv.execute
51
-
52
- puts cv.accuracy # for classification
53
- puts cv.mean_squared_error # for regression
54
- puts cv.squared_correlation_coefficient # for regression
26
+ # train
27
+ model = Liblinear.train(
28
+ { solver_type: Liblinear::L2R_LR }, # parameter
29
+ [-1, -1, 1, 1], # labels (classes) of training data
30
+ [[-2, -2], [-1, -1], [1, 1], [2, 2]], # training data
31
+ )
32
+ # predict
33
+ puts Liblinear.predict(model, [0.5, 0.5]) # predicted class will be 1
55
34
  ```
56
- ## Usage
57
35
 
58
- ### Setting parameters
59
- First, you have to make an instance of Liblinear::Parameter:
60
- ```ruby
61
- param = Liblinear::Parameter.new
62
- ```
63
- And then set the parameters as:
64
- ```ruby
65
- param.[parameter_you_set] = value
66
- ```
67
- Or you can set by Hash as:
68
- ```ruby
69
- parameter = {
70
- parameter_you_set: value,
71
- ...
72
- }
73
- param = Liblinear::Parameter.new(parameter)
74
- ```
36
+ ## Parameter
37
+ There are some parameters you can specify:
75
38
 
76
- #### Type of solver
77
- This parameter is comparable to -s option on command line.
78
- You can set as:
79
- ```ruby
80
- param.solver_type = solver_type # default 1 (Liblinear::L2R_L2LOSS_SVC_DUAL)
81
- ```
82
- Solver types you can set are shown below.
39
+ - `solver_type`
40
+ - `cost`
41
+ - `sensitive_loss`
42
+ - `epsilon`
43
+ - `weight_labels` and `weights`
44
+
45
+ ### solver_type
46
+ This parameter specifies a type of solver (default: `Liblinear::L2R_L2LOSS_SVC_DUAL`).
47
+ This corresponds to `-s` option on command line.
48
+ Solver types you can set are shown below:
83
49
  ```ruby
84
50
  # for multi-class classification
85
51
  Liblinear::L2R_LR # L2-regularized logistic regression (primal)
@@ -97,92 +63,80 @@ Liblinear::L2R_L2LOSS_SVR_DUAL # L2-regularized L2-loss support vector regressio
97
63
  Liblinear::L2R_L1LOSS_SVR_DUAL # L2-regularized L1-loss support vector regression (dual)
98
64
  ```
99
65
 
100
- #### C parameter
101
- This parameter is comparable to -c option on command line.
102
- You can set as:
103
- ```ruby
104
- param.C = value # default 1
105
- ```
66
+ ### cost
67
+ This parameter specifies the cost of constraints violation (default `1.0`).
68
+ This corresponds to `-c` option on command line.
106
69
 
107
- #### Epsilon in loss function of epsilon-SVR
108
- This parameter is comparable to -p option on command line.
109
- You can set as:
110
- ```ruby
111
- param.p = value # default 0.1
112
- ```
70
+ ### sensitive_loss
71
+ This parameter specifies an epsilon in loss function of epsilon-SVR (default `0.1`).
72
+ This corresponds to `-p` option on command line.
113
73
 
114
- #### Tolerance of termination criterion
115
- This parameter is comparable to -e option on command line.
116
- You can set as:
117
- ```ruby
118
- param.eps = value # default 0.1
119
- ```
74
+ ### epsilon
75
+ This parameter specifies a tolerance of termination criterion.
76
+ This corresponds to `-e` option on command line.
77
+ The default value depends on a type of solver. See LIBLINEAR's README or `Liblinear::Parameter.default_epsion` for more details.
120
78
 
121
- #### Weight
122
- This parameter adjust the parameter C of different classes(see LIBLINEAR's README for details).
123
- nr_weight is the number of elements in the array weight_label and weight.
124
- You can set as:
125
- ```ruby
126
- param.nr_weight = value # default 0
127
- param.weight_label = [Array <Integer>] # default []
128
- param.weight = [Array <Double>] # default []
129
- ```
79
+ ### weight_labels and weights
80
+ These parameters are used to change the penalty for some classes (default `[]`).
81
+ Each `weights[i]` corresponds to `weight_labels[i]`, meaning that the penalty of class `weight_labels[i]` is scaled by a factor of `weights[i]`.
82
+
83
+
84
+ ## Train
85
+ First, prepare training data.
130
86
 
131
- ### Training phase
132
- You have to prepare training data.
133
- The format of training data is shown below:
134
87
  ```ruby
135
- # Labels mean class
136
- label = [1, -1, ...]
88
+ # Define class of each training data:
89
+ labels = [1, -1, ...]
137
90
 
138
- # Training data have to be array of hash or array of array
139
- # If you chose array of hash
91
+ # Training data is Array of Array:
140
92
  examples = [
141
- {1=>0, 2=>0, 3=>0, 4=>0, 5=>0},
142
- {1=>1, 2=>1, 3=>1, 4=>1, 5=>1},
93
+ [1, 0, 0, 1, 0],
94
+ [0, 0, 0, 1, 1],
143
95
  ...
144
96
  ]
145
97
 
146
- # If you chose array of array
98
+ # You can also use Array of Hash instead:
147
99
  examples = [
148
- [0, 0, 0, 0, 0],
149
- [1, 1, 1, 1, 1],
100
+ { 1 => 1, 4 => 1 },
101
+ { 4 => 1, 5 => 1 },
102
+ ...
150
103
  ]
151
104
  ```
152
- Next, set the bias (this is comparable to -B option on command line):
105
+
106
+ Next, set the bias (this corresponds to `-B` option on command line):
153
107
  ```ruby
154
108
  bias = 0.5 # default -1
155
109
  ```
156
- And then make an instance of Liblinear::Problem and Liblinear::Model:
157
- ```ruby
158
- prob = Liblinear::Problem.new(labels, examples, bias)
159
- model = Liblinear::Model.new(prob, param)
160
- ```
161
- If you have already had a model file, you can load it as:
110
+
111
+ Then, specify parameters and execute `Liblinear.train` to get the instance of `Liblinear::Model`.
162
112
  ```ruby
163
- model = Liblinear::Model.new(model_file)
113
+ model = Liblinear.train(parameter, labels, examples, bias)
164
114
  ```
115
+
165
116
  In this phase, you can save model as:
166
117
  ```ruby
167
118
  model.save(file_name)
168
119
  ```
169
120
 
170
- ### Predicting phase
171
- Input a data whose format is same as training data:
121
+ If you have already had a model file, you can load it as:
172
122
  ```ruby
173
- # Hash
174
- model.predict({1=>1, 2=>1, 3=>1, 4=>1, 5=>1})
175
- # Array
176
- model.predict([1, 1, 1, 1, 1])
123
+ model = Liblinear::Model.load(file_name)
177
124
  ```
178
125
 
179
- ## Contributing
126
+ ## Predict
127
+ Prepare the data you want to predict its class and call `Liblinear.predict`.
180
128
 
181
- 1. Fork it
182
- 2. Create your feature branch (`git checkout -b my-new-feature`)
183
- 3. Commit your changes (`git commit -am 'Add some feature'`)
184
- 4. Push to the branch (`git push origin my-new-feature`)
185
- 5. Create new Pull Request
129
+ ```ruby
130
+ examples = [0, 0, 0, 1, 1]
131
+ Liblinear.predict(model, example)
132
+ ```
133
+
134
+ ## Cross Validation
135
+ To get classes predicted by k-fold cross validation, use `Liblinear.cross_validation`.
136
+ For example, `results[0]` is a class predicted by `examples` excepts part including `examples[0]`.
137
+ ```ruby
138
+ results = Liblinear.cross_validation(fold, parameter, labels, examples)
139
+ ```
186
140
 
187
141
  ## Thanks
188
142
  - http://www.csie.ntu.edu.tw/~cjlin/liblinear/
@@ -0,0 +1,26 @@
1
+ class Liblinear
2
+ class Array::Double < Array
3
+ class << self
4
+ # @param array [SWIG::TYPE_p_double]
5
+ # @param size [Integer]
6
+ # @return [Array <Float>]
7
+ def decode(array, size)
8
+ size.times.map {|index| Liblinearswig.double_getitem(array, index)}
9
+ end
10
+
11
+ # @param array [SWIG::TYPE_p_double]
12
+ def delete(array)
13
+ Liblinearswig.delete_double(array)
14
+ end
15
+ end
16
+
17
+ # @param array [Array <Float>]
18
+ def initialize(array)
19
+ @array = Liblinearswig.new_double(array.size)
20
+ array.size.times do |index|
21
+ Liblinearswig.double_setitem(@array, index, array[index])
22
+ end
23
+ @size = array.size
24
+ end
25
+ end
26
+ end
@@ -0,0 +1,26 @@
1
+ class Liblinear
2
+ class Array::Integer < Array
3
+ class << self
4
+ # @param array [SWIG::TYPE_p_int]
5
+ # @param size [Integer]
6
+ # @param return [Array <Integer>]
7
+ def decode(array, size)
8
+ size.times.map {|index| Liblinearswig.int_getitem(array, index)}
9
+ end
10
+
11
+ # @param array [SWIG::TYPE_p_int]
12
+ def delete(array)
13
+ Liblinearswig.delete_int(array)
14
+ end
15
+ end
16
+
17
+ # @param array [Array <Integer>]
18
+ def initialize(array)
19
+ @array = Liblinearswig.new_int(array.size)
20
+ array.size.times do |index|
21
+ Liblinearswig.int_setitem(@array, index, array[index])
22
+ end
23
+ @size = array.size
24
+ end
25
+ end
26
+ end
@@ -0,0 +1,15 @@
1
+ class Liblinear
2
+ class Array
3
+ def swig
4
+ @array
5
+ end
6
+
7
+ def decode
8
+ self.class.decode(@array, @size)
9
+ end
10
+
11
+ def delete
12
+ self.class.delete(@array)
13
+ end
14
+ end
15
+ end
@@ -1,4 +1,4 @@
1
- module Liblinear
1
+ class Liblinear
2
2
  class InvalidParameter < StandardError
3
3
  end
4
4
  end
@@ -0,0 +1,29 @@
1
+ class Liblinear
2
+ class Example
3
+ class << self
4
+ # @param examples [Array <Hash, Array>]
5
+ # @return [Integer]
6
+ def max_feature_id(examples)
7
+ max_feature_id = 0
8
+ examples.each do |example|
9
+ if example.is_a?(::Hash)
10
+ max_feature_id = [max_feature_id, example.keys.max].max if example.size > 0
11
+ else
12
+ max_feature_id = [max_feature_id, example.size].max
13
+ end
14
+ end
15
+ max_feature_id
16
+ end
17
+
18
+ # @param example_array [Array]
19
+ # @return [Hash]
20
+ def array_to_hash(example_array)
21
+ example_hash = {}
22
+ example_array.size.times do |index|
23
+ example_hash[index + 1] = example_array[index]
24
+ end
25
+ example_hash
26
+ end
27
+ end
28
+ end
29
+ end
@@ -0,0 +1,40 @@
1
+ class Liblinear
2
+ class FeatureNode
3
+ # @param examples [Array <Float> or Hash]
4
+ # @param max_feature_id [Integer]
5
+ # @param bias [Float]
6
+ def initialize(example, max_feature_id, bias = -1)
7
+ example = Liblinear::Example.array_to_hash(example) if example.is_a?(::Array)
8
+
9
+ example_indexes = []
10
+ example.each_key do |key|
11
+ example_indexes << key
12
+ end
13
+ example_indexes.sort!
14
+
15
+ if bias >= 0
16
+ @feature_node = Liblinearswig.feature_node_array(example_indexes.size + 2)
17
+ Liblinearswig.feature_node_array_set(@feature_node, example_indexes.size, max_feature_id + 1, bias)
18
+ Liblinearswig.feature_node_array_set(@feature_node, example_indexes.size + 1, -1, 0)
19
+ else
20
+ @feature_node = Liblinearswig.feature_node_array(example_indexes.size + 1)
21
+ Liblinearswig.feature_node_array_set(@feature_node, example_indexes.size, -1, 0)
22
+ end
23
+
24
+ f_index = 0
25
+ example_indexes.each do |e_index|
26
+ Liblinearswig.feature_node_array_set(@feature_node, f_index, e_index, example[e_index])
27
+ f_index += 1
28
+ end
29
+ end
30
+
31
+ # @return [Liblinearswig::Feature_node]
32
+ def swig
33
+ @feature_node
34
+ end
35
+
36
+ def delete
37
+ Liblinearswig.feature_node_array_destroy(@feature_node)
38
+ end
39
+ end
40
+ end
@@ -0,0 +1,23 @@
1
+ class Liblinear
2
+ class FeatureNodeMatrix
3
+ # @param examples [Array <Array <Float> or Hash>]
4
+ # @param bias [Float]
5
+ def initialize(examples, bias)
6
+ @feature_node_matrix = Liblinearswig.feature_node_matrix(examples.size)
7
+ max_feature_id = Liblinear::Example.max_feature_id(examples)
8
+ examples.size.times do |index|
9
+ feature_node = Liblinear::FeatureNode.new(examples[index], max_feature_id, bias)
10
+ Liblinearswig.feature_node_matrix_set(@feature_node_matrix, index, feature_node.swig)
11
+ end
12
+ end
13
+
14
+ # @return [SWIG::TYPE_p_p_feature_node]
15
+ def swig
16
+ @feature_node_matrix
17
+ end
18
+
19
+ def delete
20
+ Liblinearswig.feature_node_matrix_destroy(@feature_node_matrix)
21
+ end
22
+ end
23
+ end
@@ -1,113 +1,78 @@
1
- module Liblinear
1
+ class Liblinear
2
2
  class Model
3
- include Liblinear
4
- include Liblinearswig
5
- attr_accessor :model
3
+ class << self
4
+ # @param problem [LibLinear::Problem]
5
+ # @param parameter [Liblinear::Parameter]
6
+ # @return [Liblinear::Model]
7
+ def train(problem, parameter)
8
+ model = self.new
9
+ model.train(problem, parameter)
10
+ model
11
+ end
6
12
 
7
- # @param arg_1 [LibLinear::Problem, String]
8
- # @param arg_2 [Liblinear::Parameter]
9
- # @raise [ArgumentError]
10
- # @raise [Liblinear::InvalidParameter]
11
- def initialize(arg_1, arg_2 = nil)
12
- if arg_2
13
- unless arg_1.is_a?(Liblinear::Problem) && arg_2.is_a?(Liblinear::Parameter)
14
- raise ArgumentError, 'arguments must be [Liblinear::Problem] and [Liblinear::Parameter]'
15
- end
16
- error_msg = check_parameter(arg_1.prob, arg_2.param)
17
- raise InvalidParameter, error_msg if error_msg
18
- @model = train(arg_1.prob, arg_2.param)
19
- else
20
- raise ArgumentError, 'argument must be [String]' unless arg_1.is_a?(String)
21
- @model = load_model(arg_1)
13
+ # @param file_name [String]
14
+ # @return [Liblinear::Model]
15
+ def load(file_name)
16
+ model = self.new
17
+ model.load(file_name)
18
+ model
22
19
  end
23
20
  end
24
21
 
25
- # @return [Integer]
26
- def class_size
27
- get_nr_class(@model)
22
+ # @param problem [LibLinear::Problem]
23
+ # @param parameter [Liblinear::Parameter]
24
+ def train(problem, parameter)
25
+ @model = Liblinearswig.train(problem.swig, parameter.swig)
28
26
  end
29
27
 
30
- # @return [Integer]
31
- def nr_class
32
- warn "'nr_class' is deprecated. Please use 'class_size' instead."
33
- class_size
28
+ # @param file_name [String]
29
+ def load(file_name)
30
+ @model = Liblinearswig.load_model(file_name)
34
31
  end
35
32
 
36
- # @return [Integer]
37
- def feature_size
38
- get_nr_feature(@model)
33
+ # @return [Liblinear::Model]
34
+ def swig
35
+ @model
39
36
  end
40
37
 
41
- # @return [Array <Integer>]
42
- def labels
43
- c_int_array = new_int(class_size)
44
- get_labels(@model, c_int_array)
45
- labels = int_array_c_to_ruby(c_int_array, class_size)
46
- delete_int(c_int_array)
47
- labels
38
+ # @param filename [String]
39
+ def save(filename)
40
+ Liblinearswig.save_model(filename, @model)
48
41
  end
49
42
 
50
- # @param example [Array, Hash]
51
- # @return [Double]
52
- def predict(example)
53
- feature_nodes = convert_to_feature_node_array(example, @model.nr_feature, @model.bias)
54
- prediction = Liblinearswig.predict(@model, feature_nodes)
55
- feature_node_array_destroy(feature_nodes)
56
- prediction
43
+ # @return [Integer]
44
+ def class_size
45
+ @model.nr_class
57
46
  end
58
47
 
59
- # @param example [Array, Hash]
60
- # @return [Hash]
61
- def predict_probability(example)
62
- predict_prob_val(example, :predict_probability)
48
+ # @return [Integer]
49
+ def feature_size
50
+ @model.nr_feature
63
51
  end
64
52
 
65
- # @param example [Array, Hash]
66
- # @return [Hash]
67
- def predict_values(example)
68
- predict_prob_val(example, :predict_values)
53
+ # @return [Array <Float>]
54
+ def feature_weights
55
+ Liblinear::Array::Double.decode(@model.w, feature_size)
69
56
  end
70
57
 
71
- # @param filename [String]
72
- def save(filename)
73
- save_model(filename, @model)
58
+ # @return [Float]
59
+ def bias
60
+ @model.bias
74
61
  end
75
62
 
76
- # @param feature_index [Integer]
77
- # @param label_index [Integer]
78
- # @return [Double, Array <Double>]
79
- def coefficient(feature_index = nil, label_index = 0)
80
- return get_decfun_coef(@model, feature_index, label_index) if feature_index
81
- coefficients = []
82
- feature_size.times.map {|feature_index| get_decfun_coef(@model, feature_index + 1, label_index)}
63
+ # @return [Array <Integer>]
64
+ def labels
65
+ Liblinear::Array::Integer.decode(@model.label, class_size)
83
66
  end
84
67
 
85
- # @param label_index [Integer]
86
- # @return [Double]
87
- def bias(label_index = 0)
88
- get_decfun_bias(@model, label_index)
68
+ # @return [Boolean]
69
+ def probability_model?
70
+ Liblinearswig.check_probability_model(@model) == 1 ? true : false
89
71
  end
90
72
 
91
73
  # @return [Boolean]
92
74
  def regression_model?
93
- check_regression_model(@model) == 1 ? true : false
94
- end
95
-
96
- private
97
- # @param example [Array, Hash]
98
- # @return [Hash]
99
- def predict_prob_val(example, liblinear_func)
100
- feature_nodes = convert_to_feature_node_array(example, @model.nr_feature, @model.bias)
101
- c_double_array = new_double(class_size)
102
- Liblinearswig.send(liblinear_func, @model, feature_nodes, c_double_array)
103
- values = double_array_c_to_ruby(c_double_array, class_size)
104
- delete_double(c_double_array)
105
- feature_node_array_destroy(feature_nodes)
106
- value_list = {}
107
- labels.size.times do |i|
108
- value_list[labels[i]] = values[i]
109
- end
110
- value_list
75
+ Liblinearswig.check_regression_model(@model) == 1 ? true : false
111
76
  end
112
77
  end
113
78
  end