liblinear-ruby 0.0.7 → 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: a7d1e2c1eeff706b5cd0494d4fc720dec937f710
4
- data.tar.gz: 3d436efa057c9fa1a68e0e4e884322b4c07dfb10
3
+ metadata.gz: fe141358d47228659a23f83ea68a49dc09fc1401
4
+ data.tar.gz: 2165c62af5e42c40836eb17131f668e44a0c3311
5
5
  SHA512:
6
- metadata.gz: 59d78c950c0d15db0213b6925a56a7aa56b567126fa184ceea744e13c418bedd4cd00638182ac0635f40075041cd96cde6bb1ba1c788440f0fbe70a9e46b128f
7
- data.tar.gz: fbeaad15badd01d2fea3ebf2ab2a4cf594e7f4b90975e5c88c73f13433f9c3c3282c5d7939dd347355543e2a94216e5e9274d3089a823650cb616507c089bd8b
6
+ metadata.gz: 723448dcbff38bee0b668ee62d53c07ebbcd0d86feff6022480ab4a5904b39f8ef79bc5440baef3c38f55e120d58ce22327dd69d070cc36454d6dca6afe9c1d6
7
+ data.tar.gz: cd6f6aaaf33b0cf5d5400328b59341f80561eab64aece3e24c7dde2466800b5b4176e8dd81fcf26fba66eb8d3d444c607288dc2be842e122b1af47f3eb7db8ec
data/README.md CHANGED
@@ -1,8 +1,8 @@
1
1
  # Liblinear-Ruby
2
2
  [![Gem Version](https://badge.fury.io/rb/liblinear-ruby.png)](http://badge.fury.io/rb/liblinear-ruby)
3
3
 
4
- Liblinear-Ruby is Ruby interface to LIBLINEAR using SWIG.
5
- Now, this interface is supporting LIBLINEAR 1.95.
4
+ Liblinear-Ruby is Ruby interface of LIBLINEAR using SWIG.
5
+ Now, this interface is supporting LIBLINEAR 2.1.
6
6
 
7
7
  ## Installation
8
8
 
@@ -23,63 +23,29 @@ This sample code execute classification with L2-regularized logistic regression.
23
23
  ```ruby
24
24
  require 'liblinear'
25
25
 
26
- # Setting parameters
27
- param = Liblinear::Parameter.new
28
- param.solver_type = Liblinear::L2R_LR
29
-
30
- # Training phase
31
- labels = [1, -1]
32
- examples = [
33
- {1=>0, 2=>0, 3=>0, 4=>0, 5=>0},
34
- {1=>1, 2=>1, 3=>1, 4=>1, 5=>1}
35
- ]
36
- bias = 0.5
37
- prob = Liblinear::Problem.new(labels, examples, bias)
38
- model = Liblinear::Model.new(prob, param)
39
-
40
- # Predicting phase
41
- puts model.predict({1=>1, 2=>1, 3=>1, 4=>1, 5=>1}) # => -1.0
42
-
43
- # Analyzing phase
44
- puts model.coefficient
45
- puts model.bias
46
-
47
- # Cross Validation
48
- fold = 2
49
- cv = Liblinear::CrossValidator.new(prob, param, fold)
50
- cv.execute
51
-
52
- puts cv.accuracy # for classification
53
- puts cv.mean_squared_error # for regression
54
- puts cv.squared_correlation_coefficient # for regression
26
+ # train
27
+ model = Liblinear.train(
28
+ { solver_type: Liblinear::L2R_LR }, # parameter
29
+ [-1, -1, 1, 1], # labels (classes) of training data
30
+ [[-2, -2], [-1, -1], [1, 1], [2, 2]], # training data
31
+ )
32
+ # predict
33
+ puts Liblinear.predict(model, [0.5, 0.5]) # predicted class will be 1
55
34
  ```
56
- ## Usage
57
35
 
58
- ### Setting parameters
59
- First, you have to make an instance of Liblinear::Parameter:
60
- ```ruby
61
- param = Liblinear::Parameter.new
62
- ```
63
- And then set the parameters as:
64
- ```ruby
65
- param.[parameter_you_set] = value
66
- ```
67
- Or you can set by Hash as:
68
- ```ruby
69
- parameter = {
70
- parameter_you_set: value,
71
- ...
72
- }
73
- param = Liblinear::Parameter.new(parameter)
74
- ```
36
+ ## Parameter
37
+ There are some parameters you can specify:
75
38
 
76
- #### Type of solver
77
- This parameter is comparable to -s option on command line.
78
- You can set as:
79
- ```ruby
80
- param.solver_type = solver_type # default 1 (Liblinear::L2R_L2LOSS_SVC_DUAL)
81
- ```
82
- Solver types you can set are shown below.
39
+ - `solver_type`
40
+ - `cost`
41
+ - `sensitive_loss`
42
+ - `epsilon`
43
+ - `weight_labels` and `weights`
44
+
45
+ ### solver_type
46
+ This parameter specifies a type of solver (default: `Liblinear::L2R_L2LOSS_SVC_DUAL`).
47
+ This corresponds to `-s` option on command line.
48
+ Solver types you can set are shown below:
83
49
  ```ruby
84
50
  # for multi-class classification
85
51
  Liblinear::L2R_LR # L2-regularized logistic regression (primal)
@@ -97,92 +63,80 @@ Liblinear::L2R_L2LOSS_SVR_DUAL # L2-regularized L2-loss support vector regressio
97
63
  Liblinear::L2R_L1LOSS_SVR_DUAL # L2-regularized L1-loss support vector regression (dual)
98
64
  ```
99
65
 
100
- #### C parameter
101
- This parameter is comparable to -c option on command line.
102
- You can set as:
103
- ```ruby
104
- param.C = value # default 1
105
- ```
66
+ ### cost
67
+ This parameter specifies the cost of constraints violation (default `1.0`).
68
+ This corresponds to `-c` option on command line.
106
69
 
107
- #### Epsilon in loss function of epsilon-SVR
108
- This parameter is comparable to -p option on command line.
109
- You can set as:
110
- ```ruby
111
- param.p = value # default 0.1
112
- ```
70
+ ### sensitive_loss
71
+ This parameter specifies an epsilon in loss function of epsilon-SVR (default `0.1`).
72
+ This corresponds to `-p` option on command line.
113
73
 
114
- #### Tolerance of termination criterion
115
- This parameter is comparable to -e option on command line.
116
- You can set as:
117
- ```ruby
118
- param.eps = value # default 0.1
119
- ```
74
+ ### epsilon
75
+ This parameter specifies a tolerance of termination criterion.
76
+ This corresponds to `-e` option on command line.
77
+ The default value depends on a type of solver. See LIBLINEAR's README or `Liblinear::Parameter.default_epsion` for more details.
120
78
 
121
- #### Weight
122
- This parameter adjust the parameter C of different classes(see LIBLINEAR's README for details).
123
- nr_weight is the number of elements in the array weight_label and weight.
124
- You can set as:
125
- ```ruby
126
- param.nr_weight = value # default 0
127
- param.weight_label = [Array <Integer>] # default []
128
- param.weight = [Array <Double>] # default []
129
- ```
79
+ ### weight_labels and weights
80
+ These parameters are used to change the penalty for some classes (default `[]`).
81
+ Each `weights[i]` corresponds to `weight_labels[i]`, meaning that the penalty of class `weight_labels[i]` is scaled by a factor of `weights[i]`.
82
+
83
+
84
+ ## Train
85
+ First, prepare training data.
130
86
 
131
- ### Training phase
132
- You have to prepare training data.
133
- The format of training data is shown below:
134
87
  ```ruby
135
- # Labels mean class
136
- label = [1, -1, ...]
88
+ # Define class of each training data:
89
+ labels = [1, -1, ...]
137
90
 
138
- # Training data have to be array of hash or array of array
139
- # If you chose array of hash
91
+ # Training data is Array of Array:
140
92
  examples = [
141
- {1=>0, 2=>0, 3=>0, 4=>0, 5=>0},
142
- {1=>1, 2=>1, 3=>1, 4=>1, 5=>1},
93
+ [1, 0, 0, 1, 0],
94
+ [0, 0, 0, 1, 1],
143
95
  ...
144
96
  ]
145
97
 
146
- # If you chose array of array
98
+ # You can also use Array of Hash instead:
147
99
  examples = [
148
- [0, 0, 0, 0, 0],
149
- [1, 1, 1, 1, 1],
100
+ { 1 => 1, 4 => 1 },
101
+ { 4 => 1, 5 => 1 },
102
+ ...
150
103
  ]
151
104
  ```
152
- Next, set the bias (this is comparable to -B option on command line):
105
+
106
+ Next, set the bias (this corresponds to `-B` option on command line):
153
107
  ```ruby
154
108
  bias = 0.5 # default -1
155
109
  ```
156
- And then make an instance of Liblinear::Problem and Liblinear::Model:
157
- ```ruby
158
- prob = Liblinear::Problem.new(labels, examples, bias)
159
- model = Liblinear::Model.new(prob, param)
160
- ```
161
- If you have already had a model file, you can load it as:
110
+
111
+ Then, specify parameters and execute `Liblinear.train` to get the instance of `Liblinear::Model`.
162
112
  ```ruby
163
- model = Liblinear::Model.new(model_file)
113
+ model = Liblinear.train(parameter, labels, examples, bias)
164
114
  ```
115
+
165
116
  In this phase, you can save model as:
166
117
  ```ruby
167
118
  model.save(file_name)
168
119
  ```
169
120
 
170
- ### Predicting phase
171
- Input a data whose format is same as training data:
121
+ If you have already had a model file, you can load it as:
172
122
  ```ruby
173
- # Hash
174
- model.predict({1=>1, 2=>1, 3=>1, 4=>1, 5=>1})
175
- # Array
176
- model.predict([1, 1, 1, 1, 1])
123
+ model = Liblinear::Model.load(file_name)
177
124
  ```
178
125
 
179
- ## Contributing
126
+ ## Predict
127
+ Prepare the data you want to predict its class and call `Liblinear.predict`.
180
128
 
181
- 1. Fork it
182
- 2. Create your feature branch (`git checkout -b my-new-feature`)
183
- 3. Commit your changes (`git commit -am 'Add some feature'`)
184
- 4. Push to the branch (`git push origin my-new-feature`)
185
- 5. Create new Pull Request
129
+ ```ruby
130
+ examples = [0, 0, 0, 1, 1]
131
+ Liblinear.predict(model, example)
132
+ ```
133
+
134
+ ## Cross Validation
135
+ To get classes predicted by k-fold cross validation, use `Liblinear.cross_validation`.
136
+ For example, `results[0]` is a class predicted by `examples` excepts part including `examples[0]`.
137
+ ```ruby
138
+ results = Liblinear.cross_validation(fold, parameter, labels, examples)
139
+ ```
186
140
 
187
141
  ## Thanks
188
142
  - http://www.csie.ntu.edu.tw/~cjlin/liblinear/
@@ -0,0 +1,26 @@
1
+ class Liblinear
2
+ class Array::Double < Array
3
+ class << self
4
+ # @param array [SWIG::TYPE_p_double]
5
+ # @param size [Integer]
6
+ # @return [Array <Float>]
7
+ def decode(array, size)
8
+ size.times.map {|index| Liblinearswig.double_getitem(array, index)}
9
+ end
10
+
11
+ # @param array [SWIG::TYPE_p_double]
12
+ def delete(array)
13
+ Liblinearswig.delete_double(array)
14
+ end
15
+ end
16
+
17
+ # @param array [Array <Float>]
18
+ def initialize(array)
19
+ @array = Liblinearswig.new_double(array.size)
20
+ array.size.times do |index|
21
+ Liblinearswig.double_setitem(@array, index, array[index])
22
+ end
23
+ @size = array.size
24
+ end
25
+ end
26
+ end
@@ -0,0 +1,26 @@
1
+ class Liblinear
2
+ class Array::Integer < Array
3
+ class << self
4
+ # @param array [SWIG::TYPE_p_int]
5
+ # @param size [Integer]
6
+ # @param return [Array <Integer>]
7
+ def decode(array, size)
8
+ size.times.map {|index| Liblinearswig.int_getitem(array, index)}
9
+ end
10
+
11
+ # @param array [SWIG::TYPE_p_int]
12
+ def delete(array)
13
+ Liblinearswig.delete_int(array)
14
+ end
15
+ end
16
+
17
+ # @param array [Array <Integer>]
18
+ def initialize(array)
19
+ @array = Liblinearswig.new_int(array.size)
20
+ array.size.times do |index|
21
+ Liblinearswig.int_setitem(@array, index, array[index])
22
+ end
23
+ @size = array.size
24
+ end
25
+ end
26
+ end
@@ -0,0 +1,15 @@
1
+ class Liblinear
2
+ class Array
3
+ def swig
4
+ @array
5
+ end
6
+
7
+ def decode
8
+ self.class.decode(@array, @size)
9
+ end
10
+
11
+ def delete
12
+ self.class.delete(@array)
13
+ end
14
+ end
15
+ end
@@ -1,4 +1,4 @@
1
- module Liblinear
1
+ class Liblinear
2
2
  class InvalidParameter < StandardError
3
3
  end
4
4
  end
@@ -0,0 +1,29 @@
1
+ class Liblinear
2
+ class Example
3
+ class << self
4
+ # @param examples [Array <Hash, Array>]
5
+ # @return [Integer]
6
+ def max_feature_id(examples)
7
+ max_feature_id = 0
8
+ examples.each do |example|
9
+ if example.is_a?(::Hash)
10
+ max_feature_id = [max_feature_id, example.keys.max].max if example.size > 0
11
+ else
12
+ max_feature_id = [max_feature_id, example.size].max
13
+ end
14
+ end
15
+ max_feature_id
16
+ end
17
+
18
+ # @param example_array [Array]
19
+ # @return [Hash]
20
+ def array_to_hash(example_array)
21
+ example_hash = {}
22
+ example_array.size.times do |index|
23
+ example_hash[index + 1] = example_array[index]
24
+ end
25
+ example_hash
26
+ end
27
+ end
28
+ end
29
+ end
@@ -0,0 +1,40 @@
1
+ class Liblinear
2
+ class FeatureNode
3
+ # @param examples [Array <Float> or Hash]
4
+ # @param max_feature_id [Integer]
5
+ # @param bias [Float]
6
+ def initialize(example, max_feature_id, bias = -1)
7
+ example = Liblinear::Example.array_to_hash(example) if example.is_a?(::Array)
8
+
9
+ example_indexes = []
10
+ example.each_key do |key|
11
+ example_indexes << key
12
+ end
13
+ example_indexes.sort!
14
+
15
+ if bias >= 0
16
+ @feature_node = Liblinearswig.feature_node_array(example_indexes.size + 2)
17
+ Liblinearswig.feature_node_array_set(@feature_node, example_indexes.size, max_feature_id + 1, bias)
18
+ Liblinearswig.feature_node_array_set(@feature_node, example_indexes.size + 1, -1, 0)
19
+ else
20
+ @feature_node = Liblinearswig.feature_node_array(example_indexes.size + 1)
21
+ Liblinearswig.feature_node_array_set(@feature_node, example_indexes.size, -1, 0)
22
+ end
23
+
24
+ f_index = 0
25
+ example_indexes.each do |e_index|
26
+ Liblinearswig.feature_node_array_set(@feature_node, f_index, e_index, example[e_index])
27
+ f_index += 1
28
+ end
29
+ end
30
+
31
+ # @return [Liblinearswig::Feature_node]
32
+ def swig
33
+ @feature_node
34
+ end
35
+
36
+ def delete
37
+ Liblinearswig.feature_node_array_destroy(@feature_node)
38
+ end
39
+ end
40
+ end
@@ -0,0 +1,23 @@
1
+ class Liblinear
2
+ class FeatureNodeMatrix
3
+ # @param examples [Array <Array <Float> or Hash>]
4
+ # @param bias [Float]
5
+ def initialize(examples, bias)
6
+ @feature_node_matrix = Liblinearswig.feature_node_matrix(examples.size)
7
+ max_feature_id = Liblinear::Example.max_feature_id(examples)
8
+ examples.size.times do |index|
9
+ feature_node = Liblinear::FeatureNode.new(examples[index], max_feature_id, bias)
10
+ Liblinearswig.feature_node_matrix_set(@feature_node_matrix, index, feature_node.swig)
11
+ end
12
+ end
13
+
14
+ # @return [SWIG::TYPE_p_p_feature_node]
15
+ def swig
16
+ @feature_node_matrix
17
+ end
18
+
19
+ def delete
20
+ Liblinearswig.feature_node_matrix_destroy(@feature_node_matrix)
21
+ end
22
+ end
23
+ end
@@ -1,113 +1,78 @@
1
- module Liblinear
1
+ class Liblinear
2
2
  class Model
3
- include Liblinear
4
- include Liblinearswig
5
- attr_accessor :model
3
+ class << self
4
+ # @param problem [LibLinear::Problem]
5
+ # @param parameter [Liblinear::Parameter]
6
+ # @return [Liblinear::Model]
7
+ def train(problem, parameter)
8
+ model = self.new
9
+ model.train(problem, parameter)
10
+ model
11
+ end
6
12
 
7
- # @param arg_1 [LibLinear::Problem, String]
8
- # @param arg_2 [Liblinear::Parameter]
9
- # @raise [ArgumentError]
10
- # @raise [Liblinear::InvalidParameter]
11
- def initialize(arg_1, arg_2 = nil)
12
- if arg_2
13
- unless arg_1.is_a?(Liblinear::Problem) && arg_2.is_a?(Liblinear::Parameter)
14
- raise ArgumentError, 'arguments must be [Liblinear::Problem] and [Liblinear::Parameter]'
15
- end
16
- error_msg = check_parameter(arg_1.prob, arg_2.param)
17
- raise InvalidParameter, error_msg if error_msg
18
- @model = train(arg_1.prob, arg_2.param)
19
- else
20
- raise ArgumentError, 'argument must be [String]' unless arg_1.is_a?(String)
21
- @model = load_model(arg_1)
13
+ # @param file_name [String]
14
+ # @return [Liblinear::Model]
15
+ def load(file_name)
16
+ model = self.new
17
+ model.load(file_name)
18
+ model
22
19
  end
23
20
  end
24
21
 
25
- # @return [Integer]
26
- def class_size
27
- get_nr_class(@model)
22
+ # @param problem [LibLinear::Problem]
23
+ # @param parameter [Liblinear::Parameter]
24
+ def train(problem, parameter)
25
+ @model = Liblinearswig.train(problem.swig, parameter.swig)
28
26
  end
29
27
 
30
- # @return [Integer]
31
- def nr_class
32
- warn "'nr_class' is deprecated. Please use 'class_size' instead."
33
- class_size
28
+ # @param file_name [String]
29
+ def load(file_name)
30
+ @model = Liblinearswig.load_model(file_name)
34
31
  end
35
32
 
36
- # @return [Integer]
37
- def feature_size
38
- get_nr_feature(@model)
33
+ # @return [Liblinear::Model]
34
+ def swig
35
+ @model
39
36
  end
40
37
 
41
- # @return [Array <Integer>]
42
- def labels
43
- c_int_array = new_int(class_size)
44
- get_labels(@model, c_int_array)
45
- labels = int_array_c_to_ruby(c_int_array, class_size)
46
- delete_int(c_int_array)
47
- labels
38
+ # @param filename [String]
39
+ def save(filename)
40
+ Liblinearswig.save_model(filename, @model)
48
41
  end
49
42
 
50
- # @param example [Array, Hash]
51
- # @return [Double]
52
- def predict(example)
53
- feature_nodes = convert_to_feature_node_array(example, @model.nr_feature, @model.bias)
54
- prediction = Liblinearswig.predict(@model, feature_nodes)
55
- feature_node_array_destroy(feature_nodes)
56
- prediction
43
+ # @return [Integer]
44
+ def class_size
45
+ @model.nr_class
57
46
  end
58
47
 
59
- # @param example [Array, Hash]
60
- # @return [Hash]
61
- def predict_probability(example)
62
- predict_prob_val(example, :predict_probability)
48
+ # @return [Integer]
49
+ def feature_size
50
+ @model.nr_feature
63
51
  end
64
52
 
65
- # @param example [Array, Hash]
66
- # @return [Hash]
67
- def predict_values(example)
68
- predict_prob_val(example, :predict_values)
53
+ # @return [Array <Float>]
54
+ def feature_weights
55
+ Liblinear::Array::Double.decode(@model.w, feature_size)
69
56
  end
70
57
 
71
- # @param filename [String]
72
- def save(filename)
73
- save_model(filename, @model)
58
+ # @return [Float]
59
+ def bias
60
+ @model.bias
74
61
  end
75
62
 
76
- # @param feature_index [Integer]
77
- # @param label_index [Integer]
78
- # @return [Double, Array <Double>]
79
- def coefficient(feature_index = nil, label_index = 0)
80
- return get_decfun_coef(@model, feature_index, label_index) if feature_index
81
- coefficients = []
82
- feature_size.times.map {|feature_index| get_decfun_coef(@model, feature_index + 1, label_index)}
63
+ # @return [Array <Integer>]
64
+ def labels
65
+ Liblinear::Array::Integer.decode(@model.label, class_size)
83
66
  end
84
67
 
85
- # @param label_index [Integer]
86
- # @return [Double]
87
- def bias(label_index = 0)
88
- get_decfun_bias(@model, label_index)
68
+ # @return [Boolean]
69
+ def probability_model?
70
+ Liblinearswig.check_probability_model(@model) == 1 ? true : false
89
71
  end
90
72
 
91
73
  # @return [Boolean]
92
74
  def regression_model?
93
- check_regression_model(@model) == 1 ? true : false
94
- end
95
-
96
- private
97
- # @param example [Array, Hash]
98
- # @return [Hash]
99
- def predict_prob_val(example, liblinear_func)
100
- feature_nodes = convert_to_feature_node_array(example, @model.nr_feature, @model.bias)
101
- c_double_array = new_double(class_size)
102
- Liblinearswig.send(liblinear_func, @model, feature_nodes, c_double_array)
103
- values = double_array_c_to_ruby(c_double_array, class_size)
104
- delete_double(c_double_array)
105
- feature_node_array_destroy(feature_nodes)
106
- value_list = {}
107
- labels.size.times do |i|
108
- value_list[labels[i]] = values[i]
109
- end
110
- value_list
75
+ Liblinearswig.check_regression_model(@model) == 1 ? true : false
111
76
  end
112
77
  end
113
78
  end