thundersvm 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 9f3ccc0ace653a21c742dd36bdf2cf33d941fd7c157b98ea5e453524677dc3da
4
+ data.tar.gz: 6a35aad608e187dac40f4cb2cc798d8fe6b17bb7617e570861f5a044bfd30f5b
5
+ SHA512:
6
+ metadata.gz: ca1f18fce237bb032f64e33c36636ff379fb6a267dabcb83f30700a9bfc60c3fd8a1511afbf17cfe5c009bb1c584d2fd463dbba6a35bad7f9a4d5bc9f8b6cbeb
7
+ data.tar.gz: 775d599e2e0334fe4ae24aa6ae7789087f0c16cc8e9a786fb5b344b784a62b6f509db882ce70dfbecdf0d76e4ed562a8b140972f5c536349c899660cebd78a1a
data/CHANGELOG.md ADDED
@@ -0,0 +1,3 @@
1
+ ## 0.1.0 (2019-11-24)
2
+
3
+ - First release
data/LICENSE.txt ADDED
@@ -0,0 +1,22 @@
1
+ Copyright (c) 2019 Andrew Kane
2
+
3
+ MIT License
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining
6
+ a copy of this software and associated documentation files (the
7
+ "Software"), to deal in the Software without restriction, including
8
+ without limitation the rights to use, copy, modify, merge, publish,
9
+ distribute, sublicense, and/or sell copies of the Software, and to
10
+ permit persons to whom the Software is furnished to do so, subject to
11
+ the following conditions:
12
+
13
+ The above copyright notice and this permission notice shall be
14
+ included in all copies or substantial portions of the Software.
15
+
16
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
17
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
19
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
20
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
21
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
22
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,141 @@
1
+ # ThunderSVM
2
+
3
+ [ThunderSVM](https://github.com/Xtra-Computing/thundersvm) - high-performance parallel SVMs - for Ruby
4
+
5
+ :fire: Uses GPUs and multi-core CPUs for blazing performance
6
+
7
+ For a great intro on support vector machines, check out [this video](https://www.youtube.com/watch?v=efR1C6CvhmE).
8
+
9
+ ## Installation
10
+
11
+ First, [install ThunderSVM](https://github.com/Xtra-Computing/thundersvm/blob/master/docs/how-to.md#install-thundersvm). Add this line to your application’s Gemfile:
12
+
13
+ ```ruby
14
+ gem 'thundersvm'
15
+ ```
16
+
17
+ ## Getting Started
18
+
19
+ Prep your data
20
+
21
+ ```ruby
22
+ x = [[1, 2], [3, 4], [5, 6], [7, 8]]
23
+ y = [1, 2, 3, 4]
24
+ ```
25
+
26
+ Train a model
27
+
28
+ ```ruby
29
+ model = ThunderSVM::Regressor.new
30
+ model.fit(x, y)
31
+ ```
32
+
33
+ Use `ThunderSVM::Classifier` for classification and `ThunderSVM::Model` for other models
34
+
35
+ Make predictions
36
+
37
+ ```ruby
38
+ model.predict(x)
39
+ ```
40
+
41
+ Save the model to a file
42
+
43
+ ```ruby
44
+ model.save_model("model.txt")
45
+ ```
46
+
47
+ Load the model from a file
48
+
49
+ ```ruby
50
+ model = ThunderSVM.load_model("model.txt")
51
+ ```
52
+
53
+ Get support vectors
54
+
55
+ ```ruby
56
+ model.support_vectors
57
+ ```
58
+
59
+ ## Cross-Validation
60
+
61
+ Perform cross-validation
62
+
63
+ ```ruby
64
+ model.cv(x, y)
65
+ ```
66
+
67
+ Specify the number of folds
68
+
69
+ ```ruby
70
+ model.cv(x, y, folds: 5)
71
+ ```
72
+
73
+ ## Parameters
74
+
75
+ Defaults shown below
76
+
77
+ ```ruby
78
+ ThunderSVM::Model.new(
79
+ svm_type: :c_svc, # set type of SVM (c_svc, nu_svc, one_class, epsilon_svr, nu_svr)
80
+ kernel: :rbf, # set type of kernel function (linear, polynomial, rbf, sigmoid)
81
+ degree: 3, # set degree in kernel function
82
+ gamma: nil, # set gamma in kernel function
83
+ coef0: 0, # set coef0 in kernel function
84
+ c: 1, # set the parameter C of C-SVC, epsilon-SVR, and nu-SVR
85
+ nu: 0.5, # set the parameter nu of nu-SVC, one-class SVM, and nu-SVR
86
+ epsilon: 0.1, # set the epsilon in loss function of epsilon-SVR
87
+ max_memory: 8192, # constrain the maximum memory size (MB) that thundersvm uses
88
+ tolerance: 0.001, # set tolerance of termination criterion
89
+ probability: false, # whether to train a SVC or SVR model for probability estimates
90
+ gpu: 0, # specify which gpu to use
91
+ cores: nil, # set the number of cpu cores to use (defaults to all)
92
+ verbose: false # verbose mode
93
+ )
94
+ ```
95
+
96
+ ## Data
97
+
98
+ Data can be a Ruby array
99
+
100
+ ```ruby
101
+ [[1, 2], [3, 4], [5, 6], [7, 8]]
102
+ ```
103
+
104
+ Or a Numo array
105
+
106
+ ```ruby
107
+ Numo::DFloat.cast([[1, 2], [3, 4], [5, 6], [7, 8]])
108
+ ```
109
+
110
+ Or the path a file in `libsvm` format (better for sparse data)
111
+
112
+ ```ruby
113
+ model.fit("train.txt")
114
+ model.predict("test.txt")
115
+ ```
116
+
117
+ ## Resources
118
+
119
+ - [ThunderSVM: A Fast SVM Library on GPUs and CPUs](https://github.com/Xtra-Computing/thundersvm/blob/master/thundersvm-full.pdf)
120
+
121
+ ## History
122
+
123
+ View the [changelog](https://github.com/ankane/thundersvm/blob/master/CHANGELOG.md)
124
+
125
+ ## Contributing
126
+
127
+ Everyone is encouraged to help improve this project. Here are a few ways you can help:
128
+
129
+ - [Report bugs](https://github.com/ankane/thundersvm/issues)
130
+ - Fix bugs and [submit pull requests](https://github.com/ankane/thundersvm/pulls)
131
+ - Write, clarify, or fix documentation
132
+ - Suggest or add new features
133
+
134
+ To get started with development:
135
+
136
+ ```sh
137
+ git clone https://github.com/ankane/thundersvm.git
138
+ cd thundersvm
139
+ bundle install
140
+ bundle exec rake test
141
+ ```
data/lib/thundersvm.rb ADDED
@@ -0,0 +1,28 @@
1
+ # stdlib
2
+ require "fiddle/import"
3
+ require "fileutils"
4
+ require "tempfile"
5
+
6
+ # modules
7
+ require "thundersvm/model"
8
+ require "thundersvm/classifier"
9
+ require "thundersvm/regressor"
10
+ require "thundersvm/version"
11
+
12
+ module ThunderSVM
13
+ class Error < StandardError; end
14
+
15
+ class << self
16
+ attr_accessor :ffi_lib
17
+ end
18
+ self.ffi_lib = ["libthundersvm.so", "libthundersvm.dylib", "thundersvm.dll"]
19
+
20
+ # friendlier error message
21
+ autoload :FFI, "thundersvm/ffi"
22
+
23
+ def self.load_model(path)
24
+ model = Model.new
25
+ model.load_model(path)
26
+ model
27
+ end
28
+ end
@@ -0,0 +1,7 @@
1
+ module ThunderSVM
2
+ class Classifier < Model
3
+ def initialize(svm_type: :c_svc, **options)
4
+ super(svm_type: svm_type, **options)
5
+ end
6
+ end
7
+ end
@@ -0,0 +1,19 @@
1
+ module ThunderSVM
2
+ module FFI
3
+ extend Fiddle::Importer
4
+
5
+ libs = ThunderSVM.ffi_lib.dup
6
+ begin
7
+ dlload libs.shift
8
+ rescue Fiddle::DLError => e
9
+ retry if libs.any?
10
+ raise e if ENV["THUNDERSVM_DEBUG"]
11
+ raise LoadError, "Could not find ThunderSVM"
12
+ end
13
+
14
+ extern "void thundersvm_train(int argc, char **argv)"
15
+ extern "void thundersvm_train_after_parse(char **option, int len, char *file_name)"
16
+ extern "void thundersvm_predict(int argc, char **argv)"
17
+ extern "void thundersvm_predict_after_parse(char *model_file_name, char *output_file_name, char **option, int len)"
18
+ end
19
+ end
@@ -0,0 +1,191 @@
1
+ module ThunderSVM
2
+ class Model
3
+ def initialize(svm_type: :c_svc, kernel: :rbf, degree: 3, gamma: nil, coef0: 0,
4
+ c: 1, nu: 0.5, epsilon: 0.1, max_memory: 8192, tolerance: 0.001,
5
+ probability: false, gpu: 0, cores: nil, verbose: nil)
6
+
7
+ @svm_type = svm_type.to_sym
8
+ @kernel = kernel.to_sym
9
+ @degree = degree
10
+ @gamma = gamma
11
+ @coef0 = coef0
12
+ @c = c
13
+ @nu = nu
14
+ @epsilon = epsilon
15
+ @max_memory = max_memory
16
+ @tolerance = tolerance
17
+ @probability = probability
18
+ @gpu = gpu
19
+ @cores = cores
20
+ @verbose = verbose
21
+ end
22
+
23
+ def fit(x, y = nil)
24
+ train(x, y)
25
+ end
26
+
27
+ def cv(x, y = nil, folds: 5)
28
+ train(x, y, folds: folds)
29
+ end
30
+
31
+ def predict(x)
32
+ dataset_file = create_dataset(x)
33
+ out_file = create_tempfile
34
+ argv = ["thundersvm-predict", dataset_file.path, @model_file.path, out_file.path]
35
+ FFI.thundersvm_predict(argv.size, str_ptr(argv))
36
+ func = [:c_svc, :nu_svc].include?(@svm_type) ? :to_i : :to_f
37
+ out_file.each_line.map(&func)
38
+ end
39
+
40
+ def save_model(path)
41
+ raise Error, "Not trained" unless @model_file
42
+ FileUtils.cp(@model_file.path, path)
43
+ nil
44
+ end
45
+
46
+ def load_model(path)
47
+ @model_file ||= create_tempfile
48
+ # TODO ensure tempfile is still cleaned up
49
+ FileUtils.cp(path, @model_file.path)
50
+ @svm_type = read_header["svm_type"].to_sym
51
+ @kernel = read_header["kernel_type"].to_sym
52
+ nil
53
+ end
54
+
55
+ def support_vectors
56
+ vectors = []
57
+ sv = false
58
+ read_txt do |line|
59
+ if sv
60
+ index = line.index("1:")
61
+ vectors << line[index..-1].split(" ").map { |v| v.split(":").last.to_f }
62
+ elsif line.start_with?("SV")
63
+ sv = true
64
+ end
65
+ end
66
+ vectors
67
+ end
68
+
69
+ def dual_coef
70
+ vectors = []
71
+ sv = false
72
+ read_txt do |line|
73
+ if sv
74
+ index = line.index("1:")
75
+ line[0...index].split(" ").map(&:to_f).each_with_index do |v, i|
76
+ (vectors[i] ||= []) << v
77
+ end
78
+ elsif line.start_with?("SV")
79
+ sv = true
80
+ end
81
+ end
82
+ vectors
83
+ end
84
+
85
+ def self.finalize_file(file)
86
+ # must use proc instead of stabby lambda
87
+ proc do
88
+ file.close
89
+ file.unlink
90
+ end
91
+ end
92
+
93
+ private
94
+
95
+ def train(x, y = nil, folds: nil)
96
+ dataset_file = create_dataset(x, y)
97
+ @model_file ||= create_tempfile
98
+
99
+ svm_types = {
100
+ c_svc: 0,
101
+ nu_svc: 1,
102
+ one_class: 2,
103
+ epsilon_svr: 3,
104
+ nu_svr: 4
105
+ }
106
+ s = svm_types[@svm_type]
107
+ raise Error, "Unknown SVM type: #{@svm_type}" unless s
108
+
109
+ kernels = {
110
+ linear: 0,
111
+ polynomial: 1,
112
+ rbf: 2,
113
+ sigmoid: 3
114
+ }
115
+ t = kernels[@kernel]
116
+ raise Error, "Unknown kernel: #{@kernel}" unless t
117
+
118
+ verbose = @verbose
119
+ verbose = true if folds && verbose.nil?
120
+
121
+ argv = ["thundersvm-train"]
122
+ argv += ["-s", s]
123
+ argv += ["-t", t]
124
+ argv += ["-d", @degree.to_i] if @degree
125
+ argv += ["-g", @gamma.to_f] if @gamma
126
+ argv += ["-r", @coef0.to_f] if @coef0
127
+ argv += ["-c", @c.to_f] if @c
128
+ argv += ["-n", @nu.to_f] if @nu
129
+ argv += ["-p", @epsilon.to_f] if @epsilon
130
+ argv += ["-m", @max_memory.to_i] if @max_memory
131
+ argv += ["-e", @tolerance.to_f] if @tolerance
132
+ argv += ["-b", @probability ? 1 : 0] if @probability
133
+ argv += ["-v", folds.to_i] if folds
134
+ argv += ["-u", @gpu.to_i] if @gpu
135
+ argv += ["-o", @cores.to_i] if @cores
136
+ argv << "-q" unless verbose
137
+ argv += [dataset_file.path, @model_file.path]
138
+
139
+ FFI.thundersvm_train(argv.size, str_ptr(argv))
140
+ nil
141
+ end
142
+
143
+ def create_dataset(x, y = nil)
144
+ if x.is_a?(String)
145
+ raise ArgumentError, "Cannot pass y with file" if y
146
+ File.open(x)
147
+ else
148
+ contents = String.new("")
149
+ y ||= [0] * x.size
150
+ x.to_a.zip(y.to_a).each do |xi, yi|
151
+ contents << "#{yi.to_i} #{xi.map.with_index { |v, i| "#{i + 1}:#{v.to_f}" }.join(" ")}\n"
152
+ end
153
+ dataset = create_tempfile
154
+ dataset.write(contents)
155
+ dataset.close
156
+ dataset
157
+ end
158
+ end
159
+
160
+ def str_ptr(arr)
161
+ ptr = Fiddle::Pointer.malloc(Fiddle::SIZEOF_VOIDP * arr.size)
162
+ arr.each_with_index do |v, i|
163
+ ptr[i * Fiddle::SIZEOF_VOIDP, Fiddle::SIZEOF_VOIDP] = Fiddle::Pointer["#{v}\x00"].ref
164
+ end
165
+ ptr
166
+ end
167
+
168
+ def create_tempfile
169
+ file = Tempfile.new("thundersvm")
170
+ ObjectSpace.define_finalizer(self, self.class.finalize_file(file))
171
+ file
172
+ end
173
+
174
+ def read_header
175
+ model = {}
176
+ read_txt do |line|
177
+ break if line.start_with?("SV")
178
+ k, v = line.split(" ", 2)
179
+ model[k] = v.strip
180
+ end
181
+ model
182
+ end
183
+
184
+ def read_txt
185
+ @model_file.rewind
186
+ @model_file.each_line do |line|
187
+ yield line
188
+ end
189
+ end
190
+ end
191
+ end
@@ -0,0 +1,7 @@
1
+ module ThunderSVM
2
+ class Regressor < Model
3
+ def initialize(svm_type: :epsilon_svr, **options)
4
+ super(svm_type: svm_type, **options)
5
+ end
6
+ end
7
+ end
@@ -0,0 +1,3 @@
1
+ module ThunderSVM
2
+ VERSION = "0.1.0"
3
+ end
metadata ADDED
@@ -0,0 +1,107 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: thundersvm
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - Andrew Kane
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2019-11-25 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: bundler
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - ">="
18
+ - !ruby/object:Gem::Version
19
+ version: '0'
20
+ type: :development
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - ">="
25
+ - !ruby/object:Gem::Version
26
+ version: '0'
27
+ - !ruby/object:Gem::Dependency
28
+ name: rake
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - ">="
32
+ - !ruby/object:Gem::Version
33
+ version: '0'
34
+ type: :development
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - ">="
39
+ - !ruby/object:Gem::Version
40
+ version: '0'
41
+ - !ruby/object:Gem::Dependency
42
+ name: minitest
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - ">="
46
+ - !ruby/object:Gem::Version
47
+ version: '5'
48
+ type: :development
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - ">="
53
+ - !ruby/object:Gem::Version
54
+ version: '5'
55
+ - !ruby/object:Gem::Dependency
56
+ name: numo-narray
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - ">="
60
+ - !ruby/object:Gem::Version
61
+ version: '0'
62
+ type: :development
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - ">="
67
+ - !ruby/object:Gem::Version
68
+ version: '0'
69
+ description:
70
+ email: andrew@chartkick.com
71
+ executables: []
72
+ extensions: []
73
+ extra_rdoc_files: []
74
+ files:
75
+ - CHANGELOG.md
76
+ - LICENSE.txt
77
+ - README.md
78
+ - lib/thundersvm.rb
79
+ - lib/thundersvm/classifier.rb
80
+ - lib/thundersvm/ffi.rb
81
+ - lib/thundersvm/model.rb
82
+ - lib/thundersvm/regressor.rb
83
+ - lib/thundersvm/version.rb
84
+ homepage: https://github.com/ankane/thundersvm
85
+ licenses:
86
+ - MIT
87
+ metadata: {}
88
+ post_install_message:
89
+ rdoc_options: []
90
+ require_paths:
91
+ - lib
92
+ required_ruby_version: !ruby/object:Gem::Requirement
93
+ requirements:
94
+ - - ">="
95
+ - !ruby/object:Gem::Version
96
+ version: '2.4'
97
+ required_rubygems_version: !ruby/object:Gem::Requirement
98
+ requirements:
99
+ - - ">="
100
+ - !ruby/object:Gem::Version
101
+ version: '0'
102
+ requirements: []
103
+ rubygems_version: 3.0.6
104
+ signing_key:
105
+ specification_version: 4
106
+ summary: ThunderSVM - high-performance parallel SVMs - for Ruby
107
+ test_files: []