ffi-fasttext 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: c217429c82622b92680b63e21f0c86f42cb920a5
4
+ data.tar.gz: 6918bc5355f181f3755b0932696b4fd2e73949e4
5
+ SHA512:
6
+ metadata.gz: 110c9c5767272ffaf845c8798e093f3d9e7693373aca44dfae50e9f8c4cca7a1537f50274ca4b869034f6b549c1d5abf12354e261141ade94b9a1d67b88c70b7
7
+ data.tar.gz: 4e06ffaa51d37cab3f6b99381c5413d4f4876c04b45130bdada8bb08bd665064a28dac0b1ccdd6e48ffa953a203585822ecdc723f925b5c9b247a5a9fc9347f7
@@ -0,0 +1,44 @@
1
+ .ruby-*
2
+ *.lock
3
+ /.bundle/
4
+ /.yardoc
5
+ /_yardoc/
6
+ /coverage/
7
+ /doc/
8
+ /pkg/
9
+ /spec/reports/
10
+ /tmp/
11
+ /vendor/fasttext/fasttext
12
+
13
+ # Prerequisites
14
+ *.d
15
+
16
+ # Compiled Object files
17
+ *.slo
18
+ *.lo
19
+ *.o
20
+ *.obj
21
+
22
+ # Precompiled Headers
23
+ *.gch
24
+ *.pch
25
+
26
+ # Compiled Dynamic libraries
27
+ *.so
28
+ *.dylib
29
+ *.dll
30
+
31
+ # Fortran module files
32
+ *.mod
33
+ *.smod
34
+
35
+ # Compiled Static libraries
36
+ *.lai
37
+ *.la
38
+ *.a
39
+ *.lib
40
+
41
+ # Executables
42
+ *.exe
43
+ *.out
44
+ *.app
@@ -0,0 +1,5 @@
1
+ sudo: false
2
+ language: ruby
3
+ rvm:
4
+ - 2.3.3
5
+ before_install: gem install bundler -v 1.16.0
data/Gemfile ADDED
@@ -0,0 +1,6 @@
1
+ source "https://rubygems.org"
2
+
3
+ git_source(:github) {|repo_name| "https://github.com/#{repo_name}" }
4
+
5
+ # Specify your gem's dependencies in ffi-fasttext.gemspec
6
+ gemspec
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2018 Brandon Dewitt
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
@@ -0,0 +1,59 @@
1
+ # FFI::Fasttext
2
+
3
+ FFI bindings for [Fasttext](https://fasttext.cc/)
4
+
5
+ If you aren't sure what Fasttext is then read the information provided at the site above or the facebook github profile for the source of the fasttext C++ code [Github:Fasttext](https://github.com/facebookresearch/fastText/)
6
+
7
+ The FFI bindings are only for using Fasttext predictions in a ruby process after a model has been trained in the console. The model.bin/model.vec files are required to use the prediction probabilities through the `#predict` method of a `::FFI::Fasttext::Predictor` object.
8
+
9
+ ## Installation
10
+
11
+ Add this line to your application's Gemfile:
12
+
13
+ ```ruby
14
+ gem 'ffi-fasttext'
15
+ ```
16
+
17
+ And then execute:
18
+
19
+ $ bundle
20
+
21
+ Or install it yourself as:
22
+
23
+ $ gem install ffi-fasttext
24
+
25
+ ## Usage
26
+
27
+ Requires `g++` to be installed as it uses the C++ compiler to build the shared object that FFI uses and the executable unless other configuration options for access to the shared object are provided.
28
+
29
+ The Fasttext C++ model loading code will error out (and cause a hard `exit`) if the wrong model file is used or the model file is not at the specified filename. (so make sure you use the \*.bin file in the initialization)
30
+
31
+ There is a test training set and model in the spec directory, which should not be relied on for any predictability as it is abbreviated and all trained on "derp" derivations.
32
+
33
+ ```ruby
34
+ require "ffi/fasttext"
35
+
36
+ ft = ::FFI::Fasttext::Predictor.new("spec/model.bin")
37
+
38
+ ft.predict("derp") # => [["__label__3", 0.4375]] // will output the highest probability label and the associated probability in an array
39
+ ft.predict("derp", 3) # => [["__label__3", 0.4375], ["__label__1", 0.396484], ["__label__2", 0.164063]] // will output the top 3 probabilities in an array of arrays
40
+ ft.predict("derp", 10) # => [["__label__3", 0.4375], ["__label__1", 0.396484], ["__label__2", 0.164063]] // output the same as above as there are only 3 categories or if probability < 0
41
+
42
+ ft.destroy! # The prediction model is dynamically allocated in C code and must be released
43
+ ```
44
+
45
+ ## Development
46
+
47
+ After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
48
+
49
+ To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
50
+
51
+ ## Contributing
52
+
53
+ Bug reports and pull requests are welcome on GitHub at https://github.com/abrandoned/ffi-fasttext
54
+
55
+ ## License
56
+
57
+ The gem is available as open source under the terms of the [MIT License](https://github.com/abrandoned/ffi-fasttext/blob/master/LICENSE.txt)
58
+ The original source is licensed per Facebook under the terms of the [BSD License](https://github.com/abrandoned/ffi-fasttext/blob/master/vendor/fasttext/LICENSE)
59
+ Along with a Patent License from Facebook [Patents](https://github.com/abrandoned/ffi-fasttext/blob/master/vendor/fasttext/PATENTS)
@@ -0,0 +1,19 @@
1
+ require "bundler/gem_tasks"
2
+ require "rake/testtask"
3
+ import "ext/ffi/fasttext/Rakefile"
4
+
5
+ namespace :fasttext do
6
+ desc "build fasttext"
7
+ task :compile do
8
+ ::Rake::Task[:compile_fasttext].invoke
9
+ end
10
+ end
11
+
12
+ Rake::TestTask.new(:spec) do |t|
13
+ t.libs << "spec"
14
+ t.libs << "lib"
15
+ t.test_files = FileList["spec/**/*_spec.rb"]
16
+ end
17
+ Rake::Task[:spec].prerequisites << "fasttext:compile"
18
+
19
+ task :default => :spec
@@ -0,0 +1,14 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "bundler/setup"
4
+ require "ffi/fasttext"
5
+
6
+ # You can add fixtures and/or initialization code here to make experimenting
7
+ # with your gem easier. You can also use a different console, if you like.
8
+
9
+ # (If you use this, don't forget to add pry to your Gemfile!)
10
+ # require "pry"
11
+ # Pry.start
12
+
13
+ require "irb"
14
+ IRB.start(__FILE__)
@@ -0,0 +1,8 @@
1
+ #!/usr/bin/env bash
2
+ set -euo pipefail
3
+ IFS=$'\n\t'
4
+ set -vx
5
+
6
+ bundle install
7
+
8
+ # Do any other automated setup that you need to do here
@@ -0,0 +1,71 @@
1
+ require "rubygems"
2
+ require "fileutils"
3
+ require "ffi"
4
+
5
+ # Copied fom mkmf
6
+ def find_executable(bin, path = nil)
7
+ executable_file = proc do |name|
8
+ begin
9
+ stat = File.stat(name)
10
+ rescue SystemCallError
11
+ else
12
+ next name if stat.file? and stat.executable?
13
+ end
14
+ end
15
+
16
+ if File.expand_path(bin) == bin
17
+ return bin if executable_file.call(bin)
18
+ return nil
19
+ end
20
+ path = %w[/usr/local/bin /usr/ucb /usr/bin /bin]
21
+ if additional_path ||= ENV['PATH']
22
+ path += additional_path.split(File::PATH_SEPARATOR)
23
+ end
24
+ file = nil
25
+ path.each do |dir|
26
+ return file if executable_file.call(file = File.join(dir, bin))
27
+ end
28
+ nil
29
+ end
30
+
31
+ def sys(cmd)
32
+ puts " -- #{cmd}"
33
+ unless ret = system(cmd)
34
+ raise "ERROR: '#{cmd}' failed"
35
+ end
36
+ ret
37
+ end
38
+
39
+ desc "Build the fasttext shared lib"
40
+ task :compile_fasttext do
41
+ # Do not attempt to install if we want to use the system fasttext lib
42
+ next if ENV.key?("FASTTEXT_USE_SYSTEM_LIB") || ENV.key?("SKIP_FASTTEXT_COMPILE")
43
+
44
+ if !find_executable("g++")
45
+ abort "ERROR: g++ is required to build ffi-fasttext"
46
+ end
47
+
48
+ CWD = ::File.expand_path(::File.dirname(__FILE__))
49
+ FASTTEXT_DIR = ::File.join(CWD, "..", "..", "..", "vendor", "fasttext")
50
+
51
+ ::Dir.chdir(FASTTEXT_DIR) do
52
+ sys("g++ -c -O3 -fPIC -pthread -std=c++0x -c args.cc") unless ::File.exist?("args.o")
53
+ sys("g++ -c -O3 -fPIC -pthread -std=c++0x -c dictionary.cc") unless ::File.exist?("dictionary.o")
54
+ sys("g++ -c -O3 -fPIC -pthread -std=c++0x -c productquantizer.cc") unless ::File.exist?("productquantizer.o")
55
+ sys("g++ -c -O3 -fPIC -pthread -std=c++0x -c matrix.cc") unless ::File.exist?("matrix.o")
56
+ sys("g++ -c -O3 -fPIC -pthread -std=c++0x -c qmatrix.cc") unless ::File.exist?("qmatrix.o")
57
+ sys("g++ -c -O3 -fPIC -pthread -std=c++0x -c vector.cc") unless ::File.exist?("vector.o")
58
+ sys("g++ -c -O3 -fPIC -pthread -std=c++0x -c model.cc") unless ::File.exist?("model.o")
59
+ sys("g++ -c -O3 -fPIC -pthread -std=c++0x -c utils.cc") unless ::File.exist?("utils.o")
60
+ sys("g++ -c -O3 -fPIC -pthread -std=c++0x -c fasttext.cc") unless ::File.exist?("fasttext.o")
61
+ sys("g++ -c -O3 -fPIC -pthread -std=c++0x -c ffi_fasttext.cc")
62
+ sys("g++ -fPIC -pthread -std=c++0x args.o dictionary.o productquantizer.o matrix.o qmatrix.o vector.o model.o utils.o fasttext.o main.cc -o fasttext") unless ::File.exist?("fasttext")
63
+ sys("g++ -fPIC -pthread -std=c++0x args.o dictionary.o productquantizer.o matrix.o qmatrix.o vector.o model.o utils.o fasttext.o ffi_fasttext.o -shared -o libfasttext.#{::FFI::Platform::LIBSUFFIX}")
64
+ end
65
+
66
+ unless ::File.exist?(::File.join(FASTTEXT_DIR, "libfasttext.#{::FFI::Platform::LIBSUFFIX}"))
67
+ abort "ERROR: Failed to build fasttext"
68
+ end
69
+ end
70
+
71
+ task :default => :compile_fasttext
@@ -0,0 +1,40 @@
1
+
2
+ lib = File.expand_path("../lib", __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+ require "ffi/fasttext/version"
5
+
6
+ Gem::Specification.new do |spec|
7
+ spec.name = "ffi-fasttext"
8
+ spec.version = ::FFI::Fasttext::VERSION
9
+ spec.authors = ["Brandon Dewitt"]
10
+ spec.email = ["brandonsdewitt+rubygems@gmail.com"]
11
+
12
+ spec.summary = %q{ FFI bindings for Facebook's FastText text classification library }
13
+ spec.description = %q{ FFI bindings for Facebook's FastText text classification library }
14
+ spec.homepage = "https://www.github.com/abrandoned/ffi-fastext"
15
+ spec.license = "MIT"
16
+
17
+ # Prevent pushing this gem to RubyGems.org. To allow pushes either set the 'allowed_push_host'
18
+ # to allow pushing to a single host or delete this section to allow pushing to any host.
19
+ if spec.respond_to?(:metadata)
20
+ spec.metadata["allowed_push_host"] = "https://rubygems.org"
21
+ else
22
+ raise "RubyGems 2.0 or newer is required to protect against " \
23
+ "public gem pushes."
24
+ end
25
+
26
+ spec.files = `git ls-files -z`.split("\x0").reject do |f|
27
+ f.match(%r{^(test|spec|features)/})
28
+ end
29
+ spec.bindir = "exe"
30
+ spec.extensions = "ext/ffi/fasttext/Rakefile"
31
+ spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
32
+ spec.require_paths = ["lib"]
33
+
34
+ spec.add_dependency "ffi"
35
+
36
+ spec.add_development_dependency "pry"
37
+ spec.add_development_dependency "bundler", "~> 1.16"
38
+ spec.add_development_dependency "rake", "~> 10.0"
39
+ spec.add_development_dependency "minitest", "~> 5.0"
40
+ end
@@ -0,0 +1,108 @@
1
+ require "ffi"
2
+ require "ffi/fasttext/version"
3
+
4
+ module FFI
5
+ module Fasttext
6
+ extend FFI::Library
7
+ ffi_lib_flags :now, :global
8
+
9
+ ##
10
+ # ffi-rzmq-core for reference
11
+ #
12
+ # https://github.com/chuckremes/ffi-rzmq-core/blob/master/lib/ffi-rzmq-core/libzmq.rb
13
+ #
14
+ begin
15
+ # bias the library discovery to a path inside the gem first, then
16
+ # to the usual system paths
17
+ gem_base = ::File.join(::File.dirname(__FILE__), '..', '..')
18
+ inside_gem = ::File.join(gem_base, 'ext')
19
+ local_path = ::FFI::Platform::IS_WINDOWS ? ENV['PATH'].split(';') : ENV['PATH'].split(':')
20
+ env_path = [ ENV['FASTTEXT_LIB_PATH'] ].compact
21
+ rbconfig_path = ::RbConfig::CONFIG["libdir"]
22
+ homebrew_path = nil
23
+
24
+ # RUBYOPT set by RVM breaks 'brew' so we need to unset it.
25
+ rubyopt = ENV.delete('RUBYOPT')
26
+
27
+ begin
28
+ stdout, stderr, status = ::Open3.capture3("brew", "--prefix")
29
+ homebrew_path = if status.success?
30
+ "#{stdout.chomp}/lib"
31
+ else
32
+ '/usr/local/homebrew/lib'
33
+ end
34
+ rescue
35
+ # Homebrew doesn't exist
36
+ end
37
+
38
+ # Restore RUBYOPT after executing 'brew' above.
39
+ ENV['RUBYOPT'] = rubyopt
40
+
41
+ # Search for libfasttext in the following order...
42
+ fasttext_lib_paths =
43
+ if ENV.key?("FASTTEXT_USE_SYSTEM_LIB")
44
+ [inside_gem] + env_path + local_path + [rbconfig_path] + [
45
+ '/usr/local/lib', '/opt/local/lib', homebrew_path, '/usr/lib64'
46
+ ]
47
+ else
48
+ [::File.join(gem_base, "vendor/fasttext")]
49
+ end
50
+
51
+ FASTTEXT_LIB_PATHS = fasttext_lib_paths.
52
+ compact.map{|path| "#{path}/libfasttext.#{::FFI::Platform::LIBSUFFIX}"}
53
+
54
+ ffi_lib(FASTTEXT_LIB_PATHS + %w{libfasttext})
55
+ rescue LoadError => error
56
+ if FASTTEXT_LIB_PATHS.any? {|path| ::File.file?(::File.join(path)) }
57
+ warn "Unable to load this gem. The libfasttext library exists, but cannot be loaded."
58
+ warn "Set FASTTEXT_LIB_PATH if custom load path is desired"
59
+ warn "If this is Windows:"
60
+ warn "- Check that you have MSVC runtime installed or statically linked"
61
+ warn "- Check that your DLL is compiled for #{FFI::Platform::ADDRESS_SIZE} bit"
62
+ else
63
+ warn "Unable to load this gem. The libfasttext library (or DLL) could not be found."
64
+ warn "Set FASTTEXT_LIB_PATH if custom load path is desired"
65
+ warn "If this is a Windows platform, make sure libfasttext.dll is on the PATH."
66
+ warn "If the DLL was built with mingw, make sure the other two dependent DLLs,"
67
+ warn "libgcc_s_sjlj-1.dll and libstdc++6.dll, are also on the PATH."
68
+ warn "For non-Windows platforms, make sure libfasttext is located in this search path:"
69
+ warn FASTTEXT_LIB_PATHS.inspect
70
+ end
71
+ raise error
72
+ end
73
+
74
+ attach_function :create, [:string], :pointer
75
+ attach_function :destroy, [:pointer], :void
76
+ attach_function :predict_string_free, [:pointer], :void
77
+ attach_function :predict, [:pointer, :string, :int32_t], :strptr
78
+
79
+ class Predictor
80
+ def initialize(model_filename)
81
+ raise "File does not exist" unless ::File.exist?(model_filename)
82
+ @ptr = ::FFI::Fasttext.create(model_filename)
83
+ end
84
+
85
+ def destroy!
86
+ ::FFI::Fasttext.destroy(@ptr) unless @ptr.nil?
87
+ @ptr = nil
88
+ end
89
+
90
+ def predict(string, number_of_predictions = 1)
91
+ response_string, pointer = ::FFI::Fasttext.predict(@ptr, string, number_of_predictions)
92
+ return [] unless response_string.size > 0
93
+
94
+ response_array = []
95
+ split_responses = response_string.split(" ")
96
+ split_responses.each_slice(2) do |pair|
97
+ next unless pair.first && pair.last
98
+
99
+ response_array << [pair.first, pair.last.to_f]
100
+ end
101
+
102
+ response_array
103
+ ensure
104
+ ::FFI::Fasttext.predict_string_free(pointer) unless pointer.nil?
105
+ end
106
+ end
107
+ end
108
+ end
@@ -0,0 +1,5 @@
1
+ module FFI
2
+ module Fasttext
3
+ VERSION = "0.1.0"
4
+ end
5
+ end
@@ -0,0 +1,30 @@
1
+ BSD License
2
+
3
+ For fastText software
4
+
5
+ Copyright (c) 2016-present, Facebook, Inc. All rights reserved.
6
+
7
+ Redistribution and use in source and binary forms, with or without modification,
8
+ are permitted provided that the following conditions are met:
9
+
10
+ * Redistributions of source code must retain the above copyright notice, this
11
+ list of conditions and the following disclaimer.
12
+
13
+ * Redistributions in binary form must reproduce the above copyright notice,
14
+ this list of conditions and the following disclaimer in the documentation
15
+ and/or other materials provided with the distribution.
16
+
17
+ * Neither the name Facebook nor the names of its contributors may be used to
18
+ endorse or promote products derived from this software without specific
19
+ prior written permission.
20
+
21
+ THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
22
+ ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
23
+ WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
24
+ DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR
25
+ ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
26
+ (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
27
+ LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
28
+ ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
29
+ (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
30
+ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.