natto 0.9.6 → 0.9.7

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 14ae50169a93b3810e5ae2258187d71f80d8be1a
4
- data.tar.gz: 8b89a9be35a76123c1955d85913c7665636235b1
3
+ metadata.gz: fad99a300fd0a04d95e5ffacb7352b3855506e85
4
+ data.tar.gz: 1e9ba71a7690d14099f45d0350fba7d388b7e4e9
5
5
  SHA512:
6
- metadata.gz: 996501067f551d7e7497155f5a128b256adc858c4ebcb127218eca393398bd70cd26911de4516e18fcf7450c48c00da598d5a54e6fe5e91025907121c2f6fc8c
7
- data.tar.gz: 308595234aa422803e3a0ac09573601a3e8360837918964e46c1151cb65f03121acc85c789d0c18be86428d7947c4025055925f6dc03b709b6fdc734b55e5c74
6
+ metadata.gz: 185db00a5a3fba01b27ad27ea0e89e03e698b8d5ccfbef400539c0e48648ab77abe5b90e5cad9c7777dc5d6a79297b4dea99e3ecdae4111bb25f8d78614b164c
7
+ data.tar.gz: fec5fd24301277deff762c68762b89fec9f33736a6cf918b7d4ec61a9019ff03433d6f96528d8220ca7f35a352950afd93f4b5500cf5fe9baf5b0ccbedeb5efe
data/CHANGELOG CHANGED
@@ -1,5 +1,23 @@
1
1
  ## CHANGELOG
2
2
 
3
+ - __2014/12/20__: 0.9.7 release.
4
+ - Issue 14: [adding automatic discovery for mecab library; no need to
5
+ explicitly set
6
+ MECAB_PATH!](https://bitbucket.org/buruzaemon/natto/issue/14/automatic-discovery-of-libmecab-path-and)
7
+ - Issue 15: [refactored node-parsing to use Enumerator instead of
8
+ materializing every node and stuffing into
9
+ array](https://bitbucket.org/buruzaemon/natto/issue/15/use-enumerator-when-parsing-mecab-nodes)
10
+ - Issue 17: [adding filepath to MeCab and
11
+ DictionaryInfo](https://bitbucket.org/buruzaemon/natto/issue/17/use-filerealpath-value-for-all-file-paths)
12
+ - Issue 18: [bug-fix for node-formatting during default node
13
+ parse](https://bitbucket.org/buruzaemon/natto/issue/18/no-node-formatting-when-using-default-node)
14
+ - Deprecating parse_as_nodes and parse_as_strings; please use parse instead!
15
+ - CAUTION: parse_as_nodes, parse_as_strings, readnodes and readlines will be removed in the following release!
16
+ - Enhancements to to_s methods for both MeCab and DictionaryInfo
17
+ - Enhancements to TestDictionaryInfo to allow for building user dic during setup on Windows as well
18
+ - Slight enhancement to benchmark task.
19
+ - Updating LICENSE (adding copyright year 2015), adding to all files
20
+
3
21
  - __2013/07/07__: 0.9.6 release.
4
22
  - Upgrade to mecab 0.996
5
23
  - Adding support for partial parsing mode (-p / --partial)
data/LICENSE CHANGED
@@ -1,8 +1,8 @@
1
- Copyright © 2011, Brooke M. Fujita.
1
+ Copyright (c) 2014-2015, Brooke M. Fujita.
2
2
  All rights reserved.
3
3
 
4
- Redistribution and use in source and binary forms, with or without modification, are
5
- permitted provided that the following conditions are met:
4
+ Redistribution and use in source and binary forms, with or without
5
+ modification, are permitted provided that the following conditions are met:
6
6
 
7
7
  * Redistributions of source code must retain the above
8
8
  copyright notice, this list of conditions and the
@@ -13,11 +13,13 @@ permitted provided that the following conditions are met:
13
13
  following disclaimer in the documentation and/or other
14
14
  materials provided with the distribution.
15
15
 
16
- THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED
17
- WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
18
- PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
19
- ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
20
- LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
21
- INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
22
- TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
23
- ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
16
+ THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
17
+ ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
18
+ WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
19
+ DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
20
+ ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
21
+ (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
22
+ LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
23
+ ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
24
+ (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
25
+ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
data/README.md CHANGED
@@ -1,108 +1,233 @@
1
- # natto
2
- A Tasty Ruby Binding with MeCab
3
-
4
- ## What is natto?
5
- natto combines the [Ruby programming language](http://www.ruby-lang.org/) with [MeCab](http://mecab.googlecode.com/svn/trunk/mecab/doc/index.html), the part-of-speech and morphological analyzer for the Japanese language.
6
-
7
- natto is a gem bridging Ruby and MeCab using FFI (foreign function interface). No compilation is necessary, as natto is _not_ a C extension. natto will run on CRuby (mri/yarv) and JRuby (jvm) equally well. natto will also run on Windows, Unix/Linux, and Mac.
8
-
9
- You can learn more about [natto at bitbucket](https://bitbucket.org/buruzaemon/natto/).
10
-
11
- ## Requirements
12
- natto requires the following:
13
-
14
- - [MeCab _0.996_](http://code.google.com/p/mecab/downloads/list)
15
- - [ffi _1.9.0 or greater_](http://rubygems.org/gems/ffi)
16
- - Ruby _1.9 or greater_
17
-
18
- ## Installation on *NIX/Mac
19
- Install natto with the following gem command:
20
-
21
- gem install natto
22
-
23
- This will automatically install the [ffi](http://rubygems.org/gems/ffi) rubygem, which natto uses to bind to the `mecab` library.
24
-
25
- ## Installation on Windows
26
- However, if you are using a CRuby on Windows, then you will first need to install the [RubyInstaller Development Kit (DevKit)](https://github.com/oneclick/rubyinstaller/wiki/Development-Kit), a MSYS/MinGW based toolkit than enables your Windows Ruby installation to build many of the native C/C++ extensions available, including `ffi`.
27
-
28
- 1. Download the latest release for RubyInstaller for Windows platforms and the corresponding DevKit from the [RubyInstaller for Windows downloads page](http://rubyinstaller.org/downloads/).
29
- 2. After installing RubyInstaller for Windows, double-click on the DevKit-tdm installer `.exe`, and expand the contents to an appropriate location, for example `C:\devkit`.
30
- 3. Open a command window under `C:\devkit`, and execute: `ruby dk.rb init`. This will locate all known ruby installations, and add them to `C:\devkit\config.yml`.
31
- 4. Next, execute: `ruby dk.rb install`, which will add the DevKit to all of the installed rubies listed in your `C:\devkit\config.yml`. Now you should be able to install and build the `ffi` rubygem correctly on your Windows-installed ruby.
32
- 5. Install `natto` with:
33
-
34
- gem install natto
35
-
36
- ## Configuration
37
- - natto will try to locate the `mecab` library based upon its runtime environment.
38
- - In case of `LoadError`, please set the `MECAB_PATH` environment variable to the exact name/path to your `mecab` library.
39
-
40
- e.g., for bash on UNIX/Linux
41
-
42
- export MECAB_PATH=/usr/local/lib/libmecab.so
43
-
44
- e.g., on Windows
45
-
46
- set MECAB_PATH=C:\Program Files\MeCab\bin\libmecab.dll
47
-
48
- e.g., from within a Ruby program
49
-
50
- ENV['MECAB_PATH']='/usr/local/lib/libmecab.so'
51
-
52
- ## Usage
53
- require 'natto'
54
-
55
- nm = Natto::MeCab.new
56
- => #<Natto::MeCab:0x28d30748
57
- @tagger=#<FFI::Pointer address=0x28a97d50>, \
58
- @options={}, \
59
- @dicts=[#<Natto::DictionaryInfo:0x28d3061c \
60
- type="0", \
61
- filename="/usr/local/lib/mecab/dic/ipadic/sys.dic", \
62
- charset="utf8">], \
63
- @version="0.996">
64
-
65
- puts nm.version
66
- => "0.996"
67
-
68
- sysdic = nm.dicts.first
69
-
70
- puts sysdic.filename
71
- => "/usr/local/lib/mecab/dic/ipadic/sys.dic"
72
-
73
- puts sysdic.charset
74
- => "utf8"
75
-
76
- nm.parse('ピンチの時には必ずヒーローが現れる。') do |n|
77
- puts "#{n.surface}\t#{n.feature}"
78
- end
79
- ピンチ 名詞,一般,*,*,*,*,ピンチ,ピンチ,ピンチ
80
- の 助詞,連体化,*,*,*,*,の,ノ,ノ
81
- 時 名詞,非自立,副詞可能,*,*,*,時,トキ,トキ
82
- に 助詞,格助詞,一般,*,*,*,に,一般ニ,ニ
83
- は 助詞,係助詞,*,*,*,*,は,ハ,ワ
84
- 必ず 副詞,助詞類接続,*,*,*,*,必ず,カナラズ,カナラズ
85
- ヒーロー 名詞,一般,*,*,*,*,ヒーロー,ヒーローー,ヒーロー
86
- が 助詞,格助詞,一般,*,*,*,が,ガ,ガ
87
- 現れる 動詞,自立,*,*,一段,基本形,現れる,アラワレル,アラワレル
88
- 。 記号,句点,*,*,*,*,。,。,。句点
89
- BOS/EOS,*,*,*,*,*,*,*,*
90
-
91
-
92
- ## Learn more
93
- - You can read more about natto on the [project Wiki](https://bitbucket.org/buruzaemon/natto/wiki/Home).
94
-
95
- ## Contributing to natto
96
- - Use [mercurial](http://mercurial.selenic.com/) and [check out the latest code at bitbucket](https://bitbucket.org/buruzaemon/natto/src/) to make sure the feature hasn't been implemented or the bug hasn't been fixed yet.
97
- - [Browse the issue tracker](https://bitbucket.org/buruzaemon/natto/issues/) to make sure someone already hasn't requested it and/or contributed it.
98
- - Fork the project.
99
- - Start a feature/bugfix branch.
100
- - Commit and push until you are happy with your contribution.
101
- - Make sure to add tests for it. This is important so I don't break it in a future version unintentionally. I use [MiniTest::Unit](http://rubydoc.info/gems/minitest/MiniTest/Unit) as it is very natural and easy-to-use.
102
- - Please try not to mess with the Rakefile, CHANGELOG, or version. If you must have your own version, that is fine, but please isolate to its own commit so I can cherry-pick around it.
103
-
104
- ## Changelog
105
- Please see the {file:CHANGELOG} for this gem's release history.
106
-
107
- ## Copyright
108
- Copyright &copy; 2011, Brooke M. Fujita. All rights reserved. Please see the {file:LICENSE} file for further details.
1
+ # natto
2
+ A Tasty Ruby Binding with MeCab
3
+
4
+ ## What is natto?
5
+ A gem leveraging FFI (foreign function interface), natto combines the
6
+ [Ruby programming language](http://www.ruby-lang.org/) with
7
+ [MeCab](http://mecab.googlecode.com/svn/trunk/mecab/doc/index.html), the part-of-speech
8
+ and morphological analyzer for the Japanese language.
9
+
10
+ - No compiler is necessary, as natto is _not_ a C extension.
11
+ - It will run on CRuby (mri/yarv) and JRuby (jvm) equally well.
12
+ - It will work with MeCab installations on Windows, Unix/Linux or Mac OS.
13
+ - natto provides a naturally Ruby-esque interface to MeCab.
14
+
15
+ You can learn more about [natto at bitbucket](https://bitbucket.org/buruzaemon/natto/).
16
+
17
+
18
+ ## Requirements
19
+ natto requires the following:
20
+
21
+ - [MeCab _0.996_](http://code.google.com/p/mecab/downloads/list)
22
+ - A system dictionary, like [mecab-ipadic](https://mecab.googlecode.com/files/mecab-ipadic-2.7.0-20070801.tar.gz) or [mecab-jumandic](https://mecab.googlecode.com/files/mecab-jumandic-5.1-20070304.tar.gz)
23
+ - `libmecab-devel` if you are on Linux, since natto uses `mecab-config`
24
+ - Ruby _1.9 or greater_
25
+ - [ffi _1.9.0 or greater_](http://rubygems.org/gems/ffi)
26
+
27
+ ## Installation on *nix and Mac OS
28
+ Install natto with the following gem command:
29
+
30
+ gem install natto
31
+
32
+ This will automatically install the [ffi](http://rubygems.org/gems/ffi) rubygem, which natto uses to bind to the `mecab` library.
33
+
34
+ ## Installation on Windows
35
+ However, if you are using a CRuby on Windows, then you will first need to install the [RubyInstaller Development Kit (DevKit)](https://github.com/oneclick/rubyinstaller/wiki/Development-Kit), a MSYS/MinGW based toolkit that enables your Windows Ruby installation to build many of the native C/C++ extensions available, including ffi.
36
+
37
+ 1. Download the latest release for RubyInstaller for Windows platforms and the corresponding DevKit from the [RubyInstaller for Windows downloads page](http://rubyinstaller.org/downloads/).
38
+ 2. After installing RubyInstaller for Windows, double-click on the DevKit-tdm installer `.exe`, and expand the contents to an appropriate location, for example `C:\devkit`.
39
+ 3. Open a command window under `C:\devkit`, and execute: `ruby dk.rb init`. This will locate all known ruby installations, and add them to `C:\devkit\config.yml`.
40
+ 4. Next, execute: `ruby dk.rb install`, which will add the DevKit to all of the installed rubies listed in your `C:\devkit\config.yml`. Now you should be able to install and build the ffi rubygem correctly on your Windows-installed ruby.
41
+ 5. Install natto with:
42
+
43
+ gem install natto
44
+
45
+ 6. If you are on a 64-bit Windows and you use a 64-bit Ruby or JRuby, then you might want to [build a 64-bit version of libmecab.dll](https://bitbucket.org/buruzaemon/natto/wiki/64-Bit-Windows).
46
+
47
+
48
+ ## Configuration
49
+ - ***No explicit configuration should be necessary, as natto will try to locate the `mecab` library based upon its runtime environment.***
50
+ - On Windows, it will query the Windows Registry to determine where `libmecab.dll` is installed
51
+ - On Mac OS and \*nix, it will query `mecab-config --libs`
52
+ - ***But if natto cannot find the `mecab` library, `LoadError` will be raised.***
53
+ - Please set the `MECAB_PATH` environment variable to the exact name/path to your `mecab` library.
54
+ - e.g., for Mac OS
55
+
56
+ export MECAB_PATH=/usr/local/Cellar/mecab/0.996/lib/libmecab.dylib
57
+
58
+ - e.g., for bash on UNIX/Linux
59
+
60
+ export MECAB_PATH=/usr/local/lib/libmecab.so
61
+
62
+ - e.g., on Windows
63
+
64
+ set MECAB_PATH=C:\Program Files\MeCab\bin\libmecab.dll
65
+
66
+ - e.g., from within a Ruby program
67
+
68
+ ENV['MECAB_PATH']='/usr/local/lib/libmecab.so'
69
+
70
+ ## Usage
71
+
72
+
73
+ # Quick Start
74
+ # -----------
75
+ #
76
+ # No explicit configuration should be necessary!
77
+ #
78
+ require 'natto'
79
+
80
+ # first, create an instance of Natto::MeCab
81
+ #
82
+ nm = Natto::MeCab.new
83
+ => #<Natto::MeCab:0x28d30748
84
+ @tagger=#<FFI::Pointer address=0x28a97d50>, \
85
+ @libpath="/usr/local/lib/libmecab.so", \
86
+ @options={}, \
87
+ @dicts=[#<Natto::DictionaryInfo:0x28d3061c \
88
+ @filepath="/usr/local/lib/mecab/dic/ipadic/sys.dic", \
89
+ charset=utf8, \
90
+ type=0>] \
91
+ @version=0.996>
92
+
93
+ # display MeCab version
94
+ #
95
+ puts nm.version
96
+ => 0.996
97
+
98
+ # display full pathname to MeCab library
99
+ #
100
+ puts nm.libpath
101
+ => /usr/local/lib/libmecab.so
102
+
103
+ # reference to MeCab system dictionary
104
+ #
105
+ sysdic = nm.dicts.first
106
+
107
+ # display full pathname to system dictionary file
108
+ #
109
+ puts sysdic.filepath
110
+ => /usr/local/lib/mecab/dic/ipadic/sys.dic
111
+
112
+ # what charset (encoding) is the system dictionary?
113
+ #
114
+ puts sysdic.charset
115
+ => utf8
116
+
117
+ # parse text and send output to stdout
118
+ #
119
+ puts nm.parse('俺の名前は星野豊だ!!そこんとこヨロシク!')
120
+ 俺 名詞,代名詞,一般,*,*,*,俺,オレ,オレ
121
+ の 助詞,連体化,*,*,*,*,の,ノ,ノ
122
+ 名前 名詞,一般,*,*,*,*,名前,ナマエ,ナマエ
123
+ は 助詞,係助詞,*,*,*,*,は,ハ,ワ
124
+ 星野 名詞,固有名詞,人名,姓,*,*,星野,ホシノ,ホシノ
125
+ 豊 名詞,固有名詞,人名,名,*,*,豊,ユタカ,ユタカ
126
+ だ 助動詞,*,*,*,特殊・ダ,基本形,だ,ダ,ダ
127
+ ! 記号,一般,*,*,*,*,!,!,!
128
+ ! 記号,一般,*,*,*,*,!,!,!
129
+ そこ 名詞,代名詞,一般,*,*,*,そこ,ソコ,ソコ
130
+ ん 助詞,特殊,*,*,*,*,ん,ン,ン
131
+ とこ 名詞,一般,*,*,*,*,とこ,トコ,トコ
132
+ ヨロシク 感動詞,*,*,*,*,*,ヨロシク,ヨロシク,ヨロシク
133
+ ! 記号,一般,*,*,*,*,!,!,!
134
+ EOS
135
+
136
+ # parse more text and use a block to:
137
+ # - iterate the resulting MeCab nodes
138
+ # - output morpheme surface and part-of-speech ID
139
+ #
140
+ # * ignore any end-of-sentence nodes
141
+ #
142
+ nm.parse('世界チャンプ目指してんだなこれがっ!!夢なの、俺のっ!!') do |n|
143
+ puts "#{n.surface}\tpart-of-speech id: #{n.posid}" if !n.is_eos?
144
+ end
145
+ 世界 part-of-speech id: 38
146
+ チャンプ part-of-speech id: 38
147
+ 目指し part-of-speech id: 31
148
+ て part-of-speech id: 18
149
+ ん part-of-speech id: 63
150
+ だ part-of-speech id: 25
151
+ な part-of-speech id: 17
152
+ これ part-of-speech id: 59
153
+ がっ part-of-speech id: 32
154
+ !! part-of-speech id: 36
155
+ 夢 part-of-speech id: 38
156
+ な part-of-speech id: 25
157
+ の part-of-speech id: 17
158
+ 、 part-of-speech id: 9
159
+ 俺 part-of-speech id: 59
160
+ のっ part-of-speech id: 31
161
+ !! part-of-speech id: 36
162
+
163
+ # for more complex parsing, such as that for natural
164
+ # language processing tasks, it is far more efficient
165
+ # to iterate over MeCab nodes using an Enumerator
166
+ #
167
+ # this example uses the node-format option to customize
168
+ # the resulting morpheme feature to extract:
169
+ # - surface
170
+ # - part-of-speech
171
+ # - reading
172
+ #
173
+ # * again, ignore any end-of-sentence nodes
174
+ #
175
+ nm = Natto::MeCab.new('-F%m\t%f[0]\t%f[7]')
176
+
177
+ enum = nm.enum_parse('この星の一等賞になりたいの卓球で俺は、そんだけ!')
178
+ => #<Enumerator: #<Enumerator::Generator:0x00000002ff3898>:each>
179
+
180
+ enum.next
181
+ => #<Natto::MeCabNode:0x000000032eed68 \
182
+ @pointer=#<FFI::Pointer address=0x000000005ffb48>, \
183
+ stat=0, \
184
+ @surface="この", \
185
+ @feature="この 連体詞 コノ">
186
+
187
+ enum.peek
188
+ => #<Natto::MeCabNode:0x00000002fe2110a \
189
+ @pointer=#<FFI::Pointer address=0x000000005ffdb8>, \
190
+ stat=0, \
191
+ @surface="星", \
192
+ @feature="星 名詞 ホシ">
193
+
194
+ enum.rewind
195
+
196
+ enum.each { |n| puts n.feature }
197
+ この 連体詞 コノ
198
+ 星 名詞 ホシ
199
+ の 助詞 ノ
200
+ 一等 名詞 イットウ
201
+ 賞 名詞 ショウ
202
+ に 助詞 ニ
203
+ なり 動詞 ナリ
204
+ たい 助動詞 タイ
205
+ の 助詞 ノ
206
+ 卓球 名詞 タッキュウ
207
+ で 助詞 デ
208
+ 俺 名詞 オレ
209
+ は 助詞 ハ
210
+ 、 記号 、
211
+ そん 名詞 ソン
212
+ だけ 助詞 ダケ
213
+ ! 記号 !
214
+
215
+
216
+
217
+ ## Learn more
218
+ - You can read more about natto on the [project Wiki](https://bitbucket.org/buruzaemon/natto/wiki/Home).
219
+
220
+ ## Contributing to natto
221
+ - Use [mercurial](http://mercurial.selenic.com/) and [check out the latest code at bitbucket](https://bitbucket.org/buruzaemon/natto/src/) to make sure the feature hasn't been implemented or the bug hasn't been fixed yet.
222
+ - [Browse the issue tracker](https://bitbucket.org/buruzaemon/natto/issues/) to make sure someone already hasn't requested it and/or contributed it.
223
+ - Fork the project.
224
+ - Start a feature/bugfix branch.
225
+ - Commit and push until you are happy with your contribution.
226
+ - Make sure to add tests for it. This is important so I don't break it in a future version unintentionally. I use [MiniTest::Unit](http://rubydoc.info/gems/minitest/MiniTest/Unit) as it is very natural and easy-to-use.
227
+ - Please try not to mess with the Rakefile, CHANGELOG, or version. If you must have your own version, that is fine, but please isolate to its own commit so I can cherry-pick around it.
228
+
229
+ ## Changelog
230
+ Please see the {file:CHANGELOG} for this gem's release history.
231
+
232
+ ## Copyright
233
+ Copyright &copy; 2014-2015, Brooke M. Fujita. All rights reserved. Please see the {file:LICENSE} file for further details.
@@ -1 +1,27 @@
1
1
  require 'natto/natto'
2
+
3
+ # Copyright (c) 2014-2015, Brooke M. Fujita.
4
+ # All rights reserved.
5
+ #
6
+ # Redistribution and use in source and binary forms, with or without
7
+ # modification, are permitted provided that the following conditions are met:
8
+ #
9
+ # * Redistributions of source code must retain the above
10
+ # copyright notice, this list of conditions and the
11
+ # following disclaimer.
12
+ #
13
+ # * Redistributions in binary form must reproduce the above
14
+ # copyright notice, this list of conditions and the
15
+ # following disclaimer in the documentation and/or other
16
+ # materials provided with the distribution.
17
+ #
18
+ # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
19
+ # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
20
+ # WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
21
+ # DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
22
+ # ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
23
+ # (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
24
+ # LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
25
+ # ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
26
+ # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
27
+ # SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
@@ -10,7 +10,7 @@ module Natto
10
10
  extend FFI::Library
11
11
 
12
12
  # String name for the environment variable used by
13
- # `Natto` to indicate the exact name / full path
13
+ # `Natto` to indicate the absolute pathname
14
14
  # to the `mecab` library.
15
15
  MECAB_PATH = 'MECAB_PATH'.freeze
16
16
 
@@ -19,38 +19,52 @@ module Natto
19
19
  base.extend(ClassMethods)
20
20
  end
21
21
 
22
- # Returns the name of the `mecab` library based on
23
- # the runtime environment. The value of the environment
24
- # parameter `MECAB_PATH` is checked before this
25
- # function is invoked, and in the case of Windows, a
26
- # `LoadError` will be raised if `MECAB_PATH`
27
- # is _not_ set to the full path of the `mecab`
28
- # library.
29
- # @return name of the `mecab` library
30
- # @raise [LoadError] if MECAB_PATH environment variable is not set in Windows
31
- # <br/>
32
- # e.g., for bash on UNIX/Linux
33
- #
34
- # export MECAB_PATH=/usr/local/lib/libmecab.so
35
- #
36
- # e.g., on Windows
37
- #
38
- # set MECAB_PATH=C:\Program Files\MeCab\bin\libmecab.dll
39
- #
40
- # e.g., from within a Ruby program
41
- #
42
- # ENV['MECAB_PATH']='usr/local/lib/libmecab.so'
22
+ # Returns the absolute pathname to the `mecab` library based on
23
+ # the runtime environment.
24
+ #
25
+ # @return [String] absolute pathname to the `mecab` library
26
+ # @raise [LoadError] if the library cannot be located
43
27
  def self.find_library
28
+ return File.absolute_path(ENV[MECAB_PATH]) if ENV[MECAB_PATH]
29
+
44
30
  host_os = RbConfig::CONFIG['host_os']
45
31
 
46
32
  if host_os =~ /mswin|mingw/i
47
- raise LoadError, "Please set #{MECAB_PATH} to the full path to libmecab.dll"
33
+ require 'win32/registry'
34
+ begin
35
+ base = nil
36
+ Win32::Registry::HKEY_CURRENT_USER.open('Software\MeCab') do |r|
37
+ base = r['mecabrc'].split('etc').first
38
+ end
39
+ lib = File.join(base, 'bin/libmecab.dll')
40
+ File.absolute_path(lib)
41
+ rescue
42
+ raise LoadError, "Please set #{MECAB_PATH} to the full path to libmecab.dll"
43
+ end
48
44
  else
49
- 'mecab'
45
+ require 'open3'
46
+ if host_os =~ /darwin/i
47
+ ext = 'dylib'
48
+ else
49
+ ext = 'so'
50
+ end
51
+
52
+ begin
53
+ base, lib = nil, nil
54
+ cmd = 'mecab-config --libs'
55
+ Open3.popen3(cmd) do |stdin,stdout,stderr|
56
+ toks = stdout.read.split
57
+ base = toks[0][2..-1]
58
+ lib = toks[1][2..-1]
59
+ end
60
+ File.absolute_path(File.join(base, "lib#{lib}.#{ext}"))
61
+ rescue
62
+ raise LoadError, "Please set #{MECAB_PATH} to the full path to libmecab.#{ext}"
63
+ end
50
64
  end
51
65
  end
52
66
 
53
- ffi_lib(ENV[MECAB_PATH] || find_library)
67
+ ffi_lib find_library
54
68
 
55
69
  # new interface
56
70
  attach_function :mecab_model_new2, [:string], :pointer
@@ -77,6 +91,10 @@ module Natto
77
91
  # @private
78
92
  module ClassMethods
79
93
 
94
+ def find_library
95
+ Natto::Binding.find_library
96
+ end
97
+
80
98
  def mecab_model_new2(options_str)
81
99
  Natto::Binding.mecab_model_new2(options_str)
82
100
  end
@@ -156,3 +174,29 @@ module Natto
156
174
  end
157
175
  end
158
176
  end
177
+
178
+ # Copyright (c) 2014-2015, Brooke M. Fujita.
179
+ # All rights reserved.
180
+ #
181
+ # Redistribution and use in source and binary forms, with or without
182
+ # modification, are permitted provided that the following conditions are met:
183
+ #
184
+ # * Redistributions of source code must retain the above
185
+ # copyright notice, this list of conditions and the
186
+ # following disclaimer.
187
+ #
188
+ # * Redistributions in binary form must reproduce the above
189
+ # copyright notice, this list of conditions and the
190
+ # following disclaimer in the documentation and/or other
191
+ # materials provided with the distribution.
192
+ #
193
+ # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
194
+ # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
195
+ # WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
196
+ # DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
197
+ # ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
198
+ # (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
199
+ # LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
200
+ # ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
201
+ # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
202
+ # SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.