natto 0.9.6 → 0.9.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 14ae50169a93b3810e5ae2258187d71f80d8be1a
4
- data.tar.gz: 8b89a9be35a76123c1955d85913c7665636235b1
3
+ metadata.gz: fad99a300fd0a04d95e5ffacb7352b3855506e85
4
+ data.tar.gz: 1e9ba71a7690d14099f45d0350fba7d388b7e4e9
5
5
  SHA512:
6
- metadata.gz: 996501067f551d7e7497155f5a128b256adc858c4ebcb127218eca393398bd70cd26911de4516e18fcf7450c48c00da598d5a54e6fe5e91025907121c2f6fc8c
7
- data.tar.gz: 308595234aa422803e3a0ac09573601a3e8360837918964e46c1151cb65f03121acc85c789d0c18be86428d7947c4025055925f6dc03b709b6fdc734b55e5c74
6
+ metadata.gz: 185db00a5a3fba01b27ad27ea0e89e03e698b8d5ccfbef400539c0e48648ab77abe5b90e5cad9c7777dc5d6a79297b4dea99e3ecdae4111bb25f8d78614b164c
7
+ data.tar.gz: fec5fd24301277deff762c68762b89fec9f33736a6cf918b7d4ec61a9019ff03433d6f96528d8220ca7f35a352950afd93f4b5500cf5fe9baf5b0ccbedeb5efe
data/CHANGELOG CHANGED
@@ -1,5 +1,23 @@
1
1
  ## CHANGELOG
2
2
 
3
+ - __2014/12/20__: 0.9.7 release.
4
+ - Issue 14: [adding automatic discovery for mecab library; no need to
5
+ explicitly set
6
+ MECAB_PATH!](https://bitbucket.org/buruzaemon/natto/issue/14/automatic-discovery-of-libmecab-path-and)
7
+ - Issue 15: [refactored node-parsing to use Enumerator instead of
8
+ materializing every node and stuffing into
9
+ array](https://bitbucket.org/buruzaemon/natto/issue/15/use-enumerator-when-parsing-mecab-nodes)
10
+ - Issue 17: [adding filepath to MeCab and
11
+ DictionaryInfo](https://bitbucket.org/buruzaemon/natto/issue/17/use-filerealpath-value-for-all-file-paths)
12
+ - Issue 18: [bug-fix for node-formatting during default node
13
+ parse](https://bitbucket.org/buruzaemon/natto/issue/18/no-node-formatting-when-using-default-node)
14
+ - Deprecating parse_as_nodes and parse_as_strings; please use parse instead!
15
+ - CAUTION: parse_as_nodes, parse_as_strings, readnodes and readlines will be removed in the following release!
16
+ - Enhancements to to_s methods for both MeCab and DictionaryInfo
17
+ - Enhancements to TestDictionaryInfo to allow for building user dic during setup on Windows as well
18
+ - Slight enhancement to benchmark task.
19
+ - Updating LICENSE (adding copyright year 2015), adding to all files
20
+
3
21
  - __2013/07/07__: 0.9.6 release.
4
22
  - Upgrade to mecab 0.996
5
23
  - Adding support for partial parsing mode (-p / --partial)
data/LICENSE CHANGED
@@ -1,8 +1,8 @@
1
- Copyright © 2011, Brooke M. Fujita.
1
+ Copyright (c) 2014-2015, Brooke M. Fujita.
2
2
  All rights reserved.
3
3
 
4
- Redistribution and use in source and binary forms, with or without modification, are
5
- permitted provided that the following conditions are met:
4
+ Redistribution and use in source and binary forms, with or without
5
+ modification, are permitted provided that the following conditions are met:
6
6
 
7
7
  * Redistributions of source code must retain the above
8
8
  copyright notice, this list of conditions and the
@@ -13,11 +13,13 @@ permitted provided that the following conditions are met:
13
13
  following disclaimer in the documentation and/or other
14
14
  materials provided with the distribution.
15
15
 
16
- THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED
17
- WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
18
- PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
19
- ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
20
- LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
21
- INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
22
- TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
23
- ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
16
+ THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
17
+ ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
18
+ WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
19
+ DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
20
+ ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
21
+ (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
22
+ LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
23
+ ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
24
+ (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
25
+ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
data/README.md CHANGED
@@ -1,108 +1,233 @@
1
- # natto
2
- A Tasty Ruby Binding with MeCab
3
-
4
- ## What is natto?
5
- natto combines the [Ruby programming language](http://www.ruby-lang.org/) with [MeCab](http://mecab.googlecode.com/svn/trunk/mecab/doc/index.html), the part-of-speech and morphological analyzer for the Japanese language.
6
-
7
- natto is a gem bridging Ruby and MeCab using FFI (foreign function interface). No compilation is necessary, as natto is _not_ a C extension. natto will run on CRuby (mri/yarv) and JRuby (jvm) equally well. natto will also run on Windows, Unix/Linux, and Mac.
8
-
9
- You can learn more about [natto at bitbucket](https://bitbucket.org/buruzaemon/natto/).
10
-
11
- ## Requirements
12
- natto requires the following:
13
-
14
- - [MeCab _0.996_](http://code.google.com/p/mecab/downloads/list)
15
- - [ffi _1.9.0 or greater_](http://rubygems.org/gems/ffi)
16
- - Ruby _1.9 or greater_
17
-
18
- ## Installation on *NIX/Mac
19
- Install natto with the following gem command:
20
-
21
- gem install natto
22
-
23
- This will automatically install the [ffi](http://rubygems.org/gems/ffi) rubygem, which natto uses to bind to the `mecab` library.
24
-
25
- ## Installation on Windows
26
- However, if you are using a CRuby on Windows, then you will first need to install the [RubyInstaller Development Kit (DevKit)](https://github.com/oneclick/rubyinstaller/wiki/Development-Kit), a MSYS/MinGW based toolkit than enables your Windows Ruby installation to build many of the native C/C++ extensions available, including `ffi`.
27
-
28
- 1. Download the latest release for RubyInstaller for Windows platforms and the corresponding DevKit from the [RubyInstaller for Windows downloads page](http://rubyinstaller.org/downloads/).
29
- 2. After installing RubyInstaller for Windows, double-click on the DevKit-tdm installer `.exe`, and expand the contents to an appropriate location, for example `C:\devkit`.
30
- 3. Open a command window under `C:\devkit`, and execute: `ruby dk.rb init`. This will locate all known ruby installations, and add them to `C:\devkit\config.yml`.
31
- 4. Next, execute: `ruby dk.rb install`, which will add the DevKit to all of the installed rubies listed in your `C:\devkit\config.yml`. Now you should be able to install and build the `ffi` rubygem correctly on your Windows-installed ruby.
32
- 5. Install `natto` with:
33
-
34
- gem install natto
35
-
36
- ## Configuration
37
- - natto will try to locate the `mecab` library based upon its runtime environment.
38
- - In case of `LoadError`, please set the `MECAB_PATH` environment variable to the exact name/path to your `mecab` library.
39
-
40
- e.g., for bash on UNIX/Linux
41
-
42
- export MECAB_PATH=/usr/local/lib/libmecab.so
43
-
44
- e.g., on Windows
45
-
46
- set MECAB_PATH=C:\Program Files\MeCab\bin\libmecab.dll
47
-
48
- e.g., from within a Ruby program
49
-
50
- ENV['MECAB_PATH']='/usr/local/lib/libmecab.so'
51
-
52
- ## Usage
53
- require 'natto'
54
-
55
- nm = Natto::MeCab.new
56
- => #<Natto::MeCab:0x28d30748
57
- @tagger=#<FFI::Pointer address=0x28a97d50>, \
58
- @options={}, \
59
- @dicts=[#<Natto::DictionaryInfo:0x28d3061c \
60
- type="0", \
61
- filename="/usr/local/lib/mecab/dic/ipadic/sys.dic", \
62
- charset="utf8">], \
63
- @version="0.996">
64
-
65
- puts nm.version
66
- => "0.996"
67
-
68
- sysdic = nm.dicts.first
69
-
70
- puts sysdic.filename
71
- => "/usr/local/lib/mecab/dic/ipadic/sys.dic"
72
-
73
- puts sysdic.charset
74
- => "utf8"
75
-
76
- nm.parse('ピンチの時には必ずヒーローが現れる。') do |n|
77
- puts "#{n.surface}\t#{n.feature}"
78
- end
79
- ピンチ 名詞,一般,*,*,*,*,ピンチ,ピンチ,ピンチ
80
- の 助詞,連体化,*,*,*,*,の,ノ,ノ
81
- 時 名詞,非自立,副詞可能,*,*,*,時,トキ,トキ
82
- に 助詞,格助詞,一般,*,*,*,に,一般ニ,ニ
83
- は 助詞,係助詞,*,*,*,*,は,ハ,ワ
84
- 必ず 副詞,助詞類接続,*,*,*,*,必ず,カナラズ,カナラズ
85
- ヒーロー 名詞,一般,*,*,*,*,ヒーロー,ヒーローー,ヒーロー
86
- が 助詞,格助詞,一般,*,*,*,が,ガ,ガ
87
- 現れる 動詞,自立,*,*,一段,基本形,現れる,アラワレル,アラワレル
88
- 。 記号,句点,*,*,*,*,。,。,。句点
89
- BOS/EOS,*,*,*,*,*,*,*,*
90
-
91
-
92
- ## Learn more
93
- - You can read more about natto on the [project Wiki](https://bitbucket.org/buruzaemon/natto/wiki/Home).
94
-
95
- ## Contributing to natto
96
- - Use [mercurial](http://mercurial.selenic.com/) and [check out the latest code at bitbucket](https://bitbucket.org/buruzaemon/natto/src/) to make sure the feature hasn't been implemented or the bug hasn't been fixed yet.
97
- - [Browse the issue tracker](https://bitbucket.org/buruzaemon/natto/issues/) to make sure someone already hasn't requested it and/or contributed it.
98
- - Fork the project.
99
- - Start a feature/bugfix branch.
100
- - Commit and push until you are happy with your contribution.
101
- - Make sure to add tests for it. This is important so I don't break it in a future version unintentionally. I use [MiniTest::Unit](http://rubydoc.info/gems/minitest/MiniTest/Unit) as it is very natural and easy-to-use.
102
- - Please try not to mess with the Rakefile, CHANGELOG, or version. If you must have your own version, that is fine, but please isolate to its own commit so I can cherry-pick around it.
103
-
104
- ## Changelog
105
- Please see the {file:CHANGELOG} for this gem's release history.
106
-
107
- ## Copyright
108
- Copyright &copy; 2011, Brooke M. Fujita. All rights reserved. Please see the {file:LICENSE} file for further details.
1
+ # natto
2
+ A Tasty Ruby Binding with MeCab
3
+
4
+ ## What is natto?
5
+ A gem leveraging FFI (foreign function interface), natto combines the
6
+ [Ruby programming language](http://www.ruby-lang.org/) with
7
+ [MeCab](http://mecab.googlecode.com/svn/trunk/mecab/doc/index.html), the part-of-speech
8
+ and morphological analyzer for the Japanese language.
9
+
10
+ - No compiler is necessary, as natto is _not_ a C extension.
11
+ - It will run on CRuby (mri/yarv) and JRuby (jvm) equally well.
12
+ - It will work with MeCab installations on Windows, Unix/Linux or Mac OS.
13
+ - natto provides a naturally Ruby-esque interface to MeCab.
14
+
15
+ You can learn more about [natto at bitbucket](https://bitbucket.org/buruzaemon/natto/).
16
+
17
+
18
+ ## Requirements
19
+ natto requires the following:
20
+
21
+ - [MeCab _0.996_](http://code.google.com/p/mecab/downloads/list)
22
+ - A system dictionary, like [mecab-ipadic](https://mecab.googlecode.com/files/mecab-ipadic-2.7.0-20070801.tar.gz) or [mecab-jumandic](https://mecab.googlecode.com/files/mecab-jumandic-5.1-20070304.tar.gz)
23
+ - `libmecab-devel` if you are on Linux, since natto uses `mecab-config`
24
+ - Ruby _1.9 or greater_
25
+ - [ffi _1.9.0 or greater_](http://rubygems.org/gems/ffi)
26
+
27
+ ## Installation on *nix and Mac OS
28
+ Install natto with the following gem command:
29
+
30
+ gem install natto
31
+
32
+ This will automatically install the [ffi](http://rubygems.org/gems/ffi) rubygem, which natto uses to bind to the `mecab` library.
33
+
34
+ ## Installation on Windows
35
+ However, if you are using a CRuby on Windows, then you will first need to install the [RubyInstaller Development Kit (DevKit)](https://github.com/oneclick/rubyinstaller/wiki/Development-Kit), a MSYS/MinGW based toolkit that enables your Windows Ruby installation to build many of the native C/C++ extensions available, including ffi.
36
+
37
+ 1. Download the latest release for RubyInstaller for Windows platforms and the corresponding DevKit from the [RubyInstaller for Windows downloads page](http://rubyinstaller.org/downloads/).
38
+ 2. After installing RubyInstaller for Windows, double-click on the DevKit-tdm installer `.exe`, and expand the contents to an appropriate location, for example `C:\devkit`.
39
+ 3. Open a command window under `C:\devkit`, and execute: `ruby dk.rb init`. This will locate all known ruby installations, and add them to `C:\devkit\config.yml`.
40
+ 4. Next, execute: `ruby dk.rb install`, which will add the DevKit to all of the installed rubies listed in your `C:\devkit\config.yml`. Now you should be able to install and build the ffi rubygem correctly on your Windows-installed ruby.
41
+ 5. Install natto with:
42
+
43
+ gem install natto
44
+
45
+ 6. If you are on a 64-bit Windows and you use a 64-bit Ruby or JRuby, then you might want to [build a 64-bit version of libmecab.dll](https://bitbucket.org/buruzaemon/natto/wiki/64-Bit-Windows).
46
+
47
+
48
+ ## Configuration
49
+ - ***No explicit configuration should be necessary, as natto will try to locate the `mecab` library based upon its runtime environment.***
50
+ - On Windows, it will query the Windows Registry to determine where `libmecab.dll` is installed
51
+ - On Mac OS and \*nix, it will query `mecab-config --libs`
52
+ - ***But if natto cannot find the `mecab` library, `LoadError` will be raised.***
53
+ - Please set the `MECAB_PATH` environment variable to the exact name/path to your `mecab` library.
54
+ - e.g., for Mac OS
55
+
56
+ export MECAB_PATH=/usr/local/Cellar/mecab/0.996/lib/libmecab.dylib
57
+
58
+ - e.g., for bash on UNIX/Linux
59
+
60
+ export MECAB_PATH=/usr/local/lib/libmecab.so
61
+
62
+ - e.g., on Windows
63
+
64
+ set MECAB_PATH=C:\Program Files\MeCab\bin\libmecab.dll
65
+
66
+ - e.g., from within a Ruby program
67
+
68
+ ENV['MECAB_PATH']='/usr/local/lib/libmecab.so'
69
+
70
+ ## Usage
71
+
72
+
73
+ # Quick Start
74
+ # -----------
75
+ #
76
+ # No explicit configuration should be necessary!
77
+ #
78
+ require 'natto'
79
+
80
+ # first, create an instance of Natto::MeCab
81
+ #
82
+ nm = Natto::MeCab.new
83
+ => #<Natto::MeCab:0x28d30748
84
+ @tagger=#<FFI::Pointer address=0x28a97d50>, \
85
+ @libpath="/usr/local/lib/libmecab.so", \
86
+ @options={}, \
87
+ @dicts=[#<Natto::DictionaryInfo:0x28d3061c \
88
+ @filepath="/usr/local/lib/mecab/dic/ipadic/sys.dic", \
89
+ charset=utf8, \
90
+ type=0>] \
91
+ @version=0.996>
92
+
93
+ # display MeCab version
94
+ #
95
+ puts nm.version
96
+ => 0.996
97
+
98
+ # display full pathname to MeCab library
99
+ #
100
+ puts nm.libpath
101
+ => /usr/local/lib/libmecab.so
102
+
103
+ # reference to MeCab system dictionary
104
+ #
105
+ sysdic = nm.dicts.first
106
+
107
+ # display full pathname to system dictionary file
108
+ #
109
+ puts sysdic.filepath
110
+ => /usr/local/lib/mecab/dic/ipadic/sys.dic
111
+
112
+ # what charset (encoding) is the system dictionary?
113
+ #
114
+ puts sysdic.charset
115
+ => utf8
116
+
117
+ # parse text and send output to stdout
118
+ #
119
+ puts nm.parse('俺の名前は星野豊だ!!そこんとこヨロシク!')
120
+ 俺 名詞,代名詞,一般,*,*,*,俺,オレ,オレ
121
+ の 助詞,連体化,*,*,*,*,の,ノ,ノ
122
+ 名前 名詞,一般,*,*,*,*,名前,ナマエ,ナマエ
123
+ は 助詞,係助詞,*,*,*,*,は,ハ,ワ
124
+ 星野 名詞,固有名詞,人名,姓,*,*,星野,ホシノ,ホシノ
125
+ 豊 名詞,固有名詞,人名,名,*,*,豊,ユタカ,ユタカ
126
+ だ 助動詞,*,*,*,特殊・ダ,基本形,だ,ダ,ダ
127
+ ! 記号,一般,*,*,*,*,!,!,!
128
+ ! 記号,一般,*,*,*,*,!,!,!
129
+ そこ 名詞,代名詞,一般,*,*,*,そこ,ソコ,ソコ
130
+ ん 助詞,特殊,*,*,*,*,ん,ン,ン
131
+ とこ 名詞,一般,*,*,*,*,とこ,トコ,トコ
132
+ ヨロシク 感動詞,*,*,*,*,*,ヨロシク,ヨロシク,ヨロシク
133
+ ! 記号,一般,*,*,*,*,!,!,!
134
+ EOS
135
+
136
+ # parse more text and use a block to:
137
+ # - iterate the resulting MeCab nodes
138
+ # - output morpheme surface and part-of-speech ID
139
+ #
140
+ # * ignore any end-of-sentence nodes
141
+ #
142
+ nm.parse('世界チャンプ目指してんだなこれがっ!!夢なの、俺のっ!!') do |n|
143
+ puts "#{n.surface}\tpart-of-speech id: #{n.posid}" if !n.is_eos?
144
+ end
145
+ 世界 part-of-speech id: 38
146
+ チャンプ part-of-speech id: 38
147
+ 目指し part-of-speech id: 31
148
+ て part-of-speech id: 18
149
+ ん part-of-speech id: 63
150
+ だ part-of-speech id: 25
151
+ な part-of-speech id: 17
152
+ これ part-of-speech id: 59
153
+ がっ part-of-speech id: 32
154
+ !! part-of-speech id: 36
155
+ 夢 part-of-speech id: 38
156
+ な part-of-speech id: 25
157
+ の part-of-speech id: 17
158
+ 、 part-of-speech id: 9
159
+ 俺 part-of-speech id: 59
160
+ のっ part-of-speech id: 31
161
+ !! part-of-speech id: 36
162
+
163
+ # for more complex parsing, such as that for natural
164
+ # language processing tasks, it is far more efficient
165
+ # to iterate over MeCab nodes using an Enumerator
166
+ #
167
+ # this example uses the node-format option to customize
168
+ # the resulting morpheme feature to extract:
169
+ # - surface
170
+ # - part-of-speech
171
+ # - reading
172
+ #
173
+ # * again, ignore any end-of-sentence nodes
174
+ #
175
+ nm = Natto::MeCab.new('-F%m\t%f[0]\t%f[7]')
176
+
177
+ enum = nm.enum_parse('この星の一等賞になりたいの卓球で俺は、そんだけ!')
178
+ => #<Enumerator: #<Enumerator::Generator:0x00000002ff3898>:each>
179
+
180
+ enum.next
181
+ => #<Natto::MeCabNode:0x000000032eed68 \
182
+ @pointer=#<FFI::Pointer address=0x000000005ffb48>, \
183
+ stat=0, \
184
+ @surface="この", \
185
+ @feature="この 連体詞 コノ">
186
+
187
+ enum.peek
188
+ => #<Natto::MeCabNode:0x00000002fe2110a \
189
+ @pointer=#<FFI::Pointer address=0x000000005ffdb8>, \
190
+ stat=0, \
191
+ @surface="星", \
192
+ @feature="星 名詞 ホシ">
193
+
194
+ enum.rewind
195
+
196
+ enum.each { |n| puts n.feature }
197
+ この 連体詞 コノ
198
+ 星 名詞 ホシ
199
+ の 助詞 ノ
200
+ 一等 名詞 イットウ
201
+ 賞 名詞 ショウ
202
+ に 助詞 ニ
203
+ なり 動詞 ナリ
204
+ たい 助動詞 タイ
205
+ の 助詞 ノ
206
+ 卓球 名詞 タッキュウ
207
+ で 助詞 デ
208
+ 俺 名詞 オレ
209
+ は 助詞 ハ
210
+ 、 記号 、
211
+ そん 名詞 ソン
212
+ だけ 助詞 ダケ
213
+ ! 記号 !
214
+
215
+
216
+
217
+ ## Learn more
218
+ - You can read more about natto on the [project Wiki](https://bitbucket.org/buruzaemon/natto/wiki/Home).
219
+
220
+ ## Contributing to natto
221
+ - Use [mercurial](http://mercurial.selenic.com/) and [check out the latest code at bitbucket](https://bitbucket.org/buruzaemon/natto/src/) to make sure the feature hasn't been implemented or the bug hasn't been fixed yet.
222
+ - [Browse the issue tracker](https://bitbucket.org/buruzaemon/natto/issues/) to make sure someone already hasn't requested it and/or contributed it.
223
+ - Fork the project.
224
+ - Start a feature/bugfix branch.
225
+ - Commit and push until you are happy with your contribution.
226
+ - Make sure to add tests for it. This is important so I don't break it in a future version unintentionally. I use [MiniTest::Unit](http://rubydoc.info/gems/minitest/MiniTest/Unit) as it is very natural and easy-to-use.
227
+ - Please try not to mess with the Rakefile, CHANGELOG, or version. If you must have your own version, that is fine, but please isolate to its own commit so I can cherry-pick around it.
228
+
229
+ ## Changelog
230
+ Please see the {file:CHANGELOG} for this gem's release history.
231
+
232
+ ## Copyright
233
+ Copyright &copy; 2014-2015, Brooke M. Fujita. All rights reserved. Please see the {file:LICENSE} file for further details.
@@ -1 +1,27 @@
1
1
  require 'natto/natto'
2
+
3
+ # Copyright (c) 2014-2015, Brooke M. Fujita.
4
+ # All rights reserved.
5
+ #
6
+ # Redistribution and use in source and binary forms, with or without
7
+ # modification, are permitted provided that the following conditions are met:
8
+ #
9
+ # * Redistributions of source code must retain the above
10
+ # copyright notice, this list of conditions and the
11
+ # following disclaimer.
12
+ #
13
+ # * Redistributions in binary form must reproduce the above
14
+ # copyright notice, this list of conditions and the
15
+ # following disclaimer in the documentation and/or other
16
+ # materials provided with the distribution.
17
+ #
18
+ # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
19
+ # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
20
+ # WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
21
+ # DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
22
+ # ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
23
+ # (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
24
+ # LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
25
+ # ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
26
+ # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
27
+ # SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
@@ -10,7 +10,7 @@ module Natto
10
10
  extend FFI::Library
11
11
 
12
12
  # String name for the environment variable used by
13
- # `Natto` to indicate the exact name / full path
13
+ # `Natto` to indicate the absolute pathname
14
14
  # to the `mecab` library.
15
15
  MECAB_PATH = 'MECAB_PATH'.freeze
16
16
 
@@ -19,38 +19,52 @@ module Natto
19
19
  base.extend(ClassMethods)
20
20
  end
21
21
 
22
- # Returns the name of the `mecab` library based on
23
- # the runtime environment. The value of the environment
24
- # parameter `MECAB_PATH` is checked before this
25
- # function is invoked, and in the case of Windows, a
26
- # `LoadError` will be raised if `MECAB_PATH`
27
- # is _not_ set to the full path of the `mecab`
28
- # library.
29
- # @return name of the `mecab` library
30
- # @raise [LoadError] if MECAB_PATH environment variable is not set in Windows
31
- # <br/>
32
- # e.g., for bash on UNIX/Linux
33
- #
34
- # export MECAB_PATH=/usr/local/lib/libmecab.so
35
- #
36
- # e.g., on Windows
37
- #
38
- # set MECAB_PATH=C:\Program Files\MeCab\bin\libmecab.dll
39
- #
40
- # e.g., from within a Ruby program
41
- #
42
- # ENV['MECAB_PATH']='usr/local/lib/libmecab.so'
22
+ # Returns the absolute pathname to the `mecab` library based on
23
+ # the runtime environment.
24
+ #
25
+ # @return [String] absolute pathname to the `mecab` library
26
+ # @raise [LoadError] if the library cannot be located
43
27
  def self.find_library
28
+ return File.absolute_path(ENV[MECAB_PATH]) if ENV[MECAB_PATH]
29
+
44
30
  host_os = RbConfig::CONFIG['host_os']
45
31
 
46
32
  if host_os =~ /mswin|mingw/i
47
- raise LoadError, "Please set #{MECAB_PATH} to the full path to libmecab.dll"
33
+ require 'win32/registry'
34
+ begin
35
+ base = nil
36
+ Win32::Registry::HKEY_CURRENT_USER.open('Software\MeCab') do |r|
37
+ base = r['mecabrc'].split('etc').first
38
+ end
39
+ lib = File.join(base, 'bin/libmecab.dll')
40
+ File.absolute_path(lib)
41
+ rescue
42
+ raise LoadError, "Please set #{MECAB_PATH} to the full path to libmecab.dll"
43
+ end
48
44
  else
49
- 'mecab'
45
+ require 'open3'
46
+ if host_os =~ /darwin/i
47
+ ext = 'dylib'
48
+ else
49
+ ext = 'so'
50
+ end
51
+
52
+ begin
53
+ base, lib = nil, nil
54
+ cmd = 'mecab-config --libs'
55
+ Open3.popen3(cmd) do |stdin,stdout,stderr|
56
+ toks = stdout.read.split
57
+ base = toks[0][2..-1]
58
+ lib = toks[1][2..-1]
59
+ end
60
+ File.absolute_path(File.join(base, "lib#{lib}.#{ext}"))
61
+ rescue
62
+ raise LoadError, "Please set #{MECAB_PATH} to the full path to libmecab.#{ext}"
63
+ end
50
64
  end
51
65
  end
52
66
 
53
- ffi_lib(ENV[MECAB_PATH] || find_library)
67
+ ffi_lib find_library
54
68
 
55
69
  # new interface
56
70
  attach_function :mecab_model_new2, [:string], :pointer
@@ -77,6 +91,10 @@ module Natto
77
91
  # @private
78
92
  module ClassMethods
79
93
 
94
+ def find_library
95
+ Natto::Binding.find_library
96
+ end
97
+
80
98
  def mecab_model_new2(options_str)
81
99
  Natto::Binding.mecab_model_new2(options_str)
82
100
  end
@@ -156,3 +174,29 @@ module Natto
156
174
  end
157
175
  end
158
176
  end
177
+
178
+ # Copyright (c) 2014-2015, Brooke M. Fujita.
179
+ # All rights reserved.
180
+ #
181
+ # Redistribution and use in source and binary forms, with or without
182
+ # modification, are permitted provided that the following conditions are met:
183
+ #
184
+ # * Redistributions of source code must retain the above
185
+ # copyright notice, this list of conditions and the
186
+ # following disclaimer.
187
+ #
188
+ # * Redistributions in binary form must reproduce the above
189
+ # copyright notice, this list of conditions and the
190
+ # following disclaimer in the documentation and/or other
191
+ # materials provided with the distribution.
192
+ #
193
+ # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
194
+ # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
195
+ # WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
196
+ # DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
197
+ # ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
198
+ # (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
199
+ # LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
200
+ # ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
201
+ # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
202
+ # SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.