natto 0.0.6 → 0.0.7
Sign up to get free protection for your applications and to get access to all the features.
- data/LICENSE +1 -1
- data/README.md +84 -0
- data/lib/natto.rb +93 -83
- data/lib/natto/binding.rb +53 -2
- data/lib/natto/version.rb +1 -1
- data/test/test_natto.rb +52 -12
- metadata +6 -6
- data/README +0 -77
data/LICENSE
CHANGED
data/README.md
ADDED
@@ -0,0 +1,84 @@
|
|
1
|
+
# natto
|
2
|
+
A Tasty Ruby Binding with MeCab
|
3
|
+
|
4
|
+
## What is natto?
|
5
|
+
|
6
|
+
natto combines the [Ruby programming language](http://www.ruby-lang.org/) with [MeCab](http://mecab.sourceforge.net/), the part-of-speech and morphological analyzer for the Japanese language.
|
7
|
+
|
8
|
+
## Requirements
|
9
|
+
natto requires the following:
|
10
|
+
|
11
|
+
- [MeCab _0.98_](http://sourceforge.net/projects/mecab/files/mecab/0.98/)
|
12
|
+
- [ffi _0.6.3 or greater_](http://rubygems.org/gems/ffi)
|
13
|
+
- Ruby _1.8.7 or greater_
|
14
|
+
|
15
|
+
## Installation
|
16
|
+
Install natto with the following gem command:
|
17
|
+
gem install natto
|
18
|
+
|
19
|
+
## Configuration
|
20
|
+
- natto will try to locate the <tt>mecab</tt> library based upon its runtime environment.
|
21
|
+
- In case of <tt>LoadError</tt>, please set the <tt>MECAB_PATH</tt> environment variable to the exact name/path to your <tt>mecab</tt> library.
|
22
|
+
|
23
|
+
e.g., for bash on UNIX/Linux
|
24
|
+
export MECAB_PATH=mecab.so
|
25
|
+
e.g., on Windows
|
26
|
+
set MECAB_PATH=C:\Program Files\MeCab\bin\libmecab.dll
|
27
|
+
e.g., for Cygwin
|
28
|
+
export MECAB_PATH=cygmecab-1
|
29
|
+
|
30
|
+
## Usage
|
31
|
+
require 'natto'
|
32
|
+
|
33
|
+
m = Natto::MeCab.new
|
34
|
+
puts m.parse("すもももももももものうち")
|
35
|
+
すもも 名詞,一般,*,*,*,*,すもも,スモモ,スモモ
|
36
|
+
も 助詞,係助詞,*,*,*,*,も,モ,モ
|
37
|
+
もも 名詞,一般,*,*,*,*,もも,モモ,モモ
|
38
|
+
も 助詞,係助詞,*,*,*,*,も,モ,モ
|
39
|
+
もも 名詞,一般,*,*,*,*,もも,モモ,モモ
|
40
|
+
の 助詞,連体化,*,*,*,*,の,ノ,ノ
|
41
|
+
うち 名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ
|
42
|
+
EOS
|
43
|
+
=> nil
|
44
|
+
|
45
|
+
## Contributing to natto
|
46
|
+
- Use [Mercurial](http://mercurial.selenic.com/) and [check out the latest master](http://code.google.com/p/natto/source/checkout) to make sure the feature hasn't been implemented or the bug hasn't been fixed yet.
|
47
|
+
- [Check out the issue tracker](http://code.google.com/p/natto/issues/list) to make sure someone already hasn't requested it and/or contributed it.
|
48
|
+
- Fork the project.
|
49
|
+
- Start a feature/bugfix branch.
|
50
|
+
- Commit and push until you are happy with your contribution.
|
51
|
+
- Make sure to add tests for it. This is important so I don't break it in a future version unintentionally. I use [Test::Unit](http://ruby-doc.org/stdlib/libdoc/test/unit/rdoc/classes/Test/Unit.html) since it is simple and it works.
|
52
|
+
- Please try not to mess with the Rakefile, version, or history. If you must have your own version, that is fine, but please isolate to its own commit so I can cherry-pick around it.
|
53
|
+
|
54
|
+
## Changelog
|
55
|
+
|
56
|
+
- __2010/12/30: 0.0.7 release.
|
57
|
+
- Adding support for all-morphs and partial options
|
58
|
+
- Further updating of documentation with markdown
|
59
|
+
|
60
|
+
- __2010/12/28__: 0.0.6 release.
|
61
|
+
- Correction to natto.gemspec to include lib/natto/binding.rb
|
62
|
+
|
63
|
+
- __2010/12/28__: 0.0.5 release. (yanked)
|
64
|
+
- On-going refactoring
|
65
|
+
- Project structure refactored for greater maintainability
|
66
|
+
|
67
|
+
- __2010/12/26__: 0.0.4 release.
|
68
|
+
- On-going refactoring
|
69
|
+
|
70
|
+
- __2010/12/23__: 0.0.3 release.
|
71
|
+
- On-going refactoring
|
72
|
+
- Adding documentation via yard
|
73
|
+
|
74
|
+
- __2010/12/20__: 0.0.2 release.
|
75
|
+
- Continuing development on proper resource deallocation
|
76
|
+
- Adding options hash in object initializer
|
77
|
+
|
78
|
+
- __2010/12/13__: Released version 0.0.1. The objective is to provide
|
79
|
+
an easy-to-use, production-level Ruby binding to MeCab.
|
80
|
+
- Initial release
|
81
|
+
|
82
|
+
## Copyright
|
83
|
+
|
84
|
+
natto © 2010-2013 by Brooke M. Fujita, licensed under the new BSD license. Please see the [LICENSE](file.LICENSE.html) document for further details.
|
data/lib/natto.rb
CHANGED
@@ -1,85 +1,85 @@
|
|
1
|
-
# -*- encoding: utf-8 -*-
|
2
1
|
require 'rubygems' if RUBY_VERSION.to_f < 1.9
|
3
2
|
require 'natto/binding'
|
4
3
|
|
5
|
-
# natto combines the Ruby programming language with MeCab,
|
6
|
-
# the part-of-speech and morphological analyzer for the
|
7
|
-
# Japanese language.
|
8
|
-
#
|
9
|
-
# === Requirements
|
10
|
-
# natto requires the following:
|
11
|
-
# * {http://sourceforge.net/projects/mecab/files/mecab/ MeCab 0.98}
|
12
|
-
# * {http://rubygems.org/gems/ffi ffi 0.63 or greater}
|
13
|
-
# * Ruby 1.8.7 or greater
|
14
|
-
#
|
15
|
-
# === Installation
|
16
|
-
# Install natto with the following gem command:
|
17
|
-
# * <code>gem install natto</code>
|
18
|
-
#
|
19
|
-
# === Configuration
|
20
|
-
# * natto will try to locate the <tt>mecab</tt> library based upon its runtime environment.
|
21
|
-
# * In case of <tt>LoadError</tt>, please set the <tt>MECAB_PATH</tt> environment variable to the exact name/path to your <tt>mecab</tt> library.
|
22
|
-
#
|
23
|
-
#== Usage
|
24
|
-
# require 'natto'
|
25
|
-
#
|
26
|
-
# m = Natto::MeCab.new
|
27
|
-
# puts m.parse("すもももももももものうち")
|
28
|
-
# すもも 名詞,一般,*,*,*,*,すもも,スモモ,スモモ
|
29
|
-
# も 助詞,係助詞,*,*,*,*,も,モ,モ
|
30
|
-
# もも 名詞,一般,*,*,*,*,もも,モモ,モモ
|
31
|
-
# も 助詞,係助詞,*,*,*,*,も,モ,モ
|
32
|
-
# もも 名詞,一般,*,*,*,*,もも,モモ,モモ
|
33
|
-
# の 助詞,連体化,*,*,*,*,の,ノ,ノ
|
34
|
-
# うち 名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ
|
35
|
-
# EOS
|
36
|
-
# => nil
|
37
|
-
#
|
38
|
-
# @author Brooke M. Fujita (buruzaemon)
|
39
4
|
module Natto
|
40
5
|
require 'ffi'
|
41
6
|
|
42
|
-
# <tt>MeCab</tt> is a wrapper class
|
43
|
-
# Options to the <tt>mecab</tt> parser are passed in as a hash
|
44
|
-
#
|
7
|
+
# <tt>MeCab</tt> is a wrapper class for the <tt>mecab</tt> parser.
|
8
|
+
# Options to the <tt>mecab</tt> parser are passed in as a hash at
|
9
|
+
# initialization.
|
10
|
+
#
|
11
|
+
# ## Usage
|
12
|
+
# require 'natto'
|
13
|
+
#
|
14
|
+
# m = Natto::MeCab.new
|
15
|
+
# puts m.parse("すもももももももものうち")
|
16
|
+
# すもも 名詞,一般,*,*,*,*,すもも,スモモ,スモモ
|
17
|
+
# も 助詞,係助詞,*,*,*,*,も,モ,モ
|
18
|
+
# もも 名詞,一般,*,*,*,*,もも,モモ,モモ
|
19
|
+
# も 助詞,係助詞,*,*,*,*,も,モ,モ
|
20
|
+
# もも 名詞,一般,*,*,*,*,もも,モモ,モモ
|
21
|
+
# の 助詞,連体化,*,*,*,*,の,ノ,ノ
|
22
|
+
# うち 名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ
|
23
|
+
# EOS
|
24
|
+
# => nil
|
45
25
|
#
|
46
|
-
# @see
|
26
|
+
# @see SUPPORTED_OPTS
|
47
27
|
class MeCab
|
28
|
+
|
29
|
+
attr_reader :options
|
30
|
+
|
48
31
|
# Supported options to the <tt>mecab</tt> parser.
|
49
32
|
# See the <tt>mecab</tt> help for more details.
|
50
|
-
SUPPORTED_OPTS = [
|
51
|
-
|
52
|
-
|
53
|
-
|
33
|
+
SUPPORTED_OPTS = [ :rcfile, :dicdir, :userdic, :lattice_level, :all_morphs,
|
34
|
+
:output_format_type, :partial, :node_format, :unk_format,
|
35
|
+
:bos_format, :eos_format, :eon_format, :unk_feature,
|
36
|
+
:nbest, :theta, :cost_factor ].freeze
|
37
|
+
# :allocate_sentence ]
|
38
|
+
|
39
|
+
#OPTION_DEFAULTS = { :lattice_level=>0, :all_morphs=>false, :nbest=>1,
|
40
|
+
# :theta=>0.75, :cost_factor=>700 }.freeze
|
54
41
|
|
55
|
-
#
|
42
|
+
# Initializes the wrapped <tt>mecab</tt> instance with the
|
56
43
|
# given <tt>options</tt> hash.
|
57
|
-
#
|
44
|
+
#
|
58
45
|
# Options supported are:
|
59
|
-
#
|
60
|
-
#
|
61
|
-
#
|
62
|
-
#
|
63
|
-
#
|
64
|
-
#
|
65
|
-
#
|
66
|
-
#
|
67
|
-
#
|
68
|
-
#
|
69
|
-
#
|
70
|
-
#
|
71
|
-
#
|
72
|
-
#
|
73
|
-
#
|
74
|
-
#
|
75
|
-
#
|
46
|
+
#
|
47
|
+
# - :rcfile -- resource file
|
48
|
+
# - :dicdir -- system dicdir
|
49
|
+
# - :userdic -- user dictionary
|
50
|
+
# - :lattice_level -- lattice information level (integer, default 0)
|
51
|
+
# - :all_morphs -- output all morphs (default false)
|
52
|
+
# - :output_format_type -- output format type (wakati, chasen, yomi, etc.)
|
53
|
+
# - :partial -- partial parsing mode
|
54
|
+
# - :node_format -- user-defined node format
|
55
|
+
# - :unk_format -- user-defined unknown node format
|
56
|
+
# - :bos_format -- user-defined beginning-of-sentence format
|
57
|
+
# - :eos_format -- user-defined end-of-sentence format
|
58
|
+
# - :eon_format -- user-defined end-of-NBest format
|
59
|
+
# - :unk_feature -- feature for unknown word
|
60
|
+
# - :nbest -- output N best results (integer, default 1)
|
61
|
+
# - :theta -- temperature parameter theta (float, default 0.75)
|
62
|
+
# - :cost_factor -- cost factor (integer, default 700)
|
63
|
+
#
|
64
|
+
# _Use single-quotes to preserve format options that contain escape chars._
|
65
|
+
#
|
76
66
|
# e.g.
|
77
|
-
#
|
67
|
+
# m = Natto::MeCab.new(:node_format=>'%m\t%f[7]\n')
|
68
|
+
# puts m.parse("日本語は難しいです。")
|
69
|
+
# 日本語 ニホンゴ
|
70
|
+
# は ハ
|
71
|
+
# 難しい ムズカシイ
|
72
|
+
# です デス
|
73
|
+
# 。 。
|
74
|
+
# EOS
|
75
|
+
# => nil
|
78
76
|
#
|
79
77
|
# @param [Hash]
|
80
|
-
# @
|
78
|
+
# @raise [MeCabError] if <tt>mecab</tt> cannot be initialized with the given <tt>options</tt>
|
79
|
+
# @see MeCab::SUPPORTED_OPTS
|
81
80
|
def initialize(options={})
|
82
|
-
|
81
|
+
@options = options
|
82
|
+
opt_str = self.class.build_options_str(@options)
|
83
83
|
@ptr = Natto::Binding.mecab_new2(opt_str)
|
84
84
|
raise MeCabError.new("Could not initialize MeCab with options: '#{opt_str}'") if @ptr.address == 0
|
85
85
|
#@dict = Natto::DictionaryInfo.new(Natto::Binding.mecab_dictionary_info(@ptr))
|
@@ -88,7 +88,9 @@ module Natto
|
|
88
88
|
|
89
89
|
# Parses the given string <tt>s</tt>.
|
90
90
|
#
|
91
|
-
# @param [String]
|
91
|
+
# @param [String] s
|
92
|
+
# @return parsing result from <tt>mecab</tt>
|
93
|
+
# @raise [MeCabError] if the <tt>mecab</tt> parser cannot parse the given string <tt>s</tt>
|
92
94
|
def parse(s)
|
93
95
|
Natto::Binding.mecab_sparse_tostr(@ptr, s) ||
|
94
96
|
raise(MeCabError.new(Natto::Binding.mecab_strerror(@ptr)))
|
@@ -99,6 +101,7 @@ module Natto
|
|
99
101
|
# after the object owning <tt>ptr</tt> has been destroyed.
|
100
102
|
#
|
101
103
|
# @param [FFI::MemoryPointer] ptr
|
104
|
+
# @return [Proc] to release <tt>mecab</tt> resources properly
|
102
105
|
def self.create_free_proc(ptr)
|
103
106
|
Proc.new do
|
104
107
|
Natto::Binding.mecab_destroy(ptr)
|
@@ -109,12 +112,17 @@ module Natto
|
|
109
112
|
# be passed in the construction of <tt>mecab</tt>.
|
110
113
|
#
|
111
114
|
# @param [Hash] options
|
115
|
+
# @return string-representation of the options to the <tt>mecab</tt> parser
|
112
116
|
def self.build_options_str(options={})
|
113
117
|
opt = []
|
114
118
|
SUPPORTED_OPTS.each do |k|
|
115
119
|
if options.has_key? k
|
116
120
|
key = k.to_s.gsub('_', '-')
|
117
|
-
|
121
|
+
if %w( all-morphs partial ).include? key
|
122
|
+
opt << "--#{key}" if options[k]==true
|
123
|
+
else
|
124
|
+
opt << "--#{key}=#{options[k]}"
|
125
|
+
end
|
118
126
|
|
119
127
|
#if key.end_with? '_format_' or key.end_with? '_feature'
|
120
128
|
# opt << "--#{key}="+options[k]
|
@@ -133,24 +141,26 @@ module Natto
|
|
133
141
|
|
134
142
|
# <tt>DictionaryInfo</tt> is a wrapper for a <tt>MeCab</tt>
|
135
143
|
# instance's related dictionary information.
|
136
|
-
#
|
144
|
+
#
|
137
145
|
# Values may be obtained by using the following symbols
|
138
146
|
# as keys to the hash of <tt>mecab</tt> dictionary information.
|
139
|
-
#
|
140
|
-
#
|
141
|
-
#
|
142
|
-
#
|
143
|
-
#
|
144
|
-
#
|
145
|
-
#
|
146
|
-
#
|
147
|
-
#
|
148
|
-
#
|
149
|
-
#
|
150
|
-
#
|
151
|
-
#
|
152
|
-
#
|
153
|
-
#
|
147
|
+
#
|
148
|
+
# - :filename
|
149
|
+
# - :charset
|
150
|
+
# - :size
|
151
|
+
# - :type
|
152
|
+
# - :lsize
|
153
|
+
# - :rsize
|
154
|
+
# - :version
|
155
|
+
# - :next
|
156
|
+
#
|
157
|
+
# # Usage:
|
158
|
+
#
|
159
|
+
# dict = Natto::DictionaryInfo.new(mecab_ptr)
|
160
|
+
# puts dict[:filename]
|
161
|
+
# => /usr/local/lib/mecab/dic/ipadic/sys.dic
|
162
|
+
# puts dict[:charset]
|
163
|
+
# => utf8
|
154
164
|
class DictionaryInfo < FFI::Struct
|
155
165
|
layout :filename, :string,
|
156
166
|
:charset, :string,
|
data/lib/natto/binding.rb
CHANGED
@@ -1,7 +1,49 @@
|
|
1
|
+
# natto combines the Ruby programming language with MeCab,
|
2
|
+
# the part-of-speech and morphological analyzer for the
|
3
|
+
# Japanese language.
|
4
|
+
#
|
5
|
+
# ## Requirements
|
6
|
+
# natto requires the following:
|
7
|
+
#
|
8
|
+
# - [MeCab _0.98_](http://sourceforge.net/projects/mecab/files/mecab/0.98/)
|
9
|
+
# - [ffi _0.6.3 or greater_](http://rubygems.org/gems/ffi)
|
10
|
+
# - Ruby _1.8.7 or greater_
|
11
|
+
#
|
12
|
+
# ## Installation
|
13
|
+
# Install natto with the following gem command:
|
14
|
+
# gem install natto
|
15
|
+
#
|
16
|
+
# ## Configuration
|
17
|
+
# - natto will try to locate the <tt>mecab</tt> library based upon its runtime environment.
|
18
|
+
# - In case of <tt>LoadError</tt>, please set the <tt>MECAB_PATH</tt> environment variable to the exact name/path to your <tt>mecab</tt> library.
|
19
|
+
#
|
20
|
+
# e.g., for bash on UNIX/Linux
|
21
|
+
# export MECAB_PATH=mecab.so
|
22
|
+
# e.g., on Windows
|
23
|
+
# set MECAB_PATH=C:\Program Files\MeCab\bin\libmecab.dll
|
24
|
+
# e.g., for Cygwin
|
25
|
+
# export MECAB_PATH=cygmecab-1
|
26
|
+
#
|
27
|
+
# ## Usage
|
28
|
+
# require 'natto'
|
29
|
+
#
|
30
|
+
# m = Natto::MeCab.new
|
31
|
+
# puts m.parse("すもももももももものうち")
|
32
|
+
# すもも 名詞,一般,*,*,*,*,すもも,スモモ,スモモ
|
33
|
+
# も 助詞,係助詞,*,*,*,*,も,モ,モ
|
34
|
+
# もも 名詞,一般,*,*,*,*,もも,モモ,モモ
|
35
|
+
# も 助詞,係助詞,*,*,*,*,も,モ,モ
|
36
|
+
# もも 名詞,一般,*,*,*,*,もも,モモ,モモ
|
37
|
+
# の 助詞,連体化,*,*,*,*,の,ノ,ノ
|
38
|
+
# うち 名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ
|
39
|
+
# EOS
|
40
|
+
# => nil
|
41
|
+
#
|
1
42
|
module Natto
|
2
43
|
|
3
|
-
# Module <tt>Binding</tt> encapsulates
|
4
|
-
# made available via <tt>FFI</tt> bindings to
|
44
|
+
# Module <tt>Binding</tt> encapsulates methods and behavior
|
45
|
+
# which are made available via <tt>FFI</tt> bindings to
|
46
|
+
# <tt>mecab</tt>.
|
5
47
|
module Binding
|
6
48
|
require 'ffi'
|
7
49
|
require 'rbconfig'
|
@@ -24,6 +66,15 @@ module Natto
|
|
24
66
|
# <tt>LoadError</tt> will be raised if <tt>MECAB_PATH</tt>
|
25
67
|
# is <b>not</b> set to the full path of the <tt>mecab</tt>
|
26
68
|
# library.
|
69
|
+
# @return name of the <tt>mecab</tt> library
|
70
|
+
# @raise [LoadError] if MECAB_PATH environment variable is not set in Windows
|
71
|
+
# <br/>
|
72
|
+
# e.g., for bash on UNIX/Linux
|
73
|
+
# export MECAB_PATH=mecab.so
|
74
|
+
# e.g., on Windows
|
75
|
+
# set MECAB_PATH=C:\Program Files\MeCab\bin\libmecab.dll
|
76
|
+
# e.g., for Cygwin
|
77
|
+
# export MECAB_PATH=cygmecab-1
|
27
78
|
def self.find_library
|
28
79
|
host_os = RbConfig::CONFIG['host_os']
|
29
80
|
|
data/lib/natto/version.rb
CHANGED
data/test/test_natto.rb
CHANGED
@@ -34,13 +34,17 @@ class TestNatto < Test::Unit::TestCase
|
|
34
34
|
res = Natto::MeCab.build_options_str(:userdic=>"/yet/another/file")
|
35
35
|
assert_equal('--userdic=/yet/another/file', res)
|
36
36
|
|
37
|
-
res = Natto::MeCab.build_options_str(:
|
38
|
-
assert_equal('--
|
37
|
+
res = Natto::MeCab.build_options_str(:lattice_level=>42)
|
38
|
+
assert_equal('--lattice-level=42', res)
|
39
39
|
|
40
|
-
res = Natto::MeCab.build_options_str(:
|
41
|
-
|
42
|
-
|
43
|
-
|
40
|
+
res = Natto::MeCab.build_options_str(:all_morphs=>true)
|
41
|
+
assert_equal('--all-morphs', res)
|
42
|
+
|
43
|
+
res = Natto::MeCab.build_options_str(:output_format_type=>"natto")
|
44
|
+
assert_equal('--output-format-type=natto', res)
|
45
|
+
|
46
|
+
res = Natto::MeCab.build_options_str(:partial=>true)
|
47
|
+
assert_equal('--partial', res)
|
44
48
|
|
45
49
|
res = Natto::MeCab.build_options_str(:node_format=>'%m\t%f[7]\n')
|
46
50
|
assert_equal('--node-format=%m\t%f[7]\n', res)
|
@@ -60,9 +64,6 @@ class TestNatto < Test::Unit::TestCase
|
|
60
64
|
res = Natto::MeCab.build_options_str(:unk_feature=>'%m\t%f[7]\n')
|
61
65
|
assert_equal('--unk-feature=%m\t%f[7]\n', res)
|
62
66
|
|
63
|
-
res = Natto::MeCab.build_options_str(:lattice_level=>42)
|
64
|
-
assert_equal('--lattice-level=42', res)
|
65
|
-
|
66
67
|
res = Natto::MeCab.build_options_str(:nbest=>42)
|
67
68
|
assert_equal('--nbest=42', res)
|
68
69
|
|
@@ -71,12 +72,51 @@ class TestNatto < Test::Unit::TestCase
|
|
71
72
|
|
72
73
|
res = Natto::MeCab.build_options_str(:cost_factor=>42)
|
73
74
|
assert_equal('--cost-factor=42', res)
|
75
|
+
|
76
|
+
res = Natto::MeCab.build_options_str(:output_format_type=>"natto",
|
77
|
+
:userdic=>"/some/file",
|
78
|
+
:dicdir=>"/some/other/file",
|
79
|
+
:partial=>true,
|
80
|
+
:all_morphs=>true)
|
81
|
+
assert_equal('--dicdir=/some/other/file --userdic=/some/file --all-morphs --output-format-type=natto --partial', res)
|
82
|
+
|
74
83
|
end
|
75
84
|
|
76
|
-
def
|
77
|
-
|
78
|
-
|
85
|
+
def test_construction
|
86
|
+
m = nil
|
87
|
+
assert_nothing_raised do
|
88
|
+
m = Natto::MeCab.new
|
89
|
+
end
|
90
|
+
assert_equal({}, m.options)
|
91
|
+
|
92
|
+
opts = {:output_format_type=>'chasen'}
|
93
|
+
assert_nothing_raised do
|
94
|
+
m = Natto::MeCab.new(opts)
|
79
95
|
end
|
96
|
+
assert_equal(opts, m.options)
|
97
|
+
|
98
|
+
opts = {:all_morphs=>true, :partial=>true}
|
99
|
+
assert_nothing_raised do
|
100
|
+
m = Natto::MeCab.new(opts)
|
101
|
+
end
|
102
|
+
assert_equal(opts, m.options)
|
80
103
|
end
|
81
104
|
|
105
|
+
def test_initialize_with_errors
|
106
|
+
assert_raise Natto::MeCabError do
|
107
|
+
Natto::MeCab.new(:output_format_type=>'not_defined_anywhere')
|
108
|
+
end
|
109
|
+
|
110
|
+
assert_raise Natto::MeCabError do
|
111
|
+
Natto::MeCab.new(:rcfile=>'/rcfile/does/not/exist')
|
112
|
+
end
|
113
|
+
|
114
|
+
assert_raise Natto::MeCabError do
|
115
|
+
Natto::MeCab.new(:dicdir=>'/dicdir/does/not/exist')
|
116
|
+
end
|
117
|
+
|
118
|
+
assert_raise Natto::MeCabError do
|
119
|
+
Natto::MeCab.new(:userdic=>'/userdic/does/not/exist')
|
120
|
+
end
|
121
|
+
end
|
82
122
|
end
|
metadata
CHANGED
@@ -1,13 +1,13 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: natto
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
hash:
|
4
|
+
hash: 17
|
5
5
|
prerelease: false
|
6
6
|
segments:
|
7
7
|
- 0
|
8
8
|
- 0
|
9
|
-
-
|
10
|
-
version: 0.0.
|
9
|
+
- 7
|
10
|
+
version: 0.0.7
|
11
11
|
platform: ruby
|
12
12
|
authors:
|
13
13
|
- Brooke M. Fujita
|
@@ -15,7 +15,7 @@ autorequire:
|
|
15
15
|
bindir: bin
|
16
16
|
cert_chain: []
|
17
17
|
|
18
|
-
date: 2010-12-
|
18
|
+
date: 2010-12-30 00:00:00 +09:00
|
19
19
|
default_executable:
|
20
20
|
dependencies:
|
21
21
|
- !ruby/object:Gem::Dependency
|
@@ -42,14 +42,14 @@ extensions: []
|
|
42
42
|
|
43
43
|
extra_rdoc_files:
|
44
44
|
- LICENSE
|
45
|
-
- README
|
45
|
+
- README.md
|
46
46
|
files:
|
47
47
|
- lib/natto.rb
|
48
48
|
- lib/natto/binding.rb
|
49
49
|
- lib/natto/version.rb
|
50
50
|
- test/test_natto.rb
|
51
51
|
- LICENSE
|
52
|
-
- README
|
52
|
+
- README.md
|
53
53
|
has_rdoc: yard
|
54
54
|
homepage: http://code.google.com/p/natto/
|
55
55
|
licenses:
|
data/README
DELETED
@@ -1,77 +0,0 @@
|
|
1
|
-
= natto: A Tasty Ruby Binding with MeCab
|
2
|
-
|
3
|
-
== What is natto?
|
4
|
-
|
5
|
-
natto provides a Ruby binding with MeCab,
|
6
|
-
the part-of-speech and morphological analyzer
|
7
|
-
for the Japanese language.
|
8
|
-
|
9
|
-
== Try It! Try It!
|
10
|
-
|
11
|
-
=== Requirements
|
12
|
-
natto requires the following:
|
13
|
-
* {http://sourceforge.net/projects/mecab/files/mecab/ MeCab 0.98}
|
14
|
-
* {http://rubygems.org/gems/ffi ffi 0.63 or greater}
|
15
|
-
* Ruby 1.8.7 or greater
|
16
|
-
|
17
|
-
=== Installation
|
18
|
-
Install natto with the following gem command:
|
19
|
-
* <code>gem install natto</code>
|
20
|
-
|
21
|
-
=== Configuration
|
22
|
-
* natto will try to locate the <tt>mecab</tt> library based upon its runtime environment.
|
23
|
-
* In case of <tt>LoadError</tt>, please set the <tt>MECAB_PATH</tt> environment variable to the exact name/path to your <tt>mecab</tt> library.
|
24
|
-
|
25
|
-
== Usage
|
26
|
-
require 'natto'
|
27
|
-
|
28
|
-
m = Natto::MeCab.new
|
29
|
-
puts m.parse("すもももももももものうち")
|
30
|
-
すもも 名詞,一般,*,*,*,*,すもも,スモモ,スモモ
|
31
|
-
も 助詞,係助詞,*,*,*,*,も,モ,モ
|
32
|
-
もも 名詞,一般,*,*,*,*,もも,モモ,モモ
|
33
|
-
も 助詞,係助詞,*,*,*,*,も,モ,モ
|
34
|
-
もも 名詞,一般,*,*,*,*,もも,モモ,モモ
|
35
|
-
の 助詞,連体化,*,*,*,*,の,ノ,ノ
|
36
|
-
うち 名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ
|
37
|
-
EOS
|
38
|
-
=> nil
|
39
|
-
|
40
|
-
== Contributing to natto
|
41
|
-
|
42
|
-
* Check out the latest master to make sure the feature hasn't been implemented or the bug hasn't been fixed yet
|
43
|
-
* Check out the issue tracker to make sure someone already hasn't requested it and/or contributed it
|
44
|
-
* Fork the project
|
45
|
-
* Start a feature/bugfix branch
|
46
|
-
* Commit and push until you are happy with your contribution
|
47
|
-
* Make sure to add tests for it. This is important so I don't break it in a future version unintentionally.
|
48
|
-
* Please try not to mess with the Rakefile, version, or history. If you must have your own version, that is fine, but please isolate to its own commit so I can cherry-pick around it.
|
49
|
-
|
50
|
-
== Changelog
|
51
|
-
|
52
|
-
- **2010/12/28**: 0.0.6 release.
|
53
|
-
- Correction to natto.gemspec to include lib/natto/binding.rb.
|
54
|
-
|
55
|
-
- **2010/12/28**: 0.0.5 release. (yanked)
|
56
|
-
- On-going refactoring
|
57
|
-
- Project structure refactored for greater maintainability
|
58
|
-
|
59
|
-
- **2010/12/26**: 0.0.4 release.
|
60
|
-
- On-going refactoring
|
61
|
-
|
62
|
-
- **2010/12/23**: 0.0.3 release.
|
63
|
-
- On-going refactoring
|
64
|
-
- Adding documentation via yard
|
65
|
-
|
66
|
-
- **2010/12/20**: 0.0.2 release.
|
67
|
-
- Continuing development on proper resource deallocation
|
68
|
-
- Adding options hash in object initializer
|
69
|
-
|
70
|
-
- **2010/12/13**: Released version 0.0.1. The objective is to provide
|
71
|
-
an easy-to-use, production-level Ruby binding to MeCab.
|
72
|
-
- Initial release
|
73
|
-
|
74
|
-
|
75
|
-
== Copyright
|
76
|
-
|
77
|
-
natto (c) 2010-2013 by Brooke M. Fujita, licensed under the new BSD license. Please see the {file:LICENSE} document for further details.
|