natto 0.0.6 → 0.0.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/LICENSE +1 -1
- data/README.md +84 -0
- data/lib/natto.rb +93 -83
- data/lib/natto/binding.rb +53 -2
- data/lib/natto/version.rb +1 -1
- data/test/test_natto.rb +52 -12
- metadata +6 -6
- data/README +0 -77
data/LICENSE
CHANGED
data/README.md
ADDED
@@ -0,0 +1,84 @@
|
|
1
|
+
# natto
|
2
|
+
A Tasty Ruby Binding with MeCab
|
3
|
+
|
4
|
+
## What is natto?
|
5
|
+
|
6
|
+
natto combines the [Ruby programming language](http://www.ruby-lang.org/) with [MeCab](http://mecab.sourceforge.net/), the part-of-speech and morphological analyzer for the Japanese language.
|
7
|
+
|
8
|
+
## Requirements
|
9
|
+
natto requires the following:
|
10
|
+
|
11
|
+
- [MeCab _0.98_](http://sourceforge.net/projects/mecab/files/mecab/0.98/)
|
12
|
+
- [ffi _0.6.3 or greater_](http://rubygems.org/gems/ffi)
|
13
|
+
- Ruby _1.8.7 or greater_
|
14
|
+
|
15
|
+
## Installation
|
16
|
+
Install natto with the following gem command:
|
17
|
+
gem install natto
|
18
|
+
|
19
|
+
## Configuration
|
20
|
+
- natto will try to locate the <tt>mecab</tt> library based upon its runtime environment.
|
21
|
+
- In case of <tt>LoadError</tt>, please set the <tt>MECAB_PATH</tt> environment variable to the exact name/path to your <tt>mecab</tt> library.
|
22
|
+
|
23
|
+
e.g., for bash on UNIX/Linux
|
24
|
+
export MECAB_PATH=mecab.so
|
25
|
+
e.g., on Windows
|
26
|
+
set MECAB_PATH=C:\Program Files\MeCab\bin\libmecab.dll
|
27
|
+
e.g., for Cygwin
|
28
|
+
export MECAB_PATH=cygmecab-1
|
29
|
+
|
30
|
+
## Usage
|
31
|
+
require 'natto'
|
32
|
+
|
33
|
+
m = Natto::MeCab.new
|
34
|
+
puts m.parse("すもももももももものうち")
|
35
|
+
すもも 名詞,一般,*,*,*,*,すもも,スモモ,スモモ
|
36
|
+
も 助詞,係助詞,*,*,*,*,も,モ,モ
|
37
|
+
もも 名詞,一般,*,*,*,*,もも,モモ,モモ
|
38
|
+
も 助詞,係助詞,*,*,*,*,も,モ,モ
|
39
|
+
もも 名詞,一般,*,*,*,*,もも,モモ,モモ
|
40
|
+
の 助詞,連体化,*,*,*,*,の,ノ,ノ
|
41
|
+
うち 名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ
|
42
|
+
EOS
|
43
|
+
=> nil
|
44
|
+
|
45
|
+
## Contributing to natto
|
46
|
+
- Use [Mercurial](http://mercurial.selenic.com/) and [check out the latest master](http://code.google.com/p/natto/source/checkout) to make sure the feature hasn't been implemented or the bug hasn't been fixed yet.
|
47
|
+
- [Check out the issue tracker](http://code.google.com/p/natto/issues/list) to make sure someone already hasn't requested it and/or contributed it.
|
48
|
+
- Fork the project.
|
49
|
+
- Start a feature/bugfix branch.
|
50
|
+
- Commit and push until you are happy with your contribution.
|
51
|
+
- Make sure to add tests for it. This is important so I don't break it in a future version unintentionally. I use [Test::Unit](http://ruby-doc.org/stdlib/libdoc/test/unit/rdoc/classes/Test/Unit.html) since it is simple and it works.
|
52
|
+
- Please try not to mess with the Rakefile, version, or history. If you must have your own version, that is fine, but please isolate to its own commit so I can cherry-pick around it.
|
53
|
+
|
54
|
+
## Changelog
|
55
|
+
|
56
|
+
- __2010/12/30: 0.0.7 release.
|
57
|
+
- Adding support for all-morphs and partial options
|
58
|
+
- Further updating of documentation with markdown
|
59
|
+
|
60
|
+
- __2010/12/28__: 0.0.6 release.
|
61
|
+
- Correction to natto.gemspec to include lib/natto/binding.rb
|
62
|
+
|
63
|
+
- __2010/12/28__: 0.0.5 release. (yanked)
|
64
|
+
- On-going refactoring
|
65
|
+
- Project structure refactored for greater maintainability
|
66
|
+
|
67
|
+
- __2010/12/26__: 0.0.4 release.
|
68
|
+
- On-going refactoring
|
69
|
+
|
70
|
+
- __2010/12/23__: 0.0.3 release.
|
71
|
+
- On-going refactoring
|
72
|
+
- Adding documentation via yard
|
73
|
+
|
74
|
+
- __2010/12/20__: 0.0.2 release.
|
75
|
+
- Continuing development on proper resource deallocation
|
76
|
+
- Adding options hash in object initializer
|
77
|
+
|
78
|
+
- __2010/12/13__: Released version 0.0.1. The objective is to provide
|
79
|
+
an easy-to-use, production-level Ruby binding to MeCab.
|
80
|
+
- Initial release
|
81
|
+
|
82
|
+
## Copyright
|
83
|
+
|
84
|
+
natto © 2010-2013 by Brooke M. Fujita, licensed under the new BSD license. Please see the [LICENSE](file.LICENSE.html) document for further details.
|
data/lib/natto.rb
CHANGED
@@ -1,85 +1,85 @@
|
|
1
|
-
# -*- encoding: utf-8 -*-
|
2
1
|
require 'rubygems' if RUBY_VERSION.to_f < 1.9
|
3
2
|
require 'natto/binding'
|
4
3
|
|
5
|
-
# natto combines the Ruby programming language with MeCab,
|
6
|
-
# the part-of-speech and morphological analyzer for the
|
7
|
-
# Japanese language.
|
8
|
-
#
|
9
|
-
# === Requirements
|
10
|
-
# natto requires the following:
|
11
|
-
# * {http://sourceforge.net/projects/mecab/files/mecab/ MeCab 0.98}
|
12
|
-
# * {http://rubygems.org/gems/ffi ffi 0.63 or greater}
|
13
|
-
# * Ruby 1.8.7 or greater
|
14
|
-
#
|
15
|
-
# === Installation
|
16
|
-
# Install natto with the following gem command:
|
17
|
-
# * <code>gem install natto</code>
|
18
|
-
#
|
19
|
-
# === Configuration
|
20
|
-
# * natto will try to locate the <tt>mecab</tt> library based upon its runtime environment.
|
21
|
-
# * In case of <tt>LoadError</tt>, please set the <tt>MECAB_PATH</tt> environment variable to the exact name/path to your <tt>mecab</tt> library.
|
22
|
-
#
|
23
|
-
#== Usage
|
24
|
-
# require 'natto'
|
25
|
-
#
|
26
|
-
# m = Natto::MeCab.new
|
27
|
-
# puts m.parse("すもももももももものうち")
|
28
|
-
# すもも 名詞,一般,*,*,*,*,すもも,スモモ,スモモ
|
29
|
-
# も 助詞,係助詞,*,*,*,*,も,モ,モ
|
30
|
-
# もも 名詞,一般,*,*,*,*,もも,モモ,モモ
|
31
|
-
# も 助詞,係助詞,*,*,*,*,も,モ,モ
|
32
|
-
# もも 名詞,一般,*,*,*,*,もも,モモ,モモ
|
33
|
-
# の 助詞,連体化,*,*,*,*,の,ノ,ノ
|
34
|
-
# うち 名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ
|
35
|
-
# EOS
|
36
|
-
# => nil
|
37
|
-
#
|
38
|
-
# @author Brooke M. Fujita (buruzaemon)
|
39
4
|
module Natto
|
40
5
|
require 'ffi'
|
41
6
|
|
42
|
-
# <tt>MeCab</tt> is a wrapper class
|
43
|
-
# Options to the <tt>mecab</tt> parser are passed in as a hash
|
44
|
-
#
|
7
|
+
# <tt>MeCab</tt> is a wrapper class for the <tt>mecab</tt> parser.
|
8
|
+
# Options to the <tt>mecab</tt> parser are passed in as a hash at
|
9
|
+
# initialization.
|
10
|
+
#
|
11
|
+
# ## Usage
|
12
|
+
# require 'natto'
|
13
|
+
#
|
14
|
+
# m = Natto::MeCab.new
|
15
|
+
# puts m.parse("すもももももももものうち")
|
16
|
+
# すもも 名詞,一般,*,*,*,*,すもも,スモモ,スモモ
|
17
|
+
# も 助詞,係助詞,*,*,*,*,も,モ,モ
|
18
|
+
# もも 名詞,一般,*,*,*,*,もも,モモ,モモ
|
19
|
+
# も 助詞,係助詞,*,*,*,*,も,モ,モ
|
20
|
+
# もも 名詞,一般,*,*,*,*,もも,モモ,モモ
|
21
|
+
# の 助詞,連体化,*,*,*,*,の,ノ,ノ
|
22
|
+
# うち 名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ
|
23
|
+
# EOS
|
24
|
+
# => nil
|
45
25
|
#
|
46
|
-
# @see
|
26
|
+
# @see SUPPORTED_OPTS
|
47
27
|
class MeCab
|
28
|
+
|
29
|
+
attr_reader :options
|
30
|
+
|
48
31
|
# Supported options to the <tt>mecab</tt> parser.
|
49
32
|
# See the <tt>mecab</tt> help for more details.
|
50
|
-
SUPPORTED_OPTS = [
|
51
|
-
|
52
|
-
|
53
|
-
|
33
|
+
SUPPORTED_OPTS = [ :rcfile, :dicdir, :userdic, :lattice_level, :all_morphs,
|
34
|
+
:output_format_type, :partial, :node_format, :unk_format,
|
35
|
+
:bos_format, :eos_format, :eon_format, :unk_feature,
|
36
|
+
:nbest, :theta, :cost_factor ].freeze
|
37
|
+
# :allocate_sentence ]
|
38
|
+
|
39
|
+
#OPTION_DEFAULTS = { :lattice_level=>0, :all_morphs=>false, :nbest=>1,
|
40
|
+
# :theta=>0.75, :cost_factor=>700 }.freeze
|
54
41
|
|
55
|
-
#
|
42
|
+
# Initializes the wrapped <tt>mecab</tt> instance with the
|
56
43
|
# given <tt>options</tt> hash.
|
57
|
-
#
|
44
|
+
#
|
58
45
|
# Options supported are:
|
59
|
-
#
|
60
|
-
#
|
61
|
-
#
|
62
|
-
#
|
63
|
-
#
|
64
|
-
#
|
65
|
-
#
|
66
|
-
#
|
67
|
-
#
|
68
|
-
#
|
69
|
-
#
|
70
|
-
#
|
71
|
-
#
|
72
|
-
#
|
73
|
-
#
|
74
|
-
#
|
75
|
-
#
|
46
|
+
#
|
47
|
+
# - :rcfile -- resource file
|
48
|
+
# - :dicdir -- system dicdir
|
49
|
+
# - :userdic -- user dictionary
|
50
|
+
# - :lattice_level -- lattice information level (integer, default 0)
|
51
|
+
# - :all_morphs -- output all morphs (default false)
|
52
|
+
# - :output_format_type -- output format type (wakati, chasen, yomi, etc.)
|
53
|
+
# - :partial -- partial parsing mode
|
54
|
+
# - :node_format -- user-defined node format
|
55
|
+
# - :unk_format -- user-defined unknown node format
|
56
|
+
# - :bos_format -- user-defined beginning-of-sentence format
|
57
|
+
# - :eos_format -- user-defined end-of-sentence format
|
58
|
+
# - :eon_format -- user-defined end-of-NBest format
|
59
|
+
# - :unk_feature -- feature for unknown word
|
60
|
+
# - :nbest -- output N best results (integer, default 1)
|
61
|
+
# - :theta -- temperature parameter theta (float, default 0.75)
|
62
|
+
# - :cost_factor -- cost factor (integer, default 700)
|
63
|
+
#
|
64
|
+
# _Use single-quotes to preserve format options that contain escape chars._
|
65
|
+
#
|
76
66
|
# e.g.
|
77
|
-
#
|
67
|
+
# m = Natto::MeCab.new(:node_format=>'%m\t%f[7]\n')
|
68
|
+
# puts m.parse("日本語は難しいです。")
|
69
|
+
# 日本語 ニホンゴ
|
70
|
+
# は ハ
|
71
|
+
# 難しい ムズカシイ
|
72
|
+
# です デス
|
73
|
+
# 。 。
|
74
|
+
# EOS
|
75
|
+
# => nil
|
78
76
|
#
|
79
77
|
# @param [Hash]
|
80
|
-
# @
|
78
|
+
# @raise [MeCabError] if <tt>mecab</tt> cannot be initialized with the given <tt>options</tt>
|
79
|
+
# @see MeCab::SUPPORTED_OPTS
|
81
80
|
def initialize(options={})
|
82
|
-
|
81
|
+
@options = options
|
82
|
+
opt_str = self.class.build_options_str(@options)
|
83
83
|
@ptr = Natto::Binding.mecab_new2(opt_str)
|
84
84
|
raise MeCabError.new("Could not initialize MeCab with options: '#{opt_str}'") if @ptr.address == 0
|
85
85
|
#@dict = Natto::DictionaryInfo.new(Natto::Binding.mecab_dictionary_info(@ptr))
|
@@ -88,7 +88,9 @@ module Natto
|
|
88
88
|
|
89
89
|
# Parses the given string <tt>s</tt>.
|
90
90
|
#
|
91
|
-
# @param [String]
|
91
|
+
# @param [String] s
|
92
|
+
# @return parsing result from <tt>mecab</tt>
|
93
|
+
# @raise [MeCabError] if the <tt>mecab</tt> parser cannot parse the given string <tt>s</tt>
|
92
94
|
def parse(s)
|
93
95
|
Natto::Binding.mecab_sparse_tostr(@ptr, s) ||
|
94
96
|
raise(MeCabError.new(Natto::Binding.mecab_strerror(@ptr)))
|
@@ -99,6 +101,7 @@ module Natto
|
|
99
101
|
# after the object owning <tt>ptr</tt> has been destroyed.
|
100
102
|
#
|
101
103
|
# @param [FFI::MemoryPointer] ptr
|
104
|
+
# @return [Proc] to release <tt>mecab</tt> resources properly
|
102
105
|
def self.create_free_proc(ptr)
|
103
106
|
Proc.new do
|
104
107
|
Natto::Binding.mecab_destroy(ptr)
|
@@ -109,12 +112,17 @@ module Natto
|
|
109
112
|
# be passed in the construction of <tt>mecab</tt>.
|
110
113
|
#
|
111
114
|
# @param [Hash] options
|
115
|
+
# @return string-representation of the options to the <tt>mecab</tt> parser
|
112
116
|
def self.build_options_str(options={})
|
113
117
|
opt = []
|
114
118
|
SUPPORTED_OPTS.each do |k|
|
115
119
|
if options.has_key? k
|
116
120
|
key = k.to_s.gsub('_', '-')
|
117
|
-
|
121
|
+
if %w( all-morphs partial ).include? key
|
122
|
+
opt << "--#{key}" if options[k]==true
|
123
|
+
else
|
124
|
+
opt << "--#{key}=#{options[k]}"
|
125
|
+
end
|
118
126
|
|
119
127
|
#if key.end_with? '_format_' or key.end_with? '_feature'
|
120
128
|
# opt << "--#{key}="+options[k]
|
@@ -133,24 +141,26 @@ module Natto
|
|
133
141
|
|
134
142
|
# <tt>DictionaryInfo</tt> is a wrapper for a <tt>MeCab</tt>
|
135
143
|
# instance's related dictionary information.
|
136
|
-
#
|
144
|
+
#
|
137
145
|
# Values may be obtained by using the following symbols
|
138
146
|
# as keys to the hash of <tt>mecab</tt> dictionary information.
|
139
|
-
#
|
140
|
-
#
|
141
|
-
#
|
142
|
-
#
|
143
|
-
#
|
144
|
-
#
|
145
|
-
#
|
146
|
-
#
|
147
|
-
#
|
148
|
-
#
|
149
|
-
#
|
150
|
-
#
|
151
|
-
#
|
152
|
-
#
|
153
|
-
#
|
147
|
+
#
|
148
|
+
# - :filename
|
149
|
+
# - :charset
|
150
|
+
# - :size
|
151
|
+
# - :type
|
152
|
+
# - :lsize
|
153
|
+
# - :rsize
|
154
|
+
# - :version
|
155
|
+
# - :next
|
156
|
+
#
|
157
|
+
# # Usage:
|
158
|
+
#
|
159
|
+
# dict = Natto::DictionaryInfo.new(mecab_ptr)
|
160
|
+
# puts dict[:filename]
|
161
|
+
# => /usr/local/lib/mecab/dic/ipadic/sys.dic
|
162
|
+
# puts dict[:charset]
|
163
|
+
# => utf8
|
154
164
|
class DictionaryInfo < FFI::Struct
|
155
165
|
layout :filename, :string,
|
156
166
|
:charset, :string,
|
data/lib/natto/binding.rb
CHANGED
@@ -1,7 +1,49 @@
|
|
1
|
+
# natto combines the Ruby programming language with MeCab,
|
2
|
+
# the part-of-speech and morphological analyzer for the
|
3
|
+
# Japanese language.
|
4
|
+
#
|
5
|
+
# ## Requirements
|
6
|
+
# natto requires the following:
|
7
|
+
#
|
8
|
+
# - [MeCab _0.98_](http://sourceforge.net/projects/mecab/files/mecab/0.98/)
|
9
|
+
# - [ffi _0.6.3 or greater_](http://rubygems.org/gems/ffi)
|
10
|
+
# - Ruby _1.8.7 or greater_
|
11
|
+
#
|
12
|
+
# ## Installation
|
13
|
+
# Install natto with the following gem command:
|
14
|
+
# gem install natto
|
15
|
+
#
|
16
|
+
# ## Configuration
|
17
|
+
# - natto will try to locate the <tt>mecab</tt> library based upon its runtime environment.
|
18
|
+
# - In case of <tt>LoadError</tt>, please set the <tt>MECAB_PATH</tt> environment variable to the exact name/path to your <tt>mecab</tt> library.
|
19
|
+
#
|
20
|
+
# e.g., for bash on UNIX/Linux
|
21
|
+
# export MECAB_PATH=mecab.so
|
22
|
+
# e.g., on Windows
|
23
|
+
# set MECAB_PATH=C:\Program Files\MeCab\bin\libmecab.dll
|
24
|
+
# e.g., for Cygwin
|
25
|
+
# export MECAB_PATH=cygmecab-1
|
26
|
+
#
|
27
|
+
# ## Usage
|
28
|
+
# require 'natto'
|
29
|
+
#
|
30
|
+
# m = Natto::MeCab.new
|
31
|
+
# puts m.parse("すもももももももものうち")
|
32
|
+
# すもも 名詞,一般,*,*,*,*,すもも,スモモ,スモモ
|
33
|
+
# も 助詞,係助詞,*,*,*,*,も,モ,モ
|
34
|
+
# もも 名詞,一般,*,*,*,*,もも,モモ,モモ
|
35
|
+
# も 助詞,係助詞,*,*,*,*,も,モ,モ
|
36
|
+
# もも 名詞,一般,*,*,*,*,もも,モモ,モモ
|
37
|
+
# の 助詞,連体化,*,*,*,*,の,ノ,ノ
|
38
|
+
# うち 名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ
|
39
|
+
# EOS
|
40
|
+
# => nil
|
41
|
+
#
|
1
42
|
module Natto
|
2
43
|
|
3
|
-
# Module <tt>Binding</tt> encapsulates
|
4
|
-
# made available via <tt>FFI</tt> bindings to
|
44
|
+
# Module <tt>Binding</tt> encapsulates methods and behavior
|
45
|
+
# which are made available via <tt>FFI</tt> bindings to
|
46
|
+
# <tt>mecab</tt>.
|
5
47
|
module Binding
|
6
48
|
require 'ffi'
|
7
49
|
require 'rbconfig'
|
@@ -24,6 +66,15 @@ module Natto
|
|
24
66
|
# <tt>LoadError</tt> will be raised if <tt>MECAB_PATH</tt>
|
25
67
|
# is <b>not</b> set to the full path of the <tt>mecab</tt>
|
26
68
|
# library.
|
69
|
+
# @return name of the <tt>mecab</tt> library
|
70
|
+
# @raise [LoadError] if MECAB_PATH environment variable is not set in Windows
|
71
|
+
# <br/>
|
72
|
+
# e.g., for bash on UNIX/Linux
|
73
|
+
# export MECAB_PATH=mecab.so
|
74
|
+
# e.g., on Windows
|
75
|
+
# set MECAB_PATH=C:\Program Files\MeCab\bin\libmecab.dll
|
76
|
+
# e.g., for Cygwin
|
77
|
+
# export MECAB_PATH=cygmecab-1
|
27
78
|
def self.find_library
|
28
79
|
host_os = RbConfig::CONFIG['host_os']
|
29
80
|
|
data/lib/natto/version.rb
CHANGED
data/test/test_natto.rb
CHANGED
@@ -34,13 +34,17 @@ class TestNatto < Test::Unit::TestCase
|
|
34
34
|
res = Natto::MeCab.build_options_str(:userdic=>"/yet/another/file")
|
35
35
|
assert_equal('--userdic=/yet/another/file', res)
|
36
36
|
|
37
|
-
res = Natto::MeCab.build_options_str(:
|
38
|
-
assert_equal('--
|
37
|
+
res = Natto::MeCab.build_options_str(:lattice_level=>42)
|
38
|
+
assert_equal('--lattice-level=42', res)
|
39
39
|
|
40
|
-
res = Natto::MeCab.build_options_str(:
|
41
|
-
|
42
|
-
|
43
|
-
|
40
|
+
res = Natto::MeCab.build_options_str(:all_morphs=>true)
|
41
|
+
assert_equal('--all-morphs', res)
|
42
|
+
|
43
|
+
res = Natto::MeCab.build_options_str(:output_format_type=>"natto")
|
44
|
+
assert_equal('--output-format-type=natto', res)
|
45
|
+
|
46
|
+
res = Natto::MeCab.build_options_str(:partial=>true)
|
47
|
+
assert_equal('--partial', res)
|
44
48
|
|
45
49
|
res = Natto::MeCab.build_options_str(:node_format=>'%m\t%f[7]\n')
|
46
50
|
assert_equal('--node-format=%m\t%f[7]\n', res)
|
@@ -60,9 +64,6 @@ class TestNatto < Test::Unit::TestCase
|
|
60
64
|
res = Natto::MeCab.build_options_str(:unk_feature=>'%m\t%f[7]\n')
|
61
65
|
assert_equal('--unk-feature=%m\t%f[7]\n', res)
|
62
66
|
|
63
|
-
res = Natto::MeCab.build_options_str(:lattice_level=>42)
|
64
|
-
assert_equal('--lattice-level=42', res)
|
65
|
-
|
66
67
|
res = Natto::MeCab.build_options_str(:nbest=>42)
|
67
68
|
assert_equal('--nbest=42', res)
|
68
69
|
|
@@ -71,12 +72,51 @@ class TestNatto < Test::Unit::TestCase
|
|
71
72
|
|
72
73
|
res = Natto::MeCab.build_options_str(:cost_factor=>42)
|
73
74
|
assert_equal('--cost-factor=42', res)
|
75
|
+
|
76
|
+
res = Natto::MeCab.build_options_str(:output_format_type=>"natto",
|
77
|
+
:userdic=>"/some/file",
|
78
|
+
:dicdir=>"/some/other/file",
|
79
|
+
:partial=>true,
|
80
|
+
:all_morphs=>true)
|
81
|
+
assert_equal('--dicdir=/some/other/file --userdic=/some/file --all-morphs --output-format-type=natto --partial', res)
|
82
|
+
|
74
83
|
end
|
75
84
|
|
76
|
-
def
|
77
|
-
|
78
|
-
|
85
|
+
def test_construction
|
86
|
+
m = nil
|
87
|
+
assert_nothing_raised do
|
88
|
+
m = Natto::MeCab.new
|
89
|
+
end
|
90
|
+
assert_equal({}, m.options)
|
91
|
+
|
92
|
+
opts = {:output_format_type=>'chasen'}
|
93
|
+
assert_nothing_raised do
|
94
|
+
m = Natto::MeCab.new(opts)
|
79
95
|
end
|
96
|
+
assert_equal(opts, m.options)
|
97
|
+
|
98
|
+
opts = {:all_morphs=>true, :partial=>true}
|
99
|
+
assert_nothing_raised do
|
100
|
+
m = Natto::MeCab.new(opts)
|
101
|
+
end
|
102
|
+
assert_equal(opts, m.options)
|
80
103
|
end
|
81
104
|
|
105
|
+
def test_initialize_with_errors
|
106
|
+
assert_raise Natto::MeCabError do
|
107
|
+
Natto::MeCab.new(:output_format_type=>'not_defined_anywhere')
|
108
|
+
end
|
109
|
+
|
110
|
+
assert_raise Natto::MeCabError do
|
111
|
+
Natto::MeCab.new(:rcfile=>'/rcfile/does/not/exist')
|
112
|
+
end
|
113
|
+
|
114
|
+
assert_raise Natto::MeCabError do
|
115
|
+
Natto::MeCab.new(:dicdir=>'/dicdir/does/not/exist')
|
116
|
+
end
|
117
|
+
|
118
|
+
assert_raise Natto::MeCabError do
|
119
|
+
Natto::MeCab.new(:userdic=>'/userdic/does/not/exist')
|
120
|
+
end
|
121
|
+
end
|
82
122
|
end
|
metadata
CHANGED
@@ -1,13 +1,13 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: natto
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
hash:
|
4
|
+
hash: 17
|
5
5
|
prerelease: false
|
6
6
|
segments:
|
7
7
|
- 0
|
8
8
|
- 0
|
9
|
-
-
|
10
|
-
version: 0.0.
|
9
|
+
- 7
|
10
|
+
version: 0.0.7
|
11
11
|
platform: ruby
|
12
12
|
authors:
|
13
13
|
- Brooke M. Fujita
|
@@ -15,7 +15,7 @@ autorequire:
|
|
15
15
|
bindir: bin
|
16
16
|
cert_chain: []
|
17
17
|
|
18
|
-
date: 2010-12-
|
18
|
+
date: 2010-12-30 00:00:00 +09:00
|
19
19
|
default_executable:
|
20
20
|
dependencies:
|
21
21
|
- !ruby/object:Gem::Dependency
|
@@ -42,14 +42,14 @@ extensions: []
|
|
42
42
|
|
43
43
|
extra_rdoc_files:
|
44
44
|
- LICENSE
|
45
|
-
- README
|
45
|
+
- README.md
|
46
46
|
files:
|
47
47
|
- lib/natto.rb
|
48
48
|
- lib/natto/binding.rb
|
49
49
|
- lib/natto/version.rb
|
50
50
|
- test/test_natto.rb
|
51
51
|
- LICENSE
|
52
|
-
- README
|
52
|
+
- README.md
|
53
53
|
has_rdoc: yard
|
54
54
|
homepage: http://code.google.com/p/natto/
|
55
55
|
licenses:
|
data/README
DELETED
@@ -1,77 +0,0 @@
|
|
1
|
-
= natto: A Tasty Ruby Binding with MeCab
|
2
|
-
|
3
|
-
== What is natto?
|
4
|
-
|
5
|
-
natto provides a Ruby binding with MeCab,
|
6
|
-
the part-of-speech and morphological analyzer
|
7
|
-
for the Japanese language.
|
8
|
-
|
9
|
-
== Try It! Try It!
|
10
|
-
|
11
|
-
=== Requirements
|
12
|
-
natto requires the following:
|
13
|
-
* {http://sourceforge.net/projects/mecab/files/mecab/ MeCab 0.98}
|
14
|
-
* {http://rubygems.org/gems/ffi ffi 0.63 or greater}
|
15
|
-
* Ruby 1.8.7 or greater
|
16
|
-
|
17
|
-
=== Installation
|
18
|
-
Install natto with the following gem command:
|
19
|
-
* <code>gem install natto</code>
|
20
|
-
|
21
|
-
=== Configuration
|
22
|
-
* natto will try to locate the <tt>mecab</tt> library based upon its runtime environment.
|
23
|
-
* In case of <tt>LoadError</tt>, please set the <tt>MECAB_PATH</tt> environment variable to the exact name/path to your <tt>mecab</tt> library.
|
24
|
-
|
25
|
-
== Usage
|
26
|
-
require 'natto'
|
27
|
-
|
28
|
-
m = Natto::MeCab.new
|
29
|
-
puts m.parse("すもももももももものうち")
|
30
|
-
すもも 名詞,一般,*,*,*,*,すもも,スモモ,スモモ
|
31
|
-
も 助詞,係助詞,*,*,*,*,も,モ,モ
|
32
|
-
もも 名詞,一般,*,*,*,*,もも,モモ,モモ
|
33
|
-
も 助詞,係助詞,*,*,*,*,も,モ,モ
|
34
|
-
もも 名詞,一般,*,*,*,*,もも,モモ,モモ
|
35
|
-
の 助詞,連体化,*,*,*,*,の,ノ,ノ
|
36
|
-
うち 名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ
|
37
|
-
EOS
|
38
|
-
=> nil
|
39
|
-
|
40
|
-
== Contributing to natto
|
41
|
-
|
42
|
-
* Check out the latest master to make sure the feature hasn't been implemented or the bug hasn't been fixed yet
|
43
|
-
* Check out the issue tracker to make sure someone already hasn't requested it and/or contributed it
|
44
|
-
* Fork the project
|
45
|
-
* Start a feature/bugfix branch
|
46
|
-
* Commit and push until you are happy with your contribution
|
47
|
-
* Make sure to add tests for it. This is important so I don't break it in a future version unintentionally.
|
48
|
-
* Please try not to mess with the Rakefile, version, or history. If you must have your own version, that is fine, but please isolate to its own commit so I can cherry-pick around it.
|
49
|
-
|
50
|
-
== Changelog
|
51
|
-
|
52
|
-
- **2010/12/28**: 0.0.6 release.
|
53
|
-
- Correction to natto.gemspec to include lib/natto/binding.rb.
|
54
|
-
|
55
|
-
- **2010/12/28**: 0.0.5 release. (yanked)
|
56
|
-
- On-going refactoring
|
57
|
-
- Project structure refactored for greater maintainability
|
58
|
-
|
59
|
-
- **2010/12/26**: 0.0.4 release.
|
60
|
-
- On-going refactoring
|
61
|
-
|
62
|
-
- **2010/12/23**: 0.0.3 release.
|
63
|
-
- On-going refactoring
|
64
|
-
- Adding documentation via yard
|
65
|
-
|
66
|
-
- **2010/12/20**: 0.0.2 release.
|
67
|
-
- Continuing development on proper resource deallocation
|
68
|
-
- Adding options hash in object initializer
|
69
|
-
|
70
|
-
- **2010/12/13**: Released version 0.0.1. The objective is to provide
|
71
|
-
an easy-to-use, production-level Ruby binding to MeCab.
|
72
|
-
- Initial release
|
73
|
-
|
74
|
-
|
75
|
-
== Copyright
|
76
|
-
|
77
|
-
natto (c) 2010-2013 by Brooke M. Fujita, licensed under the new BSD license. Please see the {file:LICENSE} document for further details.
|