natto2classifier 0.1.3 → 0.3.4

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
- SHA1:
3
- metadata.gz: f933e506e593fedf2eeb3c43e73c3de5fe39d84b
4
- data.tar.gz: 6406772efd4fdba35779207d791d0a9c6e257092
2
+ SHA256:
3
+ metadata.gz: 3ae8ae7a0796de33c72feb4ce5b5926ba945c0779ea3fae42ca17b6a04ed3e32
4
+ data.tar.gz: d7083a5371d528fed4ced8bcaf9a522d0db4c6abc62e9df15c63e2af62c8b77f
5
5
  SHA512:
6
- metadata.gz: cecd63071da50fbc05b970aa730fcdeb1cb65fbd7a34ddd1e0834f1d845cd7c7fe0216626c504406878c6b9391d44f2be748c5864482babdc000a05d8f9f74a6
7
- data.tar.gz: 9b12fd6dde86b47fb731f41112b7990cd74dc641ecffaf096c976b4b6dd967f60e433d1a0c53c70e68c1b114b991ffdceee0ccf32b1b325d5fdbeeb1b3e77eec
6
+ metadata.gz: 15cead65621e2f1d63e243605a3148377248a0940977efb545ad050d53914836ebf70d7b8792e5f7cf550835a77f7c296666d51e1b926b8dfea1e83864ced27a
7
+ data.tar.gz: 25977c17d175688768efd47a76d30136aae83fe48bc33138b7d80fc8e6b0415128ac765cf7ec9578c4a355138e431e86e1fda5ce5e0c1cfbdd73aafd6ee58691
@@ -0,0 +1,14 @@
1
+ version: 2
2
+ jobs:
3
+ test:
4
+ docker:
5
+ - image: kanayannet/natto2classifier:latest
6
+ steps:
7
+ - checkout
8
+ - run: bundle install
9
+ - run: bundle exec ruby test/natto2classifier_test.rb
10
+ workflows:
11
+ version: 2
12
+ test:
13
+ jobs:
14
+ - test
@@ -1,5 +1,21 @@
1
- sudo: false
2
1
  language: ruby
3
2
  rvm:
4
3
  - 2.4.3
5
- before_install: gem install bundler -v 1.16.1
4
+ before_install:
5
+ - gem install bundler -v 1.16.1
6
+ - sudo apt-get update -qq
7
+ install:
8
+ # mecab
9
+ - wget --no-check-certificate https://github.com/buruzaemon/natto/raw/master/etc/mecab-0.996.tar.gz && tar zxf mecab-0.996.tar.gz
10
+ - pushd mecab-0.996 && ./configure --enable-utf8-only && make && sudo make install && popd
11
+ - sudo ldconfig
12
+ # mecab-ipadic
13
+ - wget --no-check-certificate https://github.com/buruzaemon/natto/raw/master/etc/mecab-ipadic-2.7.0-20070801.tar.gz && tar zxf mecab-ipadic-2.7.0-20070801.tar.gz
14
+ - pushd mecab-ipadic-2.7.0-20070801 && ./configure --with-charset=utf8 && make && sudo make install && popd
15
+ - sudo ldconfig
16
+ # gsl
17
+ - sudo apt-get install libgsl2 libgsl-dev
18
+ # explicitly install
19
+ - bundle install --path .bundle
20
+ script:
21
+ - bundle exec ruby test/natto2classifier_test.rb
@@ -1,9 +1,10 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- natto2classifier (0.1.3)
4
+ natto2classifier (0.3.4)
5
5
  classifier-reborn
6
6
  natto
7
+ rb-gsl
7
8
 
8
9
  GEM
9
10
  remote: https://rubygems.org/
@@ -13,6 +14,7 @@ GEM
13
14
  coderay (1.1.2)
14
15
  fast-stemmer (1.0.2)
15
16
  ffi (1.9.23)
17
+ gsl (2.1.0.3)
16
18
  method_source (0.9.0)
17
19
  minitest (5.11.3)
18
20
  natto (1.1.1)
@@ -21,6 +23,8 @@ GEM
21
23
  coderay (~> 1.1.0)
22
24
  method_source (~> 0.9.0)
23
25
  rake (10.5.0)
26
+ rb-gsl (1.16.0.6)
27
+ gsl
24
28
 
25
29
  PLATFORMS
26
30
  ruby
@@ -33,4 +37,4 @@ DEPENDENCIES
33
37
  rake (~> 10.0)
34
38
 
35
39
  BUNDLED WITH
36
- 1.16.1
40
+ 1.17.2
data/README.md CHANGED
@@ -1,5 +1,7 @@
1
1
  # Natto2classifier
2
2
 
3
+ [![Build Status](https://travis-ci.org/kanayannet/natto2classifier.svg?branch=master)](https://travis-ci.org/kanayannet/natto2classifier)
4
+
3
5
  Welcome to your new gem! In this directory, you'll find the files you need to be able to package up your Ruby library into a gem. Put your Ruby code in the file `lib/natto2classifier`. To experiment with that code, run `bin/console` for an interactive prompt.
4
6
 
5
7
  ## Installation
@@ -12,7 +14,7 @@ gem 'natto2classifier'
12
14
 
13
15
  And then execute:
14
16
 
15
- $ bundle
17
+ $ bundle install
16
18
 
17
19
  Or install it yourself as:
18
20
 
@@ -20,11 +22,34 @@ Or install it yourself as:
20
22
 
21
23
  ## Usage
22
24
 
25
+ ### Baysian methods
26
+
27
+ ```
28
+ bayes = Natto2classifier::Bayes.new '朝食', '夕食'
29
+ bayes.train '朝食', '今日の朝食は納豆だ'
30
+ bayes.train '夕食', '今日の夕食は湯豆腐だ'
31
+ bayes.classify '納豆はいつも朝食べている' #=> '朝食'
32
+ ```
33
+
34
+ ### LSI methods
35
+
23
36
  ```
24
- classifier = Natto2classifier::Bayes.new '朝食', '夕食'
25
- classifier.train '朝食', '今日の朝食は納豆だ'
26
- classifier.train '夕食', '今日の夕食は湯豆腐だ'
27
- classifier.classify '納豆はいつも朝食べている' #=> '朝食'
37
+ lsi = Natto2classifier::LSI.new
38
+ lsi.add_item '今日の朝食は納豆だ', '朝食'
39
+ lsi.add_item '今日の夕食は湯豆腐だ', '夕食'
40
+ lsi.classify '納豆はいつも朝食べている' #=> '朝食'
41
+ lsi.find_related '納豆はいつも朝食べている' #=> ['今日 キョウ の ノ 朝食 チョウショク は ハ 納豆 ナットウ だ ダ', '今日 キョウ の ノ 夕食 ユウショク は ハ 湯豆腐 ユドウフ だ ダ']
42
+ ```
43
+
44
+ ### validate methods
45
+
46
+ ```
47
+ sample_data = CSV.read('./data/train.csv')
48
+ bayes = Natto2classifier::Bayes.new '朝食', '夕食'
49
+ cross_validate(bayes, sample_data) #=> report...
50
+
51
+ test_data, training_data = sample_data.partition.with_index { |_, i| (i % 2).zero? }
52
+ validate(bayes, training_data, test_data) #=> {"夕食"=>{"夕食"=>3, "朝食"=>0}, "朝食"=>{"夕食"=>...}}
28
53
  ```
29
54
 
30
55
  ## Development
@@ -35,7 +60,7 @@ To install this gem onto your local machine, run `bundle exec rake install`. To
35
60
 
36
61
  ## Contributing
37
62
 
38
- Bug reports and pull requests are welcome on GitHub at https://github.com/[USERNAME]/natto2classifier. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
63
+ Bug reports and pull requests are welcome on GitHub at https://github.com/kanayannet/natto2classifier. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
39
64
 
40
65
  ## License
41
66
 
@@ -43,4 +68,4 @@ The gem is available as open source under the terms of the [MIT License](https:/
43
68
 
44
69
  ## Code of Conduct
45
70
 
46
- Everyone interacting in the Natto2classifier project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/[USERNAME]/natto2classifier/blob/master/CODE_OF_CONDUCT.md).
71
+ Everyone interacting in the Natto2classifier project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/kanayannet/natto2classifier/blob/master/CODE_OF_CONDUCT.md).
@@ -1,5 +1,5 @@
1
1
  朝食,卵かけご飯は醤油とあう
2
- 朝食,目玉焼きは塩故障派だ。
2
+ 朝食,目玉焼きは塩胡椒派だ。
3
3
  朝食,納豆かけご飯はとても健康によい
4
4
  朝食,食パンはオーソドックスな朝食だ
5
5
  朝食,スクランブルエッグはホテルの朝食でよく出る
@@ -1,20 +1,6 @@
1
1
  require 'natto2classifier/version'
2
2
  require 'natto2classifier/natto'
3
3
  require 'classifier-reborn'
4
-
5
- module Natto2classifier
6
- # It is a library that classifies Japanese language.
7
- class Bayes < ClassifierReborn::Bayes
8
- alias_method :__train__, :train
9
- alias_method :__classify__, :classify
10
- private :__train__, :__classify__
11
-
12
- def train(category, word)
13
- __train__ category, Natto2classifier::Natto.parse(word).join(' ')
14
- end
15
-
16
- def classify(word)
17
- __classify__ Natto2classifier::Natto.parse(word).join(' ')
18
- end
19
- end
20
- end
4
+ require 'natto2classifier/bayes'
5
+ require 'natto2classifier/lsi'
6
+ require 'natto2classifier/validator'
@@ -0,0 +1,17 @@
1
+
2
+ module Natto2classifier
3
+ # It is a library that classifies Japanese language.
4
+ class Bayes < ClassifierReborn::Bayes
5
+ alias_method :__train__, :train
6
+ alias_method :__classify__, :classify
7
+ private :__train__, :__classify__
8
+
9
+ def train(category, word)
10
+ __train__ category, Natto2classifier::Natto.parse(word).join(' ')
11
+ end
12
+
13
+ def classify(word)
14
+ __classify__ Natto2classifier::Natto.parse(word).join(' ')
15
+ end
16
+ end
17
+ end
@@ -0,0 +1,22 @@
1
+
2
+ module Natto2classifier
3
+ # It is a library that classifies Japanese language.
4
+ class LSI < ClassifierReborn::LSI
5
+ alias_method :__add_item__, :add_item
6
+ alias_method :__classify__, :classify
7
+ alias_method :__find_related__, :find_related
8
+ private :__add_item__, :__classify__, :__find_related__
9
+
10
+ def add_item(word, category)
11
+ __add_item__ Natto2classifier::Natto.parse(word).join(' '), category
12
+ end
13
+
14
+ def classify(word)
15
+ __classify__ Natto2classifier::Natto.parse(word).join(' ')
16
+ end
17
+
18
+ def find_related(word)
19
+ __find_related__ Natto2classifier::Natto.parse(word).join(' ')
20
+ end
21
+ end
22
+ end
@@ -4,10 +4,13 @@ require 'natto'
4
4
  module Natto2classifier
5
5
  class Natto
6
6
  def self.parse(word)
7
- nm = ::Natto::MeCab.new('-F%m\s%f[7]')
7
+ nm = ::Natto::MeCab.new
8
8
  results = []
9
- nm.enum_parse(word.to_s).each do |n|
10
- results << n.feature if !n.is_eos?
9
+ nm.parse(word.to_s) do |n|
10
+ break if n.is_eos?
11
+ kana = n.feature.split(',')[7]
12
+ results << n.surface
13
+ results << kana if !kana.nil? && kana != '*'
11
14
  end
12
15
  results
13
16
  end
@@ -0,0 +1,6 @@
1
+
2
+ module Natto2classifier
3
+ module Validator
4
+ include ClassifierReborn::ClassifierValidator
5
+ end
6
+ end
@@ -1,3 +1,3 @@
1
1
  module Natto2classifier
2
- VERSION = "0.1.3"
2
+ VERSION = "0.3.4"
3
3
  end
@@ -27,4 +27,5 @@ Gem::Specification.new do |spec|
27
27
  spec.add_development_dependency "pry"
28
28
  spec.add_runtime_dependency "natto"
29
29
  spec.add_runtime_dependency "classifier-reborn"
30
+ spec.add_runtime_dependency "rb-gsl"
30
31
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: natto2classifier
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.3
4
+ version: 0.3.4
5
5
  platform: ruby
6
6
  authors:
7
7
  - kanayannet
8
- autorequire:
8
+ autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2018-04-26 00:00:00.000000000 Z
11
+ date: 2020-09-15 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -94,6 +94,20 @@ dependencies:
94
94
  - - ">="
95
95
  - !ruby/object:Gem::Version
96
96
  version: '0'
97
+ - !ruby/object:Gem::Dependency
98
+ name: rb-gsl
99
+ requirement: !ruby/object:Gem::Requirement
100
+ requirements:
101
+ - - ">="
102
+ - !ruby/object:Gem::Version
103
+ version: '0'
104
+ type: :runtime
105
+ prerelease: false
106
+ version_requirements: !ruby/object:Gem::Requirement
107
+ requirements:
108
+ - - ">="
109
+ - !ruby/object:Gem::Version
110
+ version: '0'
97
111
  description: It is a library that classifies Japanese language. It depends on classifier-reborn
98
112
  and natto.
99
113
  email:
@@ -102,6 +116,7 @@ executables: []
102
116
  extensions: []
103
117
  extra_rdoc_files: []
104
118
  files:
119
+ - ".circleci/config.yml"
105
120
  - ".gitignore"
106
121
  - ".travis.yml"
107
122
  - CODE_OF_CONDUCT.md
@@ -114,14 +129,17 @@ files:
114
129
  - bin/setup
115
130
  - data/train.csv
116
131
  - lib/natto2classifier.rb
132
+ - lib/natto2classifier/bayes.rb
133
+ - lib/natto2classifier/lsi.rb
117
134
  - lib/natto2classifier/natto.rb
135
+ - lib/natto2classifier/validator.rb
118
136
  - lib/natto2classifier/version.rb
119
137
  - natto2classifier.gemspec
120
138
  homepage: https://github.com/kanayannet/natto2classifier
121
139
  licenses:
122
140
  - MIT
123
141
  metadata: {}
124
- post_install_message:
142
+ post_install_message:
125
143
  rdoc_options: []
126
144
  require_paths:
127
145
  - lib
@@ -136,9 +154,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
136
154
  - !ruby/object:Gem::Version
137
155
  version: '0'
138
156
  requirements: []
139
- rubyforge_project:
140
- rubygems_version: 2.6.14
141
- signing_key:
157
+ rubygems_version: 3.0.3
158
+ signing_key:
142
159
  specification_version: 4
143
160
  summary: It is a library that classifies Japanese language.
144
161
  test_files: []