natto2classifier 0.1.3 → 0.3.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
- SHA1:
3
- metadata.gz: f933e506e593fedf2eeb3c43e73c3de5fe39d84b
4
- data.tar.gz: 6406772efd4fdba35779207d791d0a9c6e257092
2
+ SHA256:
3
+ metadata.gz: 3ae8ae7a0796de33c72feb4ce5b5926ba945c0779ea3fae42ca17b6a04ed3e32
4
+ data.tar.gz: d7083a5371d528fed4ced8bcaf9a522d0db4c6abc62e9df15c63e2af62c8b77f
5
5
  SHA512:
6
- metadata.gz: cecd63071da50fbc05b970aa730fcdeb1cb65fbd7a34ddd1e0834f1d845cd7c7fe0216626c504406878c6b9391d44f2be748c5864482babdc000a05d8f9f74a6
7
- data.tar.gz: 9b12fd6dde86b47fb731f41112b7990cd74dc641ecffaf096c976b4b6dd967f60e433d1a0c53c70e68c1b114b991ffdceee0ccf32b1b325d5fdbeeb1b3e77eec
6
+ metadata.gz: 15cead65621e2f1d63e243605a3148377248a0940977efb545ad050d53914836ebf70d7b8792e5f7cf550835a77f7c296666d51e1b926b8dfea1e83864ced27a
7
+ data.tar.gz: 25977c17d175688768efd47a76d30136aae83fe48bc33138b7d80fc8e6b0415128ac765cf7ec9578c4a355138e431e86e1fda5ce5e0c1cfbdd73aafd6ee58691
@@ -0,0 +1,14 @@
1
+ version: 2
2
+ jobs:
3
+ test:
4
+ docker:
5
+ - image: kanayannet/natto2classifier:latest
6
+ steps:
7
+ - checkout
8
+ - run: bundle install
9
+ - run: bundle exec ruby test/natto2classifier_test.rb
10
+ workflows:
11
+ version: 2
12
+ test:
13
+ jobs:
14
+ - test
@@ -1,5 +1,21 @@
1
- sudo: false
2
1
  language: ruby
3
2
  rvm:
4
3
  - 2.4.3
5
- before_install: gem install bundler -v 1.16.1
4
+ before_install:
5
+ - gem install bundler -v 1.16.1
6
+ - sudo apt-get update -qq
7
+ install:
8
+ # mecab
9
+ - wget --no-check-certificate https://github.com/buruzaemon/natto/raw/master/etc/mecab-0.996.tar.gz && tar zxf mecab-0.996.tar.gz
10
+ - pushd mecab-0.996 && ./configure --enable-utf8-only && make && sudo make install && popd
11
+ - sudo ldconfig
12
+ # mecab-ipadic
13
+ - wget --no-check-certificate https://github.com/buruzaemon/natto/raw/master/etc/mecab-ipadic-2.7.0-20070801.tar.gz && tar zxf mecab-ipadic-2.7.0-20070801.tar.gz
14
+ - pushd mecab-ipadic-2.7.0-20070801 && ./configure --with-charset=utf8 && make && sudo make install && popd
15
+ - sudo ldconfig
16
+ # gsl
17
+ - sudo apt-get install libgsl2 libgsl-dev
18
+ # explicitly install
19
+ - bundle install --path .bundle
20
+ script:
21
+ - bundle exec ruby test/natto2classifier_test.rb
@@ -1,9 +1,10 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- natto2classifier (0.1.3)
4
+ natto2classifier (0.3.4)
5
5
  classifier-reborn
6
6
  natto
7
+ rb-gsl
7
8
 
8
9
  GEM
9
10
  remote: https://rubygems.org/
@@ -13,6 +14,7 @@ GEM
13
14
  coderay (1.1.2)
14
15
  fast-stemmer (1.0.2)
15
16
  ffi (1.9.23)
17
+ gsl (2.1.0.3)
16
18
  method_source (0.9.0)
17
19
  minitest (5.11.3)
18
20
  natto (1.1.1)
@@ -21,6 +23,8 @@ GEM
21
23
  coderay (~> 1.1.0)
22
24
  method_source (~> 0.9.0)
23
25
  rake (10.5.0)
26
+ rb-gsl (1.16.0.6)
27
+ gsl
24
28
 
25
29
  PLATFORMS
26
30
  ruby
@@ -33,4 +37,4 @@ DEPENDENCIES
33
37
  rake (~> 10.0)
34
38
 
35
39
  BUNDLED WITH
36
- 1.16.1
40
+ 1.17.2
data/README.md CHANGED
@@ -1,5 +1,7 @@
1
1
  # Natto2classifier
2
2
 
3
+ [![Build Status](https://travis-ci.org/kanayannet/natto2classifier.svg?branch=master)](https://travis-ci.org/kanayannet/natto2classifier)
4
+
3
5
  Welcome to your new gem! In this directory, you'll find the files you need to be able to package up your Ruby library into a gem. Put your Ruby code in the file `lib/natto2classifier`. To experiment with that code, run `bin/console` for an interactive prompt.
4
6
 
5
7
  ## Installation
@@ -12,7 +14,7 @@ gem 'natto2classifier'
12
14
 
13
15
  And then execute:
14
16
 
15
- $ bundle
17
+ $ bundle install
16
18
 
17
19
  Or install it yourself as:
18
20
 
@@ -20,11 +22,34 @@ Or install it yourself as:
20
22
 
21
23
  ## Usage
22
24
 
25
+ ### Baysian methods
26
+
27
+ ```
28
+ bayes = Natto2classifier::Bayes.new '朝食', '夕食'
29
+ bayes.train '朝食', '今日の朝食は納豆だ'
30
+ bayes.train '夕食', '今日の夕食は湯豆腐だ'
31
+ bayes.classify '納豆はいつも朝食べている' #=> '朝食'
32
+ ```
33
+
34
+ ### LSI methods
35
+
23
36
  ```
24
- classifier = Natto2classifier::Bayes.new '朝食', '夕食'
25
- classifier.train '朝食', '今日の朝食は納豆だ'
26
- classifier.train '夕食', '今日の夕食は湯豆腐だ'
27
- classifier.classify '納豆はいつも朝食べている' #=> '朝食'
37
+ lsi = Natto2classifier::LSI.new
38
+ lsi.add_item '今日の朝食は納豆だ', '朝食'
39
+ lsi.add_item '今日の夕食は湯豆腐だ', '夕食'
40
+ lsi.classify '納豆はいつも朝食べている' #=> '朝食'
41
+ lsi.find_related '納豆はいつも朝食べている' #=> ['今日 キョウ の ノ 朝食 チョウショク は ハ 納豆 ナットウ だ ダ', '今日 キョウ の ノ 夕食 ユウショク は ハ 湯豆腐 ユドウフ だ ダ']
42
+ ```
43
+
44
+ ### validate methods
45
+
46
+ ```
47
+ sample_data = CSV.read('./data/train.csv')
48
+ bayes = Natto2classifier::Bayes.new '朝食', '夕食'
49
+ cross_validate(bayes, sample_data) #=> report...
50
+
51
+ test_data, training_data = sample_data.partition.with_index { |_, i| (i % 2).zero? }
52
+ validate(bayes, training_data, test_data) #=> {"夕食"=>{"夕食"=>3, "朝食"=>0}, "朝食"=>{"夕食"=>...}}
28
53
  ```
29
54
 
30
55
  ## Development
@@ -35,7 +60,7 @@ To install this gem onto your local machine, run `bundle exec rake install`. To
35
60
 
36
61
  ## Contributing
37
62
 
38
- Bug reports and pull requests are welcome on GitHub at https://github.com/[USERNAME]/natto2classifier. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
63
+ Bug reports and pull requests are welcome on GitHub at https://github.com/kanayannet/natto2classifier. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
39
64
 
40
65
  ## License
41
66
 
@@ -43,4 +68,4 @@ The gem is available as open source under the terms of the [MIT License](https:/
43
68
 
44
69
  ## Code of Conduct
45
70
 
46
- Everyone interacting in the Natto2classifier project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/[USERNAME]/natto2classifier/blob/master/CODE_OF_CONDUCT.md).
71
+ Everyone interacting in the Natto2classifier project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/kanayannet/natto2classifier/blob/master/CODE_OF_CONDUCT.md).
@@ -1,5 +1,5 @@
1
1
  朝食,卵かけご飯は醤油とあう
2
- 朝食,目玉焼きは塩故障派だ。
2
+ 朝食,目玉焼きは塩胡椒派だ。
3
3
  朝食,納豆かけご飯はとても健康によい
4
4
  朝食,食パンはオーソドックスな朝食だ
5
5
  朝食,スクランブルエッグはホテルの朝食でよく出る
@@ -1,20 +1,6 @@
1
1
  require 'natto2classifier/version'
2
2
  require 'natto2classifier/natto'
3
3
  require 'classifier-reborn'
4
-
5
- module Natto2classifier
6
- # It is a library that classifies Japanese language.
7
- class Bayes < ClassifierReborn::Bayes
8
- alias_method :__train__, :train
9
- alias_method :__classify__, :classify
10
- private :__train__, :__classify__
11
-
12
- def train(category, word)
13
- __train__ category, Natto2classifier::Natto.parse(word).join(' ')
14
- end
15
-
16
- def classify(word)
17
- __classify__ Natto2classifier::Natto.parse(word).join(' ')
18
- end
19
- end
20
- end
4
+ require 'natto2classifier/bayes'
5
+ require 'natto2classifier/lsi'
6
+ require 'natto2classifier/validator'
@@ -0,0 +1,17 @@
1
+
2
+ module Natto2classifier
3
+ # It is a library that classifies Japanese language.
4
+ class Bayes < ClassifierReborn::Bayes
5
+ alias_method :__train__, :train
6
+ alias_method :__classify__, :classify
7
+ private :__train__, :__classify__
8
+
9
+ def train(category, word)
10
+ __train__ category, Natto2classifier::Natto.parse(word).join(' ')
11
+ end
12
+
13
+ def classify(word)
14
+ __classify__ Natto2classifier::Natto.parse(word).join(' ')
15
+ end
16
+ end
17
+ end
@@ -0,0 +1,22 @@
1
+
2
+ module Natto2classifier
3
+ # It is a library that classifies Japanese language.
4
+ class LSI < ClassifierReborn::LSI
5
+ alias_method :__add_item__, :add_item
6
+ alias_method :__classify__, :classify
7
+ alias_method :__find_related__, :find_related
8
+ private :__add_item__, :__classify__, :__find_related__
9
+
10
+ def add_item(word, category)
11
+ __add_item__ Natto2classifier::Natto.parse(word).join(' '), category
12
+ end
13
+
14
+ def classify(word)
15
+ __classify__ Natto2classifier::Natto.parse(word).join(' ')
16
+ end
17
+
18
+ def find_related(word)
19
+ __find_related__ Natto2classifier::Natto.parse(word).join(' ')
20
+ end
21
+ end
22
+ end
@@ -4,10 +4,13 @@ require 'natto'
4
4
  module Natto2classifier
5
5
  class Natto
6
6
  def self.parse(word)
7
- nm = ::Natto::MeCab.new('-F%m\s%f[7]')
7
+ nm = ::Natto::MeCab.new
8
8
  results = []
9
- nm.enum_parse(word.to_s).each do |n|
10
- results << n.feature if !n.is_eos?
9
+ nm.parse(word.to_s) do |n|
10
+ break if n.is_eos?
11
+ kana = n.feature.split(',')[7]
12
+ results << n.surface
13
+ results << kana if !kana.nil? && kana != '*'
11
14
  end
12
15
  results
13
16
  end
@@ -0,0 +1,6 @@
1
+
2
+ module Natto2classifier
3
+ module Validator
4
+ include ClassifierReborn::ClassifierValidator
5
+ end
6
+ end
@@ -1,3 +1,3 @@
1
1
  module Natto2classifier
2
- VERSION = "0.1.3"
2
+ VERSION = "0.3.4"
3
3
  end
@@ -27,4 +27,5 @@ Gem::Specification.new do |spec|
27
27
  spec.add_development_dependency "pry"
28
28
  spec.add_runtime_dependency "natto"
29
29
  spec.add_runtime_dependency "classifier-reborn"
30
+ spec.add_runtime_dependency "rb-gsl"
30
31
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: natto2classifier
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.3
4
+ version: 0.3.4
5
5
  platform: ruby
6
6
  authors:
7
7
  - kanayannet
8
- autorequire:
8
+ autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2018-04-26 00:00:00.000000000 Z
11
+ date: 2020-09-15 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -94,6 +94,20 @@ dependencies:
94
94
  - - ">="
95
95
  - !ruby/object:Gem::Version
96
96
  version: '0'
97
+ - !ruby/object:Gem::Dependency
98
+ name: rb-gsl
99
+ requirement: !ruby/object:Gem::Requirement
100
+ requirements:
101
+ - - ">="
102
+ - !ruby/object:Gem::Version
103
+ version: '0'
104
+ type: :runtime
105
+ prerelease: false
106
+ version_requirements: !ruby/object:Gem::Requirement
107
+ requirements:
108
+ - - ">="
109
+ - !ruby/object:Gem::Version
110
+ version: '0'
97
111
  description: It is a library that classifies Japanese language. It depends on classifier-reborn
98
112
  and natto.
99
113
  email:
@@ -102,6 +116,7 @@ executables: []
102
116
  extensions: []
103
117
  extra_rdoc_files: []
104
118
  files:
119
+ - ".circleci/config.yml"
105
120
  - ".gitignore"
106
121
  - ".travis.yml"
107
122
  - CODE_OF_CONDUCT.md
@@ -114,14 +129,17 @@ files:
114
129
  - bin/setup
115
130
  - data/train.csv
116
131
  - lib/natto2classifier.rb
132
+ - lib/natto2classifier/bayes.rb
133
+ - lib/natto2classifier/lsi.rb
117
134
  - lib/natto2classifier/natto.rb
135
+ - lib/natto2classifier/validator.rb
118
136
  - lib/natto2classifier/version.rb
119
137
  - natto2classifier.gemspec
120
138
  homepage: https://github.com/kanayannet/natto2classifier
121
139
  licenses:
122
140
  - MIT
123
141
  metadata: {}
124
- post_install_message:
142
+ post_install_message:
125
143
  rdoc_options: []
126
144
  require_paths:
127
145
  - lib
@@ -136,9 +154,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
136
154
  - !ruby/object:Gem::Version
137
155
  version: '0'
138
156
  requirements: []
139
- rubyforge_project:
140
- rubygems_version: 2.6.14
141
- signing_key:
157
+ rubygems_version: 3.0.3
158
+ signing_key:
142
159
  specification_version: 4
143
160
  summary: It is a library that classifies Japanese language.
144
161
  test_files: []