naivebayes 0.0.1

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,165 @@
1
+ GNU LESSER GENERAL PUBLIC LICENSE
2
+ Version 3, 29 June 2007
3
+
4
+ Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>
5
+ Everyone is permitted to copy and distribute verbatim copies
6
+ of this license document, but changing it is not allowed.
7
+
8
+
9
+ This version of the GNU Lesser General Public License incorporates
10
+ the terms and conditions of version 3 of the GNU General Public
11
+ License, supplemented by the additional permissions listed below.
12
+
13
+ 0. Additional Definitions.
14
+
15
+ As used herein, "this License" refers to version 3 of the GNU Lesser
16
+ General Public License, and the "GNU GPL" refers to version 3 of the GNU
17
+ General Public License.
18
+
19
+ "The Library" refers to a covered work governed by this License,
20
+ other than an Application or a Combined Work as defined below.
21
+
22
+ An "Application" is any work that makes use of an interface provided
23
+ by the Library, but which is not otherwise based on the Library.
24
+ Defining a subclass of a class defined by the Library is deemed a mode
25
+ of using an interface provided by the Library.
26
+
27
+ A "Combined Work" is a work produced by combining or linking an
28
+ Application with the Library. The particular version of the Library
29
+ with which the Combined Work was made is also called the "Linked
30
+ Version".
31
+
32
+ The "Minimal Corresponding Source" for a Combined Work means the
33
+ Corresponding Source for the Combined Work, excluding any source code
34
+ for portions of the Combined Work that, considered in isolation, are
35
+ based on the Application, and not on the Linked Version.
36
+
37
+ The "Corresponding Application Code" for a Combined Work means the
38
+ object code and/or source code for the Application, including any data
39
+ and utility programs needed for reproducing the Combined Work from the
40
+ Application, but excluding the System Libraries of the Combined Work.
41
+
42
+ 1. Exception to Section 3 of the GNU GPL.
43
+
44
+ You may convey a covered work under sections 3 and 4 of this License
45
+ without being bound by section 3 of the GNU GPL.
46
+
47
+ 2. Conveying Modified Versions.
48
+
49
+ If you modify a copy of the Library, and, in your modifications, a
50
+ facility refers to a function or data to be supplied by an Application
51
+ that uses the facility (other than as an argument passed when the
52
+ facility is invoked), then you may convey a copy of the modified
53
+ version:
54
+
55
+ a) under this License, provided that you make a good faith effort to
56
+ ensure that, in the event an Application does not supply the
57
+ function or data, the facility still operates, and performs
58
+ whatever part of its purpose remains meaningful, or
59
+
60
+ b) under the GNU GPL, with none of the additional permissions of
61
+ this License applicable to that copy.
62
+
63
+ 3. Object Code Incorporating Material from Library Header Files.
64
+
65
+ The object code form of an Application may incorporate material from
66
+ a header file that is part of the Library. You may convey such object
67
+ code under terms of your choice, provided that, if the incorporated
68
+ material is not limited to numerical parameters, data structure
69
+ layouts and accessors, or small macros, inline functions and templates
70
+ (ten or fewer lines in length), you do both of the following:
71
+
72
+ a) Give prominent notice with each copy of the object code that the
73
+ Library is used in it and that the Library and its use are
74
+ covered by this License.
75
+
76
+ b) Accompany the object code with a copy of the GNU GPL and this license
77
+ document.
78
+
79
+ 4. Combined Works.
80
+
81
+ You may convey a Combined Work under terms of your choice that,
82
+ taken together, effectively do not restrict modification of the
83
+ portions of the Library contained in the Combined Work and reverse
84
+ engineering for debugging such modifications, if you also do each of
85
+ the following:
86
+
87
+ a) Give prominent notice with each copy of the Combined Work that
88
+ the Library is used in it and that the Library and its use are
89
+ covered by this License.
90
+
91
+ b) Accompany the Combined Work with a copy of the GNU GPL and this license
92
+ document.
93
+
94
+ c) For a Combined Work that displays copyright notices during
95
+ execution, include the copyright notice for the Library among
96
+ these notices, as well as a reference directing the user to the
97
+ copies of the GNU GPL and this license document.
98
+
99
+ d) Do one of the following:
100
+
101
+ 0) Convey the Minimal Corresponding Source under the terms of this
102
+ License, and the Corresponding Application Code in a form
103
+ suitable for, and under terms that permit, the user to
104
+ recombine or relink the Application with a modified version of
105
+ the Linked Version to produce a modified Combined Work, in the
106
+ manner specified by section 6 of the GNU GPL for conveying
107
+ Corresponding Source.
108
+
109
+ 1) Use a suitable shared library mechanism for linking with the
110
+ Library. A suitable mechanism is one that (a) uses at run time
111
+ a copy of the Library already present on the user's computer
112
+ system, and (b) will operate properly with a modified version
113
+ of the Library that is interface-compatible with the Linked
114
+ Version.
115
+
116
+ e) Provide Installation Information, but only if you would otherwise
117
+ be required to provide such information under section 6 of the
118
+ GNU GPL, and only to the extent that such information is
119
+ necessary to install and execute a modified version of the
120
+ Combined Work produced by recombining or relinking the
121
+ Application with a modified version of the Linked Version. (If
122
+ you use option 4d0, the Installation Information must accompany
123
+ the Minimal Corresponding Source and Corresponding Application
124
+ Code. If you use option 4d1, you must provide the Installation
125
+ Information in the manner specified by section 6 of the GNU GPL
126
+ for conveying Corresponding Source.)
127
+
128
+ 5. Combined Libraries.
129
+
130
+ You may place library facilities that are a work based on the
131
+ Library side by side in a single library together with other library
132
+ facilities that are not Applications and are not covered by this
133
+ License, and convey such a combined library under terms of your
134
+ choice, if you do both of the following:
135
+
136
+ a) Accompany the combined library with a copy of the same work based
137
+ on the Library, uncombined with any other library facilities,
138
+ conveyed under the terms of this License.
139
+
140
+ b) Give prominent notice with the combined library that part of it
141
+ is a work based on the Library, and explaining where to find the
142
+ accompanying uncombined form of the same work.
143
+
144
+ 6. Revised Versions of the GNU Lesser General Public License.
145
+
146
+ The Free Software Foundation may publish revised and/or new versions
147
+ of the GNU Lesser General Public License from time to time. Such new
148
+ versions will be similar in spirit to the present version, but may
149
+ differ in detail to address new problems or concerns.
150
+
151
+ Each version is given a distinguishing version number. If the
152
+ Library as you received it specifies that a certain numbered version
153
+ of the GNU Lesser General Public License "or any later version"
154
+ applies to it, you have the option of following the terms and
155
+ conditions either of that published version or of any later version
156
+ published by the Free Software Foundation. If the Library as you
157
+ received it does not specify a version number of the GNU Lesser
158
+ General Public License, you may choose any version of the GNU Lesser
159
+ General Public License ever published by the Free Software Foundation.
160
+
161
+ If the Library as you received it specifies that a proxy can decide
162
+ whether future versions of the GNU Lesser General Public License shall
163
+ apply, that proxy's public statement of acceptance of any version is
164
+ permanent authorization for you to choose that version for the
165
+ Library.
@@ -0,0 +1,5 @@
1
+ === 0.0.1 / 2013-07-07
2
+
3
+ * First release.
4
+
5
+
@@ -0,0 +1,8 @@
1
+ You can redistribute it and/or modify it under either the terms of the GPL
2
+ version 3, or LGPL version 3 (Dual License).
3
+
4
+ See the file doc/COPYING or doc/COPYING.LESSER.
5
+
6
+ Copyright (c) 774 All Rights Reserved.
7
+ Web: http://id774.net
8
+ E-Mail: 774@id774.net
@@ -0,0 +1,16 @@
1
+ naivebayes
2
+
3
+ Name
4
+ naive bayes classifier
5
+
6
+ Syntax
7
+ require 'naivebayes'
8
+
9
+ Description
10
+ http://en.wikipedia.org/wiki/Naive_bayes
11
+
12
+ Installation
13
+ $ gem install naivebayes
14
+
15
+ Tutorial
16
+ See spec files.
@@ -0,0 +1,7 @@
1
+ #!/usr/bin/env ruby
2
+ # -*- coding: utf-8 -*-
3
+
4
+ module NaiveBayes
5
+ VERSION = "0.0.1"
6
+ require File.dirname(__FILE__) + "/naivebayes/classifier"
7
+ end
@@ -0,0 +1,58 @@
1
+ #!/usr/bin/env ruby
2
+ # -*- coding: utf-8 -*-
3
+
4
+ module NaiveBayes
5
+ class Classifier
6
+ def initialize(params)
7
+ @frequency_table = Hash.new
8
+ @word_table = Hash.new
9
+ @instance_count_of = Hash.new(0)
10
+ @total_count = 0
11
+ @model = params[:model]
12
+ end
13
+
14
+ def train(label, attributes)
15
+ unless @frequency_table.has_key?(label)
16
+ @frequency_table[label] = Hash.new(0)
17
+ end
18
+ attributes.each {|word, frequency|
19
+ if @model == "multinomial"
20
+ @frequency_table[label][word] += frequency
21
+ else
22
+ @frequency_table[label][word] += 1
23
+ end
24
+ @word_table[word] = 1
25
+ }
26
+ @instance_count_of[label] += 1
27
+ @total_count += 1
28
+ end
29
+
30
+ def classify(attributes)
31
+ class_prior_of = Hash.new(1)
32
+ likelihood_of = Hash.new(1)
33
+ class_posterior_of = Hash.new(1)
34
+ evidence = 0
35
+ @instance_count_of.each {|label, freq|
36
+ class_prior_of[label] = freq.to_f / @total_count.to_f
37
+ }
38
+ @frequency_table.each_key {|label|
39
+ likelihood_of[label] = 1
40
+ @word_table.each_key {|word|
41
+ laplace_word_likelihood = (@frequency_table[label][word] + 1).to_f /
42
+ (@instance_count_of[label] + @word_table.size()).to_f
43
+ if attributes.has_key?(word)
44
+ likelihood_of[label] *= laplace_word_likelihood
45
+ else
46
+ likelihood_of[label] *= (1 - laplace_word_likelihood)
47
+ end
48
+ }
49
+ class_posterior_of[label] = class_prior_of[label] * likelihood_of[label]
50
+ evidence += class_posterior_of[label]
51
+ }
52
+ class_posterior_of.each {|label, posterior|
53
+ class_posterior_of[label] = posterior / evidence
54
+ }
55
+ return class_posterior_of
56
+ end
57
+ end
58
+ end
@@ -0,0 +1,65 @@
1
+ # Generated by jeweler
2
+ # DO NOT EDIT THIS FILE DIRECTLY
3
+ # Instead, edit Jeweler::Tasks in Rakefile, and run 'rake gemspec'
4
+ # -*- encoding: utf-8 -*-
5
+
6
+ Gem::Specification.new do |s|
7
+ s.name = "naivebayes"
8
+ s.version = "0.0.1"
9
+
10
+ s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
11
+ s.authors = ["id774"]
12
+ s.date = "2013-07-07"
13
+ s.description = "Naive Bayes classifier"
14
+ s.email = "idnanashi@gmail.com"
15
+ s.extra_rdoc_files = [
16
+ "README.md"
17
+ ]
18
+ s.files = [
19
+ "Gemfile",
20
+ "README.md",
21
+ "Rakefile",
22
+ "VERSION",
23
+ "doc/AUTHORS",
24
+ "doc/COPYING",
25
+ "doc/COPYING.LESSER",
26
+ "doc/ChangeLog",
27
+ "doc/LICENSE",
28
+ "doc/README",
29
+ "lib/naivebayes.rb",
30
+ "lib/naivebayes/classifier.rb",
31
+ "naivebayes.gemspec",
32
+ "script/build",
33
+ "spec/lib/naivebayes/classifier_spec.rb",
34
+ "spec/lib/naivebayes_spec.rb",
35
+ "spec/spec_helper.rb",
36
+ "vendor/.gitkeep"
37
+ ]
38
+ s.homepage = "http://github.com/id774/naivebayes"
39
+ s.licenses = ["GPL"]
40
+ s.require_paths = ["lib"]
41
+ s.rubygems_version = "2.0.3"
42
+ s.summary = "naivebayes"
43
+
44
+ if s.respond_to? :specification_version then
45
+ s.specification_version = 4
46
+
47
+ if Gem::Version.new(Gem::VERSION) >= Gem::Version.new('1.2.0') then
48
+ s.add_development_dependency(%q<cucumber>, [">= 0"])
49
+ s.add_development_dependency(%q<bundler>, ["~> 1.3.5"])
50
+ s.add_development_dependency(%q<builder>, ["~> 3.1.0"])
51
+ s.add_development_dependency(%q<jeweler>, [">= 0"])
52
+ else
53
+ s.add_dependency(%q<cucumber>, [">= 0"])
54
+ s.add_dependency(%q<bundler>, ["~> 1.3.5"])
55
+ s.add_dependency(%q<builder>, ["~> 3.1.0"])
56
+ s.add_dependency(%q<jeweler>, [">= 0"])
57
+ end
58
+ else
59
+ s.add_dependency(%q<cucumber>, [">= 0"])
60
+ s.add_dependency(%q<bundler>, ["~> 1.3.5"])
61
+ s.add_dependency(%q<builder>, ["~> 3.1.0"])
62
+ s.add_dependency(%q<jeweler>, [">= 0"])
63
+ end
64
+ end
65
+
@@ -0,0 +1,32 @@
1
+ #!/bin/sh
2
+ #
3
+ ########################################################################
4
+ # Integration Build Script
5
+ #
6
+ # Maintainer: id774 <idnanashi@gmail.com>
7
+ #
8
+ # v1.2 4/17,2013
9
+ # Using simplecov for coverage.
10
+ # v1.1 3/14,2013
11
+ # Show ruby version.
12
+ # v1.0 3/16,2012
13
+ # First.
14
+ ########################################################################
15
+
16
+ kickstart() {
17
+ export RACK_ROOT="."
18
+ export RACK_ENV="test"
19
+ ruby -v
20
+ }
21
+
22
+ run_tests() {
23
+ rake simplecov
24
+ }
25
+
26
+ main() {
27
+ kickstart
28
+ run_tests
29
+ }
30
+
31
+ set -ex
32
+ main
@@ -0,0 +1,161 @@
1
+ #!/usr/bin/env ruby
2
+ # -*- coding: utf-8 -*-
3
+
4
+ require File.dirname(__FILE__) + '/../../spec_helper'
5
+
6
+ def train_by_2
7
+ @classifier.train("positive", {"aaa" => 0, "bbb" => 1})
8
+ @classifier.train("negative", {"ccc" => 2, "ddd" => 3})
9
+ end
10
+
11
+ def train_by_3
12
+ @classifier.train("positive", {"aaa" => 2, "bbb" => 1})
13
+ @classifier.train("negative", {"ccc" => 2, "ddd" => 2})
14
+ @classifier.train("neutral", {"eee" => 3, "fff" => 3})
15
+ end
16
+
17
+ describe NaiveBayes::Classifier, 'ナイーブベイズ' do
18
+ context '多変数ベルヌーイモデルにおいて' do
19
+ describe '2 つの教師データで positive が期待される値を与えると' do
20
+ it 'positive が返る' do
21
+ @classifier = NaiveBayes::Classifier.new(:model => "berounoulli")
22
+ train_by_2
23
+ expect = {
24
+ "positive" => 0.8767123287671234,
25
+ "negative" => 0.12328767123287669
26
+ }
27
+ result = @classifier.classify({"aaa" => 1, "bbb" => 1})
28
+ result.should == expect
29
+ end
30
+ end
31
+ describe '2 つの教師データで negative が期待される値を与えると' do
32
+ it 'negative が返る' do
33
+ @classifier = NaiveBayes::Classifier.new(:model => "berounoulli")
34
+ train_by_2
35
+ expect = {
36
+ "positive" => 0.12328767123287668,
37
+ "negative" => 0.8767123287671234
38
+ }
39
+ result = @classifier.classify({"ccc" => 3, "ddd" => 3})
40
+ result.should == expect
41
+ end
42
+ end
43
+ end
44
+ end
45
+
46
+ describe NaiveBayes::Classifier, 'ナイーブベイズ' do
47
+ context '多項分布モデルにおいて' do
48
+ describe '2 つの教師データで positive が期待される値を与えると' do
49
+ it 'positive が返る' do
50
+ @classifier = NaiveBayes::Classifier.new(:model => "multinomial")
51
+ train_by_2
52
+ expect = {
53
+ "positive" => 0.9411764705882353,
54
+ "negative" => 0.05882352941176469
55
+ }
56
+ result = @classifier.classify({"aaa" => 1, "bbb" => 1})
57
+ result.should == expect
58
+ end
59
+ end
60
+ describe '2 つの教師データで negative が期待される値を与えると' do
61
+ it 'negative が返る' do
62
+ @classifier = NaiveBayes::Classifier.new(:model => "multinomial")
63
+ train_by_2
64
+ expect = {
65
+ "positive" => 0.0588235294117647,
66
+ "negative" => 0.9411764705882353
67
+ }
68
+ result = @classifier.classify({"ccc" => 3, "ddd" => 3})
69
+ result.should == expect
70
+ end
71
+ end
72
+ end
73
+ end
74
+
75
+ describe NaiveBayes::Classifier, 'ナイーブベイズ' do
76
+ context '多変数ベルヌーイモデルにおいて' do
77
+ describe '3 つの教師データで positive が期待される値を与えると' do
78
+ it 'positive が返る' do
79
+ @classifier = NaiveBayes::Classifier.new(:model => "berounoulli")
80
+ train_by_3
81
+ expect = {
82
+ "positive" => 0.7422680412371133,
83
+ "negative" => 0.12886597938144329,
84
+ "neutral" => 0.12886597938144329
85
+ }
86
+ result = @classifier.classify({"aaa" => 1, "bbb" => 1})
87
+ result.should == expect
88
+ end
89
+ end
90
+ describe '3 つの教師データで negative が期待される値を与えると' do
91
+ it 'negative が返る' do
92
+ @classifier = NaiveBayes::Classifier.new(:model => "berounoulli")
93
+ train_by_3
94
+ expect = {
95
+ "positive" => 0.12886597938144329,
96
+ "negative" => 0.7422680412371133,
97
+ "neutral" => 0.12886597938144329
98
+ }
99
+ result = @classifier.classify({"ccc" => 3, "ddd" => 2})
100
+ result.should == expect
101
+ end
102
+ end
103
+ describe '3 つの教師データで neutral が期待される値を与えると' do
104
+ it 'neutral が返る' do
105
+ @classifier = NaiveBayes::Classifier.new(:model => "berounoulli")
106
+ train_by_3
107
+ expect = {
108
+ "positive" => 0.2272727272727273,
109
+ "negative" => 0.22727272727272724,
110
+ "neutral" => 0.5454545454545455
111
+ }
112
+ result = @classifier.classify({"aaa" => 1, "ddd" => 2, "eee" => 3, "fff" => 1})
113
+ result.should == expect
114
+ end
115
+ end
116
+ end
117
+ end
118
+
119
+ describe NaiveBayes::Classifier, 'ナイーブベイズ' do
120
+ context '多項分布モデルにおいて' do
121
+ describe '3 つの教師データで positive が期待される値を与えると' do
122
+ it 'positive が返る' do
123
+ @classifier = NaiveBayes::Classifier.new(:model => "multinomial")
124
+ train_by_3
125
+ expect = {
126
+ "positive" => 0.896265560165975,
127
+ "negative" => 0.06639004149377592,
128
+ "neutral" => 0.03734439834024896
129
+ }
130
+ result = @classifier.classify({"aaa" => 1, "bbb" => 1})
131
+ result.should == expect
132
+ end
133
+ end
134
+ describe '3 つの教師データで negative が期待される値を与えると' do
135
+ it 'negative が返る' do
136
+ @classifier = NaiveBayes::Classifier.new(:model => "multinomial")
137
+ train_by_3
138
+ expect = {
139
+ "positive" => 0.05665722379603399,
140
+ "negative" => 0.9178470254957508,
141
+ "neutral" => 0.0254957507082153
142
+ }
143
+ result = @classifier.classify({"ccc" => 3, "ddd" => 2})
144
+ result.should == expect
145
+ end
146
+ end
147
+ describe '3 つの教師データで neutral が期待される値を与えると' do
148
+ it 'neutral が返る' do
149
+ @classifier = NaiveBayes::Classifier.new(:model => "multinomial")
150
+ train_by_3
151
+ expect = {
152
+ "positive" => 0.12195121951219513,
153
+ "negative" => 0.09756097560975606,
154
+ "neutral" => 0.7804878048780488
155
+ }
156
+ result = @classifier.classify({"aaa" => 1, "ddd" => 2, "eee" => 3, "fff" => 1})
157
+ result.should == expect
158
+ end
159
+ end
160
+ end
161
+ end