NaiveText 0.4.1 → 0.4.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: c77da3eeff67ccae563debe023d1b7381b270098
4
- data.tar.gz: 818c857fbb2366d14bc599e2a074eead39d840b7
3
+ metadata.gz: 99c7ed0d2ea0ab00ce13e284e537474e1bd48b5a
4
+ data.tar.gz: 1feb4b6118ccfde5c54daca91c1386fbdcfe8e1b
5
5
  SHA512:
6
- metadata.gz: ebf2a4a647463099cc87d1ce3266765a4fcb2f6a770a604ca5293755851ac43dc938d13bc000f7604ee7892ca9974a36659985f433b5e1c29adf5494d36b1c70
7
- data.tar.gz: 48f1e633ff228d192dfad74bc8f0c1cb4ada96fdac94a1979ef491d63613ab6b484f30079b2256d5c178b398b1d16b9779231b6ccfa33343b6404b20f7d2ea7f
6
+ metadata.gz: 810c0f40cdd3852010a8bdbc831c6b8591ea47ddb0ea7154e64899c682cbbf875a7b17b4d0676a809a080f14f9d69810cbe4798c2540b57c2ae0851224458365
7
+ data.tar.gz: a28bd4c31239537888d85cca8020ceca2506886a0e4d6bf6f57e78ed383e89bf5adebfebf56bdeda39649c4b5dfd01e8f43d6833101659c6cbc06921a0cd6480
data/CHANGELOG.md CHANGED
@@ -3,6 +3,7 @@ All notable changes to this project will be documented in this file.
3
3
  This project adheres to [Semantic Versioning](http://semver.org/).
4
4
 
5
5
  ## [Unreleased]
6
+ -Fixed a typo in the interface of TextClassifier propabilities --> probabilities. Deprecated the old version.
6
7
 
7
8
  ## [0.4.1] - 2015-10-29
8
9
  ### Added
data/README.md CHANGED
@@ -2,8 +2,8 @@
2
2
 
3
3
  NaiveText is a text classifier gem written in ruby and made to be easily integratable in your Rails app.
4
4
 
5
- 1. What does it do?
6
- ----
5
+ ## What is it good for?
6
+
7
7
  Text classifier are used in many areas of IT. The filter spam, predict what a user wants to buy, detect which language a text is written in, ...
8
8
 
9
9
  The kind of classifier included in NaiveText, uses existing text examples (junk-makrde e-mails, allready bought products, texts in different languages, ...) to calculate in which category (spam/e-mail, interesting_product/not_interesting_product, ...) a unknown text belongs.
@@ -35,13 +35,10 @@ You can also use local files as examples (via ExamplesFactory.from_files('path/t
35
35
 
36
36
  Lets pretend you write some kind of forum. A user can write posts and can vote them up or down.
37
37
 
38
-
39
38
  We will build a system which predicts if a new post is interesting to the user or if this post will bore him a sleep.
40
39
 
41
40
  In your system (an rails app of course) you haven a *Post* model with a text attribute containing the posts content. There are also two scopes on Post: *up_voted* and *down_voted*, which return all up/down voted posts.
42
41
 
43
-
44
-
45
42
  ```ruby
46
43
  require 'NaiveText'
47
44
 
data/lib/NaiveText.rb CHANGED
@@ -2,8 +2,8 @@ require "NaiveText/version"
2
2
  require "NaiveText/Example"
3
3
  require "NaiveText/ExamplesFactory"
4
4
  require "NaiveText/ExamplesGroup"
5
- require "NaiveText/PropabilityCollection"
6
- require "NaiveText/PropabilityCalculator"
5
+ require "NaiveText/ProbabilityCollection"
6
+ require "NaiveText/ProbabilityCalculator"
7
7
  require "NaiveText/TextClassifier"
8
8
  require "NaiveText/Category"
9
9
  require "NaiveText/Categories"
@@ -1,13 +1,13 @@
1
- class PropabilityCalculator
1
+ class ProbabilityCalculator
2
2
  def initialize(args)
3
3
  @categories = args[:categories] || []
4
- @propabilities = PropabilityCollection.new(categories: @categories)
4
+ @probabilities = ProbabilityCollection.new(categories: @categories)
5
5
  end
6
6
 
7
- def get_propabilities_for(text)
7
+ def get_probabilities_for(text)
8
8
  calculateProbabilities(text)
9
- normalize unless @propabilities.sum <= 0
10
- @propabilities
9
+ normalize unless @probabilities.sum <= 0
10
+ @probabilities
11
11
  end
12
12
 
13
13
 
@@ -21,30 +21,30 @@ class PropabilityCalculator
21
21
  end
22
22
 
23
23
  def calculateProbabilities(text)
24
- set_apriori_propabilities
24
+ set_apriori_probabilities
25
25
  list_of_words = text.split(/\W+/)
26
26
  list_of_words.each do |word|
27
27
  @categories.each do |category|
28
- @propabilities.multiply(category: category, factor: protect_factor(category.p(word)) )
28
+ @probabilities.multiply(category: category, factor: protect_factor(category.p(word)) )
29
29
  end
30
30
  end
31
31
  remove_minimum(text)
32
32
  end
33
33
 
34
- def set_apriori_propabilities
34
+ def set_apriori_probabilities
35
35
  @categories.each do |category|
36
- @propabilities.set(category: category, value: p_apriori(category))
36
+ @probabilities.set(category: category, value: p_apriori(category))
37
37
  end
38
38
  end
39
39
 
40
40
  def remove_minimum(text)
41
41
  times = text.split(/\W+/).length
42
- @propabilities.greater_then(minimum**times)
42
+ @probabilities.greater_then(minimum**times)
43
43
  end
44
44
 
45
45
  def normalize
46
- normalization_factor = 1.to_f / @propabilities.sum
47
- @propabilities.multiply(factor: normalization_factor)
46
+ normalization_factor = 1.to_f / @probabilities.sum
47
+ @probabilities.multiply(factor: normalization_factor)
48
48
  end
49
49
 
50
50
  def p_apriori(category)
@@ -1,35 +1,35 @@
1
- class PropabilityCollection
1
+ class ProbabilityCollection
2
2
  def initialize(args)
3
3
  @categories = args[:categories] || []
4
4
  initialize_ids
5
- @propabilities = []
6
- initalize_propabilities(@ids)
5
+ @probabilities = []
6
+ initalize_probabilities(@ids)
7
7
  end
8
8
 
9
9
  def find(category)
10
- return @propabilities[category.id]
10
+ return @probabilities[category.id]
11
11
  end
12
12
 
13
13
 
14
14
  def set(args)
15
15
  category = args[:category]
16
16
  value = args[:value]
17
- @propabilities[category.id] = value
17
+ @probabilities[category.id] = value
18
18
  end
19
19
 
20
20
  def multiply(args)
21
21
  category = args[:category]
22
22
  factor = args[:factor]
23
23
  if category
24
- @propabilities[category.id] *= factor
24
+ @probabilities[category.id] *= factor
25
25
  else
26
- @propabilities.map! {|el| el*factor}
26
+ @probabilities.map! {|el| el*factor}
27
27
  end
28
28
  end
29
29
 
30
30
  def category_with_max
31
- if @propabilities.max > 0
32
- id = @propabilities.find_index(@propabilities.max)
31
+ if @probabilities.max > 0
32
+ id = @probabilities.find_index(@probabilities.max)
33
33
  @categories.find {|category| category.id == id}
34
34
  else
35
35
  NullCategory.new
@@ -37,11 +37,11 @@ class PropabilityCollection
37
37
  end
38
38
 
39
39
  def max
40
- @propabilities.max
40
+ @probabilities.max
41
41
  end
42
42
 
43
43
  def greater_then(value)
44
- @propabilities.map! do |p|
44
+ @probabilities.map! do |p|
45
45
  if p > value
46
46
  p
47
47
  else
@@ -51,7 +51,7 @@ class PropabilityCollection
51
51
  end
52
52
 
53
53
  def sum
54
- @propabilities.reduce(:+)
54
+ @probabilities.reduce(:+)
55
55
  end
56
56
 
57
57
  def to_s
@@ -70,9 +70,9 @@ class PropabilityCollection
70
70
  @ids = @categories.map { |category| category.id }
71
71
  end
72
72
 
73
- def initalize_propabilities(ids)
73
+ def initalize_probabilities(ids)
74
74
  ids.max.times do
75
- @propabilities << 0
75
+ @probabilities << 0
76
76
  end
77
77
  end
78
78
  end
@@ -2,20 +2,25 @@ class TextClassifier
2
2
  attr_reader :categories
3
3
  def initialize( args )
4
4
  @categories = args[:categories]
5
- @calculator = args[:calculator] || PropabilityCalculator.new(categories: @categories)
5
+ @calculator = args[:calculator] || ProbabilityCalculator.new(categories: @categories)
6
6
  end
7
7
 
8
8
  def classify(text)
9
9
  get_category_for(text)
10
10
  end
11
11
 
12
+ def probabilities(text)
13
+ @calculator.get_probabilities_for(text)
14
+ end
15
+
12
16
  def propabilities(text)
13
- @calculator.get_propabilities_for(text)
17
+ puts "This notation is deprecated in will be removed in later versions. Please use probabilities (4th character b instead of p)"
18
+ probabilities(text)
14
19
  end
15
20
 
16
21
  private
17
22
  def get_category_for(text)
18
- propabilities = @calculator.get_propabilities_for(text)
19
- propabilities.category_with_max
23
+ probabilities = @calculator.get_probabilities_for(text)
24
+ probabilities.category_with_max
20
25
  end
21
26
  end
@@ -1,3 +1,3 @@
1
1
  module NaiveText
2
- VERSION = "0.4.1"
2
+ VERSION = "0.4.2"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: NaiveText
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.4.1
4
+ version: 0.4.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - RicciFlowing
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2015-10-29 00:00:00.000000000 Z
11
+ date: 2015-11-02 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -107,8 +107,8 @@ files:
107
107
  - lib/NaiveText/Example.rb
108
108
  - lib/NaiveText/ExamplesFactory.rb
109
109
  - lib/NaiveText/ExamplesGroup.rb
110
- - lib/NaiveText/PropabilityCalculator.rb
111
- - lib/NaiveText/PropabilityCollection.rb
110
+ - lib/NaiveText/ProbabilityCalculator.rb
111
+ - lib/NaiveText/ProbabilityCollection.rb
112
112
  - lib/NaiveText/TextClassifier.rb
113
113
  - lib/NaiveText/version.rb
114
114
  homepage: https://github.com/RicciFlowing/NaiveText