NaiveText 0.5.1 → 0.6.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: ec519a11d1de21e62b71bb470ef0097ed62910f8
4
- data.tar.gz: bc96077d8ed6fa7890692466a2954425a8719540
3
+ metadata.gz: 3b04e3a990ab60596a6e4067f3e6e6b7b762e9e7
4
+ data.tar.gz: 95cefeef5c2030e33c7290eecb848ec85e3a4d86
5
5
  SHA512:
6
- metadata.gz: 15446a47f72ed08af4c32924f5b7b246878bdc184f21a9616259a346c010baba9f25ca426a22f60bb29ff5d9e94b9cd311c41221fdb728542d794fa7777f5c01
7
- data.tar.gz: bfb9e832ddd6a8f3f7aebc8cc47c949f7aac93d1a718d6db461c827e247575c937ddf4ae23be3b89d6d2c0617b93d8a5f5ed1a830fc59b1f440c0bbd770e1e3b
6
+ metadata.gz: d4b7734d40ca51cb0af57485ca7312007ba2ef0982f471cd3d95c000e488ea1d526bc6a03a84d52cfc1eeb41a3dc0793c986e7d9be49424ead2811042f0b8ce5
7
+ data.tar.gz: aed39b603081561255c043fbd61d9de06e0e91a14a628e1b324589e8eb0f6d4d3428248b9e18c6f35bf79a21852f7a121256a8fa16530f0774960526eeab3deb
data/CHANGELOG.md CHANGED
@@ -2,6 +2,10 @@
2
2
  All notable changes to this project will be documented in this file.
3
3
  This project adheres to [Semantic Versioning](http://semver.org/).
4
4
 
5
+ ## [0.6.0]- 2015-11-30
6
+ ### Added
7
+ - Added optional language_model, that make it possible to compare words based on the word stem. (Like 'testing', 'tests', 'tested' all matched with the stem 'test')
8
+
5
9
  ## [0.5.1] - 2015-11-21
6
10
  ### Added
7
11
  - Added optional default category. This category will be returned from NaiveText.build if the algorithm can't find a match with the existing text examples. Default value is NullCategory.
data/Gemfile CHANGED
@@ -2,3 +2,8 @@ source 'https://rubygems.org'
2
2
 
3
3
  # Specify your gem's dependencies in NaiveText.gemspec
4
4
  gemspec
5
+
6
+
7
+ spec.add_development_dependency "guard"
8
+ spec.add_development_dependency "guard-rspec"
9
+ spec.add_development_dependency "guard-rubocop"
data/NaiveText.gemspec CHANGED
@@ -19,13 +19,12 @@ Gem::Specification.new do |spec|
19
19
  spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
20
20
  spec.require_paths = ["lib"]
21
21
 
22
+ spec.required_ruby_version = '>= 2.0.0'
23
+
22
24
  if spec.respond_to?(:metadata)
23
25
  spec.metadata['allowed_push_host'] = "https://rubygems.org"
24
26
  end
25
27
 
26
28
  spec.add_development_dependency "bundler", "~> 1.8"
27
29
  spec.add_development_dependency "rake", "~> 10.0"
28
- spec.add_development_dependency "guard"
29
- spec.add_development_dependency "guard-rspec"
30
- spec.add_development_dependency "guard-rubocop"
31
30
  end
data/README.md CHANGED
@@ -6,7 +6,7 @@ NaiveText is a text classifier gem written in ruby and made to be easily integra
6
6
 
7
7
  Text classifier are used in many areas of IT. The filter spam, predict what a user wants to buy, detect which language a text is written in, ...
8
8
 
9
- The kind of classifier included in NaiveText, uses existing text examples (junk-makrde e-mails, allready bought products, texts in different languages, ...) to calculate in which category (spam/e-mail, interesting_product/not_interesting_product, ...) a unknown text belongs.
9
+ The kind of classifier included in NaiveText, uses existing text examples (junk-makrde e-mails, already bought products, texts in different languages, ...) to calculate in which category (spam/e-mail, interesting_product/not_interesting_product, ...) a unknown text belongs.
10
10
 
11
11
  ## Installation
12
12
 
@@ -31,32 +31,9 @@ You can also use local files as examples (via ExamplesFactory.from_files('path/t
31
31
 
32
32
 
33
33
 
34
- ### Example
34
+ ## Example
35
35
 
36
- Lets pretend you write some kind of forum. A user can write posts and can vote them up or down.
37
-
38
- We will build a system which predicts if a new post is interesting to the user or if this post will bore him a sleep.
39
-
40
- In your system (an rails app of course) you haven a *Post* model with a text attribute containing the posts content. There are also two scopes on Post: *up_voted* and *down_voted*, which return all up/down voted posts.
41
-
42
- ```ruby
43
- require 'NaiveText'
44
-
45
- interesting_examples = Post.up_voted
46
- boring_examples = Post.down_voted
47
-
48
- categories = [{name: 'interesting', examples: interesting_examples},
49
- {name: 'boring', examples: boring_examples}];
50
-
51
- classifier = NaiveText.build(categories: categories)
52
-
53
- category = classifier.classify(new_interesting_post.text)
54
- category.name
55
- => 'interesting'
56
- ```
57
- Checkout the full example and some more in the
58
- [NaiveText-example repo](https://github.com/RicciFlowing/NaiveText-examples).
59
- Have fun using it!
36
+ Can be found on the projects [homepage](https://ricciflowing.github.io/NaiveText/).
60
37
 
61
38
  ## Contributing
62
39
 
@@ -3,7 +3,7 @@ class CategoriesFactory
3
3
  categories = []
4
4
  default = nil
5
5
  if config.is_a?(Array)
6
- puts "The format [{name: name_of_category, path: path_to_trainings_data}] is deprecated and will be removed in future versions. Use the following arguments instead: categories: [name: 'the name', examples:'An example']"
6
+ puts "The format [{name: name_of_category, path: path_to_trainings_data}] is deprecated and will be removed in version 1.0.0 (due in Jan. 2016). Use the following arguments instead: categories: [name: 'the name', examples:'An example']"
7
7
  config.each do |category_config|
8
8
  begin
9
9
  examples = ExamplesFactory.from_files(category_config[:path])
@@ -20,7 +20,7 @@ class CategoriesFactory
20
20
  else
21
21
  config[:categories].each do |category_config|
22
22
  begin
23
- group = ExamplesGroup.new(examples: category_config[:examples])
23
+ group = ExamplesGroup.new(examples: category_config[:examples], language_model: config[:language_model] )
24
24
  category = Category.new(name: category_config[:name], examples: group, weight: category_config[:weight])
25
25
  categories << category
26
26
  if category_config[:name] == config[:default]
@@ -7,7 +7,7 @@ class ExamplesFactory
7
7
  examples.push FileExample.new(path: dir_path+'/'+file_path)
8
8
  end
9
9
  rescue
10
- puts "Failed laoding" + dir_path
10
+ puts "Failed loading" + dir_path
11
11
  end
12
12
  examples
13
13
  end
@@ -1,6 +1,7 @@
1
1
  class ExamplesGroup
2
2
  def initialize(args)
3
- @examples = args[:examples].to_a || []
3
+ @examples = args[:examples].to_a || []
4
+ @language_model = args[:language_model] || lambda {|str| str}
4
5
  load_text
5
6
  split_text_into_words
6
7
  format_words
@@ -10,7 +11,7 @@ class ExamplesGroup
10
11
  end
11
12
 
12
13
  def count(word)
13
- @words.count(word.downcase)
14
+ @words.count(@language_model.call(word.downcase))
14
15
  end
15
16
 
16
17
  def word_count
@@ -32,5 +33,7 @@ class ExamplesGroup
32
33
 
33
34
  def format_words
34
35
  @words.map! {|word| word.downcase}
36
+ @words.map! {|word| @language_model.call(word)}
37
+ @words
35
38
  end
36
39
  end
@@ -1,3 +1,3 @@
1
1
  module NaiveText
2
- VERSION = "0.5.1"
2
+ VERSION = "0.6.0"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: NaiveText
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.5.1
4
+ version: 0.6.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - RicciFlowing
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2015-11-21 00:00:00.000000000 Z
11
+ date: 2015-12-01 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -38,48 +38,6 @@ dependencies:
38
38
  - - "~>"
39
39
  - !ruby/object:Gem::Version
40
40
  version: '10.0'
41
- - !ruby/object:Gem::Dependency
42
- name: guard
43
- requirement: !ruby/object:Gem::Requirement
44
- requirements:
45
- - - ">="
46
- - !ruby/object:Gem::Version
47
- version: '0'
48
- type: :development
49
- prerelease: false
50
- version_requirements: !ruby/object:Gem::Requirement
51
- requirements:
52
- - - ">="
53
- - !ruby/object:Gem::Version
54
- version: '0'
55
- - !ruby/object:Gem::Dependency
56
- name: guard-rspec
57
- requirement: !ruby/object:Gem::Requirement
58
- requirements:
59
- - - ">="
60
- - !ruby/object:Gem::Version
61
- version: '0'
62
- type: :development
63
- prerelease: false
64
- version_requirements: !ruby/object:Gem::Requirement
65
- requirements:
66
- - - ">="
67
- - !ruby/object:Gem::Version
68
- version: '0'
69
- - !ruby/object:Gem::Dependency
70
- name: guard-rubocop
71
- requirement: !ruby/object:Gem::Requirement
72
- requirements:
73
- - - ">="
74
- - !ruby/object:Gem::Version
75
- version: '0'
76
- type: :development
77
- prerelease: false
78
- version_requirements: !ruby/object:Gem::Requirement
79
- requirements:
80
- - - ">="
81
- - !ruby/object:Gem::Version
82
- version: '0'
83
41
  description: NaiveText is a text classifier gem written in ruby and made to be easily
84
42
  integratable in your Rails app.
85
43
  email:
@@ -124,7 +82,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
124
82
  requirements:
125
83
  - - ">="
126
84
  - !ruby/object:Gem::Version
127
- version: '0'
85
+ version: 2.0.0
128
86
  required_rubygems_version: !ruby/object:Gem::Requirement
129
87
  requirements:
130
88
  - - ">="