ankusa 0.1.0 → 0.1.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.gitignore +3 -1
- data/.travis.yml +11 -0
- data/README.rdoc +18 -13
- data/Rakefile +10 -0
- data/ankusa.gemspec +1 -0
- data/lib/ankusa/extensions.rb +0 -4
- data/lib/ankusa/hasher.rb +12 -3
- data/lib/ankusa/naive_bayes.rb +2 -2
- data/lib/ankusa/stopwords.rb +1 -1
- data/lib/ankusa/version.rb +1 -1
- data/test/classifier_base.rb +3 -2
- data/test/file_system_classifier_test.rb +2 -2
- data/test/hasher_test.rb +15 -8
- data/test/mongo_db_classifier_test.rb +3 -3
- metadata +28 -27
- data/.ruby-gemset +0 -1
- data/.ruby-version +0 -1
- data/Gemfile.lock +0 -18
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: 30cc949c422944d307a7e7044a61848becabebba
|
4
|
+
data.tar.gz: 0339b15870714fa66bf8632c0c9c66fe0ad7ba62
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: a4fd17d83e0c28652b2d94e3e06259abe356f6e276bc09a3317a9d5887a6868be7d986db91fb78eb2135bc8293bb563a01657b81fee8fa60bf08722338c47b71
|
7
|
+
data.tar.gz: 0e34f6961f49e78afe8badcd529a1e86a15d3b041ba156c7d7e6233326dfb54529f88c97229f409da96379587cf4169661737977caf89ccf285e8e6ebec4184f
|
data/.gitignore
CHANGED
data/.travis.yml
ADDED
data/README.rdoc
CHANGED
@@ -4,18 +4,23 @@
|
|
4
4
|
|
5
5
|
Ankusa is a text classifier in Ruby that can use either Hadoop's HBase, Mongo, or Cassandra for storage. Because it uses HBase/Mongo/Cassandra as a backend, the training corpus can be many terabytes in size (though additional memory and single file storage abilities also exist for smaller corpora).
|
6
6
|
|
7
|
-
Ankusa currently provides both a Naive Bayes and Kullback-Leibler divergence classifier. It ignores common words (a.k.a, stop words) and stems all others. Additionally, it uses
|
7
|
+
Ankusa currently provides both a Naive Bayes and Kullback-Leibler divergence classifier. It ignores common words (a.k.a, stop words) and stems all others. Additionally, it uses additive smoothing in both classification methods.
|
8
8
|
|
9
9
|
== Installation
|
10
|
-
|
11
|
-
|
10
|
+
Add this line to your application's Gemfile:
|
11
|
+
|
12
|
+
gem 'ankusa'
|
13
|
+
|
14
|
+
Ensure that if you're using the HBase, Cassandra, or MongoDB backends that you also add the correct dependency gem to your Gemfile:
|
15
|
+
|
16
|
+
gem 'hbaserb'
|
12
17
|
# or
|
13
|
-
gem
|
18
|
+
gem 'cassandra'
|
14
19
|
# or
|
15
|
-
gem
|
20
|
+
gem 'mongo'
|
21
|
+
|
22
|
+
If you're using HBase, make sure the HBase Thrift interface has been started as well.
|
16
23
|
|
17
|
-
If you're using HBase, make sure the HBase Thrift interface has been started as well. Then:
|
18
|
-
gem install ankusa
|
19
24
|
|
20
25
|
== Basic Usage
|
21
26
|
Using the naive Bayes classifier:
|
@@ -57,7 +62,6 @@ There is a Kullback–Leibler divergence classifier as well. KL divergence is a
|
|
57
62
|
|
58
63
|
The API is the same as the NaiveBayesClassifier, except rather than calling "classifications" if you want actual numbers you call "distances".
|
59
64
|
|
60
|
-
require 'rubygems'
|
61
65
|
require 'ankusa'
|
62
66
|
require 'ankusa/hbase_storage'
|
63
67
|
|
@@ -87,10 +91,12 @@ The API is the same as the NaiveBayesClassifier, except rather than calling "cla
|
|
87
91
|
Ankusa has a generalized storage interface that has been implemented for HBase, Cassandra, Mongo, single file, and in-memory storage.
|
88
92
|
|
89
93
|
Memory storage can be used when you have a very small corpora
|
94
|
+
|
90
95
|
require 'ankusa/memory_storage'
|
91
96
|
storage = Ankusa::MemoryStorage.new
|
92
97
|
|
93
98
|
FileSystem storage can be used when you have a very small corpora and want to persist the classification results.
|
99
|
+
|
94
100
|
require 'ankusa/file_system_storage'
|
95
101
|
storage = Ankusa::FileSystemStorage.new '/path/to/file'
|
96
102
|
# Do classification ...
|
@@ -99,6 +105,7 @@ FileSystem storage can be used when you have a very small corpora and want to pe
|
|
99
105
|
The FileSystem storage does NOT save to the filesystem automatically, the #save method must be invoked to save and persist the results
|
100
106
|
|
101
107
|
HBase storage:
|
108
|
+
|
102
109
|
require 'ankusa/hbase_storage'
|
103
110
|
# defaults: host='localhost', port=9090, frequency_tablename="ankusa_word_frequencies", summary_tablename="ankusa_summary"
|
104
111
|
storage = Ankusa::HBaseStorage.new host, port, frequency_tablename, summary_tablename
|
@@ -109,17 +116,18 @@ For Cassandra storage:
|
|
109
116
|
* Prior to using the Cassandra storage you will need to run the following command from the cassandra-cli: "create keyspace ankusa with replication_factor = 1". This should be fixed with a new release candidate for Cassandra.
|
110
117
|
|
111
118
|
To use the Cassandra storage class:
|
119
|
+
|
112
120
|
require 'ankusa/cassandra_storage'
|
113
121
|
# defaults: host='127.0.0.1', port=9160, keyspace = 'ankusa', max_classes = 100
|
114
122
|
storage = Ankusa::CassandraStorage.new host, port, keyspace, max_classes
|
115
123
|
|
116
124
|
For MongoDB storage:
|
125
|
+
|
117
126
|
require 'ankusa/mongo_db_storage'
|
118
127
|
storage = Ankusa::MongoDbStorage.new :host => "localhost", :port => 27017, :db => "ankusa"
|
119
128
|
# defaults: :host => "localhost", :port => 27017, :db => "ankusa"
|
120
129
|
# no default username or password
|
121
|
-
#
|
122
|
-
|
130
|
+
# you can also use frequency_tablename and summary_tablename options
|
123
131
|
|
124
132
|
== Running Tests
|
125
133
|
You can run the tests for any of the four storage methods. For instance, for memory storage:
|
@@ -133,6 +141,3 @@ For the other methods you will need to edit the file test/config.yml and set the
|
|
133
141
|
rake test_filesystem
|
134
142
|
#or
|
135
143
|
rake test_mongo_db
|
136
|
-
|
137
|
-
|
138
|
-
|
data/Rakefile
CHANGED
@@ -51,3 +51,13 @@ Rake::TestTask.new("test_mongo_db") { |t|
|
|
51
51
|
t.test_files = FileList['test/hasher_test.rb', 'test/mongo_db_classifier_test.rb']
|
52
52
|
t.verbose = true
|
53
53
|
}
|
54
|
+
|
55
|
+
desc "Run all unit tests in Travis-CI environment"
|
56
|
+
Rake::TestTask.new("test_travis_ci") { |t|
|
57
|
+
t.libs += ["lib", "."]
|
58
|
+
t.test_files = FileList['test/hasher_test.rb',
|
59
|
+
'test/memory_classifier_test.rb',
|
60
|
+
'test/file_system_classifier_test.rb',
|
61
|
+
'test/mongo_db_classifier_test.rb']
|
62
|
+
t.verbose = true
|
63
|
+
}
|
data/ankusa.gemspec
CHANGED
@@ -14,6 +14,7 @@ Gem::Specification.new do |s|
|
|
14
14
|
s.require_paths = ["lib"]
|
15
15
|
s.add_dependency('fast-stemmer', '>= 1.0.0')
|
16
16
|
s.add_development_dependency("rake")
|
17
|
+
s.add_development_dependency("mongo", "= 1.6.0")
|
17
18
|
s.requirements << "Either hbaserb >= 0.0.3 or cassandra >= 0.7"
|
18
19
|
s.rubyforge_project = "ankusa"
|
19
20
|
end
|
data/lib/ankusa/extensions.rb
CHANGED
data/lib/ankusa/hasher.rb
CHANGED
@@ -3,7 +3,7 @@ require 'ankusa/stopwords'
|
|
3
3
|
|
4
4
|
module Ankusa
|
5
5
|
|
6
|
-
class TextHash < Hash
|
6
|
+
class TextHash < Hash
|
7
7
|
attr_reader :word_count
|
8
8
|
|
9
9
|
def initialize(text=nil, stem=true)
|
@@ -19,14 +19,14 @@ module Ankusa
|
|
19
19
|
|
20
20
|
# word should be only alphanum chars at this point
|
21
21
|
def self.valid_word?(word)
|
22
|
-
not (Ankusa::STOPWORDS.include?(word) || word.length < 3 ||
|
22
|
+
not (Ankusa::STOPWORDS.include?(word) || word.length < 3 || self.numeric_word?(word))
|
23
23
|
end
|
24
24
|
|
25
25
|
def add_text(text)
|
26
26
|
if text.instance_of? Array
|
27
27
|
text.each { |t| add_text t }
|
28
28
|
else
|
29
|
-
# replace dashes with spaces, then get rid of non-word/non-space characters,
|
29
|
+
# replace dashes with spaces, then get rid of non-word/non-space characters,
|
30
30
|
# then split by space to get words
|
31
31
|
words = TextHash.atomize text
|
32
32
|
words.each { |word| add_word(word) if TextHash.valid_word?(word) }
|
@@ -42,6 +42,15 @@ module Ankusa
|
|
42
42
|
key = word.intern
|
43
43
|
store key, fetch(key, 0)+1
|
44
44
|
end
|
45
|
+
|
46
|
+
# Due to the character filtering that takes place in atomisation
|
47
|
+
# this method should never received something that could be a
|
48
|
+
# negative number, float etc.
|
49
|
+
# Therefore we can dispense with the SLOW Float(word) method and
|
50
|
+
# just do a simple regex.
|
51
|
+
def self.numeric_word?(word)
|
52
|
+
word.match(/[\d]+/)
|
53
|
+
end
|
45
54
|
end
|
46
55
|
|
47
56
|
end
|
data/lib/ankusa/naive_bayes.rb
CHANGED
@@ -39,8 +39,8 @@ module Ankusa
|
|
39
39
|
TextHash.new(text).each { |word, count|
|
40
40
|
probs = get_word_probs(word, classnames)
|
41
41
|
classnames.each { |k|
|
42
|
-
#
|
43
|
-
result[k] += probs[k] > 0 ? (
|
42
|
+
# Choose a really small probability if the word has never been seen before in class k
|
43
|
+
result[k] += Math.log(probs[k] > 0 ? (probs[k] * count) : Float::EPSILON)
|
44
44
|
}
|
45
45
|
}
|
46
46
|
|
data/lib/ankusa/stopwords.rb
CHANGED
@@ -1,4 +1,4 @@
|
|
1
1
|
module Ankusa
|
2
2
|
# These are taken from MySQL - http://dev.mysql.com/tech-resources/articles/full-text-revealed.html
|
3
|
-
STOPWORDS = %W(a able about above according accordingly across actually after afterwards again against ain't all allow allows almost alone along already also although always am among amongst an and another any anybody anyhow anyone anything anyway anyways anywhere apart appear appreciate appropriate are aren't around as aside ask asking associated at available away awfully be became because become becomes becoming been before beforehand behind being believe below beside besides best better between beyond both brief but by c'mon c's came can can't cannot cant cause causes certain certainly changes clearly co com come comes concerning consequently consider considering contain containing contains corresponding could couldn't course currently definitely described despite did didn't different do does doesn't doing don't done down downwards during each edu eg eight either else elsewhere enough entirely especially et etc even ever every everybody everyone everything everywhere ex exactly example except far few fifth first five followed following follows for former formerly forth four from further furthermore get gets getting given gives go goes going gone got gotten greetings had hadn't happens hardly has hasn't have haven't having he he's hello help hence her here here's hereafter hereby herein hereupon hers herself hi him himself his hither hopefully how howbeit however i'd i'll i'm i've ie if ignored immediate in inasmuch inc indeed indicate indicated indicates inner insofar instead into inward is isn't it it'd it'll it's its itself just keep keeps kept know knows known last lately later latter latterly least less lest let let's like liked likely little look looking looks ltd mainly many may maybe me mean meanwhile merely might more moreover most mostly much must my myself name namely nd near nearly necessary need needs neither never nevertheless new next nine no nobody non none noone nor normally not nothing novel now nowhere obviously of off often oh ok okay old on once one ones only onto or other others otherwise ought our ours ourselves out outside over overall own particular particularly per perhaps placed please plus possible presumably probably provides que quite qv rather rd re really reasonably regarding regardless regards relatively respectively right said same saw say saying says second secondly see seeing seem seemed seeming seems seen self selves sensible sent serious seriously seven several shall she should shouldn't since six so some somebody somehow someone something sometime sometimes somewhat somewhere soon sorry specified specify specifying still sub such sup sure t's take taken tell tends th than thank thanks thanx that that's thats the their theirs them themselves then thence there there's thereafter thereby therefore therein theres thereupon these they they'd they'll they're they've think third this thorough thoroughly those though three through throughout thru thus to together too took toward towards tried tries truly try trying twice two un under unfortunately unless unlikely until unto up upon us use used useful uses using usually value various very via viz vs want wants was wasn't way we we'd we'll we're we've welcome well went were weren't what what's whatever when whence whenever where where's whereafter whereas whereby wherein whereupon wherever whether which while whither who who's whoever whole whom whose why will willing wish with within without won't wonder would would wouldn't yes yet you you'd you'll you're you've your yours yourself yourselves zero)
|
3
|
+
STOPWORDS = Set.new(%W(a able about above according accordingly across actually after afterwards again against ain't all allow allows almost alone along already also although always am among amongst an and another any anybody anyhow anyone anything anyway anyways anywhere apart appear appreciate appropriate are aren't around as aside ask asking associated at available away awfully be became because become becomes becoming been before beforehand behind being believe below beside besides best better between beyond both brief but by c'mon c's came can can't cannot cant cause causes certain certainly changes clearly co com come comes concerning consequently consider considering contain containing contains corresponding could couldn't course currently definitely described despite did didn't different do does doesn't doing don't done down downwards during each edu eg eight either else elsewhere enough entirely especially et etc even ever every everybody everyone everything everywhere ex exactly example except far few fifth first five followed following follows for former formerly forth four from further furthermore get gets getting given gives go goes going gone got gotten greetings had hadn't happens hardly has hasn't have haven't having he he's hello help hence her here here's hereafter hereby herein hereupon hers herself hi him himself his hither hopefully how howbeit however i'd i'll i'm i've ie if ignored immediate in inasmuch inc indeed indicate indicated indicates inner insofar instead into inward is isn't it it'd it'll it's its itself just keep keeps kept know knows known last lately later latter latterly least less lest let let's like liked likely little look looking looks ltd mainly many may maybe me mean meanwhile merely might more moreover most mostly much must my myself name namely nd near nearly necessary need needs neither never nevertheless new next nine no nobody non none noone nor normally not nothing novel now nowhere obviously of off often oh ok okay old on once one ones only onto or other others otherwise ought our ours ourselves out outside over overall own particular particularly per perhaps placed please plus possible presumably probably provides que quite qv rather rd re really reasonably regarding regardless regards relatively respectively right said same saw say saying says second secondly see seeing seem seemed seeming seems seen self selves sensible sent serious seriously seven several shall she should shouldn't since six so some somebody somehow someone something sometime sometimes somewhat somewhere soon sorry specified specify specifying still sub such sup sure t's take taken tell tends th than thank thanks thanx that that's thats the their theirs them themselves then thence there there's thereafter thereby therefore therein theres thereupon these they they'd they'll they're they've think third this thorough thoroughly those though three through throughout thru thus to together too took toward towards tried tries truly try trying twice two un under unfortunately unless unlikely until unto up upon us use used useful uses using usually value various very via viz vs want wants was wasn't way we we'd we'll we're we've welcome well went were weren't what what's whatever when whence whenever where where's whereafter whereas whereby wherein whereupon wherever whether which while whither who who's whoever whole whom whose why will willing wish with within without won't wonder would would wouldn't yes yet you you'd you'll you're you've your yours yourself yourselves zero))
|
4
4
|
end
|
data/lib/ankusa/version.rb
CHANGED
data/test/classifier_base.rb
CHANGED
@@ -46,7 +46,7 @@ module NBClassifierBase
|
|
46
46
|
|
47
47
|
string = "spam is tastey"
|
48
48
|
|
49
|
-
hash = {:spam => 0, :good => 0}
|
49
|
+
hash = {:spam => 0.5, :good => 0.5}
|
50
50
|
assert_equal hash, @classifier.classifications(string)
|
51
51
|
assert_equal nil, @classifier.classify(string)
|
52
52
|
end
|
@@ -79,7 +79,8 @@ module NBClassifierBase
|
|
79
79
|
|
80
80
|
# test for class we didn't train on
|
81
81
|
cs = @classifier.classifications("spam is super tastey if you are a zombie", [:spam, :nothing])
|
82
|
-
|
82
|
+
assert cs[:nothing] < Float::EPSILON
|
83
|
+
assert cs[:nothing] < cs[:spam]
|
83
84
|
end
|
84
85
|
|
85
86
|
def test_prob_result
|
@@ -15,13 +15,13 @@ module FileSystemClassifierBase
|
|
15
15
|
end
|
16
16
|
end
|
17
17
|
|
18
|
-
class
|
18
|
+
class NBFileSystemClassifierTest < Test::Unit::TestCase
|
19
19
|
include FileSystemClassifierBase
|
20
20
|
include NBClassifierBase
|
21
21
|
end
|
22
22
|
|
23
23
|
|
24
|
-
class
|
24
|
+
class KLFileSystemClassifierTest < Test::Unit::TestCase
|
25
25
|
include FileSystemClassifierBase
|
26
26
|
include KLClassifierBase
|
27
27
|
end
|
data/test/hasher_test.rb
CHANGED
@@ -1,13 +1,12 @@
|
|
1
1
|
require File.join File.dirname(__FILE__), 'helper'
|
2
2
|
|
3
3
|
class HasherTest < Test::Unit::TestCase
|
4
|
-
|
4
|
+
|
5
|
+
def test_stemming
|
5
6
|
string = "Words word a the at fish fishing fishes? /^/ The at a of! @#$!"
|
6
7
|
@text_hash = Ankusa::TextHash.new string
|
7
8
|
@array = Ankusa::TextHash.new [string]
|
8
|
-
end
|
9
9
|
|
10
|
-
def test_stemming
|
11
10
|
assert_equal @text_hash.length, 2
|
12
11
|
assert_equal @text_hash.word_count, 5
|
13
12
|
|
@@ -15,11 +14,19 @@ class HasherTest < Test::Unit::TestCase
|
|
15
14
|
assert_equal @array.word_count, 5
|
16
15
|
end
|
17
16
|
|
17
|
+
def test_atomization
|
18
|
+
string = "Hello 123,45 My-name! is Robot14 123.45 @#$!"
|
19
|
+
@array = Ankusa::TextHash.atomize string
|
20
|
+
|
21
|
+
assert_equal %w{hello 123 45 my name is robot14 123 45}, @array
|
22
|
+
end
|
23
|
+
|
18
24
|
def test_valid_word
|
19
|
-
assert
|
20
|
-
assert
|
21
|
-
assert Ankusa::TextHash.valid_word?
|
22
|
-
assert Ankusa::TextHash.valid_word?
|
23
|
-
assert
|
25
|
+
assert !Ankusa::TextHash.valid_word?("accordingly")
|
26
|
+
assert !Ankusa::TextHash.valid_word?("appropriate")
|
27
|
+
assert Ankusa::TextHash.valid_word?("^*&@")
|
28
|
+
assert Ankusa::TextHash.valid_word?("mother")
|
29
|
+
assert !Ankusa::TextHash.valid_word?("21675")
|
30
|
+
assert !Ankusa::TextHash.valid_word?("00000")
|
24
31
|
end
|
25
32
|
end
|
@@ -3,19 +3,19 @@ require 'ankusa/mongo_db_storage'
|
|
3
3
|
|
4
4
|
module MongoDbClassifierBase
|
5
5
|
def initialize(name)
|
6
|
-
@storage = Ankusa::MongoDbStorage.new :host => CONFIG['mongo_db_host'], :port => CONFIG['mongo_db_port'],
|
6
|
+
@storage = Ankusa::MongoDbStorage.new :host => CONFIG['mongo_db_host'], :port => CONFIG['mongo_db_port'],
|
7
7
|
:username => CONFIG['mongo_db_username'], :password => CONFIG['mongo_db_password'],
|
8
8
|
:db => 'ankusa-test'
|
9
9
|
super(name)
|
10
10
|
end
|
11
11
|
end
|
12
12
|
|
13
|
-
class
|
13
|
+
class NBMongoDBClassifierTest < Test::Unit::TestCase
|
14
14
|
include MongoDbClassifierBase
|
15
15
|
include NBClassifierBase
|
16
16
|
end
|
17
17
|
|
18
|
-
class
|
18
|
+
class KLMongoDBClassifierTest < Test::Unit::TestCase
|
19
19
|
include MongoDbClassifierBase
|
20
20
|
include KLClassifierBase
|
21
21
|
end
|
metadata
CHANGED
@@ -1,59 +1,66 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: ankusa
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.1.
|
5
|
-
prerelease:
|
4
|
+
version: 0.1.1
|
6
5
|
platform: ruby
|
7
6
|
authors:
|
8
7
|
- Brian Muller
|
9
8
|
autorequire:
|
10
9
|
bindir: bin
|
11
10
|
cert_chain: []
|
12
|
-
date:
|
11
|
+
date: 2015-11-22 00:00:00.000000000 Z
|
13
12
|
dependencies:
|
14
13
|
- !ruby/object:Gem::Dependency
|
15
14
|
name: fast-stemmer
|
16
15
|
requirement: !ruby/object:Gem::Requirement
|
17
|
-
none: false
|
18
16
|
requirements:
|
19
|
-
- -
|
17
|
+
- - ">="
|
20
18
|
- !ruby/object:Gem::Version
|
21
19
|
version: 1.0.0
|
22
20
|
type: :runtime
|
23
21
|
prerelease: false
|
24
22
|
version_requirements: !ruby/object:Gem::Requirement
|
25
|
-
none: false
|
26
23
|
requirements:
|
27
|
-
- -
|
24
|
+
- - ">="
|
28
25
|
- !ruby/object:Gem::Version
|
29
26
|
version: 1.0.0
|
30
27
|
- !ruby/object:Gem::Dependency
|
31
28
|
name: rake
|
32
29
|
requirement: !ruby/object:Gem::Requirement
|
33
|
-
none: false
|
34
30
|
requirements:
|
35
|
-
- -
|
31
|
+
- - ">="
|
36
32
|
- !ruby/object:Gem::Version
|
37
33
|
version: '0'
|
38
34
|
type: :development
|
39
35
|
prerelease: false
|
40
36
|
version_requirements: !ruby/object:Gem::Requirement
|
41
|
-
none: false
|
42
37
|
requirements:
|
43
|
-
- -
|
38
|
+
- - ">="
|
44
39
|
- !ruby/object:Gem::Version
|
45
40
|
version: '0'
|
41
|
+
- !ruby/object:Gem::Dependency
|
42
|
+
name: mongo
|
43
|
+
requirement: !ruby/object:Gem::Requirement
|
44
|
+
requirements:
|
45
|
+
- - '='
|
46
|
+
- !ruby/object:Gem::Version
|
47
|
+
version: 1.6.0
|
48
|
+
type: :development
|
49
|
+
prerelease: false
|
50
|
+
version_requirements: !ruby/object:Gem::Requirement
|
51
|
+
requirements:
|
52
|
+
- - '='
|
53
|
+
- !ruby/object:Gem::Version
|
54
|
+
version: 1.6.0
|
46
55
|
description: Text classifier with HBase, Cassandra, or Mongo storage
|
47
56
|
email: bamuller@gmail.com
|
48
57
|
executables: []
|
49
58
|
extensions: []
|
50
59
|
extra_rdoc_files: []
|
51
60
|
files:
|
52
|
-
- .gitignore
|
53
|
-
- .
|
54
|
-
- .ruby-version
|
61
|
+
- ".gitignore"
|
62
|
+
- ".travis.yml"
|
55
63
|
- Gemfile
|
56
|
-
- Gemfile.lock
|
57
64
|
- LICENSE
|
58
65
|
- README.rdoc
|
59
66
|
- Rakefile
|
@@ -82,34 +89,27 @@ files:
|
|
82
89
|
- test/mongo_db_classifier_test.rb
|
83
90
|
homepage: https://github.com/bmuller/ankusa
|
84
91
|
licenses: []
|
92
|
+
metadata: {}
|
85
93
|
post_install_message:
|
86
94
|
rdoc_options: []
|
87
95
|
require_paths:
|
88
96
|
- lib
|
89
97
|
required_ruby_version: !ruby/object:Gem::Requirement
|
90
|
-
none: false
|
91
98
|
requirements:
|
92
|
-
- -
|
99
|
+
- - ">="
|
93
100
|
- !ruby/object:Gem::Version
|
94
101
|
version: '0'
|
95
|
-
segments:
|
96
|
-
- 0
|
97
|
-
hash: 3381126087859790337
|
98
102
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
99
|
-
none: false
|
100
103
|
requirements:
|
101
|
-
- -
|
104
|
+
- - ">="
|
102
105
|
- !ruby/object:Gem::Version
|
103
106
|
version: '0'
|
104
|
-
segments:
|
105
|
-
- 0
|
106
|
-
hash: 3381126087859790337
|
107
107
|
requirements:
|
108
108
|
- Either hbaserb >= 0.0.3 or cassandra >= 0.7
|
109
109
|
rubyforge_project: ankusa
|
110
|
-
rubygems_version:
|
110
|
+
rubygems_version: 2.2.2
|
111
111
|
signing_key:
|
112
|
-
specification_version:
|
112
|
+
specification_version: 4
|
113
113
|
summary: Text classifier in Ruby that uses Hadoop's HBase, Cassandra, or Mongo for
|
114
114
|
storage
|
115
115
|
test_files:
|
@@ -122,3 +122,4 @@ test_files:
|
|
122
122
|
- test/helper.rb
|
123
123
|
- test/memory_classifier_test.rb
|
124
124
|
- test/mongo_db_classifier_test.rb
|
125
|
+
has_rdoc:
|
data/.ruby-gemset
DELETED
@@ -1 +0,0 @@
|
|
1
|
-
ankusa
|
data/.ruby-version
DELETED
@@ -1 +0,0 @@
|
|
1
|
-
ruby-1.9.3
|
data/Gemfile.lock
DELETED