see5 0.2.0 → 0.3.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: f2f7b552c03f5df7d375e4fdf2af7a30bbd25cd4c55634a5be14b5abd0f511f2
4
- data.tar.gz: 03056357c74c6cd58145c5a60f9a759b2f8c04fb7809357163900415da1a1791
3
+ metadata.gz: a953b6bf5aafae7ae110f261e525e7b2421c39f517c56c885af0560ea0d7a3ec
4
+ data.tar.gz: 64e7f538405524d8fb5f900cd62a2acea92d42e38053f1ef527cd0108e448dc8
5
5
  SHA512:
6
- metadata.gz: dd9b563564b17b5b9f2f770448b59ed75947b0ad61fc26377529fed84cca56f5ac31f2b8950635a1a53bd8667d289d19a2cd5cedc2a2783c04b1f485d27cfcfc
7
- data.tar.gz: f5b0be6ae50abef89eab752f4008b454a49b00f68bc59bff54a99adcc6968950fc898f43785eca0ee3c4915f75616edbae5c77fd953953a27156976cb51f7206
6
+ metadata.gz: 5b810bee1a216f3dd935d04e51287463b4a1e6b0b5186daf93c54140b9b02253dd14416617521800cca559925b4b09fd3bfa62b6ee955a76776b3e138596348f
7
+ data.tar.gz: f99d05053a2cc441a0efc1c49010c3880d577ded61dfadaee2f89caddc356082d35227eabf77c20af419b487c08a6c5c13536bca9036f2c172f51c9512797b34
data/Gemfile.lock CHANGED
@@ -1,22 +1,22 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- see5 (0.1.0)
4
+ see5 (0.2.0)
5
5
 
6
6
  GEM
7
7
  remote: https://rubygems.org/
8
8
  specs:
9
- minitest (5.14.1)
9
+ minitest (5.14.4)
10
10
  rake (13.0.1)
11
11
 
12
12
  PLATFORMS
13
13
  ruby
14
14
 
15
15
  DEPENDENCIES
16
- bundler (~> 1.17)
16
+ bundler (~> 2.0)
17
17
  minitest (~> 5.0)
18
18
  rake (~> 13.0)
19
19
  see5!
20
20
 
21
21
  BUNDLED WITH
22
- 1.17.2
22
+ 2.2.15
data/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2020-2021 Eddie Lebow <elebow@users.noreply.github.com>
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
data/README.md CHANGED
@@ -1,4 +1,4 @@
1
- # Ruby::See5
1
+ # See5
2
2
 
3
3
  A Ruby frontend for the See5/C5.0 family of classifiers and modellers. Builds models from Ruby objects.
4
4
 
@@ -7,27 +7,91 @@ A Ruby frontend for the See5/C5.0 family of classifiers and modellers. Builds mo
7
7
  Add this line to your application's Gemfile:
8
8
 
9
9
  ```ruby
10
- gem 'ruby-see5'
10
+ gem 'see5'
11
11
  ```
12
12
 
13
13
  And then execute:
14
14
 
15
- $ bundle
15
+ ```
16
+ bundle
17
+ ```
16
18
 
17
19
  Or install it yourself as:
18
20
 
19
- $ gem install ruby-see5
21
+ ```
22
+ gem install see5
23
+ ```
24
+
25
+ You may also wish to install the gem `see5-installer` to provide the required executables: <https://github.com/elebow/ruby-see5-installer>.
20
26
 
21
27
  ## Usage
22
28
 
23
- TODO: Write usage instructions here
29
+ ### Training a classifier
30
+
31
+ The gem `see5-installer` must be installed, or `c5.0` must be available in the system path.
32
+
33
+ Input data must be an Enumerable of Hashes, OpenStructs, or ActiveModel-like objects—anything that responds to `#[]` or `#send(:attr_name)` for each attribute name.
34
+
35
+ The objects must respond to `#each` for automatic schema construction; otherwise, you will have to specify the schema.
36
+
37
+ ```ruby
38
+ data = [
39
+ { a: true, b: 5, c: "yellow" },
40
+ { a: false, b: 6, c: "green" },
41
+ # ...
42
+ ]
43
+ ```
44
+
45
+ Train a classifier by calling `See5.train`. Pass in the data and the name of the class attribute.
46
+
47
+ ```ruby
48
+ classifier = See5.train(data, class_attribute: :a)
49
+ ```
50
+
51
+ Use the model to classify new records by calling `#classify`.
52
+
53
+ ```ruby
54
+ classifier.classify(b: 5)
55
+ # => true
56
+
57
+ classifier.classify(b: 8, c: "green")
58
+ # => false
59
+ ```
60
+
61
+ Inspect the classifier's rules with `#rules`.
62
+
63
+ ```ruby
64
+ classifier.rules
65
+ ```
24
66
 
25
- ## Development
67
+ ### Dumping and loading a classifier
68
+
69
+ A model can be dumped to JSON, to be loaded and used later. Perhaps you want to run the classifier in a production system.
70
+
71
+ Loading and using a classifier does not have any dependencies outside the gem.
72
+
73
+ ```ruby
74
+ File.write("classifier.json", classifier.to_json)
75
+ ```
26
76
 
27
- After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake test` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
77
+ Load a classifier from a JSON string.
78
+
79
+ ```ruby
80
+ classifier = See5::Model.from_json(json_string)
81
+ ```
82
+
83
+ ### Anomaly detection with GritBot
84
+
85
+ Anomaly detection uses the same input format as training a model. Call `See5.audit` to detect anomalies in the set.
86
+
87
+ The gem `see5-installer` must be installed, or `gritbot` must be available in the system path.
88
+
89
+ ```ruby
90
+ anomalies = See5.audit(data, class_attribute: :a)
91
+ ```
28
92
 
29
- To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
93
+ Anomalies are hashes.
30
94
 
31
95
  ## Contributing
32
96
 
33
- Bug reports and pull requests are welcome on GitHub at https://github.com/elebow/ruby-see5.
97
+ Bug reports and pull requests are welcome on GitHub at <https://github.com/elebow/ruby-see5>.
data/lib/see5.rb CHANGED
@@ -1,5 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
+ require "bundler"
4
+
3
5
  require "see5/input_file_writer"
4
6
  require "see5/model"
5
7
  require "see5/rules_output_parser"
@@ -8,8 +10,8 @@ require "see5/schema"
8
10
  require "see5/version"
9
11
 
10
12
  module See5
11
- def self.train(data, class_attribute)
12
- prepare_tmp_files(data, class_attribute)
13
+ def self.train(data, class_attribute:)
14
+ prepare_tmp_files(data, class_attribute: class_attribute)
13
15
  run_see5
14
16
 
15
17
  output = See5::RulesOutputParser.parse_file("/tmp/ruby-see5.rules_output")
@@ -17,23 +19,37 @@ module See5
17
19
  See5::Model.new(**output)
18
20
  end
19
21
 
20
- def self.audit(data, class_attribute)
21
- prepare_tmp_files(data, class_attribute)
22
+ def self.audit(data, class_attribute:)
23
+ prepare_tmp_files(data, class_attribute: class_attribute)
22
24
  run_gritbot
23
25
 
24
26
  See5::GritbotOutputParser.parse_file("/tmp/ruby-see5.gritbot_output")
25
27
  end
26
28
 
27
- def self.prepare_tmp_files(data, class_attribute)
28
- schema = See5::Schema.from_dataset(data, class_attribute)
29
+ def self.prepare_tmp_files(data, class_attribute:)
30
+ schema = See5::Schema.from_dataset(data, class_attribute: class_attribute)
29
31
  See5::InputFileWriter.write_files(data: data, schema: schema)
30
32
  end
31
33
 
32
34
  def self.run_see5
33
- system("c5.0 -f /tmp/ruby-see5 -r > /tmp/ruby-see5.rules_output")
35
+ system("#{see5_executable} -f /tmp/ruby-see5 -r > /tmp/ruby-see5.rules_output")
34
36
  end
35
37
 
36
38
  def self.run_gritbot
37
- system("gritbot -s -f /tmp/ruby-see5 -r > /tmp/ruby-see5.gritbot_output")
39
+ system("#{gritbot_executable} -s -f /tmp/ruby-see5 -r > /tmp/ruby-see5.gritbot_output")
40
+ end
41
+
42
+ private
43
+
44
+ private_class_method def self.see5_executable
45
+ @see5_executable ||= see5_installer_path ? "#{see5_installer_path}/ext/c5.0/c5.0" : "c5.0"
46
+ end
47
+
48
+ private_class_method def self.gritbot_executable
49
+ @gritbot_executable ||= see5_installer_path ? "#{see5_installer_path}/ext/gritbot/gritbot" : "gritbot"
50
+ end
51
+
52
+ private_class_method def self.see5_installer_path
53
+ @see5_installer_path ||= Bundler.rubygems.find_name("see5-installer").first&.full_gem_path
38
54
  end
39
55
  end
data/lib/see5/model.rb CHANGED
@@ -1,5 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
+ require "json"
4
+
3
5
  module See5
4
6
  class Model
5
7
  attr_reader :rules
@@ -18,5 +20,20 @@ module See5
18
20
 
19
21
  @default_classification
20
22
  end
23
+
24
+ def to_h
25
+ { default_classification: @default_classification,
26
+ rules: rules.map(&:to_h) }
27
+ end
28
+
29
+ def to_json
30
+ to_h.to_json
31
+ end
32
+
33
+ def self.from_json(json)
34
+ json_hash = JSON.parse(json, symbolize_names: true)
35
+ new(default_classification: json_hash[:default_classification],
36
+ rules: json_hash[:rules]&.map { |rule_hash| Rule.from_h(rule_hash) })
37
+ end
21
38
  end
22
39
  end
data/lib/see5/rule.rb CHANGED
@@ -17,11 +17,25 @@ module See5
17
17
  .all? { |matched| matched == true }
18
18
  end
19
19
 
20
+ def to_h
21
+ { rule_info: rule_info,
22
+ conditions: conditions,
23
+ classification: classification,
24
+ confidence: confidence }
25
+ end
26
+
27
+ def self.from_h(h)
28
+ new(h[:rule_info],
29
+ h[:conditions],
30
+ { classification: h[:classification],
31
+ confidence: h[:confidence] })
32
+ end
33
+
20
34
  def to_s
21
35
  [
22
36
  "See5::Rule",
23
- "@classification=#{@classification}",
24
- "@conditions=#{@conditions}"
37
+ "@classification=#{classification}",
38
+ "@conditions=#{conditions}"
25
39
  ]
26
40
  .join(", ")
27
41
  .yield_self { |s| "#<#{s}>" }
data/lib/see5/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module See5
4
- VERSION = "0.2.0"
4
+ VERSION = "0.3.0"
5
5
  end
data/see5.gemspec CHANGED
@@ -13,6 +13,7 @@ Gem::Specification.new do |spec|
13
13
  spec.summary = "A Ruby frontend for the See5/C5.0 family of classifiers and modellers."
14
14
  #spec.description = "TODO: Write a longer description or delete this line."
15
15
  spec.homepage = "https://github.com/elebow/ruby-see5"
16
+ spec.license = "MIT"
16
17
 
17
18
  if spec.respond_to?(:metadata)
18
19
  spec.metadata["homepage_uri"] = spec.homepage
@@ -32,7 +33,7 @@ Gem::Specification.new do |spec|
32
33
  spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
33
34
  spec.require_paths = ["lib"]
34
35
 
35
- spec.add_development_dependency "bundler", "~> 1.17"
36
+ spec.add_development_dependency "bundler", "~> 2.0"
36
37
  spec.add_development_dependency "minitest", "~> 5.0"
37
38
  spec.add_development_dependency "rake", "~> 13.0"
38
39
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: see5
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.0
4
+ version: 0.3.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Eddie Lebow
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2021-01-27 00:00:00.000000000 Z
11
+ date: 2021-03-26 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -16,14 +16,14 @@ dependencies:
16
16
  requirements:
17
17
  - - "~>"
18
18
  - !ruby/object:Gem::Version
19
- version: '1.17'
19
+ version: '2.0'
20
20
  type: :development
21
21
  prerelease: false
22
22
  version_requirements: !ruby/object:Gem::Requirement
23
23
  requirements:
24
24
  - - "~>"
25
25
  - !ruby/object:Gem::Version
26
- version: '1.17'
26
+ version: '2.0'
27
27
  - !ruby/object:Gem::Dependency
28
28
  name: minitest
29
29
  requirement: !ruby/object:Gem::Requirement
@@ -63,6 +63,7 @@ files:
63
63
  - ".rubocop.yml"
64
64
  - Gemfile
65
65
  - Gemfile.lock
66
+ - LICENSE
66
67
  - README.md
67
68
  - Rakefile
68
69
  - lib/see5.rb
@@ -75,7 +76,8 @@ files:
75
76
  - lib/see5/version.rb
76
77
  - see5.gemspec
77
78
  homepage: https://github.com/elebow/ruby-see5
78
- licenses: []
79
+ licenses:
80
+ - MIT
79
81
  metadata:
80
82
  homepage_uri: https://github.com/elebow/ruby-see5
81
83
  source_code_uri: https://github.com/elebow/ruby-see5
@@ -95,7 +97,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
95
97
  - !ruby/object:Gem::Version
96
98
  version: '0'
97
99
  requirements: []
98
- rubygems_version: 3.0.3
100
+ rubyforge_project:
101
+ rubygems_version: 2.7.6.2
99
102
  signing_key:
100
103
  specification_version: 4
101
104
  summary: A Ruby frontend for the See5/C5.0 family of classifiers and modellers.