RubyGems - confusion_matrix - Versions diffs - 1.1.0 - Mend

confusion_matrix 1.1.0

Files changed (6) hide show

checksums.yaml ADDED Viewed

@@ -0,0 +1,7 @@
+---
+SHA256:
+  metadata.gz: bc3f9ed40d45bb892541c4a3dc97c04c5893e335ec79b2526d072d291d08a7b7
+  data.tar.gz: 602f9857a4e45283357117974943d46f4802fbd9c5ce7b1be0ec704472d29ba4
+SHA512:
+  metadata.gz: 20cc86e92c2ad0867206ee2c2a46fe31de7c08c15e90491d481a8c40fd94541b4dea87b575792652cedbb49fea284e70e4f72250ab6f2e83de2e9b57ea7136c6
+  data.tar.gz: 972386c261254d2fc44b09f320d48b41639c09682f16c0ddeb88cd2697b683f7b178d257e7edcb3bcba51d0dfc4c625bf478fcadd1a6e2e81ed8354a588266ce

data/LICENSE.rdoc ADDED Viewed

@@ -0,0 +1,22 @@
+= MIT License
+Copyright (c) 2020-23, Peter Lane
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

data/README.rdoc ADDED Viewed

@@ -0,0 +1,88 @@
+= Confusion Matrix
+Install from {RubyGems}[https://rubygems.org/gems/confusion_matrix/]:
+  > gem install confusion_matrix
+source:: https://notabug.org/peterlane/confusion-matrix-ruby/
+== Description
+A confusion matrix is used in data-mining as a summary of the performance of a
+classification algorithm. Each row represents the _actual_ class of an
+instance, and each column represents the _predicted_ class of that instance,
+i.e. the class that they were classified as.  Numbers at each (row, column)
+reflect the total number of instances of actual class "row" which were
+predicted to fall in class "column".
+A two-class example is:
+    Classified      Classified    |
+    Positive        Negative      | Actual
+    ------------------------------+------------
+        a               b         | Positive
+        c               d         | Negative
+Here the value:
+a:: is the number of true positives (those labelled positive and classified positive)
+b:: is the number of false negatives (those labelled positive but classified negative)
+c:: is the number of false positives (those labelled negative but classified positive)
+d:: is the number of true negatives (those labelled negative and classified negative)
+From this table we can calculate statistics like:
+true_positive_rate:: a/(a+b)
+positive recall:: a/(a+c)
+The implementation supports confusion matrices with more than two
+classes, and hence most statistics are calculated with reference to a
+named class. When more than two classes are in use, the statistics
+are calculated as if the named class were positive and all the other
+classes are grouped as if negative.
+For example, in a three-class example:
+    Classified      Classified    Classified    |
+       Red            Blue          Green       | Actual
+    --------------------------------------------+------------
+        a               b             c         | Red
+        d               e             f         | Blue
+        g               h             i         | Green
+We can calculate:
+true_red_rate:: a/(a+b+c)
+red recall:: a/(a+d+g)
+== Example
+The following example creates a simple two-class confusion matrix,
+prints a few statistics and displays the table.
+    require 'confusion_matrix'
+    cm = ConfusionMatrix.new :pos, :neg
+    cm.add_for(:pos, :pos, 10)
+    3.times { cm.add_for(:pos, :neg) }
+    20.times { cm.add_for(:neg, :neg) }
+    5.times { cm.add_for(:neg, :pos) }
+    puts "Precision: #{cm.precision}"
+    puts "Recall: #{cm.recall}"
+    puts "MCC: #{cm.matthews_correlation}"
+    puts
+    puts(cm.to_s)
+Output:
+    Precision: 0.6666666666666666
+    Recall: 0.7692307692307693
+    MCC: 0.5524850114241865
+    Predicted |
+    pos neg   | Actual
+    ----------+-------
+     10   3   | pos
+      5  20   | neg

data/lib/confusion_matrix.rb ADDED Viewed

@@ -0,0 +1,451 @@
+# This class holds the confusion matrix information.
+# It is designed to be called incrementally, as results are obtained
+# from the classifier model.
+#
+# At any point, statistics may be obtained by calling the relevant methods.
+#
+# A two-class example is:
+#
+#     Classified      Classified    |
+#     Positive        Negative      | Actual
+#     ------------------------------+------------
+#         a               b         | Positive
+#         c               d         | Negative
+#
+# Statistical methods will be described with reference to this example.
+#
+class ConfusionMatrix
+  # Creates a new, empty instance of a confusion matrix.
+  #
+  # @param labels [<String, Symbol>, ...] if provided, makes the matrix
+  #        use the first label as a default label, and also check
+  #        all operations use one of the pre-defined labels.
+  # @raise [ArgumentError] if there are not at least two unique labels, when provided.
+  def initialize(*labels)
+    @matrix = {}
+    @labels = labels.uniq
+    if @labels.size == 1
+      raise ArgumentError.new("If labels are provided, there must be at least two.")
+    else # preset the matrix Hash
+      @labels.each do |actual|
+        @matrix[actual] = {}
+        @labels.each do |predicted|
+          @matrix[actual][predicted] = 0
+        end
+      end
+    end
+  end
+  # Returns a list of labels used in the matrix.
+  #
+  #  cm = ConfusionMatrix.new
+  #  cm.add_for(:pos, :neg)
+  #  cm.labels # => [:neg, :pos]
+  #
+  # @return [Array<String>] labels used in the matrix.
+  def labels
+    if @labels.size >= 2 # if we defined some labels, return them
+      @labels
+    else
+      result = []
+      @matrix.each_pair do |key, predictions|
+        result << key
+        predictions.each_key do |key|
+          result << key
+        end
+      end
+      result.uniq.sort
+    end
+  end
+  # Return the count for (actual,prediction) pair.
+  #
+  #  cm = ConfusionMatrix.new
+  #  cm.add_for(:pos, :neg)
+  #  cm.count_for(:pos, :neg) # => 1
+  #
+  # @param actual [String, Symbol] is actual class of the instance,
+  #        which we expect the classifier to predict
+  # @param prediction [String, Symbol] is the predicted class of the instance,
+  #        as output from the classifier
+  # @return [Integer] number of observations of (actual, prediction) pair
+  # @raise [ArgumentError] if +actual+ or +predicted+ are not one of any
+  #        pre-defined labels in matrix
+  def count_for(actual, prediction)
+    validate_label actual, prediction
+    predictions = @matrix.fetch(actual, {})
+    predictions.fetch(prediction, 0)
+  end
+  # Adds one result to the matrix for a given (actual, prediction) pair of labels.
+  # If the matrix was given a pre-defined list of labels on construction, then
+  # these given labels must be from the pre-defined list.
+  # If no pre-defined list of labels was used in constructing the matrix, then
+  # labels will be added to matrix.
+  #
+  # Class labels may be any hashable value, though ideally they are strings or symbols.
+  #
+  # @param actual [String, Symbol] is actual class of the instance,
+  #        which we expect the classifier to predict
+  # @param prediction [String, Symbol] is the predicted class of the instance,
+  #        as output from the classifier
+  # @param n [Integer] number of observations to add
+  # @raise [ArgumentError] if +n+ is not an Integer
+  # @raise [ArgumentError] if +actual+ or +predicted+ are not one of any
+  #        pre-defined labels in matrix
+  def add_for(actual, prediction, n = 1)
+    validate_label actual, prediction
+    if !@matrix.has_key?(actual)
+      @matrix[actual] = {}
+    end
+    predictions = @matrix[actual]
+    if !predictions.has_key?(prediction)
+      predictions[prediction] = 0
+    end
+    unless n.class == Integer and n.positive?
+      raise ArgumentError.new("add_for requires n to be a positive Integer, but got #{n}")
+    end
+    @matrix[actual][prediction] += n
+  end
+  # Returns the number of instances of the given class label which
+  # are incorrectly classified.
+  #
+  #   false_negative(:positive) = b
+  #
+  # @param label [String, Symbol] of class to use, defaults to first of any pre-defined labels in matrix
+  # @return [Float] value of false negative
+  # @raise [ArgumentError] if +label+ is not one of any pre-defined labels in matrix
+  def false_negative(label = @labels.first)
+    validate_label label
+    predictions = @matrix.fetch(label, {})
+    total = 0
+    predictions.each_pair do |key, count|
+      if key != label
+        total += count
+      end
+    end
+    total
+  end
+  # Returns the number of instances incorrectly classified with the given
+  # class label.
+  #
+  #   false_positive(:positive) = c
+  #
+  # @param label [String, Symbol] of class to use, defaults to first of any pre-defined labels in matrix
+  # @return [Float] value of false positive
+  # @raise [ArgumentError] if +label+ is not one of any pre-defined labels in matrix
+  def false_positive(label = @labels.first)
+    validate_label label
+    total = 0
+    @matrix.each_pair do |key, predictions|
+      if key != label
+        total += predictions.fetch(label, 0)
+      end
+    end
+    total
+  end
+  # The false rate for a given class label is the proportion of instances
+  # incorrectly classified as that label, out of all those instances
+  # not originally of that label.
+  #
+  #   false_rate(:positive) = c/(c+d)
+  #
+  # @param label [String, Symbol] of class to use, defaults to first of any pre-defined labels in matrix
+  # @return [Float] value of false rate
+  # @raise [ArgumentError] if +label+ is not one of any pre-defined labels in matrix
+  def false_rate(label = @labels.first)
+    validate_label label
+    fp = false_positive(label)
+    tn = true_negative(label)
+    divide(fp, fp+tn)
+  end
+  # The F-measure for a given label is the harmonic mean of the precision
+  # and recall for that label.
+  #
+  # F = 2*(precision*recall)/(precision+recall)
+  #
+  # @param label [String, Symbol] of class to use, defaults to first of any pre-defined labels in matrix
+  # @return [Float] value of F-measure
+  # @raise [ArgumentError] if +label+ is not one of any pre-defined labels in matrix
+  def f_measure(label = @labels.first)
+    validate_label label
+    2*precision(label)*recall(label)/(precision(label) + recall(label))
+  end
+  # The geometric mean is the nth-root of the product of the true_rate for
+  # each label.
+  #
+  #   a1 = a/(a+b)
+  #   a2 = d/(c+d)
+  #   geometric_mean = Math.sqrt(a1*a2)
+  #
+  # @return [Float] value of geometric mean
+  def geometric_mean
+    product = 1
+    @matrix.each_key do |key|
+      product *= true_rate(key)
+    end
+    product**(1.0/@matrix.size)
+  end
+  # The Kappa statistic compares the observed accuracy with an expected
+  # accuracy.
+  #
+  # @param label [String, Symbol] of class to use, defaults to first of any pre-defined labels in matrix
+  # @return [Float] value of Cohen's Kappa Statistic
+  # @raise [ArgumentError] if +label+ is not one of any pre-defined labels in matrix
+  def kappa(label = @labels.first)
+    validate_label label
+    tp = true_positive(label)
+    fn = false_negative(label)
+    fp = false_positive(label)
+    tn = true_negative(label)
+    total = tp+fn+fp+tn
+    total_accuracy = divide(tp+tn, tp+tn+fp+fn)
+    random_accuracy = divide((tn+fp)*(tn+fn) + (fn+tp)*(fp+tp), total*total)
+    divide(total_accuracy - random_accuracy, 1 - random_accuracy)
+  end
+  # Matthews Correlation Coefficient is a measure of the quality of binary
+  # classifications.
+  #
+  #   mathews_correlation(:positive) = (a*d - c*b) / sqrt((a+c)(a+b)(d+c)(d+b))
+  #
+  # @param label [String, Symbol] of class to use, defaults to first of any pre-defined labels in matrix
+  # @return [Float] value of Matthews Correlation Coefficient
+  # @raise [ArgumentError] if +label+ is not one of any pre-defined labels in matrix
+  def matthews_correlation(label = @labels.first)
+    validate_label label
+    tp = true_positive(label)
+    fn = false_negative(label)
+    fp = false_positive(label)
+    tn = true_negative(label)
+    divide(tp*tn - fp*fn, Math.sqrt((tp+fp)*(tp+fn)*(tn+fp)*(tn+fn)))
+  end
+  # The overall accuracy is the proportion of instances which are
+  # correctly labelled.
+  #
+  #   overall_accuracy = (a+d)/(a+b+c+d)
+  #
+  # @return [Float] value of overall accuracy
+  def overall_accuracy
+    total_correct = 0
+    @matrix.each_pair do |key, predictions|
+      total_correct += true_positive(key)
+    end
+    divide(total_correct, total)
+  end
+  # The precision for a given class label is the proportion of instances
+  # classified as that class which are correct.
+  #
+  #   precision(:positive) = a/(a+c)
+  #
+  # @param label [String, Symbol] of class to use, defaults to first of any pre-defined labels in matrix
+  # @return [Float] value of precision
+  # @raise [ArgumentError] if +label+ is not one of any pre-defined labels in matrix
+  def precision(label = @labels.first)
+    validate_label label
+    tp = true_positive(label)
+    fp = false_positive(label)
+    divide(tp, tp+fp)
+  end
+  # The prevalence for a given class label is the proportion of instances
+  # which were classified as of that label, out of the total.
+  #
+  #   prevalence(:positive) = (a+c)/(a+b+c+d)
+  #
+  # @param label [String, Symbol] of class to use, defaults to first of any pre-defined labels in matrix
+  # @return [Float] value of prevalence
+  # @raise [ArgumentError] if +label+ is not one of any pre-defined labels in matrix
+  def prevalence(label = @labels.first)
+    validate_label label
+    tp = true_positive(label)
+    fn = false_negative(label)
+    fp = false_positive(label)
+    tn = true_negative(label)
+    total = tp+fn+fp+tn
+    divide(tp+fn, total)
+  end
+  # The recall is another name for the true rate.
+  #
+  # @see true_rate
+  # @param (see #true_rate)
+  # @return (see #true_rate)
+  # @raise [ArgumentError] if +label+ is not one of any pre-defined labels in matrix
+  def recall(label = @labels.first)
+    validate_label label
+    true_rate(label)
+  end
+  # Sensitivity is another name for the true rate.
+  #
+  # @see true_rate
+  # @param (see #true_rate)
+  # @return (see #true_rate)
+  # @raise [ArgumentError] if +label+ is not one of any pre-defined labels in matrix
+  def sensitivity(label = @labels.first)
+    validate_label label
+    true_rate(label)
+  end
+  # The specificity for a given class label is 1 - false_rate(label)
+  #
+  # In two-class case, specificity = 1 - false_positive_rate
+  #
+  # @param label [String, Symbol] of class to use, defaults to first of any pre-defined labels in matrix
+  # @return [Float] value of specificity
+  # @raise [ArgumentError] if +label+ is not one of any pre-defined labels in matrix
+  def specificity(label = @labels.first)
+    validate_label label
+    1-false_rate(label)
+  end
+  # Returns the table in a string format, representing the entries as a
+  # printable table.
+  #
+  # @return [String] representation as a printable table.
+  def to_s
+    ls = labels
+    result = ""
+    title_line = "Predicted "
+    label_line = ""
+    ls.each { |l| label_line << "#{l} " }
+    label_line << " " while label_line.size < title_line.size
+    title_line << " " while title_line.size < label_line.size
+    result << title_line << "|\n" << label_line << "| Actual\n"
+    result << "-"*title_line.size << "+-------\n"
+    ls.each do |l|
+      count_line = ""
+      ls.each_with_index do |m, i|
+        count_line << "#{count_for(l, m)}".rjust(labels[i].size) << " "
+      end
+      result << count_line.ljust(title_line.size) << "| #{l}\n"
+    end
+    result
+  end
+  # Returns the total number of instances referenced in the matrix.
+  #
+  #   total = a+b+c+d
+  #
+  # @return [Integer] total number of instances referenced in the matrix.
+  def total
+    total = 0
+    @matrix.each_value do |predictions|
+      predictions.each_value do |count|
+        total += count
+      end
+    end
+    total
+  end
+  # Returns the number of instances NOT of the given class label which
+  # are correctly classified.
+  #
+  #   true_negative(:positive) = d
+  #
+  # @param label [String, Symbol] of class to use, defaults to first of any pre-defined labels in matrix
+  # @return [Integer] number of instances not of given label which are correctly classified
+  # @raise [ArgumentError] if +label+ is not one of any pre-defined labels in matrix
+  def true_negative(label = @labels.first)
+    validate_label label
+    total = 0
+    @matrix.each_pair do |key, predictions|
+      if key != label
+        total += predictions.fetch(key, 0)
+      end
+    end
+    total
+  end
+  # Returns the number of instances of the given class label which are
+  # correctly classified.
+  #
+  #   true_positive(:positive) = a
+  #
+  # @param label [String, Symbol] of class to use, defaults to first of any pre-defined labels in matrix
+  # @return [Integer] number of instances of given label which are correctly classified
+  # @raise [ArgumentError] if +label+ is not one of any pre-defined labels in matrix
+  def true_positive(label = @labels.first)
+    validate_label label
+    predictions = @matrix.fetch(label, {})
+    predictions.fetch(label, 0)
+  end
+  # The true rate for a given class label is the proportion of instances of
+  # that class which are correctly classified.
+  #
+  #   true_rate(:positive) = a/(a+b)
+  #
+  # @param label [String, Symbol] of class to use, defaults to first of any pre-defined labels in matrix
+  # @return [Float] proportion of instances which are correctly classified
+  # @raise [ArgumentError] if +label+ is not one of any pre-defined labels in matrix
+  def true_rate(label = @labels.first)
+    validate_label label
+    tp = true_positive(label)
+    fn = false_negative(label)
+    divide(tp, tp+fn)
+  end
+  private
+  # A form of "safe divide".
+  # Checks if divisor is zero, and returns 0.0 if so.
+  # This avoids a run-time error.
+  # Also, ensures floating point division is done.
+  def divide(x, y)
+    if y.zero?
+      0.0
+    else
+      x.to_f/y
+    end
+  end
+  # Checks if given label(s) is non-nil and in @labels, or if @labels is empty
+  # Raises ArgumentError if not
+  def validate_label *labels
+    return true if @labels.empty?
+    labels.each do |label|
+      unless label and @labels.include?(label)
+        raise ArgumentError.new("Given label (#{label}) is not in predefined list (#{@labels.join(',')})")
+      end
+    end
+  end
+end

data/test/matrix_test.rb ADDED Viewed

@@ -0,0 +1,160 @@
+require 'confusion_matrix'
+require 'minitest/autorun'
+class TestConfusionMatrix < MiniTest::Test
+  def test_empty_case
+    cm = ConfusionMatrix.new
+    assert(0, cm.total)
+    assert(0, cm.true_positive(:none))
+    assert(0, cm.false_negative(:none))
+    assert(0, cm.false_positive(:none))
+    assert(0, cm.true_negative(:none))
+    assert_in_delta(0, cm.true_rate(:none))
+  end
+  def test_two_classes
+    cm = ConfusionMatrix.new
+    10.times { cm.add_for(:pos, :pos) }
+    5.times { cm.add_for(:pos, :neg) }
+    20.times { cm.add_for(:neg, :neg) }
+    5.times { cm.add_for(:neg, :pos) }
+    assert_equal([:neg, :pos], cm.labels)
+    assert_equal(10, cm.count_for(:pos, :pos))
+    assert_equal(5, cm.count_for(:pos, :neg))
+    assert_equal(20, cm.count_for(:neg, :neg))
+    assert_equal(5, cm.count_for(:neg, :pos))
+    assert_equal(40, cm.total)
+    assert_equal(10, cm.true_positive(:pos))
+    assert_equal(5, cm.false_negative(:pos))
+    assert_equal(5, cm.false_positive(:pos))
+    assert_equal(20, cm.true_negative(:pos))
+    assert_equal(20, cm.true_positive(:neg))
+    assert_equal(5, cm.false_negative(:neg))
+    assert_equal(5, cm.false_positive(:neg))
+    assert_equal(10, cm.true_negative(:neg))
+    assert_in_delta(0.6667, cm.true_rate(:pos))
+    assert_in_delta(0.8, cm.true_rate(:neg))
+    assert_in_delta(0.2, cm.false_rate(:pos))
+    assert_in_delta(0.3333, cm.false_rate(:neg))
+    assert_in_delta(0.6667, cm.precision(:pos))
+    assert_in_delta(0.8, cm.precision(:neg))
+    assert_in_delta(0.6667, cm.recall(:pos))
+    assert_in_delta(0.8, cm.recall(:neg))
+    assert_in_delta(0.6667, cm.sensitivity(:pos))
+    assert_in_delta(0.8, cm.sensitivity(:neg))
+    assert_in_delta(0.75, cm.overall_accuracy)
+    assert_in_delta(0.6667, cm.f_measure(:pos))
+    assert_in_delta(0.8, cm.f_measure(:neg))
+    assert_in_delta(0.7303, cm.geometric_mean)
+  end
+  # Example from:
+  # https://www.datatechnotes.com/2019/02/accuracy-metrics-in-classification.html
+  def test_two_classes_2
+    cm = ConfusionMatrix.new
+    5.times { cm.add_for(:pos, :pos) }
+    1.times { cm.add_for(:pos, :neg) }
+    3.times { cm.add_for(:neg, :neg) }
+    2.times { cm.add_for(:neg, :pos) }
+    assert_equal(11, cm.total)
+    assert_equal(5, cm.true_positive(:pos))
+    assert_equal(1, cm.false_negative(:pos))
+    assert_equal(2, cm.false_positive(:pos))
+    assert_equal(3, cm.true_negative(:pos))
+    assert_in_delta(0.7142, cm.precision(:pos))
+    assert_in_delta(0.8333, cm.recall(:pos))
+    assert_in_delta(0.7272, cm.overall_accuracy)
+    assert_in_delta(0.7692, cm.f_measure(:pos))
+    assert_in_delta(0.8333, cm.sensitivity(:pos))
+    assert_in_delta(0.6, cm.specificity(:pos))
+    assert_in_delta(0.4407, cm.kappa(:pos))
+    assert_in_delta(0.5454, cm.prevalence(:pos))
+  end
+  # Examples from:
+  # https://standardwisdom.com/softwarejournal/2011/12/matthews-correlation-coefficient-how-well-does-it-do/
+  def two_class_case(a,b,c,d,e,f,g,h,i)
+    cm = ConfusionMatrix.new
+    a.times { cm.add_for(:pos, :pos) }
+    b.times { cm.add_for(:pos, :neg) }
+    c.times { cm.add_for(:neg, :neg) }
+    d.times { cm.add_for(:neg, :pos) }
+    assert_in_delta(e, cm.matthews_correlation(:pos))
+    assert_in_delta(f, cm.precision(:pos))
+    assert_in_delta(g, cm.recall(:pos))
+    assert_in_delta(h, cm.f_measure(:pos))
+    assert_in_delta(i, cm.kappa(:pos))
+  end
+  def test_two_classes_3
+    two_class_case(100, 0, 900, 0, 1.0, 1.0, 1.0, 1.0, 1.0)
+    two_class_case(65, 35, 825, 75, 0.490, 0.4643, 0.65, 0.542, 0.4811)
+    two_class_case(50, 50, 700, 200, 0.192, 0.2, 0.5, 0.286, 0.1666)
+  end
+  def test_three_classes
+    cm = ConfusionMatrix.new
+    10.times { cm.add_for(:red, :red) }
+    7.times { cm.add_for(:red, :blue) }
+    5.times { cm.add_for(:red, :green) }
+    20.times { cm.add_for(:blue, :red) }
+    5.times { cm.add_for(:blue, :blue) }
+    15.times { cm.add_for(:blue, :green) }
+    30.times { cm.add_for(:green, :red) }
+    12.times { cm.add_for(:green, :blue) }
+    8.times { cm.add_for(:green, :green) }
+    assert_equal([:blue, :green, :red], cm.labels)
+    assert_equal(112, cm.total)
+    assert_equal(10, cm.true_positive(:red))
+    assert_equal(12, cm.false_negative(:red))
+    assert_equal(50, cm.false_positive(:red))
+    assert_equal(13, cm.true_negative(:red))
+    assert_equal(5, cm.true_positive(:blue))
+    assert_equal(35, cm.false_negative(:blue))
+    assert_equal(19, cm.false_positive(:blue))
+    assert_equal(18, cm.true_negative(:blue))
+    assert_equal(8, cm.true_positive(:green))
+    assert_equal(42, cm.false_negative(:green))
+    assert_equal(20, cm.false_positive(:green))
+    assert_equal(15, cm.true_negative(:green))
+  end
+  def test_add_for_n
+    cm = ConfusionMatrix.new
+    cm.add_for(:pos, :pos, 3)
+    cm.add_for(:pos, :neg)
+    cm.add_for(:neg, :pos, 2)
+    cm.add_for(:neg, :neg, 1)
+    assert_equal(7, cm.total)
+    assert_equal(3, cm.count_for(:pos, :pos))
+    # - check errors
+    assert_raises(ArgumentError) { cm.add_for(:pos, :pos, 0) }
+    assert_raises(ArgumentError) { cm.add_for(:pos, :pos, -3) }
+    assert_raises(ArgumentError) { cm.add_for(:pos, :pos, nil) }
+  end
+  def test_use_labels
+    # - check errors
+    assert_raises(ArgumentError) { ConfusionMatrix.new(:pos) }
+    assert_raises(ArgumentError) { ConfusionMatrix.new(:pos, :pos) }
+    # - check created matrix
+    cm = ConfusionMatrix.new(:pos, :neg)
+    assert_equal([:pos, :neg], cm.labels)
+    assert_raises(ArgumentError) { cm.add_for(:pos, :nothing) }
+    cm.add_for(:pos, :neg, 3)
+    cm.add_for(:neg, :pos, 2)
+    assert_equal(2, cm.false_negative(:neg))
+    assert_equal(3, cm.false_negative(:pos))
+    assert_equal(3, cm.false_negative())
+    assert_raises(ArgumentError) { cm.false_negative(:nothing) }
+    assert_raises(ArgumentError) { cm.false_negative(nil) }
+  end
+end

metadata ADDED Viewed

@@ -0,0 +1,55 @@
+--- !ruby/object:Gem::Specification
+name: confusion_matrix
+version: !ruby/object:Gem::Version
+  version: 1.1.0
+platform: ruby
+authors:
+- Peter Lane
+autorequire:
+bindir: bin
+cert_chain: []
+date: 2023-02-04 00:00:00.000000000 Z
+dependencies: []
+description: "A confusion matrix is used in data-mining as a summary of the performance
+  of a\nclassification algorithm. This library allows the user to incrementally add
+  \nresults to a confusion matrix, and then retrieve statistical information.\n"
+email: peterlane@gmx.com
+executables: []
+extensions: []
+extra_rdoc_files:
+- README.rdoc
+- LICENSE.rdoc
+files:
+- LICENSE.rdoc
+- README.rdoc
+- lib/confusion_matrix.rb
+- test/matrix_test.rb
+homepage:
+licenses:
+- MIT
+metadata: {}
+post_install_message:
+rdoc_options:
+- "-m"
+- README.rdoc
+require_paths:
+- lib
+required_ruby_version: !ruby/object:Gem::Requirement
+  requirements:
+  - - ">="
+    - !ruby/object:Gem::Version
+      version: '2.5'
+  - - "<"
+    - !ruby/object:Gem::Version
+      version: '4.0'
+required_rubygems_version: !ruby/object:Gem::Requirement
+  requirements:
+  - - ">="
+    - !ruby/object:Gem::Version
+      version: '0'
+requirements: []
+rubygems_version: 3.4.5
+signing_key:
+specification_version: 4
+summary: Construct a confusion matrix and retrieve statistical information from it.
+test_files: []