RubyGems - fluent-plugin-histogram - Versions diffs - 0.1.2 - Mend

fluent-plugin-histogram 0.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

checksums.yaml +15 -0
data/.gitignore +22 -0
data/Gemfile +4 -0
data/LICENSE.txt +13 -0
data/README.md +98 -0
data/Rakefile +10 -0
data/bench/README.md +30 -0
data/bench/genload.rb +152 -0
data/fluent-plugin-histogram.gemspec +25 -0
data/lib/fluent/plugin/out_histogram.rb +179 -0
data/test/helper.rb +30 -0
data/test/plugin/test_out_histogram.rb +202 -0
metadata +112 -0

checksums.yaml ADDED Viewed

@@ -0,0 +1,15 @@
+---
+!binary "U0hBMQ==":
+  metadata.gz: !binary |-
+    NmJkY2ZiNTU3NjcyMzRhN2EzNzA1MWFiMzJlMjgxNGY4ODI2Mjc4Zg==
+  data.tar.gz: !binary |-
+    YjVhOGEzNWQzY2YyZTBhYjYyMjVlYjkxMDdhZDczMjg1YzA5MzQ1Nw==
+SHA512:
+  metadata.gz: !binary |-
+    ZDNlZGU3OWYwYzFlYjhiOTNjMzU4MjU3MTU0MzZjZmUzMzY4OGQ4NDg2ODky
+    NjBjNDJlNmM1MTE4MmE1YjcwNWNlMThlN2Y0OTQ3ZGMyN2VkNTQxMThlNmIx
+    YmUzMjQ1MzU2MDk0NzEwNjNlYmRjOGFlZjNlZTZiOTQ0YjkwMTY=
+  data.tar.gz: !binary |-
+    YWE0MmNjOWQ2MjI5N2EzZTVmNTNiZWY1NzNkNzgzOTNkOGM2NmJmMzU0YWNh
+    OGZmNjQxNzc0OTFmZGZjZmNjYTUxMzdiNTYyOTYzNjI4NGFmYWE4MjdhMTRj
+    ZDM5NmY0MzM2ZDhmOTBlY2UzZDk3MjA2MmJiYzhkODE3ZDJiOTE=

data/.gitignore ADDED Viewed

@@ -0,0 +1,22 @@
+*.gem
+*.rbc
+.bundle
+.config
+.yardoc
+Gemfile.lock
+InstalledFiles
+_yardoc
+coverage
+doc/
+lib/bundler/man
+pkg
+rdoc
+spec/reports
+test/tmp
+test/version_tmp
+tmp
+*.swp
+.conf
+.idea
+vendor
+.ruby-version

data/Gemfile ADDED Viewed

@@ -0,0 +1,4 @@
+source 'https://rubygems.org'
+# Specify your gem's dependencies in fluent-plugin-histogram.gemspec
+gemspec

data/LICENSE.txt ADDED Viewed

@@ -0,0 +1,13 @@
+Copyright (c) 2013 SHIMIZU Yusuke
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+       http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.

data/README.md ADDED Viewed

@@ -0,0 +1,98 @@
+# fluent-plugin-histogram
+Fluentd output plugin.
+Count up input keys, and make **scalable and rough histogram** to help detecting hotspot problems.
+"Scalable rough histogram" fit for cases there are an enormous variety of keys.
+We refered ["Strauss,  O.: Rough histograms for robust statistics, Pattern Recogniti, 2000. Proceedings. 15th International Conference on (Volume:2)"](http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=7237) for "rough histogram".
+In this approarch, a increment unit is not one value(`.`), increment some values like this shape `△ `.
+To use this, please set `alpha >= 1`(default 1) option in fluent.conf.
+Moreover, we optimized that histogram for enormous variety of keys by fix histogram width.
+To use this, please set `bin_num`(default 100) in fluent.conf.
+Be careful, our plugin's output histogram is not correct count-up result about provided data. But this plugin can handle 25,000 records/sec inputs data, and that outputted histogram is enough to use for detecting hotspot problem.
+## Examples
+##### Example 1
+if run below commands,
+```
+$ echo '{"keys":["A",  "B",  "C",  "A"]}' | fluent-cat input.sample
+$ echo '{"keys":["A",  "B",  "D"]}' | fluent-cat input.sample
+```
+output is
+```
+2013-12-21T11:08:25+09:00       histo.sample.localhost   {"hist":[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 6, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 4, 2, 0, 0, 0, 1, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 1, 0, 0, 0, 0, 0], "sum":28, "avg":0, "sd":0}
+```
+count up about you specified key, and make **histogramatic something**.
+And calculate,
+* Sum(**sum**)
+* Average(**avg**)
+* Standard Deviation(**sd**)
+##### Example 2
+run bench
+```
+$ ruby bench/genload.rb input.sample 5000
+```
+output is,
+```
+2013-12-21T11:09:52+09:00       histo.sample.localhost
+{"hist":
+[859, 963, 1224, 1252, 957, 764, 746, 929, 1406, 1519, 1072, 955, 1069, 916, 797, 948, 1090, 915, 727, 730, 898, 1051, 918, 780, 751, 890, 1104, 976, 949, 1138, 996, 959, 1100, 964, 840, 832, 1020, 1196, 969, 756, 750, 939, 1108, 928, 883, 1154, 1173, 951, 871, 837, 776, 896, 1048, 961, 825, 780, 959, 1113, 1034, 1019, 1090, 1274, 1370, 1207, 930, 898, 1029, 907, 951, 1113, 921, 992, 1422, 1509, 1253, 924, 941, 1099, 898, 775, 994, 1182, 1170, 1515, 1788, 1216, 870, 1038, 938, 744, 826, 969, 892, 843, 883, 840, 800, 966, 1115, 978],
+"sum":100000,
+"avg":1000,
+"sd":193}
+```
+## Configuration
+```
+<match input.**>
+    type            histogram
+    count_key       keys            # input message tag to be counted
+    flush_interval  10s             # flush interval[s] (:default 60s)
+    tag_prefix      histo
+    tag_suffix      __HOSTNAME__    # this plugin mixined fluent-mixin-config-placeholders
+    input_tag_remove_prefix input
+    alpha           1               # count up like this,  (■ = +1)
+                                    #                             ■
+                                    #                  ■        ■ ■ ■
+                                    #          ■     ■ ■ ■    ■ ■ ■ ■ ■
+                                    # alpha:   0,      1,         2
+    sampling_rate   10              # input datas be thin outed to 1/10.
+</match>
+```
+## Installation
+Add this line to your application's Gemfile:
+    gem 'fluent-plugin-histogram'
+And then execute:
+    $ bundle
+Or install it yourself as:
+    $ gem install fluent-plugin-histogram
+## Contributing
+1. Fork it
+2. Create your feature branch (`git checkout -b my-new-feature`)
+3. Commit your changes (`git commit -am 'Add some feature'`)
+4. Push to the branch (`git push origin my-new-feature`)
+5. Create new Pull Request

data/Rakefile ADDED Viewed

@@ -0,0 +1,10 @@
+require "bundler/gem_tasks"
+require 'rake/testtask'
+Rake::TestTask.new(:test) do |test|
+    test.libs << 'lib' << 'test'
+    test.pattern = 'test/**/test_*.rb'
+    test.verbose = true
+end
+task :default => :test

data/bench/README.md ADDED Viewed

@@ -0,0 +1,30 @@
+Benchmark tool for Fluent event collector
+=========================================
+## Install
+    # genload.rb depends on fluent gem
+    $ gem install fluent
+## Usage
+    Usage: genload [options] <tag> <num>
+        -p, --port PORT                  fluent tcp port (default: 24224)
+        -h, --host HOST                  fluent host (default: 127.0.0.1)
+        -u, --unix                       use unix socket instead of tcp
+        -P, --path PATH                  unix socket path (default: /var/run/fluent/fluent.sock)
+        -r, --repeat NUM                 repeat number (default: 1)
+        -m, --multi NUM                  send multiple records at once (default: 1)
+        -c, --concurrent NUM             number of threads (default: 1)
+        -s, --size SIZE                  size of a record (default: 100)
+        -G, --no-packed                  don't use lazy deserialization optimize
+## Examples
+    # uses "benchmark.buffered" tag and sends 50,000 records
+    # -c: uses 10 threads/connections;
+    # -m: one message includes 20 record
+    # -r: repeats 100 times
+    ruby genload.rb benchamrk.buffered 50000 -c 10 -m 20 -r 100

data/bench/genload.rb ADDED Viewed

@@ -0,0 +1,152 @@
+require 'optparse'
+require 'fluent/env'
+op = OptionParser.new
+op.banner += " <tag> <num>"
+port = Fluent::DEFAULT_LISTEN_PORT
+host = '127.0.0.1'
+unix = false
+socket_path = Fluent::DEFAULT_SOCKET_PATH
+send_timeout = 20.0
+repeat = 1
+para = 1
+multi = 1
+record_len = 5
+packed = true
+config_path = Fluent::DEFAULT_CONFIG_PATH
+op.on('-p', '--port PORT', "fluent tcp port (default: #{port})", Integer) {|i|
+  port = s
+}
+op.on('-h', '--host HOST', "fluent host (default: #{host})") {|s|
+  host = s
+}
+op.on('-u', '--unix', "use unix socket instead of tcp", TrueClass) {|b|
+  unix = b
+}
+op.on('-P', '--path PATH', "unix socket path (default: #{socket_path})") {|s|
+  socket_path = s
+}
+op.on('-r', '--repeat NUM', "repeat number (default: 1)", Integer) {|i|
+  repeat = i
+}
+op.on('-m', '--multi NUM', "send multiple records at once (default: 1)", Integer) {|i|
+  multi = i
+}
+op.on('-l', '--record_len NUM', "a record to be send have NUM keys (default: 5)", Integer) {|i|
+  record_len = i
+}
+op.on('-c', '--concurrent NUM', "number of threads (default: 1)", Integer) {|i|
+  para = i
+}
+op.on('-G', '--no-packed', "don't use lazy deserialization optimize") {|i|
+  packed = false
+}
+(class<<self;self;end).module_eval do
+  define_method(:usage) do |msg|
+    puts op.to_s
+    puts "error: #{msg}" if msg
+    exit 1
+  end
+end
+begin
+  op.parse!(ARGV)
+  if ARGV.length != 2
+    usage nil
+  end
+  tag = ARGV.shift
+  num = ARGV.shift.to_i
+rescue
+  usage $!.to_s
+end
+require 'socket'
+require 'msgpack'
+require 'benchmark'
+def gen_word(len=nil)
+  len = rand(5) + 1 unless len
+  rand(36**len).to_s(36)
+end
+def gen_record(num=5, w_len=nil)
+  (1..num).reduce([]) {|ret| ret << gen_word(w_len)}
+end
+connector = Proc.new {
+  if unix
+    sock = UNIXSocket.open(socket_path)
+  else
+    sock = TCPSocket.new(host, port)
+  end
+  opt = [1, send_timeout.to_i].pack('I!I!')  # { int l_onoff; int l_linger; }
+  sock.setsockopt(Socket::SOL_SOCKET, Socket::SO_LINGER, opt)
+  opt = [send_timeout.to_i, 0].pack('L!L!')  # struct timeval
+  sock.setsockopt(Socket::SOL_SOCKET, Socket::SO_SNDTIMEO, opt)
+  sock
+}
+def gen_data(tag, multi=1, r_len=5)
+  time = Time.now.to_i
+  data = ''
+  multi.times do
+    record = {"keys"=>gen_record(r_len)}
+    [time, record].to_msgpack(data)
+  end
+  data = [tag, data].to_msgpack
+end
+size = 0 # sum of data.bytesize
+repeat.times do
+  puts "--- #{Time.now}"
+  Benchmark.bm do |x|
+    start = Time.now
+    lo = num / para / multi
+    lo = 1 if lo == 0
+    x.report do
+      (1..para).map {
+        Thread.new do
+          sock = connector.call
+          lo.times do
+            data = gen_data(tag, multi, record_len)
+            size += data.bytesize
+            sock.write data
+          end
+          sock.close
+        end
+      }.each {|t|
+        t.join
+      }
+    end
+    finish = Time.now
+    elapsed = finish - start
+    puts "% 10.3f Mbps" % [size*lo*para/elapsed/1000/1000]
+    puts "% 10.3f records/sec" % [lo*para*multi/elapsed]
+  end
+end

data/fluent-plugin-histogram.gemspec ADDED Viewed

@@ -0,0 +1,25 @@
+# coding: utf-8
+Gem::Specification.new do |gem|
+  gem.name          = "fluent-plugin-histogram"
+  gem.version       = "0.1.2"
+  gem.authors       = ["Yusuke SHIMIZU"]
+  gem.email         = "a.ryuklnm@gmail.com"
+  gem.description   = "Combine inputs data and make histogram which helps to detect a hotspot."
+  gem.summary       = "Combine inputs data and make histogram which helps to detect a hotspot."
+  gem.homepage      = "https://github.com/karahiyo/fluent-plugin-histogram"
+  gem.license       = "APLv2"
+  gem.rubyforge_project = "fluent-plugin-histogram"
+  gem.files         = `git ls-files`.split($/)
+  gem.executables   = gem.files.grep(%r{^bin/}) { |f| File.basename(f) }
+  gem.test_files    = gem.files.grep(%r{^(test|spec|features)/})
+  gem.require_paths = ["lib"]
+  gem.add_development_dependency "bundler", "~> 1.3"
+  gem.add_development_dependency "rake", ">= 0.9.2"
+  gem.add_development_dependency "fluentd", "~> 0.10.9"
+  gem.add_runtime_dependency "fluent-mixin-config-placeholders", "~> 0.2.3"
+end

data/lib/fluent/plugin/out_histogram.rb ADDED Viewed

@@ -0,0 +1,179 @@
+# -*- coding: utf-8 -*-
+require 'fluent/mixin/config_placeholders'
+module Fluent
+  class HistogramOutput < Fluent::Output
+    Fluent::Plugin.register_output('histogram', self)
+    config_param :tag, :string, :default => nil
+    config_param :tag_prefix, :string, :default => nil
+    config_param :tag_suffix, :string, :default => nil
+    config_param :input_tag_remove_prefix, :string, :default => nil
+    config_param :flush_interval, :time, :default => 60
+    config_param :count_key, :string, :default => 'keys'
+    config_param :bin_num, :integer, :default => 100
+    config_param :alpha, :integer, :default => 1
+    config_param :sampling_rate, :integer, :default => 1
+    include Fluent::Mixin::ConfigPlaceholders
+    attr_accessor :flush_interval
+    attr_accessor :hists
+    attr_accessor :zero_hist
+    attr_accessor :remove_prefix_string
+    ## fluentd output plugin's methods
+    def initialize
+      super
+    end
+    def configure(conf)
+      super
+      raise Fluent::ConfigError, "bin_num must be > 0" if @bin_num <= 0
+      $log.warn %Q[too small "bin_num(=#{@bin_num})" may raise unexpected outcome] if @bin_num < 100
+      @tag_prefix_string = @tag_prefix + '.' if @tag_prefix
+      @tag_suffix_string = '.' + @tag_suffix if @tag_suffix
+      if @input_tag_remove_prefix
+        @remove_prefix_string = @input_tag_remove_prefix + '.'
+        @remove_prefix_length = @remove_prefix_string.length
+      end
+      @zero_hist = [0] * @bin_num
+      @hists = initialize_hists
+      @sampling_counter = 0
+      @mutex = Mutex.new
+    end
+    def start
+      super
+      @watcher = Thread.new(&method(:watch))
+    end
+    def watch
+      @last_checked = Fluent::Engine.now
+      while true
+        sleep 0.5
+        if Fluent::Engine.now - @last_checked >= @flush_interval
+          now = Fluent::Engine.now
+          flush_emit(now)
+          @last_checked = now
+        end
+      end
+    end
+    def shutdown
+      super
+      @watcher.terminate
+      @watcher.join
+    end
+    ## Histogram plugin's method
+    def initialize_hists(tags=nil)
+      hists = {}
+      if tags
+        tags.each do |tag|
+          hists[tag] = @zero_hist.dup
+        end
+      end
+      hists
+    end
+    def increment(tag, key)
+      @hists[tag] ||= @zero_hist.dup
+      id = key.hash % @bin_num
+      @mutex.synchronize {
+        (0..@alpha).each do |alpha|
+          (-alpha..alpha).each do |a|
+            @hists[tag][(id + a) % @bin_num] += 1 * @sampling_rate
+          end
+        end
+      }
+    end
+    def emit(tag, es, chain)
+      chain.next
+      es.each do |time, record|
+        keys = record[@count_key]
+        [keys].flatten.each do |k|
+          if @sampling_rate == 1
+            increment(tag, k)
+          else
+            @sampling_counter += 1
+            if @sampling_counter >= @sampling_rate
+              increment(tag, k)
+              @sampling_counter = 0
+            end
+          end
+        end
+      end
+    end
+    def tagging(flushed)
+      tagged = {}
+      tagged = Hash[ flushed.map do |tag, hist|
+        if @tag
+          tag = @tag
+        else
+          if @input_tag_remove_prefix &&
+            ( ( tag.start_with?(@remove_prefix_string) &&
+               tag.length > @remove_prefix_length ) ||
+               tag == @input_tag_remove_prefix)
+            tag = tag[@input_tag_remove_prefix.length..-1]
+            tag.gsub!(/^\.|\.$/, "")
+          end
+          if @tag_prefix
+            tag = @tag_prefix_string + tag
+            tag.gsub!(/^\.|\.$/, "")
+          end
+          if @tag_suffix
+            tag += @tag_suffix_string
+            tag.gsub!(/^\.|\.$/, "")
+          end
+        end
+        [tag, hist]
+      end ]
+      tagged
+    end
+    def generate_output(flushed)
+      output = {}
+      flushed.each do |tag, hist|
+        output[tag] = {}
+        sum = hist.inject(:+)
+        avg = sum / hist.size
+        sd = hist.instance_eval do
+          sigmas = map { |n| (avg - n)**2 }
+          Math.sqrt(sigmas.inject(:+) / size)
+        end
+        output[tag][:hist] = hist
+        output[tag][:sum] = sum
+        output[tag][:avg] = avg
+        output[tag][:sd] = sd.to_i
+      end
+      output
+    end
+    def flush
+      flushed, @hists = generate_output(@hists), initialize_hists(@hists.keys.dup)
+      tagging(flushed)
+    end
+    def flush_emit(now)
+      flushed = flush
+      flushed.each do |tag, data|
+        Fluent::Engine.emit(tag, now, data)
+      end
+    end
+  end
+end

data/test/helper.rb ADDED Viewed

@@ -0,0 +1,30 @@
+require 'rubygems'
+require 'bundler'
+begin
+  Bundler.setup(:default, :development)
+rescue Bundler::BundlerError => e
+  $stderr.puts e.message
+  $stderr.puts "Run `bundle install` to install missing gems"
+  exit e.status_code
+end
+require 'test/unit'
+$LOAD_PATH.unshift(File.join(File.dirname(__FILE__), '/../lib', ))
+$LOAD_PATH.unshift(File.dirname(__FILE__))
+require 'fluent/test'
+unless ENV.has_key? 'VERBOSE'
+  nulllogger = Object.new
+  nulllogger.instance_eval {|logj|
+    def method_missing(methos, *args)
+      # pass
+    end
+  }
+  $log = nulllogger
+end
+require 'fluent/plugin/out_histogram'
+class Test::Unit::TestCase
+end

data/test/plugin/test_out_histogram.rb ADDED Viewed

@@ -0,0 +1,202 @@
+# -*- coding: utf-8 -*-
+require 'helper'
+class HistogramOutputTest < Test::Unit::TestCase
+  def setup
+    Fluent::Test.setup
+  end
+  CONFIG = %[
+  count_key      keys
+  flush_interval 60s
+  bin_num        100
+  tag_prefix     histo
+  input_tag_remove_prefix test.input
+  ]
+  def create_driver(conf = CONFIG, tag='test')
+    Fluent::Test::OutputTestDriver.new(Fluent::HistogramOutput, tag).configure(conf)
+  end
+  def test_configure
+    assert_raise(Fluent::ConfigError) {
+      create_driver %[ bin_num 0]
+    }
+  end
+  def test_small_increment_no_alpha
+    bin_num = 100
+    alpha = 0
+    f = create_driver(%[
+                        bin_num #{bin_num}
+                        alpha #{alpha}])
+    f.instance.increment("test.input", "A")
+    f.instance.increment("test.input", "B")
+    zero = f.instance.zero_hist.dup
+    id = "A".hash % bin_num
+    zero[id] += 1
+    id = "B".hash % bin_num
+    zero[id] += 1
+    assert_equal({"test.input" => {:hist => zero, :sum => 2, :avg => 2/bin_num, :sd=>0}},
+                 f.instance.flush)
+  end
+  def test_small_increment_with_alpha
+    bin_num = 100
+    alpha = 1
+    f = create_driver(%[
+                        bin_num #{bin_num}
+                        alpha #{alpha}])
+    f.instance.increment("test.input", "A")
+    f.instance.increment("test.input", "B")
+    zero = f.instance.zero_hist.dup
+    id = "A".hash % bin_num
+    zero[id] += 2
+    zero[(id + alpha) % bin_num] += 1
+    zero[id - alpha] += 1
+    id = "B".hash % bin_num
+    zero[id] += 2
+    zero[(id + alpha) % bin_num] += 1
+    zero[id - alpha] += 1
+    assert_equal({"test.input" => {:hist => zero, :sum => 2*3+2, :avg => (2*3+2)/bin_num, :sd=>0}},
+                 f.instance.flush)
+  end
+  def test_tagging_with_flush
+    f = create_driver(%[tag_prefix histo])
+    f.instance.increment("test",  "A")
+    flushed = f.instance.flush
+    assert_equal("histo.test", flushed.keys.join(''))
+    f = create_driver(%[
+                      tag_prefix histo
+                      input_tag_remove_prefix test])
+    f.instance.increment("test", "A")
+    flushed = f.instance.flush
+    assert_equal("histo", flushed.keys.join(''))
+  end
+  def test_tagging
+    f = create_driver(%[
+                      hostname localhost
+                      tag_prefix histo
+                      input_tag_remove_prefix test
+                      tag_suffix __HOSTNAME__ ])
+    # input tag is one
+    data = {"test.input" => [1, 2, 3, 4, 5]}
+    tagged = f.instance.tagging(data)
+    assert_equal("histo.input.localhost", tagged.keys.join(''))
+    # input tag is more than one
+    data = {"test.a" => [1, 2, 3], "test.b" => [1, 2]}
+    tagged = f.instance.tagging(data)
+    assert_equal(true, tagged.key?("histo.a.localhost"))
+    assert_equal(true, tagged.key?("histo.b.localhost"))
+  end
+  def test_tagging_use_tag
+    f = create_driver(%[ tag histo ])
+    data = {"test.input" => [1, 2, 3, 4, 5]}
+    tagged = f.instance.tagging(data)
+    assert_equal("histo", tagged.keys.join(''))
+  end
+  def test_increment_sum
+    bin_num = 100
+    f = create_driver(%[
+                        bin_num #{bin_num}
+                        alpha   1 ])
+    1000.times do |i|
+      f.instance.increment("test.input", i.to_s)
+    end
+    flushed = f.instance.flush
+    assert_equal(1000*4, flushed["test.input"][:sum])
+    assert_equal(1000*4/bin_num, flushed["test.input"][:avg])
+  end
+  def test_emit
+    bin_num = 100
+    f = create_driver(%[
+                      bin_num #{bin_num}
+                      alpha 1 ])
+    f.run do
+      100.times do
+        f.emit({"keys" => ["A", "B", "C"]})
+      end
+    end
+    flushed = f.instance.flush
+    assert_equal(300*4, flushed["test"][:sum])
+    assert_equal(300*4/bin_num, flushed["test"][:avg])
+  end
+  def test_some_hist_exist_case_tagging_with_emit
+    f = create_driver
+    data = {"keys" => ["A", "B", "C"]}
+    f.run do
+      ["test.a", "test.b", "test.c"].each do |tag|
+        f.instance.increment(tag, data)
+      end
+    end
+    f.instance.flush # clear hist
+    flushed = f.instance.flush
+    assert_equal(true, flushed.key?("histo.test.a"))
+    assert_equal(true, flushed.key?("histo.test.b"))
+    assert_equal(true, flushed.key?("histo.test.c"))
+  end
+  def test_can_detect_hotspot
+    f = create_driver(%[
+                        count_key      keys
+                        flush_interval 10s
+                        bin_num        100
+                        tag_prefix     histo
+                        tag_suffix     __HOSTNAME__
+                        hostname       localhost
+                        input_tag_remove_prefix test])
+    # ("A".."ZZ").to_a.size == 702
+    data = ("A".."ZZ").to_a.shuffle
+    f.run do
+      100.times do
+        data.each_slice(10) do |d|
+          f.emit({"keys" => d})
+        end
+      end
+    end
+    flushed_even = f.instance.flush["histo.localhost"]
+    #('A'..'ZZ').to_a.shuffle.size == 702
+    # In here, replace 7 values of ('A'..'ZZ') to 'D' as example hotspot.
+    data.size.times {|i| data[i] = 'D' if i%100 == 0 }
+    f.run do
+      100.times do
+        data.each_slice(10) do |d|
+          f.emit({"keys" =>  d})
+        end
+      end
+    end
+    flushed_uneven = f.instance.flush["histo.localhost"]
+    assert_equal(true, flushed_even[:sd] < flushed_uneven[:sd])
+  end
+  def test_sampling
+    bin_num = 100
+    sampling_rate = 10
+    f = create_driver(%[
+                      bin_num #{bin_num}
+                      sampling_rate #{sampling_rate}
+                      alpha 0 ])
+    f.run do
+      100.times do
+        f.emit({"keys" => ["A", "B", "C"]})
+      end
+    end
+    flushed = f.instance.flush
+    assert_equal(300, flushed["test"][:sum])
+    assert_equal(300/bin_num, flushed["test"][:avg])
+  end
+end

metadata ADDED Viewed

@@ -0,0 +1,112 @@
+--- !ruby/object:Gem::Specification
+name: fluent-plugin-histogram
+version: !ruby/object:Gem::Version
+  version: 0.1.2
+platform: ruby
+authors:
+- Yusuke SHIMIZU
+autorequire:
+bindir: bin
+cert_chain: []
+date: 2014-01-28 00:00:00.000000000 Z
+dependencies:
+- !ruby/object:Gem::Dependency
+  name: bundler
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - ~>
+      - !ruby/object:Gem::Version
+        version: '1.3'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - ~>
+      - !ruby/object:Gem::Version
+        version: '1.3'
+- !ruby/object:Gem::Dependency
+  name: rake
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - ! '>='
+      - !ruby/object:Gem::Version
+        version: 0.9.2
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - ! '>='
+      - !ruby/object:Gem::Version
+        version: 0.9.2
+- !ruby/object:Gem::Dependency
+  name: fluentd
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - ~>
+      - !ruby/object:Gem::Version
+        version: 0.10.9
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - ~>
+      - !ruby/object:Gem::Version
+        version: 0.10.9
+- !ruby/object:Gem::Dependency
+  name: fluent-mixin-config-placeholders
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - ~>
+      - !ruby/object:Gem::Version
+        version: 0.2.3
+  type: :runtime
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - ~>
+      - !ruby/object:Gem::Version
+        version: 0.2.3
+description: Combine inputs data and make histogram which helps to detect a hotspot.
+email: a.ryuklnm@gmail.com
+executables: []
+extensions: []
+extra_rdoc_files: []
+files:
+- .gitignore
+- Gemfile
+- LICENSE.txt
+- README.md
+- Rakefile
+- bench/README.md
+- bench/genload.rb
+- fluent-plugin-histogram.gemspec
+- lib/fluent/plugin/out_histogram.rb
+- test/helper.rb
+- test/plugin/test_out_histogram.rb
+homepage: https://github.com/karahiyo/fluent-plugin-histogram
+licenses:
+- APLv2
+metadata: {}
+post_install_message:
+rdoc_options: []
+require_paths:
+- lib
+required_ruby_version: !ruby/object:Gem::Requirement
+  requirements:
+  - - ! '>='
+    - !ruby/object:Gem::Version
+      version: '0'
+required_rubygems_version: !ruby/object:Gem::Requirement
+  requirements:
+  - - ! '>='
+    - !ruby/object:Gem::Version
+      version: '0'
+requirements: []
+rubyforge_project: fluent-plugin-histogram
+rubygems_version: 2.2.1
+signing_key:
+specification_version: 4
+summary: Combine inputs data and make histogram which helps to detect a hotspot.
+test_files:
+- test/helper.rb
+- test/plugin/test_out_histogram.rb