RubyGems - kbaum-munger - Versions diffs - 0.1.4 - Mend

kbaum-munger 0.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (28) hide show

data/.gitignore +5 -0
data/README +90 -0
data/Rakefile +21 -0
data/VERSION +1 -0
data/examples/column_add.rb +30 -0
data/examples/development.log +2 -0
data/examples/example_helper.rb +23 -0
data/examples/sinatra_app.rb +100 -0
data/examples/test.html +0 -0
data/lib/munger.rb +16 -0
data/lib/munger/data.rb +232 -0
data/lib/munger/item.rb +50 -0
data/lib/munger/render.rb +22 -0
data/lib/munger/render/csv.rb +31 -0
data/lib/munger/render/html.rb +89 -0
data/lib/munger/render/sortable_html.rb +133 -0
data/lib/munger/render/text.rb +54 -0
data/lib/munger/report.rb +349 -0
data/spec/munger/data/new_spec.rb +18 -0
data/spec/munger/data_spec.rb +140 -0
data/spec/munger/item_spec.rb +37 -0
data/spec/munger/render/csv_spec.rb +21 -0
data/spec/munger/render/html_spec.rb +75 -0
data/spec/munger/render/text_spec.rb +22 -0
data/spec/munger/render_spec.rb +28 -0
data/spec/munger/report_spec.rb +148 -0
data/spec/spec_helper.rb +76 -0
metadata +92 -0

data/.gitignore ADDED Viewed

@@ -0,0 +1,5 @@
+pkg
+*~
+report.html
+coverage
+doc

data/README ADDED Viewed

@@ -0,0 +1,90 @@
+Munger Ruby Reporting Library
+=============================
+Munger is basically a simple data munging and reporting library
+for Ruby as an alternative to Ruport, which did not fill my needs
+in ways that convinced me to start over rather than try to fork or
+patch it.  Apologies to the Ruport chaps, whom I am sure are
+smashing blokes - it just didn't wiggle my worm.
+See the Wiki for details : http://github.com/schacon/munger/wikis
+3-Part Reporting
+=============================
+Munger creates reports in three stages, much like an Apollo rocket. My
+main problem with Ruport was the coupling of different parts of these
+stages in ways that didn't make the data easily re-usable, cacheable or
+didn't give me enough control.  I like to have my data separate from my
+report, which should be renderable however I want.
+* Stage 1 - Data Munging *
+The first stage is getting a dataset that has all the information you need.
+I like to call this stage 'munging' (pronounced: 'MON'-day + chan-'GING'),
+which is taking a simple set of data (from a SQL query, perhaps) and
+transforming fields, adding derived data, pivoting, etc - and making it into
+a table of all the actual data-points you need.
+* Stage 2 - Report Formatting *
+Then there is the Reporting.  To me, this means taking your massaged dataset
+and doing all the fun reporting to it.  This includes grouping, subgrouping,
+sorting, column ordering, multi-level aggregation (sums, avg, etc) and
+highlighting important information (values that are too small, too high, etc).
+It can be argued that pivoting should be at this level, rather than the first,
+but I decided to put it there instead, mostly because I really think of the
+pivoted data as a different data set and also for performance reasons - the
+pivot data can be a bear to produce, and I plan on caching the first stage and
+then running different reporting options on it.
+* Stage 3 - Output Rendering *
+Now that I have my super spiffy report, I want to be able to render it however
+I want, possibly in multiple formats - HTML and XLS are the most important to
+me, but PDF, text, csv, etc will also likely be produced eventually.
+Examples
+=============================
+The starting data can be ActiveRecord collections or an array of Hashes.
+# webpage_hit table has ip_address, hit_date, action, referrer #
+* Simple Example *
+hits = WebpageHits.find(:all, :conditions => ['hit_date > ?', 1.days.ago])
+@table_data = Munger::Report.new(:data => data)
+@table_data.sort('hit_date').aggregate(:count => :action)
+html_table = Munger::Render::Html.new(@table_data).render
+* More Complex Example *
+hits = WebpageHits.find(:all, :conditions => ['hit_date > ?', 7.days.ago])
+data = Munger::Data.new
+data.transform_column('hit_date') { |row| row.hit_date.day }
+data.add_column('controller') { |row| row.action.split('/').first }
+day_columns = data.pivot('hit_date', 'action', 'ip_address', :count)
+@table_data = Munger::Report.new(:data => data,
+                                :columns => [:action] + day_columns,
+                                :aggregate => {:sum => day_columns})
+@table_data.sort('action').subgroup('controller')
+@table_data.process.style_cells('low_traffic', :only => new_columns) do |cell, row|
+  # highlight any index pages that have < 500 hits
+  cell.to_i < 500 if row.action =~ /index/
+end
+html_table = Munger::Render::Html.new(@table_data).render

data/Rakefile ADDED Viewed

@@ -0,0 +1,21 @@
+require 'rubygems'
+#Gem::manage_gems
+require 'rake/gempackagetask'
+require 'rake/rdoctask'
+require 'spec/rake/spectask'
+begin
+  require 'jeweler'
+  Jeweler::Tasks.new do |gemspec|
+    gemspec.name = "kbaum-munger"
+    gemspec.version="0.1.4"
+    gemspec.summary = "fork of munger reporting to create a gem"
+    gemspec.description = "A different and possibly longer explanation of"
+    gemspec.email = "karl@weshopnetwork.com"
+    gemspec.homepage = "http://github.com/kbaum/munger"
+    gemspec.authors = ["Karl Baum"]
+  end
+rescue LoadError
+  puts "Jeweler not available. Install it with: sudo gem install jeweler"
+end

data/VERSION ADDED Viewed

	@@ -0,0 +1 @@
1	+ 0.1.4

data/examples/column_add.rb ADDED Viewed

@@ -0,0 +1,30 @@
+require File.dirname(__FILE__) + "/example_helper"
+include ExampleHelper
+data = Munger::Data.load_data(test_data)
+data.add_column([:advert, :rate]) do |row|
+  rate = (row.clicks / row.airtime)
+  [row.advert.capitalize, rate]
+end
+#data.filter_rows { |row| row.rate > 10 }
+#new_columns = data.pivot('airtime', 'advert', 'rate', :average)
+report = Munger::Report.from_data(data)
+report.columns(:advert => 'Spot', :airdate => 'Air Date', :airtime => 'Airtime', :rate => 'Rate')
+report.sort = [['airtime', :asc], ['rate', :asc]]
+#report.subgroup('airtime')
+#report.aggregate(Proc.new {|arr| arr.inject(0) {|total, i| i * i + (total - 30) }} => :airtime, :avg => :rate)
+report.process
+report.style_cells('myRed', :only => :rate) { |cell, row| (cell.to_i < 10) }
+#puts html = Munger::Render.to_html(report, :classes => {:table => 'other-class'} )
+puts text = Munger::Render.to_text(report)
+f = File.open('test.html', 'w')
+f.write(html)
+f.close

data/examples/development.log ADDED Viewed

	@@ -0,0 +1,2 @@
1	+ GET /example \| Status: 200 \| Params: {:format=>"html"}
2	+ GET /favicon.ico \| Status: 404 \| Params: {}

data/examples/example_helper.rb ADDED Viewed

@@ -0,0 +1,23 @@
+require File.expand_path(File.dirname(__FILE__) + "/../lib/munger")
+require 'fileutils'
+require 'logger'
+require 'pp'
+module ExampleHelper
+  def test_data
+    [
+      {:advert => "spot 1", :airtime => 15, :airdate => "2008-01-01", :clicks => 301},
+      {:advert => "spot 1", :airtime => 30, :airdate => "2008-01-02", :clicks => 199},
+      {:advert => "spot 1", :airtime => 30, :airdate => "2008-01-03", :clicks => 234},
+      {:advert => "spot 1", :airtime => 15, :airdate => "2008-01-04", :clicks => 342},
+      {:advert => "spot 2", :airtime => 30, :airdate => "2008-01-01", :clicks => 172},
+      {:advert => "spot 2", :airtime => 15, :airdate => "2008-01-02", :clicks => 217},
+      {:advert => "spot 2", :airtime => 90, :airdate => "2008-01-03", :clicks => 1023},
+      {:advert => "spot 2", :airtime => 30, :airdate => "2008-01-04", :clicks => 321},
+      {:advert => "spot 3", :airtime => 60, :airdate => "2008-01-01", :clicks => 512},
+      {:advert => "spot 3", :airtime => 30, :airdate => "2008-01-02", :clicks => 813},
+      {:advert => "spot 3", :airtime => 15, :airdate => "2008-01-03", :clicks => 333},
+    ]
+  end
+end

data/examples/sinatra_app.rb ADDED Viewed

@@ -0,0 +1,100 @@
+require 'rubygems'
+require 'sinatra'
+require File.expand_path(File.dirname(__FILE__) + "/../lib/munger")
+get '/' do
+  data = Munger::Data.load_data(test_data)
+  report = Munger::Report.from_data(data)
+  report.process
+  out = Munger::Render.to_html(report, :classes => {:table => 'other-class'} )
+  show(out)
+end
+get '/pivot' do
+  data = Munger::Data.load_data(test_data)
+  data.add_column([:advert, :rate]) do |row|
+    rate = (row.clicks / row.airtime)
+    [row.advert.capitalize, rate]
+  end
+  new_columns = data.pivot('airtime', 'advert', 'rate', :average)
+  report = Munger::Report.from_data(data)
+  report.columns([:advert] + new_columns.sort)
+  report.process
+  report.style_cells('myRed', :only => new_columns) { |cell, row| (cell.to_i < 10 && cell.to_i > 0) }
+  out = Munger::Render.to_html(report, :classes => {:table => 'other-class'} )
+  show(out)
+end
+get '/example' do
+  data = Munger::Data.load_data(test_data)
+  data.add_column([:advert, :rate]) do |row|
+    rate = (row.clicks / row.airtime)
+    [row.advert.capitalize, rate]
+  end
+  #data.filter_rows { |row| row.rate > 10 }
+  #new_columns = data.pivot('airtime', 'advert', 'rate', :average)
+  report = Munger::Report.from_data(data)
+  report.columns(:advert => 'Spot', :airdate => 'Air Date', :airtime => 'Airtime', :rate => 'Rate')
+  report.sort = [['airtime', :asc], ['rate', :asc]]
+  report.subgroup('airtime', :with_titles => true)
+  report.aggregate(Proc.new {|arr| arr.inject(0) {|total, i| i * i + (total - 30) }} => :airtime, :average => :rate)
+  report.process
+  report.style_cells('myRed', :only => :rate) { |cell, row| (cell.to_i < 10) }
+  out = Munger::Render.to_html(report, :classes => {:table => 'other-class'} )
+  show(out)
+end
+def test_data
+  [
+    {:advert => "spot 1", :airtime => 15, :airdate => "2008-01-01", :clicks => 301},
+    {:advert => "spot 1", :airtime => 30, :airdate => "2008-01-02", :clicks => 199},
+    {:advert => "spot 1", :airtime => 30, :airdate => "2008-01-03", :clicks => 234},
+    {:advert => "spot 1", :airtime => 15, :airdate => "2008-01-04", :clicks => 342},
+    {:advert => "spot 2", :airtime => 30, :airdate => "2008-01-01", :clicks => 172},
+    {:advert => "spot 2", :airtime => 15, :airdate => "2008-01-02", :clicks => 217},
+    {:advert => "spot 2", :airtime => 90, :airdate => "2008-01-03", :clicks => 1023},
+    {:advert => "spot 2", :airtime => 30, :airdate => "2008-01-04", :clicks => 321},
+    {:advert => "spot 3", :airtime => 60, :airdate => "2008-01-01", :clicks => 512},
+    {:advert => "spot 3", :airtime => 30, :airdate => "2008-01-02", :clicks => 813},
+    {:advert => "spot 3", :airtime => 15, :airdate => "2008-01-03", :clicks => 333},
+  ]
+end
+def show(data)
+%Q(
+<html>
+  <head>
+    <style>
+      .myRed { background: #e44; }
+      tr.group0 { background: #bbb;}
+      tr.group1 { background: #ddd;}
+      tr.groupHeader1 { background: #ccc;}
+      table tr td {padding: 0 15px;}
+      table tr th { background: #aaa; padding: 5px; }
+      body { font-family: verdana, "Lucida Grande", arial, helvetica, sans-serif;
+        color: #333; }
+    </style>
+  </head>
+  <body>
+    #{data}
+  </body>
+</html>
+)
+end

data/examples/test.html ADDED Viewed

File without changes

data/lib/munger.rb ADDED Viewed

@@ -0,0 +1,16 @@
+$:.unshift(File.dirname(__FILE__)) unless
+  $:.include?(File.dirname(__FILE__)) || $:.include?(File.expand_path(File.dirname(__FILE__)))
+require 'munger/data'
+require 'munger/report'
+require 'munger/item'
+require 'munger/render'
+require 'munger/render/csv'
+require 'munger/render/html'
+require 'munger/render/sortable_html'
+require 'munger/render/text'
+module Munger
+  VERSION = '0.1.3'
+end

data/lib/munger/data.rb ADDED Viewed

@@ -0,0 +1,232 @@
+module Munger #:nodoc:
+  # this class is a data munger
+  #  it takes raw data (arrays of hashes, basically)
+  #  and can manipulate it in various interesting ways
+  class Data
+    attr_accessor :data
+    # will accept active record collection or array of hashes
+    def initialize(options = {})
+      @data = options[:data] if options[:data]
+      yield self if block_given?
+    end
+    def <<(data)
+      add_data(data)
+    end
+    def add_data(data)
+      if @data
+        @data = @data + data
+      else
+        @data = data
+      end
+      @data
+    end
+    #--
+    # NOTE:
+    # The name seems redundant; why:
+    #   Munger::Data.load_data(data)
+    # and not:
+    #   Munger::Data.load(data)
+    #++
+    def self.load_data(data, options = {})
+      Data.new(:data => data)
+    end
+    def columns
+      @columns ||= clean_data(@data.first).to_hash.keys
+    rescue
+      puts clean_data(@data.first).to_hash.inspect
+    end
+    # :default:	The default value to use for the column in existing rows.
+    #           Set to nil if not specified.
+    # if a block is passed, you can set the values manually
+    def add_column(names, options = {})
+      default = options[:default] || nil
+      @data.each_with_index do |row, index|
+        if block_given?
+          col_data = yield Item.ensure(row)
+        else
+          col_data = default
+        end
+        if names.is_a? Array
+          names.each_with_index do |col, i|
+            row[col] = col_data[i]
+          end
+        else
+          row[names] = col_data
+        end
+        @data[index] = Item.ensure(row)
+      end
+    end
+    alias :add_columns :add_column
+    alias :transform_column :add_column
+    alias :transform_columns :add_column
+    def clean_data(hash_or_ar)
+      if hash_or_ar.is_a? Hash
+        return Item.ensure(hash_or_ar)
+      elsif hash_or_ar.respond_to? :attributes
+        return Item.ensure(hash_or_ar.attributes)
+      end
+      hash_or_ar
+    end
+    def filter_rows
+      new_data = []
+      @data.each do |row|
+        row = Item.ensure(row)
+        if (yield row)
+          new_data << row
+        end
+      end
+      @data = new_data
+    end
+    # group the data like sql
+    def group(groups, agg_hash = {})
+      data_hash = {}
+      agg_columns = []
+      agg_hash.each do |key, columns|
+        Data.array(columns).each do |col|  # column name
+          agg_columns << col
+        end
+      end
+      agg_columns = agg_columns.uniq.compact
+      @data.each do |row|
+        row_key = Data.array(groups).map { |rk| row[rk] }
+        data_hash[row_key] ||= {:cells => {}, :data => {}, :count => 0}
+        focus = data_hash[row_key]
+        focus[:data] = clean_data(row)
+        agg_columns.each do |col|
+          focus[:cells][col] ||= []
+          focus[:cells][col] << row[col]
+        end
+        focus[:count] += 1
+      end
+      new_data = []
+      new_keys = []
+      data_hash.each do |row_key, data|
+        new_row = data[:data]
+        agg_hash.each do |key, columns|
+          Data.array(columns).each do |col|  # column name
+            newcol = ''
+            if key.is_a?(Array) && key[1].is_a?(Proc)
+              newcol = key[0].to_s + '_' + col.to_s
+              new_row[newcol] = key[1].call(data[:cells][col])
+            else
+              newcol = key.to_s + '_' + col.to_s
+              case key
+              when :average
+                sum = data[:cells][col].inject { |sum, a| sum + a }
+                new_row[newcol] = (sum / data[:count])
+              when :count
+                new_row[newcol] = data[:count]
+              else
+                new_row[newcol] = data[:cells][col].inject { |sum, a| sum + a }
+              end
+            end
+            new_keys << newcol
+          end
+        end
+        new_data << Item.ensure(new_row)
+      end
+      @data = new_data
+      new_keys.compact
+    end
+    def pivot(columns, rows, value, aggregation = :sum)
+      data_hash = {}
+      @data.each do |row|
+        column_key = Data.array(columns).map { |rk| row[rk] }
+        row_key = Data.array(rows).map { |rk| row[rk] }
+        data_hash[row_key] ||= {}
+        data_hash[row_key][column_key] ||= {:sum => 0, :data => {}, :count => 0}
+        focus = data_hash[row_key][column_key]
+        focus[:data] = clean_data(row)
+        focus[:count] += 1
+        focus[:sum] += row[value]
+      end
+      new_data = []
+      new_keys = {}
+      data_hash.each do |row_key, row_hash|
+        new_row = {}
+        row_hash.each do |column_key, data|
+          column_key.each do |ckey|
+            new_row.merge!(data[:data])
+            case aggregation
+            when :average
+              new_row[ckey] = (data[:sum] / data[:count])
+            when :count
+              new_row[ckey] = data[:count]
+            else
+              new_row[ckey] = data[:sum]
+            end
+            new_keys[ckey] = true
+          end
+        end
+        new_data << Item.ensure(new_row)
+      end
+      @data = new_data
+      new_keys.keys
+    end
+    def self.array(string_or_array)
+      if string_or_array.is_a? Array
+        return string_or_array
+      else
+        return [string_or_array]
+      end
+    end
+    def size
+      @data.size
+    end
+    alias :length :size
+    def valid?
+      if ((@data.size > 0) &&
+        (@data.respond_to? :each_with_index) &&
+        (@data.first.respond_to? :keys)) &&
+        (!@data.first.is_a? String)
+        return true
+      else
+        return false
+      end
+    rescue
+      false
+    end
+    # cols is an array of column names, if given the nested arrays are built in this order
+    def to_a(cols=nil)
+      array = []
+      cols ||= self.columns
+      @data.each do |row|
+        array << cols.inject([]){ |a,col| a << row[col] }
+      end
+      array
+    end
+  end
+end