histograffle 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,5 @@
1
+ README.rdoc
2
+ lib/**/*.rb
3
+ bin/*
4
+ features/**/*.feature
5
+ LICENSE
@@ -0,0 +1,21 @@
1
+ ## MAC OS
2
+ .DS_Store
3
+
4
+ ## TEXTMATE
5
+ *.tmproj
6
+ tmtags
7
+
8
+ ## EMACS
9
+ *~
10
+ \#*
11
+ .\#*
12
+
13
+ ## VIM
14
+ *.swp
15
+
16
+ ## PROJECT::GENERAL
17
+ coverage
18
+ rdoc
19
+ pkg
20
+
21
+ ## PROJECT::SPECIFIC
data/LICENSE ADDED
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2009 Jostein B.E.
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,86 @@
1
+ h1. Histograffle
2
+
3
+ h2. A sort of histogram object implemented in pure Ruby. Use it to count the occurrence of each of several things.
4
+
5
+ h3. Simple example of its use and abuse
6
+
7
+ I am not a matematician, so this is not a mathematical library or explaination.
8
+
9
+ Sample use case: We have two lists of words, and want to count the occurrence of each word in the lists, first separately, and then together.
10
+
11
+ <pre>
12
+ <code>
13
+ require 'rubygems'
14
+ require 'histograffle'
15
+
16
+ include Histograffle
17
+
18
+ reported_incidents = %w[rain thunder confusion fun rain rain theft]
19
+
20
+ hist_one = Histogram.new
21
+
22
+ hist_one.eat(reported_incidents)
23
+
24
+ hist_one.ladder.each do |count|
25
+ puts "#{count}: #{hist_one[count].join(', ')}"
26
+ end
27
+
28
+ hist_two = Histogram.new
29
+
30
+ unreported_incidents = %w[monkey pig concert ufo airplane pig pig concert]
31
+
32
+ hist_two.eat(unreported_incidents)
33
+
34
+ hist_one << hist_two
35
+
36
+ </code>
37
+ </pre>
38
+
39
+ h3. API quickref
40
+
41
+ |_@Histograffle::Histogram.new([serialized_data])@_|Create new instance.|
42
+ |_@hist.eat(item_or_array_or_histogram)@_|Add the input. You can add histograms together this way.|
43
+ |_@hist[n]@_|Return the occurrences of which there are _n_.|
44
+ |_@hist.distribution[n]@_|Return the occurrences of which there are _n_.|
45
+ |_@hist.distribution@_|Descending list of occurrence counts.|
46
+ |_@hist.ladder@_|Ascending list of occurrence counts.|
47
+ |_@hist.top(n)@_|Return the top n entries as a hash like @{12 => 'a_word', 2 => 'some other word'}@.|
48
+ |_@hist.flat_top(n)@_|Same as above, return only a list of words with no differentiations.|
49
+ |_@hist << other_hist@_|Add a histogram to another.|
50
+ |_@hist == other_hist@_|True when they contain counts of the same entries.|
51
+
52
+ There is also @.to_mongo@ and @.from_mongo@ serialization support.
53
+
54
+ To get copies of the raw data tables (see below), you use @hist.data@ and @hist.lookup@.
55
+
56
+ h3. Implementation
57
+
58
+ Each instance stores its data internally, in two hashes, quite simply. The _data_ hash keys occurrences to counts. The _lookup_ hash keys counts to occurrences. Occurrences can be any object.
59
+
60
+ !http://davidpsydoc.com/brian/pictures/extremely%20random,%20hamburger%20gets%20arrested.jpg!
61
+ "Credit":http://davidpsydoc.com/brian/pictures/extremely%20random
62
+
63
+ h3. LICENSE
64
+
65
+ (MIT License)
66
+
67
+ Copyright (c) 2009 Jostein Berre Eliassen
68
+
69
+ Permission is hereby granted, free of charge, to any person obtaining
70
+ a copy of this software and associated documentation files (the
71
+ 'Software'), to deal in the Software without restriction, including
72
+ without limitation the rights to use, copy, modify, merge, publish,
73
+ distribute, sublicense, and/or sell copies of the Software, and to
74
+ permit persons to whom the Software is furnished to do so, subject to
75
+ the following conditions:
76
+
77
+ The above copyright notice and this permission notice shall be
78
+ included in all copies or substantial portions of the Software.
79
+
80
+ THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
81
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
82
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
83
+ IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
84
+ CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
85
+ TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
86
+ SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,53 @@
1
+ require 'rubygems'
2
+ require 'rake'
3
+
4
+ begin
5
+ require 'jeweler'
6
+ Jeweler::Tasks.new do |gem|
7
+ gem.name = "histograffle"
8
+ gem.summary = %Q{Count the occurrence of each of several things.}
9
+ gem.description = %Q{Count the occurrence of each of several things. Pure and banal Ruby.}
10
+ gem.email = "find.me@on.github"
11
+ gem.homepage = "http://github.com/jbe/histograffle"
12
+ gem.authors = ["Jostein B.E."]
13
+ gem.add_development_dependency "thoughtbot-shoulda", ">= 0"
14
+ # gem is a Gem::Specification... see http://www.rubygems.org/read/chapter/20 for additional settings
15
+ end
16
+ Jeweler::GemcutterTasks.new
17
+ rescue LoadError
18
+ puts "Jeweler (or a dependency) not available. Install it with: gem install jeweler"
19
+ end
20
+
21
+ require 'rake/testtask'
22
+ Rake::TestTask.new(:test) do |test|
23
+ test.libs << 'lib' << 'test'
24
+ test.pattern = 'test/**/test_*.rb'
25
+ test.verbose = true
26
+ end
27
+
28
+ begin
29
+ require 'rcov/rcovtask'
30
+ Rcov::RcovTask.new do |test|
31
+ test.libs << 'test'
32
+ test.pattern = 'test/**/test_*.rb'
33
+ test.verbose = true
34
+ end
35
+ rescue LoadError
36
+ task :rcov do
37
+ abort "RCov is not available. In order to run rcov, you must: sudo gem install spicycode-rcov"
38
+ end
39
+ end
40
+
41
+ task :test => :check_dependencies
42
+
43
+ task :default => :test
44
+
45
+ require 'rake/rdoctask'
46
+ Rake::RDocTask.new do |rdoc|
47
+ version = File.exist?('VERSION') ? File.read('VERSION') : ""
48
+
49
+ rdoc.rdoc_dir = 'rdoc'
50
+ rdoc.title = "histograffle #{version}"
51
+ rdoc.rdoc_files.include('README*')
52
+ rdoc.rdoc_files.include('lib/**/*.rb')
53
+ end
data/VERSION ADDED
@@ -0,0 +1 @@
1
+ 0.1.0
@@ -0,0 +1,114 @@
1
+
2
+ module Histograffle; end
3
+
4
+ class Histograffle::Histogram
5
+
6
+ def data
7
+ @data.dup
8
+ end
9
+
10
+ def lookup
11
+ lu = @lookup.dup
12
+ lu.each_key do |k|
13
+ lu[k] = lu[k].dup
14
+ end
15
+ lu
16
+ end
17
+
18
+ def initialize(d=[{}, {}])
19
+ @data = d[0].dup
20
+ @lookup = d[1].dup
21
+ @data.default = 0
22
+ end
23
+
24
+ def eat( o, n=1 )
25
+ if o.is_a? self.class
26
+ self << o
27
+ else
28
+ o = [o] unless o.respond_to? :each
29
+ o.each do |t|
30
+ @data[t] += n
31
+ @lookup[@data[t]] ||= []
32
+ @lookup[@data[t]] << t
33
+ @lookup[@data[t]-n].delete(t) if @lookup[@data[t]-n]
34
+ @lookup.delete(@data[t]-n) if (@lookup[@data[t]-n] || []).empty?
35
+ end
36
+ end
37
+ self
38
+ end
39
+
40
+ def [](n)
41
+ @lookup[n]
42
+ end
43
+
44
+ def distribution
45
+ ladder.reverse
46
+ end
47
+
48
+ def ladder
49
+ @lookup.keys.sort
50
+ end
51
+
52
+ def top( n )
53
+ result = {}
54
+ distribution.each do |count|
55
+ @lookup[count].each do |word|
56
+ result[count] ||= []
57
+ result[count] << word
58
+ return result if (n -= 1) < 1
59
+ end
60
+ end
61
+ result
62
+ end
63
+
64
+ def flat_top( n )
65
+ result = []
66
+ ladder.reverse.each do |count|
67
+ @lookup[count].each do |word|
68
+ result << word
69
+ return result if (n -= 1) < 1
70
+ end
71
+ end
72
+ result
73
+
74
+ end
75
+
76
+ def <<(other)
77
+ other.data.each{|k,v| eat(k,v) }
78
+ self
79
+ end
80
+
81
+ def ==(other)
82
+ return false unless other.is_a?(self.class)
83
+ [@data, @lookup] == [other.data, other.lookup]
84
+ end
85
+
86
+ def self.to_mongo(v)
87
+ # puts "to_mongo: " + v.inspect
88
+ return v if v.nil?
89
+ return Array.to_mongo(v) if v.is_a?(Array)
90
+ [v.data, stringify_keys(v.lookup)]
91
+ end
92
+
93
+ def self.from_mongo(v)
94
+ # puts "from_mongo: " + v.inspect
95
+ return v if v.nil? || v.is_a?(self)
96
+ v ? self.new([v[0], intify_keys(v[1])]) : nil
97
+ end
98
+
99
+ private
100
+ def self.stringify_keys(hsh)
101
+ r = Hash.new
102
+ hsh.each do |k,v|
103
+ r[k.to_s] = v
104
+ end
105
+ r
106
+ end
107
+ def self.intify_keys(hsh)
108
+ r = Hash.new
109
+ hsh.each do |k,v|
110
+ r[k.to_i] = v
111
+ end
112
+ r
113
+ end
114
+ end
@@ -0,0 +1,10 @@
1
+ require 'rubygems'
2
+ require 'test/unit'
3
+ require 'shoulda'
4
+
5
+ $LOAD_PATH.unshift(File.join(File.dirname(__FILE__), '..', 'lib'))
6
+ $LOAD_PATH.unshift(File.dirname(__FILE__))
7
+ require 'histograffle'
8
+
9
+ class Test::Unit::TestCase
10
+ end
@@ -0,0 +1,7 @@
1
+ require 'helper'
2
+
3
+ class TestHistograffle < Test::Unit::TestCase
4
+ should "probably rename this file and start testing for real" do
5
+ flunk "hey buddy, you should probably rename this file and start testing for real"
6
+ end
7
+ end
metadata ADDED
@@ -0,0 +1,90 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: histograffle
3
+ version: !ruby/object:Gem::Version
4
+ hash: 27
5
+ prerelease: false
6
+ segments:
7
+ - 0
8
+ - 1
9
+ - 0
10
+ version: 0.1.0
11
+ platform: ruby
12
+ authors:
13
+ - Jostein B.E.
14
+ autorequire:
15
+ bindir: bin
16
+ cert_chain: []
17
+
18
+ date: 2010-06-20 00:00:00 +02:00
19
+ default_executable:
20
+ dependencies:
21
+ - !ruby/object:Gem::Dependency
22
+ name: thoughtbot-shoulda
23
+ prerelease: false
24
+ requirement: &id001 !ruby/object:Gem::Requirement
25
+ none: false
26
+ requirements:
27
+ - - ">="
28
+ - !ruby/object:Gem::Version
29
+ hash: 3
30
+ segments:
31
+ - 0
32
+ version: "0"
33
+ type: :development
34
+ version_requirements: *id001
35
+ description: Count the occurrence of each of several things. Pure and banal Ruby.
36
+ email: find.me@on.github
37
+ executables: []
38
+
39
+ extensions: []
40
+
41
+ extra_rdoc_files:
42
+ - LICENSE
43
+ - README.textile
44
+ files:
45
+ - .document
46
+ - .gitignore
47
+ - LICENSE
48
+ - README.textile
49
+ - Rakefile
50
+ - VERSION
51
+ - lib/histograffle.rb
52
+ - test/helper.rb
53
+ - test/test_histograffle.rb
54
+ has_rdoc: true
55
+ homepage: http://github.com/jbe/histograffle
56
+ licenses: []
57
+
58
+ post_install_message:
59
+ rdoc_options:
60
+ - --charset=UTF-8
61
+ require_paths:
62
+ - lib
63
+ required_ruby_version: !ruby/object:Gem::Requirement
64
+ none: false
65
+ requirements:
66
+ - - ">="
67
+ - !ruby/object:Gem::Version
68
+ hash: 3
69
+ segments:
70
+ - 0
71
+ version: "0"
72
+ required_rubygems_version: !ruby/object:Gem::Requirement
73
+ none: false
74
+ requirements:
75
+ - - ">="
76
+ - !ruby/object:Gem::Version
77
+ hash: 3
78
+ segments:
79
+ - 0
80
+ version: "0"
81
+ requirements: []
82
+
83
+ rubyforge_project:
84
+ rubygems_version: 1.3.7
85
+ signing_key:
86
+ specification_version: 3
87
+ summary: Count the occurrence of each of several things.
88
+ test_files:
89
+ - test/helper.rb
90
+ - test/test_histograffle.rb