active_olap 0.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/.gitignore ADDED
@@ -0,0 +1,2 @@
1
+ .DS_Store
2
+ active_olap-*.gem
data/MIT-LICENSE ADDED
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2008 Willem van Bergen
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.textile ADDED
@@ -0,0 +1,59 @@
1
+ h1. Active OLAP
2
+
3
+ This Rails plugin makes it easy to add an OLAP interface to your application, which is
4
+ great for administration interfaces. Its main uses are collection information about the
5
+ usage of your application and detecting inconsistencies and problems in your data.
6
+
7
+ This plugin provides:
8
+ * The main functions for OLAP querying: olap_query and olap_drilldown. These functions
9
+ must be enabled for your model by calling enable_active_olap on your model class.
10
+ * Functions to easily define dimension, categories and aggregates to use in your
11
+ OLAP queries.
12
+
13
+ In the future, the following functionality is planned to be included:
14
+ * A helper module to generate tables and charts for the query results. The gchartrb gem
15
+ is needed for charts, as they are generated using the Google charts API.
16
+ * A controller that can be included in your Rails projects to get started quickly.
17
+
18
+ More information about the concepts and usage of this plugin, see the Active OLAP Wiki on
19
+ GitHub: http://github.com/wvanbergen/active_olap/wikis. I have blogged about this plugin
20
+ on the Floorplanner tech blog: http://techblog.floorplanner.com/tag/active_olap/. Finally,
21
+ if you want to get involved or tinker with the code, you can access the repository at
22
+ http://github.com/wvanbergen/active_olap/tree.
23
+
24
+
25
+ h2. Why use this plugin?
26
+
27
+ This plugin simply runs SQL queries using the find-method of ActiveRecord. You might be
28
+ wondering why you would need a plugin for that.
29
+
30
+ First of all, it makes your life as a developer easier:
31
+ * This plugin generates the nasty SQL expressions for you using standard compliant SQL,
32
+ handles issues with SQL NULL values and makes sure the results have a consistent format.
33
+ * You can define dimensions and aggregates that are "safe to use" or known to yield useful
34
+ results. Once dimensions and aggregates are defined, they can be combined at will safely
35
+ and without any coding effort, so it is suitable for management. :-)
36
+
37
+
38
+ h2. Requirements
39
+
40
+ This plugin is usable for any ActiveRecord-based model. Because named_scope is used for the
41
+ implementation, Rails 2.1 is required for it to work. It is tested to work with MySQL 5 and
42
+ SQLite 3 but should work with other databases as well, as it only generates standard
43
+ compliant SQL queries.
44
+
45
+ Warning: OLAP queries can be heavy on the database. They can impact the performance of your
46
+ application if you perform them on the same server or database. Setting good indices is
47
+ helpful, but it may be a good idea to use a copy of the production database on another
48
+ server for these heavy queries.
49
+
50
+ Another warning: while this plugin makes it easy to perform OLAP queries and play around
51
+ with it, interpreting the results is hard and mistakes are easily made. At least, make sure
52
+ to validate the results before they are used for decision making.
53
+
54
+
55
+ h2. About this plugin
56
+
57
+ The plugin is written by Willem van Bergen for Floorplanner.com. It is MIT-licensed (see
58
+ MIT-LICENSE). If you have any questions or want to help out with the development of this plugin,
59
+ please contact me on willem AT vanbergen DOT org.
data/Rakefile ADDED
@@ -0,0 +1,5 @@
1
+ Dir[File.dirname(__FILE__) + "/tasks/*.rake"].each { |file| load(file) }
2
+
3
+ GithubGem::RakeTasks.new(:gem)
4
+
5
+ task :default => :test
@@ -0,0 +1,15 @@
1
+ Gem::Specification.new do |s|
2
+ s.name = 'active_olap'
3
+ s.version = '0.0.2'
4
+ s.date = '2008-12-23'
5
+
6
+ s.summary = "Extend ActiveRecord with OLAP query functionality"
7
+ s.description = "Extends ActiveRecord with functionality to perform OLAP queries on your data. Includes helper method to ease displaying the results."
8
+
9
+ s.authors = ['Willem van Bergen']
10
+ s.email = ['willem@vanbergen.org']
11
+ s.homepage = 'http://github.com/wvanbergen/active_olap/wikis'
12
+
13
+ s.files = %w(test/helper_modules_test.rb spec/spec_helper.rb .gitignore lib/active_olap/helpers/table_helper.rb lib/active_olap/dimension.rb test/active_olap_test.rb lib/active_olap/helpers/display_helper.rb init.rb README.textile spec/integration/active_olap_spec.rb lib/active_olap/test/assertions.rb lib/active_olap/category.rb active_olap.gemspec Rakefile MIT-LICENSE tasks/github-gem.rake lib/active_olap.rb test/helper.rb lib/active_olap/helpers/form_helper.rb lib/active_olap/aggregate.rb spec/unit/cube_spec.rb lib/active_olap/helpers/chart_helper.rb lib/active_olap/cube.rb lib/active_olap/configurator.rb)
14
+ s.test_files = %w(test/helper_modules_test.rb test/active_olap_test.rb spec/integration/active_olap_spec.rb spec/unit/cube_spec.rb)
15
+ end
data/init.rb ADDED
@@ -0,0 +1,2 @@
1
+ $:.unshift(File.dirname(__FILE__) + '/lib')
2
+ require 'active_olap'
@@ -0,0 +1,148 @@
1
+ module ActiveOLAP
2
+
3
+ class Aggregate
4
+
5
+ attr_reader :klass
6
+ attr_reader :label
7
+
8
+ attr_reader :function
9
+ attr_reader :distinct
10
+ attr_reader :expression
11
+
12
+ attr_reader :joins
13
+ attr_reader :info
14
+
15
+ def self.all_from_olap_query_call(klass, aggregates_given)
16
+ aggregates_given = [aggregates_given] unless aggregates_given.kind_of?(Array)
17
+
18
+ return aggregates_given.map do |aggregate_definition|
19
+ if aggregate_definition.kind_of?(Symbol) && klass.active_olap_aggregates.has_key?(aggregate_definition)
20
+ Aggregate.from_configuration(klass, aggregate_definition)
21
+ else
22
+ Aggregate.create(klass, aggregate_definition.to_sym, aggregate_definition)
23
+ end
24
+ end
25
+ end
26
+
27
+ def initialize(klass, label, function, expression = nil, distinct = false)
28
+ @klass = klass
29
+ @label = label
30
+ @function = function
31
+ @expression = expression
32
+ @distinct = distinct
33
+ @joins = []
34
+ @info = {}
35
+ end
36
+
37
+
38
+ def self.create(klass, label, definition)
39
+ case definition
40
+ when Symbol
41
+ return from_symbol(klass, label, definition)
42
+ when String
43
+ return from_string(klass, label, definition)
44
+ when Hash
45
+ return from_hash(klass, label, definition)
46
+ else
47
+ raise "Invalid aggregate definition: #{definition.inspect}"
48
+ end
49
+ end
50
+
51
+ def self.from_configuration(klass, aggregate_name, label = nil)
52
+ label = aggregate_name.to_sym if label.nil?
53
+ if klass.active_olap_aggregates[aggregate_name].respond_to?(:call)
54
+ return Aggregate.create(klass, label, klass.active_olap_aggregates[aggregate_name].call)
55
+ else
56
+ return Aggregate.create(klass, label, klass.active_olap_aggregates[aggregate_name])
57
+ end
58
+ end
59
+
60
+ def self.from_hash(klass, label, hash)
61
+ hash = hash.clone
62
+ agg = Aggregate.create(klass, label, hash.delete(:expression))
63
+ agg.joins.concat(hash[:joins].kind_of?(Array) ? hash.delete(:joins) : [hash.delete(:joins)]) if hash.has_key?(:joins)
64
+ hash.each { |key, val| agg.info[key] = val }
65
+ return agg
66
+ end
67
+
68
+ def self.from_string(klass, label, sql_expression)
69
+ if sql_expression =~ /^(\w+)\((.+)\)$/
70
+ return Aggregate.new(klass, label, $1.downcase.to_sym, $2, false)
71
+ else
72
+ raise "Invalid aggregate SQL expression: " + sql_expression
73
+ end
74
+ end
75
+
76
+ def self.from_symbol(klass, label, aggregate_name)
77
+
78
+ case aggregate_name
79
+ when :count_all
80
+ return Aggregate.new(klass, label, :count, '*', false) # with table name?
81
+ when :count_distinct_all
82
+ return Aggregate.new(klass, label, :count, '*', true) # with table name?
83
+ when :count
84
+ return Aggregate.new(klass, label, :count, :id, false)
85
+ when :count_distinct
86
+ return Aggregate.new(klass, label, :count, :id, true)
87
+
88
+ else
89
+ parts = aggregate_name.to_s.split('_')
90
+ raise "Invalid aggregate name: #{symbol.inspect}" unless parts.length > 1
91
+
92
+ distinct = false
93
+ if parts[1] == 'distinct'
94
+ parts.delete_at(1)
95
+ distinct = true
96
+ end
97
+
98
+ raise "Invalid aggregate name: #{symbol.inspect}" unless parts.length >= 2
99
+ #TODO: check field name and function name?
100
+ return Aggregate.new(klass, label, parts[0].to_sym, parts[1..-1].join('_').to_sym, distinct)
101
+ end
102
+ end
103
+
104
+ def to_sanitized_sql
105
+ sql = @function.to_s.upcase! + '('
106
+ sql << 'DISTINCT ' if @distinct
107
+ sql << (@expression.kind_of?(Symbol) ? "#{quote_table}.#{quote_column(@expression)}" : @expression.to_s)
108
+ sql << ") AS #{quote_column(@label)}"
109
+ end
110
+
111
+ def is_count_with_overlap?
112
+ @function == :count_with_overlap
113
+ end
114
+
115
+ def cast_value(source)
116
+ return nil if source.nil?
117
+ (@function == :count) ? source.to_i : source.to_f # TODO: better?
118
+ end
119
+
120
+ def default_value
121
+ (@function == :count) ? 0 : nil # TODO: better?
122
+ end
123
+
124
+ def self.values(aggregates, source)
125
+ result = HashWithIndifferentAccess.new
126
+ aggregates.each { |agg| result[agg.label] = agg.cast_value(source[agg.label.to_s]) }
127
+ return (aggregates.length == 1) ? result[aggregates.first.label] : result
128
+ end
129
+
130
+ def self.default_values(aggregates)
131
+ return 0 if aggregates.empty? # count with overlap
132
+ result = HashWithIndifferentAccess.new
133
+ aggregates.each { |agg| result[agg.label] = agg.default_value }
134
+ return (aggregates.length == 1) ? result[aggregates.first.label] : result
135
+ end
136
+
137
+ protected
138
+
139
+ def quote_column(column)
140
+ @klass.connection.send(:quote_column_name, column.to_s)
141
+ end
142
+
143
+ def quote_table
144
+ @klass.connection.send(:quote_table_name, @klass.table_name)
145
+ end
146
+ end
147
+
148
+ end
@@ -0,0 +1,46 @@
1
+ module ActiveOLAP
2
+
3
+ class Category
4
+
5
+ attr_reader :dimension, :label, :conditions, :info
6
+
7
+ # initializes a category, given the dimension it belongs to, a label,
8
+ # and a definition. The definition should be a hash with at least the
9
+ # key expression set to a usable ActiveRecord#find conditions
10
+ def initialize(dimension, label, definition)
11
+ @dimension = dimension
12
+ @label = label
13
+ @info = {}
14
+
15
+ if definition.kind_of?(Hash) && definition.has_key?(:expression)
16
+ @conditions = definition[:expression]
17
+ @info = definition.reject { |k,v| k == :expression }
18
+ else
19
+ @conditions = definition
20
+ end
21
+ end
22
+
23
+ # Returns the index of this category in the corresponding dimension
24
+ def index
25
+ @dimension.category_index(@label)
26
+ end
27
+
28
+ # Returns a santized SQL expression for this category
29
+ def to_sanitized_sql
30
+ @dimension.klass.send(:sanitize_sql, @conditions)
31
+ end
32
+
33
+ def to_count_sql(count_what)
34
+ "COUNT(DISTINCT CASE WHEN (#{to_sanitized_sql}) THEN #{count_what} ELSE NULL END)
35
+ AS #{@dimension.klass.connection.send(:quote_column_name, label.to_s)}"
36
+ end
37
+
38
+ # Returns the label of this category as a string
39
+ def to_s
40
+ return "nil" if label.nil?
41
+ label.to_s
42
+ end
43
+
44
+ end
45
+
46
+ end
@@ -0,0 +1,32 @@
1
+ module ActiveOLAP
2
+
3
+ class Configurator
4
+
5
+ # initializes a OLAP::Configurator object, which is used in the block
6
+ # passed to the call enable_active_olap. It can be used to register
7
+ # dimensions and classes
8
+ def initialize(klass)
9
+ @klass = klass
10
+ end
11
+
12
+ # registers a dimension for the class it belongs to
13
+ def dimension(name, definition = nil)
14
+ definition = name.to_sym if definition.nil?
15
+ @klass.active_olap_dimensions[name] = definition
16
+ end
17
+
18
+ def time_dimension(name, field, defaults = {})
19
+ @klass.active_olap_dimensions[name] = Proc.new do |*options|
20
+ options = options.empty? ? {} : options.first
21
+ { :trend => defaults.merge(options).merge(:timestamp_field => field) }
22
+ end
23
+ end
24
+
25
+ # registers an aggregate for the class it belongs to
26
+ def aggregate(name, definition = nil, options = {})
27
+ definition = name if definition.nil?
28
+ agg_definition = options.merge(:expression => definition)
29
+ @klass.active_olap_aggregates[name] = agg_definition
30
+ end
31
+ end
32
+ end
@@ -0,0 +1,215 @@
1
+ module ActiveOLAP
2
+
3
+ class Cube
4
+
5
+ attr_accessor :info
6
+ attr_accessor :klass
7
+ attr_accessor :dimensions
8
+ attr_accessor :aggregates
9
+
10
+ # Initializes a new OLAP cube.
11
+ def initialize(klass, dimensions, aggregates, query_result = nil)
12
+ @klass = klass
13
+ @dimensions = dimensions
14
+ @aggregates = aggregates
15
+ @info = {}
16
+
17
+ # populates the cube with the query rsult if it is provided.
18
+ unless query_result.nil?
19
+ @result = []
20
+ populate_result_with(query_result)
21
+ traverse_result_for_nils(@result)
22
+ end
23
+ end
24
+
25
+ # Sums up all the values in this cube
26
+ def sum(agg = nil)
27
+ raise "Please provide the aggregate you want to sum." if self.aggregates.length > 1 && agg.nil?
28
+ total_sum = 0
29
+ self.each do |cat, value|
30
+ total_sum += (value.kind_of?(Cube) ? value.sum : (agg.nil? ? value : value[agg]))
31
+ end
32
+ return total_sum
33
+ end
34
+
35
+ # Returns the total number of cells in this cube. Note that this does not take aggregates into
36
+ # account, so the result should me multiplied by the number of aggregates if you want to know
37
+ # the total number of (numeric) values.
38
+ def cell_count
39
+ dimensions.inject(1) { |intermediate, dimension| intermediate * dimension.categories.length }
40
+ end
41
+
42
+ # Returns a reference to the internal array that holds ther raw results of this cube.
43
+ # Altering this array will alter the internals of the cube-object, so make sure you know what
44
+ # you are doing. Use to_a if you want to obtain a copy of the internal array
45
+ def raw_results
46
+ @result
47
+ end
48
+
49
+ # Returns a clone of the internal array that holds ther raw results of this cube.
50
+ def to_a
51
+ @result.clone
52
+ end
53
+
54
+ # Returns the array of categories of the current (= first) dimension
55
+ def categories
56
+ @dimensions.first.categories
57
+ end
58
+
59
+ # Returns the current (first) dimension
60
+ def dimension
61
+ @dimensions.first
62
+ end
63
+
64
+ # Returns the number of dimensions in this cube
65
+ def depth
66
+ @dimensions.length
67
+ end
68
+
69
+ # Returns the number of categories in the current (= first) dimension
70
+ def breadth
71
+ @result.length
72
+ end
73
+
74
+ # Switches the dimensions of a two-dimensional cube
75
+ def transpose
76
+ raise "Can only transpose 2-dimensial results" unless depth == 2
77
+ result_object = Cube.new(@klass, [@dimensions.last, @dimensions.first], @aggregates)
78
+ result_object.result = @result.transpose
79
+ return result_object
80
+ end
81
+
82
+ def reorder_dimensions(*order)
83
+ # IMPLEMENT ME
84
+ end
85
+
86
+ def only_aggregate(aggregate_label)
87
+ # IMPLEMENT ME
88
+ end
89
+
90
+ def only_dimension(dimension_index)
91
+ # IMPLEMENT ME
92
+ end
93
+
94
+ def except_dimension(dimension_index)
95
+ # IMPLEMENT ME
96
+ end
97
+
98
+ # Returns a part of the cube or a single cell
99
+ # If the number of arguments matches the number of dimensions, a single cell is returned;
100
+ # If the number of arguments is less that the number of dimensons, this function will return
101
+ # a cube with (dimensions.length - args.length) dimensions.
102
+ def [](*args)
103
+ result = @result.clone
104
+ args.each_with_index do |cat_label, index|
105
+ cat_index = @dimensions[index].category_index(cat_label)
106
+ return nil if cat_index.nil?
107
+ result = result[cat_index]
108
+ end
109
+
110
+ if result.kind_of?(Array)
111
+ # build a new query_result object if not enoug dimensions were provided
112
+ result_object = Cube.new(@klass, @dimensions[args.length...@dimensions.length], @aggregates)
113
+ result_object.result = result
114
+ return result_object
115
+ else
116
+ return result
117
+ end
118
+ end
119
+
120
+ # Iterates over all the categories of the current dimension and their corresponding values.
121
+ # The provided block should take two arguments: the first one will be the Category and
122
+ # the second one will be the sub-cube or cell belonging to that category
123
+ def each(&block)
124
+ categories.each { |cat| yield(cat, self[cat.label]) }
125
+ end
126
+
127
+ # Maps all the categoies of the current dimension and their corresponding values. See each.
128
+ def map(&block)
129
+ result = []
130
+ categories.each { |cat| result << yield(cat, self[cat.label]) }
131
+ return result
132
+ end
133
+
134
+ protected
135
+
136
+ # Set the result by hand. For internal use
137
+ def result=(array)
138
+ @result = array
139
+ end
140
+
141
+ # Walks over all the rows of the resultset to build the cube.
142
+ def populate_result_with(query_result)
143
+
144
+ query_result.each do |row|
145
+
146
+ result = @result
147
+ values = row.attributes_before_type_cast
148
+ discard_data = false
149
+
150
+ (@dimensions.length - 1).times do |dim_index|
151
+
152
+ category_name = values.delete("dimension_#{dim_index}")
153
+ if @dimensions[dim_index].is_field_dimension?
154
+ # this field contains the value of the category_field, which should be used as category
155
+ # this might be the first time this category is seen, so register it in the dimension
156
+ category_index = @dimensions[dim_index].register_category(category_name)
157
+
158
+ elsif category_name.nil?
159
+ # this is a record for rows that did not fall in any of the categories of a dimension
160
+ # therefore, this data can be discarded. This should not happen if an "other"-field is present!
161
+ discard_data = true
162
+ break
163
+
164
+ else
165
+ # get the index of the category, which should exist
166
+ category_index = @dimensions[dim_index].category_index(category_name.to_sym)
167
+ end
168
+
169
+ # switch the result to the next dimension
170
+ result[category_index] = [] if result[category_index].nil? # add a new dimension if needed
171
+ result = result[category_index] # set the result to the next dimension for the next iteration
172
+ end
173
+
174
+ unless discard_data
175
+ dim = @dimensions.last # only the last dimension is remaining
176
+ if dim.is_field_dimension?
177
+ # the last dimension is a field category.
178
+ # every category is represented as a single row, with only one count per row
179
+ dimension_field_value = values["dimension_#{@dimensions.length - 1}"]
180
+ result[dim.register_category(dimension_field_value)] = Aggregate.values(@aggregates, values)
181
+
182
+ elsif aggregates.length == 0
183
+ # the last dimension is a category with possible overlap, using SUMs.
184
+ # every category will have its number on this row
185
+ result = [] if result.nil?
186
+ values.each { |key, value| result[dim.category_index(key.to_sym)] = value.to_i }
187
+
188
+ else
189
+ # the last category is a normal category
190
+ dimension_field_value = values["dimension_#{@dimensions.length - 1}"]
191
+ result[dim.category_index(dimension_field_value.to_sym)] = Aggregate.values(@aggregates, values) unless dimension_field_value.nil?
192
+ end
193
+ end
194
+ end
195
+ end
196
+
197
+ # Makes sure all the values are set in the resulting array
198
+ def traverse_result_for_nils(result, depth = 0)
199
+ dim = @dimensions[depth]
200
+ if dim == @dimensions.last
201
+ # set all categories to 0 if no value is set
202
+ dim.categories.length.times do |i|
203
+ result[i] = Aggregate.default_values(@aggregates) if result[i].nil?
204
+ end
205
+ else
206
+ # if no value set, create an empty array and iterate to the next dimension
207
+ # so all values will be set to 0
208
+ dim.categories.length.times do |i|
209
+ result[i] = [] if result[i].nil?
210
+ traverse_result_for_nils(result[i], depth + 1)
211
+ end
212
+ end
213
+ end
214
+ end
215
+ end