active_olap 0.0.2

Sign up to get free protection for your applications and to get access to all the features.
data/.gitignore ADDED
@@ -0,0 +1,2 @@
1
+ .DS_Store
2
+ active_olap-*.gem
data/MIT-LICENSE ADDED
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2008 Willem van Bergen
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.textile ADDED
@@ -0,0 +1,59 @@
1
+ h1. Active OLAP
2
+
3
+ This Rails plugin makes it easy to add an OLAP interface to your application, which is
4
+ great for administration interfaces. Its main uses are collection information about the
5
+ usage of your application and detecting inconsistencies and problems in your data.
6
+
7
+ This plugin provides:
8
+ * The main functions for OLAP querying: olap_query and olap_drilldown. These functions
9
+ must be enabled for your model by calling enable_active_olap on your model class.
10
+ * Functions to easily define dimension, categories and aggregates to use in your
11
+ OLAP queries.
12
+
13
+ In the future, the following functionality is planned to be included:
14
+ * A helper module to generate tables and charts for the query results. The gchartrb gem
15
+ is needed for charts, as they are generated using the Google charts API.
16
+ * A controller that can be included in your Rails projects to get started quickly.
17
+
18
+ More information about the concepts and usage of this plugin, see the Active OLAP Wiki on
19
+ GitHub: http://github.com/wvanbergen/active_olap/wikis. I have blogged about this plugin
20
+ on the Floorplanner tech blog: http://techblog.floorplanner.com/tag/active_olap/. Finally,
21
+ if you want to get involved or tinker with the code, you can access the repository at
22
+ http://github.com/wvanbergen/active_olap/tree.
23
+
24
+
25
+ h2. Why use this plugin?
26
+
27
+ This plugin simply runs SQL queries using the find-method of ActiveRecord. You might be
28
+ wondering why you would need a plugin for that.
29
+
30
+ First of all, it makes your life as a developer easier:
31
+ * This plugin generates the nasty SQL expressions for you using standard compliant SQL,
32
+ handles issues with SQL NULL values and makes sure the results have a consistent format.
33
+ * You can define dimensions and aggregates that are "safe to use" or known to yield useful
34
+ results. Once dimensions and aggregates are defined, they can be combined at will safely
35
+ and without any coding effort, so it is suitable for management. :-)
36
+
37
+
38
+ h2. Requirements
39
+
40
+ This plugin is usable for any ActiveRecord-based model. Because named_scope is used for the
41
+ implementation, Rails 2.1 is required for it to work. It is tested to work with MySQL 5 and
42
+ SQLite 3 but should work with other databases as well, as it only generates standard
43
+ compliant SQL queries.
44
+
45
+ Warning: OLAP queries can be heavy on the database. They can impact the performance of your
46
+ application if you perform them on the same server or database. Setting good indices is
47
+ helpful, but it may be a good idea to use a copy of the production database on another
48
+ server for these heavy queries.
49
+
50
+ Another warning: while this plugin makes it easy to perform OLAP queries and play around
51
+ with it, interpreting the results is hard and mistakes are easily made. At least, make sure
52
+ to validate the results before they are used for decision making.
53
+
54
+
55
+ h2. About this plugin
56
+
57
+ The plugin is written by Willem van Bergen for Floorplanner.com. It is MIT-licensed (see
58
+ MIT-LICENSE). If you have any questions or want to help out with the development of this plugin,
59
+ please contact me on willem AT vanbergen DOT org.
data/Rakefile ADDED
@@ -0,0 +1,5 @@
1
+ Dir[File.dirname(__FILE__) + "/tasks/*.rake"].each { |file| load(file) }
2
+
3
+ GithubGem::RakeTasks.new(:gem)
4
+
5
+ task :default => :test
@@ -0,0 +1,15 @@
1
+ Gem::Specification.new do |s|
2
+ s.name = 'active_olap'
3
+ s.version = '0.0.2'
4
+ s.date = '2008-12-23'
5
+
6
+ s.summary = "Extend ActiveRecord with OLAP query functionality"
7
+ s.description = "Extends ActiveRecord with functionality to perform OLAP queries on your data. Includes helper method to ease displaying the results."
8
+
9
+ s.authors = ['Willem van Bergen']
10
+ s.email = ['willem@vanbergen.org']
11
+ s.homepage = 'http://github.com/wvanbergen/active_olap/wikis'
12
+
13
+ s.files = %w(test/helper_modules_test.rb spec/spec_helper.rb .gitignore lib/active_olap/helpers/table_helper.rb lib/active_olap/dimension.rb test/active_olap_test.rb lib/active_olap/helpers/display_helper.rb init.rb README.textile spec/integration/active_olap_spec.rb lib/active_olap/test/assertions.rb lib/active_olap/category.rb active_olap.gemspec Rakefile MIT-LICENSE tasks/github-gem.rake lib/active_olap.rb test/helper.rb lib/active_olap/helpers/form_helper.rb lib/active_olap/aggregate.rb spec/unit/cube_spec.rb lib/active_olap/helpers/chart_helper.rb lib/active_olap/cube.rb lib/active_olap/configurator.rb)
14
+ s.test_files = %w(test/helper_modules_test.rb test/active_olap_test.rb spec/integration/active_olap_spec.rb spec/unit/cube_spec.rb)
15
+ end
data/init.rb ADDED
@@ -0,0 +1,2 @@
1
+ $:.unshift(File.dirname(__FILE__) + '/lib')
2
+ require 'active_olap'
@@ -0,0 +1,148 @@
1
+ module ActiveOLAP
2
+
3
+ class Aggregate
4
+
5
+ attr_reader :klass
6
+ attr_reader :label
7
+
8
+ attr_reader :function
9
+ attr_reader :distinct
10
+ attr_reader :expression
11
+
12
+ attr_reader :joins
13
+ attr_reader :info
14
+
15
+ def self.all_from_olap_query_call(klass, aggregates_given)
16
+ aggregates_given = [aggregates_given] unless aggregates_given.kind_of?(Array)
17
+
18
+ return aggregates_given.map do |aggregate_definition|
19
+ if aggregate_definition.kind_of?(Symbol) && klass.active_olap_aggregates.has_key?(aggregate_definition)
20
+ Aggregate.from_configuration(klass, aggregate_definition)
21
+ else
22
+ Aggregate.create(klass, aggregate_definition.to_sym, aggregate_definition)
23
+ end
24
+ end
25
+ end
26
+
27
+ def initialize(klass, label, function, expression = nil, distinct = false)
28
+ @klass = klass
29
+ @label = label
30
+ @function = function
31
+ @expression = expression
32
+ @distinct = distinct
33
+ @joins = []
34
+ @info = {}
35
+ end
36
+
37
+
38
+ def self.create(klass, label, definition)
39
+ case definition
40
+ when Symbol
41
+ return from_symbol(klass, label, definition)
42
+ when String
43
+ return from_string(klass, label, definition)
44
+ when Hash
45
+ return from_hash(klass, label, definition)
46
+ else
47
+ raise "Invalid aggregate definition: #{definition.inspect}"
48
+ end
49
+ end
50
+
51
+ def self.from_configuration(klass, aggregate_name, label = nil)
52
+ label = aggregate_name.to_sym if label.nil?
53
+ if klass.active_olap_aggregates[aggregate_name].respond_to?(:call)
54
+ return Aggregate.create(klass, label, klass.active_olap_aggregates[aggregate_name].call)
55
+ else
56
+ return Aggregate.create(klass, label, klass.active_olap_aggregates[aggregate_name])
57
+ end
58
+ end
59
+
60
+ def self.from_hash(klass, label, hash)
61
+ hash = hash.clone
62
+ agg = Aggregate.create(klass, label, hash.delete(:expression))
63
+ agg.joins.concat(hash[:joins].kind_of?(Array) ? hash.delete(:joins) : [hash.delete(:joins)]) if hash.has_key?(:joins)
64
+ hash.each { |key, val| agg.info[key] = val }
65
+ return agg
66
+ end
67
+
68
+ def self.from_string(klass, label, sql_expression)
69
+ if sql_expression =~ /^(\w+)\((.+)\)$/
70
+ return Aggregate.new(klass, label, $1.downcase.to_sym, $2, false)
71
+ else
72
+ raise "Invalid aggregate SQL expression: " + sql_expression
73
+ end
74
+ end
75
+
76
+ def self.from_symbol(klass, label, aggregate_name)
77
+
78
+ case aggregate_name
79
+ when :count_all
80
+ return Aggregate.new(klass, label, :count, '*', false) # with table name?
81
+ when :count_distinct_all
82
+ return Aggregate.new(klass, label, :count, '*', true) # with table name?
83
+ when :count
84
+ return Aggregate.new(klass, label, :count, :id, false)
85
+ when :count_distinct
86
+ return Aggregate.new(klass, label, :count, :id, true)
87
+
88
+ else
89
+ parts = aggregate_name.to_s.split('_')
90
+ raise "Invalid aggregate name: #{symbol.inspect}" unless parts.length > 1
91
+
92
+ distinct = false
93
+ if parts[1] == 'distinct'
94
+ parts.delete_at(1)
95
+ distinct = true
96
+ end
97
+
98
+ raise "Invalid aggregate name: #{symbol.inspect}" unless parts.length >= 2
99
+ #TODO: check field name and function name?
100
+ return Aggregate.new(klass, label, parts[0].to_sym, parts[1..-1].join('_').to_sym, distinct)
101
+ end
102
+ end
103
+
104
+ def to_sanitized_sql
105
+ sql = @function.to_s.upcase! + '('
106
+ sql << 'DISTINCT ' if @distinct
107
+ sql << (@expression.kind_of?(Symbol) ? "#{quote_table}.#{quote_column(@expression)}" : @expression.to_s)
108
+ sql << ") AS #{quote_column(@label)}"
109
+ end
110
+
111
+ def is_count_with_overlap?
112
+ @function == :count_with_overlap
113
+ end
114
+
115
+ def cast_value(source)
116
+ return nil if source.nil?
117
+ (@function == :count) ? source.to_i : source.to_f # TODO: better?
118
+ end
119
+
120
+ def default_value
121
+ (@function == :count) ? 0 : nil # TODO: better?
122
+ end
123
+
124
+ def self.values(aggregates, source)
125
+ result = HashWithIndifferentAccess.new
126
+ aggregates.each { |agg| result[agg.label] = agg.cast_value(source[agg.label.to_s]) }
127
+ return (aggregates.length == 1) ? result[aggregates.first.label] : result
128
+ end
129
+
130
+ def self.default_values(aggregates)
131
+ return 0 if aggregates.empty? # count with overlap
132
+ result = HashWithIndifferentAccess.new
133
+ aggregates.each { |agg| result[agg.label] = agg.default_value }
134
+ return (aggregates.length == 1) ? result[aggregates.first.label] : result
135
+ end
136
+
137
+ protected
138
+
139
+ def quote_column(column)
140
+ @klass.connection.send(:quote_column_name, column.to_s)
141
+ end
142
+
143
+ def quote_table
144
+ @klass.connection.send(:quote_table_name, @klass.table_name)
145
+ end
146
+ end
147
+
148
+ end
@@ -0,0 +1,46 @@
1
+ module ActiveOLAP
2
+
3
+ class Category
4
+
5
+ attr_reader :dimension, :label, :conditions, :info
6
+
7
+ # initializes a category, given the dimension it belongs to, a label,
8
+ # and a definition. The definition should be a hash with at least the
9
+ # key expression set to a usable ActiveRecord#find conditions
10
+ def initialize(dimension, label, definition)
11
+ @dimension = dimension
12
+ @label = label
13
+ @info = {}
14
+
15
+ if definition.kind_of?(Hash) && definition.has_key?(:expression)
16
+ @conditions = definition[:expression]
17
+ @info = definition.reject { |k,v| k == :expression }
18
+ else
19
+ @conditions = definition
20
+ end
21
+ end
22
+
23
+ # Returns the index of this category in the corresponding dimension
24
+ def index
25
+ @dimension.category_index(@label)
26
+ end
27
+
28
+ # Returns a santized SQL expression for this category
29
+ def to_sanitized_sql
30
+ @dimension.klass.send(:sanitize_sql, @conditions)
31
+ end
32
+
33
+ def to_count_sql(count_what)
34
+ "COUNT(DISTINCT CASE WHEN (#{to_sanitized_sql}) THEN #{count_what} ELSE NULL END)
35
+ AS #{@dimension.klass.connection.send(:quote_column_name, label.to_s)}"
36
+ end
37
+
38
+ # Returns the label of this category as a string
39
+ def to_s
40
+ return "nil" if label.nil?
41
+ label.to_s
42
+ end
43
+
44
+ end
45
+
46
+ end
@@ -0,0 +1,32 @@
1
+ module ActiveOLAP
2
+
3
+ class Configurator
4
+
5
+ # initializes a OLAP::Configurator object, which is used in the block
6
+ # passed to the call enable_active_olap. It can be used to register
7
+ # dimensions and classes
8
+ def initialize(klass)
9
+ @klass = klass
10
+ end
11
+
12
+ # registers a dimension for the class it belongs to
13
+ def dimension(name, definition = nil)
14
+ definition = name.to_sym if definition.nil?
15
+ @klass.active_olap_dimensions[name] = definition
16
+ end
17
+
18
+ def time_dimension(name, field, defaults = {})
19
+ @klass.active_olap_dimensions[name] = Proc.new do |*options|
20
+ options = options.empty? ? {} : options.first
21
+ { :trend => defaults.merge(options).merge(:timestamp_field => field) }
22
+ end
23
+ end
24
+
25
+ # registers an aggregate for the class it belongs to
26
+ def aggregate(name, definition = nil, options = {})
27
+ definition = name if definition.nil?
28
+ agg_definition = options.merge(:expression => definition)
29
+ @klass.active_olap_aggregates[name] = agg_definition
30
+ end
31
+ end
32
+ end
@@ -0,0 +1,215 @@
1
+ module ActiveOLAP
2
+
3
+ class Cube
4
+
5
+ attr_accessor :info
6
+ attr_accessor :klass
7
+ attr_accessor :dimensions
8
+ attr_accessor :aggregates
9
+
10
+ # Initializes a new OLAP cube.
11
+ def initialize(klass, dimensions, aggregates, query_result = nil)
12
+ @klass = klass
13
+ @dimensions = dimensions
14
+ @aggregates = aggregates
15
+ @info = {}
16
+
17
+ # populates the cube with the query rsult if it is provided.
18
+ unless query_result.nil?
19
+ @result = []
20
+ populate_result_with(query_result)
21
+ traverse_result_for_nils(@result)
22
+ end
23
+ end
24
+
25
+ # Sums up all the values in this cube
26
+ def sum(agg = nil)
27
+ raise "Please provide the aggregate you want to sum." if self.aggregates.length > 1 && agg.nil?
28
+ total_sum = 0
29
+ self.each do |cat, value|
30
+ total_sum += (value.kind_of?(Cube) ? value.sum : (agg.nil? ? value : value[agg]))
31
+ end
32
+ return total_sum
33
+ end
34
+
35
+ # Returns the total number of cells in this cube. Note that this does not take aggregates into
36
+ # account, so the result should me multiplied by the number of aggregates if you want to know
37
+ # the total number of (numeric) values.
38
+ def cell_count
39
+ dimensions.inject(1) { |intermediate, dimension| intermediate * dimension.categories.length }
40
+ end
41
+
42
+ # Returns a reference to the internal array that holds ther raw results of this cube.
43
+ # Altering this array will alter the internals of the cube-object, so make sure you know what
44
+ # you are doing. Use to_a if you want to obtain a copy of the internal array
45
+ def raw_results
46
+ @result
47
+ end
48
+
49
+ # Returns a clone of the internal array that holds ther raw results of this cube.
50
+ def to_a
51
+ @result.clone
52
+ end
53
+
54
+ # Returns the array of categories of the current (= first) dimension
55
+ def categories
56
+ @dimensions.first.categories
57
+ end
58
+
59
+ # Returns the current (first) dimension
60
+ def dimension
61
+ @dimensions.first
62
+ end
63
+
64
+ # Returns the number of dimensions in this cube
65
+ def depth
66
+ @dimensions.length
67
+ end
68
+
69
+ # Returns the number of categories in the current (= first) dimension
70
+ def breadth
71
+ @result.length
72
+ end
73
+
74
+ # Switches the dimensions of a two-dimensional cube
75
+ def transpose
76
+ raise "Can only transpose 2-dimensial results" unless depth == 2
77
+ result_object = Cube.new(@klass, [@dimensions.last, @dimensions.first], @aggregates)
78
+ result_object.result = @result.transpose
79
+ return result_object
80
+ end
81
+
82
+ def reorder_dimensions(*order)
83
+ # IMPLEMENT ME
84
+ end
85
+
86
+ def only_aggregate(aggregate_label)
87
+ # IMPLEMENT ME
88
+ end
89
+
90
+ def only_dimension(dimension_index)
91
+ # IMPLEMENT ME
92
+ end
93
+
94
+ def except_dimension(dimension_index)
95
+ # IMPLEMENT ME
96
+ end
97
+
98
+ # Returns a part of the cube or a single cell
99
+ # If the number of arguments matches the number of dimensions, a single cell is returned;
100
+ # If the number of arguments is less that the number of dimensons, this function will return
101
+ # a cube with (dimensions.length - args.length) dimensions.
102
+ def [](*args)
103
+ result = @result.clone
104
+ args.each_with_index do |cat_label, index|
105
+ cat_index = @dimensions[index].category_index(cat_label)
106
+ return nil if cat_index.nil?
107
+ result = result[cat_index]
108
+ end
109
+
110
+ if result.kind_of?(Array)
111
+ # build a new query_result object if not enoug dimensions were provided
112
+ result_object = Cube.new(@klass, @dimensions[args.length...@dimensions.length], @aggregates)
113
+ result_object.result = result
114
+ return result_object
115
+ else
116
+ return result
117
+ end
118
+ end
119
+
120
+ # Iterates over all the categories of the current dimension and their corresponding values.
121
+ # The provided block should take two arguments: the first one will be the Category and
122
+ # the second one will be the sub-cube or cell belonging to that category
123
+ def each(&block)
124
+ categories.each { |cat| yield(cat, self[cat.label]) }
125
+ end
126
+
127
+ # Maps all the categoies of the current dimension and their corresponding values. See each.
128
+ def map(&block)
129
+ result = []
130
+ categories.each { |cat| result << yield(cat, self[cat.label]) }
131
+ return result
132
+ end
133
+
134
+ protected
135
+
136
+ # Set the result by hand. For internal use
137
+ def result=(array)
138
+ @result = array
139
+ end
140
+
141
+ # Walks over all the rows of the resultset to build the cube.
142
+ def populate_result_with(query_result)
143
+
144
+ query_result.each do |row|
145
+
146
+ result = @result
147
+ values = row.attributes_before_type_cast
148
+ discard_data = false
149
+
150
+ (@dimensions.length - 1).times do |dim_index|
151
+
152
+ category_name = values.delete("dimension_#{dim_index}")
153
+ if @dimensions[dim_index].is_field_dimension?
154
+ # this field contains the value of the category_field, which should be used as category
155
+ # this might be the first time this category is seen, so register it in the dimension
156
+ category_index = @dimensions[dim_index].register_category(category_name)
157
+
158
+ elsif category_name.nil?
159
+ # this is a record for rows that did not fall in any of the categories of a dimension
160
+ # therefore, this data can be discarded. This should not happen if an "other"-field is present!
161
+ discard_data = true
162
+ break
163
+
164
+ else
165
+ # get the index of the category, which should exist
166
+ category_index = @dimensions[dim_index].category_index(category_name.to_sym)
167
+ end
168
+
169
+ # switch the result to the next dimension
170
+ result[category_index] = [] if result[category_index].nil? # add a new dimension if needed
171
+ result = result[category_index] # set the result to the next dimension for the next iteration
172
+ end
173
+
174
+ unless discard_data
175
+ dim = @dimensions.last # only the last dimension is remaining
176
+ if dim.is_field_dimension?
177
+ # the last dimension is a field category.
178
+ # every category is represented as a single row, with only one count per row
179
+ dimension_field_value = values["dimension_#{@dimensions.length - 1}"]
180
+ result[dim.register_category(dimension_field_value)] = Aggregate.values(@aggregates, values)
181
+
182
+ elsif aggregates.length == 0
183
+ # the last dimension is a category with possible overlap, using SUMs.
184
+ # every category will have its number on this row
185
+ result = [] if result.nil?
186
+ values.each { |key, value| result[dim.category_index(key.to_sym)] = value.to_i }
187
+
188
+ else
189
+ # the last category is a normal category
190
+ dimension_field_value = values["dimension_#{@dimensions.length - 1}"]
191
+ result[dim.category_index(dimension_field_value.to_sym)] = Aggregate.values(@aggregates, values) unless dimension_field_value.nil?
192
+ end
193
+ end
194
+ end
195
+ end
196
+
197
+ # Makes sure all the values are set in the resulting array
198
+ def traverse_result_for_nils(result, depth = 0)
199
+ dim = @dimensions[depth]
200
+ if dim == @dimensions.last
201
+ # set all categories to 0 if no value is set
202
+ dim.categories.length.times do |i|
203
+ result[i] = Aggregate.default_values(@aggregates) if result[i].nil?
204
+ end
205
+ else
206
+ # if no value set, create an empty array and iterate to the next dimension
207
+ # so all values will be set to 0
208
+ dim.categories.length.times do |i|
209
+ result[i] = [] if result[i].nil?
210
+ traverse_result_for_nils(result[i], depth + 1)
211
+ end
212
+ end
213
+ end
214
+ end
215
+ end