cubicle 0.1.2 → 0.1.3

Sign up to get free protection for your applications and to get access to all the features.
Files changed (37) hide show
  1. data/CHANGELOG.rdoc +14 -0
  2. data/README.rdoc +188 -174
  3. data/cubicle.gemspec +26 -10
  4. data/lib/cubicle.rb +47 -422
  5. data/lib/cubicle/aggregation.rb +58 -7
  6. data/lib/cubicle/aggregation/ad_hoc.rb +12 -0
  7. data/lib/cubicle/aggregation/aggregation_manager.rb +212 -0
  8. data/lib/cubicle/aggregation/dsl.rb +108 -0
  9. data/lib/cubicle/aggregation/map_reduce_helper.rb +55 -0
  10. data/lib/cubicle/data.rb +29 -84
  11. data/lib/cubicle/data/hierarchy.rb +55 -0
  12. data/lib/cubicle/data/level.rb +62 -0
  13. data/lib/cubicle/data/member.rb +28 -0
  14. data/lib/cubicle/data/table.rb +56 -0
  15. data/lib/cubicle/measure.rb +30 -20
  16. data/lib/cubicle/mongo_mapper/aggregate_plugin.rb +1 -1
  17. data/lib/cubicle/ordered_hash_with_indifferent_access.rb +27 -0
  18. data/lib/cubicle/query.rb +21 -194
  19. data/lib/cubicle/query/dsl.rb +118 -0
  20. data/lib/cubicle/query/dsl/time_intelligence.rb +89 -0
  21. data/lib/cubicle/ratio.rb +28 -12
  22. data/lib/cubicle/version.rb +2 -2
  23. data/test/cubicle/aggregation/ad_hoc_test.rb +21 -0
  24. data/test/cubicle/cubicle_aggregation_test.rb +84 -20
  25. data/test/cubicle/cubicle_query_test.rb +36 -0
  26. data/test/cubicle/data/data_test.rb +30 -0
  27. data/test/cubicle/data/level_test.rb +42 -0
  28. data/test/cubicle/data/member_test.rb +40 -0
  29. data/test/cubicle/{cubicle_data_test.rb → data/table_test.rb} +50 -50
  30. data/test/cubicle/duration_test.rb +46 -48
  31. data/test/cubicle/ordered_hash_with_indifferent_access_test.rb +19 -0
  32. data/test/cubicles/defect_cubicle.rb +31 -31
  33. data/test/log/test.log +102066 -0
  34. metadata +26 -10
  35. data/lib/cubicle/data_level.rb +0 -60
  36. data/test/cubicle/cubicle_data_level_test.rb +0 -58
  37. data/test/cubicle/cubicle_test.rb +0 -85
data/CHANGELOG.rdoc CHANGED
@@ -1,2 +1,16 @@
1
+ == 0.1.3
2
+ * Formalized flat (Cubicle::Data::Table) and hierarchical (Cubicle::Data::Hierarchy) data results from cubicle queries,
3
+ and added client side support for rolling up measure values in hierarchical data so that no matter how a given
4
+ query is organized hierarchically, a summary of aggregated measure values for each level of data will be available.
5
+ * Re-organized codebase
6
+ * Bug fixes
7
+
8
+ == 0.1.2
9
+ * Added ability to calculate durations between two timestamps
10
+ * Bug fixes
11
+
12
+ == 0.1.1
13
+ * Fixed a bug that required a logger to be initialized for the thing to work
14
+
1
15
  == 0.1.0
2
16
  * Initial release
data/README.rdoc CHANGED
@@ -1,175 +1,189 @@
1
- == Overview
2
- Cubicle is a Ruby library and DSL for automating the generation, execution and caching of common aggregations of MongoDB documents. Cubicle was born from the need to easily extract simple, processed statistical views of raw, real time business data being collected from a variety of systems.
3
-
4
- == Motivation
5
- Aggregating data in MongoDB, unlike relational or multidimensional (OLAP) database, requires writing custom reduce functions in JavaScript for the simplest cases and full map reduce functions in the more complex cases, even for common aggregations like sums or averages.
6
-
7
- While writing such map reduce functions isn't particularly difficult it can be tedious and error prone and requires switching from Ruby to JavaScript. Cubicle presents a simplified Ruby DSL for generating the JavaScript required for most common aggregation tasks and also handles processing, caching and presenting the results. JavaScript is still required in some cases, but is limited to constructing simple data transformation expressions.
8
-
9
-
10
- == Approach
11
- Cubicle breaks the task of defining and executing aggregation queries into two pieces. The first is the Cubicle, an analysis friendly 'view' of the underlying collection which defines both the attributes that will be used for grouping (dimensions) , the numerical fields that will be aggregated (measures), and kind of aggregation will be applied to each measure. The second piece of the Cubicle puzzle is a Query which specifies which particular dimensions or measures will be selected from the Cubicle for a given data request, along with how the resulting data will be filtered, ordered, paginated and organized.
12
-
13
- == Install
14
-
15
- Install the gem with:
16
-
17
- gem install cubicle
18
- or
19
- sudo gem install cubicle
20
-
21
-
22
- == An Example
23
- Given a document with the following structure (I'm using MongoMapper here as the ORM, but MongoMapper, or any other ORM, is not required by Cubicle, it works directly with the Mongo-Ruby Driver)
24
-
25
- class PokerHand
26
- include MongoMapper::Document
27
-
28
- key :match_date, String #we use iso8601 strings for dates, but Time works too
29
- key :table, String
30
- key :winner, Person # {:person=>{:name=>'Jim', :address=>{...}...}}
31
- key :winning_hand, Symbol #:two_of_a_kind, :full_house, etc...
32
- key :amount_won, Float
33
- end
34
-
35
- == The Cubicle
36
- here's how a Cubicle designed to analyze these poker hands might look:
37
-
38
- class PokerHandCubicle
39
- extend Cubicle
40
-
41
- date :date, :field_name=>'match_date'
42
- dimension :month, :expression=>'this.match_date.substring(0,7)'
43
- dimension :year, :expression=>'this.match_date.substring(0,4)'
44
-
45
- dimensions :table,
46
- :winning_hand
47
- dimension :winner, :field_name=>'winner.name'
48
-
49
- count :total_hands, :expression=>'true'
50
- count :total_draws, :expression=>'this.winning_hand=="draw"'
51
- sum :total_winnings, :field_name=>'amount_won'
52
- avg :avg_winnings, :field_name=>'amount_won'
53
-
54
- ratio :draw_pct, :total_draws, :total_hands
55
- end
56
-
57
- == The Queries
58
- The Queries
59
- And here's how you would use this cubicle to query the underlying data:
60
-
61
- aggregated_data = PokerHandCubicle.query
62
-
63
-
64
- Issuing an empty query to the cubicle like the one above will return a list of measures aggregated according to type for each combination of dimensions. However, once a Cubicle has been defined, you can query it in many different ways. For instance if you wanted to see the total number of hands by type, you could do this:
65
-
66
- hands_by_type = PokerHandCubicle.query { select :winning_hand, :total_hands }
67
-
68
- Or, if you wanted to see the total amount won with a full house, by player, sorted by amount won, you could do this:
69
-
70
- full_houses_by_player = PokerHandCubicle.query do
71
- select :winner
72
- where :winning_hand=>'full_house'
73
- order_by :total_winnings
74
- end
75
-
76
- Cubicle can return your data in a hierarchy (tree) too, if you want. If you wanted to see the percent of hands resulting in a draw by table by day, you could do this:
77
-
78
- draw_pct_by_player_by_day = PokerHandCubicle.query do
79
- select :draw_pct
80
- by :date, :table
81
- end
82
-
83
- In addition to the basic query primitives such as select, where, by and order_by, Cubicle has a basic understanding of time, so as long as you have a dimension in your cubicle defined using 'date', and that dimension is either an iso8601 string or an instance of Time, then you can easily perform some handy date filtering in the DSL:
84
-
85
- winnings_last_30_days_by_player = PokerHandCubicle.query do
86
- select :winner, :total_winnings
87
- for_the_last 30.days
88
- end
89
-
90
- or
91
-
92
- winnings_ytd_by_player = PokerHandCubicle.query do
93
- select :winner, :all_measures
94
- year_to_date
95
- order_by [:total_winnings, :desc]
96
- end
97
-
98
- == The Results
99
- Cubicle data is returned as either an array of hashes, for a two dimensional query, or a hash-based tree the leaves of which are arrays of hashes for hierarchical data (via queries using the 'by' keyword)
100
-
101
- Flat data:
102
- [{:dimension1=>'d1', :dimension2=>'d1', :measure1=>'1.0'},{:dimension1=>'d2'...
103
-
104
- Hierarchical data 2 levels deep:
105
- {'dimension 1'=>{'dimension2'=>[{:measures1=>'1.0'}],'dimension2b'=>[{measure1=>'2.0'}],...
106
-
107
- When you request two dimensional data (i.e. you do not use the 'by' keyword) you can transform your two dimensional data set into hierarchical data at any time using the 'hierarchize' method, specifying the dimensions you want to use in your hierarchy:
108
-
109
- data = MyCubicle.query {select :date, :name, :all_measures}
110
- hierarchized_data = data.hierarchize :name, :date
111
-
112
- This will result in a hash containing each unique value for :name in your source collection, and for each unique :name, a hash containing each unique :date with that :name, and for each :date, an array of hashes keyed by the measures in your Cubicle.
113
-
114
- == Caching & Processing
115
- Map reduce operations, especially over large or very large data sets, can take time to complete. Sometimes a long time. However, very often what you want to do is present a graph or a table of numbers to an interactive user on your website, and you probably don't want to make them wait for all your bazillion rows of raw data to be reduced down to the handful of numbers they are actually interested in seeing. For this reason, Cubicle has two modes of operation, the normal default mode in which aggregations are automatically cached until YourCubicle.expire! Or YourCubicle.process is called, and transient mode, which bypasses the caching mechanisms and executes real time queries against the raw source data.
116
-
117
- == Preprocessed Aggregations
118
- The expected normal mode of operation, however, is cached mode. While far from anything actually resembling an OLAP cube, Cubicle was designed to to process data on some periodic schedule and provide quick access to stored, aggregated data in between each processing, much like a real OLAP cube. Also reminiscent of an OLAP cube, Cubicle will cache aggregations at various levels of resolution, depending on the aggregations that were set up when defining a cubicle and depending on what queries are executed. For example, if a given Cubicle has three dimensions, Name, City and Date, when the Cubicle is processed, it will calculated aggregated measures for each combination of values on those three fields. If a query is executed that only requires Name and Date, then Cubicle will aggregate and cache measures by just Name and Date. If a third query asks for just Name, then Cubicle will create an aggregation based just on Name, but rather than using the original data source with its many rows, it will execute its map reduce against the previously cached Name-Date aggregation, which by definition will have fewer rows and should therefore perform faster. If you are aware ahead of time the aggregations your queries will need, you can specify them in the Cubicle definition, like this
119
- class MyCubicle
120
- extend Cubicle
121
- dimension :name
122
- dimension :date
123
- ...
124
- avg :my_measure
125
- ...
126
- aggregate :name, :date
127
- aggregate :name
128
- aggregate :date
129
- end
130
-
131
- When aggregations are specified in this way, then Cubicle will pre-aggregate your data for each of the specified combinations of dimensions whenever MyCubicle.process is called, eliminating the first-hit penalty that would otherwise be incurred when Cubicle encountered a given aggregation for the first time.
132
-
133
- == Transient (Real Time) Queries
134
- Sometimes you may not want to query cached data. In our application, we are using Cubicle to provide data for our performance management Key Performance Indicators (KPI's) which consist of both a historical trend of a particular metric as well as the current, real time value of the same metric for, say, the current month or a rolling 30 day period. For performance reasons, we fetch our trend, which is usually 12 months, from cached data but want up to the minute freshness for our real time KPI values, so we need to query the living source data. To accomplish this using Cubicle, you simply insert 'transient!' into your query definition, like so
135
-
136
- MyCubicle.query do
137
- transient!
138
- select :this, :that, :the_other
139
- end
140
-
141
- This will bypass cached aggregations and execute a map reduce query directly against the cubicle source collection.
142
-
143
- == Mongo Mapper plugin
144
- If MongoMapper is detected, Cubicle will use its connection to MongoDB. Additionally, Cubicle will install a simple MongoMapper plugin for doing ad-hoc, non-cached aggregations on the fly from a MongoMapper document, like this:
145
- MyMongoMapperModel.aggregate do
146
- dimension :my_dimension
147
- count :measure1
148
- avg :measure2
149
- end.query {order_by [:measure2, :desc]; limit 10;}
150
-
151
- == Limitations
152
- * Cubicle cannot currently cause child documents to be emitted in the map reduce. This is a pretty big limitation, and will be resolved shortly.
153
- * Documentation is non-existent. This is being worked on (head that one before?)
154
- * Test coverage is OK, but the tests could be better organized
155
- * Code needs to be modularized a bit, main classes are a bit hairy at the moment
156
-
157
-
158
- == Credits
159
-
160
- * Alex Wang, Patrick Gannon for features, fixes & testing
161
-
162
- == Bugs/Issues
163
- Please report them {on github}[http://github.com/plasticlizard/cubicle/issues].
164
-
165
- == Links
166
-
167
- == Todo
168
- * Support for emitting child / descendant documents
169
- * Work with native Date type, instead of just iso strings
170
- * Hirb support
171
- * Member format strings
172
- * Auto gen of a cubicle definition based on existing keys/key types in the MongoMapper plugin
173
- * DSL support for topcount and bottomcount queries
174
- * Support for 'duration' aggregation that will calculated durations between timestamps
1
+ == Overview
2
+ Cubicle is a Ruby library and DSL for automating the generation, execution and caching of common aggregations of MongoDB documents. Cubicle was born from the need to easily extract simple, processed statistical views of raw, real time business data being collected from a variety of systems.
3
+
4
+ == Motivation
5
+ Aggregating data in MongoDB, unlike relational or multidimensional (OLAP) database, requires writing custom reduce functions in JavaScript for the simplest cases and full map reduce functions in the more complex cases, even for common aggregations like sums or averages.
6
+
7
+ While writing such map reduce functions isn't particularly difficult it can be tedious and error prone and requires switching from Ruby to JavaScript. Cubicle presents a simplified Ruby DSL for generating the JavaScript required for most common aggregation tasks and also handles processing, caching and presenting the results. JavaScript is still required in some cases, but is limited to constructing simple data transformation expressions.
8
+
9
+
10
+ == Approach
11
+ Cubicle breaks the task of defining and executing aggregation queries into two pieces. The first is the Cubicle, an analysis friendly 'view' of the underlying collection which defines both the attributes that will be used for grouping (dimensions) , the numerical fields that will be aggregated (measures), and kind of aggregation will be applied to each measure. The second piece of the Cubicle puzzle is a Query which specifies which particular dimensions or measures will be selected from the Cubicle for a given data request, along with how the resulting data will be filtered, ordered, paginated and organized.
12
+
13
+ == Install
14
+
15
+ Install the gem with:
16
+
17
+ gem install cubicle
18
+ or
19
+ sudo gem install cubicle
20
+
21
+
22
+ == An Example
23
+ Given a document with the following structure (I'm using MongoMapper here as the ORM, but MongoMapper, or any other ORM, is not required by Cubicle, it works directly with the Mongo-Ruby Driver)
24
+
25
+ class PokerHand
26
+ include MongoMapper::Document
27
+
28
+ key :match_date, String #we use iso8601 strings for dates, but Time works too
29
+ key :table, String
30
+ key :winner, Person # {:person=>{:name=>'Jim', :address=>{...}...}}
31
+ key :winning_hand, Symbol #:two_of_a_kind, :full_house, etc...
32
+ key :amount_won, Float
33
+ end
34
+
35
+ == The Aggregation
36
+ here's how a Cubicle designed to analyze these poker hands might look:
37
+
38
+ class PokerHandCubicle
39
+ extend Cubicle::Aggregation
40
+
41
+ date :date, :field_name=>'match_date'
42
+ dimension :month, :expression=>'this.match_date.substring(0,7)'
43
+ dimension :year, :expression=>'this.match_date.substring(0,4)'
44
+
45
+ dimensions :table,
46
+ :winning_hand
47
+ dimension :winner, :field_name=>'winner.name'
48
+
49
+ count :total_hands, :expression=>'true'
50
+ count :total_draws, :expression=>'this.winning_hand=="draw"'
51
+ sum :total_winnings, :field_name=>'amount_won'
52
+ avg :avg_winnings, :field_name=>'amount_won'
53
+
54
+ ratio :draw_pct, :total_draws, :total_hands
55
+ end
56
+
57
+ == The Queries
58
+ The Queries
59
+ And here's how you would use this cubicle to query the underlying data:
60
+
61
+ aggregated_data = PokerHandCubicle.query
62
+
63
+
64
+ Issuing an empty query to the cubicle like the one above will return a list of measures aggregated according to type for each combination of dimensions. However, once a Cubicle has been defined, you can query it in many different ways. For instance if you wanted to see the total number of hands by type, you could do this:
65
+
66
+ hands_by_type = PokerHandCubicle.query { select :winning_hand, :total_hands }
67
+
68
+ Or, if you wanted to see the total amount won with a full house, by player, sorted by amount won, you could do this:
69
+
70
+ full_houses_by_player = PokerHandCubicle.query do
71
+ select :winner
72
+ where :winning_hand=>'full_house'
73
+ order_by :total_winnings
74
+ end
75
+
76
+ Cubicle can return your data in a hierarchy (tree) too, if you want. If you wanted to see the percent of hands resulting in a draw by table by day, you could do this:
77
+
78
+ draw_pct_by_player_by_day = PokerHandCubicle.query do
79
+ select :draw_pct
80
+ by :date, :table
81
+ end
82
+
83
+ In addition to the basic query primitives such as select, where, by and order_by, Cubicle has a basic understanding of time, so as long as you have a dimension in your cubicle defined using 'date', and that dimension is either an iso8601 string or an instance of Time, then you can easily perform some handy date filtering in the DSL:
84
+
85
+ winnings_last_30_days_by_player = PokerHandCubicle.query do
86
+ select :winner, :total_winnings
87
+ for_the_last 30.days
88
+ end
89
+
90
+ or
91
+
92
+ winnings_ytd_by_player = PokerHandCubicle.query do
93
+ select :winner, :all_measures
94
+ year_to_date
95
+ order_by [:total_winnings, :desc]
96
+ end
97
+
98
+ == Durations
99
+ In addition to the basic aggregations, Cubicle can also calculate durations based on timestamps. Currently, Cubicle is limited to calculating durations for data types actually stored as times (i.e. it won't automatically emit javascript to parse string or iso8601 representations of time), but this will change in the future. Cubicle can calculate average or total durations between timestamps, in either seconds, minutes, hours or days. Durations can also be given conditions (which are javascript expressions) which act to filter which documents are included in the duration calculation. By default, Cubicle will calculate an average of the duration between timestamps. To request Cubicle to calculate a sum instead, use 'total_duration'. If you are the like everything as exlicit as possible type, duration is aliased as 'average_duration'
100
+ class SomeBusinessProcess
101
+ extend Cubicle::Aggregation
102
+
103
+ dimension :some_dimension
104
+
105
+ average :some_measure
106
+ duration :timestamp1 => :timestamp2
107
+ duration :timestamp2 => :timestamp3
108
+ average_duration :timestamp1 => :timestamp3, :in=>:days
109
+ total_duration :happy_times, :timestamp1 => :timestamp3, :condition=>"this.mood == 'happy'"
110
+ end
111
+
112
+ == The Results
113
+ Cubicle data is returned as either an array of hashes, for a two dimensional query, or a hash-based tree the leaves of which are arrays of hashes for hierarchical data (via queries using the 'by' keyword)
114
+
115
+ Flat data:
116
+ [{:dimension1=>'d1', :dimension2=>'d1', :measure1=>'1.0'},{:dimension1=>'d2'...
117
+
118
+ Hierarchical data 2 levels deep:
119
+ {'dimension 1'=>{'dimension2'=>[{:measures1=>'1.0'}],'dimension2b'=>[{measure1=>'2.0'}],...
120
+
121
+ When you request two dimensional data (i.e. you do not use the 'by' keyword) you can transform your two dimensional data set into hierarchical data at any time using the 'hierarchize' method, specifying the dimensions you want to use in your hierarchy:
122
+
123
+ data = MyCubicle.query {select :date, :name, :all_measures}
124
+ hierarchized_data = data.hierarchize :name, :date
125
+
126
+ This will result in a hash containing each unique value for :name in your source collection, and for each unique :name, a hash containing each unique :date with that :name, and for each :date, an array of hashes keyed by the measures in your Cubicle.
127
+
128
+ == Caching & Processing
129
+ Map reduce operations, especially over large or very large data sets, can take time to complete. Sometimes a long time. However, very often what you want to do is present a graph or a table of numbers to an interactive user on your website, and you probably don't want to make them wait for all your bazillion rows of raw data to be reduced down to the handful of numbers they are actually interested in seeing. For this reason, Cubicle has two modes of operation, the normal default mode in which aggregations are automatically cached until YourCubicle.expire! Or YourCubicle.process is called, and transient mode, which bypasses the caching mechanisms and executes real time queries against the raw source data.
130
+
131
+ == Preprocessed Aggregations
132
+ The expected normal mode of operation, however, is cached mode. While far from anything actually resembling an OLAP cube, Cubicle was designed to to process data on some periodic schedule and provide quick access to stored, aggregated data in between each processing, much like a real OLAP cube. Also reminiscent of an OLAP cube, Cubicle will cache aggregations at various levels of resolution, depending on the aggregations that were set up when defining a cubicle and depending on what queries are executed. For example, if a given Cubicle has three dimensions, Name, City and Date, when the Cubicle is processed, it will calculated aggregated measures for each combination of values on those three fields. If a query is executed that only requires Name and Date, then Cubicle will aggregate and cache measures by just Name and Date. If a third query asks for just Name, then Cubicle will create an aggregation based just on Name, but rather than using the original data source with its many rows, it will execute its map reduce against the previously cached Name-Date aggregation, which by definition will have fewer rows and should therefore perform faster. If you are aware ahead of time the aggregations your queries will need, you can specify them in the Cubicle definition, like this
133
+ class MyCubicle
134
+ extend Cubicle::Aggregation
135
+ dimension :name
136
+ dimension :date
137
+ ...
138
+ avg :my_measure
139
+ ...
140
+ aggregate :name, :date
141
+ aggregate :name
142
+ aggregate :date
143
+ end
144
+
145
+ When aggregations are specified in this way, then Cubicle will pre-aggregate your data for each of the specified combinations of dimensions whenever MyCubicle.process is called, eliminating the first-hit penalty that would otherwise be incurred when Cubicle encountered a given aggregation for the first time.
146
+
147
+ == Transient (Real Time) Queries
148
+ Sometimes you may not want to query cached data. In our application, we are using Cubicle to provide data for our performance management Key Performance Indicators (KPI's) which consist of both a historical trend of a particular metric as well as the current, real time value of the same metric for, say, the current month or a rolling 30 day period. For performance reasons, we fetch our trend, which is usually 12 months, from cached data but want up to the minute freshness for our real time KPI values, so we need to query the living source data. To accomplish this using Cubicle, you simply insert 'transient!' into your query definition, like so
149
+
150
+ MyCubicle.query do
151
+ transient!
152
+ select :this, :that, :the_other
153
+ end
154
+
155
+ This will bypass cached aggregations and execute a map reduce query directly against the cubicle source collection.
156
+
157
+ == Mongo Mapper plugin
158
+ If MongoMapper is detected, Cubicle will use its connection to MongoDB. Additionally, Cubicle will install a simple MongoMapper plugin for doing ad-hoc, non-cached aggregations on the fly from a MongoMapper document, like this:
159
+ MyMongoMapperModel.aggregate do
160
+ dimension :my_dimension
161
+ count :measure1
162
+ avg :measure2
163
+ end.query {order_by [:measure2, :desc]; limit 10;}
164
+
165
+ == Limitations
166
+ * Cubicle cannot currently cause child documents to be emitted in the map reduce. This is a pretty big limitation, and will be resolved shortly.
167
+ * Documentation is non-existent. This is being worked on (head that one before?)
168
+ * Test coverage is OK, but the tests could be better organized
169
+ * Code needs to be modularized a bit, main classes are pretty hairy at the moment
170
+
171
+
172
+ == Credits
173
+
174
+ * Alex Wang, Patrick Gannon for features, fixes & testing
175
+
176
+ == Bugs/Issues
177
+ Please report them {on github}[http://github.com/plasticlizard/cubicle/issues].
178
+
179
+ == Links
180
+
181
+ == Todo
182
+ * Support for emitting child / descendant documents
183
+ * Work with native Date type, instead of just iso strings
184
+ * Hirb support
185
+ * Member format strings
186
+ * Auto gen of a cubicle definition based on existing keys/key types in the MongoMapper plugin
187
+ * DSL support for topcount and bottomcount queries
188
+ * Support for parsing string based times for duration calculations, particularly iso8601 strings
175
189
  * Metadata collection to track when cubicles have been processed, perhaps how big they are, how many aggregations, etc.
data/cubicle.gemspec CHANGED
@@ -5,11 +5,11 @@
5
5
 
6
6
  Gem::Specification.new do |s|
7
7
  s.name = %q{cubicle}
8
- s.version = "0.1.2"
8
+ s.version = "0.1.3"
9
9
 
10
10
  s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
11
11
  s.authors = ["Nathan Stults"]
12
- s.date = %q{2010-03-20}
12
+ s.date = %q{2010-03-25}
13
13
  s.description = %q{Cubicle provides a dsl and aggregation caching framework for automating the generation, execution and caching of map reduce queries when using MongoDB in Ruby. Cubicle also includes a MongoMapper plugin for quickly performing ad-hoc, multi-level group-by queries against a MongoMapper model.}
14
14
  s.email = %q{hereiam@sonic.net}
15
15
  s.extra_rdoc_files = [
@@ -25,9 +25,16 @@ Gem::Specification.new do |s|
25
25
  "cubicle.log",
26
26
  "lib/cubicle.rb",
27
27
  "lib/cubicle/aggregation.rb",
28
+ "lib/cubicle/aggregation/ad_hoc.rb",
29
+ "lib/cubicle/aggregation/aggregation_manager.rb",
30
+ "lib/cubicle/aggregation/dsl.rb",
31
+ "lib/cubicle/aggregation/map_reduce_helper.rb",
28
32
  "lib/cubicle/calculated_measure.rb",
29
33
  "lib/cubicle/data.rb",
30
- "lib/cubicle/data_level.rb",
34
+ "lib/cubicle/data/hierarchy.rb",
35
+ "lib/cubicle/data/level.rb",
36
+ "lib/cubicle/data/member.rb",
37
+ "lib/cubicle/data/table.rb",
31
38
  "lib/cubicle/date_time.rb",
32
39
  "lib/cubicle/dimension.rb",
33
40
  "lib/cubicle/duration.rb",
@@ -36,18 +43,24 @@ Gem::Specification.new do |s|
36
43
  "lib/cubicle/member_list.rb",
37
44
  "lib/cubicle/mongo_environment.rb",
38
45
  "lib/cubicle/mongo_mapper/aggregate_plugin.rb",
46
+ "lib/cubicle/ordered_hash_with_indifferent_access.rb",
39
47
  "lib/cubicle/query.rb",
48
+ "lib/cubicle/query/dsl.rb",
49
+ "lib/cubicle/query/dsl/time_intelligence.rb",
40
50
  "lib/cubicle/ratio.rb",
41
51
  "lib/cubicle/support.rb",
42
52
  "lib/cubicle/version.rb",
43
53
  "test/config/database.yml",
54
+ "test/cubicle/aggregation/ad_hoc_test.rb",
44
55
  "test/cubicle/cubicle_aggregation_test.rb",
45
- "test/cubicle/cubicle_data_level_test.rb",
46
- "test/cubicle/cubicle_data_test.rb",
47
56
  "test/cubicle/cubicle_query_test.rb",
48
- "test/cubicle/cubicle_test.rb",
57
+ "test/cubicle/data/data_test.rb",
58
+ "test/cubicle/data/level_test.rb",
59
+ "test/cubicle/data/member_test.rb",
60
+ "test/cubicle/data/table_test.rb",
49
61
  "test/cubicle/duration_test.rb",
50
62
  "test/cubicle/mongo_mapper/aggregate_plugin_test.rb",
63
+ "test/cubicle/ordered_hash_with_indifferent_access_test.rb",
51
64
  "test/cubicles/defect_cubicle.rb",
52
65
  "test/log/test.log",
53
66
  "test/models/defect.rb",
@@ -59,13 +72,16 @@ Gem::Specification.new do |s|
59
72
  s.rubygems_version = %q{1.3.6}
60
73
  s.summary = %q{Pseudo-Multi Dimensional analysis / simplified aggregation for MongoDB in Ruby (NOLAP ;))}
61
74
  s.test_files = [
62
- "test/cubicle/cubicle_aggregation_test.rb",
63
- "test/cubicle/cubicle_data_level_test.rb",
64
- "test/cubicle/cubicle_data_test.rb",
75
+ "test/cubicle/aggregation/ad_hoc_test.rb",
76
+ "test/cubicle/cubicle_aggregation_test.rb",
65
77
  "test/cubicle/cubicle_query_test.rb",
66
- "test/cubicle/cubicle_test.rb",
78
+ "test/cubicle/data/data_test.rb",
79
+ "test/cubicle/data/level_test.rb",
80
+ "test/cubicle/data/member_test.rb",
81
+ "test/cubicle/data/table_test.rb",
67
82
  "test/cubicle/duration_test.rb",
68
83
  "test/cubicle/mongo_mapper/aggregate_plugin_test.rb",
84
+ "test/cubicle/ordered_hash_with_indifferent_access_test.rb",
69
85
  "test/cubicles/defect_cubicle.rb",
70
86
  "test/models/defect.rb",
71
87
  "test/test_helper.rb",