sleek 0.0.1 → 0.0.2

Sign up to get free protection for your applications and to get access to all the features.
Files changed (38) hide show
  1. data/README.md +95 -20
  2. data/lib/sleek.rb +4 -2
  3. data/lib/sleek/core_ext/range.rb +6 -7
  4. data/lib/sleek/event.rb +4 -1
  5. data/lib/sleek/filter.rb +11 -0
  6. data/lib/sleek/group_by_criteria.rb +144 -0
  7. data/lib/sleek/interval.rb +21 -20
  8. data/lib/sleek/{base.rb → namespace.rb} +13 -8
  9. data/lib/sleek/queries.rb +0 -1
  10. data/lib/sleek/queries/average.rb +1 -1
  11. data/lib/sleek/queries/count_unique.rb +1 -1
  12. data/lib/sleek/queries/maximum.rb +1 -1
  13. data/lib/sleek/queries/minimum.rb +1 -1
  14. data/lib/sleek/queries/query.rb +70 -37
  15. data/lib/sleek/queries/sum.rb +1 -1
  16. data/lib/sleek/query_collection.rb +3 -3
  17. data/lib/sleek/query_command.rb +71 -0
  18. data/lib/sleek/timeframe.rb +45 -52
  19. data/lib/sleek/version.rb +1 -1
  20. data/spec/lib/sleek/event_spec.rb +3 -3
  21. data/spec/lib/sleek/filter_spec.rb +3 -3
  22. data/spec/lib/sleek/group_by_criteria_spec.rb +139 -0
  23. data/spec/lib/sleek/interval_spec.rb +6 -5
  24. data/spec/lib/sleek/{base_spec.rb → namespace_spec.rb} +15 -8
  25. data/spec/lib/sleek/queries/average_spec.rb +1 -1
  26. data/spec/lib/sleek/queries/count_spec.rb +1 -1
  27. data/spec/lib/sleek/queries/count_unique_spec.rb +1 -1
  28. data/spec/lib/sleek/queries/maximum_spec.rb +1 -1
  29. data/spec/lib/sleek/queries/minimum_spec.rb +1 -1
  30. data/spec/lib/sleek/queries/query_spec.rb +58 -84
  31. data/spec/lib/sleek/queries/sum_spec.rb +1 -1
  32. data/spec/lib/sleek/query_collection_spec.rb +11 -9
  33. data/spec/lib/sleek/query_command_spec.rb +171 -0
  34. data/spec/lib/sleek/timeframe_spec.rb +70 -36
  35. data/spec/lib/sleek_spec.rb +2 -2
  36. metadata +11 -8
  37. data/lib/sleek/queries/targetable.rb +0 -13
  38. data/spec/lib/sleek/queries/targetable_spec.rb +0 -29
data/README.md CHANGED
@@ -1,10 +1,12 @@
1
1
  ![Sleek](sleek.png)
2
2
 
3
- [![Build Status](https://travis-ci.org/goshakkk/sleek.png)](https://travis-ci.org/goshakkk/sleek)
3
+ [![Build Status](https://travis-ci.org/goshakkk/sleek.png)](https://travis-ci.org/goshakkk/sleek) [![Code Climate](https://codeclimate.com/github/goshakkk/sleek.png)](https://codeclimate.com/github/goshakkk/sleek)
4
4
 
5
5
  Sleek is a gem for doing analytics. It allows you to easily collect and
6
6
  analyze events that happen in your app.
7
7
 
8
+ **Sleek is a work-in-progress development. Use with caution.**
9
+
8
10
  ## Installation
9
11
 
10
12
  The easiest way to install Sleek is to add it to your Gemfile:
@@ -13,15 +15,27 @@ The easiest way to install Sleek is to add it to your Gemfile:
13
15
  gem "sleek"
14
16
  ```
15
17
 
16
- Then, install it:
18
+ Or, if you want the latest hotness:
17
19
 
20
+ ```ruby
21
+ gem "sleek", github: "goshakkk/sleek"
18
22
  ```
23
+
24
+ Then, install it:
25
+
26
+ ```bash
19
27
  $ bundle install
20
28
  ```
21
29
 
22
30
  Sleek requires MongoDB to work and assumes that you have Mongoid
23
31
  configured already.
24
32
 
33
+ Finally, create needed indexes:
34
+
35
+ ```bash
36
+ $ rake db:mongoid:create_indexes
37
+ ```
38
+
25
39
  ## Getting started
26
40
 
27
41
  ### Namespacing
@@ -98,8 +112,8 @@ call. Using `:interval` also requires that you specify `:timeframe`.
98
112
  ```ruby
99
113
  sleek.queries.count(:purchases, timeframe: :this_2_days, interval: :daily)
100
114
  # => [
101
- # {:timeframe=>#<Sleek::Timeframe 2013-01-01 00:00:00 UTC..2013-01-02 00:00:00 UTC>, :value=>10},
102
- # {:timeframe=>#<Sleek::Timeframe 2013-01-02 00:00:00 UTC..2013-01-03 00:00:00 UTC>, :value=>24}
115
+ # {:timeframe=>2013-01-01 00:00:00 UTC..2013-01-02 00:00:00 UTC, :value=>10},
116
+ # {:timeframe=>2013-01-02 00:00:00 UTC..2013-01-03 00:00:00 UTC, :value=>24}
103
117
  # ]
104
118
  ```
105
119
 
@@ -137,8 +151,8 @@ sleek.queries.count_unique(:purchases, target_property: "customer.id")
137
151
  ### Minimum
138
152
 
139
153
  It finds the minimum numeric value for a given property. All non-numeric
140
- values are ignored. If none of property values are numeric, the
141
- exception will be raised.
154
+ values are ignored. If none of property values are numeric, nil will
155
+ be returned.
142
156
 
143
157
  ```ruby
144
158
  sleek.queries.minimum(:bucket, params)
@@ -154,8 +168,8 @@ sleek.queries.minimum(:purchases, target_property: "total")
154
168
  ### Maximum
155
169
 
156
170
  It finds the maximum numeric value for a given property. All non-numeric
157
- values are ignored. If none of property values are numeric, the
158
- exception will be raised.
171
+ values are ignored. If none of property values are numeric, nill will
172
+ be returned.
159
173
 
160
174
  ```ruby
161
175
  sleek.queries.maximum(:bucket, params)
@@ -172,7 +186,7 @@ sleek.queries.maximum(:purchases, target_property: "total")
172
186
 
173
187
  The average query finds the average value for a given property. All
174
188
  non-numeric values are ignored. If none of property values are numeric,
175
- the exception will be raised.
189
+ nil will be returned.
176
190
 
177
191
  ```ruby
178
192
  sleek.queries.average(:bucket, params)
@@ -189,7 +203,7 @@ sleek.queries.average(:purchases, target_property: "total")
189
203
 
190
204
  The sum query sums all the numeric values for a given property. All
191
205
  non-numeric values are ignored. If none of property values are numeric,
192
- the exception will be raised.
206
+ nil will be returned.
193
207
 
194
208
  ```ruby
195
209
  sleek.queries.sum(:bucket, params)
@@ -202,6 +216,54 @@ sleek.queries.sum(:purchases, target_property: "total")
202
216
  # => 2_072_70
203
217
  ```
204
218
 
219
+ ## Series
220
+
221
+ Series allow you to analyze trends in metrics over time. They break a
222
+ timeframe into intervals and compute the metric for those intervals.
223
+
224
+ Calculating series is simply done by adding the `:timeframe` and
225
+ `:interval` options to the metric query.
226
+
227
+ Valid intervals are:
228
+
229
+ * `:hourly`
230
+ * `:daily`
231
+ * `:weekly`
232
+ * `:monthly`
233
+
234
+ ## Group by
235
+
236
+ In addition to using metrics and series, it is sometimes desired to
237
+ group their outputs by a specific property value.
238
+
239
+ For example, you might be wondering, "How much have me made from each of
240
+ our customers?" Group by will help you answer questions like this.
241
+
242
+ To group metrics or series result by value of some property, all you
243
+ need to do is to pass the `:group_by` option to the query.
244
+
245
+ ```ruby
246
+ sleek.queries.sum(:purchases, target_property: "total", group_by: "customer.email")
247
+ # => {"first@another.com"=>214998, "first@last.com"=>64999}
248
+ ```
249
+
250
+ Or, you may wonder how much did you make from each of your customers for
251
+ every day of this week.
252
+
253
+ ```ruby
254
+ sleek.queries.sum(:purchases, target_property: "total", timeframe: :this_week,
255
+ interval: :daily, group_by: "customer.email")
256
+ ```
257
+
258
+ You can even combine it with filters. For example, how much did you make
259
+ from each of your customers for evey day of this weeks on orders greater
260
+ than $1000?
261
+
262
+ ```ruby
263
+ sleek.queries.sum(:purchases, target_property: "total", filter: ["total", :gte, 1000_00],
264
+ timeframe: :this_week, interval: :daily, group_by: "customer.email")
265
+ ```
266
+
205
267
  ## Filters
206
268
 
207
269
  To limit the scope of events used in analysis you can use a filter. To
@@ -222,23 +284,36 @@ sleek.queries.count(:purchases, filters: [:total, :gt, 1599])
222
284
  # => 20
223
285
  ```
224
286
 
225
- ## Series
287
+ ### Timeframe & timezone
226
288
 
227
- Series allow you to analyze trends in metrics over time. They break a
228
- timeframe into intervals and compute the metric for those intervals.
289
+ You can pass the `:timeframe` with or without `:timezone` to any query.
229
290
 
230
- Calculating series is simply done by adding the `:timeframe` and
231
- `:interval` options to the metric query.
291
+ Timeframe is used to limit your query by some window of time. You can
292
+ use a range of `TimeWithRange` objects to specify absolute timeframe, or
293
+ you can use a string that describes relative timeframe.
232
294
 
233
- Valid intervals are:
295
+ Relative timeframe string (or a symbol) consists of these parts: category,
296
+ optional number, and interval specification. Possible categories are `this`
297
+ and `previous`, possible intervals are `minute`, `hour`, `day`, `week`,
298
+ `month`.
234
299
 
235
- * `:hourly`
236
- * `:daily`
237
- * `:weekly`
238
- * `:monthly`
300
+ Examples: `this_day`, `previous_3_weeks`.
301
+
302
+ By default, relative times are transformed into ranges of time objects
303
+ in UTC timezone. You can, however, pass the `:timezone` option to tell
304
+ Sleek to construct the window of time in the given timezone.
305
+
306
+ Refer to [`ActiveSupport::TimeZone` docs](http://api.rubyonrails.org/classes/ActiveSupport/TimeZone.html)
307
+ for more details on possible timezone identifiers.
239
308
 
240
309
  ## Other
241
310
 
311
+ ### Deleting namespace
312
+
313
+ ```ruby
314
+ sleek.delete!
315
+ ```
316
+
242
317
  ### Deleting buckets
243
318
 
244
319
  ```ruby
@@ -7,14 +7,16 @@ require 'sleek/version'
7
7
  require 'sleek/timeframe'
8
8
  require 'sleek/interval'
9
9
  require 'sleek/filter'
10
+ require 'sleek/group_by_criteria'
10
11
  require 'sleek/event'
11
12
  require 'sleek/queries'
13
+ require 'sleek/query_command'
12
14
  require 'sleek/query_collection'
13
- require 'sleek/base'
15
+ require 'sleek/namespace'
14
16
 
15
17
  module Sleek
16
18
  def self.for_namespace(namespace)
17
- Base.new namespace
19
+ Namespace.new namespace
18
20
  end
19
21
 
20
22
  def self.[](namespace)
@@ -5,8 +5,9 @@ class Range
5
5
  end
6
6
 
7
7
  # Public: Convert both ends of range to times.
8
- def to_time_range
9
- Time.at(self.begin)..Time.at(self.end)
8
+ def to_time_range(zone = nil)
9
+ time = zone ? zone : Time
10
+ time.at(self.begin)..time.at(self.end)
10
11
  end
11
12
 
12
13
  def int_range?
@@ -33,12 +34,10 @@ class Range
33
34
  # (1200..1300).previous
34
35
  # # => 1100..1200
35
36
  def previous(n = 1)
36
- new_begin = self.begin - difference * n
37
- new_end = self.end - difference * n
38
- new_begin..new_end
37
+ self - difference * n
39
38
  end
40
39
 
41
- def -(what)
42
- (self.begin - what)..(self.end - what)
40
+ def -(other)
41
+ (self.begin - other)..(self.end - other)
43
42
  end
44
43
  end
@@ -19,7 +19,8 @@ module Sleek
19
19
  field :ns, type: Symbol, as: :namespace
20
20
  field :b, type: String, as: :bucket
21
21
  field :d, type: Hash, as: :data
22
- embeds_one :sleek, store_as: "s", class_name: 'Sleek::EventMetadata', cascade_callbacks: true
22
+ embeds_one :sleek, store_as: 's', class_name: 'Sleek::EventMetadata',
23
+ cascade_callbacks: true
23
24
  accepts_nested_attributes_for :sleek
24
25
 
25
26
  validates :namespace, presence: true
@@ -27,6 +28,8 @@ module Sleek
27
28
 
28
29
  after_initialize { build_sleek }
29
30
 
31
+ index ns: 1, b: 1, 's.t' => 1
32
+
30
33
  def self.create_with_namespace(namespace, bucket, payload)
31
34
  sleek = payload.delete(:sleek)
32
35
  event = create(namespace: namespace, bucket: bucket, data: payload)
@@ -2,6 +2,12 @@ module Sleek
2
2
  class Filter
3
3
  attr_reader :property_name, :operator, :value
4
4
 
5
+ # Internal: Initialize a filter.
6
+ #
7
+ # property_name - the String name of target property.
8
+ # operator - the Symbol operator name.
9
+ # value - the value used by operator to compare with the
10
+ # value of target property.
5
11
  def initialize(property_name, operator, value)
6
12
  @property_name = "d.#{property_name}"
7
13
  @operator = operator.to_sym
@@ -12,10 +18,15 @@ module Sleek
12
18
  end
13
19
  end
14
20
 
21
+ # Internal: Apply the filter to a criteria.
22
+ #
23
+ # criteria - the Mongoid::Criteria object.
15
24
  def apply(criteria)
16
25
  criteria.send(operator, property_name => value)
17
26
  end
18
27
 
28
+ # Internal: Compare the filter with another. Filters are equal when
29
+ # property name, operator name, and value are equal.
19
30
  def ==(other)
20
31
  other.is_a?(Filter) && property_name == other.property_name &&
21
32
  operator == other.operator && value == other.value
@@ -0,0 +1,144 @@
1
+ module Sleek
2
+ # Internal: Criteria object for group_by queries.
3
+ # The reason it exists is that it's not possible to group_by result of
4
+ # normal MongoDB queries, so MongoDB's Aggregation Framework has to be
5
+ # used.
6
+ #
7
+ # It provides common aggregates methods that normal criteria objects
8
+ # have: `count`, `distinct`, `sum`, `avg`, `min`, and `max`, but
9
+ # instead of just numbers, they return a hash of group value => number.
10
+ class GroupByCriteria
11
+ attr_reader :criteria, :group_by
12
+
13
+ # Internal: Initialize a group_by criteria.
14
+ #
15
+ # criteria - the Mongoid::Criteria instance, used to match events.
16
+ # group_by - the name of the property to group by. Should be
17
+ # fully-qualified property name (not name of property
18
+ # inside "d".)
19
+ def initialize(criteria, group_by)
20
+ @criteria = criteria
21
+ @group_by = group_by
22
+ end
23
+
24
+ # Internal: Compute all possible aggregates.
25
+ #
26
+ # field - the optional name of the filed being aggregated. If
27
+ # none is passed, aggregates will only count events
28
+ # inside each group. If it is passed, min, max, sum,
29
+ # and avg will be also included.
30
+ # count_unique - the boolean flag indicating whethere or not
31
+ # counting distinct field values is needed. Off by
32
+ # default, because calculation of distinct values
33
+ # adds two additional pipeline operators and pushes
34
+ # every value to the set, which might make
35
+ # computation slower on large datasets when you do
36
+ # NOT need to count unique values.
37
+ #
38
+ # Examples:
39
+ #
40
+ # gc.aggregates
41
+ # # => [
42
+ # {"_id"=>"customer1", "count"=>2},
43
+ # {"_id"=>"customer2", "count" => 1}
44
+ # ]
45
+ #
46
+ # Returns an array of groups. Each group is a hash with key "_id"
47
+ # being the value of group_by property.
48
+ def aggregates(field = nil, count_unique = false)
49
+ pipeline = aggregates_pipeline(field, count_unique)
50
+ criteria.collection.aggregate(pipeline).to_a
51
+ end
52
+
53
+ # Internal: Run the aggregation on field and only select group value
54
+ # and some property.
55
+ #
56
+ # Examples:
57
+ #
58
+ # gc.aggregates_prop(nil, "count")
59
+ # # => { unique_value_1: 42, unique_value_2: 12 }
60
+ def aggregates_prop(field, prop, count_unique = false)
61
+ aggregates = aggregates(field, count_unique)
62
+ Hash[aggregates.map { |doc| [doc['_id'], doc[prop]] }]
63
+ end
64
+
65
+ def count
66
+ aggregates_prop(nil, 'count')
67
+ end
68
+
69
+ def count_unique(field)
70
+ aggregates_prop(field, 'count_unique', true)
71
+ end
72
+
73
+ def distinct(field)
74
+ OpenStruct.new(count: count_unique(field))
75
+ end
76
+
77
+ def avg(field)
78
+ aggregates_prop(field, 'avg')
79
+ end
80
+
81
+ def max(field)
82
+ aggregates_prop(field, 'max')
83
+ end
84
+
85
+ def min(field)
86
+ aggregates_prop(field, 'min')
87
+ end
88
+
89
+ def sum(field)
90
+ aggregates_prop(field, 'sum')
91
+ end
92
+
93
+ # Internal: Create aggregation pipeline.
94
+ #
95
+ # field - the optional name of the field to aggregate.
96
+ # count_unique - the optional flag indicating whethere or not to
97
+ # count unique values of the field or not. Off by
98
+ # default. See `aggregates` doc for the rationale.
99
+ def aggregates_pipeline(field = nil, count_unique = false)
100
+ db_group = "$#{group_by}"
101
+ db_field = "$#{field}" if field
102
+
103
+ pipeline = []
104
+
105
+ crit = criteria
106
+
107
+ crit = crit.ne(field => nil) if field
108
+ pipeline << { "$match" => crit.ne(group_by => nil).selector }
109
+
110
+ group_args = { "_id" => db_group, "count" => { "$sum" => 1 } }
111
+
112
+ if field
113
+ group_args.merge!({
114
+ "max" => { "$max" => db_field },
115
+ "min" => { "$min" => db_field },
116
+ "sum" => { "$sum" => db_field },
117
+ "avg" => { "$avg" => db_field }
118
+ })
119
+
120
+ if count_unique
121
+ group_args.merge!({ "unique_set" => { "$addToSet" => db_field } })
122
+ end
123
+ end
124
+
125
+ pipeline << { "$group" => group_args }
126
+
127
+ if count_unique
128
+ pipeline << { "$unwind" => "$unique_set" }
129
+ pipeline << {
130
+ "$group" => {
131
+ "_id" => "$_id",
132
+ "count_unique" => { "$sum" => 1 },
133
+ "count" => { "$first" => "count" },
134
+ "max" => { "$first" => "max" },
135
+ "min" => { "$first" => "min" },
136
+ "avg" => { "$first" => "avg" }
137
+ }
138
+ }
139
+ end
140
+
141
+ pipeline
142
+ end
143
+ end
144
+ end
@@ -1,20 +1,5 @@
1
1
  module Sleek
2
2
  class Interval
3
- def self.interval_value(desc)
4
- case desc
5
- when :hourly
6
- 1.hour
7
- when :daily
8
- 1.day
9
- when :weekly
10
- 1.week
11
- when :monthly
12
- 1.month
13
- else
14
- raise ArgumentError, "invalid interval description"
15
- end
16
- end
17
-
18
3
  attr_reader :interval, :timeframe
19
4
 
20
5
  # Internal: Initialize an interval.
@@ -22,7 +7,7 @@ module Sleek
22
7
  # interval_desc - the Symbol description of the interval.
23
8
  # Possible values: :hourly, :daily, :weekly,
24
9
  # :monthly.
25
- # timeframe - the Timeframe object.
10
+ # timeframe - the range of TimeWithZone objects.
26
11
  def initialize(interval_desc, timeframe)
27
12
  @interval = self.class.interval_value(interval_desc)
28
13
  @timeframe = timeframe
@@ -30,12 +15,28 @@ module Sleek
30
15
 
31
16
  # Internal: Split the timeframe into intervals.
32
17
  #
33
- # Returns an Array of Timeframe objects.
18
+ # Returns an Array of time range objects.
34
19
  def timeframes
35
- @timeframes ||= timeframe.to_time_range.to_i_range.each_slice(interval)
20
+ tz = timeframe.first.time_zone
21
+ timeframe.to_i_range.each_slice(interval)
36
22
  .to_a[0..-2]
37
- .map { |tf| (tf.first..(tf.first + interval)).to_time_range }
38
- .map { |tf| Timeframe.new(tf) }
23
+ .map { |tf, _| (tf..(tf + interval)).to_time_range(tz) }
24
+ end
25
+
26
+ # Internal: Convert interval description to numeric value.
27
+ def self.interval_value(desc)
28
+ case desc
29
+ when :hourly
30
+ 1.hour
31
+ when :daily
32
+ 1.day
33
+ when :weekly
34
+ 1.week
35
+ when :monthly
36
+ 1.month
37
+ else
38
+ raise ArgumentError, 'invalid interval description'
39
+ end
39
40
  end
40
41
  end
41
42
  end