sleek 0.0.1 → 0.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (38) hide show
  1. data/README.md +95 -20
  2. data/lib/sleek.rb +4 -2
  3. data/lib/sleek/core_ext/range.rb +6 -7
  4. data/lib/sleek/event.rb +4 -1
  5. data/lib/sleek/filter.rb +11 -0
  6. data/lib/sleek/group_by_criteria.rb +144 -0
  7. data/lib/sleek/interval.rb +21 -20
  8. data/lib/sleek/{base.rb → namespace.rb} +13 -8
  9. data/lib/sleek/queries.rb +0 -1
  10. data/lib/sleek/queries/average.rb +1 -1
  11. data/lib/sleek/queries/count_unique.rb +1 -1
  12. data/lib/sleek/queries/maximum.rb +1 -1
  13. data/lib/sleek/queries/minimum.rb +1 -1
  14. data/lib/sleek/queries/query.rb +70 -37
  15. data/lib/sleek/queries/sum.rb +1 -1
  16. data/lib/sleek/query_collection.rb +3 -3
  17. data/lib/sleek/query_command.rb +71 -0
  18. data/lib/sleek/timeframe.rb +45 -52
  19. data/lib/sleek/version.rb +1 -1
  20. data/spec/lib/sleek/event_spec.rb +3 -3
  21. data/spec/lib/sleek/filter_spec.rb +3 -3
  22. data/spec/lib/sleek/group_by_criteria_spec.rb +139 -0
  23. data/spec/lib/sleek/interval_spec.rb +6 -5
  24. data/spec/lib/sleek/{base_spec.rb → namespace_spec.rb} +15 -8
  25. data/spec/lib/sleek/queries/average_spec.rb +1 -1
  26. data/spec/lib/sleek/queries/count_spec.rb +1 -1
  27. data/spec/lib/sleek/queries/count_unique_spec.rb +1 -1
  28. data/spec/lib/sleek/queries/maximum_spec.rb +1 -1
  29. data/spec/lib/sleek/queries/minimum_spec.rb +1 -1
  30. data/spec/lib/sleek/queries/query_spec.rb +58 -84
  31. data/spec/lib/sleek/queries/sum_spec.rb +1 -1
  32. data/spec/lib/sleek/query_collection_spec.rb +11 -9
  33. data/spec/lib/sleek/query_command_spec.rb +171 -0
  34. data/spec/lib/sleek/timeframe_spec.rb +70 -36
  35. data/spec/lib/sleek_spec.rb +2 -2
  36. metadata +11 -8
  37. data/lib/sleek/queries/targetable.rb +0 -13
  38. data/spec/lib/sleek/queries/targetable_spec.rb +0 -29
data/README.md CHANGED
@@ -1,10 +1,12 @@
1
1
  ![Sleek](sleek.png)
2
2
 
3
- [![Build Status](https://travis-ci.org/goshakkk/sleek.png)](https://travis-ci.org/goshakkk/sleek)
3
+ [![Build Status](https://travis-ci.org/goshakkk/sleek.png)](https://travis-ci.org/goshakkk/sleek) [![Code Climate](https://codeclimate.com/github/goshakkk/sleek.png)](https://codeclimate.com/github/goshakkk/sleek)
4
4
 
5
5
  Sleek is a gem for doing analytics. It allows you to easily collect and
6
6
  analyze events that happen in your app.
7
7
 
8
+ **Sleek is a work-in-progress development. Use with caution.**
9
+
8
10
  ## Installation
9
11
 
10
12
  The easiest way to install Sleek is to add it to your Gemfile:
@@ -13,15 +15,27 @@ The easiest way to install Sleek is to add it to your Gemfile:
13
15
  gem "sleek"
14
16
  ```
15
17
 
16
- Then, install it:
18
+ Or, if you want the latest hotness:
17
19
 
20
+ ```ruby
21
+ gem "sleek", github: "goshakkk/sleek"
18
22
  ```
23
+
24
+ Then, install it:
25
+
26
+ ```bash
19
27
  $ bundle install
20
28
  ```
21
29
 
22
30
  Sleek requires MongoDB to work and assumes that you have Mongoid
23
31
  configured already.
24
32
 
33
+ Finally, create needed indexes:
34
+
35
+ ```bash
36
+ $ rake db:mongoid:create_indexes
37
+ ```
38
+
25
39
  ## Getting started
26
40
 
27
41
  ### Namespacing
@@ -98,8 +112,8 @@ call. Using `:interval` also requires that you specify `:timeframe`.
98
112
  ```ruby
99
113
  sleek.queries.count(:purchases, timeframe: :this_2_days, interval: :daily)
100
114
  # => [
101
- # {:timeframe=>#<Sleek::Timeframe 2013-01-01 00:00:00 UTC..2013-01-02 00:00:00 UTC>, :value=>10},
102
- # {:timeframe=>#<Sleek::Timeframe 2013-01-02 00:00:00 UTC..2013-01-03 00:00:00 UTC>, :value=>24}
115
+ # {:timeframe=>2013-01-01 00:00:00 UTC..2013-01-02 00:00:00 UTC, :value=>10},
116
+ # {:timeframe=>2013-01-02 00:00:00 UTC..2013-01-03 00:00:00 UTC, :value=>24}
103
117
  # ]
104
118
  ```
105
119
 
@@ -137,8 +151,8 @@ sleek.queries.count_unique(:purchases, target_property: "customer.id")
137
151
  ### Minimum
138
152
 
139
153
  It finds the minimum numeric value for a given property. All non-numeric
140
- values are ignored. If none of property values are numeric, the
141
- exception will be raised.
154
+ values are ignored. If none of property values are numeric, nil will
155
+ be returned.
142
156
 
143
157
  ```ruby
144
158
  sleek.queries.minimum(:bucket, params)
@@ -154,8 +168,8 @@ sleek.queries.minimum(:purchases, target_property: "total")
154
168
  ### Maximum
155
169
 
156
170
  It finds the maximum numeric value for a given property. All non-numeric
157
- values are ignored. If none of property values are numeric, the
158
- exception will be raised.
171
+ values are ignored. If none of property values are numeric, nill will
172
+ be returned.
159
173
 
160
174
  ```ruby
161
175
  sleek.queries.maximum(:bucket, params)
@@ -172,7 +186,7 @@ sleek.queries.maximum(:purchases, target_property: "total")
172
186
 
173
187
  The average query finds the average value for a given property. All
174
188
  non-numeric values are ignored. If none of property values are numeric,
175
- the exception will be raised.
189
+ nil will be returned.
176
190
 
177
191
  ```ruby
178
192
  sleek.queries.average(:bucket, params)
@@ -189,7 +203,7 @@ sleek.queries.average(:purchases, target_property: "total")
189
203
 
190
204
  The sum query sums all the numeric values for a given property. All
191
205
  non-numeric values are ignored. If none of property values are numeric,
192
- the exception will be raised.
206
+ nil will be returned.
193
207
 
194
208
  ```ruby
195
209
  sleek.queries.sum(:bucket, params)
@@ -202,6 +216,54 @@ sleek.queries.sum(:purchases, target_property: "total")
202
216
  # => 2_072_70
203
217
  ```
204
218
 
219
+ ## Series
220
+
221
+ Series allow you to analyze trends in metrics over time. They break a
222
+ timeframe into intervals and compute the metric for those intervals.
223
+
224
+ Calculating series is simply done by adding the `:timeframe` and
225
+ `:interval` options to the metric query.
226
+
227
+ Valid intervals are:
228
+
229
+ * `:hourly`
230
+ * `:daily`
231
+ * `:weekly`
232
+ * `:monthly`
233
+
234
+ ## Group by
235
+
236
+ In addition to using metrics and series, it is sometimes desired to
237
+ group their outputs by a specific property value.
238
+
239
+ For example, you might be wondering, "How much have me made from each of
240
+ our customers?" Group by will help you answer questions like this.
241
+
242
+ To group metrics or series result by value of some property, all you
243
+ need to do is to pass the `:group_by` option to the query.
244
+
245
+ ```ruby
246
+ sleek.queries.sum(:purchases, target_property: "total", group_by: "customer.email")
247
+ # => {"first@another.com"=>214998, "first@last.com"=>64999}
248
+ ```
249
+
250
+ Or, you may wonder how much did you make from each of your customers for
251
+ every day of this week.
252
+
253
+ ```ruby
254
+ sleek.queries.sum(:purchases, target_property: "total", timeframe: :this_week,
255
+ interval: :daily, group_by: "customer.email")
256
+ ```
257
+
258
+ You can even combine it with filters. For example, how much did you make
259
+ from each of your customers for evey day of this weeks on orders greater
260
+ than $1000?
261
+
262
+ ```ruby
263
+ sleek.queries.sum(:purchases, target_property: "total", filter: ["total", :gte, 1000_00],
264
+ timeframe: :this_week, interval: :daily, group_by: "customer.email")
265
+ ```
266
+
205
267
  ## Filters
206
268
 
207
269
  To limit the scope of events used in analysis you can use a filter. To
@@ -222,23 +284,36 @@ sleek.queries.count(:purchases, filters: [:total, :gt, 1599])
222
284
  # => 20
223
285
  ```
224
286
 
225
- ## Series
287
+ ### Timeframe & timezone
226
288
 
227
- Series allow you to analyze trends in metrics over time. They break a
228
- timeframe into intervals and compute the metric for those intervals.
289
+ You can pass the `:timeframe` with or without `:timezone` to any query.
229
290
 
230
- Calculating series is simply done by adding the `:timeframe` and
231
- `:interval` options to the metric query.
291
+ Timeframe is used to limit your query by some window of time. You can
292
+ use a range of `TimeWithRange` objects to specify absolute timeframe, or
293
+ you can use a string that describes relative timeframe.
232
294
 
233
- Valid intervals are:
295
+ Relative timeframe string (or a symbol) consists of these parts: category,
296
+ optional number, and interval specification. Possible categories are `this`
297
+ and `previous`, possible intervals are `minute`, `hour`, `day`, `week`,
298
+ `month`.
234
299
 
235
- * `:hourly`
236
- * `:daily`
237
- * `:weekly`
238
- * `:monthly`
300
+ Examples: `this_day`, `previous_3_weeks`.
301
+
302
+ By default, relative times are transformed into ranges of time objects
303
+ in UTC timezone. You can, however, pass the `:timezone` option to tell
304
+ Sleek to construct the window of time in the given timezone.
305
+
306
+ Refer to [`ActiveSupport::TimeZone` docs](http://api.rubyonrails.org/classes/ActiveSupport/TimeZone.html)
307
+ for more details on possible timezone identifiers.
239
308
 
240
309
  ## Other
241
310
 
311
+ ### Deleting namespace
312
+
313
+ ```ruby
314
+ sleek.delete!
315
+ ```
316
+
242
317
  ### Deleting buckets
243
318
 
244
319
  ```ruby
@@ -7,14 +7,16 @@ require 'sleek/version'
7
7
  require 'sleek/timeframe'
8
8
  require 'sleek/interval'
9
9
  require 'sleek/filter'
10
+ require 'sleek/group_by_criteria'
10
11
  require 'sleek/event'
11
12
  require 'sleek/queries'
13
+ require 'sleek/query_command'
12
14
  require 'sleek/query_collection'
13
- require 'sleek/base'
15
+ require 'sleek/namespace'
14
16
 
15
17
  module Sleek
16
18
  def self.for_namespace(namespace)
17
- Base.new namespace
19
+ Namespace.new namespace
18
20
  end
19
21
 
20
22
  def self.[](namespace)
@@ -5,8 +5,9 @@ class Range
5
5
  end
6
6
 
7
7
  # Public: Convert both ends of range to times.
8
- def to_time_range
9
- Time.at(self.begin)..Time.at(self.end)
8
+ def to_time_range(zone = nil)
9
+ time = zone ? zone : Time
10
+ time.at(self.begin)..time.at(self.end)
10
11
  end
11
12
 
12
13
  def int_range?
@@ -33,12 +34,10 @@ class Range
33
34
  # (1200..1300).previous
34
35
  # # => 1100..1200
35
36
  def previous(n = 1)
36
- new_begin = self.begin - difference * n
37
- new_end = self.end - difference * n
38
- new_begin..new_end
37
+ self - difference * n
39
38
  end
40
39
 
41
- def -(what)
42
- (self.begin - what)..(self.end - what)
40
+ def -(other)
41
+ (self.begin - other)..(self.end - other)
43
42
  end
44
43
  end
@@ -19,7 +19,8 @@ module Sleek
19
19
  field :ns, type: Symbol, as: :namespace
20
20
  field :b, type: String, as: :bucket
21
21
  field :d, type: Hash, as: :data
22
- embeds_one :sleek, store_as: "s", class_name: 'Sleek::EventMetadata', cascade_callbacks: true
22
+ embeds_one :sleek, store_as: 's', class_name: 'Sleek::EventMetadata',
23
+ cascade_callbacks: true
23
24
  accepts_nested_attributes_for :sleek
24
25
 
25
26
  validates :namespace, presence: true
@@ -27,6 +28,8 @@ module Sleek
27
28
 
28
29
  after_initialize { build_sleek }
29
30
 
31
+ index ns: 1, b: 1, 's.t' => 1
32
+
30
33
  def self.create_with_namespace(namespace, bucket, payload)
31
34
  sleek = payload.delete(:sleek)
32
35
  event = create(namespace: namespace, bucket: bucket, data: payload)
@@ -2,6 +2,12 @@ module Sleek
2
2
  class Filter
3
3
  attr_reader :property_name, :operator, :value
4
4
 
5
+ # Internal: Initialize a filter.
6
+ #
7
+ # property_name - the String name of target property.
8
+ # operator - the Symbol operator name.
9
+ # value - the value used by operator to compare with the
10
+ # value of target property.
5
11
  def initialize(property_name, operator, value)
6
12
  @property_name = "d.#{property_name}"
7
13
  @operator = operator.to_sym
@@ -12,10 +18,15 @@ module Sleek
12
18
  end
13
19
  end
14
20
 
21
+ # Internal: Apply the filter to a criteria.
22
+ #
23
+ # criteria - the Mongoid::Criteria object.
15
24
  def apply(criteria)
16
25
  criteria.send(operator, property_name => value)
17
26
  end
18
27
 
28
+ # Internal: Compare the filter with another. Filters are equal when
29
+ # property name, operator name, and value are equal.
19
30
  def ==(other)
20
31
  other.is_a?(Filter) && property_name == other.property_name &&
21
32
  operator == other.operator && value == other.value
@@ -0,0 +1,144 @@
1
+ module Sleek
2
+ # Internal: Criteria object for group_by queries.
3
+ # The reason it exists is that it's not possible to group_by result of
4
+ # normal MongoDB queries, so MongoDB's Aggregation Framework has to be
5
+ # used.
6
+ #
7
+ # It provides common aggregates methods that normal criteria objects
8
+ # have: `count`, `distinct`, `sum`, `avg`, `min`, and `max`, but
9
+ # instead of just numbers, they return a hash of group value => number.
10
+ class GroupByCriteria
11
+ attr_reader :criteria, :group_by
12
+
13
+ # Internal: Initialize a group_by criteria.
14
+ #
15
+ # criteria - the Mongoid::Criteria instance, used to match events.
16
+ # group_by - the name of the property to group by. Should be
17
+ # fully-qualified property name (not name of property
18
+ # inside "d".)
19
+ def initialize(criteria, group_by)
20
+ @criteria = criteria
21
+ @group_by = group_by
22
+ end
23
+
24
+ # Internal: Compute all possible aggregates.
25
+ #
26
+ # field - the optional name of the filed being aggregated. If
27
+ # none is passed, aggregates will only count events
28
+ # inside each group. If it is passed, min, max, sum,
29
+ # and avg will be also included.
30
+ # count_unique - the boolean flag indicating whethere or not
31
+ # counting distinct field values is needed. Off by
32
+ # default, because calculation of distinct values
33
+ # adds two additional pipeline operators and pushes
34
+ # every value to the set, which might make
35
+ # computation slower on large datasets when you do
36
+ # NOT need to count unique values.
37
+ #
38
+ # Examples:
39
+ #
40
+ # gc.aggregates
41
+ # # => [
42
+ # {"_id"=>"customer1", "count"=>2},
43
+ # {"_id"=>"customer2", "count" => 1}
44
+ # ]
45
+ #
46
+ # Returns an array of groups. Each group is a hash with key "_id"
47
+ # being the value of group_by property.
48
+ def aggregates(field = nil, count_unique = false)
49
+ pipeline = aggregates_pipeline(field, count_unique)
50
+ criteria.collection.aggregate(pipeline).to_a
51
+ end
52
+
53
+ # Internal: Run the aggregation on field and only select group value
54
+ # and some property.
55
+ #
56
+ # Examples:
57
+ #
58
+ # gc.aggregates_prop(nil, "count")
59
+ # # => { unique_value_1: 42, unique_value_2: 12 }
60
+ def aggregates_prop(field, prop, count_unique = false)
61
+ aggregates = aggregates(field, count_unique)
62
+ Hash[aggregates.map { |doc| [doc['_id'], doc[prop]] }]
63
+ end
64
+
65
+ def count
66
+ aggregates_prop(nil, 'count')
67
+ end
68
+
69
+ def count_unique(field)
70
+ aggregates_prop(field, 'count_unique', true)
71
+ end
72
+
73
+ def distinct(field)
74
+ OpenStruct.new(count: count_unique(field))
75
+ end
76
+
77
+ def avg(field)
78
+ aggregates_prop(field, 'avg')
79
+ end
80
+
81
+ def max(field)
82
+ aggregates_prop(field, 'max')
83
+ end
84
+
85
+ def min(field)
86
+ aggregates_prop(field, 'min')
87
+ end
88
+
89
+ def sum(field)
90
+ aggregates_prop(field, 'sum')
91
+ end
92
+
93
+ # Internal: Create aggregation pipeline.
94
+ #
95
+ # field - the optional name of the field to aggregate.
96
+ # count_unique - the optional flag indicating whethere or not to
97
+ # count unique values of the field or not. Off by
98
+ # default. See `aggregates` doc for the rationale.
99
+ def aggregates_pipeline(field = nil, count_unique = false)
100
+ db_group = "$#{group_by}"
101
+ db_field = "$#{field}" if field
102
+
103
+ pipeline = []
104
+
105
+ crit = criteria
106
+
107
+ crit = crit.ne(field => nil) if field
108
+ pipeline << { "$match" => crit.ne(group_by => nil).selector }
109
+
110
+ group_args = { "_id" => db_group, "count" => { "$sum" => 1 } }
111
+
112
+ if field
113
+ group_args.merge!({
114
+ "max" => { "$max" => db_field },
115
+ "min" => { "$min" => db_field },
116
+ "sum" => { "$sum" => db_field },
117
+ "avg" => { "$avg" => db_field }
118
+ })
119
+
120
+ if count_unique
121
+ group_args.merge!({ "unique_set" => { "$addToSet" => db_field } })
122
+ end
123
+ end
124
+
125
+ pipeline << { "$group" => group_args }
126
+
127
+ if count_unique
128
+ pipeline << { "$unwind" => "$unique_set" }
129
+ pipeline << {
130
+ "$group" => {
131
+ "_id" => "$_id",
132
+ "count_unique" => { "$sum" => 1 },
133
+ "count" => { "$first" => "count" },
134
+ "max" => { "$first" => "max" },
135
+ "min" => { "$first" => "min" },
136
+ "avg" => { "$first" => "avg" }
137
+ }
138
+ }
139
+ end
140
+
141
+ pipeline
142
+ end
143
+ end
144
+ end
@@ -1,20 +1,5 @@
1
1
  module Sleek
2
2
  class Interval
3
- def self.interval_value(desc)
4
- case desc
5
- when :hourly
6
- 1.hour
7
- when :daily
8
- 1.day
9
- when :weekly
10
- 1.week
11
- when :monthly
12
- 1.month
13
- else
14
- raise ArgumentError, "invalid interval description"
15
- end
16
- end
17
-
18
3
  attr_reader :interval, :timeframe
19
4
 
20
5
  # Internal: Initialize an interval.
@@ -22,7 +7,7 @@ module Sleek
22
7
  # interval_desc - the Symbol description of the interval.
23
8
  # Possible values: :hourly, :daily, :weekly,
24
9
  # :monthly.
25
- # timeframe - the Timeframe object.
10
+ # timeframe - the range of TimeWithZone objects.
26
11
  def initialize(interval_desc, timeframe)
27
12
  @interval = self.class.interval_value(interval_desc)
28
13
  @timeframe = timeframe
@@ -30,12 +15,28 @@ module Sleek
30
15
 
31
16
  # Internal: Split the timeframe into intervals.
32
17
  #
33
- # Returns an Array of Timeframe objects.
18
+ # Returns an Array of time range objects.
34
19
  def timeframes
35
- @timeframes ||= timeframe.to_time_range.to_i_range.each_slice(interval)
20
+ tz = timeframe.first.time_zone
21
+ timeframe.to_i_range.each_slice(interval)
36
22
  .to_a[0..-2]
37
- .map { |tf| (tf.first..(tf.first + interval)).to_time_range }
38
- .map { |tf| Timeframe.new(tf) }
23
+ .map { |tf, _| (tf..(tf + interval)).to_time_range(tz) }
24
+ end
25
+
26
+ # Internal: Convert interval description to numeric value.
27
+ def self.interval_value(desc)
28
+ case desc
29
+ when :hourly
30
+ 1.hour
31
+ when :daily
32
+ 1.day
33
+ when :weekly
34
+ 1.week
35
+ when :monthly
36
+ 1.month
37
+ else
38
+ raise ArgumentError, 'invalid interval description'
39
+ end
39
40
  end
40
41
  end
41
42
  end