ruby-druid 0.1.9 → 0.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: d0e676ac15efc4d9777fa182d8c8855bc1ebf386
4
- data.tar.gz: e54d2ec80e3bb5efdf02d3b5b127c565c5cfce86
3
+ metadata.gz: 804e71f527f98f4bc082a2967da5ce7065fa2c01
4
+ data.tar.gz: ada587e143c21e913df7386ebb3020155d82ff5d
5
5
  SHA512:
6
- metadata.gz: 884a621b0fb481490d114248bfff0207bc2fd54e889832cf86318c6d26fb53c73c535d97d4d832fc9872a4f6b76b0d426a7050190e41639abfad2d728a36dd7a
7
- data.tar.gz: b46976c63af7169c6581d092522f5ac97098320efe903bd6d5a6e58c9c2792aab34d0e17c474de174fc44e74d91a0fc6594a89d39beaf3c8f7fabe3f432af21e
6
+ metadata.gz: be6eb7e1e566340d22c70b7bc693e3849260b9994886249b501d90db10fcbfac1b0074928e7e5118cc704b5b8eaeda5c02e8e24b3a6b05b0a03e31137090929e
7
+ data.tar.gz: 08d4a836a5ebc98cbd8c923a1ab718b9c67cceaa66be5e2ab218e85c2928564f47ed05ad2648a977b079f1691c96c35e273bf70494a5c50d9944636218e668fd
data/LICENSE CHANGED
@@ -1,20 +1,22 @@
1
- Copyright (c) 2013 madvertise Mobile Advertising GmbH
1
+ Copyright (c) 2016 Ruby Druid Community
2
2
 
3
- Permission is hereby granted, free of charge, to any person obtaining
4
- a copy of this software and associated documentation files (the
5
- "Software"), to deal in the Software without restriction, including
6
- without limitation the rights to use, copy, modify, merge, publish,
7
- distribute, sublicense, and/or sell copies of the Software, and to
8
- permit persons to whom the Software is furnished to do so, subject to
9
- the following conditions:
3
+ Permission is hereby granted, free of charge, to any person
4
+ obtaining a copy of this software and associated documentation
5
+ files (the "Software"), to deal in the Software without
6
+ restriction, including without limitation the rights to use,
7
+ copy, modify, merge, publish, distribute, sublicense, and/or sell
8
+ copies of the Software, and to permit persons to whom the
9
+ Software is furnished to do so, subject to the following
10
+ conditions:
10
11
 
11
- The above copyright notice and this permission notice shall be included
12
- in all copies or substantial portions of the Software.
12
+ The above copyright notice and this permission notice shall be
13
+ included in all copies or substantial portions of the Software.
13
14
 
14
15
  THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
- EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
- MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
17
- IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
18
- CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
19
- TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
20
- SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
16
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
17
+ OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
18
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
19
+ HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
20
+ WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
21
+ FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
22
+ OTHER DEALINGS IN THE SOFTWARE.
data/README.md CHANGED
@@ -1,29 +1,31 @@
1
1
  # ruby-druid
2
2
 
3
- A ruby client for [druid](http://druid.io).
4
-
5
- ruby-druid features a [Squeel](https://github.com/ernie/squeel)-like query DSL
6
- and generates a JSON query that can be sent to druid directly. A console for
7
- testing is also provided.
3
+ A Ruby client for [Druid](http://druid.io). Includes a [Squeel](https://github.com/ernie/squeel)-like query DSL and generates a JSON query that can be sent to Druid directly.
8
4
 
9
5
  [![Gem Version](https://badge.fury.io/rb/ruby-druid.png)](http://badge.fury.io/rb/ruby-druid)
10
- [![Build Status](https://travis-ci.org/liquidm/ruby-druid.png)](https://travis-ci.org/liquidm/ruby-druid)
11
- [![Code Climate](https://codeclimate.com/github/liquidm/ruby-druid.png)](https://codeclimate.com/github/liquidm/ruby-druid)
12
- [![Dependency Status](https://gemnasium.com/liquidm/ruby-druid.png)](https://gemnasium.com/liquidm/ruby-druid)
6
+ [![Build Status](https://travis-ci.org/ruby-druid/ruby-druid.png)](https://travis-ci.org/ruby-druid/ruby-druid)
7
+ [![Code Climate](https://codeclimate.com/github/ruby-druid/ruby-druid.png)](https://codeclimate.com/github/ruby-druid/ruby-druid)
8
+ [![Dependency Status](https://gemnasium.com/ruby-druid/ruby-druid.png)](https://gemnasium.com/ruby-druid/ruby-druid)
13
9
 
14
10
  ## Installation
15
11
 
16
12
  Add this line to your application's Gemfile:
17
13
 
18
- gem 'ruby-druid'
14
+ ```
15
+ gem 'ruby-druid'
16
+ ```
19
17
 
20
18
  And then execute:
21
19
 
22
- $ bundle
20
+ ```
21
+ bundle
22
+ ```
23
23
 
24
24
  Or install it yourself as:
25
25
 
26
- $ gem install ruby-druid
26
+ ```
27
+ gem install ruby-druid
28
+ ```
27
29
 
28
30
  ## Usage
29
31
 
@@ -31,8 +33,7 @@ Or install it yourself as:
31
33
  Druid::Client.new('zk1:2181,zk2:2181/druid').query('service/source')
32
34
  ```
33
35
 
34
- returns a query object on which all other methods can be called to create a
35
- full and valid druid query.
36
+ returns a query object on which all other methods can be called to create a full and valid Druid query.
36
37
 
37
38
  A query object can be sent like this:
38
39
 
@@ -42,11 +43,7 @@ query = Druid::Query.new('service/source')
42
43
  client.send(query)
43
44
  ```
44
45
 
45
- The `send` method returns the parsed response from the druid server as an
46
- array. If the response is not empty it contains one `ResponseRow` object for
47
- each row. The timestamp by can be received by a method with the same name
48
- (i.e. `row.timestamp`), all row values by hashlike syntax (i.e.
49
- `row['dimension'])
46
+ The `send` method returns the parsed response from the druid server as an array. If the response is not empty it contains one `ResponseRow` object for each row. The timestamp by can be received by a method with the same name (i.e. `row.timestamp`), all row values by hashlike syntax (i.e. `row['dimension'])
50
47
 
51
48
  An options hash can be passed when creating `Druid::Client` instance:
52
49
 
@@ -60,7 +57,7 @@ Supported options are:
60
57
 
61
58
  ### GroupBy
62
59
 
63
- A [GroupByQuery](https://github.com/metamx/druid/wiki/GroupByQuery) sets the
60
+ A [GroupByQuery](http://druid.io/docs/latest/querying/groupbyquery.html) sets the
64
61
  dimensions to group the data.
65
62
 
66
63
  `queryType` is set automatically to `groupBy`.
@@ -71,9 +68,7 @@ Druid::Query.new('service/source').group_by([:dimension1, :dimension2])
71
68
 
72
69
  ### TimeSeries
73
70
 
74
- A [TimeSeriesQuery](https://github.com/metamx/druid/wiki/TimeseriesQuery)
75
- returns an array of JSON objects where each object represents a value asked for
76
- by the timeseries query.
71
+ A [TimeSeriesQuery](http://druid.io/docs/latest/querying/timeseriesquery.html) returns an array of JSON objects where each object represents a value asked for by the timeseries query.
77
72
 
78
73
  ```ruby
79
74
  Druid::Query.new('service/source').time_series([:aggregate1, :aggregate2])
@@ -81,10 +76,32 @@ Druid::Query.new('service/source').time_series([:aggregate1, :aggregate2])
81
76
 
82
77
  ### Aggregations
83
78
 
79
+ #### longSum, doubleSum, count, min, max, hyperUnique
80
+
84
81
  ```ruby
85
82
  Druid::Query.new('service/source').long_sum([:aggregate1, :aggregate2])
86
83
  ```
87
84
 
85
+ In the same way could be used the following methods for [aggregations](http://druid.io/docs/latest/querying/aggregations.html) adding: `double_sum, count, min, max, hyper_unique`
86
+
87
+ #### cardinality
88
+
89
+ ```ruby
90
+ Druid::Query.new('service/source').cardinality(:aggregate, [:dimension1, dimension2], <by_row: true | false>)
91
+ ```
92
+
93
+ #### javascript
94
+
95
+ For example calculation for `sum(log(x)/y) + 10`:
96
+
97
+ ```ruby
98
+ Druid::Query.new('service/source').js_aggregation(:aggregate, [:x, :y],
99
+ aggregate: "function(current, a, b) { return current + (Math.log(a) * b); }",
100
+ combine: "function(partialA, partialB) { return partialA + partialB; }",
101
+ reset: "function() { return 10; }"
102
+ )
103
+ ```
104
+
88
105
  ### Post Aggregations
89
106
 
90
107
  A simple syntax for post aggregations with +,-,/,* can be used like:
@@ -94,8 +111,7 @@ query = Druid::Query.new('service/source').long_sum([:aggregate1, :aggregate2])
94
111
  query.postagg { (aggregate2 + aggregate2).as output_field_name }
95
112
  ```
96
113
 
97
- Required fields for the postaggregation are fetched automatically by the
98
- library.
114
+ Required fields for the postaggregation are fetched automatically by the library.
99
115
 
100
116
  Javascript post aggregations are also supported:
101
117
 
@@ -105,8 +121,7 @@ query.postagg { js('function(aggregate1, aggregate2) { return aggregate1 + aggre
105
121
 
106
122
  ### Query Interval
107
123
 
108
- The interval for the query takes a string with date and time or objects that
109
- provide an `iso8601` method.
124
+ The interval for the query takes a string with date and time or objects that provide an `iso8601` method.
110
125
 
111
126
  ```ruby
112
127
  query = Druid::Query.new('service/source').long_sum(:aggregate1)
@@ -115,16 +130,13 @@ query.interval("2013-01-01T00", Time.now)
115
130
 
116
131
  ### Result Granularity
117
132
 
118
- The granularity can be `:all`, `:none`, `:minute`, `:fifteen_minute`,
119
- `:thirthy_minute`, `:hour` or `:day`.
133
+ The granularity can be `:all`, `:none`, `:minute`, `:fifteen_minute`, `:thirthy_minute`, `:hour` or `:day`.
120
134
 
121
- It can also be a period granularity as described in the [druid
122
- wiki](https://github.com/metamx/druid/wiki/Granularities).
135
+ It can also be a period granularity as described in the [Druid documentation](http://druid.io/docs/latest/querying/granularities.html).
123
136
 
124
137
  The period `'day'` or `:day` will be interpreted as `'P1D'`.
125
138
 
126
- If a period granularity is specifed, the (optional) second parameter is a time
127
- zone. It defaults to the machines local time zone. i.e.
139
+ If a period granularity is specifed, the (optional) second parameter is a time zone. It defaults to the machines local time zone. i.e.
128
140
 
129
141
  ```ruby
130
142
  query = Druid::Query.new('service/source').long_sum(:aggregate1)
@@ -138,20 +150,46 @@ query = Druid::Query.new('service/source').long_sum(:aggregate1)
138
150
  query.granularity('P1D', 'Europe/Berlin')
139
151
  ```
140
152
 
141
- ### Having
153
+ ### Having filters
142
154
 
143
155
  ```ruby
144
- Druid::Query.new('service/source').having{metric > 10}
156
+ # equality
157
+ Druid::Query.new('service/source').having { metric == 10 }
145
158
  ```
146
159
 
147
160
  ```ruby
148
- Druid::Query.new('service/source').having{metric < 10}
161
+ # inequality
162
+ Druid::Query.new('service/source').having { metric != 10 }
163
+ ```
164
+
165
+ ```ruby
166
+ # greater, less
167
+ Druid::Query.new('service/source').having { metric > 10 }
168
+ Druid::Query.new('service/source').having { metric < 10 }
169
+ ```
170
+
171
+ #### Compound having filters
172
+
173
+ Having filters can be combined with boolean logic.
174
+
175
+ ```ruby
176
+ # and
177
+ Druid::Query.new('service/source').having { (metric != 1) & (metric2 != 2) }
178
+ ```
179
+
180
+ ```ruby
181
+ # or
182
+ Druid::Query.new('service/source').having { (metric == 1) | (metric2 == 2) }
183
+ ```
184
+
185
+ ```ruby
186
+ # not
187
+ Druid::Query.new('service/source').having{ !metric.eq(1) }
149
188
  ```
150
189
 
151
190
  ### Filters
152
191
 
153
- Filters are set by the `filter` method. It takes a block or a hash as
154
- parameter.
192
+ Filters are set by the `filter` method. It takes a block or a hash as parameter.
155
193
 
156
194
  Filters can be chained `filter{...}.filter{...}`
157
195
 
@@ -210,7 +248,7 @@ Druid::Query.new('service/source').filter{dimension.in(1,2,3)}
210
248
  ```
211
249
  #### Geographic filter
212
250
 
213
- These filters have to be combined with time_series and do only work when coordinates is a spatial dimension [GeographicQueries](http://druid.io/docs/0.6.73/GeographicQueries.html)
251
+ These filters have to be combined with time_series and do only work when coordinates is a spatial dimension [GeographicQueries](http://druid.io/docs/latest/development/geo.html)
214
252
 
215
253
  ```ruby
216
254
  Druid::Query.new('service/source').time_series().long_sum([:aggregate1]).filter{coordinates.in_rec [[50.0,13.0],[54.0,15.0]]}
@@ -230,85 +268,14 @@ Druid::Query.new('service/source').filter{dimension.nin(1,2,3)}
230
268
 
231
269
  #### Hash syntax
232
270
 
233
- Sometimes it can be useful to use a hash syntax for filtering
234
- for example if you already get them from a list or parameter hash.
271
+ Sometimes it can be useful to use a hash syntax for filtering for example if you already get them from a list or parameter hash.
235
272
 
236
273
  ```ruby
237
274
  Druid::Query.new('service/source').filter{dimension => 1, dimension1 =>2, dimension2 => 3}
238
-
239
- #this is the same as
240
-
275
+ # which is equivalent to
241
276
  Druid::Query.new('service/source').filter{dimension.eq(1) & dimension1.eq(2) & dimension2.eq(3)}
242
277
  ```
243
278
 
244
- ### DRIPL
245
-
246
- ruby-druid now includes a [REPL](https://github.com/cldwalker/ripl):
247
-
248
- ```ruby
249
- $ bin/dripl
250
- >> metrics
251
- [
252
- [0] "actions"
253
- [1] "words"
254
- ]
255
-
256
- >> dimensions
257
- [
258
- [0] "type"
259
- ]
260
-
261
- >> long_sum(:actions)
262
- +---------+
263
- | actions |
264
- +---------+
265
- | 98575 |
266
- +---------+
267
-
268
- >> long_sum(:actions, :words)[-3.days].granularity(:day)
269
- +---------------+---------------+
270
- | actions | words |
271
- +---------------+---------------+
272
- | 2013-12-11T00:00:00.000+01:00 |
273
- +---------------+---------------+
274
- | 537345 | 68974 |
275
- +---------------+---------------+
276
- | 2013-12-12T00:00:00.000+01:00 |
277
- +---------------+---------------+
278
- | 675431 | 49253 |
279
- +---------------+---------------+
280
- | 2013-12-13T00:00:00.000+01:00 |
281
- +---------------+---------------+
282
- | 749034 | 87542 |
283
- +---------------+---------------+
284
-
285
- >> long_sum(:actions, :words)[-3.days].granularity(:day).properties
286
- {
287
- :dataSource => "events",
288
- :granularity => {
289
- :type => "period",
290
- :period => "P1D",
291
- :timeZone => "Europe/Berlin"
292
- },
293
- :intervals => [
294
- [0] "2013-12-11T00:00:00+01:00/2013-12-13T09:41:10+01:00"
295
- ],
296
- :queryType => :groupBy,
297
- :aggregations => [
298
- [0] {
299
- :type => "longSum",
300
- :name => :actions,
301
- :fieldName => :actions
302
- },
303
- [1] {
304
- :type => "longSum",
305
- :name => :words,
306
- :fieldName => :words
307
- }
308
- ]
309
- }
310
- ```
311
-
312
279
  ## Contributing
313
280
 
314
281
  1. Fork it
@@ -1,8 +1,2 @@
1
1
  require 'druid/client'
2
2
  require 'druid/query'
3
- require 'druid/response_row'
4
- require 'druid/zoo_handler'
5
-
6
- module Druid
7
-
8
- end
@@ -0,0 +1,66 @@
1
+ module Druid
2
+ class Aggregation
3
+ include ActiveModel::Model
4
+
5
+ attr_accessor :type
6
+ validates :type, inclusion: { in: %w(count longSum doubleSum min max javascript cardinality hyperUnique) }
7
+
8
+ attr_accessor :name
9
+ validates :name, presence: true
10
+
11
+ class FieldnameValidator < ActiveModel::EachValidator
12
+ TYPES = %w(count longSum doubleSum min max hyperUnique)
13
+ def validate_each(record, attribute, value)
14
+ if TYPES.include?(record.type)
15
+ record.errors.add(attribute, 'may not be blank') if value.blank?
16
+ else
17
+ record.errors.add(attribute, "is not supported by type=#{record.type}") if value
18
+ end
19
+ end
20
+ end
21
+
22
+ attr_accessor :fieldName
23
+ validates :fieldName, fieldname: true
24
+
25
+ class FieldnamesValidator < ActiveModel::EachValidator
26
+ TYPES = %w(javascript cardinality)
27
+ def validate_each(record, attribute, value)
28
+ if TYPES.include?(record.type)
29
+ record.errors.add(attribute, 'must be a list of field names') if !value.is_a?(Array) || value.blank?
30
+ else
31
+ record.errors.add(attribute, "is not supported by type=#{record.type}") if value
32
+ end
33
+ end
34
+ end
35
+
36
+ attr_accessor :fieldNames
37
+ validates :fieldNames, fieldnames: true
38
+
39
+ class FnValidator < ActiveModel::EachValidator
40
+ TYPES = %w(javascript)
41
+ def validate_each(record, attribute, value)
42
+ if TYPES.include?(record.type)
43
+ record.errors.add(attribute, 'may not be blank') if value.blank?
44
+ else
45
+ record.errors.add(attribute, "is not supported by type=#{record.type}") if value
46
+ end
47
+ end
48
+ end
49
+
50
+ attr_accessor :fnAggregate
51
+ validates :fnAggregate, fn: true
52
+
53
+ attr_accessor :fnCombine
54
+ validates :fnCombine, fn: true
55
+
56
+ attr_accessor :fnReset
57
+ validates :fnReset, fn: true
58
+
59
+ attr_accessor :byRow
60
+ validates :byRow, allow_nil: true, inclusion: { in: [true, false] }
61
+
62
+ def as_json(options = {})
63
+ super(options.merge(except: %w(errors validation_context)))
64
+ end
65
+ end
66
+ end
@@ -1,95 +1,23 @@
1
+ require 'druid/zk'
2
+ require 'druid/data_source'
3
+
1
4
  module Druid
2
5
  class Client
3
6
 
4
- def initialize(zookeeper_uri, opts = nil)
5
- opts ||= {}
6
-
7
- if opts[:static_setup] && !opts[:fallback]
8
- @static = opts[:static_setup]
9
- else
10
- @backup = opts[:static_setup] if opts[:fallback]
11
- zookeeper_caching_management!(zookeeper_uri, opts)
12
- end
13
-
14
- @http_timeout = opts[:http_timeout] || 2 * 60
15
- end
16
-
17
- def send(query)
18
- uri = data_source_uri(query.source)
19
- raise "data source #{query.source} (currently) not available" unless uri
20
-
21
- req = Net::HTTP::Post.new(uri.path, {'Content-Type' =>'application/json'})
22
- req.body = query.to_json
7
+ attr_reader :zk
23
8
 
24
- response = Net::HTTP.new(uri.host, uri.port).start do |http|
25
- http.read_timeout = @http_timeout
26
- http.request(req)
27
- end
28
-
29
- if response.code == "200"
30
- JSON.parse(response.body).map{ |row| ResponseRow.new(row) }
31
- else
32
- raise "Request failed: #{response.code}: #{response.body}"
33
- end
34
- end
35
-
36
- def query(id, &block)
37
- uri = data_source_uri(id)
38
- raise "data source #{id} (currently) not available" unless uri
39
- query = Query.new(id, self)
40
- return query unless block
41
-
42
- send query
9
+ def initialize(zookeeper, opts = {})
10
+ @zk = ZK.new(zookeeper, opts)
43
11
  end
44
12
 
45
- def zookeeper_caching_management!(zookeeper_uri, opts)
46
- @zk = ZooHandler.new(zookeeper_uri, opts)
47
-
48
- unless opts[:zk_keepalive]
49
- @cached_data_sources = @zk.data_sources unless @zk.nil?
50
-
51
- @zk.close!
52
- end
53
- end
54
-
55
- def ds
56
- @cached_data_sources || (@zk.data_sources unless @zk.nil?)
13
+ def data_source(source)
14
+ uri = @zk.data_sources[source]
15
+ Druid::DataSource.new(source, uri)
57
16
  end
58
17
 
59
18
  def data_sources
60
- (ds.nil? ? @static : ds).keys
19
+ @zk.data_sources
61
20
  end
62
21
 
63
- def data_source_uri(source)
64
- uri = (ds.nil? ? @static : ds)[source]
65
- begin
66
- return URI(uri) if uri
67
- rescue
68
- return URI(@backup) if @backup
69
- end
70
- end
71
-
72
- def data_source(source)
73
- uri = data_source_uri(source)
74
- raise "data source #{source} (currently) not available" unless uri
75
-
76
- meta_path = "#{uri.path}datasources/#{source.split('/').last}"
77
-
78
- req = Net::HTTP::Get.new(meta_path)
79
-
80
- response = Net::HTTP.new(uri.host, uri.port).start do |http|
81
- http.read_timeout = @http_timeout
82
- http.request(req)
83
- end
84
-
85
- if response.code == "200"
86
- meta = JSON.parse(response.body)
87
- meta.define_singleton_method(:dimensions) { self['dimensions'] }
88
- meta.define_singleton_method(:metrics) { self['metrics'] }
89
- meta
90
- else
91
- raise "Request failed: #{response.code}: #{response.body}"
92
- end
93
- end
94
22
  end
95
23
  end