ruby-druid 0.1.9 → 0.9.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: d0e676ac15efc4d9777fa182d8c8855bc1ebf386
4
- data.tar.gz: e54d2ec80e3bb5efdf02d3b5b127c565c5cfce86
3
+ metadata.gz: 804e71f527f98f4bc082a2967da5ce7065fa2c01
4
+ data.tar.gz: ada587e143c21e913df7386ebb3020155d82ff5d
5
5
  SHA512:
6
- metadata.gz: 884a621b0fb481490d114248bfff0207bc2fd54e889832cf86318c6d26fb53c73c535d97d4d832fc9872a4f6b76b0d426a7050190e41639abfad2d728a36dd7a
7
- data.tar.gz: b46976c63af7169c6581d092522f5ac97098320efe903bd6d5a6e58c9c2792aab34d0e17c474de174fc44e74d91a0fc6594a89d39beaf3c8f7fabe3f432af21e
6
+ metadata.gz: be6eb7e1e566340d22c70b7bc693e3849260b9994886249b501d90db10fcbfac1b0074928e7e5118cc704b5b8eaeda5c02e8e24b3a6b05b0a03e31137090929e
7
+ data.tar.gz: 08d4a836a5ebc98cbd8c923a1ab718b9c67cceaa66be5e2ab218e85c2928564f47ed05ad2648a977b079f1691c96c35e273bf70494a5c50d9944636218e668fd
data/LICENSE CHANGED
@@ -1,20 +1,22 @@
1
- Copyright (c) 2013 madvertise Mobile Advertising GmbH
1
+ Copyright (c) 2016 Ruby Druid Community
2
2
 
3
- Permission is hereby granted, free of charge, to any person obtaining
4
- a copy of this software and associated documentation files (the
5
- "Software"), to deal in the Software without restriction, including
6
- without limitation the rights to use, copy, modify, merge, publish,
7
- distribute, sublicense, and/or sell copies of the Software, and to
8
- permit persons to whom the Software is furnished to do so, subject to
9
- the following conditions:
3
+ Permission is hereby granted, free of charge, to any person
4
+ obtaining a copy of this software and associated documentation
5
+ files (the "Software"), to deal in the Software without
6
+ restriction, including without limitation the rights to use,
7
+ copy, modify, merge, publish, distribute, sublicense, and/or sell
8
+ copies of the Software, and to permit persons to whom the
9
+ Software is furnished to do so, subject to the following
10
+ conditions:
10
11
 
11
- The above copyright notice and this permission notice shall be included
12
- in all copies or substantial portions of the Software.
12
+ The above copyright notice and this permission notice shall be
13
+ included in all copies or substantial portions of the Software.
13
14
 
14
15
  THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
- EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
- MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
17
- IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
18
- CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
19
- TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
20
- SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
16
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
17
+ OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
18
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
19
+ HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
20
+ WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
21
+ FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
22
+ OTHER DEALINGS IN THE SOFTWARE.
data/README.md CHANGED
@@ -1,29 +1,31 @@
1
1
  # ruby-druid
2
2
 
3
- A ruby client for [druid](http://druid.io).
4
-
5
- ruby-druid features a [Squeel](https://github.com/ernie/squeel)-like query DSL
6
- and generates a JSON query that can be sent to druid directly. A console for
7
- testing is also provided.
3
+ A Ruby client for [Druid](http://druid.io). Includes a [Squeel](https://github.com/ernie/squeel)-like query DSL and generates a JSON query that can be sent to Druid directly.
8
4
 
9
5
  [![Gem Version](https://badge.fury.io/rb/ruby-druid.png)](http://badge.fury.io/rb/ruby-druid)
10
- [![Build Status](https://travis-ci.org/liquidm/ruby-druid.png)](https://travis-ci.org/liquidm/ruby-druid)
11
- [![Code Climate](https://codeclimate.com/github/liquidm/ruby-druid.png)](https://codeclimate.com/github/liquidm/ruby-druid)
12
- [![Dependency Status](https://gemnasium.com/liquidm/ruby-druid.png)](https://gemnasium.com/liquidm/ruby-druid)
6
+ [![Build Status](https://travis-ci.org/ruby-druid/ruby-druid.png)](https://travis-ci.org/ruby-druid/ruby-druid)
7
+ [![Code Climate](https://codeclimate.com/github/ruby-druid/ruby-druid.png)](https://codeclimate.com/github/ruby-druid/ruby-druid)
8
+ [![Dependency Status](https://gemnasium.com/ruby-druid/ruby-druid.png)](https://gemnasium.com/ruby-druid/ruby-druid)
13
9
 
14
10
  ## Installation
15
11
 
16
12
  Add this line to your application's Gemfile:
17
13
 
18
- gem 'ruby-druid'
14
+ ```
15
+ gem 'ruby-druid'
16
+ ```
19
17
 
20
18
  And then execute:
21
19
 
22
- $ bundle
20
+ ```
21
+ bundle
22
+ ```
23
23
 
24
24
  Or install it yourself as:
25
25
 
26
- $ gem install ruby-druid
26
+ ```
27
+ gem install ruby-druid
28
+ ```
27
29
 
28
30
  ## Usage
29
31
 
@@ -31,8 +33,7 @@ Or install it yourself as:
31
33
  Druid::Client.new('zk1:2181,zk2:2181/druid').query('service/source')
32
34
  ```
33
35
 
34
- returns a query object on which all other methods can be called to create a
35
- full and valid druid query.
36
+ returns a query object on which all other methods can be called to create a full and valid Druid query.
36
37
 
37
38
  A query object can be sent like this:
38
39
 
@@ -42,11 +43,7 @@ query = Druid::Query.new('service/source')
42
43
  client.send(query)
43
44
  ```
44
45
 
45
- The `send` method returns the parsed response from the druid server as an
46
- array. If the response is not empty it contains one `ResponseRow` object for
47
- each row. The timestamp by can be received by a method with the same name
48
- (i.e. `row.timestamp`), all row values by hashlike syntax (i.e.
49
- `row['dimension'])
46
+ The `send` method returns the parsed response from the druid server as an array. If the response is not empty it contains one `ResponseRow` object for each row. The timestamp by can be received by a method with the same name (i.e. `row.timestamp`), all row values by hashlike syntax (i.e. `row['dimension'])
50
47
 
51
48
  An options hash can be passed when creating `Druid::Client` instance:
52
49
 
@@ -60,7 +57,7 @@ Supported options are:
60
57
 
61
58
  ### GroupBy
62
59
 
63
- A [GroupByQuery](https://github.com/metamx/druid/wiki/GroupByQuery) sets the
60
+ A [GroupByQuery](http://druid.io/docs/latest/querying/groupbyquery.html) sets the
64
61
  dimensions to group the data.
65
62
 
66
63
  `queryType` is set automatically to `groupBy`.
@@ -71,9 +68,7 @@ Druid::Query.new('service/source').group_by([:dimension1, :dimension2])
71
68
 
72
69
  ### TimeSeries
73
70
 
74
- A [TimeSeriesQuery](https://github.com/metamx/druid/wiki/TimeseriesQuery)
75
- returns an array of JSON objects where each object represents a value asked for
76
- by the timeseries query.
71
+ A [TimeSeriesQuery](http://druid.io/docs/latest/querying/timeseriesquery.html) returns an array of JSON objects where each object represents a value asked for by the timeseries query.
77
72
 
78
73
  ```ruby
79
74
  Druid::Query.new('service/source').time_series([:aggregate1, :aggregate2])
@@ -81,10 +76,32 @@ Druid::Query.new('service/source').time_series([:aggregate1, :aggregate2])
81
76
 
82
77
  ### Aggregations
83
78
 
79
+ #### longSum, doubleSum, count, min, max, hyperUnique
80
+
84
81
  ```ruby
85
82
  Druid::Query.new('service/source').long_sum([:aggregate1, :aggregate2])
86
83
  ```
87
84
 
85
+ In the same way could be used the following methods for [aggregations](http://druid.io/docs/latest/querying/aggregations.html) adding: `double_sum, count, min, max, hyper_unique`
86
+
87
+ #### cardinality
88
+
89
+ ```ruby
90
+ Druid::Query.new('service/source').cardinality(:aggregate, [:dimension1, dimension2], <by_row: true | false>)
91
+ ```
92
+
93
+ #### javascript
94
+
95
+ For example calculation for `sum(log(x)/y) + 10`:
96
+
97
+ ```ruby
98
+ Druid::Query.new('service/source').js_aggregation(:aggregate, [:x, :y],
99
+ aggregate: "function(current, a, b) { return current + (Math.log(a) * b); }",
100
+ combine: "function(partialA, partialB) { return partialA + partialB; }",
101
+ reset: "function() { return 10; }"
102
+ )
103
+ ```
104
+
88
105
  ### Post Aggregations
89
106
 
90
107
  A simple syntax for post aggregations with +,-,/,* can be used like:
@@ -94,8 +111,7 @@ query = Druid::Query.new('service/source').long_sum([:aggregate1, :aggregate2])
94
111
  query.postagg { (aggregate2 + aggregate2).as output_field_name }
95
112
  ```
96
113
 
97
- Required fields for the postaggregation are fetched automatically by the
98
- library.
114
+ Required fields for the postaggregation are fetched automatically by the library.
99
115
 
100
116
  Javascript post aggregations are also supported:
101
117
 
@@ -105,8 +121,7 @@ query.postagg { js('function(aggregate1, aggregate2) { return aggregate1 + aggre
105
121
 
106
122
  ### Query Interval
107
123
 
108
- The interval for the query takes a string with date and time or objects that
109
- provide an `iso8601` method.
124
+ The interval for the query takes a string with date and time or objects that provide an `iso8601` method.
110
125
 
111
126
  ```ruby
112
127
  query = Druid::Query.new('service/source').long_sum(:aggregate1)
@@ -115,16 +130,13 @@ query.interval("2013-01-01T00", Time.now)
115
130
 
116
131
  ### Result Granularity
117
132
 
118
- The granularity can be `:all`, `:none`, `:minute`, `:fifteen_minute`,
119
- `:thirthy_minute`, `:hour` or `:day`.
133
+ The granularity can be `:all`, `:none`, `:minute`, `:fifteen_minute`, `:thirthy_minute`, `:hour` or `:day`.
120
134
 
121
- It can also be a period granularity as described in the [druid
122
- wiki](https://github.com/metamx/druid/wiki/Granularities).
135
+ It can also be a period granularity as described in the [Druid documentation](http://druid.io/docs/latest/querying/granularities.html).
123
136
 
124
137
  The period `'day'` or `:day` will be interpreted as `'P1D'`.
125
138
 
126
- If a period granularity is specifed, the (optional) second parameter is a time
127
- zone. It defaults to the machines local time zone. i.e.
139
+ If a period granularity is specifed, the (optional) second parameter is a time zone. It defaults to the machines local time zone. i.e.
128
140
 
129
141
  ```ruby
130
142
  query = Druid::Query.new('service/source').long_sum(:aggregate1)
@@ -138,20 +150,46 @@ query = Druid::Query.new('service/source').long_sum(:aggregate1)
138
150
  query.granularity('P1D', 'Europe/Berlin')
139
151
  ```
140
152
 
141
- ### Having
153
+ ### Having filters
142
154
 
143
155
  ```ruby
144
- Druid::Query.new('service/source').having{metric > 10}
156
+ # equality
157
+ Druid::Query.new('service/source').having { metric == 10 }
145
158
  ```
146
159
 
147
160
  ```ruby
148
- Druid::Query.new('service/source').having{metric < 10}
161
+ # inequality
162
+ Druid::Query.new('service/source').having { metric != 10 }
163
+ ```
164
+
165
+ ```ruby
166
+ # greater, less
167
+ Druid::Query.new('service/source').having { metric > 10 }
168
+ Druid::Query.new('service/source').having { metric < 10 }
169
+ ```
170
+
171
+ #### Compound having filters
172
+
173
+ Having filters can be combined with boolean logic.
174
+
175
+ ```ruby
176
+ # and
177
+ Druid::Query.new('service/source').having { (metric != 1) & (metric2 != 2) }
178
+ ```
179
+
180
+ ```ruby
181
+ # or
182
+ Druid::Query.new('service/source').having { (metric == 1) | (metric2 == 2) }
183
+ ```
184
+
185
+ ```ruby
186
+ # not
187
+ Druid::Query.new('service/source').having{ !metric.eq(1) }
149
188
  ```
150
189
 
151
190
  ### Filters
152
191
 
153
- Filters are set by the `filter` method. It takes a block or a hash as
154
- parameter.
192
+ Filters are set by the `filter` method. It takes a block or a hash as parameter.
155
193
 
156
194
  Filters can be chained `filter{...}.filter{...}`
157
195
 
@@ -210,7 +248,7 @@ Druid::Query.new('service/source').filter{dimension.in(1,2,3)}
210
248
  ```
211
249
  #### Geographic filter
212
250
 
213
- These filters have to be combined with time_series and do only work when coordinates is a spatial dimension [GeographicQueries](http://druid.io/docs/0.6.73/GeographicQueries.html)
251
+ These filters have to be combined with time_series and do only work when coordinates is a spatial dimension [GeographicQueries](http://druid.io/docs/latest/development/geo.html)
214
252
 
215
253
  ```ruby
216
254
  Druid::Query.new('service/source').time_series().long_sum([:aggregate1]).filter{coordinates.in_rec [[50.0,13.0],[54.0,15.0]]}
@@ -230,85 +268,14 @@ Druid::Query.new('service/source').filter{dimension.nin(1,2,3)}
230
268
 
231
269
  #### Hash syntax
232
270
 
233
- Sometimes it can be useful to use a hash syntax for filtering
234
- for example if you already get them from a list or parameter hash.
271
+ Sometimes it can be useful to use a hash syntax for filtering for example if you already get them from a list or parameter hash.
235
272
 
236
273
  ```ruby
237
274
  Druid::Query.new('service/source').filter{dimension => 1, dimension1 =>2, dimension2 => 3}
238
-
239
- #this is the same as
240
-
275
+ # which is equivalent to
241
276
  Druid::Query.new('service/source').filter{dimension.eq(1) & dimension1.eq(2) & dimension2.eq(3)}
242
277
  ```
243
278
 
244
- ### DRIPL
245
-
246
- ruby-druid now includes a [REPL](https://github.com/cldwalker/ripl):
247
-
248
- ```ruby
249
- $ bin/dripl
250
- >> metrics
251
- [
252
- [0] "actions"
253
- [1] "words"
254
- ]
255
-
256
- >> dimensions
257
- [
258
- [0] "type"
259
- ]
260
-
261
- >> long_sum(:actions)
262
- +---------+
263
- | actions |
264
- +---------+
265
- | 98575 |
266
- +---------+
267
-
268
- >> long_sum(:actions, :words)[-3.days].granularity(:day)
269
- +---------------+---------------+
270
- | actions | words |
271
- +---------------+---------------+
272
- | 2013-12-11T00:00:00.000+01:00 |
273
- +---------------+---------------+
274
- | 537345 | 68974 |
275
- +---------------+---------------+
276
- | 2013-12-12T00:00:00.000+01:00 |
277
- +---------------+---------------+
278
- | 675431 | 49253 |
279
- +---------------+---------------+
280
- | 2013-12-13T00:00:00.000+01:00 |
281
- +---------------+---------------+
282
- | 749034 | 87542 |
283
- +---------------+---------------+
284
-
285
- >> long_sum(:actions, :words)[-3.days].granularity(:day).properties
286
- {
287
- :dataSource => "events",
288
- :granularity => {
289
- :type => "period",
290
- :period => "P1D",
291
- :timeZone => "Europe/Berlin"
292
- },
293
- :intervals => [
294
- [0] "2013-12-11T00:00:00+01:00/2013-12-13T09:41:10+01:00"
295
- ],
296
- :queryType => :groupBy,
297
- :aggregations => [
298
- [0] {
299
- :type => "longSum",
300
- :name => :actions,
301
- :fieldName => :actions
302
- },
303
- [1] {
304
- :type => "longSum",
305
- :name => :words,
306
- :fieldName => :words
307
- }
308
- ]
309
- }
310
- ```
311
-
312
279
  ## Contributing
313
280
 
314
281
  1. Fork it
@@ -1,8 +1,2 @@
1
1
  require 'druid/client'
2
2
  require 'druid/query'
3
- require 'druid/response_row'
4
- require 'druid/zoo_handler'
5
-
6
- module Druid
7
-
8
- end
@@ -0,0 +1,66 @@
1
+ module Druid
2
+ class Aggregation
3
+ include ActiveModel::Model
4
+
5
+ attr_accessor :type
6
+ validates :type, inclusion: { in: %w(count longSum doubleSum min max javascript cardinality hyperUnique) }
7
+
8
+ attr_accessor :name
9
+ validates :name, presence: true
10
+
11
+ class FieldnameValidator < ActiveModel::EachValidator
12
+ TYPES = %w(count longSum doubleSum min max hyperUnique)
13
+ def validate_each(record, attribute, value)
14
+ if TYPES.include?(record.type)
15
+ record.errors.add(attribute, 'may not be blank') if value.blank?
16
+ else
17
+ record.errors.add(attribute, "is not supported by type=#{record.type}") if value
18
+ end
19
+ end
20
+ end
21
+
22
+ attr_accessor :fieldName
23
+ validates :fieldName, fieldname: true
24
+
25
+ class FieldnamesValidator < ActiveModel::EachValidator
26
+ TYPES = %w(javascript cardinality)
27
+ def validate_each(record, attribute, value)
28
+ if TYPES.include?(record.type)
29
+ record.errors.add(attribute, 'must be a list of field names') if !value.is_a?(Array) || value.blank?
30
+ else
31
+ record.errors.add(attribute, "is not supported by type=#{record.type}") if value
32
+ end
33
+ end
34
+ end
35
+
36
+ attr_accessor :fieldNames
37
+ validates :fieldNames, fieldnames: true
38
+
39
+ class FnValidator < ActiveModel::EachValidator
40
+ TYPES = %w(javascript)
41
+ def validate_each(record, attribute, value)
42
+ if TYPES.include?(record.type)
43
+ record.errors.add(attribute, 'may not be blank') if value.blank?
44
+ else
45
+ record.errors.add(attribute, "is not supported by type=#{record.type}") if value
46
+ end
47
+ end
48
+ end
49
+
50
+ attr_accessor :fnAggregate
51
+ validates :fnAggregate, fn: true
52
+
53
+ attr_accessor :fnCombine
54
+ validates :fnCombine, fn: true
55
+
56
+ attr_accessor :fnReset
57
+ validates :fnReset, fn: true
58
+
59
+ attr_accessor :byRow
60
+ validates :byRow, allow_nil: true, inclusion: { in: [true, false] }
61
+
62
+ def as_json(options = {})
63
+ super(options.merge(except: %w(errors validation_context)))
64
+ end
65
+ end
66
+ end
@@ -1,95 +1,23 @@
1
+ require 'druid/zk'
2
+ require 'druid/data_source'
3
+
1
4
  module Druid
2
5
  class Client
3
6
 
4
- def initialize(zookeeper_uri, opts = nil)
5
- opts ||= {}
6
-
7
- if opts[:static_setup] && !opts[:fallback]
8
- @static = opts[:static_setup]
9
- else
10
- @backup = opts[:static_setup] if opts[:fallback]
11
- zookeeper_caching_management!(zookeeper_uri, opts)
12
- end
13
-
14
- @http_timeout = opts[:http_timeout] || 2 * 60
15
- end
16
-
17
- def send(query)
18
- uri = data_source_uri(query.source)
19
- raise "data source #{query.source} (currently) not available" unless uri
20
-
21
- req = Net::HTTP::Post.new(uri.path, {'Content-Type' =>'application/json'})
22
- req.body = query.to_json
7
+ attr_reader :zk
23
8
 
24
- response = Net::HTTP.new(uri.host, uri.port).start do |http|
25
- http.read_timeout = @http_timeout
26
- http.request(req)
27
- end
28
-
29
- if response.code == "200"
30
- JSON.parse(response.body).map{ |row| ResponseRow.new(row) }
31
- else
32
- raise "Request failed: #{response.code}: #{response.body}"
33
- end
34
- end
35
-
36
- def query(id, &block)
37
- uri = data_source_uri(id)
38
- raise "data source #{id} (currently) not available" unless uri
39
- query = Query.new(id, self)
40
- return query unless block
41
-
42
- send query
9
+ def initialize(zookeeper, opts = {})
10
+ @zk = ZK.new(zookeeper, opts)
43
11
  end
44
12
 
45
- def zookeeper_caching_management!(zookeeper_uri, opts)
46
- @zk = ZooHandler.new(zookeeper_uri, opts)
47
-
48
- unless opts[:zk_keepalive]
49
- @cached_data_sources = @zk.data_sources unless @zk.nil?
50
-
51
- @zk.close!
52
- end
53
- end
54
-
55
- def ds
56
- @cached_data_sources || (@zk.data_sources unless @zk.nil?)
13
+ def data_source(source)
14
+ uri = @zk.data_sources[source]
15
+ Druid::DataSource.new(source, uri)
57
16
  end
58
17
 
59
18
  def data_sources
60
- (ds.nil? ? @static : ds).keys
19
+ @zk.data_sources
61
20
  end
62
21
 
63
- def data_source_uri(source)
64
- uri = (ds.nil? ? @static : ds)[source]
65
- begin
66
- return URI(uri) if uri
67
- rescue
68
- return URI(@backup) if @backup
69
- end
70
- end
71
-
72
- def data_source(source)
73
- uri = data_source_uri(source)
74
- raise "data source #{source} (currently) not available" unless uri
75
-
76
- meta_path = "#{uri.path}datasources/#{source.split('/').last}"
77
-
78
- req = Net::HTTP::Get.new(meta_path)
79
-
80
- response = Net::HTTP.new(uri.host, uri.port).start do |http|
81
- http.read_timeout = @http_timeout
82
- http.request(req)
83
- end
84
-
85
- if response.code == "200"
86
- meta = JSON.parse(response.body)
87
- meta.define_singleton_method(:dimensions) { self['dimensions'] }
88
- meta.define_singleton_method(:metrics) { self['metrics'] }
89
- meta
90
- else
91
- raise "Request failed: #{response.code}: #{response.body}"
92
- end
93
- end
94
22
  end
95
23
  end