druid_config 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
data/README.md ADDED
@@ -0,0 +1,68 @@
1
+ # DruidConfig
2
+
3
+ DruidConfig is a gem to access the information about Druid cluster status. You can check a node capacity, number of segments, tiers... It uses [zookeeper](https://zookeeper.apache.org/) to get coordinator and overlord URIs.
4
+
5
+ To use in your application, add this line to your Gemfile:
6
+
7
+ ```ruby
8
+ require 'druid_config'
9
+ ```
10
+
11
+ # Initialization
12
+
13
+ `Cluster` is the base class to perform queries. To initialize it send the Zookeeper URI and options as arguments:
14
+
15
+ ```ruby
16
+ cluster = DruidConfig::Cluster.new(zookeeper_uri, options)
17
+ ```
18
+
19
+ Available options:
20
+ * discovery_path: string with the discovery path of druid inside Zookeeper directory structure.
21
+
22
+ # Usage
23
+
24
+ Call methods defined in `DruidConfig::Cluster` to access to the data. To get more information about data returned in methods, check [Druid documentation](http://druid.io/docs/0.8.1/design/coordinator.html).
25
+
26
+ * `leader`: leader
27
+ * `load_status`: load status
28
+ * `load_status`: load queue
29
+ * `metadata_datasources`: Hash with metadata of datasources
30
+ * `metadata_datasources_segments`: Hash with metadata of segments
31
+ * `datasources`: all data sources
32
+ * `datasource`: a concrete data source
33
+ * `rules`: all rules defined in the cluster
34
+ * `tiers`: tiers
35
+ * `servers` or `nodes`: all nodes of the cluster
36
+ * `physical_servers` or `physical_nodes`: array of URIs of nodes
37
+ * `historicals`: historical nodes
38
+ * `realtimes`: realtime nodes
39
+ * `workers`: worker nodes
40
+ * `physical_workers`: array of URIs of worker nodes
41
+ * `services`: Hash with physical nodes and the services they are running
42
+
43
+ ## Entities
44
+
45
+ Some methods return an instance of an `Entity` class. These entities provide multiple methods to access data. Defined entities are inside `druid_config/entities` folder.
46
+
47
+ * [DataSource]()
48
+ * [Node]()
49
+ * [Segment]()
50
+ * [Tier]()
51
+ * [Worker]()
52
+
53
+ ## Exceptions
54
+
55
+ Sometimes the Gem can't access to Druid API. In this case, the gem automatically will reset the Zookeeper connection and retry the query. If second query fails too, a `DruidApiError` exception will be raised.
56
+
57
+ # Collaborate
58
+
59
+ To contribute DruidConfig:
60
+
61
+ * Create an issue with the contribution: bug, enhancement or feature
62
+ * Fork the repository and make all changes you need
63
+ * Write test on new changes
64
+ * Create a pull request when you finish
65
+
66
+ # License
67
+
68
+ DruidConfig gem is released under the Affero GPL license. Copyright [redBorder](http://redborder.net)
@@ -0,0 +1,46 @@
1
+ module DruidConfig
2
+ #
3
+ # Class to initialize the connection to Zookeeper
4
+ #
5
+ class Client
6
+ attr_reader :zk, :zookeeper, :opts
7
+
8
+ #
9
+ # Initialize Zookeeper connection
10
+ #
11
+ def initialize(zookeeper, opts = {})
12
+ @zookeeper = :zk
13
+ @opts = opts
14
+ @zk = ZK.new(zookeeper, opts)
15
+ end
16
+
17
+ #
18
+ # Get the URL of a coordinator
19
+ #
20
+ def coordinator
21
+ zk.coordinator
22
+ end
23
+
24
+ #
25
+ # Get the URI of a overlord
26
+ #
27
+ def overlord
28
+ zk.overlord
29
+ end
30
+
31
+ #
32
+ # Close the client
33
+ #
34
+ def close!
35
+ zk.close!
36
+ end
37
+
38
+ #
39
+ # Reset the client
40
+ #
41
+ def reset!
42
+ close!
43
+ @zk = ZK.new(@zookeeper, @opts)
44
+ end
45
+ end
46
+ end
@@ -0,0 +1,324 @@
1
+ module DruidConfig
2
+ #
3
+ # Class to initialize the connection to Zookeeper
4
+ #
5
+ class Cluster
6
+ # HTTParty Rocks!
7
+ include HTTParty
8
+ include DruidConfig::Util
9
+
10
+ #
11
+ # Initialize the client to perform the queries
12
+ #
13
+ # == Parameters:
14
+ # zk_uri::
15
+ # String with URI or URIs (sparated by comma) of Zookeeper
16
+ # options::
17
+ # Hash with options:
18
+ # - discovery_path: String with the discovery path of Druid
19
+ #
20
+ def initialize(zk_uri, options)
21
+ # Initialize the Client
22
+ DruidConfig.client = DruidConfig::Client.new(zk_uri, options)
23
+
24
+ # Used to check the number of retries on error
25
+ @retries = 0
26
+
27
+ # Update the base uri to perform queries
28
+ self.class.base_uri(
29
+ "#{DruidConfig.client.coordinator}"\
30
+ "druid/coordinator/#{DruidConfig::Version::API_VERSION}")
31
+ end
32
+
33
+ #
34
+ # Close connection with zookeeper
35
+ #
36
+ def close!
37
+ DruidConfig.client.close!
38
+ end
39
+
40
+ #
41
+ # Reset the client
42
+ #
43
+ def reset!
44
+ DruidConfig.client.reset!
45
+ self.class.base_uri(
46
+ "#{DruidConfig.client.coordinator}"\
47
+ "druid/coordinator/#{DruidConfig::Version::API_VERSION}")
48
+ end
49
+
50
+ # ------------------------------------------------------------
51
+ # Queries!
52
+ # ------------------------------------------------------------
53
+
54
+ #
55
+ # The following methods are referenced to Druid API. To check the
56
+ # funcionality about it, please go to Druid documentation:
57
+ #
58
+ # http://druid.io/docs/0.8.1/design/coordinator.html
59
+ #
60
+
61
+ # Coordinator
62
+ # -----------------
63
+
64
+ #
65
+ # Return the leader of the Druid cluster
66
+ #
67
+ def leader
68
+ secure_query do
69
+ self.class.get('/leader').body
70
+ end
71
+ end
72
+
73
+ #
74
+ # Load status of the cluster
75
+ #
76
+ def load_status(params = '')
77
+ secure_query do
78
+ self.class.get("/loadstatus?#{params}")
79
+ end
80
+ end
81
+
82
+ #
83
+ # Load queue of the cluster
84
+ #
85
+ def load_queue(params = '')
86
+ secure_query do
87
+ self.class.get("/loadqueue?#{params}")
88
+ end
89
+ end
90
+
91
+ # Metadata
92
+ # -----------------
93
+
94
+ #
95
+ # Return a Hash with metadata of datasources
96
+ #
97
+ def metadata_datasources(params = '')
98
+ secure_query do
99
+ self.class.get("/metadata/datasources?#{params}")
100
+ end
101
+ end
102
+
103
+ alias_method :mt_datasources, :metadata_datasources
104
+
105
+ #
106
+ # Return a Hash with metadata of segments
107
+ #
108
+ # == Parameters:
109
+ # data_source::
110
+ # String with the name of the data source
111
+ # segment::
112
+ # (Optional) Segment to search
113
+ #
114
+ def metadata_datasources_segments(data_source, segment = '')
115
+ end_point = "/metadata/datasources/#{data_source}/segments"
116
+ secure_query do
117
+ if segment.empty? || segment == 'full'
118
+ self.class.get("#{end_point}?#{params}")
119
+ else
120
+ self.class.get("#{end_point}/#{params}")
121
+ end
122
+ end
123
+ end
124
+
125
+ alias_method :mt_datasources_segments, :metadata_datasources_segments
126
+
127
+ # Data sources
128
+ # -----------------
129
+
130
+ #
131
+ # Return all datasources
132
+ #
133
+ # == Returns:
134
+ # Array of Datasource initialized.
135
+ #
136
+ def datasources
137
+ datasource_status = load_status
138
+ secure_query do
139
+ self.class.get('/datasources?full').map do |data|
140
+ DruidConfig::Entities::DataSource.new(
141
+ data,
142
+ datasource_status.select { |k, _| k == data['name'] }.values.first)
143
+ end
144
+ end
145
+ end
146
+
147
+ #
148
+ # Return a unique datasource
149
+ #
150
+ # == Parameters:
151
+ # datasource:
152
+ # String with the data source name
153
+ #
154
+ # == Returns:
155
+ # DataSource instance
156
+ #
157
+ def datasource(datasource)
158
+ datasources.select { |el| el.name == datasource }
159
+ end
160
+
161
+ # Rules
162
+ # -----------------
163
+
164
+ #
165
+ # Return the rules applied to a cluster
166
+ #
167
+ def rules
168
+ secure_query do
169
+ self.class.get('/rules')
170
+ end
171
+ end
172
+
173
+ # Tiers
174
+ # -----------------
175
+
176
+ #
177
+ # Return all tiers defined in the cluster
178
+ #
179
+ # == Returns:
180
+ # Array of Tier instances
181
+ #
182
+ def tiers
183
+ current_nodes = servers
184
+ # Initialize tiers
185
+ secure_query do
186
+ current_nodes.map(&:tier).uniq.map do |tier|
187
+ DruidConfig::Entities::Tier.new(
188
+ tier,
189
+ current_nodes.select { |node| node.tier == tier })
190
+ end
191
+ end
192
+ end
193
+
194
+ # Servers
195
+ # -----------------
196
+
197
+ #
198
+ # Return all nodes of the cluster
199
+ #
200
+ # == Returns:
201
+ # Array of node Objects
202
+ #
203
+ def servers
204
+ secure_query do
205
+ queue = load_queue('full')
206
+ self.class.get('/servers?full').map do |data|
207
+ DruidConfig::Entities::Node.new(
208
+ data,
209
+ queue.select { |k, _| k == data['host'] }.values.first)
210
+ end
211
+ end
212
+ end
213
+
214
+ #
215
+ # URIs of the physical servers in the cluster
216
+ #
217
+ # == Returns:
218
+ # Array of strings
219
+ #
220
+ def physical_servers
221
+ secure_query do
222
+ @physical_servers ||= servers.map(&:host).uniq
223
+ end
224
+ end
225
+
226
+ alias_method :nodes, :servers
227
+ alias_method :physical_nodes, :physical_servers
228
+
229
+ #
230
+ # Returns only historial nodes
231
+ #
232
+ # == Returns:
233
+ # Array of Nodes
234
+ #
235
+ def historicals
236
+ servers.select { |node| node.type == :historical }
237
+ end
238
+
239
+ #
240
+ # Returns only realtime
241
+ #
242
+ # == Returns:
243
+ # Array of Nodes
244
+ #
245
+ def realtimes
246
+ servers.select { |node| node.type == :realtime }
247
+ end
248
+
249
+ #
250
+ # Return all Workers (MiddleManager) of the cluster
251
+ #
252
+ # == Returns:
253
+ # Array of Workers
254
+ #
255
+ def workers
256
+ # Stash the base_uri
257
+ stash_uri
258
+ self.class.base_uri(
259
+ "#{DruidConfig.client.overlord}"\
260
+ "druid/indexer/#{DruidConfig::Version::API_VERSION}")
261
+ workers = []
262
+ # Perform a query
263
+ begin
264
+ secure_query do
265
+ workers = self.class.get('/workers').map do |worker|
266
+ DruidConfig::Entities::Worker.new(worker)
267
+ end
268
+ end
269
+ ensure
270
+ # Recover it
271
+ pop_uri
272
+ end
273
+ # Return
274
+ workers
275
+ end
276
+
277
+ #
278
+ # URIs of the physical workers in the cluster
279
+ #
280
+ def physical_workers
281
+ @physical_workers ||= workers.map(&:host).uniq
282
+ end
283
+
284
+ # Services
285
+ # -----------------
286
+
287
+ #
288
+ # Availabe services in the cluster
289
+ #
290
+ # == Parameters:
291
+ # Array of Hash with the format:
292
+ # { server: [ services ], server2: [ services ], ... }
293
+ #
294
+ def services
295
+ return @services if @services
296
+ services = {}
297
+ physical_nodes.each { |node| services[node] = [] }
298
+ # Load services
299
+ realtimes.map(&:host).uniq.each { |r| services[r] << :realtime }
300
+ historicals.map(&:host).uniq.each { |r| services[r] << :historical }
301
+ physical_workers.each { |w| services[w] << :middleManager }
302
+ # Return nodes
303
+ @services = services
304
+ end
305
+
306
+ private
307
+
308
+ #
309
+ # Stash current base_uri
310
+ #
311
+ def stash_uri
312
+ @uri_stack ||= []
313
+ @uri_stack.push self.class.base_uri
314
+ end
315
+
316
+ #
317
+ # Pop next base_uri
318
+ #
319
+ def pop_uri
320
+ return if @uri_stack.nil? || @uri_stack.empty?
321
+ self.class.base_uri(@uri_stack.pop)
322
+ end
323
+ end
324
+ end
@@ -0,0 +1,78 @@
1
+ module DruidConfig
2
+ #
3
+ # Module of info
4
+ #
5
+ module Entities
6
+ #
7
+ # Init a DataSource
8
+ class DataSource
9
+ # HTTParty Rocks!
10
+ include HTTParty
11
+
12
+ attr_reader :name, :properties, :load_status
13
+
14
+ #
15
+ # Initialize a DataSource
16
+ #
17
+ def initialize(metadata, load_status)
18
+ @name = metadata['name']
19
+ @properties = metadata['properties']
20
+ @load_status = load_status
21
+ # Set end point for HTTParty
22
+ self.class.base_uri(
23
+ "#{DruidConfig.client.coordinator}"\
24
+ "druid/coordinator/#{DruidConfig::Version::API_VERSION}")
25
+ end
26
+
27
+ #
28
+ # The following methods are referenced to Druid API. To check the
29
+ # funcionality about it, please go to Druid documentation:
30
+ #
31
+ # http://druid.io/docs/0.8.1/design/coordinator.html
32
+ #
33
+
34
+ def info(params = '')
35
+ @info ||= self.class.get("/datasources/#{@name}?#{params}")
36
+ end
37
+
38
+ # Intervals
39
+ # -----------------
40
+ def intervals(params = '')
41
+ self.class.get("/datasources/#{@name}/intervals?#{params}")
42
+ end
43
+
44
+ def interval(interval, params = '')
45
+ self.class.get("/datasources/#{@name}/intervals/#{interval}"\
46
+ "?#{params}")
47
+ end
48
+
49
+ # Segments and Tiers
50
+ # -----------------
51
+ def segments
52
+ @segments ||=
53
+ self.class.get("/datasources/#{@name}/segments?full").map do |s|
54
+ DruidConfig::Entities::Segment.new(s)
55
+ end
56
+ end
57
+
58
+ def segment(segment)
59
+ segments.select { |s| s.id == segment }
60
+ end
61
+
62
+ def tiers
63
+ info['tiers']
64
+ end
65
+
66
+ # Rules
67
+ # -----------------
68
+ def rules(params = '')
69
+ self.class.get("/rules/#{@name}?#{params}")
70
+ end
71
+
72
+ def history_rules(interval)
73
+ self.class.get("/rules/#{@name}/history"\
74
+ "?interval=#{interval}")
75
+ end
76
+ end
77
+ end
78
+ end
@@ -0,0 +1,74 @@
1
+ module DruidConfig
2
+ module Entities
3
+ #
4
+ # Node class
5
+ #
6
+ class Node
7
+ # HTTParty Rocks!
8
+ include HTTParty
9
+
10
+ # Readers
11
+ attr_reader :host, :port, :max_size, :type, :tier, :priority, :size,
12
+ :segments, :segments_to_load, :segments_to_drop,
13
+ :segments_to_load_size, :segments_to_drop_size
14
+
15
+ #
16
+ # Initialize it with received info
17
+ #
18
+ # == Parameters:
19
+ # metadata::
20
+ # Hash with the data of the node given by a Druid API query
21
+ # queue::
22
+ # Hash with segments to load
23
+ #
24
+ def initialize(metadata, queue)
25
+ @host, @port = metadata['host'].split(':')
26
+ @max_size = metadata['maxSize']
27
+ @type = metadata['type'].to_sym
28
+ @tier = metadata['tier']
29
+ @priority = metadata['priority']
30
+ @size = metadata['currSize']
31
+ @segments = metadata['segments'].map do |_, sdata|
32
+ DruidConfig::Entities::Segment.new(sdata)
33
+ end
34
+ if queue.nil?
35
+ @segments_to_load, @segments_to_drop = [], []
36
+ @segments_to_load_size, @segments_to_drop_size = 0, 0
37
+ else
38
+ @segments_to_load = queue['segmentsToLoad'].map do |segment|
39
+ DruidConfig::Entities::Segment.new(segment)
40
+ end
41
+ @segments_to_drop = queue['segmentsToDrop'].map do |segment|
42
+ DruidConfig::Entities::Segment.new(segment)
43
+ end
44
+ @segments_to_load_size = @segments_to_load.map(&:size).reduce(:+)
45
+ @segments_to_drop_size = @segments_to_drop.map(&:size).reduce(:+)
46
+ end
47
+ end
48
+
49
+ alias_method :used, :size
50
+
51
+ #
52
+ # Calculate the percent of used space
53
+ #
54
+ def used_percent
55
+ return 0 unless max_size && max_size != 0
56
+ ((size.to_f / max_size) * 100).round(2)
57
+ end
58
+
59
+ #
60
+ # Calculate free space
61
+ #
62
+ def free
63
+ max_size - size
64
+ end
65
+
66
+ #
67
+ # Return the URI of this node
68
+ #
69
+ def uri
70
+ "#{@host}:#{@port}"
71
+ end
72
+ end
73
+ end
74
+ end
File without changes
@@ -0,0 +1,60 @@
1
+ module DruidConfig
2
+ module Entities
3
+ #
4
+ # Segment class
5
+ #
6
+ class Segment
7
+ # Readers
8
+ attr_reader :id, :interval, :version, :load_spec, :dimensions, :metrics,
9
+ :shard_spec, :binary_version, :size
10
+
11
+ #
12
+ # Initialize it with received info
13
+ #
14
+ # == Parameters:
15
+ # metadata::
16
+ # Hash with returned metadata from Druid
17
+ #
18
+ def initialize(metadata)
19
+ @id = metadata['identifier']
20
+ @interval = metadata['interval'].split('/').map { |t| Time.parse t }
21
+ @version = Time.parse metadata['version']
22
+ @load_spec = metadata['loadSpec']
23
+ @dimensions = metadata['dimensions'].split(',').map(&:to_sym)
24
+ @metrics = metadata['metrics'].split(',').map(&:to_sym)
25
+ @shard_spec = metadata['shardSpec']
26
+ @binary_version = metadata['binaryVersion']
27
+ @size = metadata['size']
28
+ end
29
+
30
+ #
31
+ # Return direct link to the store
32
+ #
33
+ # == Returns:
34
+ # String with the URI
35
+ #
36
+ def store_uri
37
+ return '' if load_spec.empty?
38
+ "s3://#{load_spec['bucket']}/#{load_spec['key']}"
39
+ end
40
+
41
+ #
42
+ # Return the store type
43
+ #
44
+ # == Returns:
45
+ # Store type as symbol
46
+ #
47
+ def store_type
48
+ return nil if load_spec.empty?
49
+ load_spec['type'].to_sym
50
+ end
51
+
52
+ #
53
+ # By default, show the identifier in To_s
54
+ #
55
+ def to_s
56
+ @id
57
+ end
58
+ end
59
+ end
60
+ end
@@ -0,0 +1,66 @@
1
+ module DruidConfig
2
+ module Entities
3
+ #
4
+ # Tier class
5
+ #
6
+ class Tier
7
+ # Readers
8
+ attr_reader :name, :nodes
9
+
10
+ def initialize(name, nodes)
11
+ @name = name
12
+ @nodes = nodes
13
+ end
14
+
15
+ alias_method :servers, :nodes
16
+
17
+ def size
18
+ @size ||= nodes.map(&:size).inject(:+)
19
+ end
20
+
21
+ alias_method :used, :size
22
+
23
+ def max_size
24
+ @max_size ||= nodes.map(&:max_size).inject(:+)
25
+ end
26
+
27
+ def free
28
+ @free ||= (max_size - size)
29
+ end
30
+
31
+ def used_percent
32
+ return 0 unless max_size && max_size != 0
33
+ ((size.to_f / max_size) * 100).round(2)
34
+ end
35
+
36
+ def historicals
37
+ nodes.select { |node| node.type == :historical }
38
+ end
39
+
40
+ def segments
41
+ @segments ||= nodes.map(&:segments)
42
+ .flatten.sort_by { |seg| seg.interval.first }
43
+ end
44
+
45
+ def segments_to_load
46
+ @segments_to_load ||=
47
+ nodes.map { |node| node.segments_to_load.count }.inject(:+)
48
+ end
49
+
50
+ def segments_to_drop
51
+ @segments_to_drop ||=
52
+ nodes.map { |node| node.segments_to_drop.count }.inject(:+)
53
+ end
54
+
55
+ def segments_to_load_size
56
+ @segments_to_load_size ||=
57
+ nodes.map(&:segments_to_load_size).reduce(:+)
58
+ end
59
+
60
+ def segments_to_drop_size
61
+ @segments_to_drop_size ||=
62
+ nodes.map(&:segments_to_drop_size).reduce(:+)
63
+ end
64
+ end
65
+ end
66
+ end