druid_config 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/README.md ADDED
@@ -0,0 +1,68 @@
1
+ # DruidConfig
2
+
3
+ DruidConfig is a gem to access the information about Druid cluster status. You can check a node capacity, number of segments, tiers... It uses [zookeeper](https://zookeeper.apache.org/) to get coordinator and overlord URIs.
4
+
5
+ To use in your application, add this line to your Gemfile:
6
+
7
+ ```ruby
8
+ require 'druid_config'
9
+ ```
10
+
11
+ # Initialization
12
+
13
+ `Cluster` is the base class to perform queries. To initialize it send the Zookeeper URI and options as arguments:
14
+
15
+ ```ruby
16
+ cluster = DruidConfig::Cluster.new(zookeeper_uri, options)
17
+ ```
18
+
19
+ Available options:
20
+ * discovery_path: string with the discovery path of druid inside Zookeeper directory structure.
21
+
22
+ # Usage
23
+
24
+ Call methods defined in `DruidConfig::Cluster` to access to the data. To get more information about data returned in methods, check [Druid documentation](http://druid.io/docs/0.8.1/design/coordinator.html).
25
+
26
+ * `leader`: leader
27
+ * `load_status`: load status
28
+ * `load_status`: load queue
29
+ * `metadata_datasources`: Hash with metadata of datasources
30
+ * `metadata_datasources_segments`: Hash with metadata of segments
31
+ * `datasources`: all data sources
32
+ * `datasource`: a concrete data source
33
+ * `rules`: all rules defined in the cluster
34
+ * `tiers`: tiers
35
+ * `servers` or `nodes`: all nodes of the cluster
36
+ * `physical_servers` or `physical_nodes`: array of URIs of nodes
37
+ * `historicals`: historical nodes
38
+ * `realtimes`: realtime nodes
39
+ * `workers`: worker nodes
40
+ * `physical_workers`: array of URIs of worker nodes
41
+ * `services`: Hash with physical nodes and the services they are running
42
+
43
+ ## Entities
44
+
45
+ Some methods return an instance of an `Entity` class. These entities provide multiple methods to access data. Defined entities are inside `druid_config/entities` folder.
46
+
47
+ * [DataSource]()
48
+ * [Node]()
49
+ * [Segment]()
50
+ * [Tier]()
51
+ * [Worker]()
52
+
53
+ ## Exceptions
54
+
55
+ Sometimes the Gem can't access to Druid API. In this case, the gem automatically will reset the Zookeeper connection and retry the query. If second query fails too, a `DruidApiError` exception will be raised.
56
+
57
+ # Collaborate
58
+
59
+ To contribute DruidConfig:
60
+
61
+ * Create an issue with the contribution: bug, enhancement or feature
62
+ * Fork the repository and make all changes you need
63
+ * Write test on new changes
64
+ * Create a pull request when you finish
65
+
66
+ # License
67
+
68
+ DruidConfig gem is released under the Affero GPL license. Copyright [redBorder](http://redborder.net)
@@ -0,0 +1,46 @@
1
+ module DruidConfig
2
+ #
3
+ # Class to initialize the connection to Zookeeper
4
+ #
5
+ class Client
6
+ attr_reader :zk, :zookeeper, :opts
7
+
8
+ #
9
+ # Initialize Zookeeper connection
10
+ #
11
+ def initialize(zookeeper, opts = {})
12
+ @zookeeper = :zk
13
+ @opts = opts
14
+ @zk = ZK.new(zookeeper, opts)
15
+ end
16
+
17
+ #
18
+ # Get the URL of a coordinator
19
+ #
20
+ def coordinator
21
+ zk.coordinator
22
+ end
23
+
24
+ #
25
+ # Get the URI of a overlord
26
+ #
27
+ def overlord
28
+ zk.overlord
29
+ end
30
+
31
+ #
32
+ # Close the client
33
+ #
34
+ def close!
35
+ zk.close!
36
+ end
37
+
38
+ #
39
+ # Reset the client
40
+ #
41
+ def reset!
42
+ close!
43
+ @zk = ZK.new(@zookeeper, @opts)
44
+ end
45
+ end
46
+ end
@@ -0,0 +1,324 @@
1
+ module DruidConfig
2
+ #
3
+ # Class to initialize the connection to Zookeeper
4
+ #
5
+ class Cluster
6
+ # HTTParty Rocks!
7
+ include HTTParty
8
+ include DruidConfig::Util
9
+
10
+ #
11
+ # Initialize the client to perform the queries
12
+ #
13
+ # == Parameters:
14
+ # zk_uri::
15
+ # String with URI or URIs (sparated by comma) of Zookeeper
16
+ # options::
17
+ # Hash with options:
18
+ # - discovery_path: String with the discovery path of Druid
19
+ #
20
+ def initialize(zk_uri, options)
21
+ # Initialize the Client
22
+ DruidConfig.client = DruidConfig::Client.new(zk_uri, options)
23
+
24
+ # Used to check the number of retries on error
25
+ @retries = 0
26
+
27
+ # Update the base uri to perform queries
28
+ self.class.base_uri(
29
+ "#{DruidConfig.client.coordinator}"\
30
+ "druid/coordinator/#{DruidConfig::Version::API_VERSION}")
31
+ end
32
+
33
+ #
34
+ # Close connection with zookeeper
35
+ #
36
+ def close!
37
+ DruidConfig.client.close!
38
+ end
39
+
40
+ #
41
+ # Reset the client
42
+ #
43
+ def reset!
44
+ DruidConfig.client.reset!
45
+ self.class.base_uri(
46
+ "#{DruidConfig.client.coordinator}"\
47
+ "druid/coordinator/#{DruidConfig::Version::API_VERSION}")
48
+ end
49
+
50
+ # ------------------------------------------------------------
51
+ # Queries!
52
+ # ------------------------------------------------------------
53
+
54
+ #
55
+ # The following methods are referenced to Druid API. To check the
56
+ # funcionality about it, please go to Druid documentation:
57
+ #
58
+ # http://druid.io/docs/0.8.1/design/coordinator.html
59
+ #
60
+
61
+ # Coordinator
62
+ # -----------------
63
+
64
+ #
65
+ # Return the leader of the Druid cluster
66
+ #
67
+ def leader
68
+ secure_query do
69
+ self.class.get('/leader').body
70
+ end
71
+ end
72
+
73
+ #
74
+ # Load status of the cluster
75
+ #
76
+ def load_status(params = '')
77
+ secure_query do
78
+ self.class.get("/loadstatus?#{params}")
79
+ end
80
+ end
81
+
82
+ #
83
+ # Load queue of the cluster
84
+ #
85
+ def load_queue(params = '')
86
+ secure_query do
87
+ self.class.get("/loadqueue?#{params}")
88
+ end
89
+ end
90
+
91
+ # Metadata
92
+ # -----------------
93
+
94
+ #
95
+ # Return a Hash with metadata of datasources
96
+ #
97
+ def metadata_datasources(params = '')
98
+ secure_query do
99
+ self.class.get("/metadata/datasources?#{params}")
100
+ end
101
+ end
102
+
103
+ alias_method :mt_datasources, :metadata_datasources
104
+
105
+ #
106
+ # Return a Hash with metadata of segments
107
+ #
108
+ # == Parameters:
109
+ # data_source::
110
+ # String with the name of the data source
111
+ # segment::
112
+ # (Optional) Segment to search
113
+ #
114
+ def metadata_datasources_segments(data_source, segment = '')
115
+ end_point = "/metadata/datasources/#{data_source}/segments"
116
+ secure_query do
117
+ if segment.empty? || segment == 'full'
118
+ self.class.get("#{end_point}?#{params}")
119
+ else
120
+ self.class.get("#{end_point}/#{params}")
121
+ end
122
+ end
123
+ end
124
+
125
+ alias_method :mt_datasources_segments, :metadata_datasources_segments
126
+
127
+ # Data sources
128
+ # -----------------
129
+
130
+ #
131
+ # Return all datasources
132
+ #
133
+ # == Returns:
134
+ # Array of Datasource initialized.
135
+ #
136
+ def datasources
137
+ datasource_status = load_status
138
+ secure_query do
139
+ self.class.get('/datasources?full').map do |data|
140
+ DruidConfig::Entities::DataSource.new(
141
+ data,
142
+ datasource_status.select { |k, _| k == data['name'] }.values.first)
143
+ end
144
+ end
145
+ end
146
+
147
+ #
148
+ # Return a unique datasource
149
+ #
150
+ # == Parameters:
151
+ # datasource:
152
+ # String with the data source name
153
+ #
154
+ # == Returns:
155
+ # DataSource instance
156
+ #
157
+ def datasource(datasource)
158
+ datasources.select { |el| el.name == datasource }
159
+ end
160
+
161
+ # Rules
162
+ # -----------------
163
+
164
+ #
165
+ # Return the rules applied to a cluster
166
+ #
167
+ def rules
168
+ secure_query do
169
+ self.class.get('/rules')
170
+ end
171
+ end
172
+
173
+ # Tiers
174
+ # -----------------
175
+
176
+ #
177
+ # Return all tiers defined in the cluster
178
+ #
179
+ # == Returns:
180
+ # Array of Tier instances
181
+ #
182
+ def tiers
183
+ current_nodes = servers
184
+ # Initialize tiers
185
+ secure_query do
186
+ current_nodes.map(&:tier).uniq.map do |tier|
187
+ DruidConfig::Entities::Tier.new(
188
+ tier,
189
+ current_nodes.select { |node| node.tier == tier })
190
+ end
191
+ end
192
+ end
193
+
194
+ # Servers
195
+ # -----------------
196
+
197
+ #
198
+ # Return all nodes of the cluster
199
+ #
200
+ # == Returns:
201
+ # Array of node Objects
202
+ #
203
+ def servers
204
+ secure_query do
205
+ queue = load_queue('full')
206
+ self.class.get('/servers?full').map do |data|
207
+ DruidConfig::Entities::Node.new(
208
+ data,
209
+ queue.select { |k, _| k == data['host'] }.values.first)
210
+ end
211
+ end
212
+ end
213
+
214
+ #
215
+ # URIs of the physical servers in the cluster
216
+ #
217
+ # == Returns:
218
+ # Array of strings
219
+ #
220
+ def physical_servers
221
+ secure_query do
222
+ @physical_servers ||= servers.map(&:host).uniq
223
+ end
224
+ end
225
+
226
+ alias_method :nodes, :servers
227
+ alias_method :physical_nodes, :physical_servers
228
+
229
+ #
230
+ # Returns only historial nodes
231
+ #
232
+ # == Returns:
233
+ # Array of Nodes
234
+ #
235
+ def historicals
236
+ servers.select { |node| node.type == :historical }
237
+ end
238
+
239
+ #
240
+ # Returns only realtime
241
+ #
242
+ # == Returns:
243
+ # Array of Nodes
244
+ #
245
+ def realtimes
246
+ servers.select { |node| node.type == :realtime }
247
+ end
248
+
249
+ #
250
+ # Return all Workers (MiddleManager) of the cluster
251
+ #
252
+ # == Returns:
253
+ # Array of Workers
254
+ #
255
+ def workers
256
+ # Stash the base_uri
257
+ stash_uri
258
+ self.class.base_uri(
259
+ "#{DruidConfig.client.overlord}"\
260
+ "druid/indexer/#{DruidConfig::Version::API_VERSION}")
261
+ workers = []
262
+ # Perform a query
263
+ begin
264
+ secure_query do
265
+ workers = self.class.get('/workers').map do |worker|
266
+ DruidConfig::Entities::Worker.new(worker)
267
+ end
268
+ end
269
+ ensure
270
+ # Recover it
271
+ pop_uri
272
+ end
273
+ # Return
274
+ workers
275
+ end
276
+
277
+ #
278
+ # URIs of the physical workers in the cluster
279
+ #
280
+ def physical_workers
281
+ @physical_workers ||= workers.map(&:host).uniq
282
+ end
283
+
284
+ # Services
285
+ # -----------------
286
+
287
+ #
288
+ # Availabe services in the cluster
289
+ #
290
+ # == Parameters:
291
+ # Array of Hash with the format:
292
+ # { server: [ services ], server2: [ services ], ... }
293
+ #
294
+ def services
295
+ return @services if @services
296
+ services = {}
297
+ physical_nodes.each { |node| services[node] = [] }
298
+ # Load services
299
+ realtimes.map(&:host).uniq.each { |r| services[r] << :realtime }
300
+ historicals.map(&:host).uniq.each { |r| services[r] << :historical }
301
+ physical_workers.each { |w| services[w] << :middleManager }
302
+ # Return nodes
303
+ @services = services
304
+ end
305
+
306
+ private
307
+
308
+ #
309
+ # Stash current base_uri
310
+ #
311
+ def stash_uri
312
+ @uri_stack ||= []
313
+ @uri_stack.push self.class.base_uri
314
+ end
315
+
316
+ #
317
+ # Pop next base_uri
318
+ #
319
+ def pop_uri
320
+ return if @uri_stack.nil? || @uri_stack.empty?
321
+ self.class.base_uri(@uri_stack.pop)
322
+ end
323
+ end
324
+ end
@@ -0,0 +1,78 @@
1
+ module DruidConfig
2
+ #
3
+ # Module of info
4
+ #
5
+ module Entities
6
+ #
7
+ # Init a DataSource
8
+ class DataSource
9
+ # HTTParty Rocks!
10
+ include HTTParty
11
+
12
+ attr_reader :name, :properties, :load_status
13
+
14
+ #
15
+ # Initialize a DataSource
16
+ #
17
+ def initialize(metadata, load_status)
18
+ @name = metadata['name']
19
+ @properties = metadata['properties']
20
+ @load_status = load_status
21
+ # Set end point for HTTParty
22
+ self.class.base_uri(
23
+ "#{DruidConfig.client.coordinator}"\
24
+ "druid/coordinator/#{DruidConfig::Version::API_VERSION}")
25
+ end
26
+
27
+ #
28
+ # The following methods are referenced to Druid API. To check the
29
+ # funcionality about it, please go to Druid documentation:
30
+ #
31
+ # http://druid.io/docs/0.8.1/design/coordinator.html
32
+ #
33
+
34
+ def info(params = '')
35
+ @info ||= self.class.get("/datasources/#{@name}?#{params}")
36
+ end
37
+
38
+ # Intervals
39
+ # -----------------
40
+ def intervals(params = '')
41
+ self.class.get("/datasources/#{@name}/intervals?#{params}")
42
+ end
43
+
44
+ def interval(interval, params = '')
45
+ self.class.get("/datasources/#{@name}/intervals/#{interval}"\
46
+ "?#{params}")
47
+ end
48
+
49
+ # Segments and Tiers
50
+ # -----------------
51
+ def segments
52
+ @segments ||=
53
+ self.class.get("/datasources/#{@name}/segments?full").map do |s|
54
+ DruidConfig::Entities::Segment.new(s)
55
+ end
56
+ end
57
+
58
+ def segment(segment)
59
+ segments.select { |s| s.id == segment }
60
+ end
61
+
62
+ def tiers
63
+ info['tiers']
64
+ end
65
+
66
+ # Rules
67
+ # -----------------
68
+ def rules(params = '')
69
+ self.class.get("/rules/#{@name}?#{params}")
70
+ end
71
+
72
+ def history_rules(interval)
73
+ self.class.get("/rules/#{@name}/history"\
74
+ "?interval=#{interval}")
75
+ end
76
+ end
77
+ end
78
+ end
@@ -0,0 +1,74 @@
1
+ module DruidConfig
2
+ module Entities
3
+ #
4
+ # Node class
5
+ #
6
+ class Node
7
+ # HTTParty Rocks!
8
+ include HTTParty
9
+
10
+ # Readers
11
+ attr_reader :host, :port, :max_size, :type, :tier, :priority, :size,
12
+ :segments, :segments_to_load, :segments_to_drop,
13
+ :segments_to_load_size, :segments_to_drop_size
14
+
15
+ #
16
+ # Initialize it with received info
17
+ #
18
+ # == Parameters:
19
+ # metadata::
20
+ # Hash with the data of the node given by a Druid API query
21
+ # queue::
22
+ # Hash with segments to load
23
+ #
24
+ def initialize(metadata, queue)
25
+ @host, @port = metadata['host'].split(':')
26
+ @max_size = metadata['maxSize']
27
+ @type = metadata['type'].to_sym
28
+ @tier = metadata['tier']
29
+ @priority = metadata['priority']
30
+ @size = metadata['currSize']
31
+ @segments = metadata['segments'].map do |_, sdata|
32
+ DruidConfig::Entities::Segment.new(sdata)
33
+ end
34
+ if queue.nil?
35
+ @segments_to_load, @segments_to_drop = [], []
36
+ @segments_to_load_size, @segments_to_drop_size = 0, 0
37
+ else
38
+ @segments_to_load = queue['segmentsToLoad'].map do |segment|
39
+ DruidConfig::Entities::Segment.new(segment)
40
+ end
41
+ @segments_to_drop = queue['segmentsToDrop'].map do |segment|
42
+ DruidConfig::Entities::Segment.new(segment)
43
+ end
44
+ @segments_to_load_size = @segments_to_load.map(&:size).reduce(:+)
45
+ @segments_to_drop_size = @segments_to_drop.map(&:size).reduce(:+)
46
+ end
47
+ end
48
+
49
+ alias_method :used, :size
50
+
51
+ #
52
+ # Calculate the percent of used space
53
+ #
54
+ def used_percent
55
+ return 0 unless max_size && max_size != 0
56
+ ((size.to_f / max_size) * 100).round(2)
57
+ end
58
+
59
+ #
60
+ # Calculate free space
61
+ #
62
+ def free
63
+ max_size - size
64
+ end
65
+
66
+ #
67
+ # Return the URI of this node
68
+ #
69
+ def uri
70
+ "#{@host}:#{@port}"
71
+ end
72
+ end
73
+ end
74
+ end
File without changes
@@ -0,0 +1,60 @@
1
+ module DruidConfig
2
+ module Entities
3
+ #
4
+ # Segment class
5
+ #
6
+ class Segment
7
+ # Readers
8
+ attr_reader :id, :interval, :version, :load_spec, :dimensions, :metrics,
9
+ :shard_spec, :binary_version, :size
10
+
11
+ #
12
+ # Initialize it with received info
13
+ #
14
+ # == Parameters:
15
+ # metadata::
16
+ # Hash with returned metadata from Druid
17
+ #
18
+ def initialize(metadata)
19
+ @id = metadata['identifier']
20
+ @interval = metadata['interval'].split('/').map { |t| Time.parse t }
21
+ @version = Time.parse metadata['version']
22
+ @load_spec = metadata['loadSpec']
23
+ @dimensions = metadata['dimensions'].split(',').map(&:to_sym)
24
+ @metrics = metadata['metrics'].split(',').map(&:to_sym)
25
+ @shard_spec = metadata['shardSpec']
26
+ @binary_version = metadata['binaryVersion']
27
+ @size = metadata['size']
28
+ end
29
+
30
+ #
31
+ # Return direct link to the store
32
+ #
33
+ # == Returns:
34
+ # String with the URI
35
+ #
36
+ def store_uri
37
+ return '' if load_spec.empty?
38
+ "s3://#{load_spec['bucket']}/#{load_spec['key']}"
39
+ end
40
+
41
+ #
42
+ # Return the store type
43
+ #
44
+ # == Returns:
45
+ # Store type as symbol
46
+ #
47
+ def store_type
48
+ return nil if load_spec.empty?
49
+ load_spec['type'].to_sym
50
+ end
51
+
52
+ #
53
+ # By default, show the identifier in To_s
54
+ #
55
+ def to_s
56
+ @id
57
+ end
58
+ end
59
+ end
60
+ end
@@ -0,0 +1,66 @@
1
+ module DruidConfig
2
+ module Entities
3
+ #
4
+ # Tier class
5
+ #
6
+ class Tier
7
+ # Readers
8
+ attr_reader :name, :nodes
9
+
10
+ def initialize(name, nodes)
11
+ @name = name
12
+ @nodes = nodes
13
+ end
14
+
15
+ alias_method :servers, :nodes
16
+
17
+ def size
18
+ @size ||= nodes.map(&:size).inject(:+)
19
+ end
20
+
21
+ alias_method :used, :size
22
+
23
+ def max_size
24
+ @max_size ||= nodes.map(&:max_size).inject(:+)
25
+ end
26
+
27
+ def free
28
+ @free ||= (max_size - size)
29
+ end
30
+
31
+ def used_percent
32
+ return 0 unless max_size && max_size != 0
33
+ ((size.to_f / max_size) * 100).round(2)
34
+ end
35
+
36
+ def historicals
37
+ nodes.select { |node| node.type == :historical }
38
+ end
39
+
40
+ def segments
41
+ @segments ||= nodes.map(&:segments)
42
+ .flatten.sort_by { |seg| seg.interval.first }
43
+ end
44
+
45
+ def segments_to_load
46
+ @segments_to_load ||=
47
+ nodes.map { |node| node.segments_to_load.count }.inject(:+)
48
+ end
49
+
50
+ def segments_to_drop
51
+ @segments_to_drop ||=
52
+ nodes.map { |node| node.segments_to_drop.count }.inject(:+)
53
+ end
54
+
55
+ def segments_to_load_size
56
+ @segments_to_load_size ||=
57
+ nodes.map(&:segments_to_load_size).reduce(:+)
58
+ end
59
+
60
+ def segments_to_drop_size
61
+ @segments_to_drop_size ||=
62
+ nodes.map(&:segments_to_drop_size).reduce(:+)
63
+ end
64
+ end
65
+ end
66
+ end