druiddb 1.0.1 → 1.2.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 77f96072ce4ca16fd1d5f9d4ad147f646da2e6a8
4
- data.tar.gz: 1c17a5ea1d268cc3ba0a72aa1acdffd21054f26e
3
+ metadata.gz: 6aff717c3c1644264319311423dd20b93c6b1c1b
4
+ data.tar.gz: 13ee2b193c9fd2eb5ce6c875176b950f4be736aa
5
5
  SHA512:
6
- metadata.gz: 43d399682a94461de6ab511252a3807ec7610f0393c36be70dfe86ecf42e5d45f172175c3b4551cb3c954ac2bbdfddc3d2d0c25105e00f0b4b426f939707be1a
7
- data.tar.gz: 8a2c36065fe408b8c092946ad69b003cde7f8ace3562ba12d8efc61378b247bd16a65e1fcb520db09e9e265accb7c1d171fd069a36b85358c2a504f1d4d4895f
6
+ metadata.gz: c20d56f9c873ded1c52c99ee7fe3e6e6517c652335d45c8cf8e3696063b25e1291eff03a210036c0b3128d76ae6e86303f90e0d62ad5ede3ded5fa7a72db4ee2
7
+ data.tar.gz: 1c8b7c392900c2b99c517186a498ed84d9843ccb40ba95ff24e0b929a9448e58b387fc8d20257debbbb69bdead3d5b0dbc7c1ab059feef65a63c6b0199047269
data/.gitignore CHANGED
@@ -9,6 +9,5 @@
9
9
  /tmp/
10
10
  /example
11
11
  zookeeper.out
12
- jruby-druid.log
13
12
  .ruby-version
14
13
  *.gem
data/.rspec ADDED
@@ -0,0 +1 @@
1
+ --require spec_helper
data/.rubocop.yml ADDED
@@ -0,0 +1,6 @@
1
+ inherit_from: .rubocop_todo.yml
2
+
3
+ Documentation:
4
+ Enabled: false
5
+ Metrics/LineLength:
6
+ Max: 100
data/.rubocop_todo.yml ADDED
@@ -0,0 +1,35 @@
1
+ # This configuration was generated by
2
+ # `rubocop --auto-gen-config`
3
+ # on 2017-08-20 22:13:21 -0400 using RuboCop version 0.49.1.
4
+ # The point is for the user to remove these configuration records
5
+ # one by one as the offenses are removed from the code base.
6
+ # Note that changes in the inspected code, or installation of new
7
+ # versions of RuboCop, may require this file to be generated again.
8
+
9
+ # Offense count: 11
10
+ Metrics/AbcSize:
11
+ Max: 22
12
+
13
+ # Offense count: 1
14
+ # Configuration parameters: CountComments.
15
+ Metrics/ClassLength:
16
+ Max: 145
17
+
18
+ # Offense count: 3
19
+ Metrics/CyclomaticComplexity:
20
+ Max: 11
21
+
22
+ # Offense count: 2
23
+ # Configuration parameters: AllowHeredoc, AllowURI, URISchemes, IgnoreCopDirectives, IgnoredPatterns.
24
+ # URISchemes: http, https
25
+ Metrics/LineLength:
26
+ Max: 108
27
+
28
+ # Offense count: 9
29
+ # Configuration parameters: CountComments.
30
+ Metrics/MethodLength:
31
+ Max: 26
32
+
33
+ # Offense count: 1
34
+ Metrics/PerceivedComplexity:
35
+ Max: 10
data/.travis.yml ADDED
@@ -0,0 +1,15 @@
1
+ language: ruby
2
+ sudo: required
3
+
4
+ services:
5
+ - docker
6
+
7
+ before_script:
8
+ - docker-compose up -d
9
+ - docker build -t druiddb-ruby .
10
+
11
+ script:
12
+ - docker run -it --network=druiddbruby_druiddb druiddb-ruby bin/run_tests.sh
13
+
14
+ after_script:
15
+ - docker-compose down
data/Dockerfile ADDED
@@ -0,0 +1,20 @@
1
+ FROM ruby:2.2.6
2
+ MAINTAINER Andre LeBlanc <andre.leblanc88@gmail.com>
3
+
4
+ RUN apt-get update
5
+
6
+ WORKDIR /druiddb-ruby
7
+
8
+ COPY lib/druiddb/version.rb lib/druiddb/version.rb
9
+ COPY druiddb.gemspec druiddb.gemspec
10
+ COPY Gemfile Gemfile
11
+
12
+ RUN git init
13
+ RUN bundle install
14
+
15
+ COPY bin bin
16
+ COPY lib lib
17
+ COPY spec spec
18
+ COPY Rakefile Rakefile
19
+
20
+ CMD bin/console
data/README.md CHANGED
@@ -1 +1,130 @@
1
- # ruby-druid
1
+ # druiddb-ruby
2
+
3
+ [![Build Status](https://travis-ci.org/andremleblanc/druiddb-ruby.svg?branch=master)](https://travis-ci.org/andremleblanc/druiddb-ruby)
4
+ [![Gem Version](https://badge.fury.io/rb/druiddb.svg)](https://badge.fury.io/rb/druiddb)
5
+ [![Code Climate](https://codeclimate.com/github/andremleblanc/druiddb-ruby/badges/gpa.svg)](https://codeclimate.com/github/andremleblanc/druiddb-ruby)
6
+ [![Test Coverage](https://codeclimate.com/github/andremleblanc/druiddb-ruby/badges/coverage.svg)](https://codeclimate.com/github/andremleblanc/druiddb-ruby/coverage)
7
+ [![Dependency Status](https://gemnasium.com/badges/github.com/andremleblanc/druiddb-ruby.svg)](https://gemnasium.com/github.com/andremleblanc/druiddb-ruby)
8
+
9
+ This documentation is intended to be a quick-start guide, not a comprehensive
10
+ list of all available methods and configuration options. Please look through
11
+ the source for more information; a great place to get started is `DruidDB::Client`
12
+ and the `DruidDB::Query` modules as they expose most of the methods on the client.
13
+
14
+ This guide assumes a significant knowledge of Druid, for more info:
15
+ http://druid.io/docs/latest/design/index.html
16
+
17
+ ## Install
18
+
19
+ ```bash
20
+ $ gem install druiddb
21
+ ```
22
+
23
+ ## Usage
24
+
25
+ ### Creating a Client
26
+ ```ruby
27
+ client = DruidDB::Client.new()
28
+ ```
29
+ *Note:* There are many configuration options, please take a look at
30
+ `DruidDB::Configuration` for more details.
31
+
32
+ ### Writing Data
33
+
34
+ #### Kafka Indexing service
35
+ This gem leverages the [Kafka Indexing Service](http://druid.io/docs/latest/development/extensions-core/kafka-ingestion.html) for ingesting data. The gem pushes datapoints onto Kafka topics (typically named after the datasource). You can also use the gem to upload an ingestion spec, which is needed for Druid to consume the Kafka topic.
36
+
37
+ This repo contains a `docker-compose.yml` build that may help bootstrap development with Druid and the Kafka Indexing Service. It's what we use for integration testing.
38
+
39
+ #### Submitting an Ingestion Spec
40
+
41
+ ```ruby
42
+ path = 'path/to/spec.json'
43
+ client.submit_supervisor_spec(path)
44
+ ```
45
+
46
+ #### Writing Datapoints
47
+ ```ruby
48
+ topic_name = 'foo'
49
+ datapoint = {
50
+ timestamp: Time.now.utc.iso8601,
51
+ foo: 'bar',
52
+ units: 1
53
+ }
54
+ client.write_point(topic_name, datapoint)
55
+ ```
56
+
57
+ ### Reading Data
58
+
59
+ #### Querying
60
+ ```ruby
61
+ client.query(
62
+ queryType: 'timeseries',
63
+ dataSource: 'foo',
64
+ granularity: 'day',
65
+ intervals: Time.now.utc.advance(days: -30) + '/' + Time.now.utc.iso8601,
66
+ aggregations: [{ type: 'longSum', name: 'baz', fieldName: 'baz' }]
67
+ )
68
+ ```
69
+ The `query` method POSTs the query to Druid; for information on
70
+ querying Druid: http://druid.io/docs/latest/querying/querying.html. This is
71
+ intentionally simple to allow all current features and hopefully all future
72
+ features of the Druid query language without updating the gem.
73
+
74
+ ##### Fill Empty Intervals
75
+
76
+ Currently, Druid will not fill empty intervals for which there are no points. To
77
+ accommodate this need until it is handled more efficiently in Druid, use the
78
+ experimental `fill_value` feature in your query. This ensure you get a result
79
+ for every interval in intervals.
80
+
81
+ This has only been tested with 'timeseries' and single-dimension 'groupBy'
82
+ queries with simple granularities.
83
+
84
+ ```ruby
85
+ client.query(
86
+ queryType: 'timeseries',
87
+ dataSource: 'foo',
88
+ granularity: 'day',
89
+ intervals: Time.now.utc.advance(days: -30) + '/' + Time.now.utc.iso8601,
90
+ aggregations: [{ type: 'longSum', name: 'baz', fieldName: 'baz' }],
91
+ fill_value: 0
92
+ )
93
+ ```
94
+
95
+ ### Management
96
+ List datasources.
97
+ ```ruby
98
+ client.list_datasources
99
+ ```
100
+
101
+ List supervisor tasks.
102
+ ```ruby
103
+ client.supervisor_tasks
104
+ ```
105
+
106
+ ## Development
107
+
108
+ ### Docker Compose
109
+ This project uses docker-compose to provide a development environment.
110
+
111
+ 1. git clone the project
112
+ 2. cd into project
113
+ 3. `docker-compose up` - this will download necessary images and run all dependencies in the foreground.
114
+
115
+ Then you can use `docker build -t some_tag .` to build the Docker image for this project after making changes and `docker run -it --network=druiddbruby_druiddb some_tag some_command` to interact with it.
116
+
117
+ ### Metabase
118
+
119
+ Viewing data in the database can be a bit annoying, use a tool like [Metabase](https://github.com/metabase/metabase) makes this much easier and is what I personally do when developing.
120
+
121
+ ## Testing
122
+
123
+ Testing is run utilizing the docker-compose environment.
124
+
125
+ 1. `docker-compose up`
126
+ 2. `docker run -it --network=druiddbruby_druiddb druiddb-ruby bin/run_tests.sh`
127
+
128
+ ## License
129
+
130
+ The gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).
data/Rakefile CHANGED
@@ -1,6 +1,13 @@
1
- require "bundler/gem_tasks"
2
- require "rspec/core/rake_task"
1
+ require 'bundler/gem_tasks'
2
+ require 'rspec/core/rake_task'
3
+ require 'druiddb'
3
4
 
4
- RSpec::Core::RakeTask.new(:spec)
5
-
6
- task :default => :spec
5
+ namespace :db do
6
+ namespace :test do
7
+ task :prepare do
8
+ client = DruidDB::Client.new(zookeeper: 'zookeeper:2181')
9
+ client.submit_supervisor_spec("#{Dir.pwd}/spec/ingestion_specs/xwings_spec.json")
10
+ puts client.supervisor_tasks
11
+ end
12
+ end
13
+ end
data/bin/console CHANGED
@@ -1,7 +1,7 @@
1
1
  #!/usr/bin/env ruby
2
2
 
3
- require "bundler/setup"
4
- require "irb"
5
- require "druiddb"
3
+ require 'bundler/setup'
4
+ require 'irb'
5
+ require 'druiddb'
6
6
 
7
7
  IRB.start
data/bin/run_tests.sh ADDED
@@ -0,0 +1,2 @@
1
+ #!/usr/bin/env bash
2
+ spec/wait-for-it.sh overlord:8090 --timeout=30 --strict -- rspec
@@ -0,0 +1,100 @@
1
+ version: "3"
2
+
3
+ networks:
4
+ druiddb:
5
+
6
+ volumes:
7
+ druid_fs:
8
+
9
+ services:
10
+ zookeeper:
11
+ image: zookeeper:3.4
12
+ networks:
13
+ - druiddb
14
+ ports:
15
+ - '2181:2181'
16
+
17
+ derby:
18
+ image: adito/apache-derby
19
+ networks:
20
+ - druiddb
21
+
22
+ kafka:
23
+ image: wurstmeister/kafka:0.10.2.1
24
+ networks:
25
+ - druiddb
26
+ ports:
27
+ - '7203:7203'
28
+ - '9092:9092'
29
+ depends_on:
30
+ - zookeeper
31
+ environment:
32
+ KAFKA_ADVERTISED_HOST_NAME: kafka
33
+ KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
34
+ volumes:
35
+ - /var/run/docker.sock:/var/run/docker.sock
36
+
37
+ broker:
38
+ image: andremleblanc/druid-broker:0.9.2
39
+ networks:
40
+ - druiddb
41
+ ports:
42
+ - '8082:8082'
43
+ depends_on:
44
+ - zookeeper
45
+ - derby
46
+ - kafka
47
+ volumes:
48
+ - druid_fs:/druid-0.9.2/var/druid/
49
+
50
+ coordinator:
51
+ image: andremleblanc/druid-coordinator:0.9.2
52
+ networks:
53
+ - druiddb
54
+ ports:
55
+ - '8081:8081'
56
+ depends_on:
57
+ - zookeeper
58
+ - derby
59
+ - kafka
60
+ volumes:
61
+ - druid_fs:/druid-0.9.2/var/druid/
62
+
63
+ historical:
64
+ image: andremleblanc/druid-historical:0.9.2
65
+ networks:
66
+ - druiddb
67
+ ports:
68
+ - '8083:8083'
69
+ depends_on:
70
+ - zookeeper
71
+ - derby
72
+ - kafka
73
+ volumes:
74
+ - druid_fs:/druid-0.9.2/var/druid/
75
+
76
+ middlemanager:
77
+ image: andremleblanc/druid-middlemanager:0.9.2
78
+ networks:
79
+ - druiddb
80
+ ports:
81
+ - '8091:8091'
82
+ depends_on:
83
+ - zookeeper
84
+ - derby
85
+ - kafka
86
+ volumes:
87
+ - druid_fs:/druid-0.9.2/var/druid/
88
+
89
+ overlord:
90
+ image: andremleblanc/druid-overlord:0.9.2
91
+ networks:
92
+ - druiddb
93
+ ports:
94
+ - '8090:8090'
95
+ depends_on:
96
+ - zookeeper
97
+ - kafka
98
+ - derby
99
+ volumes:
100
+ - druid_fs:/druid-0.9.2/var/druid/
data/druiddb.gemspec CHANGED
@@ -1,28 +1,32 @@
1
1
  # coding: utf-8
2
+
2
3
  lib = File.expand_path('../lib', __FILE__)
3
4
  $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
- require 'druid/version'
5
+ require 'druiddb/version'
5
6
 
6
7
  Gem::Specification.new do |spec|
7
- spec.name = "druiddb"
8
- spec.version = Druiddb::VERSION
9
- spec.authors = ["Andre LeBlanc"]
10
- spec.email = ["andre.leblanc88@gmail.com"]
8
+ spec.name = 'druiddb'
9
+ spec.version = DruidDB::VERSION
10
+ spec.authors = ['Andre LeBlanc']
11
+ spec.email = ['andre.leblanc88@gmail.com']
11
12
 
12
- spec.summary = 'Ruby adapter for Druid.'
13
- spec.description = 'Ruby adapter for Druid that allows reads and writes using the Tranquility Kafka API.'
14
- spec.homepage = "https://github.com/andremleblanc/druiddb"
15
- spec.license = "MIT"
13
+ spec.summary = 'Ruby client for Druid.'
14
+ spec.description = 'Ruby client for reading from and writing to Druid.'
15
+ spec.homepage = 'https://github.com/andremleblanc/druiddb-ruby'
16
+ spec.license = 'MIT'
16
17
 
17
- spec.files = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
18
- spec.bindir = "exe"
18
+ spec.files = `git ls-files -z`.split("\x0").reject do |f|
19
+ f.match(%r{^(test|spec|features)/})
20
+ end
21
+ spec.bindir = 'exe'
19
22
  spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
20
- spec.require_paths = ["lib"]
23
+ spec.require_paths = ['lib']
21
24
 
22
- spec.add_dependency "activesupport", '>= 4.0'
23
- spec.add_dependency "ruby-kafka", '~> 0.3'
24
- spec.add_dependency "zk", '~> 1.9'
25
+ spec.add_dependency 'activesupport', '> 4.0'
26
+ spec.add_dependency 'ruby-kafka', '~> 0.3'
27
+ spec.add_dependency 'zk', '~> 1.9'
25
28
 
26
- spec.add_development_dependency "bundler", '~> 1.7'
27
- spec.add_development_dependency "rake", '~> 10.0'
29
+ spec.add_development_dependency 'bundler', '~> 1.7'
30
+ spec.add_development_dependency 'rake', '~> 10.0'
31
+ spec.add_development_dependency 'rspec', '~> 3.6'
28
32
  end
data/lib/druiddb.rb CHANGED
@@ -1,21 +1,22 @@
1
- require "active_support/all"
2
- require "ruby-kafka"
3
- require "json"
4
- require "zk"
1
+ require 'active_support/all'
2
+ require 'ruby-kafka'
3
+ require 'json'
4
+ require 'zk'
5
5
 
6
- require "druid/configuration"
7
- require "druid/connection"
8
- require "druid/errors"
9
- require "druid/query"
10
- require "druid/version"
11
- require "druid/zk"
6
+ require 'druiddb/configuration'
7
+ require 'druiddb/connection'
8
+ require 'druiddb/errors'
9
+ require 'druiddb/query'
10
+ require 'druiddb/version'
11
+ require 'druiddb/zk'
12
12
 
13
- require "druid/node/broker"
14
- require "druid/node/coordinator"
15
- require "druid/node/overlord"
13
+ require 'druiddb/node/broker'
14
+ require 'druiddb/node/coordinator'
15
+ require 'druiddb/node/overlord'
16
16
 
17
- require "druid/queries/core"
18
- require "druid/queries/task"
17
+ require 'druiddb/queries/core'
18
+ require 'druiddb/queries/datasources'
19
+ require 'druiddb/queries/task'
19
20
 
20
- require "druid/writer"
21
- require "druid/client"
21
+ require 'druiddb/writer'
22
+ require 'druiddb/client'
@@ -0,0 +1,23 @@
1
+ module DruidDB
2
+ class Client
3
+ include DruidDB::Queries::Core
4
+ include DruidDB::Queries::Datasources
5
+ include DruidDB::Queries::Task
6
+
7
+ attr_reader :broker,
8
+ :config,
9
+ :coordinator,
10
+ :overlord,
11
+ :writer,
12
+ :zk
13
+
14
+ def initialize(options = {})
15
+ @config = DruidDB::Configuration.new(options)
16
+ @zk = DruidDB::ZK.new(config)
17
+ @broker = DruidDB::Node::Broker.new(config, zk)
18
+ @coordinator = DruidDB::Node::Coordinator.new(config, zk)
19
+ @overlord = DruidDB::Node::Overlord.new(config, zk)
20
+ @writer = DruidDB::Writer.new(config, zk)
21
+ end
22
+ end
23
+ end
@@ -1,17 +1,19 @@
1
- module Druid
1
+ module DruidDB
2
2
  class Configuration
3
+ CLIENT_ID = 'druiddb-ruby'.freeze
3
4
  DISCOVERY_PATH = '/druid/discovery'.freeze
4
5
  INDEX_SERVICE = 'druid/overlord'.freeze
5
6
  KAFKA_BROKER_PATH = '/brokers/ids'.freeze
6
7
  LOG_LEVEL = :error
7
8
  ROLLUP_GRANULARITY = :minute
8
- STRONG_DELETE = false # Not recommend to be true for production.
9
+ STRONG_DELETE = false
9
10
  TUNING_GRANULARITY = :day
10
11
  TUNING_WINDOW = 'PT1H'.freeze
11
- WAIT_TIME = 20 # Seconds
12
+ WAIT_TIME = 20
12
13
  ZOOKEEPER = 'localhost:2181'.freeze
13
14
 
14
- attr_reader :discovery_path,
15
+ attr_reader :client_id,
16
+ :discovery_path,
15
17
  :index_service,
16
18
  :kafka_broker_path,
17
19
  :log_level,
@@ -22,8 +24,8 @@ module Druid
22
24
  :wait_time,
23
25
  :zookeeper
24
26
 
25
-
26
27
  def initialize(opts = {})
28
+ @client_id = opts[:client_id] || CLIENT_ID
27
29
  @discovery_path = opts[:discovery_path] || DISCOVERY_PATH
28
30
  @index_service = opts[:index_service] || INDEX_SERVICE
29
31
  @kafka_broker_path = opts[:kafka_broker_path] || KAFKA_BROKER_PATH
@@ -1,22 +1,23 @@
1
1
  # Based on: http://danknox.github.io/2013/01/27/using-rubys-native-nethttp-library/
2
2
  require 'net/http'
3
3
 
4
- module Druid
4
+ module DruidDB
5
5
  class Connection
6
6
  CONTENT_TYPE = 'application/json'.freeze
7
7
  VERB_MAP = {
8
- :get => ::Net::HTTP::Get,
9
- :post => ::Net::HTTP::Post,
10
- :put => ::Net::HTTP::Put,
11
- :delete => ::Net::HTTP::Delete
12
- }
8
+ get: ::Net::HTTP::Get,
9
+ post: ::Net::HTTP::Post,
10
+ put: ::Net::HTTP::Put,
11
+ delete: ::Net::HTTP::Delete
12
+ }.freeze
13
13
 
14
14
  attr_reader :http
15
15
 
16
16
  def initialize(endpoint)
17
17
  if endpoint.is_a? String
18
18
  uri = URI.parse(endpoint)
19
- host, port = uri.host, uri.port
19
+ host = uri.host
20
+ port = uri.port
20
21
  else
21
22
  host, port = endpoint.values_at(:host, :port)
22
23
  end
@@ -44,7 +45,7 @@ module Druid
44
45
 
45
46
  def encode_path_params(path, params)
46
47
  encoded = URI.encode_www_form(params)
47
- [path, encoded].join("?")
48
+ [path, encoded].join('?')
48
49
  end
49
50
 
50
51
  def request(method, path, params)
@@ -60,7 +61,7 @@ module Druid
60
61
  request.content_type = CONTENT_TYPE
61
62
  begin
62
63
  response = http.request(request)
63
- rescue Timeout::Error, *Druid::NET_HTTP_EXCEPTIONS => e
64
+ rescue Timeout::Error, *DruidDB::NET_HTTP_EXCEPTIONS => e
64
65
  raise ConnectionError, e.message
65
66
  end
66
67
 
@@ -1,4 +1,4 @@
1
- module Druid
1
+ module DruidDB
2
2
  class Error < StandardError; end
3
3
  class ClientError < Error; end
4
4
  class ConnectionError < Error; end
@@ -18,5 +18,5 @@ module Druid
18
18
  Net::HTTPHeaderSyntaxError,
19
19
  Net::ProtocolError,
20
20
  SocketError
21
- ]
21
+ ].freeze
22
22
  end
@@ -1,4 +1,4 @@
1
- module Druid
1
+ module DruidDB
2
2
  module Node
3
3
  class Broker
4
4
  QUERY_PATH = '/druid/v2'.freeze
@@ -9,18 +9,18 @@ module Druid
9
9
  @zk = zk
10
10
  end
11
11
 
12
- #TODO: Would caching connections be beneficial?
13
12
  def connection
14
13
  broker = zk.registry["#{config.discovery_path}/druid:broker"].first
15
- raise Druid::ConnectionError, 'no druid brokers available' if broker.nil?
14
+ raise DruidDB::ConnectionError, 'no druid brokers available' if broker.nil?
16
15
  zk.registry["#{config.discovery_path}/druid:broker"].rotate! # round-robin load balancing
17
- Druid::Connection.new(host: broker[:host], port: broker[:port])
16
+ DruidDB::Connection.new(host: broker[:host], port: broker[:port])
18
17
  end
19
18
 
20
19
  def query(query_object)
21
20
  begin
22
21
  response = connection.post(QUERY_PATH, query_object)
23
- rescue Druid::ConnectionError => e
22
+ rescue DruidDB::ConnectionError
23
+ # TODO: Log
24
24
  # TODO: This sucks, make it better
25
25
  (zk.registry["#{config.discovery_path}/druid:broker"].size - 1).times do
26
26
  response = connection.post(QUERY_PATH, query_object)
@@ -1,4 +1,4 @@
1
- module Druid
1
+ module DruidDB
2
2
  module Node
3
3
  class Coordinator
4
4
  DATASOURCES_PATH = '/druid/coordinator/v1/datasources/'.freeze
@@ -12,14 +12,17 @@ module Druid
12
12
  # TODO: DRY; copy/paste from broker
13
13
  def connection
14
14
  coordinator = zk.registry["#{config.discovery_path}/druid:coordinator"].first
15
- raise Druid::ConnectionError, 'no druid coordinators available' if coordinator.nil?
16
- zk.registry["#{config.discovery_path}/druid:coordinator"].rotate! # round-robin load balancing
17
- Druid::Connection.new(host: coordinator[:host], port: coordinator[:port])
15
+ raise DruidDB::ConnectionError, 'no druid coordinators available' if coordinator.nil?
16
+ # round-robin load balancing
17
+ zk.registry["#{config.discovery_path}/druid:coordinator"].rotate!
18
+ DruidDB::Connection.new(host: coordinator[:host], port: coordinator[:port])
18
19
  end
19
20
 
20
21
  def datasource_info(datasource_name)
21
22
  response = connection.get(DATASOURCES_PATH + datasource_name.to_s, full: true)
22
- raise ConnectionError, 'Unable to retrieve datasource information.' unless response.code.to_i == 200
23
+ unless response.code.to_i == 200
24
+ raise ConnectionError, 'Unable to retrieve datasource information.'
25
+ end
23
26
  JSON.parse(response.body)
24
27
  end
25
28
 
@@ -53,7 +56,7 @@ module Druid
53
56
  # TODO: This should either be private or moved to datasource
54
57
  def disable_segments(datasource_name)
55
58
  segments = list_segments(datasource_name)
56
- segments.each{ |segment| disable_segment(datasource_name, segment) }
59
+ segments.each { |segment| disable_segment(datasource_name, segment) }
57
60
  end
58
61
 
59
62
  def issue_kill_task(datasource_name, interval)
@@ -71,7 +74,7 @@ module Druid
71
74
  response = connection.get(DATASOURCES_PATH + datasource_name + '/segments', full: true)
72
75
  case response.code.to_i
73
76
  when 200
74
- JSON.parse(response.body).map{ |segment| segment['identifier'] }
77
+ JSON.parse(response.body).map { |segment| segment['identifier'] }
75
78
  when 204
76
79
  []
77
80
  else
@@ -86,7 +89,7 @@ module Druid
86
89
  attempts = 0
87
90
  max = 10
88
91
 
89
- while(condition) do
92
+ while condition
90
93
  attempts += 1
91
94
  sleep 1
92
95
  condition = datasource_enabled?(datasource_name)
@@ -102,7 +105,7 @@ module Druid
102
105
  attempts = 0
103
106
  max = 60
104
107
 
105
- while(condition) do
108
+ while condition
106
109
  attempts += 1
107
110
  sleep 1
108
111
  condition = datasource_has_segments?(datasource_name)
@@ -1,9 +1,10 @@
1
- module Druid
1
+ module DruidDB
2
2
  module Node
3
3
  class Overlord
4
4
  INDEXER_PATH = '/druid/indexer/v1/'.freeze
5
5
  RUNNING_TASKS_PATH = (INDEXER_PATH + 'runningTasks').freeze
6
- TASK_PATH = INDEXER_PATH + 'task/'
6
+ TASK_PATH = (INDEXER_PATH + 'task/').freeze
7
+ SUPERVISOR_PATH = (INDEXER_PATH + 'supervisor/').freeze
7
8
 
8
9
  attr_reader :config, :zk
9
10
  def initialize(config, zk)
@@ -11,19 +12,19 @@ module Druid
11
12
  @zk = zk
12
13
  end
13
14
 
14
- #TODO: DRY: copy/paste
15
+ # TODO: DRY: copy/paste
15
16
  def connection
16
17
  overlord = zk.registry["#{config.discovery_path}/druid:overlord"].first
17
- raise Druid::ConnectionError, 'no druid overlords available' if overlord.nil?
18
+ raise DruidDB::ConnectionError, 'no druid overlords available' if overlord.nil?
18
19
  zk.registry["#{config.discovery_path}/druid:overlord"].rotate! # round-robin load balancing
19
- Druid::Connection.new(host: overlord[:host], port: overlord[:port])
20
+ DruidDB::Connection.new(host: overlord[:host], port: overlord[:port])
20
21
  end
21
22
 
22
23
  def running_tasks(datasource_name = nil)
23
24
  response = connection.get(RUNNING_TASKS_PATH)
24
25
  raise ConnectionError, 'Could not retrieve running tasks' unless response.code.to_i == 200
25
- tasks = JSON.parse(response.body).map{|task| task['id']}
26
- tasks.select!{ |task| task.include? datasource_name } if datasource_name
26
+ tasks = JSON.parse(response.body).map { |task| task['id'] }
27
+ tasks.select! { |task| task.include? datasource_name } if datasource_name
27
28
  tasks ? tasks : []
28
29
  end
29
30
 
@@ -35,7 +36,20 @@ module Druid
35
36
 
36
37
  def shutdown_tasks(datasource_name = nil)
37
38
  tasks = running_tasks(datasource_name)
38
- tasks.each{|task| shutdown_task(task)}
39
+ tasks.each { |task| shutdown_task(task) }
40
+ end
41
+
42
+ def supervisor_tasks
43
+ response = connection.get(SUPERVISOR_PATH)
44
+ raise ConnectionError, 'Could not retrieve supervisors' unless response.code.to_i == 200
45
+ JSON.parse(response.body)
46
+ end
47
+
48
+ def submit_supervisor_spec(filepath)
49
+ spec = JSON.parse(File.read(filepath))
50
+ response = connection.post(SUPERVISOR_PATH, spec)
51
+ raise ConnectionError, 'Unable to submit spec' unless response.code.to_i == 200
52
+ JSON.parse(response.body)
39
53
  end
40
54
 
41
55
  private
@@ -45,7 +59,7 @@ module Druid
45
59
  attempts = 0
46
60
  max = 10
47
61
 
48
- until(condition) do
62
+ until condition
49
63
  attempts += 1
50
64
  sleep 1
51
65
  condition = !(running_tasks.include? task)
@@ -1,10 +1,10 @@
1
- module Druid
1
+ module DruidDB
2
2
  module Queries
3
3
  module Core
4
4
  delegate :write_point, to: :writer
5
5
 
6
6
  def query(opts)
7
- Druid::Query.create(opts.merge(broker: broker))
7
+ DruidDB::Query.create(opts.merge(broker: broker))
8
8
  end
9
9
  end
10
10
  end
@@ -0,0 +1,7 @@
1
+ module DruidDB
2
+ module Queries
3
+ module Datasources
4
+ delegate :list_datasources, to: :coordinator
5
+ end
6
+ end
7
+ end
@@ -0,0 +1,10 @@
1
+ module DruidDB
2
+ module Queries
3
+ module Task
4
+ delegate :shutdown_tasks,
5
+ :supervisor_tasks,
6
+ :submit_supervisor_spec,
7
+ to: :overlord
8
+ end
9
+ end
10
+ end
@@ -1,4 +1,4 @@
1
- module Druid
1
+ module DruidDB
2
2
  class Query
3
3
  attr_reader :aggregations,
4
4
  :broker,
@@ -13,7 +13,7 @@ module Druid
13
13
  :start_interval
14
14
 
15
15
  def initialize(opts)
16
- @aggregations = opts[:aggregations].map{|agg| agg[:name]}
16
+ @aggregations = opts[:aggregations].map { |agg| agg[:name] }
17
17
  @broker = opts[:broker]
18
18
  @dimensions = opts[:dimensions]
19
19
  @fill_value = opts[:fill_value]
@@ -57,7 +57,7 @@ module Druid
57
57
  when 'year'
58
58
  time.advance(years: 1)
59
59
  else
60
- raise Druid::QueryError, 'Unsupported granularity'
60
+ raise DruidDB::QueryError, 'Unsupported granularity'
61
61
  end
62
62
  end
63
63
 
@@ -74,9 +74,8 @@ module Druid
74
74
  interval = start_interval
75
75
  result = []
76
76
 
77
- while interval <= end_interval do
78
- # TODO:
79
- # This will search the points every time, could be more performant if
77
+ while interval <= end_interval
78
+ # TODO: This will search the points every time, could be more performant if
80
79
  # we track the 'current point' in the points and only compare the
81
80
  # current point's timestamp
82
81
  point = find_or_create_point(interval, points)
@@ -99,13 +98,13 @@ module Druid
99
98
  return query_result unless query_result.present? && fill_value.present?
100
99
  parse_result_key(query_result.first)
101
100
 
102
- #TODO: handle multi-dimensional group by
101
+ # TODO: handle multi-dimensional group by
103
102
  if group_by?
104
103
  result = []
105
104
  dimension_key = dimensions.first
106
- groups = query_result.group_by{ |point| point[result_key][dimension_key] }
105
+ groups = query_result.group_by { |point| point[result_key][dimension_key] }
107
106
  groups.each do |dimension_value, dimension_points|
108
- result += fill_empty_intervals(dimension_points, { dimension_key => dimension_value })
107
+ result += fill_empty_intervals(dimension_points, dimension_key => dimension_value)
109
108
  end
110
109
  result
111
110
  else
@@ -114,7 +113,7 @@ module Druid
114
113
  end
115
114
 
116
115
  def find_or_create_point(interval, points)
117
- point = points.find{ |point| point['timestamp'].to_s.to_time == interval.to_time }
116
+ point = points.find { |p| p['timestamp'].to_s.to_time == interval.to_time }
118
117
  point.present? ? point : { 'timestamp' => interval.iso8601(3), result_key => {} }
119
118
  end
120
119
 
@@ -151,10 +150,10 @@ module Druid
151
150
  when 'minute'
152
151
  time.beginning_of_minute
153
152
  when 'fifteen_minute'
154
- first_fifteen = [45, 30, 15, 0].detect{ |m| m <= time.min }
153
+ first_fifteen = [45, 30, 15, 0].detect { |m| m <= time.min }
155
154
  time.change(min: first_fifteen)
156
155
  when 'thirty_minute'
157
- first_thirty = [30, 0].detect{ |m| m <= time.min }
156
+ first_thirty = [30, 0].detect { |m| m <= time.min }
158
157
  time.change(min: first_thirty)
159
158
  when 'hour'
160
159
  time.beginning_of_hour
@@ -0,0 +1,3 @@
1
+ module DruidDB
2
+ VERSION = '1.2.0'.freeze
3
+ end
@@ -1,5 +1,4 @@
1
- #TODO: Seems to be a delay after shutting down Kafka and ZK updating
2
- module Druid
1
+ module DruidDB
3
2
  class Writer
4
3
  attr_reader :config, :producer, :zk
5
4
  def initialize(config, zk)
@@ -10,28 +9,28 @@ module Druid
10
9
  end
11
10
 
12
11
  def write_point(datasource, datapoint)
13
- raise Druid::ConnectionError, 'no kafka brokers available' if producer.nil?
14
- producer.produce(datapoint, topic: datasource)
12
+ raise DruidDB::ConnectionError, 'no kafka brokers available' if producer.nil?
13
+ producer.produce(datapoint.to_json, topic: datasource)
15
14
  end
16
15
 
17
16
  private
18
17
 
19
18
  def broker_list
20
- zk.registry["/brokers/ids"].map{|instance| "#{instance[:host]}:#{instance[:port]}" }.join(',')
19
+ zk.registry['/brokers/ids'].map { |instance| broker_name(instance) }.join(',')
20
+ end
21
+
22
+ def broker_name(instance)
23
+ "#{instance[:host]}:#{instance[:port]}"
21
24
  end
22
25
 
23
26
  def handle_kafka_state_change(service)
24
- if service == config.kafka_broker_path
25
- producer.shutdown
26
- init_producer
27
- end
27
+ return unless service == config.kafka_broker_path
28
+ producer.shutdown
29
+ init_producer
28
30
  end
29
31
 
30
32
  def init_producer
31
- producer_options = {
32
- seed_brokers: broker_list,
33
- client_id: "ruby-druid"
34
- }
33
+ producer_options = { seed_brokers: broker_list, client_id: config.client_id }
35
34
 
36
35
  if broker_list.present?
37
36
  kafka = Kafka.new(producer_options)
@@ -1,9 +1,8 @@
1
- module Druid
1
+ module DruidDB
2
2
  class ZK
3
3
  attr_accessor :registry
4
4
  attr_reader :client, :config, :listeners
5
5
 
6
- #TODO: Test and handle ZK partitions
7
6
  def initialize(config)
8
7
  @client = ::ZK.new(config.zookeeper)
9
8
  @config = config
@@ -19,7 +18,6 @@ module Druid
19
18
  private
20
19
 
21
20
  def announce(service)
22
- # puts "announcing #{service}"
23
21
  listeners.each { |listener| listener.call(service) }
24
22
  end
25
23
 
@@ -27,34 +25,28 @@ module Druid
27
25
  register_service("#{config.discovery_path}/druid:broker")
28
26
  register_service("#{config.discovery_path}/druid:coordinator")
29
27
  register_service("#{config.discovery_path}/druid:overlord")
30
- register_service("#{config.kafka_broker_path}")
28
+ register_service(config.kafka_broker_path.to_s)
31
29
  end
32
30
 
33
31
  def register_service(service)
34
- # puts "registering #{service}"
35
- #TODO: Thead safety, lock this registry key
36
32
  subscribe_to_service(service)
37
33
  renew_service_instances(service)
38
34
  end
39
35
 
40
36
  def renew_service_instances(service)
41
- # puts "activating registered subscriptions on #{service}"
42
37
  instances = client.children(service, watch: true)
43
38
 
44
- # puts "emptying #{service} from registry"
45
39
  registry[service] = []
46
40
  instances.each do |instance|
47
41
  data = JSON.parse(client.get("#{service}/#{instance}").first)
48
42
  host = data['address'] || data['host']
49
43
  port = data['port']
50
- # puts "adding #{host}:#{port} to registry for #{service}"
51
44
  registry[service] << { host: host, port: port }
52
45
  end
53
46
  end
54
47
 
55
48
  def subscribe_to_service(service)
56
- subscription = client.register(service) do |event|
57
- # puts "watched event for #{service} detected"
49
+ client.register(service) do |event|
58
50
  renew_service_instances(event.path)
59
51
  announce(event.path)
60
52
  end
metadata CHANGED
@@ -1,27 +1,27 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: druiddb
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.1
4
+ version: 1.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Andre LeBlanc
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2017-07-07 00:00:00.000000000 Z
11
+ date: 2017-08-23 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: activesupport
15
15
  requirement: !ruby/object:Gem::Requirement
16
16
  requirements:
17
- - - ">="
17
+ - - ">"
18
18
  - !ruby/object:Gem::Version
19
19
  version: '4.0'
20
20
  type: :runtime
21
21
  prerelease: false
22
22
  version_requirements: !ruby/object:Gem::Requirement
23
23
  requirements:
24
- - - ">="
24
+ - - ">"
25
25
  - !ruby/object:Gem::Version
26
26
  version: '4.0'
27
27
  - !ruby/object:Gem::Dependency
@@ -80,8 +80,21 @@ dependencies:
80
80
  - - "~>"
81
81
  - !ruby/object:Gem::Version
82
82
  version: '10.0'
83
- description: Ruby adapter for Druid that allows reads and writes using the Tranquility
84
- Kafka API.
83
+ - !ruby/object:Gem::Dependency
84
+ name: rspec
85
+ requirement: !ruby/object:Gem::Requirement
86
+ requirements:
87
+ - - "~>"
88
+ - !ruby/object:Gem::Version
89
+ version: '3.6'
90
+ type: :development
91
+ prerelease: false
92
+ version_requirements: !ruby/object:Gem::Requirement
93
+ requirements:
94
+ - - "~>"
95
+ - !ruby/object:Gem::Version
96
+ version: '3.6'
97
+ description: Ruby client for reading from and writing to Druid.
85
98
  email:
86
99
  - andre.leblanc88@gmail.com
87
100
  executables: []
@@ -89,29 +102,36 @@ extensions: []
89
102
  extra_rdoc_files: []
90
103
  files:
91
104
  - ".gitignore"
105
+ - ".rspec"
106
+ - ".rubocop.yml"
107
+ - ".rubocop_todo.yml"
108
+ - ".travis.yml"
109
+ - Dockerfile
92
110
  - Gemfile
93
111
  - LICENSE.txt
94
112
  - README.md
95
113
  - Rakefile
96
114
  - bin/console
115
+ - bin/run_tests.sh
97
116
  - bin/setup
117
+ - docker-compose.yml
98
118
  - druiddb.gemspec
99
- - lib/druid/README.md
100
- - lib/druid/client.rb
101
- - lib/druid/configuration.rb
102
- - lib/druid/connection.rb
103
- - lib/druid/errors.rb
104
- - lib/druid/node/broker.rb
105
- - lib/druid/node/coordinator.rb
106
- - lib/druid/node/overlord.rb
107
- - lib/druid/queries/core.rb
108
- - lib/druid/queries/task.rb
109
- - lib/druid/query.rb
110
- - lib/druid/version.rb
111
- - lib/druid/writer.rb
112
- - lib/druid/zk.rb
113
119
  - lib/druiddb.rb
114
- homepage: https://github.com/andremleblanc/druiddb
120
+ - lib/druiddb/client.rb
121
+ - lib/druiddb/configuration.rb
122
+ - lib/druiddb/connection.rb
123
+ - lib/druiddb/errors.rb
124
+ - lib/druiddb/node/broker.rb
125
+ - lib/druiddb/node/coordinator.rb
126
+ - lib/druiddb/node/overlord.rb
127
+ - lib/druiddb/queries/core.rb
128
+ - lib/druiddb/queries/datasources.rb
129
+ - lib/druiddb/queries/task.rb
130
+ - lib/druiddb/query.rb
131
+ - lib/druiddb/version.rb
132
+ - lib/druiddb/writer.rb
133
+ - lib/druiddb/zk.rb
134
+ homepage: https://github.com/andremleblanc/druiddb-ruby
115
135
  licenses:
116
136
  - MIT
117
137
  metadata: {}
@@ -134,5 +154,5 @@ rubyforge_project:
134
154
  rubygems_version: 2.6.12
135
155
  signing_key:
136
156
  specification_version: 4
137
- summary: Ruby adapter for Druid.
157
+ summary: Ruby client for Druid.
138
158
  test_files: []
data/lib/druid/README.md DELETED
@@ -1,20 +0,0 @@
1
- # Druid
2
- This module contains all logic associated with Druid.
3
-
4
- ## Node
5
- The `Node` classes represent Druid nodes and manage connection with them. They
6
- also provide the methods that are exposed natively by the Druid REST API.
7
-
8
- ## Query
9
- The query module provides a way for the `Druid::Client` to inherit the methods
10
- from the `Node` classes. Additionally, the `Query` module classes provide some
11
- additional methods not found natively in the Druid REST API.
12
-
13
- ## Writer
14
- The `Writer` classes utilize the Tranquility Kafka API to communicate with Druid
15
- nodes and allows writing.
16
-
17
- ## Errors
18
- **Client Error:** Indicates a failure within the Ruby-Druid adapter.
19
- **Connection Error:** Indicates a failed request to Druid.
20
- **QueryError:** Indicates a malformed query.
data/lib/druid/client.rb DELETED
@@ -1,22 +0,0 @@
1
- module Druid
2
- class Client
3
- include Druid::Queries::Core
4
- include Druid::Queries::Task
5
-
6
- attr_reader :broker,
7
- :config,
8
- :coordinator,
9
- :overlord,
10
- :writer,
11
- :zk
12
-
13
- def initialize(options = {})
14
- @config = Druid::Configuration.new(options)
15
- @zk = Druid::ZK.new(config)
16
- @broker = Druid::Node::Broker.new(config, zk)
17
- @coordinator = Druid::Node::Coordinator.new(config, zk)
18
- @overlord = Druid::Node::Overlord.new(config, zk)
19
- @writer = Druid::Writer.new(config, zk)
20
- end
21
- end
22
- end
@@ -1,7 +0,0 @@
1
- module Druid
2
- module Queries
3
- module Task
4
- delegate :shutdown_tasks, to: :overlord
5
- end
6
- end
7
- end
data/lib/druid/version.rb DELETED
@@ -1,3 +0,0 @@
1
- module Druiddb
2
- VERSION = '1.0.1'
3
- end