clickhouse 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: c0be7eb44c6eed3df458e031647a9cad1bc40181
4
+ data.tar.gz: 724c2dbce96c466eaad983d720f7da01a899571d
5
+ SHA512:
6
+ metadata.gz: 2c2e727bbcffefa01ae315b19cddbddf6050afefaf82f8b5379496d79ba7add71e5aa0fed8b977c2d8fd012a12affbf7179654cd3fe61fac99d49222fb9820b7
7
+ data.tar.gz: 5893a3e5919b5d19779f4ffd35f3858a10b33e6e2b7eb9510501d9c7e528be01c256890c53d92d329856e0b85fcb36d9a7398a21840c2472df1939a1fb9345d2
data/.gitignore ADDED
@@ -0,0 +1,9 @@
1
+ .DS_Store
2
+ .bundle
3
+ .env
4
+ .rvmrc
5
+ Gemfile.lock
6
+ coverage
7
+ doc
8
+ pkg
9
+ test/coverage
data/.travis.yml ADDED
@@ -0,0 +1,5 @@
1
+ language: ruby
2
+ rvm:
3
+ - 2.3.1
4
+ - 2.1.2
5
+ - 2.0.0
data/CHANGELOG.md ADDED
@@ -0,0 +1,5 @@
1
+ ## Clickhouse CHANGELOG
2
+
3
+ ### Version 0.1.0 (October 18, 2016)
4
+
5
+ * Initial release
data/Gemfile ADDED
@@ -0,0 +1,3 @@
1
+ source "https://rubygems.org"
2
+
3
+ gemspec
data/MIT-LICENSE ADDED
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2016 Paul Engel
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,172 @@
1
+ # Clickhouse [![Build Status](https://travis-ci.org/archan937/clickhouse.svg?branch=master)](https://travis-ci.org/archan937/clickhouse) [![Code Climate](https://codeclimate.com/github/archan937/clickhouse/badges/gpa.svg)](https://codeclimate.com/github/archan937/clickhouse) [![Test Coverage](https://codeclimate.com/github/archan937/clickhouse/badges/coverage.svg)](https://codeclimate.com/github/archan937/clickhouse/coverage) [![Gem](https://img.shields.io/gem/v/clickhouse.svg)](https://rubygems.org/gems/clickhouse) [![Gem](https://img.shields.io/gem/dt/clickhouse.svg)](https://rubygems.org/gems/clickhouse)
2
+
3
+ A Ruby database driver for Clickhouse.
4
+
5
+ ## Introduction
6
+
7
+ [Clickhouse](https://clickhouse.yandex) is a high-performance column-oriented database management system developed by [Yandex](https://yandex.com/company) which operates Russia's most popular search engine.
8
+
9
+ > ClickHouse manages extremely large volumes of data in a stable and sustainable manner. It currently powers Yandex.Metrica, world’s second largest web analytics platform, with over 13 trillion database records and over 20 billion events a day, generating customized reports on-the-fly, directly from non-aggregated data. This system was successfully implemented at CERN’s LHCb experiment to store and process metadata on 10bn events with over 1000 attributes per event registered in 2011.
10
+
11
+ On June 15th 2016, [Yandex open-sourced their awesome project](https://news.ycombinator.com/item?id=11908254) giving the community a [powerful asset](https://clickhouse.yandex/benchmark.html) which can compete with the big players like [Google BigQuery](https://cloud.google.com/bigquery/) and [Amazon Redshift](http://docs.aws.amazon.com/redshift/latest/mgmt/welcome.html) with an important advantage: the client can use ClickHouse in its infrastructure and does not have to pay for the cloud ([read more](https://translate.google.com/translate?sl=ru&tl=en&js=y&prev=_t&hl=en&ie=UTF-8&u=https://habrahabr.ru/company/yandex/blog/303282/)).
12
+
13
+ ### Why use the HTTP interface and not the TCP interface?
14
+
15
+ Well, the developers of Clickhouse themselves [discourage](https://github.com/yandex/ClickHouse/issues/45#issuecomment-231194134) using the TCP interface.
16
+
17
+ > TCP transport is more specific, we don't want to expose details.
18
+ Despite we have full compatibility of protocol of different versions of client and server, we want to keep the ability to "break" it for very old clients. And that protocol is not too clean to make a specification.
19
+
20
+ ## Installation
21
+
22
+ Run the following command to install `Clickhouse`:
23
+
24
+ $ gem install "clickhouse"
25
+
26
+ ## Usage
27
+
28
+ ### Quick start
29
+
30
+ Require the Clickhouse gem.
31
+
32
+ ```ruby
33
+ require "clickhouse"
34
+ ```
35
+
36
+ Setup the logging output.
37
+
38
+ ```ruby
39
+ require "logger"
40
+ Clickhouse.logger = Logger.new(STDOUT)
41
+ ```
42
+
43
+ Establish the connection with the Clickhouse server (using the default config).
44
+
45
+ ```ruby
46
+ Clickhouse.establish_connection
47
+ => true
48
+ ```
49
+
50
+ List databases and tables.
51
+
52
+ ```ruby
53
+ Clickhouse.connection.databases
54
+ I, [2016-10-17T22:54:26.587401 #81829] INFO -- :
55
+ SQL (64.0ms) SHOW DATABASES;
56
+ => ["default", "system"]
57
+
58
+ Clickhouse.connection.tables
59
+ I, [2016-10-17T22:54:51.454012 #81829] INFO -- :
60
+ SQL (61.7ms) SHOW TABLES;
61
+ => []
62
+ ```
63
+
64
+ Create tables.
65
+
66
+ ```ruby
67
+ Clickhouse.connection.create_table("events") do |t|
68
+ t.fixed_string :id, 16
69
+ t.uint16 :year
70
+ t.date :date
71
+ t.date_time :time
72
+ t.string :event
73
+ t.uint32 :user_id
74
+ t.float32 :revenue
75
+ t.engine "MergeTree(date, (year, date), 8192)"
76
+ end
77
+ => true
78
+
79
+ Clickhouse.connection.query "DESCRIBE TABLE events" # or Clickhouse.connection.describe_table "events"
80
+ => #<Clickhouse::Connection::Query::ResultSet:0x007fa9ac137010
81
+ @names=["name", "type", "default_type", "default_expression"],
82
+ @rows=
83
+ [["id", "FixedString(16)", nil, nil],
84
+ ["year", "UInt16", nil, nil],
85
+ ["date", "Date", nil, nil],
86
+ ["time", "DateTime", nil, nil],
87
+ ["event", "String", nil, nil],
88
+ ["user_id", "UInt32", nil, nil],
89
+ ["revenue", "Float32", nil, nil]],
90
+ @types=["String", "String", "String", "String"]>
91
+ ```
92
+
93
+ Insert data.
94
+
95
+ ```ruby
96
+ Clickhouse.connection.insert_rows(events, :names => %w(id year date time event user_id revenue)) do |rows|
97
+ rows << [
98
+ "d91d1c90",
99
+ 2016,
100
+ "2016-10-17",
101
+ "2016-10-17 23:14:28",
102
+ "click",
103
+ 1982,
104
+ 0.18
105
+ ]
106
+ rows << [
107
+ "d91d2294",
108
+ 2016,
109
+ "2016-10-17",
110
+ "2016-10-17 23:14:41",
111
+ "click",
112
+ 1947,
113
+ 0.203
114
+ ]
115
+ end
116
+ => true
117
+ ```
118
+
119
+ Query data.
120
+
121
+ ```ruby
122
+ Clickhouse.connection.count :from => "events"
123
+ I, [2016-10-17T23:19:45.592602 #82196] INFO -- :
124
+ SQL (65.4ms) SELECT COUNT(*)
125
+ FROM events;
126
+ => 2
127
+
128
+ Clickhouse.connection.select_row :select => "COUNT(*), year, date, avg(revenue)", :from => "events", :group => "year, date"
129
+ I, [2016-10-17T23:22:47.340232 #82196] INFO -- :
130
+ SQL (67.7ms) SELECT COUNT(*), year, date, avg(revenue)
131
+ FROM events
132
+ GROUP BY year, date;
133
+ => [2, 2016, #<Date: 2016-10-17 ((2457679j,0s,0n),+0s,2299161j)>, 0.1915000081062317]
134
+ ```
135
+
136
+ ### Check out the tests
137
+
138
+ To see what more the `Clickhouse` gem has to offer, please take a look at the unit tests ( [test/unit/connection/test_query.rb](https://github.com/archan937/clickhouse/blob/master/test/unit/connection/test_query.rb) for instance).
139
+
140
+ ## Using the console
141
+
142
+ As you probably already noticed, the `Clickhouse` repo is provided with a `script/console` file which you can use for development / testing purposes. Please note that you need to have a Clickhouse server running.
143
+
144
+ ## Testing
145
+
146
+ Run the following command for testing:
147
+
148
+ $ rake
149
+
150
+ You can also run a single test file:
151
+
152
+ $ ruby test/unit/connection/test_query.rb
153
+
154
+ ## Contact me
155
+
156
+ For support, remarks and requests, please mail me at [pm_engel@icloud.com](mailto:pm_engel@icloud.com).
157
+
158
+ ## TODO
159
+
160
+ * Support cluster connections
161
+
162
+ ## License
163
+
164
+ Copyright (c) 2016 Paul Engel, released under the MIT license
165
+
166
+ http://github.com/archan937 – http://twitter.com/archan937 – pm_engel@icloud.com
167
+
168
+ Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
169
+
170
+ The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
171
+
172
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/Rakefile ADDED
@@ -0,0 +1,15 @@
1
+ #!/usr/bin/env rake
2
+
3
+ require "bundler/gem_tasks"
4
+ require "rake/testtask"
5
+
6
+ task :default => :test
7
+
8
+ desc "Run tests and report test coverage to Code Climate"
9
+ task :report do
10
+ exec "REPORT=1 rake"
11
+ end
12
+
13
+ Rake::TestTask.new do |test|
14
+ test.pattern = "test/**/test_*.rb"
15
+ end
data/VERSION ADDED
@@ -0,0 +1 @@
1
+ 0.1.0
@@ -0,0 +1,27 @@
1
+ # -*- encoding: utf-8 -*-
2
+ require File.expand_path("../lib/clickhouse/version", __FILE__)
3
+
4
+ Gem::Specification.new do |gem|
5
+ gem.authors = ["Paul Engel"]
6
+ gem.email = ["pm_engel@icloud.com"]
7
+ gem.summary = %q{A Ruby database driver for Clickhouse}
8
+ gem.description = %q{A Ruby database driver for Clickhouse}
9
+ gem.homepage = "https://github.com/archan937/clickhouse"
10
+
11
+ gem.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
12
+ gem.files = `git ls-files`.split("\n")
13
+ gem.test_files = `git ls-files -- {test,spec,features}/*`.split("\n")
14
+ gem.name = "clickhouse"
15
+ gem.require_paths = ["lib"]
16
+ gem.version = Clickhouse::VERSION
17
+
18
+ gem.add_dependency "faraday"
19
+
20
+ gem.add_development_dependency "rake"
21
+ gem.add_development_dependency "pry"
22
+ gem.add_development_dependency "dotenv"
23
+ gem.add_development_dependency "codeclimate-test-reporter"
24
+ gem.add_development_dependency "simplecov"
25
+ gem.add_development_dependency "minitest"
26
+ gem.add_development_dependency "mocha"
27
+ end
data/lib/clickhouse.rb ADDED
@@ -0,0 +1,42 @@
1
+ require "forwardable"
2
+ require "csv"
3
+
4
+ require "faraday"
5
+
6
+ require "clickhouse/connection"
7
+ require "clickhouse/error"
8
+ require "clickhouse/version"
9
+
10
+ module Clickhouse
11
+
12
+ def self.logger=(logger)
13
+ @logger = logger
14
+ end
15
+
16
+ def self.logger
17
+ @logger if instance_variables.include?(:@logger)
18
+ end
19
+
20
+ def self.configurations=(configurations)
21
+ @configurations = configurations.inject({}){|h, (k, v)| h[k.to_s] = v; h}
22
+ end
23
+
24
+ def self.configurations
25
+ @configurations if instance_variables.include?(:@configurations)
26
+ end
27
+
28
+ def self.establish_connection(arg = {})
29
+ config = arg.is_a?(Hash) ? arg : (configurations || {})[arg.to_s]
30
+ if config
31
+ @connection = Connection.new(config)
32
+ @connection.connect!
33
+ else
34
+ raise InvalidConnectionError, "Invalid connection specified: #{arg.inspect}"
35
+ end
36
+ end
37
+
38
+ def self.connection
39
+ @connection if instance_variables.include?(:@connection)
40
+ end
41
+
42
+ end
@@ -0,0 +1,23 @@
1
+ require "clickhouse/connection/client"
2
+ require "clickhouse/connection/logger"
3
+ require "clickhouse/connection/query"
4
+
5
+ module Clickhouse
6
+ class Connection
7
+
8
+ include Client
9
+ include Logger
10
+ include Query
11
+
12
+ def initialize(config = {})
13
+ @config = {
14
+ :scheme => "http",
15
+ :host => "localhost",
16
+ :port => 8123
17
+ }.merge(
18
+ config.inject({}){|h, (k, v)| h[k.to_sym] = v; h}
19
+ )
20
+ end
21
+
22
+ end
23
+ end
@@ -0,0 +1,65 @@
1
+ module Clickhouse
2
+ class Connection
3
+ module Client
4
+
5
+ def connect!
6
+ return if connected?
7
+ ensure_authentication
8
+ ping!
9
+ end
10
+
11
+ def connected?
12
+ instance_variables.include?(:@client) && !!@client
13
+ end
14
+
15
+ def get(query)
16
+ request(:get, query)
17
+ end
18
+
19
+ def post(query, body = nil)
20
+ request(:post, query, body)
21
+ end
22
+
23
+ private
24
+
25
+ def url
26
+ "#{@config[:scheme]}://#{@config[:host]}:#{@config[:port]}"
27
+ end
28
+
29
+ def path(query)
30
+ database = "database=#{@config[:database]}&" if @config[:database]
31
+ "/?#{database}query=#{CGI.escape(query)}"
32
+ end
33
+
34
+ def client
35
+ @client ||= Faraday.new(:url => url)
36
+ end
37
+
38
+ def ensure_authentication
39
+ username, password = @config.values_at(:username, :password)
40
+ client.basic_auth(username || "default", password) if username || password
41
+ end
42
+
43
+ def ping!
44
+ status = client.get("/").status
45
+ if status != 200
46
+ raise ConnectionError, "Unexpected response status: #{status}"
47
+ end
48
+ true
49
+ rescue Faraday::ConnectionFailed => e
50
+ raise ConnectionError, e.message
51
+ end
52
+
53
+ def request(method, query, body = nil)
54
+ connect!
55
+ query = query.to_s.strip
56
+ start = Time.now
57
+ client.send(method, path(query), body).tap do |response|
58
+ log :info, "\n SQL (#{((Time.now - start) * 1000).round(1)}ms) #{query.gsub(/( FORMAT \w+|;$)/, "")};"
59
+ raise QueryError, response.body unless response.status == 200
60
+ end
61
+ end
62
+
63
+ end
64
+ end
65
+ end
@@ -0,0 +1,12 @@
1
+ module Clickhouse
2
+ class Connection
3
+ module Logger
4
+ private
5
+
6
+ def log(type, msg)
7
+ Clickhouse.logger.send(type, msg) if Clickhouse.logger
8
+ end
9
+
10
+ end
11
+ end
12
+ end
@@ -0,0 +1,160 @@
1
+ require "clickhouse/connection/query/table"
2
+ require "clickhouse/connection/query/result_set"
3
+ require "clickhouse/connection/query/result_row"
4
+
5
+ module Clickhouse
6
+ class Connection
7
+ module Query
8
+
9
+ def execute(query, body = nil)
10
+ body = post(query, body).body.to_s
11
+ body.empty? ? true : body
12
+ end
13
+
14
+ def query(query)
15
+ query = query.to_s.gsub(/(;|\bFORMAT \w+)/i, "").strip
16
+ query += " FORMAT TabSeparatedWithNamesAndTypes"
17
+ parse_response get(query).body.to_s
18
+ end
19
+
20
+ def databases
21
+ query("SHOW DATABASES").flatten
22
+ end
23
+
24
+ def tables
25
+ query("SHOW TABLES").flatten
26
+ end
27
+
28
+ def create_table(name, &block)
29
+ execute(Clickhouse::Connection::Query::Table.new(name, &block).to_sql)
30
+ end
31
+
32
+ def describe_table(name)
33
+ query("DESCRIBE TABLE #{name}").to_a
34
+ end
35
+
36
+ def rename_table(*args)
37
+ names = (args[0].is_a?(Hash) ? args[0].to_a : [args]).flatten
38
+ raise Clickhouse::InvalidQueryError, "Odd number of table names" unless (names.size % 2) == 0
39
+ names = Hash[*names].collect{|(from, to)| "#{from} TO #{to}"}
40
+ execute "RENAME TABLE #{names.join(", ")}"
41
+ end
42
+
43
+ def drop_table(name)
44
+ execute "DROP TABLE #{name}"
45
+ end
46
+
47
+ def insert_rows(table, options = {})
48
+ options[:csv] ||= begin
49
+ options[:rows] ||= yield([])
50
+ generate_csv options[:rows], options[:names]
51
+ end
52
+ execute "INSERT INTO #{table} FORMAT CSVWithNames", options[:csv]
53
+ end
54
+
55
+ def select_rows(options)
56
+ query to_select_query(options)
57
+ end
58
+
59
+ def select_row(options)
60
+ select_rows(options)[0]
61
+ end
62
+
63
+ def select_values(options)
64
+ select_rows(options).collect{|row| row[0]}
65
+ end
66
+
67
+ def select_value(options)
68
+ values = select_values(options)
69
+ values[0] if values
70
+ end
71
+
72
+ def count(options)
73
+ select_value options.merge(:select => "COUNT(*)")
74
+ end
75
+
76
+ private
77
+
78
+ def generate_csv(rows, names = nil)
79
+ hashes = rows[0].is_a?(Hash)
80
+
81
+ if hashes
82
+ names ||= rows[0].keys
83
+ end
84
+
85
+ CSV.generate do |csv|
86
+ csv << names if names
87
+ rows.each do |row|
88
+ csv << (hashes ? row.values_at(*names) : row)
89
+ end
90
+ end
91
+ end
92
+
93
+ def inspect_value(value)
94
+ value.nil? ? "NULL" : value.inspect.gsub(/(^"|"$)/, "'").gsub("\\\"", "\"")
95
+ end
96
+
97
+ def to_select_options(options)
98
+ keys = [:select, :from, :where, :group, :having, :order, :limit, :offset]
99
+
100
+ options = Hash[keys.zip(options.values_at(*keys))]
101
+ options[:select] ||= "*"
102
+ options[:limit] ||= 0 if options[:offset]
103
+ options[:limit] = options.values_at(:offset, :limit).compact.join(", ") if options[:limit]
104
+ options.delete(:offset)
105
+
106
+ options
107
+ end
108
+
109
+ def to_segment(type, value)
110
+ case type
111
+ when :select
112
+ [value].flatten.join(", ")
113
+ when :where, :having
114
+ to_condition_statements(value)
115
+ else
116
+ value
117
+ end
118
+ end
119
+
120
+ def to_condition_statements(value)
121
+ value.collect do |attr, val|
122
+ if val == :empty
123
+ "empty(#{attr})"
124
+ elsif val.is_a?(Range)
125
+ [
126
+ "#{attr} >= #{inspect_value(val.first)}",
127
+ "#{attr} <= #{inspect_value(val.last)}"
128
+ ]
129
+ elsif val.is_a?(Array)
130
+ "#{attr} IN (#{val.collect{|x| inspect_value(x)}.join(", ")})"
131
+ elsif val.to_s.match(/^`.*`$/)
132
+ "#{attr} #{val.gsub(/(^`|`$)/, "")}"
133
+ else
134
+ "#{attr} = #{inspect_value(val)}"
135
+ end
136
+ end.flatten.join(" AND ")
137
+ end
138
+
139
+ def to_select_query(options)
140
+ to_select_options(options).collect do |(key, value)|
141
+ next if value.nil? && (!value.respond_to?(:empty?) || value.empty?)
142
+
143
+ statement = [key.to_s.upcase]
144
+ statement << "BY" if %W(GROUP ORDER).include?(statement[0])
145
+ statement << to_segment(key, value)
146
+ statement.join(" ")
147
+
148
+ end.compact.join("\n").force_encoding("UTF-8")
149
+ end
150
+
151
+ def parse_response(response)
152
+ rows = CSV.parse response, :col_sep => "\t"
153
+ names = rows.shift
154
+ types = rows.shift
155
+ ResultSet.new rows, names, types
156
+ end
157
+
158
+ end
159
+ end
160
+ end