blazer 2.4.0 → 2.4.1

Sign up to get free protection for your applications and to get access to all the features.

Potentially problematic release.


This version of blazer might be problematic. Click here for more details.

checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: c9f173935d9c63e3537014420daf0af3017f6f9c51615d102fb6f0b4b3b45ba8
4
- data.tar.gz: 8743bc1783f5181bd946d6263a70ac7ae02cbb26d3663b59c8915cdc9495e80f
3
+ metadata.gz: b2a6413e7d7d49e280cbb8d6a880ec32c6c49452e69f980183b592bd927fa90a
4
+ data.tar.gz: 2e988e2010ae919dbfbb7c4c922753425b16eb0ee1e6fe309789cb44a450d8f7
5
5
  SHA512:
6
- metadata.gz: bfe97805e455f1ec96cc825563460a303006c6c379f9adfb3b24ed2f9f3b86f57f9600648a8fbc01a564ba62f78346582659dab9c121d239f91383211a3b8ad3
7
- data.tar.gz: 52a1ce2a7a489105ba1c1ece23e412117de868bf6c50a07b968056b6ccbbe40d90031087c836f7831d04f12bd7d88593260a0a98bfdcb15659e1158c3fc070d6
6
+ metadata.gz: 50fc2e2d4c430935e05d14df65aea648df86c4792e58007ff0f6015a5c58a5842db4f9d22d430ac7f84b094325d45c41406f67676d6ec3d3535266e47ebd53a0
7
+ data.tar.gz: 3118830b6ab4198fe383615c09319d98619480e2f88dca117b92e99c228da82a5556f7a9948ad57dd45e788cacc20efec370e0eb68770919e5dab9909bd4f1ef
@@ -1,3 +1,9 @@
1
+ ## 2.4.1 (2021-01-25)
2
+
3
+ - Added cohorts for MySQL
4
+ - Added support for Apache Hive and Apache Spark
5
+ - Fixed deprecation warning with Active Record 6.1
6
+
1
7
  ## 2.4.0 (2020-12-15)
2
8
 
3
9
  - Added cohorts
data/README.md CHANGED
@@ -4,7 +4,7 @@ Explore your data with SQL. Easily create charts and dashboards, and share them
4
4
 
5
5
  [Try it out](https://blazer.dokkuapp.com)
6
6
 
7
- [![Screenshot](https://blazer.dokkuapp.com/assets/blazer-90bd7acc9fdf1f5fc2bb25bfe5506f746ec8c9d2e0730388debfd697e32f75b8.png)](https://blazer.dokkuapp.com)
7
+ [![Screenshot](https://blazer.dokkuapp.com/assets/blazer-a10baa40fef1ca2f5bb25fc97bcf261a6a54192fb1ad0f893c0f562b8c7c4697.png)](https://blazer.dokkuapp.com)
8
8
 
9
9
  Blazer is also available as a [Docker image](https://github.com/ankane/blazer-docker).
10
10
 
@@ -412,7 +412,7 @@ SELECT users.id AS user_id, orders.created_at AS conversion_time, users.created_
412
412
  FROM users LEFT JOIN orders ON orders.user_id = users.id
413
413
  ```
414
414
 
415
- This feature requires PostgreSQL.
415
+ This feature requires PostgreSQL or MySQL.
416
416
 
417
417
  ## Anomaly Detection
418
418
 
@@ -566,6 +566,8 @@ data_sources:
566
566
  - [Amazon Athena](#amazon-athena)
567
567
  - [Amazon Redshift](#amazon-redshift)
568
568
  - [Apache Drill](#apache-drill)
569
+ - [Apache Hive](#apache-hive)
570
+ - [Apache Spark](#apache-spark)
569
571
  - [Cassandra](#cassandra)
570
572
  - [Druid](#druid)
571
573
  - [Elasticsearch](#elasticsearch)
@@ -627,6 +629,32 @@ data_sources:
627
629
  url: http://hostname:8047
628
630
  ```
629
631
 
632
+ ### Apache Hive
633
+
634
+ Add [hexspace](https://github.com/ankane/hexspace) to your Gemfile and set:
635
+
636
+ ```yml
637
+ data_sources:
638
+ my_source:
639
+ adapter: hive
640
+ url: sasl://user:password@hostname:10000/database
641
+ ```
642
+
643
+ Use a read-only user. Requires [HiveServer2](https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2).
644
+
645
+ ### Apache Spark
646
+
647
+ Add [hexspace](https://github.com/ankane/hexspace) to your Gemfile and set:
648
+
649
+ ```yml
650
+ data_sources:
651
+ my_source:
652
+ adapter: spark
653
+ url: sasl://user:password@hostname:10000/database
654
+ ```
655
+
656
+ Use a read-only user. Requires the [Thrift server](https://spark.apache.org/docs/latest/sql-distributed-sql-engine.html).
657
+
630
658
  ### Cassandra
631
659
 
632
660
  Add [cassandra-driver](https://github.com/datastax/ruby-driver) to your Gemfile and set:
@@ -949,152 +977,6 @@ with:
949
977
  add_column :blazer_checks, :slack_channels, :text
950
978
  ```
951
979
 
952
- ### 1.5
953
-
954
- To take advantage of the anomaly detection, create a migration
955
-
956
- ```sh
957
- rails g migration upgrade_blazer_to_1_5
958
- ```
959
-
960
- with:
961
-
962
- ```ruby
963
- add_column :blazer_checks, :check_type, :string
964
- add_column :blazer_checks, :message, :text
965
- commit_db_transaction
966
-
967
- Blazer::Check.reset_column_information
968
-
969
- Blazer::Check.where(invert: true).update_all(check_type: "missing_data")
970
- Blazer::Check.where(check_type: nil).update_all(check_type: "bad_data")
971
- ```
972
-
973
- ### 1.3
974
-
975
- To take advantage of the latest features, create a migration
976
-
977
- ```sh
978
- rails g migration upgrade_blazer_to_1_3
979
- ```
980
-
981
- with:
982
-
983
- ```ruby
984
- add_column :blazer_dashboards, :creator_id, :integer
985
- add_column :blazer_checks, :creator_id, :integer
986
- add_column :blazer_checks, :invert, :boolean
987
- add_column :blazer_checks, :schedule, :string
988
- add_column :blazer_checks, :last_run_at, :timestamp
989
- commit_db_transaction
990
-
991
- Blazer::Check.update_all schedule: "1 hour"
992
- ```
993
-
994
- ### 1.0
995
-
996
- Blazer 1.0 brings a number of new features:
997
-
998
- - multiple data sources, including Redshift
999
- - dashboards
1000
- - checks
1001
-
1002
- To upgrade, run:
1003
-
1004
- ```sh
1005
- bundle update blazer
1006
- ```
1007
-
1008
- Create a migration
1009
-
1010
- ```sh
1011
- rails g migration upgrade_blazer_to_1_0
1012
- ```
1013
-
1014
- with:
1015
-
1016
- ```ruby
1017
- add_column :blazer_queries, :data_source, :string
1018
- add_column :blazer_audits, :data_source, :string
1019
-
1020
- create_table :blazer_dashboards do |t|
1021
- t.text :name
1022
- t.timestamps
1023
- end
1024
-
1025
- create_table :blazer_dashboard_queries do |t|
1026
- t.references :dashboard
1027
- t.references :query
1028
- t.integer :position
1029
- t.timestamps
1030
- end
1031
-
1032
- create_table :blazer_checks do |t|
1033
- t.references :query
1034
- t.string :state
1035
- t.text :emails
1036
- t.timestamps
1037
- end
1038
- ```
1039
-
1040
- And run:
1041
-
1042
- ```sh
1043
- rails db:migrate
1044
- ```
1045
-
1046
- Update `config/blazer.yml` with:
1047
-
1048
- ```yml
1049
- # see https://github.com/ankane/blazer for more info
1050
-
1051
- data_sources:
1052
- main:
1053
- url: <%= ENV["BLAZER_DATABASE_URL"] %>
1054
-
1055
- # statement timeout, in seconds
1056
- # applies to PostgreSQL only
1057
- # none by default
1058
- # timeout: 15
1059
-
1060
- # time to cache results, in minutes
1061
- # can greatly improve speed
1062
- # none by default
1063
- # cache: 60
1064
-
1065
- # wrap queries in a transaction for safety
1066
- # not necessary if you use a read-only user
1067
- # true by default
1068
- # use_transaction: false
1069
-
1070
- smart_variables:
1071
- # zone_id: "SELECT id, name FROM zones ORDER BY name ASC"
1072
-
1073
- linked_columns:
1074
- # user_id: "/admin/users/{value}"
1075
-
1076
- smart_columns:
1077
- # user_id: "SELECT id, name FROM users WHERE id IN {value}"
1078
-
1079
- # create audits
1080
- audit: true
1081
-
1082
- # change the time zone
1083
- # time_zone: "Pacific Time (US & Canada)"
1084
-
1085
- # class name of the user model
1086
- # user_class: User
1087
-
1088
- # method name for the current user
1089
- # user_method: current_user
1090
-
1091
- # method name for the display name
1092
- # user_name: name
1093
-
1094
- # email to send checks from
1095
- # from_email: blazer@example.org
1096
- ```
1097
-
1098
980
  ## History
1099
981
 
1100
982
  View the [changelog](https://github.com/ankane/blazer/blob/master/CHANGELOG.md)
@@ -9,7 +9,7 @@ module Blazer
9
9
  validates :statement, presence: true
10
10
 
11
11
  scope :active, -> { column_names.include?("status") ? where(status: "active") : all }
12
- scope :named, -> { where("blazer_queries.name <> ''") }
12
+ scope :named, -> { where.not(name: "") }
13
13
 
14
14
  def to_param
15
15
  [id, name].compact.join("-").gsub("'", "").parameterize
@@ -18,12 +18,14 @@ require "blazer/adapters/cassandra_adapter"
18
18
  require "blazer/adapters/drill_adapter"
19
19
  require "blazer/adapters/druid_adapter"
20
20
  require "blazer/adapters/elasticsearch_adapter"
21
+ require "blazer/adapters/hive_adapter"
21
22
  require "blazer/adapters/influxdb_adapter"
22
23
  require "blazer/adapters/mongodb_adapter"
23
24
  require "blazer/adapters/neo4j_adapter"
24
25
  require "blazer/adapters/presto_adapter"
25
26
  require "blazer/adapters/salesforce_adapter"
26
27
  require "blazer/adapters/soda_adapter"
28
+ require "blazer/adapters/spark_adapter"
27
29
  require "blazer/adapters/sql_adapter"
28
30
  require "blazer/adapters/snowflake_adapter"
29
31
 
@@ -239,11 +241,13 @@ Blazer.register_adapter "cassandra", Blazer::Adapters::CassandraAdapter
239
241
  Blazer.register_adapter "drill", Blazer::Adapters::DrillAdapter
240
242
  Blazer.register_adapter "druid", Blazer::Adapters::DruidAdapter
241
243
  Blazer.register_adapter "elasticsearch", Blazer::Adapters::ElasticsearchAdapter
244
+ Blazer.register_adapter "hive", Blazer::Adapters::HiveAdapter
242
245
  Blazer.register_adapter "influxdb", Blazer::Adapters::InfluxdbAdapter
243
246
  Blazer.register_adapter "neo4j", Blazer::Adapters::Neo4jAdapter
244
247
  Blazer.register_adapter "presto", Blazer::Adapters::PrestoAdapter
245
248
  Blazer.register_adapter "mongodb", Blazer::Adapters::MongodbAdapter
246
249
  Blazer.register_adapter "salesforce", Blazer::Adapters::SalesforceAdapter
247
250
  Blazer.register_adapter "soda", Blazer::Adapters::SodaAdapter
251
+ Blazer.register_adapter "spark", Blazer::Adapters::SparkAdapter
248
252
  Blazer.register_adapter "sql", Blazer::Adapters::SqlAdapter
249
253
  Blazer.register_adapter "snowflake", Blazer::Adapters::SnowflakeAdapter
@@ -0,0 +1,45 @@
1
+ module Blazer
2
+ module Adapters
3
+ class HiveAdapter < BaseAdapter
4
+ def run_statement(statement, comment)
5
+ columns = []
6
+ rows = []
7
+ error = nil
8
+
9
+ begin
10
+ result = client.execute("#{statement} /*#{comment}*/")
11
+ columns = result.any? ? result.first.keys : []
12
+ rows = result.map(&:values)
13
+ rescue => e
14
+ error = e.message
15
+ end
16
+
17
+ [columns, rows, error]
18
+ end
19
+
20
+ def tables
21
+ client.execute("SHOW TABLES").map { |r| r["tab_name"] }
22
+ end
23
+
24
+ def preview_statement
25
+ "SELECT * FROM {table} LIMIT 10"
26
+ end
27
+
28
+ protected
29
+
30
+ def client
31
+ @client ||= begin
32
+ uri = URI.parse(settings["url"])
33
+ Hexspace::Client.new(
34
+ host: uri.host,
35
+ port: uri.port,
36
+ username: uri.user,
37
+ password: uri.password,
38
+ database: uri.path.sub(/\A\//, ""),
39
+ mode: uri.scheme.to_sym
40
+ )
41
+ end
42
+ end
43
+ end
44
+ end
45
+ end
@@ -0,0 +1,9 @@
1
+ module Blazer
2
+ module Adapters
3
+ class SparkAdapter < HiveAdapter
4
+ def tables
5
+ client.execute("SHOW TABLES").map { |r| r["tableName"] }
6
+ end
7
+ end
8
+ end
9
+ end
@@ -123,7 +123,7 @@ module Blazer
123
123
  end
124
124
 
125
125
  def supports_cohort_analysis?
126
- postgresql?
126
+ postgresql? || mysql?
127
127
  end
128
128
 
129
129
  # TODO treat date columns as already in time zone
@@ -131,6 +131,27 @@ module Blazer
131
131
  raise "Cohort analysis not supported" unless supports_cohort_analysis?
132
132
 
133
133
  cohort_column = statement =~ /\bcohort_time\b/ ? "cohort_time" : "conversion_time"
134
+ tzname = Blazer.time_zone.tzinfo.name
135
+
136
+ if mysql?
137
+ time_sql = "CONVERT_TZ(cohorts.cohort_time, '+00:00', ?)"
138
+ case period
139
+ when "day"
140
+ date_sql = "CAST(DATE_FORMAT(#{time_sql}, '%Y-%m-%d') AS DATE)"
141
+ date_params = [tzname]
142
+ when "week"
143
+ date_sql = "CAST(DATE_FORMAT(#{time_sql} - INTERVAL ((5 + DAYOFWEEK(#{time_sql})) % 7) DAY, '%Y-%m-%d') AS DATE)"
144
+ date_params = [tzname, tzname]
145
+ else
146
+ date_sql = "CAST(DATE_FORMAT(#{time_sql}, '%Y-%m-01') AS DATE)"
147
+ date_params = [tzname]
148
+ end
149
+ bucket_sql = "CAST(CEIL(TIMESTAMPDIFF(SECOND, cohorts.cohort_time, query.conversion_time) / ?) AS INTEGER)"
150
+ else
151
+ date_sql = "date_trunc(?, cohorts.cohort_time::timestamptz AT TIME ZONE ?)::date"
152
+ date_params = [period, tzname]
153
+ bucket_sql = "CEIL(EXTRACT(EPOCH FROM query.conversion_time - cohorts.cohort_time) / ?)::int"
154
+ end
134
155
 
135
156
  # WITH not an optimization fence in Postgres 12+
136
157
  statement = <<~SQL
@@ -143,14 +164,14 @@ module Blazer
143
164
  GROUP BY 1
144
165
  )
145
166
  SELECT
146
- date_trunc(?, cohorts.cohort_time::timestamptz AT TIME ZONE ?)::date AS period,
167
+ #{date_sql} AS period,
147
168
  0 AS bucket,
148
169
  COUNT(DISTINCT cohorts.user_id)
149
170
  FROM cohorts GROUP BY 1
150
171
  UNION ALL
151
172
  SELECT
152
- date_trunc(?, cohorts.cohort_time::timestamptz AT TIME ZONE ?)::date AS period,
153
- CEIL(EXTRACT(EPOCH FROM query.conversion_time - cohorts.cohort_time) / ?)::int AS bucket,
173
+ #{date_sql} AS period,
174
+ #{bucket_sql} AS bucket,
154
175
  COUNT(DISTINCT query.user_id)
155
176
  FROM cohorts INNER JOIN query ON query.user_id = cohorts.user_id
156
177
  WHERE query.conversion_time IS NOT NULL
@@ -158,8 +179,7 @@ module Blazer
158
179
  #{cohort_column == "conversion_time" ? "AND query.conversion_time != cohorts.cohort_time" : ""}
159
180
  GROUP BY 1, 2
160
181
  SQL
161
- tzname = Blazer.time_zone.tzinfo.name
162
- params = [statement, period, tzname, period, tzname, days.to_i * 86400]
182
+ params = [statement] + date_params + date_params + [days.to_i * 86400]
163
183
  connection_model.send(:sanitize_sql_array, params)
164
184
  end
165
185
 
@@ -206,6 +226,8 @@ module Blazer
206
226
  "public"
207
227
  elsif sqlserver?
208
228
  "dbo"
229
+ elsif connection_model.respond_to?(:connection_db_config)
230
+ connection_model.connection_db_config.database
209
231
  else
210
232
  connection_model.connection_config[:database]
211
233
  end
@@ -1,3 +1,3 @@
1
1
  module Blazer
2
- VERSION = "2.4.0"
2
+ VERSION = "2.4.1"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: blazer
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.4.0
4
+ version: 2.4.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Andrew Kane
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2020-12-15 00:00:00.000000000 Z
11
+ date: 2021-01-25 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: railties
@@ -194,6 +194,7 @@ files:
194
194
  - lib/blazer/adapters/drill_adapter.rb
195
195
  - lib/blazer/adapters/druid_adapter.rb
196
196
  - lib/blazer/adapters/elasticsearch_adapter.rb
197
+ - lib/blazer/adapters/hive_adapter.rb
197
198
  - lib/blazer/adapters/influxdb_adapter.rb
198
199
  - lib/blazer/adapters/mongodb_adapter.rb
199
200
  - lib/blazer/adapters/neo4j_adapter.rb
@@ -201,6 +202,7 @@ files:
201
202
  - lib/blazer/adapters/salesforce_adapter.rb
202
203
  - lib/blazer/adapters/snowflake_adapter.rb
203
204
  - lib/blazer/adapters/soda_adapter.rb
205
+ - lib/blazer/adapters/spark_adapter.rb
204
206
  - lib/blazer/adapters/sql_adapter.rb
205
207
  - lib/blazer/data_source.rb
206
208
  - lib/blazer/detect_anomalies.R
@@ -250,7 +252,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
250
252
  - !ruby/object:Gem::Version
251
253
  version: '0'
252
254
  requirements: []
253
- rubygems_version: 3.1.4
255
+ rubygems_version: 3.2.3
254
256
  signing_key:
255
257
  specification_version: 4
256
258
  summary: Explore your data with SQL. Easily create charts and dashboards, and share