beetle_etl 2.0.0 → 2.0.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 6a726a9734d6866687319a6742cc8db41ef68b64
4
- data.tar.gz: 82be15d660033bd3d957879ec351c0479b26623f
3
+ metadata.gz: 2cd21d7530444cef54966138ce38a5ec6ca0f08b
4
+ data.tar.gz: 277b54c1b957edeeb78fab7434852215f571d486
5
5
  SHA512:
6
- metadata.gz: 01c1408c035afb9d0dcadb9382a3db442318f48c74517f1376c3a6ef615675ef891df8559c172be6e6b1f7f47d2f29c4a6965493a613d93a24d439c40f224246
7
- data.tar.gz: 111f73fd2692ed88f4ce91c6f4da838efe13d1d12fc00bfb4f63cb00383509ffa1d171bf265362ded8f01ddde7f0a0e0659d552f7be8b31188c94bb4bb984e59
6
+ metadata.gz: 659b6e6cde8208578833b7fd4a870f15fd706300389aef4c2c8c93786a17954bddca324398f1ca2e32bba12a96ed89f2e26b237df78efe99f33e1cbb0d9cd476
7
+ data.tar.gz: cb8d7842912f24a4e82d0198f4508757f32c5051183face61e5359bfbe36a5ce353127e085992a33069d82e6afd6bc1d44a68c79d832d38fc079eca819d8726a
data/.gitignore CHANGED
@@ -1,5 +1,6 @@
1
1
  *.gem
2
2
  *.rbc
3
+ .byebug_history
3
4
  .bundle
4
5
  .config
5
6
  .yardoc
data/.travis.yml CHANGED
@@ -6,8 +6,6 @@ rvm:
6
6
  - 2.3.0
7
7
  addons:
8
8
  postgresql: "9.3"
9
- code_climate:
10
- repo_token: fcd6d8c28da900609a2cf903716d858621b8ce68152edbcebe6908a9a3f5d3d5
11
9
  before_install:
12
10
  - gem update --system
13
11
  - gem update bundler
data/Gemfile CHANGED
@@ -5,6 +5,5 @@ gemspec
5
5
 
6
6
  group :test do
7
7
  gem 'rake'
8
- gem 'codeclimate-test-reporter'
9
8
  gem 'byebug'
10
9
  end
data/README.md CHANGED
@@ -1,12 +1,11 @@
1
1
  # BeetleETL
2
2
  [![Build Status](https://travis-ci.org/maiwald/beetle_etl.svg?branch=master)](https://travis-ci.org/maiwald/beetle_etl)
3
- [![Code Climate](https://codeclimate.com/github/maiwald/beetle_etl.png)](https://codeclimate.com/github/maiwald/beetle_etl)
4
3
 
5
4
  BeetleETL helps you with synchronising relational databases and recurring imports of reference data. It is actually quite nice.
6
5
 
7
6
  Consider you have a set of database tables representing third party data (i.e. the ```source```) and you want to synchronize a set of tables in your application (i.e. the ```target```) with that third party data. Further consider that you want to apply transformations to that ```source``` data before you import it.
8
7
 
9
- You define your transformations and BeetleETL will to the rest. Even when your ```source``` data changes, when you run BeetleETL again, it can keep track of what changes need to be applied to what records in your application’s tables.
8
+ You define your transformations and BeetleETL will do the rest. Even when your ```source``` data changes, when you run BeetleETL again, it can keep track of what changes need to be applied to what records in your application’s tables.
10
9
 
11
10
  It currently only works with PostgreSQL databases.
12
11
 
@@ -34,61 +33,64 @@ Make sure the tables you want to import contain columns named ```external_id```
34
33
 
35
34
  Create a configuration object
36
35
 
37
- configuration = BeetleETL::Configuration.new do |config|
38
- # path to your transformation file
39
- config.transformation_file = "../my_fancy_transformations"
40
-
41
- # sequel database config
42
- config.database_config = {
43
- adapter: 'postgres'
44
- encoding: utf8
45
- host: my_host
46
- database: my_database
47
- username: 'foo'
48
- password: 'bar'
49
- pool: 5
50
- pool_timeout: 360
51
- connect_timeout: 360
52
- }
53
- # or config.database = # sequel database instance
54
-
55
- # name of your soruce
56
- config.external_source = "important_data"
57
-
58
- # target schema in case you use postgres schemas
59
- config.target_schema = "public" # default
60
-
61
- # logger
62
- config.logger = Logger.new(STDOUT) # default
63
- end
36
+ ```ruby
37
+ configuration = BeetleETL::Configuration.new do |config|
38
+ # path to your transformation file
39
+ config.transformation_file = "../my_fancy_transformations"
40
+
41
+ # sequel database config
42
+ config.database_config = {
43
+ adapter: 'postgres'
44
+ encoding: utf8
45
+ host: my_host
46
+ database: my_database
47
+ username: 'foo'
48
+ password: 'bar'
49
+ pool: 5
50
+ pool_timeout: 360
51
+ connect_timeout: 360
52
+ }
53
+ # or config.database = # sequel database instance
54
+
55
+ # name of your soruce
56
+ config.external_source = "important_data"
57
+
58
+ # target schema in case you use postgres schemas
59
+ config.target_schema = "public" # default
60
+
61
+ # logger
62
+ config.logger = Logger.new(STDOUT) # default
63
+ end
64
+ ```
64
65
 
65
66
  ### Defining Imports
66
67
 
67
68
  Fill a ```transformation``` file with import directives like this:
68
69
 
69
- import :departments do
70
- columns :name
71
-
72
- references :organisations, on: :organisation_id
73
-
74
- query <<-SQL
75
- INSERT INTO #{stage_table} (
76
- external_id,
77
- name,
78
- external_organisation_id
79
- )
80
-
81
- SELECT
82
- o.id,
83
- o.”dep_name”,
84
- data.”address”
85
-
86
- FROM ”Organisation” o
87
- JOIN additional_data data
88
- ON data.org_id = o.id
89
- SQL
90
- end
91
-
70
+ ```ruby
71
+ import :departments do
72
+ columns :name
73
+
74
+ references :organisations, on: :organisation_id
75
+
76
+ query <<-SQL
77
+ INSERT INTO #{stage_table} (
78
+ external_id,
79
+ name,
80
+ external_organisation_id
81
+ )
82
+
83
+ SELECT
84
+ o.id,
85
+ o.”dep_name”,
86
+ data.”address”
87
+
88
+ FROM ”Organisation” o
89
+ JOIN additional_data data
90
+ ON data.org_id = o.id
91
+ SQL
92
+ end
93
+ ```
92
94
 
93
95
  ```import``` takes the name of the table you want to fill and the configuration as arguments.
94
96
  With ```columns``` you define what columns BeetleETL is supposed to fill in your application’s table.
data/beetle_etl.gemspec CHANGED
@@ -18,9 +18,9 @@ Gem::Specification.new do |spec|
18
18
  spec.test_files = spec.files.grep(%r{^(test|spec|features)/})
19
19
  spec.require_paths = ['lib']
20
20
 
21
- spec.add_runtime_dependency 'sequel', '>= 4.0.0'
22
21
  spec.add_runtime_dependency 'activesupport', '>= 4.0.0'
23
22
 
23
+ spec.add_development_dependency 'sequel', '>= 4.0.0'
24
24
  spec.add_development_dependency 'bundler', '~> 1.11'
25
25
  spec.add_development_dependency 'rspec', '>= 3.0.0'
26
26
  spec.add_development_dependency 'timecop', '>= 0.7.0'
data/lib/beetle_etl.rb CHANGED
@@ -5,6 +5,7 @@ require 'logger'
5
5
 
6
6
  module BeetleETL
7
7
 
8
+ require 'beetle_etl/adapters/sequel_adapter'
8
9
  require 'beetle_etl/configuration'
9
10
 
10
11
  require 'beetle_etl/dsl/dsl'
@@ -0,0 +1,35 @@
1
+ module BeetleETL
2
+ class SequelAdapter
3
+ attr_reader :database
4
+ def initialize(database)
5
+ @database = database
6
+ end
7
+
8
+ def execute(query)
9
+ @database.run(query)
10
+ end
11
+
12
+ def column_names(schema_name, table_name)
13
+ @database[Sequel.qualify(schema_name, table_name)].columns
14
+ end
15
+
16
+ def column_types(schema_name, table_name)
17
+ Hash[@database.schema(Sequel.qualify(schema_name, table_name))].reduce({}) do |acc, (name, column_config)|
18
+ acc[name.to_sym] = column_config.fetch(:db_type)
19
+ acc
20
+ end
21
+ end
22
+
23
+ def table_exists?(schema_name, table_name)
24
+ @database.table_exists?(Sequel.qualify(schema_name, table_name))
25
+ end
26
+
27
+ def transaction(&block)
28
+ @database.transaction(&block)
29
+ end
30
+
31
+ def disconnect
32
+ @database.disconnect
33
+ end
34
+ end
35
+ end
@@ -1,38 +1,54 @@
1
1
  module BeetleETL
2
+
2
3
  InvalidConfigurationError = Class.new(StandardError)
3
4
 
4
5
  class Configuration
5
6
  attr_accessor \
6
7
  :transformation_file,
8
+ :target_schema,
7
9
  :stage_schema,
8
10
  :external_source,
9
11
  :logger
10
12
 
11
13
  attr_writer \
12
14
  :database,
13
- :database_config,
14
- :target_schema
15
+ :database_config
15
16
 
16
17
  def initialize
17
18
  @target_schema = 'public'
18
19
  @logger = ::Logger.new(STDOUT)
19
20
  end
20
21
 
22
+ def database=(database)
23
+ @database_config = nil
24
+ @adapter ||= case
25
+ when sequel?(database) then SequelAdapter.new(database)
26
+ else nil
27
+ end
28
+ end
29
+
30
+ def database_config=(database_config)
31
+ @database_config = database_config
32
+ @adapter = SequelAdapter.new(Sequel.connect(@database_config))
33
+ end
34
+
21
35
  def database
22
- if [@database, @database_config].none?
23
- msg = "Either Sequel connection database_config or a Sequel Database object required"
36
+ if @adapter.nil?
37
+ msg = "Either Sequel connection database_config, Sequel::Database object or ActiveRecord::Base.connection required!"
24
38
  raise InvalidConfigurationError.new(msg)
25
39
  end
26
40
 
27
- @database ||= Sequel.connect(@database_config)
41
+ @adapter
28
42
  end
29
43
 
30
44
  def disconnect_database
31
- database.disconnect if @database_config
45
+ @adapter.disconnect if @database_config
32
46
  end
33
47
 
34
- def target_schema
35
- @target_schema != 'public' ? @target_schema : nil
48
+ private
49
+
50
+ def sequel?(database)
51
+ defined?(::Sequel::Database) && database.is_a?(::Sequel::Database)
36
52
  end
37
53
 
38
54
  end
@@ -26,10 +26,12 @@ module BeetleETL
26
26
  # query helper methods
27
27
 
28
28
  def stage_table(table_name = nil)
29
- BeetleETL::Naming.stage_table_name_sql(
29
+ stage_table_name = BeetleETL::Naming.stage_table_name(
30
30
  @config.external_source,
31
31
  table_name || @table_name
32
32
  )
33
+
34
+ %Q("#{@config.target_schema}"."#{stage_table_name}")
33
35
  end
34
36
 
35
37
  def combined_key(*args)
@@ -10,18 +10,5 @@ module BeetleETL
10
10
  "#{external_source.to_s}-#{table_name.to_s}-#{digest}"[0, 63]
11
11
  end
12
12
 
13
- def stage_table_name_sql(external_source, table_name)
14
- %Q("#{stage_table_name(external_source, table_name)}")
15
- end
16
-
17
- def target_table_name(target_schema, table_name)
18
- schema = target_schema ? target_schema.to_s : nil
19
- [schema, table_name.to_s].compact.join('.')
20
- end
21
-
22
- def target_table_name_sql(target_schema, table_name)
23
- %Q("#{target_table_name(target_schema, table_name)}")
24
- end
25
-
26
13
  end
27
14
  end
@@ -7,10 +7,10 @@ module BeetleETL
7
7
 
8
8
  def run
9
9
  database.execute <<-SQL
10
- UPDATE #{stage_table_name_sql} stage_update
11
- SET id = COALESCE(target.id, nextval('#{table_name}_id_seq'))
12
- FROM #{stage_table_name_sql} stage
13
- LEFT OUTER JOIN #{target_table_name_sql} target
10
+ UPDATE "#{target_schema}"."#{stage_table_name}" stage_update
11
+ SET id = COALESCE(target.id, NEXTVAL('#{target_schema}.#{table_name}_id_seq'))
12
+ FROM "#{target_schema}"."#{stage_table_name}" stage
13
+ LEFT OUTER JOIN "#{target_schema}"."#{table_name}" target
14
14
  on (
15
15
  stage.external_id = target.external_id
16
16
  AND target.external_source = '#{external_source}'
@@ -13,7 +13,7 @@ module BeetleETL
13
13
 
14
14
  def run
15
15
  database.execute <<-SQL
16
- CREATE UNLOGGED TABLE IF NOT EXISTS #{stage_table_name_sql} (
16
+ CREATE UNLOGGED TABLE IF NOT EXISTS "#{target_schema}"."#{stage_table_name}" (
17
17
  id integer,
18
18
  external_id character varying(255),
19
19
  transition character varying(255),
@@ -23,13 +23,13 @@ module BeetleETL
23
23
 
24
24
  #{index_definitions};
25
25
 
26
- ALTER TABLE #{stage_table_name_sql}
26
+ ALTER TABLE "#{target_schema}"."#{stage_table_name}"
27
27
  SET (
28
28
  autovacuum_enabled = false,
29
29
  toast.autovacuum_enabled = false
30
30
  );
31
31
 
32
- TRUNCATE TABLE #{stage_table_name_sql} RESTART IDENTITY CASCADE;
32
+ TRUNCATE TABLE "#{target_schema}"."#{stage_table_name}" RESTART IDENTITY CASCADE;
33
33
  SQL
34
34
  end
35
35
 
@@ -39,7 +39,7 @@ module BeetleETL
39
39
  definitions = [
40
40
  payload_column_definitions,
41
41
  relation_column_definitions
42
- ].compact
42
+ ].flatten
43
43
 
44
44
  if definitions.empty?
45
45
  raise NoColumnsDefinedError.new <<-MSG
@@ -52,35 +52,29 @@ module BeetleETL
52
52
  end
53
53
 
54
54
  def payload_column_definitions
55
- definitions = (@column_names - @relations.keys).map do |column_name|
55
+ (@column_names - @relations.keys).map do |column_name|
56
56
  "#{column_name} #{column_type(column_name)}"
57
57
  end
58
- definitions.join(',') if definitions.any?
59
58
  end
60
59
 
61
60
  def relation_column_definitions
62
- definitions = @relations.map do |foreign_key_name, table|
61
+ @relations.map do |foreign_key_name, table|
63
62
  <<-SQL
64
63
  #{foreign_key_name} integer,
65
64
  external_#{foreign_key_name} character varying(255)
66
65
  SQL
67
66
  end
68
- definitions.join(',') if definitions.any?
69
67
  end
70
68
 
71
69
  def index_definitions
72
70
  index_columns = [:external_id] + @relations.keys.map { |c| "external_#{c}" }
73
71
  index_columns.map do |column_name|
74
- "CREATE INDEX ON #{stage_table_name_sql} (#{column_name})"
72
+ %Q[CREATE INDEX ON "#{target_schema}"."#{stage_table_name}" (#{column_name});]
75
73
  end.join(";")
76
74
  end
77
75
 
78
76
  def column_type(column_name)
79
- @column_types ||= Hash[database.schema(target_table_name.to_sym)]
80
- .reduce({}) do |acc, (name, schema)|
81
- acc[name.to_sym] = schema.fetch(:db_type)
82
- acc
83
- end
77
+ @column_types ||= database.column_types(target_schema, table_name)
84
78
 
85
79
  unless @column_types.has_key?(column_name)
86
80
  raise ColumnDefinitionNotFoundError.new <<-MSG
@@ -3,7 +3,7 @@ module BeetleETL
3
3
 
4
4
  def run
5
5
  database.execute <<-SQL
6
- DROP TABLE IF EXISTS #{stage_table_name_sql}
6
+ DROP TABLE IF EXISTS "#{target_schema}"."#{stage_table_name}";
7
7
  SQL
8
8
  end
9
9
 
@@ -25,26 +25,26 @@ module BeetleETL
25
25
  just_now = now
26
26
 
27
27
  database.execute <<-SQL
28
- INSERT INTO #{target_table_name_sql}
28
+ INSERT INTO "#{target_schema}"."#{table_name}"
29
29
  (#{data_columns.join(', ')}, external_source, created_at, updated_at)
30
30
  SELECT
31
31
  #{data_columns.join(', ')},
32
32
  '#{external_source}',
33
33
  '#{just_now}',
34
34
  '#{just_now}'
35
- FROM #{stage_table_name_sql}
35
+ FROM "#{target_schema}"."#{stage_table_name}"
36
36
  WHERE transition = 'CREATE'
37
37
  SQL
38
38
  end
39
39
 
40
40
  def load_update
41
41
  database.execute <<-SQL
42
- UPDATE #{target_table_name_sql} target
42
+ UPDATE "#{target_schema}"."#{table_name}" target
43
43
  SET
44
44
  #{updatable_columns.map { |c| %Q("#{c}" = stage."#{c}") }.join(',')},
45
45
  "updated_at" = '#{now}',
46
46
  deleted_at = NULL
47
- FROM #{stage_table_name_sql} stage
47
+ FROM "#{target_schema}"."#{stage_table_name}" stage
48
48
  WHERE stage.id = target.id
49
49
  AND stage.transition IN ('UPDATE', 'REINSTATE')
50
50
  SQL
@@ -54,11 +54,11 @@ module BeetleETL
54
54
  just_now = now
55
55
 
56
56
  database.execute <<-SQL
57
- UPDATE #{target_table_name_sql} target
57
+ UPDATE "#{target_schema}"."#{table_name}" target
58
58
  SET
59
59
  updated_at = '#{just_now}',
60
60
  deleted_at = '#{just_now}'
61
- FROM #{stage_table_name_sql} stage
61
+ FROM "#{target_schema}"."#{stage_table_name}" stage
62
62
  WHERE stage.id = target.id
63
63
  AND stage.transition = 'DELETE'
64
64
  SQL
@@ -71,7 +71,7 @@ module BeetleETL
71
71
  end
72
72
 
73
73
  def table_columns
74
- @table_columns ||= database[stage_table_name.to_sym].columns
74
+ @table_columns ||= database.column_names(target_schema, stage_table_name)
75
75
  end
76
76
 
77
77
  def ignored_columns
@@ -14,9 +14,9 @@ module BeetleETL
14
14
  def run
15
15
  @relations.map do |foreign_key_column, foreign_table_name|
16
16
  database.execute <<-SQL
17
- UPDATE #{stage_table_name_sql} current_table
17
+ UPDATE "#{target_schema}"."#{stage_table_name}" current_table
18
18
  SET #{foreign_key_column} = foreign_table.id
19
- FROM #{stage_table_name_sql(foreign_table_name)} foreign_table
19
+ FROM "#{target_schema}"."#{stage_table_name(foreign_table_name)}" foreign_table
20
20
  WHERE current_table.external_#{foreign_key_column} = foreign_table.external_id
21
21
  SQL
22
22
  end
@@ -29,23 +29,12 @@ module BeetleETL
29
29
  @config.database
30
30
  end
31
31
 
32
- # naming
33
-
34
- def stage_table_name
35
- BeetleETL::Naming.stage_table_name(@config.external_source, @table_name)
36
- end
37
-
38
- def stage_table_name_sql(table_name = nil)
39
- table_name ||= @table_name
40
- BeetleETL::Naming.stage_table_name_sql(@config.external_source, table_name)
41
- end
42
-
43
- def target_table_name
44
- BeetleETL::Naming.target_table_name(@config.target_schema, @table_name)
32
+ def target_schema
33
+ @config.target_schema
45
34
  end
46
35
 
47
- def target_table_name_sql
48
- BeetleETL::Naming.target_table_name_sql(@config.target_schema, @table_name)
36
+ def stage_table_name(table_name = nil)
37
+ BeetleETL::Naming.stage_table_name(external_source, table_name || @table_name)
49
38
  end
50
39
 
51
40
  end
@@ -18,11 +18,11 @@ module BeetleETL
18
18
 
19
19
  def transition_create
20
20
  database.execute <<-SQL
21
- UPDATE #{stage_table_name_sql} stage
21
+ UPDATE "#{target_schema}"."#{stage_table_name}" stage
22
22
  SET transition = 'CREATE'
23
23
  WHERE NOT EXISTS (
24
24
  SELECT 1
25
- FROM #{target_table_name} target
25
+ FROM "#{target_schema}"."#{table_name}" target
26
26
  WHERE target.external_id = stage.external_id
27
27
  AND target.external_source = '#{external_source}'
28
28
  )
@@ -31,11 +31,11 @@ module BeetleETL
31
31
 
32
32
  def transition_update
33
33
  database.execute <<-SQL
34
- UPDATE #{stage_table_name_sql} stage
34
+ UPDATE "#{target_schema}"."#{stage_table_name}" stage
35
35
  SET transition = 'UPDATE'
36
36
  WHERE EXISTS (
37
37
  SELECT 1
38
- FROM #{target_table_name} target
38
+ FROM "#{target_schema}"."#{table_name}" target
39
39
  WHERE target.external_id = stage.external_id
40
40
  AND target.external_source = '#{external_source}'
41
41
  AND target.deleted_at IS NULL
@@ -49,13 +49,13 @@ module BeetleETL
49
49
 
50
50
  def transition_delete
51
51
  database.execute <<-SQL
52
- INSERT INTO #{stage_table_name_sql}
52
+ INSERT INTO "#{target_schema}"."#{stage_table_name}"
53
53
  (external_id, transition)
54
54
  SELECT
55
55
  target.external_id,
56
56
  'DELETE'
57
- FROM #{target_table_name_sql} target
58
- LEFT OUTER JOIN #{stage_table_name_sql} stage
57
+ FROM "#{target_schema}"."#{table_name}" target
58
+ LEFT OUTER JOIN "#{target_schema}"."#{stage_table_name}" stage
59
59
  ON (stage.external_id = target.external_id)
60
60
  WHERE stage.external_id IS NULL
61
61
  AND target.external_source = '#{external_source}'
@@ -65,11 +65,11 @@ module BeetleETL
65
65
 
66
66
  def transition_reinstate
67
67
  database.execute <<-SQL
68
- UPDATE #{stage_table_name_sql} stage
68
+ UPDATE "#{target_schema}"."#{stage_table_name}" stage
69
69
  SET transition = 'REINSTATE'
70
70
  WHERE EXISTS (
71
71
  SELECT 1
72
- FROM #{target_table_name_sql} target
72
+ FROM "#{target_schema}"."#{table_name}" target
73
73
  WHERE target.external_id = stage.external_id
74
74
  AND target.external_source = '#{external_source}'
75
75
  AND target.deleted_at IS NOT NULL
@@ -92,7 +92,7 @@ module BeetleETL
92
92
  end
93
93
 
94
94
  def table_columns
95
- @table_columns ||= database[stage_table_name.to_sym].columns
95
+ @table_columns ||= database.column_names(target_schema, stage_table_name)
96
96
  end
97
97
 
98
98
  def ignored_columns
@@ -12,7 +12,7 @@ module BeetleETL
12
12
  end
13
13
 
14
14
  def run
15
- database.run(@query)
15
+ database.execute(@query)
16
16
  end
17
17
 
18
18
  end
@@ -13,7 +13,7 @@ module BeetleETL
13
13
 
14
14
  def with_stage_tables_for(*table_names, &block)
15
15
  table_names.each do |table_name|
16
- unless @@config.database.table_exists?(table_name)
16
+ unless @@config.database.table_exists?(@@config.target_schema, table_name)
17
17
  raise TargetTableNotFoundError.new <<-MSG
18
18
  Missing target table "#{table_name}".
19
19
  In order to create stage tables, BeetleETL requires the target tables to exist because they provide the column definitions.
@@ -1,3 +1,3 @@
1
1
  module BeetleETL
2
- VERSION = "2.0.0"
2
+ VERSION = "2.0.1"
3
3
  end
@@ -0,0 +1,10 @@
1
+ require "spec_helper"
2
+ require_relative "shared_examples"
3
+
4
+ module BeetleETL
5
+ describe SequelAdapter do
6
+ it_behaves_like "database adapter" do
7
+ let(:adapter) { SequelAdapter.new(test_database) }
8
+ end
9
+ end
10
+ end
@@ -0,0 +1,58 @@
1
+ require "spec_helper"
2
+
3
+ shared_examples "database adapter" do
4
+ before do
5
+ test_database.run <<-SQL
6
+ CREATE SCHEMA foo;
7
+ CREATE TABLE foo.persons (
8
+ id int,
9
+ first_name varchar(255),
10
+ last_name varchar(255)
11
+ );
12
+ SQL
13
+ end
14
+
15
+ after do
16
+ test_database.run <<-SQL
17
+ DROP SCHEMA foo CASCADE;
18
+ SQL
19
+ end
20
+
21
+ describe "#execute" do
22
+ it "executes SQL" do
23
+ adapter.execute <<-SQL
24
+ INSERT INTO foo.persons VALUES (1, 'hugo', 'warzenkopp');
25
+ SQL
26
+
27
+ expect(Sequel.qualify("foo", "persons")).to have_values(
28
+ [ :id , :first_name , :last_name ],
29
+ [ 1 , "hugo" , "warzenkopp" ]
30
+ )
31
+ end
32
+ end
33
+
34
+ describe "#column_names" do
35
+ it "returns a tables column names" do
36
+ expect(adapter.column_names("foo", "persons")).to match_array([
37
+ :id, :first_name, :last_name
38
+ ])
39
+ end
40
+ end
41
+
42
+ describe "#column_types" do
43
+ it "returns a tables column names" do
44
+ expect(adapter.column_types("foo", "persons")).to match({
45
+ id: 'integer',
46
+ first_name: 'character varying(255)',
47
+ last_name: 'character varying(255)'
48
+ })
49
+ end
50
+ end
51
+
52
+ describe "#table_exists?" do
53
+ it "returns whether a table exists" do
54
+ expect(adapter.table_exists?("foo", "persons")).to eql(true)
55
+ expect(adapter.table_exists?("foo", "persons200")).to eql(false)
56
+ end
57
+ end
58
+ end
@@ -1,27 +1,27 @@
1
1
  require 'spec_helper'
2
+ require 'yaml'
2
3
 
3
4
  module BeetleETL
4
5
  describe Configuration do
5
6
 
6
7
  subject { Configuration.new }
7
8
 
8
- describe "#database" do
9
- let(:database) { double(:database) }
9
+ let(:database_config) do
10
+ config_path = File.expand_path('../support/database.yml', __FILE__)
11
+ YAML.load(File.read(config_path))
12
+ end
10
13
 
11
- it "returns the object if present" do
12
- subject.database = database
14
+ describe "#database" do
15
+ it "builds a SequelAdapter when passed a Sequel::Database" do
16
+ subject.database = test_database
13
17
 
14
- expect(subject.database).to eql(database)
18
+ expect { subject.database.execute("SELECT 1") }.not_to raise_error
15
19
  end
16
20
 
17
- it "builds a Sequel Database from config when no database is passed" do
18
- database_config = double(:database_config)
21
+ it "builds a SequelAdapter from config when no database is passed" do
19
22
  subject.database_config = database_config
20
23
 
21
- expect(Sequel).to receive(:connect).with(database_config).once { database }
22
-
23
- expect(subject.database).to eql(database)
24
- expect(subject.database).to eql(database)
24
+ expect { subject.database.execute("SELECT 1") }.not_to raise_error
25
25
  end
26
26
 
27
27
  it "raises an error if no database or database_config is passed" do
@@ -31,30 +31,26 @@ module BeetleETL
31
31
  end
32
32
 
33
33
  describe "#disconnect_database" do
34
- let(:database) { double(:database) }
35
-
36
34
  it "disconnects from database if database_config was passed" do
37
- database_config = double(:database_config)
35
+ subject.database_config = database_config
38
36
 
39
- expect(Sequel).to receive(:connect).with(database_config) { database }
40
- expect(database).to receive(:disconnect)
37
+ expect(subject.database).to receive(:disconnect)
41
38
 
42
- subject.database_config = database_config
43
39
  subject.disconnect_database
44
40
  end
45
41
 
46
42
  it "does not disconnect from database if database object was passed" do
47
- expect(database).not_to receive(:disconnect)
43
+ subject.database = test_database
44
+
45
+ expect(subject.database).not_to receive(:disconnect)
48
46
 
49
- subject.database = database
50
47
  subject.disconnect_database
51
48
  end
52
49
  end
53
50
 
54
51
  describe "#target_schema" do
55
- it "returns nil if target_schema is 'public'" do
56
- subject.target_schema = "public"
57
- expect(subject.target_schema).to be_nil
52
+ it "returns 'public' by default" do
53
+ expect(subject.target_schema).to eql("public")
58
54
  end
59
55
 
60
56
  it "returns target_schema if target_schema is not 'public'" do
data/spec/dsl/dsl_spec.rb CHANGED
@@ -6,6 +6,7 @@ module BeetleETL
6
6
  let(:config) do
7
7
  Configuration.new.tap do |c|
8
8
  c.external_source = "bar"
9
+ c.target_schema = "baz_schema"
9
10
  end
10
11
  end
11
12
 
@@ -14,13 +15,13 @@ module BeetleETL
14
15
  describe '#stage_table' do
15
16
  it 'returns the current stage table name' do
16
17
  expect(subject.stage_table).to eql(
17
- BeetleETL::Naming.stage_table_name_sql("bar", :foo_table)
18
+ %Q["baz_schema"."#{BeetleETL::Naming.stage_table_name("bar", :foo_table)}"]
18
19
  )
19
20
  end
20
21
 
21
22
  it 'returns the stage table name for the given table' do
22
23
  expect(subject.stage_table(:bar_table)).to eql(
23
- BeetleETL::Naming.stage_table_name_sql("bar", :bar_table)
24
+ %Q["baz_schema"."#{BeetleETL::Naming.stage_table_name("bar", :bar_table)}"]
24
25
  )
25
26
  end
26
27
  end
@@ -13,7 +13,7 @@ module ExampleSchema
13
13
  def create_source_tables
14
14
  test_database.create_schema :source
15
15
 
16
- test_database.create_table :source__Organisation do
16
+ test_database.create_table Sequel.qualify("source", "Organisation") do
17
17
  Integer :pkOrgId
18
18
  String :Name, size: 255
19
19
  String :Adresse, size: 255
@@ -26,7 +26,9 @@ module ExampleSchema
26
26
  end
27
27
 
28
28
  def create_target_tables
29
- test_database.create_table :organisations do
29
+ test_database.create_schema :my_target
30
+
31
+ test_database.create_table Sequel.qualify("my_target", "organisations") do
30
32
  primary_key :id
31
33
  String :external_id, size: 255
32
34
  String :external_source, size: 255
@@ -37,12 +39,12 @@ module ExampleSchema
37
39
  DateTime :deleted_at
38
40
  end
39
41
 
40
- test_database.create_table :departments do
42
+ test_database.create_table Sequel.qualify("my_target", "departments") do
41
43
  primary_key :id
42
44
  String :external_id, size: 255
43
45
  String :external_source, size: 255
44
46
  String :name, size: 255
45
- foreign_key :organisation_id, :organisations
47
+ foreign_key :organisation_id, Sequel.qualify("my_target", "organisations")
46
48
  DateTime :created_at
47
49
  DateTime :updated_at
48
50
  DateTime :deleted_at
@@ -50,8 +52,7 @@ module ExampleSchema
50
52
  end
51
53
 
52
54
  def drop_target_tables
53
- test_database.drop_table :departments
54
- test_database.drop_table :organisations
55
+ test_database.drop_schema :my_target, cascade: true
55
56
  end
56
57
 
57
58
  end
@@ -27,6 +27,7 @@ describe BeetleETL do
27
27
  c.transformation_file = File.expand_path('../example_transform.rb', __FILE__)
28
28
  c.database_config = database_config
29
29
  c.external_source = 'source_name'
30
+ c.target_schema = 'my_target'
30
31
  c.logger = Logger.new(Tempfile.new("log"))
31
32
  end
32
33
  end
@@ -45,7 +46,7 @@ describe BeetleETL do
45
46
 
46
47
  def import1
47
48
  # create
48
- insert_into(:source__Organisation).values(
49
+ insert_into(Sequel.qualify("source", "Organisation")).values(
49
50
  [ :pkOrgId , :Name , :Adresse , :Abteilung ] ,
50
51
  [ 1 , 'Apple' , 'Apple Street' , 'iPhone' ] ,
51
52
  [ 2 , 'Apple' , 'Apple Street' , 'MacBook' ] ,
@@ -57,14 +58,14 @@ describe BeetleETL do
57
58
  BeetleETL.import(@config)
58
59
  end
59
60
 
60
- expect(:organisations).to have_values(
61
+ expect(Sequel.qualify("my_target", "organisations")).to have_values(
61
62
  [ :id , :external_id , :external_source , :name , :address , :created_at , :updated_at , :deleted_at ] ,
62
63
  [ organisation_id('Apple') , 'Apple' , 'source_name' , 'Apple' , 'Apple Street' , time1 , time1 , nil ] ,
63
64
  [ organisation_id('Google') , 'Google' , 'source_name' , 'Google' , 'Google Street' , time1 , time1 , nil ] ,
64
65
  [ organisation_id('Audi') , 'Audi' , 'source_name' , 'Audi' , 'Audi Street' , time1 , time1 , nil ]
65
66
  )
66
67
 
67
- expect(:departments).to have_values(
68
+ expect(Sequel.qualify("my_target", "departments")).to have_values(
68
69
  [ :id , :external_id , :organisation_id , :external_source , :name , :created_at , :updated_at , :deleted_at ] ,
69
70
  [ department_id('[Apple,1]') , '[Apple,1]' , organisation_id('Apple') , 'source_name' , 'iPhone' , time1 , time1 , nil ] ,
70
71
  [ department_id('[Apple,2]') , '[Apple,2]' , organisation_id('Apple') , 'source_name' , 'MacBook' , time1 , time1 , nil ] ,
@@ -72,12 +73,12 @@ describe BeetleETL do
72
73
  [ department_id('[Audi,4]') , '[Audi,4]' , organisation_id('Audi') , 'source_name' , 'A4' , time1 , time1 , nil ] ,
73
74
  )
74
75
 
75
- test_database[:source__Organisation].truncate
76
+ test_database[Sequel.qualify("source", "Organisation")].truncate
76
77
  end
77
78
 
78
79
  def import2
79
80
  # keep, update, delete
80
- insert_into(:source__Organisation).values(
81
+ insert_into(Sequel.qualify("source", "Organisation")).values(
81
82
  [ :pkOrgId , :Name , :Adresse , :Abteilung ] ,
82
83
  [ 1 , 'Apple' , 'Apple Street' , 'iPhone' ] ,
83
84
  [ 2 , 'Apple' , 'Apple Street' , 'MacBook' ] ,
@@ -89,14 +90,14 @@ describe BeetleETL do
89
90
  BeetleETL.import(@config)
90
91
  end
91
92
 
92
- expect(:organisations).to have_values(
93
+ expect(Sequel.qualify("my_target", "organisations")).to have_values(
93
94
  [ :id , :external_id , :external_source , :name , :address , :created_at , :updated_at , :deleted_at ] ,
94
95
  [ organisation_id('Apple') , 'Apple' , 'source_name' , 'Apple' , 'Apple Street' , time1 , time1 , nil ] ,
95
96
  [ organisation_id('Google') , 'Google' , 'source_name' , 'Google' , 'NEW Google Street' , time1 , time2 , nil ] ,
96
97
  [ organisation_id('Audi') , 'Audi' , 'source_name' , 'Audi' , 'Audi Street' , time1 , time2 , time2 ]
97
98
  )
98
99
 
99
- expect(:departments).to have_values(
100
+ expect(Sequel.qualify("my_target", "departments")).to have_values(
100
101
  [ :id , :external_id , :organisation_id , :external_source , :name , :created_at , :updated_at , :deleted_at ] ,
101
102
  [ department_id('[Apple,1]') , '[Apple,1]' , organisation_id('Apple') , 'source_name' , 'iPhone' , time1 , time1 , nil ] ,
102
103
  [ department_id('[Apple,2]') , '[Apple,2]' , organisation_id('Apple') , 'source_name' , 'MacBook' , time1 , time1 , nil ] ,
@@ -104,12 +105,12 @@ describe BeetleETL do
104
105
  [ department_id('[Audi,4]') , '[Audi,4]' , organisation_id('Audi') , 'source_name' , 'A4' , time1 , time2 , time2 ] ,
105
106
  )
106
107
 
107
- test_database[:source__Organisation].truncate
108
+ test_database[Sequel.qualify("source", "Organisation")].truncate
108
109
  end
109
110
 
110
111
  def import3
111
112
  # reinstate with update
112
- insert_into(:source__Organisation).values(
113
+ insert_into(Sequel.qualify("source", "Organisation")).values(
113
114
  [ :pkOrgId , :Name , :Adresse , :Abteilung ] ,
114
115
  [ 1 , 'Apple' , 'Apple Street' , 'iPhone' ] ,
115
116
  [ 2 , 'Apple' , 'Apple Street' , 'MacBook' ] ,
@@ -121,14 +122,14 @@ describe BeetleETL do
121
122
  BeetleETL.import(@config)
122
123
  end
123
124
 
124
- expect(:organisations).to have_values(
125
+ expect(Sequel.qualify("my_target", "organisations")).to have_values(
125
126
  [ :id , :external_id , :external_source , :name , :address , :created_at , :updated_at , :deleted_at ] ,
126
127
  [ organisation_id('Apple') , 'Apple' , 'source_name' , 'Apple' , 'Apple Street' , time1 , time1 , nil ] ,
127
128
  [ organisation_id('Google') , 'Google' , 'source_name' , 'Google' , 'NEW Google Street' , time1 , time2 , nil ] ,
128
129
  [ organisation_id('Audi') , 'Audi' , 'source_name' , 'Audi' , 'NEW Audi Street' , time1 , time3 , nil ]
129
130
  )
130
131
 
131
- expect(:departments).to have_values(
132
+ expect(Sequel.qualify("my_target", "departments")).to have_values(
132
133
  [ :id , :external_id , :organisation_id , :external_source , :name , :created_at , :updated_at , :deleted_at ] ,
133
134
  [ department_id('[Apple,1]') , '[Apple,1]' , organisation_id('Apple') , 'source_name' , 'iPhone' , time1 , time1 , nil ] ,
134
135
  [ department_id('[Apple,2]') , '[Apple,2]' , organisation_id('Apple') , 'source_name' , 'MacBook' , time1 , time1 , nil ] ,
@@ -136,15 +137,15 @@ describe BeetleETL do
136
137
  [ department_id('[Audi,4]') , '[Audi,4]' , organisation_id('Audi') , 'source_name' , 'A4' , time1 , time3 , nil ] ,
137
138
  )
138
139
 
139
- test_database[:source__Organisation].truncate
140
+ test_database[Sequel.qualify("source", "Organisation")].truncate
140
141
  end
141
142
 
142
143
  def organisation_id(external_id)
143
- test_database[:organisations].first(external_id: external_id)[:id]
144
+ test_database[Sequel.qualify("my_target", "organisations")].first(external_id: external_id)[:id]
144
145
  end
145
146
 
146
147
  def department_id(external_id)
147
- test_database[:departments].first(external_id: external_id)[:id]
148
+ test_database[Sequel.qualify("my_target", "departments")].first(external_id: external_id)[:id]
148
149
  end
149
150
 
150
151
  end
data/spec/spec_helper.rb CHANGED
@@ -1,6 +1,5 @@
1
1
  require "byebug"
2
- require "codeclimate-test-reporter"
3
- CodeClimate::TestReporter.start
2
+ require "ostruct"
4
3
 
5
4
  require_relative "../lib/beetle_etl.rb"
6
5
  require_relative "support/database_helpers.rb"
@@ -19,10 +18,9 @@ RSpec.configure do |config|
19
18
  else
20
19
  test_database.transaction do
21
20
  example.run
22
- raise Sequel::Error::Rollback
21
+ raise Sequel::Rollback
23
22
  end
24
23
  end
25
24
  end
26
25
 
27
26
  end
28
-
@@ -7,11 +7,12 @@ module BeetleETL
7
7
  let(:another_source) { 'another_source' }
8
8
 
9
9
  let(:config) do
10
- Configuration.new.tap do |c|
11
- c.stage_schema = 'stage'
12
- c.external_source = external_source
13
- c.database = test_database
14
- end
10
+ OpenStruct.new({
11
+ stage_schema: 'stage',
12
+ target_schema: 'public',
13
+ external_source: external_source,
14
+ database: test_database,
15
+ })
15
16
  end
16
17
 
17
18
  subject { AssignIds.new(config, :example_table) }
@@ -102,14 +102,14 @@ module BeetleETL
102
102
  it "truncates the stage table if it already exists" do
103
103
  CreateStage.new(config, :example_table, {}, @columns).run
104
104
 
105
- insert_into(subject.stage_table_name.to_sym).values(
105
+ insert_into(Sequel.qualify("public", subject.stage_table_name)).values(
106
106
  [ :some_string , :some_integer , :some_float ] ,
107
107
  [ "hello" , 123 , 123.456 ]
108
108
  )
109
109
 
110
110
  CreateStage.new(config, :example_table, {}, @columns).run
111
111
 
112
- expect(subject.stage_table_name).to have_values(
112
+ expect(Sequel.qualify("public", subject.stage_table_name)).to have_values(
113
113
  [:some_string, :some_integer, :some_float]
114
114
  )
115
115
  end
@@ -4,10 +4,11 @@ module BeetleETL
4
4
  describe MapRelations do
5
5
 
6
6
  let(:config) do
7
- Configuration.new.tap do |c|
8
- c.external_source = 'my_source'
9
- c.database = test_database
10
- end
7
+ OpenStruct.new({
8
+ external_source: 'my_source',
9
+ target_schema: 'public',
10
+ database: test_database
11
+ })
11
12
  end
12
13
 
13
14
  let(:dependee_a) do
@@ -75,15 +76,14 @@ module BeetleETL
75
76
  [ 26 , 'b_id' ] ,
76
77
  )
77
78
 
78
- insert_into(subject.stage_table_name.to_sym).values(
79
+ insert_into(Sequel.qualify("public", subject.stage_table_name)).values(
79
80
  [ :external_dependee_a_id , :external_dependee_b_id ] ,
80
81
  [ 'a_id' , 'b_id' ] ,
81
82
  )
82
83
 
83
-
84
84
  subject.run
85
85
 
86
- expect(subject.stage_table_name.to_sym).to have_values(
86
+ expect(Sequel.qualify("public", subject.stage_table_name)).to have_values(
87
87
  [ :dependee_a_id , :dependee_b_id ] ,
88
88
  [ 1 , 26 ] ,
89
89
  )
@@ -3,7 +3,7 @@ require 'spec_helper'
3
3
  module BeetleETL
4
4
  describe Step do
5
5
 
6
- let(:config) { Configuration.new }
6
+ let(:config) { OpenStruct.new }
7
7
 
8
8
  subject { Step.new(config, :example_table) }
9
9
  FooStep = Class.new(Step)
@@ -5,9 +5,9 @@ module BeetleETL
5
5
 
6
6
  let(:database) { double(:database) }
7
7
  let(:config) do
8
- Configuration.new.tap do |c|
9
- c.database = database
10
- end
8
+ OpenStruct.new({
9
+ database: database
10
+ })
11
11
  end
12
12
  let(:query) { double(:query) }
13
13
 
@@ -29,7 +29,7 @@ module BeetleETL
29
29
 
30
30
  describe '#run' do
31
31
  it 'runs a query in the database' do
32
- expect(database).to receive(:run).with(query)
32
+ expect(database).to receive(:execute).with(query)
33
33
 
34
34
  subject.run
35
35
  end
@@ -37,7 +37,7 @@ end
37
37
 
38
38
  RSpec::Matchers.define :have_values do |*rows|
39
39
  match do |table_description|
40
- dataset = test_database[table_description.to_sym]
40
+ dataset = test_database[table_description]
41
41
 
42
42
  columns = rows[0].map(&:to_sym)
43
43
  values = rows[1..-1]
data/spec/testing_spec.rb CHANGED
@@ -75,7 +75,7 @@ describe "BeetleETL:Testing" do
75
75
  with_stage_tables_for(:organisations, :some_table) do
76
76
  run_transformation(:organisations)
77
77
 
78
- expect(stage_table_name(:organisations)).to have_values(
78
+ expect(Sequel.qualify("public", stage_table_name(:organisations))).to have_values(
79
79
  [ :external_id , :address , :name ] ,
80
80
  [ "external_id" , "address" , "name" ]
81
81
  )
metadata CHANGED
@@ -1,17 +1,17 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: beetle_etl
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.0.0
4
+ version: 2.0.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Luciano Maiwald
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2016-02-19 00:00:00.000000000 Z
11
+ date: 2017-08-24 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
- name: sequel
14
+ name: activesupport
15
15
  requirement: !ruby/object:Gem::Requirement
16
16
  requirements:
17
17
  - - ">="
@@ -25,13 +25,13 @@ dependencies:
25
25
  - !ruby/object:Gem::Version
26
26
  version: 4.0.0
27
27
  - !ruby/object:Gem::Dependency
28
- name: activesupport
28
+ name: sequel
29
29
  requirement: !ruby/object:Gem::Requirement
30
30
  requirements:
31
31
  - - ">="
32
32
  - !ruby/object:Gem::Version
33
33
  version: 4.0.0
34
- type: :runtime
34
+ type: :development
35
35
  prerelease: false
36
36
  version_requirements: !ruby/object:Gem::Requirement
37
37
  requirements:
@@ -116,7 +116,6 @@ executables: []
116
116
  extensions: []
117
117
  extra_rdoc_files: []
118
118
  files:
119
- - ".byebug_history"
120
119
  - ".gitignore"
121
120
  - ".travis.yml"
122
121
  - Gemfile
@@ -125,6 +124,7 @@ files:
125
124
  - Rakefile
126
125
  - beetle_etl.gemspec
127
126
  - lib/beetle_etl.rb
127
+ - lib/beetle_etl/adapters/sequel_adapter.rb
128
128
  - lib/beetle_etl/configuration.rb
129
129
  - lib/beetle_etl/dsl/dsl.rb
130
130
  - lib/beetle_etl/dsl/transformation.rb
@@ -146,6 +146,8 @@ files:
146
146
  - lib/beetle_etl/testing/test_wrapper.rb
147
147
  - lib/beetle_etl/version.rb
148
148
  - script/postgres
149
+ - spec/adapters/sequel_adapter_spec.rb
150
+ - spec/adapters/shared_examples.rb
149
151
  - spec/beetle_etl_spec.rb
150
152
  - spec/configuration_spec.rb
151
153
  - spec/dsl/dsl_spec.rb
@@ -189,11 +191,13 @@ required_rubygems_version: !ruby/object:Gem::Requirement
189
191
  version: '0'
190
192
  requirements: []
191
193
  rubyforge_project:
192
- rubygems_version: 2.2.5
194
+ rubygems_version: 2.5.2
193
195
  signing_key:
194
196
  specification_version: 4
195
197
  summary: BeetleETL helps you with your recurring ETL imports.
196
198
  test_files:
199
+ - spec/adapters/sequel_adapter_spec.rb
200
+ - spec/adapters/shared_examples.rb
197
201
  - spec/beetle_etl_spec.rb
198
202
  - spec/configuration_spec.rb
199
203
  - spec/dsl/dsl_spec.rb
data/.byebug_history DELETED
@@ -1,8 +0,0 @@
1
- continue
2
- backtrace
3
- stack
4
- trace
5
- c
6
- continue
7
- c
8
- target_table_name