beetle_etl 2.0.0 → 2.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 6a726a9734d6866687319a6742cc8db41ef68b64
4
- data.tar.gz: 82be15d660033bd3d957879ec351c0479b26623f
3
+ metadata.gz: 2cd21d7530444cef54966138ce38a5ec6ca0f08b
4
+ data.tar.gz: 277b54c1b957edeeb78fab7434852215f571d486
5
5
  SHA512:
6
- metadata.gz: 01c1408c035afb9d0dcadb9382a3db442318f48c74517f1376c3a6ef615675ef891df8559c172be6e6b1f7f47d2f29c4a6965493a613d93a24d439c40f224246
7
- data.tar.gz: 111f73fd2692ed88f4ce91c6f4da838efe13d1d12fc00bfb4f63cb00383509ffa1d171bf265362ded8f01ddde7f0a0e0659d552f7be8b31188c94bb4bb984e59
6
+ metadata.gz: 659b6e6cde8208578833b7fd4a870f15fd706300389aef4c2c8c93786a17954bddca324398f1ca2e32bba12a96ed89f2e26b237df78efe99f33e1cbb0d9cd476
7
+ data.tar.gz: cb8d7842912f24a4e82d0198f4508757f32c5051183face61e5359bfbe36a5ce353127e085992a33069d82e6afd6bc1d44a68c79d832d38fc079eca819d8726a
data/.gitignore CHANGED
@@ -1,5 +1,6 @@
1
1
  *.gem
2
2
  *.rbc
3
+ .byebug_history
3
4
  .bundle
4
5
  .config
5
6
  .yardoc
data/.travis.yml CHANGED
@@ -6,8 +6,6 @@ rvm:
6
6
  - 2.3.0
7
7
  addons:
8
8
  postgresql: "9.3"
9
- code_climate:
10
- repo_token: fcd6d8c28da900609a2cf903716d858621b8ce68152edbcebe6908a9a3f5d3d5
11
9
  before_install:
12
10
  - gem update --system
13
11
  - gem update bundler
data/Gemfile CHANGED
@@ -5,6 +5,5 @@ gemspec
5
5
 
6
6
  group :test do
7
7
  gem 'rake'
8
- gem 'codeclimate-test-reporter'
9
8
  gem 'byebug'
10
9
  end
data/README.md CHANGED
@@ -1,12 +1,11 @@
1
1
  # BeetleETL
2
2
  [![Build Status](https://travis-ci.org/maiwald/beetle_etl.svg?branch=master)](https://travis-ci.org/maiwald/beetle_etl)
3
- [![Code Climate](https://codeclimate.com/github/maiwald/beetle_etl.png)](https://codeclimate.com/github/maiwald/beetle_etl)
4
3
 
5
4
  BeetleETL helps you with synchronising relational databases and recurring imports of reference data. It is actually quite nice.
6
5
 
7
6
  Consider you have a set of database tables representing third party data (i.e. the ```source```) and you want to synchronize a set of tables in your application (i.e. the ```target```) with that third party data. Further consider that you want to apply transformations to that ```source``` data before you import it.
8
7
 
9
- You define your transformations and BeetleETL will to the rest. Even when your ```source``` data changes, when you run BeetleETL again, it can keep track of what changes need to be applied to what records in your application’s tables.
8
+ You define your transformations and BeetleETL will do the rest. Even when your ```source``` data changes, when you run BeetleETL again, it can keep track of what changes need to be applied to what records in your application’s tables.
10
9
 
11
10
  It currently only works with PostgreSQL databases.
12
11
 
@@ -34,61 +33,64 @@ Make sure the tables you want to import contain columns named ```external_id```
34
33
 
35
34
  Create a configuration object
36
35
 
37
- configuration = BeetleETL::Configuration.new do |config|
38
- # path to your transformation file
39
- config.transformation_file = "../my_fancy_transformations"
40
-
41
- # sequel database config
42
- config.database_config = {
43
- adapter: 'postgres'
44
- encoding: utf8
45
- host: my_host
46
- database: my_database
47
- username: 'foo'
48
- password: 'bar'
49
- pool: 5
50
- pool_timeout: 360
51
- connect_timeout: 360
52
- }
53
- # or config.database = # sequel database instance
54
-
55
- # name of your soruce
56
- config.external_source = "important_data"
57
-
58
- # target schema in case you use postgres schemas
59
- config.target_schema = "public" # default
60
-
61
- # logger
62
- config.logger = Logger.new(STDOUT) # default
63
- end
36
+ ```ruby
37
+ configuration = BeetleETL::Configuration.new do |config|
38
+ # path to your transformation file
39
+ config.transformation_file = "../my_fancy_transformations"
40
+
41
+ # sequel database config
42
+ config.database_config = {
43
+ adapter: 'postgres'
44
+ encoding: utf8
45
+ host: my_host
46
+ database: my_database
47
+ username: 'foo'
48
+ password: 'bar'
49
+ pool: 5
50
+ pool_timeout: 360
51
+ connect_timeout: 360
52
+ }
53
+ # or config.database = # sequel database instance
54
+
55
+ # name of your soruce
56
+ config.external_source = "important_data"
57
+
58
+ # target schema in case you use postgres schemas
59
+ config.target_schema = "public" # default
60
+
61
+ # logger
62
+ config.logger = Logger.new(STDOUT) # default
63
+ end
64
+ ```
64
65
 
65
66
  ### Defining Imports
66
67
 
67
68
  Fill a ```transformation``` file with import directives like this:
68
69
 
69
- import :departments do
70
- columns :name
71
-
72
- references :organisations, on: :organisation_id
73
-
74
- query <<-SQL
75
- INSERT INTO #{stage_table} (
76
- external_id,
77
- name,
78
- external_organisation_id
79
- )
80
-
81
- SELECT
82
- o.id,
83
- o.”dep_name”,
84
- data.”address”
85
-
86
- FROM ”Organisation” o
87
- JOIN additional_data data
88
- ON data.org_id = o.id
89
- SQL
90
- end
91
-
70
+ ```ruby
71
+ import :departments do
72
+ columns :name
73
+
74
+ references :organisations, on: :organisation_id
75
+
76
+ query <<-SQL
77
+ INSERT INTO #{stage_table} (
78
+ external_id,
79
+ name,
80
+ external_organisation_id
81
+ )
82
+
83
+ SELECT
84
+ o.id,
85
+ o.”dep_name”,
86
+ data.”address”
87
+
88
+ FROM ”Organisation” o
89
+ JOIN additional_data data
90
+ ON data.org_id = o.id
91
+ SQL
92
+ end
93
+ ```
92
94
 
93
95
  ```import``` takes the name of the table you want to fill and the configuration as arguments.
94
96
  With ```columns``` you define what columns BeetleETL is supposed to fill in your application’s table.
data/beetle_etl.gemspec CHANGED
@@ -18,9 +18,9 @@ Gem::Specification.new do |spec|
18
18
  spec.test_files = spec.files.grep(%r{^(test|spec|features)/})
19
19
  spec.require_paths = ['lib']
20
20
 
21
- spec.add_runtime_dependency 'sequel', '>= 4.0.0'
22
21
  spec.add_runtime_dependency 'activesupport', '>= 4.0.0'
23
22
 
23
+ spec.add_development_dependency 'sequel', '>= 4.0.0'
24
24
  spec.add_development_dependency 'bundler', '~> 1.11'
25
25
  spec.add_development_dependency 'rspec', '>= 3.0.0'
26
26
  spec.add_development_dependency 'timecop', '>= 0.7.0'
data/lib/beetle_etl.rb CHANGED
@@ -5,6 +5,7 @@ require 'logger'
5
5
 
6
6
  module BeetleETL
7
7
 
8
+ require 'beetle_etl/adapters/sequel_adapter'
8
9
  require 'beetle_etl/configuration'
9
10
 
10
11
  require 'beetle_etl/dsl/dsl'
@@ -0,0 +1,35 @@
1
+ module BeetleETL
2
+ class SequelAdapter
3
+ attr_reader :database
4
+ def initialize(database)
5
+ @database = database
6
+ end
7
+
8
+ def execute(query)
9
+ @database.run(query)
10
+ end
11
+
12
+ def column_names(schema_name, table_name)
13
+ @database[Sequel.qualify(schema_name, table_name)].columns
14
+ end
15
+
16
+ def column_types(schema_name, table_name)
17
+ Hash[@database.schema(Sequel.qualify(schema_name, table_name))].reduce({}) do |acc, (name, column_config)|
18
+ acc[name.to_sym] = column_config.fetch(:db_type)
19
+ acc
20
+ end
21
+ end
22
+
23
+ def table_exists?(schema_name, table_name)
24
+ @database.table_exists?(Sequel.qualify(schema_name, table_name))
25
+ end
26
+
27
+ def transaction(&block)
28
+ @database.transaction(&block)
29
+ end
30
+
31
+ def disconnect
32
+ @database.disconnect
33
+ end
34
+ end
35
+ end
@@ -1,38 +1,54 @@
1
1
  module BeetleETL
2
+
2
3
  InvalidConfigurationError = Class.new(StandardError)
3
4
 
4
5
  class Configuration
5
6
  attr_accessor \
6
7
  :transformation_file,
8
+ :target_schema,
7
9
  :stage_schema,
8
10
  :external_source,
9
11
  :logger
10
12
 
11
13
  attr_writer \
12
14
  :database,
13
- :database_config,
14
- :target_schema
15
+ :database_config
15
16
 
16
17
  def initialize
17
18
  @target_schema = 'public'
18
19
  @logger = ::Logger.new(STDOUT)
19
20
  end
20
21
 
22
+ def database=(database)
23
+ @database_config = nil
24
+ @adapter ||= case
25
+ when sequel?(database) then SequelAdapter.new(database)
26
+ else nil
27
+ end
28
+ end
29
+
30
+ def database_config=(database_config)
31
+ @database_config = database_config
32
+ @adapter = SequelAdapter.new(Sequel.connect(@database_config))
33
+ end
34
+
21
35
  def database
22
- if [@database, @database_config].none?
23
- msg = "Either Sequel connection database_config or a Sequel Database object required"
36
+ if @adapter.nil?
37
+ msg = "Either Sequel connection database_config, Sequel::Database object or ActiveRecord::Base.connection required!"
24
38
  raise InvalidConfigurationError.new(msg)
25
39
  end
26
40
 
27
- @database ||= Sequel.connect(@database_config)
41
+ @adapter
28
42
  end
29
43
 
30
44
  def disconnect_database
31
- database.disconnect if @database_config
45
+ @adapter.disconnect if @database_config
32
46
  end
33
47
 
34
- def target_schema
35
- @target_schema != 'public' ? @target_schema : nil
48
+ private
49
+
50
+ def sequel?(database)
51
+ defined?(::Sequel::Database) && database.is_a?(::Sequel::Database)
36
52
  end
37
53
 
38
54
  end
@@ -26,10 +26,12 @@ module BeetleETL
26
26
  # query helper methods
27
27
 
28
28
  def stage_table(table_name = nil)
29
- BeetleETL::Naming.stage_table_name_sql(
29
+ stage_table_name = BeetleETL::Naming.stage_table_name(
30
30
  @config.external_source,
31
31
  table_name || @table_name
32
32
  )
33
+
34
+ %Q("#{@config.target_schema}"."#{stage_table_name}")
33
35
  end
34
36
 
35
37
  def combined_key(*args)
@@ -10,18 +10,5 @@ module BeetleETL
10
10
  "#{external_source.to_s}-#{table_name.to_s}-#{digest}"[0, 63]
11
11
  end
12
12
 
13
- def stage_table_name_sql(external_source, table_name)
14
- %Q("#{stage_table_name(external_source, table_name)}")
15
- end
16
-
17
- def target_table_name(target_schema, table_name)
18
- schema = target_schema ? target_schema.to_s : nil
19
- [schema, table_name.to_s].compact.join('.')
20
- end
21
-
22
- def target_table_name_sql(target_schema, table_name)
23
- %Q("#{target_table_name(target_schema, table_name)}")
24
- end
25
-
26
13
  end
27
14
  end
@@ -7,10 +7,10 @@ module BeetleETL
7
7
 
8
8
  def run
9
9
  database.execute <<-SQL
10
- UPDATE #{stage_table_name_sql} stage_update
11
- SET id = COALESCE(target.id, nextval('#{table_name}_id_seq'))
12
- FROM #{stage_table_name_sql} stage
13
- LEFT OUTER JOIN #{target_table_name_sql} target
10
+ UPDATE "#{target_schema}"."#{stage_table_name}" stage_update
11
+ SET id = COALESCE(target.id, NEXTVAL('#{target_schema}.#{table_name}_id_seq'))
12
+ FROM "#{target_schema}"."#{stage_table_name}" stage
13
+ LEFT OUTER JOIN "#{target_schema}"."#{table_name}" target
14
14
  on (
15
15
  stage.external_id = target.external_id
16
16
  AND target.external_source = '#{external_source}'
@@ -13,7 +13,7 @@ module BeetleETL
13
13
 
14
14
  def run
15
15
  database.execute <<-SQL
16
- CREATE UNLOGGED TABLE IF NOT EXISTS #{stage_table_name_sql} (
16
+ CREATE UNLOGGED TABLE IF NOT EXISTS "#{target_schema}"."#{stage_table_name}" (
17
17
  id integer,
18
18
  external_id character varying(255),
19
19
  transition character varying(255),
@@ -23,13 +23,13 @@ module BeetleETL
23
23
 
24
24
  #{index_definitions};
25
25
 
26
- ALTER TABLE #{stage_table_name_sql}
26
+ ALTER TABLE "#{target_schema}"."#{stage_table_name}"
27
27
  SET (
28
28
  autovacuum_enabled = false,
29
29
  toast.autovacuum_enabled = false
30
30
  );
31
31
 
32
- TRUNCATE TABLE #{stage_table_name_sql} RESTART IDENTITY CASCADE;
32
+ TRUNCATE TABLE "#{target_schema}"."#{stage_table_name}" RESTART IDENTITY CASCADE;
33
33
  SQL
34
34
  end
35
35
 
@@ -39,7 +39,7 @@ module BeetleETL
39
39
  definitions = [
40
40
  payload_column_definitions,
41
41
  relation_column_definitions
42
- ].compact
42
+ ].flatten
43
43
 
44
44
  if definitions.empty?
45
45
  raise NoColumnsDefinedError.new <<-MSG
@@ -52,35 +52,29 @@ module BeetleETL
52
52
  end
53
53
 
54
54
  def payload_column_definitions
55
- definitions = (@column_names - @relations.keys).map do |column_name|
55
+ (@column_names - @relations.keys).map do |column_name|
56
56
  "#{column_name} #{column_type(column_name)}"
57
57
  end
58
- definitions.join(',') if definitions.any?
59
58
  end
60
59
 
61
60
  def relation_column_definitions
62
- definitions = @relations.map do |foreign_key_name, table|
61
+ @relations.map do |foreign_key_name, table|
63
62
  <<-SQL
64
63
  #{foreign_key_name} integer,
65
64
  external_#{foreign_key_name} character varying(255)
66
65
  SQL
67
66
  end
68
- definitions.join(',') if definitions.any?
69
67
  end
70
68
 
71
69
  def index_definitions
72
70
  index_columns = [:external_id] + @relations.keys.map { |c| "external_#{c}" }
73
71
  index_columns.map do |column_name|
74
- "CREATE INDEX ON #{stage_table_name_sql} (#{column_name})"
72
+ %Q[CREATE INDEX ON "#{target_schema}"."#{stage_table_name}" (#{column_name});]
75
73
  end.join(";")
76
74
  end
77
75
 
78
76
  def column_type(column_name)
79
- @column_types ||= Hash[database.schema(target_table_name.to_sym)]
80
- .reduce({}) do |acc, (name, schema)|
81
- acc[name.to_sym] = schema.fetch(:db_type)
82
- acc
83
- end
77
+ @column_types ||= database.column_types(target_schema, table_name)
84
78
 
85
79
  unless @column_types.has_key?(column_name)
86
80
  raise ColumnDefinitionNotFoundError.new <<-MSG
@@ -3,7 +3,7 @@ module BeetleETL
3
3
 
4
4
  def run
5
5
  database.execute <<-SQL
6
- DROP TABLE IF EXISTS #{stage_table_name_sql}
6
+ DROP TABLE IF EXISTS "#{target_schema}"."#{stage_table_name}";
7
7
  SQL
8
8
  end
9
9
 
@@ -25,26 +25,26 @@ module BeetleETL
25
25
  just_now = now
26
26
 
27
27
  database.execute <<-SQL
28
- INSERT INTO #{target_table_name_sql}
28
+ INSERT INTO "#{target_schema}"."#{table_name}"
29
29
  (#{data_columns.join(', ')}, external_source, created_at, updated_at)
30
30
  SELECT
31
31
  #{data_columns.join(', ')},
32
32
  '#{external_source}',
33
33
  '#{just_now}',
34
34
  '#{just_now}'
35
- FROM #{stage_table_name_sql}
35
+ FROM "#{target_schema}"."#{stage_table_name}"
36
36
  WHERE transition = 'CREATE'
37
37
  SQL
38
38
  end
39
39
 
40
40
  def load_update
41
41
  database.execute <<-SQL
42
- UPDATE #{target_table_name_sql} target
42
+ UPDATE "#{target_schema}"."#{table_name}" target
43
43
  SET
44
44
  #{updatable_columns.map { |c| %Q("#{c}" = stage."#{c}") }.join(',')},
45
45
  "updated_at" = '#{now}',
46
46
  deleted_at = NULL
47
- FROM #{stage_table_name_sql} stage
47
+ FROM "#{target_schema}"."#{stage_table_name}" stage
48
48
  WHERE stage.id = target.id
49
49
  AND stage.transition IN ('UPDATE', 'REINSTATE')
50
50
  SQL
@@ -54,11 +54,11 @@ module BeetleETL
54
54
  just_now = now
55
55
 
56
56
  database.execute <<-SQL
57
- UPDATE #{target_table_name_sql} target
57
+ UPDATE "#{target_schema}"."#{table_name}" target
58
58
  SET
59
59
  updated_at = '#{just_now}',
60
60
  deleted_at = '#{just_now}'
61
- FROM #{stage_table_name_sql} stage
61
+ FROM "#{target_schema}"."#{stage_table_name}" stage
62
62
  WHERE stage.id = target.id
63
63
  AND stage.transition = 'DELETE'
64
64
  SQL
@@ -71,7 +71,7 @@ module BeetleETL
71
71
  end
72
72
 
73
73
  def table_columns
74
- @table_columns ||= database[stage_table_name.to_sym].columns
74
+ @table_columns ||= database.column_names(target_schema, stage_table_name)
75
75
  end
76
76
 
77
77
  def ignored_columns
@@ -14,9 +14,9 @@ module BeetleETL
14
14
  def run
15
15
  @relations.map do |foreign_key_column, foreign_table_name|
16
16
  database.execute <<-SQL
17
- UPDATE #{stage_table_name_sql} current_table
17
+ UPDATE "#{target_schema}"."#{stage_table_name}" current_table
18
18
  SET #{foreign_key_column} = foreign_table.id
19
- FROM #{stage_table_name_sql(foreign_table_name)} foreign_table
19
+ FROM "#{target_schema}"."#{stage_table_name(foreign_table_name)}" foreign_table
20
20
  WHERE current_table.external_#{foreign_key_column} = foreign_table.external_id
21
21
  SQL
22
22
  end
@@ -29,23 +29,12 @@ module BeetleETL
29
29
  @config.database
30
30
  end
31
31
 
32
- # naming
33
-
34
- def stage_table_name
35
- BeetleETL::Naming.stage_table_name(@config.external_source, @table_name)
36
- end
37
-
38
- def stage_table_name_sql(table_name = nil)
39
- table_name ||= @table_name
40
- BeetleETL::Naming.stage_table_name_sql(@config.external_source, table_name)
41
- end
42
-
43
- def target_table_name
44
- BeetleETL::Naming.target_table_name(@config.target_schema, @table_name)
32
+ def target_schema
33
+ @config.target_schema
45
34
  end
46
35
 
47
- def target_table_name_sql
48
- BeetleETL::Naming.target_table_name_sql(@config.target_schema, @table_name)
36
+ def stage_table_name(table_name = nil)
37
+ BeetleETL::Naming.stage_table_name(external_source, table_name || @table_name)
49
38
  end
50
39
 
51
40
  end
@@ -18,11 +18,11 @@ module BeetleETL
18
18
 
19
19
  def transition_create
20
20
  database.execute <<-SQL
21
- UPDATE #{stage_table_name_sql} stage
21
+ UPDATE "#{target_schema}"."#{stage_table_name}" stage
22
22
  SET transition = 'CREATE'
23
23
  WHERE NOT EXISTS (
24
24
  SELECT 1
25
- FROM #{target_table_name} target
25
+ FROM "#{target_schema}"."#{table_name}" target
26
26
  WHERE target.external_id = stage.external_id
27
27
  AND target.external_source = '#{external_source}'
28
28
  )
@@ -31,11 +31,11 @@ module BeetleETL
31
31
 
32
32
  def transition_update
33
33
  database.execute <<-SQL
34
- UPDATE #{stage_table_name_sql} stage
34
+ UPDATE "#{target_schema}"."#{stage_table_name}" stage
35
35
  SET transition = 'UPDATE'
36
36
  WHERE EXISTS (
37
37
  SELECT 1
38
- FROM #{target_table_name} target
38
+ FROM "#{target_schema}"."#{table_name}" target
39
39
  WHERE target.external_id = stage.external_id
40
40
  AND target.external_source = '#{external_source}'
41
41
  AND target.deleted_at IS NULL
@@ -49,13 +49,13 @@ module BeetleETL
49
49
 
50
50
  def transition_delete
51
51
  database.execute <<-SQL
52
- INSERT INTO #{stage_table_name_sql}
52
+ INSERT INTO "#{target_schema}"."#{stage_table_name}"
53
53
  (external_id, transition)
54
54
  SELECT
55
55
  target.external_id,
56
56
  'DELETE'
57
- FROM #{target_table_name_sql} target
58
- LEFT OUTER JOIN #{stage_table_name_sql} stage
57
+ FROM "#{target_schema}"."#{table_name}" target
58
+ LEFT OUTER JOIN "#{target_schema}"."#{stage_table_name}" stage
59
59
  ON (stage.external_id = target.external_id)
60
60
  WHERE stage.external_id IS NULL
61
61
  AND target.external_source = '#{external_source}'
@@ -65,11 +65,11 @@ module BeetleETL
65
65
 
66
66
  def transition_reinstate
67
67
  database.execute <<-SQL
68
- UPDATE #{stage_table_name_sql} stage
68
+ UPDATE "#{target_schema}"."#{stage_table_name}" stage
69
69
  SET transition = 'REINSTATE'
70
70
  WHERE EXISTS (
71
71
  SELECT 1
72
- FROM #{target_table_name_sql} target
72
+ FROM "#{target_schema}"."#{table_name}" target
73
73
  WHERE target.external_id = stage.external_id
74
74
  AND target.external_source = '#{external_source}'
75
75
  AND target.deleted_at IS NOT NULL
@@ -92,7 +92,7 @@ module BeetleETL
92
92
  end
93
93
 
94
94
  def table_columns
95
- @table_columns ||= database[stage_table_name.to_sym].columns
95
+ @table_columns ||= database.column_names(target_schema, stage_table_name)
96
96
  end
97
97
 
98
98
  def ignored_columns
@@ -12,7 +12,7 @@ module BeetleETL
12
12
  end
13
13
 
14
14
  def run
15
- database.run(@query)
15
+ database.execute(@query)
16
16
  end
17
17
 
18
18
  end
@@ -13,7 +13,7 @@ module BeetleETL
13
13
 
14
14
  def with_stage_tables_for(*table_names, &block)
15
15
  table_names.each do |table_name|
16
- unless @@config.database.table_exists?(table_name)
16
+ unless @@config.database.table_exists?(@@config.target_schema, table_name)
17
17
  raise TargetTableNotFoundError.new <<-MSG
18
18
  Missing target table "#{table_name}".
19
19
  In order to create stage tables, BeetleETL requires the target tables to exist because they provide the column definitions.
@@ -1,3 +1,3 @@
1
1
  module BeetleETL
2
- VERSION = "2.0.0"
2
+ VERSION = "2.0.1"
3
3
  end
@@ -0,0 +1,10 @@
1
+ require "spec_helper"
2
+ require_relative "shared_examples"
3
+
4
+ module BeetleETL
5
+ describe SequelAdapter do
6
+ it_behaves_like "database adapter" do
7
+ let(:adapter) { SequelAdapter.new(test_database) }
8
+ end
9
+ end
10
+ end
@@ -0,0 +1,58 @@
1
+ require "spec_helper"
2
+
3
+ shared_examples "database adapter" do
4
+ before do
5
+ test_database.run <<-SQL
6
+ CREATE SCHEMA foo;
7
+ CREATE TABLE foo.persons (
8
+ id int,
9
+ first_name varchar(255),
10
+ last_name varchar(255)
11
+ );
12
+ SQL
13
+ end
14
+
15
+ after do
16
+ test_database.run <<-SQL
17
+ DROP SCHEMA foo CASCADE;
18
+ SQL
19
+ end
20
+
21
+ describe "#execute" do
22
+ it "executes SQL" do
23
+ adapter.execute <<-SQL
24
+ INSERT INTO foo.persons VALUES (1, 'hugo', 'warzenkopp');
25
+ SQL
26
+
27
+ expect(Sequel.qualify("foo", "persons")).to have_values(
28
+ [ :id , :first_name , :last_name ],
29
+ [ 1 , "hugo" , "warzenkopp" ]
30
+ )
31
+ end
32
+ end
33
+
34
+ describe "#column_names" do
35
+ it "returns a tables column names" do
36
+ expect(adapter.column_names("foo", "persons")).to match_array([
37
+ :id, :first_name, :last_name
38
+ ])
39
+ end
40
+ end
41
+
42
+ describe "#column_types" do
43
+ it "returns a tables column names" do
44
+ expect(adapter.column_types("foo", "persons")).to match({
45
+ id: 'integer',
46
+ first_name: 'character varying(255)',
47
+ last_name: 'character varying(255)'
48
+ })
49
+ end
50
+ end
51
+
52
+ describe "#table_exists?" do
53
+ it "returns whether a table exists" do
54
+ expect(adapter.table_exists?("foo", "persons")).to eql(true)
55
+ expect(adapter.table_exists?("foo", "persons200")).to eql(false)
56
+ end
57
+ end
58
+ end
@@ -1,27 +1,27 @@
1
1
  require 'spec_helper'
2
+ require 'yaml'
2
3
 
3
4
  module BeetleETL
4
5
  describe Configuration do
5
6
 
6
7
  subject { Configuration.new }
7
8
 
8
- describe "#database" do
9
- let(:database) { double(:database) }
9
+ let(:database_config) do
10
+ config_path = File.expand_path('../support/database.yml', __FILE__)
11
+ YAML.load(File.read(config_path))
12
+ end
10
13
 
11
- it "returns the object if present" do
12
- subject.database = database
14
+ describe "#database" do
15
+ it "builds a SequelAdapter when passed a Sequel::Database" do
16
+ subject.database = test_database
13
17
 
14
- expect(subject.database).to eql(database)
18
+ expect { subject.database.execute("SELECT 1") }.not_to raise_error
15
19
  end
16
20
 
17
- it "builds a Sequel Database from config when no database is passed" do
18
- database_config = double(:database_config)
21
+ it "builds a SequelAdapter from config when no database is passed" do
19
22
  subject.database_config = database_config
20
23
 
21
- expect(Sequel).to receive(:connect).with(database_config).once { database }
22
-
23
- expect(subject.database).to eql(database)
24
- expect(subject.database).to eql(database)
24
+ expect { subject.database.execute("SELECT 1") }.not_to raise_error
25
25
  end
26
26
 
27
27
  it "raises an error if no database or database_config is passed" do
@@ -31,30 +31,26 @@ module BeetleETL
31
31
  end
32
32
 
33
33
  describe "#disconnect_database" do
34
- let(:database) { double(:database) }
35
-
36
34
  it "disconnects from database if database_config was passed" do
37
- database_config = double(:database_config)
35
+ subject.database_config = database_config
38
36
 
39
- expect(Sequel).to receive(:connect).with(database_config) { database }
40
- expect(database).to receive(:disconnect)
37
+ expect(subject.database).to receive(:disconnect)
41
38
 
42
- subject.database_config = database_config
43
39
  subject.disconnect_database
44
40
  end
45
41
 
46
42
  it "does not disconnect from database if database object was passed" do
47
- expect(database).not_to receive(:disconnect)
43
+ subject.database = test_database
44
+
45
+ expect(subject.database).not_to receive(:disconnect)
48
46
 
49
- subject.database = database
50
47
  subject.disconnect_database
51
48
  end
52
49
  end
53
50
 
54
51
  describe "#target_schema" do
55
- it "returns nil if target_schema is 'public'" do
56
- subject.target_schema = "public"
57
- expect(subject.target_schema).to be_nil
52
+ it "returns 'public' by default" do
53
+ expect(subject.target_schema).to eql("public")
58
54
  end
59
55
 
60
56
  it "returns target_schema if target_schema is not 'public'" do
data/spec/dsl/dsl_spec.rb CHANGED
@@ -6,6 +6,7 @@ module BeetleETL
6
6
  let(:config) do
7
7
  Configuration.new.tap do |c|
8
8
  c.external_source = "bar"
9
+ c.target_schema = "baz_schema"
9
10
  end
10
11
  end
11
12
 
@@ -14,13 +15,13 @@ module BeetleETL
14
15
  describe '#stage_table' do
15
16
  it 'returns the current stage table name' do
16
17
  expect(subject.stage_table).to eql(
17
- BeetleETL::Naming.stage_table_name_sql("bar", :foo_table)
18
+ %Q["baz_schema"."#{BeetleETL::Naming.stage_table_name("bar", :foo_table)}"]
18
19
  )
19
20
  end
20
21
 
21
22
  it 'returns the stage table name for the given table' do
22
23
  expect(subject.stage_table(:bar_table)).to eql(
23
- BeetleETL::Naming.stage_table_name_sql("bar", :bar_table)
24
+ %Q["baz_schema"."#{BeetleETL::Naming.stage_table_name("bar", :bar_table)}"]
24
25
  )
25
26
  end
26
27
  end
@@ -13,7 +13,7 @@ module ExampleSchema
13
13
  def create_source_tables
14
14
  test_database.create_schema :source
15
15
 
16
- test_database.create_table :source__Organisation do
16
+ test_database.create_table Sequel.qualify("source", "Organisation") do
17
17
  Integer :pkOrgId
18
18
  String :Name, size: 255
19
19
  String :Adresse, size: 255
@@ -26,7 +26,9 @@ module ExampleSchema
26
26
  end
27
27
 
28
28
  def create_target_tables
29
- test_database.create_table :organisations do
29
+ test_database.create_schema :my_target
30
+
31
+ test_database.create_table Sequel.qualify("my_target", "organisations") do
30
32
  primary_key :id
31
33
  String :external_id, size: 255
32
34
  String :external_source, size: 255
@@ -37,12 +39,12 @@ module ExampleSchema
37
39
  DateTime :deleted_at
38
40
  end
39
41
 
40
- test_database.create_table :departments do
42
+ test_database.create_table Sequel.qualify("my_target", "departments") do
41
43
  primary_key :id
42
44
  String :external_id, size: 255
43
45
  String :external_source, size: 255
44
46
  String :name, size: 255
45
- foreign_key :organisation_id, :organisations
47
+ foreign_key :organisation_id, Sequel.qualify("my_target", "organisations")
46
48
  DateTime :created_at
47
49
  DateTime :updated_at
48
50
  DateTime :deleted_at
@@ -50,8 +52,7 @@ module ExampleSchema
50
52
  end
51
53
 
52
54
  def drop_target_tables
53
- test_database.drop_table :departments
54
- test_database.drop_table :organisations
55
+ test_database.drop_schema :my_target, cascade: true
55
56
  end
56
57
 
57
58
  end
@@ -27,6 +27,7 @@ describe BeetleETL do
27
27
  c.transformation_file = File.expand_path('../example_transform.rb', __FILE__)
28
28
  c.database_config = database_config
29
29
  c.external_source = 'source_name'
30
+ c.target_schema = 'my_target'
30
31
  c.logger = Logger.new(Tempfile.new("log"))
31
32
  end
32
33
  end
@@ -45,7 +46,7 @@ describe BeetleETL do
45
46
 
46
47
  def import1
47
48
  # create
48
- insert_into(:source__Organisation).values(
49
+ insert_into(Sequel.qualify("source", "Organisation")).values(
49
50
  [ :pkOrgId , :Name , :Adresse , :Abteilung ] ,
50
51
  [ 1 , 'Apple' , 'Apple Street' , 'iPhone' ] ,
51
52
  [ 2 , 'Apple' , 'Apple Street' , 'MacBook' ] ,
@@ -57,14 +58,14 @@ describe BeetleETL do
57
58
  BeetleETL.import(@config)
58
59
  end
59
60
 
60
- expect(:organisations).to have_values(
61
+ expect(Sequel.qualify("my_target", "organisations")).to have_values(
61
62
  [ :id , :external_id , :external_source , :name , :address , :created_at , :updated_at , :deleted_at ] ,
62
63
  [ organisation_id('Apple') , 'Apple' , 'source_name' , 'Apple' , 'Apple Street' , time1 , time1 , nil ] ,
63
64
  [ organisation_id('Google') , 'Google' , 'source_name' , 'Google' , 'Google Street' , time1 , time1 , nil ] ,
64
65
  [ organisation_id('Audi') , 'Audi' , 'source_name' , 'Audi' , 'Audi Street' , time1 , time1 , nil ]
65
66
  )
66
67
 
67
- expect(:departments).to have_values(
68
+ expect(Sequel.qualify("my_target", "departments")).to have_values(
68
69
  [ :id , :external_id , :organisation_id , :external_source , :name , :created_at , :updated_at , :deleted_at ] ,
69
70
  [ department_id('[Apple,1]') , '[Apple,1]' , organisation_id('Apple') , 'source_name' , 'iPhone' , time1 , time1 , nil ] ,
70
71
  [ department_id('[Apple,2]') , '[Apple,2]' , organisation_id('Apple') , 'source_name' , 'MacBook' , time1 , time1 , nil ] ,
@@ -72,12 +73,12 @@ describe BeetleETL do
72
73
  [ department_id('[Audi,4]') , '[Audi,4]' , organisation_id('Audi') , 'source_name' , 'A4' , time1 , time1 , nil ] ,
73
74
  )
74
75
 
75
- test_database[:source__Organisation].truncate
76
+ test_database[Sequel.qualify("source", "Organisation")].truncate
76
77
  end
77
78
 
78
79
  def import2
79
80
  # keep, update, delete
80
- insert_into(:source__Organisation).values(
81
+ insert_into(Sequel.qualify("source", "Organisation")).values(
81
82
  [ :pkOrgId , :Name , :Adresse , :Abteilung ] ,
82
83
  [ 1 , 'Apple' , 'Apple Street' , 'iPhone' ] ,
83
84
  [ 2 , 'Apple' , 'Apple Street' , 'MacBook' ] ,
@@ -89,14 +90,14 @@ describe BeetleETL do
89
90
  BeetleETL.import(@config)
90
91
  end
91
92
 
92
- expect(:organisations).to have_values(
93
+ expect(Sequel.qualify("my_target", "organisations")).to have_values(
93
94
  [ :id , :external_id , :external_source , :name , :address , :created_at , :updated_at , :deleted_at ] ,
94
95
  [ organisation_id('Apple') , 'Apple' , 'source_name' , 'Apple' , 'Apple Street' , time1 , time1 , nil ] ,
95
96
  [ organisation_id('Google') , 'Google' , 'source_name' , 'Google' , 'NEW Google Street' , time1 , time2 , nil ] ,
96
97
  [ organisation_id('Audi') , 'Audi' , 'source_name' , 'Audi' , 'Audi Street' , time1 , time2 , time2 ]
97
98
  )
98
99
 
99
- expect(:departments).to have_values(
100
+ expect(Sequel.qualify("my_target", "departments")).to have_values(
100
101
  [ :id , :external_id , :organisation_id , :external_source , :name , :created_at , :updated_at , :deleted_at ] ,
101
102
  [ department_id('[Apple,1]') , '[Apple,1]' , organisation_id('Apple') , 'source_name' , 'iPhone' , time1 , time1 , nil ] ,
102
103
  [ department_id('[Apple,2]') , '[Apple,2]' , organisation_id('Apple') , 'source_name' , 'MacBook' , time1 , time1 , nil ] ,
@@ -104,12 +105,12 @@ describe BeetleETL do
104
105
  [ department_id('[Audi,4]') , '[Audi,4]' , organisation_id('Audi') , 'source_name' , 'A4' , time1 , time2 , time2 ] ,
105
106
  )
106
107
 
107
- test_database[:source__Organisation].truncate
108
+ test_database[Sequel.qualify("source", "Organisation")].truncate
108
109
  end
109
110
 
110
111
  def import3
111
112
  # reinstate with update
112
- insert_into(:source__Organisation).values(
113
+ insert_into(Sequel.qualify("source", "Organisation")).values(
113
114
  [ :pkOrgId , :Name , :Adresse , :Abteilung ] ,
114
115
  [ 1 , 'Apple' , 'Apple Street' , 'iPhone' ] ,
115
116
  [ 2 , 'Apple' , 'Apple Street' , 'MacBook' ] ,
@@ -121,14 +122,14 @@ describe BeetleETL do
121
122
  BeetleETL.import(@config)
122
123
  end
123
124
 
124
- expect(:organisations).to have_values(
125
+ expect(Sequel.qualify("my_target", "organisations")).to have_values(
125
126
  [ :id , :external_id , :external_source , :name , :address , :created_at , :updated_at , :deleted_at ] ,
126
127
  [ organisation_id('Apple') , 'Apple' , 'source_name' , 'Apple' , 'Apple Street' , time1 , time1 , nil ] ,
127
128
  [ organisation_id('Google') , 'Google' , 'source_name' , 'Google' , 'NEW Google Street' , time1 , time2 , nil ] ,
128
129
  [ organisation_id('Audi') , 'Audi' , 'source_name' , 'Audi' , 'NEW Audi Street' , time1 , time3 , nil ]
129
130
  )
130
131
 
131
- expect(:departments).to have_values(
132
+ expect(Sequel.qualify("my_target", "departments")).to have_values(
132
133
  [ :id , :external_id , :organisation_id , :external_source , :name , :created_at , :updated_at , :deleted_at ] ,
133
134
  [ department_id('[Apple,1]') , '[Apple,1]' , organisation_id('Apple') , 'source_name' , 'iPhone' , time1 , time1 , nil ] ,
134
135
  [ department_id('[Apple,2]') , '[Apple,2]' , organisation_id('Apple') , 'source_name' , 'MacBook' , time1 , time1 , nil ] ,
@@ -136,15 +137,15 @@ describe BeetleETL do
136
137
  [ department_id('[Audi,4]') , '[Audi,4]' , organisation_id('Audi') , 'source_name' , 'A4' , time1 , time3 , nil ] ,
137
138
  )
138
139
 
139
- test_database[:source__Organisation].truncate
140
+ test_database[Sequel.qualify("source", "Organisation")].truncate
140
141
  end
141
142
 
142
143
  def organisation_id(external_id)
143
- test_database[:organisations].first(external_id: external_id)[:id]
144
+ test_database[Sequel.qualify("my_target", "organisations")].first(external_id: external_id)[:id]
144
145
  end
145
146
 
146
147
  def department_id(external_id)
147
- test_database[:departments].first(external_id: external_id)[:id]
148
+ test_database[Sequel.qualify("my_target", "departments")].first(external_id: external_id)[:id]
148
149
  end
149
150
 
150
151
  end
data/spec/spec_helper.rb CHANGED
@@ -1,6 +1,5 @@
1
1
  require "byebug"
2
- require "codeclimate-test-reporter"
3
- CodeClimate::TestReporter.start
2
+ require "ostruct"
4
3
 
5
4
  require_relative "../lib/beetle_etl.rb"
6
5
  require_relative "support/database_helpers.rb"
@@ -19,10 +18,9 @@ RSpec.configure do |config|
19
18
  else
20
19
  test_database.transaction do
21
20
  example.run
22
- raise Sequel::Error::Rollback
21
+ raise Sequel::Rollback
23
22
  end
24
23
  end
25
24
  end
26
25
 
27
26
  end
28
-
@@ -7,11 +7,12 @@ module BeetleETL
7
7
  let(:another_source) { 'another_source' }
8
8
 
9
9
  let(:config) do
10
- Configuration.new.tap do |c|
11
- c.stage_schema = 'stage'
12
- c.external_source = external_source
13
- c.database = test_database
14
- end
10
+ OpenStruct.new({
11
+ stage_schema: 'stage',
12
+ target_schema: 'public',
13
+ external_source: external_source,
14
+ database: test_database,
15
+ })
15
16
  end
16
17
 
17
18
  subject { AssignIds.new(config, :example_table) }
@@ -102,14 +102,14 @@ module BeetleETL
102
102
  it "truncates the stage table if it already exists" do
103
103
  CreateStage.new(config, :example_table, {}, @columns).run
104
104
 
105
- insert_into(subject.stage_table_name.to_sym).values(
105
+ insert_into(Sequel.qualify("public", subject.stage_table_name)).values(
106
106
  [ :some_string , :some_integer , :some_float ] ,
107
107
  [ "hello" , 123 , 123.456 ]
108
108
  )
109
109
 
110
110
  CreateStage.new(config, :example_table, {}, @columns).run
111
111
 
112
- expect(subject.stage_table_name).to have_values(
112
+ expect(Sequel.qualify("public", subject.stage_table_name)).to have_values(
113
113
  [:some_string, :some_integer, :some_float]
114
114
  )
115
115
  end
@@ -4,10 +4,11 @@ module BeetleETL
4
4
  describe MapRelations do
5
5
 
6
6
  let(:config) do
7
- Configuration.new.tap do |c|
8
- c.external_source = 'my_source'
9
- c.database = test_database
10
- end
7
+ OpenStruct.new({
8
+ external_source: 'my_source',
9
+ target_schema: 'public',
10
+ database: test_database
11
+ })
11
12
  end
12
13
 
13
14
  let(:dependee_a) do
@@ -75,15 +76,14 @@ module BeetleETL
75
76
  [ 26 , 'b_id' ] ,
76
77
  )
77
78
 
78
- insert_into(subject.stage_table_name.to_sym).values(
79
+ insert_into(Sequel.qualify("public", subject.stage_table_name)).values(
79
80
  [ :external_dependee_a_id , :external_dependee_b_id ] ,
80
81
  [ 'a_id' , 'b_id' ] ,
81
82
  )
82
83
 
83
-
84
84
  subject.run
85
85
 
86
- expect(subject.stage_table_name.to_sym).to have_values(
86
+ expect(Sequel.qualify("public", subject.stage_table_name)).to have_values(
87
87
  [ :dependee_a_id , :dependee_b_id ] ,
88
88
  [ 1 , 26 ] ,
89
89
  )
@@ -3,7 +3,7 @@ require 'spec_helper'
3
3
  module BeetleETL
4
4
  describe Step do
5
5
 
6
- let(:config) { Configuration.new }
6
+ let(:config) { OpenStruct.new }
7
7
 
8
8
  subject { Step.new(config, :example_table) }
9
9
  FooStep = Class.new(Step)
@@ -5,9 +5,9 @@ module BeetleETL
5
5
 
6
6
  let(:database) { double(:database) }
7
7
  let(:config) do
8
- Configuration.new.tap do |c|
9
- c.database = database
10
- end
8
+ OpenStruct.new({
9
+ database: database
10
+ })
11
11
  end
12
12
  let(:query) { double(:query) }
13
13
 
@@ -29,7 +29,7 @@ module BeetleETL
29
29
 
30
30
  describe '#run' do
31
31
  it 'runs a query in the database' do
32
- expect(database).to receive(:run).with(query)
32
+ expect(database).to receive(:execute).with(query)
33
33
 
34
34
  subject.run
35
35
  end
@@ -37,7 +37,7 @@ end
37
37
 
38
38
  RSpec::Matchers.define :have_values do |*rows|
39
39
  match do |table_description|
40
- dataset = test_database[table_description.to_sym]
40
+ dataset = test_database[table_description]
41
41
 
42
42
  columns = rows[0].map(&:to_sym)
43
43
  values = rows[1..-1]
data/spec/testing_spec.rb CHANGED
@@ -75,7 +75,7 @@ describe "BeetleETL:Testing" do
75
75
  with_stage_tables_for(:organisations, :some_table) do
76
76
  run_transformation(:organisations)
77
77
 
78
- expect(stage_table_name(:organisations)).to have_values(
78
+ expect(Sequel.qualify("public", stage_table_name(:organisations))).to have_values(
79
79
  [ :external_id , :address , :name ] ,
80
80
  [ "external_id" , "address" , "name" ]
81
81
  )
metadata CHANGED
@@ -1,17 +1,17 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: beetle_etl
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.0.0
4
+ version: 2.0.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Luciano Maiwald
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2016-02-19 00:00:00.000000000 Z
11
+ date: 2017-08-24 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
- name: sequel
14
+ name: activesupport
15
15
  requirement: !ruby/object:Gem::Requirement
16
16
  requirements:
17
17
  - - ">="
@@ -25,13 +25,13 @@ dependencies:
25
25
  - !ruby/object:Gem::Version
26
26
  version: 4.0.0
27
27
  - !ruby/object:Gem::Dependency
28
- name: activesupport
28
+ name: sequel
29
29
  requirement: !ruby/object:Gem::Requirement
30
30
  requirements:
31
31
  - - ">="
32
32
  - !ruby/object:Gem::Version
33
33
  version: 4.0.0
34
- type: :runtime
34
+ type: :development
35
35
  prerelease: false
36
36
  version_requirements: !ruby/object:Gem::Requirement
37
37
  requirements:
@@ -116,7 +116,6 @@ executables: []
116
116
  extensions: []
117
117
  extra_rdoc_files: []
118
118
  files:
119
- - ".byebug_history"
120
119
  - ".gitignore"
121
120
  - ".travis.yml"
122
121
  - Gemfile
@@ -125,6 +124,7 @@ files:
125
124
  - Rakefile
126
125
  - beetle_etl.gemspec
127
126
  - lib/beetle_etl.rb
127
+ - lib/beetle_etl/adapters/sequel_adapter.rb
128
128
  - lib/beetle_etl/configuration.rb
129
129
  - lib/beetle_etl/dsl/dsl.rb
130
130
  - lib/beetle_etl/dsl/transformation.rb
@@ -146,6 +146,8 @@ files:
146
146
  - lib/beetle_etl/testing/test_wrapper.rb
147
147
  - lib/beetle_etl/version.rb
148
148
  - script/postgres
149
+ - spec/adapters/sequel_adapter_spec.rb
150
+ - spec/adapters/shared_examples.rb
149
151
  - spec/beetle_etl_spec.rb
150
152
  - spec/configuration_spec.rb
151
153
  - spec/dsl/dsl_spec.rb
@@ -189,11 +191,13 @@ required_rubygems_version: !ruby/object:Gem::Requirement
189
191
  version: '0'
190
192
  requirements: []
191
193
  rubyforge_project:
192
- rubygems_version: 2.2.5
194
+ rubygems_version: 2.5.2
193
195
  signing_key:
194
196
  specification_version: 4
195
197
  summary: BeetleETL helps you with your recurring ETL imports.
196
198
  test_files:
199
+ - spec/adapters/sequel_adapter_spec.rb
200
+ - spec/adapters/shared_examples.rb
197
201
  - spec/beetle_etl_spec.rb
198
202
  - spec/configuration_spec.rb
199
203
  - spec/dsl/dsl_spec.rb
data/.byebug_history DELETED
@@ -1,8 +0,0 @@
1
- continue
2
- backtrace
3
- stack
4
- trace
5
- c
6
- continue
7
- c
8
- target_table_name