redshift_extractor 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: aa8e751f08bdd4b304276a2c3689340736df59cc
4
- data.tar.gz: 1e81e2a83c9a9f753e5a72a46656fe7b51ccc8fc
3
+ metadata.gz: d6b1a776fcff84a096fd9e510d1774fcd006191d
4
+ data.tar.gz: 0a69e8eb4ad78653b47a0151c9e6bd1d3389dbeb
5
5
  SHA512:
6
- metadata.gz: eb63d429a7eb62e67ce033b6b33462978801cb98d9c6cab2f2fbf17c783313ca68928b9517adfe47cd6279882bb627ef859e16c810d47cee0a8b116f2507d79c
7
- data.tar.gz: 4dfc53bdfc08aa9fb03d6736f80b9e100369a25a8a0273d95d92b6c8f0848c251d8a53a2f248c17f85327f345c4902ecc33ae93e40daf7ae0fee7844f3699ecd
6
+ metadata.gz: 32c29672a4b5ed08a44d56ccdd6e4447b0cddf48f1a5b534b0102ac6235cd4c3e927ae8bd9acfaf262f856bbef1a59131902122350d0c77329d65d1a92685bb1
7
+ data.tar.gz: d2ef768e4d19c90b0cb71617e019ba5365e05a65b2ad5f7f8e8d1acf49d4315cbe27d46518b1d7d25ed13c605d4f2b6028ca5df7f341a3079d87138ac3c392a4
data/README.md CHANGED
@@ -1,40 +1,85 @@
1
1
  # RedshiftExtractor
2
2
 
3
- Welcome to your new gem! In this directory, you'll find the files you need to be able to package up your Ruby library into a gem. Put your Ruby code in the file `lib/redshift_extractor`. To experiment with that code, run `bin/console` for an interactive prompt.
3
+ redshift_extractor moves data from one Amazon Redshift cluster to another. Here is how it works:
4
4
 
5
- TODO: Delete this and the text above, and describe your gem
5
+ - Source database
6
6
 
7
- ## Installation
7
+ 1. [UNLOAD](http://docs.aws.amazon.com/redshift/latest/dg/r_UNLOAD.html) - runs a SELECT query and exports the results to CSV files in S3.
8
8
 
9
- Add this line to your application's Gemfile:
9
+ - Destination database
10
+
11
+ 2. Drop - Drops a database table (the table in the destination database where the data will be stored).
12
+
13
+ 3. Create - Creates a database table.
14
+
15
+ 4. [COPY](http://docs.aws.amazon.com/redshift/latest/dg/r_COPY.html) - Loads data from S3 into a Redshift database.
16
+
17
+ One database connection is established with the source database to UNLOAD the data to S3. After the data is UNLOADed, a second database connection is establed with the destination database to drop/create the database table that will store the data. The final step is to COPY the data from the S3 files to the destination table.
18
+
19
+ ## Running the Code
20
+
21
+ The RedshiftExtractor::Extractor class is instantiated with a long hash of arguments.
10
22
 
11
23
  ```ruby
12
- gem 'redshift_extractor'
24
+ args = {
25
+ database_config_source: "database_config_source",
26
+ database_config_destination: "database_config_destination",
27
+ unload_s3_destination: "unload_s3_destination",
28
+ unload_select_sql: "unload_select_sql",
29
+ table_name: "table_name",
30
+ create_sql: "create_sql",
31
+ copy_data_source: "copy_data_source",
32
+ aws_access_key_id: "aws_access_key_id",
33
+ aws_secret_access_key: "aws_secret_access_key"
34
+ }
35
+
36
+ extractor = RedshiftExtractor::Extractor.new(args)
37
+ extractor.run
13
38
  ```
14
39
 
15
- And then execute:
40
+ Here is a description of the parameters:
16
41
 
17
- $ bundle
42
+ - database_config_source: A hash that's acceptable for the [Ruby Postgres gem](https://bitbucket.org/ged/ruby-pg/wiki/Home). Here's an example:
43
+
44
+ ```ruby
45
+ {
46
+ dbname: "db_name",
47
+ user: "username",
48
+ password: "password",
49
+ host: "host",
50
+ sslmode: 'require',
51
+ port: 5439
52
+ }
53
+ ```
54
+
55
+ - unload_s3_destination: A S3 path, something like `"s3://bucket_name/something_else/"`
18
56
 
19
- Or install it yourself as:
57
+ - unload_select_sql: A SQL SELECT query that will be run on the source table
20
58
 
21
- $ gem install redshift_extractor
59
+ - table_name: The table that will be dropped, recreated, and populated with data from the COPY command
22
60
 
23
- ## Usage
61
+ - create_sql: The SQL that creates the table_name table (this SQL is run to recreate the table in the step above)
24
62
 
25
- TODO: Write usage instructions here
63
+ - copy_data_source: This is typically `"#{unload_s3_destination}manifest"`. The UNLOAD command automatically creates a manifest file that can be used by the COPY command to load the data.
26
64
 
27
- ## Development
65
+ - aws_keys: The keys you get from AWS.
28
66
 
29
- After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
67
+ ## Installation
30
68
 
31
- To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
69
+ Add this line to your application's Gemfile:
70
+
71
+ ```ruby
72
+ gem 'redshift_extractor'
73
+ ```
74
+
75
+ And then execute:
76
+
77
+ $ bundle
32
78
 
33
79
  ## Contributing
34
80
 
35
81
  Bug reports and pull requests are welcome on GitHub at https://github.com/MrPowers/redshift_extractor.
36
82
 
37
-
38
83
  ## License
39
84
 
40
85
  The gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).
@@ -10,7 +10,9 @@ module RedshiftExtractor; class Copy
10
10
  end
11
11
 
12
12
  def copy_sql
13
- "copy #{table_name} from '#{data_source}' credentials '#{credentials}' manifest dateformat 'auto' timeformat 'auto' blanksasnull emptyasnull escape gzip removequotes delimiter '|';"
13
+ "copy #{table_name} from '#{data_source}'"\
14
+ " credentials '#{credentials}'"\
15
+ " manifest dateformat 'auto' timeformat 'auto' blanksasnull emptyasnull escape gzip removequotes delimiter '|';"
14
16
  end
15
17
 
16
18
  def credentials
@@ -30,8 +30,11 @@ module RedshiftExtractor; class Extractor
30
30
  source_connection.exec(unloader.unload_sql)
31
31
  end
32
32
 
33
+ def dropper
34
+ Drop.new(table_name: config.table_name)
35
+ end
36
+
33
37
  def drop
34
- dropper = Drop.new(table_name: config.table_name)
35
38
  destination_connection.exec(dropper.drop_sql)
36
39
  end
37
40
 
@@ -39,13 +42,16 @@ module RedshiftExtractor; class Extractor
39
42
  destination_connection.exec(config.create_sql)
40
43
  end
41
44
 
42
- def copy
43
- copier = Copy.new(
45
+ def copier
46
+ Copy.new(
44
47
  aws_access_key_id: config.aws_access_key_id,
45
48
  aws_secret_access_key: config.aws_secret_access_key,
46
49
  data_source: config.copy_data_source,
47
50
  table_name: config.table_name
48
51
  )
52
+ end
53
+
54
+ def copy
49
55
  destination_connection.exec(copier.copy_sql)
50
56
  end
51
57
 
@@ -10,7 +10,8 @@ module RedshiftExtractor; class Unload
10
10
  end
11
11
 
12
12
  def unload_sql
13
- "UNLOAD('#{escaped_extract_sql}') to '#{s3_destination}' CREDENTIALS '#{credentials}' MANIFEST GZIP ADDQUOTES ESCAPE;"
13
+ "UNLOAD('#{escaped_extract_sql}') to '#{s3_destination}'"\
14
+ " CREDENTIALS '#{credentials}' MANIFEST GZIP ADDQUOTES ESCAPE;"
14
15
  end
15
16
 
16
17
  def escaped_extract_sql
@@ -1,3 +1,3 @@
1
1
  module RedshiftExtractor
2
- VERSION = "0.1.0"
2
+ VERSION = "0.2.0"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: redshift_extractor
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - MrPowers
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2015-10-27 00:00:00.000000000 Z
11
+ date: 2015-10-31 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler