redshift_extractor 0.1.0 → 0.2.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: aa8e751f08bdd4b304276a2c3689340736df59cc
4
- data.tar.gz: 1e81e2a83c9a9f753e5a72a46656fe7b51ccc8fc
3
+ metadata.gz: d6b1a776fcff84a096fd9e510d1774fcd006191d
4
+ data.tar.gz: 0a69e8eb4ad78653b47a0151c9e6bd1d3389dbeb
5
5
  SHA512:
6
- metadata.gz: eb63d429a7eb62e67ce033b6b33462978801cb98d9c6cab2f2fbf17c783313ca68928b9517adfe47cd6279882bb627ef859e16c810d47cee0a8b116f2507d79c
7
- data.tar.gz: 4dfc53bdfc08aa9fb03d6736f80b9e100369a25a8a0273d95d92b6c8f0848c251d8a53a2f248c17f85327f345c4902ecc33ae93e40daf7ae0fee7844f3699ecd
6
+ metadata.gz: 32c29672a4b5ed08a44d56ccdd6e4447b0cddf48f1a5b534b0102ac6235cd4c3e927ae8bd9acfaf262f856bbef1a59131902122350d0c77329d65d1a92685bb1
7
+ data.tar.gz: d2ef768e4d19c90b0cb71617e019ba5365e05a65b2ad5f7f8e8d1acf49d4315cbe27d46518b1d7d25ed13c605d4f2b6028ca5df7f341a3079d87138ac3c392a4
data/README.md CHANGED
@@ -1,40 +1,85 @@
1
1
  # RedshiftExtractor
2
2
 
3
- Welcome to your new gem! In this directory, you'll find the files you need to be able to package up your Ruby library into a gem. Put your Ruby code in the file `lib/redshift_extractor`. To experiment with that code, run `bin/console` for an interactive prompt.
3
+ redshift_extractor moves data from one Amazon Redshift cluster to another. Here is how it works:
4
4
 
5
- TODO: Delete this and the text above, and describe your gem
5
+ - Source database
6
6
 
7
- ## Installation
7
+ 1. [UNLOAD](http://docs.aws.amazon.com/redshift/latest/dg/r_UNLOAD.html) - runs a SELECT query and exports the results to CSV files in S3.
8
8
 
9
- Add this line to your application's Gemfile:
9
+ - Destination database
10
+
11
+ 2. Drop - Drops a database table (the table in the destination database where the data will be stored).
12
+
13
+ 3. Create - Creates a database table.
14
+
15
+ 4. [COPY](http://docs.aws.amazon.com/redshift/latest/dg/r_COPY.html) - Loads data from S3 into a Redshift database.
16
+
17
+ One database connection is established with the source database to UNLOAD the data to S3. After the data is UNLOADed, a second database connection is establed with the destination database to drop/create the database table that will store the data. The final step is to COPY the data from the S3 files to the destination table.
18
+
19
+ ## Running the Code
20
+
21
+ The RedshiftExtractor::Extractor class is instantiated with a long hash of arguments.
10
22
 
11
23
  ```ruby
12
- gem 'redshift_extractor'
24
+ args = {
25
+ database_config_source: "database_config_source",
26
+ database_config_destination: "database_config_destination",
27
+ unload_s3_destination: "unload_s3_destination",
28
+ unload_select_sql: "unload_select_sql",
29
+ table_name: "table_name",
30
+ create_sql: "create_sql",
31
+ copy_data_source: "copy_data_source",
32
+ aws_access_key_id: "aws_access_key_id",
33
+ aws_secret_access_key: "aws_secret_access_key"
34
+ }
35
+
36
+ extractor = RedshiftExtractor::Extractor.new(args)
37
+ extractor.run
13
38
  ```
14
39
 
15
- And then execute:
40
+ Here is a description of the parameters:
16
41
 
17
- $ bundle
42
+ - database_config_source: A hash that's acceptable for the [Ruby Postgres gem](https://bitbucket.org/ged/ruby-pg/wiki/Home). Here's an example:
43
+
44
+ ```ruby
45
+ {
46
+ dbname: "db_name",
47
+ user: "username",
48
+ password: "password",
49
+ host: "host",
50
+ sslmode: 'require',
51
+ port: 5439
52
+ }
53
+ ```
54
+
55
+ - unload_s3_destination: A S3 path, something like `"s3://bucket_name/something_else/"`
18
56
 
19
- Or install it yourself as:
57
+ - unload_select_sql: A SQL SELECT query that will be run on the source table
20
58
 
21
- $ gem install redshift_extractor
59
+ - table_name: The table that will be dropped, recreated, and populated with data from the COPY command
22
60
 
23
- ## Usage
61
+ - create_sql: The SQL that creates the table_name table (this SQL is run to recreate the table in the step above)
24
62
 
25
- TODO: Write usage instructions here
63
+ - copy_data_source: This is typically `"#{unload_s3_destination}manifest"`. The UNLOAD command automatically creates a manifest file that can be used by the COPY command to load the data.
26
64
 
27
- ## Development
65
+ - aws_keys: The keys you get from AWS.
28
66
 
29
- After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
67
+ ## Installation
30
68
 
31
- To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
69
+ Add this line to your application's Gemfile:
70
+
71
+ ```ruby
72
+ gem 'redshift_extractor'
73
+ ```
74
+
75
+ And then execute:
76
+
77
+ $ bundle
32
78
 
33
79
  ## Contributing
34
80
 
35
81
  Bug reports and pull requests are welcome on GitHub at https://github.com/MrPowers/redshift_extractor.
36
82
 
37
-
38
83
  ## License
39
84
 
40
85
  The gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).
@@ -10,7 +10,9 @@ module RedshiftExtractor; class Copy
10
10
  end
11
11
 
12
12
  def copy_sql
13
- "copy #{table_name} from '#{data_source}' credentials '#{credentials}' manifest dateformat 'auto' timeformat 'auto' blanksasnull emptyasnull escape gzip removequotes delimiter '|';"
13
+ "copy #{table_name} from '#{data_source}'"\
14
+ " credentials '#{credentials}'"\
15
+ " manifest dateformat 'auto' timeformat 'auto' blanksasnull emptyasnull escape gzip removequotes delimiter '|';"
14
16
  end
15
17
 
16
18
  def credentials
@@ -30,8 +30,11 @@ module RedshiftExtractor; class Extractor
30
30
  source_connection.exec(unloader.unload_sql)
31
31
  end
32
32
 
33
+ def dropper
34
+ Drop.new(table_name: config.table_name)
35
+ end
36
+
33
37
  def drop
34
- dropper = Drop.new(table_name: config.table_name)
35
38
  destination_connection.exec(dropper.drop_sql)
36
39
  end
37
40
 
@@ -39,13 +42,16 @@ module RedshiftExtractor; class Extractor
39
42
  destination_connection.exec(config.create_sql)
40
43
  end
41
44
 
42
- def copy
43
- copier = Copy.new(
45
+ def copier
46
+ Copy.new(
44
47
  aws_access_key_id: config.aws_access_key_id,
45
48
  aws_secret_access_key: config.aws_secret_access_key,
46
49
  data_source: config.copy_data_source,
47
50
  table_name: config.table_name
48
51
  )
52
+ end
53
+
54
+ def copy
49
55
  destination_connection.exec(copier.copy_sql)
50
56
  end
51
57
 
@@ -10,7 +10,8 @@ module RedshiftExtractor; class Unload
10
10
  end
11
11
 
12
12
  def unload_sql
13
- "UNLOAD('#{escaped_extract_sql}') to '#{s3_destination}' CREDENTIALS '#{credentials}' MANIFEST GZIP ADDQUOTES ESCAPE;"
13
+ "UNLOAD('#{escaped_extract_sql}') to '#{s3_destination}'"\
14
+ " CREDENTIALS '#{credentials}' MANIFEST GZIP ADDQUOTES ESCAPE;"
14
15
  end
15
16
 
16
17
  def escaped_extract_sql
@@ -1,3 +1,3 @@
1
1
  module RedshiftExtractor
2
- VERSION = "0.1.0"
2
+ VERSION = "0.2.0"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: redshift_extractor
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - MrPowers
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2015-10-27 00:00:00.000000000 Z
11
+ date: 2015-10-31 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler