flare-up 0.1 → 0.2
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +8 -8
- data/.gitignore +3 -1
- data/.rspec +3 -0
- data/.ruby-version +1 -1
- data/.travis.yml +6 -0
- data/Gemfile.lock +25 -1
- data/README.md +29 -0
- data/Rakefile +10 -0
- data/bin/flare-up +5 -0
- data/flare-up.gemspec +24 -0
- data/lib/flare_up/boot.rb +51 -0
- data/lib/flare_up/cli.rb +60 -0
- data/lib/flare_up/connection.rb +74 -0
- data/lib/flare_up/copy_command.rb +73 -0
- data/lib/flare_up/env_wrap.rb +11 -0
- data/lib/flare_up/stl_load_error.rb +67 -0
- data/lib/flare_up/stl_load_error_fetcher.rb +16 -0
- data/lib/flare_up/version.rb +1 -1
- data/lib/flare_up.rb +13 -0
- data/resources/README.md +7 -0
- data/resources/hearthstone_cards.csv +3 -0
- data/resources/load_hearthstone_cards.rb +64 -0
- data/resources/postgresql-8.4-703.jdbc4.jar +0 -0
- data/resources/test_schema.sql +10 -0
- data/spec/lib/flare_up/boot_spec.rb +100 -0
- data/spec/lib/flare_up/cli_spec.rb +162 -0
- data/spec/lib/flare_up/connection_spec.rb +114 -0
- data/spec/lib/flare_up/copy_command_spec.rb +134 -0
- data/spec/lib/flare_up/stl_load_error_fetcher_spec.rb +45 -0
- data/spec/lib/flare_up/stl_load_error_spec.rb +61 -0
- data/spec/spec_helper.rb +5 -0
- metadata +110 -9
- data/flareup.gemspec +0 -17
checksums.yaml
CHANGED
@@ -1,15 +1,15 @@
|
|
1
1
|
---
|
2
2
|
!binary "U0hBMQ==":
|
3
3
|
metadata.gz: !binary |-
|
4
|
-
|
4
|
+
Y2M1MDAzODUxOGM5NjQwYzQ0ZGM4ZjI2Y2Q0Mzk1NjQ3NGNmZDBhMg==
|
5
5
|
data.tar.gz: !binary |-
|
6
|
-
|
6
|
+
ZDhmZjQwZTA4ODJjYzZkZWM5N2FiZDFiYzY3ZTU1ZTBhYjBlN2VmNw==
|
7
7
|
SHA512:
|
8
8
|
metadata.gz: !binary |-
|
9
|
-
|
10
|
-
|
11
|
-
|
9
|
+
MzM2YTRkMjhkYWRkZGUzZmU2Mzg3NjgzZDk5ZmVmOWVhZTBkMDA1OWQyMjEz
|
10
|
+
NmVmOTRkNTExYzk5Y2FlYWUzMTU2ZGRlMzA4ODhmMWZjZWZhNmE1Mzg4YmJh
|
11
|
+
OTAzYjBjNGVhYzE2NjM5NjJkOTFhZWM3ZjMyMTdkOWEzZjk3MWI=
|
12
12
|
data.tar.gz: !binary |-
|
13
|
-
|
14
|
-
|
15
|
-
|
13
|
+
MDgwNjcyMDAwN2VkYTIzOGVmNWQ4YzhiZWZmNWIyMTBhMDYwZDE1NTYyOTRl
|
14
|
+
YTNiYmJiZTA4ODU5MzhkMmRiNmU2ZmQ2ZjJmZmZjZTQ2ODAxNmFhYjdkMTk0
|
15
|
+
NWExZGQ5ZDU2Y2Q4YzZmMjJkOGNiMjg4MDBhYTk4MDVjNjNmMzQ=
|
data/.gitignore
CHANGED
data/.rspec
ADDED
data/.ruby-version
CHANGED
@@ -1 +1 @@
|
|
1
|
-
1.9.3-
|
1
|
+
1.9.3-p547
|
data/.travis.yml
ADDED
data/Gemfile.lock
CHANGED
@@ -1,14 +1,38 @@
|
|
1
1
|
PATH
|
2
2
|
remote: .
|
3
3
|
specs:
|
4
|
-
flare-up (0.
|
4
|
+
flare-up (0.2)
|
5
|
+
pg (~> 0.17)
|
6
|
+
thor (~> 0.19)
|
5
7
|
|
6
8
|
GEM
|
7
9
|
remote: http://rubygems.org/
|
8
10
|
specs:
|
11
|
+
diff-lcs (1.2.5)
|
12
|
+
pg (0.17.1)
|
13
|
+
rake (10.3.2)
|
14
|
+
rspec (3.0.0)
|
15
|
+
rspec-core (~> 3.0.0)
|
16
|
+
rspec-expectations (~> 3.0.0)
|
17
|
+
rspec-mocks (~> 3.0.0)
|
18
|
+
rspec-core (3.0.3)
|
19
|
+
rspec-support (~> 3.0.0)
|
20
|
+
rspec-expectations (3.0.3)
|
21
|
+
diff-lcs (>= 1.2.0, < 2.0)
|
22
|
+
rspec-support (~> 3.0.0)
|
23
|
+
rspec-its (1.0.1)
|
24
|
+
rspec-core (>= 2.99.0.beta1)
|
25
|
+
rspec-expectations (>= 2.99.0.beta1)
|
26
|
+
rspec-mocks (3.0.3)
|
27
|
+
rspec-support (~> 3.0.0)
|
28
|
+
rspec-support (3.0.3)
|
29
|
+
thor (0.19.1)
|
9
30
|
|
10
31
|
PLATFORMS
|
11
32
|
ruby
|
12
33
|
|
13
34
|
DEPENDENCIES
|
14
35
|
flare-up!
|
36
|
+
rake (~> 10.0)
|
37
|
+
rspec (~> 3.0)
|
38
|
+
rspec-its (~> 1.0)
|
data/README.md
CHANGED
@@ -0,0 +1,29 @@
|
|
1
|
+
## Overview
|
2
|
+
```Flare-up``` provides a wrapper around the Redshift [```COPY```](http://docs.aws.amazon.com/redshift/latest/dg/r_COPY.html) command for scriptability, allowing you to issue the command directly from the CLI. Much of the code is concerned with simplifying constructing the COPY command and providing easy access to the errors that may result from import.
|
3
|
+
|
4
|
+
## Why?
|
5
|
+
|
6
|
+
Redshift prefers a bulk COPY operation over indidivual INSERTs which Redshift is not optimized for, and Amazon does not recommend it as a strategy for bulk loading. This means a tool like Sqoop is out, as the number of INSERTs would be prohibitvely large given the data sets many folks import into Redshift. Additionally, COPY is a SQL command, not something issued via the AWS Redshift REST API, meaning you need a SQL connection to your Redshift instance to bulk load data.
|
7
|
+
|
8
|
+
The astute consumer of the AWS toolchain will note that [Data Pipeline](http://aws.amazon.com/datapipeline/) is one way this import may be completed however, we use Azkaban and the only thing worse one than one job flow control tool is two job flow control tools :)
|
9
|
+
|
10
|
+
Additionally, access to COPY errors is a bit cumbersome. On failure, Redshift populates the ```stl_load_errors``` table which inherently must be accessed via SQL. Flare-up will pretty print any errors that occur during import so that you may examine your logs to versus having to establish a connection to Redshift to understand what went wrong.
|
11
|
+
|
12
|
+
```
|
13
|
+
TODO PASTE EXAMPLE OF PRETTY PRINT ERROR HERE
|
14
|
+
```
|
15
|
+
|
16
|
+
## Sample Usage - Overview
|
17
|
+
|
18
|
+
At Sharethrough we're somewhat opinionated about having credentials present as environment variables for security purposes, and recommend it as a productionalized approach. That being said, it can be a pain to export variables when you're testing a tool and as such, we support specifying all of these on the command-line.
|
19
|
+
|
20
|
+
### Environment Variables
|
21
|
+
|
22
|
+
These will be queried if no command-line options are specified.
|
23
|
+
|
24
|
+
```
|
25
|
+
export AWS_ACCESS_KEY_ID=
|
26
|
+
export AWS_SECRET_ACCESS_KEY=
|
27
|
+
export REDSHIFT_USERNAME=
|
28
|
+
export REDSHIFT_PASSWORD=
|
29
|
+
```
|
data/Rakefile
CHANGED
data/bin/flare-up
ADDED
data/flare-up.gemspec
ADDED
@@ -0,0 +1,24 @@
|
|
1
|
+
$:.push File.expand_path('../lib', __FILE__)
|
2
|
+
require 'flare_up/version'
|
3
|
+
|
4
|
+
Gem::Specification.new do |s|
|
5
|
+
s.name = 'flare-up'
|
6
|
+
s.version = FlareUp::VERSION
|
7
|
+
s.platform = Gem::Platform::RUBY
|
8
|
+
s.authors = ['Robert Slifka']
|
9
|
+
s.homepage = 'http://www.github.com/sharethrough/flare-up'
|
10
|
+
s.summary = %q{Command-line access to bulk data loading via Redshift's COPY command.}
|
11
|
+
s.description = %q{Flare-up makes Redshift COPY scriptable by providing CLI access to the Redshift COPY command, with handy access to pretty printed errors as well.}
|
12
|
+
|
13
|
+
s.add_dependency('pg', '~> 0.17')
|
14
|
+
s.add_dependency('thor', '~> 0.19')
|
15
|
+
|
16
|
+
s.add_development_dependency('rake', '~> 10.0')
|
17
|
+
s.add_development_dependency('rspec', '~> 3.0')
|
18
|
+
s.add_development_dependency('rspec-its', '~> 1.0')
|
19
|
+
|
20
|
+
s.files = `git ls-files`.split("\n")
|
21
|
+
s.test_files = `git ls-files -- {spec,features}/*`.split("\n")
|
22
|
+
s.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
|
23
|
+
s.require_paths = %w(lib)
|
24
|
+
end
|
@@ -0,0 +1,51 @@
|
|
1
|
+
module FlareUp
|
2
|
+
|
3
|
+
class Boot
|
4
|
+
|
5
|
+
# TODO: This control flow is untested
|
6
|
+
def self.boot(options)
|
7
|
+
conn = create_connection(options)
|
8
|
+
copy = create_copy_command(options)
|
9
|
+
|
10
|
+
begin
|
11
|
+
handle_load_errors(copy.execute(conn))
|
12
|
+
rescue CopyCommandError => e
|
13
|
+
CLI.output_error("\x1b[31m#{e.message}")
|
14
|
+
CLI.bailout(1)
|
15
|
+
end
|
16
|
+
|
17
|
+
end
|
18
|
+
|
19
|
+
def self.create_connection(options)
|
20
|
+
FlareUp::Connection.new(
|
21
|
+
options[:redshift_endpoint],
|
22
|
+
options[:database],
|
23
|
+
options[:redshift_username],
|
24
|
+
options[:redshift_password]
|
25
|
+
)
|
26
|
+
end
|
27
|
+
|
28
|
+
def self.create_copy_command(options)
|
29
|
+
copy = FlareUp::CopyCommand.new(
|
30
|
+
options[:table],
|
31
|
+
options[:data_source],
|
32
|
+
options[:aws_access_key],
|
33
|
+
options[:aws_secret_key]
|
34
|
+
)
|
35
|
+
copy.columns = options[:column_list] if options[:column_list]
|
36
|
+
copy.options = options[:copy_options] if options[:copy_options]
|
37
|
+
copy
|
38
|
+
end
|
39
|
+
|
40
|
+
# TODO: How can we test this?
|
41
|
+
def self.handle_load_errors(stl_load_errors)
|
42
|
+
return if stl_load_errors.empty?
|
43
|
+
puts "\x1b[31mThere was an error processing the COPY command:"
|
44
|
+
stl_load_errors.each do |e|
|
45
|
+
puts e.pretty_print
|
46
|
+
end
|
47
|
+
end
|
48
|
+
|
49
|
+
end
|
50
|
+
|
51
|
+
end
|
data/lib/flare_up/cli.rb
ADDED
@@ -0,0 +1,60 @@
|
|
1
|
+
module FlareUp
|
2
|
+
|
3
|
+
class CLI < Thor
|
4
|
+
|
5
|
+
desc 'copy DATA_SOURCE REDSHIFT_ENDPOINT DATABASE TABLE', 'COPY data into REDSHIFT_ENDPOINT from DATA_SOURCE into DATABASE.TABLE'
|
6
|
+
long_desc <<-LONGDESC
|
7
|
+
`flare-up copy` executes the Redshift COPY command, loading data from\x5
|
8
|
+
DATA_SOURCE into DATABASE_NAME.TABLE_NAME at REDSHIFT_ENDPOINT.
|
9
|
+
|
10
|
+
Documentation for this version can be found at:\x5
|
11
|
+
https://github.com/sharethrough/flare-up/blob/v#{FlareUp::VERSION}/README.md
|
12
|
+
LONGDESC
|
13
|
+
option :aws_access_key, :type => :string, :desc => "Required unless ENV['AWS_ACCESS_KEY_ID'] is set."
|
14
|
+
option :aws_secret_key, :type => :string, :desc => "Required unless ENV['AWS_SECRET_ACCESS_KEY'] is set."
|
15
|
+
option :redshift_username, :type => :string, :desc => "Required unless ENV['REDSHIFT_USERNAME'] is set."
|
16
|
+
option :redshift_password, :type => :string, :desc => "Required unless ENV['REDSHIFT_PASSWORD'] is set."
|
17
|
+
option :column_list, :type => :array, :desc => 'A space-separated list of columns, should your DATA_SOURCE require it'
|
18
|
+
option :copy_options, :type => :string, :desc => "Appended to the end of the COPY command; enclose \"IN QUOTES\""
|
19
|
+
|
20
|
+
def copy(data_source, endpoint, database_name, table_name)
|
21
|
+
boot_options = {
|
22
|
+
:data_source => data_source,
|
23
|
+
:redshift_endpoint => endpoint,
|
24
|
+
:database => database_name,
|
25
|
+
:table => table_name
|
26
|
+
}
|
27
|
+
options.each { |k, v| boot_options[k.to_sym] = v }
|
28
|
+
|
29
|
+
begin
|
30
|
+
CLI.env_validator(boot_options, :aws_access_key, 'AWS_ACCESS_KEY_ID')
|
31
|
+
CLI.env_validator(boot_options, :aws_secret_key, 'AWS_SECRET_ACCESS_KEY')
|
32
|
+
CLI.env_validator(boot_options, :redshift_username, 'REDSHIFT_USERNAME')
|
33
|
+
CLI.env_validator(boot_options, :redshift_password, 'REDSHIFT_PASSWORD')
|
34
|
+
rescue ArgumentError => e
|
35
|
+
CLI.output_error(e.message)
|
36
|
+
CLI.bailout(1)
|
37
|
+
end
|
38
|
+
|
39
|
+
Boot.boot(boot_options)
|
40
|
+
end
|
41
|
+
|
42
|
+
def self.env_validator(options, option_name, env_variable_name)
|
43
|
+
options[option_name] ||= ENVWrap.get(env_variable_name)
|
44
|
+
return if options[option_name]
|
45
|
+
raise ArgumentError, "One of either the --#{option_name} option or the ENV['#{env_variable_name}'] must be set"
|
46
|
+
end
|
47
|
+
|
48
|
+
# TODO: Extract
|
49
|
+
def self.bailout(exit_code)
|
50
|
+
exit(1)
|
51
|
+
end
|
52
|
+
|
53
|
+
# TODO: Extract
|
54
|
+
def self.output_error(message)
|
55
|
+
puts "\x1b[31m#{message}" unless ENV['TESTING']
|
56
|
+
end
|
57
|
+
|
58
|
+
end
|
59
|
+
|
60
|
+
end
|
@@ -0,0 +1,74 @@
|
|
1
|
+
module FlareUp
|
2
|
+
|
3
|
+
class HostUnknownOrInaccessibleError < StandardError
|
4
|
+
end
|
5
|
+
class TimeoutError < StandardError
|
6
|
+
end
|
7
|
+
class NoDatabaseError < StandardError
|
8
|
+
end
|
9
|
+
class AuthenticationError < StandardError
|
10
|
+
end
|
11
|
+
class UnknownError < StandardError
|
12
|
+
end
|
13
|
+
|
14
|
+
class Connection
|
15
|
+
|
16
|
+
attr_accessor :host
|
17
|
+
attr_accessor :port
|
18
|
+
attr_accessor :dbname
|
19
|
+
attr_accessor :user
|
20
|
+
attr_accessor :password
|
21
|
+
attr_accessor :connect_timeout
|
22
|
+
|
23
|
+
def initialize(host, dbname, user, password)
|
24
|
+
@host = host
|
25
|
+
@dbname = dbname
|
26
|
+
@user = user
|
27
|
+
@password = password
|
28
|
+
|
29
|
+
@port = 5439
|
30
|
+
@connect_timeout = 5
|
31
|
+
end
|
32
|
+
|
33
|
+
# TODO - Not quite sure how to test this; perhaps fold connect/execute into
|
34
|
+
# TODO one method so we can close connections in case of failure, etc.
|
35
|
+
def execute(statement)
|
36
|
+
@pg_conn ||= connect
|
37
|
+
@pg_conn.exec(statement)
|
38
|
+
end
|
39
|
+
|
40
|
+
private
|
41
|
+
|
42
|
+
def connect
|
43
|
+
begin
|
44
|
+
PG.connect(connection_parameters)
|
45
|
+
rescue PG::ConnectionBad => e
|
46
|
+
case e.message
|
47
|
+
when /nodename nor servname provided, or not known/
|
48
|
+
raise HostUnknownOrInaccessibleError, "Host unknown or unreachable: #{@host}"
|
49
|
+
when /timeout expired/
|
50
|
+
raise TimeoutError, 'Timeout connecting to the database (have you checked your Redshift security groups?)'
|
51
|
+
when /database ".+" does not exist/
|
52
|
+
raise NoDatabaseError, "Database #{@dbname} does not exist"
|
53
|
+
when /password authentication failed for user/
|
54
|
+
raise AuthenticationError, "Either username '#{@user}' or password invalid"
|
55
|
+
else
|
56
|
+
raise UnknownError
|
57
|
+
end
|
58
|
+
end
|
59
|
+
end
|
60
|
+
|
61
|
+
def connection_parameters
|
62
|
+
{
|
63
|
+
:host => @host,
|
64
|
+
:port => @port,
|
65
|
+
:dbname => @dbname,
|
66
|
+
:user => @user,
|
67
|
+
:password => @password,
|
68
|
+
:connect_timeout => @connect_timeout
|
69
|
+
}
|
70
|
+
end
|
71
|
+
|
72
|
+
end
|
73
|
+
|
74
|
+
end
|
@@ -0,0 +1,73 @@
|
|
1
|
+
module FlareUp
|
2
|
+
|
3
|
+
class CopyCommandError < StandardError
|
4
|
+
end
|
5
|
+
class DataSourceError < CopyCommandError
|
6
|
+
end
|
7
|
+
class OtherZoneBucketError < CopyCommandError
|
8
|
+
end
|
9
|
+
class SyntaxError < CopyCommandError
|
10
|
+
end
|
11
|
+
|
12
|
+
class CopyCommand
|
13
|
+
|
14
|
+
attr_reader :table_name
|
15
|
+
attr_reader :data_source
|
16
|
+
attr_reader :aws_access_key_id
|
17
|
+
attr_reader :aws_secret_access_key
|
18
|
+
attr_reader :columns
|
19
|
+
attr_accessor :options
|
20
|
+
|
21
|
+
def initialize(table_name, data_source, aws_access_key_id, aws_secret_access_key)
|
22
|
+
@table_name = table_name
|
23
|
+
@data_source = data_source
|
24
|
+
@aws_access_key_id = aws_access_key_id
|
25
|
+
@aws_secret_access_key = aws_secret_access_key
|
26
|
+
@columns = []
|
27
|
+
@options = ''
|
28
|
+
end
|
29
|
+
|
30
|
+
def get_command
|
31
|
+
"COPY #{@table_name} #{get_columns} FROM '#{@data_source}' CREDENTIALS '#{get_credentials}' #{@options}"
|
32
|
+
end
|
33
|
+
|
34
|
+
def columns=(columns)
|
35
|
+
raise ArgumentError, 'Columns must be an array' unless columns.is_a?(Array)
|
36
|
+
@columns = columns
|
37
|
+
end
|
38
|
+
|
39
|
+
def execute(connection)
|
40
|
+
begin
|
41
|
+
connection.execute(get_command)
|
42
|
+
[]
|
43
|
+
rescue PG::InternalError => e
|
44
|
+
case e.message
|
45
|
+
when /Check 'stl_load_errors' system table for details/
|
46
|
+
return STLLoadErrorFetcher.fetch_errors(connection)
|
47
|
+
when /The specified S3 prefix '.+' does not exist/
|
48
|
+
raise DataSourceError, "A data source with prefix '#{@data_source}' does not exist."
|
49
|
+
when /The bucket you are attempting to access must be addressed using the specified endpoint/
|
50
|
+
raise OtherZoneBucketError, "Your Redshift instance appears to be in a different zone than your S3 bucket. Specify the \"REGION 'bucket-region'\" option."
|
51
|
+
when /PG::SyntaxError/
|
52
|
+
matches = /syntax error (.+) \(PG::SyntaxError\)/.match(e.message)
|
53
|
+
raise SyntaxError, "Syntax error in the COPY command: [#{matches[1]}]."
|
54
|
+
else
|
55
|
+
raise e
|
56
|
+
end
|
57
|
+
end
|
58
|
+
end
|
59
|
+
|
60
|
+
private
|
61
|
+
|
62
|
+
def get_columns
|
63
|
+
return '' if columns.empty?
|
64
|
+
"(#{@columns.join(', ').strip})"
|
65
|
+
end
|
66
|
+
|
67
|
+
def get_credentials
|
68
|
+
"aws_access_key_id=#{@aws_access_key_id};aws_secret_access_key=#{@aws_secret_access_key}"
|
69
|
+
end
|
70
|
+
|
71
|
+
end
|
72
|
+
|
73
|
+
end
|
@@ -0,0 +1,67 @@
|
|
1
|
+
module FlareUp
|
2
|
+
|
3
|
+
class STLLoadError
|
4
|
+
|
5
|
+
attr_reader :err_reason
|
6
|
+
attr_reader :raw_field_value
|
7
|
+
attr_reader :raw_line
|
8
|
+
attr_reader :col_length
|
9
|
+
attr_reader :type
|
10
|
+
attr_reader :colname
|
11
|
+
attr_reader :filename
|
12
|
+
attr_reader :position
|
13
|
+
attr_reader :line_number
|
14
|
+
|
15
|
+
def initialize(err_reason, raw_field_value, raw_line, col_length, type, colname, filename, position, line_number)
|
16
|
+
@err_reason = err_reason
|
17
|
+
@raw_field_value = raw_field_value
|
18
|
+
@raw_line = raw_line
|
19
|
+
@col_length = col_length
|
20
|
+
@type = type
|
21
|
+
@colname = colname
|
22
|
+
@filename = filename
|
23
|
+
@position = position
|
24
|
+
@line_number = line_number
|
25
|
+
end
|
26
|
+
|
27
|
+
def ==(other_error)
|
28
|
+
return false unless @err_reason == other_error.err_reason
|
29
|
+
return false unless @raw_field_value == other_error.raw_field_value
|
30
|
+
return false unless @raw_line == other_error.raw_line
|
31
|
+
return false unless @col_length == other_error.col_length
|
32
|
+
return false unless @type == other_error.type
|
33
|
+
return false unless @colname == other_error.colname
|
34
|
+
return false unless @filename == other_error.filename
|
35
|
+
return false unless @position == other_error.position
|
36
|
+
return false unless @line_number == other_error.line_number
|
37
|
+
true
|
38
|
+
end
|
39
|
+
|
40
|
+
def pretty_print
|
41
|
+
output = ''
|
42
|
+
output += "\e[33mREASON: \e[37m#{@err_reason}\n"
|
43
|
+
output += "\e[33mLINE : \e[37m#{@line_number}\n"
|
44
|
+
output += "\e[33mPOS : \e[37m#{@position}\n"
|
45
|
+
output += "\e[33mCOLUMN: \e[37m#{@colname} (LENGTH=#{@col_length})\n" if @colname.length > 0 && @col_length > 0
|
46
|
+
output += "\e[33mTYPE : \e[37m#{@type}\n" if @type.length > 0
|
47
|
+
output += "\e[33mLINE : \e[37m#{@raw_line}\n"
|
48
|
+
output += " \e[37m#{' ' * @position}^"
|
49
|
+
end
|
50
|
+
|
51
|
+
def self.from_pg_results_row(row)
|
52
|
+
STLLoadError.new(
|
53
|
+
row['err_reason'].strip,
|
54
|
+
row['raw_field_value'].strip,
|
55
|
+
row['raw_line'].strip,
|
56
|
+
row['col_length'].strip.to_i,
|
57
|
+
row['type'].strip,
|
58
|
+
row['colname'].strip,
|
59
|
+
row['filename'].strip,
|
60
|
+
row['position'].strip.to_i,
|
61
|
+
row['line_number'].strip.to_i
|
62
|
+
)
|
63
|
+
end
|
64
|
+
|
65
|
+
end
|
66
|
+
|
67
|
+
end
|
@@ -0,0 +1,16 @@
|
|
1
|
+
module FlareUp
|
2
|
+
|
3
|
+
class STLLoadErrorFetcher
|
4
|
+
|
5
|
+
def self.fetch_errors(connection)
|
6
|
+
query_result = connection.execute('SELECT * FROM stl_load_errors ORDER BY query DESC, line_number, position LIMIT 1')
|
7
|
+
errors = []
|
8
|
+
query_result.each do |row|
|
9
|
+
errors << STLLoadError.from_pg_results_row(row)
|
10
|
+
end
|
11
|
+
errors
|
12
|
+
end
|
13
|
+
|
14
|
+
end
|
15
|
+
|
16
|
+
end
|
data/lib/flare_up/version.rb
CHANGED
data/lib/flare_up.rb
CHANGED
@@ -1,3 +1,16 @@
|
|
1
|
+
require 'pg'
|
2
|
+
require 'thor'
|
3
|
+
|
4
|
+
require 'flare_up/env_wrap'
|
5
|
+
|
6
|
+
require 'flare_up/connection'
|
7
|
+
require 'flare_up/stl_load_error'
|
8
|
+
require 'flare_up/stl_load_error_fetcher'
|
9
|
+
require 'flare_up/copy_command'
|
10
|
+
|
11
|
+
require 'flare_up/boot'
|
12
|
+
require 'flare_up/cli'
|
13
|
+
|
1
14
|
module FlareUp
|
2
15
|
|
3
16
|
end
|
data/resources/README.md
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
## Resources
|
2
|
+
Used to facilitate development of ```flare-up```, these are not packaged as part of the gem, nor do they constitute a public API. Consume at your own risk :)
|
3
|
+
|
4
|
+
- ```hearthstone_cards.csv```: Small list of Hearthstone cards to COPY.
|
5
|
+
- ```load_hearthstone_cards.rb```: Utilize the internal API to COPY from a single S3 file into a test Redshift database. Note that the internal design is subject to, and will definitely change.
|
6
|
+
- ```postgresql-8.4-703.jdbc4.jar```: Used by [SQLWorkbenchJ](http://www.sql-workbench.net/index.html) to connect to Redshift. The current version of the official Postgres client does not support connecting to Redshift because it is such an old version (supports >= 8.4, Redshift is 8.0).
|
7
|
+
- ```test_schema.sql```: Prepare our test Redshift database for ingestion.
|
@@ -0,0 +1,64 @@
|
|
1
|
+
require 'flare_up'
|
2
|
+
|
3
|
+
host_name = 'flare-up-test.cskjnp4xvaje.us-west-2.redshift.amazonaws.com'
|
4
|
+
db_name = 'dev'
|
5
|
+
table_name = 'hearthstone_cards'
|
6
|
+
data_source = 's3://slif-redshift/hearthstone_cards_short_list.csv'
|
7
|
+
|
8
|
+
conn = FlareUp::Connection.new(host_name, db_name, ENV['REDSHIFT_USERNAME'], ENV['REDSHIFT_PASSWORD'])
|
9
|
+
|
10
|
+
copy = FlareUp::CopyCommand.new(table_name, data_source, ENV['AWS_ACCESS_KEY_ID'], ENV['AWS_SECRET_ACCESS_KEY'])
|
11
|
+
copy.columns = %w(name cost attack health description)
|
12
|
+
copy.options = "REGION 'us-east-1' CSV"
|
13
|
+
|
14
|
+
result = copy.execute(conn)
|
15
|
+
|
16
|
+
good_command = <<-COMMAND
|
17
|
+
flare-up copy \
|
18
|
+
s3://slif-redshift/hearthstone_cards_short_list.csv \
|
19
|
+
flare-up-test.cskjnp4xvaje.us-west-2.redshift.amazonaws.com \
|
20
|
+
dev \
|
21
|
+
hearthstone_cards \
|
22
|
+
--column-list name cost attack health description \
|
23
|
+
--copy_options "REGION 'us-east-1' CSV"
|
24
|
+
COMMAND
|
25
|
+
|
26
|
+
bad_data_source_command = <<-COMMAND
|
27
|
+
flare-up copy \
|
28
|
+
s3://slif-redshift/hearthstone_cards_short_lissdsdsdsdsdsdsd.csv \
|
29
|
+
flare-up-test.cskjnp4xvaje.us-west-2.redshift.amazonaws.com \
|
30
|
+
dev \
|
31
|
+
hearthstone_cards \
|
32
|
+
--column-list name cost attack health description \
|
33
|
+
--copy_options "REGION 'us-east-1' CSV"
|
34
|
+
COMMAND
|
35
|
+
|
36
|
+
bad_data_command = <<-COMMAND
|
37
|
+
flare-up copy \
|
38
|
+
s3://slif-redshift/hearthstone_cards_broken.csv \
|
39
|
+
flare-up-test.cskjnp4xvaje.us-west-2.redshift.amazonaws.com \
|
40
|
+
dev \
|
41
|
+
hearthstone_cards \
|
42
|
+
--column-list name cost attack health description \
|
43
|
+
--copy_options "REGION 'us-east-1' CSV"
|
44
|
+
COMMAND
|
45
|
+
|
46
|
+
other_zone_bucket = <<-COMMAND
|
47
|
+
flare-up copy \
|
48
|
+
s3://slif-redshift/hearthstone_cards_short_list.csv \
|
49
|
+
flare-up-test.cskjnp4xvaje.us-west-2.redshift.amazonaws.com \
|
50
|
+
dev \
|
51
|
+
hearthstone_cards \
|
52
|
+
--column-list name cost attack health description \
|
53
|
+
--copy_options CSV
|
54
|
+
COMMAND
|
55
|
+
|
56
|
+
busted_options = <<-COMMAND
|
57
|
+
flare-up copy \
|
58
|
+
s3://slif-redshift/hearthstone_cards_short_list.csv \
|
59
|
+
flare-up-test.cskjnp4xvaje.us-west-2.redshift.amazonaws.com \
|
60
|
+
dev \
|
61
|
+
hearthstone_cards \
|
62
|
+
--column-list name cost attack health description \
|
63
|
+
--copy_options "CSV ;lmlkmlk3"
|
64
|
+
COMMAND
|
Binary file
|