RubyGems - tumugi-plugin-bigquery - Versions diffs - 0.1.0 → 0.2.0 - Mend

tumugi-plugin-bigquery 0.1.0 → 0.2.0

Files changed (18) hide show

checksums.yaml +4 -4
data/.travis.yml +6 -4
data/CHANGELOG.md +48 -0
data/README.md +23 -3
data/examples/load.rb +24 -0
data/examples/test.csv +6 -0
data/examples/tumugi_config_example.rb +5 -5
data/lib/tumugi/plugin/bigquery/client.rb +48 -18
data/lib/tumugi/plugin/bigquery/version.rb +1 -1
data/lib/tumugi/plugin/target/bigquery_dataset.rb +2 -2
data/lib/tumugi/plugin/target/bigquery_table.rb +2 -2
data/lib/tumugi/plugin/task/bigquery_copy.rb +3 -1
data/lib/tumugi/plugin/task/bigquery_dataset.rb +1 -1
data/lib/tumugi/plugin/task/bigquery_export.rb +112 -0
data/lib/tumugi/plugin/task/bigquery_load.rb +73 -0
data/lib/tumugi/plugin/task/bigquery_query.rb +1 -1
data/tumugi-plugin-bigquery.gemspec +3 -2
metadata +29 -10

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA1:
-  metadata.gz: 3ab49f7c9e04361951e1f7ab9eff4019d09c8829
-  data.tar.gz: 1b5ed8b53e77fb413a98ce6cfe4a682611b33f62
+  metadata.gz: 1f82d5d752da3918795afc6cc669a0fb4711cf95
+  data.tar.gz: fed486ae8aeb9266d4fd11cf523a19a8507755af
 SHA512:
-  metadata.gz: ea2e037bc46885c0e7cec71d085a5883dd9508f597d6e98d275b11464d2a9f27596a26cfd4a9141f51090b9ab2c9a96df63eb27c59ebba12113c00e2cac4d239
-  data.tar.gz: 1e18b4578eda83a38d200896cf0b8e22a85726b08f956bd20e9e794d81322df18416004d58f977e78b053b3609e559064d04cfe683ab1ecc60d32e8f89f34067
+  metadata.gz: 8418f29dfe96d38bcdfa0c5098d59efd819edaa715a7e2af0945c57f70a4d08fa5c477560df636389eefe0bcf40712c4d240b2a97d3301cadebf6e5615808f2b
+  data.tar.gz: aa34ee20fdec506277f3ac40b8819b62f7d000db7c802ed4c9fcedd5af1bf33644ecf673b10326c8e64f8d554a9a708cdfc0f9ae7942b5570257adec3108ab28

data/.travis.yml CHANGED Viewed

@@ -1,7 +1,9 @@
 language: ruby
 cache: bundler
 rvm:
-- 2.1.10
-- 2.2.5
-- 2.3.1
-- jruby-9.0.5.0
+  - 2.1
+  - 2.3.1
+  - jruby-9.1.2.0
+before_install:
+  - gem install bundler

data/CHANGELOG.md ADDED Viewed

@@ -0,0 +1,48 @@
+# Change Log
+## [0.2.0](https://github.com/tumugi/tumugi-plugin-bigquery/tree/0.2.0) (2016-06-06)
+[Full Changelog](https://github.com/tumugi/tumugi-plugin-bigquery/compare/v0.1.0...0.2.0)
+**Implemented enhancements:**
+- Support extract table to FileSystemTarget [\#23](https://github.com/tumugi/tumugi-plugin-bigquery/issues/23)
+- Support load from GCS [\#5](https://github.com/tumugi/tumugi-plugin-bigquery/issues/5)
+- Support extract table to Google Cloud Storage [\#4](https://github.com/tumugi/tumugi-plugin-bigquery/issues/4)
+- Support service account application default auth [\#22](https://github.com/tumugi/tumugi-plugin-bigquery/pull/22) ([hakobera](https://github.com/hakobera))
+**Fixed bugs:**
+- Fix typo and dependency [\#24](https://github.com/tumugi/tumugi-plugin-bigquery/pull/24) ([hakobera](https://github.com/hakobera))
+- Fix missing project\_id of dataset/table [\#21](https://github.com/tumugi/tumugi-plugin-bigquery/pull/21) ([hakobera](https://github.com/hakobera))
+- Fix private key file auth does not work [\#19](https://github.com/tumugi/tumugi-plugin-bigquery/pull/19) ([hakobera](https://github.com/hakobera))
+- Fix support private key file in config section [\#13](https://github.com/tumugi/tumugi-plugin-bigquery/pull/13) ([hakobera](https://github.com/hakobera))
+**Closed issues:**
+- Update tumugi to v0.5.0 [\#8](https://github.com/tumugi/tumugi-plugin-bigquery/issues/8)
+**Merged pull requests:**
+- Cache output [\#26](https://github.com/tumugi/tumugi-plugin-bigquery/pull/26) ([hakobera](https://github.com/hakobera))
+- Prepare release for 0.2.0 [\#25](https://github.com/tumugi/tumugi-plugin-bigquery/pull/25) ([hakobera](https://github.com/hakobera))
+- Use Thor's invoke instead of system method [\#18](https://github.com/tumugi/tumugi-plugin-bigquery/pull/18) ([hakobera](https://github.com/hakobera))
+- Change test ruby version [\#17](https://github.com/tumugi/tumugi-plugin-bigquery/pull/17) ([hakobera](https://github.com/hakobera))
+- Change tumugi dependency version [\#16](https://github.com/tumugi/tumugi-plugin-bigquery/pull/16) ([hakobera](https://github.com/hakobera))
+- Implement extract table to google cloud storage feature [\#15](https://github.com/tumugi/tumugi-plugin-bigquery/pull/15) ([hakobera](https://github.com/hakobera))
+- Add BigqueryLoadTask [\#12](https://github.com/tumugi/tumugi-plugin-bigquery/pull/12) ([hakobera](https://github.com/hakobera))
+- Update dependency gems [\#11](https://github.com/tumugi/tumugi-plugin-bigquery/pull/11) ([hakobera](https://github.com/hakobera))
+- Update tumugi to v0.5.0 [\#9](https://github.com/tumugi/tumugi-plugin-bigquery/pull/9) ([hakobera](https://github.com/hakobera))
+- Add rubygems badge [\#3](https://github.com/tumugi/tumugi-plugin-bigquery/pull/3) ([hakobera](https://github.com/hakobera))
+## [v0.1.0](https://github.com/tumugi/tumugi-plugin-bigquery/tree/v0.1.0) (2016-05-16)
+**Fixed bugs:**
+- Fix unused arguments [\#2](https://github.com/tumugi/tumugi-plugin-bigquery/pull/2) ([hakobera](https://github.com/hakobera))
+**Merged pull requests:**
+- First implementation [\#1](https://github.com/tumugi/tumugi-plugin-bigquery/pull/1) ([hakobera](https://github.com/hakobera))
+\* *This Change Log was automatically generated by [github_changelog_generator](https://github.com/skywinder/Github-Changelog-Generator)*

data/README.md CHANGED Viewed

@@ -1,4 +1,4 @@
-[![Build Status](https://travis-ci.org/tumugi/tumugi-plugin-bigquery.svg?branch=master)](https://travis-ci.org/tumugi/tumugi-plugin-bigquery) [![Code Climate](https://codeclimate.com/github/tumugi/tumugi-plugin-bigquery/badges/gpa.svg)](https://codeclimate.com/github/tumugi/tumugi-plugin-bigquery) [![Coverage Status](https://coveralls.io/repos/github/tumugi/tumugi-plugin-bigquery/badge.svg?branch=master)](https://coveralls.io/github/tumugi/tumugi-plugin-bigquery)
+[![Build Status](https://travis-ci.org/tumugi/tumugi-plugin-bigquery.svg?branch=master)](https://travis-ci.org/tumugi/tumugi-plugin-bigquery) [![Code Climate](https://codeclimate.com/github/tumugi/tumugi-plugin-bigquery/badges/gpa.svg)](https://codeclimate.com/github/tumugi/tumugi-plugin-bigquery) [![Coverage Status](https://coveralls.io/repos/github/tumugi/tumugi-plugin-bigquery/badge.svg?branch=master)](https://coveralls.io/github/tumugi/tumugi-plugin-bigquery)  [![Gem Version](https://badge.fury.io/rb/tumugi-plugin-bigquery.svg)](https://badge.fury.io/rb/tumugi-plugin-bigquery)
 # tumugi-plugin-bigquery
@@ -68,6 +68,8 @@ end
 #### Usage
+Copy `test.src_table` to `test.dest_table`.
 ```rb
 task :task1, type: :bigquery_copy do
   param_set :src_dataset_id, 'test'
@@ -77,6 +79,24 @@ task :task1, type: :bigquery_copy do
 end
 ```
+### Tumugi::Plugin::BigqueryLoadTask
+`Tumugi::Plugin::BigqueryLoadTask` is task to load structured data from GCS into BigQuery.
+#### Usage
+Load `gs://test_bucket/load_data.csv` into `dest_project:dest_dataset.dest_table`
+```rb
+task :task1, type: :bigquery_load do
+  param_set :bucket, 'test_bucket'
+  param_set :key, 'load_data.csv'
+  param_set :project_id, 'dest_project'
+  param_set :datset_id, 'dest_dataset'
+  param_set :table_id, 'dest_table'
+end
+```
 ### Config Section
 tumugi-plugin-bigquery provide config section named "bigquery" which can specified BigQuery autenticaion info.
@@ -84,7 +104,7 @@ tumugi-plugin-bigquery provide config section named "bigquery" which can specifi
 #### Authenticate by client_email and private_key
 ```rb
-Tumugi.config do |config|
+Tumugi.configure do |config|
   config.section("bigquery") do |section|
     section.project_id = "xxx"
     section.client_email = "yyy@yyy.iam.gserviceaccount.com"
@@ -96,7 +116,7 @@ end
 #### Authenticate by JSON key file
 ```rb
-Tumugi.config do |config|
+Tumugi.configure do |config|
   config.section("bigquery") do |section|
     section.private_key_file = "/path/to/key.json"
   end

data/examples/load.rb ADDED Viewed

@@ -0,0 +1,24 @@
+task :task1, type: :bigquery_load do
+  requires :task2
+  param_set :bucket, 'tumugi-plugin-bigquery'
+  param_set :key, 'test.csv'
+  param_set :dataset_id, -> { input.dataset_id }
+  param_set :table_id, 'load_test'
+  param_set :skip_leading_rows, 1
+  param_set :schema, [
+    {
+      name: 'row_number',
+      type: 'INTEGER',
+      mode: 'NULLABLE'
+    },
+    {
+      name: 'value',
+      type: 'INTEGER',
+      mode: 'NULLABLE'
+    },
+  ]
+end
+task :task2, type: :bigquery_dataset do
+  param_set :dataset_id, 'test'
+end

data/examples/test.csv ADDED Viewed

@@ -0,0 +1,6 @@
+row_number,value
+1,1
+2,2
+3,3
+4,4
+5,5

data/examples/tumugi_config_example.rb CHANGED Viewed

@@ -1,7 +1,7 @@
-Tumugi.config do |c|
-  c.section('bigquery') do |s|
-    s.project_id = ENV["PROJECT_ID"]
-    s.client_email = ENV["CLIENT_EMAIL"]
-    s.private_key = ENV["PRIVATE_KEY"].gsub(/\\n/, "\n")
+Tumugi.configure do |config|
+  config.section('bigquery') do |section|
+    section.project_id = ENV["PROJECT_ID"]
+    section.client_email = ENV["CLIENT_EMAIL"]
+    section.private_key = ENV["PRIVATE_KEY"].gsub(/\\n/, "\n")
   end
 end

data/lib/tumugi/plugin/bigquery/client.rb CHANGED Viewed

@@ -1,4 +1,5 @@
 require 'kura'
+require 'json'
 require_relative './error'
 Tumugi::Config.register_section('bigquery', :project_id, :client_email, :private_key, :private_key_file)
@@ -9,12 +10,22 @@ module Tumugi
       class Client
         attr_reader :project_id
-        def initialize(project_id: nil, client_email: nil, private_key: nil)
-          config = Tumugi.config.section('bigquery')
-          @project_id = project_id || config.project_id
-          @client_email = client_email || config.client_email
-          @private_key = private_key || config.private_key
-          @client = Kura.client(@project_id, @client_email, @private_key)
+        def initialize(project_id: nil, client_email: nil, private_key: nil, private_key_file: nil)
+          @project_id = project_id
+          if client_email.nil? && private_key.nil? && !private_key_file.nil?
+            @client = Kura.client(private_key_file)
+            if @project_id.nil?
+              key = JSON.parse(File.read(private_key_file))
+              @project_id = key['project_id']
+            end
+          else
+            # This method call style is needed for jruby.
+            # JRuby cannot handle correctly if method using keyword hash and last hash argument.
+            # see https://bugs.ruby-lang.org/issues/7529
+            @client = Kura.client(project_id = { "project_id" => @project_id, "client_email" => client_email, "private_key" => private_key },
+                                  client_email = nil, private_key = nil, {http_options: {timeout: 60}})
+          end
         rescue Kura::ApiError => e
           process_error(e)
         end
@@ -77,6 +88,12 @@ module Tumugi
           process_error(e)
         end
+        def table(dataset_id, table_id, project_id: nil)
+          @client.table(dataset_id, table_id, project_id: project_id || @project_id)
+        rescue Kura::ApiError => e
+          process_error(e)
+        end
         def table_exist?(dataset_id, table_id, project_id: nil)
           !@client.table(dataset_id, table_id, project_id: project_id || @project_id).nil?
         rescue Kura::ApiError => e
@@ -163,6 +180,7 @@ module Tumugi
                   use_query_cache: true,
                   user_defined_function_resources: nil,
                   project_id: nil,
+                  job_project_id: nil,
                   job_id: nil,
                   wait: nil,
                   dry_run: false,
@@ -175,7 +193,7 @@ module Tumugi
                         use_query_cache: use_query_cache,
                         user_defined_function_resources: user_defined_function_resources,
                         project_id: project_id || @project_id,
-                        job_project_id: project_id || @project_id,
+                        job_project_id: job_project_id || @project_id,
                         job_id: job_id,
                         wait: wait,
                         dry_run: dry_run,
@@ -185,28 +203,38 @@ module Tumugi
         end
         def load(dataset_id, table_id, source_uris=nil,
-                 schema: nil, delimiter: ",", field_delimiter: delimiter, mode: :append,
-                 allow_jagged_rows: false, max_bad_records: 0,
+                 schema: nil,
+                 field_delimiter: ",",
+                 mode: :append,
+                 allow_jagged_rows: false,
+                 max_bad_records: 0,
                  ignore_unknown_values: false,
                  allow_quoted_newlines: false,
-                 quote: '"', skip_leading_rows: 0,
+                 quote: '"',
+                 skip_leading_rows: 0,
                  source_format: "CSV",
                  project_id: nil,
+                 job_project_id: nil,
                  job_id: nil,
                  file: nil, wait: nil,
                  dry_run: false,
                  &blk)
           @client.load(dataset_id, table_id, source_uris=source_uris,
-                       schema: schema, delimiter: delimiter, field_delimiter: field_delimiter, mode: mode,
-                       allow_jagged_rows: allow_jagged_rows, max_bad_records: max_bad_records,
+                       schema: schema,
+                       field_delimiter: field_delimiter,
+                       mode: mode,
+                       allow_jagged_rows: allow_jagged_rows,
+                       max_bad_records: max_bad_records,
                        ignore_unknown_values: ignore_unknown_values,
                        allow_quoted_newlines: allow_quoted_newlines,
-                       quote: quote, skip_leading_rows: skip_leading_rows,
+                       quote: quote,
+                       skip_leading_rows: skip_leading_rows,
                        source_format: source_format,
                        project_id: project_id || @project_id,
-                       job_project_id: project_id || @project_id,
+                       job_project_id: job_project_id || @project_id,
                        job_id: job_id,
-                       file: file, wait: wait,
+                       file: file,
+                       wait: wait,
                        dry_run: dry_run,
                        &blk)
         rescue Kura::ApiError => e
@@ -219,6 +247,7 @@ module Tumugi
                     field_delimiter: ",",
                     print_header: true,
                     project_id: nil,
+                    job_project_id: nil,
                     job_id: nil,
                     wait: nil,
                     dry_run: false,
@@ -229,7 +258,7 @@ module Tumugi
                           field_delimiter: field_delimiter,
                           print_header: print_header,
                           project_id: project_id || @project_id,
-                          job_project_id: project_id || @project_id,
+                          job_project_id: job_project_id || @project_id,
                           job_id: job_id,
                           wait: wait,
                           dry_run: dry_run,
@@ -242,6 +271,7 @@ module Tumugi
                  mode: :truncate,
                  src_project_id: nil,
                  dest_project_id: nil,
+                 job_project_id: dest_project_id,
                  job_id: nil,
                  wait: nil,
                  dry_run: false,
@@ -250,7 +280,7 @@ module Tumugi
                        mode: mode,
                        src_project_id: src_project_id || @project_id,
                        dest_project_id: dest_project_id || @project_id,
-                       job_project_id: dest_project_id || @project_id,
+                       job_project_id: job_project_id || @project_id,
                        job_id: job_id,
                        wait: wait,
                        dry_run: dry_run,
@@ -280,7 +310,7 @@ module Tumugi
         private
         def process_error(e)
-          raise Tumugi::Plugin::Bigquery::BigqueryError.new(e.reason, e.message)
+          raise Tumugi::Plugin::Bigquery::BigqueryError.new(e.message, e.reason)
         end
       end
     end

data/lib/tumugi/plugin/bigquery/version.rb CHANGED Viewed

@@ -1,7 +1,7 @@
 module Tumugi
   module Plugin
     module Bigquery
-      VERSION = "0.1.0"
+      VERSION = "0.2.0"
     end
   end
 end

data/lib/tumugi/plugin/target/bigquery_dataset.rb CHANGED Viewed

@@ -17,8 +17,8 @@ module Tumugi
         cfg = Tumugi.config.section('bigquery')
         @project_id = project_id || cfg.project_id
         @dataset_id = dataset_id
-        @client = client || Tumugi::Plugin::Bigquery::Client.new(project_id: @project_id)
-        @dataset = Tumugi::Plugin::Bigquery::Dataset.new(project_id: @project_id, dataset_id: @dataset_id)
+        @client = client || Tumugi::Plugin::Bigquery::Client.new(cfg.to_h.merge(project_id: @project_id))
+        @dataset = Tumugi::Plugin::Bigquery::Dataset.new(project_id: @client.project_id, dataset_id: @dataset_id)
       end
       def exist?

data/lib/tumugi/plugin/target/bigquery_table.rb CHANGED Viewed

@@ -18,8 +18,8 @@ module Tumugi
         @project_id = project_id || cfg.project_id
         @dataset_id = dataset_id
         @table_id = table_id
-        @client = client || Tumugi::Plugin::Bigquery::Client.new(project_id: @project_id)
-        @table = Tumugi::Plugin::Bigquery::Table.new(project_id: @project_id, dataset_id: @dataset_id, table_id: @table_id)
+        @client = client || Tumugi::Plugin::Bigquery::Client.new(cfg.to_h.merge(project_id: @project_id))
+        @table = Tumugi::Plugin::Bigquery::Table.new(project_id: @client.project_id, dataset_id: @dataset_id, table_id: @table_id)
       end
       def exist?

data/lib/tumugi/plugin/task/bigquery_copy.rb CHANGED Viewed

@@ -15,9 +15,11 @@ module Tumugi
       param :wait, type: :int, default: 60
       def output
+        return @output if @output
         opts = { dataset_id: dest_dataset_id, table_id: dest_table_id }
         opts[:project_id] = dest_project_id if dest_project_id
-        Tumugi::Plugin::BigqueryTableTarget.new(opts)
+        @output = Tumugi::Plugin::BigqueryTableTarget.new(opts)
       end
       def run

data/lib/tumugi/plugin/task/bigquery_dataset.rb CHANGED Viewed

@@ -10,7 +10,7 @@ module Tumugi
       param :dataset_id, type: :string, required: true
       def output
-        Tumugi::Plugin::BigqueryDatasetTarget.new(project_id: project_id, dataset_id: dataset_id)
+        @output ||= Tumugi::Plugin::BigqueryDatasetTarget.new(project_id: project_id, dataset_id: dataset_id)
       end
       def run

data/lib/tumugi/plugin/task/bigquery_export.rb ADDED Viewed

@@ -0,0 +1,112 @@
+require 'json'
+require 'tumugi'
+require 'tumugi/plugin/file_system_target'
+require_relative '../target/bigquery_table'
+module Tumugi
+  module Plugin
+    class BigqueryExportTask < Tumugi::Task
+      Tumugi::Plugin.register_task('bigquery_export', self)
+      param :project_id, type: :string
+      param :job_project_id, type: :string
+      param :dataset_id, type: :string, required: true
+      param :table_id, type: :string, required: true
+      param :compression, type: :string, default: 'NONE' # GZIP
+      param :destination_format, type: :string, default: 'CSV' # NEWLINE_DELIMITED_JSON, AVRO
+      # Only effected if destiation_format == 'CSV'
+      param :field_delimiter, type: :string, default: ','
+      param :print_header, type: :bool, default: true
+      param :page_size, type: :integer, default: 10000
+      param :wait, type: :integer, default: 120
+      def run
+        unless output.is_a?(Tumugi::Plugin::FileSystemTarget)
+          raise Tumugi::TumugiError.new("BigqueryExportTask#output must be return a instance of Tumugi::Plugin::FileSystemTarget")
+        end
+        client = Tumugi::Plugin::Bigquery::Client.new(config)
+        table = Tumugi::Plugin::Bigquery::Table.new(project_id: client.project_id, dataset_id: dataset_id, table_id: table_id)
+        job_project_id = client.project_id if job_project_id.nil?
+        log "Source: #{table}"
+        log "Destination: #{output}"
+        if is_gcs?(output)
+          export_to_gcs(client)
+        else
+          if destination_format.upcase == 'AVRO'
+            raise Tumugi::TumugiError.new("destination_format='AVRO' is only supported when export to Google Cloud Storage")
+          end
+          if compression.upcase == 'GZIP'
+            logger.warn("compression parameter is ignored, it's only supported when export to Google Cloud Storage")
+          end
+          export_to_file_system(client)
+        end
+      end
+      private
+      def is_gcs?(target)
+        not target.to_s.match(/^gs:\/\/[^\/]+\/.+$/).nil?
+      end
+      def export_to_gcs(client)
+        options = {
+          compression: compression.upcase,
+          destination_format: destination_format.upcase,
+          field_delimiter: field_delimiter,
+          print_header: print_header,
+          project_id: client.project_id,
+          job_project_id: job_project_id || client.project_id,
+          wait: wait
+        }
+        client.extract(dataset_id, table_id, output.to_s, options)
+      end
+      def export_to_file_system(client)
+        schema ||= client.table(dataset_id, table_id, project_id: client.project_id).schema.fields
+        field_names = schema.map{|f| f.respond_to?(:[]) ? (f["name"] || f[:name]) : f.name }
+        start_index = 0
+        page_token = nil
+        options = {
+          max_result: page_size,
+          project_id: client.project_id,
+        }
+        output.open('w') do |file|
+          file.puts field_names.join(field_delimiter) if destination_format == 'CSV' && print_header
+          begin
+            table_data_list = client.list_tabledata(dataset_id, table_id, options.merge(start_index: start_index, page_token: page_token))
+            start_index += page_size
+            page_token = table_data_list[:next_token]
+            table_data_list[:rows].each do |row|
+              file.puts line(field_names, row, destination_format)
+            end
+          end while not page_token.nil?
+        end
+      end
+      def line(field_names, row, format)
+        case format
+        when 'CSV'
+          row.map{|v| v[1]}.join(field_delimiter)
+        when 'NEWLINE_DELIMITED_JSON'
+          JSON.generate(row.to_h)
+        end
+      end
+      def config
+        cfg = Tumugi.config.section('bigquery').to_h
+        unless project_id.nil?
+          cfg[:project_id] = project_id
+        end
+        cfg
+      end
+    end
+  end
+end

data/lib/tumugi/plugin/task/bigquery_load.rb ADDED Viewed

@@ -0,0 +1,73 @@
+require 'tumugi'
+require_relative '../target/bigquery_table'
+module Tumugi
+  module Plugin
+    class BigqueryLoadTask < Tumugi::Task
+      Tumugi::Plugin.register_task('bigquery_load', self)
+      param :bucket, type: :string, required: true
+      param :key, type: :string, required: true
+      param :project_id, type: :string
+      param :dataset_id, type: :string, required: true
+      param :table_id, type: :string, required: true
+      param :schema # type: :array
+      param :field_delimiter, type: :string, default: ','
+      param :mode, type: :string, default: 'append' # truncate, empty
+      param :allow_jagged_rows, type: :bool, default: false
+      param :max_bad_records, type: :integer, default: 0
+      param :ignore_unknown_values, type: :bool, default: false
+      param :allow_quoted_newlines, type: :bool, default: false
+      param :quote, type: :string, default: '"'
+      param :skip_leading_rows, type: :interger, default: 0
+      param :source_format, type: :string, default: 'CSV' # NEWLINE_DELIMITED_JSON, AVRO
+      param :wait, type: :integer, default: 60
+      def output
+        return @output if @output
+        opts = { dataset_id: dataset_id, table_id: table_id }
+        opts[:project_id] = project_id if project_id
+        @output = Tumugi::Plugin::BigqueryTableTarget.new(opts)
+      end
+      def run
+        if mode != 'append'
+          raise Tumugi::ParameterError.new("Parameter 'schema' is required when 'mode' is 'truncate' or 'empty'") if schema.nil?
+        end
+        src_uri = "gs://#{bucket}#{normalize_path(key)}"
+        log "Source: #{src_uri}"
+        log "Destination: #{output}"
+        bq_client = output.client
+        opts = {
+          schema: schema,
+          field_delimiter: field_delimiter,
+          mode: mode.to_sym,
+          allow_jagged_rows: allow_jagged_rows,
+          max_bad_records: max_bad_records,
+          ignore_unknown_values: ignore_unknown_values,
+          allow_quoted_newlines: allow_quoted_newlines,
+          quote: quote,
+          skip_leading_rows: skip_leading_rows,
+          source_format: source_format,
+          project_id: output.project_id,
+          wait: wait
+        }
+        bq_client.load(output.dataset_id, output.table_id, src_uri, opts)
+      end
+      private
+      def normalize_path(path)
+        unless path.start_with?('/')
+          "/#{path}"
+        else
+          path
+        end
+      end
+    end
+  end
+end

data/lib/tumugi/plugin/task/bigquery_query.rb CHANGED Viewed

@@ -13,7 +13,7 @@ module Tumugi
       param :wait, type: :int, default: 60
       def output
-        Tumugi::Plugin::BigqueryTableTarget.new(project_id: project_id, dataset_id: dataset_id, table_id: table_id)
+        @output ||= Tumugi::Plugin::BigqueryTableTarget.new(project_id: project_id, dataset_id: dataset_id, table_id: table_id)
       end
       def run

data/tumugi-plugin-bigquery.gemspec CHANGED Viewed

@@ -20,8 +20,8 @@ Gem::Specification.new do |spec|
   spec.executables   = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
   spec.require_paths = ["lib"]
-  spec.add_runtime_dependency "tumugi", "~> 0.4.5"
-  spec.add_runtime_dependency "kura", "0.2.16"
+  spec.add_runtime_dependency "tumugi", ">= 0.5.1"
+  spec.add_runtime_dependency "kura", "~> 0.2.17"
   spec.add_development_dependency 'bundler', '~> 1.11'
   spec.add_development_dependency 'rake', '~> 10.0'
@@ -29,4 +29,5 @@ Gem::Specification.new do |spec|
   spec.add_development_dependency 'test-unit-rr'
   spec.add_development_dependency 'coveralls'
   spec.add_development_dependency 'github_changelog_generator'
+  spec.add_development_dependency 'tumugi-plugin-google_cloud_storage'
 end

metadata CHANGED Viewed

@@ -1,43 +1,43 @@
 --- !ruby/object:Gem::Specification
 name: tumugi-plugin-bigquery
 version: !ruby/object:Gem::Version
-  version: 0.1.0
+  version: 0.2.0
 platform: ruby
 authors:
 - Kazuyuki Honda
 autorequire:
 bindir: exe
 cert_chain: []
-date: 2016-05-16 00:00:00.000000000 Z
+date: 2016-06-06 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: tumugi
   requirement: !ruby/object:Gem::Requirement
     requirements:
-    - - "~>"
+    - - ">="
       - !ruby/object:Gem::Version
-        version: 0.4.5
+        version: 0.5.1
   type: :runtime
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
-    - - "~>"
+    - - ">="
       - !ruby/object:Gem::Version
-        version: 0.4.5
+        version: 0.5.1
 - !ruby/object:Gem::Dependency
   name: kura
   requirement: !ruby/object:Gem::Requirement
     requirements:
-    - - '='
+    - - "~>"
       - !ruby/object:Gem::Version
-        version: 0.2.16
+        version: 0.2.17
   type: :runtime
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
-    - - '='
+    - - "~>"
       - !ruby/object:Gem::Version
-        version: 0.2.16
+        version: 0.2.17
 - !ruby/object:Gem::Dependency
   name: bundler
   requirement: !ruby/object:Gem::Requirement
@@ -122,6 +122,20 @@ dependencies:
     - - ">="
       - !ruby/object:Gem::Version
         version: '0'
+- !ruby/object:Gem::Dependency
+  name: tumugi-plugin-google_cloud_storage
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
 description:
 email:
 - hakobera@gmail.com
@@ -131,13 +145,16 @@ extra_rdoc_files: []
 files:
 - ".gitignore"
 - ".travis.yml"
+- CHANGELOG.md
 - Gemfile
 - README.md
 - Rakefile
 - bin/setup
 - examples/copy.rb
 - examples/dataset.rb
+- examples/load.rb
 - examples/query.rb
+- examples/test.csv
 - examples/tumugi_config_example.rb
 - lib/tumugi/plugin/bigquery/client.rb
 - lib/tumugi/plugin/bigquery/dataset.rb
@@ -148,6 +165,8 @@ files:
 - lib/tumugi/plugin/target/bigquery_table.rb
 - lib/tumugi/plugin/task/bigquery_copy.rb
 - lib/tumugi/plugin/task/bigquery_dataset.rb
+- lib/tumugi/plugin/task/bigquery_export.rb
+- lib/tumugi/plugin/task/bigquery_load.rb
 - lib/tumugi/plugin/task/bigquery_query.rb
 - tumugi-plugin-bigquery.gemspec
 homepage: https://github.com/tumugi/tumugi-plugin-bigquery