tumugi-plugin-bigquery 0.2.0 → 0.3.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 1f82d5d752da3918795afc6cc669a0fb4711cf95
4
- data.tar.gz: fed486ae8aeb9266d4fd11cf523a19a8507755af
3
+ metadata.gz: 63e4b8a538949b06c7a63d62b60e965c3d167e21
4
+ data.tar.gz: ead04218cb01d036f9c0c457d6a036ab8b6a12b1
5
5
  SHA512:
6
- metadata.gz: 8418f29dfe96d38bcdfa0c5098d59efd819edaa715a7e2af0945c57f70a4d08fa5c477560df636389eefe0bcf40712c4d240b2a97d3301cadebf6e5615808f2b
7
- data.tar.gz: aa34ee20fdec506277f3ac40b8819b62f7d000db7c802ed4c9fcedd5af1bf33644ecf673b10326c8e64f8d554a9a708cdfc0f9ae7942b5570257adec3108ab28
6
+ metadata.gz: 1d21aa4a556541f906d566f18fd61b94960eb6021f3c5c749de07dd2be14444533d228d5229a5ffe4d4b934071e670708aeb6c43dad6976e4feb4c8613dd1474
7
+ data.tar.gz: 389387e0fbcf5e4ab0719260bcefa70d8176ce0b4f96bb8b6d40c4078a23bde30dfd1b03c1189282cc78cdaddda4498c0e57c72325d6a9cbe747a2f0ef44e2b1
data/.gitignore CHANGED
@@ -7,4 +7,5 @@
7
7
  /pkg/
8
8
  /spec/reports/
9
9
  /tmp/
10
+ .ruby-version
10
11
  tumugi_config.rb
@@ -1,7 +1,29 @@
1
1
  # Change Log
2
2
 
3
- ## [0.2.0](https://github.com/tumugi/tumugi-plugin-bigquery/tree/0.2.0) (2016-06-06)
4
- [Full Changelog](https://github.com/tumugi/tumugi-plugin-bigquery/compare/v0.1.0...0.2.0)
3
+ ## [v0.3.0](https://github.com/tumugi/tumugi-plugin-bigquery/tree/v0.3.0) (2016-07-16)
4
+ [Full Changelog](https://github.com/tumugi/tumugi-plugin-bigquery/compare/v0.2.0...v0.3.0)
5
+
6
+ **Implemented enhancements:**
7
+
8
+ - Support flatten\_result flag [\#30](https://github.com/tumugi/tumugi-plugin-bigquery/issues/30)
9
+ - Support mode parameter for BigqueryQueryTask [\#28](https://github.com/tumugi/tumugi-plugin-bigquery/issues/28)
10
+ - Support standard SQL [\#20](https://github.com/tumugi/tumugi-plugin-bigquery/issues/20)
11
+ - Support force copy table [\#7](https://github.com/tumugi/tumugi-plugin-bigquery/issues/7)
12
+
13
+ **Fixed bugs:**
14
+
15
+ - Fix JSON export for FileSystemTarget does not work [\#31](https://github.com/tumugi/tumugi-plugin-bigquery/issues/31)
16
+
17
+ **Merged pull requests:**
18
+
19
+ - Update tumugi to 0.6 [\#35](https://github.com/tumugi/tumugi-plugin-bigquery/pull/35) ([hakobera](https://github.com/hakobera))
20
+ - Add JSON export test [\#34](https://github.com/tumugi/tumugi-plugin-bigquery/pull/34) ([hakobera](https://github.com/hakobera))
21
+ - Fix misc [\#33](https://github.com/tumugi/tumugi-plugin-bigquery/pull/33) ([hakobera](https://github.com/hakobera))
22
+ - Support force\_copy parameter for bigquery\_copy task [\#32](https://github.com/tumugi/tumugi-plugin-bigquery/pull/32) ([hakobera](https://github.com/hakobera))
23
+ - Support append mode query and use legacy SQL flag [\#29](https://github.com/tumugi/tumugi-plugin-bigquery/pull/29) ([hakobera](https://github.com/hakobera))
24
+
25
+ ## [v0.2.0](https://github.com/tumugi/tumugi-plugin-bigquery/tree/v0.2.0) (2016-06-06)
26
+ [Full Changelog](https://github.com/tumugi/tumugi-plugin-bigquery/compare/v0.1.0...v0.2.0)
5
27
 
6
28
  **Implemented enhancements:**
7
29
 
@@ -23,8 +45,10 @@
23
45
 
24
46
  **Merged pull requests:**
25
47
 
26
- - Cache output [\#26](https://github.com/tumugi/tumugi-plugin-bigquery/pull/26) ([hakobera](https://github.com/hakobera))
48
+ - Update changelog [\#27](https://github.com/tumugi/tumugi-plugin-bigquery/pull/27) ([hakobera](https://github.com/hakobera))
27
49
  - Prepare release for 0.2.0 [\#25](https://github.com/tumugi/tumugi-plugin-bigquery/pull/25) ([hakobera](https://github.com/hakobera))
50
+ - Add rubygems badge [\#3](https://github.com/tumugi/tumugi-plugin-bigquery/pull/3) ([hakobera](https://github.com/hakobera))
51
+ - Cache output [\#26](https://github.com/tumugi/tumugi-plugin-bigquery/pull/26) ([hakobera](https://github.com/hakobera))
28
52
  - Use Thor's invoke instead of system method [\#18](https://github.com/tumugi/tumugi-plugin-bigquery/pull/18) ([hakobera](https://github.com/hakobera))
29
53
  - Change test ruby version [\#17](https://github.com/tumugi/tumugi-plugin-bigquery/pull/17) ([hakobera](https://github.com/hakobera))
30
54
  - Change tumugi dependency version [\#16](https://github.com/tumugi/tumugi-plugin-bigquery/pull/16) ([hakobera](https://github.com/hakobera))
@@ -32,7 +56,6 @@
32
56
  - Add BigqueryLoadTask [\#12](https://github.com/tumugi/tumugi-plugin-bigquery/pull/12) ([hakobera](https://github.com/hakobera))
33
57
  - Update dependency gems [\#11](https://github.com/tumugi/tumugi-plugin-bigquery/pull/11) ([hakobera](https://github.com/hakobera))
34
58
  - Update tumugi to v0.5.0 [\#9](https://github.com/tumugi/tumugi-plugin-bigquery/pull/9) ([hakobera](https://github.com/hakobera))
35
- - Add rubygems badge [\#3](https://github.com/tumugi/tumugi-plugin-bigquery/pull/3) ([hakobera](https://github.com/hakobera))
36
59
 
37
60
  ## [v0.1.0](https://github.com/tumugi/tumugi-plugin-bigquery/tree/v0.1.0) (2016-05-16)
38
61
  **Fixed bugs:**
data/README.md CHANGED
@@ -1,8 +1,8 @@
1
1
  [![Build Status](https://travis-ci.org/tumugi/tumugi-plugin-bigquery.svg?branch=master)](https://travis-ci.org/tumugi/tumugi-plugin-bigquery) [![Code Climate](https://codeclimate.com/github/tumugi/tumugi-plugin-bigquery/badges/gpa.svg)](https://codeclimate.com/github/tumugi/tumugi-plugin-bigquery) [![Coverage Status](https://coveralls.io/repos/github/tumugi/tumugi-plugin-bigquery/badge.svg?branch=master)](https://coveralls.io/github/tumugi/tumugi-plugin-bigquery) [![Gem Version](https://badge.fury.io/rb/tumugi-plugin-bigquery.svg)](https://badge.fury.io/rb/tumugi-plugin-bigquery)
2
2
 
3
- # tumugi-plugin-bigquery
3
+ # Google BigQuery plugin for [tumugi](https://github.com/tumugi/tumugi)
4
4
 
5
- tumugi-plugin-bigquery is a plugin for integrate [Google BigQuery](https://cloud.google.com/bigquery/) and [Tumugi](https://github.com/tumugi/tumugi).
5
+ tumugi-plugin-bigquery is a plugin for integrate [Google BigQuery](https://cloud.google.com/bigquery/) and [tumugi](https://github.com/tumugi/tumugi).
6
6
 
7
7
  ## Installation
8
8
 
@@ -12,17 +12,7 @@ Add this line to your application's Gemfile:
12
12
  gem 'tumugi-plugin-bigquery'
13
13
  ```
14
14
 
15
- And then execute:
16
-
17
- ```sh
18
- $ bundle
19
- ```
20
-
21
- Or install it yourself as:
22
-
23
- ```sb
24
- $ gem install tumugi-plugin-bigquery
25
- ```
15
+ And then execute `bundle install`.
26
16
 
27
17
  ## Target
28
18
 
@@ -30,21 +20,65 @@ $ gem install tumugi-plugin-bigquery
30
20
 
31
21
  `Tumugi::Plugin::BigqueryDatasetTarget` is target for BigQuery dataset.
32
22
 
23
+ #### Parameters
24
+
25
+ | Name | type | required? | default | description |
26
+ |------------|--------|-----------|---------|------------------------------------------------------------------|
27
+ | dataset_id | string | required | | Dataset ID |
28
+ | project_id | string | optional | | [Project](https://cloud.google.com/compute/docs/projects) ID |
29
+
30
+ #### Examples
31
+
32
+ ```rb
33
+ task :task1 do
34
+ output target(:bigquery_dataset, dataset_id: "your_dataset_id")
35
+ end
36
+ ```
37
+
38
+ ```rb
39
+ task :task1 do
40
+ output target(:bigquery_dataset, project_id: "project_id", dataset_id: "dataset_id")
41
+ end
42
+ ```
43
+
33
44
  #### Tumugi::Plugin::BigqueryTableTarget
34
45
 
35
46
  `Tumugi::Plugin::BigqueryDatasetTarget` is target for BigQuery table.
36
47
 
48
+ #### Parameters
49
+
50
+ | name | type | required? | default | description |
51
+ |------------|--------|-----------|---------|------------------------------------------------------------------|
52
+ | table_id | string | required | | Table ID |
53
+ | dataset_id | string | required | | Dataset ID |
54
+ | project_id | string | optional | | [Project](https://cloud.google.com/compute/docs/projects) ID |
55
+
56
+ #### Examples
57
+
58
+ ```rb
59
+ task :task1 do
60
+ output target(:bigquery_table, table_id: "table_id", dataset_id: "your_dataset_id")
61
+ end
62
+ ```
63
+
37
64
  ## Task
38
65
 
39
66
  ### Tumugi::Plugin::BigqueryDatasetTask
40
67
 
41
68
  `Tumugi::Plugin::BigqueryDatasetTask` is task to create a dataset.
42
69
 
43
- #### Usage
70
+ #### Parameters
71
+
72
+ | name | type | required? | default | description |
73
+ |------------|--------|-----------|---------|------------------------------------------------------------------|
74
+ | dataset_id | string | required | | Dataset ID |
75
+ | project_id | string | optional | | [Project](https://cloud.google.com/compute/docs/projects) ID |
76
+
77
+ #### Examples
44
78
 
45
79
  ```rb
46
80
  task :task1, type: :bigquery_dataset do
47
- param_set :dataset_id, 'test'
81
+ dataset_id 'test'
48
82
  end
49
83
  ```
50
84
 
@@ -52,13 +86,41 @@ end
52
86
 
53
87
  `Tumugi::Plugin::BigqueryQueryTask` is task to run `query` and save the result into the table which specified by parameter.
54
88
 
55
- #### Usage
89
+ #### Parameters
90
+
91
+ | name | type | required? | default | description |
92
+ |-----------------|---------|-----------|------------|-----------------------------------------------------------------------------------------------------------------------------------------------|
93
+ | query | string | required | | query to execute |
94
+ | table_id | string | required | | destination table ID |
95
+ | dataset_id | string | required | | destination dataset ID |
96
+ | project_id | string | optional | | destination project ID |
97
+ | mode | string | optional | "truncate" | specifies the action that occurs if the destination table already exists. [see](#mode) |
98
+ | flatten_results | boolean | optional | true | when you query nested data, BigQuery automatically flattens the table data or not. [see](https://cloud.google.com/bigquery/docs/data#flatten) |
99
+ | use_legacy_sql | bool | optional | true | use legacy SQL syntanx for BigQuery or not |
100
+ | wait | integer | optional | 60 | wait time (seconds) for query execution |
101
+
102
+ #### Examples
103
+
104
+ ##### truncate mode (default)
56
105
 
57
106
  ```rb
58
107
  task :task1, type: :bigquery_query do
59
- param_set :query, "SELECT COUNT(*) AS cnt FROM [bigquery-public-data:samples.wikipedia]"
60
- param_set :dataset_id, 'test'
61
- param_set :table_id, "dest_table#{Time.now.to_i}"
108
+ query "SELECT COUNT(*) AS cnt FROM [bigquery-public-data:samples.wikipedia]"
109
+ table_id "dest_table#{Time.now.to_i}"
110
+ dataset_id "test"
111
+ end
112
+ ```
113
+
114
+ ##### append mode
115
+
116
+ If you set `mode` to `append`, query result append to existing table.
117
+
118
+ ```rb
119
+ task :task1, type: :bigquery_query do
120
+ query "SELECT COUNT(*) AS cnt FROM [bigquery-public-data:samples.wikipedia]"
121
+ table_id "dest_table#{Time.now.to_i}"
122
+ dataset_id "test"
123
+ mode "append"
62
124
  end
63
125
  ```
64
126
 
@@ -66,16 +128,46 @@ end
66
128
 
67
129
  `Tumugi::Plugin::BigqueryCopyTask` is task to copy table which specified by parameter.
68
130
 
69
- #### Usage
131
+ #### Parameters
132
+
133
+ | name | type | required? | default | description |
134
+ |-----------------|--------|-----------|---------|---------------------------------------------------------|
135
+ | src_table_id | string | required | | source table ID |
136
+ | src_dataset_id | string | required | | source dataset ID |
137
+ | src_project_id | string | optional | | source project ID |
138
+ | dest_table_id | string | required | | destination table ID |
139
+ | dest_dataset_id | string | required | | destination dataset ID |
140
+ | dest_project_id | string | optional | | destination project ID |
141
+ | force_copy | bool | optional | false | force copy when destination table already exists or not |
142
+ | wait | integer| optional | 60 | wait time (seconds) for query execution |
143
+
144
+ #### Examples
70
145
 
71
146
  Copy `test.src_table` to `test.dest_table`.
72
147
 
148
+ ##### Normal usecase
149
+
150
+ ```rb
151
+ task :task1, type: :bigquery_copy do
152
+ src_table_id "src_table"
153
+ src_dataset_id "test"
154
+ dest_table_id "dest_table"
155
+ dest_dataset_id "test"
156
+ end
157
+ ```
158
+
159
+ ##### force_copy
160
+
161
+ If `force_copy` is `true`, copy operation always execute even if destination table exists.
162
+ This means data of destination table data is deleted, so be carefull to enable this parameter.
163
+
73
164
  ```rb
74
165
  task :task1, type: :bigquery_copy do
75
- param_set :src_dataset_id, 'test'
76
- param_set :src_table_id, 'src_table'
77
- param_set :dest_dataset_id, 'test'
78
- param_set :dest_table_id, 'dest_table'
166
+ src_table_id "src_table"
167
+ src_dataset_id "test"
168
+ dest_table_id "dest_table"
169
+ dest_dataset_id "test"
170
+ force_copy true
79
171
  end
80
172
  ```
81
173
 
@@ -83,25 +175,154 @@ end
83
175
 
84
176
  `Tumugi::Plugin::BigqueryLoadTask` is task to load structured data from GCS into BigQuery.
85
177
 
86
- #### Usage
178
+ #### Parameters
179
+
180
+ | name | type | required? | default | description |
181
+ |-----------------------|-----------------|------------------------------------|---------------------|----------------------------------------------------------------------------------------------------------------------------------------------|
182
+ | bucket | string | required | | source GCS bucket name |
183
+ | key | string | required | | source path of file like "/path/to/file.csv" |
184
+ | table_id | string | required | | destination table ID |
185
+ | dataset_id | string | required | | destination dataset ID |
186
+ | project_id | string | optional | | destination project ID |
187
+ | schema | array of object | required when mode is not "append" | | see [schema format](#schema) |
188
+ | mode | string | optional | "append" | specifies the action that occurs if the destination table already exists. [see](#mode) |
189
+ | source_format | string | optional | "CSV" | source file format. [see](#format) |
190
+ | ignore_unknown_values | bool | optional | false | indicates if BigQuery should allow extra values that are not represented in the table schema |
191
+ | max_bad_records | integer | optional | 0 | maximum number of bad records that BigQuery can ignore when running the job |
192
+ | field_delimiter | string | optional | "," | separator for fields in a CSV file. used only when source_format is "CSV" |
193
+ | allow_jagged_rows | bool | optional | false | accept rows that are missing trailing optional columns. The missing values are treated as null. used only when source_format is "CSV" |
194
+ | allow_quoted_newlines | bool | optional | false | indicates if BigQuery should allow quoted data sections that contain newline characters in a CSV file. used only when source_format is "CSV" |
195
+ | quote | string | optional | "\"" (double-quote) | value that is used to quote data sections in a CSV file. used only when source_format is "CSV" |
196
+ | skip_leading_rows | integer | optional | 0 | .number of rows at the top of a CSV file that BigQuery will skip when loading the data. used only when source_format is "CSV" |
197
+ | wait | integer | optional | 60 | wait time (seconds) for query execution |
198
+
199
+ #### Example
87
200
 
88
201
  Load `gs://test_bucket/load_data.csv` into `dest_project:dest_dataset.dest_table`
89
202
 
90
203
  ```rb
91
204
  task :task1, type: :bigquery_load do
92
- param_set :bucket, 'test_bucket'
93
- param_set :key, 'load_data.csv'
94
- param_set :project_id, 'dest_project'
95
- param_set :datset_id, 'dest_dataset'
96
- param_set :table_id, 'dest_table'
205
+ bucket "test_bucket"
206
+ key "load_data.csv"
207
+ table_id "dest_table"
208
+ datset_id "dest_dataset"
209
+ project_id "dest_project"
210
+ end
211
+ ```
212
+
213
+ ### Tumugi::Plugin::BigqueryExportTask
214
+
215
+ `Tumugi::Plugin::BigqueryExportTask` is task to export BigQuery table.
216
+
217
+ #### Parameters
218
+
219
+ | name | type | required? | default | description |
220
+ |--------------------|---------|-----------|--------------------|-------------------------------------------------------------------------------------|
221
+ | project_id | string | optional | | source project ID |
222
+ | job_project_id | string | optional | same as project_id | job running project ID |
223
+ | dataset_id | string | required | true | source dataset ID |
224
+ | table_id | string | required | true | source table ID |
225
+ | compression | string | optional | "NONE" | [destination file compression]. "NONE": no compression, "GZIP": compression by gzip |
226
+ | destination_format | string | optional | "CSV" | [destination file format](#format) |
227
+ | field_delimiter | string | optional | "," | separator for fields in a CSV file. used only when destination_format is "CSV" |
228
+ | print_header | bool | optional | true | print header row in a CSV file. used only when destination_format is "CSV" |
229
+ | page_size | integer | optional | 10000 | Fetch number of rows in one request |
230
+ | wait | integer | optional | 60 | wait time (seconds) for query execution |
231
+
232
+ #### Examples
233
+
234
+ ##### Export `src_dataset.src_table` to local file `data.csv`
235
+
236
+ ```rb
237
+ task :task1, type: :bigquery_export do
238
+ table_id "src_table"
239
+ datset_id "src_dataset"
240
+
241
+ output target(:local_file, "data.csv")
242
+ end
243
+ ```
244
+
245
+ ##### Export `src_dataset.src_table` to Google Cloud Storage
246
+
247
+ You need [tumugi-plugin-google_cloud_storage](https://github.com/tumugi/tumugi-plugin-google_cloud_storage)
248
+
249
+ ```rb
250
+ task :task1, type: :bigquery_export do
251
+ table_id "src_table"
252
+ datset_id "src_dataset"
253
+
254
+ output target(:google_cloud_storage_file, bucket: "bucket", key: "data.csv")
97
255
  end
98
256
  ```
99
257
 
100
- ### Config Section
258
+ ##### Export `src_dataset.src_table` to Google Drive
259
+
260
+ You need [tumugi-plugin-google_drive](https://github.com/tumugi/tumugi-plugin-google_drive)
261
+
262
+ ```rb
263
+ task :task1, type: :bigquery_export do
264
+ table_id "src_table"
265
+ datset_id "src_dataset"
266
+
267
+ output target(:google_drive_file, name: "data.csv")
268
+ end
269
+ ```
270
+
271
+ ## Common parameter value
272
+
273
+ ### mode
274
+
275
+ | value | description |
276
+ |----------|-------------|
277
+ | truncate | If the table already exists, BigQuery overwrites the table data. |
278
+ | append | If the table already exists, BigQuery appends the data to the table. |
279
+ | empty | If the table already exists and contains data, a 'duplicate' error is returned in the job result. |
280
+
281
+ ### format
282
+
283
+ | value | description |
284
+ |------------------------|--------------------------------------------|
285
+ | CSV | CSV |
286
+ | NEWLINE_DELIMITED_JSON | Each line is JSON + new line |
287
+ | AVRO | [see](https://avro.apache.org/docs/1.2.0/) |
288
+
289
+ ### schema
290
+
291
+ Format of `schema` parameter is array of nested object like below:
292
+
293
+ ```js
294
+ [
295
+ {
296
+ "name": "column1",
297
+ "type": "string"
298
+ },
299
+ {
300
+ "name": "column2",
301
+ "type": "integer",
302
+ "mode": "repeated"
303
+ },
304
+ {
305
+ "name": "record1",
306
+ "type": "record",
307
+ "fields": [
308
+ {
309
+ "name": "key1",
310
+ "type": "integer",
311
+ },
312
+ {
313
+ "name": "key2",
314
+ "type": "integer"
315
+ }
316
+ ]
317
+ }
318
+ ]
319
+ ```
320
+
321
+ ## Config Section
101
322
 
102
323
  tumugi-plugin-bigquery provide config section named "bigquery" which can specified BigQuery autenticaion info.
103
324
 
104
- #### Authenticate by client_email and private_key
325
+ ### Authenticate by client_email and private_key
105
326
 
106
327
  ```rb
107
328
  Tumugi.configure do |config|
@@ -113,7 +334,7 @@ Tumugi.configure do |config|
113
334
  end
114
335
  ```
115
336
 
116
- #### Authenticate by JSON key file
337
+ ### Authenticate by JSON key file
117
338
 
118
339
  ```rb
119
340
  Tumugi.configure do |config|
@@ -1,21 +1,21 @@
1
1
  task :task1, type: :bigquery_copy do
2
- param_set :src_project_id, ->{ input.project_id }
3
- param_set :src_dataset_id, ->{ input.dataset_id }
4
- param_set :src_table_id, ->{ input.table_id }
5
- param_set :dest_dataset_id, "test"
6
- param_set :dest_table_id, ->{ "dest_table_#{Time.now.to_i}" }
2
+ src_project_id { input.project_id }
3
+ src_dataset_id { input.dataset_id }
4
+ src_table_id { input.table_id }
5
+ dest_dataset_id "test"
6
+ dest_table_id { "dest_table_#{Time.now.to_i}" }
7
7
 
8
8
  requires :task2
9
9
  end
10
10
 
11
11
  task :task2, type: :bigquery_query do
12
- param_set :query, "SELECT COUNT(*) AS cnt FROM [bigquery-public-data:samples.wikipedia]"
13
- param_set :dataset_id, "test" #->{ input.dataset_id }
14
- param_set :table_id, "dest_#{Time.now.to_i}"
12
+ query "SELECT COUNT(*) AS cnt FROM [bigquery-public-data:samples.wikipedia]"
13
+ dataset_id { input.dataset_id }
14
+ table_id "dest_#{Time.now.to_i}"
15
15
 
16
16
  requires :task3
17
17
  end
18
18
 
19
19
  task :task3, type: :bigquery_dataset do
20
- param_set :dataset_id, "test"
20
+ dataset_id "test"
21
21
  end
@@ -6,5 +6,5 @@ task :task1 do
6
6
  end
7
7
 
8
8
  task :task2, type: :bigquery_dataset do
9
- param_set :dataset_id, 'test'
9
+ dataset_id "test"
10
10
  end
@@ -0,0 +1,13 @@
1
+ task :task1, type: :bigquery_export do
2
+ dataset_id { input.dataset_id }
3
+ table_id { input.table_id }
4
+
5
+ requires :task2
6
+ output target(:local_file, "tmp/export.csv")
7
+ end
8
+
9
+ task :task2, type: :bigquery_query do
10
+ query "SELECT COUNT(*) AS cnt FROM [bigquery-public-data:samples.wikipedia]"
11
+ dataset_id "test"
12
+ table_id "dest_#{Time.now.to_i}"
13
+ end
@@ -0,0 +1,22 @@
1
+ task :task1, type: :bigquery_copy do
2
+ src_project_id { input.project_id }
3
+ src_dataset_id { input.dataset_id }
4
+ src_table_id { input.table_id }
5
+ dest_dataset_id "test"
6
+ dest_table_id "dest_table_1"
7
+ force_copy true
8
+
9
+ requires :task2
10
+ end
11
+
12
+ task :task2, type: :bigquery_query do
13
+ query "SELECT COUNT(*) AS cnt FROM [bigquery-public-data:samples.wikipedia]"
14
+ dataset_id { input.dataset_id }
15
+ table_id "dest_#{Time.now.to_i}"
16
+
17
+ requires :task3
18
+ end
19
+
20
+ task :task3, type: :bigquery_dataset do
21
+ dataset_id "test"
22
+ end
@@ -1,11 +1,11 @@
1
1
  task :task1, type: :bigquery_load do
2
2
  requires :task2
3
- param_set :bucket, 'tumugi-plugin-bigquery'
4
- param_set :key, 'test.csv'
5
- param_set :dataset_id, -> { input.dataset_id }
6
- param_set :table_id, 'load_test'
7
- param_set :skip_leading_rows, 1
8
- param_set :schema, [
3
+ bucket 'tumugi-plugin-bigquery'
4
+ key 'test.csv'
5
+ dataset_id { input.dataset_id }
6
+ table_id 'load_test'
7
+ skip_leading_rows 1
8
+ schema [
9
9
  {
10
10
  name: 'row_number',
11
11
  type: 'INTEGER',
@@ -20,5 +20,5 @@ task :task1, type: :bigquery_load do
20
20
  end
21
21
 
22
22
  task :task2, type: :bigquery_dataset do
23
- param_set :dataset_id, 'test'
23
+ dataset_id "test"
24
24
  end
@@ -6,7 +6,7 @@ task :task1 do
6
6
  end
7
7
 
8
8
  task :task2, type: :bigquery_query do
9
- param_set :query, "SELECT COUNT(*) AS cnt FROM [bigquery-public-data:samples.wikipedia]"
10
- param_set :dataset_id, 'test'
11
- param_set :table_id, "dest_#{Time.now.to_i}"
9
+ query "SELECT COUNT(*) AS cnt FROM [bigquery-public-data:samples.wikipedia]"
10
+ dataset_id "test"
11
+ table_id "dest_#{Time.now.to_i}"
12
12
  end
@@ -0,0 +1,13 @@
1
+ task :task1 do
2
+ requires :task2
3
+ run do
4
+ log input.table_name
5
+ end
6
+ end
7
+
8
+ task :task2, type: :bigquery_query do
9
+ query "SELECT COUNT(*) AS cnt FROM [bigquery-public-data:samples.wikipedia]"
10
+ dataset_id "test"
11
+ table_id "dest_append"
12
+ mode "append"
13
+ end
@@ -178,6 +178,7 @@ module Tumugi
178
178
  flatten_results: true,
179
179
  priority: "INTERACTIVE",
180
180
  use_query_cache: true,
181
+ use_legacy_sql: true,
181
182
  user_defined_function_resources: nil,
182
183
  project_id: nil,
183
184
  job_project_id: nil,
@@ -191,6 +192,7 @@ module Tumugi
191
192
  flatten_results: flatten_results,
192
193
  priority: priority,
193
194
  use_query_cache: use_query_cache,
195
+ use_legacy_sql: use_legacy_sql,
194
196
  user_defined_function_resources: user_defined_function_resources,
195
197
  project_id: project_id || @project_id,
196
198
  job_project_id: job_project_id || @project_id,
@@ -1,7 +1,7 @@
1
1
  module Tumugi
2
2
  module Plugin
3
3
  module Bigquery
4
- VERSION = "0.2.0"
4
+ VERSION = "0.3.0"
5
5
  end
6
6
  end
7
7
  end
@@ -12,16 +12,25 @@ module Tumugi
12
12
  param :dest_project_id, type: :string
13
13
  param :dest_dataset_id, type: :string, required: true
14
14
  param :dest_table_id, type: :string, required: true
15
- param :wait, type: :int, default: 60
15
+ param :force_copy, type: :bool, default: false
16
+ param :wait, type: :integer, default: 60
16
17
 
17
18
  def output
18
19
  return @output if @output
19
-
20
+
20
21
  opts = { dataset_id: dest_dataset_id, table_id: dest_table_id }
21
22
  opts[:project_id] = dest_project_id if dest_project_id
22
23
  @output = Tumugi::Plugin::BigqueryTableTarget.new(opts)
23
24
  end
24
25
 
26
+ def completed?
27
+ if force_copy && !finished?
28
+ false
29
+ else
30
+ super
31
+ end
32
+ end
33
+
25
34
  def run
26
35
  log "Source: bq://#{src_project_id}/#{src_dataset_id}/#{src_table_id}"
27
36
  log "Destination: #{output}"
@@ -10,19 +10,37 @@ module Tumugi
10
10
  param :project_id, type: :string
11
11
  param :dataset_id, type: :string, required: true
12
12
  param :table_id, type: :string, required: true
13
- param :wait, type: :int, default: 60
13
+ param :mode, type: :string, default: 'truncate' # append, empty
14
+ param :flatten_results, type: :bool, default: true
15
+ param :use_legacy_sql, type: :bool, default: true
16
+ param :wait, type: :integer, default: 60
14
17
 
15
18
  def output
16
19
  @output ||= Tumugi::Plugin::BigqueryTableTarget.new(project_id: project_id, dataset_id: dataset_id, table_id: table_id)
17
20
  end
18
21
 
22
+ def completed?
23
+ if mode.to_sym == :append && !finished?
24
+ false
25
+ else
26
+ super
27
+ end
28
+ end
29
+
19
30
  def run
20
31
  log "Launching Query"
21
32
  log "Query: #{query}"
22
33
  log "Query destination: #{output}"
23
34
 
24
35
  bq_client = output.client
25
- bq_client.query(query, project_id: project_id, dataset_id: output.dataset_id, table_id: output.table_id, wait: wait)
36
+ bq_client.query(query,
37
+ project_id: project_id,
38
+ dataset_id: output.dataset_id,
39
+ table_id: output.table_id,
40
+ mode: mode.to_sym,
41
+ flatten_results: flatten_results,
42
+ use_legacy_sql: use_legacy_sql,
43
+ wait: wait)
26
44
  end
27
45
  end
28
46
  end
@@ -20,14 +20,14 @@ Gem::Specification.new do |spec|
20
20
  spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
21
21
  spec.require_paths = ["lib"]
22
22
 
23
- spec.add_runtime_dependency "tumugi", ">= 0.5.1"
23
+ spec.add_runtime_dependency "tumugi", ">= 0.6.1"
24
24
  spec.add_runtime_dependency "kura", "~> 0.2.17"
25
+ spec.add_runtime_dependency "json", "~> 1.8.3" # json 2.0 does not work with JRuby + MultiJson
25
26
 
26
27
  spec.add_development_dependency 'bundler', '~> 1.11'
27
28
  spec.add_development_dependency 'rake', '~> 10.0'
28
29
  spec.add_development_dependency 'test-unit', '~> 3.1'
29
30
  spec.add_development_dependency 'test-unit-rr'
30
31
  spec.add_development_dependency 'coveralls'
31
- spec.add_development_dependency 'github_changelog_generator'
32
32
  spec.add_development_dependency 'tumugi-plugin-google_cloud_storage'
33
33
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: tumugi-plugin-bigquery
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.0
4
+ version: 0.3.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Kazuyuki Honda
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2016-06-06 00:00:00.000000000 Z
11
+ date: 2016-07-17 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: tumugi
@@ -16,14 +16,14 @@ dependencies:
16
16
  requirements:
17
17
  - - ">="
18
18
  - !ruby/object:Gem::Version
19
- version: 0.5.1
19
+ version: 0.6.1
20
20
  type: :runtime
21
21
  prerelease: false
22
22
  version_requirements: !ruby/object:Gem::Requirement
23
23
  requirements:
24
24
  - - ">="
25
25
  - !ruby/object:Gem::Version
26
- version: 0.5.1
26
+ version: 0.6.1
27
27
  - !ruby/object:Gem::Dependency
28
28
  name: kura
29
29
  requirement: !ruby/object:Gem::Requirement
@@ -38,6 +38,20 @@ dependencies:
38
38
  - - "~>"
39
39
  - !ruby/object:Gem::Version
40
40
  version: 0.2.17
41
+ - !ruby/object:Gem::Dependency
42
+ name: json
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - "~>"
46
+ - !ruby/object:Gem::Version
47
+ version: 1.8.3
48
+ type: :runtime
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - "~>"
53
+ - !ruby/object:Gem::Version
54
+ version: 1.8.3
41
55
  - !ruby/object:Gem::Dependency
42
56
  name: bundler
43
57
  requirement: !ruby/object:Gem::Requirement
@@ -108,20 +122,6 @@ dependencies:
108
122
  - - ">="
109
123
  - !ruby/object:Gem::Version
110
124
  version: '0'
111
- - !ruby/object:Gem::Dependency
112
- name: github_changelog_generator
113
- requirement: !ruby/object:Gem::Requirement
114
- requirements:
115
- - - ">="
116
- - !ruby/object:Gem::Version
117
- version: '0'
118
- type: :development
119
- prerelease: false
120
- version_requirements: !ruby/object:Gem::Requirement
121
- requirements:
122
- - - ">="
123
- - !ruby/object:Gem::Version
124
- version: '0'
125
125
  - !ruby/object:Gem::Dependency
126
126
  name: tumugi-plugin-google_cloud_storage
127
127
  requirement: !ruby/object:Gem::Requirement
@@ -152,8 +152,11 @@ files:
152
152
  - bin/setup
153
153
  - examples/copy.rb
154
154
  - examples/dataset.rb
155
+ - examples/export.rb
156
+ - examples/force_copy.rb
155
157
  - examples/load.rb
156
158
  - examples/query.rb
159
+ - examples/query_append.rb
157
160
  - examples/test.csv
158
161
  - examples/tumugi_config_example.rb
159
162
  - lib/tumugi/plugin/bigquery/client.rb