embulk-output-bigquery 0.6.0 → 0.6.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (50) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +7 -3
  3. data/README.md +12 -7
  4. data/embulk-output-bigquery.gemspec +4 -2
  5. data/lib/embulk/output/bigquery.rb +3 -3
  6. metadata +2 -46
  7. data/example/config_append_direct_schema_update_options.yml +0 -31
  8. data/example/config_client_options.yml +0 -33
  9. data/example/config_csv.yml +0 -30
  10. data/example/config_delete_in_advance.yml +0 -29
  11. data/example/config_delete_in_advance_field_partitioned_table.yml +0 -33
  12. data/example/config_delete_in_advance_partitioned_table.yml +0 -33
  13. data/example/config_expose_errors.yml +0 -30
  14. data/example/config_gcs.yml +0 -32
  15. data/example/config_guess_from_embulk_schema.yml +0 -29
  16. data/example/config_guess_with_column_options.yml +0 -40
  17. data/example/config_gzip.yml +0 -1
  18. data/example/config_jsonl.yml +0 -1
  19. data/example/config_max_threads.yml +0 -34
  20. data/example/config_min_ouput_tasks.yml +0 -34
  21. data/example/config_mode_append.yml +0 -30
  22. data/example/config_mode_append_direct.yml +0 -30
  23. data/example/config_nested_record.yml +0 -1
  24. data/example/config_payload_column.yml +0 -20
  25. data/example/config_payload_column_index.yml +0 -20
  26. data/example/config_progress_log_interval.yml +0 -31
  27. data/example/config_replace.yml +0 -30
  28. data/example/config_replace_backup.yml +0 -32
  29. data/example/config_replace_backup_field_partitioned_table.yml +0 -34
  30. data/example/config_replace_backup_partitioned_table.yml +0 -34
  31. data/example/config_replace_field_partitioned_table.yml +0 -33
  32. data/example/config_replace_partitioned_table.yml +0 -33
  33. data/example/config_replace_schema_update_options.yml +0 -33
  34. data/example/config_skip_file_generation.yml +0 -32
  35. data/example/config_table_strftime.yml +0 -30
  36. data/example/config_template_table.yml +0 -21
  37. data/example/config_uncompressed.yml +0 -1
  38. data/example/config_with_rehearsal.yml +0 -33
  39. data/example/example.csv +0 -17
  40. data/example/example.yml +0 -1
  41. data/example/example2_1.csv +0 -1
  42. data/example/example2_2.csv +0 -1
  43. data/example/example4_1.csv +0 -1
  44. data/example/example4_2.csv +0 -1
  45. data/example/example4_3.csv +0 -1
  46. data/example/example4_4.csv +0 -1
  47. data/example/json_key.json +0 -12
  48. data/example/nested_example.jsonl +0 -16
  49. data/example/schema.json +0 -30
  50. data/example/schema_expose_errors.json +0 -30
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 3c0942035a81c9180260f8329ccaa5ba99de2185ea5f9ec5f1b3ffe87d5e8a73
4
- data.tar.gz: c543a1b9f1278cf5d543a96bd3b8c465b2727b03df67d1e1726bef40135a1d42
3
+ metadata.gz: ddfd10c5e85614e1dae0333494333653f1af95b8158dfda8977f8b00d64b3478
4
+ data.tar.gz: 2cec70eaa49c828d7fe9347bc0d9699b9398f21db96880e997a66bdab23deb89
5
5
  SHA512:
6
- metadata.gz: 23559e485346f2f8d65fa76aef2284c8b8c682257ee317b2088b30515e6e2a2936cc3c5b8ab5c3020ee9f9790e735bc48c41bb8e5d30fc777174d681796128c1
7
- data.tar.gz: 336988c0afb153c0b9b7532bf9d85523bb9e9641eca7a79b6ab491be5567e2be9204c47417b28a8d42bbae2907cb6892b1ce5abb98261564c696b15478deb3ad
6
+ metadata.gz: 4782a28272da610f8399aca50cc4ddaefea00b8dbf45a37bec24771d7ecdb05bbdcd6de85ff167c5c3745f6689413c215689bb8d420960705cd6cb2026e99932
7
+ data.tar.gz: 9dbabb787e2f1b5797ccb2a2cd8786ce28d0e0d01310cd522ea4894337a279e809de10abca14b50b836553b6de95df4afd886596d75e7193d4de60a5c6f95781
@@ -1,3 +1,7 @@
1
+ ## 0.6.1 - 2019-08-28
2
+
3
+ * [maintenance] Release a new gem not to include symlinks to make it work on Windows.
4
+
1
5
  ## 0.6.0 - 2019-08-11
2
6
 
3
7
  Cleanup `auth_method`:
@@ -5,14 +9,14 @@ Cleanup `auth_method`:
5
9
  * [enhancement] Support `auth_method: authorized_user` (OAuth)
6
10
  * [incompatibility change] Rename `auth_method: json_key` to `auth_method: service_account` (`json_key` is kept for backward compatibility)
7
11
  * [incompatibility change] Remove deprecated `auth_method: private_key` (p12 key)
8
- * [incompatibility change] Change the default `auth_method` to `application_default` from `private_key`.
12
+ * [incompatibility change] Change the default `auth_method` to `application_default` from `private_key` because `private_key` was dropped.
9
13
 
10
14
  ## 0.5.0 - 2019-08-10
11
15
 
12
16
  * [incompatibility change] Drop deprecated `time_partitioning`.`require_partition_filter`
13
17
  * [incompatibility change] Drop `prevent_duplicate_insert` which has no use-case now
14
- * [incompatibility change] Change default value of `auto_create_table` to `true` from `false`
15
- * Modes `replace`, `replace_backup`, `append`, `delete_in_advance`, that is, except `append_direct` requires `auto_create_table: true`.
18
+ * [incompatibility change] Modes `replace`, `replace_backup`, `append`, and `delete_in_advance` require `auto_create_table: true` now because, previously, these modes had created a target table even with `auto_create_table: false` and made users being confused. Note that `auto_create_table: true` is always required even for a partition (a table name with a partition decorator) which may not require creating a table. This is for simplicity of logics and implementations.
19
+ * [incompatibility change] Change default value of `auto_create_table` to `true` because the above 4 modes, that is, except `append_direct` always require `auto_create_table: true` now.
16
20
 
17
21
  ## 0.4.14 - 2019-08-10
18
22
 
data/README.md CHANGED
@@ -37,7 +37,7 @@ OAuth flow for installed applications.
37
37
  | location | string | optional | nil | geographic location of dataset. See [Location](#location) |
38
38
  | table | string | required | | table name, or table name with a partition decorator such as `table_name$20160929`|
39
39
  | auto_create_dataset | boolean | optional | false | automatically create dataset |
40
- | auto_create_table | boolean | optional | true | `false` is available only for `append_direct` mode. Other modes requires `true`. See [Dynamic Table Creating](#dynamic-table-creating) and [Time Partitioning](#time-partitioning) |
40
+ | auto_create_table | boolean | optional | true | `false` is available only for `append_direct` mode. Other modes require `true`. See [Dynamic Table Creating](#dynamic-table-creating) and [Time Partitioning](#time-partitioning) |
41
41
  | schema_file | string | optional | | /path/to/schema.json |
42
42
  | template_table | string | optional | | template table name. See [Dynamic Table Creating](#dynamic-table-creating) |
43
43
  | job_status_max_polling_time | int | optional | 3600 sec | Max job status polling time |
@@ -213,7 +213,7 @@ You can also embed contents of `json_keyfile` at config.yml.
213
213
  ```yaml
214
214
  out:
215
215
  type: bigquery
216
- auth_method: service_account
216
+ auth_method: authorized_user
217
217
  json_keyfile:
218
218
  content: |
219
219
  {
@@ -239,7 +239,12 @@ out:
239
239
 
240
240
  #### application\_default
241
241
 
242
- Use Application Default Credentials (ADC).
242
+ Use Application Default Credentials (ADC). ADC is a strategy to locate Google Cloud Service Account credentials.
243
+
244
+ 1. ADC checks to see if the environment variable `GOOGLE_APPLICATION_CREDENTIALS` is set. If the variable is set, ADC uses the service account file that the variable points to.
245
+ 2. ADC checks to see if `~/.config/gcloud/application_default_credentials.json` is located. This file is created by running `gcloud auth application-default login`.
246
+ 3. Use the default service account for credentials if the application running on Compute Engine, App Engine, Kubernetes Engine, Cloud Functions or Cloud Run.
247
+
243
248
  See https://cloud.google.com/docs/authentication/production for details.
244
249
 
245
250
  ```yaml
@@ -256,12 +261,12 @@ Table ids are formatted at runtime
256
261
  using the local time of the embulk server.
257
262
 
258
263
  For example, with the configuration below,
259
- data is inserted into tables `table_2015_04`, `table_2015_05` and so on.
264
+ data is inserted into tables `table_20150503`, `table_20150504` and so on.
260
265
 
261
266
  ```yaml
262
267
  out:
263
268
  type: bigquery
264
- table: table_%Y_%m
269
+ table: table_%Y%m%d
265
270
  ```
266
271
 
267
272
  ### Dynamic table creating
@@ -276,7 +281,7 @@ Please set file path of schema.json.
276
281
  out:
277
282
  type: bigquery
278
283
  auto_create_table: true
279
- table: table_%Y_%m
284
+ table: table_%Y%m%d
280
285
  schema_file: /path/to/schema.json
281
286
  ```
282
287
 
@@ -288,7 +293,7 @@ Plugin will try to read schema from existing table and use it as schema template
288
293
  out:
289
294
  type: bigquery
290
295
  auto_create_table: true
291
- table: table_%Y_%m
296
+ table: table_%Y%m%d
292
297
  template_table: existing_table_name
293
298
  ```
294
299
 
@@ -1,6 +1,6 @@
1
1
  Gem::Specification.new do |spec|
2
2
  spec.name = "embulk-output-bigquery"
3
- spec.version = "0.6.0"
3
+ spec.version = "0.6.1"
4
4
  spec.authors = ["Satoshi Akama", "Naotoshi Seo"]
5
5
  spec.summary = "Google BigQuery output plugin for Embulk"
6
6
  spec.description = "Embulk plugin that insert records to Google BigQuery."
@@ -8,7 +8,9 @@ Gem::Specification.new do |spec|
8
8
  spec.licenses = ["MIT"]
9
9
  spec.homepage = "https://github.com/embulk/embulk-output-bigquery"
10
10
 
11
- spec.files = `git ls-files`.split("\n") + Dir["classpath/*.jar"]
11
+ # Exclude example directory which uses symlinks from generating gem.
12
+ # Symlinks do not work properly on the Windows platform without administrator privilege.
13
+ spec.files = `git ls-files`.split("\n") + Dir["classpath/*.jar"] - Dir["example/*" ]
12
14
  spec.test_files = spec.files.grep(%r{^(test|spec)/})
13
15
  spec.require_paths = ["lib"]
14
16
 
@@ -304,14 +304,14 @@ module Embulk
304
304
  bigquery.create_table_if_not_exists(task['table'])
305
305
  when 'replace'
306
306
  bigquery.create_table_if_not_exists(task['temp_table'])
307
- bigquery.create_table_if_not_exists(task['table'])
307
+ bigquery.create_table_if_not_exists(task['table']) # needs for when task['table'] is a partition
308
308
  when 'append'
309
309
  bigquery.create_table_if_not_exists(task['temp_table'])
310
- bigquery.create_table_if_not_exists(task['table'])
310
+ bigquery.create_table_if_not_exists(task['table']) # needs for when task['table'] is a partition
311
311
  when 'replace_backup'
312
312
  bigquery.create_table_if_not_exists(task['temp_table'])
313
313
  bigquery.create_table_if_not_exists(task['table'])
314
- bigquery.create_table_if_not_exists(task['table_old'], dataset: task['dataset_old'])
314
+ bigquery.create_table_if_not_exists(task['table_old'], dataset: task['dataset_old']) # needs for when a partition
315
315
  else # append_direct
316
316
  if task['auto_create_table']
317
317
  bigquery.create_table_if_not_exists(task['table'])
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: embulk-output-bigquery
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.6.0
4
+ version: 0.6.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Satoshi Akama
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2019-08-10 00:00:00.000000000 Z
12
+ date: 2019-08-28 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  requirement: !ruby/object:Gem::Requirement
@@ -83,50 +83,6 @@ files:
83
83
  - README.md
84
84
  - Rakefile
85
85
  - embulk-output-bigquery.gemspec
86
- - example/config_append_direct_schema_update_options.yml
87
- - example/config_client_options.yml
88
- - example/config_csv.yml
89
- - example/config_delete_in_advance.yml
90
- - example/config_delete_in_advance_field_partitioned_table.yml
91
- - example/config_delete_in_advance_partitioned_table.yml
92
- - example/config_expose_errors.yml
93
- - example/config_gcs.yml
94
- - example/config_guess_from_embulk_schema.yml
95
- - example/config_guess_with_column_options.yml
96
- - example/config_gzip.yml
97
- - example/config_jsonl.yml
98
- - example/config_max_threads.yml
99
- - example/config_min_ouput_tasks.yml
100
- - example/config_mode_append.yml
101
- - example/config_mode_append_direct.yml
102
- - example/config_nested_record.yml
103
- - example/config_payload_column.yml
104
- - example/config_payload_column_index.yml
105
- - example/config_progress_log_interval.yml
106
- - example/config_replace.yml
107
- - example/config_replace_backup.yml
108
- - example/config_replace_backup_field_partitioned_table.yml
109
- - example/config_replace_backup_partitioned_table.yml
110
- - example/config_replace_field_partitioned_table.yml
111
- - example/config_replace_partitioned_table.yml
112
- - example/config_replace_schema_update_options.yml
113
- - example/config_skip_file_generation.yml
114
- - example/config_table_strftime.yml
115
- - example/config_template_table.yml
116
- - example/config_uncompressed.yml
117
- - example/config_with_rehearsal.yml
118
- - example/example.csv
119
- - example/example.yml
120
- - example/example2_1.csv
121
- - example/example2_2.csv
122
- - example/example4_1.csv
123
- - example/example4_2.csv
124
- - example/example4_3.csv
125
- - example/example4_4.csv
126
- - example/json_key.json
127
- - example/nested_example.jsonl
128
- - example/schema.json
129
- - example/schema_expose_errors.json
130
86
  - lib/embulk/output/bigquery.rb
131
87
  - lib/embulk/output/bigquery/auth.rb
132
88
  - lib/embulk/output/bigquery/bigquery_client.rb
@@ -1,31 +0,0 @@
1
- in:
2
- type: file
3
- path_prefix: example/example.csv
4
- parser:
5
- type: csv
6
- charset: UTF-8
7
- newline: CRLF
8
- null_string: 'NULL'
9
- skip_header_lines: 1
10
- comment_line_marker: '#'
11
- columns:
12
- - {name: date, type: string}
13
- - {name: timestamp, type: timestamp, format: "%Y-%m-%d %H:%M:%S.%N", timezone: "+09:00"}
14
- - {name: "null", type: string}
15
- - {name: long, type: long}
16
- - {name: string, type: string}
17
- - {name: double, type: double}
18
- - {name: boolean, type: boolean}
19
- out:
20
- type: bigquery
21
- mode: append_direct
22
- auth_method: service_account
23
- json_keyfile: example/your-project-000.json
24
- dataset: your_dataset_name
25
- table: your_table_name
26
- source_format: NEWLINE_DELIMITED_JSON
27
- compression: NONE
28
- auto_create_dataset: true
29
- auto_create_table: true
30
- schema_file: example/schema.json
31
- schema_update_options: [ALLOW_FIELD_ADDITION, ALLOW_FIELD_RELAXATION]
@@ -1,33 +0,0 @@
1
- in:
2
- type: file
3
- path_prefix: example/example.csv
4
- parser:
5
- type: csv
6
- charset: UTF-8
7
- newline: CRLF
8
- null_string: 'NULL'
9
- skip_header_lines: 1
10
- comment_line_marker: '#'
11
- columns:
12
- - {name: date, type: string}
13
- - {name: timestamp, type: timestamp, format: "%Y-%m-%d %H:%M:%S.%N", timezone: "+09:00"}
14
- - {name: "null", type: string}
15
- - {name: long, type: long}
16
- - {name: string, type: string}
17
- - {name: double, type: double}
18
- - {name: boolean, type: boolean}
19
- out:
20
- type: bigquery
21
- mode: replace
22
- auth_method: service_account
23
- json_keyfile: example/your-project-000.json
24
- dataset: your_dataset_name
25
- table: your_table_name
26
- source_format: NEWLINE_DELIMITED_JSON
27
- auto_create_dataset: true
28
- auto_create_table: true
29
- schema_file: example/schema.json
30
- timeout_sec: 400
31
- open_timeout_sec: 400
32
- retries: 2
33
- application_name: "Embulk BigQuery plugin test"
@@ -1,30 +0,0 @@
1
- in:
2
- type: file
3
- path_prefix: example/example.csv
4
- parser:
5
- type: csv
6
- charset: UTF-8
7
- newline: CRLF
8
- null_string: 'NULL'
9
- skip_header_lines: 1
10
- comment_line_marker: '#'
11
- columns:
12
- - {name: date, type: string}
13
- - {name: timestamp, type: timestamp, format: "%Y-%m-%d %H:%M:%S.%N", timezone: "+09:00"}
14
- - {name: "null", type: string}
15
- - {name: long, type: long}
16
- - {name: string, type: string}
17
- - {name: double, type: double}
18
- - {name: boolean, type: boolean}
19
- out:
20
- type: bigquery
21
- mode: replace
22
- auth_method: service_account
23
- json_keyfile: example/your-project-000.json
24
- dataset: your_dataset_name
25
- table: your_table_name
26
- source_format: CSV
27
- compression: GZIP
28
- auto_create_dataset: true
29
- auto_create_table: true
30
- schema_file: example/schema.json
@@ -1,29 +0,0 @@
1
- in:
2
- type: file
3
- path_prefix: example/example.csv
4
- parser:
5
- type: csv
6
- charset: UTF-8
7
- newline: CRLF
8
- null_string: 'NULL'
9
- skip_header_lines: 1
10
- comment_line_marker: '#'
11
- columns:
12
- - {name: date, type: string}
13
- - {name: timestamp, type: timestamp, format: "%Y-%m-%d %H:%M:%S.%N", timezone: "+09:00"}
14
- - {name: "null", type: string}
15
- - {name: long, type: long}
16
- - {name: string, type: string}
17
- - {name: double, type: double}
18
- - {name: boolean, type: boolean}
19
- out:
20
- type: bigquery
21
- mode: delete_in_advance
22
- auth_method: service_account
23
- json_keyfile: example/your-project-000.json
24
- dataset: your_dataset_name
25
- table: your_table_name
26
- source_format: NEWLINE_DELIMITED_JSON
27
- auto_create_dataset: true
28
- auto_create_table: true
29
- schema_file: example/schema.json
@@ -1,33 +0,0 @@
1
- in:
2
- type: file
3
- path_prefix: example/example.csv
4
- parser:
5
- type: csv
6
- charset: UTF-8
7
- newline: CRLF
8
- null_string: 'NULL'
9
- skip_header_lines: 1
10
- comment_line_marker: '#'
11
- columns:
12
- - {name: date, type: string}
13
- - {name: timestamp, type: timestamp, format: "%Y-%m-%d %H:%M:%S.%N", timezone: "+09:00"}
14
- - {name: "null", type: string}
15
- - {name: long, type: long}
16
- - {name: string, type: string}
17
- - {name: double, type: double}
18
- - {name: boolean, type: boolean}
19
- out:
20
- type: bigquery
21
- mode: delete_in_advance
22
- auth_method: service_account
23
- json_keyfile: example/your-project-000.json
24
- dataset: your_dataset_name
25
- table: your_field_partitioned_table_name
26
- source_format: NEWLINE_DELIMITED_JSON
27
- compression: NONE
28
- auto_create_dataset: true
29
- auto_create_table: true
30
- schema_file: example/schema.json
31
- time_partitioning:
32
- type: 'DAY'
33
- field: timestamp
@@ -1,33 +0,0 @@
1
- in:
2
- type: file
3
- path_prefix: example/example.csv
4
- parser:
5
- type: csv
6
- charset: UTF-8
7
- newline: CRLF
8
- null_string: 'NULL'
9
- skip_header_lines: 1
10
- comment_line_marker: '#'
11
- columns:
12
- - {name: date, type: string}
13
- - {name: timestamp, type: timestamp, format: "%Y-%m-%d %H:%M:%S.%N", timezone: "+09:00"}
14
- - {name: "null", type: string}
15
- - {name: long, type: long}
16
- - {name: string, type: string}
17
- - {name: double, type: double}
18
- - {name: boolean, type: boolean}
19
- out:
20
- type: bigquery
21
- mode: delete_in_advance
22
- auth_method: service_account
23
- json_keyfile: example/your-project-000.json
24
- dataset: your_dataset_name
25
- table: your_partitioned_table_name$20160929
26
- source_format: NEWLINE_DELIMITED_JSON
27
- compression: NONE
28
- auto_create_dataset: true
29
- auto_create_table: true
30
- schema_file: example/schema.json
31
- time_partitioning:
32
- type: 'DAY'
33
- expiration_ms: 100
@@ -1,30 +0,0 @@
1
- in:
2
- type: file
3
- path_prefix: example/example.csv
4
- parser:
5
- type: csv
6
- charset: UTF-8
7
- newline: CRLF
8
- null_string: 'NULL'
9
- skip_header_lines: 1
10
- comment_line_marker: '#'
11
- columns:
12
- - {name: date, type: string}
13
- - {name: timestamp, type: timestamp, format: "%Y-%m-%d %H:%M:%S.%N", timezone: "+09:00"}
14
- - {name: "null", type: string}
15
- - {name: long, type: long}
16
- - {name: string, type: string}
17
- - {name: double, type: double}
18
- - {name: boolean, type: boolean}
19
- out:
20
- type: bigquery
21
- mode: replace
22
- auth_method: service_account
23
- json_keyfile: example/your-project-000.json
24
- dataset: your_dataset_name
25
- table: your_table_name
26
- source_format: NEWLINE_DELIMITED_JSON
27
- compression: NONE
28
- auto_create_dataset: true
29
- auto_create_table: true
30
- schema_file: example/schema_expose_errors.json
@@ -1,32 +0,0 @@
1
- in:
2
- type: file
3
- path_prefix: example/example.csv
4
- parser:
5
- type: csv
6
- charset: UTF-8
7
- newline: CRLF
8
- null_string: 'NULL'
9
- skip_header_lines: 1
10
- comment_line_marker: '#'
11
- columns:
12
- - {name: date, type: string}
13
- - {name: timestamp, type: timestamp, format: "%Y-%m-%d %H:%M:%S.%N", timezone: "+09:00"}
14
- - {name: "null", type: string}
15
- - {name: long, type: long}
16
- - {name: string, type: string}
17
- - {name: double, type: double}
18
- - {name: boolean, type: boolean}
19
- out:
20
- type: bigquery
21
- mode: replace
22
- auth_method: service_account
23
- json_keyfile: example/your-project-000.json
24
- dataset: your_dataset_name
25
- table: your_table_name
26
- source_format: NEWLINE_DELIMITED_JSON
27
- compression: GZIP
28
- auto_create_dataset: true
29
- auto_create_table: true
30
- schema_file: example/schema.json
31
- gcs_bucket: your_bucket_name
32
- auto_create_gcs_bucket: true