google-cloud-bigquery 1.21.2
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.yardopts +16 -0
- data/AUTHENTICATION.md +158 -0
- data/CHANGELOG.md +397 -0
- data/CODE_OF_CONDUCT.md +40 -0
- data/CONTRIBUTING.md +188 -0
- data/LICENSE +201 -0
- data/LOGGING.md +27 -0
- data/OVERVIEW.md +463 -0
- data/TROUBLESHOOTING.md +31 -0
- data/lib/google-cloud-bigquery.rb +139 -0
- data/lib/google/cloud/bigquery.rb +145 -0
- data/lib/google/cloud/bigquery/argument.rb +197 -0
- data/lib/google/cloud/bigquery/convert.rb +383 -0
- data/lib/google/cloud/bigquery/copy_job.rb +316 -0
- data/lib/google/cloud/bigquery/credentials.rb +50 -0
- data/lib/google/cloud/bigquery/data.rb +526 -0
- data/lib/google/cloud/bigquery/dataset.rb +2845 -0
- data/lib/google/cloud/bigquery/dataset/access.rb +1021 -0
- data/lib/google/cloud/bigquery/dataset/list.rb +162 -0
- data/lib/google/cloud/bigquery/encryption_configuration.rb +123 -0
- data/lib/google/cloud/bigquery/external.rb +2432 -0
- data/lib/google/cloud/bigquery/extract_job.rb +368 -0
- data/lib/google/cloud/bigquery/insert_response.rb +180 -0
- data/lib/google/cloud/bigquery/job.rb +657 -0
- data/lib/google/cloud/bigquery/job/list.rb +162 -0
- data/lib/google/cloud/bigquery/load_job.rb +1704 -0
- data/lib/google/cloud/bigquery/model.rb +740 -0
- data/lib/google/cloud/bigquery/model/list.rb +164 -0
- data/lib/google/cloud/bigquery/project.rb +1655 -0
- data/lib/google/cloud/bigquery/project/list.rb +161 -0
- data/lib/google/cloud/bigquery/query_job.rb +1695 -0
- data/lib/google/cloud/bigquery/routine.rb +1108 -0
- data/lib/google/cloud/bigquery/routine/list.rb +165 -0
- data/lib/google/cloud/bigquery/schema.rb +564 -0
- data/lib/google/cloud/bigquery/schema/field.rb +668 -0
- data/lib/google/cloud/bigquery/service.rb +589 -0
- data/lib/google/cloud/bigquery/standard_sql.rb +495 -0
- data/lib/google/cloud/bigquery/table.rb +3340 -0
- data/lib/google/cloud/bigquery/table/async_inserter.rb +520 -0
- data/lib/google/cloud/bigquery/table/list.rb +172 -0
- data/lib/google/cloud/bigquery/time.rb +65 -0
- data/lib/google/cloud/bigquery/version.rb +22 -0
- metadata +297 -0
data/OVERVIEW.md
ADDED
@@ -0,0 +1,463 @@
|
|
1
|
+
# Google Cloud BigQuery
|
2
|
+
|
3
|
+
Google BigQuery enables super-fast, SQL-like queries against massive datasets,
|
4
|
+
using the processing power of Google's infrastructure. To learn more, read [What
|
5
|
+
is BigQuery?](https://cloud.google.com/bigquery/what-is-bigquery).
|
6
|
+
|
7
|
+
The goal of google-cloud is to provide an API that is comfortable to Rubyists.
|
8
|
+
Your authentication credentials are detected automatically in Google Cloud
|
9
|
+
Platform (GCP), including Google Compute Engine (GCE), Google Kubernetes Engine
|
10
|
+
(GKE), Google App Engine (GAE), Google Cloud Functions (GCF) and Cloud Run. In
|
11
|
+
other environments you can configure authentication easily, either directly in
|
12
|
+
your code or via environment variables. Read more about the options for
|
13
|
+
connecting in the {file:AUTHENTICATION.md Authentication Guide}.
|
14
|
+
|
15
|
+
To help you get started quickly, the first few examples below use a public
|
16
|
+
dataset provided by Google. As soon as you have [signed
|
17
|
+
up](https://cloud.google.com/bigquery/sign-up) to use BigQuery, and provided
|
18
|
+
that you stay in the free tier for queries, you should be able to run these
|
19
|
+
first examples without the need to set up billing or to load data (although
|
20
|
+
we'll show you how to do that too.)
|
21
|
+
|
22
|
+
## Listing Datasets and Tables
|
23
|
+
|
24
|
+
A BigQuery project contains datasets, which in turn contain tables. Assuming
|
25
|
+
that you have not yet created datasets or tables in your own project, let's
|
26
|
+
connect to Google's `bigquery-public-data` project, and see what we find.
|
27
|
+
|
28
|
+
```ruby
|
29
|
+
require "google/cloud/bigquery"
|
30
|
+
|
31
|
+
bigquery = Google::Cloud::Bigquery.new project: "bigquery-public-data"
|
32
|
+
|
33
|
+
bigquery.datasets.count #=> 1
|
34
|
+
bigquery.datasets.first.dataset_id #=> "samples"
|
35
|
+
|
36
|
+
dataset = bigquery.datasets.first
|
37
|
+
tables = dataset.tables
|
38
|
+
|
39
|
+
tables.count #=> 7
|
40
|
+
tables.map &:table_id #=> [..., "shakespeare", "trigrams", "wikipedia"]
|
41
|
+
```
|
42
|
+
|
43
|
+
In addition to listing all datasets and tables in the project, you can also
|
44
|
+
retrieve individual datasets and tables by ID. Let's look at the structure of
|
45
|
+
the `shakespeare` table, which contains an entry for every word in every play
|
46
|
+
written by Shakespeare.
|
47
|
+
|
48
|
+
```ruby
|
49
|
+
require "google/cloud/bigquery"
|
50
|
+
|
51
|
+
bigquery = Google::Cloud::Bigquery.new project: "bigquery-public-data"
|
52
|
+
|
53
|
+
dataset = bigquery.dataset "samples"
|
54
|
+
table = dataset.table "shakespeare"
|
55
|
+
|
56
|
+
table.headers #=> [:word, :word_count, :corpus, :corpus_date]
|
57
|
+
table.rows_count #=> 164656
|
58
|
+
```
|
59
|
+
|
60
|
+
Now that you know the column names for the Shakespeare table, let's write and
|
61
|
+
run a few queries against it.
|
62
|
+
|
63
|
+
## Running queries
|
64
|
+
|
65
|
+
BigQuery supports two SQL dialects: [standard
|
66
|
+
SQL](https://cloud.google.com/bigquery/docs/reference/standard-sql/) and the
|
67
|
+
older [legacy SQl (BigQuery
|
68
|
+
SQL)](https://cloud.google.com/bigquery/docs/reference/legacy-sql), as discussed
|
69
|
+
in the guide [Migrating from legacy
|
70
|
+
SQL](https://cloud.google.com/bigquery/docs/reference/standard-sql/migrating-from-legacy-sql).
|
71
|
+
|
72
|
+
### Standard SQL
|
73
|
+
|
74
|
+
Standard SQL is the preferred SQL dialect for querying data stored in BigQuery.
|
75
|
+
It is compliant with the SQL 2011 standard, and has extensions that support
|
76
|
+
querying nested and repeated data. This is the default syntax. It has several
|
77
|
+
advantages over legacy SQL, including:
|
78
|
+
|
79
|
+
* Composability using `WITH` clauses and SQL functions
|
80
|
+
* Subqueries in the `SELECT` list and `WHERE` clause
|
81
|
+
* Correlated subqueries
|
82
|
+
* `ARRAY` and `STRUCT` data types
|
83
|
+
* Inserts, updates, and deletes
|
84
|
+
* `COUNT(DISTINCT <expr>)` is exact and scalable, providing the accuracy of
|
85
|
+
`EXACT_COUNT_DISTINCT` without its limitations
|
86
|
+
* Automatic predicate push-down through `JOIN`s
|
87
|
+
* Complex `JOIN` predicates, including arbitrary expressions
|
88
|
+
|
89
|
+
For examples that demonstrate some of these features, see [Standard SQL
|
90
|
+
ghlights](https://cloud.google.com/bigquery/docs/reference/standard-sql/migrating-from-legacy-sql#standard_sql_highlights).
|
91
|
+
|
92
|
+
As shown in this example, standard SQL is the library default:
|
93
|
+
|
94
|
+
```ruby
|
95
|
+
require "google/cloud/bigquery"
|
96
|
+
|
97
|
+
bigquery = Google::Cloud::Bigquery.new
|
98
|
+
|
99
|
+
sql = "SELECT word, SUM(word_count) AS word_count " \
|
100
|
+
"FROM `bigquery-public-data.samples.shakespeare`" \
|
101
|
+
"WHERE word IN ('me', 'I', 'you') GROUP BY word"
|
102
|
+
data = bigquery.query sql
|
103
|
+
```
|
104
|
+
|
105
|
+
Notice that in standard SQL, a fully-qualified table name uses the following
|
106
|
+
format: <code>`my-dashed-project.dataset1.tableName`</code>.
|
107
|
+
|
108
|
+
### Legacy SQL (formerly BigQuery SQL)
|
109
|
+
|
110
|
+
Before version 2.0, BigQuery executed queries using a non-standard SQL dialect
|
111
|
+
known as BigQuery SQL. This variant is optional, and can be enabled by passing
|
112
|
+
the flag `legacy_sql: true` with your query. (If you get an SQL syntax error
|
113
|
+
with a query that may be written in legacy SQL, be sure that you are passing
|
114
|
+
this option.)
|
115
|
+
|
116
|
+
To use legacy SQL, pass the option `legacy_sql: true` with your query:
|
117
|
+
|
118
|
+
```ruby
|
119
|
+
require "google/cloud/bigquery"
|
120
|
+
|
121
|
+
bigquery = Google::Cloud::Bigquery.new
|
122
|
+
|
123
|
+
sql = "SELECT TOP(word, 50) as word, COUNT(*) as count " \
|
124
|
+
"FROM [bigquery-public-data:samples.shakespeare]"
|
125
|
+
data = bigquery.query sql, legacy_sql: true
|
126
|
+
```
|
127
|
+
|
128
|
+
Notice that in legacy SQL, a fully-qualified table name uses brackets instead of
|
129
|
+
back-ticks, and a colon instead of a dot to separate the project and the
|
130
|
+
dataset: `[my-dashed-project:dataset1.tableName]`.
|
131
|
+
|
132
|
+
#### Query parameters
|
133
|
+
|
134
|
+
With standard SQL, you can use positional or named query parameters. This
|
135
|
+
example shows the use of named parameters:
|
136
|
+
|
137
|
+
```ruby
|
138
|
+
require "google/cloud/bigquery"
|
139
|
+
|
140
|
+
bigquery = Google::Cloud::Bigquery.new
|
141
|
+
|
142
|
+
sql = "SELECT word, SUM(word_count) AS word_count " \
|
143
|
+
"FROM `bigquery-public-data.samples.shakespeare`" \
|
144
|
+
"WHERE word IN UNNEST(@words) GROUP BY word"
|
145
|
+
data = bigquery.query sql, params: { words: ['me', 'I', 'you'] }
|
146
|
+
```
|
147
|
+
|
148
|
+
As demonstrated above, passing the `params` option will automatically set
|
149
|
+
`standard_sql` to `true`.
|
150
|
+
|
151
|
+
#### Data types
|
152
|
+
|
153
|
+
BigQuery standard SQL supports simple data types such as integers, as well as
|
154
|
+
more complex types such as `ARRAY` and `STRUCT`.
|
155
|
+
|
156
|
+
The BigQuery data types are converted to and from Ruby types as follows:
|
157
|
+
|
158
|
+
| BigQuery | Ruby | Notes |
|
159
|
+
|-------------|----------------|---|
|
160
|
+
| `BOOL` | `true`/`false` | |
|
161
|
+
| `INT64` | `Integer` | |
|
162
|
+
| `FLOAT64` | `Float` | |
|
163
|
+
| `NUMERIC` | `BigDecimal` | Will be rounded to 9 decimal places |
|
164
|
+
| `STRING` | `String` | |
|
165
|
+
| `DATETIME` | `DateTime` | `DATETIME` does not support time zone. |
|
166
|
+
| `DATE` | `Date` | |
|
167
|
+
| `TIMESTAMP` | `Time` | |
|
168
|
+
| `TIME` | `Google::Cloud::BigQuery::Time` | |
|
169
|
+
| `BYTES` | `File`, `IO`, `StringIO`, or similar | |
|
170
|
+
| `ARRAY` | `Array` | Nested arrays and `nil` values are not supported. |
|
171
|
+
| `STRUCT` | `Hash` | Hash keys may be strings or symbols. |
|
172
|
+
|
173
|
+
See [Data
|
174
|
+
Types](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types)
|
175
|
+
for an overview of each BigQuery data type, including allowed values.
|
176
|
+
|
177
|
+
### Running Queries
|
178
|
+
|
179
|
+
Let's start with the simplest way to run a query. Notice that this time you are
|
180
|
+
connecting using your own default project. It is necessary to have write access
|
181
|
+
to the project for running a query, since queries need to create tables to hold
|
182
|
+
results.
|
183
|
+
|
184
|
+
```ruby
|
185
|
+
require "google/cloud/bigquery"
|
186
|
+
|
187
|
+
bigquery = Google::Cloud::Bigquery.new
|
188
|
+
|
189
|
+
sql = "SELECT APPROX_TOP_COUNT(corpus, 10) as title, " \
|
190
|
+
"COUNT(*) as unique_words " \
|
191
|
+
"FROM `bigquery-public-data.samples.shakespeare`"
|
192
|
+
data = bigquery.query sql
|
193
|
+
|
194
|
+
data.next? #=> false
|
195
|
+
data.first #=> {:title=>[{:value=>"hamlet", :count=>5318}, ...}
|
196
|
+
```
|
197
|
+
|
198
|
+
The `APPROX_TOP_COUNT` function shown above is just one of a variety of
|
199
|
+
functions offered by BigQuery. See the [Query Reference (standard
|
200
|
+
SQL)](https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators)
|
201
|
+
for a full listing.
|
202
|
+
|
203
|
+
### Query Jobs
|
204
|
+
|
205
|
+
It is usually best not to block for most BigQuery operations, including querying
|
206
|
+
as well as importing, exporting, and copying data. Therefore, the BigQuery API
|
207
|
+
provides facilities for managing longer-running jobs. With this approach, an
|
208
|
+
instance of {Google::Cloud::Bigquery::QueryJob} is returned, rather than an
|
209
|
+
instance of {Google::Cloud::Bigquery::Data}.
|
210
|
+
|
211
|
+
```ruby
|
212
|
+
require "google/cloud/bigquery"
|
213
|
+
|
214
|
+
bigquery = Google::Cloud::Bigquery.new
|
215
|
+
|
216
|
+
sql = "SELECT APPROX_TOP_COUNT(corpus, 10) as title, " \
|
217
|
+
"COUNT(*) as unique_words " \
|
218
|
+
"FROM `bigquery-public-data.samples.shakespeare`"
|
219
|
+
job = bigquery.query_job sql
|
220
|
+
|
221
|
+
job.wait_until_done!
|
222
|
+
if !job.failed?
|
223
|
+
job.data.first
|
224
|
+
#=> {:title=>[{:value=>"hamlet", :count=>5318}, ...}
|
225
|
+
end
|
226
|
+
```
|
227
|
+
|
228
|
+
Once you have determined that the job is done and has not failed, you can obtain
|
229
|
+
an instance of {Google::Cloud::Bigquery::Data} by calling `data` on the job
|
230
|
+
instance. The query results for both of the above examples are stored in
|
231
|
+
temporary tables with a lifetime of about 24 hours. See the final example below
|
232
|
+
for a demonstration of how to store query results in a permanent table.
|
233
|
+
|
234
|
+
## Creating Datasets and Tables
|
235
|
+
|
236
|
+
The first thing you need to do in a new BigQuery project is to create a
|
237
|
+
{Google::Cloud::Bigquery::Dataset}. Datasets hold tables and control access to
|
238
|
+
them.
|
239
|
+
|
240
|
+
```ruby
|
241
|
+
require "google/cloud/bigquery"
|
242
|
+
|
243
|
+
bigquery = Google::Cloud::Bigquery.new
|
244
|
+
|
245
|
+
dataset = bigquery.create_dataset "my_dataset"
|
246
|
+
```
|
247
|
+
|
248
|
+
Now that you have a dataset, you can use it to create a table. Every table is
|
249
|
+
defined by a schema that may contain nested and repeated fields. The example
|
250
|
+
below shows a schema with a repeated record field named `cities_lived`. (For
|
251
|
+
more information about nested and repeated fields, see [Preparing Data for
|
252
|
+
Loading](https://cloud.google.com/bigquery/preparing-data-for-loading).)
|
253
|
+
|
254
|
+
```ruby
|
255
|
+
require "google/cloud/bigquery"
|
256
|
+
|
257
|
+
bigquery = Google::Cloud::Bigquery.new
|
258
|
+
dataset = bigquery.dataset "my_dataset"
|
259
|
+
|
260
|
+
table = dataset.create_table "people" do |schema|
|
261
|
+
schema.string "first_name", mode: :required
|
262
|
+
schema.record "cities_lived", mode: :repeated do |nested_schema|
|
263
|
+
nested_schema.string "place", mode: :required
|
264
|
+
nested_schema.integer "number_of_years", mode: :required
|
265
|
+
end
|
266
|
+
end
|
267
|
+
```
|
268
|
+
|
269
|
+
Because of the repeated field in this schema, we cannot use the CSV format to
|
270
|
+
load data into the table.
|
271
|
+
|
272
|
+
## Loading records
|
273
|
+
|
274
|
+
To follow along with these examples, you will need to set up billing on the
|
275
|
+
[Google Developers Console](https://console.developers.google.com).
|
276
|
+
|
277
|
+
In addition to CSV, data can be imported from files that are formatted as
|
278
|
+
[Newline-delimited JSON](http://jsonlines.org/),
|
279
|
+
[Avro](http://avro.apache.org/),
|
280
|
+
[ORC](https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-orc),
|
281
|
+
[Parquet](https://parquet.apache.org/) or from a Google Cloud Datastore backup.
|
282
|
+
It can also be "streamed" into BigQuery.
|
283
|
+
|
284
|
+
### Streaming records
|
285
|
+
|
286
|
+
For situations in which you want new data to be available for querying as soon
|
287
|
+
as possible, inserting individual records directly from your Ruby application is
|
288
|
+
a great approach.
|
289
|
+
|
290
|
+
```ruby
|
291
|
+
require "google/cloud/bigquery"
|
292
|
+
|
293
|
+
bigquery = Google::Cloud::Bigquery.new
|
294
|
+
dataset = bigquery.dataset "my_dataset"
|
295
|
+
table = dataset.table "people"
|
296
|
+
|
297
|
+
rows = [
|
298
|
+
{
|
299
|
+
"first_name" => "Anna",
|
300
|
+
"cities_lived" => [
|
301
|
+
{
|
302
|
+
"place" => "Stockholm",
|
303
|
+
"number_of_years" => 2
|
304
|
+
}
|
305
|
+
]
|
306
|
+
},
|
307
|
+
{
|
308
|
+
"first_name" => "Bob",
|
309
|
+
"cities_lived" => [
|
310
|
+
{
|
311
|
+
"place" => "Seattle",
|
312
|
+
"number_of_years" => 5
|
313
|
+
},
|
314
|
+
{
|
315
|
+
"place" => "Austin",
|
316
|
+
"number_of_years" => 6
|
317
|
+
}
|
318
|
+
]
|
319
|
+
}
|
320
|
+
]
|
321
|
+
table.insert rows
|
322
|
+
```
|
323
|
+
|
324
|
+
To avoid making RPCs (network requests) to retrieve the dataset and table
|
325
|
+
resources when streaming records, pass the `skip_lookup` option. This creates
|
326
|
+
local objects without verifying that the resources exist on the BigQuery
|
327
|
+
service.
|
328
|
+
|
329
|
+
```ruby
|
330
|
+
require "google/cloud/bigquery"
|
331
|
+
|
332
|
+
bigquery = Google::Cloud::Bigquery.new
|
333
|
+
dataset = bigquery.dataset "my_dataset", skip_lookup: true
|
334
|
+
table = dataset.table "people", skip_lookup: true
|
335
|
+
|
336
|
+
rows = [
|
337
|
+
{
|
338
|
+
"first_name" => "Anna",
|
339
|
+
"cities_lived" => [
|
340
|
+
{
|
341
|
+
"place" => "Stockholm",
|
342
|
+
"number_of_years" => 2
|
343
|
+
}
|
344
|
+
]
|
345
|
+
},
|
346
|
+
{
|
347
|
+
"first_name" => "Bob",
|
348
|
+
"cities_lived" => [
|
349
|
+
{
|
350
|
+
"place" => "Seattle",
|
351
|
+
"number_of_years" => 5
|
352
|
+
},
|
353
|
+
{
|
354
|
+
"place" => "Austin",
|
355
|
+
"number_of_years" => 6
|
356
|
+
}
|
357
|
+
]
|
358
|
+
}
|
359
|
+
]
|
360
|
+
table.insert rows
|
361
|
+
```
|
362
|
+
|
363
|
+
There are some trade-offs involved with streaming, so be sure to read the
|
364
|
+
discussion of data consistency in [Streaming Data Into
|
365
|
+
BigQuery](https://cloud.google.com/bigquery/streaming-data-into-bigquery).
|
366
|
+
|
367
|
+
### Uploading a file
|
368
|
+
|
369
|
+
To follow along with this example, please download the
|
370
|
+
[names.zip](http://www.ssa.gov/OACT/babynames/names.zip) archive from the U.S.
|
371
|
+
Social Security Administration. Inside the archive you will find over 100 files
|
372
|
+
containing baby name records since the year 1880.
|
373
|
+
|
374
|
+
```ruby
|
375
|
+
require "google/cloud/bigquery"
|
376
|
+
|
377
|
+
bigquery = Google::Cloud::Bigquery.new
|
378
|
+
dataset = bigquery.dataset "my_dataset"
|
379
|
+
table = dataset.create_table "baby_names" do |schema|
|
380
|
+
schema.string "name", mode: :required
|
381
|
+
schema.string "gender", mode: :required
|
382
|
+
schema.integer "count", mode: :required
|
383
|
+
end
|
384
|
+
|
385
|
+
file = File.open "names/yob2014.txt"
|
386
|
+
table.load file, format: "csv"
|
387
|
+
```
|
388
|
+
|
389
|
+
Because the names data, although formatted as CSV, is distributed in files with
|
390
|
+
a `.txt` extension, this example explicitly passes the `format` option in order
|
391
|
+
to demonstrate how to handle such situations. Because CSV is the default format
|
392
|
+
for load operations, the option is not actually necessary. For JSON saved with a
|
393
|
+
`.txt` extension, however, it would be.
|
394
|
+
|
395
|
+
## Exporting query results to Google Cloud Storage
|
396
|
+
|
397
|
+
The example below shows how to pass the `table` option with a query in order to
|
398
|
+
store results in a permanent table. It also shows how to export the result data
|
399
|
+
to a Google Cloud Storage file. In order to follow along, you will need to
|
400
|
+
enable the Google Cloud Storage API in addition to setting up billing.
|
401
|
+
|
402
|
+
```ruby
|
403
|
+
require "google/cloud/bigquery"
|
404
|
+
|
405
|
+
bigquery = Google::Cloud::Bigquery.new
|
406
|
+
dataset = bigquery.dataset "my_dataset"
|
407
|
+
source_table = dataset.table "baby_names"
|
408
|
+
result_table = dataset.create_table "baby_names_results"
|
409
|
+
|
410
|
+
sql = "SELECT name, count " \
|
411
|
+
"FROM baby_names " \
|
412
|
+
"WHERE gender = 'M' " \
|
413
|
+
"ORDER BY count ASC LIMIT 5"
|
414
|
+
query_job = dataset.query_job sql, table: result_table
|
415
|
+
|
416
|
+
query_job.wait_until_done!
|
417
|
+
|
418
|
+
if !query_job.failed?
|
419
|
+
require "google/cloud/storage"
|
420
|
+
|
421
|
+
storage = Google::Cloud::Storage.new
|
422
|
+
bucket_id = "bigquery-exports-#{SecureRandom.uuid}"
|
423
|
+
bucket = storage.create_bucket bucket_id
|
424
|
+
extract_url = "gs://#{bucket.id}/baby-names.csv"
|
425
|
+
|
426
|
+
result_table.extract extract_url
|
427
|
+
|
428
|
+
# Download to local filesystem
|
429
|
+
bucket.files.first.download "baby-names.csv"
|
430
|
+
end
|
431
|
+
```
|
432
|
+
|
433
|
+
If a table you wish to export contains a large amount of data, you can pass a
|
434
|
+
wildcard URI to export to multiple files (for sharding), or an array of URIs
|
435
|
+
(for partitioning), or both. See [Exporting
|
436
|
+
Data](https://cloud.google.com/bigquery/docs/exporting-data) for details.
|
437
|
+
|
438
|
+
## Configuring retries and timeout
|
439
|
+
|
440
|
+
You can configure how many times API requests may be automatically retried. When
|
441
|
+
an API request fails, the response will be inspected to see if the request meets
|
442
|
+
criteria indicating that it may succeed on retry, such as `500` and `503` status
|
443
|
+
codes or a specific internal error code such as `rateLimitExceeded`. If it meets
|
444
|
+
the criteria, the request will be retried after a delay. If another error
|
445
|
+
occurs, the delay will be increased before a subsequent attempt, until the
|
446
|
+
`retries` limit is reached.
|
447
|
+
|
448
|
+
You can also set the request `timeout` value in seconds.
|
449
|
+
|
450
|
+
```ruby
|
451
|
+
require "google/cloud/bigquery"
|
452
|
+
|
453
|
+
bigquery = Google::Cloud::Bigquery.new retries: 10, timeout: 120
|
454
|
+
```
|
455
|
+
|
456
|
+
See the [BigQuery error
|
457
|
+
table](https://cloud.google.com/bigquery/troubleshooting-errors#errortable) for
|
458
|
+
a list of error conditions.
|
459
|
+
|
460
|
+
## Additional information
|
461
|
+
|
462
|
+
Google BigQuery can be configured to use logging. To learn more, see the
|
463
|
+
{file:LOGGING.md Logging guide}.
|