google-cloud-dataproc 0.3.0 → 0.3.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/.yardopts +2 -0
- data/AUTHENTICATION.md +199 -0
- data/lib/google/cloud/dataproc/v1/cluster_controller_client.rb +15 -12
- data/lib/google/cloud/dataproc/v1/doc/google/cloud/dataproc/v1/clusters.rb +46 -32
- data/lib/google/cloud/dataproc/v1/doc/google/cloud/dataproc/v1/jobs.rb +21 -17
- data/lib/google/cloud/dataproc/v1/doc/google/cloud/dataproc/v1/workflow_templates.rb +6 -6
- data/lib/google/cloud/dataproc/v1/doc/google/longrunning/operations.rb +1 -1
- data/lib/google/cloud/dataproc/v1/doc/google/protobuf/any.rb +2 -1
- data/lib/google/cloud/dataproc/v1/doc/google/protobuf/field_mask.rb +18 -26
- data/lib/google/cloud/dataproc/v1/doc/google/protobuf/timestamp.rb +15 -13
- data/lib/google/cloud/dataproc/v1/doc/google/rpc/status.rb +17 -14
- data/lib/google/cloud/dataproc/v1/job_controller_client.rb +4 -3
- data/lib/google/cloud/dataproc/v1/jobs_services_pb.rb +2 -1
- data/lib/google/cloud/dataproc/v1/workflow_template_service_client.rb +45 -23
- data/lib/google/cloud/dataproc/v1/workflow_templates_services_pb.rb +2 -1
- data/lib/google/cloud/dataproc/v1beta2/cluster_controller_client.rb +29 -20
- data/lib/google/cloud/dataproc/v1beta2/clusters_pb.rb +1 -1
- data/lib/google/cloud/dataproc/v1beta2/doc/google/cloud/dataproc/v1beta2/clusters.rb +58 -39
- data/lib/google/cloud/dataproc/v1beta2/doc/google/cloud/dataproc/v1beta2/jobs.rb +23 -18
- data/lib/google/cloud/dataproc/v1beta2/doc/google/cloud/dataproc/v1beta2/workflow_templates.rb +10 -9
- data/lib/google/cloud/dataproc/v1beta2/doc/google/longrunning/operations.rb +1 -1
- data/lib/google/cloud/dataproc/v1beta2/doc/google/protobuf/any.rb +2 -1
- data/lib/google/cloud/dataproc/v1beta2/doc/google/protobuf/field_mask.rb +18 -26
- data/lib/google/cloud/dataproc/v1beta2/doc/google/protobuf/timestamp.rb +15 -13
- data/lib/google/cloud/dataproc/v1beta2/doc/google/rpc/status.rb +17 -14
- data/lib/google/cloud/dataproc/v1beta2/job_controller_client.rb +6 -4
- data/lib/google/cloud/dataproc/v1beta2/jobs_services_pb.rb +2 -1
- data/lib/google/cloud/dataproc/v1beta2/workflow_template_service_client.rb +45 -23
- data/lib/google/cloud/dataproc/v1beta2/workflow_templates_services_pb.rb +2 -1
- metadata +6 -6
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 433b67960fde33d581be68d4316747f4ee70009aabb1c6e89029410a627a947a
|
4
|
+
data.tar.gz: e10c766af7e68ea6e1ab8d5471c0153bcbb32487f8240ff5bbe7bbc34e4b1849
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 4688350b720a26778e6751f65408f08a76848504eb5c7c2f729841736cb8004c104b274fbd4e31d47b34ee073a771a38500416da149e558cccc03d3d42018b4a
|
7
|
+
data.tar.gz: 9edfe8721650d1dab30ab294f5174be7d37037198bc1b37de3e56d7b2114a317e38ae03c4d7a03cf716b5659f8ab45e88075616f8fe06a55fea14b2e22b478bd
|
data/.yardopts
CHANGED
data/AUTHENTICATION.md
ADDED
@@ -0,0 +1,199 @@
|
|
1
|
+
# Authentication
|
2
|
+
|
3
|
+
In general, the google-cloud-dataproc library uses [Service
|
4
|
+
Account](https://cloud.google.com/iam/docs/creating-managing-service-accounts)
|
5
|
+
credentials to connect to Google Cloud services. When running within [Google
|
6
|
+
Cloud Platform environments](#google-cloud-platform-environments)
|
7
|
+
the credentials will be discovered automatically. When running on other
|
8
|
+
environments, the Service Account credentials can be specified by providing the
|
9
|
+
path to the [JSON
|
10
|
+
keyfile](https://cloud.google.com/iam/docs/managing-service-account-keys) for
|
11
|
+
the account (or the JSON itself) in [environment
|
12
|
+
variables](#environment-variables). Additionally, Cloud SDK credentials can also
|
13
|
+
be discovered automatically, but this is only recommended during development.
|
14
|
+
|
15
|
+
## Quickstart
|
16
|
+
|
17
|
+
1. [Create a service account and credentials](#creating-a-service-account).
|
18
|
+
2. Set the [environment variable](#environment-variables).
|
19
|
+
|
20
|
+
```sh
|
21
|
+
export DATAPROC_CREDENTIALS=/path/to/json`
|
22
|
+
```
|
23
|
+
|
24
|
+
3. Initialize the client.
|
25
|
+
|
26
|
+
```ruby
|
27
|
+
require "google/cloud/dataproc"
|
28
|
+
|
29
|
+
client = Google::Cloud::Dataproc.new
|
30
|
+
```
|
31
|
+
|
32
|
+
## Project and Credential Lookup
|
33
|
+
|
34
|
+
The google-cloud-dataproc library aims to make authentication
|
35
|
+
as simple as possible, and provides several mechanisms to configure your system
|
36
|
+
without providing **Project ID** and **Service Account Credentials** directly in
|
37
|
+
code.
|
38
|
+
|
39
|
+
**Project ID** is discovered in the following order:
|
40
|
+
|
41
|
+
1. Specify project ID in method arguments
|
42
|
+
2. Specify project ID in configuration
|
43
|
+
3. Discover project ID in environment variables
|
44
|
+
4. Discover GCE project ID
|
45
|
+
5. Discover project ID in credentials JSON
|
46
|
+
|
47
|
+
**Credentials** are discovered in the following order:
|
48
|
+
|
49
|
+
1. Specify credentials in method arguments
|
50
|
+
2. Specify credentials in configuration
|
51
|
+
3. Discover credentials path in environment variables
|
52
|
+
4. Discover credentials JSON in environment variables
|
53
|
+
5. Discover credentials file in the Cloud SDK's path
|
54
|
+
6. Discover GCE credentials
|
55
|
+
|
56
|
+
### Google Cloud Platform environments
|
57
|
+
|
58
|
+
While running on Google Cloud Platform environments such as Google Compute
|
59
|
+
Engine, Google App Engine and Google Kubernetes Engine, no extra work is needed.
|
60
|
+
The **Project ID** and **Credentials** and are discovered automatically. Code
|
61
|
+
should be written as if already authenticated. Just be sure when you [set up the
|
62
|
+
GCE instance][gce-how-to], you add the correct scopes for the APIs you want to
|
63
|
+
access. For example:
|
64
|
+
|
65
|
+
* **All APIs**
|
66
|
+
* `https://www.googleapis.com/auth/cloud-platform`
|
67
|
+
* `https://www.googleapis.com/auth/cloud-platform.read-only`
|
68
|
+
* **BigQuery**
|
69
|
+
* `https://www.googleapis.com/auth/bigquery`
|
70
|
+
* `https://www.googleapis.com/auth/bigquery.insertdata`
|
71
|
+
* **Compute Engine**
|
72
|
+
* `https://www.googleapis.com/auth/compute`
|
73
|
+
* **Datastore**
|
74
|
+
* `https://www.googleapis.com/auth/datastore`
|
75
|
+
* `https://www.googleapis.com/auth/userinfo.email`
|
76
|
+
* **DNS**
|
77
|
+
* `https://www.googleapis.com/auth/ndev.clouddns.readwrite`
|
78
|
+
* **Pub/Sub**
|
79
|
+
* `https://www.googleapis.com/auth/pubsub`
|
80
|
+
* **Storage**
|
81
|
+
* `https://www.googleapis.com/auth/devstorage.full_control`
|
82
|
+
* `https://www.googleapis.com/auth/devstorage.read_only`
|
83
|
+
* `https://www.googleapis.com/auth/devstorage.read_write`
|
84
|
+
|
85
|
+
### Environment Variables
|
86
|
+
|
87
|
+
The **Project ID** and **Credentials JSON** can be placed in environment
|
88
|
+
variables instead of declaring them directly in code. Each service has its own
|
89
|
+
environment variable, allowing for different service accounts to be used for
|
90
|
+
different services. (See the READMEs for the individual service gems for
|
91
|
+
details.) The path to the **Credentials JSON** file can be stored in the
|
92
|
+
environment variable, or the **Credentials JSON** itself can be stored for
|
93
|
+
environments such as Docker containers where writing files is difficult or not
|
94
|
+
encouraged.
|
95
|
+
|
96
|
+
The environment variables that google-cloud-dataproc checks for project ID are:
|
97
|
+
|
98
|
+
1. `DATAPROC_PROJECT`
|
99
|
+
2. `GOOGLE_CLOUD_PROJECT`
|
100
|
+
|
101
|
+
The environment variables that google-cloud-dataproc checks for credentials are configured on {Google::Cloud::Dataproc::V1::Credentials}:
|
102
|
+
|
103
|
+
1. `DATAPROC_CREDENTIALS` - Path to JSON file, or JSON contents
|
104
|
+
2. `DATAPROC_KEYFILE` - Path to JSON file, or JSON contents
|
105
|
+
3. `GOOGLE_CLOUD_CREDENTIALS` - Path to JSON file, or JSON contents
|
106
|
+
4. `GOOGLE_CLOUD_KEYFILE` - Path to JSON file, or JSON contents
|
107
|
+
5. `GOOGLE_APPLICATION_CREDENTIALS` - Path to JSON file
|
108
|
+
|
109
|
+
```ruby
|
110
|
+
require "google/cloud/dataproc"
|
111
|
+
|
112
|
+
ENV["DATAPROC_PROJECT"] = "my-project-id"
|
113
|
+
ENV["DATAPROC_CREDENTIALS"] = "path/to/keyfile.json"
|
114
|
+
|
115
|
+
client = Google::Cloud::Dataproc.new
|
116
|
+
```
|
117
|
+
|
118
|
+
### Configuration
|
119
|
+
|
120
|
+
The **Project ID** and **Credentials JSON** can be configured instead of placing them in environment variables or providing them as arguments.
|
121
|
+
|
122
|
+
```ruby
|
123
|
+
require "google/cloud/dataproc"
|
124
|
+
|
125
|
+
Google::Cloud::Dataproc.configure do |config|
|
126
|
+
config.project_id = "my-project-id"
|
127
|
+
config.credentials = "path/to/keyfile.json"
|
128
|
+
end
|
129
|
+
|
130
|
+
client = Google::Cloud::Dataproc.new
|
131
|
+
```
|
132
|
+
|
133
|
+
### Cloud SDK
|
134
|
+
|
135
|
+
This option allows for an easy way to authenticate during development. If
|
136
|
+
credentials are not provided in code or in environment variables, then Cloud SDK
|
137
|
+
credentials are discovered.
|
138
|
+
|
139
|
+
To configure your system for this, simply:
|
140
|
+
|
141
|
+
1. [Download and install the Cloud SDK](https://cloud.google.com/sdk)
|
142
|
+
2. Authenticate using OAuth 2.0 `$ gcloud auth login`
|
143
|
+
3. Write code as if already authenticated.
|
144
|
+
|
145
|
+
**NOTE:** This is _not_ recommended for running in production. The Cloud SDK
|
146
|
+
*should* only be used during development.
|
147
|
+
|
148
|
+
[gce-how-to]: https://cloud.google.com/compute/docs/authentication#using
|
149
|
+
[dev-console]: https://console.cloud.google.com/project
|
150
|
+
|
151
|
+
[enable-apis]: https://raw.githubusercontent.com/GoogleCloudPlatform/gcloud-common/master/authentication/enable-apis.png
|
152
|
+
|
153
|
+
[create-new-service-account]: https://raw.githubusercontent.com/GoogleCloudPlatform/gcloud-common/master/authentication/create-new-service-account.png
|
154
|
+
[create-new-service-account-existing-keys]: https://raw.githubusercontent.com/GoogleCloudPlatform/gcloud-common/master/authentication/create-new-service-account-existing-keys.png
|
155
|
+
[reuse-service-account]: https://raw.githubusercontent.com/GoogleCloudPlatform/gcloud-common/master/authentication/reuse-service-account.png
|
156
|
+
|
157
|
+
## Creating a Service Account
|
158
|
+
|
159
|
+
Google Cloud requires a **Project ID** and **Service Account Credentials** to
|
160
|
+
connect to the APIs. You will use the **Project ID** and **JSON key file** to
|
161
|
+
connect to most services with google-cloud-dataproc.
|
162
|
+
|
163
|
+
If you are not running this client within [Google Cloud Platform
|
164
|
+
environments](#google-cloud-platform-environments), you need a Google
|
165
|
+
Developers service account.
|
166
|
+
|
167
|
+
1. Visit the [Google Developers Console][dev-console].
|
168
|
+
1. Create a new project or click on an existing project.
|
169
|
+
1. Activate the slide-out navigation tray and select **API Manager**. From
|
170
|
+
here, you will enable the APIs that your application requires.
|
171
|
+
|
172
|
+
![Enable the APIs that your application requires][enable-apis]
|
173
|
+
|
174
|
+
*Note: You may need to enable billing in order to use these services.*
|
175
|
+
|
176
|
+
1. Select **Credentials** from the side navigation.
|
177
|
+
|
178
|
+
You should see a screen like one of the following.
|
179
|
+
|
180
|
+
![Create a new service account][create-new-service-account]
|
181
|
+
|
182
|
+
![Create a new service account With Existing Keys][create-new-service-account-existing-keys]
|
183
|
+
|
184
|
+
Find the "Add credentials" drop down and select "Service account" to be
|
185
|
+
guided through downloading a new JSON key file.
|
186
|
+
|
187
|
+
If you want to re-use an existing service account, you can easily generate a
|
188
|
+
new key file. Just select the account you wish to re-use, and click "Generate
|
189
|
+
new JSON key":
|
190
|
+
|
191
|
+
![Re-use an existing service account][reuse-service-account]
|
192
|
+
|
193
|
+
The key file you download will be used by this library to authenticate API
|
194
|
+
requests and should be stored in a secure location.
|
195
|
+
|
196
|
+
## Troubleshooting
|
197
|
+
|
198
|
+
If you're having trouble authenticating you can ask for help by following the
|
199
|
+
{file:TROUBLESHOOTING.md Troubleshooting Guide}.
|
@@ -234,10 +234,11 @@ module Google
|
|
234
234
|
# can also be provided.
|
235
235
|
# @param request_id [String]
|
236
236
|
# Optional. A unique id used to identify the request. If the server
|
237
|
-
# receives two
|
238
|
-
#
|
239
|
-
#
|
240
|
-
#
|
237
|
+
# receives two
|
238
|
+
# {Google::Cloud::Dataproc::V1::CreateClusterRequest CreateClusterRequest}
|
239
|
+
# requests with the same id, then the second request will be ignored and the
|
240
|
+
# first {Google::Longrunning::Operation} created
|
241
|
+
# and stored in the backend is returned.
|
241
242
|
#
|
242
243
|
# It is recommended to always set this value to a
|
243
244
|
# [UUID](https://en.wikipedia.org/wiki/Universally_unique_identifier).
|
@@ -390,10 +391,11 @@ module Google
|
|
390
391
|
# can also be provided.
|
391
392
|
# @param request_id [String]
|
392
393
|
# Optional. A unique id used to identify the request. If the server
|
393
|
-
# receives two
|
394
|
-
#
|
395
|
-
#
|
396
|
-
#
|
394
|
+
# receives two
|
395
|
+
# {Google::Cloud::Dataproc::V1::UpdateClusterRequest UpdateClusterRequest}
|
396
|
+
# requests with the same id, then the second request will be ignored and the
|
397
|
+
# first {Google::Longrunning::Operation} created
|
398
|
+
# and stored in the backend is returned.
|
397
399
|
#
|
398
400
|
# It is recommended to always set this value to a
|
399
401
|
# [UUID](https://en.wikipedia.org/wiki/Universally_unique_identifier).
|
@@ -496,10 +498,11 @@ module Google
|
|
496
498
|
# (with error NOT_FOUND) if cluster with specified UUID does not exist.
|
497
499
|
# @param request_id [String]
|
498
500
|
# Optional. A unique id used to identify the request. If the server
|
499
|
-
# receives two
|
500
|
-
#
|
501
|
-
#
|
502
|
-
#
|
501
|
+
# receives two
|
502
|
+
# {Google::Cloud::Dataproc::V1::DeleteClusterRequest DeleteClusterRequest}
|
503
|
+
# requests with the same id, then the second request will be ignored and the
|
504
|
+
# first {Google::Longrunning::Operation} created
|
505
|
+
# and stored in the backend is returned.
|
503
506
|
#
|
504
507
|
# It is recommended to always set this value to a
|
505
508
|
# [UUID](https://en.wikipedia.org/wiki/Universally_unique_identifier).
|
@@ -36,8 +36,9 @@ module Google
|
|
36
36
|
# Label **keys** must contain 1 to 63 characters, and must conform to
|
37
37
|
# [RFC 1035](https://www.ietf.org/rfc/rfc1035.txt).
|
38
38
|
# Label **values** may be empty, but, if present, must contain 1 to 63
|
39
|
-
# characters, and must conform to [RFC
|
40
|
-
# No more than 32 labels can be
|
39
|
+
# characters, and must conform to [RFC
|
40
|
+
# 1035](https://www.ietf.org/rfc/rfc1035.txt). No more than 32 labels can be
|
41
|
+
# associated with a cluster.
|
41
42
|
# @!attribute [rw] status
|
42
43
|
# @return [Google::Cloud::Dataproc::V1::ClusterStatus]
|
43
44
|
# Output only. Cluster status.
|
@@ -52,8 +53,8 @@ module Google
|
|
52
53
|
# @return [Google::Cloud::Dataproc::V1::ClusterMetrics]
|
53
54
|
# Contains cluster daemon metrics such as HDFS and YARN stats.
|
54
55
|
#
|
55
|
-
# **Beta Feature**: This report is available for testing purposes only. It
|
56
|
-
# be changed before final release.
|
56
|
+
# **Beta Feature**: This report is available for testing purposes only. It
|
57
|
+
# may be changed before final release.
|
57
58
|
class Cluster; end
|
58
59
|
|
59
60
|
# The cluster config.
|
@@ -89,9 +90,11 @@ module Google
|
|
89
90
|
# Optional. Commands to execute on each node after config is
|
90
91
|
# completed. By default, executables are run on master and all worker nodes.
|
91
92
|
# You can test a node's `role` metadata to run an executable on
|
92
|
-
# a master or worker node, as shown below using `curl` (you can also use
|
93
|
+
# a master or worker node, as shown below using `curl` (you can also use
|
94
|
+
# `wget`):
|
93
95
|
#
|
94
|
-
# ROLE=$(curl -H Metadata-Flavor:Google
|
96
|
+
# ROLE=$(curl -H Metadata-Flavor:Google
|
97
|
+
# http://metadata/computeMetadata/v1/instance/attributes/dataproc-role)
|
95
98
|
# if [[ "${ROLE}" == 'Master' ]]; then
|
96
99
|
# ... master specific actions ...
|
97
100
|
# else
|
@@ -150,11 +153,11 @@ module Google
|
|
150
153
|
# @!attribute [rw] internal_ip_only
|
151
154
|
# @return [true, false]
|
152
155
|
# Optional. If true, all instances in the cluster will only have internal IP
|
153
|
-
# addresses. By default, clusters are not restricted to internal IP
|
154
|
-
# and will have ephemeral external IP addresses assigned to each
|
155
|
-
# This `internal_ip_only` restriction can only be enabled for
|
156
|
-
# enabled networks, and all off-cluster dependencies must be
|
157
|
-
# accessible without external IP addresses.
|
156
|
+
# addresses. By default, clusters are not restricted to internal IP
|
157
|
+
# addresses, and will have ephemeral external IP addresses assigned to each
|
158
|
+
# instance. This `internal_ip_only` restriction can only be enabled for
|
159
|
+
# subnetwork enabled networks, and all off-cluster dependencies must be
|
160
|
+
# configured to be accessible without external IP addresses.
|
158
161
|
# @!attribute [rw] service_account
|
159
162
|
# @return [String]
|
160
163
|
# Optional. The service account of the instances. Defaults to the default
|
@@ -164,7 +167,8 @@ module Google
|
|
164
167
|
# * roles/logging.logWriter
|
165
168
|
# * roles/storage.objectAdmin
|
166
169
|
#
|
167
|
-
# (see
|
170
|
+
# (see
|
171
|
+
# https://cloud.google.com/compute/docs/access/service-accounts#custom_service_accounts
|
168
172
|
# for more information).
|
169
173
|
# Example: `[account_id]@[project_id].iam.gserviceaccount.com`
|
170
174
|
# @!attribute [rw] service_account_scopes
|
@@ -190,7 +194,8 @@ module Google
|
|
190
194
|
# @!attribute [rw] metadata
|
191
195
|
# @return [Hash{String => String}]
|
192
196
|
# The Compute Engine metadata entries to add to all instances (see
|
193
|
-
# [Project and instance
|
197
|
+
# [Project and instance
|
198
|
+
# metadata](https://cloud.google.com/compute/docs/storing-retrieving-metadata#project_and_instance_metadata)).
|
194
199
|
class GceClusterConfig; end
|
195
200
|
|
196
201
|
# Optional. The config settings for Compute Engine resources in
|
@@ -219,7 +224,8 @@ module Google
|
|
219
224
|
# * `n1-standard-2`
|
220
225
|
#
|
221
226
|
# **Auto Zone Exception**: If you are using the Cloud Dataproc
|
222
|
-
# [Auto Zone
|
227
|
+
# [Auto Zone
|
228
|
+
# Placement](/dataproc/docs/concepts/configuring-clusters/auto-zone#using_auto_zone_placement)
|
223
229
|
# feature, you must use the short name of the machine type
|
224
230
|
# resource, for example, `n1-standard-2`.
|
225
231
|
# @!attribute [rw] disk_config
|
@@ -227,7 +233,8 @@ module Google
|
|
227
233
|
# Optional. Disk option config settings.
|
228
234
|
# @!attribute [rw] is_preemptible
|
229
235
|
# @return [true, false]
|
230
|
-
# Optional. Specifies that this instance group contains preemptible
|
236
|
+
# Optional. Specifies that this instance group contains preemptible
|
237
|
+
# instances.
|
231
238
|
# @!attribute [rw] managed_group_config
|
232
239
|
# @return [Google::Cloud::Dataproc::V1::ManagedGroupConfig]
|
233
240
|
# Output only. The config for Compute Engine Instance Group
|
@@ -258,7 +265,8 @@ module Google
|
|
258
265
|
# @return [String]
|
259
266
|
# Full URL, partial URI, or short name of the accelerator type resource to
|
260
267
|
# expose to this instance. See
|
261
|
-
# [Compute Engine
|
268
|
+
# [Compute Engine
|
269
|
+
# AcceleratorTypes](/compute/docs/reference/beta/acceleratorTypes).
|
262
270
|
#
|
263
271
|
# Examples:
|
264
272
|
#
|
@@ -267,7 +275,8 @@ module Google
|
|
267
275
|
# * `nvidia-tesla-k80`
|
268
276
|
#
|
269
277
|
# **Auto Zone Exception**: If you are using the Cloud Dataproc
|
270
|
-
# [Auto Zone
|
278
|
+
# [Auto Zone
|
279
|
+
# Placement](/dataproc/docs/concepts/configuring-clusters/auto-zone#using_auto_zone_placement)
|
271
280
|
# feature, you must use the short name of the accelerator type
|
272
281
|
# resource, for example, `nvidia-tesla-k80`.
|
273
282
|
# @!attribute [rw] accelerator_count
|
@@ -366,10 +375,12 @@ module Google
|
|
366
375
|
# Specifies the selection and config of software inside the cluster.
|
367
376
|
# @!attribute [rw] image_version
|
368
377
|
# @return [String]
|
369
|
-
# Optional. The version of software inside the cluster. It must be one of the
|
370
|
-
# [Cloud Dataproc
|
378
|
+
# Optional. The version of software inside the cluster. It must be one of the
|
379
|
+
# supported [Cloud Dataproc
|
380
|
+
# Versions](/dataproc/docs/concepts/versioning/dataproc-versions#supported_cloud_dataproc_versions),
|
371
381
|
# such as "1.2" (including a subminor version, such as "1.2.29"), or the
|
372
|
-
# ["preview"
|
382
|
+
# ["preview"
|
383
|
+
# version](/dataproc/docs/concepts/versioning/dataproc-versions#other_versions).
|
373
384
|
# If unspecified, it defaults to the latest version.
|
374
385
|
# @!attribute [rw] properties
|
375
386
|
# @return [Hash{String => String}]
|
@@ -419,10 +430,11 @@ module Google
|
|
419
430
|
# @!attribute [rw] request_id
|
420
431
|
# @return [String]
|
421
432
|
# Optional. A unique id used to identify the request. If the server
|
422
|
-
# receives two
|
423
|
-
#
|
424
|
-
#
|
425
|
-
#
|
433
|
+
# receives two
|
434
|
+
# {Google::Cloud::Dataproc::V1::CreateClusterRequest CreateClusterRequest}
|
435
|
+
# requests with the same id, then the second request will be ignored and the
|
436
|
+
# first {Google::Longrunning::Operation} created
|
437
|
+
# and stored in the backend is returned.
|
426
438
|
#
|
427
439
|
# It is recommended to always set this value to a
|
428
440
|
# [UUID](https://en.wikipedia.org/wiki/Universally_unique_identifier).
|
@@ -507,10 +519,11 @@ module Google
|
|
507
519
|
# @!attribute [rw] request_id
|
508
520
|
# @return [String]
|
509
521
|
# Optional. A unique id used to identify the request. If the server
|
510
|
-
# receives two
|
511
|
-
#
|
512
|
-
#
|
513
|
-
#
|
522
|
+
# receives two
|
523
|
+
# {Google::Cloud::Dataproc::V1::UpdateClusterRequest UpdateClusterRequest}
|
524
|
+
# requests with the same id, then the second request will be ignored and the
|
525
|
+
# first {Google::Longrunning::Operation} created
|
526
|
+
# and stored in the backend is returned.
|
514
527
|
#
|
515
528
|
# It is recommended to always set this value to a
|
516
529
|
# [UUID](https://en.wikipedia.org/wiki/Universally_unique_identifier).
|
@@ -537,10 +550,11 @@ module Google
|
|
537
550
|
# @!attribute [rw] request_id
|
538
551
|
# @return [String]
|
539
552
|
# Optional. A unique id used to identify the request. If the server
|
540
|
-
# receives two
|
541
|
-
#
|
542
|
-
#
|
543
|
-
#
|
553
|
+
# receives two
|
554
|
+
# {Google::Cloud::Dataproc::V1::DeleteClusterRequest DeleteClusterRequest}
|
555
|
+
# requests with the same id, then the second request will be ignored and the
|
556
|
+
# first {Google::Longrunning::Operation} created
|
557
|
+
# and stored in the backend is returned.
|
544
558
|
#
|
545
559
|
# It is recommended to always set this value to a
|
546
560
|
# [UUID](https://en.wikipedia.org/wiki/Universally_unique_identifier).
|
@@ -59,8 +59,10 @@ module Google
|
|
59
59
|
end
|
60
60
|
|
61
61
|
# A Cloud Dataproc job for running
|
62
|
-
# [Apache Hadoop
|
63
|
-
#
|
62
|
+
# [Apache Hadoop
|
63
|
+
# MapReduce](https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html)
|
64
|
+
# jobs on [Apache Hadoop
|
65
|
+
# YARN](https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-site/YARN.html).
|
64
66
|
# @!attribute [rw] main_jar_file_uri
|
65
67
|
# @return [String]
|
66
68
|
# The HCFS URI of the jar file containing the main class.
|
@@ -75,8 +77,8 @@ module Google
|
|
75
77
|
# @!attribute [rw] args
|
76
78
|
# @return [Array<String>]
|
77
79
|
# Optional. The arguments to pass to the driver. Do not
|
78
|
-
# include arguments, such as `-libjars` or `-Dfoo=bar`, that can be set as
|
79
|
-
# properties, since a collision may occur that causes an incorrect job
|
80
|
+
# include arguments, such as `-libjars` or `-Dfoo=bar`, that can be set as
|
81
|
+
# job properties, since a collision may occur that causes an incorrect job
|
80
82
|
# submission.
|
81
83
|
# @!attribute [rw] jar_file_uris
|
82
84
|
# @return [Array<String>]
|
@@ -142,7 +144,8 @@ module Google
|
|
142
144
|
class SparkJob; end
|
143
145
|
|
144
146
|
# A Cloud Dataproc job for running
|
145
|
-
# [Apache
|
147
|
+
# [Apache
|
148
|
+
# PySpark](https://spark.apache.org/docs/0.9.0/python-programming-guide.html)
|
146
149
|
# applications on YARN.
|
147
150
|
# @!attribute [rw] main_python_file_uri
|
148
151
|
# @return [String]
|
@@ -210,8 +213,8 @@ module Google
|
|
210
213
|
# @!attribute [rw] continue_on_failure
|
211
214
|
# @return [true, false]
|
212
215
|
# Optional. Whether to continue executing queries if a query fails.
|
213
|
-
# The default value is `false`. Setting to `true` can be useful when
|
214
|
-
# independent parallel queries.
|
216
|
+
# The default value is `false`. Setting to `true` can be useful when
|
217
|
+
# executing independent parallel queries.
|
215
218
|
# @!attribute [rw] script_variables
|
216
219
|
# @return [Hash{String => String}]
|
217
220
|
# Optional. Mapping of query variable names to values (equivalent to the
|
@@ -229,8 +232,8 @@ module Google
|
|
229
232
|
# and UDFs.
|
230
233
|
class HiveJob; end
|
231
234
|
|
232
|
-
# A Cloud Dataproc job for running [Apache Spark
|
233
|
-
# queries.
|
235
|
+
# A Cloud Dataproc job for running [Apache Spark
|
236
|
+
# SQL](http://spark.apache.org/sql/) queries.
|
234
237
|
# @!attribute [rw] query_file_uri
|
235
238
|
# @return [String]
|
236
239
|
# The HCFS URI of the script that contains SQL queries.
|
@@ -265,8 +268,8 @@ module Google
|
|
265
268
|
# @!attribute [rw] continue_on_failure
|
266
269
|
# @return [true, false]
|
267
270
|
# Optional. Whether to continue executing queries if a query fails.
|
268
|
-
# The default value is `false`. Setting to `true` can be useful when
|
269
|
-
# independent parallel queries.
|
271
|
+
# The default value is `false`. Setting to `true` can be useful when
|
272
|
+
# executing independent parallel queries.
|
270
273
|
# @!attribute [rw] script_variables
|
271
274
|
# @return [Hash{String => String}]
|
272
275
|
# Optional. Mapping of query variable names to values (equivalent to the Pig
|
@@ -484,8 +487,8 @@ module Google
|
|
484
487
|
# @return [Array<Google::Cloud::Dataproc::V1::YarnApplication>]
|
485
488
|
# Output only. The collection of YARN applications spun up by this job.
|
486
489
|
#
|
487
|
-
# **Beta** Feature: This report is available for testing purposes only. It
|
488
|
-
# be changed before final release.
|
490
|
+
# **Beta** Feature: This report is available for testing purposes only. It
|
491
|
+
# may be changed before final release.
|
489
492
|
# @!attribute [rw] driver_output_resource_uri
|
490
493
|
# @return [String]
|
491
494
|
# Output only. A URI pointing to the location of the stdout of the job's
|
@@ -501,8 +504,9 @@ module Google
|
|
501
504
|
# Label **keys** must contain 1 to 63 characters, and must conform to
|
502
505
|
# [RFC 1035](https://www.ietf.org/rfc/rfc1035.txt).
|
503
506
|
# Label **values** may be empty, but, if present, must contain 1 to 63
|
504
|
-
# characters, and must conform to [RFC
|
505
|
-
# No more than 32 labels can be
|
507
|
+
# characters, and must conform to [RFC
|
508
|
+
# 1035](https://www.ietf.org/rfc/rfc1035.txt). No more than 32 labels can be
|
509
|
+
# associated with a job.
|
506
510
|
# @!attribute [rw] scheduling
|
507
511
|
# @return [Google::Cloud::Dataproc::V1::JobScheduling]
|
508
512
|
# Optional. Job scheduling configuration.
|
@@ -540,8 +544,8 @@ module Google
|
|
540
544
|
# @!attribute [rw] request_id
|
541
545
|
# @return [String]
|
542
546
|
# Optional. A unique id used to identify the request. If the server
|
543
|
-
# receives two {Google::Cloud::Dataproc::V1::SubmitJobRequest SubmitJobRequest}
|
544
|
-
# id, then the second request will be ignored and the
|
547
|
+
# receives two {Google::Cloud::Dataproc::V1::SubmitJobRequest SubmitJobRequest}
|
548
|
+
# requests with the same id, then the second request will be ignored and the
|
545
549
|
# first {Google::Cloud::Dataproc::V1::Job Job} created and stored in the backend
|
546
550
|
# is returned.
|
547
551
|
#
|