google-cloud-dataproc 0.3.0 → 0.3.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.yardopts +2 -0
- data/AUTHENTICATION.md +199 -0
- data/lib/google/cloud/dataproc/v1/cluster_controller_client.rb +15 -12
- data/lib/google/cloud/dataproc/v1/doc/google/cloud/dataproc/v1/clusters.rb +46 -32
- data/lib/google/cloud/dataproc/v1/doc/google/cloud/dataproc/v1/jobs.rb +21 -17
- data/lib/google/cloud/dataproc/v1/doc/google/cloud/dataproc/v1/workflow_templates.rb +6 -6
- data/lib/google/cloud/dataproc/v1/doc/google/longrunning/operations.rb +1 -1
- data/lib/google/cloud/dataproc/v1/doc/google/protobuf/any.rb +2 -1
- data/lib/google/cloud/dataproc/v1/doc/google/protobuf/field_mask.rb +18 -26
- data/lib/google/cloud/dataproc/v1/doc/google/protobuf/timestamp.rb +15 -13
- data/lib/google/cloud/dataproc/v1/doc/google/rpc/status.rb +17 -14
- data/lib/google/cloud/dataproc/v1/job_controller_client.rb +4 -3
- data/lib/google/cloud/dataproc/v1/jobs_services_pb.rb +2 -1
- data/lib/google/cloud/dataproc/v1/workflow_template_service_client.rb +45 -23
- data/lib/google/cloud/dataproc/v1/workflow_templates_services_pb.rb +2 -1
- data/lib/google/cloud/dataproc/v1beta2/cluster_controller_client.rb +29 -20
- data/lib/google/cloud/dataproc/v1beta2/clusters_pb.rb +1 -1
- data/lib/google/cloud/dataproc/v1beta2/doc/google/cloud/dataproc/v1beta2/clusters.rb +58 -39
- data/lib/google/cloud/dataproc/v1beta2/doc/google/cloud/dataproc/v1beta2/jobs.rb +23 -18
- data/lib/google/cloud/dataproc/v1beta2/doc/google/cloud/dataproc/v1beta2/workflow_templates.rb +10 -9
- data/lib/google/cloud/dataproc/v1beta2/doc/google/longrunning/operations.rb +1 -1
- data/lib/google/cloud/dataproc/v1beta2/doc/google/protobuf/any.rb +2 -1
- data/lib/google/cloud/dataproc/v1beta2/doc/google/protobuf/field_mask.rb +18 -26
- data/lib/google/cloud/dataproc/v1beta2/doc/google/protobuf/timestamp.rb +15 -13
- data/lib/google/cloud/dataproc/v1beta2/doc/google/rpc/status.rb +17 -14
- data/lib/google/cloud/dataproc/v1beta2/job_controller_client.rb +6 -4
- data/lib/google/cloud/dataproc/v1beta2/jobs_services_pb.rb +2 -1
- data/lib/google/cloud/dataproc/v1beta2/workflow_template_service_client.rb +45 -23
- data/lib/google/cloud/dataproc/v1beta2/workflow_templates_services_pb.rb +2 -1
- metadata +6 -6
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 433b67960fde33d581be68d4316747f4ee70009aabb1c6e89029410a627a947a
|
4
|
+
data.tar.gz: e10c766af7e68ea6e1ab8d5471c0153bcbb32487f8240ff5bbe7bbc34e4b1849
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 4688350b720a26778e6751f65408f08a76848504eb5c7c2f729841736cb8004c104b274fbd4e31d47b34ee073a771a38500416da149e558cccc03d3d42018b4a
|
7
|
+
data.tar.gz: 9edfe8721650d1dab30ab294f5174be7d37037198bc1b37de3e56d7b2114a317e38ae03c4d7a03cf716b5659f8ab45e88075616f8fe06a55fea14b2e22b478bd
|
data/.yardopts
CHANGED
data/AUTHENTICATION.md
ADDED
@@ -0,0 +1,199 @@
|
|
1
|
+
# Authentication
|
2
|
+
|
3
|
+
In general, the google-cloud-dataproc library uses [Service
|
4
|
+
Account](https://cloud.google.com/iam/docs/creating-managing-service-accounts)
|
5
|
+
credentials to connect to Google Cloud services. When running within [Google
|
6
|
+
Cloud Platform environments](#google-cloud-platform-environments)
|
7
|
+
the credentials will be discovered automatically. When running on other
|
8
|
+
environments, the Service Account credentials can be specified by providing the
|
9
|
+
path to the [JSON
|
10
|
+
keyfile](https://cloud.google.com/iam/docs/managing-service-account-keys) for
|
11
|
+
the account (or the JSON itself) in [environment
|
12
|
+
variables](#environment-variables). Additionally, Cloud SDK credentials can also
|
13
|
+
be discovered automatically, but this is only recommended during development.
|
14
|
+
|
15
|
+
## Quickstart
|
16
|
+
|
17
|
+
1. [Create a service account and credentials](#creating-a-service-account).
|
18
|
+
2. Set the [environment variable](#environment-variables).
|
19
|
+
|
20
|
+
```sh
|
21
|
+
export DATAPROC_CREDENTIALS=/path/to/json`
|
22
|
+
```
|
23
|
+
|
24
|
+
3. Initialize the client.
|
25
|
+
|
26
|
+
```ruby
|
27
|
+
require "google/cloud/dataproc"
|
28
|
+
|
29
|
+
client = Google::Cloud::Dataproc.new
|
30
|
+
```
|
31
|
+
|
32
|
+
## Project and Credential Lookup
|
33
|
+
|
34
|
+
The google-cloud-dataproc library aims to make authentication
|
35
|
+
as simple as possible, and provides several mechanisms to configure your system
|
36
|
+
without providing **Project ID** and **Service Account Credentials** directly in
|
37
|
+
code.
|
38
|
+
|
39
|
+
**Project ID** is discovered in the following order:
|
40
|
+
|
41
|
+
1. Specify project ID in method arguments
|
42
|
+
2. Specify project ID in configuration
|
43
|
+
3. Discover project ID in environment variables
|
44
|
+
4. Discover GCE project ID
|
45
|
+
5. Discover project ID in credentials JSON
|
46
|
+
|
47
|
+
**Credentials** are discovered in the following order:
|
48
|
+
|
49
|
+
1. Specify credentials in method arguments
|
50
|
+
2. Specify credentials in configuration
|
51
|
+
3. Discover credentials path in environment variables
|
52
|
+
4. Discover credentials JSON in environment variables
|
53
|
+
5. Discover credentials file in the Cloud SDK's path
|
54
|
+
6. Discover GCE credentials
|
55
|
+
|
56
|
+
### Google Cloud Platform environments
|
57
|
+
|
58
|
+
While running on Google Cloud Platform environments such as Google Compute
|
59
|
+
Engine, Google App Engine and Google Kubernetes Engine, no extra work is needed.
|
60
|
+
The **Project ID** and **Credentials** and are discovered automatically. Code
|
61
|
+
should be written as if already authenticated. Just be sure when you [set up the
|
62
|
+
GCE instance][gce-how-to], you add the correct scopes for the APIs you want to
|
63
|
+
access. For example:
|
64
|
+
|
65
|
+
* **All APIs**
|
66
|
+
* `https://www.googleapis.com/auth/cloud-platform`
|
67
|
+
* `https://www.googleapis.com/auth/cloud-platform.read-only`
|
68
|
+
* **BigQuery**
|
69
|
+
* `https://www.googleapis.com/auth/bigquery`
|
70
|
+
* `https://www.googleapis.com/auth/bigquery.insertdata`
|
71
|
+
* **Compute Engine**
|
72
|
+
* `https://www.googleapis.com/auth/compute`
|
73
|
+
* **Datastore**
|
74
|
+
* `https://www.googleapis.com/auth/datastore`
|
75
|
+
* `https://www.googleapis.com/auth/userinfo.email`
|
76
|
+
* **DNS**
|
77
|
+
* `https://www.googleapis.com/auth/ndev.clouddns.readwrite`
|
78
|
+
* **Pub/Sub**
|
79
|
+
* `https://www.googleapis.com/auth/pubsub`
|
80
|
+
* **Storage**
|
81
|
+
* `https://www.googleapis.com/auth/devstorage.full_control`
|
82
|
+
* `https://www.googleapis.com/auth/devstorage.read_only`
|
83
|
+
* `https://www.googleapis.com/auth/devstorage.read_write`
|
84
|
+
|
85
|
+
### Environment Variables
|
86
|
+
|
87
|
+
The **Project ID** and **Credentials JSON** can be placed in environment
|
88
|
+
variables instead of declaring them directly in code. Each service has its own
|
89
|
+
environment variable, allowing for different service accounts to be used for
|
90
|
+
different services. (See the READMEs for the individual service gems for
|
91
|
+
details.) The path to the **Credentials JSON** file can be stored in the
|
92
|
+
environment variable, or the **Credentials JSON** itself can be stored for
|
93
|
+
environments such as Docker containers where writing files is difficult or not
|
94
|
+
encouraged.
|
95
|
+
|
96
|
+
The environment variables that google-cloud-dataproc checks for project ID are:
|
97
|
+
|
98
|
+
1. `DATAPROC_PROJECT`
|
99
|
+
2. `GOOGLE_CLOUD_PROJECT`
|
100
|
+
|
101
|
+
The environment variables that google-cloud-dataproc checks for credentials are configured on {Google::Cloud::Dataproc::V1::Credentials}:
|
102
|
+
|
103
|
+
1. `DATAPROC_CREDENTIALS` - Path to JSON file, or JSON contents
|
104
|
+
2. `DATAPROC_KEYFILE` - Path to JSON file, or JSON contents
|
105
|
+
3. `GOOGLE_CLOUD_CREDENTIALS` - Path to JSON file, or JSON contents
|
106
|
+
4. `GOOGLE_CLOUD_KEYFILE` - Path to JSON file, or JSON contents
|
107
|
+
5. `GOOGLE_APPLICATION_CREDENTIALS` - Path to JSON file
|
108
|
+
|
109
|
+
```ruby
|
110
|
+
require "google/cloud/dataproc"
|
111
|
+
|
112
|
+
ENV["DATAPROC_PROJECT"] = "my-project-id"
|
113
|
+
ENV["DATAPROC_CREDENTIALS"] = "path/to/keyfile.json"
|
114
|
+
|
115
|
+
client = Google::Cloud::Dataproc.new
|
116
|
+
```
|
117
|
+
|
118
|
+
### Configuration
|
119
|
+
|
120
|
+
The **Project ID** and **Credentials JSON** can be configured instead of placing them in environment variables or providing them as arguments.
|
121
|
+
|
122
|
+
```ruby
|
123
|
+
require "google/cloud/dataproc"
|
124
|
+
|
125
|
+
Google::Cloud::Dataproc.configure do |config|
|
126
|
+
config.project_id = "my-project-id"
|
127
|
+
config.credentials = "path/to/keyfile.json"
|
128
|
+
end
|
129
|
+
|
130
|
+
client = Google::Cloud::Dataproc.new
|
131
|
+
```
|
132
|
+
|
133
|
+
### Cloud SDK
|
134
|
+
|
135
|
+
This option allows for an easy way to authenticate during development. If
|
136
|
+
credentials are not provided in code or in environment variables, then Cloud SDK
|
137
|
+
credentials are discovered.
|
138
|
+
|
139
|
+
To configure your system for this, simply:
|
140
|
+
|
141
|
+
1. [Download and install the Cloud SDK](https://cloud.google.com/sdk)
|
142
|
+
2. Authenticate using OAuth 2.0 `$ gcloud auth login`
|
143
|
+
3. Write code as if already authenticated.
|
144
|
+
|
145
|
+
**NOTE:** This is _not_ recommended for running in production. The Cloud SDK
|
146
|
+
*should* only be used during development.
|
147
|
+
|
148
|
+
[gce-how-to]: https://cloud.google.com/compute/docs/authentication#using
|
149
|
+
[dev-console]: https://console.cloud.google.com/project
|
150
|
+
|
151
|
+
[enable-apis]: https://raw.githubusercontent.com/GoogleCloudPlatform/gcloud-common/master/authentication/enable-apis.png
|
152
|
+
|
153
|
+
[create-new-service-account]: https://raw.githubusercontent.com/GoogleCloudPlatform/gcloud-common/master/authentication/create-new-service-account.png
|
154
|
+
[create-new-service-account-existing-keys]: https://raw.githubusercontent.com/GoogleCloudPlatform/gcloud-common/master/authentication/create-new-service-account-existing-keys.png
|
155
|
+
[reuse-service-account]: https://raw.githubusercontent.com/GoogleCloudPlatform/gcloud-common/master/authentication/reuse-service-account.png
|
156
|
+
|
157
|
+
## Creating a Service Account
|
158
|
+
|
159
|
+
Google Cloud requires a **Project ID** and **Service Account Credentials** to
|
160
|
+
connect to the APIs. You will use the **Project ID** and **JSON key file** to
|
161
|
+
connect to most services with google-cloud-dataproc.
|
162
|
+
|
163
|
+
If you are not running this client within [Google Cloud Platform
|
164
|
+
environments](#google-cloud-platform-environments), you need a Google
|
165
|
+
Developers service account.
|
166
|
+
|
167
|
+
1. Visit the [Google Developers Console][dev-console].
|
168
|
+
1. Create a new project or click on an existing project.
|
169
|
+
1. Activate the slide-out navigation tray and select **API Manager**. From
|
170
|
+
here, you will enable the APIs that your application requires.
|
171
|
+
|
172
|
+
![Enable the APIs that your application requires][enable-apis]
|
173
|
+
|
174
|
+
*Note: You may need to enable billing in order to use these services.*
|
175
|
+
|
176
|
+
1. Select **Credentials** from the side navigation.
|
177
|
+
|
178
|
+
You should see a screen like one of the following.
|
179
|
+
|
180
|
+
![Create a new service account][create-new-service-account]
|
181
|
+
|
182
|
+
![Create a new service account With Existing Keys][create-new-service-account-existing-keys]
|
183
|
+
|
184
|
+
Find the "Add credentials" drop down and select "Service account" to be
|
185
|
+
guided through downloading a new JSON key file.
|
186
|
+
|
187
|
+
If you want to re-use an existing service account, you can easily generate a
|
188
|
+
new key file. Just select the account you wish to re-use, and click "Generate
|
189
|
+
new JSON key":
|
190
|
+
|
191
|
+
![Re-use an existing service account][reuse-service-account]
|
192
|
+
|
193
|
+
The key file you download will be used by this library to authenticate API
|
194
|
+
requests and should be stored in a secure location.
|
195
|
+
|
196
|
+
## Troubleshooting
|
197
|
+
|
198
|
+
If you're having trouble authenticating you can ask for help by following the
|
199
|
+
{file:TROUBLESHOOTING.md Troubleshooting Guide}.
|
@@ -234,10 +234,11 @@ module Google
|
|
234
234
|
# can also be provided.
|
235
235
|
# @param request_id [String]
|
236
236
|
# Optional. A unique id used to identify the request. If the server
|
237
|
-
# receives two
|
238
|
-
#
|
239
|
-
#
|
240
|
-
#
|
237
|
+
# receives two
|
238
|
+
# {Google::Cloud::Dataproc::V1::CreateClusterRequest CreateClusterRequest}
|
239
|
+
# requests with the same id, then the second request will be ignored and the
|
240
|
+
# first {Google::Longrunning::Operation} created
|
241
|
+
# and stored in the backend is returned.
|
241
242
|
#
|
242
243
|
# It is recommended to always set this value to a
|
243
244
|
# [UUID](https://en.wikipedia.org/wiki/Universally_unique_identifier).
|
@@ -390,10 +391,11 @@ module Google
|
|
390
391
|
# can also be provided.
|
391
392
|
# @param request_id [String]
|
392
393
|
# Optional. A unique id used to identify the request. If the server
|
393
|
-
# receives two
|
394
|
-
#
|
395
|
-
#
|
396
|
-
#
|
394
|
+
# receives two
|
395
|
+
# {Google::Cloud::Dataproc::V1::UpdateClusterRequest UpdateClusterRequest}
|
396
|
+
# requests with the same id, then the second request will be ignored and the
|
397
|
+
# first {Google::Longrunning::Operation} created
|
398
|
+
# and stored in the backend is returned.
|
397
399
|
#
|
398
400
|
# It is recommended to always set this value to a
|
399
401
|
# [UUID](https://en.wikipedia.org/wiki/Universally_unique_identifier).
|
@@ -496,10 +498,11 @@ module Google
|
|
496
498
|
# (with error NOT_FOUND) if cluster with specified UUID does not exist.
|
497
499
|
# @param request_id [String]
|
498
500
|
# Optional. A unique id used to identify the request. If the server
|
499
|
-
# receives two
|
500
|
-
#
|
501
|
-
#
|
502
|
-
#
|
501
|
+
# receives two
|
502
|
+
# {Google::Cloud::Dataproc::V1::DeleteClusterRequest DeleteClusterRequest}
|
503
|
+
# requests with the same id, then the second request will be ignored and the
|
504
|
+
# first {Google::Longrunning::Operation} created
|
505
|
+
# and stored in the backend is returned.
|
503
506
|
#
|
504
507
|
# It is recommended to always set this value to a
|
505
508
|
# [UUID](https://en.wikipedia.org/wiki/Universally_unique_identifier).
|
@@ -36,8 +36,9 @@ module Google
|
|
36
36
|
# Label **keys** must contain 1 to 63 characters, and must conform to
|
37
37
|
# [RFC 1035](https://www.ietf.org/rfc/rfc1035.txt).
|
38
38
|
# Label **values** may be empty, but, if present, must contain 1 to 63
|
39
|
-
# characters, and must conform to [RFC
|
40
|
-
# No more than 32 labels can be
|
39
|
+
# characters, and must conform to [RFC
|
40
|
+
# 1035](https://www.ietf.org/rfc/rfc1035.txt). No more than 32 labels can be
|
41
|
+
# associated with a cluster.
|
41
42
|
# @!attribute [rw] status
|
42
43
|
# @return [Google::Cloud::Dataproc::V1::ClusterStatus]
|
43
44
|
# Output only. Cluster status.
|
@@ -52,8 +53,8 @@ module Google
|
|
52
53
|
# @return [Google::Cloud::Dataproc::V1::ClusterMetrics]
|
53
54
|
# Contains cluster daemon metrics such as HDFS and YARN stats.
|
54
55
|
#
|
55
|
-
# **Beta Feature**: This report is available for testing purposes only. It
|
56
|
-
# be changed before final release.
|
56
|
+
# **Beta Feature**: This report is available for testing purposes only. It
|
57
|
+
# may be changed before final release.
|
57
58
|
class Cluster; end
|
58
59
|
|
59
60
|
# The cluster config.
|
@@ -89,9 +90,11 @@ module Google
|
|
89
90
|
# Optional. Commands to execute on each node after config is
|
90
91
|
# completed. By default, executables are run on master and all worker nodes.
|
91
92
|
# You can test a node's `role` metadata to run an executable on
|
92
|
-
# a master or worker node, as shown below using `curl` (you can also use
|
93
|
+
# a master or worker node, as shown below using `curl` (you can also use
|
94
|
+
# `wget`):
|
93
95
|
#
|
94
|
-
# ROLE=$(curl -H Metadata-Flavor:Google
|
96
|
+
# ROLE=$(curl -H Metadata-Flavor:Google
|
97
|
+
# http://metadata/computeMetadata/v1/instance/attributes/dataproc-role)
|
95
98
|
# if [[ "${ROLE}" == 'Master' ]]; then
|
96
99
|
# ... master specific actions ...
|
97
100
|
# else
|
@@ -150,11 +153,11 @@ module Google
|
|
150
153
|
# @!attribute [rw] internal_ip_only
|
151
154
|
# @return [true, false]
|
152
155
|
# Optional. If true, all instances in the cluster will only have internal IP
|
153
|
-
# addresses. By default, clusters are not restricted to internal IP
|
154
|
-
# and will have ephemeral external IP addresses assigned to each
|
155
|
-
# This `internal_ip_only` restriction can only be enabled for
|
156
|
-
# enabled networks, and all off-cluster dependencies must be
|
157
|
-
# accessible without external IP addresses.
|
156
|
+
# addresses. By default, clusters are not restricted to internal IP
|
157
|
+
# addresses, and will have ephemeral external IP addresses assigned to each
|
158
|
+
# instance. This `internal_ip_only` restriction can only be enabled for
|
159
|
+
# subnetwork enabled networks, and all off-cluster dependencies must be
|
160
|
+
# configured to be accessible without external IP addresses.
|
158
161
|
# @!attribute [rw] service_account
|
159
162
|
# @return [String]
|
160
163
|
# Optional. The service account of the instances. Defaults to the default
|
@@ -164,7 +167,8 @@ module Google
|
|
164
167
|
# * roles/logging.logWriter
|
165
168
|
# * roles/storage.objectAdmin
|
166
169
|
#
|
167
|
-
# (see
|
170
|
+
# (see
|
171
|
+
# https://cloud.google.com/compute/docs/access/service-accounts#custom_service_accounts
|
168
172
|
# for more information).
|
169
173
|
# Example: `[account_id]@[project_id].iam.gserviceaccount.com`
|
170
174
|
# @!attribute [rw] service_account_scopes
|
@@ -190,7 +194,8 @@ module Google
|
|
190
194
|
# @!attribute [rw] metadata
|
191
195
|
# @return [Hash{String => String}]
|
192
196
|
# The Compute Engine metadata entries to add to all instances (see
|
193
|
-
# [Project and instance
|
197
|
+
# [Project and instance
|
198
|
+
# metadata](https://cloud.google.com/compute/docs/storing-retrieving-metadata#project_and_instance_metadata)).
|
194
199
|
class GceClusterConfig; end
|
195
200
|
|
196
201
|
# Optional. The config settings for Compute Engine resources in
|
@@ -219,7 +224,8 @@ module Google
|
|
219
224
|
# * `n1-standard-2`
|
220
225
|
#
|
221
226
|
# **Auto Zone Exception**: If you are using the Cloud Dataproc
|
222
|
-
# [Auto Zone
|
227
|
+
# [Auto Zone
|
228
|
+
# Placement](/dataproc/docs/concepts/configuring-clusters/auto-zone#using_auto_zone_placement)
|
223
229
|
# feature, you must use the short name of the machine type
|
224
230
|
# resource, for example, `n1-standard-2`.
|
225
231
|
# @!attribute [rw] disk_config
|
@@ -227,7 +233,8 @@ module Google
|
|
227
233
|
# Optional. Disk option config settings.
|
228
234
|
# @!attribute [rw] is_preemptible
|
229
235
|
# @return [true, false]
|
230
|
-
# Optional. Specifies that this instance group contains preemptible
|
236
|
+
# Optional. Specifies that this instance group contains preemptible
|
237
|
+
# instances.
|
231
238
|
# @!attribute [rw] managed_group_config
|
232
239
|
# @return [Google::Cloud::Dataproc::V1::ManagedGroupConfig]
|
233
240
|
# Output only. The config for Compute Engine Instance Group
|
@@ -258,7 +265,8 @@ module Google
|
|
258
265
|
# @return [String]
|
259
266
|
# Full URL, partial URI, or short name of the accelerator type resource to
|
260
267
|
# expose to this instance. See
|
261
|
-
# [Compute Engine
|
268
|
+
# [Compute Engine
|
269
|
+
# AcceleratorTypes](/compute/docs/reference/beta/acceleratorTypes).
|
262
270
|
#
|
263
271
|
# Examples:
|
264
272
|
#
|
@@ -267,7 +275,8 @@ module Google
|
|
267
275
|
# * `nvidia-tesla-k80`
|
268
276
|
#
|
269
277
|
# **Auto Zone Exception**: If you are using the Cloud Dataproc
|
270
|
-
# [Auto Zone
|
278
|
+
# [Auto Zone
|
279
|
+
# Placement](/dataproc/docs/concepts/configuring-clusters/auto-zone#using_auto_zone_placement)
|
271
280
|
# feature, you must use the short name of the accelerator type
|
272
281
|
# resource, for example, `nvidia-tesla-k80`.
|
273
282
|
# @!attribute [rw] accelerator_count
|
@@ -366,10 +375,12 @@ module Google
|
|
366
375
|
# Specifies the selection and config of software inside the cluster.
|
367
376
|
# @!attribute [rw] image_version
|
368
377
|
# @return [String]
|
369
|
-
# Optional. The version of software inside the cluster. It must be one of the
|
370
|
-
# [Cloud Dataproc
|
378
|
+
# Optional. The version of software inside the cluster. It must be one of the
|
379
|
+
# supported [Cloud Dataproc
|
380
|
+
# Versions](/dataproc/docs/concepts/versioning/dataproc-versions#supported_cloud_dataproc_versions),
|
371
381
|
# such as "1.2" (including a subminor version, such as "1.2.29"), or the
|
372
|
-
# ["preview"
|
382
|
+
# ["preview"
|
383
|
+
# version](/dataproc/docs/concepts/versioning/dataproc-versions#other_versions).
|
373
384
|
# If unspecified, it defaults to the latest version.
|
374
385
|
# @!attribute [rw] properties
|
375
386
|
# @return [Hash{String => String}]
|
@@ -419,10 +430,11 @@ module Google
|
|
419
430
|
# @!attribute [rw] request_id
|
420
431
|
# @return [String]
|
421
432
|
# Optional. A unique id used to identify the request. If the server
|
422
|
-
# receives two
|
423
|
-
#
|
424
|
-
#
|
425
|
-
#
|
433
|
+
# receives two
|
434
|
+
# {Google::Cloud::Dataproc::V1::CreateClusterRequest CreateClusterRequest}
|
435
|
+
# requests with the same id, then the second request will be ignored and the
|
436
|
+
# first {Google::Longrunning::Operation} created
|
437
|
+
# and stored in the backend is returned.
|
426
438
|
#
|
427
439
|
# It is recommended to always set this value to a
|
428
440
|
# [UUID](https://en.wikipedia.org/wiki/Universally_unique_identifier).
|
@@ -507,10 +519,11 @@ module Google
|
|
507
519
|
# @!attribute [rw] request_id
|
508
520
|
# @return [String]
|
509
521
|
# Optional. A unique id used to identify the request. If the server
|
510
|
-
# receives two
|
511
|
-
#
|
512
|
-
#
|
513
|
-
#
|
522
|
+
# receives two
|
523
|
+
# {Google::Cloud::Dataproc::V1::UpdateClusterRequest UpdateClusterRequest}
|
524
|
+
# requests with the same id, then the second request will be ignored and the
|
525
|
+
# first {Google::Longrunning::Operation} created
|
526
|
+
# and stored in the backend is returned.
|
514
527
|
#
|
515
528
|
# It is recommended to always set this value to a
|
516
529
|
# [UUID](https://en.wikipedia.org/wiki/Universally_unique_identifier).
|
@@ -537,10 +550,11 @@ module Google
|
|
537
550
|
# @!attribute [rw] request_id
|
538
551
|
# @return [String]
|
539
552
|
# Optional. A unique id used to identify the request. If the server
|
540
|
-
# receives two
|
541
|
-
#
|
542
|
-
#
|
543
|
-
#
|
553
|
+
# receives two
|
554
|
+
# {Google::Cloud::Dataproc::V1::DeleteClusterRequest DeleteClusterRequest}
|
555
|
+
# requests with the same id, then the second request will be ignored and the
|
556
|
+
# first {Google::Longrunning::Operation} created
|
557
|
+
# and stored in the backend is returned.
|
544
558
|
#
|
545
559
|
# It is recommended to always set this value to a
|
546
560
|
# [UUID](https://en.wikipedia.org/wiki/Universally_unique_identifier).
|
@@ -59,8 +59,10 @@ module Google
|
|
59
59
|
end
|
60
60
|
|
61
61
|
# A Cloud Dataproc job for running
|
62
|
-
# [Apache Hadoop
|
63
|
-
#
|
62
|
+
# [Apache Hadoop
|
63
|
+
# MapReduce](https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html)
|
64
|
+
# jobs on [Apache Hadoop
|
65
|
+
# YARN](https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-site/YARN.html).
|
64
66
|
# @!attribute [rw] main_jar_file_uri
|
65
67
|
# @return [String]
|
66
68
|
# The HCFS URI of the jar file containing the main class.
|
@@ -75,8 +77,8 @@ module Google
|
|
75
77
|
# @!attribute [rw] args
|
76
78
|
# @return [Array<String>]
|
77
79
|
# Optional. The arguments to pass to the driver. Do not
|
78
|
-
# include arguments, such as `-libjars` or `-Dfoo=bar`, that can be set as
|
79
|
-
# properties, since a collision may occur that causes an incorrect job
|
80
|
+
# include arguments, such as `-libjars` or `-Dfoo=bar`, that can be set as
|
81
|
+
# job properties, since a collision may occur that causes an incorrect job
|
80
82
|
# submission.
|
81
83
|
# @!attribute [rw] jar_file_uris
|
82
84
|
# @return [Array<String>]
|
@@ -142,7 +144,8 @@ module Google
|
|
142
144
|
class SparkJob; end
|
143
145
|
|
144
146
|
# A Cloud Dataproc job for running
|
145
|
-
# [Apache
|
147
|
+
# [Apache
|
148
|
+
# PySpark](https://spark.apache.org/docs/0.9.0/python-programming-guide.html)
|
146
149
|
# applications on YARN.
|
147
150
|
# @!attribute [rw] main_python_file_uri
|
148
151
|
# @return [String]
|
@@ -210,8 +213,8 @@ module Google
|
|
210
213
|
# @!attribute [rw] continue_on_failure
|
211
214
|
# @return [true, false]
|
212
215
|
# Optional. Whether to continue executing queries if a query fails.
|
213
|
-
# The default value is `false`. Setting to `true` can be useful when
|
214
|
-
# independent parallel queries.
|
216
|
+
# The default value is `false`. Setting to `true` can be useful when
|
217
|
+
# executing independent parallel queries.
|
215
218
|
# @!attribute [rw] script_variables
|
216
219
|
# @return [Hash{String => String}]
|
217
220
|
# Optional. Mapping of query variable names to values (equivalent to the
|
@@ -229,8 +232,8 @@ module Google
|
|
229
232
|
# and UDFs.
|
230
233
|
class HiveJob; end
|
231
234
|
|
232
|
-
# A Cloud Dataproc job for running [Apache Spark
|
233
|
-
# queries.
|
235
|
+
# A Cloud Dataproc job for running [Apache Spark
|
236
|
+
# SQL](http://spark.apache.org/sql/) queries.
|
234
237
|
# @!attribute [rw] query_file_uri
|
235
238
|
# @return [String]
|
236
239
|
# The HCFS URI of the script that contains SQL queries.
|
@@ -265,8 +268,8 @@ module Google
|
|
265
268
|
# @!attribute [rw] continue_on_failure
|
266
269
|
# @return [true, false]
|
267
270
|
# Optional. Whether to continue executing queries if a query fails.
|
268
|
-
# The default value is `false`. Setting to `true` can be useful when
|
269
|
-
# independent parallel queries.
|
271
|
+
# The default value is `false`. Setting to `true` can be useful when
|
272
|
+
# executing independent parallel queries.
|
270
273
|
# @!attribute [rw] script_variables
|
271
274
|
# @return [Hash{String => String}]
|
272
275
|
# Optional. Mapping of query variable names to values (equivalent to the Pig
|
@@ -484,8 +487,8 @@ module Google
|
|
484
487
|
# @return [Array<Google::Cloud::Dataproc::V1::YarnApplication>]
|
485
488
|
# Output only. The collection of YARN applications spun up by this job.
|
486
489
|
#
|
487
|
-
# **Beta** Feature: This report is available for testing purposes only. It
|
488
|
-
# be changed before final release.
|
490
|
+
# **Beta** Feature: This report is available for testing purposes only. It
|
491
|
+
# may be changed before final release.
|
489
492
|
# @!attribute [rw] driver_output_resource_uri
|
490
493
|
# @return [String]
|
491
494
|
# Output only. A URI pointing to the location of the stdout of the job's
|
@@ -501,8 +504,9 @@ module Google
|
|
501
504
|
# Label **keys** must contain 1 to 63 characters, and must conform to
|
502
505
|
# [RFC 1035](https://www.ietf.org/rfc/rfc1035.txt).
|
503
506
|
# Label **values** may be empty, but, if present, must contain 1 to 63
|
504
|
-
# characters, and must conform to [RFC
|
505
|
-
# No more than 32 labels can be
|
507
|
+
# characters, and must conform to [RFC
|
508
|
+
# 1035](https://www.ietf.org/rfc/rfc1035.txt). No more than 32 labels can be
|
509
|
+
# associated with a job.
|
506
510
|
# @!attribute [rw] scheduling
|
507
511
|
# @return [Google::Cloud::Dataproc::V1::JobScheduling]
|
508
512
|
# Optional. Job scheduling configuration.
|
@@ -540,8 +544,8 @@ module Google
|
|
540
544
|
# @!attribute [rw] request_id
|
541
545
|
# @return [String]
|
542
546
|
# Optional. A unique id used to identify the request. If the server
|
543
|
-
# receives two {Google::Cloud::Dataproc::V1::SubmitJobRequest SubmitJobRequest}
|
544
|
-
# id, then the second request will be ignored and the
|
547
|
+
# receives two {Google::Cloud::Dataproc::V1::SubmitJobRequest SubmitJobRequest}
|
548
|
+
# requests with the same id, then the second request will be ignored and the
|
545
549
|
# first {Google::Cloud::Dataproc::V1::Job Job} created and stored in the backend
|
546
550
|
# is returned.
|
547
551
|
#
|