dataproc-spark-connect 0.8.3__py2.py3-none-any.whl → 1.0.0__py2.py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,105 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: dataproc-spark-connect
3
- Version: 0.8.3
4
- Summary: Dataproc client library for Spark Connect
5
- Home-page: https://github.com/GoogleCloudDataproc/dataproc-spark-connect-python
6
- Author: Google LLC
7
- License: Apache 2.0
8
- License-File: LICENSE
9
- Requires-Dist: google-api-core>=2.19
10
- Requires-Dist: google-cloud-dataproc>=5.18
11
- Requires-Dist: packaging>=20.0
12
- Requires-Dist: pyspark[connect]~=3.5.1
13
- Requires-Dist: tqdm>=4.67
14
- Requires-Dist: websockets>=14.0
15
- Dynamic: author
16
- Dynamic: description
17
- Dynamic: home-page
18
- Dynamic: license
19
- Dynamic: license-file
20
- Dynamic: requires-dist
21
- Dynamic: summary
22
-
23
- # Dataproc Spark Connect Client
24
-
25
- A wrapper of the Apache [Spark Connect](https://spark.apache.org/spark-connect/)
26
- client with additional functionalities that allow applications to communicate
27
- with a remote Dataproc Spark Session using the Spark Connect protocol without
28
- requiring additional steps.
29
-
30
- ## Install
31
-
32
- ```sh
33
- pip install dataproc_spark_connect
34
- ```
35
-
36
- ## Uninstall
37
-
38
- ```sh
39
- pip uninstall dataproc_spark_connect
40
- ```
41
-
42
- ## Setup
43
-
44
- This client requires permissions to
45
- manage [Dataproc Sessions and Session Templates](https://cloud.google.com/dataproc-serverless/docs/concepts/iam).
46
- If you are running the client outside of Google Cloud, you must set following
47
- environment variables:
48
-
49
- * `GOOGLE_CLOUD_PROJECT` - The Google Cloud project you use to run Spark
50
- workloads
51
- * `GOOGLE_CLOUD_REGION` - The Compute
52
- Engine [region](https://cloud.google.com/compute/docs/regions-zones#available)
53
- where you run the Spark workload.
54
- * `GOOGLE_APPLICATION_CREDENTIALS` -
55
- Your [Application Credentials](https://cloud.google.com/docs/authentication/provide-credentials-adc)
56
-
57
- ## Usage
58
-
59
- 1. Install the latest version of Dataproc Python client and Dataproc Spark
60
- Connect modules:
61
-
62
- ```sh
63
- pip install google_cloud_dataproc dataproc_spark_connect --force-reinstall
64
- ```
65
-
66
- 2. Add the required imports into your PySpark application or notebook and start
67
- a Spark session with the following code instead of using
68
- environment variables:
69
-
70
- ```python
71
- from google.cloud.dataproc_spark_connect import DataprocSparkSession
72
- from google.cloud.dataproc_v1 import Session
73
- session_config = Session()
74
- session_config.environment_config.execution_config.subnetwork_uri = '<subnet>'
75
- session_config.runtime_config.version = '2.2'
76
- spark = DataprocSparkSession.builder.dataprocSessionConfig(session_config).getOrCreate()
77
- ```
78
-
79
- ## Developing
80
-
81
- For development instructions see [guide](DEVELOPING.md).
82
-
83
- ## Contributing
84
-
85
- We'd love to accept your patches and contributions to this project. There are
86
- just a few small guidelines you need to follow.
87
-
88
- ### Contributor License Agreement
89
-
90
- Contributions to this project must be accompanied by a Contributor License
91
- Agreement. You (or your employer) retain the copyright to your contribution;
92
- this simply gives us permission to use and redistribute your contributions as
93
- part of the project. Head over to <https://cla.developers.google.com> to see
94
- your current agreements on file or to sign a new one.
95
-
96
- You generally only need to submit a CLA once, so if you've already submitted one
97
- (even if it was for a different project), you probably don't need to do it
98
- again.
99
-
100
- ### Code reviews
101
-
102
- All submissions, including submissions by project members, require review. We
103
- use GitHub pull requests for this purpose. Consult
104
- [GitHub Help](https://help.github.com/articles/about-pull-requests/) for more
105
- information on using pull requests.
@@ -1,12 +0,0 @@
1
- dataproc_spark_connect-0.8.3.dist-info/licenses/LICENSE,sha256=xx0jnfkXJvxRnG63LTGOxlggYnIysveWIZ6H3PNdCrQ,11357
2
- google/cloud/dataproc_spark_connect/__init__.py,sha256=dIqHNWVWWrSuRf26x11kX5e9yMKSHCtmI_GBj1-FDdE,1101
3
- google/cloud/dataproc_spark_connect/exceptions.py,sha256=WF-qdzgdofRwILCriIkjjsmjObZfF0P3Ecg4lv-Hmec,968
4
- google/cloud/dataproc_spark_connect/pypi_artifacts.py,sha256=gd-VMwiVP-EJuPp9Vf9Shx8pqps3oSKp0hBcSSZQS-A,1575
5
- google/cloud/dataproc_spark_connect/session.py,sha256=ZWoW9-otaCJnttPt7h9W3pmhHpdbQsAOl8ypOX3fVbo,33556
6
- google/cloud/dataproc_spark_connect/client/__init__.py,sha256=6hCNSsgYlie6GuVpc5gjFsPnyeMTScTpXSPYqp1fplY,615
7
- google/cloud/dataproc_spark_connect/client/core.py,sha256=m3oXTKBm3sBy6jhDu9GRecrxLb5CdEM53SgMlnJb6ag,4616
8
- google/cloud/dataproc_spark_connect/client/proxy.py,sha256=qUZXvVY1yn934vE6nlO495XUZ53AUx9O74a9ozkGI9U,8976
9
- dataproc_spark_connect-0.8.3.dist-info/METADATA,sha256=croGipnWGtSrd2NLyMCHrcVagYCk9yJ6cEOqCEAm-Qc,3465
10
- dataproc_spark_connect-0.8.3.dist-info/WHEEL,sha256=JNWh1Fm1UdwIQV075glCn4MVuCRs0sotJIq-J6rbxCU,109
11
- dataproc_spark_connect-0.8.3.dist-info/top_level.txt,sha256=_1QvSJIhFAGfxb79D6DhB7SUw2X6T4rwnz_LLrbcD3c,7
12
- dataproc_spark_connect-0.8.3.dist-info/RECORD,,