AWSGlueDataplanePython 5.0.0__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- awsglue/README.md +37 -0
- awsglue/__init__.py +15 -0
- awsglue/context.py +690 -0
- awsglue/data_sink.py +49 -0
- awsglue/data_source.py +49 -0
- awsglue/dataframe_transforms/__init__.py +17 -0
- awsglue/dataframe_transforms/apply_mapping.py +76 -0
- awsglue/dataframereader.py +41 -0
- awsglue/dataframewriter.py +21 -0
- awsglue/devutils.py +236 -0
- awsglue/dynamicframe.py +669 -0
- awsglue/functions.py +31 -0
- awsglue/glue_shell.py +38 -0
- awsglue/gluetypes.py +461 -0
- awsglue/job.py +59 -0
- awsglue/scripts/__init__.py +12 -0
- awsglue/scripts/activate_etl_connector.py +362 -0
- awsglue/scripts/connector_activation_util.py +38 -0
- awsglue/scripts/crawler_redo_from_backup.py +75 -0
- awsglue/scripts/crawler_undo.py +121 -0
- awsglue/scripts/scripts_utils.py +106 -0
- awsglue/streaming_data_source.py +28 -0
- awsglue/transforms/__init__.py +47 -0
- awsglue/transforms/apply_mapping.py +72 -0
- awsglue/transforms/coalesce.py +66 -0
- awsglue/transforms/collection_transforms.py +155 -0
- awsglue/transforms/drop_nulls.py +85 -0
- awsglue/transforms/dynamicframe_filter.py +66 -0
- awsglue/transforms/dynamicframe_map.py +72 -0
- awsglue/transforms/errors_as_dynamicframe.py +45 -0
- awsglue/transforms/field_transforms.py +469 -0
- awsglue/transforms/relationalize.py +105 -0
- awsglue/transforms/repartition.py +61 -0
- awsglue/transforms/resolve_choice.py +85 -0
- awsglue/transforms/transform.py +92 -0
- awsglue/transforms/unbox.py +112 -0
- awsglue/transforms/union.py +66 -0
- awsglue/transforms/unnest_frame.py +75 -0
- awsglue/utils.py +159 -0
- awsgluedataplanepython-5.0.0.dist-info/METADATA +178 -0
- awsgluedataplanepython-5.0.0.dist-info/RECORD +45 -0
- awsgluedataplanepython-5.0.0.dist-info/WHEEL +5 -0
- awsgluedataplanepython-5.0.0.dist-info/licenses/LICENSE.txt +96 -0
- awsgluedataplanepython-5.0.0.dist-info/licenses/NOTICE.txt +3 -0
- awsgluedataplanepython-5.0.0.dist-info/top_level.txt +1 -0
awsglue/README.md
ADDED
|
@@ -0,0 +1,37 @@
|
|
|
1
|
+
# awsglue
|
|
2
|
+
|
|
3
|
+
The awsglue Python package contains the Python portion of the [AWS Glue](https://aws.amazon.com/glue) library. This library extends [PySpark](http://spark.apache.org/docs/2.1.0/api/python/pyspark.html) to support serverless ETL on AWS.
|
|
4
|
+
|
|
5
|
+
Note that this package must be used in conjunction with the AWS Glue service and is not executable independently. Many of the classes and methods use the Py4J library to interface with code that is available on the Glue platform. This repository can be used as a reference and aid for writing Glue scripts.
|
|
6
|
+
|
|
7
|
+
While scripts using this library can only be run on the AWS Glue service, it is possible to import this library locally. This may be helpful to provide auto-completion in an IDE, for instance. To import the library successfully you will need to install PySpark, which can be done using pip:
|
|
8
|
+
|
|
9
|
+
pip install pyspark
|
|
10
|
+
|
|
11
|
+
## Content
|
|
12
|
+
|
|
13
|
+
This package contains Python interfaces to the key data structures and methods used in AWS Glue. The following are some important modules. More information can be found in the public documentation.
|
|
14
|
+
|
|
15
|
+
|
|
16
|
+
#### GlueContext
|
|
17
|
+
The file [context.py](context.py) contains the GlueContext class. GlueContext extends PySpark's [SQLContext](https://github.com/apache/spark/blob/master/python/pyspark/sql/context.py) class to provide Glue-specific operations. Most Glue programs will start by instantiating a GlueContext and using it to construct a DynamicFrame.
|
|
18
|
+
|
|
19
|
+
|
|
20
|
+
#### DynamicFrame
|
|
21
|
+
The DynamicFrame, defined in [dynamicframe.py](dynamicframe.py), is the core data structure used in Glue scripts. DynamicFrames are similar to Spark SQL's [DataFrames](https://github.com/apache/spark/blob/master/python/pyspark/sql/dataframe.py) in that they represent distributed collections of data records, but DynamicFrames provide more flexible handling of data sets with inconsistent schemas. By representing records in a self-describing way, they can be used without specifying a schema up front or requiring a costly schema inference step.
|
|
22
|
+
|
|
23
|
+
DynamicFrames support many operations, but it is also possible to convert them to DataFrames using the `toDF` method to make use of existing Spark SQL operations.
|
|
24
|
+
|
|
25
|
+
|
|
26
|
+
#### Transforms
|
|
27
|
+
|
|
28
|
+
The [transforms](transforms/) directory contains a variety of operations that can be performed on DynamicFrames. These include simple operations, such as `DropFields`, as well as more complex transformations like `Relationalize`, which flattens a nested data set into a collection of tables that can be loaded into a Relational Database. Once imported, transforms can be invoked using the following syntax:
|
|
29
|
+
|
|
30
|
+
TransformClass.apply(args...)
|
|
31
|
+
|
|
32
|
+
## Additional Resources
|
|
33
|
+
|
|
34
|
+
- The [aws-glue-samples](https://github.com/awslabs/aws-glue-samples) repository contains sample scripts that make use of awsglue library and can be submitted directly to the AWS Glue service.
|
|
35
|
+
|
|
36
|
+
- The public [Glue Documentation](http://docs.aws.amazon.com/glue/latest/dg/index.html) contains information about the AWS Glue service as well as addditional information about the Python library.
|
|
37
|
+
|
awsglue/__init__.py
ADDED
|
@@ -0,0 +1,15 @@
|
|
|
1
|
+
# Copyright 2016-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
|
|
2
|
+
# Licensed under the Amazon Software License (the "License"). You may not use
|
|
3
|
+
# this file except in compliance with the License. A copy of the License is
|
|
4
|
+
# located at
|
|
5
|
+
#
|
|
6
|
+
# http://aws.amazon.com/asl/
|
|
7
|
+
#
|
|
8
|
+
# or in the "license" file accompanying this file. This file is distributed
|
|
9
|
+
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, express
|
|
10
|
+
# or implied. See the License for the specific language governing
|
|
11
|
+
# permissions and limitations under the License.
|
|
12
|
+
|
|
13
|
+
from .dynamicframe import DynamicFrame
|
|
14
|
+
|
|
15
|
+
__all__ = ['DynamicFrame']
|