teradataml 20.0.0.1__py3-none-any.whl → 20.0.0.3__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Potentially problematic release.
This version of teradataml might be problematic. Click here for more details.
- teradataml/LICENSE-3RD-PARTY.pdf +0 -0
- teradataml/LICENSE.pdf +0 -0
- teradataml/README.md +306 -0
- teradataml/__init__.py +10 -3
- teradataml/_version.py +1 -1
- teradataml/analytics/__init__.py +3 -2
- teradataml/analytics/analytic_function_executor.py +299 -16
- teradataml/analytics/analytic_query_generator.py +92 -0
- teradataml/analytics/byom/__init__.py +3 -2
- teradataml/analytics/json_parser/metadata.py +13 -3
- teradataml/analytics/json_parser/utils.py +13 -6
- teradataml/analytics/meta_class.py +40 -1
- teradataml/analytics/sqle/DecisionTreePredict.py +1 -1
- teradataml/analytics/sqle/__init__.py +11 -2
- teradataml/analytics/table_operator/__init__.py +4 -3
- teradataml/analytics/uaf/__init__.py +21 -2
- teradataml/analytics/utils.py +66 -1
- teradataml/analytics/valib.py +1 -1
- teradataml/automl/__init__.py +1502 -323
- teradataml/automl/custom_json_utils.py +139 -61
- teradataml/automl/data_preparation.py +247 -307
- teradataml/automl/data_transformation.py +32 -12
- teradataml/automl/feature_engineering.py +325 -86
- teradataml/automl/model_evaluation.py +44 -35
- teradataml/automl/model_training.py +122 -153
- teradataml/catalog/byom.py +8 -8
- teradataml/clients/pkce_client.py +1 -1
- teradataml/common/__init__.py +2 -1
- teradataml/common/constants.py +72 -0
- teradataml/common/deprecations.py +13 -7
- teradataml/common/garbagecollector.py +152 -120
- teradataml/common/messagecodes.py +11 -2
- teradataml/common/messages.py +4 -1
- teradataml/common/sqlbundle.py +26 -4
- teradataml/common/utils.py +225 -14
- teradataml/common/wrapper_utils.py +1 -1
- teradataml/context/context.py +82 -2
- teradataml/data/SQL_Fundamentals.pdf +0 -0
- teradataml/data/complaints_test_tokenized.csv +353 -0
- teradataml/data/complaints_tokens_model.csv +348 -0
- teradataml/data/covid_confirm_sd.csv +83 -0
- teradataml/data/dataframe_example.json +27 -1
- teradataml/data/docs/sqle/docs_17_20/CFilter.py +132 -0
- teradataml/data/docs/sqle/docs_17_20/NaiveBayes.py +162 -0
- teradataml/data/docs/sqle/docs_17_20/OutlierFilterFit.py +2 -0
- teradataml/data/docs/sqle/docs_17_20/Pivoting.py +279 -0
- teradataml/data/docs/sqle/docs_17_20/Shap.py +203 -0
- teradataml/data/docs/sqle/docs_17_20/TDNaiveBayesPredict.py +189 -0
- teradataml/data/docs/sqle/docs_17_20/TFIDF.py +142 -0
- teradataml/data/docs/sqle/docs_17_20/TextParser.py +3 -3
- teradataml/data/docs/sqle/docs_17_20/Unpivoting.py +216 -0
- teradataml/data/docs/tableoperator/docs_17_20/Image2Matrix.py +118 -0
- teradataml/data/docs/uaf/docs_17_20/ACF.py +1 -10
- teradataml/data/docs/uaf/docs_17_20/ArimaEstimate.py +1 -1
- teradataml/data/docs/uaf/docs_17_20/ArimaForecast.py +35 -5
- teradataml/data/docs/uaf/docs_17_20/ArimaValidate.py +3 -1
- teradataml/data/docs/uaf/docs_17_20/ArimaXEstimate.py +293 -0
- teradataml/data/docs/uaf/docs_17_20/AutoArima.py +354 -0
- teradataml/data/docs/uaf/docs_17_20/BreuschGodfrey.py +3 -2
- teradataml/data/docs/uaf/docs_17_20/BreuschPaganGodfrey.py +1 -1
- teradataml/data/docs/uaf/docs_17_20/Convolve.py +13 -10
- teradataml/data/docs/uaf/docs_17_20/Convolve2.py +4 -1
- teradataml/data/docs/uaf/docs_17_20/CopyArt.py +145 -0
- teradataml/data/docs/uaf/docs_17_20/CumulPeriodogram.py +5 -4
- teradataml/data/docs/uaf/docs_17_20/DFFT2Conv.py +4 -4
- teradataml/data/docs/uaf/docs_17_20/DWT.py +235 -0
- teradataml/data/docs/uaf/docs_17_20/DWT2D.py +214 -0
- teradataml/data/docs/uaf/docs_17_20/DickeyFuller.py +18 -21
- teradataml/data/docs/uaf/docs_17_20/DurbinWatson.py +1 -1
- teradataml/data/docs/uaf/docs_17_20/ExtractResults.py +1 -1
- teradataml/data/docs/uaf/docs_17_20/FilterFactory1d.py +160 -0
- teradataml/data/docs/uaf/docs_17_20/GenseriesSinusoids.py +1 -1
- teradataml/data/docs/uaf/docs_17_20/GoldfeldQuandt.py +9 -31
- teradataml/data/docs/uaf/docs_17_20/HoltWintersForecaster.py +4 -2
- teradataml/data/docs/uaf/docs_17_20/IDFFT2.py +1 -8
- teradataml/data/docs/uaf/docs_17_20/IDWT.py +236 -0
- teradataml/data/docs/uaf/docs_17_20/IDWT2D.py +226 -0
- teradataml/data/docs/uaf/docs_17_20/IQR.py +134 -0
- teradataml/data/docs/uaf/docs_17_20/LineSpec.py +1 -1
- teradataml/data/docs/uaf/docs_17_20/LinearRegr.py +2 -2
- teradataml/data/docs/uaf/docs_17_20/MAMean.py +3 -3
- teradataml/data/docs/uaf/docs_17_20/Matrix2Image.py +297 -0
- teradataml/data/docs/uaf/docs_17_20/MatrixMultiply.py +15 -6
- teradataml/data/docs/uaf/docs_17_20/PACF.py +0 -1
- teradataml/data/docs/uaf/docs_17_20/Portman.py +2 -2
- teradataml/data/docs/uaf/docs_17_20/PowerSpec.py +2 -2
- teradataml/data/docs/uaf/docs_17_20/Resample.py +9 -1
- teradataml/data/docs/uaf/docs_17_20/SAX.py +246 -0
- teradataml/data/docs/uaf/docs_17_20/SeasonalNormalize.py +17 -10
- teradataml/data/docs/uaf/docs_17_20/SignifPeriodicities.py +1 -1
- teradataml/data/docs/uaf/docs_17_20/WhitesGeneral.py +3 -1
- teradataml/data/docs/uaf/docs_17_20/WindowDFFT.py +368 -0
- teradataml/data/dwt2d_dataTable.csv +65 -0
- teradataml/data/dwt_dataTable.csv +8 -0
- teradataml/data/dwt_filterTable.csv +3 -0
- teradataml/data/finance_data4.csv +13 -0
- teradataml/data/grocery_transaction.csv +19 -0
- teradataml/data/idwt2d_dataTable.csv +5 -0
- teradataml/data/idwt_dataTable.csv +8 -0
- teradataml/data/idwt_filterTable.csv +3 -0
- teradataml/data/interval_data.csv +5 -0
- teradataml/data/jsons/paired_functions.json +14 -0
- teradataml/data/jsons/sqle/17.20/TD_CFilter.json +118 -0
- teradataml/data/jsons/sqle/17.20/TD_NaiveBayes.json +193 -0
- teradataml/data/jsons/sqle/17.20/TD_NaiveBayesPredict.json +212 -0
- teradataml/data/jsons/sqle/17.20/TD_OneClassSVM.json +9 -9
- teradataml/data/jsons/sqle/17.20/TD_Pivoting.json +280 -0
- teradataml/data/jsons/sqle/17.20/TD_Shap.json +222 -0
- teradataml/data/jsons/sqle/17.20/TD_TFIDF.json +162 -0
- teradataml/data/jsons/sqle/17.20/TD_TextParser.json +1 -1
- teradataml/data/jsons/sqle/17.20/TD_Unpivoting.json +235 -0
- teradataml/data/jsons/sqle/20.00/TD_KMeans.json +250 -0
- teradataml/data/jsons/sqle/20.00/TD_SMOTE.json +266 -0
- teradataml/data/jsons/sqle/20.00/TD_VectorDistance.json +278 -0
- teradataml/data/jsons/storedprocedure/17.20/TD_COPYART.json +71 -0
- teradataml/data/jsons/storedprocedure/17.20/TD_FILTERFACTORY1D.json +150 -0
- teradataml/data/jsons/tableoperator/17.20/IMAGE2MATRIX.json +53 -0
- teradataml/data/jsons/uaf/17.20/TD_ACF.json +1 -18
- teradataml/data/jsons/uaf/17.20/TD_ARIMAESTIMATE.json +3 -16
- teradataml/data/jsons/uaf/17.20/TD_ARIMAFORECAST.json +0 -3
- teradataml/data/jsons/uaf/17.20/TD_ARIMAVALIDATE.json +5 -3
- teradataml/data/jsons/uaf/17.20/TD_ARIMAXESTIMATE.json +362 -0
- teradataml/data/jsons/uaf/17.20/TD_AUTOARIMA.json +469 -0
- teradataml/data/jsons/uaf/17.20/TD_BINARYMATRIXOP.json +0 -3
- teradataml/data/jsons/uaf/17.20/TD_BINARYSERIESOP.json +0 -2
- teradataml/data/jsons/uaf/17.20/TD_BREUSCH_GODFREY.json +2 -1
- teradataml/data/jsons/uaf/17.20/TD_BREUSCH_PAGAN_GODFREY.json +2 -5
- teradataml/data/jsons/uaf/17.20/TD_CONVOLVE.json +3 -6
- teradataml/data/jsons/uaf/17.20/TD_CONVOLVE2.json +1 -3
- teradataml/data/jsons/uaf/17.20/TD_CUMUL_PERIODOGRAM.json +0 -5
- teradataml/data/jsons/uaf/17.20/TD_DFFT.json +1 -4
- teradataml/data/jsons/uaf/17.20/TD_DFFT2.json +2 -7
- teradataml/data/jsons/uaf/17.20/TD_DFFT2CONV.json +1 -2
- teradataml/data/jsons/uaf/17.20/TD_DFFTCONV.json +0 -2
- teradataml/data/jsons/uaf/17.20/TD_DICKEY_FULLER.json +10 -19
- teradataml/data/jsons/uaf/17.20/TD_DTW.json +3 -6
- teradataml/data/jsons/uaf/17.20/TD_DWT.json +173 -0
- teradataml/data/jsons/uaf/17.20/TD_DWT2D.json +160 -0
- teradataml/data/jsons/uaf/17.20/TD_FITMETRICS.json +1 -1
- teradataml/data/jsons/uaf/17.20/TD_GOLDFELD_QUANDT.json +16 -30
- teradataml/data/jsons/uaf/17.20/{TD_HOLT_WINTERS_FORECAST.json → TD_HOLT_WINTERS_FORECASTER.json} +1 -2
- teradataml/data/jsons/uaf/17.20/TD_IDFFT2.json +1 -15
- teradataml/data/jsons/uaf/17.20/TD_IDWT.json +162 -0
- teradataml/data/jsons/uaf/17.20/TD_IDWT2D.json +149 -0
- teradataml/data/jsons/uaf/17.20/TD_IQR.json +117 -0
- teradataml/data/jsons/uaf/17.20/TD_LINEAR_REGR.json +1 -1
- teradataml/data/jsons/uaf/17.20/TD_LINESPEC.json +1 -1
- teradataml/data/jsons/uaf/17.20/TD_MAMEAN.json +1 -3
- teradataml/data/jsons/uaf/17.20/TD_MATRIX2IMAGE.json +209 -0
- teradataml/data/jsons/uaf/17.20/TD_PACF.json +2 -2
- teradataml/data/jsons/uaf/17.20/TD_POWERSPEC.json +5 -5
- teradataml/data/jsons/uaf/17.20/TD_RESAMPLE.json +48 -28
- teradataml/data/jsons/uaf/17.20/TD_SAX.json +210 -0
- teradataml/data/jsons/uaf/17.20/TD_SEASONALNORMALIZE.json +12 -6
- teradataml/data/jsons/uaf/17.20/TD_SIMPLEEXP.json +0 -1
- teradataml/data/jsons/uaf/17.20/TD_TRACKINGOP.json +8 -8
- teradataml/data/jsons/uaf/17.20/TD_UNDIFF.json +1 -1
- teradataml/data/jsons/uaf/17.20/TD_UNNORMALIZE.json +1 -1
- teradataml/data/jsons/uaf/17.20/TD_WINDOWDFFT.json +410 -0
- teradataml/data/load_example_data.py +8 -2
- teradataml/data/medical_readings.csv +101 -0
- teradataml/data/naivebayestextclassifier_example.json +1 -1
- teradataml/data/naivebayestextclassifierpredict_example.json +11 -0
- teradataml/data/patient_profile.csv +101 -0
- teradataml/data/peppers.png +0 -0
- teradataml/data/real_values.csv +14 -0
- teradataml/data/sax_example.json +8 -0
- teradataml/data/scripts/deploy_script.py +1 -1
- teradataml/data/scripts/lightgbm/dataset.template +157 -0
- teradataml/data/scripts/lightgbm/lightgbm_class_functions.template +247 -0
- teradataml/data/scripts/lightgbm/lightgbm_function.template +216 -0
- teradataml/data/scripts/lightgbm/lightgbm_sklearn.template +159 -0
- teradataml/data/scripts/sklearn/sklearn_fit.py +194 -160
- teradataml/data/scripts/sklearn/sklearn_fit_predict.py +136 -115
- teradataml/data/scripts/sklearn/sklearn_function.template +34 -16
- teradataml/data/scripts/sklearn/sklearn_model_selection_split.py +155 -137
- teradataml/data/scripts/sklearn/sklearn_neighbors.py +1 -1
- teradataml/data/scripts/sklearn/sklearn_score.py +12 -3
- teradataml/data/scripts/sklearn/sklearn_transform.py +162 -24
- teradataml/data/star_pivot.csv +8 -0
- teradataml/data/target_udt_data.csv +8 -0
- teradataml/data/templates/open_source_ml.json +3 -1
- teradataml/data/teradataml_example.json +20 -1
- teradataml/data/timestamp_data.csv +4 -0
- teradataml/data/titanic_dataset_unpivoted.csv +19 -0
- teradataml/data/uaf_example.json +55 -1
- teradataml/data/unpivot_example.json +15 -0
- teradataml/data/url_data.csv +9 -0
- teradataml/data/vectordistance_example.json +4 -0
- teradataml/data/windowdfft.csv +16 -0
- teradataml/dataframe/copy_to.py +1 -1
- teradataml/dataframe/data_transfer.py +5 -3
- teradataml/dataframe/dataframe.py +1002 -201
- teradataml/dataframe/fastload.py +3 -3
- teradataml/dataframe/functions.py +867 -0
- teradataml/dataframe/row.py +160 -0
- teradataml/dataframe/setop.py +2 -2
- teradataml/dataframe/sql.py +840 -33
- teradataml/dataframe/window.py +1 -1
- teradataml/dbutils/dbutils.py +878 -34
- teradataml/dbutils/filemgr.py +48 -1
- teradataml/geospatial/geodataframe.py +1 -1
- teradataml/geospatial/geodataframecolumn.py +1 -1
- teradataml/hyperparameter_tuner/optimizer.py +13 -13
- teradataml/lib/aed_0_1.dll +0 -0
- teradataml/opensource/__init__.py +1 -1
- teradataml/opensource/{sklearn/_class.py → _class.py} +102 -17
- teradataml/opensource/_lightgbm.py +950 -0
- teradataml/opensource/{sklearn/_wrapper_utils.py → _wrapper_utils.py} +1 -2
- teradataml/opensource/{sklearn/constants.py → constants.py} +13 -10
- teradataml/opensource/sklearn/__init__.py +0 -1
- teradataml/opensource/sklearn/_sklearn_wrapper.py +1019 -574
- teradataml/options/__init__.py +9 -23
- teradataml/options/configure.py +42 -4
- teradataml/options/display.py +2 -2
- teradataml/plot/axis.py +4 -4
- teradataml/scriptmgmt/UserEnv.py +13 -9
- teradataml/scriptmgmt/lls_utils.py +77 -23
- teradataml/store/__init__.py +13 -0
- teradataml/store/feature_store/__init__.py +0 -0
- teradataml/store/feature_store/constants.py +291 -0
- teradataml/store/feature_store/feature_store.py +2223 -0
- teradataml/store/feature_store/models.py +1505 -0
- teradataml/store/vector_store/__init__.py +1586 -0
- teradataml/table_operators/Script.py +2 -2
- teradataml/table_operators/TableOperator.py +106 -20
- teradataml/table_operators/query_generator.py +3 -0
- teradataml/table_operators/table_operator_query_generator.py +3 -1
- teradataml/table_operators/table_operator_util.py +102 -56
- teradataml/table_operators/templates/dataframe_register.template +69 -0
- teradataml/table_operators/templates/dataframe_udf.template +63 -0
- teradataml/telemetry_utils/__init__.py +0 -0
- teradataml/telemetry_utils/queryband.py +52 -0
- teradataml/utils/dtypes.py +4 -2
- teradataml/utils/validators.py +34 -2
- {teradataml-20.0.0.1.dist-info → teradataml-20.0.0.3.dist-info}/METADATA +311 -3
- {teradataml-20.0.0.1.dist-info → teradataml-20.0.0.3.dist-info}/RECORD +240 -157
- {teradataml-20.0.0.1.dist-info → teradataml-20.0.0.3.dist-info}/WHEEL +0 -0
- {teradataml-20.0.0.1.dist-info → teradataml-20.0.0.3.dist-info}/top_level.txt +0 -0
- {teradataml-20.0.0.1.dist-info → teradataml-20.0.0.3.dist-info}/zip-safe +0 -0
teradataml/utils/dtypes.py
CHANGED
|
@@ -641,11 +641,13 @@ class _Dtypes:
|
|
|
641
641
|
|
|
642
642
|
"""
|
|
643
643
|
from teradataml.dataframe.dataframe import TDSeries, TDMatrix, TDGenSeries, TDAnalyticResult
|
|
644
|
+
from teradataml.store.feature_store.feature_store import Feature
|
|
644
645
|
_DtypesMappers.JSON_TD_TO_PYTHON_TYPE_MAPPER.update({"SERIES": TDSeries,
|
|
645
646
|
"MATRIX": TDMatrix,
|
|
646
647
|
"ART": TDAnalyticResult,
|
|
647
|
-
"GENSERIES": TDGenSeries
|
|
648
|
-
|
|
648
|
+
"GENSERIES": TDGenSeries,
|
|
649
|
+
"COLUMN": (str, Feature),
|
|
650
|
+
"COLUMNS": (str, Feature)})
|
|
649
651
|
|
|
650
652
|
return _DtypesMappers.JSON_TD_TO_PYTHON_TYPE_MAPPER.get(json_td_type.upper())
|
|
651
653
|
|
teradataml/utils/validators.py
CHANGED
|
@@ -1,3 +1,4 @@
|
|
|
1
|
+
import enum
|
|
1
2
|
import numbers
|
|
2
3
|
import os
|
|
3
4
|
import pandas as pd
|
|
@@ -11,6 +12,8 @@ from teradataml.options.configure import configure
|
|
|
11
12
|
from teradataml.dataframe.sql_interfaces import ColumnExpression
|
|
12
13
|
from functools import wraps, reduce
|
|
13
14
|
|
|
15
|
+
from teradataml.utils.internal_buffer import _InternalBuffer
|
|
16
|
+
|
|
14
17
|
|
|
15
18
|
def skip_validation():
|
|
16
19
|
"""
|
|
@@ -545,7 +548,7 @@ class _Validators:
|
|
|
545
548
|
raise TypeError("Third element in argument information matrix should be bool.")
|
|
546
549
|
|
|
547
550
|
if not (isinstance(args[3], tuple) or isinstance(args[3], type) or
|
|
548
|
-
isinstance(args[3], (_ListOf, _TupleOf))):
|
|
551
|
+
isinstance(args[3], (_ListOf, _TupleOf)) or isinstance(args[3], enum.EnumMeta)):
|
|
549
552
|
err_msg = "Fourth element in argument information matrix should be a 'tuple of types' or 'type' type."
|
|
550
553
|
raise TypeError(err_msg)
|
|
551
554
|
|
|
@@ -1660,7 +1663,7 @@ class _Validators:
|
|
|
1660
1663
|
|
|
1661
1664
|
# Check whether table exists on the system or not.
|
|
1662
1665
|
table_exists = conn.dialect.has_table(conn, table_name=table_name,
|
|
1663
|
-
schema=schema_name)
|
|
1666
|
+
schema=schema_name, table_only=True)
|
|
1664
1667
|
|
|
1665
1668
|
# If tables exists, return True.
|
|
1666
1669
|
if table_exists:
|
|
@@ -2274,4 +2277,33 @@ class _Validators:
|
|
|
2274
2277
|
MessageCodes.INVALID_ARG_VALUE).format(ip_address, "ip_address",
|
|
2275
2278
|
'of four numbers (each between 0 and 255) separated by periods'))
|
|
2276
2279
|
|
|
2280
|
+
return True
|
|
2281
|
+
|
|
2282
|
+
|
|
2283
|
+
@staticmethod
|
|
2284
|
+
@skip_validation()
|
|
2285
|
+
def _check_auth_token(func_name):
|
|
2286
|
+
"""
|
|
2287
|
+
DESCRIPTION:
|
|
2288
|
+
Check if the user has set the authentication token.
|
|
2289
|
+
|
|
2290
|
+
PARAMETERS:
|
|
2291
|
+
func_name:
|
|
2292
|
+
Required Argument.
|
|
2293
|
+
Specifies the function name where the authentication token is required.
|
|
2294
|
+
Types: str
|
|
2295
|
+
|
|
2296
|
+
RAISES:
|
|
2297
|
+
TeradataMLException
|
|
2298
|
+
|
|
2299
|
+
RETURNS:
|
|
2300
|
+
None.
|
|
2301
|
+
|
|
2302
|
+
EXAMPLES:
|
|
2303
|
+
>>> _Validators._check_auth_token("udf")
|
|
2304
|
+
"""
|
|
2305
|
+
if _InternalBuffer.get("auth_token") is None:
|
|
2306
|
+
raise TeradataMlException(Messages.get_message(MessageCodes.AUTH_TOKEN_REQUIRED,\
|
|
2307
|
+
func_name), MessageCodes.AUTH_TOKEN_REQUIRED)
|
|
2308
|
+
|
|
2277
2309
|
return True
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.1
|
|
2
2
|
Name: teradataml
|
|
3
|
-
Version: 20.0.0.
|
|
3
|
+
Version: 20.0.0.3
|
|
4
4
|
Summary: Teradata Vantage Python package for Advanced Analytics
|
|
5
5
|
Home-page: http://www.teradata.com/
|
|
6
6
|
Author: Teradata Corporation
|
|
@@ -17,8 +17,8 @@ Classifier: Topic :: Database :: Front-Ends
|
|
|
17
17
|
Classifier: License :: Other/Proprietary License
|
|
18
18
|
Requires-Python: >=3.8
|
|
19
19
|
Description-Content-Type: text/markdown
|
|
20
|
-
Requires-Dist: teradatasql (>=
|
|
21
|
-
Requires-Dist: teradatasqlalchemy (>=20.0.0.
|
|
20
|
+
Requires-Dist: teradatasql (>=20.0.0.19)
|
|
21
|
+
Requires-Dist: teradatasqlalchemy (>=20.0.0.3)
|
|
22
22
|
Requires-Dist: pandas (>=0.22)
|
|
23
23
|
Requires-Dist: psutil
|
|
24
24
|
Requires-Dist: requests (>=2.25.1)
|
|
@@ -27,6 +27,8 @@ Requires-Dist: IPython (>=8.10.0)
|
|
|
27
27
|
Requires-Dist: imbalanced-learn (>=0.8.0)
|
|
28
28
|
Requires-Dist: pyjwt (>=2.8.0)
|
|
29
29
|
Requires-Dist: cryptography (>=42.0.5)
|
|
30
|
+
Requires-Dist: sqlalchemy (>=2.0)
|
|
31
|
+
Requires-Dist: lightgbm (>=3.3.3)
|
|
30
32
|
|
|
31
33
|
## Teradata Python package for Advanced Analytics.
|
|
32
34
|
|
|
@@ -46,6 +48,312 @@ Copyright 2024, Teradata. All Rights Reserved.
|
|
|
46
48
|
* [License](#license)
|
|
47
49
|
|
|
48
50
|
## Release Notes:
|
|
51
|
+
|
|
52
|
+
#### teradataml 20.00.00.03
|
|
53
|
+
|
|
54
|
+
* teradataml no longer supports setting the `auth_token` using `set_config_params()`. Users should use `set_auth_token()` to set the token.
|
|
55
|
+
|
|
56
|
+
* ##### New Features/Functionality
|
|
57
|
+
* ###### teradataml: DataFrame
|
|
58
|
+
* New Function
|
|
59
|
+
* `alias()` - Creates a DataFrame with alias name.
|
|
60
|
+
* New Properties
|
|
61
|
+
* `db_object_name` - Get the underlying database object name, on which DataFrame is created.
|
|
62
|
+
|
|
63
|
+
* ###### teradataml: GeoDataFrame
|
|
64
|
+
* New Function
|
|
65
|
+
* `alias()` - Creates a GeoDataFrame with alias name.
|
|
66
|
+
|
|
67
|
+
* ###### teradataml: DataFrameColumn a.k.a. ColumnExpression
|
|
68
|
+
* _Arithmetic Functions_
|
|
69
|
+
* `DataFrameColumn.isnan()` - Function evaluates expression to determine if the floating-point
|
|
70
|
+
argument is a NaN (Not-a-Number) value.
|
|
71
|
+
* `DataFrameColumn.isinf()` - Function evaluates expression to determine if the floating-point
|
|
72
|
+
argument is an infinite number.
|
|
73
|
+
* `DataFrameColumn.isfinite()` - Function evaluates expression to determine if it is a finite
|
|
74
|
+
floating value.
|
|
75
|
+
|
|
76
|
+
* ###### FeatureStore - handles feature management within the Vantage environment
|
|
77
|
+
* FeatureStore Components
|
|
78
|
+
* Feature - Represents a feature which is used in ML Modeling.
|
|
79
|
+
* Entity - Represents the columns which serves as uniqueness for the data used in ML Modeling.
|
|
80
|
+
* DataSource - Represents the source of Data.
|
|
81
|
+
* FeatureGroup - Collection of Feature, Entity and DataSource.
|
|
82
|
+
* Methods
|
|
83
|
+
* `apply()` - Adds Feature, Entity, DataSource to a FeatureGroup.
|
|
84
|
+
* `from_DataFrame()` - Creates a FeatureGroup from teradataml DataFrame.
|
|
85
|
+
* `from_query()` - Creates a FeatureGroup using a SQL query.
|
|
86
|
+
* `remove()` - Removes Feature, Entity, or DataSource from a FeatureGroup.
|
|
87
|
+
* `reset_labels()` - Removes the labels assigned to the FeatureGroup, that are set using `set_labels()`.
|
|
88
|
+
* `set_labels()` - Sets the Features as labels for a FeatureGroup.
|
|
89
|
+
* Properties
|
|
90
|
+
* `features` - Get the features of a FeatureGroup.
|
|
91
|
+
* `labels` - Get the labels of FeatureGroup.
|
|
92
|
+
* FeatureStore
|
|
93
|
+
* Methods
|
|
94
|
+
* `apply()` - Adds Feature, Entity, DataSource, FeatureGroup to FeatureStore.
|
|
95
|
+
* `archive_data_source()` - Archives a specified DataSource from a FeatureStore.
|
|
96
|
+
* `archive_entity()` - Archives a specified Entity from a FeatureStore.
|
|
97
|
+
* `archive_feature()` - Archives a specified Feature from a FeatureStore.
|
|
98
|
+
* `archive_feature_group()` - Archives a specified FeatureGroup from a FeatureStore. Method archives underlying Feature, Entity, DataSource also.
|
|
99
|
+
* `delete_data_source()` - Deletes an archived DataSource.
|
|
100
|
+
* `delete_entity()` - Deletes an archived Entity.
|
|
101
|
+
* `delete_feature()` - Deletes an archived Feature.
|
|
102
|
+
* `delete_feature_group()` - Deletes an archived FeatureGroup.
|
|
103
|
+
* `get_data_source()` - Get the DataSources associated with FeatureStore.
|
|
104
|
+
* `get_dataset()` - Get the teradataml DataFrame based on Features, Entities and DataSource from FeatureGroup.
|
|
105
|
+
* `get_entity()` - Get the Entity associated with FeatureStore.
|
|
106
|
+
* `get_feature()` - Get the Feature associated with FeatureStore.
|
|
107
|
+
* `get_feature_group()` - Get the FeatureGroup associated with FeatureStore.
|
|
108
|
+
* `list_data_sources()` - List DataSources.
|
|
109
|
+
* `list_entities()` - List Entities.
|
|
110
|
+
* `list_feature_groups()` - List FeatureGroups.
|
|
111
|
+
* `list_features()` - List Features.
|
|
112
|
+
* `list_repos()` - List available repos which are configured for FeatureStore.
|
|
113
|
+
* `repair()` - Repairs the underlying FeatureStore schema on database.
|
|
114
|
+
* `set_features_active()` - Marks the Features as active.
|
|
115
|
+
* `set_features_inactive()` - Marks the Features as inactive.
|
|
116
|
+
* `setup()` - Setup the FeatureStore for a repo.
|
|
117
|
+
* Property
|
|
118
|
+
* `repo` - Property for FeatureStore repo.
|
|
119
|
+
* `grant` - Property to Grant access on FeatureStore to user.
|
|
120
|
+
* `revoke` - Property to Revoke access on FeatureStore from user.
|
|
121
|
+
|
|
122
|
+
* ###### teradataml: Table Operator Functions
|
|
123
|
+
* `Image2Matrix()` - Converts an image into a matrix.
|
|
124
|
+
|
|
125
|
+
* ###### teradataml: SQLE Engine Analytic Functions
|
|
126
|
+
* New Analytics Database Analytic Functions:
|
|
127
|
+
* `CFilter()`
|
|
128
|
+
* `NaiveBayes()`
|
|
129
|
+
* `TDNaiveBayesPredict()`
|
|
130
|
+
* `Shap()`
|
|
131
|
+
* `SMOTE()`
|
|
132
|
+
|
|
133
|
+
* ###### teradataml: Unbounded Array Framework (UAF) Functions
|
|
134
|
+
* New Unbounded Array Framework(UAF) Functions:
|
|
135
|
+
* `CopyArt()`
|
|
136
|
+
|
|
137
|
+
* ###### General functions
|
|
138
|
+
* Vantage File Management Functions
|
|
139
|
+
* `list_files()` - List the installed files in Database.
|
|
140
|
+
|
|
141
|
+
* ###### OpensourceML: LightGBM
|
|
142
|
+
* teradataml adds support for lightGBM package through `OpensourceML` (`OpenML`) feature.
|
|
143
|
+
The following functionality is added in the current release:
|
|
144
|
+
* `td_lightgbm` - Interface object to run lightgbm functions and classes through Teradata Vantage.
|
|
145
|
+
Example usage below:
|
|
146
|
+
```
|
|
147
|
+
from teradataml import td_lightgbm, DataFrame
|
|
148
|
+
|
|
149
|
+
df_train = DataFrame("multi_model_classification")
|
|
150
|
+
|
|
151
|
+
feature_columns = ["col1", "col2", "col3", "col4"]
|
|
152
|
+
label_columns = ["label"]
|
|
153
|
+
part_columns = ["partition_column_1", "partition_column_2"]
|
|
154
|
+
|
|
155
|
+
df_x = df_train.select(feature_columns)
|
|
156
|
+
df_y = df_train.select(label_columns)
|
|
157
|
+
|
|
158
|
+
# Dataset creation.
|
|
159
|
+
# Single model case.
|
|
160
|
+
obj_s = td_lightgbm.Dataset(df_x, df_y, silent=True, free_raw_data=False)
|
|
161
|
+
|
|
162
|
+
# Multi model case.
|
|
163
|
+
obj_m = td_lightgbm.Dataset(df_x, df_y, free_raw_data=False, partition_columns=part_columns)
|
|
164
|
+
obj_m_v = td_lightgbm.Dataset(df_x, df_y, free_raw_data=False, partition_columns=part_columns)
|
|
165
|
+
|
|
166
|
+
## Model training.
|
|
167
|
+
# Single model case.
|
|
168
|
+
opt = td_lightgbm.train(params={}, train_set = obj_s, num_boost_round=30)
|
|
169
|
+
|
|
170
|
+
opt.predict(data=df_x, num_iteration=20, pred_contrib=True)
|
|
171
|
+
|
|
172
|
+
# Multi model case.
|
|
173
|
+
opt = td_lightgbm.train(params={}, train_set = obj_m, num_boost_round=30,
|
|
174
|
+
callbacks=[td_lightgbm.record_evaluation(rec)],
|
|
175
|
+
valid_sets=[obj_m_v, obj_m_v])
|
|
176
|
+
|
|
177
|
+
# Passing `label` argument to get it returned in output DataFrame.
|
|
178
|
+
opt.predict(data=df_x, label=df_y, num_iteration=20)
|
|
179
|
+
|
|
180
|
+
```
|
|
181
|
+
* Added support for accessing scikit-learn APIs using exposed inteface object `td_lightgbm`.
|
|
182
|
+
|
|
183
|
+
Refer Teradata Python Package User Guide for more details of this feature, arguments, usage, examples and supportability in Vantage.
|
|
184
|
+
|
|
185
|
+
* ###### teradataml: Functions
|
|
186
|
+
* `register()` - Registers a user defined function (UDF).
|
|
187
|
+
* `call_udf()` - Calls a registered user defined function (UDF) and returns ColumnExpression.
|
|
188
|
+
* `list_udfs()` - List all the UDFs registered using 'register()' function.
|
|
189
|
+
* `deregister()` - Deregisters a user defined function (UDF).
|
|
190
|
+
|
|
191
|
+
* ###### teradataml: Options
|
|
192
|
+
* Configuration Options
|
|
193
|
+
* `table_operator` - Specifies the name of table operator.
|
|
194
|
+
|
|
195
|
+
* ##### Updates
|
|
196
|
+
* ###### General functions
|
|
197
|
+
* `set_auth_token()` - Added `base_url` parameter which accepts the CCP url.
|
|
198
|
+
'ues_url' will be deprecated in future and users
|
|
199
|
+
will need to specify 'base_url' instead.
|
|
200
|
+
|
|
201
|
+
* ###### teradataml: DataFrame function
|
|
202
|
+
* `join()`
|
|
203
|
+
* Now supports compound ColumExpression having more than one binary operator in `on` argument.
|
|
204
|
+
* Now supports ColumExpression containing FunctionExpression(s) in `on` argument.
|
|
205
|
+
* self-join now expects aliased DataFrame in `other` argument.
|
|
206
|
+
|
|
207
|
+
* ###### teradataml: GeoDataFrame function
|
|
208
|
+
* `join()`
|
|
209
|
+
* Now supports compound ColumExpression having more than one binary operator in `on` argument.
|
|
210
|
+
* Now supports ColumExpression containing FunctionExpression(s) in `on` argument.
|
|
211
|
+
* self-join now expects aliased DataFrame in `other` argument.
|
|
212
|
+
|
|
213
|
+
* ###### teradataml: Unbounded Array Framework (UAF) Functions
|
|
214
|
+
* `SAX()` - Default value added for `window_size` and `output_frequency`.
|
|
215
|
+
* `DickeyFuller()`
|
|
216
|
+
* Supports TDAnalyticResult as input.
|
|
217
|
+
* Default value added for `max_lags`.
|
|
218
|
+
* Removed parameter `drift_trend_formula`.
|
|
219
|
+
* Updated permitted values for `algorithm`.
|
|
220
|
+
|
|
221
|
+
* ##### teradataml: AutoML
|
|
222
|
+
* `AutoML`, `AutoRegressor` and `AutoClassifier`
|
|
223
|
+
* Now supports DECIMAL datatype as input.
|
|
224
|
+
|
|
225
|
+
* ##### teradataml: SQLE Engine Analytic Functions
|
|
226
|
+
* `TextParser()`
|
|
227
|
+
* Argument name `covert_to_lowercase` changed to `convert_to_lowercase`.
|
|
228
|
+
|
|
229
|
+
* ##### Bug Fixes
|
|
230
|
+
* `db_list_tables()` now returns correct results when '%' is used.
|
|
231
|
+
|
|
232
|
+
#### teradataml 20.00.00.02
|
|
233
|
+
|
|
234
|
+
* teradataml will no longer be supported with SQLAlchemy < 2.0.
|
|
235
|
+
* teradataml no longer shows the warnings from Vantage by default.
|
|
236
|
+
* Users should set `display.suppress_vantage_runtime_warnings` to `False` to display warnings.
|
|
237
|
+
|
|
238
|
+
* ##### New Features/Functionality
|
|
239
|
+
* ##### teradataml: SQLE Engine Analytic Functions
|
|
240
|
+
* New Analytics Database Analytic Functions:
|
|
241
|
+
* `TFIDF()`
|
|
242
|
+
* `Pivoting()`
|
|
243
|
+
* `UnPivoting()`
|
|
244
|
+
* New Unbounded Array Framework(UAF) Functions:
|
|
245
|
+
* `AutoArima()`
|
|
246
|
+
* `DWT()`
|
|
247
|
+
* `DWT2D()`
|
|
248
|
+
* `FilterFactory1d()`
|
|
249
|
+
* `IDWT()`
|
|
250
|
+
* `IDWT2D()`
|
|
251
|
+
* `IQR()`
|
|
252
|
+
* `Matrix2Image()`
|
|
253
|
+
* `SAX()`
|
|
254
|
+
* `WindowDFFT()`
|
|
255
|
+
* ###### teradataml: Functions
|
|
256
|
+
* `udf()` - Creates a user defined function (UDF) and returns ColumnExpression.
|
|
257
|
+
* `set_session_param()` is added to set the database session parameters.
|
|
258
|
+
* `unset_session_param()` is added to unset database session parameters.
|
|
259
|
+
|
|
260
|
+
* ###### teradataml: DataFrame
|
|
261
|
+
* `materialize()` - Persists DataFrame into database for current session.
|
|
262
|
+
* `create_temp_view()` - Creates a temporary view for session on the DataFrame.
|
|
263
|
+
|
|
264
|
+
* ###### teradataml DataFrameColumn a.k.a. ColumnExpression
|
|
265
|
+
* _Date Time Functions_
|
|
266
|
+
* `DataFrameColumn.to_timestamp()` - Converts string or integer value to a TIMESTAMP data type or TIMESTAMP WITH TIME ZONE data type.
|
|
267
|
+
* `DataFrameColumn.extract()` - Extracts date component to a numeric value.
|
|
268
|
+
* `DataFrameColumn.to_interval()` - Converts a numeric value or string value into an INTERVAL_DAY_TO_SECOND or INTERVAL_YEAR_TO_MONTH value.
|
|
269
|
+
* _String Functions_
|
|
270
|
+
* `DataFrameColumn.parse_url()` - Extracts a part from a URL.
|
|
271
|
+
* _Arithmetic Functions_
|
|
272
|
+
* `DataFrameColumn.log` - Returns the logarithm value of the column with respect to 'base'.
|
|
273
|
+
|
|
274
|
+
* ##### teradataml: AutoML
|
|
275
|
+
* New methods added for `AutoML()`, `AutoRegressor()` and `AutoClassifier()`:
|
|
276
|
+
* `evaluate()` - Performs evaluation on the data using the best model or the model of users choice
|
|
277
|
+
from the leaderboard.
|
|
278
|
+
* `load()`: Loads the saved model from database.
|
|
279
|
+
* `deploy()`: Saves the trained model inside database.
|
|
280
|
+
* `remove_saved_model()`: Removes the saved model in database.
|
|
281
|
+
* `model_hyperparameters()`: Returns the hyperparameter of fitted or loaded models.
|
|
282
|
+
|
|
283
|
+
* ##### Updates
|
|
284
|
+
* ##### teradataml: AutoML
|
|
285
|
+
* `AutoML()`, `AutoRegressor()`
|
|
286
|
+
* New performance metrics added for task type regression i.e., "MAPE", "MPE", "ME", "EV", "MPD" and "MGD".
|
|
287
|
+
* `AutoML()`, `AutoRegressor()` and `AutoClassifier`
|
|
288
|
+
* New arguments added: `volatile`, `persist`.
|
|
289
|
+
* `predict()` - Data input is now mandatory for generating predictions. Default model
|
|
290
|
+
evaluation is now removed.
|
|
291
|
+
* `DataFrameColumn.cast()`: Accepts 2 new arguments `format` and `timezone`.
|
|
292
|
+
* `DataFrame.assign()`: Accepts ColumnExpressions returned by `udf()`.
|
|
293
|
+
|
|
294
|
+
* ##### teradataml: Options
|
|
295
|
+
* `set_config_params()`
|
|
296
|
+
* Following arguments will be deprecated in the future:
|
|
297
|
+
* `ues_url`
|
|
298
|
+
* `auth_token`
|
|
299
|
+
|
|
300
|
+
* #### teradata DataFrame
|
|
301
|
+
* `to_pandas()` - Function returns the pandas dataframe with Decimal columns types as float instead of object.
|
|
302
|
+
If user want datatype to be object, set argument `coerce_float` to False.
|
|
303
|
+
|
|
304
|
+
* ###### Database Utility
|
|
305
|
+
* `list_td_reserved_keywords()` - Accepts a list of strings as argument.
|
|
306
|
+
|
|
307
|
+
* ##### Updates to existing UAF Functions:
|
|
308
|
+
* `ACF()` - `round_results` parameter removed as it was used for internal testing.
|
|
309
|
+
* `BreuschGodfrey()` - Added default_value 0.05 for parameter `significance_level`.
|
|
310
|
+
* `GoldfeldQuandt()` -
|
|
311
|
+
* Removed parameters `weights` and `formula`.
|
|
312
|
+
Replaced parameter `orig_regr_paramcnt` with `const_term`.
|
|
313
|
+
Changed description for parameter `algorithm`. Please refer document for more details.
|
|
314
|
+
* Note: This will break backward compatibility.
|
|
315
|
+
* `HoltWintersForecaster()` - Default value of parameter `seasonal_periods` removed.
|
|
316
|
+
* `IDFFT2()` - Removed parameter `output_fmt_row_major` as it is used for internal testing.
|
|
317
|
+
* `Resample()` - Added parameter `output_fmt_index_style`.
|
|
318
|
+
|
|
319
|
+
* ##### Bug Fixes
|
|
320
|
+
* KNN `predict()` function can now predict on test data which does not contain target column.
|
|
321
|
+
* Metrics functions are supported on the Lake system.
|
|
322
|
+
* The following OpensourceML functions from different sklearn modules in single model case are fixed.
|
|
323
|
+
* `sklearn.ensemble`:
|
|
324
|
+
* ExtraTreesClassifier - `apply()`
|
|
325
|
+
* ExtraTreesRegressor - `apply()`
|
|
326
|
+
* RandomForestClassifier - `apply()`
|
|
327
|
+
* RandomForestRegressor - `apply()`
|
|
328
|
+
* `sklearn.impute`:
|
|
329
|
+
* SimpleImputer - `transform()`, `fit_transform()`, `inverse_transform()`
|
|
330
|
+
* MissingIndicator - `transform()`, `fit_transform()`
|
|
331
|
+
* `sklearn.kernel_approximations`:
|
|
332
|
+
* Nystroem - `transform()`, `fit_transform()`
|
|
333
|
+
* PolynomialCountSketch - `transform()`, `fit_transform()`
|
|
334
|
+
* RBFSampler - `transform()`, `fit_transform()`
|
|
335
|
+
* `sklearn.neighbors`:
|
|
336
|
+
* KNeighborsTransformer - `transform()`, `fit_transform()`
|
|
337
|
+
* RadiusNeighborsTransformer - `transform()`, `fit_transform()`
|
|
338
|
+
* `sklearn.preprocessing`:
|
|
339
|
+
* KernelCenterer - `transform()`
|
|
340
|
+
* OneHotEncoder - `transform()`, `inverse_transform()`
|
|
341
|
+
* The following OpensourceML functions from different sklearn modules in multi model case are fixed.
|
|
342
|
+
* `sklearn.feature_selection`:
|
|
343
|
+
* SelectFpr - `transform()`, `fit_transform()`, `inverse_transform()`
|
|
344
|
+
* SelectFdr - `transform()`, `fit_transform()`, `inverse_transform()`
|
|
345
|
+
* SelectFromModel - `transform()`, `fit_transform()`, `inverse_transform()`
|
|
346
|
+
* SelectFwe - `transform()`, `fit_transform()`, `inverse_transform()`
|
|
347
|
+
* RFECV - `transform()`, `fit_transform()`, `inverse_transform()`
|
|
348
|
+
* `sklearn.clustering`:
|
|
349
|
+
* Birch - `transform()`, `fit_transform()`
|
|
350
|
+
* OpensourceML returns teradataml objects for model attributes and functions instead of sklearn
|
|
351
|
+
objects so that the user can perform further operations like `score()`, `predict()` etc on top
|
|
352
|
+
of the returned objects.
|
|
353
|
+
* AutoML `predict()` function now generates correct ROC-AUC value for positive class.
|
|
354
|
+
* `deploy()` method of `Script` and `Apply` classes retries model deployment if there is any
|
|
355
|
+
intermittent network issues.
|
|
356
|
+
|
|
49
357
|
#### teradataml 20.00.00.01
|
|
50
358
|
* teradataml no longer supports Python versions less than 3.8.
|
|
51
359
|
|