teradataml 20.0.0.1__py3-none-any.whl → 20.0.0.2__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of teradataml might be problematic. Click here for more details.

Files changed (200) hide show
  1. teradataml/LICENSE.pdf +0 -0
  2. teradataml/README.md +112 -0
  3. teradataml/__init__.py +6 -3
  4. teradataml/_version.py +1 -1
  5. teradataml/analytics/__init__.py +3 -2
  6. teradataml/analytics/analytic_function_executor.py +224 -16
  7. teradataml/analytics/analytic_query_generator.py +92 -0
  8. teradataml/analytics/byom/__init__.py +3 -2
  9. teradataml/analytics/json_parser/metadata.py +1 -0
  10. teradataml/analytics/json_parser/utils.py +6 -4
  11. teradataml/analytics/meta_class.py +40 -1
  12. teradataml/analytics/sqle/DecisionTreePredict.py +1 -1
  13. teradataml/analytics/sqle/__init__.py +10 -2
  14. teradataml/analytics/table_operator/__init__.py +3 -2
  15. teradataml/analytics/uaf/__init__.py +21 -2
  16. teradataml/analytics/utils.py +62 -1
  17. teradataml/analytics/valib.py +1 -1
  18. teradataml/automl/__init__.py +1502 -323
  19. teradataml/automl/custom_json_utils.py +139 -61
  20. teradataml/automl/data_preparation.py +245 -306
  21. teradataml/automl/data_transformation.py +32 -12
  22. teradataml/automl/feature_engineering.py +313 -82
  23. teradataml/automl/model_evaluation.py +44 -35
  24. teradataml/automl/model_training.py +109 -146
  25. teradataml/catalog/byom.py +8 -8
  26. teradataml/clients/pkce_client.py +1 -1
  27. teradataml/common/constants.py +37 -0
  28. teradataml/common/deprecations.py +13 -7
  29. teradataml/common/garbagecollector.py +151 -120
  30. teradataml/common/messagecodes.py +4 -1
  31. teradataml/common/messages.py +2 -1
  32. teradataml/common/sqlbundle.py +1 -1
  33. teradataml/common/utils.py +97 -11
  34. teradataml/common/wrapper_utils.py +1 -1
  35. teradataml/context/context.py +72 -2
  36. teradataml/data/complaints_test_tokenized.csv +353 -0
  37. teradataml/data/complaints_tokens_model.csv +348 -0
  38. teradataml/data/covid_confirm_sd.csv +83 -0
  39. teradataml/data/dataframe_example.json +10 -0
  40. teradataml/data/docs/sqle/docs_17_20/CFilter.py +132 -0
  41. teradataml/data/docs/sqle/docs_17_20/NaiveBayes.py +162 -0
  42. teradataml/data/docs/sqle/docs_17_20/OutlierFilterFit.py +2 -0
  43. teradataml/data/docs/sqle/docs_17_20/Pivoting.py +279 -0
  44. teradataml/data/docs/sqle/docs_17_20/Shap.py +197 -0
  45. teradataml/data/docs/sqle/docs_17_20/TDNaiveBayesPredict.py +189 -0
  46. teradataml/data/docs/sqle/docs_17_20/TFIDF.py +142 -0
  47. teradataml/data/docs/sqle/docs_17_20/Unpivoting.py +216 -0
  48. teradataml/data/docs/uaf/docs_17_20/ACF.py +1 -10
  49. teradataml/data/docs/uaf/docs_17_20/ArimaEstimate.py +1 -1
  50. teradataml/data/docs/uaf/docs_17_20/ArimaForecast.py +35 -5
  51. teradataml/data/docs/uaf/docs_17_20/ArimaValidate.py +3 -1
  52. teradataml/data/docs/uaf/docs_17_20/ArimaXEstimate.py +293 -0
  53. teradataml/data/docs/uaf/docs_17_20/AutoArima.py +354 -0
  54. teradataml/data/docs/uaf/docs_17_20/BreuschGodfrey.py +3 -2
  55. teradataml/data/docs/uaf/docs_17_20/BreuschPaganGodfrey.py +1 -1
  56. teradataml/data/docs/uaf/docs_17_20/Convolve.py +13 -10
  57. teradataml/data/docs/uaf/docs_17_20/Convolve2.py +4 -1
  58. teradataml/data/docs/uaf/docs_17_20/CumulPeriodogram.py +5 -4
  59. teradataml/data/docs/uaf/docs_17_20/DFFT2Conv.py +4 -4
  60. teradataml/data/docs/uaf/docs_17_20/DWT.py +235 -0
  61. teradataml/data/docs/uaf/docs_17_20/DWT2D.py +214 -0
  62. teradataml/data/docs/uaf/docs_17_20/DurbinWatson.py +1 -1
  63. teradataml/data/docs/uaf/docs_17_20/ExtractResults.py +1 -1
  64. teradataml/data/docs/uaf/docs_17_20/FilterFactory1d.py +160 -0
  65. teradataml/data/docs/uaf/docs_17_20/GenseriesSinusoids.py +1 -1
  66. teradataml/data/docs/uaf/docs_17_20/GoldfeldQuandt.py +9 -31
  67. teradataml/data/docs/uaf/docs_17_20/HoltWintersForecaster.py +4 -2
  68. teradataml/data/docs/uaf/docs_17_20/IDFFT2.py +1 -8
  69. teradataml/data/docs/uaf/docs_17_20/IDWT.py +236 -0
  70. teradataml/data/docs/uaf/docs_17_20/IDWT2D.py +226 -0
  71. teradataml/data/docs/uaf/docs_17_20/IQR.py +134 -0
  72. teradataml/data/docs/uaf/docs_17_20/LineSpec.py +1 -1
  73. teradataml/data/docs/uaf/docs_17_20/LinearRegr.py +2 -2
  74. teradataml/data/docs/uaf/docs_17_20/MAMean.py +3 -3
  75. teradataml/data/docs/uaf/docs_17_20/Matrix2Image.py +297 -0
  76. teradataml/data/docs/uaf/docs_17_20/MatrixMultiply.py +15 -6
  77. teradataml/data/docs/uaf/docs_17_20/PACF.py +0 -1
  78. teradataml/data/docs/uaf/docs_17_20/Portman.py +2 -2
  79. teradataml/data/docs/uaf/docs_17_20/PowerSpec.py +2 -2
  80. teradataml/data/docs/uaf/docs_17_20/Resample.py +9 -1
  81. teradataml/data/docs/uaf/docs_17_20/SAX.py +246 -0
  82. teradataml/data/docs/uaf/docs_17_20/SeasonalNormalize.py +17 -10
  83. teradataml/data/docs/uaf/docs_17_20/SignifPeriodicities.py +1 -1
  84. teradataml/data/docs/uaf/docs_17_20/WhitesGeneral.py +3 -1
  85. teradataml/data/docs/uaf/docs_17_20/WindowDFFT.py +368 -0
  86. teradataml/data/dwt2d_dataTable.csv +65 -0
  87. teradataml/data/dwt_dataTable.csv +8 -0
  88. teradataml/data/dwt_filterTable.csv +3 -0
  89. teradataml/data/finance_data4.csv +13 -0
  90. teradataml/data/grocery_transaction.csv +19 -0
  91. teradataml/data/idwt2d_dataTable.csv +5 -0
  92. teradataml/data/idwt_dataTable.csv +8 -0
  93. teradataml/data/idwt_filterTable.csv +3 -0
  94. teradataml/data/interval_data.csv +5 -0
  95. teradataml/data/jsons/paired_functions.json +14 -0
  96. teradataml/data/jsons/sqle/17.20/TD_CFilter.json +118 -0
  97. teradataml/data/jsons/sqle/17.20/TD_NaiveBayes.json +193 -0
  98. teradataml/data/jsons/sqle/17.20/TD_NaiveBayesPredict.json +212 -0
  99. teradataml/data/jsons/sqle/17.20/TD_OneClassSVM.json +9 -9
  100. teradataml/data/jsons/sqle/17.20/TD_Pivoting.json +280 -0
  101. teradataml/data/jsons/sqle/17.20/TD_Shap.json +222 -0
  102. teradataml/data/jsons/sqle/17.20/TD_TFIDF.json +162 -0
  103. teradataml/data/jsons/sqle/17.20/TD_Unpivoting.json +235 -0
  104. teradataml/data/jsons/storedprocedure/17.20/TD_FILTERFACTORY1D.json +150 -0
  105. teradataml/data/jsons/uaf/17.20/TD_ACF.json +1 -18
  106. teradataml/data/jsons/uaf/17.20/TD_ARIMAESTIMATE.json +3 -16
  107. teradataml/data/jsons/uaf/17.20/TD_ARIMAFORECAST.json +0 -3
  108. teradataml/data/jsons/uaf/17.20/TD_ARIMAVALIDATE.json +5 -3
  109. teradataml/data/jsons/uaf/17.20/TD_ARIMAXESTIMATE.json +362 -0
  110. teradataml/data/jsons/uaf/17.20/TD_AUTOARIMA.json +469 -0
  111. teradataml/data/jsons/uaf/17.20/TD_BINARYMATRIXOP.json +0 -3
  112. teradataml/data/jsons/uaf/17.20/TD_BINARYSERIESOP.json +0 -2
  113. teradataml/data/jsons/uaf/17.20/TD_BREUSCH_GODFREY.json +2 -1
  114. teradataml/data/jsons/uaf/17.20/TD_BREUSCH_PAGAN_GODFREY.json +2 -5
  115. teradataml/data/jsons/uaf/17.20/TD_CONVOLVE.json +3 -6
  116. teradataml/data/jsons/uaf/17.20/TD_CONVOLVE2.json +1 -3
  117. teradataml/data/jsons/uaf/17.20/TD_CUMUL_PERIODOGRAM.json +0 -5
  118. teradataml/data/jsons/uaf/17.20/TD_DFFT.json +1 -4
  119. teradataml/data/jsons/uaf/17.20/TD_DFFT2.json +2 -7
  120. teradataml/data/jsons/uaf/17.20/TD_DFFT2CONV.json +1 -2
  121. teradataml/data/jsons/uaf/17.20/TD_DFFTCONV.json +0 -2
  122. teradataml/data/jsons/uaf/17.20/TD_DTW.json +3 -6
  123. teradataml/data/jsons/uaf/17.20/TD_DWT.json +173 -0
  124. teradataml/data/jsons/uaf/17.20/TD_DWT2D.json +160 -0
  125. teradataml/data/jsons/uaf/17.20/TD_FITMETRICS.json +1 -1
  126. teradataml/data/jsons/uaf/17.20/TD_GOLDFELD_QUANDT.json +16 -30
  127. teradataml/data/jsons/uaf/17.20/{TD_HOLT_WINTERS_FORECAST.json → TD_HOLT_WINTERS_FORECASTER.json} +1 -2
  128. teradataml/data/jsons/uaf/17.20/TD_IDFFT2.json +1 -15
  129. teradataml/data/jsons/uaf/17.20/TD_IDWT.json +162 -0
  130. teradataml/data/jsons/uaf/17.20/TD_IDWT2D.json +149 -0
  131. teradataml/data/jsons/uaf/17.20/TD_IQR.json +117 -0
  132. teradataml/data/jsons/uaf/17.20/TD_LINEAR_REGR.json +1 -1
  133. teradataml/data/jsons/uaf/17.20/TD_LINESPEC.json +1 -1
  134. teradataml/data/jsons/uaf/17.20/TD_MAMEAN.json +1 -3
  135. teradataml/data/jsons/uaf/17.20/TD_MATRIX2IMAGE.json +209 -0
  136. teradataml/data/jsons/uaf/17.20/TD_PACF.json +2 -2
  137. teradataml/data/jsons/uaf/17.20/TD_POWERSPEC.json +5 -5
  138. teradataml/data/jsons/uaf/17.20/TD_RESAMPLE.json +48 -28
  139. teradataml/data/jsons/uaf/17.20/TD_SAX.json +208 -0
  140. teradataml/data/jsons/uaf/17.20/TD_SEASONALNORMALIZE.json +12 -6
  141. teradataml/data/jsons/uaf/17.20/TD_SIMPLEEXP.json +0 -1
  142. teradataml/data/jsons/uaf/17.20/TD_TRACKINGOP.json +8 -8
  143. teradataml/data/jsons/uaf/17.20/TD_UNDIFF.json +1 -1
  144. teradataml/data/jsons/uaf/17.20/TD_UNNORMALIZE.json +1 -1
  145. teradataml/data/jsons/uaf/17.20/TD_WINDOWDFFT.json +400 -0
  146. teradataml/data/load_example_data.py +8 -2
  147. teradataml/data/naivebayestextclassifier_example.json +1 -1
  148. teradataml/data/naivebayestextclassifierpredict_example.json +11 -0
  149. teradataml/data/peppers.png +0 -0
  150. teradataml/data/real_values.csv +14 -0
  151. teradataml/data/sax_example.json +8 -0
  152. teradataml/data/scripts/deploy_script.py +1 -1
  153. teradataml/data/scripts/sklearn/sklearn_fit.py +17 -10
  154. teradataml/data/scripts/sklearn/sklearn_fit_predict.py +2 -2
  155. teradataml/data/scripts/sklearn/sklearn_function.template +30 -7
  156. teradataml/data/scripts/sklearn/sklearn_neighbors.py +1 -1
  157. teradataml/data/scripts/sklearn/sklearn_score.py +12 -3
  158. teradataml/data/scripts/sklearn/sklearn_transform.py +55 -4
  159. teradataml/data/star_pivot.csv +8 -0
  160. teradataml/data/templates/open_source_ml.json +2 -1
  161. teradataml/data/teradataml_example.json +20 -1
  162. teradataml/data/timestamp_data.csv +4 -0
  163. teradataml/data/titanic_dataset_unpivoted.csv +19 -0
  164. teradataml/data/uaf_example.json +55 -1
  165. teradataml/data/unpivot_example.json +15 -0
  166. teradataml/data/url_data.csv +9 -0
  167. teradataml/data/windowdfft.csv +16 -0
  168. teradataml/dataframe/copy_to.py +1 -1
  169. teradataml/dataframe/data_transfer.py +5 -3
  170. teradataml/dataframe/dataframe.py +474 -41
  171. teradataml/dataframe/fastload.py +3 -3
  172. teradataml/dataframe/functions.py +339 -0
  173. teradataml/dataframe/row.py +160 -0
  174. teradataml/dataframe/setop.py +2 -2
  175. teradataml/dataframe/sql.py +658 -20
  176. teradataml/dataframe/window.py +1 -1
  177. teradataml/dbutils/dbutils.py +322 -16
  178. teradataml/geospatial/geodataframe.py +1 -1
  179. teradataml/geospatial/geodataframecolumn.py +1 -1
  180. teradataml/hyperparameter_tuner/optimizer.py +13 -13
  181. teradataml/lib/aed_0_1.dll +0 -0
  182. teradataml/opensource/sklearn/_sklearn_wrapper.py +154 -69
  183. teradataml/options/__init__.py +3 -1
  184. teradataml/options/configure.py +14 -2
  185. teradataml/options/display.py +2 -2
  186. teradataml/plot/axis.py +4 -4
  187. teradataml/scriptmgmt/UserEnv.py +10 -6
  188. teradataml/scriptmgmt/lls_utils.py +3 -2
  189. teradataml/table_operators/Script.py +2 -2
  190. teradataml/table_operators/TableOperator.py +106 -20
  191. teradataml/table_operators/table_operator_util.py +88 -41
  192. teradataml/table_operators/templates/dataframe_udf.template +63 -0
  193. teradataml/telemetry_utils/__init__.py +0 -0
  194. teradataml/telemetry_utils/queryband.py +52 -0
  195. teradataml/utils/validators.py +1 -1
  196. {teradataml-20.0.0.1.dist-info → teradataml-20.0.0.2.dist-info}/METADATA +115 -2
  197. {teradataml-20.0.0.1.dist-info → teradataml-20.0.0.2.dist-info}/RECORD +200 -140
  198. {teradataml-20.0.0.1.dist-info → teradataml-20.0.0.2.dist-info}/WHEEL +0 -0
  199. {teradataml-20.0.0.1.dist-info → teradataml-20.0.0.2.dist-info}/top_level.txt +0 -0
  200. {teradataml-20.0.0.1.dist-info → teradataml-20.0.0.2.dist-info}/zip-safe +0 -0
@@ -16,7 +16,7 @@ import os
16
16
  import time
17
17
  import uuid
18
18
  from math import floor
19
- import tarfile
19
+ import warnings
20
20
  import subprocess
21
21
  from pathlib import Path
22
22
  import teradataml.dataframe as tdmldf
@@ -1012,18 +1012,24 @@ class TableOperator:
1012
1012
  repr_string = "{}\n\n{}".format(repr_string, self.result)
1013
1013
  return repr_string
1014
1014
 
1015
- def deploy(self, model_column, partition_columns=None, model_file_prefix=None):
1015
+ def deploy(self, model_column, partition_columns=None, model_file_prefix=None, retry=3,
1016
+ retry_timeout=30):
1016
1017
  """
1017
1018
  DESCRIPTION:
1018
- Function deploys the model generated after running `execute_script()` in database in
1019
+ Function deploys the models generated after running `execute_script()` in database in
1019
1020
  VantageCloud Enterprise or in user environment in VantageCloud Lake.
1020
1021
  If deployed files are not needed, these files can be removed using `remove_file()` in
1021
- database or `<user_env>.remove_file()` in lake.
1022
+ database or `UserEnv.remove_file()` in lake.
1023
+
1024
+ Note:
1025
+ If the models (one or many) fail to get deployed in Vantage even after retries,
1026
+ try deploying them again using `install_file()` function or remove installed
1027
+ files using `remove_file()` function.
1022
1028
 
1023
1029
  PARAMETERS:
1024
1030
  model_column:
1025
1031
  Required Argument.
1026
- Specifies the column name in which model is present.
1032
+ Specifies the column name in which models are present.
1027
1033
  Supported types of model in this column are CLOB and BLOB.
1028
1034
  Note:
1029
1035
  The column mentioned in this argument should be present in
@@ -1051,11 +1057,27 @@ class TableOperator:
1051
1057
  with underscore(_) to generate model file names.
1052
1058
  Types: str
1053
1059
 
1060
+ retry:
1061
+ Optional Argument.
1062
+ Specifies the maximum number of retries to be made to deploy the models.
1063
+ This argument helps in retrying the deployment of models in case of network issues.
1064
+ This argument should be a positive integer.
1065
+ Default Value: 3
1066
+ Types: int
1067
+
1068
+ retry_timeout:
1069
+ Optional Argument. Used along with retry argument. Ignored otherwise.
1070
+ Specifies the time interval in seconds between each retry.
1071
+ This argument should be a positive integer.
1072
+ Default Value: 30
1073
+ Types: int
1074
+
1054
1075
  RETURNS:
1055
1076
  List of generated file identifiers in database or file names in lake.
1056
1077
 
1057
1078
  RAISES:
1058
- TeradatamlException
1079
+ - TeradatamlException
1080
+ - Throws warning when models failed to deploy even after retries.
1059
1081
 
1060
1082
  EXAMPLES:
1061
1083
  >>> import teradataml
@@ -1139,7 +1161,24 @@ class TableOperator:
1139
1161
  'my_prefix_new__0_11',
1140
1162
  'my_prefix_new__1_10',
1141
1163
  'my_prefix_new__1_11']
1142
-
1164
+
1165
+ # Example 5: Assuming that 2 model files fail to get installed due to network issues,
1166
+ # the function retries installing the failed files twice with timeout between
1167
+ # retries of 10 secs.
1168
+ >>> opt = obj.deploy(model_column="model", model_file_prefix="my_prefix_",
1169
+ partition_columns=["partition_column_1", "partition_column_2"],
1170
+ retry=2, retry_timeout=10)
1171
+ RuntimeWarning: The following model files failed to get installed in Vantage:
1172
+ ['my_prefix__1_10', 'my_prefix__1_11'].
1173
+ Try manually deploying them from the path '<temp_path>' using:
1174
+ - `install_file()` when connected to Enterprise/On-Prem system or
1175
+ - `UserEnv.install_file()` when connected to Lake system.
1176
+ OR
1177
+ Remove the returned installed files manually using `remove_file()` or `UserEnv.remove_file()`.
1178
+ >>> opt
1179
+ ['my_prefix__0_10',
1180
+ 'my_prefix__0_11']
1181
+
1143
1182
  ## Run in VantageCloud Lake using Apply object.
1144
1183
  # Let's assume an user environment named "user_env" already exists in VantageCloud Lake,
1145
1184
  # which will be used for the examples below.
@@ -1168,7 +1207,7 @@ class TableOperator:
1168
1207
  1 10 b'abdwcd.....dWIz'
1169
1208
  1 11 b'gA4jc4.....agfu'
1170
1209
 
1171
- # Example 5: Provide both "partition_columns" and "model_file_prefix" arguments.
1210
+ # Example 6: Provide both "partition_columns" and "model_file_prefix" arguments.
1172
1211
  >>> obj.deploy(model_column="model", model_file_prefix="my_prefix_",
1173
1212
  partition_columns=["partition_column_1", "partition_column_2"])
1174
1213
  ['my_prefix__0_10',
@@ -1183,8 +1222,13 @@ class TableOperator:
1183
1222
  arg_info_matrix.append(["model_column", model_column, False, (str)])
1184
1223
  arg_info_matrix.append(["partition_columns", partition_columns, True, (str, list)])
1185
1224
  arg_info_matrix.append(["model_file_prefix", model_file_prefix, True, (str)])
1225
+ arg_info_matrix.append(["retry", retry, True, (int)])
1226
+ arg_info_matrix.append(["retry_timeout", retry_timeout, True, (int)])
1186
1227
  _Validators._validate_function_arguments(arg_info_matrix)
1187
1228
 
1229
+ _Validators._validate_positive_int(retry, "retry", lbound_inclusive=True)
1230
+ _Validators._validate_positive_int(retry_timeout, "retry_timeout", lbound_inclusive=True)
1231
+
1188
1232
  if self.result is None:
1189
1233
  return "Result is empty. Please run execute_script first."
1190
1234
 
@@ -1212,11 +1256,29 @@ class TableOperator:
1212
1256
  model_column_type = data._td_column_names_and_sqlalchemy_types[model_column.lower()].__class__.__name__
1213
1257
 
1214
1258
  n_models = len(vals)
1215
- all_files = []
1216
1259
 
1217
1260
  # Default location for .teradataml is user's home directory if configure.local_storage is not set.
1218
1261
  tempdir = GarbageCollector._get_temp_dir_name()
1219
1262
 
1263
+ def __install_file(model_file, model_file_path):
1264
+ """
1265
+ Function to install the model file in Vantage and return the status.
1266
+ """
1267
+ file_installed = True
1268
+ try:
1269
+ if self.__class__.__name__ == "Script":
1270
+ from teradataml.dbutils.filemgr import install_file
1271
+ install_file(file_identifier=model_file, file_path=model_file_path,
1272
+ is_binary=True, suppress_output=True, replace=True)
1273
+ elif self.__class__.__name__ == "Apply":
1274
+ self.env.install_file(file_path=model_file_path, suppress_output=True, replace=True)
1275
+ except Exception as e:
1276
+ file_installed = False
1277
+ return file_installed
1278
+
1279
+ installed_files = []
1280
+ failed_files = []
1281
+
1220
1282
  for i, row in enumerate(vals):
1221
1283
  model = row[0]
1222
1284
  partition_values = ""
@@ -1241,15 +1303,39 @@ class TableOperator:
1241
1303
  with open(model_file_path, "wb") as f:
1242
1304
  f.write(model)
1243
1305
 
1244
- if self.__class__.__name__ == "Script":
1245
- from teradataml import install_file
1246
- install_file(file_identifier=model_file, file_path=model_file_path,
1247
- is_binary=True, suppress_output=True)
1248
- elif self.__class__.__name__ == "Apply":
1249
- self.env.install_file(file_path=model_file_path)
1250
-
1251
- all_files.append(model_file)
1306
+ file_installed = __install_file(model_file, model_file_path)
1252
1307
 
1253
- os.remove(model_file_path)
1254
-
1255
- return all_files
1308
+ if file_installed:
1309
+ installed_files.append(model_file)
1310
+ os.remove(model_file_path)
1311
+ else:
1312
+ # File failed to get installed in Vantage. Hence, keeping the file in tempdir.
1313
+ failed_files.append(model_file)
1314
+
1315
+ while retry and failed_files:
1316
+ # If there are any failed files and retry is not zero, retry installing the failed files.
1317
+ time.sleep(retry_timeout)
1318
+ retry_failed_files = []
1319
+ for model_file in failed_files:
1320
+ model_file_path = os.path.join(tempdir, model_file)
1321
+ file_installed = __install_file(model_file, model_file_path)
1322
+
1323
+ if file_installed:
1324
+ installed_files.append(model_file)
1325
+ os.remove(model_file_path)
1326
+ else:
1327
+ # File failed to get installed in Vantage. Hence, keeping the file in tempdir.
1328
+ retry_failed_files.append(model_file)
1329
+ failed_files = retry_failed_files
1330
+ retry -= 1
1331
+
1332
+ if failed_files:
1333
+ failed_files.sort()
1334
+ warning_message = "The following model files failed to get installed in Vantage:\n" + str(failed_files) + ".\n"
1335
+ warning_message += "Try manually deploying them from the path '" + tempdir + "' using:\n"
1336
+ warning_message += " - `install_file()` when connected to Enterprise/On-Prem system or\n"
1337
+ warning_message += " - `UserEnv.install_file()` when connected to Lake system.\n"
1338
+ warning_message += "OR\nRemove the returned installed files manually using `remove_file()` or `UserEnv.remove_file()`."
1339
+ warnings.warn(RuntimeWarning(warning_message))
1340
+
1341
+ return installed_files
@@ -9,8 +9,7 @@
9
9
  # Description: Utilities for Table Operators. #
10
10
  # #
11
11
  # ##################################################################
12
-
13
- import os
12
+ import os, json
14
13
  import teradataml.dataframe as tdmldf
15
14
  from teradataml.common.constants import TableOperatorConstants, \
16
15
  TeradataConstants, OutputStyle
@@ -24,7 +23,7 @@ from teradataml.scriptmgmt.lls_utils import get_env
24
23
  from teradataml.utils.utils import execute_sql
25
24
  from teradataml.utils.validators import _Validators
26
25
  from functools import partial
27
- from inspect import isfunction
26
+ from inspect import isfunction, getsource
28
27
 
29
28
 
30
29
  class _TableOperatorUtils:
@@ -281,12 +280,19 @@ class _TableOperatorUtils:
281
280
  self.__validate()
282
281
  """
283
282
  # Validate the user defined function.
284
- if not (isfunction(self.user_function) or
285
- isinstance(self.user_function, partial)):
286
- raise TypeError(Messages.get_message(
287
- MessageCodes.UNSUPPORTED_DATATYPE, 'user_function',
288
- "'function' or 'functools.partial'")
289
- )
283
+
284
+ if self.operation == TableOperatorConstants.UDF_OP.value:
285
+ for udf_function in self.user_function:
286
+ if not isfunction(udf_function):
287
+ raise TypeError(Messages.get_message(
288
+ MessageCodes.UNSUPPORTED_DATATYPE, 'user_function', "'function'"))
289
+ else:
290
+ if not (isfunction(self.user_function) or
291
+ isinstance(self.user_function, partial)):
292
+ raise TypeError(Messages.get_message(
293
+ MessageCodes.UNSUPPORTED_DATATYPE, 'user_function',
294
+ "'function' or 'functools.partial'")
295
+ )
290
296
 
291
297
  if arg_info_matrix is None:
292
298
  arg_info_matrix = []
@@ -349,37 +355,73 @@ class _TableOperatorUtils:
349
355
  os.path.dirname(os.path.abspath(__file__))),
350
356
  "table_operators",
351
357
  "templates")
352
-
353
- template_name = TableOperatorConstants.APPLY_TEMPLATE.value if \
354
- self.operation == TableOperatorConstants.APPLY_OP.value else TableOperatorConstants.MAP_TEMPLATE.value
358
+ # Get the template.
359
+ template = {TableOperatorConstants.APPLY_OP.value: TableOperatorConstants.APPLY_TEMPLATE.value,
360
+ TableOperatorConstants.UDF_OP.value: TableOperatorConstants.UDF_TEMPLATE.value}
361
+ template_name = template.get(self.operation, TableOperatorConstants.MAP_TEMPLATE.value)
355
362
  # Write to the script based on the template.
356
363
  try:
357
364
  with open(os.path.join(template_dir, template_name), 'r') as input_file:
358
365
  with open(self.script_path, 'w') as output_file:
359
- output_file.write(
360
- input_file.read().format(
361
- DELIMITER=UtilFuncs._serialize_and_encode(
362
- self.delimiter),
363
- STO_OPERATION=UtilFuncs._serialize_and_encode(
364
- self.operation),
365
- USER_DEF_FUNC=UtilFuncs._serialize_and_encode(
366
- self.user_function),
367
- DF_COL_NAMES_LIST=UtilFuncs._serialize_and_encode(
368
- self.data.columns),
369
- DF_COL_TYPES_LIST=UtilFuncs._serialize_and_encode(
370
- python_input_col_types),
371
- OUTPUT_COL_NAMES_LIST=UtilFuncs._serialize_and_encode(
372
- list(self.returns.keys())),
373
- OUTPUT_CONVERTERS=UtilFuncs._serialize_and_encode(
374
- output_converters),
375
- QUOTECHAR=UtilFuncs._serialize_and_encode(
376
- self.quotechar),
377
- INPUT_CONVERTERS=UtilFuncs._serialize_and_encode(
378
- input_converters),
379
- CHUNK_SIZE=UtilFuncs._serialize_and_encode(
380
- self.chunk_size)
366
+ if self.operation == TableOperatorConstants.UDF_OP.value:
367
+
368
+ # Function can have udf as decorator. Remove that.
369
+ # The below notation
370
+ # @udf
371
+ # def to_upper(s):
372
+ # return s.upper()
373
+ # Then source code will be as it is.
374
+ # But if below notation is used,
375
+ # f = udf(to_upper)
376
+ # Then source code will not have udf.
377
+ # So, remove first line if it comes with first notation.
378
+ # For both notations if in starting function defination have any extra space. Remove that.
379
+ # If multiple UDF's are there append them as a single string.
380
+
381
+ user_function_code = ""
382
+ for udf_code in self.user_function:
383
+ udf_code = getsource(udf_code)
384
+ udf_code = udf_code.lstrip()
385
+ if udf_code.startswith("@"):
386
+ udf_code = udf_code[udf_code.find("\n")+1: ].lstrip()
387
+ user_function_code += udf_code + '\n'
388
+
389
+ output_file.write(input_file.read().format(
390
+ DELIMITER=self.delimiter,
391
+ QUOTECHAR=self.quotechar,
392
+ FUNCTION_DEFINITION=user_function_code,
393
+ FUNCTION_ARGS =str(self.function_args),
394
+ INPUT_COLUMNS=json.dumps(self.data.columns),
395
+ OUTPUT_COLUMNS=json.dumps(list(self.returns.keys())),
396
+ COLUMNS_DEFINITIONS=json.dumps(self.columns_definitions),
397
+ OUTPUT_TYPE_CONVERTERS=json.dumps(self.output_type_converters)
398
+ ))
399
+ else:
400
+ # prepare script file from template file for maprow and mappartition.
401
+ output_file.write(
402
+ input_file.read().format(
403
+ DELIMITER=UtilFuncs._serialize_and_encode(
404
+ self.delimiter),
405
+ STO_OPERATION=UtilFuncs._serialize_and_encode(
406
+ self.operation),
407
+ USER_DEF_FUNC=UtilFuncs._serialize_and_encode(
408
+ self.user_function),
409
+ DF_COL_NAMES_LIST=UtilFuncs._serialize_and_encode(
410
+ self.data.columns),
411
+ DF_COL_TYPES_LIST=UtilFuncs._serialize_and_encode(
412
+ python_input_col_types),
413
+ OUTPUT_COL_NAMES_LIST=UtilFuncs._serialize_and_encode(
414
+ list(self.returns.keys())),
415
+ OUTPUT_CONVERTERS=UtilFuncs._serialize_and_encode(
416
+ output_converters),
417
+ QUOTECHAR=UtilFuncs._serialize_and_encode(
418
+ self.quotechar),
419
+ INPUT_CONVERTERS=UtilFuncs._serialize_and_encode(
420
+ input_converters),
421
+ CHUNK_SIZE=UtilFuncs._serialize_and_encode(
422
+ self.chunk_size)
423
+ )
381
424
  )
382
- )
383
425
  except Exception:
384
426
  # We may end up here if the formatting of the templating to create
385
427
  # the user script fails.
@@ -410,9 +452,11 @@ class _TableOperatorUtils:
410
452
  """
411
453
  try:
412
454
  if self.operation in [TableOperatorConstants.MAP_ROW_OP.value,
413
- TableOperatorConstants.MAP_PARTITION_OP.value]:
455
+ TableOperatorConstants.MAP_PARTITION_OP.value] or \
456
+ (self.operation == TableOperatorConstants.UDF_OP.value and self.exec_mode == 'IN-DB'):
414
457
  return self.__execute_script_table_operator()
415
- elif self.operation == TableOperatorConstants.APPLY_OP.value:
458
+ elif self.operation == TableOperatorConstants.APPLY_OP.value or \
459
+ (self.operation == TableOperatorConstants.UDF_OP.value and self.exec_mode == 'REMOTE'):
416
460
  return self.__execute_apply_table_operator()
417
461
  except Exception:
418
462
  raise
@@ -572,8 +616,9 @@ class _TableOperatorUtils:
572
616
  if self.exec_mode.upper() == TableOperatorConstants.REMOTE_EXEC.value:
573
617
  # If not test mode, execute the script using Apply table operator.
574
618
  try:
575
- # If APPLY, get environment and use it for installing file.
576
- if self.operation == TableOperatorConstants.APPLY_OP.value:
619
+ # If APPLY or UDF, get environment and use it for installing file.
620
+ if self.operation in [TableOperatorConstants.APPLY_OP.value,
621
+ TableOperatorConstants.UDF_OP.value]:
577
622
  self.__env.install_file(self.script_path, suppress_output=True)
578
623
 
579
624
  # Execute the script.
@@ -617,13 +662,15 @@ class _TableOperatorUtils:
617
662
  suppress_output=True)
618
663
 
619
664
  # For apply, remove file from remote user environment.
620
- if self.operation == TableOperatorConstants.APPLY_OP.value:
665
+ if self.operation == TableOperatorConstants.APPLY_OP.value or \
666
+ (self.operation == TableOperatorConstants.UDF_OP.value and self.exec_mode == 'REMOTE'):
621
667
  self.__env.remove_file(self.script_name, suppress_output=True)
622
668
 
623
669
  # Remove the entry from Garbage Collector
624
670
  if self.operation in [TableOperatorConstants.MAP_ROW_OP.value,
625
671
  TableOperatorConstants.MAP_PARTITION_OP.value,
626
- TableOperatorConstants.APPLY_OP.value]:
672
+ TableOperatorConstants.APPLY_OP.value,
673
+ TableOperatorConstants.UDF_OP.value]:
627
674
  GarbageCollector._delete_object_entry(
628
675
  object_to_delete=self.script_entry,
629
676
  object_type=TeradataConstants.TERADATA_SCRIPT,
@@ -0,0 +1,63 @@
1
+ import sys, csv
2
+ import datetime
3
+
4
+ td_buffer = {{}}
5
+
6
+
7
+ {FUNCTION_DEFINITION}
8
+
9
+ function_args = {FUNCTION_ARGS}
10
+ # Information that is required to help with the script usage.
11
+ # The delimiter to use with the input and output text.
12
+ delimiter = "{DELIMITER}"
13
+ # The names of columns in the input teradataml DataFrame.
14
+ _input_columns = {INPUT_COLUMNS}
15
+ # The names of columns in the output teradataml DataFrame.
16
+ _output_columns = {OUTPUT_COLUMNS}
17
+ # The definition for new columns in output.
18
+ columns_definitions = {COLUMNS_DEFINITIONS}
19
+ # The types of columns in the input/output teradataml DataFrame.
20
+ output_type_converters = {OUTPUT_TYPE_CONVERTERS}
21
+ for k,v in output_type_converters.items():
22
+ if v == 'datetime.date' or v == 'datetime.time' or v == 'datetime.datetime':
23
+ output_type_converters[k] = 'str'
24
+ output_type_converters = {{k:getattr(__builtins__, v) for k,v in output_type_converters.items()}}
25
+ # The quotechar to use.
26
+ quotechar = "{QUOTECHAR}"
27
+ if quotechar == "None":
28
+ quotechar = None
29
+
30
+
31
+ # The entry point to the script.
32
+ if __name__ == "__main__":
33
+
34
+ records = csv.reader(sys.stdin.readlines(), delimiter=delimiter, quotechar=quotechar)
35
+ for record in records:
36
+ record = dict(zip(_input_columns, record))
37
+ out_rec = []
38
+ for column in _output_columns:
39
+
40
+ # If it is a new column, get the value from definition.
41
+ if column in columns_definitions:
42
+ f_args = tuple()
43
+ # Convert the argument types first.
44
+ for v in function_args[column]:
45
+ if v in _input_columns:
46
+ c_type_ = output_type_converters.get(v)
47
+ if record[v]:
48
+ # If it is a float, replace the empty character.
49
+ if c_type_.__name__ == 'float':
50
+ arg = output_type_converters.get(v)(record[v].replace(' ', ''))
51
+ else:
52
+ arg = output_type_converters.get(v)(record[v])
53
+ else:
54
+ arg = record[v]
55
+ else:
56
+ arg = v
57
+ f_args = f_args + (arg, )
58
+ func_ = globals()[columns_definitions[column]]
59
+ out_rec.append(output_type_converters[column](func_(*f_args)))
60
+ else:
61
+ out_rec.append(record[column])
62
+
63
+ print("{{}}".format(delimiter).join((str(i) for i in out_rec)))
File without changes
@@ -0,0 +1,52 @@
1
+ from functools import wraps
2
+ from teradatasqlalchemy.telemetry.queryband import QueryBand, collect_queryband as tdsqlalchemy_collect_queryband
3
+
4
+
5
+ # Create a global variable to manage querybands for teradataml package.
6
+ global session_queryband
7
+ session_queryband = QueryBand()
8
+
9
+
10
+ def collect_queryband(*qb_deco_pos_args, **qb_deco_kwargs):
11
+ """
12
+ DESCRIPTION:
13
+ Decorator for calling collect_queryband decorator in telemetry utility
14
+ in teradatasqlalchemy using session_queryband object and other positional
15
+ and keyword arguments expected by collect_queryband.
16
+
17
+ PARAMETERS:
18
+ qb_deco_pos_args:
19
+ Optional Argument.
20
+ Specifies the positional arguments accepted by collect_queryband
21
+ decorator in telemetry utility in teradatasqlalchemy.
22
+
23
+ qb_deco_kwargs:
24
+ Optional Argument.
25
+ Specifies the keyword arguments accepted by collect_queryband
26
+ decorator in telemetry utility in teradatasqlalchemy.
27
+
28
+ EXAMPLES:
29
+ >>> from teradataml.telemetry_utils.queryband import collect_queryband
30
+ # Example 1: Collect queryband for a standalone function.
31
+ @collect_queryband(queryband="CreateContext")
32
+ def create_context(host = None, username ...): ...
33
+
34
+ # Example 2: Collect queryband for a class method and use
35
+ # class attribute to retrive queryband string.
36
+ @collect_queryband(attr="func_name")
37
+ def _execute_query(self, persist=False, volatile=False):...
38
+
39
+ # Example 3: Collect queryband for a class method and use
40
+ # method of same class to retrive queryband string.
41
+ @collect_queryband(method="get_class_specific_queryband")
42
+ def _execute_query(self, persist=False, volatile=False):...
43
+ """
44
+ def outer_wrapper(func):
45
+ @wraps(func)
46
+ def inner_wrapper(*func_args, **func_kwargs):
47
+ # Pass the required argument 'session_queryband' along with other
48
+ # expected arguments to collect_queryband() decorator which is
49
+ # imported as tdsqlalchemy_collect_queryband.
50
+ return tdsqlalchemy_collect_queryband(session_queryband, *qb_deco_pos_args, **qb_deco_kwargs)(func)(*func_args, **func_kwargs)
51
+ return inner_wrapper
52
+ return outer_wrapper
@@ -1660,7 +1660,7 @@ class _Validators:
1660
1660
 
1661
1661
  # Check whether table exists on the system or not.
1662
1662
  table_exists = conn.dialect.has_table(conn, table_name=table_name,
1663
- schema=schema_name)
1663
+ schema=schema_name, table_only=True)
1664
1664
 
1665
1665
  # If tables exists, return True.
1666
1666
  if table_exists:
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: teradataml
3
- Version: 20.0.0.1
3
+ Version: 20.0.0.2
4
4
  Summary: Teradata Vantage Python package for Advanced Analytics
5
5
  Home-page: http://www.teradata.com/
6
6
  Author: Teradata Corporation
@@ -18,7 +18,7 @@ Classifier: License :: Other/Proprietary License
18
18
  Requires-Python: >=3.8
19
19
  Description-Content-Type: text/markdown
20
20
  Requires-Dist: teradatasql (>=17.10.0.11)
21
- Requires-Dist: teradatasqlalchemy (>=20.0.0.1)
21
+ Requires-Dist: teradatasqlalchemy (>=20.0.0.2)
22
22
  Requires-Dist: pandas (>=0.22)
23
23
  Requires-Dist: psutil
24
24
  Requires-Dist: requests (>=2.25.1)
@@ -27,6 +27,7 @@ Requires-Dist: IPython (>=8.10.0)
27
27
  Requires-Dist: imbalanced-learn (>=0.8.0)
28
28
  Requires-Dist: pyjwt (>=2.8.0)
29
29
  Requires-Dist: cryptography (>=42.0.5)
30
+ Requires-Dist: sqlalchemy (>=2.0)
30
31
 
31
32
  ## Teradata Python package for Advanced Analytics.
32
33
 
@@ -46,6 +47,118 @@ Copyright 2024, Teradata. All Rights Reserved.
46
47
  * [License](#license)
47
48
 
48
49
  ## Release Notes:
50
+ #### teradataml 20.00.00.02
51
+
52
+ * teradataml will no longer be supported with SQLAlchemy < 2.0.
53
+ * teradataml no longer shows the warnings from Vantage by default.
54
+ * Users should set `display.suppress_vantage_runtime_warnings` to `False` to display warnings.
55
+
56
+ * ##### New Features/Functionality
57
+ * ##### teradataml: SQLE Engine Analytic Functions
58
+ * New Analytics Database Analytic Functions:
59
+ * `TFIDF()`
60
+ * `Pivoting()`
61
+ * `UnPivoting()`
62
+ * New Unbounded Array Framework(UAF) Functions:
63
+ * `AutoArima()`
64
+ * `DWT()`
65
+ * `DWT2D()`
66
+ * `FilterFactory1d()`
67
+ * `IDWT()`
68
+ * `IDWT2D()`
69
+ * `IQR()`
70
+ * `Matrix2Image()`
71
+ * `SAX()`
72
+ * `WindowDFFT()`
73
+ * ###### teradataml: Functions
74
+ * `udf()` - Creates a user defined function (UDF) and returns ColumnExpression.
75
+ * `set_session_param()` is added to set the database session parameters.
76
+ * `unset_session_param()` is added to unset database session parameters.
77
+
78
+ * ###### teradataml: DataFrame
79
+ * `materialize()` - Persists DataFrame into database for current session.
80
+ * `create_temp_view()` - Creates a temporary view for session on the DataFrame.
81
+
82
+ * ###### teradataml DataFrameColumn a.k.a. ColumnExpression
83
+ * _Date Time Functions_
84
+ * `DataFrameColumn.to_timestamp()` - Converts string or integer value to a TIMESTAMP data type or TIMESTAMP WITH TIME ZONE data type.
85
+ * `DataFrameColumn.extract()` - Extracts date component to a numeric value.
86
+ * `DataFrameColumn.to_interval()` - Converts a numeric value or string value into an INTERVAL_DAY_TO_SECOND or INTERVAL_YEAR_TO_MONTH value.
87
+ * _String Functions_
88
+ * `DataFrameColumn.parse_url()` - Extracts a part from a URL.
89
+ * _Arithmetic Functions_
90
+ * `DataFrameColumn.log` - Returns the logarithm value of the column with respect to 'base'.
91
+
92
+ * ##### teradataml: AutoML
93
+ * New methods added for `AutoML()`, `AutoRegressor()` and `AutoClassifier()`:
94
+ * `evaluate()` - Performs evaluation on the data using the best model or the model of users choice
95
+ from the leaderboard.
96
+ * `load()`: Loads the saved model from database.
97
+ * `deploy()`: Saves the trained model inside database.
98
+ * `remove_saved_model()`: Removes the saved model in database.
99
+ * `model_hyperparameters()`: Returns the hyperparameter of fitted or loaded models.
100
+
101
+ * ##### Updates
102
+ * ##### teradataml: AutoML
103
+ * `AutoML()`, `AutoRegressor()`
104
+ * New performance metrics added for task type regression i.e., "MAPE", "MPE", "ME", "EV", "MPD" and "MGD".
105
+ * `AutoML()`, `AutoRegressor()` and `AutoClassifier`
106
+ * New arguments added: `volatile`, `persist`.
107
+ * `predict()` - Data input is now mandatory for generating predictions. Default model
108
+ evaluation is now removed.
109
+ * `DataFrameColumn.cast()`: Accepts 2 new arguments `format` and `timezone`.
110
+ * `DataFrame.assign()`: Accepts ColumnExpressions returned by `udf()`.
111
+
112
+ * ##### teradataml: Options
113
+ * `set_config_params()`
114
+ * Following arguments will be deprecated in the future:
115
+ * `ues_url`
116
+ * `auth_token`
117
+
118
+ * ###### Database Utility
119
+ * `list_td_reserved_keywords()` - Accepts a list of strings as argument.
120
+
121
+ * ##### Updates to existing UAF Functions:
122
+ * `ACF()` - `round_results` parameter removed as it was used for internal testing.
123
+ * `BreuschGodfrey()` - Added default_value 0.05 for parameter `significance_level`.
124
+ * `GoldfeldQuandt()` -
125
+ * Removed parameters `weights` and `formula`.
126
+ Replaced parameter `orig_regr_paramcnt` with `const_term`.
127
+ Changed description for parameter `algorithm`. Please refer document for more details.
128
+ * Note: This will break backward compatibility.
129
+ * `HoltWintersForecaster()` - Default value of parameter `seasonal_periods` removed.
130
+ * `IDFFT2()` - Removed parameter `output_fmt_row_major` as it is used for internal testing.
131
+ * `Resample()` - Added parameter `output_fmt_index_style`.
132
+
133
+ * ##### Bug Fixes
134
+ * KNN `predict()` function can now predict on test data which does not contain target column.
135
+ * Metrics functions are supported on the Lake system.
136
+ * The following OpensourceML functions from different sklearn modules are fixed.
137
+ * `sklearn.ensemble`:
138
+ * ExtraTreesClassifier - `apply()`
139
+ * ExtraTreesRegressor - `apply()`
140
+ * RandomForestClassifier - `apply()`
141
+ * RandomForestRegressor - `apply()`
142
+ * `sklearn.impute`:
143
+ * SimpleImputer - `transform()`, `fit_transform()`, `inverse_transform()`
144
+ * MissingIndicator - `transform()`, `fit_transform()`
145
+ * `sklearn.kernel_approximations`:
146
+ * Nystroem - `transform()`, `fit_transform()`
147
+ * PolynomialCountSketch - `transform()`, `fit_transform()`
148
+ * RBFSampler - `transform()`, `fit_transform()`
149
+ * `sklearn.neighbours`:
150
+ * KNeighborsTransformer - `transform()`, `fit_transform()`
151
+ * RadiusNeighborsTransformer - `transform()`, `fit_transform()`
152
+ * `sklearn.preprocessing`:
153
+ * KernelCenterer - `transform()`
154
+ * OneHotEncoder - `transform()`, `inverse_transform()`
155
+ * OpensourceML returns teradataml objects for model attributes and functions instead of sklearn
156
+ objects so that the user can perform further operations like `score()`, `predict()` etc on top
157
+ of the returned objects.
158
+ * AutoML `predict()` function now generates correct ROC-AUC value for positive class.
159
+ * `deploy()` method of `Script` and `Apply` classes retries model deployment if there is any
160
+ intermittent network issues.
161
+
49
162
  #### teradataml 20.00.00.01
50
163
  * teradataml no longer supports Python versions less than 3.8.
51
164