teradataml 20.0.0.0__py3-none-any.whl → 20.0.0.1__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of teradataml might be problematic. Click here for more details.

Files changed (108) hide show
  1. teradataml/LICENSE-3RD-PARTY.pdf +0 -0
  2. teradataml/LICENSE.pdf +0 -0
  3. teradataml/README.md +71 -0
  4. teradataml/_version.py +2 -2
  5. teradataml/analytics/analytic_function_executor.py +51 -24
  6. teradataml/analytics/json_parser/utils.py +11 -17
  7. teradataml/automl/__init__.py +103 -48
  8. teradataml/automl/data_preparation.py +55 -37
  9. teradataml/automl/data_transformation.py +131 -69
  10. teradataml/automl/feature_engineering.py +117 -185
  11. teradataml/automl/feature_exploration.py +9 -2
  12. teradataml/automl/model_evaluation.py +13 -25
  13. teradataml/automl/model_training.py +214 -75
  14. teradataml/catalog/model_cataloging_utils.py +1 -1
  15. teradataml/clients/auth_client.py +133 -0
  16. teradataml/common/aed_utils.py +3 -2
  17. teradataml/common/constants.py +11 -6
  18. teradataml/common/garbagecollector.py +5 -0
  19. teradataml/common/messagecodes.py +3 -1
  20. teradataml/common/messages.py +2 -1
  21. teradataml/common/utils.py +6 -0
  22. teradataml/context/context.py +49 -29
  23. teradataml/data/advertising.csv +201 -0
  24. teradataml/data/bank_marketing.csv +11163 -0
  25. teradataml/data/bike_sharing.csv +732 -0
  26. teradataml/data/boston2cols.csv +721 -0
  27. teradataml/data/breast_cancer.csv +570 -0
  28. teradataml/data/customer_segmentation_test.csv +2628 -0
  29. teradataml/data/customer_segmentation_train.csv +8069 -0
  30. teradataml/data/docs/sqle/docs_17_10/OneHotEncodingFit.py +3 -1
  31. teradataml/data/docs/sqle/docs_17_10/OneHotEncodingTransform.py +6 -0
  32. teradataml/data/docs/sqle/docs_17_10/OutlierFilterTransform.py +5 -1
  33. teradataml/data/docs/sqle/docs_17_20/ANOVA.py +61 -1
  34. teradataml/data/docs/sqle/docs_17_20/ColumnTransformer.py +2 -0
  35. teradataml/data/docs/sqle/docs_17_20/FTest.py +105 -26
  36. teradataml/data/docs/sqle/docs_17_20/GLM.py +162 -1
  37. teradataml/data/docs/sqle/docs_17_20/GetFutileColumns.py +5 -3
  38. teradataml/data/docs/sqle/docs_17_20/KMeans.py +48 -1
  39. teradataml/data/docs/sqle/docs_17_20/NonLinearCombineFit.py +3 -2
  40. teradataml/data/docs/sqle/docs_17_20/OneHotEncodingFit.py +5 -0
  41. teradataml/data/docs/sqle/docs_17_20/OneHotEncodingTransform.py +6 -0
  42. teradataml/data/docs/sqle/docs_17_20/ROC.py +3 -2
  43. teradataml/data/docs/sqle/docs_17_20/SVMPredict.py +13 -2
  44. teradataml/data/docs/sqle/docs_17_20/ScaleFit.py +119 -1
  45. teradataml/data/docs/sqle/docs_17_20/ScaleTransform.py +93 -1
  46. teradataml/data/docs/sqle/docs_17_20/TDGLMPredict.py +163 -1
  47. teradataml/data/docs/sqle/docs_17_20/XGBoost.py +12 -4
  48. teradataml/data/docs/sqle/docs_17_20/XGBoostPredict.py +7 -1
  49. teradataml/data/docs/sqle/docs_17_20/ZTest.py +72 -7
  50. teradataml/data/glm_example.json +28 -1
  51. teradataml/data/housing_train_segment.csv +201 -0
  52. teradataml/data/insect2Cols.csv +61 -0
  53. teradataml/data/jsons/sqle/17.20/TD_ANOVA.json +99 -27
  54. teradataml/data/jsons/sqle/17.20/TD_FTest.json +166 -83
  55. teradataml/data/jsons/sqle/17.20/TD_GLM.json +90 -14
  56. teradataml/data/jsons/sqle/17.20/TD_GLMPREDICT.json +48 -5
  57. teradataml/data/jsons/sqle/17.20/TD_GetFutileColumns.json +5 -3
  58. teradataml/data/jsons/sqle/17.20/TD_KMeans.json +31 -11
  59. teradataml/data/jsons/sqle/17.20/TD_NonLinearCombineFit.json +3 -2
  60. teradataml/data/jsons/sqle/17.20/TD_ROC.json +2 -1
  61. teradataml/data/jsons/sqle/17.20/TD_SVM.json +16 -16
  62. teradataml/data/jsons/sqle/17.20/TD_SVMPredict.json +19 -1
  63. teradataml/data/jsons/sqle/17.20/TD_ScaleFit.json +168 -15
  64. teradataml/data/jsons/sqle/17.20/TD_ScaleTransform.json +50 -1
  65. teradataml/data/jsons/sqle/17.20/TD_XGBoost.json +25 -7
  66. teradataml/data/jsons/sqle/17.20/TD_XGBoostPredict.json +17 -4
  67. teradataml/data/jsons/sqle/17.20/TD_ZTest.json +157 -80
  68. teradataml/data/kmeans_example.json +5 -0
  69. teradataml/data/kmeans_table.csv +10 -0
  70. teradataml/data/onehot_encoder_train.csv +4 -0
  71. teradataml/data/openml_example.json +29 -0
  72. teradataml/data/scale_attributes.csv +3 -0
  73. teradataml/data/scale_example.json +52 -1
  74. teradataml/data/scale_input_part_sparse.csv +31 -0
  75. teradataml/data/scale_input_partitioned.csv +16 -0
  76. teradataml/data/scale_input_sparse.csv +11 -0
  77. teradataml/data/scale_parameters.csv +3 -0
  78. teradataml/data/scripts/deploy_script.py +20 -1
  79. teradataml/data/scripts/sklearn/sklearn_fit.py +23 -27
  80. teradataml/data/scripts/sklearn/sklearn_fit_predict.py +20 -28
  81. teradataml/data/scripts/sklearn/sklearn_function.template +13 -18
  82. teradataml/data/scripts/sklearn/sklearn_model_selection_split.py +23 -33
  83. teradataml/data/scripts/sklearn/sklearn_neighbors.py +18 -27
  84. teradataml/data/scripts/sklearn/sklearn_score.py +20 -29
  85. teradataml/data/scripts/sklearn/sklearn_transform.py +30 -38
  86. teradataml/data/teradataml_example.json +77 -0
  87. teradataml/data/ztest_example.json +16 -0
  88. teradataml/dataframe/copy_to.py +8 -3
  89. teradataml/dataframe/data_transfer.py +120 -61
  90. teradataml/dataframe/dataframe.py +102 -17
  91. teradataml/dataframe/dataframe_utils.py +47 -9
  92. teradataml/dataframe/fastload.py +272 -89
  93. teradataml/dataframe/sql.py +84 -0
  94. teradataml/dbutils/dbutils.py +2 -2
  95. teradataml/lib/aed_0_1.dll +0 -0
  96. teradataml/opensource/sklearn/_sklearn_wrapper.py +102 -55
  97. teradataml/options/__init__.py +13 -4
  98. teradataml/options/configure.py +27 -6
  99. teradataml/scriptmgmt/UserEnv.py +19 -16
  100. teradataml/scriptmgmt/lls_utils.py +117 -14
  101. teradataml/table_operators/Script.py +2 -3
  102. teradataml/table_operators/TableOperator.py +58 -10
  103. teradataml/utils/validators.py +40 -2
  104. {teradataml-20.0.0.0.dist-info → teradataml-20.0.0.1.dist-info}/METADATA +78 -6
  105. {teradataml-20.0.0.0.dist-info → teradataml-20.0.0.1.dist-info}/RECORD +108 -90
  106. {teradataml-20.0.0.0.dist-info → teradataml-20.0.0.1.dist-info}/WHEEL +0 -0
  107. {teradataml-20.0.0.0.dist-info → teradataml-20.0.0.1.dist-info}/top_level.txt +0 -0
  108. {teradataml-20.0.0.0.dist-info → teradataml-20.0.0.1.dist-info}/zip-safe +0 -0
@@ -21,13 +21,14 @@ import requests
21
21
 
22
22
  from json.decoder import JSONDecodeError
23
23
  from teradataml import configure
24
- from teradataml.context.context import _get_user
24
+ from teradataml.context.context import _get_user, get_connection
25
25
  from teradataml.common.constants import HTTPRequest, AsyncStatusColumns
26
26
  from teradataml.common.exceptions import TeradataMlException
27
27
  from teradataml.common.messages import Messages
28
28
  from teradataml.common.messagecodes import MessageCodes
29
29
  from teradataml.common.utils import UtilFuncs
30
30
  from teradataml.clients.pkce_client import _DAWorkflow
31
+ from teradataml.clients.auth_client import _AuthWorkflow
31
32
  from teradataml.utils.internal_buffer import _InternalBuffer
32
33
  from teradataml.scriptmgmt.UserEnv import UserEnv, _get_auth_token, \
33
34
  _process_ues_response, _get_ues_url, _AuthToken
@@ -1548,15 +1549,17 @@ def get_user_env():
1548
1549
 
1549
1550
 
1550
1551
  @collect_queryband(queryband="StAthTkn")
1551
- def set_auth_token(ues_url, client_id=None):
1552
+ def set_auth_token(ues_url, client_id=None, pat_token=None, pem_file=None, **kwargs):
1552
1553
  """
1553
1554
  DESCRIPTION:
1554
1555
  Function to set the Authentication token to connect to User Environment Service
1555
1556
  in VantageCloud Lake.
1556
1557
  Note:
1557
- User must have a privilage to login with a NULL password to use set_auth_token().
1558
+ User must have a privilege to login with a NULL password to use set_auth_token().
1558
1559
  Please refer to GRANT LOGON section in Teradata Documentation for more details.
1559
-
1560
+ If ues_url and client_id are specified then authentication is through OAuth.
1561
+ If ues_url, pat_token, pem_file are specified then authentication is through PAT.
1562
+ Refresh token still works but only for OAuth authentication.
1560
1563
 
1561
1564
  PARAMETERS:
1562
1565
  ues_url:
@@ -1570,6 +1573,32 @@ def set_auth_token(ues_url, client_id=None):
1570
1573
  VantageCloud Lake.
1571
1574
  Types: str
1572
1575
 
1576
+ pat_token:
1577
+ Required, if PAT authentication is to be used, optional otherwise.
1578
+ Specifies the PAT token generated from VantageCloud Lake Console.
1579
+ Types: str
1580
+
1581
+ pem_file:
1582
+ Required, if PAT authentication is to be used, optional otherwise.
1583
+ Specifies the path to private key file which is generated from VantageCloud Lake Console.
1584
+ Types: str
1585
+
1586
+ **kwargs:
1587
+ username:
1588
+ Specifies the user for which authentication is to be requested.
1589
+ If not specified, then user associated with current connection is used.
1590
+ Note:
1591
+ 1. Use this option only if name of the database username has lower case letters.
1592
+ 2. This option is used only for PAT and not for OAuth.
1593
+ Types: str
1594
+
1595
+ expiration_time:
1596
+ Specifies the expiration time of the token in seconds. After expiry time JWT token expires and
1597
+ UserEnv methods does not work, user should regenerate the token.
1598
+ Note:
1599
+ This option is used only for PAT and not for OAuth.
1600
+ Default Value: 31536000
1601
+ Types: int
1573
1602
 
1574
1603
  RETURNS:
1575
1604
  True, if the operation is successful.
@@ -1586,31 +1615,105 @@ def set_auth_token(ues_url, client_id=None):
1586
1615
  # Example 2: Set the Authentication token by specifying the client_id.
1587
1616
  >>> set_auth_token(ues_url=getpass.getpass("ues_url : "),
1588
1617
  ... client_id=getpass.getpass("client_id : "))
1618
+
1619
+ # Example 3: Set the Authentication token by specifying the "pem_file" and "pat_token"
1620
+ # without specifying "username".
1621
+ >>> import getpass
1622
+ >>> set_auth_token(ues_url=getpass.getpass("ues_url : "),
1623
+ ... pat_token=getpass.getpass("pat_token : "),
1624
+ ... pem_file=getpass.getpass("pem_file : "))
1625
+ True
1626
+
1627
+ # Example 4: Set the Authentication token by specifying the "pem_file" and "pat_token"
1628
+ # and "username".
1629
+ >>> import getpass
1630
+ >>> set_auth_token(ues_url=getpass.getpass("ues_url : "),
1631
+ ... pat_token=getpass.getpass("pat_token : "),
1632
+ ... pem_file=getpass.getpass("pem_file : "))
1633
+ ... username = "alice")
1634
+ True
1589
1635
  """
1636
+ # Deriving global connection using get_connection().
1637
+ con = get_connection()
1638
+ if con is None:
1639
+ raise TeradataMlException(Messages.get_message(MessageCodes.INVALID_CONTEXT_CONNECTION),
1640
+ MessageCodes.INVALID_CONTEXT_CONNECTION)
1641
+
1590
1642
  __arg_info_matrix = []
1591
1643
  __arg_info_matrix.append(["ues_url", ues_url, False, (str), True])
1592
1644
  __arg_info_matrix.append(["client_id", client_id, True, (str), True])
1645
+ __arg_info_matrix.append(["pat_token", pat_token, True, (str), True])
1646
+ __arg_info_matrix.append(["pem_file", pem_file, True, (str), True])
1647
+
1648
+ username = kwargs.get("username", None)
1649
+ __arg_info_matrix.append((["username", username, True, (str), True]))
1650
+
1651
+ expiration_time = kwargs.get("expiration_time", 31536000)
1652
+ __arg_info_matrix.append((["expiration_time", expiration_time, True, (int), True]))
1593
1653
 
1594
1654
  # Validate arguments.
1595
1655
  _Validators._validate_function_arguments(__arg_info_matrix)
1596
1656
 
1657
+ if client_id and any([pat_token, pem_file]):
1658
+ message = Messages.get_message(MessageCodes.EITHER_THIS_OR_THAT_ARGUMENT,
1659
+ "client_id", "pat_token' and 'pem_file")
1660
+ raise TeradataMlException(message, MessageCodes.EITHER_THIS_OR_THAT_ARGUMENT)
1661
+
1662
+ if client_id is None:
1663
+ if (pat_token and pem_file is None) or (pem_file and pat_token is None):
1664
+ message = Messages.get_message(MessageCodes.MUST_PASS_ARGUMENT,
1665
+ "pat_token", "pem_file")
1666
+ raise TeradataMlException(message, MessageCodes.MUST_PASS_ARGUMENT)
1667
+
1668
+ # Check if pem file exists.
1669
+ if pem_file is not None:
1670
+ _Validators._validate_file_exists(pem_file)
1671
+
1597
1672
  # Extract the base URL from "ues_url".
1598
1673
  url_parser = urlparse(ues_url)
1599
1674
  base_url = "{}://{}".format(url_parser.scheme, url_parser.netloc)
1675
+ netloc = url_parser.netloc.split('.')[0]
1600
1676
 
1601
- if client_id is None:
1602
- netloc = url_parser.netloc
1603
- client_id = "{}-oaf-device".format(netloc.split('.')[0])
1677
+ # Check if the authentication is PAT based or OAuth.
1678
+ if all(arg is None for arg in [pat_token, pem_file]):
1679
+ configure._oauth = True
1680
+ client_id = "{}-oaf-device".format(netloc) if client_id is None else client_id
1681
+ da_wf = _DAWorkflow(base_url, client_id)
1682
+ token_data = da_wf._get_token_data()
1683
+
1684
+ # Set Open AF parameters.
1685
+ configure._oauth_client_id = client_id
1686
+ configure._oauth_end_point = da_wf.device_auth_end_point
1687
+ configure._auth_token_expiry_time = time() + token_data["expires_in"] - 15
1688
+
1689
+ # Store the jwt token in internal class attribute.
1690
+ _InternalBuffer.add(auth_token=_AuthToken(token=token_data["access_token"]))
1691
+
1692
+ else:
1693
+ configure._oauth = False
1694
+
1695
+ if username is None:
1696
+ # If username is not specified then the database username associated with the current context will be
1697
+ # considered.
1698
+ username = _get_user()
1699
+
1700
+ org_id = netloc
1701
+
1702
+ # Construct a dictionary to be passed to _AuthWorkflow().
1703
+ state_dict = {}
1704
+ state_dict["base_url"] = base_url
1705
+ state_dict["org_id"] = org_id
1706
+ state_dict["pat_token"] = pat_token
1707
+ state_dict["pem_file"] = pem_file
1708
+ state_dict["username"] = username
1709
+ state_dict["expiration_time"] = expiration_time
1604
1710
 
1605
- da_wf = _DAWorkflow(base_url, client_id)
1606
- token_data = da_wf._get_token_data()
1711
+ auth_wf = _AuthWorkflow(state_dict)
1712
+ token_data = auth_wf._proxy_jwt()
1713
+ # Store the jwt token in internal class attribute.
1714
+ _InternalBuffer.add(auth_token=_AuthToken(token=token_data))
1607
1715
 
1608
1716
  # Set Open AF parameters.
1609
- configure._oauth_client_id = client_id
1610
1717
  configure.ues_url = ues_url
1611
- configure._oauth_end_point = da_wf.device_auth_end_point
1612
- configure._auth_token_expiry_time = time() + token_data["expires_in"] - 15
1613
- # Store the jwt token in internal class attribute.
1614
- _InternalBuffer.add(auth_token=_AuthToken(token=token_data["access_token"]))
1615
1718
 
1616
1719
  return True
@@ -728,7 +728,7 @@ class Script(TableOperator):
728
728
  5 1 1
729
729
 
730
730
  # Example 2 -
731
- # Script is tested using test_script and executed on Vantage.
731
+ # Input data is barrier_new and script is executed on Vantage.
732
732
  # use set_data() to reset arguments.
733
733
  # Create teradataml DataFrame objects.
734
734
  >>> load_example_data("Script", ["barrier_new"])
@@ -751,7 +751,7 @@ class Script(TableOperator):
751
751
  ... sort_ascending=False,
752
752
  ... charset='latin',
753
753
  ... returns=OrderedDict([("word", VARCHAR(15)),("count_input", VARCHAR(2))]))
754
- # Script is tested using test_script and executed on Vantage.
754
+ # Script is executed on Vantage.
755
755
  >>> sto.execute_script()
756
756
  ############ STDOUT Output ############
757
757
  word count_input
@@ -786,7 +786,6 @@ class Script(TableOperator):
786
786
  5 1 1
787
787
 
788
788
  # Example 3
789
- # Script is tested using test_script and executed on Vantage.
790
789
  # In order to run the script with same dataset but different data related
791
790
  # arguments, use set_data() to reset arguments.
792
791
  # Note:
@@ -1015,8 +1015,10 @@ class TableOperator:
1015
1015
  def deploy(self, model_column, partition_columns=None, model_file_prefix=None):
1016
1016
  """
1017
1017
  DESCRIPTION:
1018
- Function deploys the model generated after `execute_script()` in database or user
1019
- environment in lake.
1018
+ Function deploys the model generated after running `execute_script()` in database in
1019
+ VantageCloud Enterprise or in user environment in VantageCloud Lake.
1020
+ If deployed files are not needed, these files can be removed using `remove_file()` in
1021
+ database or `<user_env>.remove_file()` in lake.
1020
1022
 
1021
1023
  PARAMETERS:
1022
1024
  model_column:
@@ -1050,12 +1052,14 @@ class TableOperator:
1050
1052
  Types: str
1051
1053
 
1052
1054
  RETURNS:
1053
- List of generated file names.
1055
+ List of generated file identifiers in database or file names in lake.
1054
1056
 
1055
1057
  RAISES:
1056
1058
  TeradatamlException
1057
1059
 
1058
1060
  EXAMPLES:
1061
+ >>> import teradataml
1062
+ >>> from teradataml import load_example_data
1059
1063
  >>> load_example_data("openml", "multi_model_classification")
1060
1064
 
1061
1065
  >>> df = DataFrame("multi_model_classification")
@@ -1073,12 +1077,16 @@ class TableOperator:
1073
1077
  -0.615226 -0.546472 0.017496 -0.488720 0 12 0 10
1074
1078
  0.579671 -0.573365 0.160603 0.014404 0 9 1 10
1075
1079
 
1080
+ ## Run in VantageCloud Enterprise using Script object.
1076
1081
  # Install Script file.
1077
1082
  >>> file_location = os.path.join(os.path.dirname(teradataml.__file__), "data", "scripts", "deploy_script.py")
1078
1083
  >>> install_file("deploy_script", file_location, replace=True)
1079
1084
 
1085
+ >>> execute_sql("SET SESSION SEARCHUIFDBPATH = <db_name>;")
1086
+
1080
1087
  # Variables needed for Script execution.
1081
- >>> script_command = '/opt/teradata/languages/Python/bin/python3 ./ALICE/deploy_script.py'
1088
+ >>> from teradataml import configure
1089
+ >>> script_command = f'{configure.indb_install_location} ./<db_name>/deploy_script.py enterprise'
1082
1090
  >>> partition_columns = ["partition_column_1", "partition_column_2"]
1083
1091
  >>> columns = ["col1", "col2", "col3", "col4", "label",
1084
1092
  "partition_column_1", "partition_column_2"]
@@ -1104,10 +1112,10 @@ class TableOperator:
1104
1112
  # is auto generated.
1105
1113
  >>> obj.deploy(model_column="model",
1106
1114
  partition_columns=["partition_column_1", "partition_column_2"])
1107
- >>> ['model_file_1710436227163427__0_10',
1108
- 'model_file_1710436227163427__1_10',
1109
- 'model_file_1710436227163427__0_11',
1110
- 'model_file_1710436227163427__1_11']
1115
+ ['model_file_1710436227163427__0_10',
1116
+ 'model_file_1710436227163427__1_10',
1117
+ 'model_file_1710436227163427__0_11',
1118
+ 'model_file_1710436227163427__1_11']
1111
1119
 
1112
1120
  # Example 2: Provide only "model_file_prefix" argument. Here, filenames are suffixed
1113
1121
  # with 1, 2, 3, ... for multiple models.
@@ -1132,6 +1140,43 @@ class TableOperator:
1132
1140
  'my_prefix_new__1_10',
1133
1141
  'my_prefix_new__1_11']
1134
1142
 
1143
+ ## Run in VantageCloud Lake using Apply object.
1144
+ # Let's assume an user environment named "user_env" already exists in VantageCloud Lake,
1145
+ # which will be used for the examples below.
1146
+
1147
+ # ApplyTableOperator returns BLOB type for model column as per deploy_script.py.
1148
+ >>> returns = OrderedDict([("partition_column_1", INTEGER()),
1149
+ ("partition_column_2", INTEGER()),
1150
+ ("model", BLOB())])
1151
+
1152
+ # Install the script file which returns model and partition columns.
1153
+ >>> user_env.install_file(file_location)
1154
+
1155
+ >>> script_command = 'python3 deploy_script.py lake'
1156
+ >>> obj = Apply(data=df.select(columns),
1157
+ script_command=script_command,
1158
+ data_partition_column=partition_columns,
1159
+ returns=returns,
1160
+ env_name="user_env"
1161
+ )
1162
+
1163
+ >>> opt = obj.execute_script()
1164
+ >>> opt
1165
+ partition_column_1 partition_column_2 model model
1166
+ 0 10 b'gAejc1.....drIr'
1167
+ 0 11 b'gANjcw.....qWIu'
1168
+ 1 10 b'abdwcd.....dWIz'
1169
+ 1 11 b'gA4jc4.....agfu'
1170
+
1171
+ # Example 5: Provide both "partition_columns" and "model_file_prefix" arguments.
1172
+ >>> obj.deploy(model_column="model", model_file_prefix="my_prefix_",
1173
+ partition_columns=["partition_column_1", "partition_column_2"])
1174
+ ['my_prefix__0_10',
1175
+ 'my_prefix__0_11',
1176
+ 'my_prefix__1_10',
1177
+ 'my_prefix__1_11']
1178
+
1179
+ # Other examples are similar to the examples provided for VantageCloud Enterprise.
1135
1180
  """
1136
1181
 
1137
1182
  arg_info_matrix = []
@@ -1169,6 +1214,9 @@ class TableOperator:
1169
1214
  n_models = len(vals)
1170
1215
  all_files = []
1171
1216
 
1217
+ # Default location for .teradataml is user's home directory if configure.local_storage is not set.
1218
+ tempdir = GarbageCollector._get_temp_dir_name()
1219
+
1172
1220
  for i, row in enumerate(vals):
1173
1221
  model = row[0]
1174
1222
  partition_values = ""
@@ -1178,7 +1226,7 @@ class TableOperator:
1178
1226
  partition_values = str(i+1)
1179
1227
 
1180
1228
  model_file = f"{model_file_prefix}_{partition_values}"
1181
- model_file_path = os.path.join(os.path.expanduser("~"), ".teradataml", model_file)
1229
+ model_file_path = os.path.join(tempdir, model_file)
1182
1230
 
1183
1231
  if model_column_type == "CLOB":
1184
1232
  import base64
@@ -1198,7 +1246,7 @@ class TableOperator:
1198
1246
  install_file(file_identifier=model_file, file_path=model_file_path,
1199
1247
  is_binary=True, suppress_output=True)
1200
1248
  elif self.__class__.__name__ == "Apply":
1201
- self.env.install_file(file_name=model_file_path)
1249
+ self.env.install_file(file_path=model_file_path)
1202
1250
 
1203
1251
  all_files.append(model_file)
1204
1252
 
@@ -170,7 +170,7 @@ class _Validators:
170
170
  Required Argument.
171
171
  Specifies the name or list of names of columns to be validated
172
172
  for existence.
173
- Types: str or List of strings
173
+ Types: str or List of strings or ColumnExpression or list of ColumnExpression
174
174
 
175
175
  arg_name:
176
176
  Required Argument.
@@ -204,7 +204,15 @@ class _Validators:
204
204
  df_columns = UtilFuncs._all_df_columns(column_expression)
205
205
 
206
206
  # Let's validate existence of each column one by one.
207
- for column_name in columns:
207
+ columns_ = []
208
+ for column in columns:
209
+ if isinstance(column, str):
210
+ columns_.append(column)
211
+ else:
212
+ columns_ = columns_ + UtilFuncs._all_df_columns(column)
213
+
214
+ # Let's validate existence of each column one by one.
215
+ for column_name in columns_:
208
216
  # If column name does not exist in DataFrame of a column, raise the exception.
209
217
  if column_name not in df_columns:
210
218
  message = "{}. Check the argument '{}'".format(sorted(df_columns), arg_name)
@@ -2237,3 +2245,33 @@ class _Validators:
2237
2245
  raise TeradataMlException(message,
2238
2246
  MessageCodes.IMPORT_PYTHON_PACKAGE)
2239
2247
  return True
2248
+
2249
+
2250
+ @staticmethod
2251
+ @skip_validation()
2252
+ def _validate_ipaddress(ip_address):
2253
+ """
2254
+ DESCRIPTION:
2255
+ Check if ipaddress is valid.
2256
+ PARAMETERS:
2257
+ ip_address:
2258
+ Required Argument.
2259
+ Specifies the ip address to be validated.
2260
+ Types: str
2261
+ RETURNS:
2262
+ None.
2263
+ RAISES:
2264
+ TeradataMlException
2265
+ EXAMPLES:
2266
+ _Validators._validate_ipaddress("190.132.12.15")
2267
+ """
2268
+ import ipaddress
2269
+
2270
+ try:
2271
+ ipaddress.ip_address(ip_address)
2272
+ except Exception as err:
2273
+ raise ValueError(Messages.get_message(
2274
+ MessageCodes.INVALID_ARG_VALUE).format(ip_address, "ip_address",
2275
+ 'of four numbers (each between 0 and 255) separated by periods'))
2276
+
2277
+ return True
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: teradataml
3
- Version: 20.0.0.0
3
+ Version: 20.0.0.1
4
4
  Summary: Teradata Vantage Python package for Advanced Analytics
5
5
  Home-page: http://www.teradata.com/
6
6
  Author: Teradata Corporation
@@ -8,24 +8,25 @@ License: Teradata License Agreement
8
8
  Keywords: Teradata
9
9
  Platform: MacOS X, Windows, Linux
10
10
  Classifier: Programming Language :: Python :: 3 :: Only
11
- Classifier: Programming Language :: Python :: 3.5
12
- Classifier: Programming Language :: Python :: 3.6
13
- Classifier: Programming Language :: Python :: 3.7
11
+ Classifier: Programming Language :: Python :: 3.8
12
+ Classifier: Programming Language :: Python :: 3.9
14
13
  Classifier: Operating System :: Microsoft :: Windows
15
14
  Classifier: Operating System :: MacOS :: MacOS X
16
15
  Classifier: Operating System :: POSIX :: Linux
17
16
  Classifier: Topic :: Database :: Front-Ends
18
17
  Classifier: License :: Other/Proprietary License
19
- Requires-Python: >=3.5
18
+ Requires-Python: >=3.8
20
19
  Description-Content-Type: text/markdown
21
20
  Requires-Dist: teradatasql (>=17.10.0.11)
22
- Requires-Dist: teradatasqlalchemy (>=20.0.0.0)
21
+ Requires-Dist: teradatasqlalchemy (>=20.0.0.1)
23
22
  Requires-Dist: pandas (>=0.22)
24
23
  Requires-Dist: psutil
25
24
  Requires-Dist: requests (>=2.25.1)
26
25
  Requires-Dist: scikit-learn (>=0.24.2)
27
26
  Requires-Dist: IPython (>=8.10.0)
28
27
  Requires-Dist: imbalanced-learn (>=0.8.0)
28
+ Requires-Dist: pyjwt (>=2.8.0)
29
+ Requires-Dist: cryptography (>=42.0.5)
29
30
 
30
31
  ## Teradata Python package for Advanced Analytics.
31
32
 
@@ -45,6 +46,77 @@ Copyright 2024, Teradata. All Rights Reserved.
45
46
  * [License](#license)
46
47
 
47
48
  ## Release Notes:
49
+ #### teradataml 20.00.00.01
50
+ * teradataml no longer supports Python versions less than 3.8.
51
+
52
+ * ##### New Features/Functionality
53
+ * ##### Personal Access Token (PAT) support in teradataml
54
+ * `set_auth_token()` - teradataml now supports authentication via PAT in addition to
55
+ OAuth 2.0 Device Authorization Grant (formerly known as the Device Flow).
56
+ * It accepts UES URL, Personal AccessToken (PAT) and Private Key file generated from VantageCloud Lake Console
57
+ and optional argument `username` and `expiration_time` in seconds.
58
+
59
+ * ##### Updates
60
+ * ##### teradataml: SQLE Engine Analytic Functions
61
+ * `ANOVA()`
62
+ * New arguments added: `group_name_column`, `group_value_name`, `group_names`, `num_groups` for data containing group values and group names.
63
+ * `FTest()`
64
+ * New arguments added: `sample_name_column`, `sample_name_value`, `first_sample_name`, `second_sample_name`.
65
+ * `GLM()`
66
+ * Supports stepwise regression and accept new arguments `stepwise_direction`, `max_steps_num` and `initial_stepwise_columns`.
67
+ * New arguments added: `attribute_data`, `parameter_data`, `iteration_mode` and `partition_column`.
68
+ * `GetFutileColumns()`
69
+ * Arguments `category_summary_column` and `threshold_value` are now optional.
70
+ * `KMeans()`
71
+ * New argument added: `initialcentroids_method`.
72
+ * `NonLinearCombineFit()`
73
+ * Argument `result_column` is now optional.
74
+ * `ROC()`
75
+ * Argument `positive_class` is now optional.
76
+ * `SVMPredict()`
77
+ * New argument added: `model_type`.
78
+ * `ScaleFit()`
79
+ * New arguments added: `ignoreinvalid_locationscale`, `unused_attributes`, `attribute_name_column`, `attribute_value_column`.
80
+ * Arguments `attribute_name_column`, `attribute_value_column` and `target_attributes` are supported for sparse input.
81
+ * Arguments `attribute_data`, `parameter_data` and `partition_column` are supported for partitioning.
82
+ * `ScaleTransform()`
83
+ * New arguments added: `attribute_name_column` and `attribute_value_column` support for sparse input.
84
+ * `TDGLMPredict()`
85
+ * New arguments added: `family` and `partition_column`.
86
+ * `XGBoost()`
87
+ * New argument `base_score` is added for initial prediction value for all data points.
88
+ * `XGBoostPredict()`
89
+ * New argument `detailed` is added for detailed information of each prediction.
90
+ * `ZTest()`
91
+ * New arguments added: `sample_name_column`, `sample_value_column`, `first_sample_name` and `second_sample_name`.
92
+ * ##### teradataml: AutoML
93
+ * `AutoML()`, `AutoRegressor()` and `AutoClassifier()`
94
+ * New argument `max_models` is added as an early stopping criterion to limit the maximum number of models to be trained.
95
+ * ##### teradataml: DataFrame functions
96
+ * `DataFrame.agg()`
97
+ * Accepts ColumnExpressions and list of ColumnExpressions as arguments.
98
+ * ##### teradataml: General Functions
99
+ * Data Transfer Utility
100
+ * `fastload()` - Improved error and warning table handling with below-mentioned new arguments.
101
+ * `err_staging_db`
102
+ * `err_tbl_name`
103
+ * `warn_tbl_name`
104
+ * `err_tbl_1_suffix`
105
+ * `err_tbl_2_suffix`
106
+ * `fastload()` - Change in behaviour of `save_errors` argument.
107
+ When `save_errors` is set to `True`, error information will be available in two persistent tables `ERR_1` and `ERR_2`.
108
+ When `save_errors` is set to `False`, error information will be available in single pandas dataframe.
109
+ * Garbage collector location is now configurable.
110
+ User can set configure.local_storage to a desired location.
111
+
112
+ * ##### Bug Fixes
113
+ * UAF functions now work if the database name has special characters.
114
+ * OpensourceML can now read and process NULL/nan values.
115
+ * Boolean values output will now be returned as VARBYTE column with 0 or 1 values in OpensourceML.
116
+ * Fixed bug for `Apply`'s `deploy()`.
117
+ * Issue with volatile table creation is fixed where it is created in the right database, i.e., user's spool space, regardless of the temp database specified.
118
+ * `ColumnTransformer` function now processes its arguments in the order they are passed.
119
+
48
120
  #### teradataml 20.00.00.00
49
121
  * ##### New Features/Functionality
50
122
  * ###### teradataml OpenML: Run Opensource packages through Teradata Vantage