PyPI - snowflake-ml-python - Versions diffs - 1.0.2__py3-none-any.whl → 1.0.3__py3-none-any.whl - Mend

snowflake-ml-python 1.0.2py3-none-any.whl → 1.0.3py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (189) hide show

snowflake/ml/_internal/env_utils.py +2 -1
snowflake/ml/_internal/file_utils.py +29 -7
snowflake/ml/_internal/telemetry.py +5 -8
snowflake/ml/_internal/utils/uri.py +7 -2
snowflake/ml/model/_deploy_client/image_builds/base_image_builder.py +15 -0
snowflake/ml/model/_deploy_client/image_builds/client_image_builder.py +259 -0
snowflake/ml/model/_deploy_client/image_builds/docker_context.py +89 -0
snowflake/ml/model/_deploy_client/image_builds/gunicorn_run.sh +24 -0
snowflake/ml/model/_deploy_client/image_builds/inference_server/main.py +118 -0
snowflake/ml/model/_deploy_client/image_builds/templates/dockerfile_template +40 -0
snowflake/ml/model/_deploy_client/snowservice/deploy.py +199 -0
snowflake/ml/model/_deploy_client/snowservice/deploy_options.py +88 -0
snowflake/ml/model/_deploy_client/snowservice/templates/service_spec_template +24 -0
snowflake/ml/model/_deploy_client/utils/constants.py +47 -0
snowflake/ml/model/_deploy_client/utils/snowservice_client.py +178 -0
snowflake/ml/model/_deploy_client/warehouse/deploy.py +24 -6
snowflake/ml/model/_deploy_client/warehouse/infer_template.py +5 -2
snowflake/ml/model/_deployer.py +14 -27
snowflake/ml/model/_env.py +4 -4
snowflake/ml/model/_handlers/custom.py +14 -2
snowflake/ml/model/_handlers/pytorch.py +186 -0
snowflake/ml/model/_handlers/sklearn.py +14 -9
snowflake/ml/model/_handlers/snowmlmodel.py +14 -9
snowflake/ml/model/_handlers/torchscript.py +180 -0
snowflake/ml/model/_handlers/xgboost.py +19 -9
snowflake/ml/model/_model.py +3 -2
snowflake/ml/model/_model_meta.py +12 -7
snowflake/ml/model/model_signature.py +446 -66
snowflake/ml/model/type_hints.py +23 -4
snowflake/ml/modeling/calibration/calibrated_classifier_cv.py +51 -26
snowflake/ml/modeling/cluster/affinity_propagation.py +51 -26
snowflake/ml/modeling/cluster/agglomerative_clustering.py +51 -26
snowflake/ml/modeling/cluster/birch.py +51 -26
snowflake/ml/modeling/cluster/bisecting_k_means.py +51 -26
snowflake/ml/modeling/cluster/dbscan.py +51 -26
snowflake/ml/modeling/cluster/feature_agglomeration.py +51 -26
snowflake/ml/modeling/cluster/k_means.py +51 -26
snowflake/ml/modeling/cluster/mean_shift.py +51 -26
snowflake/ml/modeling/cluster/mini_batch_k_means.py +51 -26
snowflake/ml/modeling/cluster/optics.py +51 -26
snowflake/ml/modeling/cluster/spectral_biclustering.py +51 -26
snowflake/ml/modeling/cluster/spectral_clustering.py +51 -26
snowflake/ml/modeling/cluster/spectral_coclustering.py +51 -26
snowflake/ml/modeling/compose/column_transformer.py +51 -26
snowflake/ml/modeling/compose/transformed_target_regressor.py +51 -26
snowflake/ml/modeling/covariance/elliptic_envelope.py +51 -26
snowflake/ml/modeling/covariance/empirical_covariance.py +51 -26
snowflake/ml/modeling/covariance/graphical_lasso.py +51 -26
snowflake/ml/modeling/covariance/graphical_lasso_cv.py +51 -26
snowflake/ml/modeling/covariance/ledoit_wolf.py +51 -26
snowflake/ml/modeling/covariance/min_cov_det.py +51 -26
snowflake/ml/modeling/covariance/oas.py +51 -26
snowflake/ml/modeling/covariance/shrunk_covariance.py +51 -26
snowflake/ml/modeling/decomposition/dictionary_learning.py +51 -26
snowflake/ml/modeling/decomposition/factor_analysis.py +51 -26
snowflake/ml/modeling/decomposition/fast_ica.py +51 -26
snowflake/ml/modeling/decomposition/incremental_pca.py +51 -26
snowflake/ml/modeling/decomposition/kernel_pca.py +51 -26
snowflake/ml/modeling/decomposition/mini_batch_dictionary_learning.py +51 -26
snowflake/ml/modeling/decomposition/mini_batch_sparse_pca.py +51 -26
snowflake/ml/modeling/decomposition/pca.py +51 -26
snowflake/ml/modeling/decomposition/sparse_pca.py +51 -26
snowflake/ml/modeling/decomposition/truncated_svd.py +51 -26
snowflake/ml/modeling/discriminant_analysis/linear_discriminant_analysis.py +51 -26
snowflake/ml/modeling/discriminant_analysis/quadratic_discriminant_analysis.py +51 -26
snowflake/ml/modeling/ensemble/ada_boost_classifier.py +51 -26
snowflake/ml/modeling/ensemble/ada_boost_regressor.py +51 -26
snowflake/ml/modeling/ensemble/bagging_classifier.py +51 -26
snowflake/ml/modeling/ensemble/bagging_regressor.py +51 -26
snowflake/ml/modeling/ensemble/extra_trees_classifier.py +51 -26
snowflake/ml/modeling/ensemble/extra_trees_regressor.py +51 -26
snowflake/ml/modeling/ensemble/gradient_boosting_classifier.py +51 -26
snowflake/ml/modeling/ensemble/gradient_boosting_regressor.py +51 -26
snowflake/ml/modeling/ensemble/hist_gradient_boosting_classifier.py +51 -26
snowflake/ml/modeling/ensemble/hist_gradient_boosting_regressor.py +51 -26
snowflake/ml/modeling/ensemble/isolation_forest.py +51 -26
snowflake/ml/modeling/ensemble/random_forest_classifier.py +51 -26
snowflake/ml/modeling/ensemble/random_forest_regressor.py +51 -26
snowflake/ml/modeling/ensemble/stacking_regressor.py +51 -26
snowflake/ml/modeling/ensemble/voting_classifier.py +51 -26
snowflake/ml/modeling/ensemble/voting_regressor.py +51 -26
snowflake/ml/modeling/feature_selection/generic_univariate_select.py +51 -26
snowflake/ml/modeling/feature_selection/select_fdr.py +51 -26
snowflake/ml/modeling/feature_selection/select_fpr.py +51 -26
snowflake/ml/modeling/feature_selection/select_fwe.py +51 -26
snowflake/ml/modeling/feature_selection/select_k_best.py +51 -26
snowflake/ml/modeling/feature_selection/select_percentile.py +51 -26
snowflake/ml/modeling/feature_selection/sequential_feature_selector.py +51 -26
snowflake/ml/modeling/feature_selection/variance_threshold.py +51 -26
snowflake/ml/modeling/gaussian_process/gaussian_process_classifier.py +51 -26
snowflake/ml/modeling/gaussian_process/gaussian_process_regressor.py +51 -26
snowflake/ml/modeling/impute/iterative_imputer.py +51 -26
snowflake/ml/modeling/impute/knn_imputer.py +51 -26
snowflake/ml/modeling/impute/missing_indicator.py +51 -26
snowflake/ml/modeling/kernel_approximation/additive_chi2_sampler.py +51 -26
snowflake/ml/modeling/kernel_approximation/nystroem.py +51 -26
snowflake/ml/modeling/kernel_approximation/polynomial_count_sketch.py +51 -26
snowflake/ml/modeling/kernel_approximation/rbf_sampler.py +51 -26
snowflake/ml/modeling/kernel_approximation/skewed_chi2_sampler.py +51 -26
snowflake/ml/modeling/kernel_ridge/kernel_ridge.py +51 -26
snowflake/ml/modeling/lightgbm/lgbm_classifier.py +51 -26
snowflake/ml/modeling/lightgbm/lgbm_regressor.py +51 -26
snowflake/ml/modeling/linear_model/ard_regression.py +51 -26
snowflake/ml/modeling/linear_model/bayesian_ridge.py +51 -26
snowflake/ml/modeling/linear_model/elastic_net.py +51 -26
snowflake/ml/modeling/linear_model/elastic_net_cv.py +51 -26
snowflake/ml/modeling/linear_model/gamma_regressor.py +51 -26
snowflake/ml/modeling/linear_model/huber_regressor.py +51 -26
snowflake/ml/modeling/linear_model/lars.py +51 -26
snowflake/ml/modeling/linear_model/lars_cv.py +51 -26
snowflake/ml/modeling/linear_model/lasso.py +51 -26
snowflake/ml/modeling/linear_model/lasso_cv.py +51 -26
snowflake/ml/modeling/linear_model/lasso_lars.py +51 -26
snowflake/ml/modeling/linear_model/lasso_lars_cv.py +51 -26
snowflake/ml/modeling/linear_model/lasso_lars_ic.py +51 -26
snowflake/ml/modeling/linear_model/linear_regression.py +51 -26
snowflake/ml/modeling/linear_model/logistic_regression.py +51 -26
snowflake/ml/modeling/linear_model/logistic_regression_cv.py +51 -26
snowflake/ml/modeling/linear_model/multi_task_elastic_net.py +51 -26
snowflake/ml/modeling/linear_model/multi_task_elastic_net_cv.py +51 -26
snowflake/ml/modeling/linear_model/multi_task_lasso.py +51 -26
snowflake/ml/modeling/linear_model/multi_task_lasso_cv.py +51 -26
snowflake/ml/modeling/linear_model/orthogonal_matching_pursuit.py +51 -26
snowflake/ml/modeling/linear_model/passive_aggressive_classifier.py +51 -26
snowflake/ml/modeling/linear_model/passive_aggressive_regressor.py +51 -26
snowflake/ml/modeling/linear_model/perceptron.py +51 -26
snowflake/ml/modeling/linear_model/poisson_regressor.py +51 -26
snowflake/ml/modeling/linear_model/ransac_regressor.py +51 -26
snowflake/ml/modeling/linear_model/ridge.py +51 -26
snowflake/ml/modeling/linear_model/ridge_classifier.py +51 -26
snowflake/ml/modeling/linear_model/ridge_classifier_cv.py +51 -26
snowflake/ml/modeling/linear_model/ridge_cv.py +51 -26
snowflake/ml/modeling/linear_model/sgd_classifier.py +51 -26
snowflake/ml/modeling/linear_model/sgd_one_class_svm.py +51 -26
snowflake/ml/modeling/linear_model/sgd_regressor.py +51 -26
snowflake/ml/modeling/linear_model/theil_sen_regressor.py +51 -26
snowflake/ml/modeling/linear_model/tweedie_regressor.py +51 -26
snowflake/ml/modeling/manifold/isomap.py +51 -26
snowflake/ml/modeling/manifold/mds.py +51 -26
snowflake/ml/modeling/manifold/spectral_embedding.py +51 -26
snowflake/ml/modeling/manifold/tsne.py +51 -26
snowflake/ml/modeling/mixture/bayesian_gaussian_mixture.py +51 -26
snowflake/ml/modeling/mixture/gaussian_mixture.py +51 -26
snowflake/ml/modeling/model_selection/grid_search_cv.py +51 -26
snowflake/ml/modeling/model_selection/randomized_search_cv.py +51 -26
snowflake/ml/modeling/multiclass/one_vs_one_classifier.py +51 -26
snowflake/ml/modeling/multiclass/one_vs_rest_classifier.py +51 -26
snowflake/ml/modeling/multiclass/output_code_classifier.py +51 -26
snowflake/ml/modeling/naive_bayes/bernoulli_nb.py +51 -26
snowflake/ml/modeling/naive_bayes/categorical_nb.py +51 -26
snowflake/ml/modeling/naive_bayes/complement_nb.py +51 -26
snowflake/ml/modeling/naive_bayes/gaussian_nb.py +51 -26
snowflake/ml/modeling/naive_bayes/multinomial_nb.py +51 -26
snowflake/ml/modeling/neighbors/k_neighbors_classifier.py +51 -26
snowflake/ml/modeling/neighbors/k_neighbors_regressor.py +51 -26
snowflake/ml/modeling/neighbors/kernel_density.py +51 -26
snowflake/ml/modeling/neighbors/local_outlier_factor.py +51 -26
snowflake/ml/modeling/neighbors/nearest_centroid.py +51 -26
snowflake/ml/modeling/neighbors/nearest_neighbors.py +51 -26
snowflake/ml/modeling/neighbors/neighborhood_components_analysis.py +51 -26
snowflake/ml/modeling/neighbors/radius_neighbors_classifier.py +51 -26
snowflake/ml/modeling/neighbors/radius_neighbors_regressor.py +51 -26
snowflake/ml/modeling/neural_network/bernoulli_rbm.py +51 -26
snowflake/ml/modeling/neural_network/mlp_classifier.py +51 -26
snowflake/ml/modeling/neural_network/mlp_regressor.py +51 -26
snowflake/ml/modeling/preprocessing/ordinal_encoder.py +2 -0
snowflake/ml/modeling/preprocessing/polynomial_features.py +51 -26
snowflake/ml/modeling/semi_supervised/label_propagation.py +51 -26
snowflake/ml/modeling/semi_supervised/label_spreading.py +51 -26
snowflake/ml/modeling/svm/linear_svc.py +51 -26
snowflake/ml/modeling/svm/linear_svr.py +51 -26
snowflake/ml/modeling/svm/nu_svc.py +51 -26
snowflake/ml/modeling/svm/nu_svr.py +51 -26
snowflake/ml/modeling/svm/svc.py +51 -26
snowflake/ml/modeling/svm/svr.py +51 -26
snowflake/ml/modeling/tree/decision_tree_classifier.py +51 -26
snowflake/ml/modeling/tree/decision_tree_regressor.py +51 -26
snowflake/ml/modeling/tree/extra_tree_classifier.py +51 -26
snowflake/ml/modeling/tree/extra_tree_regressor.py +51 -26
snowflake/ml/modeling/xgboost/xgb_classifier.py +51 -26
snowflake/ml/modeling/xgboost/xgb_regressor.py +51 -26
snowflake/ml/modeling/xgboost/xgbrf_classifier.py +51 -26
snowflake/ml/modeling/xgboost/xgbrf_regressor.py +51 -26
snowflake/ml/registry/model_registry.py +74 -56
snowflake/ml/version.py +1 -1
{snowflake_ml_python-1.0.2.dist-info → snowflake_ml_python-1.0.3.dist-info}/METADATA +27 -8
snowflake_ml_python-1.0.3.dist-info/RECORD +259 -0
snowflake_ml_python-1.0.2.dist-info/RECORD +0 -246
{snowflake_ml_python-1.0.2.dist-info → snowflake_ml_python-1.0.3.dist-info}/WHEEL +0 -0

snowflake/ml/modeling/xgboost/xgbrf_regressor.py CHANGED Viewed

@@ -7,6 +7,7 @@
 #
 import inspect
 import os
+import posixpath
 from typing import Iterable, Optional, Union, List, Any, Dict, Callable, Set
 from uuid import uuid4
@@ -26,6 +27,7 @@ from snowflake.ml._internal.utils.temp_file_utils import cleanup_temp_files, get
 from snowflake.snowpark import DataFrame, Session
 from snowflake.snowpark.functions import pandas_udf, sproc
 from snowflake.snowpark.types import PandasSeries
+from snowflake.snowpark._internal.type_utils import convert_sp_to_sf_type
 from snowflake.ml.model.model_signature import (
     DataType,
@@ -392,7 +394,6 @@ class XGBRFRegressor(BaseTransformer):
         **kwargs,
     ) -> None:
         super().__init__()
-        self.id = str(uuid4()).replace("-", "_").upper()
         deps: Set[str] = set([f'numpy=={np.__version__}', f'xgboost=={xgboost.__version__}', f'cloudpickle=={cp.__version__}'])
         self._deps = list(deps)
@@ -416,6 +417,15 @@ class XGBRFRegressor(BaseTransformer):
         self.set_drop_input_cols(drop_input_cols)
         self.set_sample_weight_col(sample_weight_col)
+    def _get_rand_id(self) -> str:
+        """
+        Generate random id to be used in sproc and stage names.
+        Returns:
+            Random id string usable in sproc, table, and stage names.
+        """
+        return str(uuid4()).replace("-", "_").upper()
     def _infer_input_output_cols(self, dataset: Union[DataFrame, pd.DataFrame]) -> None:
         """
         Infer `self.input_cols` and `self.output_cols` if they are not explicitly set.
@@ -494,7 +504,7 @@ class XGBRFRegressor(BaseTransformer):
             cp.dump(self._sklearn_object, local_transform_file)
         # Create temp stage to run fit.
-        transform_stage_name = "SNOWML_TRANSFORM_{safe_id}".format(safe_id=self.id)
+        transform_stage_name = "SNOWML_TRANSFORM_{safe_id}".format(safe_id=self._get_rand_id())
         stage_creation_query = f"CREATE OR REPLACE TEMPORARY STAGE {transform_stage_name};"
         SqlResultValidator(
             session=session,
@@ -507,11 +517,12 @@ class XGBRFRegressor(BaseTransformer):
             expected_value=f"Stage area {transform_stage_name} successfully created."
         ).validate()
-        stage_transform_file_name = os.path.join(transform_stage_name, os.path.basename(local_transform_file_name))
+        # Use posixpath to construct stage paths
+        stage_transform_file_name = posixpath.join(transform_stage_name, os.path.basename(local_transform_file_name))
+        stage_result_file_name = posixpath.join(transform_stage_name, os.path.basename(local_transform_file_name))
         local_result_file_name = get_temp_file_path()
-        stage_result_file_name = os.path.join(transform_stage_name, os.path.basename(local_transform_file_name))
-        fit_sproc_name = "SNOWML_FIT_{safe_id}".format(safe_id=self.id)
+        fit_sproc_name = "SNOWML_FIT_{safe_id}".format(safe_id=self._get_rand_id())
         statement_params = telemetry.get_function_usage_statement_params(
             project=_PROJECT,
             subproject=_SUBPROJECT,
@@ -537,6 +548,7 @@ class XGBRFRegressor(BaseTransformer):
             replace=True,
             session=session,
             statement_params=statement_params,
+            anonymous=True
         )
         def fit_wrapper_sproc(
             session: Session,
@@ -545,7 +557,8 @@ class XGBRFRegressor(BaseTransformer):
             stage_result_file_name: str,
             input_cols: List[str],
             label_cols: List[str],
-            sample_weight_col: Optional[str]
+            sample_weight_col: Optional[str],
+            statement_params: Dict[str, str]
         ) -> str:
             import cloudpickle as cp
             import numpy as np
@@ -612,15 +625,15 @@ class XGBRFRegressor(BaseTransformer):
             api_calls=[Session.call],
             custom_tags=dict([("autogen", True)]),
         )
-        sproc_export_file_name = session.call(
-            fit_sproc_name,
+        sproc_export_file_name = fit_wrapper_sproc(
+            session,
             query,
             stage_transform_file_name,
             stage_result_file_name,
             identifier.get_unescaped_names(self.input_cols),
             identifier.get_unescaped_names(self.label_cols),
             identifier.get_unescaped_names(self.sample_weight_col),
-            statement_params=statement_params,
+            statement_params,
         )
         if "|" in sproc_export_file_name:
@@ -630,7 +643,7 @@ class XGBRFRegressor(BaseTransformer):
                 print("\n".join(fields[1:]))
         session.file.get(
-            os.path.join(stage_result_file_name, sproc_export_file_name),
+            posixpath.join(stage_result_file_name, sproc_export_file_name),
             local_result_file_name,
             statement_params=statement_params
         )
@@ -676,7 +689,7 @@ class XGBRFRegressor(BaseTransformer):
         # Register vectorized UDF for batch inference
         batch_inference_udf_name = "SNOWML_BATCH_INFERENCE_{safe_id}_{method}".format(
-                safe_id=self.id, method=inference_method)
+                safe_id=self._get_rand_id(), method=inference_method)
         # Need to do this since if we use self._sklearn_object directly in the UDF, Snowpark
         # will try to pickle all of self which fails.
@@ -768,7 +781,7 @@ class XGBRFRegressor(BaseTransformer):
             return transformed_pandas_df.to_dict("records")
         batch_inference_table_name = "SNOWML_BATCH_INFERENCE_INPUT_TABLE_{safe_id}".format(
-            safe_id=self.id
+            safe_id=self._get_rand_id()
         )
         pass_through_columns = self._get_pass_through_columns(dataset)
@@ -935,11 +948,18 @@ class XGBRFRegressor(BaseTransformer):
             Transformed dataset.
         """
         if isinstance(dataset, DataFrame):
+            expected_type_inferred = "float"
+            # when it is classifier, infer the datatype from label columns
+            if expected_type_inferred == "" and 'predict' in self.model_signatures:
+                expected_type_inferred = convert_sp_to_sf_type(
+                    self.model_signatures['predict'].outputs[0].as_snowpark_type()
+                )
             output_df = self._batch_inference(
                 dataset=dataset,
                 inference_method="predict",
                 expected_output_cols_list=self.output_cols,
-                expected_output_cols_type="float",
+                expected_output_cols_type=expected_type_inferred,
             )
         elif isinstance(dataset, pd.DataFrame):
             output_df = self._sklearn_inference(
@@ -1010,10 +1030,10 @@ class XGBRFRegressor(BaseTransformer):
     def _get_output_column_names(self, output_cols_prefix: str) -> List[str]:
         """ Returns the list of output columns for predict_proba(), decision_function(), etc.. functions.
-        Returns an empty list if current object is not a classifier or not yet fitted.
+        Returns a list with output_cols_prefix as the only element if the estimator is not a classifier.
         """
         if getattr(self._sklearn_object, "classes_", None) is None:
-            return []
+            return [output_cols_prefix]
         classes = self._sklearn_object.classes_
         if isinstance(classes, numpy.ndarray):
@@ -1238,7 +1258,7 @@ class XGBRFRegressor(BaseTransformer):
             cp.dump(self._sklearn_object, local_score_file)
         # Create temp stage to run score.
-        score_stage_name = "SNOWML_SCORE_{safe_id}".format(safe_id=self.id)
+        score_stage_name = "SNOWML_SCORE_{safe_id}".format(safe_id=self._get_rand_id())
         session = dataset._session
         stage_creation_query = f"CREATE OR REPLACE TEMPORARY STAGE {score_stage_name};"
         SqlResultValidator(
@@ -1252,8 +1272,9 @@ class XGBRFRegressor(BaseTransformer):
             expected_value=f"Stage area {score_stage_name} successfully created."
         ).validate()
-        stage_score_file_name = os.path.join(score_stage_name, os.path.basename(local_score_file_name))
-        score_sproc_name = "SNOWML_SCORE_{safe_id}".format(safe_id=self.id)
+        # Use posixpath to construct stage paths
+        stage_score_file_name = posixpath.join(score_stage_name, os.path.basename(local_score_file_name))
+        score_sproc_name = "SNOWML_SCORE_{safe_id}".format(safe_id=self._get_rand_id())
         statement_params = telemetry.get_function_usage_statement_params(
             project=_PROJECT,
             subproject=_SUBPROJECT,
@@ -1279,6 +1300,7 @@ class XGBRFRegressor(BaseTransformer):
             replace=True,
             session=session,
             statement_params=statement_params,
+            anonymous=True
         )
         def score_wrapper_sproc(
             session: Session,
@@ -1286,7 +1308,8 @@ class XGBRFRegressor(BaseTransformer):
             stage_score_file_name: str,
             input_cols: List[str],
             label_cols: List[str],
-            sample_weight_col: Optional[str]
+            sample_weight_col: Optional[str],
+            statement_params: Dict[str, str]
         ) -> float:
             import cloudpickle as cp
             import numpy as np
@@ -1336,14 +1359,14 @@ class XGBRFRegressor(BaseTransformer):
             api_calls=[Session.call],
             custom_tags=dict([("autogen", True)]),
         )
-        score = session.call(
-            score_sproc_name,
+        score = score_wrapper_sproc(
+            session,
             query,
             stage_score_file_name,
             identifier.get_unescaped_names(self.input_cols),
             identifier.get_unescaped_names(self.label_cols),
             identifier.get_unescaped_names(self.sample_weight_col),
-            statement_params=statement_params,
+            statement_params,
         )
         cleanup_temp_files([local_score_file_name])
@@ -1361,18 +1384,20 @@ class XGBRFRegressor(BaseTransformer):
             if self._sklearn_object._estimator_type == 'classifier':
                 outputs = _infer_signature(dataset[self.label_cols], "output")  # label columns is the desired type for output
                 outputs = _rename_features(outputs, self.output_cols)  # rename the output columns
-                self._model_signature_dict["predict"] = ModelSignature(inputs, outputs)
+                self._model_signature_dict["predict"] = ModelSignature(inputs,
+                                                                       ([] if self._drop_input_cols else inputs) + outputs)
             # For regressor, the type of predict is float64
             elif self._sklearn_object._estimator_type == 'regressor':
                 outputs = [FeatureSpec(dtype=DataType.DOUBLE, name=c) for c in self.output_cols]
-                self._model_signature_dict["predict"] = ModelSignature(inputs, outputs)
+                self._model_signature_dict["predict"] = ModelSignature(inputs,
+                                                                       ([] if self._drop_input_cols else inputs) + outputs)
         for prob_func in PROB_FUNCTIONS:
             if hasattr(self, prob_func):
                 output_cols_prefix: str = f"{prob_func}_"
                 output_column_names = self._get_output_column_names(output_cols_prefix)
                 outputs = [FeatureSpec(dtype=DataType.DOUBLE, name=c) for c in output_column_names]
-                self._model_signature_dict[prob_func] = ModelSignature(inputs, outputs)
+                self._model_signature_dict[prob_func] = ModelSignature(inputs,
+                                                                       ([] if self._drop_input_cols else inputs) + outputs)
     @property
     def model_signatures(self) -> Dict[str, ModelSignature]:

snowflake/ml/registry/model_registry.py CHANGED Viewed

@@ -1,6 +1,7 @@
 import inspect
 import json
 import os
+import posixpath
 import sys
 import tempfile
 import types
@@ -34,7 +35,7 @@ _DEFAULT_REGISTRY_NAME: str = "_SYSTEM_MODEL_REGISTRY"
 _DEFAULT_SCHEMA_NAME: str = "_SYSTEM_MODEL_REGISTRY_SCHEMA"
 _MODELS_TABLE_NAME: str = "_SYSTEM_REGISTRY_MODELS"
 _METADATA_TABLE_NAME: str = "_SYSTEM_REGISTRY_METADATA"
-_DEPLOYMENT_TABLE_NAME: str = "_SYSTEM_REGISRTRY_DEPLOYMENTS"
+_DEPLOYMENT_TABLE_NAME: str = "_SYSTEM_REGISTRY_DEPLOYMENTS"
 # Metadata operation types.
 _SET_METADATA_OPERATION: str = "SET"
@@ -83,9 +84,11 @@ def create_model_registry(
     """
     # These might be exposed as parameters in the future.
-    registry_table_name = _MODELS_TABLE_NAME
-    metadata_table_name = _METADATA_TABLE_NAME
-    deployment_table_name = _DEPLOYMENT_TABLE_NAME
+    database_name = identifier.get_inferred_name(database_name)
+    schema_name = identifier.get_inferred_name(schema_name)
+    registry_table_name = identifier.get_inferred_name(_MODELS_TABLE_NAME)
+    metadata_table_name = identifier.get_inferred_name(_METADATA_TABLE_NAME)
+    deployment_table_name = identifier.get_inferred_name(_DEPLOYMENT_TABLE_NAME)
     statement_params = telemetry.get_function_usage_statement_params(
         project=_TELEMETRY_PROJECT,
         subproject=_TELEMETRY_SUBPROJECT,
@@ -129,14 +132,14 @@ def _create_registry_database(
         database_name: Desired name of the model registry database.
         statement_params: Function usage statement parameters used in sql query executions.
     """
-    registry_databases = session.sql(f"SHOW DATABASES LIKE '{database_name}'").collect(
+    registry_databases = session.sql(f"SHOW DATABASES LIKE '{identifier.get_unescaped_names(database_name)}'").collect(
         statement_params=statement_params
     )
     if len(registry_databases) > 0:
         logging.warning(f"The database {database_name} already exists. Skipping creation.")
         return
-    session.sql(f'CREATE DATABASE "{database_name}"').collect(statement_params=statement_params)
+    session.sql(f"CREATE DATABASE {database_name}").collect(statement_params=statement_params)
 def _create_registry_schema(
@@ -153,31 +156,31 @@ def _create_registry_schema(
         statement_params: Function usage statement parameters used in sql query executions.
     """
     # The default PUBLIC schema is created by default so it might already exist even in a new database.
-    registry_schemas = session.sql(f"SHOW SCHEMAS LIKE '{schema_name}' IN DATABASE \"{database_name}\"").collect(
-        statement_params=statement_params
-    )
+    registry_schemas = session.sql(
+        f"SHOW SCHEMAS LIKE '{identifier.get_unescaped_names(schema_name)}' IN DATABASE {database_name}"
+    ).collect(statement_params=statement_params)
     if len(registry_schemas) > 0:
-        logging.warning(f'The schmea "{database_name}"."{schema_name}" already exists. Skipping creation.')
+        logging.warning(
+            f"The schema {_get_fully_qualified_schema_name(database_name, schema_name)}already exists. "
+            + "Skipping creation."
+        )
         return
-    session.sql(f'CREATE SCHEMA "{database_name}"."{schema_name}"').collect(statement_params=statement_params)
+    session.sql(f"CREATE SCHEMA {_get_fully_qualified_schema_name(database_name, schema_name)}").collect(
+        statement_params=statement_params
+    )
 def _get_fully_qualified_schema_name(database_name: str, schema_name: str) -> str:
-    return ".".join(
-        [
-            f"{identifier.quote_name_without_upper_casing(database_name)}",
-            f"{identifier.quote_name_without_upper_casing(schema_name)}",
-        ]
-    )
+    return ".".join([database_name, schema_name])
 def _get_fully_qualified_table_name(database_name: str, schema_name: str, table_name: str) -> str:
     return ".".join(
         [
             _get_fully_qualified_schema_name(database_name, schema_name),
-            f"{identifier.quote_name_without_upper_casing(table_name)}",
+            table_name,
         ]
     )
@@ -291,10 +294,10 @@ def _create_registry_views(
     )
     # Create views on most recent metadata items.
-    metadata_view_name_prefix = metadata_table_name + "_LAST_"
+    metadata_view_name_prefix = identifier.concat_names([metadata_table_name, "_LAST_"])
     metadata_view_template = formatting.unwrap(
-        """CREATE OR REPLACE VIEW "{database}"."{schema}"."{attribute_view}" COPY GRANTS AS
-            SELECT DISTINCT MODEL_ID, {select_expression} AS {final_attribute_name} FROM "{metadata_table}"
+        """CREATE OR REPLACE VIEW {database}.{schema}.{attribute_view} COPY GRANTS AS
+            SELECT DISTINCT MODEL_ID, {select_expression} AS {final_attribute_name} FROM {metadata_table}
             WHERE ATTRIBUTE_NAME = '{attribute_name}'"""
     )
@@ -302,7 +305,7 @@ def _create_registry_views(
     metadata_view_names = []
     metadata_select_fields = []
     for attribute_name in _LIST_METADATA_ATTRIBUTE:
-        view_name = f"{metadata_view_name_prefix}{attribute_name}"
+        view_name = identifier.concat_names([metadata_view_name_prefix, attribute_name])
         select_expression = f"(LAST_VALUE(VALUE) OVER (PARTITION BY MODEL_ID ORDER BY SEQUENCE_ID))['{attribute_name}']"
         sql = metadata_view_template.format(
             database=database_name,
@@ -315,14 +318,12 @@ def _create_registry_views(
         )
         session.sql(sql).collect(statement_params=statement_params)
         metadata_view_names.append(view_name)
-        metadata_select_fields.append(
-            f"{identifier.quote_name_without_upper_casing(view_name)}.{attribute_name} AS {attribute_name}"
-        )
+        metadata_select_fields.append(f"{view_name}.{attribute_name} AS {attribute_name}")
     # Create a special view for the registration timestamp.
     attribute_name = _METADATA_ATTRIBUTE_REGISTRATION
-    final_attribute_name = attribute_name + "_TIMESTAMP"
-    view_name = f"{metadata_view_name_prefix}{attribute_name}"
+    final_attribute_name = identifier.concat_names([attribute_name, "_TIMESTAMP"])
+    view_name = identifier.concat_names([metadata_view_name_prefix, attribute_name])
     create_registration_view_sql = metadata_view_template.format(
         database=database_name,
         schema=schema_name,
@@ -334,13 +335,11 @@ def _create_registry_views(
     )
     session.sql(create_registration_view_sql).collect(statement_params=statement_params)
     metadata_view_names.append(view_name)
-    metadata_select_fields.append(
-        f"{identifier.quote_name_without_upper_casing(view_name)}.{final_attribute_name} AS {final_attribute_name}"
-    )
+    metadata_select_fields.append(f"{view_name}.{final_attribute_name} AS {final_attribute_name}")
     metadata_views_join = " ".join(
         [
-            'LEFT JOIN "{view}" ON ("{view}".MODEL_ID = "{registry_table}".ID)'.format(
+            "LEFT JOIN {view} ON ({view}.MODEL_ID = {registry_table}.ID)".format(
                 view=view, registry_table=registry_table_name
             )
             for view in metadata_view_names
@@ -348,12 +347,12 @@ def _create_registry_views(
     )
     # Create view to combine all attributes.
-    registry_view_name = registry_table_name + "_VIEW"
+    registry_view_name = identifier.concat_names([registry_table_name, "_VIEW"])
     metadata_select_fields_formatted = ",".join(metadata_select_fields)
     session.sql(
-        f"""CREATE OR REPLACE VIEW {fully_qualified_schema_name}."{registry_view_name}" COPY GRANTS AS
-                SELECT "{registry_table_name}".*, {metadata_select_fields_formatted}
-                FROM "{registry_table_name}" {metadata_views_join}"""
+        f"""CREATE OR REPLACE VIEW {fully_qualified_schema_name}.{registry_view_name} COPY GRANTS AS
+                SELECT {registry_table_name}.*, {metadata_select_fields_formatted}
+                FROM {registry_table_name} {metadata_views_join}"""
     ).collect(statement_params=statement_params)
@@ -376,8 +375,9 @@ def _create_active_permanent_deployment_view(
     # Create a view on active permanent deployments
     # Active deployments are those whose last operation is not DROP.
+    active_deployments_view_name = identifier.concat_names([deployment_table_name, "_VIEW"])
     active_deployments_view_expr = f"""
-        CREATE OR REPLACE VIEW {fully_qualified_schema_name}."{deployment_table_name}_VIEW"
+        CREATE OR REPLACE VIEW {fully_qualified_schema_name}.{active_deployments_view_name}
         COPY GRANTS AS
         SELECT
             DEPLOYMENT_NAME,
@@ -416,14 +416,14 @@ class ModelRegistry:
             database_name: Desired name of the model registry database.
             schema_name: Desired name of the schema used by this model registry inside the database.
         """
-        self._name = database_name
-        self._schema = schema_name
-        self._registry_table = _MODELS_TABLE_NAME
-        self._registry_table_view = self._registry_table + "_VIEW"
-        self._metadata_table = _METADATA_TABLE_NAME
-        self._deployment_table = _DEPLOYMENT_TABLE_NAME
-        self._permanent_deployment_view = self._deployment_table + "_VIEW"
-        self._permanent_deployment_stage = self._deployment_table + "_STAGE"
+        self._name = identifier.get_inferred_name(database_name)
+        self._schema = identifier.get_inferred_name(schema_name)
+        self._registry_table = identifier.get_inferred_name(_MODELS_TABLE_NAME)
+        self._registry_table_view = identifier.concat_names([self._registry_table, "_VIEW"])
+        self._metadata_table = identifier.get_inferred_name(_METADATA_TABLE_NAME)
+        self._deployment_table = identifier.get_inferred_name(_DEPLOYMENT_TABLE_NAME)
+        self._permanent_deployment_view = identifier.concat_names([self._deployment_table, "_VIEW"])
+        self._permanent_deployment_stage = identifier.concat_names([self._deployment_table, "_STAGE"])
         self._session = session
@@ -440,23 +440,39 @@ class ModelRegistry:
         # Check that the required tables exist and are accessible by the current role.
         query_result_checker.SqlResultValidator(
-            self._session, query=f"SHOW DATABASES LIKE '{self._name}'"
+            self._session, query=f"SHOW DATABASES LIKE '{identifier.get_unescaped_names(self._name)}'"
         ).has_dimensions(expected_rows=1).validate()
         query_result_checker.SqlResultValidator(
-            self._session, query=f"SHOW SCHEMAS LIKE '{self._schema}' IN DATABASE \"{self._name}\""
+            self._session,
+            query=f"SHOW SCHEMAS LIKE '{identifier.get_unescaped_names(self._schema)}' IN DATABASE {self._name}",
         ).has_dimensions(expected_rows=1).validate()
         query_result_checker.SqlResultValidator(
-            self._session, query=f"SHOW TABLES LIKE '{self._registry_table}' IN {self._fully_qualified_schema_name()}"
+            self._session,
+            query=formatting.unwrap(
+                f"""
+            SHOW TABLES LIKE '{identifier.get_unescaped_names(self._registry_table)}'
+            IN {self._fully_qualified_schema_name()}"""
+            ),
         ).has_dimensions(expected_rows=1).validate()
         query_result_checker.SqlResultValidator(
-            self._session, query=f"SHOW TABLES LIKE '{self._metadata_table}' IN {self._fully_qualified_schema_name()}"
+            self._session,
+            query=formatting.unwrap(
+                f"""
+            SHOW TABLES LIKE '{identifier.get_unescaped_names(self._metadata_table)}'
+            IN {self._fully_qualified_schema_name()}"""
+            ),
         ).has_dimensions(expected_rows=1).validate()
         query_result_checker.SqlResultValidator(
-            self._session, query=f"SHOW TABLES LIKE '{self._deployment_table}' IN {self._fully_qualified_schema_name()}"
+            self._session,
+            query=formatting.unwrap(
+                f"""
+            SHOW TABLES LIKE '{identifier.get_unescaped_names(self._deployment_table)}'
+            IN {self._fully_qualified_schema_name()}"""
+            ),
         ).has_dimensions(expected_rows=1).validate()
         # TODO(zzhu): Also check validity of views. Consider checking schema as well.
@@ -824,7 +840,7 @@ class ModelRegistry:
                 before setting the metadata attribute. False by default meaning that by default we will check.
         Raises:
-            DataError: Failed to set the meatdata attribute.
+            DataError: Failed to set the metadata attribute.
             KeyError: The target model doesn't exist
         """
         selected_models = self._list_selected_models(id=id, model_name=model_name, model_version=model_version)
@@ -954,13 +970,13 @@ class ModelRegistry:
         # Check if directory or file and adapt accordingly.
         # TODO: Unify and explicit about compression for both file and directory.
         if os.path.isfile(path):
-            self._session.file.put(path, f"{fully_qualified_model_stage_name}/data")
+            self._session.file.put(path, posixpath.join(fully_qualified_model_stage_name, "data"))
         elif os.path.isdir(path):
             with file_utils.zip_file_or_directory_to_stream(path, path) as input_stream:
                 self._session._conn.upload_stream(
                     input_stream=input_stream,
                     stage_location=fully_qualified_model_stage_name,
-                    dest_filename=f"{os.path.basename(path)}.zip",
+                    dest_filename=f"{posixpath.basename(path)}.zip",
                     dest_prefix="",
                     source_compression="DEFLATE",
                     compress_data=False,
@@ -1066,7 +1082,7 @@ class ModelRegistry:
         """
         # Explicitly not calling collect.
         return self._session.sql(
-            'SELECT * FROM "{database}"."{schema}"."{view}"'.format(
+            "SELECT * FROM {database}.{schema}.{view}".format(
                 database=self._name, schema=self._schema, view=self._registry_table_view
             )
         )
@@ -1123,7 +1139,9 @@ class ModelRegistry:
         try:
             del model_tags[tag_name]
         except KeyError:
-            raise connector.DataError(f"Model id {id} has not tag named {tag_name}. Full list of tags: {model_tags}")
+            raise connector.DataError(
+                f"Model {model_name}/{model_version} has no tag named {tag_name}. Full list of tags: {model_tags}"
+            )
         self._set_metadata_attribute(
             _METADATA_ATTRIBUTE_TAGS, model_tags, model_name=model_name, model_version=model_version
@@ -1226,7 +1244,7 @@ class ModelRegistry:
         result = self._get_metadata_attribute(
             _METADATA_ATTRIBUTE_DESCRIPTION, model_name=model_name, model_version=model_version
         )
-        return None if result is None else str(result)
+        return None if result is None else json.loads(result)
     @telemetry.send_api_usage_telemetry(
         project=_TELEMETRY_PROJECT,
@@ -1558,7 +1576,7 @@ class ModelRegistry:
         restored_model = None
         with tempfile.TemporaryDirectory() as local_model_directory:
             self._session.file.get(remote_model_path, local_model_directory)
-            local_path = os.path.join(local_model_directory, os.path.basename(remote_model_path))
+            local_path = os.path.join(local_model_directory, posixpath.basename(remote_model_path))
             if zipfile.is_zipfile(local_path):
                 extracted_dir = os.path.join(local_model_directory, "extracted")
                 with zipfile.ZipFile(local_path, "r") as myzip:

snowflake/ml/version.py CHANGED Viewed

	@@ -1 +1 @@
1	- VERSION="1.0.2"
1	+ VERSION="1.0.3"

{snowflake_ml_python-1.0.2.dist-info → snowflake_ml_python-1.0.3.dist-info}/METADATA RENAMED Viewed

@@ -1,11 +1,15 @@
 Metadata-Version: 2.1
 Name: snowflake-ml-python
-Version: 1.0.2
-Description-Content-Type: text/markdown
 Author: Snowflake, Inc
 Author-email: support@snowflake.com
 Home-page: https://github.com/snowflakedb/snowflake-ml-python
 License: Apache License, Version 2.0
+Description-Content-Type: text/markdown
+Summary: The machine learning client library that is used for interacting with Snowflake to build machine learning solutions.
+Project-URL: Changelog, https://github.com/snowflakedb/snowflake-ml-python/blob/main/CHANGELOG.md
+Project-URL: Documentation, https://docs.snowflake.com/developer-guide/snowpark-ml
+Project-URL: Issues, https://github.com/snowflakedb/snowflake-ml-python/issues
+Project-URL: Source, https://github.com/snowflakedb/snowflake-ml-python
 Classifier: Development Status :: 3 - Alpha
 Classifier: Environment :: Console
 Classifier: Environment :: Other Environment
@@ -25,17 +29,15 @@ Classifier: Topic :: Software Development :: Libraries :: Application Frameworks
 Classifier: Topic :: Software Development :: Libraries :: Python Modules
 Classifier: Topic :: Scientific/Engineering :: Information Analysis
 Requires-Python: >=3.8,<4
-Summary: The machine learning client library that is used for interacting with Snowflake to build machine learning solutions.
 Requires-Dist: absl-py>=0.15,<2
 Requires-Dist: anyio>=3.5.0,<4
 Requires-Dist: cloudpickle
-Requires-Dist: cryptography>=3.1.0,<39.0.0
 Requires-Dist: fsspec[http]>=2022.11,<=2023.1
 Requires-Dist: numpy>=1.23,<2
 Requires-Dist: packaging>=20.9,<24
 Requires-Dist: pandas>=1.0.0,<2
 Requires-Dist: pyyaml>=6.0,<7
-Requires-Dist: scikit-learn>=1.2.1,<2
+Requires-Dist: scikit-learn>=1.2.1,<1.3
 Requires-Dist: scipy>=1.9,<2
 Requires-Dist: snowflake-connector-python[pandas]>=3.0.3,<4
 Requires-Dist: snowflake-snowpark-python>=1.4.0,<2
@@ -52,6 +54,7 @@ Provides-Extra: tensorflow
 Requires-Dist: tensorflow>=2.9,<3; extra == 'tensorflow'
 Provides-Extra: torch
 Requires-Dist: torchdata>=0.4,<1; extra == 'torch'
+Version: 1.0.3
 # Snowpark ML
@@ -77,9 +80,7 @@ Snowpark MLOps complements the Snowpark ML Development API, and provides model m
 During PrPr, we are iterating on API without backward compatibility guarantees. It is better to recreate your registry everytime you update the package. This means, at this time, you cannot use the registry for production use.
-- [Documentation](http://docs.snowflake.com/developer-guide/snowpark/python/snowpark-ml-modeling)
-- [Issues](https://github.com/snowflakedb/snowflake-ml-python/issues)
-- [Source](https://github.com/snowflakedb/snowflake-ml-python/)
+- [Documentation](https://docs.snowflake.com/developer-guide/snowpark-ml)
 ## Getting started
 ### Have your Snowflake account ready
@@ -96,6 +97,24 @@ pip install snowflake-ml-python
 ```
 # Release History
+## 1.0.3 (2023-07-14)
+### Behavior Changes
+- Model Registry: When predicting a model whose output is a list of NumPy ndarray, the output would not be flattened, instead, every ndarray will act as a feature(column) in the output.
+### New Features
+- Model Registry: Added support save/load/deploy PyTorch models (`torch.nn.Module` and `torch.jit.ScriptModule`).
+### Bug Fixes
+- Model Registry: Fix an issue that when database or schema name provided to `create_model_registry` contains special characters, the model registry cannot be created.
+- Model Registry: Fix an issue that `get_model_description` returns with additional quotes.
+- Model Registry: Fix incorrect error message when attempting to remove a unset tag of a model.
+- Model Registry: Fix a typo in the default deployment table name.
+- Model Registry: Snowpark dataframe for sample input or input for `predict` method that contains a column with Snowflake `NUMBER(precision, scale)` data type where `scale = 0` will not lead to error, and will now correctly recognized as `INT64` data type in model signature.
+- Model Registry: Fix an issue that prevent model logged in the system whose default encoding is not UTF-8 compatible from deploying.
+- Model Registry: Added earlier and better error message when any file name in the model or the file name of model itself contains characters that are unable to be encoded using ASCII. It is currently not supported to deploy such a model.
 ## 1.0.2 (2023-06-22)
 ### Behavior Changes

snowflake-ml-python 1.0.2__py3-none-any.whl → 1.0.3__py3-none-any.whl

snowflake-ml-python 1.0.2py3-none-any.whl → 1.0.3py3-none-any.whl