PyPI - arize - Versions diffs - 8.0.0a10__tar.gz → 8.0.0a12__tar.gz - Mend

arize 8.0.0a10tar.gz → 8.0.0a12tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (128) hide show

{arize-8.0.0a10 → arize-8.0.0a12}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: arize
-Version: 8.0.0a10
+Version: 8.0.0a12
 Summary: A helper library to interact with Arize AI APIs
 Project-URL: Homepage, https://arize.com
 Project-URL: Documentation, https://docs.arize.com/arize
@@ -27,6 +27,13 @@ Classifier: Topic :: System :: Monitoring
 Requires-Python: >=3.10
 Requires-Dist: lazy-imports
 Requires-Dist: numpy>=2.0.0
+Provides-Extra: auto-embeddings
+Requires-Dist: datasets!=2.14.*,<3,>=2.8; extra == 'auto-embeddings'
+Requires-Dist: pandas<3,>=1.0.0; extra == 'auto-embeddings'
+Requires-Dist: pillow<11,>=8.4.0; extra == 'auto-embeddings'
+Requires-Dist: tokenizers<1,>=0.13; extra == 'auto-embeddings'
+Requires-Dist: torch<3,>=1.13; extra == 'auto-embeddings'
+Requires-Dist: transformers<5,>=4.25; extra == 'auto-embeddings'
 Provides-Extra: dev
 Requires-Dist: pytest==8.4.2; extra == 'dev'
 Requires-Dist: ruff==0.13.2; extra == 'dev'
@@ -84,6 +91,10 @@ Description-Content-Type: text/markdown
     - [Stream log ML Data for a Classification use-case](#stream-log-ml-data-for-a-classification-use-case)
     - [Log a batch of ML Data for a Object Detection use-case](#log-a-batch-of-ml-data-for-a-object-detection-use-case)
     - [Exporting ML Data](#exporting-ml-data)
+  - [Generate embeddings for your data](#generate-embeddings-for-your-data)
+- [Configure Logging](#configure-logging)
+  - [In Code](#in-code)
+  - [Via Environment Variables](#via-environment-variables)
 - [Community](#community)
 # Overview
@@ -326,6 +337,69 @@ df = client.models.export_to_df(
 )
 ```
+## Generate embeddings for your data
+```python
+import pandas as pd
+from arize.embeddings import EmbeddingGenerator, UseCases
+# You can check available models
+print(EmbeddingGenerator.list_pretrained_models())
+# Example dataframe
+df = pd.DataFrame(
+    {
+        "text": [
+            "Hello world.",
+            "Artificial Intelligence is the future.",
+            "Spain won the FIFA World Cup on 2010.",
+        ],
+    }
+)
+# Instantiate the generator for your usecase, selecting the base model
+generator = EmbeddingGenerator.from_use_case(
+    use_case=UseCases.NLP.SEQUENCE_CLASSIFICATION,
+    model_name="distilbert-base-uncased",
+    tokenizer_max_length=512,
+    batch_size=100,
+)
+# Generate embeddings
+df["text_vector"] = generator.generate_embeddings(text_col=df["text"])
+```
+# Configure Logging
+## In Code
+You can use `configure_logging` to set up the logging behavior of the Arize package to your needs.
+```python
+from arize.logging import configure_logging
+configure_logging(
+    level=..., # Defaults to logging.INFO
+    structured=..., # if True, emit JSON logs. Defaults to False
+)
+```
+## Via Environment Variables
+Configure the same options as the section above, via:
+```python
+import os
+# You can disable logging altogether
+os.environ["ARIZE_LOG_ENABLE"] = "true"
+# Set up the logging level
+os.environ["ARIZE_LOG_LEVEL"] = "debug"
+# Whether or not you want structured JSON logs
+os.environ["ARIZE_LOG_STRUCTURED"] = "false"
+```
+The default behavior of Arize's logs is: enabled, `INFO` level, and not structured.
 # Community
 Join our community to connect with thousands of AI builders.

{arize-8.0.0a10 → arize-8.0.0a12}/README.md RENAMED Viewed

@@ -31,6 +31,10 @@
     - [Stream log ML Data for a Classification use-case](#stream-log-ml-data-for-a-classification-use-case)
     - [Log a batch of ML Data for a Object Detection use-case](#log-a-batch-of-ml-data-for-a-object-detection-use-case)
     - [Exporting ML Data](#exporting-ml-data)
+  - [Generate embeddings for your data](#generate-embeddings-for-your-data)
+- [Configure Logging](#configure-logging)
+  - [In Code](#in-code)
+  - [Via Environment Variables](#via-environment-variables)
 - [Community](#community)
 # Overview
@@ -273,6 +277,69 @@ df = client.models.export_to_df(
 )
 ```
+## Generate embeddings for your data
+```python
+import pandas as pd
+from arize.embeddings import EmbeddingGenerator, UseCases
+# You can check available models
+print(EmbeddingGenerator.list_pretrained_models())
+# Example dataframe
+df = pd.DataFrame(
+    {
+        "text": [
+            "Hello world.",
+            "Artificial Intelligence is the future.",
+            "Spain won the FIFA World Cup on 2010.",
+        ],
+    }
+)
+# Instantiate the generator for your usecase, selecting the base model
+generator = EmbeddingGenerator.from_use_case(
+    use_case=UseCases.NLP.SEQUENCE_CLASSIFICATION,
+    model_name="distilbert-base-uncased",
+    tokenizer_max_length=512,
+    batch_size=100,
+)
+# Generate embeddings
+df["text_vector"] = generator.generate_embeddings(text_col=df["text"])
+```
+# Configure Logging
+## In Code
+You can use `configure_logging` to set up the logging behavior of the Arize package to your needs.
+```python
+from arize.logging import configure_logging
+configure_logging(
+    level=..., # Defaults to logging.INFO
+    structured=..., # if True, emit JSON logs. Defaults to False
+)
+```
+## Via Environment Variables
+Configure the same options as the section above, via:
+```python
+import os
+# You can disable logging altogether
+os.environ["ARIZE_LOG_ENABLE"] = "true"
+# Set up the logging level
+os.environ["ARIZE_LOG_LEVEL"] = "debug"
+# Whether or not you want structured JSON logs
+os.environ["ARIZE_LOG_STRUCTURED"] = "false"
+```
+The default behavior of Arize's logs is: enabled, `INFO` level, and not structured.
 # Community
 Join our community to connect with thousands of AI builders.

{arize-8.0.0a10 → arize-8.0.0a12}/pyproject.toml RENAMED Viewed

@@ -39,7 +39,6 @@ dependencies = [
     # "requests_futures==1.0.0",
     # "googleapis_common_protos>=1.51.0,<2",
     # "protobuf>=4.21.0,<6",
-    # "pandas>=0.25.3,<3",
     # "pyarrow>=0.15.0",
     # "tqdm>=4.60.0,<5",
     # "pydantic>=2.0.0,<3",
@@ -77,6 +76,14 @@ ml-batch = [
 mimic-explainer = [
     "interpret-community[mimic]>=0.22.0,<1",
 ]
+auto-embeddings = [
+    "Pillow>=8.4.0, <11",
+    "datasets>=2.8, <3, !=2.14.*",
+    "pandas>=1.0.0,<3",
+    "tokenizers>=0.13, <1",
+    "torch>=1.13, <3",
+    "transformers>=4.25, <5",
+]
 [project.urls]
 Homepage = "https://arize.com"

{arize-8.0.0a10 → arize-8.0.0a12}/src/arize/client.py RENAMED Viewed

@@ -12,11 +12,13 @@ if TYPE_CHECKING:
     from arize.spans.client import SpansClient
+# TODO(Kiko): experimental/datasets must be adapted into the datasets subclient
+# TODO(Kiko): experimental/prompt hub is missing
+# TODO(Kiko): exporter/utils/schema_parser is missing
 # TODO(Kiko): Go through main APIs and add CtxAdapter where missing
 # TODO(Kiko): Search and handle other TODOs
 # TODO(Kiko): Go over **every file** and do not import anything at runtime, use `if TYPE_CHECKING`
 # with `from __future__ import annotations` (must include for Python < 3.11)
-# TODO(Kiko): MIMIC Explainer not done
 # TODO(Kiko): Go over docstrings
 class ArizeClient(LazySubclientsMixin):
     """

arize-8.0.0a12/src/arize/embeddings/__init__.py ADDED Viewed

@@ -0,0 +1,4 @@
+from arize.embeddings.auto_generator import EmbeddingGenerator
+from arize.embeddings.usecases import UseCases
+__all__ = ["EmbeddingGenerator", "UseCases"]

arize-8.0.0a12/src/arize/embeddings/auto_generator.py ADDED Viewed

@@ -0,0 +1,108 @@
+from typing import Any
+import pandas as pd
+from arize.embeddings import constants
+from arize.embeddings.base_generators import BaseEmbeddingGenerator
+from arize.embeddings.constants import (
+    CV_PRETRAINED_MODELS,
+    DEFAULT_CV_IMAGE_CLASSIFICATION_MODEL,
+    DEFAULT_CV_OBJECT_DETECTION_MODEL,
+    DEFAULT_NLP_SEQUENCE_CLASSIFICATION_MODEL,
+    DEFAULT_NLP_SUMMARIZATION_MODEL,
+    DEFAULT_TABULAR_MODEL,
+    NLP_PRETRAINED_MODELS,
+)
+from arize.embeddings.cv_generators import (
+    EmbeddingGeneratorForCVImageClassification,
+    EmbeddingGeneratorForCVObjectDetection,
+)
+from arize.embeddings.nlp_generators import (
+    EmbeddingGeneratorForNLPSequenceClassification,
+    EmbeddingGeneratorForNLPSummarization,
+)
+from arize.embeddings.tabular_generators import (
+    EmbeddingGeneratorForTabularFeatures,
+)
+from arize.embeddings.usecases import UseCases
+UseCaseLike = str | UseCases.NLP | UseCases.CV | UseCases.STRUCTURED
+class EmbeddingGenerator:
+    def __init__(self, **kwargs: str):
+        raise OSError(
+            f"{self.__class__.__name__} is designed to be instantiated using the "
+            f"`{self.__class__.__name__}.from_use_case(use_case, **kwargs)` method."
+        )
+    @staticmethod
+    def from_use_case(
+        use_case: UseCaseLike, **kwargs: Any
+    ) -> BaseEmbeddingGenerator:
+        if use_case == UseCases.NLP.SEQUENCE_CLASSIFICATION:
+            return EmbeddingGeneratorForNLPSequenceClassification(**kwargs)
+        elif use_case == UseCases.NLP.SUMMARIZATION:
+            return EmbeddingGeneratorForNLPSummarization(**kwargs)
+        elif use_case == UseCases.CV.IMAGE_CLASSIFICATION:
+            return EmbeddingGeneratorForCVImageClassification(**kwargs)
+        elif use_case == UseCases.CV.OBJECT_DETECTION:
+            return EmbeddingGeneratorForCVObjectDetection(**kwargs)
+        elif use_case == UseCases.STRUCTURED.TABULAR_EMBEDDINGS:
+            return EmbeddingGeneratorForTabularFeatures(**kwargs)
+        else:
+            raise ValueError(f"Invalid use case {use_case}")
+    @classmethod
+    def list_default_models(cls) -> pd.DataFrame:
+        df = pd.DataFrame(
+            {
+                "Area": ["NLP", "NLP", "CV", "CV", "STRUCTURED"],
+                "Usecase": [
+                    UseCases.NLP.SEQUENCE_CLASSIFICATION.name,
+                    UseCases.NLP.SUMMARIZATION.name,
+                    UseCases.CV.IMAGE_CLASSIFICATION.name,
+                    UseCases.CV.OBJECT_DETECTION.name,
+                    UseCases.STRUCTURED.TABULAR_EMBEDDINGS.name,
+                ],
+                "Model Name": [
+                    DEFAULT_NLP_SEQUENCE_CLASSIFICATION_MODEL,
+                    DEFAULT_NLP_SUMMARIZATION_MODEL,
+                    DEFAULT_CV_IMAGE_CLASSIFICATION_MODEL,
+                    DEFAULT_CV_OBJECT_DETECTION_MODEL,
+                    DEFAULT_TABULAR_MODEL,
+                ],
+            }
+        )
+        df.sort_values(
+            by=[col for col in df.columns], ascending=True, inplace=True
+        )
+        return df.reset_index(drop=True)
+    @classmethod
+    def list_pretrained_models(cls) -> pd.DataFrame:
+        data = {
+            "Task": ["NLP" for _ in NLP_PRETRAINED_MODELS]
+            + ["CV" for _ in CV_PRETRAINED_MODELS],
+            "Architecture": [
+                cls.__parse_model_arch(model)
+                for model in NLP_PRETRAINED_MODELS + CV_PRETRAINED_MODELS
+            ],
+            "Model Name": NLP_PRETRAINED_MODELS + CV_PRETRAINED_MODELS,
+        }
+        df = pd.DataFrame(data)
+        df.sort_values(
+            by=[col for col in df.columns], ascending=True, inplace=True
+        )
+        return df.reset_index(drop=True)
+    @staticmethod
+    def __parse_model_arch(model_name: str) -> str:
+        if constants.GPT.lower() in model_name.lower():
+            return constants.GPT
+        elif constants.BERT.lower() in model_name.lower():
+            return constants.BERT
+        elif constants.VIT.lower() in model_name.lower():
+            return constants.VIT
+        else:
+            raise ValueError("Invalid model_name, unknown architecture.")

arize-8.0.0a12/src/arize/embeddings/base_generators.py ADDED Viewed

@@ -0,0 +1,255 @@
+import os
+from abc import ABC, abstractmethod
+from enum import Enum
+from functools import partial
+from typing import Dict, List, Union, cast
+import pandas as pd
+import arize.embeddings.errors as err
+from arize.embeddings.constants import IMPORT_ERROR_MESSAGE
+try:
+    import torch
+    from datasets import Dataset
+    from PIL import Image
+    from transformers import (  # type: ignore
+        AutoImageProcessor,
+        AutoModel,
+        AutoTokenizer,
+        BatchEncoding,
+    )
+    from transformers.utils import logging as transformer_logging
+except ImportError as e:
+    raise ImportError(IMPORT_ERROR_MESSAGE) from e
+import logging
+logger = logging.getLogger(__name__)
+transformer_logging.set_verbosity(50)
+transformer_logging.enable_progress_bar()
+class BaseEmbeddingGenerator(ABC):
+    def __init__(
+        self, use_case: Enum, model_name: str, batch_size: int = 100, **kwargs
+    ):
+        self.__use_case = self._parse_use_case(use_case=use_case)
+        self.__model_name = model_name
+        self.__device = self.select_device()
+        self.__batch_size = batch_size
+        logger.info(f"Downloading pre-trained model '{self.model_name}'")
+        try:
+            self.__model = AutoModel.from_pretrained(
+                self.model_name, **kwargs
+            ).to(self.device)
+        except OSError as e:
+            raise err.HuggingFaceRepositoryNotFound(model_name) from e
+        except Exception as e:
+            raise e
+    @abstractmethod
+    def generate_embeddings(self, **kwargs) -> pd.Series: ...
+    def select_device(self) -> torch.device:
+        if torch.cuda.is_available():
+            return torch.device("cuda")
+        elif torch.backends.mps.is_available():
+            return torch.device("mps")
+        else:
+            logger.warning(
+                "No available GPU has been detected. The use of GPU acceleration is "
+                "strongly recommended. You can check for GPU availability by running "
+                "`torch.cuda.is_available()` or `torch.backends.mps.is_available()`."
+            )
+            return torch.device("cpu")
+    @property
+    def use_case(self) -> str:
+        return self.__use_case
+    @property
+    def model_name(self) -> str:
+        return self.__model_name
+    @property
+    def model(self):
+        return self.__model
+    @property
+    def device(self) -> torch.device:
+        return self.__device
+    @property
+    def batch_size(self) -> int:
+        return self.__batch_size
+    @batch_size.setter
+    def batch_size(self, new_batch_size: int) -> None:
+        err_message = "New batch size should be an integer greater than 0."
+        if not isinstance(new_batch_size, int):
+            raise TypeError(err_message)
+        elif new_batch_size <= 0:
+            raise ValueError(err_message)
+        else:
+            self.__batch_size = new_batch_size
+            logger.info(f"Batch size has been set to {new_batch_size}.")
+    @staticmethod
+    def _parse_use_case(use_case: Enum) -> str:
+        uc_area = use_case.__class__.__name__.split("UseCases")[0]
+        uc_task = use_case.name
+        return f"{uc_area}.{uc_task}"
+    def _get_embedding_vector(
+        self, batch: Dict[str, torch.Tensor], method
+    ) -> Dict[str, torch.Tensor]:
+        with torch.no_grad():
+            outputs = self.model(**batch)
+        # (batch_size, seq_length/or/num_tokens, hidden_size)
+        if method == "cls_token":  # Select CLS token vector
+            embeddings = outputs.last_hidden_state[:, 0, :]
+        elif method == "avg_token":  # Select avg token vector
+            embeddings = torch.mean(outputs.last_hidden_state, 1)
+        else:
+            raise ValueError(f"Invalid method = {method}")
+        return {"embedding_vector": embeddings.cpu().numpy().astype(float)}
+    @staticmethod
+    def check_invalid_index(field: Union[pd.Series, pd.DataFrame]) -> None:
+        if (field.index != field.reset_index(drop=True).index).any():
+            if isinstance(field, pd.DataFrame):
+                raise err.InvalidIndexError("DataFrame")
+            else:
+                raise err.InvalidIndexError(str(field.name))
+    @abstractmethod
+    def __repr__(self) -> str:
+        pass
+class NLPEmbeddingGenerator(BaseEmbeddingGenerator):
+    def __repr__(self) -> str:
+        return (
+            f"{self.__class__.__name__}(\n"
+            f"  use_case={self.use_case},\n"
+            f"  model_name='{self.model_name}',\n"
+            f"  tokenizer_max_length={self.tokenizer_max_length},\n"
+            f"  tokenizer={self.tokenizer.__class__},\n"
+            f"  model={self.model.__class__},\n"
+            f"  batch_size={self.batch_size},\n"
+            f")"
+        )
+    def __init__(
+        self,
+        use_case: Enum,
+        model_name: str,
+        tokenizer_max_length: int = 512,
+        **kwargs,
+    ):
+        super().__init__(use_case=use_case, model_name=model_name, **kwargs)
+        self.__tokenizer_max_length = tokenizer_max_length
+        # We don't check for the tokenizer's existence since it is coupled with the corresponding model
+        # We check the model's existence in `BaseEmbeddingGenerator`
+        logger.info(f"Downloading tokenizer for '{self.model_name}'")
+        self.__tokenizer = AutoTokenizer.from_pretrained(
+            self.model_name, model_max_length=self.tokenizer_max_length
+        )
+    @property
+    def tokenizer(self):
+        return self.__tokenizer
+    @property
+    def tokenizer_max_length(self) -> int:
+        return self.__tokenizer_max_length
+    def tokenize(
+        self, batch: Dict[str, List[str]], text_feat_name: str
+    ) -> BatchEncoding:
+        return self.tokenizer(
+            batch[text_feat_name],
+            padding=True,
+            truncation=True,
+            max_length=self.tokenizer_max_length,
+            return_tensors="pt",
+        ).to(self.device)
+class CVEmbeddingGenerator(BaseEmbeddingGenerator):
+    def __repr__(self) -> str:
+        return (
+            f"{self.__class__.__name__}(\n"
+            f"  use_case={self.use_case},\n"
+            f"  model_name='{self.model_name}',\n"
+            f"  image_processor={self.image_processor.__class__},\n"
+            f"  model={self.model.__class__},\n"
+            f"  batch_size={self.batch_size},\n"
+            f")"
+        )
+    def __init__(self, use_case: Enum, model_name: str, **kwargs):
+        super().__init__(use_case=use_case, model_name=model_name, **kwargs)
+        logger.info("Downloading image processor")
+        # We don't check for the image processor's existence since it is coupled with the corresponding model
+        # We check the model's existence in `BaseEmbeddingGenerator`
+        self.__image_processor = AutoImageProcessor.from_pretrained(
+            self.model_name
+        )
+    @property
+    def image_processor(self):
+        return self.__image_processor
+    @staticmethod
+    def open_image(image_path: str) -> Image.Image:
+        if not os.path.exists(image_path):
+            raise ValueError(f"Cannot find image {image_path}")
+        return Image.open(image_path).convert("RGB")
+    def preprocess_image(
+        self, batch: Dict[str, List[str]], local_image_feat_name: str
+    ):
+        return self.image_processor(
+            [
+                self.open_image(image_path)
+                for image_path in batch[local_image_feat_name]
+            ],
+            return_tensors="pt",
+        ).to(self.device)
+    def generate_embeddings(self, local_image_path_col: pd.Series) -> pd.Series:
+        """
+        Obtain embedding vectors from your image data using pre-trained image models.
+        :param local_image_path_col: a pandas Series containing the local path to the images to
+        be used to generate the embedding vectors.
+        :return: a pandas Series containing the embedding vectors.
+        """
+        if not isinstance(local_image_path_col, pd.Series):
+            raise TypeError(
+                "local_image_path_col_name must be pandas Series object"
+            )
+        self.check_invalid_index(field=local_image_path_col)
+        # Validate that there are no null image paths
+        if local_image_path_col.isnull().any():
+            raise ValueError(
+                "There can't be any null values in the local_image_path_col series"
+            )
+        ds = Dataset.from_dict({"local_path": local_image_path_col})
+        ds.set_transform(
+            partial(
+                self.preprocess_image,
+                local_image_feat_name="local_path",
+            )
+        )
+        logger.info("Generating embedding vectors")
+        ds = ds.map(
+            lambda batch: self._get_embedding_vector(batch, "avg_token"),
+            batched=True,
+            batch_size=self.batch_size,
+        )
+        return cast(pd.DataFrame, ds.to_pandas())["embedding_vector"]

arize-8.0.0a12/src/arize/embeddings/constants.py ADDED Viewed

@@ -0,0 +1,34 @@
+DEFAULT_NLP_SEQUENCE_CLASSIFICATION_MODEL = "distilbert-base-uncased"
+DEFAULT_NLP_SUMMARIZATION_MODEL = "distilbert-base-uncased"
+DEFAULT_TABULAR_MODEL = "distilbert-base-uncased"
+DEFAULT_CV_IMAGE_CLASSIFICATION_MODEL = "google/vit-base-patch32-224-in21k"
+DEFAULT_CV_OBJECT_DETECTION_MODEL = "facebook/detr-resnet-101"
+NLP_PRETRAINED_MODELS = [
+    "bert-base-cased",
+    "bert-base-uncased",
+    "bert-large-cased",
+    "bert-large-uncased",
+    "distilbert-base-cased",
+    "distilbert-base-uncased",
+    "xlm-roberta-base",
+    "xlm-roberta-large",
+]
+CV_PRETRAINED_MODELS = [
+    "google/vit-base-patch16-224-in21k",
+    "google/vit-base-patch16-384",
+    "google/vit-base-patch32-224-in21k",
+    "google/vit-base-patch32-384",
+    "google/vit-large-patch16-224-in21k",
+    "google/vit-large-patch16-384",
+    "google/vit-large-patch32-224-in21k",
+    "google/vit-large-patch32-384",
+]
+IMPORT_ERROR_MESSAGE = (
+    "To enable embedding generation, the arize module must be installed with "
+    "extra dependencies. Run: pip install 'arize[auto-embeddings]'."
+)
+GPT = "GPT"
+BERT = "BERT"
+VIT = "ViT"

arize-8.0.0a12/src/arize/embeddings/cv_generators.py ADDED Viewed

@@ -0,0 +1,28 @@
+from arize.embeddings.base_generators import CVEmbeddingGenerator
+from arize.embeddings.constants import (
+    DEFAULT_CV_IMAGE_CLASSIFICATION_MODEL,
+    DEFAULT_CV_OBJECT_DETECTION_MODEL,
+)
+from arize.embeddings.usecases import UseCases
+class EmbeddingGeneratorForCVImageClassification(CVEmbeddingGenerator):
+    def __init__(
+        self, model_name: str = DEFAULT_CV_IMAGE_CLASSIFICATION_MODEL, **kwargs
+    ):
+        super().__init__(
+            use_case=UseCases.CV.IMAGE_CLASSIFICATION,
+            model_name=model_name,
+            **kwargs,
+        )
+class EmbeddingGeneratorForCVObjectDetection(CVEmbeddingGenerator):
+    def __init__(
+        self, model_name: str = DEFAULT_CV_OBJECT_DETECTION_MODEL, **kwargs
+    ):
+        super().__init__(
+            use_case=UseCases.CV.OBJECT_DETECTION,
+            model_name=model_name,
+            **kwargs,
+        )

arize-8.0.0a12/src/arize/embeddings/errors.py ADDED Viewed

@@ -0,0 +1,41 @@
+class InvalidIndexError(Exception):
+    def __repr__(self) -> str:
+        return "Invalid_Index_Error"
+    def __str__(self) -> str:
+        return self.error_message()
+    def __init__(self, field_name: str) -> None:
+        self.field_name = field_name
+    def error_message(self) -> str:
+        if self.field_name == "DataFrame":
+            return (
+                f"The index of the {self.field_name} is invalid; "
+                f"reset the index by using df.reset_index(drop=True, inplace=True)"
+            )
+        else:
+            return (
+                f"The index of the Series given by the column '{self.field_name}' is invalid; "
+                f"reset the index by using df.reset_index(drop=True, inplace=True)"
+            )
+class HuggingFaceRepositoryNotFound(Exception):
+    def __repr__(self) -> str:
+        return "HuggingFace_Repository_Not_Found_Error"
+    def __str__(self) -> str:
+        return self.error_message()
+    def __init__(self, model_name: str) -> None:
+        self.model_name = model_name
+    def error_message(self) -> str:
+        return (
+            f"The given model name '{self.model_name}' is not a valid model identifier listed on "
+            "'https://huggingface.co/models'. "
+            "If this is a private repository, log in with `huggingface-cli login` or importing "
+            "`login` from `huggingface_hub` if you are using a notebook. "
+            "Learn more in https://huggingface.co/docs/huggingface_hub/quick-start#login"
+        )

arize 8.0.0a10__tar.gz → 8.0.0a12__tar.gz

arize 8.0.0a10tar.gz → 8.0.0a12tar.gz