PyPI - arize-phoenix - Versions diffs - 2.7.0__tar.gz → 2.8.0__tar.gz - Mend

arize-phoenix 2.7.0tar.gz → 2.8.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of arize-phoenix might be problematic. Click here for more details.

Files changed (177) hide show

{arize_phoenix-2.7.0 → arize_phoenix-2.8.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: arize-phoenix
-Version: 2.7.0
+Version: 2.8.0
 Summary: ML Observability in your notebook
 Project-URL: Documentation, https://docs.arize.com/phoenix/
 Project-URL: Issues, https://github.com/Arize-ai/phoenix/issues
@@ -86,6 +86,9 @@ Description-Content-Type: text/markdown
     <a target="_blank" href="https://pypi.org/project/arize-phoenix/">
         <img src="https://img.shields.io/pypi/pyversions/arize-phoenix">
     </a>
+    <a target="_blank" href="https://hub.docker.com/repository/docker/arizephoenix/phoenix/general">
+        <img src="https://img.shields.io/docker/v/arizephoenix/phoenix?sort=semver&logo=docker&label=image&color=blue">
+    </a>
 </p>
 ![a rotating UMAP point cloud of a computer vision model](https://github.com/Arize-ai/phoenix-assets/blob/main/gifs/image_classification_10mb.gif?raw=true)
@@ -134,7 +137,7 @@ pip install arize-phoenix[experimental]
 ![LLM Application Tracing](https://github.com/Arize-ai/phoenix-assets/blob/main/gifs/langchain_rag_stuff_documents_chain_10mb.gif?raw=true)
-With the advent of powerful LLMs, it is now possible to build LLM Applications that can perform complex tasks like summarization, translation, question and answering, and more. However, these applications are often difficult to debug and troubleshoot as they have an extensive surface area: search and retrieval via vector stores, embedding generation, usage of external tools and so on. Phoenix provides a tracing framework that allows you to trace through the execution of your LLM Application hierarchically. This allows you to understand the internals of your LLM Application and to troubleshoot the complex components of your applicaition. Phoenix is built on top of the OpenInference tracing standard and uses it to trace, export, and collect critical information about your LLM Application in the form of `spans`. For more details on the OpenInference tracing standard, see the [OpenInference Specification](https://github.com/Arize-ai/open-inference-spec)
+With the advent of powerful LLMs, it is now possible to build LLM Applications that can perform complex tasks like summarization, translation, question and answering, and more. However, these applications are often difficult to debug and troubleshoot as they have an extensive surface area: search and retrieval via vector stores, embedding generation, usage of external tools and so on. Phoenix provides a tracing framework that allows you to trace through the execution of your LLM Application hierarchically. This allows you to understand the internals of your LLM Application and to troubleshoot the complex components of your applicaition. Phoenix is built on top of the OpenInference tracing standard and uses it to trace, export, and collect critical information about your LLM Application in the form of `spans`. For more details on the OpenInference tracing standard, see the [OpenInference Specification](https://github.com/Arize-ai/openinference)
 ### Tracing with LlamaIndex

{arize_phoenix-2.7.0 → arize_phoenix-2.8.0}/README.md RENAMED Viewed

@@ -22,6 +22,9 @@
     <a target="_blank" href="https://pypi.org/project/arize-phoenix/">
         <img src="https://img.shields.io/pypi/pyversions/arize-phoenix">
     </a>
+    <a target="_blank" href="https://hub.docker.com/repository/docker/arizephoenix/phoenix/general">
+        <img src="https://img.shields.io/docker/v/arizephoenix/phoenix?sort=semver&logo=docker&label=image&color=blue">
+    </a>
 </p>
 ![a rotating UMAP point cloud of a computer vision model](https://github.com/Arize-ai/phoenix-assets/blob/main/gifs/image_classification_10mb.gif?raw=true)
@@ -70,7 +73,7 @@ pip install arize-phoenix[experimental]
 ![LLM Application Tracing](https://github.com/Arize-ai/phoenix-assets/blob/main/gifs/langchain_rag_stuff_documents_chain_10mb.gif?raw=true)
-With the advent of powerful LLMs, it is now possible to build LLM Applications that can perform complex tasks like summarization, translation, question and answering, and more. However, these applications are often difficult to debug and troubleshoot as they have an extensive surface area: search and retrieval via vector stores, embedding generation, usage of external tools and so on. Phoenix provides a tracing framework that allows you to trace through the execution of your LLM Application hierarchically. This allows you to understand the internals of your LLM Application and to troubleshoot the complex components of your applicaition. Phoenix is built on top of the OpenInference tracing standard and uses it to trace, export, and collect critical information about your LLM Application in the form of `spans`. For more details on the OpenInference tracing standard, see the [OpenInference Specification](https://github.com/Arize-ai/open-inference-spec)
+With the advent of powerful LLMs, it is now possible to build LLM Applications that can perform complex tasks like summarization, translation, question and answering, and more. However, these applications are often difficult to debug and troubleshoot as they have an extensive surface area: search and retrieval via vector stores, embedding generation, usage of external tools and so on. Phoenix provides a tracing framework that allows you to trace through the execution of your LLM Application hierarchically. This allows you to understand the internals of your LLM Application and to troubleshoot the complex components of your applicaition. Phoenix is built on top of the OpenInference tracing standard and uses it to trace, export, and collect critical information about your LLM Application in the form of `spans`. For more details on the OpenInference tracing standard, see the [OpenInference Specification](https://github.com/Arize-ai/openinference)
 ### Tracing with LlamaIndex

{arize_phoenix-2.7.0 → arize_phoenix-2.8.0}/pyproject.toml RENAMED Viewed

@@ -124,6 +124,7 @@ dependencies = [
 [tool.hatch.envs.type]
 dependencies = [
   "mypy==1.5.1",
+  "pydantic==v1.10.14",  # for mypy
   "llama-index>=0.9.14",
   "pandas-stubs<=2.0.2.230605",  # version 2.0.3.230814 is causing a dependency conflict.
   "types-psutil",

arize_phoenix-2.8.0/src/phoenix/exceptions.py ADDED Viewed

@@ -0,0 +1,6 @@
+class PhoenixException(Exception):
+    pass
+class PhoenixContextLimitExceeded(PhoenixException):
+    pass

{arize_phoenix-2.7.0 → arize_phoenix-2.8.0}/src/phoenix/experimental/evals/functions/classify.py RENAMED Viewed

@@ -249,7 +249,7 @@ def run_relevance_eval(
         This latter format is intended for running evaluations on exported OpenInference trace
         dataframes. For more information on the OpenInference tracing specification, see
-        https://github.com/Arize-ai/open-inference-spec/.
+        https://github.com/Arize-ai/openinference/.
         model (BaseEvalModel): The model used for evaluation.

{arize_phoenix-2.7.0 → arize_phoenix-2.8.0}/src/phoenix/experimental/evals/models/anthropic.py RENAMED Viewed

@@ -1,6 +1,7 @@
 from dataclasses import dataclass, field
 from typing import TYPE_CHECKING, Any, Dict, List, Optional
+from phoenix.exceptions import PhoenixContextLimitExceeded
 from phoenix.experimental.evals.models.base import BaseEvalModel
 from phoenix.experimental.evals.models.rate_limiters import RateLimiter
@@ -44,12 +45,6 @@ class AnthropicModel(BaseEvalModel):
         self._init_client()
         self._init_tiktoken()
         self._init_rate_limiter()
-        self.retry = self._retry(
-            error_types=[],  # default to catching all errors
-            min_seconds=self.retry_min_seconds,
-            max_seconds=self.retry_max_seconds,
-            max_retries=self.max_retries,
-        )
     def _init_environment(self) -> None:
         try:
@@ -127,7 +122,7 @@ class AnthropicModel(BaseEvalModel):
         kwargs.pop("instruction", None)
         invocation_parameters = self.invocation_parameters()
         invocation_parameters.update(kwargs)
-        response = self._generate_with_retry(
+        response = self._rate_limited_completion(
             model=self.model,
             prompt=self._format_prompt_for_claude(prompt),
             **invocation_parameters,
@@ -135,14 +130,19 @@ class AnthropicModel(BaseEvalModel):
         return str(response)
-    def _generate_with_retry(self, **kwargs: Any) -> Any:
-        @self.retry
+    def _rate_limited_completion(self, **kwargs: Any) -> Any:
         @self._rate_limiter.limit
-        def _completion_with_retry(**kwargs: Any) -> Any:
-            response = self.client.completions.create(**kwargs)
-            return response.completion
-        return _completion_with_retry(**kwargs)
+        def _completion(**kwargs: Any) -> Any:
+            try:
+                response = self.client.completions.create(**kwargs)
+                return response.completion
+            except self._anthropic.BadRequestError as e:
+                exception_message = e.args[0]
+                if exception_message and "prompt is too long" in exception_message:
+                    raise PhoenixContextLimitExceeded(exception_message) from e
+                raise e
+        return _completion(**kwargs)
     async def _async_generate(self, prompt: str, **kwargs: Dict[str, Any]) -> str:
         # instruction is an invalid input to Anthropic models, it is passed in by
@@ -150,20 +150,25 @@ class AnthropicModel(BaseEvalModel):
         kwargs.pop("instruction", None)
         invocation_parameters = self.invocation_parameters()
         invocation_parameters.update(kwargs)
-        response = await self._async_generate_with_retry(
+        response = await self._async_rate_limited_completion(
             model=self.model, prompt=self._format_prompt_for_claude(prompt), **invocation_parameters
         )
         return str(response)
-    async def _async_generate_with_retry(self, **kwargs: Any) -> Any:
-        @self.retry
+    async def _async_rate_limited_completion(self, **kwargs: Any) -> Any:
         @self._rate_limiter.alimit
-        async def _async_completion_with_retry(**kwargs: Any) -> Any:
-            response = await self.async_client.completions.create(**kwargs)
-            return response.completion
-        return await _async_completion_with_retry(**kwargs)
+        async def _async_completion(**kwargs: Any) -> Any:
+            try:
+                response = await self.async_client.completions.create(**kwargs)
+                return response.completion
+            except self._anthropic.BadRequestError as e:
+                exception_message = e.args[0]
+                if exception_message and "prompt is too long" in exception_message:
+                    raise PhoenixContextLimitExceeded(exception_message) from e
+                raise e
+        return await _async_completion(**kwargs)
     def _format_prompt_for_claude(self, prompt: str) -> str:
         # Claude requires prompt in the format of Human: ... Assistant:

{arize_phoenix-2.7.0 → arize_phoenix-2.8.0}/src/phoenix/experimental/evals/models/base.py RENAMED Viewed

@@ -2,22 +2,13 @@ import logging
 from abc import ABC, abstractmethod, abstractproperty
 from contextlib import contextmanager
 from dataclasses import dataclass, field
-from typing import TYPE_CHECKING, Any, Callable, Generator, List, Optional, Sequence, Type
+from typing import TYPE_CHECKING, Any, Generator, List, Optional, Sequence
 from phoenix.experimental.evals.models.rate_limiters import RateLimiter
 if TYPE_CHECKING:
     from tiktoken import Encoding
-from tenacity import (
-    RetryCallState,
-    retry,
-    retry_base,
-    retry_if_exception_type,
-    stop_after_attempt,
-    wait_random_exponential,
-)
 from tqdm.asyncio import tqdm_asyncio
 from tqdm.auto import tqdm
 from typing_extensions import TypeVar
@@ -65,52 +56,6 @@ class BaseEvalModel(ABC):
     def reload_client(self) -> None:
         pass
-    def _retry(
-        self,
-        error_types: List[Type[BaseException]],
-        min_seconds: int,
-        max_seconds: int,
-        max_retries: int,
-    ) -> Callable[[Any], Any]:
-        """Create a retry decorator for a given LLM and provided list of error types."""
-        def log_retry(retry_state: RetryCallState) -> None:
-            if fut := retry_state.outcome:
-                exc = fut.exception()
-            else:
-                exc = None
-            if exc:
-                printif(
-                    self._verbose,
-                    (
-                        f"Failed attempt {retry_state.attempt_number}: "
-                        f"{type(exc).__module__}.{type(exc).__name__}"
-                    ),
-                )
-                printif(
-                    True,
-                    f"Failed attempt {retry_state.attempt_number}: raised {repr(exc)}",
-                )
-            else:
-                printif(True, f"Failed attempt {retry_state.attempt_number}")
-            return None
-        if not error_types:
-            # default to retrying on all exceptions
-            error_types = [Exception]
-        retry_instance: retry_base = retry_if_exception_type(error_types[0])
-        for error in error_types[1:]:
-            retry_instance = retry_instance | retry_if_exception_type(error)
-        return retry(
-            reraise=True,
-            stop=stop_after_attempt(max_retries),
-            wait=wait_random_exponential(multiplier=1, min=min_seconds, max=max_seconds),
-            retry=retry_instance,
-            before_sleep=log_retry,
-        )
     def __call__(self, prompt: str, instruction: Optional[str] = None, **kwargs: Any) -> str:
         """Run the LLM on the given prompt."""
         if not isinstance(prompt, str):

{arize_phoenix-2.7.0 → arize_phoenix-2.8.0}/src/phoenix/experimental/evals/models/bedrock.py RENAMED Viewed

@@ -3,6 +3,7 @@ import logging
 from dataclasses import dataclass, field
 from typing import TYPE_CHECKING, Any, Dict, List, Optional
+from phoenix.exceptions import PhoenixContextLimitExceeded
 from phoenix.experimental.evals.models.base import BaseEvalModel
 from phoenix.experimental.evals.models.rate_limiters import RateLimiter
@@ -54,12 +55,6 @@ class BedrockModel(BaseEvalModel):
         self._init_client()
         self._init_tiktoken()
         self._init_rate_limiter()
-        self.retry = self._retry(
-            error_types=[],  # default to catching all errors
-            min_seconds=self.retry_min_seconds,
-            max_seconds=self.retry_max_seconds,
-            max_retries=self.max_retries,
-        )
     def _init_environment(self) -> None:
         try:
@@ -130,21 +125,36 @@ class BedrockModel(BaseEvalModel):
         accept = "application/json"
         contentType = "application/json"
-        response = self._generate_with_retry(
+        response = self._rate_limited_completion(
             body=body, modelId=self.model_id, accept=accept, contentType=contentType
         )
         return self._parse_output(response) or ""
-    def _generate_with_retry(self, **kwargs: Any) -> Any:
+    def _rate_limited_completion(self, **kwargs: Any) -> Any:
         """Use tenacity to retry the completion call."""
-        @self.retry
         @self._rate_limiter.limit
-        def _completion_with_retry(**kwargs: Any) -> Any:
-            return self.client.invoke_model(**kwargs)
-        return _completion_with_retry(**kwargs)
+        def _completion(**kwargs: Any) -> Any:
+            try:
+                return self.client.invoke_model(**kwargs)
+            except Exception as e:
+                exception_message = e.args[0]
+                if not exception_message:
+                    raise e
+                if "Input is too long" in exception_message:
+                    # Error from Anthropic models
+                    raise PhoenixContextLimitExceeded(exception_message) from e
+                elif "expected maxLength" in exception_message:
+                    # Error from Titan models
+                    raise PhoenixContextLimitExceeded(exception_message) from e
+                elif "Prompt has too many tokens" in exception_message:
+                    # Error from AI21 models
+                    raise PhoenixContextLimitExceeded(exception_message) from e
+                raise e
+        return _completion(**kwargs)
     def _format_prompt_for_claude(self, prompt: str) -> str:
         # Claude requires prompt in the format of Human: ... Assisatnt:

{arize_phoenix-2.7.0 → arize_phoenix-2.8.0}/src/phoenix/experimental/evals/models/litellm.py RENAMED Viewed

@@ -95,24 +95,17 @@ class LiteLLMModel(BaseEvalModel):
     def _generate(self, prompt: str, **kwargs: Dict[str, Any]) -> str:
         messages = self._get_messages_from_prompt(prompt)
-        return str(
-            self._generate_with_retry(
-                model=self.model_name,
-                messages=messages,
-                temperature=self.temperature,
-                max_tokens=self.max_tokens,
-                top_p=self.top_p,
-                num_retries=self.num_retries,
-                request_timeout=self.request_timeout,
-                **self.model_kwargs,
-            )
+        response = self._litellm.completion(
+            model=self.model_name,
+            messages=messages,
+            temperature=self.temperature,
+            max_tokens=self.max_tokens,
+            top_p=self.top_p,
+            num_retries=self.num_retries,
+            request_timeout=self.request_timeout,
+            **self.model_kwargs,
         )
-    def _generate_with_retry(self, **kwargs: Any) -> Any:
-        # Using default LiteLLM completion with retries = self.num_retries.
-        response = self._litellm.completion(**kwargs)
-        return response.choices[0].message.content
+        return str(response.choices[0].message.content)
     def _get_messages_from_prompt(self, prompt: str) -> List[Dict[str, str]]:
         # LiteLLM requires prompts in the format of messages

{arize_phoenix-2.7.0 → arize_phoenix-2.8.0}/src/phoenix/experimental/evals/models/openai.py RENAMED Viewed

@@ -14,6 +14,7 @@ from typing import (
     get_origin,
 )
+from phoenix.exceptions import PhoenixContextLimitExceeded
 from phoenix.experimental.evals.models.base import BaseEvalModel
 from phoenix.experimental.evals.models.rate_limiters import RateLimiter
@@ -114,25 +115,11 @@ class OpenAIModel(BaseEvalModel):
     def _init_environment(self) -> None:
         try:
-            import httpx
             import openai
             import openai._utils as openai_util
             self._openai = openai
             self._openai_util = openai_util
-            self._openai_retry_errors = [
-                self._openai.APITimeoutError,
-                self._openai.APIError,
-                self._openai.APIConnectionError,
-                self._openai.InternalServerError,
-                httpx.ReadTimeout,
-            ]
-            self.retry = self._retry(
-                error_types=self._openai_retry_errors,
-                min_seconds=self.retry_min_seconds,
-                max_seconds=self.retry_max_seconds,
-                max_retries=self.max_retries,
-            )
         except ImportError:
             self._raise_import_error(
                 package_display_name="OpenAI",
@@ -265,7 +252,7 @@ class OpenAIModel(BaseEvalModel):
             invoke_params["functions"] = functions
         if function_call := kwargs.get("function_call"):
             invoke_params["function_call"] = function_call
-        response = await self._async_generate_with_retry(
+        response = await self._async_rate_limited_completion(
             messages=messages,
             **invoke_params,
         )
@@ -284,7 +271,7 @@ class OpenAIModel(BaseEvalModel):
             invoke_params["functions"] = functions
         if function_call := kwargs.get("function_call"):
             invoke_params["function_call"] = function_call
-        response = self._generate_with_retry(
+        response = self._rate_limited_completion(
             messages=messages,
             **invoke_params,
         )
@@ -296,45 +283,51 @@ class OpenAIModel(BaseEvalModel):
             return str(function_call.get("arguments") or "")
         return str(message["content"])
-    async def _async_generate_with_retry(self, **kwargs: Any) -> Any:
-        """Use tenacity to retry the completion call."""
-        @self.retry
+    async def _async_rate_limited_completion(self, **kwargs: Any) -> Any:
         @self._rate_limiter.alimit
-        async def _completion_with_retry(**kwargs: Any) -> Any:
-            if self._model_uses_legacy_completion_api:
-                if "prompt" not in kwargs:
-                    kwargs["prompt"] = "\n\n".join(
-                        (message.get("content") or "")
-                        for message in (kwargs.pop("messages", None) or ())
-                    )
-                # OpenAI 1.0.0 API responses are pydantic objects, not dicts
-                # We must dump the model to get the dict
-                res = await self._async_client.completions.create(**kwargs)
-            else:
-                res = await self._async_client.chat.completions.create(**kwargs)
-            return res.model_dump()
-        return await _completion_with_retry(**kwargs)
-    def _generate_with_retry(self, **kwargs: Any) -> Any:
-        """Use tenacity to retry the completion call."""
-        @self.retry
+        async def _async_completion(**kwargs: Any) -> Any:
+            try:
+                if self._model_uses_legacy_completion_api:
+                    if "prompt" not in kwargs:
+                        kwargs["prompt"] = "\n\n".join(
+                            (message.get("content") or "")
+                            for message in (kwargs.pop("messages", None) or ())
+                        )
+                    # OpenAI 1.0.0 API responses are pydantic objects, not dicts
+                    # We must dump the model to get the dict
+                    res = await self._async_client.completions.create(**kwargs)
+                else:
+                    res = await self._async_client.chat.completions.create(**kwargs)
+                return res.model_dump()
+            except self._openai._exceptions.BadRequestError as e:
+                exception_message = e.args[0]
+                if exception_message and "maximum context length" in exception_message:
+                    raise PhoenixContextLimitExceeded(exception_message) from e
+                raise e
+        return await _async_completion(**kwargs)
+    def _rate_limited_completion(self, **kwargs: Any) -> Any:
         @self._rate_limiter.limit
-        def _completion_with_retry(**kwargs: Any) -> Any:
-            if self._model_uses_legacy_completion_api:
-                if "prompt" not in kwargs:
-                    kwargs["prompt"] = "\n\n".join(
-                        (message.get("content") or "")
-                        for message in (kwargs.pop("messages", None) or ())
-                    )
-                # OpenAI 1.0.0 API responses are pydantic objects, not dicts
-                # We must dump the model to get the dict
-                return self._client.completions.create(**kwargs).model_dump()
-            return self._client.chat.completions.create(**kwargs).model_dump()
-        return _completion_with_retry(**kwargs)
+        def _completion(**kwargs: Any) -> Any:
+            try:
+                if self._model_uses_legacy_completion_api:
+                    if "prompt" not in kwargs:
+                        kwargs["prompt"] = "\n\n".join(
+                            (message.get("content") or "")
+                            for message in (kwargs.pop("messages", None) or ())
+                        )
+                    # OpenAI 1.0.0 API responses are pydantic objects, not dicts
+                    # We must dump the model to get the dict
+                    return self._client.completions.create(**kwargs).model_dump()
+                return self._client.chat.completions.create(**kwargs).model_dump()
+            except self._openai._exceptions.BadRequestError as e:
+                exception_message = e.args[0]
+                if exception_message and "maximum context length" in exception_message:
+                    raise PhoenixContextLimitExceeded(exception_message) from e
+                raise e
+        return _completion(**kwargs)
     @property
     def max_context_size(self) -> int:

{arize_phoenix-2.7.0 → arize_phoenix-2.8.0}/src/phoenix/experimental/evals/models/vertex.py RENAMED Viewed

@@ -46,12 +46,6 @@ class GeminiModel(BaseEvalModel):
     def __post_init__(self) -> None:
         self._init_client()
         self._init_rate_limiter()
-        self.retry = self._retry(
-            error_types=[],  # default to catching all errors
-            min_seconds=self.retry_min_seconds,
-            max_seconds=self.retry_max_seconds,
-            max_retries=self.max_retries,
-        )
     def reload_client(self) -> None:
         self._init_client()
@@ -115,30 +109,17 @@ class GeminiModel(BaseEvalModel):
         # instruction is an invalid input to Gemini models, it is passed in by
         # BaseEvalModel.__call__ and needs to be removed
         kwargs.pop("instruction", None)
-        response = self._generate_with_retry(
-            prompt=prompt,
-            generation_config=self.generation_config,
-            **kwargs,
-        )
-        return str(response)
-    def _generate_with_retry(
-        self, prompt: str, generation_config: Dict[str, Any], **kwargs: Any
-    ) -> Any:
-        @self.retry
         @self._rate_limiter.limit
-        def _completion_with_retry(**kwargs: Any) -> Any:
+        def _rate_limited_completion(
+            prompt: str, generation_config: Dict[str, Any], **kwargs: Any
+        ) -> Any:
             response = self._model.generate_content(
                 contents=prompt, generation_config=generation_config, **kwargs
             )
             return self._parse_response_candidates(response)
-        return _completion_with_retry(**kwargs)
-    async def _async_generate(self, prompt: str, **kwargs: Dict[str, Any]) -> str:
-        kwargs.pop("instruction", None)
-        response = await self._async_generate_with_retry(
+        response = _rate_limited_completion(
             prompt=prompt,
             generation_config=self.generation_config,
             **kwargs,
@@ -146,18 +127,27 @@ class GeminiModel(BaseEvalModel):
         return str(response)
-    async def _async_generate_with_retry(
-        self, prompt: str, generation_config: Dict[str, Any], **kwargs: Any
-    ) -> Any:
-        @self.retry
+    async def _async_generate(self, prompt: str, **kwargs: Dict[str, Any]) -> str:
+        # instruction is an invalid input to Gemini models, it is passed in by
+        # BaseEvalModel.__call__ and needs to be removed
+        kwargs.pop("instruction", None)
         @self._rate_limiter.alimit
-        async def _completion_with_retry(**kwargs: Any) -> Any:
+        async def _rate_limited_completion(
+            prompt: str, generation_config: Dict[str, Any], **kwargs: Any
+        ) -> Any:
             response = await self._model.generate_content_async(
                 contents=prompt, generation_config=generation_config, **kwargs
             )
             return self._parse_response_candidates(response)
-        return await _completion_with_retry(**kwargs)
+        response = await _rate_limited_completion(
+            prompt=prompt,
+            generation_config=self.generation_config,
+            **kwargs,
+        )
+        return str(response)
     def _parse_response_candidates(self, response: Any) -> Any:
         if hasattr(response, "candidates"):

{arize_phoenix-2.7.0 → arize_phoenix-2.8.0}/src/phoenix/experimental/evals/models/vertexai.py RENAMED Viewed

@@ -52,18 +52,6 @@ class VertexAIModel(BaseEvalModel):
             self._vertexai = vertexai
             self._google_exceptions = google_exceptions
-            self._google_api_retry_errors = [
-                self._google_exceptions.ResourceExhausted,
-                self._google_exceptions.ServiceUnavailable,
-                self._google_exceptions.Aborted,
-                self._google_exceptions.DeadlineExceeded,
-            ]
-            self.retry = self._retry(
-                error_types=self._google_api_retry_errors,
-                min_seconds=self.retry_min_seconds,
-                max_seconds=self.retry_max_seconds,
-                max_retries=self.max_retries,
-            )
         except ImportError:
             self._raise_import_error(
                 package_display_name="VertexAI",
@@ -97,19 +85,12 @@ class VertexAIModel(BaseEvalModel):
     def _generate(self, prompt: str, **kwargs: Dict[str, Any]) -> str:
         invoke_params = self.invocation_params
-        response = self._generate_with_retry(
+        response = self._model.predict(
             prompt=prompt,
             **invoke_params,
         )
         return str(response.text)
-    def _generate_with_retry(self, **kwargs: Any) -> Any:
-        @self.retry
-        def _completion_with_retry(**kwargs: Any) -> Any:
-            return self._model.predict(**kwargs)
-        return _completion_with_retry(**kwargs)
     @property
     def is_codey_model(self) -> bool:
         return is_codey_model(self.tuned_model_name or self.model_name)

{arize_phoenix-2.7.0 → arize_phoenix-2.8.0}/src/phoenix/server/api/schema.py RENAMED Viewed

@@ -2,7 +2,6 @@ from collections import defaultdict
 from datetime import datetime
 from itertools import chain
 from typing import Dict, List, Optional, Set, Tuple, Union, cast
-from uuid import UUID
 import numpy as np
 import numpy.typing as npt
@@ -22,7 +21,7 @@ from phoenix.server.api.input_types.Coordinates import (
 from phoenix.server.api.input_types.SpanSort import SpanSort
 from phoenix.server.api.types.Cluster import Cluster, to_gql_clusters
 from phoenix.trace.dsl import SpanFilter
-from phoenix.trace.schemas import SpanID
+from phoenix.trace.schemas import SpanID, TraceID
 from .context import Context
 from .input_types.TimeRange import TimeRange
@@ -264,7 +263,7 @@ class Query:
                 root_spans_only=root_spans_only,
             )
         else:
-            spans = chain.from_iterable(map(traces.get_trace, map(UUID, trace_ids)))
+            spans = chain.from_iterable(map(traces.get_trace, map(TraceID, trace_ids)))
         if predicate:
             spans = filter(predicate, spans)
         if sort:

arize-phoenix 2.7.0__tar.gz → 2.8.0__tar.gz

Potentially problematic release.

arize-phoenix 2.7.0tar.gz → 2.8.0tar.gz