PyPI - azure-ai-evaluation - Versions diffs - 1.0.0b3__tar.gz → 1.0.0b4__tar.gz - Mend

azure-ai-evaluation 1.0.0b3tar.gz → 1.0.0b4tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of azure-ai-evaluation might be problematic. Click here for more details.

Files changed (147) hide show

{azure_ai_evaluation-1.0.0b3 → azure_ai_evaluation-1.0.0b4}/CHANGELOG.md RENAMED Viewed

@@ -1,5 +1,19 @@
 # Release History
+## 1.0.0b4 (2024-10-16)
+### Breaking Changes
+- Removed `numpy` dependency. All NaN values returned by the SDK have been changed to from `numpy.nan` to `math.nan`.
+- `credential` is now required to be passed in for all content safety evaluators and `ProtectedMaterialsEvaluator`. `DefaultAzureCredential` will no longer be chosen if a credential is not passed.
+- Changed package extra name from "pf-azure" to "remote".
+### Bugs Fixed
+- Adversarial Conversation simulations would fail with `Forbidden`. Added logic to re-fetch token in the exponential retry logic to retrive RAI Service response.
+### Other Changes
+- Enhance the error message to provide clearer instruction when required packages for the remote tracking feature are missing.
 ## 1.0.0b3 (2024-10-01)
 ### Features Added
@@ -54,9 +68,29 @@ evaluate(
 )
 ```
+- Simulator now requires a model configuration to call the prompty instead of an Azure AI project scope. This enables the usage of simulator with Entra ID based auth.
+Before:
+```python
+azure_ai_project = {
+    "subscription_id": os.environ.get("AZURE_SUBSCRIPTION_ID"),
+    "resource_group_name": os.environ.get("RESOURCE_GROUP"),
+    "project_name": os.environ.get("PROJECT_NAME"),
+}
+sim = Simulator(azure_ai_project=azure_ai_project, credentails=DefaultAzureCredentials())
+```
+After:
+```python
+model_config = {
+    "azure_endpoint": os.environ.get("AZURE_OPENAI_ENDPOINT"),
+    "azure_deployment": os.environ.get("AZURE_DEPLOYMENT"),
+}
+sim = Simulator(model_config=model_config)
+```
+If `api_key` is not included in the `model_config`, the prompty runtime in `promptflow-core` will pick up `DefaultAzureCredential`.
 ### Bugs Fixed
-- Fixed issue where Entra ID authentication was not working with `AzureOpenAIModelConfiguration`
+- Fixed issue where Entra ID authentication was not working with `AzureOpenAIModelConfiguration`
 ## 1.0.0b2 (2024-09-24)
@@ -69,9 +103,9 @@ evaluate(
 ### Breaking Changes
 - The `synthetic` namespace has been renamed to `simulator`, and sub-namespaces under this module have been removed
-- The `evaluate` and `evaluators` namespaces have been removed, and everything previously exposed in those modules has been added to the root namespace `azure.ai.evaluation`
+- The `evaluate` and `evaluators` namespaces have been removed, and everything previously exposed in those modules has been added to the root namespace `azure.ai.evaluation`
 - The parameter name `project_scope` in content safety evaluators have been renamed to `azure_ai_project` for consistency with evaluate API and simulators.
-- Model configurations classes are now of type `TypedDict` and are exposed in the `azure.ai.evaluation` module instead of coming from `promptflow.core`.
+- Model configurations classes are now of type `TypedDict` and are exposed in the `azure.ai.evaluation` module instead of coming from `promptflow.core`.
 - Updated the parameter names for `question` and `answer` in built-in evaluators to more generic terms: `query` and `response`.
 ### Features Added

azure_ai_evaluation-1.0.0b4/NOTICE.txt ADDED Viewed

@@ -0,0 +1,50 @@
+NOTICES AND INFORMATION
+Do Not Translate or Localize
+This software incorporates material from third parties.
+Microsoft makes certain open source code available at https://3rdpartysource.microsoft.com,
+or you may send a check or money order for US $5.00, including the product name,
+the open source component name, platform, and version number, to:
+Source Code Compliance Team
+Microsoft Corporation
+One Microsoft Way
+Redmond, WA 98052
+USA
+Notwithstanding any other terms, you may reverse engineer this software to the extent
+required to debug changes to any libraries licensed under the GNU Lesser General Public License.
+License notice for nltk
+---------------------------------------------------------
+Copyright 2024 The NLTK Project
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+    http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+License notice for rouge-score
+---------------------------------------------------------
+Copyright 2024 The Google Research Authors
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+    http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.

{azure_ai_evaluation-1.0.0b3/azure_ai_evaluation.egg-info → azure_ai_evaluation-1.0.0b4}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: azure-ai-evaluation
-Version: 1.0.0b3
+Version: 1.0.0b4
 Summary: Microsoft Azure Evaluation Library for Python
 Home-page: https://github.com/Azure/azure-sdk-for-python
 Author: Microsoft Corporation
@@ -21,17 +21,15 @@ Classifier: License :: OSI Approved :: MIT License
 Classifier: Operating System :: OS Independent
 Requires-Python: >=3.8
 Description-Content-Type: text/markdown
+License-File: NOTICE.txt
 Requires-Dist: promptflow-devkit>=1.15.0
 Requires-Dist: promptflow-core>=1.15.0
-Requires-Dist: numpy>=1.23.2; python_version < "3.12"
-Requires-Dist: numpy>=1.26.4; python_version >= "3.12"
 Requires-Dist: pyjwt>=2.8.0
-Requires-Dist: azure-identity>=1.12.0
+Requires-Dist: azure-identity>=1.16.0
 Requires-Dist: azure-core>=1.30.2
 Requires-Dist: nltk>=3.9.1
-Requires-Dist: rouge-score>=0.1.2
-Provides-Extra: pf-azure
-Requires-Dist: promptflow-azure<2.0.0,>=1.15.0; extra == "pf-azure"
+Provides-Extra: remote
+Requires-Dist: promptflow-azure<2.0.0,>=1.15.0; extra == "remote"
 # Azure AI Evaluation client library for Python
@@ -154,11 +152,6 @@ name: ApplicationPrompty
 description: Simulates an application
 model:
   api: chat
-  configuration:
-    type: azure_openai
-    azure_deployment: ${env:AZURE_DEPLOYMENT}
-    api_key: ${env:AZURE_OPENAI_API_KEY}
-    azure_endpoint: ${env:AZURE_OPENAI_ENDPOINT}
   parameters:
     temperature: 0.0
     top_p: 1.0
@@ -187,52 +180,55 @@ import asyncio
 from typing import Any, Dict, List, Optional
 from azure.ai.evaluation.simulator import Simulator
 from promptflow.client import load_flow
-from azure.identity import DefaultAzureCredential
 import os
+import wikipedia
-azure_ai_project = {
-    "subscription_id": os.environ.get("AZURE_SUBSCRIPTION_ID"),
-    "resource_group_name": os.environ.get("RESOURCE_GROUP"),
-    "project_name": os.environ.get("PROJECT_NAME")
+# Set up the model configuration without api_key, using DefaultAzureCredential
+model_config = {
+    "azure_endpoint": os.environ.get("AZURE_OPENAI_ENDPOINT"),
+    "azure_deployment": os.environ.get("AZURE_DEPLOYMENT"),
+    # not providing key would make the SDK pick up `DefaultAzureCredential`
+    # use "api_key": "<your API key>"
 }
-import wikipedia
-wiki_search_term = "Leonardo da vinci"
+# Use Wikipedia to get some text for the simulation
+wiki_search_term = "Leonardo da Vinci"
 wiki_title = wikipedia.search(wiki_search_term)[0]
 wiki_page = wikipedia.page(wiki_title)
 text = wiki_page.summary[:1000]
-def method_to_invoke_application_prompty(query: str):
+def method_to_invoke_application_prompty(query: str, messages_list: List[Dict], context: Optional[Dict]):
     try:
         current_dir = os.path.dirname(__file__)
         prompty_path = os.path.join(current_dir, "application.prompty")
-        _flow = load_flow(source=prompty_path, model={
-            "configuration": azure_ai_project
-        })
+        _flow = load_flow(
+            source=prompty_path,
+            model=model_config,
+            credential=DefaultAzureCredential()
+        )
         response = _flow(
             query=query,
             context=context,
             conversation_history=messages_list
         )
         return response
-    except:
-        print("Something went wrong invoking the prompty")
+    except Exception as e:
+        print(f"Something went wrong invoking the prompty: {e}")
         return "something went wrong"
 async def callback(
-    messages: List[Dict],
+    messages: Dict[str, List[Dict]],
     stream: bool = False,
     session_state: Any = None,  # noqa: ANN401
     context: Optional[Dict[str, Any]] = None,
 ) -> dict:
     messages_list = messages["messages"]
-    # get last message
+    # Get the last message from the user
     latest_message = messages_list[-1]
     query = latest_message["content"]
-    context = None
-    # call your endpoint or ai application here
-    response = method_to_invoke_application_prompty(query)
-    # we are formatting the response to follow the openAI chat protocol format
+    # Call your endpoint or AI application here
+    response = method_to_invoke_application_prompty(query, messages_list, context)
+    # Format the response to follow the OpenAI chat protocol format
     formatted_response = {
         "content": response,
         "role": "assistant",
@@ -243,10 +239,8 @@ async def callback(
     messages["messages"].append(formatted_response)
     return {"messages": messages["messages"], "stream": stream, "session_state": session_state, "context": context}
 async def main():
-    simulator = Simulator(azure_ai_project=azure_ai_project, credential=DefaultAzureCredential())
+    simulator = Simulator(model_config=model_config)
     outputs = await simulator(
         target=callback,
         text=text,
@@ -257,17 +251,17 @@ async def main():
             f"I am a teacher and I want to teach my students about {wiki_search_term}"
         ],
     )
-    print(json.dumps(outputs))
+    print(json.dumps(outputs, indent=2))
 if __name__ == "__main__":
-    os.environ["AZURE_SUBSCRIPTION_ID"] = ""
-    os.environ["RESOURCE_GROUP"] = ""
-    os.environ["PROJECT_NAME"] = ""
-    os.environ["AZURE_OPENAI_API_KEY"] = ""
-    os.environ["AZURE_OPENAI_ENDPOINT"] = ""
-    os.environ["AZURE_DEPLOYMENT"] = ""
+    # Ensure that the following environment variables are set in your environment:
+    # AZURE_OPENAI_ENDPOINT and AZURE_DEPLOYMENT
+    # Example:
+    # os.environ["AZURE_OPENAI_ENDPOINT"] = "https://your-endpoint.openai.azure.com/"
+    # os.environ["AZURE_DEPLOYMENT"] = "your-deployment-name"
     asyncio.run(main())
     print("done!")
 ```
 #### Adversarial Simulator
@@ -426,6 +420,20 @@ This project has adopted the [Microsoft Open Source Code of Conduct][code_of_con
 # Release History
+## 1.0.0b4 (2024-10-16)
+### Breaking Changes
+- Removed `numpy` dependency. All NaN values returned by the SDK have been changed to from `numpy.nan` to `math.nan`.
+- `credential` is now required to be passed in for all content safety evaluators and `ProtectedMaterialsEvaluator`. `DefaultAzureCredential` will no longer be chosen if a credential is not passed.
+- Changed package extra name from "pf-azure" to "remote".
+### Bugs Fixed
+- Adversarial Conversation simulations would fail with `Forbidden`. Added logic to re-fetch token in the exponential retry logic to retrive RAI Service response.
+### Other Changes
+- Enhance the error message to provide clearer instruction when required packages for the remote tracking feature are missing.
 ## 1.0.0b3 (2024-10-01)
 ### Features Added
@@ -480,9 +488,29 @@ evaluate(
 )
 ```
+- Simulator now requires a model configuration to call the prompty instead of an Azure AI project scope. This enables the usage of simulator with Entra ID based auth.
+Before:
+```python
+azure_ai_project = {
+    "subscription_id": os.environ.get("AZURE_SUBSCRIPTION_ID"),
+    "resource_group_name": os.environ.get("RESOURCE_GROUP"),
+    "project_name": os.environ.get("PROJECT_NAME"),
+}
+sim = Simulator(azure_ai_project=azure_ai_project, credentails=DefaultAzureCredentials())
+```
+After:
+```python
+model_config = {
+    "azure_endpoint": os.environ.get("AZURE_OPENAI_ENDPOINT"),
+    "azure_deployment": os.environ.get("AZURE_DEPLOYMENT"),
+}
+sim = Simulator(model_config=model_config)
+```
+If `api_key` is not included in the `model_config`, the prompty runtime in `promptflow-core` will pick up `DefaultAzureCredential`.
 ### Bugs Fixed
-- Fixed issue where Entra ID authentication was not working with `AzureOpenAIModelConfiguration`
+- Fixed issue where Entra ID authentication was not working with `AzureOpenAIModelConfiguration`
 ## 1.0.0b2 (2024-09-24)
@@ -495,9 +523,9 @@ evaluate(
 ### Breaking Changes
 - The `synthetic` namespace has been renamed to `simulator`, and sub-namespaces under this module have been removed
-- The `evaluate` and `evaluators` namespaces have been removed, and everything previously exposed in those modules has been added to the root namespace `azure.ai.evaluation`
+- The `evaluate` and `evaluators` namespaces have been removed, and everything previously exposed in those modules has been added to the root namespace `azure.ai.evaluation`
 - The parameter name `project_scope` in content safety evaluators have been renamed to `azure_ai_project` for consistency with evaluate API and simulators.
-- Model configurations classes are now of type `TypedDict` and are exposed in the `azure.ai.evaluation` module instead of coming from `promptflow.core`.
+- Model configurations classes are now of type `TypedDict` and are exposed in the `azure.ai.evaluation` module instead of coming from `promptflow.core`.
 - Updated the parameter names for `question` and `answer` in built-in evaluators to more generic terms: `query` and `response`.
 ### Features Added

{azure_ai_evaluation-1.0.0b3 → azure_ai_evaluation-1.0.0b4}/README.md RENAMED Viewed

@@ -119,11 +119,6 @@ name: ApplicationPrompty
 description: Simulates an application
 model:
   api: chat
-  configuration:
-    type: azure_openai
-    azure_deployment: ${env:AZURE_DEPLOYMENT}
-    api_key: ${env:AZURE_OPENAI_API_KEY}
-    azure_endpoint: ${env:AZURE_OPENAI_ENDPOINT}
   parameters:
     temperature: 0.0
     top_p: 1.0
@@ -152,52 +147,55 @@ import asyncio
 from typing import Any, Dict, List, Optional
 from azure.ai.evaluation.simulator import Simulator
 from promptflow.client import load_flow
-from azure.identity import DefaultAzureCredential
 import os
+import wikipedia
-azure_ai_project = {
-    "subscription_id": os.environ.get("AZURE_SUBSCRIPTION_ID"),
-    "resource_group_name": os.environ.get("RESOURCE_GROUP"),
-    "project_name": os.environ.get("PROJECT_NAME")
+# Set up the model configuration without api_key, using DefaultAzureCredential
+model_config = {
+    "azure_endpoint": os.environ.get("AZURE_OPENAI_ENDPOINT"),
+    "azure_deployment": os.environ.get("AZURE_DEPLOYMENT"),
+    # not providing key would make the SDK pick up `DefaultAzureCredential`
+    # use "api_key": "<your API key>"
 }
-import wikipedia
-wiki_search_term = "Leonardo da vinci"
+# Use Wikipedia to get some text for the simulation
+wiki_search_term = "Leonardo da Vinci"
 wiki_title = wikipedia.search(wiki_search_term)[0]
 wiki_page = wikipedia.page(wiki_title)
 text = wiki_page.summary[:1000]
-def method_to_invoke_application_prompty(query: str):
+def method_to_invoke_application_prompty(query: str, messages_list: List[Dict], context: Optional[Dict]):
     try:
         current_dir = os.path.dirname(__file__)
         prompty_path = os.path.join(current_dir, "application.prompty")
-        _flow = load_flow(source=prompty_path, model={
-            "configuration": azure_ai_project
-        })
+        _flow = load_flow(
+            source=prompty_path,
+            model=model_config,
+            credential=DefaultAzureCredential()
+        )
         response = _flow(
             query=query,
             context=context,
             conversation_history=messages_list
         )
         return response
-    except:
-        print("Something went wrong invoking the prompty")
+    except Exception as e:
+        print(f"Something went wrong invoking the prompty: {e}")
         return "something went wrong"
 async def callback(
-    messages: List[Dict],
+    messages: Dict[str, List[Dict]],
     stream: bool = False,
     session_state: Any = None,  # noqa: ANN401
     context: Optional[Dict[str, Any]] = None,
 ) -> dict:
     messages_list = messages["messages"]
-    # get last message
+    # Get the last message from the user
     latest_message = messages_list[-1]
     query = latest_message["content"]
-    context = None
-    # call your endpoint or ai application here
-    response = method_to_invoke_application_prompty(query)
-    # we are formatting the response to follow the openAI chat protocol format
+    # Call your endpoint or AI application here
+    response = method_to_invoke_application_prompty(query, messages_list, context)
+    # Format the response to follow the OpenAI chat protocol format
     formatted_response = {
         "content": response,
         "role": "assistant",
@@ -208,10 +206,8 @@ async def callback(
     messages["messages"].append(formatted_response)
     return {"messages": messages["messages"], "stream": stream, "session_state": session_state, "context": context}
 async def main():
-    simulator = Simulator(azure_ai_project=azure_ai_project, credential=DefaultAzureCredential())
+    simulator = Simulator(model_config=model_config)
     outputs = await simulator(
         target=callback,
         text=text,
@@ -222,17 +218,17 @@ async def main():
             f"I am a teacher and I want to teach my students about {wiki_search_term}"
         ],
     )
-    print(json.dumps(outputs))
+    print(json.dumps(outputs, indent=2))
 if __name__ == "__main__":
-    os.environ["AZURE_SUBSCRIPTION_ID"] = ""
-    os.environ["RESOURCE_GROUP"] = ""
-    os.environ["PROJECT_NAME"] = ""
-    os.environ["AZURE_OPENAI_API_KEY"] = ""
-    os.environ["AZURE_OPENAI_ENDPOINT"] = ""
-    os.environ["AZURE_DEPLOYMENT"] = ""
+    # Ensure that the following environment variables are set in your environment:
+    # AZURE_OPENAI_ENDPOINT and AZURE_DEPLOYMENT
+    # Example:
+    # os.environ["AZURE_OPENAI_ENDPOINT"] = "https://your-endpoint.openai.azure.com/"
+    # os.environ["AZURE_DEPLOYMENT"] = "your-deployment-name"
     asyncio.run(main())
     print("done!")
 ```
 #### Adversarial Simulator

{azure_ai_evaluation-1.0.0b3 → azure_ai_evaluation-1.0.0b4}/azure/ai/evaluation/_common/constants.py RENAMED Viewed

@@ -3,6 +3,8 @@
 # ---------------------------------------------------------
 from enum import Enum
+from azure.core import CaseInsensitiveEnumMeta
 class CommonConstants:
     """Define common constants."""
@@ -43,7 +45,7 @@ class _InternalAnnotationTasks:
     ECI = "eci"
-class EvaluationMetrics:
+class EvaluationMetrics(str, Enum, metaclass=CaseInsensitiveEnumMeta):
     """Evaluation metrics to aid the RAI service in determining what
     metrics to request, and how to present them back to the user."""
@@ -56,7 +58,7 @@ class EvaluationMetrics:
     XPIA = "xpia"
-class _InternalEvaluationMetrics:
+class _InternalEvaluationMetrics(str, Enum, metaclass=CaseInsensitiveEnumMeta):
     """Evaluation metrics that are not publicly supported.
     These metrics are experimental and subject to potential change or migration to the main
     enum over time.

azure_ai_evaluation-1.0.0b4/azure/ai/evaluation/_common/math.py ADDED Viewed

@@ -0,0 +1,18 @@
+# ---------------------------------------------------------
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# ---------------------------------------------------------
+import math
+from typing import List
+def list_sum(lst: List[float]) -> float:
+    return sum(lst)
+def list_mean(lst: List[float]) -> float:
+    return list_sum(lst) / len(lst)
+def list_mean_nan_safe(lst: List[float]) -> float:
+    return list_mean([l for l in lst if not math.isnan(l)])

azure-ai-evaluation 1.0.0b3__tar.gz → 1.0.0b4__tar.gz

Potentially problematic release.

azure-ai-evaluation 1.0.0b3tar.gz → 1.0.0b4tar.gz