azure-ai-evaluation 1.0.0b5__tar.gz → 1.0.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of azure-ai-evaluation might be problematic. Click here for more details.

Files changed (181) hide show
  1. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/CHANGELOG.md +39 -2
  2. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/MANIFEST.in +2 -1
  3. {azure_ai_evaluation-1.0.0b5/azure_ai_evaluation.egg-info → azure_ai_evaluation-1.0.1}/PKG-INFO +232 -324
  4. azure_ai_evaluation-1.0.1/README.md +345 -0
  5. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/TROUBLESHOOTING.md +15 -4
  6. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_common/_experimental.py +4 -0
  7. azure_ai_evaluation-1.0.1/azure/ai/evaluation/_common/math.py +89 -0
  8. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_common/rai_service.py +80 -29
  9. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_common/utils.py +50 -16
  10. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_constants.py +1 -0
  11. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluate/_batch_run/eval_run_context.py +9 -0
  12. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluate/_batch_run/proxy_client.py +13 -3
  13. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluate/_batch_run/target_run_context.py +11 -0
  14. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluate/_eval_run.py +34 -10
  15. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluate/_evaluate.py +59 -103
  16. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluate/_telemetry/__init__.py +2 -1
  17. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluate/_utils.py +6 -4
  18. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_bleu/_bleu.py +16 -17
  19. azure_ai_evaluation-1.0.1/azure/ai/evaluation/_evaluators/_coherence/_coherence.py +107 -0
  20. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_common/_base_eval.py +17 -5
  21. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_common/_base_prompty_eval.py +4 -2
  22. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_common/_base_rai_svc_eval.py +6 -9
  23. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_content_safety/_content_safety.py +56 -50
  24. azure_ai_evaluation-1.0.1/azure/ai/evaluation/_evaluators/_content_safety/_hate_unfairness.py +129 -0
  25. azure_ai_evaluation-1.0.1/azure/ai/evaluation/_evaluators/_content_safety/_self_harm.py +123 -0
  26. azure_ai_evaluation-1.0.1/azure/ai/evaluation/_evaluators/_content_safety/_sexual.py +125 -0
  27. azure_ai_evaluation-1.0.1/azure/ai/evaluation/_evaluators/_content_safety/_violence.py +126 -0
  28. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_eci/_eci.py +28 -3
  29. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_f1_score/_f1_score.py +20 -13
  30. azure_ai_evaluation-1.0.1/azure/ai/evaluation/_evaluators/_fluency/_fluency.py +104 -0
  31. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_gleu/_gleu.py +13 -15
  32. azure_ai_evaluation-1.0.1/azure/ai/evaluation/_evaluators/_groundedness/_groundedness.py +144 -0
  33. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_meteor/_meteor.py +17 -20
  34. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_multimodal/_content_safety_multimodal.py +10 -8
  35. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_multimodal/_content_safety_multimodal_base.py +0 -2
  36. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_multimodal/_hate_unfairness.py +6 -2
  37. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_multimodal/_protected_material.py +10 -6
  38. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_multimodal/_self_harm.py +6 -2
  39. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_multimodal/_sexual.py +6 -2
  40. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_multimodal/_violence.py +6 -2
  41. azure_ai_evaluation-1.0.1/azure/ai/evaluation/_evaluators/_protected_material/_protected_material.py +113 -0
  42. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_qa/_qa.py +25 -37
  43. azure_ai_evaluation-1.0.1/azure/ai/evaluation/_evaluators/_relevance/_relevance.py +114 -0
  44. azure_ai_evaluation-1.0.1/azure/ai/evaluation/_evaluators/_retrieval/_retrieval.py +112 -0
  45. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_rouge/_rouge.py +24 -25
  46. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_service_groundedness/_service_groundedness.py +65 -67
  47. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_similarity/_similarity.py +26 -20
  48. azure_ai_evaluation-1.0.1/azure/ai/evaluation/_evaluators/_xpia/xpia.py +125 -0
  49. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_exceptions.py +2 -0
  50. azure_ai_evaluation-1.0.1/azure/ai/evaluation/_model_configurations.py +123 -0
  51. azure_ai_evaluation-1.0.1/azure/ai/evaluation/_version.py +5 -0
  52. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/simulator/_adversarial_scenario.py +15 -1
  53. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/simulator/_adversarial_simulator.py +25 -34
  54. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/simulator/_constants.py +11 -1
  55. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/simulator/_direct_attack_simulator.py +16 -8
  56. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/simulator/_indirect_attack_simulator.py +11 -1
  57. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/simulator/_model_tools/_identity_manager.py +3 -1
  58. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/simulator/_model_tools/_rai_client.py +8 -4
  59. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/simulator/_simulator.py +51 -45
  60. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/simulator/_utils.py +25 -7
  61. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1/azure_ai_evaluation.egg-info}/PKG-INFO +232 -324
  62. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure_ai_evaluation.egg-info/SOURCES.txt +5 -1
  63. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure_ai_evaluation.egg-info/requires.txt +0 -1
  64. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/pyproject.toml +1 -2
  65. azure_ai_evaluation-1.0.1/samples/README.md +57 -0
  66. azure_ai_evaluation-1.0.1/samples/data/evaluate_test_data.jsonl +3 -0
  67. azure_ai_evaluation-1.0.1/samples/evaluation_samples_common.py +60 -0
  68. azure_ai_evaluation-1.0.1/samples/evaluation_samples_evaluate.py +395 -0
  69. azure_ai_evaluation-1.0.1/samples/evaluation_samples_simulate.py +249 -0
  70. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/setup.py +1 -2
  71. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/conftest.py +10 -1
  72. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/e2etests/target_fn.py +1 -1
  73. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/e2etests/test_builtin_evaluators.py +26 -50
  74. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/e2etests/test_evaluate.py +210 -11
  75. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/unittests/test_built_in_evaluator.py +6 -16
  76. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/unittests/test_content_safety_rai_script.py +44 -3
  77. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/unittests/test_evaluate.py +65 -22
  78. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/unittests/test_non_adv_simulator.py +5 -5
  79. azure_ai_evaluation-1.0.1/tests/unittests/test_utils.py +258 -0
  80. azure_ai_evaluation-1.0.0b5/README.md +0 -473
  81. azure_ai_evaluation-1.0.0b5/azure/ai/evaluation/_common/math.py +0 -29
  82. azure_ai_evaluation-1.0.0b5/azure/ai/evaluation/_evaluators/_coherence/_coherence.py +0 -76
  83. azure_ai_evaluation-1.0.0b5/azure/ai/evaluation/_evaluators/_content_safety/_content_safety_chat.py +0 -322
  84. azure_ai_evaluation-1.0.0b5/azure/ai/evaluation/_evaluators/_content_safety/_hate_unfairness.py +0 -84
  85. azure_ai_evaluation-1.0.0b5/azure/ai/evaluation/_evaluators/_content_safety/_self_harm.py +0 -84
  86. azure_ai_evaluation-1.0.0b5/azure/ai/evaluation/_evaluators/_content_safety/_sexual.py +0 -84
  87. azure_ai_evaluation-1.0.0b5/azure/ai/evaluation/_evaluators/_content_safety/_violence.py +0 -84
  88. azure_ai_evaluation-1.0.0b5/azure/ai/evaluation/_evaluators/_fluency/_fluency.py +0 -73
  89. azure_ai_evaluation-1.0.0b5/azure/ai/evaluation/_evaluators/_groundedness/_groundedness.py +0 -106
  90. azure_ai_evaluation-1.0.0b5/azure/ai/evaluation/_evaluators/_protected_material/_protected_material.py +0 -90
  91. azure_ai_evaluation-1.0.0b5/azure/ai/evaluation/_evaluators/_relevance/_relevance.py +0 -80
  92. azure_ai_evaluation-1.0.0b5/azure/ai/evaluation/_evaluators/_retrieval/_retrieval.py +0 -197
  93. azure_ai_evaluation-1.0.0b5/azure/ai/evaluation/_evaluators/_xpia/xpia.py +0 -91
  94. azure_ai_evaluation-1.0.0b5/azure/ai/evaluation/_model_configurations.py +0 -72
  95. azure_ai_evaluation-1.0.0b5/azure/ai/evaluation/_version.py +0 -5
  96. azure_ai_evaluation-1.0.0b5/tests/unittests/test_utils.py +0 -56
  97. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/NOTICE.txt +0 -0
  98. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/__init__.py +0 -0
  99. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/__init__.py +0 -0
  100. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/__init__.py +0 -0
  101. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_common/__init__.py +0 -0
  102. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_common/constants.py +0 -0
  103. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluate/__init__.py +0 -0
  104. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluate/_batch_run/__init__.py +0 -0
  105. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluate/_batch_run/code_client.py +0 -0
  106. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/__init__.py +0 -0
  107. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_bleu/__init__.py +0 -0
  108. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_coherence/__init__.py +0 -0
  109. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_coherence/coherence.prompty +0 -0
  110. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_common/__init__.py +0 -0
  111. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_content_safety/__init__.py +0 -0
  112. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_eci/__init__.py +0 -0
  113. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_f1_score/__init__.py +0 -0
  114. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_fluency/__init__.py +0 -0
  115. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_fluency/fluency.prompty +0 -0
  116. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_gleu/__init__.py +0 -0
  117. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_groundedness/__init__.py +0 -0
  118. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_groundedness/groundedness_with_query.prompty +0 -0
  119. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_groundedness/groundedness_without_query.prompty +0 -0
  120. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_meteor/__init__.py +0 -0
  121. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_multimodal/__init__.py +0 -0
  122. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_protected_material/__init__.py +0 -0
  123. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_qa/__init__.py +0 -0
  124. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_relevance/__init__.py +0 -0
  125. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_relevance/relevance.prompty +0 -0
  126. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_retrieval/__init__.py +0 -0
  127. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_retrieval/retrieval.prompty +0 -0
  128. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_rouge/__init__.py +0 -0
  129. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_service_groundedness/__init__.py +0 -0
  130. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_similarity/__init__.py +0 -0
  131. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_similarity/similarity.prompty +0 -0
  132. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_evaluators/_xpia/__init__.py +0 -0
  133. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_http_utils.py +0 -0
  134. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_user_agent.py +0 -0
  135. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_vendor/__init__.py +0 -0
  136. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_vendor/rouge_score/__init__.py +0 -0
  137. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_vendor/rouge_score/rouge_scorer.py +0 -0
  138. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_vendor/rouge_score/scoring.py +0 -0
  139. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_vendor/rouge_score/tokenize.py +0 -0
  140. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/_vendor/rouge_score/tokenizers.py +0 -0
  141. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/py.typed +0 -0
  142. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/simulator/__init__.py +0 -0
  143. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/simulator/_conversation/__init__.py +0 -0
  144. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/simulator/_conversation/_conversation.py +0 -0
  145. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/simulator/_conversation/constants.py +0 -0
  146. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/simulator/_data_sources/__init__.py +0 -0
  147. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/simulator/_data_sources/grounding.json +0 -0
  148. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/simulator/_helpers/__init__.py +0 -0
  149. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/simulator/_helpers/_language_suffix_mapping.py +0 -0
  150. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/simulator/_helpers/_simulator_data_classes.py +0 -0
  151. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/simulator/_model_tools/__init__.py +0 -0
  152. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/simulator/_model_tools/_proxy_completion_model.py +0 -0
  153. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/simulator/_model_tools/_template_handler.py +0 -0
  154. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/simulator/_model_tools/models.py +0 -0
  155. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/simulator/_prompty/__init__.py +0 -0
  156. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/simulator/_prompty/task_query_response.prompty +0 -0
  157. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/simulator/_prompty/task_simulate.prompty +0 -0
  158. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure/ai/evaluation/simulator/_tracing.py +0 -0
  159. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure_ai_evaluation.egg-info/dependency_links.txt +0 -0
  160. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure_ai_evaluation.egg-info/not-zip-safe +0 -0
  161. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/azure_ai_evaluation.egg-info/top_level.txt +0 -0
  162. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/setup.cfg +0 -0
  163. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/__init__.py +0 -0
  164. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/__openai_patcher.py +0 -0
  165. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/__pf_service_isolation.py +0 -0
  166. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/e2etests/__init__.py +0 -0
  167. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/e2etests/custom_evaluators/answer_length_with_aggregation.py +0 -0
  168. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/e2etests/test_adv_simulator.py +0 -0
  169. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/e2etests/test_metrics_upload.py +0 -0
  170. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/e2etests/test_sim_and_eval.py +0 -0
  171. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/unittests/test_batch_run_context.py +0 -0
  172. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/unittests/test_content_safety_defect_rate.py +0 -0
  173. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/unittests/test_eval_run.py +0 -0
  174. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/unittests/test_evaluate_telemetry.py +0 -0
  175. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/unittests/test_evaluators/apology_dag/apology.py +0 -0
  176. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/unittests/test_evaluators/test_inputs_evaluators.py +0 -0
  177. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/unittests/test_jailbreak_simulator.py +0 -0
  178. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/unittests/test_save_eval.py +0 -0
  179. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/unittests/test_simulator.py +0 -0
  180. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/unittests/test_synthetic_callback_conv_bot.py +0 -0
  181. {azure_ai_evaluation-1.0.0b5 → azure_ai_evaluation-1.0.1}/tests/unittests/test_synthetic_conversation_bot.py +0 -0
@@ -1,5 +1,42 @@
1
1
  # Release History
2
2
 
3
+ ## 1.0.1 (2024-11-15)
4
+
5
+ ### Bugs Fixed
6
+ - Fixed `[remote]` extra to be needed only when tracking results in Azure AI Studio.
7
+ - Removing `azure-ai-inference` as dependency.
8
+
9
+ ## 1.0.0 (2024-11-13)
10
+
11
+ ### Breaking Changes
12
+ - The `parallel` parameter has been removed from composite evaluators: `QAEvaluator`, `ContentSafetyChatEvaluator`, and `ContentSafetyMultimodalEvaluator`. To control evaluator parallelism, you can now use the `_parallel` keyword argument, though please note that this private parameter may change in the future.
13
+ - Parameters `query_response_generating_prompty_kwargs` and `user_simulator_prompty_kwargs` have been renamed to `query_response_generating_prompty_options` and `user_simulator_prompty_options` in the Simulator's __call__ method.
14
+
15
+ ### Bugs Fixed
16
+ - Fixed an issue where the `output_path` parameter in the `evaluate` API did not support relative path.
17
+ - Output of adversarial simulators are of type `JsonLineList` and the helper function `to_eval_qr_json_lines` now outputs context from both user and assistant turns along with `category` if it exists in the conversation
18
+ - Fixed an issue where during long-running simulations, API token expires causing "Forbidden" error. Instead, users can now set an environment variable `AZURE_TOKEN_REFRESH_INTERVAL` to refresh the token more frequently to prevent expiration and ensure continuous operation of the simulation.
19
+ - Fix `evaluate` function not producing aggregated metrics if ANY values to be aggregated were None, NaN, or
20
+ otherwise difficult to process. Such values are ignored fully, so the aggregated metric of `[1, 2, 3, NaN]`
21
+ would be 2, not 1.5.
22
+
23
+ ### Other Changes
24
+ - Refined error messages for serviced-based evaluators and simulators.
25
+ - Tracing has been disabled due to Cosmos DB initialization issue.
26
+ - Introduced environment variable `AI_EVALS_DISABLE_EXPERIMENTAL_WARNING` to disable the warning message for experimental features.
27
+ - Changed the randomization pattern for `AdversarialSimulator` such that there is an almost equal number of Adversarial harm categories (e.g. Hate + Unfairness, Self-Harm, Violence, Sex) represented in the `AdversarialSimulator` outputs. Previously, for 200 `max_simulation_results` a user might see 140 results belonging to the 'Hate + Unfairness' category and 40 results belonging to the 'Self-Harm' category. Now, user will see 50 results for each of Hate + Unfairness, Self-Harm, Violence, and Sex.
28
+ - For the `DirectAttackSimulator`, the prompt templates used to generate simulated outputs for each Adversarial harm category will no longer be in a randomized order by default. To override this behavior, pass `randomize_order=True` when you call the `DirectAttackSimulator`, for example:
29
+ ```python
30
+ adversarial_simulator = DirectAttackSimulator(azure_ai_project=azure_ai_project, credential=DefaultAzureCredential())
31
+ outputs = asyncio.run(
32
+ adversarial_simulator(
33
+ scenario=scenario,
34
+ target=callback,
35
+ randomize_order=True
36
+ )
37
+ )
38
+ ```
39
+
3
40
  ## 1.0.0b5 (2024-10-28)
4
41
 
5
42
  ### Features Added
@@ -56,8 +93,8 @@ outputs = asyncio.run(custom_simulator(
56
93
  - `SimilarityEvaluator`
57
94
  - `RetrievalEvaluator`
58
95
  - The following evaluators will now have a new key in their result output including LLM reasoning behind the score. The new key will follow the pattern "<metric_name>_reason". The reasoning is the result of a more detailed prompt template being used to generate the LLM response. Note that this requires the maximum number of tokens used to run these evaluators to be increased.
59
-
60
- | Evaluator | New Token Limit |
96
+
97
+ | Evaluator | New `max_token` for Generation |
61
98
  | --- | --- |
62
99
  | `CoherenceEvaluator` | 800 |
63
100
  | `RelevanceEvaluator` | 800 |
@@ -4,4 +4,5 @@ include azure/__init__.py
4
4
  include azure/ai/__init__.py
5
5
  include azure/ai/evaluation/py.typed
6
6
  recursive-include azure/ai/evaluation *.prompty
7
- include azure/ai/evaluation/simulator/_data_sources/grounding.json
7
+ include azure/ai/evaluation/simulator/_data_sources/grounding.json
8
+ recursive-include samples *